suxiaogang223 opened a new issue, #62101:
URL: https://github.com/apache/doris/issues/62101

   ### Background
   
   The current third-party Docker startup flow in Doris has accumulated several 
usability and performance issues over time, especially around heavyweight 
services such as Hive, Iceberg-related components, and other stateful external 
dependencies.
   
   Common pain points include:
   
   - Long startup latency caused by repeated initialization, redundant 
downloads, and expensive bootstrap steps
   - Service startup being tightly coupled with data initialization, making 
simple restart and daily development workflows slow
   - Lack of incremental refresh mechanisms, so small data/script changes often 
require broad re-initialization
   - Poor usability of startup control, with limited mode distinctions such as 
fast start, refresh, rebuild, and targeted reset
   - Repeated environment preparation work that could be cached or reused safely
   
   These issues affect both local development efficiency and CI stability/cost.
   
   ### Goal
   
   This track issue focuses on improving the third-party Docker startup scripts 
with two primary goals:
   
   1. Reduce startup time for common developer and CI workflows
   2. Improve usability, observability, and control of startup/reset behavior
   
   ### Scope
   
   The optimization work may include, but is not limited to:
   
   - Reducing redundant initialization work during container startup
   - Caching or reusing downloaded/bootstrap artifacts when safe
   - Merging or simplifying expensive bootstrap steps
   - Removing unnecessary metadata repair or data scan operations
   - Decoupling service readiness from heavyweight data loading
   - Introducing clearer startup modes for different scenarios
   - Improving partial refresh / targeted rebuild support
   - Improving logs, diagnostics, and failure visibility
   - Standardizing script behavior across different third-party components
   
   ### Non-goals
   
   This track issue does not require all startup scripts to be fully redesigned 
in one step. Incremental improvements are acceptable as long as they clearly 
improve startup performance or usability without introducing instability.
   
   ### Proposed Work Items
   
   - [ ] Audit current third-party startup bottlenecks by component
   - [ ] Optimize Hive startup hot path
   - [ ] Reduce repeated downloads and improve local cache reuse
   - [ ] Clean up redundant metadata repair and bootstrap work
   - [ ] Introduce clearer startup mode semantics where needed
   - [ ] Improve restart experience after machine reboot or container restart
   - [ ] Improve script usability and error reporting
   - [ ] Add regression coverage for key startup flows
   
   ### Expected Benefits
   
   - Faster local setup and restart for contributors
   - Lower CI initialization cost and shorter feedback loops
   - Easier debugging and maintenance of third-party environments
   - More predictable and controllable startup behavior
   
   ### Notes
   
   This issue is intended to track a series of incremental PRs instead of one 
large refactor.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to