GWphua opened a new issue, #18446:
URL: https://github.com/apache/druid/issues/18446

   ### Description
   
   - Propose a configurable startup strategy that eagerly loads only recent 
(“hot”) segments, while leaving older (“cold”) segments to load lazily on first 
access.
   - Propose to deprecate druid.segmentCache.lazyLoadOnStart in favour for 
configs that gives more flexibility to historical's segment cache loading 
during startup.
   
   ### Motivation
   
   - Non-lazy segment loading takes long if Historical segment count is high 
(observed ~22 minutes per Historical; ~39 hours cluster-wide).
   - Lazy-loading improves startup time but initial queries over hot data can 
be slow.
   - Many clusters primarily query the last N days/weeks; we can make that 
slice eager at startup to maintain query performance.
   
   ### Proposal
   Deprecate druid.segmentCache.lazyLoadOnStart in favor of a single 
strategy-driven config:
   
   New: historicalStartupSegmentCacheStrategy with options: 
   1. loadLazily (all segments lazy)
   2. loadAllEagerly (all segments eager)
   3. loadEagerlyForPeriod (recent window eager, older lazy)
   
   When loadEagerlyForPeriod is selected, require:
   historicalStartupSegmentCacheStrategy.loadPeriod (ISO-8601 period, e.g., 
P7D, P30D)
   
   #### Backward compatibility and migration
   Keep reading druid.segmentCache.lazyLoadOnStart for at least a few more 
releases with a deprecation warning.
   We can map true -> loadLazily, false -> loadAllEagerly. 
   Using the new `historicalStartupSegmentCacheStrategy` overwrites the 
`lazyLoadOnStart` setting, [Optional: and a warning is logged if both settings 
are configured].
   
   The pros of relying on the new config allows us to implement more load 
strategies that we want.
   
   Config names are open for discussion, do drop some suggestions!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to