Fly-Style opened a new pull request, #19549:
URL: https://github.com/apache/druid/pull/19549

   **Description**
     
   The cost-based autoscaler derives candidate task counts from possible 
partitions-per-task ratios. For large partition counts these candidates can be 
very far apart near the top of the assignment range - e.g. for 400 partitions, 
the candidates scales `200 -> 400`. Because the cost model only evaluates the 
generated candidates, it has no intermediate option to settle on, forcing 
coarse, all-or-nothing scaling decisions.
     
   This PR adds deterministic intermediate candidates so the cost model has 
finer-grained options, without changing the cost model itself. 
     
   **Introduced intermediate valid task counts for large gaps**
     
   `CostBasedAutoScaler.computeValidTaskCounts` now post-processes the 
generated candidate list: after the base partitions-per-task candidates are 
produced and sorted, every adjacent pair whose gap exceeds `MAX_CANDIDATE_GAP` 
(100) is split with intermediate candidates at`INTERPOLATION_FRACTIONS = {0.33, 
0.66}`, rounded to the nearest integer.
   
   **Tuned the lag amplification multiplier:** 
   `WeightedCostFunction.LAG_AMPLIFICATION_MULTIPLIER` is lowered from 0.4 to 
0.35 based on further testing, for a slightly more balanced high-lag recovery 
response.
   
   **Release note**
     
   The cost-based supervisor autoscaler now considers intermediate task counts 
when the candidate task counts derived from partition assignment are far apart, 
enabling smoother scaling for streams with large partition counts.
    
     This PR has:
   
     - [x] been self-reviewed.
     - [x] added unit tests covering the new candidate-generation behaviour 
(no-gap, single-gap,
     multi-gap, and single-candidate cases) in CostBasedAutoScalerTest.
     - [x] been tested in a test Druid cluster.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to