Re: [PR] Introduce temporary config params to tweak high lag handling (druid)

via GitHub Tue, 03 Feb 2026 22:39:37 -0800


kfaraz commented on code in PR #18976:
URL: https://github.com/apache/druid/pull/18976#discussion_r2762448736



##########
indexing-service/src/main/java/org/apache/druid/indexing/seekablestream/supervisor/autoscaler/CostBasedAutoScaler.java:
##########
@@ -231,7 +227,7 @@ int computeOptimalTaskCount(CostMetrics metrics)
     for (int taskCount : validTaskCounts) {

Review Comment:
   Let's also add a log line here which prints out the `idleWeight`, 
`lagWeight`, `aggregateLag` and `poll-to-idle-ratio`.



##########
indexing-service/src/main/java/org/apache/druid/indexing/seekablestream/supervisor/autoscaler/CostBasedAutoScaler.java:
##########
@@ -57,24 +58,19 @@ public class CostBasedAutoScaler implements 
SupervisorTaskAutoScaler
 
   private static final int MAX_INCREASE_IN_PARTITIONS_PER_TASK = 2;
   private static final int MAX_DECREASE_IN_PARTITIONS_PER_TASK = 
MAX_INCREASE_IN_PARTITIONS_PER_TASK * 2;
+
   /**
-   * Defines the step size used for evaluating lag when computing scaling 
actions.
-   * This constant helps control the granularity of lag considerations in 
scaling decisions,
-   * ensuring smoother transitions between scaled states and avoiding abrupt 
changes in task counts.
+   * Controls how fast the additional tasks grow with the square root of 
current tasks.
+   * This allows bigger jumps when under-provisioned, but growth slows down as 
the task count increases.
    */
-  private static final int LAG_STEP = 50_000;
+  private static final int SQRT_TASK_COUNT_SCALE_FACTOR = 5;
   /**
-   * This parameter fine-tunes autoscaling behavior by adding extra flexibility
-   * when calculating maximum allowable partitions per task in response to lag,
-   * which must be processed as fast, as possible.
-   * It acts as a foundational factor that balances the responsiveness and 
stability of autoscaling.
+   * Caps the maximum number of additional tasks in a single scale-up to 
preserve stability.
    */
-  private static final int BASE_RAW_EXTRA = 5;
+  private static final int MAX_JUMP = 12;
+
   // Base PPT lag threshold allowing to activate a burst scaleup to eliminate 
high lag.
-  static final int EXTRA_SCALING_LAG_PER_PARTITION_THRESHOLD = 25_000;
-  // Extra PPT lag threshold allowing activation of even more aggressive 
scaleup to eliminate high lag,
-  // also enabling lag-amplified idle calculation decay in the cost function 
(to reduce idle weight).
-  static final int AGGRESSIVE_SCALING_LAG_PER_PARTITION_THRESHOLD = 50_000;
+  static final int EXTRA_SCALING_LAG_PER_PARTITION_THRESHOLD = 50_000;

Review Comment:
   Let's remove this for now to keep things simple.
   
   The high lag would already increase the lag cost which should cause the 
auto-scaler to choose a high task count anyway. For the time being, let's just 
focus on the cost function and tweaking the `idleWeight` and `lagWeight` to get 
the auto-scaler to do the right thing.
   
   The more parameters we add, the less deterministic the entire setup becomes.



##########
indexing-service/src/main/java/org/apache/druid/indexing/seekablestream/supervisor/autoscaler/CostBasedAutoScaler.java:
##########
@@ -231,7 +227,7 @@ int computeOptimalTaskCount(CostMetrics metrics)
     for (int taskCount : validTaskCounts) {
       CostResult costResult = costFunction.computeCost(metrics, taskCount, 
config);
       double cost = costResult.totalCost();
-      log.debug(
+      log.info(
           "Proposed task count: %d, Cost: %.4f (lag: %.4f, idle: %.4f)",

Review Comment:
   ```suggestion
             "Proposed task count[%d] has total Cost[%.4f] = lagCost[%.4f] + 
idleCost[%.4f]",
   ```



##########
indexing-service/src/main/java/org/apache/druid/indexing/seekablestream/supervisor/autoscaler/CostBasedAutoScaler.java:
##########
@@ -57,24 +58,19 @@ public class CostBasedAutoScaler implements 
SupervisorTaskAutoScaler
 
   private static final int MAX_INCREASE_IN_PARTITIONS_PER_TASK = 2;
   private static final int MAX_DECREASE_IN_PARTITIONS_PER_TASK = 
MAX_INCREASE_IN_PARTITIONS_PER_TASK * 2;
+
   /**
-   * Defines the step size used for evaluating lag when computing scaling 
actions.
-   * This constant helps control the granularity of lag considerations in 
scaling decisions,
-   * ensuring smoother transitions between scaled states and avoiding abrupt 
changes in task counts.
+   * Controls how fast the additional tasks grow with the square root of 
current tasks.
+   * This allows bigger jumps when under-provisioned, but growth slows down as 
the task count increases.
    */
-  private static final int LAG_STEP = 50_000;
+  private static final int SQRT_TASK_COUNT_SCALE_FACTOR = 5;

Review Comment:
   @Fly-Style , I wonder if we shouldn't just remove this limit altogether, at 
least for the time being. We already have the `maxTaskCount` and `minTaskCount` 
guardrails. I feel they should be enough for now so that we can see how 
reactive the auto-scaler really is.
   
   The user should be able to tweak the `idleWeight` and `lagWeight` to ensure 
that the auto-scaler doesn't always scale up to the `maxTaskCount`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Introduce temporary config params to tweak high lag handling (druid)

Reply via email to