Re: [PR] [FLINK-36192][autocaler] Autocaler supports adjusting the parallelism of source vertex based on the number of partitions in Kafka or pulsars [flink-kubernetes-operator]

via GitHub Thu, 12 Sep 2024 07:38:08 -0700


mxm commented on code in PR #879:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/879#discussion_r1757005246



##########
flink-autoscaler/src/main/java/org/apache/flink/autoscaler/JobVertexScaler.java:
##########
@@ -389,15 +413,51 @@ protected static int scale(
             return newParallelism;
         }
 
-        // When the shuffle type of vertex inputs contains keyBy or vertex is 
a source, we try to
-        // adjust the parallelism such that it divides the maxParallelism 
without a remainder
-        // => data is evenly spread across subtasks
-        for (int p = newParallelism; p <= maxParallelism / 2 && p <= 
upperBound; p++) {
-            if (maxParallelism % p == 0) {
+        int adjustableMaxParallelism = maxParallelism;
+        int adjustableUpperBound;
+        if (numPartitions <= 0) {
+            adjustableUpperBound = Math.min(maxParallelism / 2, upperBound);
+        } else {
+            adjustableUpperBound = Math.min(numPartitions, upperBound);
+            adjustableMaxParallelism = numPartitions;
+        }
+
+        // When the shuffle type of vertex inputs contains keyBy or vertex is 
a source,
+        // we try to adjust the parallelism such that it divides
+        // the adjustableMaxParallelism without a remainder => data is evenly 
spread across subtasks
+        for (int p = newParallelism; p <= adjustableUpperBound; p++) {
+            if (adjustableMaxParallelism % p == 0) {
                 return p;
             }
         }
 
+        if (numPartitions > 0) {
+
+            // When adjust the parallelism after rounding up cannot be evenly 
divided by source
+            // numPartitions, Try to find the smallest parallelism that can 
satisfy the current
+            // consumption rate.
+            int finalParallelism = newParallelism;
+            for (; finalParallelism > parallelismLowerLimit; 
finalParallelism--) {
+                if (numPartitions / finalParallelism > numPartitions / 
newParallelism) {
+                    if (numPartitions % finalParallelism != 0) {
+                        finalParallelism++;
+                    }
+                    break;
+                }
+            }
+            consumer.accept(
+                    String.format(
+                            SCALE_LIMITED_MESSAGE_FORMAT,
+                            vertex,
+                            newParallelism,
+                            finalParallelism,
+                            String.format(
+                                    "numPartitions : 
%s，upperBound(maxParallelism or parallelismUpperLimit): %s, "
+                                            + "parallelismLowerLimit: %s.",
+                                    numPartitions, upperBound, 
parallelismLowerLimit)));

Review Comment:
   I would prefer if we worked on the EventHandler directly here instead of 
going through the consumer. The reason is that I would like to keep event 
generation code in one place.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-36192][autocaler] Autocaler supports adjusting the parallelism of source vertex based on the number of partitions in Kafka or pulsars [flink-kubernetes-operator]

Reply via email to