Hi Jiazhou, Autoscaling via the Flink operator supports "in-place" rescaling, but it is like you described it: When scaling up, more resources are allocated first to increase the cluster size, then the job is restarted. This isn't the kind of "in-place" rescaling which works without a job restart. I'm not aware that true in-place rescaling is currently being developed.
-Max On Fri, Feb 14, 2025 at 11:40 PM JIAZHOU GAO <gjz140...@gmail.com> wrote: > > Hi, > > I have a few questions on in-place rescaling support in Flink k8s operator. > > 1. Seems it is in the development roadmap > <https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/development/roadmap/>. > Is there a plan for developing it? Any discussion threads that I can follow? > 2. I noticed that the standalone autoscaler > <https://github.com/apache/flink-kubernetes-operator/blob/44dd679a5952569a7ae36395202fdf609130fdef/flink-autoscaler-standalone/src/main/java/org/apache/flink/autoscaler/standalone/realizer/RescaleApiScalingRealizer.java#L117> > has > the logic to trigger in-place rescaling through rescale api. But it does > not check if the Flink cluster has some task slots resources to accommodate > that request. Is that right? In other words, if it asks for more > parallelism than what the cluster current task slots can accommodate, the > rescale would not happen? > 3. To automatically increase task slots resources in the cluster to > fulfill the in-place rescaling request who asks for more parallelism than > what the current available task slots, I would assume that Flink k8s > operator needs a way to somehow add more task manager Pods without invoking > the full job restart. Is that something we already support or to be > developed? > > Thank you! > Jiazhou