You could also check out the Autoscaler logic in the Flink Kubernetes
Operator (
https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/custom-resource/autoscaler/
)
On the current main and in the upcoming 1.5.0 release the mechanism is
pretty nice and solid :)
It works with t
Thank you for all your responses! I think Gyula is right, simply do a MAX -
some_offset is not ideal as it can make the standby TM useless.
It is difficult for the scheduler to determine whether a pod has been lost
or scaled down when we enable autoscaling, which affects its decision to
utilize sta
I think the behaviour is going to get a little weird because this would
actually defeat the purpose of the standby TM.
MAX - some offset will decrease once you lose a TM so in this case we would
scale down to again have a spare (which we never actually use.)
Gyula
On Wed, Apr 26, 2023 at 4:02 PM
Reactive mode doesn't support standby taskmanagers. As you said it
always uses all available resources in the cluster.
I can see it being useful though to not always scale to MAX but (MAX -
some_offset).
I'd suggest to file a ticket.
On 26/04/2023 00:17, Wei Hou via user wrote:
Hi Flink com
Hi Flink community,
We are trying to use Flink’s reactive mode with Kubernetes HPA for autoscaling,
however since the reactive mode will always use all available resources, it
causes a problem when we need standby task managers for fast failure recover:
The job will always use these extra stand