Re: [DISCUSS] Change the default restart-strategy to exponential-delay

2023-12-19 Thread Rui Fan
Thanks everyone for the feedback! It doesn't have more feedback here, so I started the new vote[1] just now to update the default value of backoff-multiplier from 1.2 to 1.5. [1] https://lists.apache.org/thread/0b1dcwb49owpm6v1j8rhrg9h0fvs5nkt Best, Rui On Tue, Dec 12, 2023 at 7:14 PM

Re: [DISCUSS] Change the default restart-strategy to exponential-delay

2023-12-19 Thread Rui Fan
Thanks everyone for the feedback! It doesn't have more feedback here, so I started the new vote[1] just now to update the default value of backoff-multiplier from 1.2 to 1.5. [1] https://lists.apache.org/thread/0b1dcwb49owpm6v1j8rhrg9h0fvs5nkt Best, Rui On Tue, Dec 12, 2023 at 7:14 PM

Re: [DISCUSS] Change the default restart-strategy to exponential-delay

2023-12-12 Thread Maximilian Michels
Thank you Rui! I think a 1.5 multiplier is a reasonable tradeoff between restarting fast but not putting too much pressure on the cluster due to restarts. -Max On Tue, Dec 12, 2023 at 8:19 AM Rui Fan <1996fan...@gmail.com> wrote: > > Hi Maximilian and Mason, > > Thanks a lot for your feedback! >

Re: [DISCUSS] Change the default restart-strategy to exponential-delay

2023-12-12 Thread Maximilian Michels
Thank you Rui! I think a 1.5 multiplier is a reasonable tradeoff between restarting fast but not putting too much pressure on the cluster due to restarts. -Max On Tue, Dec 12, 2023 at 8:19 AM Rui Fan <1996fan...@gmail.com> wrote: > > Hi Maximilian and Mason, > > Thanks a lot for your feedback! >

Re: [DISCUSS] Change the default restart-strategy to exponential-delay

2023-12-11 Thread Rui Fan
Hi Maximilian and Mason, Thanks a lot for your feedback! After an offline consultation with Max, I guess I understand your concern for now: when flink job restarts, it will make a bunch of calls to the Kubernetes API, e.g. read/write to config maps, create task managers. Currently, the default

Re: [DISCUSS] Change the default restart-strategy to exponential-delay

2023-12-11 Thread Rui Fan
Hi Maximilian and Mason, Thanks a lot for your feedback! After an offline consultation with Max, I guess I understand your concern for now: when flink job restarts, it will make a bunch of calls to the Kubernetes API, e.g. read/write to config maps, create task managers. Currently, the default

Re: [DISCUSS] Change the default restart-strategy to exponential-delay

2023-12-07 Thread Maximilian Michels
Hey Rui, +1 for changing the default restart strategy to exponential-delay. This is something all users eventually run into. They end up changing the restart strategy to exponential-delay. I think the current defaults are quite balanced. Restarts happen quickly enough unless there are consecutive

Re: [DISCUSS] Change the default restart-strategy to exponential-delay

2023-12-07 Thread Maximilian Michels
Hey Rui, +1 for changing the default restart strategy to exponential-delay. This is something all users eventually run into. They end up changing the restart strategy to exponential-delay. I think the current defaults are quite balanced. Restarts happen quickly enough unless there are consecutive

Re: [DISCUSS] Change the default restart-strategy to exponential-delay

2023-12-05 Thread Mason Chen
Hi Rui, Sorry for the late reply. I was suggesting that perhaps we could do some testing with Kubernetes wrt configuring values for the exponential restart strategy. We've noticed that the default strategy in 1.17 caused a lot of requests to the K8s API server for unstable deployments. However,

Re: [DISCUSS] Change the default restart-strategy to exponential-delay

2023-11-19 Thread Rui Fan
Hi David and Mason, Thanks for your feedback! To David: > Given that the new default feels more complex than the current behavior, if we decide to do this I think it will be important to include the rationale you've shared in the documentation. Sounds make sense to me, I will add the related

Re: [DISCUSS] Change the default restart-strategy to exponential-delay

2023-11-19 Thread Rui Fan
Hi David and Mason, Thanks for your feedback! To David: > Given that the new default feels more complex than the current behavior, if we decide to do this I think it will be important to include the rationale you've shared in the documentation. Sounds make sense to me, I will add the related

Re: [DISCUSS] Change the default restart-strategy to exponential-delay

2023-11-17 Thread Mason Chen
Hi Rui, I suppose we could do some benchmarking on what works well for the resource providers that Flink relies on e.g. Kubernetes. Based on conferences and blogs, it seems most people are relying on Kubernetes to deploy Flink and the restart strategy has a large dependency on how well Kubernetes

Re: [DISCUSS] Change the default restart-strategy to exponential-delay

2023-11-17 Thread David Anderson
Rui, I don't have any direct experience with this topic, but given the motivation you shared, the proposal makes sense to me. Given that the new default feels more complex than the current behavior, if we decide to do this I think it will be important to include the rationale you've shared in the

Re: [DISCUSS] Change the default restart-strategy to exponential-delay

2023-11-17 Thread David Anderson
Rui, I don't have any direct experience with this topic, but given the motivation you shared, the proposal makes sense to me. Given that the new default feels more complex than the current behavior, if we decide to do this I think it will be important to include the rationale you've shared in the

[DISCUSS] Change the default restart-strategy to exponential-delay

2023-11-15 Thread Rui Fan
Hi dear flink users and devs: FLIP-364[1] intends to make some improvements to restart-strategy and discuss updating some of the default values of exponential-delay, and whether exponential-delay can be used as the default restart-strategy. After discussing at dev mail list[2], we hope to collect

[DISCUSS] Change the default restart-strategy to exponential-delay

2023-11-15 Thread Rui Fan
Hi dear flink users and devs: FLIP-364[1] intends to make some improvements to restart-strategy and discuss updating some of the default values of exponential-delay, and whether exponential-delay can be used as the default restart-strategy. After discussing at dev mail list[2], we hope to collect