Hi all,

As part of Flink bylaws, binding votes for FLIP changes are active committer votes.

Up to now, we have only 2 binding votes. Can one of the committers/PMC members vote on this FLIP ?

Thanks

Etienne


Le 08/08/2023 à 10:19, Etienne Chauchot a écrit :

Hi Joseph,

Thanks for the detailled review !

Best

Etienne

Le 14/07/2023 à 11:57, Prabhu Joseph a écrit :
*+1 (non-binding)*

Thanks for working on this. We have seen good improvement during the cool
down period with this feature.
Below are details on the test results from one of our clusters:

On a scale-out operation, 8 new nodes were added one by one with a gap of
~30 seconds. There were 8 restarts within 4 minutes with the default
behaviour,
whereas only one with this feature (cooldown period of 4 minutes).

The number of records processed by the job with this feature during the
restart window is higher (2909764), whereas it is only 1323960 with the
default
behaviour due to multiple restarts, where it spends most of the time
recovering, and also whatever work progressed by the tasks after the last
successful completed checkpoint is lost.

Metrics Default Adaptive Scheduler Adaptive Scheduler With Cooldown Period
Remarks
NumRecordsProcessed 1323960 2909764 1. NumRecordsProcessed metric indicates
the difference the cool down period brings in. When the job is doing
multiple restarts, the task spends most of the time recovering, and the
progress the task made will be lost during the restart.

2. There is only one restart with Cool Down Period which happened when the
8th node got added back.

Job Parallelism 13 -> 20 -> 27 -> 34 -> 41 -> 48 -> 55 → 62 → 69 13 → 69
NumRestarts 8 1








On Wed, Jul 12, 2023 at 8:03 PM Etienne Chauchot<echauc...@apache.org>
wrote:

Hi all,

I'm going on vacation tonight for 3 weeks.

Even if the vote is not finished, as the implementation is rather quick
and the design discussion had settled, I preferred I implementing
FLIP-322 [1] to allow people to take a look while I'm off.

[1]https://github.com/apache/flink/pull/22985

Best

Etienne

Le 12/07/2023 à 09:56, Etienne Chauchot a écrit :
Hi all,

Would you mind casting your vote to this second vote thread (opened
after new discussions) so that the subject can move forward ?

@David, @Chesnay, @Robert you took part to the discussions, can you
please sent your vote ?

Thank you very much

Best

Etienne

Le 06/07/2023 à 13:02, Etienne Chauchot a écrit :
Hi all,

Thanks for your feedback about the FLIP-322: Cooldown period for
adaptive scheduler [1].

This FLIP was discussed in [2].

I'd like to start a vote for it. The vote will be open for at least 72
hours (until July 9th 15:00 GMT) unless there is an objection or
insufficient votes.

[1]

https://cwiki.apache.org/confluence/display/FLINK/FLIP-322+Cooldown+period+for+adaptive+scheduler
[2]https://lists.apache.org/thread/qvgxzhbp9rhlsqrybxdy51h05zwxfns6

Best,

Etienne

Reply via email to