Re: [DISCUSS] FLIP-322 Cooldown period for adaptive scheduler

2023-07-06 Thread Etienne Chauchot
 when you mention "every taskManagers
connecting",
if you are referring to the start of the pipeline, please keep
in mind
that the adaptive scheduler has a "waiting for resources"

timeout

period before starting the pipeline in which all taskmanagers
connect
and the parallelism is decided.

Best

Etienne

Le 13/06/2023 à 03:58, yuxia a écrit :

Hi, Etienne. Thanks for driving it. I have one question about

the

mechanism of the cooldown timeout.

   From the Proposed Changes part, if a scalling event is
received and
it falls during the cooldown period, it'll be stacked to be
executed
after the period ends. Also, from the description of
FLINK-21883[1],
cooldown timeout is to avoid rescaling the job very frequently,
because TaskManagers are not all connecting at the same time.

So, is it possible that every taskmanager connecting will
produce a
scalling event and it'll be stacked with many scale up event

which

causes it'll take a long time to finish all? Can we just take

the

last one event?

[1]:https://issues.apache.org/jira/browse/FLINK-21883

Best regards, Yuxia

----- 原始邮件 ----- 发件人: "Etienne
Chauchot"
收件人:
"dev", "Robert Metzger"<

metrob...@gmail.com>

发送时间: 星期一, 2023年 6 月 12日 下午 11:34:25 主题: [DISCUSS]
FLIP-322
Cooldown
period for adaptive scheduler

Hi,

I’d like to start a discussion about FLIP-322 [1] which
introduces a
cooldown period for the adaptive scheduler.

I'd like to get your feedback especially @Robert as you opened

the

related ticket and worked on the reactive mode a lot.

[1]


https://cwiki.apache.org/confluence/display/FLINK/FLIP-322+Cooldown+period+for+adaptive+scheduler

Best

Etienne

Re: [DISCUSS] FLIP-322 Cooldown period for adaptive scheduler

2023-07-05 Thread Etienne Chauchot
s"

timeout

period before starting the pipeline in which all taskmanagers
connect
and the parallelism is decided.

Best

Etienne

Le 13/06/2023 à 03:58, yuxia a écrit :

Hi, Etienne. Thanks for driving it. I have one question about

the

mechanism of the cooldown timeout.

   From the Proposed Changes part, if a scalling event is
received and
it falls during the cooldown period, it'll be stacked to be
executed
after the period ends. Also, from the description of
FLINK-21883[1],
cooldown timeout is to avoid rescaling the job very frequently,
because TaskManagers are not all connecting at the same time.

So, is it possible that every taskmanager connecting will
produce a
scalling event and it'll be stacked with many scale up event

which

causes it'll take a long time to finish all? Can we just take

the

last one event?

[1]:https://issues.apache.org/jira/browse/FLINK-21883

Best regards, Yuxia

----- 原始邮件 ----- 发件人: "Etienne
Chauchot"
收件人:
"dev", "Robert Metzger"<

metrob...@gmail.com>

发送时间: 星期一, 2023年 6 月 12日 下午 11:34:25 主题: [DISCUSS]
FLIP-322
Cooldown
period for adaptive scheduler

Hi,

I’d like to start a discussion about FLIP-322 [1] which
introduces a
cooldown period for the adaptive scheduler.

I'd like to get your feedback especially @Robert as you opened

the

related ticket and worked on the reactive mode a lot.

[1]


https://cwiki.apache.org/confluence/display/FLINK/FLIP-322+Cooldown+period+for+adaptive+scheduler

Best

Etienne

Re: [DISCUSS] FLIP-322 Cooldown period for adaptive scheduler

2023-07-04 Thread David Morávek
gt;>>> 1. Taking a look at the AdaptiveScheduler class which takes all its
> >>>>>>> configuration from the JobManagerOptions, and also to be consistent
> >>>>>>> with other parameters name, I'd suggest
> >>>>>>> /jobmanager.scheduler-scaling-cooldown-period/
> >>>>>>>
> >>>>>>> 2. I thought scaling events existed already and the scheduler
> >>>>>>> received them as mentioned in FLIP-160 (cf "Whenever the scheduler
> >>>>>>> is in the Executing state and receives new slots") or in FLIP-138
> >>>>>>> (cf "Whenever new slots are available the SlotPool notifies the
> >>>>>>> Scheduler"). If it is not the case (it is the scheduler who asks
> >>>>>>> for slots), then there is no need for storing scaling requests
> >>> indeed.
> >>>>>>> => I need a confirmation here
> >>>>>>>
> >>>>>>> 3. If we loose the JobManager, we loose both the AdaptiveScheduler
> >>>>>>> state and the CoolDownTimer state. So, upon recovery, it would be
> >>>>>>> as if there was no ongoing coolDown period. So, a first re-scale
> >>>>>>> could happen right away and it will start a coolDown period. A
> >>>>>>> second re-scale would have to wait for the end of this period.
> >>>>>>>
> >>>>>>> 4. When a pipeline is re-scaled, it is restarted. Upon restart, the
> >>>>>>> AdaptiveScheduler passes again in the "waiting for resources" state
> >>>>>>> as FLIP-160 suggests. If so, then it seems that the coolDown period
> >>>>>>> is kind of redundant with the resource-stabilization-timeout. I
> >>>>>>> guess it is not the case otherwise the FLINK-21883 ticket would not
> >>>>>>> have been created.
> >>>>>>>
> >>>>>>> => I need a confirmation here also.
> >>>>>>>
> >>>>>>>
> >>>>>>> Thanks for your views on point 2 and 4.
> >>>>>>>
> >>>>>>>
> >>>>>>> Best
> >>>>>>>
> >>>>>>> Etienne
> >>>>>>>
> >>>>>>> Le 15/06/2023 à 13:35, Robert Metzger a écrit :
> >>>>>>>> Thanks for the FLIP.
> >>>>>>>>
> >>>>>>>> Some comments:
> >>>>>>>> 1. Can you specify the full proposed configuration name? "
> >>>>>>>> scaling-cooldown-period" is probably not the full config name?
> >>>>>>>> 2. Why is the concept of scaling events and a scaling queue
> >>>>>>>> needed? If I
> >>>>>>>> remember correctly, the adaptive scheduler will just check how
> many
> >>>>>>>> TaskManagers are available and then adjust the execution graph
> >>>>>>>> accordingly.
> >>>>>>>> There's no need to store a number of scaling events. We just need
> to
> >>>>>>>> determine the time to trigger an adjustment of the execution
> graph.
> >>>>>>>> 3. What's the behavior wrt to JobManager failures (e.g. we lose
> >>>>>>>> the state
> >>>>>>>> of the Adaptive Scheduler?). My proposal would be to just reset
> the
> >>>>>>>> cooldown period, so after recovery of a JobManager, we have to
> >>>>>>>> wait at
> >>>>>>>> least for the cooldown period until further scaling operations are
> >>>>>>>> done.
> >>>>>>>> 4. What's the relationship to the
> >>>>>>>> "jobmanager.adaptive-scheduler.resource-stabilization-timeout"
> >>>>>>>> configuration?
> >>>>>>>>
> >>>>>>>> Thanks a lot for working on this!
> >>>>>>>>
> >>>>>>>> Best,
> >>>>>>>> Robert
> >>>>>>>>
> >>>>>>>> On Wed, Jun 14, 2023 at 3:38 PM Etienne
> >>>>>>>> Chauchot
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Hi all,
> >>>>>>>>>
> >>>>>>>>> @Yukia,I updated the FLIP to include the aggregation of the
> staked
> >>>>>>>>> operations that we discussed below PTAL.
> >>>>>>>>>
> >>>>>>>>> Best
> >>>>>>>>>
> >>>>>>>>> Etienne
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Le 13/06/2023 à 16:31, Etienne Chauchot a écrit :
> >>>>>>>>>> Hi Yuxia,
> >>>>>>>>>>
> >>>>>>>>>> Thanks for your feedback. The number of potentially stacked
> >>>>>>>>>> operations
> >>>>>>>>>> depends on the configured length of the cooldown period.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> The proposition in the FLIP is to add a minimum delay between 2
> >>>>>>>>>> scaling
> >>>>>>>>>> operations. But, indeed, an optimization could be to still stack
> >>>>>>>>>> the
> >>>>>>>>>> operations (that arrive during a cooldown period) but maybe not
> >>>>>>>>>> take
> >>>>>>>>>> only the last operation but rather aggregate them in order to
> >>>>>>>>>> end up
> >>>>>>>>>> with a single aggregated operation when the cooldown period
> >>>>>>>>>> ends. For
> >>>>>>>>>> example, let's say 3 taskManagers come up and 1 comes down
> >>>>>>>>>> during the
> >>>>>>>>>> cooldown period, we could generate a single operation of scale
> >>>>>>>>>> up +2
> >>>>>>>>>> when the period ends.
> >>>>>>>>>>
> >>>>>>>>>> As a side note regarding your comment on "it'll take a long time
> >>> to
> >>>>>>>>>> finish all", please keep in mind that the reactive mode (at
> >>>>>>>>>> least for
> >>>>>>>>>> now) is only available for streaming pipeline which are in
> essence
> >>>>>>>>>> infinite processing.
> >>>>>>>>>>
> >>>>>>>>>> Another side note: when you mention "every taskManagers
> >>>>>>>>>> connecting",
> >>>>>>>>>> if you are referring to the start of the pipeline, please keep
> >>>>>>>>>> in mind
> >>>>>>>>>> that the adaptive scheduler has a "waiting for resources"
> timeout
> >>>>>>>>>> period before starting the pipeline in which all taskmanagers
> >>>>>>>>>> connect
> >>>>>>>>>> and the parallelism is decided.
> >>>>>>>>>>
> >>>>>>>>>> Best
> >>>>>>>>>>
> >>>>>>>>>> Etienne
> >>>>>>>>>>
> >>>>>>>>>> Le 13/06/2023 à 03:58, yuxia a écrit :
> >>>>>>>>>>> Hi, Etienne. Thanks for driving it. I have one question about
> the
> >>>>>>>>>>> mechanism of the cooldown timeout.
> >>>>>>>>>>>
> >>>>>>>>>>>   From the Proposed Changes part, if a scalling event is
> >>>>>>>>>>> received and
> >>>>>>>>>>> it falls during the cooldown period, it'll be stacked to be
> >>>>>>>>>>> executed
> >>>>>>>>>>> after the period ends. Also, from the description of
> >>>>>>>>>>> FLINK-21883[1],
> >>>>>>>>>>> cooldown timeout is to avoid rescaling the job very frequently,
> >>>>>>>>>>> because TaskManagers are not all connecting at the same time.
> >>>>>>>>>>>
> >>>>>>>>>>> So, is it possible that every taskmanager connecting will
> >>>>>>>>>>> produce a
> >>>>>>>>>>> scalling event and it'll be stacked with many scale up event
> >>> which
> >>>>>>>>>>> causes it'll take a long time to finish all? Can we just take
> the
> >>>>>>>>>>> last one event?
> >>>>>>>>>>>
> >>>>>>>>>>> [1]:https://issues.apache.org/jira/browse/FLINK-21883
> >>>>>>>>>>>
> >>>>>>>>>>> Best regards, Yuxia
> >>>>>>>>>>>
> >>>>>>>>>>> - 原始邮件 - 发件人: "Etienne
> >>>>>>>>>>> Chauchot"
> >>>>>>>>>>> 收件人:
> >>>>>>>>>>> "dev", "Robert Metzger"<
> >>> metrob...@gmail.com>
> >>>>>>>>>>> 发送时间: 星期一, 2023年 6 月 12日 下午 11:34:25 主题: [DISCUSS]
> >>>>>>>>>>> FLIP-322
> >>>>>>>>>>> Cooldown
> >>>>>>>>>>> period for adaptive scheduler
> >>>>>>>>>>>
> >>>>>>>>>>> Hi,
> >>>>>>>>>>>
> >>>>>>>>>>> I’d like to start a discussion about FLIP-322 [1] which
> >>>>>>>>>>> introduces a
> >>>>>>>>>>> cooldown period for the adaptive scheduler.
> >>>>>>>>>>>
> >>>>>>>>>>> I'd like to get your feedback especially @Robert as you opened
> >>> the
> >>>>>>>>>>> related ticket and worked on the reactive mode a lot.
> >>>>>>>>>>>
> >>>>>>>>>>> [1]
> >>>>>>>>>>>
> >>>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-322+Cooldown+period+for+adaptive+scheduler
> >>>>>>>>>> Best
> >>>>>>>>>>> Etienne
> >>>>>>
> >>


Re: [DISCUSS] FLIP-322 Cooldown period for adaptive scheduler

2023-07-04 Thread Etienne Chauchot
 right away and it will start a coolDown period. A
second re-scale would have to wait for the end of this period.

4. When a pipeline is re-scaled, it is restarted. Upon restart, the
AdaptiveScheduler passes again in the "waiting for resources" state
as FLIP-160 suggests. If so, then it seems that the coolDown period
is kind of redundant with the resource-stabilization-timeout. I
guess it is not the case otherwise the FLINK-21883 ticket would not
have been created.

=> I need a confirmation here also.


Thanks for your views on point 2 and 4.


Best

Etienne

Le 15/06/2023 à 13:35, Robert Metzger a écrit :

Thanks for the FLIP.

Some comments:
1. Can you specify the full proposed configuration name? "
scaling-cooldown-period" is probably not the full config name?
2. Why is the concept of scaling events and a scaling queue
needed? If I
remember correctly, the adaptive scheduler will just check how many
TaskManagers are available and then adjust the execution graph
accordingly.
There's no need to store a number of scaling events. We just need to
determine the time to trigger an adjustment of the execution graph.
3. What's the behavior wrt to JobManager failures (e.g. we lose
the state
of the Adaptive Scheduler?). My proposal would be to just reset the
cooldown period, so after recovery of a JobManager, we have to
wait at
least for the cooldown period until further scaling operations are
done.
4. What's the relationship to the
"jobmanager.adaptive-scheduler.resource-stabilization-timeout"
configuration?

Thanks a lot for working on this!

Best,
Robert

On Wed, Jun 14, 2023 at 3:38 PM Etienne
Chauchot
wrote:


Hi all,

@Yukia,I updated the FLIP to include the aggregation of the staked
operations that we discussed below PTAL.

Best

Etienne


Le 13/06/2023 à 16:31, Etienne Chauchot a écrit :

Hi Yuxia,

Thanks for your feedback. The number of potentially stacked
operations
depends on the configured length of the cooldown period.



The proposition in the FLIP is to add a minimum delay between 2
scaling
operations. But, indeed, an optimization could be to still stack
the
operations (that arrive during a cooldown period) but maybe not
take
only the last operation but rather aggregate them in order to
end up
with a single aggregated operation when the cooldown period
ends. For
example, let's say 3 taskManagers come up and 1 comes down
during the
cooldown period, we could generate a single operation of scale
up +2
when the period ends.

As a side note regarding your comment on "it'll take a long time

to

finish all", please keep in mind that the reactive mode (at
least for
now) is only available for streaming pipeline which are in essence
infinite processing.

Another side note: when you mention "every taskManagers
connecting",
if you are referring to the start of the pipeline, please keep
in mind
that the adaptive scheduler has a "waiting for resources" timeout
period before starting the pipeline in which all taskmanagers
connect
and the parallelism is decided.

Best

Etienne

Le 13/06/2023 à 03:58, yuxia a écrit :

Hi, Etienne. Thanks for driving it. I have one question about the
mechanism of the cooldown timeout.

  From the Proposed Changes part, if a scalling event is
received and
it falls during the cooldown period, it'll be stacked to be
executed
after the period ends. Also, from the description of
FLINK-21883[1],
cooldown timeout is to avoid rescaling the job very frequently,
because TaskManagers are not all connecting at the same time.

So, is it possible that every taskmanager connecting will
produce a
scalling event and it'll be stacked with many scale up event

which

causes it'll take a long time to finish all? Can we just take the
last one event?

[1]:https://issues.apache.org/jira/browse/FLINK-21883

Best regards, Yuxia

- 原始邮件 ----- 发件人: "Etienne
Chauchot"
收件人:
"dev", "Robert Metzger"<

metrob...@gmail.com>

发送时间: 星期一, 2023年 6 月 12日 下午 11:34:25 主题: [DISCUSS]
FLIP-322
Cooldown
period for adaptive scheduler

Hi,

I’d like to start a discussion about FLIP-322 [1] which
introduces a
cooldown period for the adaptive scheduler.

I'd like to get your feedback especially @Robert as you opened

the

related ticket and worked on the reactive mode a lot.

[1]


https://cwiki.apache.org/confluence/display/FLINK/FLIP-322+Cooldown+period+for+adaptive+scheduler

Best

Etienne




Re: [DISCUSS] FLIP-322 Cooldown period for adaptive scheduler

2023-07-04 Thread Chesnay Schepler
t;>>> remember correctly, the adaptive scheduler will just
check how many
>>>>> TaskManagers are available and then adjust the execution
graph
>>>>> accordingly.
>>>>> There's no need to store a number of scaling events. We
just need to
>>>>> determine the time to trigger an adjustment of the
execution graph.
>>>>> 3. What's the behavior wrt to JobManager failures (e.g.
we lose
>>>>> the state
>>>>> of the Adaptive Scheduler?). My proposal would be to
just reset the
>>>>> cooldown period, so after recovery of a JobManager, we
have to
>>>>> wait at
>>>>> least for the cooldown period until further scaling
operations are
>>>>> done.
>>>>> 4. What's the relationship to the
>>>>>
"jobmanager.adaptive-scheduler.resource-stabilization-timeout"
>>>>> configuration?
>>>>>
>>>>> Thanks a lot for working on this!
>>>>>
>>>>> Best,
>>>>> Robert
>>>>>
>>>>> On Wed, Jun 14, 2023 at 3:38 PM Etienne
>>>>> Chauchot
>>>>> wrote:
>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> @Yukia,I updated the FLIP to include the aggregation of
the staked
>>>>>> operations that we discussed below PTAL.
>>>>>>
>>>>>> Best
>>>>>>
>>>>>> Etienne
>>>>>>
>>>>>>
>>>>>> Le 13/06/2023 à 16:31, Etienne Chauchot a écrit :
>>>>>>> Hi Yuxia,
>>>>>>>
>>>>>>> Thanks for your feedback. The number of potentially
stacked
>>>>>>> operations
>>>>>>> depends on the configured length of the cooldown period.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> The proposition in the FLIP is to add a minimum delay
between 2
>>>>>>> scaling
>>>>>>> operations. But, indeed, an optimization could be to
still stack
>>>>>>> the
>>>>>>> operations (that arrive during a cooldown period) but
maybe not
>>>>>>> take
>>>>>>> only the last operation but rather aggregate them in
order to
>>>>>>> end up
>>>>>>> with a single aggregated operation when the cooldown
period
>>>>>>> ends. For
>>>>>>> example, let's say 3 taskManagers come up and 1 comes
down
>>>>>>> during the
>>>>>>> cooldown period, we could generate a single operation
of scale
>>>>>>> up +2
>>>>>>> when the period ends.
>>>>>>>
>>>>>>> As a side note regarding your comment on "it'll take a
long time to
>>>>>>> finish all", please keep in mind that the reactive
mode (at
>>>>>>> least for
>>>>>>> now) is only available for streaming pipeline which
are in essence
>>>>>>> infinite processing.
>>>>>>>
>>>>>>> Another side note: when you mention "every taskManagers
>>>>>>> connecting",
>>>>>>> if you are referring to the start of the pipeline,
please keep
>>>>>>> in mind
>>>>>>> that the adaptive scheduler has a "waiting for
resources" timeout
>>>>>>> period before starting the pipeline in which all
taskmanagers
>>>>>>> connect
>>>>>>> and the parallelism is decided.
>>>>>>>
>>>>>>> Best
>>>>>>>
>>>>>>> Etienne
>>>>>>>
>>>>>>> Le 13/06/2023 à 03:58, yuxia a écrit :
>>>>>>>> Hi, Etienne. Thanks for driving it. I have one
question about the
>>>>>>>> mechanism of the cooldown timeout.
>>>>>>>>
>>>>>>>>  From the Proposed Changes part, if a scalling event is
>>>>>>>> received and
>>>>>>>> it falls during the cooldown period, it'll be stacked
to be
>>>>>>>> executed
>>>>>>>> after the period ends. Also, from the description of
>>>>>>>> FLINK-21883[1],
>>>>>>>> cooldown timeout is to avoid rescaling the job very
frequently,
>>>>>>>> because TaskManagers are not all connecting at the
same time.
>>>>>>>>
>>>>>>>> So, is it possible that every taskmanager connecting
will
>>>>>>>> produce a
>>>>>>>> scalling event and it'll be stacked with many scale
up event which
>>>>>>>> causes it'll take a long time to finish all? Can we
just take the
>>>>>>>> last one event?
>>>>>>>>
>>>>>>>> [1]:https://issues.apache.org/jira/browse/FLINK-21883
>>>>>>>>
>>>>>>>> Best regards, Yuxia
>>>>>>>>
>>>>>>>> - 原始邮件 - 发件人: "Etienne
>>>>>>>> Chauchot"
>>>>>>>> 收件人:
>>>>>>>> "dev", "Robert
Metzger"
>>>>>>>> 发送时间: 星期一, 2023年 6 月 12日 下午 11:34:25 主题:
[DISCUSS]
>>>>>>>> FLIP-322
>>>>>>>> Cooldown
>>>>>>>> period for adaptive scheduler
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I’d like to start a discussion about FLIP-322 [1] which
>>>>>>>> introduces a
>>>>>>>> cooldown period for the adaptive scheduler.
>>>>>>>>
>>>>>>>> I'd like to get your feedback especially @Robert as
you opened the
>>>>>>>> related ticket and worked on the reactive mode a lot.
>>>>>>>>
>>>>>>>> [1]
>>>>>>>>
>>>>>>

https://cwiki.apache.org/confluence/display/FLINK/FLIP-322+Cooldown+period+for+adaptive+scheduler

>>>>>>
>>>>>>> Best
>>>>>>>> Etienne
>>>
>>>
>



Re: [DISCUSS] FLIP-322 Cooldown period for adaptive scheduler

2023-07-04 Thread David Morávek
a cooldown when there is already at each rescale a
>> >> stable resource timeout ?
>> >>
>> >>
>> >> Best
>> >>
>> >> Etienne
>> >>
>> >>
>> >>
>> >>>
>> >>> On 16/06/2023 15:47, Etienne Chauchot wrote:
>> >>>> Hi Robert,
>> >>>>
>> >>>> Thanks for your feedback. I don't know the scheduler part well
>> >>>> enough yet and I'm taking this ticket as a learning workshop.
>> >>>>
>> >>>> Regarding your comments:
>> >>>>
>> >>>> 1. Taking a look at the AdaptiveScheduler class which takes all its
>> >>>> configuration from the JobManagerOptions, and also to be consistent
>> >>>> with other parameters name, I'd suggest
>> >>>> /jobmanager.scheduler-scaling-cooldown-period/
>> >>>>
>> >>>> 2. I thought scaling events existed already and the scheduler
>> >>>> received them as mentioned in FLIP-160 (cf "Whenever the scheduler
>> >>>> is in the Executing state and receives new slots") or in FLIP-138
>> >>>> (cf "Whenever new slots are available the SlotPool notifies the
>> >>>> Scheduler"). If it is not the case (it is the scheduler who asks
>> >>>> for slots), then there is no need for storing scaling requests
>> indeed.
>> >>>>
>> >>>> => I need a confirmation here
>> >>>>
>> >>>> 3. If we loose the JobManager, we loose both the AdaptiveScheduler
>> >>>> state and the CoolDownTimer state. So, upon recovery, it would be
>> >>>> as if there was no ongoing coolDown period. So, a first re-scale
>> >>>> could happen right away and it will start a coolDown period. A
>> >>>> second re-scale would have to wait for the end of this period.
>> >>>>
>> >>>> 4. When a pipeline is re-scaled, it is restarted. Upon restart, the
>> >>>> AdaptiveScheduler passes again in the "waiting for resources" state
>> >>>> as FLIP-160 suggests. If so, then it seems that the coolDown period
>> >>>> is kind of redundant with the resource-stabilization-timeout. I
>> >>>> guess it is not the case otherwise the FLINK-21883 ticket would not
>> >>>> have been created.
>> >>>>
>> >>>> => I need a confirmation here also.
>> >>>>
>> >>>>
>> >>>> Thanks for your views on point 2 and 4.
>> >>>>
>> >>>>
>> >>>> Best
>> >>>>
>> >>>> Etienne
>> >>>>
>> >>>> Le 15/06/2023 à 13:35, Robert Metzger a écrit :
>> >>>>> Thanks for the FLIP.
>> >>>>>
>> >>>>> Some comments:
>> >>>>> 1. Can you specify the full proposed configuration name? "
>> >>>>> scaling-cooldown-period" is probably not the full config name?
>> >>>>> 2. Why is the concept of scaling events and a scaling queue
>> >>>>> needed? If I
>> >>>>> remember correctly, the adaptive scheduler will just check how many
>> >>>>> TaskManagers are available and then adjust the execution graph
>> >>>>> accordingly.
>> >>>>> There's no need to store a number of scaling events. We just need to
>> >>>>> determine the time to trigger an adjustment of the execution graph.
>> >>>>> 3. What's the behavior wrt to JobManager failures (e.g. we lose
>> >>>>> the state
>> >>>>> of the Adaptive Scheduler?). My proposal would be to just reset the
>> >>>>> cooldown period, so after recovery of a JobManager, we have to
>> >>>>> wait at
>> >>>>> least for the cooldown period until further scaling operations are
>> >>>>> done.
>> >>>>> 4. What's the relationship to the
>> >>>>> "jobmanager.adaptive-scheduler.resource-stabilization-timeout"
>> >>>>> configuration?
>> >>>>>
>> >>>>> Thanks a lot for working on this!
>> >>>>>
>> >>>>> Best,
>> >>>>> Robert
>> >>>>>
>> >>>>> O

Re: [DISCUSS] FLIP-322 Cooldown period for adaptive scheduler

2023-07-04 Thread David Morávek
cf "Whenever the scheduler
> >>>> is in the Executing state and receives new slots") or in FLIP-138
> >>>> (cf "Whenever new slots are available the SlotPool notifies the
> >>>> Scheduler"). If it is not the case (it is the scheduler who asks
> >>>> for slots), then there is no need for storing scaling requests indeed.
> >>>>
> >>>> => I need a confirmation here
> >>>>
> >>>> 3. If we loose the JobManager, we loose both the AdaptiveScheduler
> >>>> state and the CoolDownTimer state. So, upon recovery, it would be
> >>>> as if there was no ongoing coolDown period. So, a first re-scale
> >>>> could happen right away and it will start a coolDown period. A
> >>>> second re-scale would have to wait for the end of this period.
> >>>>
> >>>> 4. When a pipeline is re-scaled, it is restarted. Upon restart, the
> >>>> AdaptiveScheduler passes again in the "waiting for resources" state
> >>>> as FLIP-160 suggests. If so, then it seems that the coolDown period
> >>>> is kind of redundant with the resource-stabilization-timeout. I
> >>>> guess it is not the case otherwise the FLINK-21883 ticket would not
> >>>> have been created.
> >>>>
> >>>> => I need a confirmation here also.
> >>>>
> >>>>
> >>>> Thanks for your views on point 2 and 4.
> >>>>
> >>>>
> >>>> Best
> >>>>
> >>>> Etienne
> >>>>
> >>>> Le 15/06/2023 à 13:35, Robert Metzger a écrit :
> >>>>> Thanks for the FLIP.
> >>>>>
> >>>>> Some comments:
> >>>>> 1. Can you specify the full proposed configuration name? "
> >>>>> scaling-cooldown-period" is probably not the full config name?
> >>>>> 2. Why is the concept of scaling events and a scaling queue
> >>>>> needed? If I
> >>>>> remember correctly, the adaptive scheduler will just check how many
> >>>>> TaskManagers are available and then adjust the execution graph
> >>>>> accordingly.
> >>>>> There's no need to store a number of scaling events. We just need to
> >>>>> determine the time to trigger an adjustment of the execution graph.
> >>>>> 3. What's the behavior wrt to JobManager failures (e.g. we lose
> >>>>> the state
> >>>>> of the Adaptive Scheduler?). My proposal would be to just reset the
> >>>>> cooldown period, so after recovery of a JobManager, we have to
> >>>>> wait at
> >>>>> least for the cooldown period until further scaling operations are
> >>>>> done.
> >>>>> 4. What's the relationship to the
> >>>>> "jobmanager.adaptive-scheduler.resource-stabilization-timeout"
> >>>>> configuration?
> >>>>>
> >>>>> Thanks a lot for working on this!
> >>>>>
> >>>>> Best,
> >>>>> Robert
> >>>>>
> >>>>> On Wed, Jun 14, 2023 at 3:38 PM Etienne
> >>>>> Chauchot
> >>>>> wrote:
> >>>>>
> >>>>>> Hi all,
> >>>>>>
> >>>>>> @Yukia,I updated the FLIP to include the aggregation of the staked
> >>>>>> operations that we discussed below PTAL.
> >>>>>>
> >>>>>> Best
> >>>>>>
> >>>>>> Etienne
> >>>>>>
> >>>>>>
> >>>>>> Le 13/06/2023 à 16:31, Etienne Chauchot a écrit :
> >>>>>>> Hi Yuxia,
> >>>>>>>
> >>>>>>> Thanks for your feedback. The number of potentially stacked
> >>>>>>> operations
> >>>>>>> depends on the configured length of the cooldown period.
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> The proposition in the FLIP is to add a minimum delay between 2
> >>>>>>> scaling
> >>>>>>> operations. But, indeed, an optimization could be to still stack
> >>>>>>> the
> >>>>>>> operations (that arrive during a cooldown period) but maybe not
> >>>>>>> take
> >>>>>>> only the last operation but rather aggregate them in order to
> >>>>>>> end up
> >>>>>>> with a single aggregated operation when the cooldown period
> >>>>>>> ends. For
> >>>>>>> example, let's say 3 taskManagers come up and 1 comes down
> >>>>>>> during the
> >>>>>>> cooldown period, we could generate a single operation of scale
> >>>>>>> up +2
> >>>>>>> when the period ends.
> >>>>>>>
> >>>>>>> As a side note regarding your comment on "it'll take a long time to
> >>>>>>> finish all", please keep in mind that the reactive mode (at
> >>>>>>> least for
> >>>>>>> now) is only available for streaming pipeline which are in essence
> >>>>>>> infinite processing.
> >>>>>>>
> >>>>>>> Another side note: when you mention "every taskManagers
> >>>>>>> connecting",
> >>>>>>> if you are referring to the start of the pipeline, please keep
> >>>>>>> in mind
> >>>>>>> that the adaptive scheduler has a "waiting for resources" timeout
> >>>>>>> period before starting the pipeline in which all taskmanagers
> >>>>>>> connect
> >>>>>>> and the parallelism is decided.
> >>>>>>>
> >>>>>>> Best
> >>>>>>>
> >>>>>>> Etienne
> >>>>>>>
> >>>>>>> Le 13/06/2023 à 03:58, yuxia a écrit :
> >>>>>>>> Hi, Etienne. Thanks for driving it. I have one question about the
> >>>>>>>> mechanism of the cooldown timeout.
> >>>>>>>>
> >>>>>>>>  From the Proposed Changes part, if a scalling event is
> >>>>>>>> received and
> >>>>>>>> it falls during the cooldown period, it'll be stacked to be
> >>>>>>>> executed
> >>>>>>>> after the period ends. Also, from the description of
> >>>>>>>> FLINK-21883[1],
> >>>>>>>> cooldown timeout is to avoid rescaling the job very frequently,
> >>>>>>>> because TaskManagers are not all connecting at the same time.
> >>>>>>>>
> >>>>>>>> So, is it possible that every taskmanager connecting will
> >>>>>>>> produce a
> >>>>>>>> scalling event and it'll be stacked with many scale up event which
> >>>>>>>> causes it'll take a long time to finish all? Can we just take the
> >>>>>>>> last one event?
> >>>>>>>>
> >>>>>>>> [1]:https://issues.apache.org/jira/browse/FLINK-21883
> >>>>>>>>
> >>>>>>>> Best regards, Yuxia
> >>>>>>>>
> >>>>>>>> - 原始邮件 - 发件人: "Etienne
> >>>>>>>> Chauchot"
> >>>>>>>> 收件人:
> >>>>>>>> "dev", "Robert Metzger" >
> >>>>>>>> 发送时间: 星期一, 2023年 6 月 12日 下午 11:34:25 主题: [DISCUSS]
> >>>>>>>> FLIP-322
> >>>>>>>> Cooldown
> >>>>>>>> period for adaptive scheduler
> >>>>>>>>
> >>>>>>>> Hi,
> >>>>>>>>
> >>>>>>>> I’d like to start a discussion about FLIP-322 [1] which
> >>>>>>>> introduces a
> >>>>>>>> cooldown period for the adaptive scheduler.
> >>>>>>>>
> >>>>>>>> I'd like to get your feedback especially @Robert as you opened the
> >>>>>>>> related ticket and worked on the reactive mode a lot.
> >>>>>>>>
> >>>>>>>> [1]
> >>>>>>>>
> >>>>>>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-322+Cooldown+period+for+adaptive+scheduler
> >>>>>>
> >>>>>>> Best
> >>>>>>>> Etienne
> >>>
> >>>
> >


Re: [DISCUSS] FLIP-322 Cooldown period for adaptive scheduler

2023-06-29 Thread Etienne Chauchot
essing.

Another side note: when you mention "every taskManagers 
connecting",
if you are referring to the start of the pipeline, please keep 
in mind

that the adaptive scheduler has a "waiting for resources" timeout
period before starting the pipeline in which all taskmanagers 
connect

and the parallelism is decided.

Best

Etienne

Le 13/06/2023 à 03:58, yuxia a écrit :

Hi, Etienne. Thanks for driving it. I have one question about the
mechanism of the cooldown timeout.

 From the Proposed Changes part, if a scalling event is 
received and
it falls during the cooldown period, it'll be stacked to be 
executed
after the period ends. Also, from the description of 
FLINK-21883[1],

cooldown timeout is to avoid rescaling the job very frequently,
because TaskManagers are not all connecting at the same time.

So, is it possible that every taskmanager connecting will 
produce a

scalling event and it'll be stacked with many scale up event which
causes it'll take a long time to finish all? Can we just take the
last one event?

[1]:https://issues.apache.org/jira/browse/FLINK-21883

Best regards, Yuxia

----- 原始邮件 - 发件人: "Etienne 
Chauchot"

收件人:
"dev", "Robert Metzger"
发送时间: 星期一, 2023年 6 月 12日 下午 11:34:25 主题: [DISCUSS] 
FLIP-322

Cooldown
period for adaptive scheduler

Hi,

I’d like to start a discussion about FLIP-322 [1] which 
introduces a

cooldown period for the adaptive scheduler.

I'd like to get your feedback especially @Robert as you opened the
related ticket and worked on the reactive mode a lot.

[1]

https://cwiki.apache.org/confluence/display/FLINK/FLIP-322+Cooldown+period+for+adaptive+scheduler 


Best

Etienne





Re: [DISCUSS] FLIP-322 Cooldown period for adaptive scheduler

2023-06-28 Thread Chesnay Schepler
 
mind

that the adaptive scheduler has a "waiting for resources" timeout
period before starting the pipeline in which all taskmanagers 
connect

and the parallelism is decided.

Best

Etienne

Le 13/06/2023 à 03:58, yuxia a écrit :

Hi, Etienne. Thanks for driving it. I have one question about the
mechanism of the cooldown timeout.

 From the Proposed Changes part, if a scalling event is received 
and
it falls during the cooldown period, it'll be stacked to be 
executed
after the period ends. Also, from the description of 
FLINK-21883[1],

cooldown timeout is to avoid rescaling the job very frequently,
because TaskManagers are not all connecting at the same time.

So, is it possible that every taskmanager connecting will produce a
scalling event and it'll be stacked with many scale up event which
causes it'll take a long time to finish all? Can we just take the
last one event?

[1]:https://issues.apache.org/jira/browse/FLINK-21883

Best regards, Yuxia

----- 原始邮件 - 发件人: "Etienne 
Chauchot"

收件人:
"dev", "Robert Metzger"
发送时间: 星期一, 2023年 6 月 12日 下午 11:34:25 主题: [DISCUSS] 
FLIP-322

Cooldown
period for adaptive scheduler

Hi,

I’d like to start a discussion about FLIP-322 [1] which 
introduces a

cooldown period for the adaptive scheduler.

I'd like to get your feedback especially @Robert as you opened the
related ticket and worked on the reactive mode a lot.

[1]

https://cwiki.apache.org/confluence/display/FLINK/FLIP-322+Cooldown+period+for+adaptive+scheduler 


Best

Etienne







Re: [DISCUSS] FLIP-322 Cooldown period for adaptive scheduler

2023-06-20 Thread Etienne Chauchot
er 
received them as mentioned in FLIP-160 (cf "Whenever the scheduler is 
in the Executing state and receives new slots") or in FLIP-138 (cf 
"Whenever new slots are available the SlotPool notifies the 
Scheduler"). If it is not the case (it is the scheduler who asks for 
slots), then there is no need for storing scaling requests indeed.


=> I need a confirmation here

3. If we loose the JobManager, we loose both the AdaptiveScheduler 
state and the CoolDownTimer state. So, upon recovery, it would be as 
if there was no ongoing coolDown period. So, a first re-scale could 
happen right away and it will start a coolDown period. A second 
re-scale would have to wait for the end of this period.


4. When a pipeline is re-scaled, it is restarted. Upon restart, the 
AdaptiveScheduler passes again in the "waiting for resources" state 
as FLIP-160 suggests. If so, then it seems that the coolDown period 
is kind of redundant with the resource-stabilization-timeout. I guess 
it is not the case otherwise the FLINK-21883 ticket would not have 
been created.


=> I need a confirmation here also.


Thanks for your views on point 2 and 4.


Best

Etienne

Le 15/06/2023 à 13:35, Robert Metzger a écrit :

Thanks for the FLIP.

Some comments:
1. Can you specify the full proposed configuration name? "
scaling-cooldown-period" is probably not the full config name?
2. Why is the concept of scaling events and a scaling queue needed? 
If I

remember correctly, the adaptive scheduler will just check how many
TaskManagers are available and then adjust the execution graph 
accordingly.

There's no need to store a number of scaling events. We just need to
determine the time to trigger an adjustment of the execution graph.
3. What's the behavior wrt to JobManager failures (e.g. we lose the 
state

of the Adaptive Scheduler?). My proposal would be to just reset the
cooldown period, so after recovery of a JobManager, we have to wait at
least for the cooldown period until further scaling operations are 
done.

4. What's the relationship to the
"jobmanager.adaptive-scheduler.resource-stabilization-timeout"
configuration?

Thanks a lot for working on this!

Best,
Robert

On Wed, Jun 14, 2023 at 3:38 PM Etienne Chauchot
wrote:


Hi all,

@Yukia,I updated the FLIP to include the aggregation of the staked
operations that we discussed below PTAL.

Best

Etienne


Le 13/06/2023 à 16:31, Etienne Chauchot a écrit :

Hi Yuxia,

Thanks for your feedback. The number of potentially stacked 
operations

depends on the configured length of the cooldown period.



The proposition in the FLIP is to add a minimum delay between 2 
scaling

operations. But, indeed, an optimization could be to still stack the
operations (that arrive during a cooldown period) but maybe not take
only the last operation but rather aggregate them in order to end up
with a single aggregated operation when the cooldown period ends. For
example, let's say 3 taskManagers come up and 1 comes down during the
cooldown period, we could generate a single operation of scale up +2
when the period ends.

As a side note regarding your comment on "it'll take a long time to
finish all", please keep in mind that the reactive mode (at least for
now) is only available for streaming pipeline which are in essence
infinite processing.

Another side note: when you mention "every taskManagers connecting",
if you are referring to the start of the pipeline, please keep in 
mind

that the adaptive scheduler has a "waiting for resources" timeout
period before starting the pipeline in which all taskmanagers connect
and the parallelism is decided.

Best

Etienne

Le 13/06/2023 à 03:58, yuxia a écrit :

Hi, Etienne. Thanks for driving it. I have one question about the
mechanism of the cooldown timeout.

 From the Proposed Changes part, if a scalling event is received and
it falls during the cooldown period, it'll be stacked to be executed
after the period ends. Also, from the description of FLINK-21883[1],
cooldown timeout is to avoid rescaling the job very frequently,
because TaskManagers are not all connecting at the same time.

So, is it possible that every taskmanager connecting will produce a
scalling event and it'll be stacked with many scale up event which
causes it'll take a long time to finish all? Can we just take the
last one event?

[1]:https://issues.apache.org/jira/browse/FLINK-21883

Best regards, Yuxia

- 原始邮件 ----- 发件人: "Etienne Chauchot" 


收件人:
"dev", "Robert Metzger"
发送时间: 星期一, 2023年 6 月 12日 下午 11:34:25 主题: [DISCUSS] 
FLIP-322

Cooldown
period for adaptive scheduler

Hi,

I’d like to start a discussion about FLIP-322 [1] which introduces a
cooldown period for the adaptive scheduler.

I'd like to get your feedback especially @Robert as you opened the
related ticket and worked on the reactive mode a lot.

[1]

https://cwiki.apache.org/confluence/display/FLINK/FLIP-322+Cooldown+period+for+adaptive+scheduler 


Best

Etienne




Re: [DISCUSS] FLIP-322 Cooldown period for adaptive scheduler

2023-06-16 Thread Chesnay Schepler
period until further scaling operations are done.
4. What's the relationship to the
"jobmanager.adaptive-scheduler.resource-stabilization-timeout"
configuration?

Thanks a lot for working on this!

Best,
Robert

On Wed, Jun 14, 2023 at 3:38 PM Etienne Chauchot
wrote:


Hi all,

@Yukia,I updated the FLIP to include the aggregation of the staked
operations that we discussed below PTAL.

Best

Etienne


Le 13/06/2023 à 16:31, Etienne Chauchot a écrit :

Hi Yuxia,

Thanks for your feedback. The number of potentially stacked operations
depends on the configured length of the cooldown period.



The proposition in the FLIP is to add a minimum delay between 2 
scaling

operations. But, indeed, an optimization could be to still stack the
operations (that arrive during a cooldown period) but maybe not take
only the last operation but rather aggregate them in order to end up
with a single aggregated operation when the cooldown period ends. For
example, let's say 3 taskManagers come up and 1 comes down during the
cooldown period, we could generate a single operation of scale up +2
when the period ends.

As a side note regarding your comment on "it'll take a long time to
finish all", please keep in mind that the reactive mode (at least for
now) is only available for streaming pipeline which are in essence
infinite processing.

Another side note: when you mention "every taskManagers connecting",
if you are referring to the start of the pipeline, please keep in mind
that the adaptive scheduler has a "waiting for resources" timeout
period before starting the pipeline in which all taskmanagers connect
and the parallelism is decided.

Best

Etienne

Le 13/06/2023 à 03:58, yuxia a écrit :

Hi, Etienne. Thanks for driving it. I have one question about the
mechanism of the cooldown timeout.

 From the Proposed Changes part, if a scalling event is received and
it falls during the cooldown period, it'll be stacked to be executed
after the period ends. Also, from the description of FLINK-21883[1],
cooldown timeout is to avoid rescaling the job very frequently,
because TaskManagers are not all connecting at the same time.

So, is it possible that every taskmanager connecting will produce a
scalling event and it'll be stacked with many scale up event which
causes it'll take a long time to finish all? Can we just take the
last one event?

[1]:https://issues.apache.org/jira/browse/FLINK-21883

Best regards, Yuxia

----- 原始邮件 ----- 发件人: "Etienne Chauchot"
收件人:
"dev", "Robert Metzger"
发送时间: 星期一, 2023年 6 月 12日 下午 11:34:25 主题: [DISCUSS] 
FLIP-322

Cooldown
period for adaptive scheduler

Hi,

I’d like to start a discussion about FLIP-322 [1] which introduces a
cooldown period for the adaptive scheduler.

I'd like to get your feedback especially @Robert as you opened the
related ticket and worked on the reactive mode a lot.

[1]

https://cwiki.apache.org/confluence/display/FLINK/FLIP-322+Cooldown+period+for+adaptive+scheduler 


Best

Etienne




Re: [DISCUSS] FLIP-322 Cooldown period for adaptive scheduler

2023-06-16 Thread Etienne Chauchot

Hi Robert,

Thanks for your feedback. I don't know the scheduler part well enough 
yet and I'm taking this ticket as a learning workshop.


Regarding your comments:

1. Taking a look at the AdaptiveScheduler class which takes all its 
configuration from the JobManagerOptions, and also to be consistent with 
other parameters name, I'd suggest 
/jobmanager.scheduler-scaling-cooldown-period/


2. I thought scaling events existed already and the scheduler received 
them as mentioned in FLIP-160 (cf "Whenever the scheduler is in the 
Executing state and receives new slots") or in FLIP-138 (cf "Whenever 
new slots are available the SlotPool notifies the Scheduler"). If it is 
not the case (it is the scheduler who asks for slots), then there is no 
need for storing scaling requests indeed.


=> I need a confirmation here

3. If we loose the JobManager, we loose both the AdaptiveScheduler state 
and the CoolDownTimer state. So, upon recovery, it would be as if there 
was no ongoing coolDown period. So, a first re-scale could happen right 
away and it will start a coolDown period. A second re-scale would have 
to wait for the end of this period.


4. When a pipeline is re-scaled, it is restarted. Upon restart, the 
AdaptiveScheduler passes again in the "waiting for resources" state as 
FLIP-160 suggests. If so, then it seems that the coolDown period is kind 
of redundant with the resource-stabilization-timeout. I guess it is not 
the case otherwise the FLINK-21883 ticket would not have been created.


=> I need a confirmation here also.


Thanks for your views on point 2 and 4.


Best

Etienne

Le 15/06/2023 à 13:35, Robert Metzger a écrit :

Thanks for the FLIP.

Some comments:
1. Can you specify the full proposed configuration name? "
scaling-cooldown-period" is probably not the full config name?
2. Why is the concept of scaling events and a scaling queue needed? If I
remember correctly, the adaptive scheduler will just check how many
TaskManagers are available and then adjust the execution graph accordingly.
There's no need to store a number of scaling events. We just need to
determine the time to trigger an adjustment of the execution graph.
3. What's the behavior wrt to JobManager failures (e.g. we lose the state
of the Adaptive Scheduler?). My proposal would be to just reset the
cooldown period, so after recovery of a JobManager, we have to wait at
least for the cooldown period until further scaling operations are done.
4. What's the relationship to the
"jobmanager.adaptive-scheduler.resource-stabilization-timeout"
configuration?

Thanks a lot for working on this!

Best,
Robert

On Wed, Jun 14, 2023 at 3:38 PM Etienne Chauchot
wrote:


Hi all,

@Yukia,I updated the FLIP to include the aggregation of the staked
operations that we discussed below PTAL.

Best

Etienne


Le 13/06/2023 à 16:31, Etienne Chauchot a écrit :

Hi Yuxia,

Thanks for your feedback. The number of potentially stacked operations
depends on the configured length of the cooldown period.



The proposition in the FLIP is to add a minimum delay between 2 scaling
operations. But, indeed, an optimization could be to still stack the
operations (that arrive during a cooldown period) but maybe not take
only the last operation but rather aggregate them in order to end up
with a single aggregated operation when the cooldown period ends. For
example, let's say 3 taskManagers come up and 1 comes down during the
cooldown period, we could generate a single operation of scale up +2
when the period ends.

As a side note regarding your comment on "it'll take a long time to
finish all", please keep in mind that the reactive mode (at least for
now) is only available for streaming pipeline which are in essence
infinite processing.

Another side note: when you mention "every taskManagers connecting",
if you are referring to the start of the pipeline, please keep in mind
that the adaptive scheduler has a "waiting for resources" timeout
period before starting the pipeline in which all taskmanagers connect
and the parallelism is decided.

Best

Etienne

Le 13/06/2023 à 03:58, yuxia a écrit :

Hi, Etienne. Thanks for driving it. I have one question about the
mechanism of the cooldown timeout.

 From the Proposed Changes part, if a scalling event is received and
it falls during the cooldown period, it'll be stacked to be executed
after the period ends. Also, from the description of FLINK-21883[1],
cooldown timeout is to avoid rescaling the job very frequently,
because TaskManagers are not all connecting at the same time.

So, is it possible that every taskmanager connecting will produce a
scalling event and it'll be stacked with many scale up event which
causes it'll take a long time to finish all? Can we just take the
last one event?

[1]:https://issues.apache.org/jira/browse/FLINK-21883

Best regards, Yuxia

- 原始邮件 - 发件人: "Etienne Chauchot"
收件人:
&qu

Re: [DISCUSS] FLIP-322 Cooldown period for adaptive scheduler

2023-06-15 Thread Robert Metzger
Thanks for the FLIP.

Some comments:
1. Can you specify the full proposed configuration name? "
scaling-cooldown-period" is probably not the full config name?
2. Why is the concept of scaling events and a scaling queue needed? If I
remember correctly, the adaptive scheduler will just check how many
TaskManagers are available and then adjust the execution graph accordingly.
There's no need to store a number of scaling events. We just need to
determine the time to trigger an adjustment of the execution graph.
3. What's the behavior wrt to JobManager failures (e.g. we lose the state
of the Adaptive Scheduler?). My proposal would be to just reset the
cooldown period, so after recovery of a JobManager, we have to wait at
least for the cooldown period until further scaling operations are done.
4. What's the relationship to the
"jobmanager.adaptive-scheduler.resource-stabilization-timeout"
configuration?

Thanks a lot for working on this!

Best,
Robert

On Wed, Jun 14, 2023 at 3:38 PM Etienne Chauchot 
wrote:

> Hi all,
>
> @Yukia,I updated the FLIP to include the aggregation of the staked
> operations that we discussed below PTAL.
>
> Best
>
> Etienne
>
>
> Le 13/06/2023 à 16:31, Etienne Chauchot a écrit :
> > Hi Yuxia,
> >
> > Thanks for your feedback. The number of potentially stacked operations
> > depends on the configured length of the cooldown period.
> >
> >
> >
> > The proposition in the FLIP is to add a minimum delay between 2 scaling
> > operations. But, indeed, an optimization could be to still stack the
> > operations (that arrive during a cooldown period) but maybe not take
> > only the last operation but rather aggregate them in order to end up
> > with a single aggregated operation when the cooldown period ends. For
> > example, let's say 3 taskManagers come up and 1 comes down during the
> > cooldown period, we could generate a single operation of scale up +2
> > when the period ends.
> >
> > As a side note regarding your comment on "it'll take a long time to
> > finish all", please keep in mind that the reactive mode (at least for
> > now) is only available for streaming pipeline which are in essence
> > infinite processing.
> >
> > Another side note: when you mention "every taskManagers connecting",
> > if you are referring to the start of the pipeline, please keep in mind
> > that the adaptive scheduler has a "waiting for resources" timeout
> > period before starting the pipeline in which all taskmanagers connect
> > and the parallelism is decided.
> >
> > Best
> >
> > Etienne
> >
> > Le 13/06/2023 à 03:58, yuxia a écrit :
> >> Hi, Etienne. Thanks for driving it. I have one question about the
> >> mechanism of the cooldown timeout.
> >>
> >> From the Proposed Changes part, if a scalling event is received and
> >> it falls during the cooldown period, it'll be stacked to be executed
> >> after the period ends. Also, from the description of FLINK-21883[1],
> >> cooldown timeout is to avoid rescaling the job very frequently,
> >> because TaskManagers are not all connecting at the same time.
> >>
> >> So, is it possible that every taskmanager connecting will produce a
> >> scalling event and it'll be stacked with many scale up event which
> >> causes it'll take a long time to finish all? Can we just take the
> >> last one event?
> >>
> >> [1]: https://issues.apache.org/jira/browse/FLINK-21883
> >>
> >> Best regards, Yuxia
> >>
> >> - 原始邮件 - 发件人: "Etienne Chauchot" 
> >> 收件人:
> >> "dev" , "Robert Metzger" 
> >> 发送时间: 星期一, 2023年 6 月 12日 下午 11:34:25 主题: [DISCUSS] FLIP-322
> >> Cooldown
> >> period for adaptive scheduler
> >>
> >> Hi,
> >>
> >> I’d like to start a discussion about FLIP-322 [1] which introduces a
> >> cooldown period for the adaptive scheduler.
> >>
> >> I'd like to get your feedback especially @Robert as you opened the
> >> related ticket and worked on the reactive mode a lot.
> >>
> >> [1]
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-322+Cooldown+period+for+adaptive+scheduler
> >>
> >>
> >>
> > Best
> >>
> >> Etienne


Re: [DISCUSS] FLIP-322 Cooldown period for adaptive scheduler

2023-06-14 Thread Etienne Chauchot

Hi all,

@Yukia,I updated the FLIP to include the aggregation of the staked 
operations that we discussed below PTAL.


Best

Etienne


Le 13/06/2023 à 16:31, Etienne Chauchot a écrit :

Hi Yuxia,

Thanks for your feedback. The number of potentially stacked operations 
depends on the configured length of the cooldown period.




The proposition in the FLIP is to add a minimum delay between 2 scaling
operations. But, indeed, an optimization could be to still stack the
operations (that arrive during a cooldown period) but maybe not take
only the last operation but rather aggregate them in order to end up
with a single aggregated operation when the cooldown period ends. For
example, let's say 3 taskManagers come up and 1 comes down during the
cooldown period, we could generate a single operation of scale up +2
when the period ends.

As a side note regarding your comment on "it'll take a long time to 
finish all", please keep in mind that the reactive mode (at least for 
now) is only available for streaming pipeline which are in essence 
infinite processing.


Another side note: when you mention "every taskManagers connecting", 
if you are referring to the start of the pipeline, please keep in mind 
that the adaptive scheduler has a "waiting for resources" timeout 
period before starting the pipeline in which all taskmanagers connect 
and the parallelism is decided.


Best

Etienne

Le 13/06/2023 à 03:58, yuxia a écrit :

Hi, Etienne. Thanks for driving it. I have one question about the
mechanism of the cooldown timeout.

From the Proposed Changes part, if a scalling event is received and
it falls during the cooldown period, it'll be stacked to be executed
after the period ends. Also, from the description of FLINK-21883[1],
cooldown timeout is to avoid rescaling the job very frequently,
because TaskManagers are not all connecting at the same time.

So, is it possible that every taskmanager connecting will produce a
scalling event and it'll be stacked with many scale up event which
causes it'll take a long time to finish all? Can we just take the
last one event?

[1]: https://issues.apache.org/jira/browse/FLINK-21883

Best regards, Yuxia

- 原始邮件 - 发件人: "Etienne Chauchot"  
收件人:
"dev" , "Robert Metzger"  
发送时间: 星期一, 2023年 6 月 12日 下午 11:34:25 主题: [DISCUSS] FLIP-322 
Cooldown

period for adaptive scheduler

Hi,

I’d like to start a discussion about FLIP-322 [1] which introduces a 
cooldown period for the adaptive scheduler.


I'd like to get your feedback especially @Robert as you opened the 
related ticket and worked on the reactive mode a lot.


[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-322+Cooldown+period+for+adaptive+scheduler





Best


Etienne

Re: [DISCUSS] FLIP-322 Cooldown period for adaptive scheduler

2023-06-13 Thread Etienne Chauchot

Hi Yuxia,

Thanks for your feedback. The number of potentially stacked operations 
depends on the configured length of the cooldown period.




The proposition in the FLIP is to add a minimum delay between 2 scaling
operations. But, indeed, an optimization could be to still stack the
operations (that arrive during a cooldown period) but maybe not take
only the last operation but rather aggregate them in order to end up
with a single aggregated operation when the cooldown period ends. For
example, let's say 3 taskManagers come up and 1 comes down during the
cooldown period, we could generate a single operation of scale up +2
when the period ends.

As a side note regarding your comment on "it'll take a long time to 
finish all", please keep in mind that the reactive mode (at least for 
now) is only available for streaming pipeline which are in essence 
infinite processing.


Another side note: when you mention "every taskManagers connecting", if 
you are referring to the start of the pipeline, please keep in mind that 
the adaptive scheduler has a "waiting for resources" timeout period 
before starting the pipeline in which all taskmanagers connect and the 
parallelism is decided.


Best

Etienne

Le 13/06/2023 à 03:58, yuxia a écrit :

Hi, Etienne. Thanks for driving it. I have one question about the
mechanism of the cooldown timeout.

From the Proposed Changes part, if a scalling event is received and
it falls during the cooldown period, it'll be stacked to be executed
after the period ends. Also, from the description of FLINK-21883[1],
cooldown timeout is to avoid rescaling the job very frequently,
because TaskManagers are not all connecting at the same time.

So, is it possible that every taskmanager connecting will produce a
scalling event and it'll be stacked with many scale up event which
causes it'll take a long time to finish all? Can we just take the
last one event?

[1]: https://issues.apache.org/jira/browse/FLINK-21883

Best regards, Yuxia

- 原始邮件 - 发件人: "Etienne Chauchot"  收件人:
"dev" , "Robert Metzger"  
发送时间: 星期一, 2023年 6 月 12日 下午 11:34:25 主题: [DISCUSS] FLIP-322 Cooldown

period for adaptive scheduler

Hi,

I’d like to start a discussion about FLIP-322 [1] which introduces a 
cooldown period for the adaptive scheduler.


I'd like to get your feedback especially @Robert as you opened the 
related ticket and worked on the reactive mode a lot.


[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-322+Cooldown+period+for+adaptive+scheduler





Best


Etienne


Re: [DISCUSS] FLIP-322 Cooldown period for adaptive scheduler

2023-06-12 Thread yuxia
Hi, Etienne.
Thanks for driving it.
I have one question about the mechanism of the cooldown timeout.

>From the Proposed Changes part, if a scalling event is received and it falls 
>during the cooldown period, it'll be stacked to be executed after the period 
>ends.
Also, from the description of FLINK-21883[1], cooldown timeout is to avoid 
rescaling the job very frequently, because TaskManagers are not all connecting 
at the same time.

So, is it possible that every taskmanager connecting will produce a scalling 
event and it'll be stacked with many scale up event which causes it'll take a 
long time to finish all?
Can we just take the last one event?

[1]: https://issues.apache.org/jira/browse/FLINK-21883

Best regards,
Yuxia

- 原始邮件 -
发件人: "Etienne Chauchot" 
收件人: "dev" , "Robert Metzger" 
发送时间: 星期一, 2023年 6 月 12日 下午 11:34:25
主题: [DISCUSS] FLIP-322 Cooldown period for adaptive scheduler

Hi,

I’d like to start a discussion about FLIP-322 [1] which introduces a 
cooldown period for the adaptive scheduler.

I'd like to get your feedback especially @Robert as you opened the 
related ticket and worked on the reactive mode a lot.

[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-322+Cooldown+period+for+adaptive+scheduler

Best

Etienne


[DISCUSS] FLIP-322 Cooldown period for adaptive scheduler

2023-06-12 Thread Etienne Chauchot

Hi,

I’d like to start a discussion about FLIP-322 [1] which introduces a 
cooldown period for the adaptive scheduler.


I'd like to get your feedback especially @Robert as you opened the 
related ticket and worked on the reactive mode a lot.


[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-322+Cooldown+period+for+adaptive+scheduler


Best

Etienne