>
> The adaptive scheduler only supports streaming jobs. That's the biggest
> limitation that probably won't be fixed anytime soon.


Since FLIP-283 [1] has been accepted, I think this limitation might have
already been addressed to a certain extent. I'd be completely fine with
having a separate scheduler for batch and streaming (maybe we could build a
hybrid one at some point that automatically switches between the two).

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-283%3A+Use+adaptive+batch+scheduler+as+default+scheduler+for+batch+jobs


On Fri, Jan 27, 2023 at 9:58 AM Chesnay Schepler <ches...@apache.org> wrote:

> The adaptive scheduler only supports streaming jobs. That's the biggest
> limitation that probably won't be fixed anytime soon.
> The goal was though to make the adaptive scheduler the default for
> streaming jobs eventually.
> it was very much meant as a better version of the default scheduler for
> streaming jobs.
>
> On 26/01/2023 19:06, David Morávek wrote:
> > Hi Gyula,
> >
> >
> >> can you please explain why the AdaptiveScheduler is not the default
> >> scheduler?
> >
> > There are still some smaller bits missing. As far as I know, the missing
> > parts are:
> >
> > 1) Local recovery (reusing the already downloaded state files after
> restart
> > / rescale)
> > 2) Support for fine-grained resource management
> > 3) Support for the session cluster (Chesnay will be submitting a FLIP for
> > this soon)
> >
> > We're looking into addressing all of these limitations in the short term.
> >
> > Personally, I'd love to start a discussion about making transitioning the
> > AdaptiveScheduler into a default one after those limitations are fixed.
> > Being able to eventually deprecate and remove the DefaultScheduler would
> > simplify the code-base by a lot since there are many adapters between new
> > and old interfaces (eg. SlotPool-related interfaces).
> >
> > Best,
> > D.
> >
> > On Thu, Jan 26, 2023 at 6:27 PM Gyula Fóra <gyula.f...@gmail.com> wrote:
> >
> >> Chesnay,
> >>
> >> Seems like you are suggesting that the Adaptive scheduler does
> everything
> >> the standard scheduler does and more.
> >>
> >> I am clearly not an expert on this topic but can you please explain why
> the
> >> AdaptiveScheduler is not the default scheduler?
> >> If it can do everything, why do we even have 2 schedulers? Why not
> simply
> >> drop the "old" one?
> >>
> >> That would probably clear up all confusionsthen :)
> >>
> >> Gyula
> >>
> >> On Thu, Jan 26, 2023 at 6:23 PM Chesnay Schepler <ches...@apache.org>
> >> wrote:
> >>
> >>> There's the default and reactive mode; nothing else.
> >>> At it's core they are the same thing; reactive mode just cranks up the
> >>> desired parallelism to infinity and enforces certain assumptions (e.g.,
> >>> no active resource management).
> >>>
> >>> The advantage is that the adaptive scheduler can run jobs while not
> >>> sufficient resources are available, and scale things up again once they
> >>> are available.
> >>> This is it's core functionality, but we always intended to extend it
> >>> such that users can modify the parallelism at runtime as well.
> >>> And since the AS can already rescale jobs (and was purpose-built with
> >>> that functionality in mind), this is just a matter of exposing an API
> >>> for it. Everything else is already there.
> >>>
> >>> As a concrete use-case, let's say you have an SLA that says jobs must
> >>> not be down longer than X seconds, and a TM just crashed.
> >>> If you can absolutely guarantee that your k8s cluster can provision a
> >>> new TM within X seconds, no matter what cruel reality has in store for
> >>> you, than you /may/ not need it.
> >>> If you can't, well then here's a use-case for you.
> >>>
> >>>   > Last time I looked they implemented the same interface and the same
> >>> base class. Of course, their behavior is quite different.
> >>>
> >>> They never shared a base class since day 1. Are you maybe mixing up the
> >>> AdaptiveScheduler and AdaptiveBatchScheduler?
> >>>
> >>> As for FLINK-30773, I think that should be covered.
> >>>
> >>> On 26/01/2023 17:10, Maximilian Michels wrote:
> >>>> Thanks for the explanation. If not for the "reactive mode", what is
> >>>> the advantage of the adaptive scheduler? What other modes does it
> >>>> support?
> >>>>
> >>>>> Apart from implementing the same interface the implementations of the
> >>> adaptive and default schedulers are separate.
> >>>> Last time I looked they implemented the same interface and the same
> >>>> base class. Of course, their behavior is quite different.
> >>>>
> >>>> I'm still very interested in learning about the future FLIPs
> >>>> mentioned. Based on the replies, I'm assuming that they will support
> >>>> the changes required for
> >>>> https://issues.apache.org/jira/browse/FLINK-30773, or at least
> provide
> >>>> the basis for implementing them.
> >>>>
> >>>> -Max
> >>>>
> >>>> On Thu, Jan 26, 2023 at 4:57 PM Chesnay Schepler<ches...@apache.org>
> >>> wrote:
> >>>>> On 26/01/2023 16:18, Maximilian Michels wrote:
> >>>>>
> >>>>> I see slightly different goals for the standard and the adaptive
> >>>>> scheduler. The adaptive scheduler's goal is to adapt the Flink job
> >>>>> according to the available resources.
> >>>>>
> >>>>> This is really a misconception that we just have to stomp out.
> >>>>>
> >>>>> This statement only applies to reactive mode, a special mode in which
> >>> the adaptive scheduler (AS) can run in where active resource management
> >> is
> >>> not supported since requesting infinite resources from k8s doesn't
> really
> >>> make sense.
> >>>>> The AS itself can work perfectly fine with active resource
> management,
> >>> and has no effect on how the RM talks to k8s. It can just keep the job
> >>> running in cases where less than desired (==user-provided parallelism)
> >>> resources are provided by k8s (possibly temporarily).
> >>>>> On 26/01/2023 16:18, Maximilian Michels wrote:
> >>>>>
> >>>>> After
> >>>>> all, both schedulers share the same super class
> >>>>>
> >>>>> Apart from implementing the same interface the implementations of the
> >>> adaptive and default schedulers are separate.
> >>>
> >>>
>
>

Reply via email to