This sounds like a good plan to me Gyula. And there is always the Flink
packages option available if we want to make it available earlier.

Cheers,
Till

On Fri, May 15, 2020 at 10:12 AM Gyula Fóra <gyula.f...@gmail.com> wrote:

> Hi Till!
>
> I agree to some extent that managing multiple clusters is not Flink's
> primary responsibility.
>
> However many (if not most) production users use Flink in per-job-cluster
> mode which gives superior configurability and resource isolation than
> standalone/session modes.
> But still the best job management experience is on standalone clusters
> where users see all the jobs, and can interact with them purely using their
> unique job id.
>
> This is the mismatch we were trying to resolve here, to get the best of
> both worlds. This of course only concerns production users running many
> different jobs so we can
> definitely call it an enterprise feature.
>
> I agree that this would be new code to maintain in contrast to the current
> history server which "just works".
>
> We are completely okay with not adding this to Flink just yet, as it will
> be part of the next Cloudera Flink release anyways. We will test run it
> there and gather production feedback for the Flink community and can make a
> better decision afterwards when we see the real value.
>
> Cheers,
> Gyula
>
>
>
> On Thu, May 14, 2020 at 3:36 PM Till Rohrmann <trohrm...@apache.org>
> wrote:
>
> > Hi Gyula,
> >
> > thanks for proposing this extension. I can see that such a feature could
> be
> > helpful.
> >
> > However, I wouldn't consider the management of multiple clusters core to
> > Flink. Managing a single cluster is already complex enough and given the
> > available community capacity I would rather concentrate on doing this
> > aspect right instead of adding more complexity and more code to maintain.
> >
> > Maybe we could add this feature as a Flink package instead. That way it
> > would still be available to our users. If it gains enough traction then
> we
> > can also add it to Flink later. What do you think?
> >
> > Cheers,
> > Till
> >
> > On Wed, May 13, 2020 at 11:36 AM Gyula Fóra <gyula.f...@gmail.com>
> wrote:
> >
> > > It seems that not everyone can see the screenshot in the email, so here
> > is
> > > a link:
> > >
> > > https://drive.google.com/open?id=1abrlpI976NFqOZSX20k2FoiAfVhBbER9
> > >
> > > On Wed, May 13, 2020 at 11:29 AM Gyula Fóra <gyula.f...@gmail.com>
> > wrote:
> > >
> > > > Oops I forgot the screenshot, thanks Ufuk :D
> > > >
> > > >
> > > > @Jeff Zhang <zjf...@gmail.com> : Yes we simply call to the
> individual
> > > > cluster's rest endpoints so it would work with multiple flink
> versions
> > > yes.
> > > > Gyula
> > > >
> > > >
> > > > On Wed, May 13, 2020 at 10:56 AM Jeff Zhang <zjf...@gmail.com>
> wrote:
> > > >
> > > >> Hi Gyula,
> > > >>
> > > >> Big +1 for this, it would be very helpful for flink jobs and cluster
> > > >> operations. Do you call flink rest api to gather the job info ? I
> hope
> > > >> this
> > > >> history server could work with multiple versions of flink as long as
> > the
> > > >> flink rest api is compatible.
> > > >>
> > > >> Gyula Fóra <gyula.f...@gmail.com> 于2020年5月13日周三 下午4:13写道:
> > > >>
> > > >> > Hi All!
> > > >> >
> > > >> > With the growing number of Flink streaming applications the
> current
> > HS
> > > >> > implementation is starting to lose its value. Users running
> > streaming
> > > >> > applications mostly care about what is running right now on the
> > > cluster
> > > >> and
> > > >> > a centralised view on history is not very useful.
> > > >> >
> > > >> > We have been experimenting with reworking the current HS into a
> > Global
> > > >> > Flink Dashboard that would show all running and completed/failed
> > jobs
> > > on
> > > >> > all the running Flink clusters the users have.
> > > >> >
> > > >> > In essence we would get a view similar to the current HS but it
> > would
> > > >> also
> > > >> > show the running jobs with a link redirecting to the actual
> cluster
> > > >> > specific dashboard.
> > > >> >
> > > >> > This is how it looks now:
> > > >> >
> > > >> >
> > > >> > In this version we took a very simple approach of introducing a
> > > cluster
> > > >> > discovery abstraction to collect all the running Flink clusters
> (by
> > > >> listing
> > > >> > yarn apps for instance).
> > > >> >
> > > >> > The main pages aggregating jobs from different clusters would then
> > > >> simply
> > > >> > make calls to all clusters and aggregate the response. Job
> specific
> > > >> > endpoints would be simply routed to the correct target cluster.
> This
> > > way
> > > >> > the changes required are localised to the current HS
> implementation
> > > and
> > > >> > cluster rest endpoints don't need to be changed.
> > > >> >
> > > >> > In addition to getting a fully working global dashboard this also
> > gets
> > > >> us a
> > > >> > fully functioning rest endpoint for accessing all jobs in all
> > clusters
> > > >> > without having to provide the clusterId (yarn app id for instance)
> > > that
> > > >> we
> > > >> > can use to enhance CLI experience in multi cluster (lot of per-job
> > > >> > clusters) environments. Please let us know what you think! Gyula
> > > >> >
> > > >>
> > > >>
> > > >> --
> > > >> Best Regards
> > > >>
> > > >> Jeff Zhang
> > > >>
> > > >
> > >
> >
>

Reply via email to