Re: FLIP-6 and running many "small" jobs

Maximilian Michels Tue, 25 Oct 2016 08:50:34 -0700

Hi Maciek,

Your use case will be covered by the FLIP-6 "Sessions". Sessions are
similar to how the on-premise Flink or the Yarn session operates
today. We will have a long-running dispatcher, resource manager, and
task managers. We will bring up a job manager for each job but the
overhead for this one node (non HA) is relatively little if you have a
cluster with many nodes. After all, the resource intensive computation
is performed by the task managers. The job manager is only responsible
for coordinating the job execution.


Note that the dispatcher hosts the web UI and is responsible for
taking care of the job submission. The role of the resource manager
changes slightly to span across jobs. Task managers have always been
able to serve multiple jobs. Dispatcher, resource manager and task
managers live across jobs within a session.

In my opinion, you won't have to change you use pattern once FLIP-6 is
ready, which is targeted for Flink 1.3.0.

-Max


On Thu, Oct 20, 2016 at 10:07 AM, Maciek Próchniak <m...@touk.pl> wrote:
> Hi,
>
> we're looking at FLIP-6 and while it looks really great we started to wonder
> how it fits in our use case.
>
> We currently have around 20 processes but the idea is to have many more of
> them. Many of them are pretty "small" - them don't large sources, are
> stateless, mainly filtering data.
>
> As I understand, FLIP-6 makes job even more heavyweight thing than today -
> e.g. each job will have it's own jobmanager process etc.
>
> Our concern is that each job will now require more resources - e.g. the
> number of threads, memory and so on. We are thinking about a way to make
> some jobs share these resources - of course that mean they won't be really
> isolated from each other.
>
> So far the only idea we see is deploying these small jobs together, as one
> job - but this leads to some problems, like how to track which version is
> really deployed (we talk about stateless processes so the only problem is
> maintaining source kafka offsets)
>
> Unfortunatelly our jobs can have many different sources and outcomes, so we
> don't think doing sth similar to King&RBEA would work for us...
>
> Do you have any views/ideas about such use case? Or is common view that we
> should deploy our stuff to mesos and let it handle resource allocation? But
> still - for some jobs we'd need sth like "1/4" slot :)
>
> thanks,
>
> maciek
>

Re: FLIP-6 and running many "small" jobs

Reply via email to