Re: resolve the scalability problem caused by app monitoring in livy with an actor-based design

Nan Zhu Wed, 16 Aug 2017 14:10:05 -0700

With time goes, the reply from YARN can only be larger and larger. Given
the consistent workload pattern, the cost of a large query can be
eventually larger than individual request

I would say go with individual request + thread pool  or large batch for
all first, if any performance issue is observed, add the optimization on
top of it

Regarding how to optimize,

The major issue is that  YarnClient API is a simplified version of Rest
APIs regarding the less number of filtering parameters.

I looked at the usage of YarnClient in the current implementation (only
Livy-Server), only SparkYarnApp class is using that. Since there will be a
big refactoring of this class, replacing YarnClient with a home-made
Restful Client might not be that costly

*multiple Individual request:*

Batching individual requests based on submission time

*a single Large request:*

Limiting number of fetched app status can be achieved with, e.g.
application submission time, or limit.....which are only available with
rest APIs. However, even with rest API, there are some corner cases, e.g. a
long running app lasting for days (training some models), and some short
ones which last only for minutes

Best,

Nan

On Wed, Aug 16, 2017 at 1:01 PM, Marcelo Vanzin <[email protected]> wrote:

> On Wed, Aug 16, 2017 at 12:57 PM, Nan Zhu <[email protected]> wrote:
> > yes, we finally converge on the idea
> >
> > how large the reply can be? if I have only one running applications and I
> > still need to fetch 1000
> >
> > on the other side
> >
> > I have 1000 running apps, what's the cost of sending 1000 requests even
> the
> > thread pool and yarn client are shared?
>
> I don't know the answers, but I'm asking you, since you are proposing
> the design, to consider that as an option, since it does not seem like
> you considered that tradeoff when suggesting your current approach.
>
> My comments about filtering are targeted at making things better in
> your first case; if there's really only one app being monitored, and
> you can figure out a filter that returns let's say 50 apps instead of
> 1000 that may be monitored by YARN, then you can do that.
>
> Or maybe you can go with a hybrid approach, where you use individual
> requests but past a certain threshold you fall back to bulk requests
> to avoid overloading YARN.
>
> Again, I'm asking you to consider alternatives that are not mentioned
> in your design document, because I identified potential performance
> issues in the current approach.
>
>
> --
> Marcelo
>

Re: resolve the scalability problem caused by app monitoring in livy with an actor-based design

Reply via email to