> What I proposed is having a single request to YARN to get all applications'
statuses, if that's possible. You'd still have multiple application handles
that are independent of each other. They'd all be updated separately from
that one thread talking to YARN. This has nothing to do with a "shared data
structure". There's no shared data structure here to track application
status.

You are still avoiding the questions how you make all "application handles"
accessible to this thread

Please go with direct discussion

> No, but I suggested that you look whether that exists since I think that's
a better solution both from YARN and Livy's perspectives, since it requires
less resources. It should at least be mentioned as an alternative in your
mini-spec and, if it doesn't work for whatever reason, deserves an
explanation.

"I would investigate whether there's any API in YARN to do a bulk get of
running applications with a particular filter;" - from your email

If you suggest something, please find evidence to support you

> Irrelevant.

Please go with direct discussion

> What if YARN goes down? What if your datacenter has a massive power
failure? You have to handle errors in any scenario.

Again, I am describing one concrete scenario which is always involved in
any bulk operation and even we go to bulk direction, you have to handle
this. Since you proposed this bulk operation, I am asking you what's your
expectation about this. But you are throwing some imaginations without any
values

Please go with direct discussion




On Wed, Aug 16, 2017 at 9:11 AM, Marcelo Vanzin <van...@cloudera.com> wrote:

> On Wed, Aug 16, 2017 at 9:06 AM, Nan Zhu <zhunanmcg...@gmail.com> wrote:
> >> I'm not really sure what you're talking about here, since I did not
> > suggest a "shared data structure", and I'm not really sure what that
> > means in this context.
> >
> > What you claimed is just monitoring/updating the state with a single
> thread
> > *given* all applications have been there.
>
> What I proposed is having a single request to YARN to get all
> applications' statuses, if that's possible. You'd still have multiple
> application handles that are independent of each other. They'd all be
> updated separately from that one thread talking to YARN.
>
> This has nothing to do with a "shared data structure". There's no
> shared data structure here to track application status.
>
> >> Yes. While there are applications that need monitoring, you poll YARN
> > at a constant frequency. Basically what would be done by multiple
> > threads, but there's a single one.
> >
> > Did you find the bulk API?
>
> No, but I suggested that you look whether that exists since I think
> that's a better solution both from YARN and Livy's perspectives, since
> it requires less resources. It should at least be mentioned as an
> alternative in your mini-spec and, if it doesn't work for whatever
> reason, deserves an explanation.
>
> >> Why not. The expensive part is not parsing results, I'll bet, but
> > having a whole bunch of different tasks opening and closing YARN
> > connections.
> >
> > First, YARNClient is thread safe and can be shared by multiple
> threads....
>
> Irrelevant.
>
> > Second, If I have 1000 applications, what's your expectation to the
> > following cases
> >
> > 1. YARN processed request for 999 and failed on the last one for some
> reason
> >
> > 2. Livy received 999 well-formatted response but get 1 malformed response
>
> What if YARN goes down? What if your datacenter has a massive power
> failure?
>
> You have to handle errors in any scenario.
>
>
> --
> Marcelo
>

Reply via email to