On Wed, Aug 16, 2017 at 11:17 AM, Nan Zhu <[email protected]> wrote:
> Looks like non-REST API also contains this https://hadoop.apache.
> org/docs/r2.7.0/api/src-html/org/apache/hadoop/yarn/client/
> api/YarnClient.html#line.225
>
> my concern which was skipped in your last email (again) is that, how many
> app states we want to fetch through this API. What I can see is we cannot
> filter applications since application state can change between two polls,
> any thoughts?

I didn't skip it. I'm intentionally keeping the discussion high level
because there's no code here to compare. It's purely a "multiple
requests for single app state" vs. "single request for multiple
applications' statuses" discussion.

The bulk API I suggested you to investigate should be able to support
enough filtering so that Livy only gets the information it needs
(maybe with a little extra noise). It should't get every single YARN
application ever run, for example.

This method is more what I was thinking of:

287  public abstract List<ApplicationReport> getApplications(
288      Set<String> applicationTypes,
289      EnumSet<YarnApplicationState> applicationStates) throws YarnException,
290      IOException;

Lets you query apps with a given type and multiple states that you're
interested in. It's not optimal (doesn't let you filter by tags, for
example), but it's better than getting all apps. Maybe that's now
enough either, but you're proposing the changes, so please explain why
that is not enough instead of just throwing the question back at me.

-- 
Marcelo

Reply via email to