[ 
https://issues.apache.org/jira/browse/EAGLE-942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15892181#comment-15892181
 ] 

Lantao Jin commented on EAGLE-942:
----------------------------------

RM of Hadoop 2.7 added more fields in the construction {{AppInfo}} as a return 
for its /apps RESTful API. The biggest construction now is 
{{List<ResourceRequestInfo> resourceRequests}}. For example, we use below URI 
to query:
http://<rm http address:port>/ws/v1/cluster/apps?states=running,accepted&limit=2
The results are very different:
||Hadoop version|Total Character|Total Word|Total Lines|Size||
|2.4.1|1192|    42|     42|     1.2 KB|
|2.7.1|1222179| 48740|  48735|  1.21 MB|

In this case, the apps queries from Eagle became very inefficiency and it could 
make RM overloaded in a very busy cluster. 
So why not use YarnClient API to do the same thing. The return object of RPC 
call {{ApplicationReport}} is much smaller than the JSON result of {{/apps}} 
RESTful API and no noisy data {{ResourceRequestInfo}} in it.

> Fetch running apps info with Yarn client
> ----------------------------------------
>
>                 Key: EAGLE-942
>                 URL: https://issues.apache.org/jira/browse/EAGLE-942
>             Project: Eagle
>          Issue Type: Improvement
>          Components: App::Job Performance Monitor
>    Affects Versions: v0.5.0
>            Reporter: Zhao, Qingwen
>            Assignee: Lantao Jin
>             Fix For: v0.5.0
>
>
> Since Hadoop upgrades to 2.7, /apps API returns much more data than before, 
> which burdens of the resource manager heavily. 
> The requirements:
> * support multiple Hadoop version, at least 2.4 & 2.7
> * avoid the heavy burden on resource manager for per request



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to