Hi

Could you give more information, which version of hadoop are you using?


>> QueueMetrics.AppsKilled/Failed metrics shows much higher nos i.e ~100. 
>> However RMAuditLogger shows 1 or 2 Apps as Killed/Failed in the logs.
May be I suspect that Logs might be rolled out. Does more applications are 
running?

All the applications history will be displayed  on RM web UI (provided RM is 
not restarted or RM recovery enabled). May be you can check these applications 
lists.

For finding reasons for application killed/failed, one way is you can check in 
NodeManager logs also. Here  you need to check using container_id for 
corresponding application.

Thanks & Regards
Rohith Sharma K S

From: Suma Shivaprasad [mailto:sumasai.shivapra...@gmail.com]
Sent: 03 February 2015 21:35
To: user@hadoop.apache.org; yarn-...@hadoop.apache.org
Subject: QueueMetrics.AppsKilled/Failed metrics and failure reasons

Hello,

Was trying to debug reasons for Killed/Failed apps and was checking for the 
applications that were killed/failed in RM logs - from RMAuditLogger.
QueueMetrics.AppsKilled/Failed metrics shows much higher nos i.e ~100. However 
RMAuditLogger shows 1 or 2 Apps as Killed/Failed in the logs. Is it possible 
that some logs are missed by AuditLogger or is it the other way round and 
metrics are being reported higher ?
Thanks
Suma

Reply via email to