[ 
https://issues.apache.org/jira/browse/HADOOP-3956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12622778#action_12622778
 ] 

Amir Youssefi commented on HADOOP-3956:
---------------------------------------

Sample issues we detect from task/job logs: 

 -  Shuffle

 -  Map Spill

 -  Lagging single reducer (un-even distribution from time or row count point 
of view)

and more... 

Runping brought up a good point on availability of data while process is 
running. In some installations, logs are gathered by HOD after job is finished. 
User needs to wait for a job to finish to see all logs. We can change logging 
process to some extent and improve availability of items in progressing Live 
Log. Task counters are available when each task is finished.

BTW, diagram in item 1 above refers to a progress diagram developed by Owen 
O'Malley. 

> map-reduce doctor (Mr Doctor)
> -----------------------------
>
>                 Key: HADOOP-3956
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3956
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: Amir Youssefi
>
> Problem Description: 
>  Users typically submit jobs with sub-optimal parameters resulting in 
> under-utilization, black-listed task-trackers, time-outs, re-tries etc.
>  Issue can be mitigated by submitting job with custom Hadoop parameters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to