[ 
https://issues.apache.org/jira/browse/MAPREDUCE-479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiaqi Tan updated MAPREDUCE-479:
--------------------------------

    Release Note: Adds Reduce Attempt ID to ClientTrace log messages, and adds 
Reduce Attempt ID to HTTP query string sent to mapOutputServlet.  (was: Adds 
Reduce ID to ClientTrace log messages. Explicitly uses new mapreduce.JobID for 
compatibility with updated TaskID constructor.)
          Status: Patch Available  (was: Open)

I would prefer adding the reduce attempt ID to the HTTP query string because 
this eliminates the need for assuming that no two attempts of the same task can 
run on the same node; I can see scenarios where a custom scheduler may break 
this assumption and make tracing very complicated. The incremental cost in 
terms of additional network traffic of adding the reduce attempt ID should be 
minimal and much smaller than the total data shuffled in a typical job. 

> Add reduce ID to shuffle clienttrace
> ------------------------------------
>
>                 Key: MAPREDUCE-479
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-479
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>    Affects Versions: 0.21.0
>            Reporter: Jiaqi Tan
>            Assignee: Jiaqi Tan
>            Priority: Minor
>             Fix For: 0.21.0
>
>         Attachments: HADOOP-6013.patch, MAPREDUCE-479-1.patch, 
> MAPREDUCE-479.patch
>
>
> Current clienttrace messages from shuffles note only the destination map ID 
> but not the source reduce ID. Having both source and destination ID of each 
> shuffle enables full tracing of execution. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to