[
https://issues.apache.org/jira/browse/MAPREDUCE-479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jiaqi Tan updated MAPREDUCE-479:
--------------------------------
Release Note: Adds Reduce Attempt ID to ClientTrace log messages, and adds
Reduce Attempt ID to HTTP query string sent to mapOutputServlet. (was: Adds
Reduce ID to ClientTrace log messages. Explicitly uses new mapreduce.JobID for
compatibility with updated TaskID constructor.)
Status: Patch Available (was: Open)
I would prefer adding the reduce attempt ID to the HTTP query string because
this eliminates the need for assuming that no two attempts of the same task can
run on the same node; I can see scenarios where a custom scheduler may break
this assumption and make tracing very complicated. The incremental cost in
terms of additional network traffic of adding the reduce attempt ID should be
minimal and much smaller than the total data shuffled in a typical job.
> Add reduce ID to shuffle clienttrace
> ------------------------------------
>
> Key: MAPREDUCE-479
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-479
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Affects Versions: 0.21.0
> Reporter: Jiaqi Tan
> Assignee: Jiaqi Tan
> Priority: Minor
> Fix For: 0.21.0
>
> Attachments: HADOOP-6013.patch, MAPREDUCE-479-1.patch,
> MAPREDUCE-479.patch
>
>
> Current clienttrace messages from shuffles note only the destination map ID
> but not the source reduce ID. Having both source and destination ID of each
> shuffle enables full tracing of execution.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.