[
https://issues.apache.org/jira/browse/MAPREDUCE-5042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jason Lowe updated MAPREDUCE-5042:
----------------------------------
Attachment: MAPREDUCE-5042.patch
Patch that adds a unit test to verify recovery does not try to recover if the
shuffle secret was not provided as part of the job credentials.
I also fixed some tests in TestPipeApplication that had hardcoded the unexposed
name of the job token which changed in the patch.
> Reducer unable to fetch for a map task that was recovered
> ---------------------------------------------------------
>
> Key: MAPREDUCE-5042
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5042
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: mr-am, security
> Affects Versions: 0.23.7, 2.0.4-alpha
> Reporter: Jason Lowe
> Assignee: Jason Lowe
> Priority: Blocker
> Attachments: MAPREDUCE-5042.patch, MAPREDUCE-5042.patch,
> MAPREDUCE-5042.patch
>
>
> If an application attempt fails and is relaunched the AM will try to recover
> previously completed tasks. If a reducer needs to fetch the output of a map
> task attempt that was recovered then it will fail with a 401 error like this:
> {noformat}
> java.io.IOException: Server returned HTTP response code: 401 for URL:
> http://xx:xx/mapOutput?job=job_1361569180491_21845&reduce=0&map=attempt_1361569180491_21845_m_000016_0
> at
> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1615)
> at
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:231)
> at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:156)
> {noformat}
> Looking at the corresponding NM's logs, we see the shuffle failed due to
> "Verification of the hashReply failed".
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira