[jira] [Commented] (MAPREDUCE-5042) Reducer unable to fetch for a map task that was recovered
[ https://issues.apache.org/jira/browse/MAPREDUCE-5042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13604207#comment-13604207 ] Hudson commented on MAPREDUCE-5042: --- Integrated in Hadoop-Yarn-trunk #157 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/157/]) MAPREDUCE-5042. Reducer unable to fetch for a map task that was recovered (Jason Lowe via bobby) (Revision 1457119) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1457119 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/YarnChild.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/MRAppMaster.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/JobImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/MRApp.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestRecovery.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestJobImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/Task.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/JobSubmitter.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/security/TokenCache.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Fetcher.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Shuffle.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/pipes/TestPipeApplication.java Reducer unable to fetch for a map task that was recovered - Key: MAPREDUCE-5042 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5042 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am, security Affects Versions: 0.23.7, 2.0.4-alpha Reporter: Jason Lowe Assignee: Jason Lowe Priority: Blocker Fix For: 3.0.0, 0.23.7, 2.0.5-beta Attachments: MAPREDUCE-5042.patch, MAPREDUCE-5042.patch, MAPREDUCE-5042.patch, MAPREDUCE-5042.patch If an application attempt fails and is relaunched the AM will try to recover previously completed tasks. If a reducer needs to fetch the output of a map task attempt that was recovered then it will fail with a 401 error like this: {noformat} java.io.IOException: Server returned HTTP response code: 401 for URL: http://xx:xx/mapOutput?job=job_1361569180491_21845reduce=0map=attempt_1361569180491_21845_m_16_0 at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1615) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:231) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:156) {noformat} Looking at the corresponding NM's logs, we see the shuffle failed due to Verification of the hashReply failed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5042) Reducer unable to fetch for a map task that was recovered
[ https://issues.apache.org/jira/browse/MAPREDUCE-5042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13604235#comment-13604235 ] Hudson commented on MAPREDUCE-5042: --- Integrated in Hadoop-Hdfs-0.23-Build #555 (See [https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/555/]) svn merge -c 1457119 FIXES: MAPREDUCE-5042. Reducer unable to fetch for a map task that was recovered (Jason Lowe via bobby) (Revision 1457123) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1457123 Files : * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/YarnChild.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/MRAppMaster.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/JobImpl.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/MRApp.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestRecovery.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestJobImpl.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/Task.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/JobSubmitter.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/security/TokenCache.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Fetcher.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Shuffle.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/pipes/TestPipeApplication.java Reducer unable to fetch for a map task that was recovered - Key: MAPREDUCE-5042 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5042 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am, security Affects Versions: 0.23.7, 2.0.4-alpha Reporter: Jason Lowe Assignee: Jason Lowe Priority: Blocker Fix For: 3.0.0, 0.23.7, 2.0.5-beta Attachments: MAPREDUCE-5042.patch, MAPREDUCE-5042.patch, MAPREDUCE-5042.patch, MAPREDUCE-5042.patch If an application attempt fails and is relaunched the AM will try to recover previously completed tasks. If a reducer needs to fetch the output of a map task attempt that was recovered then it will fail with a 401 error like this: {noformat} java.io.IOException: Server returned HTTP response code: 401 for URL: http://xx:xx/mapOutput?job=job_1361569180491_21845reduce=0map=attempt_1361569180491_21845_m_16_0 at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1615) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:231) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:156) {noformat} Looking at the corresponding NM's logs, we see the shuffle failed due to Verification of the hashReply failed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5042) Reducer unable to fetch for a map task that was recovered
[ https://issues.apache.org/jira/browse/MAPREDUCE-5042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13604245#comment-13604245 ] Hudson commented on MAPREDUCE-5042: --- Integrated in Hadoop-Hdfs-trunk #1346 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1346/]) MAPREDUCE-5042. Reducer unable to fetch for a map task that was recovered (Jason Lowe via bobby) (Revision 1457119) Result = FAILURE bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1457119 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/YarnChild.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/MRAppMaster.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/JobImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/MRApp.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestRecovery.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestJobImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/Task.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/JobSubmitter.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/security/TokenCache.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Fetcher.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Shuffle.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/pipes/TestPipeApplication.java Reducer unable to fetch for a map task that was recovered - Key: MAPREDUCE-5042 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5042 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am, security Affects Versions: 0.23.7, 2.0.4-alpha Reporter: Jason Lowe Assignee: Jason Lowe Priority: Blocker Fix For: 3.0.0, 0.23.7, 2.0.5-beta Attachments: MAPREDUCE-5042.patch, MAPREDUCE-5042.patch, MAPREDUCE-5042.patch, MAPREDUCE-5042.patch If an application attempt fails and is relaunched the AM will try to recover previously completed tasks. If a reducer needs to fetch the output of a map task attempt that was recovered then it will fail with a 401 error like this: {noformat} java.io.IOException: Server returned HTTP response code: 401 for URL: http://xx:xx/mapOutput?job=job_1361569180491_21845reduce=0map=attempt_1361569180491_21845_m_16_0 at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1615) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:231) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:156) {noformat} Looking at the corresponding NM's logs, we see the shuffle failed due to Verification of the hashReply failed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5042) Reducer unable to fetch for a map task that was recovered
[ https://issues.apache.org/jira/browse/MAPREDUCE-5042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13604267#comment-13604267 ] Hudson commented on MAPREDUCE-5042: --- Integrated in Hadoop-Mapreduce-trunk #1374 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1374/]) MAPREDUCE-5042. Reducer unable to fetch for a map task that was recovered (Jason Lowe via bobby) (Revision 1457119) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1457119 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/YarnChild.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/MRAppMaster.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/JobImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/MRApp.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestRecovery.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestJobImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/Task.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/JobSubmitter.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/security/TokenCache.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Fetcher.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Shuffle.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/pipes/TestPipeApplication.java Reducer unable to fetch for a map task that was recovered - Key: MAPREDUCE-5042 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5042 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am, security Affects Versions: 0.23.7, 2.0.4-alpha Reporter: Jason Lowe Assignee: Jason Lowe Priority: Blocker Fix For: 3.0.0, 0.23.7, 2.0.5-beta Attachments: MAPREDUCE-5042.patch, MAPREDUCE-5042.patch, MAPREDUCE-5042.patch, MAPREDUCE-5042.patch If an application attempt fails and is relaunched the AM will try to recover previously completed tasks. If a reducer needs to fetch the output of a map task attempt that was recovered then it will fail with a 401 error like this: {noformat} java.io.IOException: Server returned HTTP response code: 401 for URL: http://xx:xx/mapOutput?job=job_1361569180491_21845reduce=0map=attempt_1361569180491_21845_m_16_0 at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1615) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:231) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:156) {noformat} Looking at the corresponding NM's logs, we see the shuffle failed due to Verification of the hashReply failed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5042) Reducer unable to fetch for a map task that was recovered
[ https://issues.apache.org/jira/browse/MAPREDUCE-5042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603872#comment-13603872 ] Hudson commented on MAPREDUCE-5042: --- Integrated in Hadoop-trunk-Commit #3483 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/3483/]) MAPREDUCE-5042. Reducer unable to fetch for a map task that was recovered (Jason Lowe via bobby) (Revision 1457119) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1457119 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/YarnChild.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/MRAppMaster.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/JobImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/MRApp.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestRecovery.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestJobImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/Task.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/JobSubmitter.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/security/TokenCache.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Fetcher.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Shuffle.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/pipes/TestPipeApplication.java Reducer unable to fetch for a map task that was recovered - Key: MAPREDUCE-5042 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5042 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am, security Affects Versions: 0.23.7, 2.0.4-alpha Reporter: Jason Lowe Assignee: Jason Lowe Priority: Blocker Fix For: 3.0.0, 0.23.7, 2.0.5-beta Attachments: MAPREDUCE-5042.patch, MAPREDUCE-5042.patch, MAPREDUCE-5042.patch, MAPREDUCE-5042.patch If an application attempt fails and is relaunched the AM will try to recover previously completed tasks. If a reducer needs to fetch the output of a map task attempt that was recovered then it will fail with a 401 error like this: {noformat} java.io.IOException: Server returned HTTP response code: 401 for URL: http://xx:xx/mapOutput?job=job_1361569180491_21845reduce=0map=attempt_1361569180491_21845_m_16_0 at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1615) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:231) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:156) {noformat} Looking at the corresponding NM's logs, we see the shuffle failed due to Verification of the hashReply failed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5042) Reducer unable to fetch for a map task that was recovered
[ https://issues.apache.org/jira/browse/MAPREDUCE-5042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13601266#comment-13601266 ] Robert Joseph Evans commented on MAPREDUCE-5042: The patch looks good to me. I am a +1 for it. But I'll wait to check it in a couple of hours to give anyone else time to comment if they want to. Reducer unable to fetch for a map task that was recovered - Key: MAPREDUCE-5042 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5042 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am, security Affects Versions: 0.23.7, 2.0.4-alpha Reporter: Jason Lowe Assignee: Jason Lowe Priority: Blocker Attachments: MAPREDUCE-5042.patch, MAPREDUCE-5042.patch, MAPREDUCE-5042.patch, MAPREDUCE-5042.patch If an application attempt fails and is relaunched the AM will try to recover previously completed tasks. If a reducer needs to fetch the output of a map task attempt that was recovered then it will fail with a 401 error like this: {noformat} java.io.IOException: Server returned HTTP response code: 401 for URL: http://xx:xx/mapOutput?job=job_1361569180491_21845reduce=0map=attempt_1361569180491_21845_m_16_0 at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1615) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:231) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:156) {noformat} Looking at the corresponding NM's logs, we see the shuffle failed due to Verification of the hashReply failed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5042) Reducer unable to fetch for a map task that was recovered
[ https://issues.apache.org/jira/browse/MAPREDUCE-5042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600544#comment-13600544 ] Hadoop QA commented on MAPREDUCE-5042: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12573413/MAPREDUCE-5042.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified test files. {color:red}-1 one of tests included doesn't have a timeout.{color} {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3408//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3408//console This message is automatically generated. Reducer unable to fetch for a map task that was recovered - Key: MAPREDUCE-5042 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5042 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am, security Affects Versions: 0.23.7, 2.0.4-alpha Reporter: Jason Lowe Assignee: Jason Lowe Priority: Blocker Attachments: MAPREDUCE-5042.patch, MAPREDUCE-5042.patch, MAPREDUCE-5042.patch If an application attempt fails and is relaunched the AM will try to recover previously completed tasks. If a reducer needs to fetch the output of a map task attempt that was recovered then it will fail with a 401 error like this: {noformat} java.io.IOException: Server returned HTTP response code: 401 for URL: http://xx:xx/mapOutput?job=job_1361569180491_21845reduce=0map=attempt_1361569180491_21845_m_16_0 at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1615) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:231) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:156) {noformat} Looking at the corresponding NM's logs, we see the shuffle failed due to Verification of the hashReply failed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5042) Reducer unable to fetch for a map task that was recovered
[ https://issues.apache.org/jira/browse/MAPREDUCE-5042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600630#comment-13600630 ] Hadoop QA commented on MAPREDUCE-5042: -- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12573432/MAPREDUCE-5042.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified test files. {color:green}+1 tests included appear to have a timeout.{color} {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3410//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3410//console This message is automatically generated. Reducer unable to fetch for a map task that was recovered - Key: MAPREDUCE-5042 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5042 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am, security Affects Versions: 0.23.7, 2.0.4-alpha Reporter: Jason Lowe Assignee: Jason Lowe Priority: Blocker Attachments: MAPREDUCE-5042.patch, MAPREDUCE-5042.patch, MAPREDUCE-5042.patch, MAPREDUCE-5042.patch If an application attempt fails and is relaunched the AM will try to recover previously completed tasks. If a reducer needs to fetch the output of a map task attempt that was recovered then it will fail with a 401 error like this: {noformat} java.io.IOException: Server returned HTTP response code: 401 for URL: http://xx:xx/mapOutput?job=job_1361569180491_21845reduce=0map=attempt_1361569180491_21845_m_16_0 at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1615) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:231) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:156) {noformat} Looking at the corresponding NM's logs, we see the shuffle failed due to Verification of the hashReply failed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5042) Reducer unable to fetch for a map task that was recovered
[ https://issues.apache.org/jira/browse/MAPREDUCE-5042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596289#comment-13596289 ] Vinod Kumar Vavilapalli commented on MAPREDUCE-5042: In my prelim security work, I once had the JobClient generate the secret and then later had the MR AM generate the tokens and reupload the tokens file into the submit directory. That was another hop to DFS and we changed that since, but this recovery code bug fell through. So there are multiple solutions: - Have a single secret but let the client generate it - Have a single secret but upload the tokens file for future app-attempts - Have multiple tokens It's future proof to separate the task and shuffle security secrets, but not sure that is tied in directly to this one if we consider the reupload solution. I don't feel strongly about any solution, but one thing we should keep in mind is to move as much stuff into the AM so that the client is thinner and enables us to do submits via web services. Reducer unable to fetch for a map task that was recovered - Key: MAPREDUCE-5042 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5042 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am, security Affects Versions: 0.23.7, 2.0.5-beta Reporter: Jason Lowe Assignee: Jason Lowe Priority: Blocker Attachments: MAPREDUCE-5042.patch, MAPREDUCE-5042.patch If an application attempt fails and is relaunched the AM will try to recover previously completed tasks. If a reducer needs to fetch the output of a map task attempt that was recovered then it will fail with a 401 error like this: {noformat} java.io.IOException: Server returned HTTP response code: 401 for URL: http://xx:xx/mapOutput?job=job_1361569180491_21845reduce=0map=attempt_1361569180491_21845_m_16_0 at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1615) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:231) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:156) {noformat} Looking at the corresponding NM's logs, we see the shuffle failed due to Verification of the hashReply failed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5042) Reducer unable to fetch for a map task that was recovered
[ https://issues.apache.org/jira/browse/MAPREDUCE-5042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596332#comment-13596332 ] Jason Lowe commented on MAPREDUCE-5042: --- I thought about the upload-to-staging-for-future-attempts solution but it seemed passing the secret in the job credentials was a bit cleaner and avoided the extra HDFS operations. As for splitting the job token into shuffle and task, I didn't want to change the current task authentication behavior. Allowing an old task attempt to authenticate with a new app attempt seemed like it would be a problem waiting to happen. But we need the shuffle secret to persist across app attempts, hence the push to split them as part of this change. Reducer unable to fetch for a map task that was recovered - Key: MAPREDUCE-5042 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5042 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am, security Affects Versions: 0.23.7, 2.0.5-beta Reporter: Jason Lowe Assignee: Jason Lowe Priority: Blocker Attachments: MAPREDUCE-5042.patch, MAPREDUCE-5042.patch If an application attempt fails and is relaunched the AM will try to recover previously completed tasks. If a reducer needs to fetch the output of a map task attempt that was recovered then it will fail with a 401 error like this: {noformat} java.io.IOException: Server returned HTTP response code: 401 for URL: http://xx:xx/mapOutput?job=job_1361569180491_21845reduce=0map=attempt_1361569180491_21845_m_16_0 at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1615) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:231) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:156) {noformat} Looking at the corresponding NM's logs, we see the shuffle failed due to Verification of the hashReply failed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5042) Reducer unable to fetch for a map task that was recovered
[ https://issues.apache.org/jira/browse/MAPREDUCE-5042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13595370#comment-13595370 ] Hadoop QA commented on MAPREDUCE-5042: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12572443/MAPREDUCE-5042.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 tests included appear to have a timeout.{color} {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:red}-1 release audit{color}. The applied patch generated 1 release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3389//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3389//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3389//console This message is automatically generated. Reducer unable to fetch for a map task that was recovered - Key: MAPREDUCE-5042 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5042 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am, security Affects Versions: 0.23.7, 2.0.4-beta Reporter: Jason Lowe Assignee: Jason Lowe Priority: Blocker Attachments: MAPREDUCE-5042.patch, MAPREDUCE-5042.patch If an application attempt fails and is relaunched the AM will try to recover previously completed tasks. If a reducer needs to fetch the output of a map task attempt that was recovered then it will fail with a 401 error like this: {noformat} java.io.IOException: Server returned HTTP response code: 401 for URL: http://xx:xx/mapOutput?job=job_1361569180491_21845reduce=0map=attempt_1361569180491_21845_m_16_0 at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1615) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:231) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:156) {noformat} Looking at the corresponding NM's logs, we see the shuffle failed due to Verification of the hashReply failed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5042) Reducer unable to fetch for a map task that was recovered
[ https://issues.apache.org/jira/browse/MAPREDUCE-5042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13595392#comment-13595392 ] Jason Lowe commented on MAPREDUCE-5042: --- Release audit warnings are unrelated to the patch. Reducer unable to fetch for a map task that was recovered - Key: MAPREDUCE-5042 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5042 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am, security Affects Versions: 0.23.7, 2.0.4-beta Reporter: Jason Lowe Assignee: Jason Lowe Priority: Blocker Attachments: MAPREDUCE-5042.patch, MAPREDUCE-5042.patch If an application attempt fails and is relaunched the AM will try to recover previously completed tasks. If a reducer needs to fetch the output of a map task attempt that was recovered then it will fail with a 401 error like this: {noformat} java.io.IOException: Server returned HTTP response code: 401 for URL: http://xx:xx/mapOutput?job=job_1361569180491_21845reduce=0map=attempt_1361569180491_21845_m_16_0 at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1615) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:231) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:156) {noformat} Looking at the corresponding NM's logs, we see the shuffle failed due to Verification of the hashReply failed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5042) Reducer unable to fetch for a map task that was recovered
[ https://issues.apache.org/jira/browse/MAPREDUCE-5042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592379#comment-13592379 ] Jason Lowe commented on MAPREDUCE-5042: --- Sorry, I was wrong. It appears it will happen without security as well. The problem is that the job token is rolled from scratch each time the AM starts up, so the subsequent AM attempt has no idea what job token was used by the previous attempt. My non-secure cluster was only one node, and any node that launches a container for the new AM attempt will smash the old shuffle token with the new one. Any node that only ran tasks for the old AM attempt will report shuffle verification failures from reduce tasks launched by the new AM attempt. Reducer unable to fetch for a map task that was recovered - Key: MAPREDUCE-5042 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5042 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am, security Affects Versions: 0.23.7, 2.0.4-beta Reporter: Jason Lowe Priority: Blocker If an application attempt fails and is relaunched the AM will try to recover previously completed tasks. If a reducer needs to fetch the output of a map task attempt that was recovered then it will fail with a 401 error like this: {noformat} java.io.IOException: Server returned HTTP response code: 401 for URL: http://xx:xx/mapOutput?job=job_1361569180491_21845reduce=0map=attempt_1361569180491_21845_m_16_0 at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1615) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:231) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:156) {noformat} Looking at the corresponding NM's logs, we see the shuffle failed due to Verification of the hashReply failed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5042) Reducer unable to fetch for a map task that was recovered
[ https://issues.apache.org/jira/browse/MAPREDUCE-5042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13591072#comment-13591072 ] Jason Lowe commented on MAPREDUCE-5042: --- This only seems to occur when security is enabled. Reducer unable to fetch for a map task that was recovered - Key: MAPREDUCE-5042 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5042 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am, security Affects Versions: 0.23.7, 2.0.4-beta Reporter: Jason Lowe Priority: Blocker If an application attempt fails and is relaunched the AM will try to recover previously completed tasks. If a reducer needs to fetch the output of a map task attempt that was recovered then it will fail with a 401 error like this: {noformat} java.io.IOException: Server returned HTTP response code: 401 for URL: http://xx:xx/mapOutput?job=job_1361569180491_21845reduce=0map=attempt_1361569180491_21845_m_16_0 at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1615) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:231) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:156) {noformat} Looking at the corresponding NM's logs, we see the shuffle failed due to Verification of the hashReply failed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira