[jira] [Commented] (MAPREDUCE-5924) Windows: Sort Job failed due to 'Invalid event: TA_COMMIT_PENDING at COMMIT_PENDING'
[ https://issues.apache.org/jira/browse/MAPREDUCE-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14035581#comment-14035581 ] Hudson commented on MAPREDUCE-5924: --- FAILURE: Integrated in Hadoop-Yarn-trunk #587 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/587/]) MAPREDUCE-5924. Changed TaskAttemptImpl to ignore TA_COMMIT_PENDING event at COMMIT_PENDING state. Contributed by Zhijie Shen (jianhe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1603338) * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestMRApp.java Windows: Sort Job failed due to 'Invalid event: TA_COMMIT_PENDING at COMMIT_PENDING' Key: MAPREDUCE-5924 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5924 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Yesha Vora Assignee: Zhijie Shen Fix For: 2.5.0 Attachments: MAPREDUCE-5924.1.patch The Sort job over 1GB data failed with below error {code} 2014-06-09 09:15:38,746 INFO [Socket Reader #1 for port 63415] SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for job_1402304714683_0002 (auth:SIMPLE) 2014-06-09 09:15:38,750 INFO [IPC Server handler 13 on 63415] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Commit-pending state update from attempt_1402304714683_0002_r_15_1000 2014-06-09 09:15:38,751 ERROR [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Can't handle this event at current state for attempt_1402304714683_0002_r_15_1000 org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: TA_COMMIT_PENDING at COMMIT_PENDING at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:1058) at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:145) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1271) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1263) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106) at java.lang.Thread.run(Thread.java:722) 2014-06-09 09:15:38,753 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1402304714683_0002Job Transitioned from RUNNING to ERROR {code} The JobHistory Url prints job state = ERROR -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5924) Windows: Sort Job failed due to 'Invalid event: TA_COMMIT_PENDING at COMMIT_PENDING'
[ https://issues.apache.org/jira/browse/MAPREDUCE-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14035700#comment-14035700 ] Hudson commented on MAPREDUCE-5924: --- FAILURE: Integrated in Hadoop-Hdfs-trunk #1778 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1778/]) MAPREDUCE-5924. Changed TaskAttemptImpl to ignore TA_COMMIT_PENDING event at COMMIT_PENDING state. Contributed by Zhijie Shen (jianhe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1603338) * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestMRApp.java Windows: Sort Job failed due to 'Invalid event: TA_COMMIT_PENDING at COMMIT_PENDING' Key: MAPREDUCE-5924 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5924 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Yesha Vora Assignee: Zhijie Shen Fix For: 2.5.0 Attachments: MAPREDUCE-5924.1.patch The Sort job over 1GB data failed with below error {code} 2014-06-09 09:15:38,746 INFO [Socket Reader #1 for port 63415] SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for job_1402304714683_0002 (auth:SIMPLE) 2014-06-09 09:15:38,750 INFO [IPC Server handler 13 on 63415] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Commit-pending state update from attempt_1402304714683_0002_r_15_1000 2014-06-09 09:15:38,751 ERROR [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Can't handle this event at current state for attempt_1402304714683_0002_r_15_1000 org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: TA_COMMIT_PENDING at COMMIT_PENDING at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:1058) at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:145) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1271) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1263) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106) at java.lang.Thread.run(Thread.java:722) 2014-06-09 09:15:38,753 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1402304714683_0002Job Transitioned from RUNNING to ERROR {code} The JobHistory Url prints job state = ERROR -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5924) Windows: Sort Job failed due to 'Invalid event: TA_COMMIT_PENDING at COMMIT_PENDING'
[ https://issues.apache.org/jira/browse/MAPREDUCE-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14035809#comment-14035809 ] Hudson commented on MAPREDUCE-5924: --- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1805 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1805/]) MAPREDUCE-5924. Changed TaskAttemptImpl to ignore TA_COMMIT_PENDING event at COMMIT_PENDING state. Contributed by Zhijie Shen (jianhe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1603338) * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestMRApp.java Windows: Sort Job failed due to 'Invalid event: TA_COMMIT_PENDING at COMMIT_PENDING' Key: MAPREDUCE-5924 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5924 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Yesha Vora Assignee: Zhijie Shen Fix For: 2.5.0 Attachments: MAPREDUCE-5924.1.patch The Sort job over 1GB data failed with below error {code} 2014-06-09 09:15:38,746 INFO [Socket Reader #1 for port 63415] SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for job_1402304714683_0002 (auth:SIMPLE) 2014-06-09 09:15:38,750 INFO [IPC Server handler 13 on 63415] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Commit-pending state update from attempt_1402304714683_0002_r_15_1000 2014-06-09 09:15:38,751 ERROR [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Can't handle this event at current state for attempt_1402304714683_0002_r_15_1000 org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: TA_COMMIT_PENDING at COMMIT_PENDING at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:1058) at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:145) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1271) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1263) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106) at java.lang.Thread.run(Thread.java:722) 2014-06-09 09:15:38,753 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1402304714683_0002Job Transitioned from RUNNING to ERROR {code} The JobHistory Url prints job state = ERROR -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5924) Windows: Sort Job failed due to 'Invalid event: TA_COMMIT_PENDING at COMMIT_PENDING'
[ https://issues.apache.org/jira/browse/MAPREDUCE-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14034664#comment-14034664 ] Jian He commented on MAPREDUCE-5924: LGTM, +1 Windows: Sort Job failed due to 'Invalid event: TA_COMMIT_PENDING at COMMIT_PENDING' Key: MAPREDUCE-5924 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5924 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Yesha Vora Assignee: Zhijie Shen Attachments: MAPREDUCE-5924.1.patch The Sort job over 1GB data failed with below error {code} 2014-06-09 09:15:38,746 INFO [Socket Reader #1 for port 63415] SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for job_1402304714683_0002 (auth:SIMPLE) 2014-06-09 09:15:38,750 INFO [IPC Server handler 13 on 63415] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Commit-pending state update from attempt_1402304714683_0002_r_15_1000 2014-06-09 09:15:38,751 ERROR [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Can't handle this event at current state for attempt_1402304714683_0002_r_15_1000 org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: TA_COMMIT_PENDING at COMMIT_PENDING at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:1058) at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:145) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1271) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1263) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106) at java.lang.Thread.run(Thread.java:722) 2014-06-09 09:15:38,753 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1402304714683_0002Job Transitioned from RUNNING to ERROR {code} The JobHistory Url prints job state = ERROR -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5924) Windows: Sort Job failed due to 'Invalid event: TA_COMMIT_PENDING at COMMIT_PENDING'
[ https://issues.apache.org/jira/browse/MAPREDUCE-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14034668#comment-14034668 ] Jian He commented on MAPREDUCE-5924: committing.. Windows: Sort Job failed due to 'Invalid event: TA_COMMIT_PENDING at COMMIT_PENDING' Key: MAPREDUCE-5924 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5924 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Yesha Vora Assignee: Zhijie Shen Attachments: MAPREDUCE-5924.1.patch The Sort job over 1GB data failed with below error {code} 2014-06-09 09:15:38,746 INFO [Socket Reader #1 for port 63415] SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for job_1402304714683_0002 (auth:SIMPLE) 2014-06-09 09:15:38,750 INFO [IPC Server handler 13 on 63415] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Commit-pending state update from attempt_1402304714683_0002_r_15_1000 2014-06-09 09:15:38,751 ERROR [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Can't handle this event at current state for attempt_1402304714683_0002_r_15_1000 org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: TA_COMMIT_PENDING at COMMIT_PENDING at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:1058) at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:145) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1271) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1263) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106) at java.lang.Thread.run(Thread.java:722) 2014-06-09 09:15:38,753 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1402304714683_0002Job Transitioned from RUNNING to ERROR {code} The JobHistory Url prints job state = ERROR -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5924) Windows: Sort Job failed due to 'Invalid event: TA_COMMIT_PENDING at COMMIT_PENDING'
[ https://issues.apache.org/jira/browse/MAPREDUCE-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14034681#comment-14034681 ] Jian He commented on MAPREDUCE-5924: Zhijie, can you open jira for the exception issue on Windows you mentioned? thx Windows: Sort Job failed due to 'Invalid event: TA_COMMIT_PENDING at COMMIT_PENDING' Key: MAPREDUCE-5924 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5924 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Yesha Vora Assignee: Zhijie Shen Fix For: 2.5.0 Attachments: MAPREDUCE-5924.1.patch The Sort job over 1GB data failed with below error {code} 2014-06-09 09:15:38,746 INFO [Socket Reader #1 for port 63415] SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for job_1402304714683_0002 (auth:SIMPLE) 2014-06-09 09:15:38,750 INFO [IPC Server handler 13 on 63415] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Commit-pending state update from attempt_1402304714683_0002_r_15_1000 2014-06-09 09:15:38,751 ERROR [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Can't handle this event at current state for attempt_1402304714683_0002_r_15_1000 org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: TA_COMMIT_PENDING at COMMIT_PENDING at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:1058) at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:145) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1271) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1263) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106) at java.lang.Thread.run(Thread.java:722) 2014-06-09 09:15:38,753 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1402304714683_0002Job Transitioned from RUNNING to ERROR {code} The JobHistory Url prints job state = ERROR -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5924) Windows: Sort Job failed due to 'Invalid event: TA_COMMIT_PENDING at COMMIT_PENDING'
[ https://issues.apache.org/jira/browse/MAPREDUCE-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14034688#comment-14034688 ] Hudson commented on MAPREDUCE-5924: --- FAILURE: Integrated in Hadoop-trunk-Commit #5720 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5720/]) MAPREDUCE-5924. Changed TaskAttemptImpl to ignore TA_COMMIT_PENDING event at COMMIT_PENDING state. Contributed by Zhijie Shen (jianhe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1603338) * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestMRApp.java Windows: Sort Job failed due to 'Invalid event: TA_COMMIT_PENDING at COMMIT_PENDING' Key: MAPREDUCE-5924 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5924 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Yesha Vora Assignee: Zhijie Shen Fix For: 2.5.0 Attachments: MAPREDUCE-5924.1.patch The Sort job over 1GB data failed with below error {code} 2014-06-09 09:15:38,746 INFO [Socket Reader #1 for port 63415] SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for job_1402304714683_0002 (auth:SIMPLE) 2014-06-09 09:15:38,750 INFO [IPC Server handler 13 on 63415] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Commit-pending state update from attempt_1402304714683_0002_r_15_1000 2014-06-09 09:15:38,751 ERROR [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Can't handle this event at current state for attempt_1402304714683_0002_r_15_1000 org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: TA_COMMIT_PENDING at COMMIT_PENDING at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:1058) at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:145) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1271) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1263) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106) at java.lang.Thread.run(Thread.java:722) 2014-06-09 09:15:38,753 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1402304714683_0002Job Transitioned from RUNNING to ERROR {code} The JobHistory Url prints job state = ERROR -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5924) Windows: Sort Job failed due to 'Invalid event: TA_COMMIT_PENDING at COMMIT_PENDING'
[ https://issues.apache.org/jira/browse/MAPREDUCE-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14030310#comment-14030310 ] Hadoop QA commented on MAPREDUCE-5924: -- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12650231/MAPREDUCE-5924.1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4655//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4655//console This message is automatically generated. Windows: Sort Job failed due to 'Invalid event: TA_COMMIT_PENDING at COMMIT_PENDING' Key: MAPREDUCE-5924 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5924 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Zhijie Shen Assignee: Zhijie Shen Attachments: MAPREDUCE-5924.1.patch Post the issue on behalf of [~yeshavora]: The Sort job over 1GB data failed with below error {code} 2014-06-09 09:15:38,746 INFO [Socket Reader #1 for port 63415] SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for job_1402304714683_0002 (auth:SIMPLE) 2014-06-09 09:15:38,750 INFO [IPC Server handler 13 on 63415] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Commit-pending state update from attempt_1402304714683_0002_r_15_1000 2014-06-09 09:15:38,751 ERROR [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Can't handle this event at current state for attempt_1402304714683_0002_r_15_1000 org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: TA_COMMIT_PENDING at COMMIT_PENDING at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:1058) at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:145) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1271) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1263) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106) at java.lang.Thread.run(Thread.java:722) 2014-06-09 09:15:38,753 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1402304714683_0002Job Transitioned from RUNNING to ERROR {code} The JobHistory Url prints job state = ERROR -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5924) Windows: Sort Job failed due to 'Invalid event: TA_COMMIT_PENDING at COMMIT_PENDING'
[ https://issues.apache.org/jira/browse/MAPREDUCE-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14030268#comment-14030268 ] Zhijie Shen commented on MAPREDUCE-5924: Some investigation based our logs: TA_COMMIT_PENDING at COMMIT_PENDING happens because there're two TA_COMMIT_PENDING that have been sent to TaskAttemptListenerImpl. The first one causes the state transition from RUNNING to COMMIT_PENDING, and the second one causes the invalid transition. Here's the code that notifies AM of committing the pending task. {code} // say the task tracker that task is commit pending while (true) { try { umbilical.commitPending(taskId, taskStatus); break; } catch (InterruptedException ie) { // ignore } catch (IOException ie) { LOG.warn(Failure sending commit pending: + StringUtils.stringifyException(ie)); if (--retries == 0) { System.exit(67); } } } {code} The loop has been iterated twice, because the following exception happens: {code} 2014-06-09 09:15:38,539 WARN [main] org.apache.hadoop.mapred.Task: Failure sending commit pending: java.io.IOException: Failed on local exception: java.io.IOException: An existing connection was forcibly closed by the remote host; Host Details : local host is: sdevin-clster53/10.215.16.72; destination host is: sdevin-clster54:63415; at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764) at org.apache.hadoop.ipc.Client.call(Client.java:1414) at org.apache.hadoop.ipc.Client.call(Client.java:1363) at org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:231) at com.sun.proxy.$Proxy9.commitPending(Unknown Source) at org.apache.hadoop.mapred.Task.done(Task.java:1006) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:397) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1594) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162) Caused by: java.io.IOException: An existing connection was forcibly closed by the remote host at sun.nio.ch.SocketDispatcher.read0(Native Method) at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:43) at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:225) at sun.nio.ch.IOUtil.read(IOUtil.java:198) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:359) at org.apache.hadoop.net.SocketInputStream$Reader.performIO(SocketInputStream.java:57) at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131) at java.io.FilterInputStream.read(FilterInputStream.java:133) at java.io.FilterInputStream.read(FilterInputStream.java:133) at org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:510) at java.io.BufferedInputStream.fill(BufferedInputStream.java:235) at java.io.BufferedInputStream.read(BufferedInputStream.java:254) at java.io.DataInputStream.readInt(DataInputStream.java:387) at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1054) at org.apache.hadoop.ipc.Client$Connection.run(Client.java:949) {code} Throughout the log, I can see a lot of IOException: An existing connection was forcibly closed by the remote host (This issue seems to have been raised and solved in HADOOP-317, but it still occurs on Windows), and this particular exception instance results in re-sending committing message to AM. In addition, it seem to have a logic problem about re-sending the committing message. Given the service side has done everything, but IOException happens on the response. The second should not to be sent by the client, or should be ignored by the service. This is something we need to fix at the MR side. There're two options to fix the invalid transition: 1. At the state machine, we ignore TA_COMMIT_PENDING when a task attempt is at COMMIT_PENDING. This is just a one-line fix. 2. We can add RetryCache for TaskAttemptListenerImpl, when commitPending is called, we check whether we have cache for a particular attempt Id. If we have, it means the transition has been executed before, and we can skip the transition. Windows: Sort Job failed due to 'Invalid event: TA_COMMIT_PENDING at COMMIT_PENDING'