[jira] [Created] (MAPREDUCE-4774) repair test org.apache.hadoop.mapred.TestClusterMRNotification.testMR
Ivan A. Veselovsky created MAPREDUCE-4774: - Summary: repair test org.apache.hadoop.mapred.TestClusterMRNotification.testMR Key: MAPREDUCE-4774 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4774 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Ivan A. Veselovsky The test org.apache.hadoop.mapred.TestClusterMRNotification.testMR frequently fails in mapred build (e.g. see https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2988/testReport/junit/org.apache.hadoop.mapred/TestClusterMRNotification/testMR/ , or https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2982//testReport/org.apache.hadoop.mapred/TestClusterMRNotification/testMR/). The test aims to check Job status notifications received through HTTP Servlet. It runs 3 jobs: successfull, killed, and failed. The test expects the servlet to receive some expected notifications in some expected order. It also tries to test the retry-on-failure notification functionality, so on each 1st notification the servlet answers 400 forcing error, and on each 2nd notification attempt it answers ok. In general, the test fails because the actual number and/or type of the notifications differs from the expected. Investigation shows that actual root cause of the problem is an incorrect job state transition: the 3rd job mapred task fails (by intentionally thrown RuntimeException, see UtilsForTests#runJobFail()), and the state of the task changes from RUNNING to FAILED. At this point JobEventType.JOB_TASK_ATTEMPT_COMPLETED event is submitted (in method org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl.handleTaskAttemptCompletion(TaskAttemptId, TaskAttemptCompletionEventStatus)), and this event gets processed in AsyncDispatcher, but this transition is impossible according to the event transition map (see org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl#stateMachineFactory). This causes the following exception to be thrown upon the event processing: 2012-11-06 12:22:02,335 ERROR [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Can't handle this event at current state org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: JOB_TASK_ATTEMPT_COMPLETED at FAILED at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:309) at org.apache.hadoop.yarn.state.StateMachineFactory.access$3(StateMachineFactory.java:290) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:454) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:716) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:1) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:917) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:130) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:79) at java.lang.Thread.run(Thread.java:662) So, the job gets into state INTERNAL_ERROR, the job end notification like this is sent: http://localhost:48656/notification/mapred?jobId=job_1352199715842_0002amp;jobStatus=ERROR (here we can see ERROR status instead of FAILED) After that the notification servlet receives either only ERROR notification, or one more notification ERROR after FAILED, which finally causes the test to fail. (Some variation in the test behavior caused by racing conditions because there are many asynchronous processings there, and the test is flaky, in fact). In any way, it looks like the root cause of the problem is the possibility of the forbidden transition Invalid event: JOB_TASK_ATTEMPT_COMPLETED at FAILED. Need an expert advice on how that should be fixed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Hadoop-Mapreduce-trunk - Build # 1248 - Still Failing
See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1248/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 30846 lines...] Running org.apache.hadoop.mapreduce.lib.partition.TestBinaryPartitioner Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.424 sec Running org.apache.hadoop.mapreduce.lib.partition.TestMRKeyFieldBasedPartitioner Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.389 sec Running org.apache.hadoop.mapreduce.TestChild Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 59.741 sec Running org.apache.hadoop.mapreduce.filecache.TestURIFragments Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.06 sec Running org.apache.hadoop.mapreduce.TestMapReduce Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.529 sec Results : Failed tests: testMR(org.apache.hadoop.mapred.TestClusterMRNotification): expected:6 but was:4 Tests run: 441, Failures: 1, Errors: 0, Skipped: 14 [INFO] [INFO] Reactor Summary: [INFO] [INFO] hadoop-mapreduce-client ... SUCCESS [3.224s] [INFO] hadoop-mapreduce-client-core .. SUCCESS [17.816s] [INFO] hadoop-mapreduce-client-common SUCCESS [22.077s] [INFO] hadoop-mapreduce-client-shuffle ... SUCCESS [1.060s] [INFO] hadoop-mapreduce-client-app ... SUCCESS [4:35.037s] [INFO] hadoop-mapreduce-client-hs SUCCESS [1:11.896s] [INFO] hadoop-mapreduce-client-jobclient . FAILURE [39:58.970s] [INFO] Apache Hadoop MapReduce Examples .. SKIPPED [INFO] hadoop-mapreduce .. SKIPPED [INFO] [INFO] BUILD FAILURE [INFO] [INFO] Total time: 46:30.622s [INFO] Finished at: Tue Nov 06 14:02:06 UTC 2012 [INFO] Final Memory: 20M/127M [INFO] [ERROR] Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.12.3:test (default-test) on project hadoop-mapreduce-client-jobclient: There are test failures. [ERROR] [ERROR] Please refer to /home/jenkins/jenkins-slave/workspace/Hadoop-Mapreduce-trunk/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/surefire-reports for the individual test results. [ERROR] - [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException [ERROR] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn goals -rf :hadoop-mapreduce-client-jobclient Build step 'Execute shell' marked build as failure [FINDBUGS] Skipping publisher since build result is FAILURE Archiving artifacts Updating HADOOP-9009 Updating HDFS-4151 Updating HADOOP-9010 Updating MAPREDUCE-4771 Updating HDFS-4046 Email was triggered for: Failure Sending email for trigger: Failure ### ## FAILED TESTS (if any) ## No tests ran.
[jira] [Resolved] (MAPREDUCE-4773) MultipleOutput with different output path for each
[ https://issues.apache.org/jira/browse/MAPREDUCE-4773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J resolved MAPREDUCE-4773. Resolution: Not A Problem Good to know. In future, please use the user-lists when you have an issue in your development you wish to discuss/get an answer for. It is a pretty active list today. The JIRA exists for identified bugs and/or feature requests and not user help. Resolving JIRA as Not a Problem (For now). MultipleOutput with different output path for each --- Key: MAPREDUCE-4773 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4773 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Rohit Dandona Is it possible to have multiple outputs in a map reduce code where each output is directed to a different path ? e.g. FileOutputFormat.setOutputPath(job, new Path(outputPath)); MultipleOutputs.addNamedOutput(job, Output 1, TextOutputFormat.class, Text.class, Text.class); MultipleOutputs.addNamedOutput(job, Output 2, TextOutputFormat.class, Text.class, Text.class); Can Output 1 Output 2 be alloted seperate paths ? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-4775) Reducer will never commit suicide
Robert Joseph Evans created MAPREDUCE-4775: -- Summary: Reducer will never commit suicide Key: MAPREDUCE-4775 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4775 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Priority: Critical In 1.0 there are a number of conditions that will cause a reducer to commit suicide and exit. This includes if it is stalled, if the error percentage of total fetches is too high. In the new code it will only commit suicide when the total number of failures for a single task attempt is = max(30, totalMaps/10). In the best case with the quadratic back-off to get a single map attempt to reach 30 failure it would take 20.5 hours. And unless there is only one reducer running the map task would have been restarted before then. We should go back to include the same reducer suicide checks that are in 1.0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira