[jira] [Created] (MAPREDUCE-4774) repair test org.apache.hadoop.mapred.TestClusterMRNotification.testMR

2012-11-06 Thread Ivan A. Veselovsky (JIRA)
Ivan A. Veselovsky created MAPREDUCE-4774:
-

 Summary: repair test 
org.apache.hadoop.mapred.TestClusterMRNotification.testMR
 Key: MAPREDUCE-4774
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4774
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Ivan A. Veselovsky


The test org.apache.hadoop.mapred.TestClusterMRNotification.testMR frequently  
fails in mapred build (e.g. see 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2988/testReport/junit/org.apache.hadoop.mapred/TestClusterMRNotification/testMR/
 , or 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2982//testReport/org.apache.hadoop.mapred/TestClusterMRNotification/testMR/).

The test aims to check Job status notifications received through HTTP Servlet. 
It runs 3 jobs: successfull, killed, and failed. 
The test expects the servlet to receive some expected notifications in some 
expected order. It also tries to test the retry-on-failure notification 
functionality, so on each 1st notification the servlet answers 400 forcing 
error, and on each 2nd notification attempt it answers ok. 
In general, the test fails because the actual number and/or type of the 
notifications differs from the expected.

Investigation shows that actual root cause of the problem is an incorrect job 
state transition: the 3rd job mapred task fails (by intentionally thrown  
RuntimeException, see UtilsForTests#runJobFail()), and the state of the task 
changes from RUNNING to FAILED.
At this point JobEventType.JOB_TASK_ATTEMPT_COMPLETED event is submitted (in  
method 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl.handleTaskAttemptCompletion(TaskAttemptId,
 TaskAttemptCompletionEventStatus)), and this event gets processed in 
AsyncDispatcher, but this transition is impossible according to the event 
transition map (see 
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl#stateMachineFactory). This 
causes the following exception to be thrown upon the event processing:
2012-11-06 12:22:02,335 ERROR [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Can't handle this event at 
current state
org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
JOB_TASK_ATTEMPT_COMPLETED at FAILED
at 
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:309)
at 
org.apache.hadoop.yarn.state.StateMachineFactory.access$3(StateMachineFactory.java:290)
at 
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:454)
at 
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:716)
at 
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:1)
at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:917)
at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1)
at 
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:130)
at 
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:79)
at java.lang.Thread.run(Thread.java:662) 

So, the job gets into state INTERNAL_ERROR, the job end notification like 
this is sent:
http://localhost:48656/notification/mapred?jobId=job_1352199715842_0002amp;jobStatus=ERROR
 
(here we can see ERROR status instead of FAILED)
After that the notification servlet receives either only ERROR notification, 
or one more notification ERROR after FAILED, which finally causes the test 
to fail. (Some variation in the test behavior caused by racing conditions 
because there are many asynchronous processings there, and the test is flaky, 
in fact).

In any way, it looks like the root cause of the problem is the possibility of 
the forbidden transition Invalid event: JOB_TASK_ATTEMPT_COMPLETED at FAILED. 
Need an expert advice on how that should be fixed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Hadoop-Mapreduce-trunk - Build # 1248 - Still Failing

2012-11-06 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1248/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 30846 lines...]
Running org.apache.hadoop.mapreduce.lib.partition.TestBinaryPartitioner
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.424 sec
Running org.apache.hadoop.mapreduce.lib.partition.TestMRKeyFieldBasedPartitioner
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.389 sec
Running org.apache.hadoop.mapreduce.TestChild
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 59.741 sec
Running org.apache.hadoop.mapreduce.filecache.TestURIFragments
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.06 sec
Running org.apache.hadoop.mapreduce.TestMapReduce
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.529 sec

Results :

Failed tests:   testMR(org.apache.hadoop.mapred.TestClusterMRNotification): 
expected:6 but was:4

Tests run: 441, Failures: 1, Errors: 0, Skipped: 14

[INFO] 
[INFO] Reactor Summary:
[INFO] 
[INFO] hadoop-mapreduce-client ... SUCCESS [3.224s]
[INFO] hadoop-mapreduce-client-core .. SUCCESS [17.816s]
[INFO] hadoop-mapreduce-client-common  SUCCESS [22.077s]
[INFO] hadoop-mapreduce-client-shuffle ... SUCCESS [1.060s]
[INFO] hadoop-mapreduce-client-app ... SUCCESS [4:35.037s]
[INFO] hadoop-mapreduce-client-hs  SUCCESS [1:11.896s]
[INFO] hadoop-mapreduce-client-jobclient . FAILURE [39:58.970s]
[INFO] Apache Hadoop MapReduce Examples .. SKIPPED
[INFO] hadoop-mapreduce .. SKIPPED
[INFO] 
[INFO] BUILD FAILURE
[INFO] 
[INFO] Total time: 46:30.622s
[INFO] Finished at: Tue Nov 06 14:02:06 UTC 2012
[INFO] Final Memory: 20M/127M
[INFO] 
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-surefire-plugin:2.12.3:test (default-test) on 
project hadoop-mapreduce-client-jobclient: There are test failures.
[ERROR] 
[ERROR] Please refer to 
/home/jenkins/jenkins-slave/workspace/Hadoop-Mapreduce-trunk/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/surefire-reports
 for the individual test results.
[ERROR] - [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn goals -rf :hadoop-mapreduce-client-jobclient
Build step 'Execute shell' marked build as failure
[FINDBUGS] Skipping publisher since build result is FAILURE
Archiving artifacts
Updating HADOOP-9009
Updating HDFS-4151
Updating HADOOP-9010
Updating MAPREDUCE-4771
Updating HDFS-4046
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
No tests ran.

[jira] [Resolved] (MAPREDUCE-4773) MultipleOutput with different output path for each

2012-11-06 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved MAPREDUCE-4773.


Resolution: Not A Problem

Good to know. In future, please use the user-lists when you have an issue in 
your development you wish to discuss/get an answer for. It is a pretty active 
list today. The JIRA exists for identified bugs and/or feature requests and not 
user help.

Resolving JIRA as Not a Problem (For now).

 MultipleOutput with different output path for each 
 ---

 Key: MAPREDUCE-4773
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4773
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Rohit Dandona

 Is it possible to have multiple outputs in a map reduce code where each 
 output is directed to a different path ?
 e.g. 
 FileOutputFormat.setOutputPath(job, new Path(outputPath));
 MultipleOutputs.addNamedOutput(job, Output 1, TextOutputFormat.class, 
 Text.class, Text.class);
 MultipleOutputs.addNamedOutput(job, Output 2, TextOutputFormat.class, 
 Text.class, Text.class);
 Can Output 1  Output 2 be alloted seperate paths ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-4775) Reducer will never commit suicide

2012-11-06 Thread Robert Joseph Evans (JIRA)
Robert Joseph Evans created MAPREDUCE-4775:
--

 Summary: Reducer will never commit suicide
 Key: MAPREDUCE-4775
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4775
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
Priority: Critical


In 1.0 there are a number of conditions that will cause a reducer to commit 
suicide and exit.

This includes if it is stalled, if the error percentage of total fetches is too 
high.  In the new code it will only commit suicide when the total number of 
failures for a single task attempt is = max(30, totalMaps/10).  In the best 
case with the quadratic back-off to get a single map attempt to reach 30 
failure it would take 20.5 hours.  And unless there is only one reducer running 
the map task would have been restarted before then.

We should go back to include the same reducer suicide checks that are in 1.0

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira