[jira] [Resolved] (MAPREDUCE-5535) TestClusterMRNotification.testMR is failing
[ https://issues.apache.org/jira/browse/MAPREDUCE-5535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijie Shen resolved MAPREDUCE-5535. Resolution: Duplicate Will fix together in MAPREDUCE-5538. TestClusterMRNotification.testMR is failing --- Key: MAPREDUCE-5535 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5535 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Jian He {code} testMR(org.apache.hadoop.mapred.TestClusterMRNotification) Time elapsed: 35.222 sec FAILURE! junit.framework.AssertionFailedError: expected:2 but was:0 at junit.framework.Assert.fail(Assert.java:50) at junit.framework.Assert.failNotEquals(Assert.java:287) at junit.framework.Assert.assertEquals(Assert.java:67) at junit.framework.Assert.assertEquals(Assert.java:199) at junit.framework.Assert.assertEquals(Assert.java:205) at org.apache.hadoop.mapred.NotificationTestCase.testMR(NotificationTestCase.java:163) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at junit.framework.TestCase.runTest(TestCase.java:168) at junit.framework.TestCase.runBare(TestCase.java:134) at junit.framework.TestResult$1.protect(TestResult.java:110) at junit.framework.TestResult.runProtected(TestResult.java:128) at junit.framework.TestResult.run(TestResult.java:113) at junit.framework.TestCase.run(TestCase.java:124) at junit.framework.TestSuite.runTest(TestSuite.java:243) at junit.framework.TestSuite.run(TestSuite.java:238) at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124) at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-5540) Speculative task makes the default JobQueueTaskScheduler scheduling becomes unreasonable
guowp_aily created MAPREDUCE-5540: - Summary: Speculative task makes the default JobQueueTaskScheduler scheduling becomes unreasonable Key: MAPREDUCE-5540 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5540 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: guowp_aily Speculative task makes the default JobQueueTaskScheduler scheduling becomes unreasonable Speculative task resulted in a resource is abundant, using the default scheduler, still prone to (map, reduce) task pend. The Cluster configuration : 3 tasktracker, 12 reduce slot per node. In the job queue has only 2 jobs: job_201309221020_0357's eleven reduce tasks are running, and job_201309221020_0358 has a reduce in the pending state; but my cluster, a total of 36 slot, why does job_201309221020_0358 need to be pending ? Job_201309221020_0358 has been waiting for 2 minutes, and finally in the job_201309221020_0357 has completed a reduce task after the operation . Check the operation log and scheduling algorithm source code, found that may be because Speculative task lead to scheduling algorithm default becomes less. The task_201309221020_0357_r_06 task actual start of two attmept (attempt_201309221020_0357_r_06_0, attempt_201309221020_0357_r_06_1), so although the job_201309221020_0357 only eleven reduce tasks, but since the opening Speculative task, causing it to the actual occupation of twelve slot (four slots per node), so the currently running 12 slots. According to the default scheduling algorithm, completed the reduce tasks running job_201309221020_0358 reduce task must wait for job_201309221020_0357‘s a reduce task, otherwise it will always be pending.So the default scheduling algorithm is not suitable for open Speculative task ? JobQueueTaskScheduler : double reduceLoadFactor = (double)remainingReduceLoad / clusterReduceCapacity; //remainingReduceLoad job queue:job_201309221020_0357's running Reduce + job_201309221020_0358's pending Reduce = 12 //clusterReduceCapacity : 36 //reduceLoadFactor=12/36=0. final int trackerCurrentReduceCapacity = Math.min((int)Math.ceil(reduceLoadFactor * trackerReduceCapacity), trackerReduceCapacity); //trackerReduceCapacity running slot: job_201309221020_0357 --- 12 slots //trackerCurrentReduceCapacity=ceil(0.*12)=4 final int availableReduceSlots = Math.min((trackerCurrentReduceCapacity - trackerRunningReduces), 1); //trackerRunningReduces : 4 slots per node //availableReduceSlots=Math.min((4 - 4), 1)=0 boolean exceededReducePadding = false; if(availableReduceSlots 0) { // if job_201309221020_0357's reduce tasks is running ,the availableReduceSlots is always less 1 exceededReducePadding = exceededPadding(false, clusterStatus, trackerReduceCapacity); synchronized (jobQueue) { LOG.debug(try to assign 1 reduce task to TaskTracker[+taskTracker.trackerName+]..); for (JobInProgress job : jobQueue) { if (job.getStatus().getRunState() != JobStatus.RUNNING || job.numReduceTasks == 0) { continue; } ... ... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-5541) improved algorithm for whether need speculative task
zhaoyunjiong created MAPREDUCE-5541: --- Summary: improved algorithm for whether need speculative task Key: MAPREDUCE-5541 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5541 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: zhaoyunjiong Assignee: zhaoyunjiong Most of time, tasks won't start running at same time. In this case hasSpeculativeTask in TaskInProgress not working very well. Some times, some tasks just start running, and scheduler already decide it need speculative task to run. And this waste a lot of resource. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Hadoop-Mapreduce-trunk - Build # 1560 - Still Failing
See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1560/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 31709 lines...] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.311 sec - in org.apache.hadoop.mapreduce.v2.app.commit.TestCommitterEventHandler Running org.apache.hadoop.mapreduce.v2.app.launcher.TestContainerLauncherImpl Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.706 sec - in org.apache.hadoop.mapreduce.v2.app.launcher.TestContainerLauncherImpl Running org.apache.hadoop.mapreduce.v2.app.launcher.TestContainerLauncher Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 13.679 sec - in org.apache.hadoop.mapreduce.v2.app.launcher.TestContainerLauncher Running org.apache.hadoop.mapreduce.jobhistory.TestJobHistoryEventHandler Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 10.421 sec - in org.apache.hadoop.mapreduce.jobhistory.TestJobHistoryEventHandler Running org.apache.hadoop.mapreduce.jobhistory.TestEvents Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.716 sec - in org.apache.hadoop.mapreduce.jobhistory.TestEvents Results : Tests run: 237, Failures: 0, Errors: 0, Skipped: 0 [INFO] [INFO] Reactor Summary: [INFO] [INFO] hadoop-mapreduce-client ... SUCCESS [2.538s] [INFO] hadoop-mapreduce-client-core .. SUCCESS [39.231s] [INFO] hadoop-mapreduce-client-common SUCCESS [24.916s] [INFO] hadoop-mapreduce-client-shuffle ... SUCCESS [2.403s] [INFO] hadoop-mapreduce-client-app ... FAILURE [5:29.843s] [INFO] hadoop-mapreduce-client-hs SKIPPED [INFO] hadoop-mapreduce-client-jobclient . SKIPPED [INFO] hadoop-mapreduce-client-hs-plugins SKIPPED [INFO] Apache Hadoop MapReduce Examples .. SKIPPED [INFO] hadoop-mapreduce .. SKIPPED [INFO] [INFO] BUILD FAILURE [INFO] [INFO] Total time: 6:39.599s [INFO] Finished at: Thu Sep 26 13:24:20 UTC 2013 [INFO] Final Memory: 23M/196M [INFO] [ERROR] Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.16:test (default-test) on project hadoop-mapreduce-client-app: ExecutionException; nested exception is java.util.concurrent.ExecutionException: java.lang.RuntimeException: The forked VM terminated without saying properly goodbye. VM crash or System.exit called ? [ERROR] Command was/bin/sh -c cd /home/jenkins/jenkins-slave/workspace/Hadoop-Mapreduce-trunk/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app /home/jenkins/tools/java/jdk1.6.0_26/jre/bin/java -Xmx1024m -XX:+HeapDumpOnOutOfMemoryError -jar /home/jenkins/jenkins-slave/workspace/Hadoop-Mapreduce-trunk/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/target/surefire/surefirebooter387726239521400.jar /home/jenkins/jenkins-slave/workspace/Hadoop-Mapreduce-trunk/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/target/surefire/surefire3444896056369734060tmp /home/jenkins/jenkins-slave/workspace/Hadoop-Mapreduce-trunk/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/target/surefire/surefire_558291672575213736381tmp [ERROR] - [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException [ERROR] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn goals -rf :hadoop-mapreduce-client-app Build step 'Execute shell' marked build as failure [FINDBUGS] Skipping publisher since build result is FAILURE Archiving artifacts Updating HDFS-5041 Updating YARN-49 Updating MAPREDUCE-5503 Updating HADOOP-9981 Updating HDFS-5246 Updating MAPREDUCE-5170 Updating YARN-1157 Updating HADOOP-9761 Email was triggered for: Failure Sending email for trigger: Failure ### ## FAILED TESTS (if any) ## No tests ran.
[jira] [Created] (MAPREDUCE-5542) Killing a job just as it finishes can generate an NPE in client
Jason Lowe created MAPREDUCE-5542: - Summary: Killing a job just as it finishes can generate an NPE in client Key: MAPREDUCE-5542 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5542 Project: Hadoop Map/Reduce Issue Type: Bug Components: client, mrv2 Affects Versions: 2.1.0-beta Reporter: Jason Lowe If a client tries to kill a job just as the job is finishing then the client can crash with an NPE. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-5543) In-memory map outputs can be leaked after shuffle completes in 0.23
Jason Lowe created MAPREDUCE-5543: - Summary: In-memory map outputs can be leaked after shuffle completes in 0.23 Key: MAPREDUCE-5543 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5543 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 2.1.0-beta, 0.23.9 Reporter: Jason Lowe Assignee: Jason Lowe Priority: Blocker Fix For: 2.1.1-beta MergeManagerImpl#close adds the contents of inMemoryMergedMapOutputs and inMemoryMapOutputs to a list of map outputs that is subsequently processed, but it does not clear those sets. This prevents some of the map outputs from being garbage collected and significantly reduces the memory available for the subsequent reduce phase. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-5544) JobClient#getJob loads job conf twice
Sandy Ryza created MAPREDUCE-5544: - Summary: JobClient#getJob loads job conf twice Key: MAPREDUCE-5544 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5544 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Sandy Ryza Calling JobClient#getJob causes the job conf file to be loaded twice, once in the constructor of JobClient.NetworkedJob and once in Cluster#getJob. We should remove the former. MAPREDUCE-5001 was meant to fix a race that was causing problems in Hive tests, but the problem persists because it only fixed one of the places where the job conf file is loaded. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira