[jira] [Resolved] (MAPREDUCE-5535) TestClusterMRNotification.testMR is failing

2013-09-26 Thread Zhijie Shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen resolved MAPREDUCE-5535.


Resolution: Duplicate

Will fix together in MAPREDUCE-5538.

 TestClusterMRNotification.testMR is failing
 ---

 Key: MAPREDUCE-5535
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5535
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Jian He

 {code}
 testMR(org.apache.hadoop.mapred.TestClusterMRNotification)  Time elapsed: 
 35.222 sec   FAILURE!
 junit.framework.AssertionFailedError: expected:2 but was:0
   at junit.framework.Assert.fail(Assert.java:50)
   at junit.framework.Assert.failNotEquals(Assert.java:287)
   at junit.framework.Assert.assertEquals(Assert.java:67)
   at junit.framework.Assert.assertEquals(Assert.java:199)
   at junit.framework.Assert.assertEquals(Assert.java:205)
   at 
 org.apache.hadoop.mapred.NotificationTestCase.testMR(NotificationTestCase.java:163)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at junit.framework.TestCase.runTest(TestCase.java:168)
   at junit.framework.TestCase.runBare(TestCase.java:134)
   at junit.framework.TestResult$1.protect(TestResult.java:110)
   at junit.framework.TestResult.runProtected(TestResult.java:128)
   at junit.framework.TestResult.run(TestResult.java:113)
   at junit.framework.TestCase.run(TestCase.java:124)
   at junit.framework.TestSuite.runTest(TestSuite.java:243)
   at junit.framework.TestSuite.run(TestSuite.java:238)
   at 
 org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
   at 
 org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
   at 
 org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
   at 
 org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-5540) Speculative task makes the default JobQueueTaskScheduler scheduling becomes unreasonable

2013-09-26 Thread guowp_aily (JIRA)
guowp_aily created MAPREDUCE-5540:
-

 Summary: Speculative task makes the default JobQueueTaskScheduler 
scheduling becomes unreasonable
 Key: MAPREDUCE-5540
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5540
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: guowp_aily


Speculative task makes the default JobQueueTaskScheduler scheduling becomes 
unreasonable

Speculative task resulted in a resource is abundant, using the default 
scheduler, still prone to (map, reduce) task pend.
The Cluster configuration : 3 tasktracker, 12 reduce slot per node. 

 
In the job queue has only 2 jobs:
job_201309221020_0357's eleven reduce tasks are running, and  
job_201309221020_0358 has a reduce in the pending state; 
but my cluster, a total of 36 slot, why does job_201309221020_0358 need to be 
pending ?
Job_201309221020_0358 has been waiting for 2 minutes, and finally in the 
job_201309221020_0357 has completed a reduce task after the operation .

Check the operation log and scheduling algorithm source code, found that may be 
because Speculative task lead to scheduling algorithm default becomes less.


The task_201309221020_0357_r_06 task actual start of two attmept 
(attempt_201309221020_0357_r_06_0, attempt_201309221020_0357_r_06_1), 
so although the job_201309221020_0357 only eleven reduce tasks, but since the 
opening Speculative task, causing it to the actual occupation of twelve slot 
(four slots per node), so the currently running   12 slots. 

According to the default scheduling algorithm, completed the reduce tasks 
running job_201309221020_0358 reduce task must wait for job_201309221020_0357‘s 
a reduce task, otherwise it will always be pending.So the default scheduling 
algorithm is not suitable for open Speculative task ?

 

JobQueueTaskScheduler  : 
 
double reduceLoadFactor = (double)remainingReduceLoad / clusterReduceCapacity;
//remainingReduceLoad   job queue:job_201309221020_0357's running Reduce + 
job_201309221020_0358's pending Reduce = 12 
//clusterReduceCapacity  : 36
//reduceLoadFactor=12/36=0.
 
final int trackerCurrentReduceCapacity = 
Math.min((int)Math.ceil(reduceLoadFactor * trackerReduceCapacity),   
trackerReduceCapacity);
//trackerReduceCapacity  running slot:  job_201309221020_0357 ---   12 slots 
//trackerCurrentReduceCapacity=ceil(0.*12)=4


final int availableReduceSlots = 
  Math.min((trackerCurrentReduceCapacity - trackerRunningReduces), 1);
//trackerRunningReduces   : 4 slots per node
//availableReduceSlots=Math.min((4 - 4), 1)=0 
 
boolean exceededReducePadding = false;
if(availableReduceSlots  0) {   // if job_201309221020_0357's reduce tasks is 
running ,the availableReduceSlots is always less 1
exceededReducePadding = exceededPadding(false, clusterStatus, 
trackerReduceCapacity);
synchronized (jobQueue) {
LOG.debug(try to assign 1 reduce task to 
TaskTracker[+taskTracker.trackerName+]..);
for (JobInProgress job : jobQueue) {
if (job.getStatus().getRunState() != JobStatus.RUNNING 
|| job.numReduceTasks == 0) {
continue;
}
... ...



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-5541) improved algorithm for whether need speculative task

2013-09-26 Thread zhaoyunjiong (JIRA)
zhaoyunjiong created MAPREDUCE-5541:
---

 Summary: improved algorithm for whether need speculative task
 Key: MAPREDUCE-5541
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5541
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: zhaoyunjiong
Assignee: zhaoyunjiong


Most of time, tasks won't start running at same time.
In this case hasSpeculativeTask in TaskInProgress not working very well.
Some times, some tasks just start running, and scheduler already decide it need 
speculative task to run.
And this waste a lot of resource.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Hadoop-Mapreduce-trunk - Build # 1560 - Still Failing

2013-09-26 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1560/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 31709 lines...]
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.311 sec - in 
org.apache.hadoop.mapreduce.v2.app.commit.TestCommitterEventHandler
Running org.apache.hadoop.mapreduce.v2.app.launcher.TestContainerLauncherImpl
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.706 sec - in 
org.apache.hadoop.mapreduce.v2.app.launcher.TestContainerLauncherImpl
Running org.apache.hadoop.mapreduce.v2.app.launcher.TestContainerLauncher
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 13.679 sec - in 
org.apache.hadoop.mapreduce.v2.app.launcher.TestContainerLauncher
Running org.apache.hadoop.mapreduce.jobhistory.TestJobHistoryEventHandler
Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 10.421 sec - in 
org.apache.hadoop.mapreduce.jobhistory.TestJobHistoryEventHandler
Running org.apache.hadoop.mapreduce.jobhistory.TestEvents
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.716 sec - in 
org.apache.hadoop.mapreduce.jobhistory.TestEvents

Results :

Tests run: 237, Failures: 0, Errors: 0, Skipped: 0

[INFO] 
[INFO] Reactor Summary:
[INFO] 
[INFO] hadoop-mapreduce-client ... SUCCESS [2.538s]
[INFO] hadoop-mapreduce-client-core .. SUCCESS [39.231s]
[INFO] hadoop-mapreduce-client-common  SUCCESS [24.916s]
[INFO] hadoop-mapreduce-client-shuffle ... SUCCESS [2.403s]
[INFO] hadoop-mapreduce-client-app ... FAILURE [5:29.843s]
[INFO] hadoop-mapreduce-client-hs  SKIPPED
[INFO] hadoop-mapreduce-client-jobclient . SKIPPED
[INFO] hadoop-mapreduce-client-hs-plugins  SKIPPED
[INFO] Apache Hadoop MapReduce Examples .. SKIPPED
[INFO] hadoop-mapreduce .. SKIPPED
[INFO] 
[INFO] BUILD FAILURE
[INFO] 
[INFO] Total time: 6:39.599s
[INFO] Finished at: Thu Sep 26 13:24:20 UTC 2013
[INFO] Final Memory: 23M/196M
[INFO] 
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-surefire-plugin:2.16:test (default-test) on 
project hadoop-mapreduce-client-app: ExecutionException; nested exception is 
java.util.concurrent.ExecutionException: java.lang.RuntimeException: The forked 
VM terminated without saying properly goodbye. VM crash or System.exit called ?
[ERROR] Command was/bin/sh -c cd 
/home/jenkins/jenkins-slave/workspace/Hadoop-Mapreduce-trunk/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app
  /home/jenkins/tools/java/jdk1.6.0_26/jre/bin/java -Xmx1024m 
-XX:+HeapDumpOnOutOfMemoryError -jar 
/home/jenkins/jenkins-slave/workspace/Hadoop-Mapreduce-trunk/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/target/surefire/surefirebooter387726239521400.jar
 
/home/jenkins/jenkins-slave/workspace/Hadoop-Mapreduce-trunk/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/target/surefire/surefire3444896056369734060tmp
 
/home/jenkins/jenkins-slave/workspace/Hadoop-Mapreduce-trunk/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/target/surefire/surefire_558291672575213736381tmp
[ERROR] - [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn goals -rf :hadoop-mapreduce-client-app
Build step 'Execute shell' marked build as failure
[FINDBUGS] Skipping publisher since build result is FAILURE
Archiving artifacts
Updating HDFS-5041
Updating YARN-49
Updating MAPREDUCE-5503
Updating HADOOP-9981
Updating HDFS-5246
Updating MAPREDUCE-5170
Updating YARN-1157
Updating HADOOP-9761
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
No tests ran.

[jira] [Created] (MAPREDUCE-5542) Killing a job just as it finishes can generate an NPE in client

2013-09-26 Thread Jason Lowe (JIRA)
Jason Lowe created MAPREDUCE-5542:
-

 Summary: Killing a job just as it finishes can generate an NPE in 
client
 Key: MAPREDUCE-5542
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5542
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client, mrv2
Affects Versions: 2.1.0-beta
Reporter: Jason Lowe


If a client tries to kill a job just as the job is finishing then the client 
can crash with an NPE.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-5543) In-memory map outputs can be leaked after shuffle completes in 0.23

2013-09-26 Thread Jason Lowe (JIRA)
Jason Lowe created MAPREDUCE-5543:
-

 Summary: In-memory map outputs can be leaked after shuffle 
completes in 0.23
 Key: MAPREDUCE-5543
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5543
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.1.0-beta, 0.23.9
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Blocker
 Fix For: 2.1.1-beta


MergeManagerImpl#close adds the contents of inMemoryMergedMapOutputs and 
inMemoryMapOutputs to a list of map outputs that is subsequently processed, but 
it does not clear those sets.  This prevents some of the map outputs from being 
garbage collected and significantly reduces the memory available for the 
subsequent reduce phase.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-5544) JobClient#getJob loads job conf twice

2013-09-26 Thread Sandy Ryza (JIRA)
Sandy Ryza created MAPREDUCE-5544:
-

 Summary: JobClient#getJob loads job conf twice
 Key: MAPREDUCE-5544
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5544
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Sandy Ryza


Calling JobClient#getJob causes the job conf file to be loaded twice, once in 
the constructor of JobClient.NetworkedJob and once in Cluster#getJob.  We 
should remove the former.

MAPREDUCE-5001 was meant to fix a race that was causing problems in Hive tests, 
but the problem persists because it only fixed one of the places where the job 
conf file is loaded.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira