[jira] [Created] (MAPREDUCE-4866) ShuffleRamManager is limited to 2Gb of memory - we should increase that

2012-12-10 Thread Varene Olivier (JIRA)
Varene Olivier created MAPREDUCE-4866:
-

 Summary: ShuffleRamManager is limited to 2Gb of memory - we should 
increase that
 Key: MAPREDUCE-4866
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4866
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Affects Versions: 0.20.2
 Environment: linux, 64bits cpu, more than 2Gb of memory for each 
reducer tasks
Reporter: Varene Olivier
Priority: Minor


Inside the org.apache.hadoop.mapred.ReduceTask.java, the *ShuffleRamManager* is 
limited to allocate up to 2Gb of memory during the shuffle phase. 
We should be able to allocate more, to take advantage of the full memory we 
have on servers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4866) ShuffleRamManager is limited to 2Gb of memory - we should increase that

2012-12-10 Thread Varene Olivier (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varene Olivier updated MAPREDUCE-4866:
--

Attachment: MAPREDUCE-4866-INCOMPLETE.patch

This patch is incomplete and does not work !!!
because the allocation of byte array is limited to around Integer.MAX_VALUE,

we need to find another way to allocate such huge space of memory.

What about BigArrays ?
What do you think ? propose ?

 ShuffleRamManager is limited to 2Gb of memory - we should increase that
 ---

 Key: MAPREDUCE-4866
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4866
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Affects Versions: 0.20.2
 Environment: linux, 64bits cpu, more than 2Gb of memory for each 
 reducer tasks
Reporter: Varene Olivier
Priority: Minor
  Labels: patch
 Attachments: MAPREDUCE-4866-INCOMPLETE.patch


 Inside the org.apache.hadoop.mapred.ReduceTask.java, the *ShuffleRamManager* 
 is limited to allocate up to 2Gb of memory during the shuffle phase. 
 We should be able to allocate more, to take advantage of the full memory we 
 have on servers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-4867) reduces tasks won't start in certain circumstances

2012-12-10 Thread Vincent Behar (JIRA)
Vincent Behar created MAPREDUCE-4867:


 Summary: reduces tasks won't start in certain circumstances 
 Key: MAPREDUCE-4867
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4867
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: scheduler
Affects Versions: 1.0.4
Reporter: Vincent Behar


Reduce tasks start are conditioned by the value of 
mapred.reduce.slowstart.completed.maps. However, if the number of completed 
map tasks never reached the configured value (for example because 
mapred.max.map.failures.percent has been set to a high value, to permit a job 
to have a lot of failed tasks), then the reduce tasks won't start.
The job is still running, all map tasks are finished (either successful or 
not), and all reduce tasks are still pending. The only thing one can do is to 
kill the job.

There are 2 things that could be done :

- document the relation between mapred.max.map.failures.percent and 
mapred.reduce.slowstart.completed.maps : we can say that the rule to follow 
if you want to be sure that your reduce tasks will start is : 
mapred.reduce.slowstart.completed.maps * 100  100 - 
mapred.max.map.failures.percent

- fix JobInProgress.scheduleReduces() to return true if all map tasks are 
finished

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4867) reduces tasks won't start in certain circumstances

2012-12-10 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13527964#comment-13527964
 ] 

Jason Lowe commented on MAPREDUCE-4867:
---

I believe this is a duplicate of MAPREDUCE-2129 which was fixed in 1.1.0.

 reduces tasks won't start in certain circumstances 
 ---

 Key: MAPREDUCE-4867
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4867
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: scheduler
Affects Versions: 1.0.4
Reporter: Vincent Behar

 Reduce tasks start are conditioned by the value of 
 mapred.reduce.slowstart.completed.maps. However, if the number of completed 
 map tasks never reached the configured value (for example because 
 mapred.max.map.failures.percent has been set to a high value, to permit a 
 job to have a lot of failed tasks), then the reduce tasks won't start.
 The job is still running, all map tasks are finished (either successful or 
 not), and all reduce tasks are still pending. The only thing one can do is to 
 kill the job.
 There are 2 things that could be done :
 - document the relation between mapred.max.map.failures.percent and 
 mapred.reduce.slowstart.completed.maps : we can say that the rule to follow 
 if you want to be sure that your reduce tasks will start is : 
 mapred.reduce.slowstart.completed.maps * 100  100 - 
 mapred.max.map.failures.percent
 - fix JobInProgress.scheduleReduces() to return true if all map tasks are 
 finished

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4867) reduces tasks won't start in certain circumstances

2012-12-10 Thread Vincent Behar (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13527994#comment-13527994
 ] 

Vincent Behar commented on MAPREDUCE-4867:
--

yes it is a duplicate of MAPREDUCE-2129 (sorry I didn't find it)

The fix has been applied to branch-1 and branch-1.1, but not branch-1.0.
Merging r1358233 (from branch-1) in branch-1.0 should be enough.

Thanks

 reduces tasks won't start in certain circumstances 
 ---

 Key: MAPREDUCE-4867
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4867
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: scheduler
Affects Versions: 1.0.4
Reporter: Vincent Behar

 Reduce tasks start are conditioned by the value of 
 mapred.reduce.slowstart.completed.maps. However, if the number of completed 
 map tasks never reached the configured value (for example because 
 mapred.max.map.failures.percent has been set to a high value, to permit a 
 job to have a lot of failed tasks), then the reduce tasks won't start.
 The job is still running, all map tasks are finished (either successful or 
 not), and all reduce tasks are still pending. The only thing one can do is to 
 kill the job.
 There are 2 things that could be done :
 - document the relation between mapred.max.map.failures.percent and 
 mapred.reduce.slowstart.completed.maps : we can say that the rule to follow 
 if you want to be sure that your reduce tasks will start is : 
 mapred.reduce.slowstart.completed.maps * 100  100 - 
 mapred.max.map.failures.percent
 - fix JobInProgress.scheduleReduces() to return true if all map tasks are 
 finished

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4867) reduces tasks won't start in certain circumstances

2012-12-10 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13528008#comment-13528008
 ] 

Jason Lowe commented on MAPREDUCE-4867:
---

Adding Matt Foley who is the release manager for Hadoop 1.x.  He can comment on 
whether there are plans for another 1.0.x release and if MAPREDUCE-2129 would 
be a good candidate.

 reduces tasks won't start in certain circumstances 
 ---

 Key: MAPREDUCE-4867
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4867
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: scheduler
Affects Versions: 1.0.4
Reporter: Vincent Behar

 Reduce tasks start are conditioned by the value of 
 mapred.reduce.slowstart.completed.maps. However, if the number of completed 
 map tasks never reached the configured value (for example because 
 mapred.max.map.failures.percent has been set to a high value, to permit a 
 job to have a lot of failed tasks), then the reduce tasks won't start.
 The job is still running, all map tasks are finished (either successful or 
 not), and all reduce tasks are still pending. The only thing one can do is to 
 kill the job.
 There are 2 things that could be done :
 - document the relation between mapred.max.map.failures.percent and 
 mapred.reduce.slowstart.completed.maps : we can say that the rule to follow 
 if you want to be sure that your reduce tasks will start is : 
 mapred.reduce.slowstart.completed.maps * 100  100 - 
 mapred.max.map.failures.percent
 - fix JobInProgress.scheduleReduces() to return true if all map tasks are 
 finished

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4703) Add the ability to start the MiniMRClientCluster using the configurations used before it is being stopped.

2012-12-10 Thread Alejandro Abdelnur (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Abdelnur updated MAPREDUCE-4703:
--

Issue Type: Improvement  (was: Bug)

 Add the ability to start the MiniMRClientCluster using the configurations 
 used before it is being stopped.
 --

 Key: MAPREDUCE-4703
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4703
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, test
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan
 Attachments: MAPREDUCE-4703_branch-1.patch, 
 MAPREDUCE-4703_branch-1_rev2.patch, MAPREDUCE-4703.patch, 
 MAPREDUCE-4703_rev2.patch, MAPREDUCE-4703_rev3.patch


 The objective here is to enable starting back the cluster, after being 
 stopped, using the same configurations/port numbers used before stopping.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4703) Add the ability to start the MiniMRClientCluster using the configurations used before it is being stopped.

2012-12-10 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13528120#comment-13528120
 ] 

Alejandro Abdelnur commented on MAPREDUCE-4703:
---

Thanks Ahmed. I've committed to trunk and branch-2 after running the tests. 

However, when trying to run the test with the branch-1 patch the test is 
failing with the following output.

Would you please take a look at it? I'll hold off committing to branch-1 until 
this is addressed. Leaving the JIRA open as well.

{code}
Testcase: testJob took 14.387 sec
Testcase: testRestart took 7.333 sec
Caused an ERROR
java.io.IOException: Call to localhost/127.0.0.1:59747 failed on local 
exception: java.io.EOFException
java.lang.RuntimeException: java.io.IOException: Call to 
localhost/127.0.0.1:59747 failed on local exception: java.io.EOFException
at 
org.apache.hadoop.mapred.MiniMRCluster.waitUntilIdle(MiniMRCluster.java:325)
at org.apache.hadoop.mapred.MiniMRCluster.init(MiniMRCluster.java:527)
at org.apache.hadoop.mapred.MiniMRCluster.init(MiniMRCluster.java:465)
at org.apache.hadoop.mapred.MiniMRCluster.init(MiniMRCluster.java:457)
at org.apache.hadoop.mapred.MiniMRCluster.init(MiniMRCluster.java:449)
at org.apache.hadoop.mapred.MiniMRCluster.init(MiniMRCluster.java:439)
at org.apache.hadoop.mapred.MiniMRCluster.init(MiniMRCluster.java:429)
at 
org.apache.hadoop.mapred.MiniMRClusterAdapter.restart(MiniMRClusterAdapter.java:80)
at 
org.apache.hadoop.mapred.TestMiniMRClientCluster.testRestart(TestMiniMRClientCluster.java:109)
Caused by: java.io.IOException: Call to localhost/127.0.0.1:59747 failed on 
local exception: java.io.EOFException
at org.apache.hadoop.ipc.Client.wrapException(Client.java:1144)
at org.apache.hadoop.ipc.Client.call(Client.java:1112)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
at org.apache.hadoop.mapred.$Proxy10.getProtocolVersion(Unknown Source)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:411)
at org.apache.hadoop.mapred.JobClient.createRPCProxy(JobClient.java:505)
at org.apache.hadoop.mapred.JobClient.init(JobClient.java:496)
at org.apache.hadoop.mapred.JobClient.init(JobClient.java:479)
at 
org.apache.hadoop.mapred.MiniMRCluster.waitUntilIdle(MiniMRCluster.java:311)
Caused by: java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:375)
at 
org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:841)
at org.apache.hadoop.ipc.Client$Connection.run(Client.java:786)
{code}

 Add the ability to start the MiniMRClientCluster using the configurations 
 used before it is being stopped.
 --

 Key: MAPREDUCE-4703
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4703
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, test
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan
 Attachments: MAPREDUCE-4703_branch-1.patch, 
 MAPREDUCE-4703_branch-1_rev2.patch, MAPREDUCE-4703.patch, 
 MAPREDUCE-4703_rev2.patch, MAPREDUCE-4703_rev3.patch


 The objective here is to enable starting back the cluster, after being 
 stopped, using the same configurations/port numbers used before stopping.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4703) Add the ability to start the MiniMRClientCluster using the configurations used before it is being stopped.

2012-12-10 Thread Alejandro Abdelnur (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Abdelnur updated MAPREDUCE-4703:
--

Affects Version/s: 2.0.3-alpha
   1.2.0
Fix Version/s: 2.0.3-alpha

 Add the ability to start the MiniMRClientCluster using the configurations 
 used before it is being stopped.
 --

 Key: MAPREDUCE-4703
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4703
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, test
Affects Versions: 1.2.0, 2.0.3-alpha
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan
 Fix For: 2.0.3-alpha

 Attachments: MAPREDUCE-4703_branch-1.patch, 
 MAPREDUCE-4703_branch-1_rev2.patch, MAPREDUCE-4703.patch, 
 MAPREDUCE-4703_rev2.patch, MAPREDUCE-4703_rev3.patch


 The objective here is to enable starting back the cluster, after being 
 stopped, using the same configurations/port numbers used before stopping.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4703) Add the ability to start the MiniMRClientCluster using the configurations used before it is being stopped.

2012-12-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13528122#comment-13528122
 ] 

Hudson commented on MAPREDUCE-4703:
---

Integrated in Hadoop-trunk-Commit #3102 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/3102/])
MAPREDUCE-4703. Add the ability to start the MiniMRClientCluster using the 
configurations used before it is being stopped. (ahmed.radwan via tucu) 
(Revision 1419618)

 Result = SUCCESS
tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1419618
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/MiniMRClientCluster.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/MiniMRClientClusterFactory.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/MiniMRYarnClusterAdapter.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestMiniMRClientCluster.java


 Add the ability to start the MiniMRClientCluster using the configurations 
 used before it is being stopped.
 --

 Key: MAPREDUCE-4703
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4703
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, test
Affects Versions: 1.2.0, 2.0.3-alpha
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan
 Fix For: 2.0.3-alpha

 Attachments: MAPREDUCE-4703_branch-1.patch, 
 MAPREDUCE-4703_branch-1_rev2.patch, MAPREDUCE-4703.patch, 
 MAPREDUCE-4703_rev2.patch, MAPREDUCE-4703_rev3.patch


 The objective here is to enable starting back the cluster, after being 
 stopped, using the same configurations/port numbers used before stopping.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4703) Add the ability to start the MiniMRClientCluster using the configurations used before it is being stopped.

2012-12-10 Thread Ahmed Radwan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13528123#comment-13528123
 ] 

Ahmed Radwan commented on MAPREDUCE-4703:
-

Thanks Tucu! I'll take a look at this failure and get back to you.

 Add the ability to start the MiniMRClientCluster using the configurations 
 used before it is being stopped.
 --

 Key: MAPREDUCE-4703
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4703
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, test
Affects Versions: 1.2.0, 2.0.3-alpha
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan
 Fix For: 2.0.3-alpha

 Attachments: MAPREDUCE-4703_branch-1.patch, 
 MAPREDUCE-4703_branch-1_rev2.patch, MAPREDUCE-4703.patch, 
 MAPREDUCE-4703_rev2.patch, MAPREDUCE-4703_rev3.patch


 The objective here is to enable starting back the cluster, after being 
 stopped, using the same configurations/port numbers used before stopping.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4049) plugin for generic shuffle service

2012-12-10 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13528271#comment-13528271
 ] 

Alejandro Abdelnur commented on MAPREDUCE-4049:
---

Avner,

If you look at the latest patches for MAPREDUCE-4807, MAPREDUCE-4812  
MAPREDUCE-4809, you'll see that they are limited to do the same thing 
MAPREDUCE-4049 does, define an interface, make existing classes to implement 
that inteface, instanciate those classes using ReflectionUtils.newInstance(). 
The only thing extra is a minor refactoring that it has been already agreed 
that it is OK and posses no risk. The bulk of the patches are testcases, that 
instead limiting to test the pluggability, they provide alternate simple 
alternate implementations to show the interfaces are adequate for such.

In order to get this done, I encourage you, as a contributor, to look at the 
work proposed in MAPREDUCE-4807, MAPREDUCE-4812  MAPREDUCE-4809 and provide 
feedback so we get things in the branch and in trunk.


 plugin for generic shuffle service
 --

 Key: MAPREDUCE-4049
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4049
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: performance, task, tasktracker
Affects Versions: 1.0.3, 1.1.0, 2.0.0-alpha, 3.0.0
Reporter: Avner BenHanoch
Assignee: Avner BenHanoch
  Labels: merge, plugin, rdma, shuffle
 Fix For: 3.0.0

 Attachments: HADOOP-1.x.y.patch, Hadoop Shuffle Plugin Design.rtf, 
 mapreduce-4049.patch, mapreduce-4049.patch, mapreduce-4049.patch, 
 mapreduce-4049.patch, mapreduce-4049.patch, mapreduce-4049.patch


 Support generic shuffle service as set of two plugins: ShuffleProvider  
 ShuffleConsumer.
 This will satisfy the following needs:
 # Better shuffle and merge performance. For example: we are working on 
 shuffle plugin that performs shuffle over RDMA in fast networks (10gE, 40gE, 
 or Infiniband) instead of using the current HTTP shuffle. Based on the fast 
 RDMA shuffle, the plugin can also utilize a suitable merge approach during 
 the intermediate merges. Hence, getting much better performance.
 # Satisfy MAPREDUCE-3060 - generic shuffle service for avoiding hidden 
 dependency of NodeManager with a specific version of mapreduce shuffle 
 (currently targeted to 0.24.0).
 References:
 # Hadoop Acceleration through Network Levitated Merging, by Prof. Weikuan Yu 
 from Auburn University with others, 
 [http://pasl.eng.auburn.edu/pubs/sc11-netlev.pdf]
 # I am attaching 2 documents with suggested Top Level Design for both plugins 
 (currently, based on 1.0 branch)
 # I am providing link for downloading UDA - Mellanox's open source plugin 
 that implements generic shuffle service using RDMA and levitated merge.  
 Note: At this phase, the code is in C++ through JNI and you should consider 
 it as beta only.  Still, it can serve anyone that wants to implement or 
 contribute to levitated merge. (Please be advised that levitated merge is 
 mostly suit in very fast networks) - 
 [http://www.mellanox.com/content/pages.php?pg=products_dynproduct_family=144menu_section=69]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4859) TestRecoveryManager fails on branch-1

2012-12-10 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated MAPREDUCE-4859:
-

Attachment: MAPREDUCE-4859.patch

I had a look at the failing tests. A couple of the tests were hanging because 
they don't wait for the jobs to complete, so the tasktrackers never exit and 
mini cluster shutdown waits forever. testJobResubmission was failing due to a 
race where the old TIP gets removed while the recovered TIP is running so the 
TT thinks it has never completed. I also made the output directories unique 
since there were occasional clashes between tests despite the test directory 
being deleted each time.

Tests pass for me on Mac and Linux.

Matt/Arun - can you see if the patch works for you please?


 TestRecoveryManager fails on branch-1
 -

 Key: MAPREDUCE-4859
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4859
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.1.1
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 1.1.2

 Attachments: MAPREDUCE-4859.patch, MAPREDUCE-4859.patch


 Looks like the tests are extremely flaky and just hang.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4049) plugin for generic shuffle service

2012-12-10 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13528317#comment-13528317
 ] 

Arun C Murthy commented on MAPREDUCE-4049:
--

Forgot to add over weekend - GM run finished with this patch slightly faster 
13s on 300 nodes with ~1300 jobs. Overall runtime was 65mins.

 plugin for generic shuffle service
 --

 Key: MAPREDUCE-4049
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4049
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: performance, task, tasktracker
Affects Versions: 1.0.3, 1.1.0, 2.0.0-alpha, 3.0.0
Reporter: Avner BenHanoch
Assignee: Avner BenHanoch
  Labels: merge, plugin, rdma, shuffle
 Fix For: 3.0.0

 Attachments: HADOOP-1.x.y.patch, Hadoop Shuffle Plugin Design.rtf, 
 mapreduce-4049.patch, mapreduce-4049.patch, mapreduce-4049.patch, 
 mapreduce-4049.patch, mapreduce-4049.patch, mapreduce-4049.patch


 Support generic shuffle service as set of two plugins: ShuffleProvider  
 ShuffleConsumer.
 This will satisfy the following needs:
 # Better shuffle and merge performance. For example: we are working on 
 shuffle plugin that performs shuffle over RDMA in fast networks (10gE, 40gE, 
 or Infiniband) instead of using the current HTTP shuffle. Based on the fast 
 RDMA shuffle, the plugin can also utilize a suitable merge approach during 
 the intermediate merges. Hence, getting much better performance.
 # Satisfy MAPREDUCE-3060 - generic shuffle service for avoiding hidden 
 dependency of NodeManager with a specific version of mapreduce shuffle 
 (currently targeted to 0.24.0).
 References:
 # Hadoop Acceleration through Network Levitated Merging, by Prof. Weikuan Yu 
 from Auburn University with others, 
 [http://pasl.eng.auburn.edu/pubs/sc11-netlev.pdf]
 # I am attaching 2 documents with suggested Top Level Design for both plugins 
 (currently, based on 1.0 branch)
 # I am providing link for downloading UDA - Mellanox's open source plugin 
 that implements generic shuffle service using RDMA and levitated merge.  
 Note: At this phase, the code is in C++ through JNI and you should consider 
 it as beta only.  Still, it can serve anyone that wants to implement or 
 contribute to levitated merge. (Please be advised that levitated merge is 
 mostly suit in very fast networks) - 
 [http://www.mellanox.com/content/pages.php?pg=products_dynproduct_family=144menu_section=69]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4808) Allow reduce-side merge to be pluggable

2012-12-10 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13528334#comment-13528334
 ] 

Arun C Murthy commented on MAPREDUCE-4808:
--

Asokan, thanks for the clarification.

However, I'm still trying to understand what you are trying to achieve here.

The original goals of the parent task (MAPREDUCE-2454) was to make 'sort 
pluggable'.

We've accomplished that with MAPREDUCE-4807 and MAPREDUCE-4809.

Now, are we done? If not, what else is remaining to achieve that? Do you need 
some special hook in the Reducer's merge for Syncsort?

As I've told you in person, when making sweeping changes to framework it's 
better to focus on the 'goal' and make as minimal changes to get there.

We can always do more work and add more features, but let's do one thing at a 
time. We can add limit-N etc. separately, it just delays this jira - why do 
that?


 Allow reduce-side merge to be pluggable
 ---

 Key: MAPREDUCE-4808
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4808
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Affects Versions: 2.0.2-alpha
Reporter: Arun C Murthy
Assignee: Mariappan Asokan
 Fix For: 2.0.3-alpha

 Attachments: COMBO-mapreduce-4809-4812-4808.patch, 
 mapreduce-4808.patch


 Allow reduce-side merge to be pluggable for MAPREDUCE-2454

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4808) Allow reduce-side merge to be pluggable

2012-12-10 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13528346#comment-13528346
 ] 

Arun C Murthy commented on MAPREDUCE-4808:
--

To be clear, I'm not against newer features - I just want them done 
independently so we can close this out and be done with.

 Allow reduce-side merge to be pluggable
 ---

 Key: MAPREDUCE-4808
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4808
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Affects Versions: 2.0.2-alpha
Reporter: Arun C Murthy
Assignee: Mariappan Asokan
 Fix For: 2.0.3-alpha

 Attachments: COMBO-mapreduce-4809-4812-4808.patch, 
 mapreduce-4808.patch


 Allow reduce-side merge to be pluggable for MAPREDUCE-2454

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4049) plugin for generic shuffle service

2012-12-10 Thread Milind Bhandarkar (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13528376#comment-13528376
 ] 

Milind Bhandarkar commented on MAPREDUCE-4049:
--

Thanks for verifying, Arun. FWIW, we have been running with many earlier 
versions of this patch on our Greenplum Analytics Workbench 1000 node cluster 
since May 2012 (I think I had mentioned this to you and Chris Douglas during 
Hadoop Summit in June), and haven't found any issues with this patch so far. 
(See my comment above.)

 plugin for generic shuffle service
 --

 Key: MAPREDUCE-4049
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4049
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: performance, task, tasktracker
Affects Versions: 1.0.3, 1.1.0, 2.0.0-alpha, 3.0.0
Reporter: Avner BenHanoch
Assignee: Avner BenHanoch
  Labels: merge, plugin, rdma, shuffle
 Fix For: 3.0.0

 Attachments: HADOOP-1.x.y.patch, Hadoop Shuffle Plugin Design.rtf, 
 mapreduce-4049.patch, mapreduce-4049.patch, mapreduce-4049.patch, 
 mapreduce-4049.patch, mapreduce-4049.patch, mapreduce-4049.patch


 Support generic shuffle service as set of two plugins: ShuffleProvider  
 ShuffleConsumer.
 This will satisfy the following needs:
 # Better shuffle and merge performance. For example: we are working on 
 shuffle plugin that performs shuffle over RDMA in fast networks (10gE, 40gE, 
 or Infiniband) instead of using the current HTTP shuffle. Based on the fast 
 RDMA shuffle, the plugin can also utilize a suitable merge approach during 
 the intermediate merges. Hence, getting much better performance.
 # Satisfy MAPREDUCE-3060 - generic shuffle service for avoiding hidden 
 dependency of NodeManager with a specific version of mapreduce shuffle 
 (currently targeted to 0.24.0).
 References:
 # Hadoop Acceleration through Network Levitated Merging, by Prof. Weikuan Yu 
 from Auburn University with others, 
 [http://pasl.eng.auburn.edu/pubs/sc11-netlev.pdf]
 # I am attaching 2 documents with suggested Top Level Design for both plugins 
 (currently, based on 1.0 branch)
 # I am providing link for downloading UDA - Mellanox's open source plugin 
 that implements generic shuffle service using RDMA and levitated merge.  
 Note: At this phase, the code is in C++ through JNI and you should consider 
 it as beta only.  Still, it can serve anyone that wants to implement or 
 contribute to levitated merge. (Please be advised that levitated merge is 
 mostly suit in very fast networks) - 
 [http://www.mellanox.com/content/pages.php?pg=products_dynproduct_family=144menu_section=69]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4594) Add init/shutdown methods to mapreduce Partitioner

2012-12-10 Thread Radim Kolar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Radim Kolar updated MAPREDUCE-4594:
---

Attachment: partitioner4.txt

 Add init/shutdown methods to mapreduce Partitioner
 --

 Key: MAPREDUCE-4594
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4594
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: client
Affects Versions: trunk
Reporter: Radim Kolar
 Attachments: partitioner1.txt, partitioner2.txt, partitioner2.txt, 
 partitioner3.txt, partitioner4.txt


 The Partitioner supports only the Configurable API, which can be used for 
 basic init in setConf(). Problem is that there is no shutdown function.
 I propose to use standard setup() cleanup() functions like in mapper / 
 reducer.
 Use case is that I need to start and stop spring context and datagrid client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4594) Add init/shutdown methods to mapreduce Partitioner

2012-12-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13528491#comment-13528491
 ] 

Hadoop QA commented on MAPREDUCE-4594:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12560312/partitioner4.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3115//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3115//console

This message is automatically generated.

 Add init/shutdown methods to mapreduce Partitioner
 --

 Key: MAPREDUCE-4594
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4594
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: client
Affects Versions: trunk
Reporter: Radim Kolar
 Attachments: partitioner1.txt, partitioner2.txt, partitioner2.txt, 
 partitioner3.txt, partitioner4.txt


 The Partitioner supports only the Configurable API, which can be used for 
 basic init in setConf(). Problem is that there is no shutdown function.
 I propose to use standard setup() cleanup() functions like in mapper / 
 reducer.
 Use case is that I need to start and stop spring context and datagrid client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4396) Make LocalJobRunner work with private distributed cache

2012-12-10 Thread Ivan Mitic (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13528577#comment-13528577
 ] 

Ivan Mitic commented on MAPREDUCE-4396:
---

This was fixed with HADOOP-8734 in branch-1-win. Maybe just integrate the same 
patch to branch-1?

 Make LocalJobRunner work with private distributed cache
 ---

 Key: MAPREDUCE-4396
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4396
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client
Affects Versions: 1.0.3
Reporter: Luke Lu
Assignee: Yu Gao
Priority: Minor
 Attachments: mapreduce-4396-branch-1.patch, test-afterpatch.result, 
 test-beforepatch.result, test-patch.result


 Some LocalJobRunner related unit tests fails if user directory permission 
 and/or umask is too restrictive.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Reopened] (MAPREDUCE-4549) Distributed cache conflicts breaks backwards compatability

2012-12-10 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza reopened MAPREDUCE-4549:
---


 Distributed cache conflicts breaks backwards compatability
 --

 Key: MAPREDUCE-4549
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4549
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.3
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
Priority: Critical
 Fix For: 0.23.5

 Attachments: MAPREDUCE-4549-trunk.patch, MR-4549-branch-0.23.txt


 I recently put in MAPREDUCE-4503 which went a bit too far, and broke 
 backwards compatibility with 1.0 in distribtued cache entries.  instead of 
 changing the behavior of the distributed cache to more closely match 1.0 
 behavior I want to just change the exception to a warning message informing 
 the users that it will become an error in 2.0

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4549) Distributed cache conflicts breaks backwards compatability

2012-12-10 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated MAPREDUCE-4549:
--

Affects Version/s: 2.0.2-alpha
Fix Version/s: 2.0.3-alpha

 Distributed cache conflicts breaks backwards compatability
 --

 Key: MAPREDUCE-4549
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4549
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.3, 2.0.2-alpha
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
Priority: Critical
 Fix For: 2.0.3-alpha, 0.23.5

 Attachments: MAPREDUCE-4549-trunk.patch, MR-4549-branch-0.23.txt


 I recently put in MAPREDUCE-4503 which went a bit too far, and broke 
 backwards compatibility with 1.0 in distribtued cache entries.  instead of 
 changing the behavior of the distributed cache to more closely match 1.0 
 behavior I want to just change the exception to a warning message informing 
 the users that it will become an error in 2.0

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4549) Distributed cache conflicts breaks backwards compatability

2012-12-10 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated MAPREDUCE-4549:
--

Attachment: MAPREDUCE-4549-trunk.patch

 Distributed cache conflicts breaks backwards compatability
 --

 Key: MAPREDUCE-4549
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4549
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.3, 2.0.2-alpha
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
Priority: Critical
 Fix For: 2.0.3-alpha, 0.23.5

 Attachments: MAPREDUCE-4549-trunk.patch, MR-4549-branch-0.23.txt


 I recently put in MAPREDUCE-4503 which went a bit too far, and broke 
 backwards compatibility with 1.0 in distribtued cache entries.  instead of 
 changing the behavior of the distributed cache to more closely match 1.0 
 behavior I want to just change the exception to a warning message informing 
 the users that it will become an error in 2.0

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-1639) Grouping using hashing instead of sorting

2012-12-10 Thread Jerry Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13528596#comment-13528596
 ] 

Jerry Chen commented on MAPREDUCE-1639:
---

+1 I think this feature is valuable and I would take time to work on this. The 
hash based algorithm can both used for group by and for join. Both of them are 
not requiring a global sort.

 Grouping using hashing instead of sorting
 -

 Key: MAPREDUCE-1639
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1639
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Reporter: Joydeep Sen Sarma

 most applications of map-reduce care about grouping and not sorting. Sorting 
 is a (relatively expensive) way to achieve grouping. In order to achieve just 
 grouping - one can:
 - replace the sort on the Mappers with a HashTable - and maintain lists of 
 key-values against each hash-bucket.
 - key-value tuples inside each hash bucket are sorted - before spilling or 
 sending to Reducer. Anytime this is done - Combiner can be invoked.
 - HashTable is serialized by hash-bucketid. So merges (of either spills or 
 Map Outputs) works similar to today (at least there's no change in overall 
 compute complexity of merge)
 Of course this hashtable has nothing to do with partitioning. it's just a 
 replacement for map-side sort.
 --
 this is (pretty much) straight from the MARS project paper: 
 http://www.cse.ust.hk/catalac/papers/mars_pact08.pdf. They report a 45% 
 speedup in inverted index calculation using hashing instead of sorting 
 (reference implementation is NOT against Hadoop though).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-3247) Add hash aggregation style data flow and/or new API

2012-12-10 Thread Jerry Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13528609#comment-13528609
 ] 

Jerry Chen commented on MAPREDUCE-3247:
---

Binglin, I noticed that you create this bug from MAPREDUCE-1639, while I think 
this two bugs are more or less similar. And also there are a lot other things 
related are going on such as MAPREDUCE-2454 and MAPREDUCE-4049.

If you are not working on this, I would like to take time to work on this 
feature.

 Add hash aggregation style data flow and/or new API
 ---

 Key: MAPREDUCE-3247
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3247
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: task
Affects Versions: 0.23.0
Reporter: Binglin Chang
  Labels: api, perfomance

 In many join/aggregation like queries run on top of mapreduce, sort is not 
 need, in fact a hash table based join/aggregation is more efficient, this is 
 described in Tenzing A SQL Implementation On The MapReduce Framework in 
 detail. There are two ways to support hash table based join/aggregation in 
 hadoop mapreduce:
 # Only support no sort, the framework do nothing, just pass partitioned k/v 
 pair from mapper to reducer
The upper application use hash table in their mapper  reducer to do 
 aggregation, and emit all hashtable enties in cleanup() of mapper/reducer, 
 this is how Google did in Tenzing. The main problem is memory control of 
 hashtable.
 # Add new fold API, it can coexist with combiner/reducer API, user can use 
 mapper-combiner-reducer or mapper-folder (maybe a bad name, welcome to 
 propose a better name..)
Like foldl in functional programming: folder should have the semantic:
  foldl folder z (x:xs)  =   foldl folder (folder z x) xs
In this way, upper applications only need to provide folder, underlying 
 framework create and maintains hashtable for key/value pairs, it can be 
 managed  optimized by the framework. For example, in mapper side, we can pre 
 emit entire hashtable or use some policies like cache algorithm to emit part 
 of k/v pairs to free some memory, if the memory consumption reach io.sort.mb

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-4868) Allow multiple iteration for map

2012-12-10 Thread Jerry Chen (JIRA)
Jerry Chen created MAPREDUCE-4868:
-

 Summary: Allow multiple iteration for map
 Key: MAPREDUCE-4868
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4868
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Affects Versions: 3.0.0, 2.0.3-alpha
Reporter: Jerry Chen
 Fix For: 3.0.0, 2.0.3-alpha


Currently, the Mapper class allows advanced users to override public void 
run(Context context) method for more control over the map the execution of the 
mapper, while Context interface limit the operations over the data which is the 
foundation of more control.

One of use cases is that when I am considering a hive optimziation problem, I 
want to go two passes over the input data instead of using a another job or 
task ( which may slower the whole process). Each pass do the same thing but 
with a different parameters.

This is a new paradigm of Map Reduce usage and can be archived easily by extend 
Context interface a little with the more control over the data such as reset 
the input.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4868) Allow multiple iteration for map

2012-12-10 Thread Jerry Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry Chen updated MAPREDUCE-4868:
--

Description: 
Currently, the Mapper class allows advanced users to override public void 
run(Context context) method for more control over the execution of the mapper, 
while Context interface limit the operations over the data which is the 
foundation of more control.

One of use cases is that when I am considering a hive optimziation problem, I 
want to go two passes over the input data instead of using a another job or 
task ( which may slower the whole process). Each pass do the same thing but 
with a different parameters.

This is a new paradigm of Map Reduce usage and can be archived easily by extend 
Context interface a little with the more control over the data such as reset 
the input.

  was:
Currently, the Mapper class allows advanced users to override public void 
run(Context context) method for more control over the map the execution of the 
mapper, while Context interface limit the operations over the data which is the 
foundation of more control.

One of use cases is that when I am considering a hive optimziation problem, I 
want to go two passes over the input data instead of using a another job or 
task ( which may slower the whole process). Each pass do the same thing but 
with a different parameters.

This is a new paradigm of Map Reduce usage and can be archived easily by extend 
Context interface a little with the more control over the data such as reset 
the input.


 Allow multiple iteration for map
 

 Key: MAPREDUCE-4868
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4868
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Affects Versions: 3.0.0, 2.0.3-alpha
Reporter: Jerry Chen
 Fix For: 3.0.0, 2.0.3-alpha

   Original Estimate: 168h
  Remaining Estimate: 168h

 Currently, the Mapper class allows advanced users to override public void 
 run(Context context) method for more control over the execution of the 
 mapper, while Context interface limit the operations over the data which is 
 the foundation of more control.
 One of use cases is that when I am considering a hive optimziation problem, I 
 want to go two passes over the input data instead of using a another job or 
 task ( which may slower the whole process). Each pass do the same thing but 
 with a different parameters.
 This is a new paradigm of Map Reduce usage and can be archived easily by 
 extend Context interface a little with the more control over the data such as 
 reset the input.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4868) Allow multiple iteration for map

2012-12-10 Thread Radim Kolar (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13528629#comment-13528629
 ] 

Radim Kolar commented on MAPREDUCE-4868:


Did you tried Spring Batch? You can boot it in setup() and do whatever you want 
with data, including multiple steps and multithreading.

 Allow multiple iteration for map
 

 Key: MAPREDUCE-4868
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4868
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Affects Versions: 3.0.0, 2.0.3-alpha
Reporter: Jerry Chen
 Fix For: 3.0.0, 2.0.3-alpha

   Original Estimate: 168h
  Remaining Estimate: 168h

 Currently, the Mapper class allows advanced users to override public void 
 run(Context context) method for more control over the execution of the 
 mapper, while Context interface limit the operations over the data which is 
 the foundation of more control.
 One of use cases is that when I am considering a hive optimziation problem, I 
 want to go two passes over the input data instead of using a another job or 
 task ( which may slower the whole process). Each pass do the same thing but 
 with a different parameters.
 This is a new paradigm of Map Reduce usage and can be archived easily by 
 extend Context interface a little with the more control over the data such as 
 reset the input.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4868) Allow multiple iteration for map

2012-12-10 Thread Radim Kolar (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13528630#comment-13528630
 ] 

Radim Kolar commented on MAPREDUCE-4868:


also this one you can find handy. org.apache.hadoop.mapred.lib.ChainMapper

 Allow multiple iteration for map
 

 Key: MAPREDUCE-4868
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4868
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Affects Versions: 3.0.0, 2.0.3-alpha
Reporter: Jerry Chen
 Fix For: 3.0.0, 2.0.3-alpha

   Original Estimate: 168h
  Remaining Estimate: 168h

 Currently, the Mapper class allows advanced users to override public void 
 run(Context context) method for more control over the execution of the 
 mapper, while Context interface limit the operations over the data which is 
 the foundation of more control.
 One of use cases is that when I am considering a hive optimziation problem, I 
 want to go two passes over the input data instead of using a another job or 
 task ( which may slower the whole process). Each pass do the same thing but 
 with a different parameters.
 This is a new paradigm of Map Reduce usage and can be archived easily by 
 extend Context interface a little with the more control over the data such as 
 reset the input.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4868) Allow multiple iteration for map

2012-12-10 Thread Jerry Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13528641#comment-13528641
 ] 

Jerry Chen commented on MAPREDUCE-4868:
---

Radim, thank you very much your quick response. I checked the ChainMapper and 
it showed to be not quite the same thing as here. The ChainMapper actually 
iterate the map data only once, and for each key value, it goes through the 
chain of mappers. But the difference here is it will enable the mapper to run 
multiple iterations. At the first glance, it seems to make no sense. But 
considering the parameter data needed (not the input data) for each iteration, 
it makes sense when considering the availability of the parameter data for each 
iteration.

In the Hive optimization problem I mentioned above, the parameter data may not 
be able to fit in the memory and we need partition the data and load in the 
memory and goes through mutiple times over the input data for each partition. 
This saves the complex reduce stage.

Does this makes sense, or there are other way around which provide equivalent 
performance?

Thanks again.

 Allow multiple iteration for map
 

 Key: MAPREDUCE-4868
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4868
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Affects Versions: 3.0.0, 2.0.3-alpha
Reporter: Jerry Chen
 Fix For: 3.0.0, 2.0.3-alpha

   Original Estimate: 168h
  Remaining Estimate: 168h

 Currently, the Mapper class allows advanced users to override public void 
 run(Context context) method for more control over the execution of the 
 mapper, while Context interface limit the operations over the data which is 
 the foundation of more control.
 One of use cases is that when I am considering a hive optimziation problem, I 
 want to go two passes over the input data instead of using a another job or 
 task ( which may slower the whole process). Each pass do the same thing but 
 with a different parameters.
 This is a new paradigm of Map Reduce usage and can be archived easily by 
 extend Context interface a little with the more control over the data such as 
 reset the input.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4868) Allow multiple iteration for map

2012-12-10 Thread Radim Kolar (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13528642#comment-13528642
 ] 

Radim Kolar commented on MAPREDUCE-4868:


If you want multiple passes then go for Spring Batch. All you need to write is 
hdfs reader, writer driver for spring batch. Its about 20 lines of code each.

 Allow multiple iteration for map
 

 Key: MAPREDUCE-4868
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4868
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Affects Versions: 3.0.0, 2.0.3-alpha
Reporter: Jerry Chen
 Fix For: 3.0.0, 2.0.3-alpha

   Original Estimate: 168h
  Remaining Estimate: 168h

 Currently, the Mapper class allows advanced users to override public void 
 run(Context context) method for more control over the execution of the 
 mapper, while Context interface limit the operations over the data which is 
 the foundation of more control.
 One of use cases is that when I am considering a hive optimziation problem, I 
 want to go two passes over the input data instead of using a another job or 
 task ( which may slower the whole process). Each pass do the same thing but 
 with a different parameters.
 This is a new paradigm of Map Reduce usage and can be archived easily by 
 extend Context interface a little with the more control over the data such as 
 reset the input.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4868) Allow multiple iteration for map

2012-12-10 Thread Jerry Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13528664#comment-13528664
 ] 

Jerry Chen commented on MAPREDUCE-4868:
---

It showed to me that Spring Batch is another batch processing infrastructure. 
While we are seeking solve the problem under the context of MapReduce as well 
as enpower the map reduce in a reasonable manner, other than simply hook 
totally to another batch processing thing.


 Allow multiple iteration for map
 

 Key: MAPREDUCE-4868
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4868
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Affects Versions: 3.0.0, 2.0.3-alpha
Reporter: Jerry Chen
 Fix For: 3.0.0, 2.0.3-alpha

   Original Estimate: 168h
  Remaining Estimate: 168h

 Currently, the Mapper class allows advanced users to override public void 
 run(Context context) method for more control over the execution of the 
 mapper, while Context interface limit the operations over the data which is 
 the foundation of more control.
 One of use cases is that when I am considering a hive optimziation problem, I 
 want to go two passes over the input data instead of using a another job or 
 task ( which may slower the whole process). Each pass do the same thing but 
 with a different parameters.
 This is a new paradigm of Map Reduce usage and can be archived easily by 
 extend Context interface a little with the more control over the data such as 
 reset the input.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4848) TaskAttemptContext cast error during AM recovery

2012-12-10 Thread Jerry Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13528685#comment-13528685
 ] 

Jerry Chen commented on MAPREDUCE-4848:
---

Hi Jason, I looked into this problem and this is a bug in RecoveryService of 
MRv2. The cause is that the RecoveryService didn't consider the commiter type 
(new api commiter or old api commiter).
I can submit a patch to this issue soon.


 TaskAttemptContext cast error during AM recovery
 

 Key: MAPREDUCE-4848
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4848
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am
Affects Versions: 0.23.4
Reporter: Jason Lowe

 Recently saw an AM that failed and tried to recover, but the subsequent 
 attempt quickly exited with its own failure during recovery:
 {noformat}
 2012-12-05 02:33:36,752 FATAL [AsyncDispatcher event handler] 
 org.apache.hadoop.yarn.event.AsyncDispatcher: Error in dispatcher thread
 java.lang.ClassCastException: 
 org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl cannot be cast to 
 org.apache.hadoop.mapred.TaskAttemptContext
   at 
 org.apache.hadoop.mapred.OutputCommitter.recoverTask(OutputCommitter.java:284)
   at 
 org.apache.hadoop.mapreduce.v2.app.recover.RecoveryService$InterceptingEventHandler.handle(RecoveryService.java:361)
   at 
 org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl$ContainerAssignedTransition.transition(TaskAttemptImpl.java:1211)
   at 
 org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl$ContainerAssignedTransition.transition(TaskAttemptImpl.java:1177)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:357)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:298)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443)
   at 
 org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:958)
   at 
 org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:135)
   at 
 org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:926)
   at 
 org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:918)
   at 
 org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126)
   at 
 org.apache.hadoop.mapreduce.v2.app.recover.RecoveryService$RecoveryDispatcher.realDispatch(RecoveryService.java:285)
   at 
 org.apache.hadoop.mapreduce.v2.app.recover.RecoveryService$RecoveryDispatcher.dispatch(RecoveryService.java:281)
   at 
 org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75)
   at java.lang.Thread.run(Thread.java:619)
 2012-12-05 02:33:36,752 INFO [AsyncDispatcher event handler] 
 org.apache.hadoop.yarn.event.AsyncDispatcher: Exiting, bbye..
 {noformat}
 The RM then launched a third AM attempt which succeeded. The third attempt 
 saw basically no progress after parsing the history file from the second 
 attempt and ran the job again from scratch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4869) TestMapReduceChildJVM fails in branch-trunk-win

2012-12-10 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated MAPREDUCE-4869:
-

Target Version/s: trunk-win

 TestMapReduceChildJVM fails in branch-trunk-win
 ---

 Key: MAPREDUCE-4869
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4869
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: trunk-win
Reporter: Chris Nauroth
Assignee: Chris Nauroth

 The YARN-233 patch for getting YARN working on Windows forgot to include a 
 corresponding change in {{TestMapReduceChildJVM}}, so the test is failing now.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Moved] (MAPREDUCE-4869) TestMapReduceChildJVM fails in branch-trunk-win

2012-12-10 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth moved HADOOP-9130 to MAPREDUCE-4869:
--

  Component/s: (was: test)
   test
 Target Version/s:   (was: trunk-win)
Affects Version/s: (was: trunk-win)
   trunk-win
  Key: MAPREDUCE-4869  (was: HADOOP-9130)
  Project: Hadoop Map/Reduce  (was: Hadoop Common)

 TestMapReduceChildJVM fails in branch-trunk-win
 ---

 Key: MAPREDUCE-4869
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4869
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: trunk-win
Reporter: Chris Nauroth
Assignee: Chris Nauroth

 The YARN-233 patch for getting YARN working on Windows forgot to include a 
 corresponding change in {{TestMapReduceChildJVM}}, so the test is failing now.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4869) TestMapReduceChildJVM fails in branch-trunk-win

2012-12-10 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated MAPREDUCE-4869:
-

Attachment: MAPREDUCE-4869-branch-trunk-win.1.patch

The attached patch updates the test to remove exec and perform 
platform-specific escaping of environment variable references.  With YARN-233, 
the exec is now inserted on the container side, because this is an 
OS-specific command.  With this patch, the test passes on Mac and Windows.

 TestMapReduceChildJVM fails in branch-trunk-win
 ---

 Key: MAPREDUCE-4869
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4869
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: trunk-win
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Attachments: MAPREDUCE-4869-branch-trunk-win.1.patch


 The YARN-233 patch for getting YARN working on Windows forgot to include a 
 corresponding change in {{TestMapReduceChildJVM}}, so the test is failing now.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4810) Add admin command options for ApplicationMaster

2012-12-10 Thread Jerry Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13528766#comment-13528766
 ] 

Jerry Chen commented on MAPREDUCE-4810:
---

+1 It makes sense separating stable confs with those that may vary by each job.
While the question is that how strong is the need to set per job confs (such as 
Heap size) on a MRAppMaster that only masters the running of the tasks but not 
running task itself. What I can think of is that different jobs may differ on 
the number of tasks which may lead different level of memory comsumption. Is 
there any other use cases that may have a specific need of heap size of 
MRAppMaster?

 Add admin command options for ApplicationMaster
 ---

 Key: MAPREDUCE-4810
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4810
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: applicationmaster
Affects Versions: 2.0.2-alpha, 0.23.4
Reporter: Jason Lowe
Priority: Minor

 It would be nice if the MR ApplicationMaster had the notion of admin options 
 in addition to the existing user options much like we have for map and reduce 
 tasks, e.g.: mapreduce.admin.map.child.java.opts vs. mapreduce.map.java.opts. 
  This allows site-wide configuration options for MR AMs but still allows a 
 user to easily override the heap size of the AM without worrying about 
 dropping other admin-specified options.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4816) JobImpl Invalid event: JOB_TASK_ATTEMPT_COMPLETED at FAILED

2012-12-10 Thread Jerry Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13528775#comment-13528775
 ] 

Jerry Chen commented on MAPREDUCE-4816:
---

Hi Jason, I looked into this issue. The cause of the exception is that when a 
job is at a FAILED state, it should ignore further JOB_TASK_ATTEMPT_COMPLETED 
events. While in the version 0.23.5, the trasition from FAILED state with 
JOB_TASK_ATTEMPT_COMPLETED event is not declared in its state machine and thus 
throws the exception.

Actually, besides JOB_TASK_ATTEMPT_COMPLETED, other events such as 
JOB_TASK_COMPLETED, JOB_MAP_TASK_RESCHEDULED also possibly to happen at the 
FAILED state and should also be declared in the state machine. 

I checked the trunk version.  And there seems to be some refactor done with the 
JobState - JobStateInternal and already fix the transition problem mentioned 
above.

If necessary, I would back port the trasition fixes to these versions such as 
2.0.2-alpha.



 JobImpl Invalid event: JOB_TASK_ATTEMPT_COMPLETED at FAILED
 ---

 Key: MAPREDUCE-4816
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4816
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster
Affects Versions: 0.23.5
Reporter: Jason Lowe

 Saw this in an AM log of a task that had failed:
 {noformat}
 2012-11-21 23:26:44,533 ERROR [AsyncDispatcher event handler] 
 org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Can't handle this event 
 at current state
 org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
 JOB_TASK_ATTEMPT_COMPLETED at FAILED
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:301)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443)
   at 
 org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:690)
   at 
 org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:113)
   at 
 org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:904)
   at 
 org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:900)
   at 
 org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126)
   at 
 org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75)
   at java.lang.Thread.run(Thread.java:619)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira