[jira] [Updated] (MAPREDUCE-4911) Add node-level aggregation flag feature(setLocalAggregation(boolean)) to JobConf

2013-01-21 Thread Tsuyoshi OZAWA (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA updated MAPREDUCE-4911:
--

Attachment: MAPREDUCE-4911.3.patch

Fixed Javadoc warnings again.

 Add node-level aggregation flag feature(setLocalAggregation(boolean)) to 
 JobConf
 

 Key: MAPREDUCE-4911
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4911
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: client
Affects Versions: trunk
Reporter: Tsuyoshi OZAWA
Assignee: Tsuyoshi OZAWA
 Attachments: MAPREDUCE-4911.2.patch, MAPREDUCE-4911.3.patch, 
 MAPREDUCE-4911.patch


 This JIRA adds node-level aggregation flag 
 feature(setLocalAggregation(boolean)) to JobConf.
 This task is subtask of MAPREDUCE-4502.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4911) Add node-level aggregation flag feature(setNodeLevelAggregation(boolean)) to JobConf

2013-01-21 Thread Tsuyoshi OZAWA (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA updated MAPREDUCE-4911:
--

Summary: Add node-level aggregation flag 
feature(setNodeLevelAggregation(boolean)) to JobConf  (was: Add node-level 
aggregation flag feature(setLocalAggregation(boolean)) to JobConf)

 Add node-level aggregation flag feature(setNodeLevelAggregation(boolean)) to 
 JobConf
 

 Key: MAPREDUCE-4911
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4911
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: client
Affects Versions: trunk
Reporter: Tsuyoshi OZAWA
Assignee: Tsuyoshi OZAWA
 Attachments: MAPREDUCE-4911.2.patch, MAPREDUCE-4911.3.patch, 
 MAPREDUCE-4911.patch


 This JIRA adds node-level aggregation flag 
 feature(setLocalAggregation(boolean)) to JobConf.
 This task is subtask of MAPREDUCE-4502.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4911) Add node-level aggregation flag feature(setNodeLevelAggregation(boolean)) to JobConf

2013-01-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13558653#comment-13558653
 ] 

Hadoop QA commented on MAPREDUCE-4911:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12565750/MAPREDUCE-4911.3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3257//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3257//console

This message is automatically generated.

 Add node-level aggregation flag feature(setNodeLevelAggregation(boolean)) to 
 JobConf
 

 Key: MAPREDUCE-4911
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4911
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: client
Affects Versions: trunk
Reporter: Tsuyoshi OZAWA
Assignee: Tsuyoshi OZAWA
 Attachments: MAPREDUCE-4911.2.patch, MAPREDUCE-4911.3.patch, 
 MAPREDUCE-4911.patch


 This JIRA adds node-level aggregation flag 
 feature(setLocalAggregation(boolean)) to JobConf.
 This task is subtask of MAPREDUCE-4502.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4952) FSSchedulerNode is always instantiated with a 0 virtual core capacity

2013-01-21 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13558658#comment-13558658
 ] 

Harsh J commented on MAPREDUCE-4952:


One may also use More Actions - Move to move incorrect JIRAs to their right 
project, and URLs are automatically mapped to the new one, making the process 
more elegant. Just something for future :-)

 FSSchedulerNode is always instantiated with a 0 virtual core capacity
 -

 Key: MAPREDUCE-4952
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4952
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.0.2-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza

 After YARN-2, FSSchedulerNode was not updated to initialize with the 
 underlying RMNode's CPU capacity, and thus always has 0 virtual cores.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4876) Adopt a Tuple MapReduce API instead of classic MapReduce one

2013-01-21 Thread posa Wu (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13558665#comment-13558665
 ] 

posa Wu commented on MAPREDUCE-4876:


[~ivan.prado]Does Pangool support MultipleInputs and MultipleOutputs?

 Adopt a Tuple MapReduce API instead of classic MapReduce one
 

 Key: MAPREDUCE-4876
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4876
 Project: Hadoop Map/Reduce
  Issue Type: Wish
Reporter: Iván de Prado
Priority: Minor

 After using MapReduce for many years, we have noticed that it lacks some 
 important features: compound records, easy intra-reduce sorting and join 
 capabilities. We have elaborated a slightly modified MapReduce foundation to 
 overcome these problems: Tuple MapReduce. You can see a full paper published 
 at the ICDM 2012 that describes it at http://pangool.net/TupleMapReduce.pdf 
 The good news are:
 1) That no architectural changes on Hadoop are needed to embrace Tuple 
 MapReduce.
 2) Indeed, we have proven that it is possible to implement it on top of 
 Hadoop. See the Pangool Open Source project ( http://pangool.net/ ). 
 3) It performs very efficiently ( http://pangool.net/benchmark.html )
 4) It is compatible with all Hadoop stack: Writables, Serializers, 
 Input/OutputFormats, etc. 
 We believe Hadoop community could benefit from it in different ways:
 1) By getting ideas for a future API redesign.
 2) By adopting Pangool inside Hadoop. Of course, we would be helping and 
 contributing with anything needed doing any adaptation changes if needed (not 
 many, because as I told, everything is compatible with existing MapReduce).
 Obviously, we prefer the second. But at least, we believe some good ideas can 
 be obtained by looking at Tuple MapReduce and Pangool.  
 There are also other improvements in Pangool that would improve Hadoop API:
 1) Configuration by instance: passing parameters by constructor. For example, 
 Pangool Input/OutputFormats can be configured by providing values to the 
 constructor.
 2) Stateful serialization. What is requested in  
 https://issues.apache.org/jira/browse/MAPREDUCE-1462 is already supported by 
 Pangool.
 3) First-class multipleinput/multipleoutput.
 Well, we are open to the discussion and to contribute.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4876) Adopt a Tuple MapReduce API instead of classic MapReduce one

2013-01-21 Thread JIRA

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13558673#comment-13558673
 ] 

Iván de Prado commented on MAPREDUCE-4876:
--

[~posa88] Yes. Pangool offers multiple inputs. Example (from 
[http://pangool.net/userguide/TupleMrBuilder.html]:

{code:java}
...
 mr.addInput(new Path(input1), new HadoopInputFormat(TextInputFormat.class), 
new UrlMapProcessor());
 mr.addInput(new Path(input2), new HadoopInputFormat(TextInputFormat.class), 
new UrlProcessor());
...
{code} 

Pangool also supports multiple outputs, with some advantages over Hadoop 
multiple outputs because of the use of configuration by instance. See 
[http://pangool.net/userguide/named_outputs.html].

[Here|http://www.datasalt.com/2012/05/pangool-solr/] you can find an example of 
using Pangool's named outputs for generating different Solr indexes in one Job. 

 Adopt a Tuple MapReduce API instead of classic MapReduce one
 

 Key: MAPREDUCE-4876
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4876
 Project: Hadoop Map/Reduce
  Issue Type: Wish
Reporter: Iván de Prado
Priority: Minor

 After using MapReduce for many years, we have noticed that it lacks some 
 important features: compound records, easy intra-reduce sorting and join 
 capabilities. We have elaborated a slightly modified MapReduce foundation to 
 overcome these problems: Tuple MapReduce. You can see a full paper published 
 at the ICDM 2012 that describes it at http://pangool.net/TupleMapReduce.pdf 
 The good news are:
 1) That no architectural changes on Hadoop are needed to embrace Tuple 
 MapReduce.
 2) Indeed, we have proven that it is possible to implement it on top of 
 Hadoop. See the Pangool Open Source project ( http://pangool.net/ ). 
 3) It performs very efficiently ( http://pangool.net/benchmark.html )
 4) It is compatible with all Hadoop stack: Writables, Serializers, 
 Input/OutputFormats, etc. 
 We believe Hadoop community could benefit from it in different ways:
 1) By getting ideas for a future API redesign.
 2) By adopting Pangool inside Hadoop. Of course, we would be helping and 
 contributing with anything needed doing any adaptation changes if needed (not 
 many, because as I told, everything is compatible with existing MapReduce).
 Obviously, we prefer the second. But at least, we believe some good ideas can 
 be obtained by looking at Tuple MapReduce and Pangool.  
 There are also other improvements in Pangool that would improve Hadoop API:
 1) Configuration by instance: passing parameters by constructor. For example, 
 Pangool Input/OutputFormats can be configured by providing values to the 
 constructor.
 2) Stateful serialization. What is requested in  
 https://issues.apache.org/jira/browse/MAPREDUCE-1462 is already supported by 
 Pangool.
 3) First-class multipleinput/multipleoutput.
 Well, we are open to the discussion and to contribute.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4049) plugin for generic shuffle service

2013-01-21 Thread Avner BenHanoch (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13558770#comment-13558770
 ] 

Avner BenHanoch commented on MAPREDUCE-4049:


Hi Alejandro,

Thanks for your comprehensive answer and code.   It is important for me to 
agree with you on all the details before I go to implementation.

Kindly, please let me know if you agree on all the bullets below:
 # I’ll Keep consumer  producer in the same JIRA and submit one patch when all 
is ready (since this JIRA issue deals with generic shuffle service as set of 
two plugins).  I’ll do my best to do it as soon as possible within few days.
 # The consumer will be according to all our agreements so far
 # The producer will be according to your code and remarks.
 # Also, I understand that in _... remove that line and discover, instantiate 
and initialize the provider plugin_ you simply mean _... remove that line and 
call multiShuffleProviderPlugin.initialize()_
 # In addition, I’ll move the current call _shuffleConsumerPlugin.destroy()_ 
from _TaskTracker.close()_ method into _TaskTracker.shutdown()_ method – this 
will match the move of _shuffleConsumerPlugin.initialize_ into TT CTOR.

Thanks,
  Avner


 plugin for generic shuffle service
 --

 Key: MAPREDUCE-4049
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4049
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: performance, task, tasktracker
Affects Versions: 1.0.3, 1.1.0, 2.0.0-alpha, 3.0.0
Reporter: Avner BenHanoch
Assignee: Avner BenHanoch
  Labels: merge, plugin, rdma, shuffle
 Fix For: 3.0.0

 Attachments: HADOOP-1.x.y.patch, Hadoop Shuffle Plugin Design.rtf, 
 MAPREDUCE-4049--branch-1.patch, mapreduce-4049.patch


 Support generic shuffle service as set of two plugins: ShuffleProvider  
 ShuffleConsumer.
 This will satisfy the following needs:
 # Better shuffle and merge performance. For example: we are working on 
 shuffle plugin that performs shuffle over RDMA in fast networks (10gE, 40gE, 
 or Infiniband) instead of using the current HTTP shuffle. Based on the fast 
 RDMA shuffle, the plugin can also utilize a suitable merge approach during 
 the intermediate merges. Hence, getting much better performance.
 # Satisfy MAPREDUCE-3060 - generic shuffle service for avoiding hidden 
 dependency of NodeManager with a specific version of mapreduce shuffle 
 (currently targeted to 0.24.0).
 References:
 # Hadoop Acceleration through Network Levitated Merging, by Prof. Weikuan Yu 
 from Auburn University with others, 
 [http://pasl.eng.auburn.edu/pubs/sc11-netlev.pdf]
 # I am attaching 2 documents with suggested Top Level Design for both plugins 
 (currently, based on 1.0 branch)
 # I am providing link for downloading UDA - Mellanox's open source plugin 
 that implements generic shuffle service using RDMA and levitated merge.  
 Note: At this phase, the code is in C++ through JNI and you should consider 
 it as beta only.  Still, it can serve anyone that wants to implement or 
 contribute to levitated merge. (Please be advised that levitated merge is 
 mostly suit in very fast networks) - 
 [http://www.mellanox.com/content/pages.php?pg=products_dynproduct_family=144menu_section=69]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4952) FSSchedulerNode is always instantiated with a 0 virtual core capacity

2013-01-21 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13558936#comment-13558936
 ] 

Sandy Ryza commented on MAPREDUCE-4952:
---

Thanks Harsh, good to know.

 FSSchedulerNode is always instantiated with a 0 virtual core capacity
 -

 Key: MAPREDUCE-4952
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4952
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.0.2-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza

 After YARN-2, FSSchedulerNode was not updated to initialize with the 
 underlying RMNode's CPU capacity, and thus always has 0 virtual cores.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4271) Make TestCapacityScheduler more robust with non-Sun JDK

2013-01-21 Thread Amir Sanjar (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13558967#comment-13558967
 ] 

Amir Sanjar commented on MAPREDUCE-4271:


this testcase also fails with non IBM JDK. Verified to fail with Oracle JAVA 7

 Make TestCapacityScheduler more robust with non-Sun JDK
 ---

 Key: MAPREDUCE-4271
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4271
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: capacity-sched
Affects Versions: 1.0.3
Reporter: Luke Lu
Assignee: Yu Gao
  Labels: alt-jdk, capacity
 Attachments: mapreduce-4271-branch-1.patch, test-afterepatch.result, 
 test-beforepatch.result, test-patch.result


 The capacity scheduler queue is initialized with a HashMap, the values of 
 which are later added to a list (a queue for assigning tasks). 
 TestCapacityScheduler depends on the order of the list hence not portable 
 across JDKs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4951) Container preemption interpreted as task failure

2013-01-21 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13558987#comment-13558987
 ] 

Bikas Saha commented on MAPREDUCE-4951:
---

Will that differentiate between preemption killing and resource (eg out of 
memory) killing?

 Container preemption interpreted as task failure
 

 Key: MAPREDUCE-4951
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4951
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mr-am, mrv2
Affects Versions: 2.0.2-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: MAPREDUCE-4951.patch


 When YARN reports a completed container to the MR AM, it always interprets it 
 as a failure.  This can lead to a job failing because too many of its tasks 
 failed, when in fact they only failed because the scheduler preempted them.
 MR needs to recognize the special exit code value of -100 and interpret it as 
 a container being killed instead of a container failure.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4951) Container preemption interpreted as task failure

2013-01-21 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13559023#comment-13559023
 ] 

Sandy Ryza commented on MAPREDUCE-4951:
---

I believe in that case the exit code will be FORCE_KILLED(137) or 
TERMINATED(143) (from ContainerExecutor.java).

 Container preemption interpreted as task failure
 

 Key: MAPREDUCE-4951
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4951
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mr-am, mrv2
Affects Versions: 2.0.2-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: MAPREDUCE-4951.patch


 When YARN reports a completed container to the MR AM, it always interprets it 
 as a failure.  This can lead to a job failing because too many of its tasks 
 failed, when in fact they only failed because the scheduler preempted them.
 MR needs to recognize the special exit code value of -100 and interpret it as 
 a container being killed instead of a container failure.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4946) Type conversion of map completion events leads to performance problems with large jobs

2013-01-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13559079#comment-13559079
 ] 

Hadoop QA commented on MAPREDUCE-4946:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12565590/MAPREDUCE-4946.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 5 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3258//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3258//console

This message is automatically generated.

 Type conversion of map completion events leads to performance problems with 
 large jobs
 --

 Key: MAPREDUCE-4946
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4946
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am
Affects Versions: 2.0.2-alpha, 0.23.5
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Critical
 Attachments: MAPREDUCE-4946.patch


 We've seen issues with large jobs (e.g.: 13,000 maps and 3,500 reduces) where 
 reducers fail to connect back to the AM after being launched due to 
 connection timeout.  Looking at stack traces of the AM during this time we 
 see a lot of IPC servers stuck waiting for a lock to get the application ID 
 while type converting the map completion events.  What's odd is that normally 
 getting the application ID should be very cheap, but in this case we're 
 type-converting thousands of map completion events for *each* reducer 
 connecting.  That means we end up type-converting the map completion events 
 over 45 million times during the lifetime of the example job (13,000 * 3,500).
 We either need to make the type conversion much cheaper (i.e.: lockless or at 
 least read-write locked) or, even better, store the completion events in a 
 form that does not require type conversion when serving them up to reducers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4946) Type conversion of map completion events leads to performance problems with large jobs

2013-01-21 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13559107#comment-13559107
 ] 

Siddharth Seth commented on MAPREDUCE-4946:
---

The change looks good to me. Jason, could you please post a patch for 
branch-0.23 as well.

Agreed. TaskUmbilical using TaskAttemptCompletionEvents seems like a longer 
term change - the conversions ends up getting pushed to individual tasks, 
unless Task itself is change to work with mrv2 constructs.

 Type conversion of map completion events leads to performance problems with 
 large jobs
 --

 Key: MAPREDUCE-4946
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4946
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am
Affects Versions: 2.0.2-alpha, 0.23.5
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Critical
 Attachments: MAPREDUCE-4946.patch


 We've seen issues with large jobs (e.g.: 13,000 maps and 3,500 reduces) where 
 reducers fail to connect back to the AM after being launched due to 
 connection timeout.  Looking at stack traces of the AM during this time we 
 see a lot of IPC servers stuck waiting for a lock to get the application ID 
 while type converting the map completion events.  What's odd is that normally 
 getting the application ID should be very cheap, but in this case we're 
 type-converting thousands of map completion events for *each* reducer 
 connecting.  That means we end up type-converting the map completion events 
 over 45 million times during the lifetime of the example job (13,000 * 3,500).
 We either need to make the type conversion much cheaper (i.e.: lockless or at 
 least read-write locked) or, even better, store the completion events in a 
 form that does not require type conversion when serving them up to reducers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4838) Add extra info to JH files

2013-01-21 Thread Zhijie Shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated MAPREDUCE-4838:
---

Attachment: MAPREDUCE-4838_1.patch

I've migrated the code to add more task-info to JH for Hadoop-1 to that for 
Hadoop-2. There're the following major differences:

1) There's no JobInProgress (actually nearly empty) and TaskInProgress, where 
the locality ant the avataar attributes are set and logged. Instead, avataar 
is now set in TaskImpl#addAndScheduleAttempt by judging whether there are other 
active task attempts, while locality is set in 
TaskAttemptImpl#ContainerAssignedTransition#transition by judging whether the 
assigned container's host is within the local host/rack list of the task 
attempt.

2) workflow related info is added in JobImpl. The function 
getWorkflowAdjacencies and its dependent functions are also imported in this 
class.

3) Locality has the same enum values as the NodeType of yarn, but I still 
created Loality because it infers one attribute of a task attempt.

The current trunk can be built correctly with the patch applied. However, I 
still need some more work with the test cases.

Arun and Sid, please have a look at the patch, and give some comments. Thank 
you!

Zhijie

 Add extra info to JH files
 --

 Key: MAPREDUCE-4838
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4838
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Arun C Murthy
Assignee: Zhijie Shen
 Attachments: MAPREDUCE-4838_1.patch, MAPREDUCE-4838.patch


 It will be useful to add more task-info to JH for analytics.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-4953) HadoopPipes misuses fprintf

2013-01-21 Thread Andy Isaacson (JIRA)
Andy Isaacson created MAPREDUCE-4953:


 Summary: HadoopPipes misuses fprintf
 Key: MAPREDUCE-4953
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4953
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: pipes
Affects Versions: 3.0.0, 2.0.3-alpha
Reporter: Andy Isaacson
Assignee: Andy Isaacson


{code}
 [exec] 
/mnt/trunk/hadoop-tools/hadoop-pipes/src/main/native/pipes/impl/HadoopPipes.cc:130:58:
 warning: format not a string literal and no format arguments 
[-Wformat-security]
{code}


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4953) HadoopPipes misuses fprintf

2013-01-21 Thread Andy Isaacson (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Isaacson updated MAPREDUCE-4953:
-

Status: Patch Available  (was: Open)

 HadoopPipes misuses fprintf
 ---

 Key: MAPREDUCE-4953
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4953
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: pipes
Affects Versions: 3.0.0, 2.0.3-alpha
Reporter: Andy Isaacson
Assignee: Andy Isaacson
 Attachments: mapreduce-4953.txt


 {code}
  [exec] 
 /mnt/trunk/hadoop-tools/hadoop-pipes/src/main/native/pipes/impl/HadoopPipes.cc:130:58:
  warning: format not a string literal and no format arguments 
 [-Wformat-security]
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4953) HadoopPipes misuses fprintf

2013-01-21 Thread Andy Isaacson (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Isaacson updated MAPREDUCE-4953:
-

Attachment: mapreduce-4953.txt

Fix the warning by switching to fputs.

 HadoopPipes misuses fprintf
 ---

 Key: MAPREDUCE-4953
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4953
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: pipes
Affects Versions: 3.0.0, 2.0.3-alpha
Reporter: Andy Isaacson
Assignee: Andy Isaacson
 Attachments: mapreduce-4953.txt


 {code}
  [exec] 
 /mnt/trunk/hadoop-tools/hadoop-pipes/src/main/native/pipes/impl/HadoopPipes.cc:130:58:
  warning: format not a string literal and no format arguments 
 [-Wformat-security]
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4951) Container preemption interpreted as task failure

2013-01-21 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated MAPREDUCE-4951:
--

Attachment: MAPREDUCE-4951-1.patch

 Container preemption interpreted as task failure
 

 Key: MAPREDUCE-4951
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4951
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mr-am, mrv2
Affects Versions: 2.0.2-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: MAPREDUCE-4951-1.patch, MAPREDUCE-4951.patch


 When YARN reports a completed container to the MR AM, it always interprets it 
 as a failure.  This can lead to a job failing because too many of its tasks 
 failed, when in fact they only failed because the scheduler preempted them.
 MR needs to recognize the special exit code value of -100 and interpret it as 
 a container being killed instead of a container failure.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4951) Container preemption interpreted as task failure

2013-01-21 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13559160#comment-13559160
 ] 

Sandy Ryza commented on MAPREDUCE-4951:
---

New patch includes test and uses constant from YarnConfiguration instead of 
hardcoded -100.

 Container preemption interpreted as task failure
 

 Key: MAPREDUCE-4951
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4951
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mr-am, mrv2
Affects Versions: 2.0.2-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: MAPREDUCE-4951-1.patch, MAPREDUCE-4951.patch


 When YARN reports a completed container to the MR AM, it always interprets it 
 as a failure.  This can lead to a job failing because too many of its tasks 
 failed, when in fact they only failed because the scheduler preempted them.
 MR needs to recognize the special exit code value of -100 and interpret it as 
 a container being killed instead of a container failure.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4953) HadoopPipes misuses fprintf

2013-01-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13559162#comment-13559162
 ] 

Hadoop QA commented on MAPREDUCE-4953:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12565858/mapreduce-4953.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3259//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3259//console

This message is automatically generated.

 HadoopPipes misuses fprintf
 ---

 Key: MAPREDUCE-4953
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4953
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: pipes
Affects Versions: 3.0.0, 2.0.3-alpha
Reporter: Andy Isaacson
Assignee: Andy Isaacson
 Attachments: mapreduce-4953.txt


 {code}
  [exec] 
 /mnt/trunk/hadoop-tools/hadoop-pipes/src/main/native/pipes/impl/HadoopPipes.cc:130:58:
  warning: format not a string literal and no format arguments 
 [-Wformat-security]
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4951) Container preemption interpreted as task failure

2013-01-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13559170#comment-13559170
 ] 

Hadoop QA commented on MAPREDUCE-4951:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12565866/MAPREDUCE-4951-1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3260//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3260//console

This message is automatically generated.

 Container preemption interpreted as task failure
 

 Key: MAPREDUCE-4951
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4951
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mr-am, mrv2
Affects Versions: 2.0.2-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: MAPREDUCE-4951-1.patch, MAPREDUCE-4951.patch


 When YARN reports a completed container to the MR AM, it always interprets it 
 as a failure.  This can lead to a job failing because too many of its tasks 
 failed, when in fact they only failed because the scheduler preempted them.
 MR needs to recognize the special exit code value of -100 and interpret it as 
 a container being killed instead of a container failure.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4953) HadoopPipes misuses fprintf

2013-01-21 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13559201#comment-13559201
 ] 

Andy Isaacson commented on MAPREDUCE-4953:
--

Since hitting the bug would probably segfault the Pipes app, and the fix is 
trivially correct, no tests are included.

 HadoopPipes misuses fprintf
 ---

 Key: MAPREDUCE-4953
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4953
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: pipes
Affects Versions: 3.0.0, 2.0.3-alpha
Reporter: Andy Isaacson
Assignee: Andy Isaacson
 Attachments: mapreduce-4953.txt


 {code}
  [exec] 
 /mnt/trunk/hadoop-tools/hadoop-pipes/src/main/native/pipes/impl/HadoopPipes.cc:130:58:
  warning: format not a string literal and no format arguments 
 [-Wformat-security]
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4838) Add extra info to JH files

2013-01-21 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13559422#comment-13559422
 ] 

Siddharth Seth commented on MAPREDUCE-4838:
---

Thanks for working on this Zhijie.
bq. There's no JobInProgress (actually nearly empty) and TaskInProgress, where 
the locality ant the avataar attributes are set and logged. Instead, avataar 
is now set in TaskImpl#addAndScheduleAttempt by judging whether there are other 
active task attempts, while locality is set in 
TaskAttemptImpl#ContainerAssignedTransition#transition by judging whether the 
assigned container's host is within the local host/rack list of the task 
attempt.
Right. I think you've got these, as well as the other changes mapped to the 
correct places for trunk.

Comments on the patch
- Some lines exceed the 80 column limit. (Coding style guidelines at 
http://wiki.apache.org/hadoop/CodeReviewChecklist)
- Job specific configuration strings like mapreduce.workflow.id etc should be 
in MRJobConfig
- TaskAttemptImpl - Default locality set to NODE_LOCAL. Should be OFF_SWITCH to 
match branch-1.
- In TaskAttemptImpl, avoid resolving the host names multiple times.
- In TaskImpl, SPECULATIVE should be set when the RedundantScheduleTransition 
is taken. Not for retires caused by FAILED / KILLED attempts. The same likely 
applies to the branch-1 patch.
- In the history events (JobSubmittedEvent etc) - null check on the new 
strings. (the Utf8 constructor does not work with nulls)
- The toString implementation in the history events, as well as the Rumen 
events is not really needed. If implemented, they should include additional 
fields.

The additional information may be useful to expose - via the UI at least - in 
which case it'll need to be exposed via TaskAttemptReport. This can be done in 
a follow up jira. For now, Locality and Avatar enums could be moved to mrv2 - 
the hadoop-mapreduce-client-common module (ref TaskType).

 Add extra info to JH files
 --

 Key: MAPREDUCE-4838
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4838
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Arun C Murthy
Assignee: Zhijie Shen
 Attachments: MAPREDUCE-4838_1.patch, MAPREDUCE-4838.patch


 It will be useful to add more task-info to JH for analytics.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4469) Resource calculation in child tasks is CPU-heavy

2013-01-21 Thread Ahmed Radwan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Radwan updated MAPREDUCE-4469:


Attachment: MAPREDUCE-4469_rev5.patch

Here is the updated patch. Thanks Todd and Phil for your comments! I have 
updated the patch to get rid of the excludedPids caching which may result in 
miscalculations due to pid recycling as Todd highlighted. The patch also uses 
StringUtils.split(). Since getrusage only accounts for terminated children, the 
updates will be missing important resource usage info for any currently running 
children which haven't terminated yet, so in my opinion we shouldn't use it. I 
have also added a time frequency property (in milliseconds instead of skips) 
determining when the resource usage counters are updated. Please take a look 
and let me know if you have any comments.

 Resource calculation in child tasks is CPU-heavy
 

 Key: MAPREDUCE-4469
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4469
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: performance, task
Affects Versions: 1.0.3
Reporter: Todd Lipcon
Assignee: Ahmed Radwan
 Attachments: MAPREDUCE-4469.patch, MAPREDUCE-4469_rev2.patch, 
 MAPREDUCE-4469_rev3.patch, MAPREDUCE-4469_rev4.patch, 
 MAPREDUCE-4469_rev5.patch


 In doing some benchmarking on a hadoop-1 derived codebase, I noticed that 
 each of the child tasks was doing a ton of syscalls. Upon stracing, I noticed 
 that it's spending a lot of time looping through all the files in /proc to 
 calculate resource usage.
 As a test, I added a flag to disable use of the ResourceCalculatorPlugin 
 within the tasks. On a CPU-bound 500G-sort workload, this improved total job 
 runtime by about 10% (map slot-seconds by 14%, reduce slot seconds by 8%)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4469) Resource calculation in child tasks is CPU-heavy

2013-01-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13559433#comment-13559433
 ] 

Hadoop QA commented on MAPREDUCE-4469:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12565910/MAPREDUCE-4469_rev5.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3261//console

This message is automatically generated.

 Resource calculation in child tasks is CPU-heavy
 

 Key: MAPREDUCE-4469
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4469
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: performance, task
Affects Versions: 1.0.3
Reporter: Todd Lipcon
Assignee: Ahmed Radwan
 Attachments: MAPREDUCE-4469.patch, MAPREDUCE-4469_rev2.patch, 
 MAPREDUCE-4469_rev3.patch, MAPREDUCE-4469_rev4.patch, 
 MAPREDUCE-4469_rev5.patch


 In doing some benchmarking on a hadoop-1 derived codebase, I noticed that 
 each of the child tasks was doing a ton of syscalls. Upon stracing, I noticed 
 that it's spending a lot of time looping through all the files in /proc to 
 calculate resource usage.
 As a test, I added a flag to disable use of the ResourceCalculatorPlugin 
 within the tasks. On a CPU-bound 500G-sort workload, this improved total job 
 runtime by about 10% (map slot-seconds by 14%, reduce slot seconds by 8%)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira