[jira] [Commented] (MAPREDUCE-4884) streaming tests fail to start MiniMRCluster due to Queue configuration missing child queue names for root

2013-02-04 Thread Ivan A. Veselovsky (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570110#comment-13570110
 ] 

Ivan A. Veselovsky commented on MAPREDUCE-4884:
---

Can this change please be back-ported to branch-2 branch?

 streaming tests fail to start MiniMRCluster due to Queue configuration 
 missing child queue names for root
 ---

 Key: MAPREDUCE-4884
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4884
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/streaming, test
Affects Versions: 3.0.0, trunk-win
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Fix For: 3.0.0

 Attachments: MAPREDUCE-4884.1.patch


 Multiple tests in hadoop-streaming, such as {{TestFileArgs}}, fail to 
 initialize {{MiniMRCluster}} due to a {{YarnException}} with reason Queue 
 configuration missing child queue names for root.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-4974) optimising the LineRecordReader initialize method

2013-02-04 Thread Arun A K (JIRA)
Arun A K created MAPREDUCE-4974:
---

 Summary: optimising the LineRecordReader initialize method
 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 0.23.5, 2.0.2-alpha
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Arun A K
 Fix For: 0.20.204.0, 0.24.0


I found there is a a scope of optimizing the code, over initialize() if we have 
compressionCodecs  codec instantiated only if its a compressed input.
Mean while Gelesh George Omathil, added if we could avoid the null check of key 
 value. This would time save, since for every next key value generation, null 
check is done. The intention being to instantiate only once and avoid NPE as 
well. Hope both could be met if initialize key  value over  initialize() 
method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4974) optimising the LineRecordReader initialize method

2013-02-04 Thread Gelesh (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gelesh updated MAPREDUCE-4974:
--

Assignee: Gelesh  (was: Arun A K)
Target Version/s: 0.23.5, 0.23.4, 2.0.1-alpha, 2.0.0-alpha, 1.1.1, 1.0.4, 
1.0.0  (was: 1.0.0, 1.0.4, 1.1.1, 2.0.0-alpha, 2.0.1-alpha, 0.23.4, 0.23.5)
  Status: Patch Available  (was: Open)

 optimising the LineRecordReader initialize method
 -

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 0.23.5, 2.0.2-alpha
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Fix For: 0.20.204.0, 0.24.0

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4974) optimising the LineRecordReader initialize method

2013-02-04 Thread Gelesh (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gelesh updated MAPREDUCE-4974:
--

Attachment: MAPREDUCE-4974.1.patch

Combined thoughts of mine  Arun AK's,

 optimising the LineRecordReader initialize method
 -

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Fix For: 0.20.204.0, 0.24.0

 Attachments: MAPREDUCE-4974.1.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) optimising the LineRecordReader initialize method

2013-02-04 Thread Gelesh (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570140#comment-13570140
 ] 

Gelesh commented on MAPREDUCE-4974:
---

Some body please review the patch,
I couldnt even see the hadoop QA running on this.
Kindly advice

 optimising the LineRecordReader initialize method
 -

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Fix For: 0.20.204.0, 0.24.0

 Attachments: MAPREDUCE-4974.1.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) optimising the LineRecordReader initialize method

2013-02-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570145#comment-13570145
 ] 

Hadoop QA commented on MAPREDUCE-4974:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12567831/MAPREDUCE-4974.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3297//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3297//console

This message is automatically generated.

 optimising the LineRecordReader initialize method
 -

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Fix For: 0.20.204.0, 0.24.0

 Attachments: MAPREDUCE-4974.1.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) optimising the LineRecordReader initialize method

2013-02-04 Thread Gelesh (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570154#comment-13570154
 ] 

Gelesh commented on MAPREDUCE-4974:
---

Its a improvement to the existing, no new features added or deleted,
And hence, existing test case would suffice.

 optimising the LineRecordReader initialize method
 -

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Fix For: 0.20.204.0, 0.24.0

 Attachments: MAPREDUCE-4974.1.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-02-04 Thread Arun A K (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun A K updated MAPREDUCE-4974:


Summary: Optimising the LineRecordReader initialize() method  (was: 
optimising the LineRecordReader initialize method)

 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Fix For: 0.20.204.0, 0.24.0

 Attachments: MAPREDUCE-4974.1.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-02-04 Thread Arun A K (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570161#comment-13570161
 ] 

Arun A K commented on MAPREDUCE-4974:
-

Quoting the review request url for this issue - 
https://reviews.apache.org/r/9287/

 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Fix For: 0.20.204.0, 0.24.0

 Attachments: MAPREDUCE-4974.1.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-4975) streaming/gridmix docs missing

2013-02-04 Thread Thomas Graves (JIRA)
Thomas Graves created MAPREDUCE-4975:


 Summary: streaming/gridmix docs missing
 Key: MAPREDUCE-4975
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4975
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.23.6
Reporter: Thomas Graves


The docs for hadoop streaming and gridmix weren't moved out of the mrv1 code so 
don't existing in the 0.23 or 2.x line. 

ie the 1.X line are http://hadoop.apache.org/docs/r1.1.0/streaming.html and 
http://hadoop.apache.org/docs/r1.1.0/gridmix.html

We should also check for others that are missing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4884) streaming tests fail to start MiniMRCluster due to Queue configuration missing child queue names for root

2013-02-04 Thread Robert Joseph Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated MAPREDUCE-4884:
---

Fix Version/s: 2.0.3-alpha

Thanks for the fix.  I just merged this into branch-2, because the tests were 
failing there still.

 streaming tests fail to start MiniMRCluster due to Queue configuration 
 missing child queue names for root
 ---

 Key: MAPREDUCE-4884
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4884
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/streaming, test
Affects Versions: 3.0.0, trunk-win
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Fix For: 3.0.0, 2.0.3-alpha

 Attachments: MAPREDUCE-4884.1.patch


 Multiple tests in hadoop-streaming, such as {{TestFileArgs}}, fail to 
 initialize {{MiniMRCluster}} due to a {{YarnException}} with reason Queue 
 configuration missing child queue names for root.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-02-04 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570547#comment-13570547
 ] 

Todd Lipcon commented on MAPREDUCE-4974:


Do you have any benchmark that shows this helps? Null checks can often be 
completely optimized out by the JIT, or at least hoisted out of the tight loop.

 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Fix For: 0.20.204.0, 0.24.0

 Attachments: MAPREDUCE-4974.1.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-02-04 Thread Gelesh (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570567#comment-13570567
 ] 

Gelesh commented on MAPREDUCE-4974:
---

[~tlipcon]
nextKeyValue() is called as many number of times, the delimiter, or the new 
line has occurred, with in a given split.
Each Time, it executes the below code,

-if (key == null) {
-  key = new LongWritable();
-}
-key.set(pos);
-if (value == null) {
-  value = new Text();
-}

Only at the first iteration, the condition would hold true, and Key Value 
objects would be created.
This could also be done, if we have Key  Value objects created at the 
initialize phase, and we can skip this null check.

Also,
-compressionCodecs = new CompressionCodecFactory(job);
-codec = compressionCodecs.getCodec(file);
Need to be done , only when it uses a compressed input file. This change is 
also brought. 

 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Fix For: 0.20.204.0, 0.24.0

 Attachments: MAPREDUCE-4974.1.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-4976) Fix test failure for HADOOP-9252

2013-02-04 Thread Tsz Wo (Nicholas), SZE (JIRA)
Tsz Wo (Nicholas), SZE created MAPREDUCE-4976:
-

 Summary: Fix test failure for HADOOP-9252
 Key: MAPREDUCE-4976
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4976
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
Priority: Minor


HADOOP-9252 slightly changes the format of some StringUtils outputs.  It may 
cause test failures.

Also, some methods was deprecated by HADOOP-9252.  The use of them should be 
replaced with the new methods.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4976) Fix test failure for HADOOP-9252

2013-02-04 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated MAPREDUCE-4976:
--

Description: 
HADOOP-9252 slightly changes the format of some StringUtils outputs.  It may 
cause test failures.

Also, some methods were deprecated by HADOOP-9252.  The use of them should be 
replaced with the new methods.

  was:
HADOOP-9252 slightly changes the format of some StringUtils outputs.  It may 
cause test failures.

Also, some methods was deprecated by HADOOP-9252.  The use of them should be 
replaced with the new methods.


 Fix test failure for HADOOP-9252
 

 Key: MAPREDUCE-4976
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4976
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
Priority: Minor

 HADOOP-9252 slightly changes the format of some StringUtils outputs.  It may 
 cause test failures.
 Also, some methods were deprecated by HADOOP-9252.  The use of them should be 
 replaced with the new methods.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4964) JobLocalizer#localizeJobFiles can potentially write job.xml to the wrong user's directory

2013-02-04 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570664#comment-13570664
 ] 

Arun C Murthy commented on MAPREDUCE-4964:
--

Karthik - makes sense. Please upload this patch to MAPREDUCE-4843 and we can 
commit via the same jira. Thanks.

 JobLocalizer#localizeJobFiles can potentially write job.xml to the wrong 
 user's directory
 -

 Key: MAPREDUCE-4964
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4964
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Affects Versions: 1.1.1
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Attachments: MR-4964.patch, MR-4964.patch


 In the following code, if jobs corresponding to different users (X and Y) are 
 localized simultaneously, it is possible that jobconf can be written to the 
 wrong user's directory. (X's job.xml can be written to Y's directory)
 {code}
   public void localizeJobFiles(JobID jobid, JobConf jConf,
   Path localJobTokenFile, TaskUmbilicalProtocol taskTracker)
   throws IOException, InterruptedException {
 localizeJobFiles(jobid, jConf,
 lDirAlloc.getLocalPathForWrite(JOBCONF, ttConf), localJobTokenFile,
 taskTracker);
   }
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4822) Unnecessary conversions in History Events

2013-02-04 Thread Chu Tong (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chu Tong updated MAPREDUCE-4822:


Attachment: MAPREDUCE-4822.patch

Removed unnecessary conversion, please review.

 Unnecessary conversions in History Events
 -

 Key: MAPREDUCE-4822
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4822
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobhistoryserver
Affects Versions: 0.23.4
Reporter: Robert Joseph Evans
Priority: Trivial
 Attachments: MAPREDUCE-4822.patch


 There are a number of conversions in the Job History Event classes that are 
 totally unnecessary.  It appears that they were originally used to convert 
 from the internal avro format, but now many of them do not pull the values 
 from the avro they store them internally.
 For example:
 {code:title=TaskAttemptFinishedEvent.java}
   /** Get the task type */
   public TaskType getTaskType() {
 return TaskType.valueOf(taskType.toString());
   }
 {code}
 The code currently is taking an enum, converting it to a string and then 
 asking the same enum to convert it back to an enum.  If java work properly 
 this should be a noop and a reference to the original taskType should be 
 returned.
 There are several places that a string is having toString called on it, and 
 since strings are immutable it returns a reference to itself.
 The various ids are not immutable and probably should not be changed at this 
 point.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (MAPREDUCE-4843) When using DefaultTaskController, JobLocalizer not thread safe

2013-02-04 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla reassigned MAPREDUCE-4843:
---

Assignee: Karthik Kambatla

 When using DefaultTaskController, JobLocalizer not thread safe
 --

 Key: MAPREDUCE-4843
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4843
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tasktracker
Affects Versions: 1.1.1
Reporter: zhaoyunjiong
Assignee: Karthik Kambatla
Priority: Critical
 Attachments: MAPREDUCE-4843-branch-1.1.patch


 In our cluster, some times job will failed due to below exception:
 2012-12-03 23:11:54,811 WARN org.apache.hadoop.mapred.TaskTracker: Error 
 initializing attempt_201212031626_1115_r_23_0:
 org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find 
 taskTracker/$username/jobcache/job_201212031626_1115/job.xml in any of the 
 configured local directories
   at 
 org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:424)
   at 
 org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:160)
   at 
 org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1175)
   at 
 org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1058)
   at 
 org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:2213)
 The root cause is JobLocalizer is not thread safe.
 In DefaultTaskController.initializeJob method:
  JobLocalizer localizer = new JobLocalizer((JobConf)getConf(), user, 
 jobid);
 but in JobLocalizer, it just simply keep the reference of the conf.
 When two TaskLauncher threads(mapLauncher and reduceLauncher) try to 
 initializeJob at same time, it will have two JobLocalizer, but only one conf 
 instance.
 So some times ttConf.setStrings(JOB_LOCAL_CTXT, localDirs) will reset 
 previous job's conf.
 Then it will cause the previous job's job.xml stored at another user's dir.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4843) When using DefaultTaskController, JobLocalizer not thread safe

2013-02-04 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-4843:


Attachment: mr-4843.patch

Uploading the patch from MAPREDUCE-4964 as that solves this issue in a 
simpler/cleaner way. The discussion on that JIRA has all the details.

Applied the patch to latest branch-1 and it applies cleanly. Also, verified 
TestJobLocalizer passes.

 When using DefaultTaskController, JobLocalizer not thread safe
 --

 Key: MAPREDUCE-4843
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4843
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tasktracker
Affects Versions: 1.1.1
Reporter: zhaoyunjiong
Assignee: Karthik Kambatla
Priority: Critical
 Attachments: MAPREDUCE-4843-branch-1.1.patch, mr-4843.patch


 In our cluster, some times job will failed due to below exception:
 2012-12-03 23:11:54,811 WARN org.apache.hadoop.mapred.TaskTracker: Error 
 initializing attempt_201212031626_1115_r_23_0:
 org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find 
 taskTracker/$username/jobcache/job_201212031626_1115/job.xml in any of the 
 configured local directories
   at 
 org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:424)
   at 
 org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:160)
   at 
 org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1175)
   at 
 org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1058)
   at 
 org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:2213)
 The root cause is JobLocalizer is not thread safe.
 In DefaultTaskController.initializeJob method:
  JobLocalizer localizer = new JobLocalizer((JobConf)getConf(), user, 
 jobid);
 but in JobLocalizer, it just simply keep the reference of the conf.
 When two TaskLauncher threads(mapLauncher and reduceLauncher) try to 
 initializeJob at same time, it will have two JobLocalizer, but only one conf 
 instance.
 So some times ttConf.setStrings(JOB_LOCAL_CTXT, localDirs) will reset 
 previous job's conf.
 Then it will cause the previous job's job.xml stored at another user's dir.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4964) JobLocalizer#localizeJobFiles can potentially write job.xml to the wrong user's directory

2013-02-04 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-4964:


Resolution: Duplicate
Status: Resolved  (was: Patch Available)

Thanks Arun. Closing this JIRA as a duplicate of MR-4843, I uploaded the latest 
patch here to MAPREDUCE-4843.

 JobLocalizer#localizeJobFiles can potentially write job.xml to the wrong 
 user's directory
 -

 Key: MAPREDUCE-4964
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4964
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Affects Versions: 1.1.1
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Attachments: MR-4964.patch, MR-4964.patch


 In the following code, if jobs corresponding to different users (X and Y) are 
 localized simultaneously, it is possible that jobconf can be written to the 
 wrong user's directory. (X's job.xml can be written to Y's directory)
 {code}
   public void localizeJobFiles(JobID jobid, JobConf jConf,
   Path localJobTokenFile, TaskUmbilicalProtocol taskTracker)
   throws IOException, InterruptedException {
 localizeJobFiles(jobid, jConf,
 lDirAlloc.getLocalPathForWrite(JOBCONF, ttConf), localJobTokenFile,
 taskTracker);
   }
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4843) When using DefaultTaskController, JobLocalizer not thread safe

2013-02-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570687#comment-13570687
 ] 

Hadoop QA commented on MAPREDUCE-4843:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12567889/mr-4843.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3298//console

This message is automatically generated.

 When using DefaultTaskController, JobLocalizer not thread safe
 --

 Key: MAPREDUCE-4843
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4843
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tasktracker
Affects Versions: 1.1.1
Reporter: zhaoyunjiong
Assignee: Karthik Kambatla
Priority: Critical
 Attachments: MAPREDUCE-4843-branch-1.1.patch, mr-4843.patch


 In our cluster, some times job will failed due to below exception:
 2012-12-03 23:11:54,811 WARN org.apache.hadoop.mapred.TaskTracker: Error 
 initializing attempt_201212031626_1115_r_23_0:
 org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find 
 taskTracker/$username/jobcache/job_201212031626_1115/job.xml in any of the 
 configured local directories
   at 
 org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:424)
   at 
 org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:160)
   at 
 org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1175)
   at 
 org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1058)
   at 
 org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:2213)
 The root cause is JobLocalizer is not thread safe.
 In DefaultTaskController.initializeJob method:
  JobLocalizer localizer = new JobLocalizer((JobConf)getConf(), user, 
 jobid);
 but in JobLocalizer, it just simply keep the reference of the conf.
 When two TaskLauncher threads(mapLauncher and reduceLauncher) try to 
 initializeJob at same time, it will have two JobLocalizer, but only one conf 
 instance.
 So some times ttConf.setStrings(JOB_LOCAL_CTXT, localDirs) will reset 
 previous job's conf.
 Then it will cause the previous job's job.xml stored at another user's dir.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (MAPREDUCE-4822) Unnecessary conversions in History Events

2013-02-04 Thread Chu Tong (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chu Tong reassigned MAPREDUCE-4822:
---

Assignee: Chu Tong

 Unnecessary conversions in History Events
 -

 Key: MAPREDUCE-4822
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4822
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobhistoryserver
Affects Versions: 0.23.4
Reporter: Robert Joseph Evans
Assignee: Chu Tong
Priority: Trivial
 Attachments: MAPREDUCE-4822.patch


 There are a number of conversions in the Job History Event classes that are 
 totally unnecessary.  It appears that they were originally used to convert 
 from the internal avro format, but now many of them do not pull the values 
 from the avro they store them internally.
 For example:
 {code:title=TaskAttemptFinishedEvent.java}
   /** Get the task type */
   public TaskType getTaskType() {
 return TaskType.valueOf(taskType.toString());
   }
 {code}
 The code currently is taking an enum, converting it to a string and then 
 asking the same enum to convert it back to an enum.  If java work properly 
 this should be a noop and a reference to the original taskType should be 
 returned.
 There are several places that a string is having toString called on it, and 
 since strings are immutable it returns a reference to itself.
 The various ids are not immutable and probably should not be changed at this 
 point.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4843) When using DefaultTaskController, JobLocalizer not thread safe

2013-02-04 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570694#comment-13570694
 ] 

Alejandro Abdelnur commented on MAPREDUCE-4843:
---

+1. As per discussion in MAPREDUCE-4964 the latest patch seems a better way of 
doing it.

 When using DefaultTaskController, JobLocalizer not thread safe
 --

 Key: MAPREDUCE-4843
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4843
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tasktracker
Affects Versions: 1.1.1
Reporter: zhaoyunjiong
Assignee: Karthik Kambatla
Priority: Critical
 Attachments: MAPREDUCE-4843-branch-1.1.patch, mr-4843.patch


 In our cluster, some times job will failed due to below exception:
 2012-12-03 23:11:54,811 WARN org.apache.hadoop.mapred.TaskTracker: Error 
 initializing attempt_201212031626_1115_r_23_0:
 org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find 
 taskTracker/$username/jobcache/job_201212031626_1115/job.xml in any of the 
 configured local directories
   at 
 org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:424)
   at 
 org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:160)
   at 
 org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1175)
   at 
 org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1058)
   at 
 org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:2213)
 The root cause is JobLocalizer is not thread safe.
 In DefaultTaskController.initializeJob method:
  JobLocalizer localizer = new JobLocalizer((JobConf)getConf(), user, 
 jobid);
 but in JobLocalizer, it just simply keep the reference of the conf.
 When two TaskLauncher threads(mapLauncher and reduceLauncher) try to 
 initializeJob at same time, it will have two JobLocalizer, but only one conf 
 instance.
 So some times ttConf.setStrings(JOB_LOCAL_CTXT, localDirs) will reset 
 previous job's conf.
 Then it will cause the previous job's job.xml stored at another user's dir.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4843) When using DefaultTaskController, JobLocalizer not thread safe

2013-02-04 Thread Alejandro Abdelnur (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Abdelnur updated MAPREDUCE-4843:
--

   Resolution: Fixed
Fix Version/s: 1.2.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Thanks Karthik. Committed to branch-1. Arun, thanks for double checking on this 
one.

 When using DefaultTaskController, JobLocalizer not thread safe
 --

 Key: MAPREDUCE-4843
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4843
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tasktracker
Affects Versions: 1.1.1
Reporter: zhaoyunjiong
Assignee: Karthik Kambatla
Priority: Critical
 Fix For: 1.2.0

 Attachments: MAPREDUCE-4843-branch-1.1.patch, mr-4843.patch


 In our cluster, some times job will failed due to below exception:
 2012-12-03 23:11:54,811 WARN org.apache.hadoop.mapred.TaskTracker: Error 
 initializing attempt_201212031626_1115_r_23_0:
 org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find 
 taskTracker/$username/jobcache/job_201212031626_1115/job.xml in any of the 
 configured local directories
   at 
 org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:424)
   at 
 org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:160)
   at 
 org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1175)
   at 
 org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1058)
   at 
 org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:2213)
 The root cause is JobLocalizer is not thread safe.
 In DefaultTaskController.initializeJob method:
  JobLocalizer localizer = new JobLocalizer((JobConf)getConf(), user, 
 jobid);
 but in JobLocalizer, it just simply keep the reference of the conf.
 When two TaskLauncher threads(mapLauncher and reduceLauncher) try to 
 initializeJob at same time, it will have two JobLocalizer, but only one conf 
 instance.
 So some times ttConf.setStrings(JOB_LOCAL_CTXT, localDirs) will reset 
 previous job's conf.
 Then it will cause the previous job's job.xml stored at another user's dir.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4822) Unnecessary conversions in History Events

2013-02-04 Thread Chu Tong (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chu Tong updated MAPREDUCE-4822:


Attachment: (was: MAPREDUCE-4822.patch)

 Unnecessary conversions in History Events
 -

 Key: MAPREDUCE-4822
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4822
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobhistoryserver
Affects Versions: 0.23.4
Reporter: Robert Joseph Evans
Assignee: Chu Tong
Priority: Trivial

 There are a number of conversions in the Job History Event classes that are 
 totally unnecessary.  It appears that they were originally used to convert 
 from the internal avro format, but now many of them do not pull the values 
 from the avro they store them internally.
 For example:
 {code:title=TaskAttemptFinishedEvent.java}
   /** Get the task type */
   public TaskType getTaskType() {
 return TaskType.valueOf(taskType.toString());
   }
 {code}
 The code currently is taking an enum, converting it to a string and then 
 asking the same enum to convert it back to an enum.  If java work properly 
 this should be a noop and a reference to the original taskType should be 
 returned.
 There are several places that a string is having toString called on it, and 
 since strings are immutable it returns a reference to itself.
 The various ids are not immutable and probably should not be changed at this 
 point.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4822) Unnecessary conversions in History Events

2013-02-04 Thread Chu Tong (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chu Tong updated MAPREDUCE-4822:


Attachment: MAPREDUCE-4822.patch

 Unnecessary conversions in History Events
 -

 Key: MAPREDUCE-4822
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4822
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobhistoryserver
Affects Versions: 0.23.4
Reporter: Robert Joseph Evans
Assignee: Chu Tong
Priority: Trivial
 Attachments: MAPREDUCE-4822.patch


 There are a number of conversions in the Job History Event classes that are 
 totally unnecessary.  It appears that they were originally used to convert 
 from the internal avro format, but now many of them do not pull the values 
 from the avro they store them internally.
 For example:
 {code:title=TaskAttemptFinishedEvent.java}
   /** Get the task type */
   public TaskType getTaskType() {
 return TaskType.valueOf(taskType.toString());
   }
 {code}
 The code currently is taking an enum, converting it to a string and then 
 asking the same enum to convert it back to an enum.  If java work properly 
 this should be a noop and a reference to the original taskType should be 
 returned.
 There are several places that a string is having toString called on it, and 
 since strings are immutable it returns a reference to itself.
 The various ids are not immutable and probably should not be changed at this 
 point.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4822) Unnecessary conversions in History Events

2013-02-04 Thread Chu Tong (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chu Tong updated MAPREDUCE-4822:


   Fix Version/s: 0.24.0
  Labels: patch  (was: )
Target Version/s: 0.23.4
  Status: Patch Available  (was: Open)

 Unnecessary conversions in History Events
 -

 Key: MAPREDUCE-4822
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4822
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobhistoryserver
Affects Versions: 0.23.4
Reporter: Robert Joseph Evans
Assignee: Chu Tong
Priority: Trivial
  Labels: patch
 Fix For: 0.24.0

 Attachments: MAPREDUCE-4822.patch


 There are a number of conversions in the Job History Event classes that are 
 totally unnecessary.  It appears that they were originally used to convert 
 from the internal avro format, but now many of them do not pull the values 
 from the avro they store them internally.
 For example:
 {code:title=TaskAttemptFinishedEvent.java}
   /** Get the task type */
   public TaskType getTaskType() {
 return TaskType.valueOf(taskType.toString());
   }
 {code}
 The code currently is taking an enum, converting it to a string and then 
 asking the same enum to convert it back to an enum.  If java work properly 
 this should be a noop and a reference to the original taskType should be 
 returned.
 There are several places that a string is having toString called on it, and 
 since strings are immutable it returns a reference to itself.
 The various ids are not immutable and probably should not be changed at this 
 point.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4822) Unnecessary conversions in History Events

2013-02-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570840#comment-13570840
 ] 

Hadoop QA commented on MAPREDUCE-4822:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12567924/MAPREDUCE-4822.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3299//console

This message is automatically generated.

 Unnecessary conversions in History Events
 -

 Key: MAPREDUCE-4822
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4822
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobhistoryserver
Affects Versions: 0.23.4
Reporter: Robert Joseph Evans
Assignee: Chu Tong
Priority: Trivial
  Labels: patch
 Fix For: 0.24.0

 Attachments: MAPREDUCE-4822.patch


 There are a number of conversions in the Job History Event classes that are 
 totally unnecessary.  It appears that they were originally used to convert 
 from the internal avro format, but now many of them do not pull the values 
 from the avro they store them internally.
 For example:
 {code:title=TaskAttemptFinishedEvent.java}
   /** Get the task type */
   public TaskType getTaskType() {
 return TaskType.valueOf(taskType.toString());
   }
 {code}
 The code currently is taking an enum, converting it to a string and then 
 asking the same enum to convert it back to an enum.  If java work properly 
 this should be a noop and a reference to the original taskType should be 
 returned.
 There are several places that a string is having toString called on it, and 
 since strings are immutable it returns a reference to itself.
 The various ids are not immutable and probably should not be changed at this 
 point.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (MAPREDUCE-4973) Backport history clean up configurations to branch-1

2013-02-04 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla resolved MAPREDUCE-4973.
-

Resolution: Duplicate

 Backport history clean up configurations to branch-1
 

 Key: MAPREDUCE-4973
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4973
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Affects Versions: 1.1.1
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla

 In trunk-based versions, we can configure the max-age of files after which 
 they will be cleaned up. This JIRA is to backport those configurations to 
 branch-1.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4820) MRApps distributed-cache duplicate checks are incorrect

2013-02-04 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated MAPREDUCE-4820:
-

Priority: Major  (was: Blocker)

 MRApps distributed-cache duplicate checks are incorrect
 ---

 Key: MAPREDUCE-4820
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4820
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am
Affects Versions: 2.0.2-alpha
Reporter: Alejandro Abdelnur
 Fix For: 2.0.3-alpha


 This seems a combination of issues that are being exposed in 2.0.2-alpha by 
 MAPREDUCE-4549.
 MAPREDUCE-4549 introduces a check to to ensure there are not duplicate JARs 
 in the distributed-cache (using the JAR name as identity).
 In Hadoop 2 (different from Hadoop 1), all JARs in the distributed-cache are 
 symlink-ed to the current directory of the task.
 MRApps, when setting up the DistributedCache 
 (MRApps#setupDistributedCache-parseDistributedCacheArtifacts) assumes that 
 the local resources (this includes files in the CURRENT_DIR/, 
 CURRENT_DIR/classes/ and files in CURRENT_DIR/lib/) are part of the 
 distributed-cache already.
 For systems, like Oozie, which use a launcher job to submit the real job this 
 poses a problem because MRApps is run from the launcher job to submit the 
 real job. The configuration of the real job has the correct distributed-cache 
 entries (no duplicates), but because the current dir has the same files, the 
 submission fails.
 It seems that MRApps should not be checking dups in the distributed-cached 
 against JARs in the CURRENT_DIR/ or CURRENT_DIR/lib/. The dup check should be 
 done among distributed-cached entries only.
 It seems YARNRunner is symlink-ing all files in the distributed cached in the 
 current directory. In Hadoop 1 this was done only for files added to the 
 distributed-cache using a fragment (ie #FOO) to trigger a symlink creation. 
 Marking as a blocker because without a fix for this, Oozie cannot submit jobs 
 to Hadoop 2 (i've debugged Oozie in a live cluster being used by BigTop 
 -thanks Roman- to test their release work, and I've verified that Oozie 3.3 
 does not create duplicated entries in the distributed-cache)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4967) TestJvmReuse fails on assertion

2013-02-04 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-4967:


Attachment: mr-4967.patch

Uploading a trivial patch that converts TestJvmReuse to use junit4.

ant test -Dtestcase=TestJvmReuse doesn't run any tests now.

 TestJvmReuse fails on assertion
 ---

 Key: MAPREDUCE-4967
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4967
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tasktracker, test
Affects Versions: 1.1.2
Reporter: Chris Nauroth
Assignee: Karthik Kambatla
 Attachments: mr-4967.patch


 {{TestJvmReuse}} on branch-1 consistently fails on an assertion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4967) TestJvmReuse fails on assertion

2013-02-04 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-4967:


Status: Patch Available  (was: Open)

 TestJvmReuse fails on assertion
 ---

 Key: MAPREDUCE-4967
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4967
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tasktracker, test
Affects Versions: 1.1.2
Reporter: Chris Nauroth
Assignee: Karthik Kambatla
 Attachments: mr-4967.patch


 {{TestJvmReuse}} on branch-1 consistently fails on an assertion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4967) TestJvmReuse fails on assertion

2013-02-04 Thread Surenkumar Nihalani (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570884#comment-13570884
 ] 

Surenkumar Nihalani commented on MAPREDUCE-4967:


+1

 TestJvmReuse fails on assertion
 ---

 Key: MAPREDUCE-4967
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4967
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tasktracker, test
Affects Versions: 1.1.2
Reporter: Chris Nauroth
Assignee: Karthik Kambatla
 Attachments: mr-4967.patch


 {{TestJvmReuse}} on branch-1 consistently fails on an assertion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4967) TestJvmReuse fails on assertion

2013-02-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570885#comment-13570885
 ] 

Hadoop QA commented on MAPREDUCE-4967:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12567939/mr-4967.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3300//console

This message is automatically generated.

 TestJvmReuse fails on assertion
 ---

 Key: MAPREDUCE-4967
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4967
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tasktracker, test
Affects Versions: 1.1.2
Reporter: Chris Nauroth
Assignee: Karthik Kambatla
 Attachments: mr-4967.patch


 {{TestJvmReuse}} on branch-1 consistently fails on an assertion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4434) Backport MR-2779 (JobSplitWriter.java can't handle large job.split file) to branch-1

2013-02-04 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-4434:


 Target Version/s: 1.2.0
Affects Version/s: (was: 1.0.3)
   1.1.1
   Status: Patch Available  (was: Open)

 Backport MR-2779 (JobSplitWriter.java can't handle large job.split file) to 
 branch-1
 

 Key: MAPREDUCE-4434
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4434
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Affects Versions: 1.1.1
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Attachments: mr-4434.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4434) Backport MR-2779 (JobSplitWriter.java can't handle large job.split file) to branch-1

2013-02-04 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-4434:


Attachment: mr-4434.patch

Trivial backport of MR-2779. Verfied TestJobSplitWriter passes.

 Backport MR-2779 (JobSplitWriter.java can't handle large job.split file) to 
 branch-1
 

 Key: MAPREDUCE-4434
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4434
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Affects Versions: 1.0.3
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Attachments: mr-4434.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4434) Backport MR-2779 (JobSplitWriter.java can't handle large job.split file) to branch-1

2013-02-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570890#comment-13570890
 ] 

Hadoop QA commented on MAPREDUCE-4434:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12567941/mr-4434.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3301//console

This message is automatically generated.

 Backport MR-2779 (JobSplitWriter.java can't handle large job.split file) to 
 branch-1
 

 Key: MAPREDUCE-4434
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4434
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Affects Versions: 1.1.1
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Attachments: mr-4434.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-2264) Job status exceeds 100% in some cases

2013-02-04 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13571025#comment-13571025
 ] 

Sandy Ryza commented on MAPREDUCE-2264:
---

Arun,

Haven't heard from you on this.  As most of the changes are unrelated to the 
original issue, I'll mark this as resolved and work on a cleanup JIRA tomorrow 
unless you say otherwise?

 Job status exceeds 100% in some cases 
 --

 Key: MAPREDUCE-2264
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2264
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.20.2, 0.20.205.0
Reporter: Adam Kramer
Assignee: Devaraj K
  Labels: critical-0.22.0
 Fix For: 1.2.0, 2.0.3-alpha

 Attachments: MAPREDUCE-2264-0.20.205-1.patch, 
 MAPREDUCE-2264-0.20.205.patch, MAPREDUCE-2264-0.20.3.patch, 
 MAPREDUCE-2264-branch-1-1.patch, MAPREDUCE-2264-branch-1-2.patch, 
 MAPREDUCE-2264-branch-1.patch, MAPREDUCE-2264-trunk-1.patch, 
 MAPREDUCE-2264-trunk-1.patch, MAPREDUCE-2264-trunk-2.patch, 
 MAPREDUCE-2264-trunk-3.patch, MAPREDUCE-2264-trunk-4.patch, 
 MAPREDUCE-2264-trunk-5.patch, MAPREDUCE-2264-trunk-5.patch, 
 MAPREDUCE-2264-trunk.patch, more than 100%.bmp


 I'm looking now at my jobtracker's list of running reduce tasks. One of them 
 is 120.05% complete, the other is 107.28% complete.
 I understand that these numbers are estimates, but there is no case in which 
 an estimate of 100% for a non-complete task is better than an estimate of 
 99.99%, nor is there any case in which an estimate greater than 100% is valid.
 I suggest that whatever logic is computing these set 99.99% as a hard maximum.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4953) HadoopPipes misuses fprintf

2013-02-04 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13571039#comment-13571039
 ] 

Colin Patrick McCabe commented on MAPREDUCE-4953:
-

looks good to me

 HadoopPipes misuses fprintf
 ---

 Key: MAPREDUCE-4953
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4953
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: pipes
Affects Versions: 3.0.0, 2.0.3-alpha
Reporter: Andy Isaacson
Assignee: Andy Isaacson
 Attachments: mapreduce-4953.txt


 {code}
  [exec] 
 /mnt/trunk/hadoop-tools/hadoop-pipes/src/main/native/pipes/impl/HadoopPipes.cc:130:58:
  warning: format not a string literal and no format arguments 
 [-Wformat-security]
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-2308) Sort buffer size (io.sort.mb) is limited to 2 GB

2013-02-04 Thread kiran sreekumar (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13571040#comment-13571040
 ] 

kiran sreekumar commented on MAPREDUCE-2308:


io.sort.mb should be 10 * io.sort.factor.
HADOOP-3473 suggests to keep it default as 100.

 Sort buffer size (io.sort.mb) is limited to  2 GB
 --

 Key: MAPREDUCE-2308
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2308
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.20.1, 0.20.2, 0.21.0
 Environment: Cloudera CDH3b3 (0.20.2+)
Reporter: Jay Hacker
Priority: Minor

 I have MapReduce jobs that use a large amount of per-task memory, because the 
 algorithm I'm using converges faster if more data is together on a node.  I 
 have my JVM heap size set at 3200 MB, and if I use the popular rule of thumb 
 that io.sort.mb should be ~70% of that, I get 2240 MB.  I rounded this down 
 to 2048 MB, but map tasks crash with :
 {noformat}
 java.io.IOException: Invalid io.sort.mb: 2048
 at 
 org.apache.hadoop.mapred.MapTask$MapOutputBuffer.init(MapTask.java:790)
 ...
 {noformat}
 MapTask.MapOutputBuffer implements its buffer with a byte[] of size 
 io.sort.mb (in bytes), and is sanity checking the size before allocating the 
 array.  The problem is that Java arrays can't have more than 2^31 - 1 
 elements (even with a 64-bit JVM), and this is a limitation of the Java 
 language specificiation itself.  As memory and data sizes grow, this would 
 seem to be a crippling limtiation of Java.
 It would be nice if this ceiling were documented, and an error issued sooner, 
 e.g. in jobtracker startup upon reading the config.  Going forward, we may 
 need to implement some array of arrays hack for large buffers. :(

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4822) Unnecessary conversions in History Events

2013-02-04 Thread Chu Tong (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chu Tong updated MAPREDUCE-4822:


Status: Open  (was: Patch Available)

 Unnecessary conversions in History Events
 -

 Key: MAPREDUCE-4822
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4822
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobhistoryserver
Affects Versions: 0.23.4
Reporter: Robert Joseph Evans
Assignee: Chu Tong
Priority: Trivial
  Labels: patch
 Fix For: 0.24.0

 Attachments: MAPREDUCE-4822.patch


 There are a number of conversions in the Job History Event classes that are 
 totally unnecessary.  It appears that they were originally used to convert 
 from the internal avro format, but now many of them do not pull the values 
 from the avro they store them internally.
 For example:
 {code:title=TaskAttemptFinishedEvent.java}
   /** Get the task type */
   public TaskType getTaskType() {
 return TaskType.valueOf(taskType.toString());
   }
 {code}
 The code currently is taking an enum, converting it to a string and then 
 asking the same enum to convert it back to an enum.  If java work properly 
 this should be a noop and a reference to the original taskType should be 
 returned.
 There are several places that a string is having toString called on it, and 
 since strings are immutable it returns a reference to itself.
 The various ids are not immutable and probably should not be changed at this 
 point.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4822) Unnecessary conversions in History Events

2013-02-04 Thread Chu Tong (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chu Tong updated MAPREDUCE-4822:


Attachment: (was: MAPREDUCE-4822.patch)

 Unnecessary conversions in History Events
 -

 Key: MAPREDUCE-4822
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4822
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobhistoryserver
Affects Versions: 0.23.4
Reporter: Robert Joseph Evans
Assignee: Chu Tong
Priority: Trivial
  Labels: patch
 Fix For: 0.24.0

 Attachments: MAPREDUCE-4822.patch


 There are a number of conversions in the Job History Event classes that are 
 totally unnecessary.  It appears that they were originally used to convert 
 from the internal avro format, but now many of them do not pull the values 
 from the avro they store them internally.
 For example:
 {code:title=TaskAttemptFinishedEvent.java}
   /** Get the task type */
   public TaskType getTaskType() {
 return TaskType.valueOf(taskType.toString());
   }
 {code}
 The code currently is taking an enum, converting it to a string and then 
 asking the same enum to convert it back to an enum.  If java work properly 
 this should be a noop and a reference to the original taskType should be 
 returned.
 There are several places that a string is having toString called on it, and 
 since strings are immutable it returns a reference to itself.
 The various ids are not immutable and probably should not be changed at this 
 point.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4822) Unnecessary conversions in History Events

2013-02-04 Thread Chu Tong (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chu Tong updated MAPREDUCE-4822:


Attachment: MAPREDUCE-4822.patch

 Unnecessary conversions in History Events
 -

 Key: MAPREDUCE-4822
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4822
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobhistoryserver
Affects Versions: 0.23.4
Reporter: Robert Joseph Evans
Assignee: Chu Tong
Priority: Trivial
  Labels: patch
 Fix For: 0.24.0

 Attachments: MAPREDUCE-4822.patch


 There are a number of conversions in the Job History Event classes that are 
 totally unnecessary.  It appears that they were originally used to convert 
 from the internal avro format, but now many of them do not pull the values 
 from the avro they store them internally.
 For example:
 {code:title=TaskAttemptFinishedEvent.java}
   /** Get the task type */
   public TaskType getTaskType() {
 return TaskType.valueOf(taskType.toString());
   }
 {code}
 The code currently is taking an enum, converting it to a string and then 
 asking the same enum to convert it back to an enum.  If java work properly 
 this should be a noop and a reference to the original taskType should be 
 returned.
 There are several places that a string is having toString called on it, and 
 since strings are immutable it returns a reference to itself.
 The various ids are not immutable and probably should not be changed at this 
 point.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4822) Unnecessary conversions in History Events

2013-02-04 Thread Chu Tong (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chu Tong updated MAPREDUCE-4822:


Status: Patch Available  (was: Open)

 Unnecessary conversions in History Events
 -

 Key: MAPREDUCE-4822
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4822
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobhistoryserver
Affects Versions: 0.23.4
Reporter: Robert Joseph Evans
Assignee: Chu Tong
Priority: Trivial
  Labels: patch
 Fix For: 0.24.0

 Attachments: MAPREDUCE-4822.patch


 There are a number of conversions in the Job History Event classes that are 
 totally unnecessary.  It appears that they were originally used to convert 
 from the internal avro format, but now many of them do not pull the values 
 from the avro they store them internally.
 For example:
 {code:title=TaskAttemptFinishedEvent.java}
   /** Get the task type */
   public TaskType getTaskType() {
 return TaskType.valueOf(taskType.toString());
   }
 {code}
 The code currently is taking an enum, converting it to a string and then 
 asking the same enum to convert it back to an enum.  If java work properly 
 this should be a noop and a reference to the original taskType should be 
 returned.
 There are several places that a string is having toString called on it, and 
 since strings are immutable it returns a reference to itself.
 The various ids are not immutable and probably should not be changed at this 
 point.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4953) HadoopPipes misuses fprintf

2013-02-04 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13571045#comment-13571045
 ] 

Aaron T. Myers commented on MAPREDUCE-4953:
---

+1, the patch looks good to me. I confirmed that this gets rid of the compiler 
warning.

I'm going to commit this momentarily.

 HadoopPipes misuses fprintf
 ---

 Key: MAPREDUCE-4953
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4953
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: pipes
Affects Versions: 3.0.0, 2.0.3-alpha
Reporter: Andy Isaacson
Assignee: Andy Isaacson
 Attachments: mapreduce-4953.txt


 {code}
  [exec] 
 /mnt/trunk/hadoop-tools/hadoop-pipes/src/main/native/pipes/impl/HadoopPipes.cc:130:58:
  warning: format not a string literal and no format arguments 
 [-Wformat-security]
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4953) HadoopPipes misuses fprintf

2013-02-04 Thread Aaron T. Myers (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated MAPREDUCE-4953:
--

   Resolution: Fixed
Fix Version/s: 2.0.3-alpha
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

I've just committed this to trunk and branch-2.

Thanks a lot for the contribution, Andy.

 HadoopPipes misuses fprintf
 ---

 Key: MAPREDUCE-4953
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4953
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: pipes
Affects Versions: 3.0.0, 2.0.3-alpha
Reporter: Andy Isaacson
Assignee: Andy Isaacson
 Fix For: 2.0.3-alpha

 Attachments: mapreduce-4953.txt


 {code}
  [exec] 
 /mnt/trunk/hadoop-tools/hadoop-pipes/src/main/native/pipes/impl/HadoopPipes.cc:130:58:
  warning: format not a string literal and no format arguments 
 [-Wformat-security]
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4822) Unnecessary conversions in History Events

2013-02-04 Thread Chu Tong (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chu Tong updated MAPREDUCE-4822:


Status: Open  (was: Patch Available)

 Unnecessary conversions in History Events
 -

 Key: MAPREDUCE-4822
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4822
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobhistoryserver
Affects Versions: 0.23.4
Reporter: Robert Joseph Evans
Assignee: Chu Tong
Priority: Trivial
  Labels: patch
 Fix For: 0.24.0

 Attachments: MAPREDUCE-4822.patch


 There are a number of conversions in the Job History Event classes that are 
 totally unnecessary.  It appears that they were originally used to convert 
 from the internal avro format, but now many of them do not pull the values 
 from the avro they store them internally.
 For example:
 {code:title=TaskAttemptFinishedEvent.java}
   /** Get the task type */
   public TaskType getTaskType() {
 return TaskType.valueOf(taskType.toString());
   }
 {code}
 The code currently is taking an enum, converting it to a string and then 
 asking the same enum to convert it back to an enum.  If java work properly 
 this should be a noop and a reference to the original taskType should be 
 returned.
 There are several places that a string is having toString called on it, and 
 since strings are immutable it returns a reference to itself.
 The various ids are not immutable and probably should not be changed at this 
 point.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4953) HadoopPipes misuses fprintf

2013-02-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13571052#comment-13571052
 ] 

Hudson commented on MAPREDUCE-4953:
---

Integrated in Hadoop-trunk-Commit #3323 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/3323/])
MAPREDUCE-4953. HadoopPipes misuses fprintf. Contributed by Andy Isaacson. 
(Revision 1442471)

 Result = SUCCESS
atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1442471
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-tools/hadoop-pipes/src/main/native/pipes/impl/HadoopPipes.cc


 HadoopPipes misuses fprintf
 ---

 Key: MAPREDUCE-4953
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4953
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: pipes
Affects Versions: 3.0.0, 2.0.3-alpha
Reporter: Andy Isaacson
Assignee: Andy Isaacson
 Fix For: 2.0.3-alpha

 Attachments: mapreduce-4953.txt


 {code}
  [exec] 
 /mnt/trunk/hadoop-tools/hadoop-pipes/src/main/native/pipes/impl/HadoopPipes.cc:130:58:
  warning: format not a string literal and no format arguments 
 [-Wformat-security]
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4822) Unnecessary conversions in History Events

2013-02-04 Thread Chu Tong (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chu Tong updated MAPREDUCE-4822:


Attachment: (was: MAPREDUCE-4822.patch)

 Unnecessary conversions in History Events
 -

 Key: MAPREDUCE-4822
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4822
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobhistoryserver
Affects Versions: 0.23.4
Reporter: Robert Joseph Evans
Assignee: Chu Tong
Priority: Trivial
  Labels: patch
 Fix For: 0.24.0


 There are a number of conversions in the Job History Event classes that are 
 totally unnecessary.  It appears that they were originally used to convert 
 from the internal avro format, but now many of them do not pull the values 
 from the avro they store them internally.
 For example:
 {code:title=TaskAttemptFinishedEvent.java}
   /** Get the task type */
   public TaskType getTaskType() {
 return TaskType.valueOf(taskType.toString());
   }
 {code}
 The code currently is taking an enum, converting it to a string and then 
 asking the same enum to convert it back to an enum.  If java work properly 
 this should be a noop and a reference to the original taskType should be 
 returned.
 There are several places that a string is having toString called on it, and 
 since strings are immutable it returns a reference to itself.
 The various ids are not immutable and probably should not be changed at this 
 point.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4822) Unnecessary conversions in History Events

2013-02-04 Thread Chu Tong (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chu Tong updated MAPREDUCE-4822:


Attachment: MAPREDUCE-4822.patch

No testcase is included as the change is trivial.

For JavaDoc warnings, it is false positive as the same number of warnings are 
generated on a clean build under my dev environment.


-1 overall.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 javadoc.  The javadoc tool appears to have generated 20 warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version ) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.


 Unnecessary conversions in History Events
 -

 Key: MAPREDUCE-4822
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4822
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobhistoryserver
Affects Versions: 0.23.4
Reporter: Robert Joseph Evans
Assignee: Chu Tong
Priority: Trivial
  Labels: patch
 Fix For: 0.24.0

 Attachments: MAPREDUCE-4822.patch


 There are a number of conversions in the Job History Event classes that are 
 totally unnecessary.  It appears that they were originally used to convert 
 from the internal avro format, but now many of them do not pull the values 
 from the avro they store them internally.
 For example:
 {code:title=TaskAttemptFinishedEvent.java}
   /** Get the task type */
   public TaskType getTaskType() {
 return TaskType.valueOf(taskType.toString());
   }
 {code}
 The code currently is taking an enum, converting it to a string and then 
 asking the same enum to convert it back to an enum.  If java work properly 
 this should be a noop and a reference to the original taskType should be 
 returned.
 There are several places that a string is having toString called on it, and 
 since strings are immutable it returns a reference to itself.
 The various ids are not immutable and probably should not be changed at this 
 point.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4822) Unnecessary conversions in History Events

2013-02-04 Thread Chu Tong (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chu Tong updated MAPREDUCE-4822:


Status: Patch Available  (was: Open)

 Unnecessary conversions in History Events
 -

 Key: MAPREDUCE-4822
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4822
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobhistoryserver
Affects Versions: 0.23.4
Reporter: Robert Joseph Evans
Assignee: Chu Tong
Priority: Trivial
  Labels: patch
 Fix For: 0.24.0

 Attachments: MAPREDUCE-4822.patch


 There are a number of conversions in the Job History Event classes that are 
 totally unnecessary.  It appears that they were originally used to convert 
 from the internal avro format, but now many of them do not pull the values 
 from the avro they store them internally.
 For example:
 {code:title=TaskAttemptFinishedEvent.java}
   /** Get the task type */
   public TaskType getTaskType() {
 return TaskType.valueOf(taskType.toString());
   }
 {code}
 The code currently is taking an enum, converting it to a string and then 
 asking the same enum to convert it back to an enum.  If java work properly 
 this should be a noop and a reference to the original taskType should be 
 returned.
 There are several places that a string is having toString called on it, and 
 since strings are immutable it returns a reference to itself.
 The various ids are not immutable and probably should not be changed at this 
 point.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4822) Unnecessary conversions in History Events

2013-02-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13571122#comment-13571122
 ] 

Hadoop QA commented on MAPREDUCE-4822:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12567961/MAPREDUCE-4822.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3302//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3302//console

This message is automatically generated.

 Unnecessary conversions in History Events
 -

 Key: MAPREDUCE-4822
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4822
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobhistoryserver
Affects Versions: 0.23.4
Reporter: Robert Joseph Evans
Assignee: Chu Tong
Priority: Trivial
  Labels: patch
 Fix For: 0.24.0

 Attachments: MAPREDUCE-4822.patch


 There are a number of conversions in the Job History Event classes that are 
 totally unnecessary.  It appears that they were originally used to convert 
 from the internal avro format, but now many of them do not pull the values 
 from the avro they store them internally.
 For example:
 {code:title=TaskAttemptFinishedEvent.java}
   /** Get the task type */
   public TaskType getTaskType() {
 return TaskType.valueOf(taskType.toString());
   }
 {code}
 The code currently is taking an enum, converting it to a string and then 
 asking the same enum to convert it back to an enum.  If java work properly 
 this should be a noop and a reference to the original taskType should be 
 returned.
 There are several places that a string is having toString called on it, and 
 since strings are immutable it returns a reference to itself.
 The various ids are not immutable and probably should not be changed at this 
 point.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-02-04 Thread Gelesh (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13571123#comment-13571123
 ] 

Gelesh commented on MAPREDUCE-4974:
---

[~tlipcon]
I tried out an estimation,on Local, with small data, subtracting the the long 
value obtained from System.nanoTime() at the beginning and at the end of the 
method.

Average time difference was 200 Nano Seconds per each anomic call made to 
nextKeyValue(), excluding the very first call, since it involves the object 
creation.

The total time difference would be 200 * number of Key Value pairs generated 
per each Map Task.

 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Fix For: 0.20.204.0, 0.24.0

 Attachments: MAPREDUCE-4974.1.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4049) plugin for generic shuffle service

2013-02-04 Thread Avner BenHanoch (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Avner BenHanoch updated MAPREDUCE-4049:
---

Attachment: MAPREDUCE-4049--branch-1.patch

 plugin for generic shuffle service
 --

 Key: MAPREDUCE-4049
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4049
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: performance, task, tasktracker
Affects Versions: 1.0.3, 1.1.0, 2.0.0-alpha, 3.0.0
Reporter: Avner BenHanoch
Assignee: Avner BenHanoch
  Labels: merge, plugin, rdma, shuffle
 Fix For: 2.0.3-alpha

 Attachments: HADOOP-1.x.y.patch, Hadoop Shuffle Plugin Design.rtf, 
 MAPREDUCE-4049--branch-1.patch, MAPREDUCE-4049--branch-1.patch, 
 MAPREDUCE-4049--branch-1.patch, mapreduce-4049.patch


 Support generic shuffle service as set of two plugins: ShuffleProvider  
 ShuffleConsumer.
 This will satisfy the following needs:
 # Better shuffle and merge performance. For example: we are working on 
 shuffle plugin that performs shuffle over RDMA in fast networks (10gE, 40gE, 
 or Infiniband) instead of using the current HTTP shuffle. Based on the fast 
 RDMA shuffle, the plugin can also utilize a suitable merge approach during 
 the intermediate merges. Hence, getting much better performance.
 # Satisfy MAPREDUCE-3060 - generic shuffle service for avoiding hidden 
 dependency of NodeManager with a specific version of mapreduce shuffle 
 (currently targeted to 0.24.0).
 References:
 # Hadoop Acceleration through Network Levitated Merging, by Prof. Weikuan Yu 
 from Auburn University with others, 
 [http://pasl.eng.auburn.edu/pubs/sc11-netlev.pdf]
 # I am attaching 2 documents with suggested Top Level Design for both plugins 
 (currently, based on 1.0 branch)
 # I am providing link for downloading UDA - Mellanox's open source plugin 
 that implements generic shuffle service using RDMA and levitated merge.  
 Note: At this phase, the code is in C++ through JNI and you should consider 
 it as beta only.  Still, it can serve anyone that wants to implement or 
 contribute to levitated merge. (Please be advised that levitated merge is 
 mostly suit in very fast networks) - 
 [http://www.mellanox.com/content/pages.php?pg=products_dynproduct_family=144menu_section=69]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4049) plugin for generic shuffle service

2013-02-04 Thread Avner BenHanoch (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13571125#comment-13571125
 ] 

Avner BenHanoch commented on MAPREDUCE-4049:


Arun, Alejadro,
I attached new patch for branch-1 that addresses all your comments.
I am still passing ReduceTask to the plugin according to my explanation in the 
previous comment.
Cheers,
Avner

 plugin for generic shuffle service
 --

 Key: MAPREDUCE-4049
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4049
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: performance, task, tasktracker
Affects Versions: 1.0.3, 1.1.0, 2.0.0-alpha, 3.0.0
Reporter: Avner BenHanoch
Assignee: Avner BenHanoch
  Labels: merge, plugin, rdma, shuffle
 Fix For: 2.0.3-alpha

 Attachments: HADOOP-1.x.y.patch, Hadoop Shuffle Plugin Design.rtf, 
 MAPREDUCE-4049--branch-1.patch, MAPREDUCE-4049--branch-1.patch, 
 MAPREDUCE-4049--branch-1.patch, mapreduce-4049.patch


 Support generic shuffle service as set of two plugins: ShuffleProvider  
 ShuffleConsumer.
 This will satisfy the following needs:
 # Better shuffle and merge performance. For example: we are working on 
 shuffle plugin that performs shuffle over RDMA in fast networks (10gE, 40gE, 
 or Infiniband) instead of using the current HTTP shuffle. Based on the fast 
 RDMA shuffle, the plugin can also utilize a suitable merge approach during 
 the intermediate merges. Hence, getting much better performance.
 # Satisfy MAPREDUCE-3060 - generic shuffle service for avoiding hidden 
 dependency of NodeManager with a specific version of mapreduce shuffle 
 (currently targeted to 0.24.0).
 References:
 # Hadoop Acceleration through Network Levitated Merging, by Prof. Weikuan Yu 
 from Auburn University with others, 
 [http://pasl.eng.auburn.edu/pubs/sc11-netlev.pdf]
 # I am attaching 2 documents with suggested Top Level Design for both plugins 
 (currently, based on 1.0 branch)
 # I am providing link for downloading UDA - Mellanox's open source plugin 
 that implements generic shuffle service using RDMA and levitated merge.  
 Note: At this phase, the code is in C++ through JNI and you should consider 
 it as beta only.  Still, it can serve anyone that wants to implement or 
 contribute to levitated merge. (Please be advised that levitated merge is 
 mostly suit in very fast networks) - 
 [http://www.mellanox.com/content/pages.php?pg=products_dynproduct_family=144menu_section=69]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira