[jira] [Commented] (MAPREDUCE-6223) TestJobConf#testNegativeValueForTaskVmem failures

2015-02-05 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308760#comment-14308760
 ] 

Masatake Iwasaki commented on MAPREDUCE-6223:
-

s/not local value but//

 TestJobConf#testNegativeValueForTaskVmem failures
 -

 Key: MAPREDUCE-6223
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6223
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: 3.0.0
Reporter: Gera Shegalov
Assignee: Varun Saxena
 Attachments: MAPREDUCE-6223.001.patch, MAPREDUCE-6223.002.patch


 {code}
 Tests run: 8, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 3.328 sec  
 FAILURE! - in org.apache.hadoop.conf.TestJobConf
 testNegativeValueForTaskVmem(org.apache.hadoop.conf.TestJobConf)  Time 
 elapsed: 0.089 sec   FAILURE!
 java.lang.AssertionError: expected:1024 but was:-1
   at org.junit.Assert.fail(Assert.java:88)
   at org.junit.Assert.failNotEquals(Assert.java:743)
   at org.junit.Assert.assertEquals(Assert.java:118)
   at org.junit.Assert.assertEquals(Assert.java:555)
   at org.junit.Assert.assertEquals(Assert.java:542)
   at 
 org.apache.hadoop.conf.TestJobConf.testNegativeValueForTaskVmem(TestJobConf.java:111)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6165) [JDK8] TestCombineFileInputFormat failed on JDK8

2015-02-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308545#comment-14308545
 ] 

Hadoop QA commented on MAPREDUCE-6165:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12685053/MAPREDUCE-6165-001.patch
  against trunk revision 6583ad1.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient:

  org.apache.hadoop.conf.TestJobConf

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5169//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5169//console

This message is automatically generated.

 [JDK8] TestCombineFileInputFormat failed on JDK8
 

 Key: MAPREDUCE-6165
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6165
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Wei Yan
Assignee: Akira AJISAKA
Priority: Minor
 Attachments: MAPREDUCE-6165-001.patch, MAPREDUCE-6165-reproduce.patch


 The error msg:
 {noformat}
 testSplitPlacementForCompressedFiles(org.apache.hadoop.mapreduce.lib.input.TestCombineFileInputFormat)
   Time elapsed: 2.487 sec   FAILURE!
 junit.framework.AssertionFailedError: expected:2 but was:1
   at junit.framework.Assert.fail(Assert.java:57)
   at junit.framework.Assert.failNotEquals(Assert.java:329)
   at junit.framework.Assert.assertEquals(Assert.java:78)
   at junit.framework.Assert.assertEquals(Assert.java:234)
   at junit.framework.Assert.assertEquals(Assert.java:241)
   at junit.framework.TestCase.assertEquals(TestCase.java:409)
   at 
 org.apache.hadoop.mapreduce.lib.input.TestCombineFileInputFormat.testSplitPlacementForCompressedFiles(TestCombineFileInputFormat.java:911)
 testSplitPlacement(org.apache.hadoop.mapreduce.lib.input.TestCombineFileInputFormat)
   Time elapsed: 0.985 sec   FAILURE!
 junit.framework.AssertionFailedError: expected:2 but was:1
   at junit.framework.Assert.fail(Assert.java:57)
   at junit.framework.Assert.failNotEquals(Assert.java:329)
   at junit.framework.Assert.assertEquals(Assert.java:78)
   at junit.framework.Assert.assertEquals(Assert.java:234)
   at junit.framework.Assert.assertEquals(Assert.java:241)
   at junit.framework.TestCase.assertEquals(TestCase.java:409)
   at 
 org.apache.hadoop.mapreduce.lib.input.TestCombineFileInputFormat.testSplitPlacement(TestCombineFileInputFormat.java:368)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6227) DFSIO for truncate

2015-02-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308606#comment-14308606
 ] 

Hadoop QA commented on MAPREDUCE-6227:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12696938/DFSIO-truncate-00.patch
  against trunk revision 6583ad1.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient:

  org.apache.hadoop.conf.TestJobConf

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5170//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5170//console

This message is automatically generated.

 DFSIO for truncate
 --

 Key: MAPREDUCE-6227
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6227
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: benchmarks, test
Affects Versions: 2.7.0
Reporter: Konstantin Shvachko
Assignee: Konstantin Shvachko
 Attachments: DFSIO-truncate-00.patch


 Create a benchmark and a test for truncate within the framework of TestDFSIO.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6234) MRJobConfig.DEFAULT_*_MEMORY_MB should be consistent with mapred-default.xml

2015-02-05 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated MAPREDUCE-6234:

Attachment: MAPREDUCE-6234.002.patch

.002 addresses my comment in MAPREDUCE-6223. Tests needing default value in 
conf can use {{MRJobConfig.DEFAULT_MAP_MEMORY_MB}} and test needing the value 
processed by JobConf#getMemoryRequired can use 
{{JobConf.DEFAULT_MAP_MEMORY_REQUIRED}}.

 MRJobConfig.DEFAULT_*_MEMORY_MB should be consistent with mapred-default.xml
 

 Key: MAPREDUCE-6234
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6234
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/gridmix, mrv2
Reporter: Masatake Iwasaki
Assignee: Masatake Iwasaki
 Attachments: MAPREDUCE-6234.001.patch, MAPREDUCE-6234.002.patch


 TestHighRamJob fails by this.
 {code}
 ---
  T E S T S
 ---
 Running org.apache.hadoop.mapred.gridmix.TestHighRamJob
 Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 1.162 sec  
 FAILURE! - in org.apache.hadoop.mapred.gridmix.TestHighRamJob
 testHighRamFeatureEmulation(org.apache.hadoop.mapred.gridmix.TestHighRamJob)  
 Time elapsed: 1.102 sec   FAILURE!
 java.lang.AssertionError: expected:1024 but was:-1
   at org.junit.Assert.fail(Assert.java:88)
   at org.junit.Assert.failNotEquals(Assert.java:743)
   at org.junit.Assert.assertEquals(Assert.java:118)
   at org.junit.Assert.assertEquals(Assert.java:555)
   at org.junit.Assert.assertEquals(Assert.java:542)
   at 
 org.apache.hadoop.mapred.gridmix.TestHighRamJob.testHighRamConfig(TestHighRamJob.java:98)
   at 
 org.apache.hadoop.mapred.gridmix.TestHighRamJob.testHighRamFeatureEmulation(TestHighRamJob.java:117)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6234) MRJobConfig.DEFAULT_*_MEMORY_MB should be consistent with mapred-default.xml

2015-02-05 Thread Tsuyoshi OZAWA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308750#comment-14308750
 ] 

Tsuyoshi OZAWA commented on MAPREDUCE-6234:
---

Make sense.

[~jira.shegalov], do you know the reason that DEFAULT_MAP_MEMORY_MB is not 
updated in MAPREDUCE-5785? If there is no reason, I think we can apply this 
patch to trunk. 

 MRJobConfig.DEFAULT_*_MEMORY_MB should be consistent with mapred-default.xml
 

 Key: MAPREDUCE-6234
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6234
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/gridmix, mrv2
Reporter: Masatake Iwasaki
Assignee: Masatake Iwasaki
 Attachments: MAPREDUCE-6234.001.patch


 TestHighRamJob fails by this.
 {code}
 ---
  T E S T S
 ---
 Running org.apache.hadoop.mapred.gridmix.TestHighRamJob
 Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 1.162 sec  
 FAILURE! - in org.apache.hadoop.mapred.gridmix.TestHighRamJob
 testHighRamFeatureEmulation(org.apache.hadoop.mapred.gridmix.TestHighRamJob)  
 Time elapsed: 1.102 sec   FAILURE!
 java.lang.AssertionError: expected:1024 but was:-1
   at org.junit.Assert.fail(Assert.java:88)
   at org.junit.Assert.failNotEquals(Assert.java:743)
   at org.junit.Assert.assertEquals(Assert.java:118)
   at org.junit.Assert.assertEquals(Assert.java:555)
   at org.junit.Assert.assertEquals(Assert.java:542)
   at 
 org.apache.hadoop.mapred.gridmix.TestHighRamJob.testHighRamConfig(TestHighRamJob.java:98)
   at 
 org.apache.hadoop.mapred.gridmix.TestHighRamJob.testHighRamFeatureEmulation(TestHighRamJob.java:117)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6223) TestJobConf#testNegativeValueForTaskVmem failures

2015-02-05 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308748#comment-14308748
 ] 

Masatake Iwasaki commented on MAPREDUCE-6223:
-

I think JobConf#getMemoryReuiqred should get 1024 from not local value but 
constant in MRJobConfig other than DEFAULT_*_MEMORY_MB because 1024 is never 
set in Configuration. [~ajisakaa] / [~ozawa], please commit the patch of this 
issue first. I will update the patch of MAPREDUCE-6234 later.

 TestJobConf#testNegativeValueForTaskVmem failures
 -

 Key: MAPREDUCE-6223
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6223
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: 3.0.0
Reporter: Gera Shegalov
Assignee: Varun Saxena
 Attachments: MAPREDUCE-6223.001.patch, MAPREDUCE-6223.002.patch


 {code}
 Tests run: 8, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 3.328 sec  
 FAILURE! - in org.apache.hadoop.conf.TestJobConf
 testNegativeValueForTaskVmem(org.apache.hadoop.conf.TestJobConf)  Time 
 elapsed: 0.089 sec   FAILURE!
 java.lang.AssertionError: expected:1024 but was:-1
   at org.junit.Assert.fail(Assert.java:88)
   at org.junit.Assert.failNotEquals(Assert.java:743)
   at org.junit.Assert.assertEquals(Assert.java:118)
   at org.junit.Assert.assertEquals(Assert.java:555)
   at org.junit.Assert.assertEquals(Assert.java:542)
   at 
 org.apache.hadoop.conf.TestJobConf.testNegativeValueForTaskVmem(TestJobConf.java:111)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6227) DFSIO for truncate

2015-02-05 Thread Konstantin Shvachko (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko updated MAPREDUCE-6227:
---
Attachment: DFSIO-truncate-01.patch

Moved TestDFSIO_results.log under {{target/test-dir}} for tests.

 DFSIO for truncate
 --

 Key: MAPREDUCE-6227
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6227
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: benchmarks, test
Affects Versions: 2.7.0
Reporter: Konstantin Shvachko
Assignee: Konstantin Shvachko
 Attachments: DFSIO-truncate-00.patch, DFSIO-truncate-01.patch


 Create a benchmark and a test for truncate within the framework of TestDFSIO.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6234) MRJobConfig.DEFAULT_*_MEMORY_MB should be consistent with mapred-default.xml

2015-02-05 Thread Gera Shegalov (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308764#comment-14308764
 ] 

Gera Shegalov commented on MAPREDUCE-6234:
--

I apologize, I am a little tied up right now to do a thorough review. Looking 
into resolving this is on my list. I was thinking that direct references to to 
DEFAULT_*_MEMORY_MB should be wrapped in a single method. Maybe [~kasha] can 
chime in in the meantime.

 MRJobConfig.DEFAULT_*_MEMORY_MB should be consistent with mapred-default.xml
 

 Key: MAPREDUCE-6234
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6234
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/gridmix, mrv2
Reporter: Masatake Iwasaki
Assignee: Masatake Iwasaki
 Attachments: MAPREDUCE-6234.001.patch


 TestHighRamJob fails by this.
 {code}
 ---
  T E S T S
 ---
 Running org.apache.hadoop.mapred.gridmix.TestHighRamJob
 Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 1.162 sec  
 FAILURE! - in org.apache.hadoop.mapred.gridmix.TestHighRamJob
 testHighRamFeatureEmulation(org.apache.hadoop.mapred.gridmix.TestHighRamJob)  
 Time elapsed: 1.102 sec   FAILURE!
 java.lang.AssertionError: expected:1024 but was:-1
   at org.junit.Assert.fail(Assert.java:88)
   at org.junit.Assert.failNotEquals(Assert.java:743)
   at org.junit.Assert.assertEquals(Assert.java:118)
   at org.junit.Assert.assertEquals(Assert.java:555)
   at org.junit.Assert.assertEquals(Assert.java:542)
   at 
 org.apache.hadoop.mapred.gridmix.TestHighRamJob.testHighRamConfig(TestHighRamJob.java:98)
   at 
 org.apache.hadoop.mapred.gridmix.TestHighRamJob.testHighRamFeatureEmulation(TestHighRamJob.java:117)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6223) TestJobConf#testNegativeValueForTaskVmem failures

2015-02-05 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308671#comment-14308671
 ] 

Varun Saxena commented on MAPREDUCE-6223:
-

[~ajisakaa] / [~ozawa],
As you wish. Because currently test failures are appearing till MAPREDUCE-6223 
is committed.
Ideally {{MRJobConfig.DEFAULT_MAP_MEMORY_MB}} should not be changed. I feel we 
should not be taking default value from a local variable.
MAPREDUCE-6234 hence will be a redundant fix as we will have to revert its 
changes again.

Although it is somewhat confusing that default value in mapred-default.xml is 
-1 and in code we take it as 1024. But if somebody reads the config 
description, which should be done, its quite clear what is the behavior of this 
config.
{code}
description
  The amount of memory to request from the scheduler for each   
  map task. If this is not specified or is non-positive, it is inferred from
  mapreduce.map.java.opts and mapreduce.job.heap.memory-mb.ratio.
  If java-opts are also not specified, we set it to 1024.
/description
{code}

You can take a call whether to commit that or not. Alternatively you can review 
and commit this as well.

 TestJobConf#testNegativeValueForTaskVmem failures
 -

 Key: MAPREDUCE-6223
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6223
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: 3.0.0
Reporter: Gera Shegalov
Assignee: Varun Saxena
 Attachments: MAPREDUCE-6223.001.patch, MAPREDUCE-6223.002.patch


 {code}
 Tests run: 8, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 3.328 sec  
 FAILURE! - in org.apache.hadoop.conf.TestJobConf
 testNegativeValueForTaskVmem(org.apache.hadoop.conf.TestJobConf)  Time 
 elapsed: 0.089 sec   FAILURE!
 java.lang.AssertionError: expected:1024 but was:-1
   at org.junit.Assert.fail(Assert.java:88)
   at org.junit.Assert.failNotEquals(Assert.java:743)
   at org.junit.Assert.assertEquals(Assert.java:118)
   at org.junit.Assert.assertEquals(Assert.java:555)
   at org.junit.Assert.assertEquals(Assert.java:542)
   at 
 org.apache.hadoop.conf.TestJobConf.testNegativeValueForTaskVmem(TestJobConf.java:111)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6235) Bundle and compress files passed with -libjars prior to uploading and distributing

2015-02-05 Thread Dustin Cote (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14307179#comment-14307179
 ] 

Dustin Cote commented on MAPREDUCE-6235:


Thanks folks, I believe I was seeing a time difference because of the time to 
compress.  I'll go ahead and close this out since no code change should be made 
here.

 Bundle and compress files passed with -libjars prior to uploading and 
 distributing
 --

 Key: MAPREDUCE-6235
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6235
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: distributed-cache, mrv2
Affects Versions: 2.6.0
Reporter: Dustin Cote
Assignee: Dustin Cote
Priority: Minor

 To improve performance, we should upload jars flagged by -libjars as a single 
 bundle and expand on arrival instead of uploading the jars one by one.   This 
 would also reduce network overhead of using the -libjars option.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6235) Bundle and compress files passed with -libjars prior to uploading and distributing

2015-02-05 Thread Dustin Cote (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14307181#comment-14307181
 ] 

Dustin Cote commented on MAPREDUCE-6235:


Time to *zip* not compress... ok now closing it.

 Bundle and compress files passed with -libjars prior to uploading and 
 distributing
 --

 Key: MAPREDUCE-6235
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6235
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: distributed-cache, mrv2
Affects Versions: 2.6.0
Reporter: Dustin Cote
Assignee: Dustin Cote
Priority: Minor

 To improve performance, we should upload jars flagged by -libjars as a single 
 bundle and expand on arrival instead of uploading the jars one by one.   This 
 would also reduce network overhead of using the -libjars option.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (MAPREDUCE-6235) Bundle and compress files passed with -libjars prior to uploading and distributing

2015-02-05 Thread Dustin Cote (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dustin Cote resolved MAPREDUCE-6235.

Resolution: Invalid

 Bundle and compress files passed with -libjars prior to uploading and 
 distributing
 --

 Key: MAPREDUCE-6235
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6235
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: distributed-cache, mrv2
Affects Versions: 2.6.0
Reporter: Dustin Cote
Assignee: Dustin Cote
Priority: Minor

 To improve performance, we should upload jars flagged by -libjars as a single 
 bundle and expand on arrival instead of uploading the jars one by one.   This 
 would also reduce network overhead of using the -libjars option.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6245) Fixed split shuffling.

2015-02-05 Thread Eric Payne (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14307331#comment-14307331
 ] 

Eric Payne commented on MAPREDUCE-6245:
---

[~lbkzman], Can you please describe the problem that this Jira is trying to 
resolve?

 Fixed split shuffling.
 --

 Key: MAPREDUCE-6245
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6245
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: lbkzman
Assignee: lbkzman





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6059) Speed up history server startup time

2015-02-05 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14307434#comment-14307434
 ] 

Allen Wittenauer commented on MAPREDUCE-6059:
-

It wasn't committed to branch-2 because I generally don't.  

 Speed up history server startup time
 

 Key: MAPREDUCE-6059
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6059
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 2.4.0
Reporter: Siqi Li
Assignee: Siqi Li
 Fix For: 3.0.0

 Attachments: YARN-2366.v1.patch


 When history server starts up, It scans every history directories and put all 
 history files into a cache, whereas this cache only stores 20K recent history 
 files. Therefore, it is wasting a large portion of time loading old history 
 files into the cache, and the startup time will keep increasing if we don't 
 trim the number of history files. For example, when history server starts up 
 with 2.5M history files in HDFS, it took ~5 minutes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6059) Speed up history server startup time

2015-02-05 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14307452#comment-14307452
 ] 

Jason Lowe commented on MAPREDUCE-6059:
---

If you have no objections, I'd like to commit this to branch-2 as well.  I'd 
like to keep the trunk and branch-2 lines as reasonably close as we can to 
minimize the pain of maintaining the two lines.

 Speed up history server startup time
 

 Key: MAPREDUCE-6059
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6059
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 2.4.0
Reporter: Siqi Li
Assignee: Siqi Li
 Fix For: 3.0.0

 Attachments: YARN-2366.v1.patch


 When history server starts up, It scans every history directories and put all 
 history files into a cache, whereas this cache only stores 20K recent history 
 files. Therefore, it is wasting a large portion of time loading old history 
 files into the cache, and the startup time will keep increasing if we don't 
 trim the number of history files. For example, when history server starts up 
 with 2.5M history files in HDFS, it took ~5 minutes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6059) Speed up history server startup time

2015-02-05 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14307471#comment-14307471
 ] 

Allen Wittenauer commented on MAPREDUCE-6059:
-

No objection from me if you want to be Sisyphus.  :)

 Speed up history server startup time
 

 Key: MAPREDUCE-6059
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6059
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 2.4.0
Reporter: Siqi Li
Assignee: Siqi Li
 Fix For: 3.0.0

 Attachments: YARN-2366.v1.patch


 When history server starts up, It scans every history directories and put all 
 history files into a cache, whereas this cache only stores 20K recent history 
 files. Therefore, it is wasting a large portion of time loading old history 
 files into the cache, and the startup time will keep increasing if we don't 
 trim the number of history files. For example, when history server starts up 
 with 2.5M history files in HDFS, it took ~5 minutes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5847) Remove redundant code for fileOutputByteCounter in MapTask and ReduceTask

2015-02-05 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-5847:

Status: Open  (was: Patch Available)

Cancelling patch as it no longer applies.

 Remove redundant code for fileOutputByteCounter in MapTask and ReduceTask 
 --

 Key: MAPREDUCE-5847
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5847
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2, task
Affects Versions: 2.4.0
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: MAPREDUCE-5847.v01.patch, MAPREDUCE-5847.v02.patch


 Both MapTask and ReduceTask carry redundant code to update BYTES_WRITTEN 
 counter. However, {{Task.updateCounters}} uses file system stats for this. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-207) Computing Input Splits on the MR Cluster

2015-02-05 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-207:
---
Status: Open  (was: Patch Available)

 Computing Input Splits on the MR Cluster
 

 Key: MAPREDUCE-207
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-207
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: applicationmaster, mrv2
Reporter: Philip Zeyliger
Assignee: Gera Shegalov
 Attachments: MAPREDUCE-207.patch, MAPREDUCE-207.v02.patch, 
 MAPREDUCE-207.v03.patch, MAPREDUCE-207.v05.patch, MAPREDUCE-207.v06.patch, 
 MAPREDUCE-207.v07.patch


 Instead of computing the input splits as part of job submission, Hadoop could 
 have a separate job task type that computes the input splits, therefore 
 allowing that computation to happen on the cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-207) Computing Input Splits on the MR Cluster

2015-02-05 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-207:
---
Status: Patch Available  (was: Open)

 Computing Input Splits on the MR Cluster
 

 Key: MAPREDUCE-207
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-207
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: applicationmaster, mrv2
Reporter: Philip Zeyliger
Assignee: Gera Shegalov
 Attachments: MAPREDUCE-207.patch, MAPREDUCE-207.v02.patch, 
 MAPREDUCE-207.v03.patch, MAPREDUCE-207.v05.patch, MAPREDUCE-207.v06.patch, 
 MAPREDUCE-207.v07.patch


 Instead of computing the input splits as part of job submission, Hadoop could 
 have a separate job task type that computes the input splits, therefore 
 allowing that computation to happen on the cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-207) Computing Input Splits on the MR Cluster

2015-02-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308003#comment-14308003
 ] 

Hadoop QA commented on MAPREDUCE-207:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12655331/MAPREDUCE-207.v07.patch
  against trunk revision e1990ab.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5167//console

This message is automatically generated.

 Computing Input Splits on the MR Cluster
 

 Key: MAPREDUCE-207
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-207
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: applicationmaster, mrv2
Reporter: Philip Zeyliger
Assignee: Gera Shegalov
 Attachments: MAPREDUCE-207.patch, MAPREDUCE-207.v02.patch, 
 MAPREDUCE-207.v03.patch, MAPREDUCE-207.v05.patch, MAPREDUCE-207.v06.patch, 
 MAPREDUCE-207.v07.patch


 Instead of computing the input splits as part of job submission, Hadoop could 
 have a separate job task type that computes the input splits, therefore 
 allowing that computation to happen on the cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5044) Have AM trigger jstack on task attempts that timeout before killing them

2015-02-05 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-5044:

Status: Open  (was: Patch Available)

 Have AM trigger jstack on task attempts that timeout before killing them
 

 Key: MAPREDUCE-5044
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5044
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mr-am
Affects Versions: 2.1.0-beta
Reporter: Jason Lowe
Assignee: Gera Shegalov
 Attachments: MAPREDUCE-5044.v01.patch, MAPREDUCE-5044.v02.patch, 
 MAPREDUCE-5044.v03.patch, MAPREDUCE-5044.v04.patch, MAPREDUCE-5044.v05.patch, 
 MAPREDUCE-5044.v06.patch, Screen Shot 2013-11-12 at 1.05.32 PM.png, Screen 
 Shot 2013-11-12 at 1.06.04 PM.png


 When an AM expires a task attempt it would be nice if it triggered a jstack 
 output via SIGQUIT before killing the task attempt.  This would be invaluable 
 for helping users debug their hung tasks, especially if they do not have 
 shell access to the nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5044) Have AM trigger jstack on task attempts that timeout before killing them

2015-02-05 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-5044:

Status: Patch Available  (was: Open)

 Have AM trigger jstack on task attempts that timeout before killing them
 

 Key: MAPREDUCE-5044
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5044
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mr-am
Affects Versions: 2.1.0-beta
Reporter: Jason Lowe
Assignee: Gera Shegalov
 Attachments: MAPREDUCE-5044.v01.patch, MAPREDUCE-5044.v02.patch, 
 MAPREDUCE-5044.v03.patch, MAPREDUCE-5044.v04.patch, MAPREDUCE-5044.v05.patch, 
 MAPREDUCE-5044.v06.patch, Screen Shot 2013-11-12 at 1.05.32 PM.png, Screen 
 Shot 2013-11-12 at 1.06.04 PM.png


 When an AM expires a task attempt it would be nice if it triggered a jstack 
 output via SIGQUIT before killing the task attempt.  This would be invaluable 
 for helping users debug their hung tasks, especially if they do not have 
 shell access to the nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5044) Have AM trigger jstack on task attempts that timeout before killing them

2015-02-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14307791#comment-14307791
 ] 

Hadoop QA commented on MAPREDUCE-5044:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12645521/MAPREDUCE-5044.v06.patch
  against trunk revision c4980a2.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5164//console

This message is automatically generated.

 Have AM trigger jstack on task attempts that timeout before killing them
 

 Key: MAPREDUCE-5044
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5044
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mr-am
Affects Versions: 2.1.0-beta
Reporter: Jason Lowe
Assignee: Gera Shegalov
 Attachments: MAPREDUCE-5044.v01.patch, MAPREDUCE-5044.v02.patch, 
 MAPREDUCE-5044.v03.patch, MAPREDUCE-5044.v04.patch, MAPREDUCE-5044.v05.patch, 
 MAPREDUCE-5044.v06.patch, Screen Shot 2013-11-12 at 1.05.32 PM.png, Screen 
 Shot 2013-11-12 at 1.06.04 PM.png


 When an AM expires a task attempt it would be nice if it triggered a jstack 
 output via SIGQUIT before killing the task attempt.  This would be invaluable 
 for helping users debug their hung tasks, especially if they do not have 
 shell access to the nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5044) Have AM trigger jstack on task attempts that timeout before killing them

2015-02-05 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-5044:

Status: Open  (was: Patch Available)

Cancelling patch as it no longer applies.

 Have AM trigger jstack on task attempts that timeout before killing them
 

 Key: MAPREDUCE-5044
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5044
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mr-am
Affects Versions: 2.1.0-beta
Reporter: Jason Lowe
Assignee: Gera Shegalov
 Attachments: MAPREDUCE-5044.v01.patch, MAPREDUCE-5044.v02.patch, 
 MAPREDUCE-5044.v03.patch, MAPREDUCE-5044.v04.patch, MAPREDUCE-5044.v05.patch, 
 MAPREDUCE-5044.v06.patch, Screen Shot 2013-11-12 at 1.05.32 PM.png, Screen 
 Shot 2013-11-12 at 1.06.04 PM.png


 When an AM expires a task attempt it would be nice if it triggered a jstack 
 output via SIGQUIT before killing the task attempt.  This would be invaluable 
 for helping users debug their hung tasks, especially if they do not have 
 shell access to the nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6237) DBRecordReader is not thread safe

2015-02-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14307813#comment-14307813
 ] 

Hadoop QA commented on MAPREDUCE-6237:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12696811/mapreduce-6237.patch
  against trunk revision d27439f.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

  {color:red}-1 javac{color}.  The applied patch generated 1154 javac 
compiler warnings (more than the trunk's current 1149 warnings).

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 13 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5162//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5162//artifact/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-core.html
Javac warnings: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5162//artifact/patchprocess/diffJavacWarnings.txt
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5162//console

This message is automatically generated.

 DBRecordReader is not thread safe
 -

 Key: MAPREDUCE-6237
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6237
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.5.0
Reporter: Kannan Rajah
Assignee: Kannan Rajah
 Attachments: mapreduce-6237.patch, mapreduce-6237.patch, 
 mapreduce-6237.patch


 DBInputFormat.createDBRecorder is reusing JDBC connections across instances 
 of DBRecordReader. This is not a good idea. We should be creating separate 
 connection. If performance is a concern, then we should be using connection 
 pooling instead.
 I looked at DBOutputFormat.getRecordReader. It actually creates a new 
 Connection object for each DBRecordReader. So can we just change 
 DBInputFormat to create new Connection every time? The connection reuse code 
 was added as part of connection leak bug in MAPREDUCE-1443. Any reason for 
 caching the connection?
 We observed this issue in a customer setup where they were reading data from 
 MySQL using Pig. As per customer, the query is returning two records which 
 causes Pig to create two instances of DBRecordReader. These two instances are 
 sharing the database connection instance. The first DBRecordReader runs to 
 extract the first record from MySQL just fine, but then closes the shared 
 connection instance. When the second DBRecordReader runs, it tries to execute 
 a query to retrieve the second record on the closed shared connection 
 instance, which fail. If we set
 mapred.map.tasks to 1, the query will be successful.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6243) Fix findbugs warnings in hadoop-rumen

2015-02-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14307267#comment-14307267
 ] 

Hudson commented on MAPREDUCE-6243:
---

SUCCESS: Integrated in Hadoop-Hdfs-trunk #2027 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2027/])
MAPREDUCE-6243. Fix findbugs warnings in hadoop-rumen. Contributed by Masatake 
Iwasaki. (aajisaka: rev 34fe11c987730932f99dec6eb458a22624eb075b)
* 
hadoop-tools/hadoop-rumen/src/main/java/org/apache/hadoop/tools/rumen/RandomSeedGenerator.java
* 
hadoop-tools/hadoop-rumen/src/main/java/org/apache/hadoop/tools/rumen/MapAttempt20LineHistoryEventEmitter.java
* 
hadoop-tools/hadoop-rumen/src/main/java/org/apache/hadoop/tools/rumen/ParsedConfigFile.java
* 
hadoop-tools/hadoop-rumen/src/main/java/org/apache/hadoop/tools/rumen/Hadoop20JHParser.java
* 
hadoop-tools/hadoop-rumen/src/main/java/org/apache/hadoop/tools/rumen/ReduceAttempt20LineHistoryEventEmitter.java
* hadoop-mapreduce-project/CHANGES.txt
* 
hadoop-tools/hadoop-rumen/src/main/java/org/apache/hadoop/tools/rumen/HadoopLogsAnalyzer.java


 Fix findbugs warnings in hadoop-rumen
 -

 Key: MAPREDUCE-6243
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6243
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tools/rumen
Affects Versions: 2.6.0
Reporter: Akira AJISAKA
Assignee: Masatake Iwasaki
Priority: Minor
  Labels: newbie
 Fix For: 2.7.0

 Attachments: MAPREDUCE-6243.001.patch, MAPREDUCE-6243.002.patch, 
 findbugs.xml


 There are 7 findbugs warnings in hadoop-rumen modules.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6245) Fixed split shuffling.

2015-02-05 Thread lbkzman (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lbkzman updated MAPREDUCE-6245:
---
Status: Patch Available  (was: Open)

index 72b47f2..8b89782 100644
--- src/java/org/apache/hadoop/mapreduce/lib/partition/InputSampler.java
+++ src/java/org/apache/hadoop/mapreduce/lib/partition/InputSampler.java
@@ -203,12 +203,8 @@ public class InputSamplerK,V extends Configured implement
s Tool  {
   r.setSeed(seed);
   LOG.debug(seed:  + seed);
   // shuffle splits
-  for (int i = 0; i  splits.size(); ++i) {
-InputSplit tmp = splits.get(i);
-int j = r.nextInt(splits.size());
-splits.set(i, splits.get(j));
-splits.set(j, tmp);
-  }
+  Collections.shuffle(splits);  
+
   // our target rate is in terms of the maximum number of sample splits,
   // but we accept the possibility of sampling additional splits to hit
   // the target sample keyset


 Fixed split shuffling.
 --

 Key: MAPREDUCE-6245
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6245
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: lbkzman
Assignee: lbkzman





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6245) Fixed split shuffling.

2015-02-05 Thread lbkzman (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lbkzman updated MAPREDUCE-6245:
---
Status: Open  (was: Patch Available)

 Fixed split shuffling.
 --

 Key: MAPREDUCE-6245
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6245
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: lbkzman
Assignee: lbkzman





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6245) Fixed split shuffling.

2015-02-05 Thread lbkzman (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lbkzman updated MAPREDUCE-6245:
---
Status: Open  (was: Patch Available)

 Fixed split shuffling.
 --

 Key: MAPREDUCE-6245
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6245
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: lbkzman
Assignee: lbkzman





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5988) Fix dead links to the javadocs in mapreduce project

2015-02-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14307270#comment-14307270
 ] 

Hudson commented on MAPREDUCE-5988:
---

SUCCESS: Integrated in Hadoop-Hdfs-trunk #2027 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2027/])
MAPREDUCE-5988. Fix dead links to the javadocs in mapreduce project. (aajisaka) 
(aajisaka: rev cc6bbfceae1cddfae6a3892cb7e7104531a689be)
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/package-info.java
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/counters/package-info.java
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/api/protocolrecords/package-info.java
* hadoop-mapreduce-project/CHANGES.txt


 Fix dead links to the javadocs in mapreduce project
 ---

 Key: MAPREDUCE-5988
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5988
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.4.1
Reporter: Akira AJISAKA
Assignee: Akira AJISAKA
Priority: Minor
 Fix For: 2.7.0

 Attachments: MAPREDUCE-5988.2.patch, MAPREDUCE-5988.patch


 In http://hadoop.apache.org/docs/r2.4.1/api/allclasses-frame.html, some 
 classes are listed, but not documented.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6059) Speed up history server startup time

2015-02-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14307278#comment-14307278
 ] 

Hudson commented on MAPREDUCE-6059:
---

SUCCESS: Integrated in Hadoop-Hdfs-trunk #2027 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2027/])
MAPREDUCE-6059. Speed up history server startup time (Siqi Li via aw) (aw: rev 
fd57ab2002f97dcc83d455a5e0c770c8efde77a4)
* hadoop-mapreduce-project/CHANGES.txt
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/HistoryFileManager.java


 Speed up history server startup time
 

 Key: MAPREDUCE-6059
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6059
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 2.4.0
Reporter: Siqi Li
Assignee: Siqi Li
 Fix For: 3.0.0

 Attachments: YARN-2366.v1.patch


 When history server starts up, It scans every history directories and put all 
 history files into a cache, whereas this cache only stores 20K recent history 
 files. Therefore, it is wasting a large portion of time loading old history 
 files into the cache, and the startup time will keep increasing if we don't 
 trim the number of history files. For example, when history server starts up 
 with 2.5M history files in HDFS, it took ~5 minutes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MAPREDUCE-6245) Fixed split shuffling.

2015-02-05 Thread lbkzman (JIRA)
lbkzman created MAPREDUCE-6245:
--

 Summary: Fixed split shuffling.
 Key: MAPREDUCE-6245
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6245
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: lbkzman






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6059) Speed up history server startup time

2015-02-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14307309#comment-14307309
 ] 

Hudson commented on MAPREDUCE-6059:
---

SUCCESS: Integrated in Hadoop-Hdfs-trunk-Java8 #92 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/92/])
MAPREDUCE-6059. Speed up history server startup time (Siqi Li via aw) (aw: rev 
fd57ab2002f97dcc83d455a5e0c770c8efde77a4)
* hadoop-mapreduce-project/CHANGES.txt
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/HistoryFileManager.java


 Speed up history server startup time
 

 Key: MAPREDUCE-6059
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6059
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 2.4.0
Reporter: Siqi Li
Assignee: Siqi Li
 Fix For: 3.0.0

 Attachments: YARN-2366.v1.patch


 When history server starts up, It scans every history directories and put all 
 history files into a cache, whereas this cache only stores 20K recent history 
 files. Therefore, it is wasting a large portion of time loading old history 
 files into the cache, and the startup time will keep increasing if we don't 
 trim the number of history files. For example, when history server starts up 
 with 2.5M history files in HDFS, it took ~5 minutes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5988) Fix dead links to the javadocs in mapreduce project

2015-02-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14307301#comment-14307301
 ] 

Hudson commented on MAPREDUCE-5988:
---

SUCCESS: Integrated in Hadoop-Hdfs-trunk-Java8 #92 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/92/])
MAPREDUCE-5988. Fix dead links to the javadocs in mapreduce project. (aajisaka) 
(aajisaka: rev cc6bbfceae1cddfae6a3892cb7e7104531a689be)
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/api/protocolrecords/package-info.java
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/package-info.java
* hadoop-mapreduce-project/CHANGES.txt
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/counters/package-info.java


 Fix dead links to the javadocs in mapreduce project
 ---

 Key: MAPREDUCE-5988
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5988
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.4.1
Reporter: Akira AJISAKA
Assignee: Akira AJISAKA
Priority: Minor
 Fix For: 2.7.0

 Attachments: MAPREDUCE-5988.2.patch, MAPREDUCE-5988.patch


 In http://hadoop.apache.org/docs/r2.4.1/api/allclasses-frame.html, some 
 classes are listed, but not documented.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6243) Fix findbugs warnings in hadoop-rumen

2015-02-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14307298#comment-14307298
 ] 

Hudson commented on MAPREDUCE-6243:
---

SUCCESS: Integrated in Hadoop-Hdfs-trunk-Java8 #92 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/92/])
MAPREDUCE-6243. Fix findbugs warnings in hadoop-rumen. Contributed by Masatake 
Iwasaki. (aajisaka: rev 34fe11c987730932f99dec6eb458a22624eb075b)
* 
hadoop-tools/hadoop-rumen/src/main/java/org/apache/hadoop/tools/rumen/Hadoop20JHParser.java
* 
hadoop-tools/hadoop-rumen/src/main/java/org/apache/hadoop/tools/rumen/RandomSeedGenerator.java
* 
hadoop-tools/hadoop-rumen/src/main/java/org/apache/hadoop/tools/rumen/ReduceAttempt20LineHistoryEventEmitter.java
* 
hadoop-tools/hadoop-rumen/src/main/java/org/apache/hadoop/tools/rumen/HadoopLogsAnalyzer.java
* 
hadoop-tools/hadoop-rumen/src/main/java/org/apache/hadoop/tools/rumen/ParsedConfigFile.java
* 
hadoop-tools/hadoop-rumen/src/main/java/org/apache/hadoop/tools/rumen/MapAttempt20LineHistoryEventEmitter.java
* hadoop-mapreduce-project/CHANGES.txt


 Fix findbugs warnings in hadoop-rumen
 -

 Key: MAPREDUCE-6243
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6243
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tools/rumen
Affects Versions: 2.6.0
Reporter: Akira AJISAKA
Assignee: Masatake Iwasaki
Priority: Minor
  Labels: newbie
 Fix For: 2.7.0

 Attachments: MAPREDUCE-6243.001.patch, MAPREDUCE-6243.002.patch, 
 findbugs.xml


 There are 7 findbugs warnings in hadoop-rumen modules.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6245) Fixed split shuffling.

2015-02-05 Thread lbkzman (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lbkzman updated MAPREDUCE-6245:
---
Affects Version/s: 2.6.0
 Release Note: 
index 72b47f2..8b89782 100644
--- src/java/org/apache/hadoop/mapreduce/lib/partition/InputSampler.java
+++ src/java/org/apache/hadoop/mapreduce/lib/partition/InputSampler.java
@@ -203,12 +203,8 @@ public class InputSamplerK,V extends Configured implement
s Tool  {
   r.setSeed(seed);
   LOG.debug(seed:  + seed);
   // shuffle splits
-  for (int i = 0; i  splits.size(); ++i) {
-InputSplit tmp = splits.get(i);
-int j = r.nextInt(splits.size());
-splits.set(i, splits.get(j));
-splits.set(j, tmp);
-  }
+  Collections.shuffle(splits);  
+
   // our target rate is in terms of the maximum number of sample splits,
   // but we accept the possibility of sampling additional splits to hit
   // the target sample keyset

   Status: Patch Available  (was: Open)

 Fixed split shuffling.
 --

 Key: MAPREDUCE-6245
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6245
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: lbkzman





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6245) Fixed split shuffling.

2015-02-05 Thread lbkzman (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lbkzman updated MAPREDUCE-6245:
---
Assignee: lbkzman
Target Version/s: 2.6.0
  Status: Open  (was: Patch Available)

 Fixed split shuffling.
 --

 Key: MAPREDUCE-6245
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6245
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: lbkzman
Assignee: lbkzman





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6245) Fixed split shuffling.

2015-02-05 Thread lbkzman (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lbkzman updated MAPREDUCE-6245:
---
Release Note:   (was: index 72b47f2..8b89782 100644
--- src/java/org/apache/hadoop/mapreduce/lib/partition/InputSampler.java
+++ src/java/org/apache/hadoop/mapreduce/lib/partition/InputSampler.java
@@ -203,12 +203,8 @@ public class InputSamplerK,V extends Configured implement
s Tool  {
   r.setSeed(seed);
   LOG.debug(seed:  + seed);
   // shuffle splits
-  for (int i = 0; i  splits.size(); ++i) {
-InputSplit tmp = splits.get(i);
-int j = r.nextInt(splits.size());
-splits.set(i, splits.get(j));
-splits.set(j, tmp);
-  }
+  Collections.shuffle(splits);  
+
   // our target rate is in terms of the maximum number of sample splits,
   // but we accept the possibility of sampling additional splits to hit
   // the target sample keyset
)

 Fixed split shuffling.
 --

 Key: MAPREDUCE-6245
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6245
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: lbkzman
Assignee: lbkzman





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6245) Fixed split shuffling.

2015-02-05 Thread lbkzman (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lbkzman updated MAPREDUCE-6245:
---
Status: Patch Available  (was: Open)

 Fixed split shuffling.
 --

 Key: MAPREDUCE-6245
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6245
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: lbkzman
Assignee: lbkzman





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6243) Fix findbugs warnings in hadoop-rumen

2015-02-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14307344#comment-14307344
 ] 

Hudson commented on MAPREDUCE-6243:
---

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #96 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/96/])
MAPREDUCE-6243. Fix findbugs warnings in hadoop-rumen. Contributed by Masatake 
Iwasaki. (aajisaka: rev 34fe11c987730932f99dec6eb458a22624eb075b)
* hadoop-mapreduce-project/CHANGES.txt
* 
hadoop-tools/hadoop-rumen/src/main/java/org/apache/hadoop/tools/rumen/ParsedConfigFile.java
* 
hadoop-tools/hadoop-rumen/src/main/java/org/apache/hadoop/tools/rumen/MapAttempt20LineHistoryEventEmitter.java
* 
hadoop-tools/hadoop-rumen/src/main/java/org/apache/hadoop/tools/rumen/Hadoop20JHParser.java
* 
hadoop-tools/hadoop-rumen/src/main/java/org/apache/hadoop/tools/rumen/ReduceAttempt20LineHistoryEventEmitter.java
* 
hadoop-tools/hadoop-rumen/src/main/java/org/apache/hadoop/tools/rumen/HadoopLogsAnalyzer.java
* 
hadoop-tools/hadoop-rumen/src/main/java/org/apache/hadoop/tools/rumen/RandomSeedGenerator.java


 Fix findbugs warnings in hadoop-rumen
 -

 Key: MAPREDUCE-6243
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6243
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tools/rumen
Affects Versions: 2.6.0
Reporter: Akira AJISAKA
Assignee: Masatake Iwasaki
Priority: Minor
  Labels: newbie
 Fix For: 2.7.0

 Attachments: MAPREDUCE-6243.001.patch, MAPREDUCE-6243.002.patch, 
 findbugs.xml


 There are 7 findbugs warnings in hadoop-rumen modules.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5988) Fix dead links to the javadocs in mapreduce project

2015-02-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14307347#comment-14307347
 ] 

Hudson commented on MAPREDUCE-5988:
---

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #96 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/96/])
MAPREDUCE-5988. Fix dead links to the javadocs in mapreduce project. (aajisaka) 
(aajisaka: rev cc6bbfceae1cddfae6a3892cb7e7104531a689be)
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/api/protocolrecords/package-info.java
* hadoop-mapreduce-project/CHANGES.txt
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/package-info.java
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/counters/package-info.java


 Fix dead links to the javadocs in mapreduce project
 ---

 Key: MAPREDUCE-5988
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5988
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.4.1
Reporter: Akira AJISAKA
Assignee: Akira AJISAKA
Priority: Minor
 Fix For: 2.7.0

 Attachments: MAPREDUCE-5988.2.patch, MAPREDUCE-5988.patch


 In http://hadoop.apache.org/docs/r2.4.1/api/allclasses-frame.html, some 
 classes are listed, but not documented.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6059) Speed up history server startup time

2015-02-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14307356#comment-14307356
 ] 

Hudson commented on MAPREDUCE-6059:
---

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #96 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/96/])
MAPREDUCE-6059. Speed up history server startup time (Siqi Li via aw) (aw: rev 
fd57ab2002f97dcc83d455a5e0c770c8efde77a4)
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/HistoryFileManager.java
* hadoop-mapreduce-project/CHANGES.txt


 Speed up history server startup time
 

 Key: MAPREDUCE-6059
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6059
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 2.4.0
Reporter: Siqi Li
Assignee: Siqi Li
 Fix For: 3.0.0

 Attachments: YARN-2366.v1.patch


 When history server starts up, It scans every history directories and put all 
 history files into a cache, whereas this cache only stores 20K recent history 
 files. Therefore, it is wasting a large portion of time loading old history 
 files into the cache, and the startup time will keep increasing if we don't 
 trim the number of history files. For example, when history server starts up 
 with 2.5M history files in HDFS, it took ~5 minutes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5988) Fix dead links to the javadocs in mapreduce project

2015-02-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14307388#comment-14307388
 ] 

Hudson commented on MAPREDUCE-5988:
---

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2046 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2046/])
MAPREDUCE-5988. Fix dead links to the javadocs in mapreduce project. (aajisaka) 
(aajisaka: rev cc6bbfceae1cddfae6a3892cb7e7104531a689be)
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/counters/package-info.java
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/package-info.java
* hadoop-mapreduce-project/CHANGES.txt
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/api/protocolrecords/package-info.java


 Fix dead links to the javadocs in mapreduce project
 ---

 Key: MAPREDUCE-5988
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5988
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.4.1
Reporter: Akira AJISAKA
Assignee: Akira AJISAKA
Priority: Minor
 Fix For: 2.7.0

 Attachments: MAPREDUCE-5988.2.patch, MAPREDUCE-5988.patch


 In http://hadoop.apache.org/docs/r2.4.1/api/allclasses-frame.html, some 
 classes are listed, but not documented.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6059) Speed up history server startup time

2015-02-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14307397#comment-14307397
 ] 

Hudson commented on MAPREDUCE-6059:
---

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2046 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2046/])
MAPREDUCE-6059. Speed up history server startup time (Siqi Li via aw) (aw: rev 
fd57ab2002f97dcc83d455a5e0c770c8efde77a4)
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/HistoryFileManager.java
* hadoop-mapreduce-project/CHANGES.txt


 Speed up history server startup time
 

 Key: MAPREDUCE-6059
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6059
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 2.4.0
Reporter: Siqi Li
Assignee: Siqi Li
 Fix For: 3.0.0

 Attachments: YARN-2366.v1.patch


 When history server starts up, It scans every history directories and put all 
 history files into a cache, whereas this cache only stores 20K recent history 
 files. Therefore, it is wasting a large portion of time loading old history 
 files into the cache, and the startup time will keep increasing if we don't 
 trim the number of history files. For example, when history server starts up 
 with 2.5M history files in HDFS, it took ~5 minutes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6243) Fix findbugs warnings in hadoop-rumen

2015-02-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14307385#comment-14307385
 ] 

Hudson commented on MAPREDUCE-6243:
---

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2046 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2046/])
MAPREDUCE-6243. Fix findbugs warnings in hadoop-rumen. Contributed by Masatake 
Iwasaki. (aajisaka: rev 34fe11c987730932f99dec6eb458a22624eb075b)
* 
hadoop-tools/hadoop-rumen/src/main/java/org/apache/hadoop/tools/rumen/ReduceAttempt20LineHistoryEventEmitter.java
* 
hadoop-tools/hadoop-rumen/src/main/java/org/apache/hadoop/tools/rumen/RandomSeedGenerator.java
* 
hadoop-tools/hadoop-rumen/src/main/java/org/apache/hadoop/tools/rumen/HadoopLogsAnalyzer.java
* 
hadoop-tools/hadoop-rumen/src/main/java/org/apache/hadoop/tools/rumen/Hadoop20JHParser.java
* 
hadoop-tools/hadoop-rumen/src/main/java/org/apache/hadoop/tools/rumen/MapAttempt20LineHistoryEventEmitter.java
* 
hadoop-tools/hadoop-rumen/src/main/java/org/apache/hadoop/tools/rumen/ParsedConfigFile.java
* hadoop-mapreduce-project/CHANGES.txt


 Fix findbugs warnings in hadoop-rumen
 -

 Key: MAPREDUCE-6243
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6243
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tools/rumen
Affects Versions: 2.6.0
Reporter: Akira AJISAKA
Assignee: Masatake Iwasaki
Priority: Minor
  Labels: newbie
 Fix For: 2.7.0

 Attachments: MAPREDUCE-6243.001.patch, MAPREDUCE-6243.002.patch, 
 findbugs.xml


 There are 7 findbugs warnings in hadoop-rumen modules.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6059) Speed up history server startup time

2015-02-05 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14307293#comment-14307293
 ] 

Jason Lowe commented on MAPREDUCE-6059:
---

Any reason this should not be committed to branch-2?  Most patches are 
committed there, so I'm curious about the criteria wrt. this patch.

 Speed up history server startup time
 

 Key: MAPREDUCE-6059
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6059
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 2.4.0
Reporter: Siqi Li
Assignee: Siqi Li
 Fix For: 3.0.0

 Attachments: YARN-2366.v1.patch


 When history server starts up, It scans every history directories and put all 
 history files into a cache, whereas this cache only stores 20K recent history 
 files. Therefore, it is wasting a large portion of time loading old history 
 files into the cache, and the startup time will keep increasing if we don't 
 trim the number of history files. For example, when history server starts up 
 with 2.5M history files in HDFS, it took ~5 minutes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6242) Progress report log is incredibly excessive in application master

2015-02-05 Thread Jian Fang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308060#comment-14308060
 ] 

Jian Fang commented on MAPREDUCE-6242:
--

Thanks for your quick fix.

 Progress report log is incredibly excessive in application master
 -

 Key: MAPREDUCE-6242
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6242
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster
Affects Versions: 2.4.0
Reporter: Jian Fang
Assignee: Varun Saxena
 Attachments: MAPREDUCE-6242.001.patch


 We saw incredibly excessive logs in application master for a long running one 
 with many task attempts. The log write rate is around 1MB/sec in some cases. 
 Most of the log entries were from the progress report such as the following 
 ones.
 2015-02-03 17:46:14,321 INFO [IPC Server handler 56 on 37661] 
 org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt 
 attempt_1422985365246_0001_m_00_0 is : 0.15605757
 2015-02-03 17:46:17,581 INFO [IPC Server handler 2 on 37661] 
 org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt 
 attempt_1422985365246_0001_m_00_0 is : 0.4108217
 2015-02-03 17:46:20,426 INFO [IPC Server handler 0 on 37661] 
 org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt 
 attempt_1422985365246_0001_m_02_0 is : 0.06634143
 2015-02-03 17:46:20,807 INFO [IPC Server handler 4 on 37661] 
 org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt 
 attempt_1422985365246_0001_m_00_0 is : 0.6506
 2015-02-03 17:46:21,013 INFO [IPC Server handler 6 on 37661] 
 org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt 
 attempt_1422985365246_0001_m_01_0 is : 0.21723115
 Looks like the report interval is controlled by a hard-coded variable 
 PROGRESS_INTERVAL as 3 seconds in class org.apache.hadoop.mapred.Task. We 
 should allow users to set the appropriate progress interval for their 
 applications.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6233) org.apache.hadoop.mapreduce.TestLargeSort.testLargeSort failed in trunk

2015-02-05 Thread Robert Kanter (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308094#comment-14308094
 ] 

Robert Kanter commented on MAPREDUCE-6233:
--

+1

 org.apache.hadoop.mapreduce.TestLargeSort.testLargeSort failed in trunk
 ---

 Key: MAPREDUCE-6233
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6233
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Reporter: Yongjun Zhang
Assignee: zhihai xu
 Attachments: MAPREDUCE-6233.000.patch


  https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2039/
 {code}
 Stack Trace:
 java.lang.AssertionError: Large sort failed for 128 expected:0 but was:1
 at org.junit.Assert.fail(Assert.java:88)
 at org.junit.Assert.failNotEquals(Assert.java:743)
 at org.junit.Assert.assertEquals(Assert.java:118)
 at org.junit.Assert.assertEquals(Assert.java:555)
 at 
 org.apache.hadoop.mapreduce.TestLargeSort.testLargeSort(TestLargeSort.java:61)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5847) Remove redundant code for fileOutputByteCounter in MapTask and ReduceTask

2015-02-05 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308116#comment-14308116
 ] 

Allen Wittenauer commented on MAPREDUCE-5847:
-

Incompatible changes can go into trunk.

 Remove redundant code for fileOutputByteCounter in MapTask and ReduceTask 
 --

 Key: MAPREDUCE-5847
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5847
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2, task
Affects Versions: 2.4.0
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: MAPREDUCE-5847.v01.patch, MAPREDUCE-5847.v02.patch


 Both MapTask and ReduceTask carry redundant code to update BYTES_WRITTEN 
 counter. However, {{Task.updateCounters}} uses file system stats for this. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5847) Remove redundant code for fileOutputByteCounter in MapTask and ReduceTask

2015-02-05 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308129#comment-14308129
 ] 

Jason Lowe commented on MAPREDUCE-5847:
---

bq. Incompatible changes can go into trunk.

Understood, but I'm arguing we shouldn't break incompatibility without 
sufficient merit.  Each incompatibility instance is a hurdle someone needs to 
jump to move from Hadoop 2.x to Hadoop 3.x.  Hence I'm wondering if others feel 
this is worth adding another hurdle or not.

 Remove redundant code for fileOutputByteCounter in MapTask and ReduceTask 
 --

 Key: MAPREDUCE-5847
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5847
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2, task
Affects Versions: 2.4.0
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: MAPREDUCE-5847.v01.patch, MAPREDUCE-5847.v02.patch


 Both MapTask and ReduceTask carry redundant code to update BYTES_WRITTEN 
 counter. However, {{Task.updateCounters}} uses file system stats for this. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5847) Remove redundant code for fileOutputByteCounter in MapTask and ReduceTask

2015-02-05 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308140#comment-14308140
 ] 

Allen Wittenauer commented on MAPREDUCE-5847:
-

This seems like such a low risk and, as it is today, aren't we actually 
reporting wrong information? That's significantly worse!  (I know of one vendor 
that is actually mentions that they report correct values for some metrics 
since we blow it so badly in lots of places...)

While I understand the concerns about moving from 2.x to 3.x, users should 
expect some degree of pain when moving major versions. 

 Remove redundant code for fileOutputByteCounter in MapTask and ReduceTask 
 --

 Key: MAPREDUCE-5847
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5847
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2, task
Affects Versions: 2.4.0
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: MAPREDUCE-5847.v01.patch, MAPREDUCE-5847.v02.patch


 Both MapTask and ReduceTask carry redundant code to update BYTES_WRITTEN 
 counter. However, {{Task.updateCounters}} uses file system stats for this. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6233) org.apache.hadoop.mapreduce.TestLargeSort.testLargeSort failed in trunk

2015-02-05 Thread Robert Kanter (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Kanter updated MAPREDUCE-6233:
-
   Resolution: Fixed
Fix Version/s: 2.7.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Thanks Zhihai.  Committed to trunk and branch-2!

 org.apache.hadoop.mapreduce.TestLargeSort.testLargeSort failed in trunk
 ---

 Key: MAPREDUCE-6233
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6233
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Reporter: Yongjun Zhang
Assignee: zhihai xu
 Fix For: 2.7.0

 Attachments: MAPREDUCE-6233.000.patch


  https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2039/
 {code}
 Stack Trace:
 java.lang.AssertionError: Large sort failed for 128 expected:0 but was:1
 at org.junit.Assert.fail(Assert.java:88)
 at org.junit.Assert.failNotEquals(Assert.java:743)
 at org.junit.Assert.assertEquals(Assert.java:118)
 at org.junit.Assert.assertEquals(Assert.java:555)
 at 
 org.apache.hadoop.mapreduce.TestLargeSort.testLargeSort(TestLargeSort.java:61)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5847) Remove redundant code for fileOutputByteCounter in MapTask and ReduceTask

2015-02-05 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308171#comment-14308171
 ] 

Jason Lowe commented on MAPREDUCE-5847:
---

If the counters are wrong then that's a separate JIRA that I think would be 
very well worth fixing in 2.x.  However IIUC this isn't about fixing incorrect 
counter values, rather it's about removing counters.

I can see the value of storing the separate counters, since they are not 
exactly equivalent.  One of them records the amount of bytes written to the 
filesystem overall during the life of the task, while the other records the 
amount of data written to the filesystem during the output collector's write 
method.  For many jobs these will be the same values, however if the task was 
doing out-of-band I/O with the filesystems outside of the output collector 
write method then they will not be equivalent.  Comparing these counters could 
be used to audit tasks that aren't writing data through the normal framework 
channels.

 Remove redundant code for fileOutputByteCounter in MapTask and ReduceTask 
 --

 Key: MAPREDUCE-5847
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5847
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2, task
Affects Versions: 2.4.0
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: MAPREDUCE-5847.v01.patch, MAPREDUCE-5847.v02.patch


 Both MapTask and ReduceTask carry redundant code to update BYTES_WRITTEN 
 counter. However, {{Task.updateCounters}} uses file system stats for this. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5965) Hadoop streaming throws error if list of input files is high. Error is: error=7, Argument list too long at if number of input file is high

2015-02-05 Thread Arup Malakar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arup Malakar updated MAPREDUCE-5965:

Attachment: MAPREDUCE-5965.1.patch

Reattaching updated patch.

 Hadoop streaming throws error if list of input files is high. Error is: 
 error=7, Argument list too long at if number of input file is high
 

 Key: MAPREDUCE-5965
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5965
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Arup Malakar
Assignee: Arup Malakar
 Attachments: MAPREDUCE-5965.1.patch, MAPREDUCE-5965.patch


 Hadoop streaming exposes all the key values in job conf as environment 
 variables when it forks a process for streaming code to run. Unfortunately 
 the variable mapreduce_input_fileinputformat_inputdir contains the list of 
 input files, and Linux has a limit on size of environment variables + 
 arguments.
 Based on how long the list of files and their full path is this could be 
 pretty huge. And given all of these variables are not even used it stops user 
 from running hadoop job with large number of files, even though it could be 
 run.
 Linux throws E2BIG if the size is greater than certain size which is error 
 code 7. And java translates that to error=7, Argument list too long. More: 
 http://man7.org/linux/man-pages/man2/execve.2.html I suggest skipping 
 variables if it is greater than certain length. That way if user code 
 requires the environment variable it would fail. It should also introduce a 
 config variable to skip long variables, and set it to false by default. That 
 way user has to specifically set it to true to invoke this feature.
 Here is the exception:
 {code}
 Error: java.lang.RuntimeException: Error in configuring object at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) 
 at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) 
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:426) at 
 org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at 
 org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at 
 java.security.AccessController.doPrivileged(Native Method) at 
 javax.security.auth.Subject.doAs(Subject.java:415) at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: 
 java.lang.reflect.InvocationTargetException at 
 sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:606) at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) 
 ... 9 more Caused by: java.lang.RuntimeException: Error in configuring object 
 at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) 
 at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) 
 at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38) ... 14 
 more Caused by: java.lang.reflect.InvocationTargetException at 
 sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:606) at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) 
 ... 17 more Caused by: java.lang.RuntimeException: configuration exception at 
 org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:222) at 
 org.apache.hadoop.streaming.PipeMapper.configure(PipeMapper.java:66) ... 22 
 more Caused by: java.io.IOException: Cannot run program 
 /data/hadoop/hadoop-yarn/cache/yarn/nm-local-dir/usercache/oo-analytics/appcache/application_1403599726264_13177/container_1403599726264_13177_01_06/./rbenv_runner.sh:
  error=7, Argument list too long at 
 java.lang.ProcessBuilder.start(ProcessBuilder.java:1041) at 
 org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:209) ... 23 
 more Caused by: java.io.IOException: error=7, Argument list too long at 
 java.lang.UNIXProcess.forkAndExec(Native Method) at 
 java.lang.UNIXProcess.init(UNIXProcess.java:135) at 
 java.lang.ProcessImpl.start(ProcessImpl.java:130) at 
 java.lang.ProcessBuilder.start(ProcessBuilder.java:1022) ... 24 more 
 Container killed by 

[jira] [Commented] (MAPREDUCE-5847) Remove redundant code for fileOutputByteCounter in MapTask and ReduceTask

2015-02-05 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308114#comment-14308114
 ] 

Jason Lowe commented on MAPREDUCE-5847:
---

Looks like the patch still applies, but I'm not sure this should go in per the 
incompatibility concerns I raised earlier.  I don't think the benefits of this 
change are worth that cost, even if this just goes into trunk.  Thoughts?

 Remove redundant code for fileOutputByteCounter in MapTask and ReduceTask 
 --

 Key: MAPREDUCE-5847
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5847
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2, task
Affects Versions: 2.4.0
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: MAPREDUCE-5847.v01.patch, MAPREDUCE-5847.v02.patch


 Both MapTask and ReduceTask carry redundant code to update BYTES_WRITTEN 
 counter. However, {{Task.updateCounters}} uses file system stats for this. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6233) org.apache.hadoop.mapreduce.TestLargeSort.testLargeSort failed in trunk

2015-02-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308159#comment-14308159
 ] 

Hudson commented on MAPREDUCE-6233:
---

FAILURE: Integrated in Hadoop-trunk-Commit #7028 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7028/])
MAPREDUCE-6233. org.apache.hadoop.mapreduce.TestLargeSort.testLargeSort failed 
in trunk (zxu via rkanter) (rkanter: rev 
e2ee2ff7d7ca429487d7e3883daedffbb269ebd4)
* hadoop-mapreduce-project/CHANGES.txt
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/TestLargeSort.java


 org.apache.hadoop.mapreduce.TestLargeSort.testLargeSort failed in trunk
 ---

 Key: MAPREDUCE-6233
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6233
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Reporter: Yongjun Zhang
Assignee: zhihai xu
 Fix For: 2.7.0

 Attachments: MAPREDUCE-6233.000.patch


  https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2039/
 {code}
 Stack Trace:
 java.lang.AssertionError: Large sort failed for 128 expected:0 but was:1
 at org.junit.Assert.fail(Assert.java:88)
 at org.junit.Assert.failNotEquals(Assert.java:743)
 at org.junit.Assert.assertEquals(Assert.java:118)
 at org.junit.Assert.assertEquals(Assert.java:555)
 at 
 org.apache.hadoop.mapreduce.TestLargeSort.testLargeSort(TestLargeSort.java:61)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5965) Hadoop streaming throws error if list of input files is high. Error is: error=7, Argument list too long at if number of input file is high

2015-02-05 Thread Arup Malakar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arup Malakar updated MAPREDUCE-5965:

Status: Patch Available  (was: Open)

 Hadoop streaming throws error if list of input files is high. Error is: 
 error=7, Argument list too long at if number of input file is high
 

 Key: MAPREDUCE-5965
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5965
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Arup Malakar
Assignee: Arup Malakar
 Attachments: MAPREDUCE-5965.1.patch, MAPREDUCE-5965.patch


 Hadoop streaming exposes all the key values in job conf as environment 
 variables when it forks a process for streaming code to run. Unfortunately 
 the variable mapreduce_input_fileinputformat_inputdir contains the list of 
 input files, and Linux has a limit on size of environment variables + 
 arguments.
 Based on how long the list of files and their full path is this could be 
 pretty huge. And given all of these variables are not even used it stops user 
 from running hadoop job with large number of files, even though it could be 
 run.
 Linux throws E2BIG if the size is greater than certain size which is error 
 code 7. And java translates that to error=7, Argument list too long. More: 
 http://man7.org/linux/man-pages/man2/execve.2.html I suggest skipping 
 variables if it is greater than certain length. That way if user code 
 requires the environment variable it would fail. It should also introduce a 
 config variable to skip long variables, and set it to false by default. That 
 way user has to specifically set it to true to invoke this feature.
 Here is the exception:
 {code}
 Error: java.lang.RuntimeException: Error in configuring object at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) 
 at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) 
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:426) at 
 org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at 
 org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at 
 java.security.AccessController.doPrivileged(Native Method) at 
 javax.security.auth.Subject.doAs(Subject.java:415) at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: 
 java.lang.reflect.InvocationTargetException at 
 sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:606) at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) 
 ... 9 more Caused by: java.lang.RuntimeException: Error in configuring object 
 at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) 
 at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) 
 at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38) ... 14 
 more Caused by: java.lang.reflect.InvocationTargetException at 
 sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:606) at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) 
 ... 17 more Caused by: java.lang.RuntimeException: configuration exception at 
 org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:222) at 
 org.apache.hadoop.streaming.PipeMapper.configure(PipeMapper.java:66) ... 22 
 more Caused by: java.io.IOException: Cannot run program 
 /data/hadoop/hadoop-yarn/cache/yarn/nm-local-dir/usercache/oo-analytics/appcache/application_1403599726264_13177/container_1403599726264_13177_01_06/./rbenv_runner.sh:
  error=7, Argument list too long at 
 java.lang.ProcessBuilder.start(ProcessBuilder.java:1041) at 
 org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:209) ... 23 
 more Caused by: java.io.IOException: error=7, Argument list too long at 
 java.lang.UNIXProcess.forkAndExec(Native Method) at 
 java.lang.UNIXProcess.init(UNIXProcess.java:135) at 
 java.lang.ProcessImpl.start(ProcessImpl.java:130) at 
 java.lang.ProcessBuilder.start(ProcessBuilder.java:1022) ... 24 more 
 Container killed by the ApplicationMaster. 

[jira] [Commented] (MAPREDUCE-5965) Hadoop streaming throws error if list of input files is high. Error is: error=7, Argument list too long at if number of input file is high

2015-02-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308092#comment-14308092
 ] 

Hadoop QA commented on MAPREDUCE-5965:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12696883/MAPREDUCE-5965.1.patch
  against trunk revision e1990ab.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5168//console

This message is automatically generated.

 Hadoop streaming throws error if list of input files is high. Error is: 
 error=7, Argument list too long at if number of input file is high
 

 Key: MAPREDUCE-5965
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5965
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Arup Malakar
Assignee: Arup Malakar
 Attachments: MAPREDUCE-5965.1.patch, MAPREDUCE-5965.patch


 Hadoop streaming exposes all the key values in job conf as environment 
 variables when it forks a process for streaming code to run. Unfortunately 
 the variable mapreduce_input_fileinputformat_inputdir contains the list of 
 input files, and Linux has a limit on size of environment variables + 
 arguments.
 Based on how long the list of files and their full path is this could be 
 pretty huge. And given all of these variables are not even used it stops user 
 from running hadoop job with large number of files, even though it could be 
 run.
 Linux throws E2BIG if the size is greater than certain size which is error 
 code 7. And java translates that to error=7, Argument list too long. More: 
 http://man7.org/linux/man-pages/man2/execve.2.html I suggest skipping 
 variables if it is greater than certain length. That way if user code 
 requires the environment variable it would fail. It should also introduce a 
 config variable to skip long variables, and set it to false by default. That 
 way user has to specifically set it to true to invoke this feature.
 Here is the exception:
 {code}
 Error: java.lang.RuntimeException: Error in configuring object at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) 
 at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) 
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:426) at 
 org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at 
 org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at 
 java.security.AccessController.doPrivileged(Native Method) at 
 javax.security.auth.Subject.doAs(Subject.java:415) at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: 
 java.lang.reflect.InvocationTargetException at 
 sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:606) at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) 
 ... 9 more Caused by: java.lang.RuntimeException: Error in configuring object 
 at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) 
 at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) 
 at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38) ... 14 
 more Caused by: java.lang.reflect.InvocationTargetException at 
 sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:606) at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) 
 ... 17 more Caused by: java.lang.RuntimeException: configuration exception at 
 org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:222) at 
 org.apache.hadoop.streaming.PipeMapper.configure(PipeMapper.java:66) ... 22 
 more Caused by: java.io.IOException: Cannot run program 
 /data/hadoop/hadoop-yarn/cache/yarn/nm-local-dir/usercache/oo-analytics/appcache/application_1403599726264_13177/container_1403599726264_13177_01_06/./rbenv_runner.sh:
  error=7, Argument list too long at 
 java.lang.ProcessBuilder.start(ProcessBuilder.java:1041) at 
 

[jira] [Commented] (MAPREDUCE-5847) Remove redundant code for fileOutputByteCounter in MapTask and ReduceTask

2015-02-05 Thread Gera Shegalov (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308132#comment-14308132
 ] 

Gera Shegalov commented on MAPREDUCE-5847:
--

Agreed, let us close it 'Won't fix'

 Remove redundant code for fileOutputByteCounter in MapTask and ReduceTask 
 --

 Key: MAPREDUCE-5847
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5847
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2, task
Affects Versions: 2.4.0
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: MAPREDUCE-5847.v01.patch, MAPREDUCE-5847.v02.patch


 Both MapTask and ReduceTask carry redundant code to update BYTES_WRITTEN 
 counter. However, {{Task.updateCounters}} uses file system stats for this. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (MAPREDUCE-5847) Remove redundant code for fileOutputByteCounter in MapTask and ReduceTask

2015-02-05 Thread Gera Shegalov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gera Shegalov resolved MAPREDUCE-5847.
--
Resolution: Won't Fix

 Remove redundant code for fileOutputByteCounter in MapTask and ReduceTask 
 --

 Key: MAPREDUCE-5847
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5847
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2, task
Affects Versions: 2.4.0
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: MAPREDUCE-5847.v01.patch, MAPREDUCE-5847.v02.patch


 Both MapTask and ReduceTask carry redundant code to update BYTES_WRITTEN 
 counter. However, {{Task.updateCounters}} uses file system stats for this. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6233) org.apache.hadoop.mapreduce.TestLargeSort.testLargeSort failed in trunk

2015-02-05 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308152#comment-14308152
 ] 

zhihai xu commented on MAPREDUCE-6233:
--

thanks [~rkanter] for the review and commit.

 org.apache.hadoop.mapreduce.TestLargeSort.testLargeSort failed in trunk
 ---

 Key: MAPREDUCE-6233
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6233
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Reporter: Yongjun Zhang
Assignee: zhihai xu
 Fix For: 2.7.0

 Attachments: MAPREDUCE-6233.000.patch


  https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2039/
 {code}
 Stack Trace:
 java.lang.AssertionError: Large sort failed for 128 expected:0 but was:1
 at org.junit.Assert.fail(Assert.java:88)
 at org.junit.Assert.failNotEquals(Assert.java:743)
 at org.junit.Assert.assertEquals(Assert.java:118)
 at org.junit.Assert.assertEquals(Assert.java:555)
 at 
 org.apache.hadoop.mapreduce.TestLargeSort.testLargeSort(TestLargeSort.java:61)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6244) Hadoop examples when run without an argument, gives ERROR instead of just usage info

2015-02-05 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308237#comment-14308237
 ] 

Akira AJISAKA commented on MAPREDUCE-6244:
--

bq. We should inspect all job to make their behavior consistent. My thought is 
that it's enough to print usages when the number of given arguments are wrong.
I'm okay with just printing usages. Consistency is more important.

 Hadoop examples when run without an argument, gives ERROR instead of just 
 usage info
 

 Key: MAPREDUCE-6244
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6244
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 0.23.0, trunk-win, 2.6.0
Reporter: Robert Justice
Assignee: Abhishek Kapoor
Priority: Minor
 Attachments: HADOOP-8834.patch, HADOOP-8834.patch


 Hadoop sort example should not give an ERROR and only should display usage 
 when run with no parameters. 
 {code}
 $ hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar sort
 ERROR: Wrong number of parameters: 0 instead of 2.
 sort [-m maps] [-r reduces] [-inFormat input format class] [-outFormat 
 output format class] [-outKey output key class] [-outValue output value 
 class] [-totalOrder pcnt num samples max splits] input output
 Generic options supported are
 -conf configuration file specify an application configuration file
 -D property=valueuse value for given property
 -fs local|namenode:port  specify a namenode
 -jt local|jobtracker:portspecify a job tracker
 -files comma separated list of filesspecify comma separated files to be 
 copied to the map reduce cluster
 -libjars comma separated list of jarsspecify comma separated jar files 
 to include in the classpath.
 -archives comma separated list of archivesspecify comma separated 
 archives to be unarchived on the compute machines.
 The general command line syntax is
 bin/hadoop command [genericOptions] [commandOptions]
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6233) org.apache.hadoop.mapreduce.TestLargeSort.testLargeSort failed in trunk

2015-02-05 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308205#comment-14308205
 ] 

Yongjun Zhang commented on MAPREDUCE-6233:
--

Thanks [~zxu] and [~rkanter]!


 org.apache.hadoop.mapreduce.TestLargeSort.testLargeSort failed in trunk
 ---

 Key: MAPREDUCE-6233
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6233
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Reporter: Yongjun Zhang
Assignee: zhihai xu
 Fix For: 2.7.0

 Attachments: MAPREDUCE-6233.000.patch


  https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2039/
 {code}
 Stack Trace:
 java.lang.AssertionError: Large sort failed for 128 expected:0 but was:1
 at org.junit.Assert.fail(Assert.java:88)
 at org.junit.Assert.failNotEquals(Assert.java:743)
 at org.junit.Assert.assertEquals(Assert.java:118)
 at org.junit.Assert.assertEquals(Assert.java:555)
 at 
 org.apache.hadoop.mapreduce.TestLargeSort.testLargeSort(TestLargeSort.java:61)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-2293) Enhance MultipleOutputs to allow additional characters in the named output name

2015-02-05 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-2293:

Status: Patch Available  (was: Open)

 Enhance MultipleOutputs to allow additional characters in the named output 
 name
 ---

 Key: MAPREDUCE-2293
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2293
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 0.21.0
Reporter: David Rosenstrauch
Assignee: Harsh J
Priority: Minor
 Attachments: mapreduce.mo.removecheck.r1.diff, 
 mapreduce.mo.removecheck.r2.diff, mapreduce.mo.removecheck.r3.diff, 
 mapreduce.mo.removecheck.r4.diff, mapreduce.mo.removecheck.r5.diff


 Currently you are only allowed to use alpha-numeric characters in a named 
 output name in the MultipleOutputs class.  This is a bit of an onerous 
 restriction, as it would be extremely convenient to be able to use non 
 alpha-numerics in the name too.  (E.g., a '.' character would be very 
 helpful, so that you can use the named output name for holding a file 
 name/extension.  Perhaps '-' and a '_' characters as well.)
 The restriction seems to be somewhat arbitrary - it appears to be only 
 enforced in the checkTokenName method.  (Though I don't know if there's any 
 downstream impact by loosening this restriction.)
 Would be extremely helpful/useful to have this fixed though!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-2293) Enhance MultipleOutputs to allow additional characters in the named output name

2015-02-05 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-2293:

Status: Open  (was: Patch Available)

Cancelling, as patch no longer applies.

 Enhance MultipleOutputs to allow additional characters in the named output 
 name
 ---

 Key: MAPREDUCE-2293
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2293
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 0.21.0
Reporter: David Rosenstrauch
Assignee: Harsh J
Priority: Minor
 Attachments: mapreduce.mo.removecheck.r1.diff, 
 mapreduce.mo.removecheck.r2.diff, mapreduce.mo.removecheck.r3.diff, 
 mapreduce.mo.removecheck.r4.diff, mapreduce.mo.removecheck.r5.diff


 Currently you are only allowed to use alpha-numeric characters in a named 
 output name in the MultipleOutputs class.  This is a bit of an onerous 
 restriction, as it would be extremely convenient to be able to use non 
 alpha-numerics in the name too.  (E.g., a '.' character would be very 
 helpful, so that you can use the named output name for holding a file 
 name/extension.  Perhaps '-' and a '_' characters as well.)
 The restriction seems to be somewhat arbitrary - it appears to be only 
 enforced in the checkTokenName method.  (Though I don't know if there's any 
 downstream impact by loosening this restriction.)
 Would be extremely helpful/useful to have this fixed though!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-1554) If user name contains '_', then searching of jobs based on user name on job history web UI doesn't work

2015-02-05 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-1554:

Resolution: Won't Fix
Status: Resolved  (was: Patch Available)

Only applies to a dead version of Hadoop. Closing as won't fix.

 If user name contains '_', then searching of jobs based on user name on job 
 history web UI doesn't work
 ---

 Key: MAPREDUCE-1554
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1554
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Ravi Gummadi
Assignee: Devaraj K
 Fix For: 0.22.1

 Attachments: MAPREDUCE-1554-0.22.patch, MAPREDUCE-1554.patch


 If user name contains underscore as part of it, then searching of jobs based 
 on user name on job history web UI doesn't work. This is because in code, 
 everywhere {code}split(_){code} is done on history file name to get user 
 name. And other parts of history file name also should *not* be obtained by 
 using split(_).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-2293) Enhance MultipleOutputs to allow additional characters in the named output name

2015-02-05 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-2293:

Status: Open  (was: Patch Available)

 Enhance MultipleOutputs to allow additional characters in the named output 
 name
 ---

 Key: MAPREDUCE-2293
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2293
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 0.21.0
Reporter: David Rosenstrauch
Assignee: Harsh J
Priority: Minor
 Attachments: mapreduce.mo.removecheck.r1.diff, 
 mapreduce.mo.removecheck.r2.diff, mapreduce.mo.removecheck.r3.diff, 
 mapreduce.mo.removecheck.r4.diff, mapreduce.mo.removecheck.r5.diff


 Currently you are only allowed to use alpha-numeric characters in a named 
 output name in the MultipleOutputs class.  This is a bit of an onerous 
 restriction, as it would be extremely convenient to be able to use non 
 alpha-numerics in the name too.  (E.g., a '.' character would be very 
 helpful, so that you can use the named output name for holding a file 
 name/extension.  Perhaps '-' and a '_' characters as well.)
 The restriction seems to be somewhat arbitrary - it appears to be only 
 enforced in the checkTokenName method.  (Though I don't know if there's any 
 downstream impact by loosening this restriction.)
 Would be extremely helpful/useful to have this fixed though!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-2293) Enhance MultipleOutputs to allow additional characters in the named output name

2015-02-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14307608#comment-14307608
 ] 

Hadoop QA commented on MAPREDUCE-2293:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12555681/mapreduce.mo.removecheck.r5.diff
  against trunk revision afbecbb.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5157//console

This message is automatically generated.

 Enhance MultipleOutputs to allow additional characters in the named output 
 name
 ---

 Key: MAPREDUCE-2293
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2293
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 0.21.0
Reporter: David Rosenstrauch
Assignee: Harsh J
Priority: Minor
 Attachments: mapreduce.mo.removecheck.r1.diff, 
 mapreduce.mo.removecheck.r2.diff, mapreduce.mo.removecheck.r3.diff, 
 mapreduce.mo.removecheck.r4.diff, mapreduce.mo.removecheck.r5.diff


 Currently you are only allowed to use alpha-numeric characters in a named 
 output name in the MultipleOutputs class.  This is a bit of an onerous 
 restriction, as it would be extremely convenient to be able to use non 
 alpha-numerics in the name too.  (E.g., a '.' character would be very 
 helpful, so that you can use the named output name for holding a file 
 name/extension.  Perhaps '-' and a '_' characters as well.)
 The restriction seems to be somewhat arbitrary - it appears to be only 
 enforced in the checkTokenName method.  (Though I don't know if there's any 
 downstream impact by loosening this restriction.)
 Would be extremely helpful/useful to have this fixed though!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6059) Speed up history server startup time

2015-02-05 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-6059:
--
Fix Version/s: (was: 3.0.0)
   2.7.0
 Hadoop Flags: Reviewed

Thanks Siqi and Allen!  I committed this to branch-2 as well.

 Speed up history server startup time
 

 Key: MAPREDUCE-6059
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6059
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 2.4.0
Reporter: Siqi Li
Assignee: Siqi Li
 Fix For: 2.7.0

 Attachments: YARN-2366.v1.patch


 When history server starts up, It scans every history directories and put all 
 history files into a cache, whereas this cache only stores 20K recent history 
 files. Therefore, it is wasting a large portion of time loading old history 
 files into the cache, and the startup time will keep increasing if we don't 
 trim the number of history files. For example, when history server starts up 
 with 2.5M history files in HDFS, it took ~5 minutes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6165) [JDK8] TestCombineFileInputFormat failed on JDK8

2015-02-05 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated MAPREDUCE-6165:
-
Status: Patch Available  (was: Open)

Resubmitting.

 [JDK8] TestCombineFileInputFormat failed on JDK8
 

 Key: MAPREDUCE-6165
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6165
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Wei Yan
Assignee: Akira AJISAKA
Priority: Minor
 Attachments: MAPREDUCE-6165-001.patch, MAPREDUCE-6165-reproduce.patch


 The error msg:
 {noformat}
 testSplitPlacementForCompressedFiles(org.apache.hadoop.mapreduce.lib.input.TestCombineFileInputFormat)
   Time elapsed: 2.487 sec   FAILURE!
 junit.framework.AssertionFailedError: expected:2 but was:1
   at junit.framework.Assert.fail(Assert.java:57)
   at junit.framework.Assert.failNotEquals(Assert.java:329)
   at junit.framework.Assert.assertEquals(Assert.java:78)
   at junit.framework.Assert.assertEquals(Assert.java:234)
   at junit.framework.Assert.assertEquals(Assert.java:241)
   at junit.framework.TestCase.assertEquals(TestCase.java:409)
   at 
 org.apache.hadoop.mapreduce.lib.input.TestCombineFileInputFormat.testSplitPlacementForCompressedFiles(TestCombineFileInputFormat.java:911)
 testSplitPlacement(org.apache.hadoop.mapreduce.lib.input.TestCombineFileInputFormat)
   Time elapsed: 0.985 sec   FAILURE!
 junit.framework.AssertionFailedError: expected:2 but was:1
   at junit.framework.Assert.fail(Assert.java:57)
   at junit.framework.Assert.failNotEquals(Assert.java:329)
   at junit.framework.Assert.assertEquals(Assert.java:78)
   at junit.framework.Assert.assertEquals(Assert.java:234)
   at junit.framework.Assert.assertEquals(Assert.java:241)
   at junit.framework.TestCase.assertEquals(TestCase.java:409)
   at 
 org.apache.hadoop.mapreduce.lib.input.TestCombineFileInputFormat.testSplitPlacement(TestCombineFileInputFormat.java:368)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6165) [JDK8] TestCombineFileInputFormat failed on JDK8

2015-02-05 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated MAPREDUCE-6165:
-
Status: Open  (was: Patch Available)

 [JDK8] TestCombineFileInputFormat failed on JDK8
 

 Key: MAPREDUCE-6165
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6165
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Wei Yan
Assignee: Akira AJISAKA
Priority: Minor
 Attachments: MAPREDUCE-6165-001.patch, MAPREDUCE-6165-reproduce.patch


 The error msg:
 {noformat}
 testSplitPlacementForCompressedFiles(org.apache.hadoop.mapreduce.lib.input.TestCombineFileInputFormat)
   Time elapsed: 2.487 sec   FAILURE!
 junit.framework.AssertionFailedError: expected:2 but was:1
   at junit.framework.Assert.fail(Assert.java:57)
   at junit.framework.Assert.failNotEquals(Assert.java:329)
   at junit.framework.Assert.assertEquals(Assert.java:78)
   at junit.framework.Assert.assertEquals(Assert.java:234)
   at junit.framework.Assert.assertEquals(Assert.java:241)
   at junit.framework.TestCase.assertEquals(TestCase.java:409)
   at 
 org.apache.hadoop.mapreduce.lib.input.TestCombineFileInputFormat.testSplitPlacementForCompressedFiles(TestCombineFileInputFormat.java:911)
 testSplitPlacement(org.apache.hadoop.mapreduce.lib.input.TestCombineFileInputFormat)
   Time elapsed: 0.985 sec   FAILURE!
 junit.framework.AssertionFailedError: expected:2 but was:1
   at junit.framework.Assert.fail(Assert.java:57)
   at junit.framework.Assert.failNotEquals(Assert.java:329)
   at junit.framework.Assert.assertEquals(Assert.java:78)
   at junit.framework.Assert.assertEquals(Assert.java:234)
   at junit.framework.Assert.assertEquals(Assert.java:241)
   at junit.framework.TestCase.assertEquals(TestCase.java:409)
   at 
 org.apache.hadoop.mapreduce.lib.input.TestCombineFileInputFormat.testSplitPlacement(TestCombineFileInputFormat.java:368)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5657) [JDK8] Fix Javadoc errors caused by incorrect or illegal tags in doc comments

2015-02-05 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308457#comment-14308457
 ] 

Andrew Purtell commented on MAPREDUCE-5657:
---

Go for it!

 [JDK8] Fix Javadoc errors caused by incorrect or illegal tags in doc comments
 -

 Key: MAPREDUCE-5657
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5657
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.7.0
Reporter: Andrew Purtell
Assignee: Andrew Purtell
Priority: Minor
 Attachments: 5657-branch-2.patch, 5657-branch-2.patch, 
 5657-trunk.patch, 5657-trunk.patch


 Javadoc is more strict by default in JDK8 and will error out on malformed or 
 illegal tags found in doc comments. Although tagged as JDK8 all of the 
 required changes are generic Javadoc cleanups.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5657) [JDK8] Fix Javadoc errors caused by incorrect or illegal tags in doc comments

2015-02-05 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308393#comment-14308393
 ] 

Akira AJISAKA commented on MAPREDUCE-5657:
--

Hi [~apurtell], how is this issue going? If you don't have time to rebase your 
patch, I'd like to succeed your work.

 [JDK8] Fix Javadoc errors caused by incorrect or illegal tags in doc comments
 -

 Key: MAPREDUCE-5657
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5657
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.7.0
Reporter: Andrew Purtell
Assignee: Andrew Purtell
Priority: Minor
 Attachments: 5657-branch-2.patch, 5657-branch-2.patch, 
 5657-trunk.patch, 5657-trunk.patch


 Javadoc is more strict by default in JDK8 and will error out on malformed or 
 illegal tags found in doc comments. Although tagged as JDK8 all of the 
 required changes are generic Javadoc cleanups.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6234) MRJobConfig.DEFAULT_*_MEMORY_MB should be consistent with mapred-default.xml

2015-02-05 Thread Tsuyoshi OZAWA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308424#comment-14308424
 ] 

Tsuyoshi OZAWA commented on MAPREDUCE-6234:
---

[~iwasakims] thank you for taking this JIRA. Currently, -1 for this fix.
 
I think we should fix test side to follow default configuration since the 
assertion only check whether check if the high ram properties are not set.

{code}
// check if the high ram properties are not set
assertEquals(expectedMapMB, 
 simulatedConf.getLong(MRJobConfig.MAP_MEMORY_MB,
   MRJobConfig.DEFAULT_MAP_MEMORY_MB));
assertEquals(expectedReduceMB, 
 simulatedConf.getLong(MRJobConfig.REDUCE_MEMORY_MB, 
   MRJobConfig.DEFAULT_MAP_MEMORY_MB));
{code}

We should also rethink what we should test in TestHighRamJob - it refers 
JT_MAX_MAPMEMORY_MB or some old configurations. Do we really need this tests?

 MRJobConfig.DEFAULT_*_MEMORY_MB should be consistent with mapred-default.xml
 

 Key: MAPREDUCE-6234
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6234
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/gridmix, mrv2
Reporter: Masatake Iwasaki
Assignee: Masatake Iwasaki
 Attachments: MAPREDUCE-6234.001.patch


 TestHighRamJob fails by this.
 {code}
 ---
  T E S T S
 ---
 Running org.apache.hadoop.mapred.gridmix.TestHighRamJob
 Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 1.162 sec  
 FAILURE! - in org.apache.hadoop.mapred.gridmix.TestHighRamJob
 testHighRamFeatureEmulation(org.apache.hadoop.mapred.gridmix.TestHighRamJob)  
 Time elapsed: 1.102 sec   FAILURE!
 java.lang.AssertionError: expected:1024 but was:-1
   at org.junit.Assert.fail(Assert.java:88)
   at org.junit.Assert.failNotEquals(Assert.java:743)
   at org.junit.Assert.assertEquals(Assert.java:118)
   at org.junit.Assert.assertEquals(Assert.java:555)
   at org.junit.Assert.assertEquals(Assert.java:542)
   at 
 org.apache.hadoop.mapred.gridmix.TestHighRamJob.testHighRamConfig(TestHighRamJob.java:98)
   at 
 org.apache.hadoop.mapred.gridmix.TestHighRamJob.testHighRamFeatureEmulation(TestHighRamJob.java:117)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6234) MRJobConfig.DEFAULT_*_MEMORY_MB should be consistent with mapred-default.xml

2015-02-05 Thread Tsuyoshi OZAWA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308430#comment-14308430
 ] 

Tsuyoshi OZAWA commented on MAPREDUCE-6234:
---

s/whether check if the high ram properties are not set./whether the high ram 
properties are set./

 MRJobConfig.DEFAULT_*_MEMORY_MB should be consistent with mapred-default.xml
 

 Key: MAPREDUCE-6234
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6234
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/gridmix, mrv2
Reporter: Masatake Iwasaki
Assignee: Masatake Iwasaki
 Attachments: MAPREDUCE-6234.001.patch


 TestHighRamJob fails by this.
 {code}
 ---
  T E S T S
 ---
 Running org.apache.hadoop.mapred.gridmix.TestHighRamJob
 Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 1.162 sec  
 FAILURE! - in org.apache.hadoop.mapred.gridmix.TestHighRamJob
 testHighRamFeatureEmulation(org.apache.hadoop.mapred.gridmix.TestHighRamJob)  
 Time elapsed: 1.102 sec   FAILURE!
 java.lang.AssertionError: expected:1024 but was:-1
   at org.junit.Assert.fail(Assert.java:88)
   at org.junit.Assert.failNotEquals(Assert.java:743)
   at org.junit.Assert.assertEquals(Assert.java:118)
   at org.junit.Assert.assertEquals(Assert.java:555)
   at org.junit.Assert.assertEquals(Assert.java:542)
   at 
 org.apache.hadoop.mapred.gridmix.TestHighRamJob.testHighRamConfig(TestHighRamJob.java:98)
   at 
 org.apache.hadoop.mapred.gridmix.TestHighRamJob.testHighRamFeatureEmulation(TestHighRamJob.java:117)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5657) [JDK8] Fix Javadoc errors caused by incorrect or illegal tags in doc comments

2015-02-05 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308460#comment-14308460
 ] 

Akira AJISAKA commented on MAPREDUCE-5657:
--

Thank you Andrew!

 [JDK8] Fix Javadoc errors caused by incorrect or illegal tags in doc comments
 -

 Key: MAPREDUCE-5657
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5657
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.7.0
Reporter: Andrew Purtell
Assignee: Andrew Purtell
Priority: Minor
 Attachments: 5657-branch-2.patch, 5657-branch-2.patch, 
 5657-trunk.patch, 5657-trunk.patch


 Javadoc is more strict by default in JDK8 and will error out on malformed or 
 illegal tags found in doc comments. Although tagged as JDK8 all of the 
 required changes are generic Javadoc cleanups.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MAPREDUCE-5657) [JDK8] Fix Javadoc errors caused by incorrect or illegal tags in doc comments

2015-02-05 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA reassigned MAPREDUCE-5657:


Assignee: Akira AJISAKA  (was: Andrew Purtell)

 [JDK8] Fix Javadoc errors caused by incorrect or illegal tags in doc comments
 -

 Key: MAPREDUCE-5657
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5657
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.7.0
Reporter: Andrew Purtell
Assignee: Akira AJISAKA
Priority: Minor
 Attachments: 5657-branch-2.patch, 5657-branch-2.patch, 
 5657-trunk.patch, 5657-trunk.patch


 Javadoc is more strict by default in JDK8 and will error out on malformed or 
 illegal tags found in doc comments. Although tagged as JDK8 all of the 
 required changes are generic Javadoc cleanups.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5988) Fix dead links to the javadocs in mapreduce project

2015-02-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14307020#comment-14307020
 ] 

Hudson commented on MAPREDUCE-5988:
---

FAILURE: Integrated in Hadoop-Yarn-trunk #829 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/829/])
MAPREDUCE-5988. Fix dead links to the javadocs in mapreduce project. (aajisaka) 
(aajisaka: rev cc6bbfceae1cddfae6a3892cb7e7104531a689be)
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/counters/package-info.java
* hadoop-mapreduce-project/CHANGES.txt
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/package-info.java
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/api/protocolrecords/package-info.java


 Fix dead links to the javadocs in mapreduce project
 ---

 Key: MAPREDUCE-5988
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5988
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.4.1
Reporter: Akira AJISAKA
Assignee: Akira AJISAKA
Priority: Minor
 Fix For: 2.7.0

 Attachments: MAPREDUCE-5988.2.patch, MAPREDUCE-5988.patch


 In http://hadoop.apache.org/docs/r2.4.1/api/allclasses-frame.html, some 
 classes are listed, but not documented.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6243) Fix findbugs warnings in hadoop-rumen

2015-02-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14307017#comment-14307017
 ] 

Hudson commented on MAPREDUCE-6243:
---

FAILURE: Integrated in Hadoop-Yarn-trunk #829 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/829/])
MAPREDUCE-6243. Fix findbugs warnings in hadoop-rumen. Contributed by Masatake 
Iwasaki. (aajisaka: rev 34fe11c987730932f99dec6eb458a22624eb075b)
* 
hadoop-tools/hadoop-rumen/src/main/java/org/apache/hadoop/tools/rumen/MapAttempt20LineHistoryEventEmitter.java
* 
hadoop-tools/hadoop-rumen/src/main/java/org/apache/hadoop/tools/rumen/ReduceAttempt20LineHistoryEventEmitter.java
* 
hadoop-tools/hadoop-rumen/src/main/java/org/apache/hadoop/tools/rumen/Hadoop20JHParser.java
* 
hadoop-tools/hadoop-rumen/src/main/java/org/apache/hadoop/tools/rumen/RandomSeedGenerator.java
* 
hadoop-tools/hadoop-rumen/src/main/java/org/apache/hadoop/tools/rumen/HadoopLogsAnalyzer.java
* hadoop-mapreduce-project/CHANGES.txt
* 
hadoop-tools/hadoop-rumen/src/main/java/org/apache/hadoop/tools/rumen/ParsedConfigFile.java


 Fix findbugs warnings in hadoop-rumen
 -

 Key: MAPREDUCE-6243
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6243
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tools/rumen
Affects Versions: 2.6.0
Reporter: Akira AJISAKA
Assignee: Masatake Iwasaki
Priority: Minor
  Labels: newbie
 Fix For: 2.7.0

 Attachments: MAPREDUCE-6243.001.patch, MAPREDUCE-6243.002.patch, 
 findbugs.xml


 There are 7 findbugs warnings in hadoop-rumen modules.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6059) Speed up history server startup time

2015-02-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14307027#comment-14307027
 ] 

Hudson commented on MAPREDUCE-6059:
---

FAILURE: Integrated in Hadoop-Yarn-trunk #829 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/829/])
MAPREDUCE-6059. Speed up history server startup time (Siqi Li via aw) (aw: rev 
fd57ab2002f97dcc83d455a5e0c770c8efde77a4)
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/HistoryFileManager.java
* hadoop-mapreduce-project/CHANGES.txt


 Speed up history server startup time
 

 Key: MAPREDUCE-6059
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6059
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 2.4.0
Reporter: Siqi Li
Assignee: Siqi Li
 Fix For: 3.0.0

 Attachments: YARN-2366.v1.patch


 When history server starts up, It scans every history directories and put all 
 history files into a cache, whereas this cache only stores 20K recent history 
 files. Therefore, it is wasting a large portion of time loading old history 
 files into the cache, and the startup time will keep increasing if we don't 
 trim the number of history files. For example, when history server starts up 
 with 2.5M history files in HDFS, it took ~5 minutes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6242) Progress report log is incredibly excessive in application master

2015-02-05 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14307081#comment-14307081
 ] 

Varun Saxena commented on MAPREDUCE-6242:
-

Oh its urgent. Thanks for letting me know. Will fix this on priority and upload 
a patch today.

 Progress report log is incredibly excessive in application master
 -

 Key: MAPREDUCE-6242
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6242
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster
Affects Versions: 2.4.0
Reporter: Jian Fang
Assignee: Varun Saxena

 We saw incredibly excessive logs in application master for a long running one 
 with many task attempts. The log write rate is around 1MB/sec in some cases. 
 Most of the log entries were from the progress report such as the following 
 ones.
 2015-02-03 17:46:14,321 INFO [IPC Server handler 56 on 37661] 
 org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt 
 attempt_1422985365246_0001_m_00_0 is : 0.15605757
 2015-02-03 17:46:17,581 INFO [IPC Server handler 2 on 37661] 
 org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt 
 attempt_1422985365246_0001_m_00_0 is : 0.4108217
 2015-02-03 17:46:20,426 INFO [IPC Server handler 0 on 37661] 
 org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt 
 attempt_1422985365246_0001_m_02_0 is : 0.06634143
 2015-02-03 17:46:20,807 INFO [IPC Server handler 4 on 37661] 
 org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt 
 attempt_1422985365246_0001_m_00_0 is : 0.6506
 2015-02-03 17:46:21,013 INFO [IPC Server handler 6 on 37661] 
 org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt 
 attempt_1422985365246_0001_m_01_0 is : 0.21723115
 Looks like the report interval is controlled by a hard-coded variable 
 PROGRESS_INTERVAL as 3 seconds in class org.apache.hadoop.mapred.Task. We 
 should allow users to set the appropriate progress interval for their 
 applications.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5988) Fix dead links to the javadocs in mapreduce project

2015-02-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14306993#comment-14306993
 ] 

Hudson commented on MAPREDUCE-5988:
---

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #95 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/95/])
MAPREDUCE-5988. Fix dead links to the javadocs in mapreduce project. (aajisaka) 
(aajisaka: rev cc6bbfceae1cddfae6a3892cb7e7104531a689be)
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/api/protocolrecords/package-info.java
* hadoop-mapreduce-project/CHANGES.txt
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/counters/package-info.java
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/package-info.java


 Fix dead links to the javadocs in mapreduce project
 ---

 Key: MAPREDUCE-5988
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5988
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.4.1
Reporter: Akira AJISAKA
Assignee: Akira AJISAKA
Priority: Minor
 Fix For: 2.7.0

 Attachments: MAPREDUCE-5988.2.patch, MAPREDUCE-5988.patch


 In http://hadoop.apache.org/docs/r2.4.1/api/allclasses-frame.html, some 
 classes are listed, but not documented.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6243) Fix findbugs warnings in hadoop-rumen

2015-02-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14306990#comment-14306990
 ] 

Hudson commented on MAPREDUCE-6243:
---

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #95 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/95/])
MAPREDUCE-6243. Fix findbugs warnings in hadoop-rumen. Contributed by Masatake 
Iwasaki. (aajisaka: rev 34fe11c987730932f99dec6eb458a22624eb075b)
* 
hadoop-tools/hadoop-rumen/src/main/java/org/apache/hadoop/tools/rumen/MapAttempt20LineHistoryEventEmitter.java
* 
hadoop-tools/hadoop-rumen/src/main/java/org/apache/hadoop/tools/rumen/ParsedConfigFile.java
* 
hadoop-tools/hadoop-rumen/src/main/java/org/apache/hadoop/tools/rumen/RandomSeedGenerator.java
* 
hadoop-tools/hadoop-rumen/src/main/java/org/apache/hadoop/tools/rumen/Hadoop20JHParser.java
* hadoop-mapreduce-project/CHANGES.txt
* 
hadoop-tools/hadoop-rumen/src/main/java/org/apache/hadoop/tools/rumen/ReduceAttempt20LineHistoryEventEmitter.java
* 
hadoop-tools/hadoop-rumen/src/main/java/org/apache/hadoop/tools/rumen/HadoopLogsAnalyzer.java


 Fix findbugs warnings in hadoop-rumen
 -

 Key: MAPREDUCE-6243
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6243
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tools/rumen
Affects Versions: 2.6.0
Reporter: Akira AJISAKA
Assignee: Masatake Iwasaki
Priority: Minor
  Labels: newbie
 Fix For: 2.7.0

 Attachments: MAPREDUCE-6243.001.patch, MAPREDUCE-6243.002.patch, 
 findbugs.xml


 There are 7 findbugs warnings in hadoop-rumen modules.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6059) Speed up history server startup time

2015-02-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14307000#comment-14307000
 ] 

Hudson commented on MAPREDUCE-6059:
---

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #95 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/95/])
MAPREDUCE-6059. Speed up history server startup time (Siqi Li via aw) (aw: rev 
fd57ab2002f97dcc83d455a5e0c770c8efde77a4)
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/HistoryFileManager.java
* hadoop-mapreduce-project/CHANGES.txt


 Speed up history server startup time
 

 Key: MAPREDUCE-6059
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6059
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 2.4.0
Reporter: Siqi Li
Assignee: Siqi Li
 Fix For: 3.0.0

 Attachments: YARN-2366.v1.patch


 When history server starts up, It scans every history directories and put all 
 history files into a cache, whereas this cache only stores 20K recent history 
 files. Therefore, it is wasting a large portion of time loading old history 
 files into the cache, and the startup time will keep increasing if we don't 
 trim the number of history files. For example, when history server starts up 
 with 2.5M history files in HDFS, it took ~5 minutes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6227) DFSIO for truncate

2015-02-05 Thread Konstantin Shvachko (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko updated MAPREDUCE-6227:
---
Attachment: DFSIO-truncate-00.patch

Adding truncate to DFSIO.

 DFSIO for truncate
 --

 Key: MAPREDUCE-6227
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6227
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: benchmarks, test
Reporter: Konstantin Shvachko
 Attachments: DFSIO-truncate-00.patch


 Create a benchmark and a test for truncate within the framework of TestDFSIO.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6234) MRJobConfig.DEFAULT_*_MEMORY_MB should be consistent with mapred-default.xml

2015-02-05 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308487#comment-14308487
 ] 

Masatake Iwasaki commented on MAPREDUCE-6234:
-

I agree that the code of gridmix should be updated. But I think it is very 
confusing that MRJobConfig.DEFAULT_MAP_MEMORY_MB is not same with the value of 
mapreduce.map.memory.mb in mapred-default.xml and it should be fixed without 
regarding to the test failure of gridmix.

 MRJobConfig.DEFAULT_*_MEMORY_MB should be consistent with mapred-default.xml
 

 Key: MAPREDUCE-6234
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6234
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/gridmix, mrv2
Reporter: Masatake Iwasaki
Assignee: Masatake Iwasaki
 Attachments: MAPREDUCE-6234.001.patch


 TestHighRamJob fails by this.
 {code}
 ---
  T E S T S
 ---
 Running org.apache.hadoop.mapred.gridmix.TestHighRamJob
 Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 1.162 sec  
 FAILURE! - in org.apache.hadoop.mapred.gridmix.TestHighRamJob
 testHighRamFeatureEmulation(org.apache.hadoop.mapred.gridmix.TestHighRamJob)  
 Time elapsed: 1.102 sec   FAILURE!
 java.lang.AssertionError: expected:1024 but was:-1
   at org.junit.Assert.fail(Assert.java:88)
   at org.junit.Assert.failNotEquals(Assert.java:743)
   at org.junit.Assert.assertEquals(Assert.java:118)
   at org.junit.Assert.assertEquals(Assert.java:555)
   at org.junit.Assert.assertEquals(Assert.java:542)
   at 
 org.apache.hadoop.mapred.gridmix.TestHighRamJob.testHighRamConfig(TestHighRamJob.java:98)
   at 
 org.apache.hadoop.mapred.gridmix.TestHighRamJob.testHighRamFeatureEmulation(TestHighRamJob.java:117)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6227) DFSIO for truncate

2015-02-05 Thread Konstantin Shvachko (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko updated MAPREDUCE-6227:
---
 Assignee: Konstantin Shvachko
Affects Version/s: 2.7.0
   Status: Patch Available  (was: Open)

 DFSIO for truncate
 --

 Key: MAPREDUCE-6227
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6227
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: benchmarks, test
Affects Versions: 2.7.0
Reporter: Konstantin Shvachko
Assignee: Konstantin Shvachko
 Attachments: DFSIO-truncate-00.patch


 Create a benchmark and a test for truncate within the framework of TestDFSIO.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6240) Hadoop client displays confusing error message

2015-02-05 Thread Mohammad Kamrul Islam (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308509#comment-14308509
 ] 

Mohammad Kamrul Islam commented on MAPREDUCE-6240:
--

[~jira.shegalov] if this message  Please check your configuration for 
mapreduce.framework.name and the correspond server addresses. is shown, please 
include what is the current values of those properties. It will help users to 
find out if their configurations is effective. 

 Hadoop client displays confusing error message
 --

 Key: MAPREDUCE-6240
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6240
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client
Reporter: Mohammad Kamrul Islam
Assignee: Mohammad Kamrul Islam
 Attachments: MAPREDUCE-6240-gera.001.patch, 
 MAPREDUCE-6240-gera.001.patch, MAPREDUCE-6240-gera.002.patch, 
 MAPREDUCE-6240.1.patch


 Hadoop client often throws exception  with java.io.IOException: Cannot 
 initialize Cluster. Please check your configuration for 
 mapreduce.framework.name and the correspond server addresses.
 This is a misleading and generic message for any cluster initialization 
 problem. It takes a lot of debugging hours to identify the root cause. The 
 correct error message could resolve this problem quickly.
 In one such instance, Oozie log showed the following exception  while the 
 root cause was CNF  that Hadoop client didn't return in the exception.
 {noformat}
  JA009: Cannot initialize Cluster. Please check your configuration for 
 mapreduce.framework.name and the correspond server addresses.
 at 
 org.apache.oozie.action.ActionExecutor.convertExceptionHelper(ActionExecutor.java:412)
 at 
 org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:392)
 at 
 org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:979)
 at 
 org.apache.oozie.action.hadoop.JavaActionExecutor.start(JavaActionExecutor.java:1134)
 at 
 org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:228)
 at 
 org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:63)
 at org.apache.oozie.command.XCommand.call(XCommand.java:281)
 at 
 org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:323)
 at 
 org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:252)
 at 
 org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:174)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:744)
 Caused by: java.io.IOException: Cannot initialize Cluster. Please check your 
 configuration for mapreduce.framework.name and the correspond server 
 addresses.
 at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:120)
 at org.apache.hadoop.mapreduce.Cluster.init(Cluster.java:82)
 at org.apache.hadoop.mapreduce.Cluster.init(Cluster.java:75)
 at org.apache.hadoop.mapred.JobClient.init(JobClient.java:470)
 at org.apache.hadoop.mapred.JobClient.init(JobClient.java:449)
 at 
 org.apache.oozie.service.HadoopAccessorService$1.run(HadoopAccessorService.java:372)
 at 
 org.apache.oozie.service.HadoopAccessorService$1.run(HadoopAccessorService.java:370)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
 at 
 org.apache.oozie.service.HadoopAccessorService.createJobClient(HadoopAccessorService.java:379)
 at 
 org.apache.oozie.action.hadoop.JavaActionExecutor.createJobClient(JavaActionExecutor.java:1185)
 at 
 org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:927)
  ... 10 more
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5492) Suppress expected log output stated on MAPREDUCE-5

2015-02-05 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-5492:

Status: Open  (was: Patch Available)

 Suppress expected log output stated on MAPREDUCE-5
 --

 Key: MAPREDUCE-5492
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5492
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 0.20.2
Reporter: Harsh J
Assignee: bc Wong
Priority: Trivial
 Attachments: 
 0001-MAPREDUCE-5492.-Do-not-reuse-a-committed-ServletResp.patch, 
 mr-5492-2.patch


 Jetty in MR1 may produce an expected EOFException during its operation that 
 we shouldn't log out in ERROR form.
 This shouldn't affect MR2, however, as it uses Netty.
 See MAPREDUCE-5 (Jothi's comments) for more info.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (MAPREDUCE-5492) Suppress expected log output stated on MAPREDUCE-5

2015-02-05 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved MAPREDUCE-5492.
-
Resolution: Won't Fix

Closing at Won't Fix since this is no longer an issue 2.x and up.

 Suppress expected log output stated on MAPREDUCE-5
 --

 Key: MAPREDUCE-5492
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5492
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 0.20.2
Reporter: Harsh J
Assignee: bc Wong
Priority: Trivial
 Attachments: 
 0001-MAPREDUCE-5492.-Do-not-reuse-a-committed-ServletResp.patch, 
 mr-5492-2.patch


 Jetty in MR1 may produce an expected EOFException during its operation that 
 we shouldn't log out in ERROR form.
 This shouldn't affect MR2, however, as it uses Netty.
 See MAPREDUCE-5 (Jothi's comments) for more info.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5696) Add Localization counters to MR

2015-02-05 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-5696:

Status: Patch Available  (was: Open)

 Add Localization counters to MR
 ---

 Key: MAPREDUCE-5696
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5696
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: LocalizationCounters.png, MAPREDUCE-5696.v01.patch, 
 MAPREDUCE-5696.v02.patch


 Users are often unaware of localization cost that their jobs incur. To 
 measure effectiveness of localization caches it is necessary to expose the 
 overhead in the form of user-visible metrics. The purpose of this JIRA is to 
 compliment YARN-1529. While YARN-1529 attempts to provide a cluster-wide view 
 to cluster admins, this JIRA focuses on exposing the localization overhead on 
 per-job basis to the job owner/user.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5696) Add Localization counters to MR

2015-02-05 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-5696:

Status: Open  (was: Patch Available)

 Add Localization counters to MR
 ---

 Key: MAPREDUCE-5696
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5696
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: LocalizationCounters.png, MAPREDUCE-5696.v01.patch, 
 MAPREDUCE-5696.v02.patch


 Users are often unaware of localization cost that their jobs incur. To 
 measure effectiveness of localization caches it is necessary to expose the 
 overhead in the form of user-visible metrics. The purpose of this JIRA is to 
 compliment YARN-1529. While YARN-1529 attempts to provide a cluster-wide view 
 to cluster admins, this JIRA focuses on exposing the localization overhead on 
 per-job basis to the job owner/user.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5648) Allow user-specified diagnostics for killed tasks and jobs

2015-02-05 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-5648:

Status: Open  (was: Patch Available)

 Allow user-specified diagnostics for killed tasks and jobs
 --

 Key: MAPREDUCE-5648
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5648
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: client, mr-am, mrv2
Affects Versions: 2.2.0
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: MAPREDUCE-5648.v01.patch, MAPREDUCE-5648.v02.patch, 
 MAPREDUCE-5648.v03.patch, MAPREDUCE-5648.v04.patch, MAPREDUCE-5648.v05.patch, 
 Screen Shot 2013-11-23 at 11.12.15 AM.png


 Our users and tools want to be able to supply additional custom diagnostic 
 messages to mapreduce ClientProtocol killTask.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5648) Allow user-specified diagnostics for killed tasks and jobs

2015-02-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14307712#comment-14307712
 ] 

Hadoop QA commented on MAPREDUCE-5648:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12634947/MAPREDUCE-5648.v05.patch
  against trunk revision b6466de.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5158//console

This message is automatically generated.

 Allow user-specified diagnostics for killed tasks and jobs
 --

 Key: MAPREDUCE-5648
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5648
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: client, mr-am, mrv2
Affects Versions: 2.2.0
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: MAPREDUCE-5648.v01.patch, MAPREDUCE-5648.v02.patch, 
 MAPREDUCE-5648.v03.patch, MAPREDUCE-5648.v04.patch, MAPREDUCE-5648.v05.patch, 
 Screen Shot 2013-11-23 at 11.12.15 AM.png


 Our users and tools want to be able to supply additional custom diagnostic 
 messages to mapreduce ClientProtocol killTask.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5839) Provide a boolean switch to enable LazyOutputFormat

2015-02-05 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-5839:

Status: Patch Available  (was: Open)

 Provide a boolean switch to enable LazyOutputFormat
 ---

 Key: MAPREDUCE-5839
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5839
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: client
Affects Versions: 2.4.0
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: MAPREDUCE-5839.v01.patch, MAPREDUCE-5839.v02.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5839) Provide a boolean switch to enable LazyOutputFormat

2015-02-05 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-5839:

Status: Open  (was: Patch Available)

 Provide a boolean switch to enable LazyOutputFormat
 ---

 Key: MAPREDUCE-5839
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5839
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: client
Affects Versions: 2.4.0
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: MAPREDUCE-5839.v01.patch, MAPREDUCE-5839.v02.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5696) Add Localization counters to MR

2015-02-05 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-5696:

Status: Open  (was: Patch Available)

Cancelling patch as it no longer applies.

 Add Localization counters to MR
 ---

 Key: MAPREDUCE-5696
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5696
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: LocalizationCounters.png, MAPREDUCE-5696.v01.patch, 
 MAPREDUCE-5696.v02.patch


 Users are often unaware of localization cost that their jobs incur. To 
 measure effectiveness of localization caches it is necessary to expose the 
 overhead in the form of user-visible metrics. The purpose of this JIRA is to 
 compliment YARN-1529. While YARN-1529 attempts to provide a cluster-wide view 
 to cluster admins, this JIRA focuses on exposing the localization overhead on 
 per-job basis to the job owner/user.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5839) Provide a boolean switch to enable LazyOutputFormat

2015-02-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14307729#comment-14307729
 ] 

Hadoop QA commented on MAPREDUCE-5839:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12640581/MAPREDUCE-5839.v02.patch
  against trunk revision b6466de.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5160//console

This message is automatically generated.

 Provide a boolean switch to enable LazyOutputFormat
 ---

 Key: MAPREDUCE-5839
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5839
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: client
Affects Versions: 2.4.0
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: MAPREDUCE-5839.v01.patch, MAPREDUCE-5839.v02.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6237) DBRecordReader is not thread safe

2015-02-05 Thread Kannan Rajah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kannan Rajah updated MAPREDUCE-6237:

Attachment: mapreduce-6237.patch

 DBRecordReader is not thread safe
 -

 Key: MAPREDUCE-6237
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6237
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.5.0
Reporter: Kannan Rajah
Assignee: Kannan Rajah
 Attachments: mapreduce-6237.patch, mapreduce-6237.patch, 
 mapreduce-6237.patch


 DBInputFormat.createDBRecorder is reusing JDBC connections across instances 
 of DBRecordReader. This is not a good idea. We should be creating separate 
 connection. If performance is a concern, then we should be using connection 
 pooling instead.
 I looked at DBOutputFormat.getRecordReader. It actually creates a new 
 Connection object for each DBRecordReader. So can we just change 
 DBInputFormat to create new Connection every time? The connection reuse code 
 was added as part of connection leak bug in MAPREDUCE-1443. Any reason for 
 caching the connection?
 We observed this issue in a customer setup where they were reading data from 
 MySQL using Pig. As per customer, the query is returning two records which 
 causes Pig to create two instances of DBRecordReader. These two instances are 
 sharing the database connection instance. The first DBRecordReader runs to 
 extract the first record from MySQL just fine, but then closes the shared 
 connection instance. When the second DBRecordReader runs, it tries to execute 
 a query to retrieve the second record on the closed shared connection 
 instance, which fail. If we set
 mapred.map.tasks to 1, the query will be successful.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6237) DBRecordReader is not thread safe

2015-02-05 Thread Kannan Rajah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kannan Rajah updated MAPREDUCE-6237:

Attachment: (was: mapreduce-6237.patch)

 DBRecordReader is not thread safe
 -

 Key: MAPREDUCE-6237
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6237
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.5.0
Reporter: Kannan Rajah
Assignee: Kannan Rajah
 Attachments: mapreduce-6237.patch, mapreduce-6237.patch, 
 mapreduce-6237.patch


 DBInputFormat.createDBRecorder is reusing JDBC connections across instances 
 of DBRecordReader. This is not a good idea. We should be creating separate 
 connection. If performance is a concern, then we should be using connection 
 pooling instead.
 I looked at DBOutputFormat.getRecordReader. It actually creates a new 
 Connection object for each DBRecordReader. So can we just change 
 DBInputFormat to create new Connection every time? The connection reuse code 
 was added as part of connection leak bug in MAPREDUCE-1443. Any reason for 
 caching the connection?
 We observed this issue in a customer setup where they were reading data from 
 MySQL using Pig. As per customer, the query is returning two records which 
 causes Pig to create two instances of DBRecordReader. These two instances are 
 sharing the database connection instance. The first DBRecordReader runs to 
 extract the first record from MySQL just fine, but then closes the shared 
 connection instance. When the second DBRecordReader runs, it tries to execute 
 a query to retrieve the second record on the closed shared connection 
 instance, which fail. If we set
 mapred.map.tasks to 1, the query will be successful.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >