date:20131031

[jira] [Created] (MAPREDUCE-5604) TestMRAMWithNonNormalizedCapabilities fails on Windows due to exceeding max path length

2013-10-31 Thread Chris Nauroth (JIRA)

Chris Nauroth created MAPREDUCE-5604:


 Summary: TestMRAMWithNonNormalizedCapabilities fails on Windows 
due to exceeding max path length
 Key: MAPREDUCE-5604
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5604
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client, test
Affects Versions: 2.2.0, 3.0.0
Reporter: Chris Nauroth
Assignee: Chris Nauroth
Priority: Minor


The test uses the full class name as a component of the 
{{yarn.nodemanager.local-dirs}} setting for a {{MiniMRYarnCluster}}.  This 
causes container launch to fail when trying to access files at a path longer 
than the maximum of 260 characters.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (MAPREDUCE-5392) mapred job -history all command throws IndexOutOfBoundsException

2013-10-31 Thread Shinichi Yamashita (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13809963#comment-13809963
 ] 

Shinichi Yamashita commented on MAPREDUCE-5392:
---

Jenkins occurred OutOfMemoryError. I checked Jenkins log and OOM occurred at 
native-maven-plugin phase.
To make sure, I attach a same patch and Jenkins test.

 mapred job -history all command throws IndexOutOfBoundsException
 --

 Key: MAPREDUCE-5392
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5392
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 3.0.0, 2.0.5-alpha, 2.2.0
Reporter: Shinichi Yamashita
Assignee: Shinichi Yamashita
Priority: Minor
 Fix For: 3.0.0

 Attachments: MAPREDUCE-5392.patch, MAPREDUCE-5392.patch, 
 MAPREDUCE-5392.patch, MAPREDUCE-5392.patch, MAPREDUCE-5392.patch, 
 MAPREDUCE-5392.patch, MAPREDUCE-5392.patch


 When I use an all option by mapred job -history comamnd, the following 
 exceptions are displayed and do not work.
 {code}
 Exception in thread main java.lang.StringIndexOutOfBoundsException: String 
 index out of range: -3
 at java.lang.String.substring(String.java:1875)
 at 
 org.apache.hadoop.mapreduce.util.HostUtil.convertTrackerNameToHostName(HostUtil.java:49)
 at 
 org.apache.hadoop.mapreduce.jobhistory.HistoryViewer.getTaskLogsUrl(HistoryViewer.java:459)
 at 
 org.apache.hadoop.mapreduce.jobhistory.HistoryViewer.printAllTaskAttempts(HistoryViewer.java:235)
 at 
 org.apache.hadoop.mapreduce.jobhistory.HistoryViewer.print(HistoryViewer.java:117)
 at org.apache.hadoop.mapreduce.tools.CLI.viewHistory(CLI.java:472)
 at org.apache.hadoop.mapreduce.tools.CLI.run(CLI.java:313)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
 at org.apache.hadoop.mapred.JobClient.main(JobClient.java:1233)
 {code}
 This is because a node name recorded in History file is not given tracker_. 
 Therefore it makes modifications to be able to read History file even if a 
 node name is not given by tracker_.
 In addition, it fixes the URL of displayed task log.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (MAPREDUCE-5392) mapred job -history all command throws IndexOutOfBoundsException

2013-10-31 Thread Shinichi Yamashita (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shinichi Yamashita updated MAPREDUCE-5392:
--

Attachment: MAPREDUCE-5392.patch

 mapred job -history all command throws IndexOutOfBoundsException
 --

 Key: MAPREDUCE-5392
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5392
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 3.0.0, 2.0.5-alpha, 2.2.0
Reporter: Shinichi Yamashita
Assignee: Shinichi Yamashita
Priority: Minor
 Fix For: 3.0.0

 Attachments: MAPREDUCE-5392.patch, MAPREDUCE-5392.patch, 
 MAPREDUCE-5392.patch, MAPREDUCE-5392.patch, MAPREDUCE-5392.patch, 
 MAPREDUCE-5392.patch, MAPREDUCE-5392.patch, MAPREDUCE-5392.patch


 When I use an all option by mapred job -history comamnd, the following 
 exceptions are displayed and do not work.
 {code}
 Exception in thread main java.lang.StringIndexOutOfBoundsException: String 
 index out of range: -3
 at java.lang.String.substring(String.java:1875)
 at 
 org.apache.hadoop.mapreduce.util.HostUtil.convertTrackerNameToHostName(HostUtil.java:49)
 at 
 org.apache.hadoop.mapreduce.jobhistory.HistoryViewer.getTaskLogsUrl(HistoryViewer.java:459)
 at 
 org.apache.hadoop.mapreduce.jobhistory.HistoryViewer.printAllTaskAttempts(HistoryViewer.java:235)
 at 
 org.apache.hadoop.mapreduce.jobhistory.HistoryViewer.print(HistoryViewer.java:117)
 at org.apache.hadoop.mapreduce.tools.CLI.viewHistory(CLI.java:472)
 at org.apache.hadoop.mapreduce.tools.CLI.run(CLI.java:313)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
 at org.apache.hadoop.mapred.JobClient.main(JobClient.java:1233)
 {code}
 This is because a node name recorded in History file is not given tracker_. 
 Therefore it makes modifications to be able to read History file even if a 
 node name is not given by tracker_.
 In addition, it fixes the URL of displayed task log.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (MAPREDUCE-5604) TestMRAMWithNonNormalizedCapabilities fails on Windows due to exceeding max path length

2013-10-31 Thread Chris Nauroth (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated MAPREDUCE-5604:
-

Attachment: MAPREDUCE-5604.1.patch

I'm attaching a patch that applies the same fix we've used in similar cases: 
use the simple class name instead of the fullly qualified class name so that 
the testing directory is shorter.

 TestMRAMWithNonNormalizedCapabilities fails on Windows due to exceeding max 
 path length
 ---

 Key: MAPREDUCE-5604
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5604
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client, test
Affects Versions: 3.0.0, 2.2.0
Reporter: Chris Nauroth
Assignee: Chris Nauroth
Priority: Minor
 Attachments: MAPREDUCE-5604.1.patch


 The test uses the full class name as a component of the 
 {{yarn.nodemanager.local-dirs}} setting for a {{MiniMRYarnCluster}}.  This 
 causes container launch to fail when trying to access files at a path longer 
 than the maximum of 260 characters.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (MAPREDUCE-5604) TestMRAMWithNonNormalizedCapabilities fails on Windows due to exceeding max path length

2013-10-31 Thread Chris Nauroth (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated MAPREDUCE-5604:
-

Status: Patch Available  (was: Open)

 TestMRAMWithNonNormalizedCapabilities fails on Windows due to exceeding max 
 path length
 ---

 Key: MAPREDUCE-5604
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5604
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client, test
Affects Versions: 2.2.0, 3.0.0
Reporter: Chris Nauroth
Assignee: Chris Nauroth
Priority: Minor
 Attachments: MAPREDUCE-5604.1.patch


 The test uses the full class name as a component of the 
 {{yarn.nodemanager.local-dirs}} setting for a {{MiniMRYarnCluster}}.  This 
 causes container launch to fail when trying to access files at a path longer 
 than the maximum of 260 characters.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (MAPREDUCE-5604) TestMRAMWithNonNormalizedCapabilities fails on Windows due to exceeding max path length

2013-10-31 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13809977#comment-13809977
 ] 

Hadoop QA commented on MAPREDUCE-5604:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12611259/MAPREDUCE-5604.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:red}-1 javac{color:red}.  The patch appears to cause the build to 
fail.

Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4160//console

This message is automatically generated.

 TestMRAMWithNonNormalizedCapabilities fails on Windows due to exceeding max 
 path length
 ---

 Key: MAPREDUCE-5604
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5604
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client, test
Affects Versions: 3.0.0, 2.2.0
Reporter: Chris Nauroth
Assignee: Chris Nauroth
Priority: Minor
 Attachments: MAPREDUCE-5604.1.patch


 The test uses the full class name as a component of the 
 {{yarn.nodemanager.local-dirs}} setting for a {{MiniMRYarnCluster}}.  This 
 causes container launch to fail when trying to access files at a path longer 
 than the maximum of 260 characters.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (MAPREDUCE-5392) mapred job -history all command throws IndexOutOfBoundsException

2013-10-31 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13809983#comment-13809983
 ] 

Hadoop QA commented on MAPREDUCE-5392:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12611258/MAPREDUCE-5392.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs:

  org.apache.hadoop.mapreduce.v2.hs.TestJobHistoryParsing

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4161//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4161//console

This message is automatically generated.

 mapred job -history all command throws IndexOutOfBoundsException
 --

 Key: MAPREDUCE-5392
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5392
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 3.0.0, 2.0.5-alpha, 2.2.0
Reporter: Shinichi Yamashita
Assignee: Shinichi Yamashita
Priority: Minor
 Fix For: 3.0.0

 Attachments: MAPREDUCE-5392.patch, MAPREDUCE-5392.patch, 
 MAPREDUCE-5392.patch, MAPREDUCE-5392.patch, MAPREDUCE-5392.patch, 
 MAPREDUCE-5392.patch, MAPREDUCE-5392.patch, MAPREDUCE-5392.patch


 When I use an all option by mapred job -history comamnd, the following 
 exceptions are displayed and do not work.
 {code}
 Exception in thread main java.lang.StringIndexOutOfBoundsException: String 
 index out of range: -3
 at java.lang.String.substring(String.java:1875)
 at 
 org.apache.hadoop.mapreduce.util.HostUtil.convertTrackerNameToHostName(HostUtil.java:49)
 at 
 org.apache.hadoop.mapreduce.jobhistory.HistoryViewer.getTaskLogsUrl(HistoryViewer.java:459)
 at 
 org.apache.hadoop.mapreduce.jobhistory.HistoryViewer.printAllTaskAttempts(HistoryViewer.java:235)
 at 
 org.apache.hadoop.mapreduce.jobhistory.HistoryViewer.print(HistoryViewer.java:117)
 at org.apache.hadoop.mapreduce.tools.CLI.viewHistory(CLI.java:472)
 at org.apache.hadoop.mapreduce.tools.CLI.run(CLI.java:313)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
 at org.apache.hadoop.mapred.JobClient.main(JobClient.java:1233)
 {code}
 This is because a node name recorded in History file is not given tracker_. 
 Therefore it makes modifications to be able to read History file even if a 
 node name is not given by tracker_.
 In addition, it fixes the URL of displayed task log.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (MAPREDUCE-5604) TestMRAMWithNonNormalizedCapabilities fails on Windows due to exceeding max path length

2013-10-31 Thread Chuan Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13809987#comment-13809987
 ] 

Chuan Liu commented on MAPREDUCE-5604:
--

+1. Change looks good to me.

 TestMRAMWithNonNormalizedCapabilities fails on Windows due to exceeding max 
 path length
 ---

 Key: MAPREDUCE-5604
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5604
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client, test
Affects Versions: 3.0.0, 2.2.0
Reporter: Chris Nauroth
Assignee: Chris Nauroth
Priority: Minor
 Attachments: MAPREDUCE-5604.1.patch


 The test uses the full class name as a component of the 
 {{yarn.nodemanager.local-dirs}} setting for a {{MiniMRYarnCluster}}.  This 
 causes container launch to fail when trying to access files at a path longer 
 than the maximum of 260 characters.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (MAPREDUCE-5604) TestMRAMWithNonNormalizedCapabilities fails on Windows due to exceeding max path length

2013-10-31 Thread Chris Nauroth (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13810414#comment-13810414
 ] 

Chris Nauroth commented on MAPREDUCE-5604:
--

bq. -1 javac. The patch appears to cause the build to fail.

This is unrelated to the patch.  It looks like another problem with the Jenkins 
server being overloaded:

{code}
Error occurred during initialization of VM
Cannot create VM thread. Out of system resources.
{code}


 TestMRAMWithNonNormalizedCapabilities fails on Windows due to exceeding max 
 path length
 ---

 Key: MAPREDUCE-5604
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5604
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client, test
Affects Versions: 3.0.0, 2.2.0
Reporter: Chris Nauroth
Assignee: Chris Nauroth
Priority: Minor
 Attachments: MAPREDUCE-5604.1.patch


 The test uses the full class name as a component of the 
 {{yarn.nodemanager.local-dirs}} setting for a {{MiniMRYarnCluster}}.  This 
 causes container launch to fail when trying to access files at a path longer 
 than the maximum of 260 characters.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (MAPREDUCE-5186) mapreduce.job.max.split.locations causes some splits created by CombineFileInputFormat to fail

2013-10-31 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-5186:
--

Target Version/s: 2.2.1
  Status: Patch Available  (was: Open)

 mapreduce.job.max.split.locations causes some splits created by 
 CombineFileInputFormat to fail
 --

 Key: MAPREDUCE-5186
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5186
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission
Affects Versions: 2.2.0, 2.0.4-alpha
Reporter: Sangjin Lee
Assignee: Robert Parker
Priority: Critical
 Attachments: MAPREDUCE-5186v1.patch, MAPREDUCE-5186v2.patch, 
 MAPREDUCE-5186v3.patch


 CombineFileInputFormat can easily create splits that can come from many 
 different locations (during the last pass of creating global splits). 
 However, we observe that this often runs afoul of the 
 mapreduce.job.max.split.locations check that's done by JobSplitWriter.
 The default value for mapreduce.job.max.split.locations is 10, and with any 
 decent size cluster, CombineFileInputFormat creates splits that are well 
 above this limit.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (MAPREDUCE-5186) mapreduce.job.max.split.locations causes some splits created by CombineFileInputFormat to fail

2013-10-31 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-5186:
--

Attachment: MAPREDUCE-5186v3.patch

Updating Rob's patch with unit tests to verify truncation of locations is 
occurring when necessary.  Also removed the TestBlockLimits test since it was 
checking for an exception in this case and is no longer necessary.

 mapreduce.job.max.split.locations causes some splits created by 
 CombineFileInputFormat to fail
 --

 Key: MAPREDUCE-5186
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5186
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission
Affects Versions: 2.0.4-alpha, 2.2.0
Reporter: Sangjin Lee
Assignee: Robert Parker
Priority: Critical
 Attachments: MAPREDUCE-5186v1.patch, MAPREDUCE-5186v2.patch, 
 MAPREDUCE-5186v3.patch


 CombineFileInputFormat can easily create splits that can come from many 
 different locations (during the last pass of creating global splits). 
 However, we observe that this often runs afoul of the 
 mapreduce.job.max.split.locations check that's done by JobSplitWriter.
 The default value for mapreduce.job.max.split.locations is 10, and with any 
 decent size cluster, CombineFileInputFormat creates splits that are well 
 above this limit.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (MAPREDUCE-5186) mapreduce.job.max.split.locations causes some splits created by CombineFileInputFormat to fail

2013-10-31 Thread Sangjin Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13810631#comment-13810631
 ] 

Sangjin Lee commented on MAPREDUCE-5186:


Looks good to me. Thanks!

 mapreduce.job.max.split.locations causes some splits created by 
 CombineFileInputFormat to fail
 --

 Key: MAPREDUCE-5186
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5186
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission
Affects Versions: 2.0.4-alpha, 2.2.0
Reporter: Sangjin Lee
Assignee: Robert Parker
Priority: Critical
 Attachments: MAPREDUCE-5186v1.patch, MAPREDUCE-5186v2.patch, 
 MAPREDUCE-5186v3.patch


 CombineFileInputFormat can easily create splits that can come from many 
 different locations (during the last pass of creating global splits). 
 However, we observe that this often runs afoul of the 
 mapreduce.job.max.split.locations check that's done by JobSplitWriter.
 The default value for mapreduce.job.max.split.locations is 10, and with any 
 decent size cluster, CombineFileInputFormat creates splits that are well 
 above this limit.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (MAPREDUCE-5186) mapreduce.job.max.split.locations causes some splits created by CombineFileInputFormat to fail

2013-10-31 Thread Hadoop QA (JIRA)

[
https://issues.apache.org/jira/browse/MAPREDUCE-5186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13810639#comment-13810639
]

Hadoop QA commented on MAPREDUCE-5186:
--

{color:red}-1 overall{color}. Here are the results of testing the latest
attachment

http://issues.apache.org/jira/secure/attachment/12611465/MAPREDUCE-5186v3.patch
against trunk revision .

{color:green}+1 @author{color}. The patch does not contain any @author
tags.

{color:green}+1 tests included{color}. The patch appears to include 1 new
or modified test files.

{color:red}-1 javac{color:red}. The patch appears to cause the build to
fail.

Console output:
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4162//console

This message is automatically generated.

mapreduce.job.max.split.locations causes some splits created by
CombineFileInputFormat to fail
--

Key: MAPREDUCE-5186
URL: https://issues.apache.org/jira/browse/MAPREDUCE-5186
Project: Hadoop Map/Reduce
Issue Type: Bug
Components: job submission
Affects Versions: 2.0.4-alpha, 2.2.0
Reporter: Sangjin Lee
Assignee: Robert Parker
Priority: Critical
Attachments: MAPREDUCE-5186v1.patch, MAPREDUCE-5186v2.patch,
MAPREDUCE-5186v3.patch

CombineFileInputFormat can easily create splits that can come from many
different locations (during the last pass of creating global splits).
However, we observe that this often runs afoul of the
mapreduce.job.max.split.locations check that's done by JobSplitWriter.
The default value for mapreduce.job.max.split.locations is 10, and with any
decent size cluster, CombineFileInputFormat creates splits that are well
above this limit.

--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (MAPREDUCE-5186) mapreduce.job.max.split.locations causes some splits created by CombineFileInputFormat to fail

2013-10-31 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-5186:
--

Attachment: MAPREDUCE-5186v3.patch

Apache build machine had a bunch of processes that had escaped and caused the 
Jenkins user to hit the process ulimit.  Those have been cleaned up, so 
resubmitting the same patch to kick Jenkins again.

 mapreduce.job.max.split.locations causes some splits created by 
 CombineFileInputFormat to fail
 --

 Key: MAPREDUCE-5186
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5186
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission
Affects Versions: 2.0.4-alpha, 2.2.0
Reporter: Sangjin Lee
Assignee: Robert Parker
Priority: Critical
 Attachments: MAPREDUCE-5186v1.patch, MAPREDUCE-5186v2.patch, 
 MAPREDUCE-5186v3.patch, MAPREDUCE-5186v3.patch


 CombineFileInputFormat can easily create splits that can come from many 
 different locations (during the last pass of creating global splits). 
 However, we observe that this often runs afoul of the 
 mapreduce.job.max.split.locations check that's done by JobSplitWriter.
 The default value for mapreduce.job.max.split.locations is 10, and with any 
 decent size cluster, CombineFileInputFormat creates splits that are well 
 above this limit.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (MAPREDUCE-5601) ShuffleHandler fadvises file regions as DONTNEED even when fetch fails

2013-10-31 Thread Sandy Ryza (JIRA)

[
https://issues.apache.org/jira/browse/MAPREDUCE-5601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Sandy Ryza updated MAPREDUCE-5601:
--

Attachment: MAPREDUCE-5601.patch

ShuffleHandler fadvises file regions as DONTNEED even when fetch fails
--

Key: MAPREDUCE-5601
URL: https://issues.apache.org/jira/browse/MAPREDUCE-5601
Project: Hadoop Map/Reduce
Issue Type: Improvement
Affects Versions: 2.2.0
Reporter: Sandy Ryza
Assignee: Sandy Ryza
Attachments: MAPREDUCE-5601.patch, MAPREDUCE-5601.patch,
MAPREDUCE-5601.patch

When a reducer initiates a fetch request, it does not know whether it will be
able to fit the fetched data in memory. The first part of the response tells
how much data will be coming. If space is not currently available, the
reduce will abandon its request and try again later. When this occurs, the
ShuffleHandler still fadvises the file region as DONTNEED. Meaning that the
next time it's asked for, it will definitely be read from disk, even if it
happened to be in the page cache before the request.
I noticed this when trying to figure out why my job was doing so much more
disk IO in MR2 than in MR1. When I turned the fadvise stuff off, I found
that disk reads went to nearly 0 on machines that had enough memory to fit
map outputs into the page cache. I then straced the NodeManager and noticed
that there were over four times as many fadvise DONTNEED calls as map-reduce
pairs. Further logging showed the same map outputs being fetched about this
many times.
This is a regression from MR1, which only did the fadvise DONTNEED after all
the bytes were transferred.

--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (MAPREDUCE-5601) ShuffleHandler fadvises file regions as DONTNEED even when fetch fails

2013-10-31 Thread Hadoop QA (JIRA)

[
https://issues.apache.org/jira/browse/MAPREDUCE-5601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13810809#comment-13810809
]

Hadoop QA commented on MAPREDUCE-5601:
--

{color:red}-1 overall{color}. Here are the results of testing the latest
attachment
http://issues.apache.org/jira/secure/attachment/12611494/MAPREDUCE-5601.patch
against trunk revision .

{color:green}+1 @author{color}. The patch does not contain any @author
tags.

{color:red}-1 tests included{color}. The patch doesn't appear to include
any new or modified tests.
Please justify why no new tests are needed for this
patch.
Also please list what manual steps were performed to
verify this patch.

{color:green}+1 javac{color}. The applied patch does not increase the
total number of javac compiler warnings.

{color:green}+1 javadoc{color}. The javadoc tool did not generate any
warning messages.

{color:green}+1 eclipse:eclipse{color}. The patch built with
eclipse:eclipse.

{color:green}+1 findbugs{color}. The patch does not introduce any new
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}. The applied patch does not increase
the total number of release audit warnings.

{color:green}+1 core tests{color}. The patch passed unit tests in
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle.

{color:green}+1 contrib tests{color}. The patch passed contrib unit tests.

Test results:
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4164//testReport/
Console output:
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4164//console

This message is automatically generated.

ShuffleHandler fadvises file regions as DONTNEED even when fetch fails
--

--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (MAPREDUCE-5186) mapreduce.job.max.split.locations causes some splits created by CombineFileInputFormat to fail

2013-10-31 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13810852#comment-13810852
 ] 

Hadoop QA commented on MAPREDUCE-5186:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12611483/MAPREDUCE-5186v3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The following test timeouts occurred in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient:

org.apache.hadoop.mapreduce.v2.TestUberAM

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4163//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4163//console

This message is automatically generated.

 mapreduce.job.max.split.locations causes some splits created by 
 CombineFileInputFormat to fail
 --

 Key: MAPREDUCE-5186
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5186
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission
Affects Versions: 2.0.4-alpha, 2.2.0
Reporter: Sangjin Lee
Assignee: Robert Parker
Priority: Critical
 Attachments: MAPREDUCE-5186v1.patch, MAPREDUCE-5186v2.patch, 
 MAPREDUCE-5186v3.patch, MAPREDUCE-5186v3.patch


 CombineFileInputFormat can easily create splits that can come from many 
 different locations (during the last pass of creating global splits). 
 However, we observe that this often runs afoul of the 
 mapreduce.job.max.split.locations check that's done by JobSplitWriter.
 The default value for mapreduce.job.max.split.locations is 10, and with any 
 decent size cluster, CombineFileInputFormat creates splits that are well 
 above this limit.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (MAPREDUCE-5601) ShuffleHandler fadvises file regions as DONTNEED even when fetch fails

2013-10-31 Thread Todd Lipcon (JIRA)

[
https://issues.apache.org/jira/browse/MAPREDUCE-5601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13810873#comment-13810873
]

Todd Lipcon commented on MAPREDUCE-5601:

bq. Or you're saying we would pass the amount of unreserved memory remaining?

Yea... though could be problematic due to parallel fetches.

I think best would be to add a new field to TaskAttemptCompletionEventProto
which contains the size of the completed map output. Then the reducer scheduler
could be smarter and avoid wasting the round trip on things which won't fit
anyway. Given it's a PB, it could be done compatibly (and fall back to the
current optimistic behavior).

ShuffleHandler fadvises file regions as DONTNEED even when fetch fails
--

--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (MAPREDUCE-5601) ShuffleHandler fadvises file regions as DONTNEED even when fetch fails

2013-10-31 Thread Todd Lipcon (JIRA)

[
https://issues.apache.org/jira/browse/MAPREDUCE-5601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13810875#comment-13810875
]

Todd Lipcon commented on MAPREDUCE-5601:

Also, +1 for the patch

ShuffleHandler fadvises file regions as DONTNEED even when fetch fails
--

--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (MAPREDUCE-5601) ShuffleHandler fadvises file regions as DONTNEED even when fetch fails

2013-10-31 Thread Sandy Ryza (JIRA)

[
https://issues.apache.org/jira/browse/MAPREDUCE-5601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13810914#comment-13810914
]

Sandy Ryza commented on MAPREDUCE-5601:
---

bq. Yea... though could be problematic due to parallel fetches.
Right. It would help a little if the amount remaining was less than the total
fetched in each request, but wouldn't solve the bigger problem.

bq. I think best would be to add a new field to TaskAttemptCompletionEventProto
which contains the size of the completed map output.
That's been my thinking too. Unfortunately the task umbilical protocol is
still on Writables, so there could be compatibility issues.

ShuffleHandler fadvises file regions as DONTNEED even when fetch fails
--

--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (MAPREDUCE-4362) If possible, we should get back the feature of propagating task logs back to JobClient

2013-10-31 Thread Vinod Kumar Vavilapalli (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated MAPREDUCE-4362:
---

Target Version/s: 2.3.0

Sandy, can you rebase this patch? We need to remove the YARN changes as 
YARN-649 is in. May be some tests too?

 If possible, we should get back the feature of propagating task logs back to 
 JobClient
 --

 Key: MAPREDUCE-4362
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4362
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client, mrv2
Affects Versions: 2.0.0-alpha
Reporter: Vinod Kumar Vavilapalli
Assignee: Sandy Ryza
 Attachments: MAPREDUCE-4362.patch, MAPREDUCE-4362.patch


 MAPREDUCE-3889 removed the code which was trying to pull from /tasklog. We 
 should see if it is possible to get back the feature.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Created] (MAPREDUCE-5604) TestMRAMWithNonNormalizedCapabilities fails on Windows due to exceeding max path length

[jira] [Commented] (MAPREDUCE-5392) mapred job -history all command throws IndexOutOfBoundsException

[jira] [Updated] (MAPREDUCE-5392) mapred job -history all command throws IndexOutOfBoundsException

[jira] [Updated] (MAPREDUCE-5604) TestMRAMWithNonNormalizedCapabilities fails on Windows due to exceeding max path length

[jira] [Updated] (MAPREDUCE-5604) TestMRAMWithNonNormalizedCapabilities fails on Windows due to exceeding max path length

[jira] [Commented] (MAPREDUCE-5604) TestMRAMWithNonNormalizedCapabilities fails on Windows due to exceeding max path length

[jira] [Commented] (MAPREDUCE-5392) mapred job -history all command throws IndexOutOfBoundsException

[jira] [Commented] (MAPREDUCE-5604) TestMRAMWithNonNormalizedCapabilities fails on Windows due to exceeding max path length

[jira] [Commented] (MAPREDUCE-5604) TestMRAMWithNonNormalizedCapabilities fails on Windows due to exceeding max path length

[jira] [Updated] (MAPREDUCE-5186) mapreduce.job.max.split.locations causes some splits created by CombineFileInputFormat to fail

[jira] [Updated] (MAPREDUCE-5186) mapreduce.job.max.split.locations causes some splits created by CombineFileInputFormat to fail

[jira] [Commented] (MAPREDUCE-5186) mapreduce.job.max.split.locations causes some splits created by CombineFileInputFormat to fail

[jira] [Commented] (MAPREDUCE-5186) mapreduce.job.max.split.locations causes some splits created by CombineFileInputFormat to fail

[jira] [Updated] (MAPREDUCE-5186) mapreduce.job.max.split.locations causes some splits created by CombineFileInputFormat to fail

[jira] [Updated] (MAPREDUCE-5601) ShuffleHandler fadvises file regions as DONTNEED even when fetch fails

[jira] [Commented] (MAPREDUCE-5601) ShuffleHandler fadvises file regions as DONTNEED even when fetch fails

[jira] [Commented] (MAPREDUCE-5186) mapreduce.job.max.split.locations causes some splits created by CombineFileInputFormat to fail

[jira] [Commented] (MAPREDUCE-5601) ShuffleHandler fadvises file regions as DONTNEED even when fetch fails

[jira] [Commented] (MAPREDUCE-5601) ShuffleHandler fadvises file regions as DONTNEED even when fetch fails

[jira] [Commented] (MAPREDUCE-5601) ShuffleHandler fadvises file regions as DONTNEED even when fetch fails

[jira] [Updated] (MAPREDUCE-4362) If possible, we should get back the feature of propagating task logs back to JobClient

21 matches

Site Navigation

Mail list logo

Footer information