[jira] [Updated] (MAPREDUCE-5186) mapreduce.job.max.split.locations causes some splits created by CombineFileInputFormat to fail

2013-11-11 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-5186:
--

   Resolution: Fixed
Fix Version/s: 2.3.0
   3.0.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Thanks Rob for the contribution, and thanks Sangjin and Daryn for reviews.  I 
committed this to trunk and branch-2.

 mapreduce.job.max.split.locations causes some splits created by 
 CombineFileInputFormat to fail
 --

 Key: MAPREDUCE-5186
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5186
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission
Affects Versions: 2.0.4-alpha, 2.2.0
Reporter: Sangjin Lee
Assignee: Robert Parker
Priority: Critical
 Fix For: 3.0.0, 2.3.0

 Attachments: MAPREDUCE-5186v1.patch, MAPREDUCE-5186v2.patch, 
 MAPREDUCE-5186v3.patch, MAPREDUCE-5186v3.patch


 CombineFileInputFormat can easily create splits that can come from many 
 different locations (during the last pass of creating global splits). 
 However, we observe that this often runs afoul of the 
 mapreduce.job.max.split.locations check that's done by JobSplitWriter.
 The default value for mapreduce.job.max.split.locations is 10, and with any 
 decent size cluster, CombineFileInputFormat creates splits that are well 
 above this limit.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (MAPREDUCE-5186) mapreduce.job.max.split.locations causes some splits created by CombineFileInputFormat to fail

2013-10-31 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-5186:
--

Target Version/s: 2.2.1
  Status: Patch Available  (was: Open)

 mapreduce.job.max.split.locations causes some splits created by 
 CombineFileInputFormat to fail
 --

 Key: MAPREDUCE-5186
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5186
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission
Affects Versions: 2.2.0, 2.0.4-alpha
Reporter: Sangjin Lee
Assignee: Robert Parker
Priority: Critical
 Attachments: MAPREDUCE-5186v1.patch, MAPREDUCE-5186v2.patch, 
 MAPREDUCE-5186v3.patch


 CombineFileInputFormat can easily create splits that can come from many 
 different locations (during the last pass of creating global splits). 
 However, we observe that this often runs afoul of the 
 mapreduce.job.max.split.locations check that's done by JobSplitWriter.
 The default value for mapreduce.job.max.split.locations is 10, and with any 
 decent size cluster, CombineFileInputFormat creates splits that are well 
 above this limit.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (MAPREDUCE-5186) mapreduce.job.max.split.locations causes some splits created by CombineFileInputFormat to fail

2013-10-31 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-5186:
--

Attachment: MAPREDUCE-5186v3.patch

Updating Rob's patch with unit tests to verify truncation of locations is 
occurring when necessary.  Also removed the TestBlockLimits test since it was 
checking for an exception in this case and is no longer necessary.

 mapreduce.job.max.split.locations causes some splits created by 
 CombineFileInputFormat to fail
 --

 Key: MAPREDUCE-5186
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5186
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission
Affects Versions: 2.0.4-alpha, 2.2.0
Reporter: Sangjin Lee
Assignee: Robert Parker
Priority: Critical
 Attachments: MAPREDUCE-5186v1.patch, MAPREDUCE-5186v2.patch, 
 MAPREDUCE-5186v3.patch


 CombineFileInputFormat can easily create splits that can come from many 
 different locations (during the last pass of creating global splits). 
 However, we observe that this often runs afoul of the 
 mapreduce.job.max.split.locations check that's done by JobSplitWriter.
 The default value for mapreduce.job.max.split.locations is 10, and with any 
 decent size cluster, CombineFileInputFormat creates splits that are well 
 above this limit.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (MAPREDUCE-5186) mapreduce.job.max.split.locations causes some splits created by CombineFileInputFormat to fail

2013-10-31 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-5186:
--

Attachment: MAPREDUCE-5186v3.patch

Apache build machine had a bunch of processes that had escaped and caused the 
Jenkins user to hit the process ulimit.  Those have been cleaned up, so 
resubmitting the same patch to kick Jenkins again.

 mapreduce.job.max.split.locations causes some splits created by 
 CombineFileInputFormat to fail
 --

 Key: MAPREDUCE-5186
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5186
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission
Affects Versions: 2.0.4-alpha, 2.2.0
Reporter: Sangjin Lee
Assignee: Robert Parker
Priority: Critical
 Attachments: MAPREDUCE-5186v1.patch, MAPREDUCE-5186v2.patch, 
 MAPREDUCE-5186v3.patch, MAPREDUCE-5186v3.patch


 CombineFileInputFormat can easily create splits that can come from many 
 different locations (during the last pass of creating global splits). 
 However, we observe that this often runs afoul of the 
 mapreduce.job.max.split.locations check that's done by JobSplitWriter.
 The default value for mapreduce.job.max.split.locations is 10, and with any 
 decent size cluster, CombineFileInputFormat creates splits that are well 
 above this limit.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (MAPREDUCE-5186) mapreduce.job.max.split.locations causes some splits created by CombineFileInputFormat to fail

2013-10-24 Thread Robert Parker (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Parker updated MAPREDUCE-5186:
-

Attachment: MAPREDUCE-5186v1.patch

This patch restores the 1.0 behavior. Defines mapreduce.job.max.split.locations 
in the mapred-default.xml. Fixed the test that was succeeding on the Max 
block location exceeded IOException instead of the Job failed! IOException.  
Pending [~tomwhite]'s input.

 mapreduce.job.max.split.locations causes some splits created by 
 CombineFileInputFormat to fail
 --

 Key: MAPREDUCE-5186
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5186
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission
Affects Versions: 2.0.4-alpha, 2.2.0
Reporter: Sangjin Lee
Assignee: Robert Parker
Priority: Critical
 Attachments: MAPREDUCE-5186v1.patch


 CombineFileInputFormat can easily create splits that can come from many 
 different locations (during the last pass of creating global splits). 
 However, we observe that this often runs afoul of the 
 mapreduce.job.max.split.locations check that's done by JobSplitWriter.
 The default value for mapreduce.job.max.split.locations is 10, and with any 
 decent size cluster, CombineFileInputFormat creates splits that are well 
 above this limit.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (MAPREDUCE-5186) mapreduce.job.max.split.locations causes some splits created by CombineFileInputFormat to fail

2013-10-24 Thread Robert Parker (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Parker updated MAPREDUCE-5186:
-

Attachment: MAPREDUCE-5186v2.patch

[~sjlee0] I originally tried to restore the mrv1 functionality, but it seems 
that max block locations is an artifact from mrv1 to protect the jobtracker and 
it is not clear to me what function it serves with respect to an AM.  I am 
modified the patch to remove m.j,max.split.locations.

 mapreduce.job.max.split.locations causes some splits created by 
 CombineFileInputFormat to fail
 --

 Key: MAPREDUCE-5186
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5186
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission
Affects Versions: 2.0.4-alpha, 2.2.0
Reporter: Sangjin Lee
Assignee: Robert Parker
Priority: Critical
 Attachments: MAPREDUCE-5186v1.patch, MAPREDUCE-5186v2.patch


 CombineFileInputFormat can easily create splits that can come from many 
 different locations (during the last pass of creating global splits). 
 However, we observe that this often runs afoul of the 
 mapreduce.job.max.split.locations check that's done by JobSplitWriter.
 The default value for mapreduce.job.max.split.locations is 10, and with any 
 decent size cluster, CombineFileInputFormat creates splits that are well 
 above this limit.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (MAPREDUCE-5186) mapreduce.job.max.split.locations causes some splits created by CombineFileInputFormat to fail

2013-10-22 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-5186:


Affects Version/s: 2.2.0

 mapreduce.job.max.split.locations causes some splits created by 
 CombineFileInputFormat to fail
 --

 Key: MAPREDUCE-5186
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5186
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2
Affects Versions: 2.0.4-alpha, 2.2.0
Reporter: Sangjin Lee
Priority: Critical

 CombineFileInputFormat can easily create splits that can come from many 
 different locations (during the last pass of creating global splits). 
 However, we observe that this often runs afoul of the 
 mapreduce.job.max.split.locations check that's done by JobSplitWriter.
 The default value for mapreduce.job.max.split.locations is 10, and with any 
 decent size cluster, CombineFileInputFormat creates splits that are well 
 above this limit.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (MAPREDUCE-5186) mapreduce.job.max.split.locations causes some splits created by CombineFileInputFormat to fail

2013-10-22 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-5186:


Component/s: (was: mrv1)
 (was: mrv2)
 job submission

 mapreduce.job.max.split.locations causes some splits created by 
 CombineFileInputFormat to fail
 --

 Key: MAPREDUCE-5186
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5186
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission
Affects Versions: 2.0.4-alpha, 2.2.0
Reporter: Sangjin Lee
Priority: Critical

 CombineFileInputFormat can easily create splits that can come from many 
 different locations (during the last pass of creating global splits). 
 However, we observe that this often runs afoul of the 
 mapreduce.job.max.split.locations check that's done by JobSplitWriter.
 The default value for mapreduce.job.max.split.locations is 10, and with any 
 decent size cluster, CombineFileInputFormat creates splits that are well 
 above this limit.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (MAPREDUCE-5186) mapreduce.job.max.split.locations causes some splits created by CombineFileInputFormat to fail

2013-10-11 Thread Sangjin Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sangjin Lee updated MAPREDUCE-5186:
---

Priority: Critical  (was: Major)

 mapreduce.job.max.split.locations causes some splits created by 
 CombineFileInputFormat to fail
 --

 Key: MAPREDUCE-5186
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5186
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2
Affects Versions: 2.0.4-alpha
Reporter: Sangjin Lee
Priority: Critical

 CombineFileInputFormat can easily create splits that can come from many 
 different locations (during the last pass of creating global splits). 
 However, we observe that this often runs afoul of the 
 mapreduce.job.max.split.locations check that's done by JobSplitWriter.
 The default value for mapreduce.job.max.split.locations is 10, and with any 
 decent size cluster, CombineFileInputFormat creates splits that are well 
 above this limit.



--
This message was sent by Atlassian JIRA
(v6.1#6144)