[jira] [Commented] (MAPREDUCE-6989) [Umbrella] Uploader tool for Distributed Cache Deploy

2018-01-02 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308542#comment-16308542
 ] 

Chris Trezzo commented on MAPREDUCE-6989:
-

Hey [~miklos.szeg...@cloudera.com]! Thanks for the work so far! I have a 
question around the high-level approach: Is there a reason why we can't 
leverage the shared cache for this? There is already an upload mechanism that 
has been built, along with a cleaning mechanism and a way to cache similar jars.

> [Umbrella] Uploader tool for Distributed Cache Deploy
> -
>
> Key: MAPREDUCE-6989
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6989
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Miklos Szegedi
>Assignee: Miklos Szegedi
> Attachments: MAPREDUCE-6989 Mapreduce framework uploader tool.pdf
>
>
> The proposal is to create a tool that collects all available jars in the 
> Hadoop classpath and adds them to a single tarball file. It then uploads the 
> resulting archive to an HDFS directory. This saves the cluster administrator 
> from having to set this up manually for Distributed Cache Deploy.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-5951) Add support for the YARN Shared Cache

2017-10-12 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated MAPREDUCE-5951:

   Resolution: Fixed
Fix Version/s: 3.0.0
   2.9.0
   Status: Resolved  (was: Patch Available)

> Add support for the YARN Shared Cache
> -
>
> Key: MAPREDUCE-5951
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5951
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>  Labels: BB2015-05-TBR
> Fix For: 2.9.0, 3.0.0
>
> Attachments: MAPREDUCE-5951-Overview.001.pdf, 
> MAPREDUCE-5951-trunk-020.patch, MAPREDUCE-5951-trunk-021.patch, 
> MAPREDUCE-5951-trunk-v1.patch, MAPREDUCE-5951-trunk-v10.patch, 
> MAPREDUCE-5951-trunk-v11.patch, MAPREDUCE-5951-trunk-v12.patch, 
> MAPREDUCE-5951-trunk-v13.patch, MAPREDUCE-5951-trunk-v14.patch, 
> MAPREDUCE-5951-trunk-v15.patch, MAPREDUCE-5951-trunk-v2.patch, 
> MAPREDUCE-5951-trunk-v3.patch, MAPREDUCE-5951-trunk-v4.patch, 
> MAPREDUCE-5951-trunk-v5.patch, MAPREDUCE-5951-trunk-v6.patch, 
> MAPREDUCE-5951-trunk-v7.patch, MAPREDUCE-5951-trunk-v8.patch, 
> MAPREDUCE-5951-trunk-v9.patch, MAPREDUCE-5951-trunk.016.patch, 
> MAPREDUCE-5951-trunk.017.patch, MAPREDUCE-5951-trunk.018.patch, 
> MAPREDUCE-5951-trunk.019.patch
>
>
> Implement the necessary changes so that the MapReduce application can 
> leverage the new YARN shared cache (i.e. YARN-1492).
> Specifically, allow per-job configuration so that MapReduce jobs can specify 
> which set of resources they would like to cache (i.e. jobjar, libjars, 
> archives, files).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-5951) Add support for the YARN Shared Cache

2017-10-12 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated MAPREDUCE-5951:

Release Note: MapReduce support for the YARN shared cache allows MapReduce 
jobs to take advantage of additional resource caching. This saves network 
bandwidth between the job submission client as well as within the YARN cluster 
itself. This will reduce job submission time and overall job runtime.

> Add support for the YARN Shared Cache
> -
>
> Key: MAPREDUCE-5951
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5951
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>  Labels: BB2015-05-TBR
> Attachments: MAPREDUCE-5951-Overview.001.pdf, 
> MAPREDUCE-5951-trunk-020.patch, MAPREDUCE-5951-trunk-021.patch, 
> MAPREDUCE-5951-trunk-v1.patch, MAPREDUCE-5951-trunk-v10.patch, 
> MAPREDUCE-5951-trunk-v11.patch, MAPREDUCE-5951-trunk-v12.patch, 
> MAPREDUCE-5951-trunk-v13.patch, MAPREDUCE-5951-trunk-v14.patch, 
> MAPREDUCE-5951-trunk-v15.patch, MAPREDUCE-5951-trunk-v2.patch, 
> MAPREDUCE-5951-trunk-v3.patch, MAPREDUCE-5951-trunk-v4.patch, 
> MAPREDUCE-5951-trunk-v5.patch, MAPREDUCE-5951-trunk-v6.patch, 
> MAPREDUCE-5951-trunk-v7.patch, MAPREDUCE-5951-trunk-v8.patch, 
> MAPREDUCE-5951-trunk-v9.patch, MAPREDUCE-5951-trunk.016.patch, 
> MAPREDUCE-5951-trunk.017.patch, MAPREDUCE-5951-trunk.018.patch, 
> MAPREDUCE-5951-trunk.019.patch
>
>
> Implement the necessary changes so that the MapReduce application can 
> leverage the new YARN shared cache (i.e. YARN-1492).
> Specifically, allow per-job configuration so that MapReduce jobs can specify 
> which set of resources they would like to cache (i.e. jobjar, libjars, 
> archives, files).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-5951) Add support for the YARN Shared Cache

2017-10-12 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16202418#comment-16202418
 ] 

Chris Trezzo commented on MAPREDUCE-5951:
-

Committed to trunk, branch-3.0 and branch-2. Thanks for all the help with 
reviews [~mingma], [~sjlee0], and [~kasha]!

> Add support for the YARN Shared Cache
> -
>
> Key: MAPREDUCE-5951
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5951
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>  Labels: BB2015-05-TBR
> Attachments: MAPREDUCE-5951-Overview.001.pdf, 
> MAPREDUCE-5951-trunk-020.patch, MAPREDUCE-5951-trunk-021.patch, 
> MAPREDUCE-5951-trunk-v1.patch, MAPREDUCE-5951-trunk-v10.patch, 
> MAPREDUCE-5951-trunk-v11.patch, MAPREDUCE-5951-trunk-v12.patch, 
> MAPREDUCE-5951-trunk-v13.patch, MAPREDUCE-5951-trunk-v14.patch, 
> MAPREDUCE-5951-trunk-v15.patch, MAPREDUCE-5951-trunk-v2.patch, 
> MAPREDUCE-5951-trunk-v3.patch, MAPREDUCE-5951-trunk-v4.patch, 
> MAPREDUCE-5951-trunk-v5.patch, MAPREDUCE-5951-trunk-v6.patch, 
> MAPREDUCE-5951-trunk-v7.patch, MAPREDUCE-5951-trunk-v8.patch, 
> MAPREDUCE-5951-trunk-v9.patch, MAPREDUCE-5951-trunk.016.patch, 
> MAPREDUCE-5951-trunk.017.patch, MAPREDUCE-5951-trunk.018.patch, 
> MAPREDUCE-5951-trunk.019.patch
>
>
> Implement the necessary changes so that the MapReduce application can 
> leverage the new YARN shared cache (i.e. YARN-1492).
> Specifically, allow per-job configuration so that MapReduce jobs can specify 
> which set of resources they would like to cache (i.e. jobjar, libjars, 
> archives, files).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-5951) Add support for the YARN Shared Cache

2017-10-10 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16199008#comment-16199008
 ] 

Chris Trezzo commented on MAPREDUCE-5951:
-

Thank you [~mingma] for the review! I will wait until Thursday to commit in 
case there are any other comments. Otherwise, I plan to commit to trunk, 
branch-3.0 and branch-2.

> Add support for the YARN Shared Cache
> -
>
> Key: MAPREDUCE-5951
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5951
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>  Labels: BB2015-05-TBR
> Attachments: MAPREDUCE-5951-Overview.001.pdf, 
> MAPREDUCE-5951-trunk-020.patch, MAPREDUCE-5951-trunk-021.patch, 
> MAPREDUCE-5951-trunk-v1.patch, MAPREDUCE-5951-trunk-v10.patch, 
> MAPREDUCE-5951-trunk-v11.patch, MAPREDUCE-5951-trunk-v12.patch, 
> MAPREDUCE-5951-trunk-v13.patch, MAPREDUCE-5951-trunk-v14.patch, 
> MAPREDUCE-5951-trunk-v15.patch, MAPREDUCE-5951-trunk-v2.patch, 
> MAPREDUCE-5951-trunk-v3.patch, MAPREDUCE-5951-trunk-v4.patch, 
> MAPREDUCE-5951-trunk-v5.patch, MAPREDUCE-5951-trunk-v6.patch, 
> MAPREDUCE-5951-trunk-v7.patch, MAPREDUCE-5951-trunk-v8.patch, 
> MAPREDUCE-5951-trunk-v9.patch, MAPREDUCE-5951-trunk.016.patch, 
> MAPREDUCE-5951-trunk.017.patch, MAPREDUCE-5951-trunk.018.patch, 
> MAPREDUCE-5951-trunk.019.patch
>
>
> Implement the necessary changes so that the MapReduce application can 
> leverage the new YARN shared cache (i.e. YARN-1492).
> Specifically, allow per-job configuration so that MapReduce jobs can specify 
> which set of resources they would like to cache (i.e. jobjar, libjars, 
> archives, files).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-5951) Add support for the YARN Shared Cache

2017-10-06 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16194976#comment-16194976
 ] 

Chris Trezzo commented on MAPREDUCE-5951:
-

Thanks for the comment [~mingma]!

bq. Should any code be moved from MR to YARN to make it easier for other YARN 
applications to use shared cache? For example, maybe other applications can 
benefit from part of LocalResourceBuilder or the special care when dealing with 
fragment.

I have thought about this a fair amount. Originally we started pushing more of 
the fragment code down into the YARN layer (see YARN-3637), but later I 
realized that the code dealing with fragments is purely at the MapReduce layer. 
YARN's api does not use fragments. Instead the ContainerLaunchContext expects a 
Map localResources, where the strings are the 
destination file names (i.e. symlinks). We wound up pulling the fragment 
portion back out of YARN (see YARN-7250) because it was not consistent with the 
rest of the YARN api. Additionally, I think that the way MapReduce uses 
fragments right now is very brittle and prone to bugs. Within MapReduce, 
resources with fragments are converted between paths, URIs and URLs multiple 
times throughout the code and each of these three classes supports fragments in 
different ways. If you are not very careful, one could easily drop a fragment.

I also thought about moving LocalResourceBuilder to YARN, but it has a fair 
amount of MapReduce specific things that would need to change. For example:
# All of the parameters are array based due to how MapReduce currently handles 
resources. We could change this, but then that would need additional 
refactoring at the MapReduce level.
# Components from the MapReduce wildcard feature are in this class. We would 
need to figure out if that makes sense at the yarn layer.
# LocalResourceBuilder currently handles fragments, which we would also need to 
figure out if it makes sense at the yarn layer.

At the end of the day, it would not be simply dropping the LocalResourceBuilder 
into YARN and being done. We would have to think about it more. It does seem 
like something YARN could benefit from, along with a resource uploader. I can 
file another jira to cover these topics, but I think it is probably out of 
scope for this jira.

I think in reality the complexity in this jira is due to the way MapReduce 
itself handles resources and the above mentioned issues with fragments. If we 
wanted to implement a generic yarn resource uploader, I think it could be much 
simpler. For example, this is a slightly simplified version of the code devoted 
to using something in the shared cache:
{noformat}
String localPathChecksum = sharedCacheClient.getFileChecksum(localPath);
URL cachedResource = sharedCacheClient.use(appId, localPathChecksum);
LocalResource resource = LocalResource.newInstance(cachedResource,
  LocalResourceType.FILE, LocalResourceVisibility.PUBLIC
  size, timestamp, null, true);
{noformat}

That LocalResource can then be passed directly to the ContainerLaunchContext 
where a symlink can be specified as a String. As you can see, there is no 
innate need for fragments at the YARN layer.

Please let me know if that makes sense or if I have missed something! Thanks.

> Add support for the YARN Shared Cache
> -
>
> Key: MAPREDUCE-5951
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5951
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>  Labels: BB2015-05-TBR
> Attachments: MAPREDUCE-5951-Overview.001.pdf, 
> MAPREDUCE-5951-trunk.016.patch, MAPREDUCE-5951-trunk.017.patch, 
> MAPREDUCE-5951-trunk.018.patch, MAPREDUCE-5951-trunk.019.patch, 
> MAPREDUCE-5951-trunk-020.patch, MAPREDUCE-5951-trunk-021.patch, 
> MAPREDUCE-5951-trunk-v10.patch, MAPREDUCE-5951-trunk-v11.patch, 
> MAPREDUCE-5951-trunk-v12.patch, MAPREDUCE-5951-trunk-v13.patch, 
> MAPREDUCE-5951-trunk-v14.patch, MAPREDUCE-5951-trunk-v15.patch, 
> MAPREDUCE-5951-trunk-v1.patch, MAPREDUCE-5951-trunk-v2.patch, 
> MAPREDUCE-5951-trunk-v3.patch, MAPREDUCE-5951-trunk-v4.patch, 
> MAPREDUCE-5951-trunk-v5.patch, MAPREDUCE-5951-trunk-v6.patch, 
> MAPREDUCE-5951-trunk-v7.patch, MAPREDUCE-5951-trunk-v8.patch, 
> MAPREDUCE-5951-trunk-v9.patch
>
>
> Implement the necessary changes so that the MapReduce application can 
> leverage the new YARN shared cache (i.e. YARN-1492).
> Specifically, allow per-job configuration so that MapReduce jobs can specify 
> which set of resources they would like to cache (i.e. jobjar, libjars, 
> archives, files).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: 

[jira] [Commented] (MAPREDUCE-5951) Add support for the YARN Shared Cache

2017-10-05 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16193828#comment-16193828
 ] 

Chris Trezzo commented on MAPREDUCE-5951:
-

Hi [~sjlee0], [~kasha], [~mingma], [~jlowe], and [~vrushalic]!

I have made another push to get MapReduce support for the shared cache 
committed. I have rebased the patch, added documentation and fixed all 
warnings/issues that I see so far. At this point, I need a reviewer for the 
final review. I know this patch is a big one, so if there is anything I can do 
to help with the review process to make it easier to review, or if there is 
someone else who might be interested in the review, please let me know. Some 
good news:

# Much of the patch has already been reviewed by [~kasha] [~sjlee0] and 
[~mingma] during previous iterations.
# I have ensured that the entire feature is behind a switch. As such, when 
disabled (default) there are no effects for the user.
# I have functionally tested this patch on a pseudo distributed cluster.
# I have deployed this patch to a larger test cluster and ran jobs with the 
patch.
# There is very similar code running in production that has been working for 
years at this point.

My main goal is to commit this to trunk and branch-2 (2.9.0). If it can make it 
into branch-3.0 for GA that would be great as well, but I understand that the 
beta is already out ([~andrew.wang] please let me know what you think). Once I 
get a +1 on the patch, I would be happy to do the work to commit.

Thanks in advance for the help and effort. I really do appreciate it!

> Add support for the YARN Shared Cache
> -
>
> Key: MAPREDUCE-5951
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5951
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>  Labels: BB2015-05-TBR
> Attachments: MAPREDUCE-5951-Overview.001.pdf, 
> MAPREDUCE-5951-trunk.016.patch, MAPREDUCE-5951-trunk.017.patch, 
> MAPREDUCE-5951-trunk.018.patch, MAPREDUCE-5951-trunk.019.patch, 
> MAPREDUCE-5951-trunk-020.patch, MAPREDUCE-5951-trunk-021.patch, 
> MAPREDUCE-5951-trunk-v10.patch, MAPREDUCE-5951-trunk-v11.patch, 
> MAPREDUCE-5951-trunk-v12.patch, MAPREDUCE-5951-trunk-v13.patch, 
> MAPREDUCE-5951-trunk-v14.patch, MAPREDUCE-5951-trunk-v15.patch, 
> MAPREDUCE-5951-trunk-v1.patch, MAPREDUCE-5951-trunk-v2.patch, 
> MAPREDUCE-5951-trunk-v3.patch, MAPREDUCE-5951-trunk-v4.patch, 
> MAPREDUCE-5951-trunk-v5.patch, MAPREDUCE-5951-trunk-v6.patch, 
> MAPREDUCE-5951-trunk-v7.patch, MAPREDUCE-5951-trunk-v8.patch, 
> MAPREDUCE-5951-trunk-v9.patch
>
>
> Implement the necessary changes so that the MapReduce application can 
> leverage the new YARN shared cache (i.e. YARN-1492).
> Specifically, allow per-job configuration so that MapReduce jobs can specify 
> which set of resources they would like to cache (i.e. jobjar, libjars, 
> archives, files).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (MAPREDUCE-5951) Add support for the YARN Shared Cache

2017-10-05 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16193573#comment-16193573
 ] 

Chris Trezzo edited comment on MAPREDUCE-5951 at 10/5/17 8:17 PM:
--

This is the javac warning:
bq. [WARNING] 
/testptch/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/util/LocalResourceBuilder.java:[34,44]
 [deprecation] DistributedCache in org.apache.hadoop.mapreduce.filecache has 
been deprecated

LocalResourceBuilder was a class added to fix a checkstyle warning. I have used 
{{@SuppressWarnings("deprecation")}} to silence the warnings around 
DistributedCache usage at the class level. This warning is complaining about 
the import statement. If anyone has an idea for how to apply the annotation to 
the import statement, please let me know. Furthermore, the LocalResourceBuilder 
is simply refactoring the MRApps#parseDistributedCacheArtifacts method, so I do 
not think it makes sense to fix the usage of a deprecated interface in this 
patch, especially since it is used in a lot of places.


was (Author: ctrezzo):
This is the javac warning:
{noformat}
[WARNING] 
/testptch/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/util/LocalResourceBuilder.java:[34,44]
 [deprecation] DistributedCache in org.apache.hadoop.mapreduce.filecache has 
been deprecated
{noformat}

LocalResourceBuilder was a class added to fix a checkstyle warning. I have used 
{{@SuppressWarnings("deprecation")}} to silence the warnings around 
DistributedCache usage at the class level. This warning is complaining about 
the import statement. If anyone has an idea for how to apply the annotation to 
the import statement, please let me know. Furthermore, the LocalResourceBuilder 
is simply refactoring the MRApps#parseDistributedCacheArtifacts method, so I do 
not think it makes sense to fix the usage of a deprecated interface in this 
patch, especially since it is used in a lot of places.

> Add support for the YARN Shared Cache
> -
>
> Key: MAPREDUCE-5951
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5951
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>  Labels: BB2015-05-TBR
> Attachments: MAPREDUCE-5951-Overview.001.pdf, 
> MAPREDUCE-5951-trunk.016.patch, MAPREDUCE-5951-trunk.017.patch, 
> MAPREDUCE-5951-trunk.018.patch, MAPREDUCE-5951-trunk.019.patch, 
> MAPREDUCE-5951-trunk-020.patch, MAPREDUCE-5951-trunk-021.patch, 
> MAPREDUCE-5951-trunk-v10.patch, MAPREDUCE-5951-trunk-v11.patch, 
> MAPREDUCE-5951-trunk-v12.patch, MAPREDUCE-5951-trunk-v13.patch, 
> MAPREDUCE-5951-trunk-v14.patch, MAPREDUCE-5951-trunk-v15.patch, 
> MAPREDUCE-5951-trunk-v1.patch, MAPREDUCE-5951-trunk-v2.patch, 
> MAPREDUCE-5951-trunk-v3.patch, MAPREDUCE-5951-trunk-v4.patch, 
> MAPREDUCE-5951-trunk-v5.patch, MAPREDUCE-5951-trunk-v6.patch, 
> MAPREDUCE-5951-trunk-v7.patch, MAPREDUCE-5951-trunk-v8.patch, 
> MAPREDUCE-5951-trunk-v9.patch
>
>
> Implement the necessary changes so that the MapReduce application can 
> leverage the new YARN shared cache (i.e. YARN-1492).
> Specifically, allow per-job configuration so that MapReduce jobs can specify 
> which set of resources they would like to cache (i.e. jobjar, libjars, 
> archives, files).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-5951) Add support for the YARN Shared Cache

2017-10-05 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16193573#comment-16193573
 ] 

Chris Trezzo commented on MAPREDUCE-5951:
-

This is the javac warning:
{noformat}
[WARNING] 
/testptch/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/util/LocalResourceBuilder.java:[34,44]
 [deprecation] DistributedCache in org.apache.hadoop.mapreduce.filecache has 
been deprecated
{noformat}

LocalResourceBuilder was a class added to fix a checkstyle warning. I have used 
{{@SuppressWarnings("deprecation")}} to silence the warnings around 
DistributedCache usage at the class level. This warning is complaining about 
the import statement. If anyone has an idea for how to apply the annotation to 
the import statement, please let me know. Furthermore, the LocalResourceBuilder 
is simply refactoring the MRApps#parseDistributedCacheArtifacts method, so I do 
not think it makes sense to fix the usage of a deprecated interface in this 
patch, especially since it is used in a lot of places.

> Add support for the YARN Shared Cache
> -
>
> Key: MAPREDUCE-5951
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5951
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>  Labels: BB2015-05-TBR
> Attachments: MAPREDUCE-5951-Overview.001.pdf, 
> MAPREDUCE-5951-trunk.016.patch, MAPREDUCE-5951-trunk.017.patch, 
> MAPREDUCE-5951-trunk.018.patch, MAPREDUCE-5951-trunk.019.patch, 
> MAPREDUCE-5951-trunk-020.patch, MAPREDUCE-5951-trunk-021.patch, 
> MAPREDUCE-5951-trunk-v10.patch, MAPREDUCE-5951-trunk-v11.patch, 
> MAPREDUCE-5951-trunk-v12.patch, MAPREDUCE-5951-trunk-v13.patch, 
> MAPREDUCE-5951-trunk-v14.patch, MAPREDUCE-5951-trunk-v15.patch, 
> MAPREDUCE-5951-trunk-v1.patch, MAPREDUCE-5951-trunk-v2.patch, 
> MAPREDUCE-5951-trunk-v3.patch, MAPREDUCE-5951-trunk-v4.patch, 
> MAPREDUCE-5951-trunk-v5.patch, MAPREDUCE-5951-trunk-v6.patch, 
> MAPREDUCE-5951-trunk-v7.patch, MAPREDUCE-5951-trunk-v8.patch, 
> MAPREDUCE-5951-trunk-v9.patch
>
>
> Implement the necessary changes so that the MapReduce application can 
> leverage the new YARN shared cache (i.e. YARN-1492).
> Specifically, allow per-job configuration so that MapReduce jobs can specify 
> which set of resources they would like to cache (i.e. jobjar, libjars, 
> archives, files).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-5951) Add support for the YARN Shared Cache

2017-10-04 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated MAPREDUCE-5951:

Attachment: MAPREDUCE-5951-trunk-021.patch

Attached is trunk v21. This fixes checkstyle issues and adds documentation.

> Add support for the YARN Shared Cache
> -
>
> Key: MAPREDUCE-5951
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5951
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>  Labels: BB2015-05-TBR
> Attachments: MAPREDUCE-5951-Overview.001.pdf, 
> MAPREDUCE-5951-trunk.016.patch, MAPREDUCE-5951-trunk.017.patch, 
> MAPREDUCE-5951-trunk.018.patch, MAPREDUCE-5951-trunk.019.patch, 
> MAPREDUCE-5951-trunk-020.patch, MAPREDUCE-5951-trunk-021.patch, 
> MAPREDUCE-5951-trunk-v10.patch, MAPREDUCE-5951-trunk-v11.patch, 
> MAPREDUCE-5951-trunk-v12.patch, MAPREDUCE-5951-trunk-v13.patch, 
> MAPREDUCE-5951-trunk-v14.patch, MAPREDUCE-5951-trunk-v15.patch, 
> MAPREDUCE-5951-trunk-v1.patch, MAPREDUCE-5951-trunk-v2.patch, 
> MAPREDUCE-5951-trunk-v3.patch, MAPREDUCE-5951-trunk-v4.patch, 
> MAPREDUCE-5951-trunk-v5.patch, MAPREDUCE-5951-trunk-v6.patch, 
> MAPREDUCE-5951-trunk-v7.patch, MAPREDUCE-5951-trunk-v8.patch, 
> MAPREDUCE-5951-trunk-v9.patch
>
>
> Implement the necessary changes so that the MapReduce application can 
> leverage the new YARN shared cache (i.e. YARN-1492).
> Specifically, allow per-job configuration so that MapReduce jobs can specify 
> which set of resources they would like to cache (i.e. jobjar, libjars, 
> archives, files).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-5951) Add support for the YARN Shared Cache

2017-10-02 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated MAPREDUCE-5951:

Status: Patch Available  (was: In Progress)

> Add support for the YARN Shared Cache
> -
>
> Key: MAPREDUCE-5951
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5951
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>  Labels: BB2015-05-TBR
> Attachments: MAPREDUCE-5951-Overview.001.pdf, 
> MAPREDUCE-5951-trunk.016.patch, MAPREDUCE-5951-trunk.017.patch, 
> MAPREDUCE-5951-trunk.018.patch, MAPREDUCE-5951-trunk.019.patch, 
> MAPREDUCE-5951-trunk-020.patch, MAPREDUCE-5951-trunk-v10.patch, 
> MAPREDUCE-5951-trunk-v11.patch, MAPREDUCE-5951-trunk-v12.patch, 
> MAPREDUCE-5951-trunk-v13.patch, MAPREDUCE-5951-trunk-v14.patch, 
> MAPREDUCE-5951-trunk-v15.patch, MAPREDUCE-5951-trunk-v1.patch, 
> MAPREDUCE-5951-trunk-v2.patch, MAPREDUCE-5951-trunk-v3.patch, 
> MAPREDUCE-5951-trunk-v4.patch, MAPREDUCE-5951-trunk-v5.patch, 
> MAPREDUCE-5951-trunk-v6.patch, MAPREDUCE-5951-trunk-v7.patch, 
> MAPREDUCE-5951-trunk-v8.patch, MAPREDUCE-5951-trunk-v9.patch
>
>
> Implement the necessary changes so that the MapReduce application can 
> leverage the new YARN shared cache (i.e. YARN-1492).
> Specifically, allow per-job configuration so that MapReduce jobs can specify 
> which set of resources they would like to cache (i.e. jobjar, libjars, 
> archives, files).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-5951) Add support for the YARN Shared Cache

2017-10-02 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated MAPREDUCE-5951:

Attachment: MAPREDUCE-5951-trunk-020.patch

Attached is a v20 trunk patch.

> Add support for the YARN Shared Cache
> -
>
> Key: MAPREDUCE-5951
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5951
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>  Labels: BB2015-05-TBR
> Attachments: MAPREDUCE-5951-Overview.001.pdf, 
> MAPREDUCE-5951-trunk.016.patch, MAPREDUCE-5951-trunk.017.patch, 
> MAPREDUCE-5951-trunk.018.patch, MAPREDUCE-5951-trunk.019.patch, 
> MAPREDUCE-5951-trunk-020.patch, MAPREDUCE-5951-trunk-v10.patch, 
> MAPREDUCE-5951-trunk-v11.patch, MAPREDUCE-5951-trunk-v12.patch, 
> MAPREDUCE-5951-trunk-v13.patch, MAPREDUCE-5951-trunk-v14.patch, 
> MAPREDUCE-5951-trunk-v15.patch, MAPREDUCE-5951-trunk-v1.patch, 
> MAPREDUCE-5951-trunk-v2.patch, MAPREDUCE-5951-trunk-v3.patch, 
> MAPREDUCE-5951-trunk-v4.patch, MAPREDUCE-5951-trunk-v5.patch, 
> MAPREDUCE-5951-trunk-v6.patch, MAPREDUCE-5951-trunk-v7.patch, 
> MAPREDUCE-5951-trunk-v8.patch, MAPREDUCE-5951-trunk-v9.patch
>
>
> Implement the necessary changes so that the MapReduce application can 
> leverage the new YARN shared cache (i.e. YARN-1492).
> Specifically, allow per-job configuration so that MapReduce jobs can specify 
> which set of resources they would like to cache (i.e. jobjar, libjars, 
> archives, files).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-5951) Add support for the YARN Shared Cache

2017-04-28 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15989545#comment-15989545
 ] 

Chris Trezzo commented on MAPREDUCE-5951:
-

[~xkrogen]

bq. Is this so that the uploading to SCM can be done by the NM, which is a 
privileged user, to have more secure control over it?

Yes exactly. We wanted to ensure that only trusted entities (i.e. the SCM and 
the node manager) were modifying the shared cached directories in HDFS. 
Additionally, we wanted to make sure that the checksum used when adding a 
resource to the cache was computed by a trusted entity as well.

> Add support for the YARN Shared Cache
> -
>
> Key: MAPREDUCE-5951
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5951
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>  Labels: BB2015-05-TBR
> Attachments: MAPREDUCE-5951-Overview.001.pdf, 
> MAPREDUCE-5951-trunk.016.patch, MAPREDUCE-5951-trunk.017.patch, 
> MAPREDUCE-5951-trunk.018.patch, MAPREDUCE-5951-trunk.019.patch, 
> MAPREDUCE-5951-trunk-v10.patch, MAPREDUCE-5951-trunk-v11.patch, 
> MAPREDUCE-5951-trunk-v12.patch, MAPREDUCE-5951-trunk-v13.patch, 
> MAPREDUCE-5951-trunk-v14.patch, MAPREDUCE-5951-trunk-v15.patch, 
> MAPREDUCE-5951-trunk-v1.patch, MAPREDUCE-5951-trunk-v2.patch, 
> MAPREDUCE-5951-trunk-v3.patch, MAPREDUCE-5951-trunk-v4.patch, 
> MAPREDUCE-5951-trunk-v5.patch, MAPREDUCE-5951-trunk-v6.patch, 
> MAPREDUCE-5951-trunk-v7.patch, MAPREDUCE-5951-trunk-v8.patch, 
> MAPREDUCE-5951-trunk-v9.patch
>
>
> Implement the necessary changes so that the MapReduce application can 
> leverage the new YARN shared cache (i.e. YARN-1492).
> Specifically, allow per-job configuration so that MapReduce jobs can specify 
> which set of resources they would like to cache (i.e. jobjar, libjars, 
> archives, files).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-5951) Add support for the YARN Shared Cache

2017-04-27 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15987897#comment-15987897
 ] 

Chris Trezzo commented on MAPREDUCE-5951:
-

Thanks [~xkrogen] for the comment!

bq. is this an oversight, or is this behavior desired?

Originally we just left it private because we wanted to avoid having to change 
the staging directory and that portion of how MapReduce uploaded resources. As 
I am looking more at YARN-5727, I think it makes more sense to do this so that 
the resources are initially uploaded to a public place and explicitly set with 
a public visibility by the MapReduce client. I was thinking of potentially 
adding a public staging directory that is created and cleaned up by the 
MapReduce client along with the current staging directory. [~xkrogen] would you 
have any thoughts on this? [~jlowe] would you have any thoughts on this as well?

> Add support for the YARN Shared Cache
> -
>
> Key: MAPREDUCE-5951
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5951
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>  Labels: BB2015-05-TBR
> Attachments: MAPREDUCE-5951-Overview.001.pdf, 
> MAPREDUCE-5951-trunk.016.patch, MAPREDUCE-5951-trunk.017.patch, 
> MAPREDUCE-5951-trunk.018.patch, MAPREDUCE-5951-trunk.019.patch, 
> MAPREDUCE-5951-trunk-v10.patch, MAPREDUCE-5951-trunk-v11.patch, 
> MAPREDUCE-5951-trunk-v12.patch, MAPREDUCE-5951-trunk-v13.patch, 
> MAPREDUCE-5951-trunk-v14.patch, MAPREDUCE-5951-trunk-v15.patch, 
> MAPREDUCE-5951-trunk-v1.patch, MAPREDUCE-5951-trunk-v2.patch, 
> MAPREDUCE-5951-trunk-v3.patch, MAPREDUCE-5951-trunk-v4.patch, 
> MAPREDUCE-5951-trunk-v5.patch, MAPREDUCE-5951-trunk-v6.patch, 
> MAPREDUCE-5951-trunk-v7.patch, MAPREDUCE-5951-trunk-v8.patch, 
> MAPREDUCE-5951-trunk-v9.patch
>
>
> Implement the necessary changes so that the MapReduce application can 
> leverage the new YARN shared cache (i.e. YARN-1492).
> Specifically, allow per-job configuration so that MapReduce jobs can specify 
> which set of resources they would like to cache (i.e. jobjar, libjars, 
> archives, files).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6846) Fragments specified for libjar paths are not handled correctly

2017-04-06 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15959278#comment-15959278
 ] 

Chris Trezzo commented on MAPREDUCE-6846:
-

Thanks everyone for the reviews and commit!

> Fragments specified for libjar paths are not handled correctly
> --
>
> Key: MAPREDUCE-6846
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6846
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.6.0, 2.7.3, 3.0.0-alpha2
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Minor
> Fix For: 2.9.0, 3.0.0-alpha3
>
> Attachments: MAPREDUCE-6846-trunk.001.patch, 
> MAPREDUCE-6846-trunk.002.patch, MAPREDUCE-6846-trunk.003.patch, 
> MAPREDUCE-6846-trunk.004.patch, MAPREDUCE-6846-trunk.005.patch, 
> MAPREDUCE-6846-trunk.006.patch
>
>
> If a user specifies a fragment for a libjars path via generic options parser, 
> the client crashes with a FileNotFoundException:
> {noformat}
> java.io.FileNotFoundException: File file:/home/mapred/test.txt#testFrag.txt 
> does not exist
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:638)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:864)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:628)
>   at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:442)
>   at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:363)
>   at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:314)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.copyRemoteFiles(JobResourceUploader.java:387)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.uploadLibJars(JobResourceUploader.java:154)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.uploadResources(JobResourceUploader.java:105)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:102)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:197)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1892)
>   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341)
>   at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:306)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:359)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:367)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
>   at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
>   at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
> {noformat}
> This is actually inconsistent with the behavior for files and archives. Here 
> is a table showing the current behavior for each type of path and resource:
> | || Qualified path (i.e. file://home/mapred/test.txt#frag.txt) || Absolute 
> path (i.e. /home/mapred/test.txt#frag.txt) || Relative path (i.e. 
> test.txt#frag.txt) ||
> || -libjars | FileNotFound | FileNotFound|FileNotFound|
> || -files | (/) | (/) | (/) |
> || -archives | (/) | (/) | (/) |



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6824) TaskAttemptImpl#createCommonContainerLaunchContext is longer than 150 lines

2017-04-05 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957254#comment-15957254
 ] 

Chris Trezzo commented on MAPREDUCE-6824:
-

Thanks [~ajisakaa]!

> TaskAttemptImpl#createCommonContainerLaunchContext is longer than 150 lines
> ---
>
> Key: MAPREDUCE-6824
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6824
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Trivial
>  Labels: newbie
> Fix For: 2.9.0, 3.0.0-alpha3
>
> Attachments: MAPREDUCE-6824-branch-2.001.patch, 
> MAPREDUCE-6824-trunk.001.patch, MAPREDUCE-6824-trunk.002.patch, 
> MAPREDUCE-6824-trunk.003.patch
>
>
> bq. 
> ./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java:752:
>  private static ContainerLaunchContext createCommonContainerLaunchContext(:3: 
> Method length is 172 lines (max allowed is 150).
> {{TaskAttemptImpl#createCommonContainerLaunchContext}} is longer than 150 
> lines and needs to be refactored.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6824) TaskAttemptImpl#createCommonContainerLaunchContext is longer than 150 lines

2017-04-04 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated MAPREDUCE-6824:

Attachment: MAPREDUCE-6824-branch-2.001.patch

Attached is a branch-2 backport patch.

> TaskAttemptImpl#createCommonContainerLaunchContext is longer than 150 lines
> ---
>
> Key: MAPREDUCE-6824
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6824
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Trivial
>  Labels: newbie
> Fix For: 3.0.0-alpha3
>
> Attachments: MAPREDUCE-6824-branch-2.001.patch, 
> MAPREDUCE-6824-trunk.001.patch, MAPREDUCE-6824-trunk.002.patch, 
> MAPREDUCE-6824-trunk.003.patch
>
>
> bq. 
> ./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java:752:
>  private static ContainerLaunchContext createCommonContainerLaunchContext(:3: 
> Method length is 172 lines (max allowed is 150).
> {{TaskAttemptImpl#createCommonContainerLaunchContext}} is longer than 150 
> lines and needs to be refactored.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6824) TaskAttemptImpl#createCommonContainerLaunchContext is longer than 150 lines

2017-04-04 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15955610#comment-15955610
 ] 

Chris Trezzo commented on MAPREDUCE-6824:
-

I will post a branch-2 patch shortly.

> TaskAttemptImpl#createCommonContainerLaunchContext is longer than 150 lines
> ---
>
> Key: MAPREDUCE-6824
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6824
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Trivial
>  Labels: newbie
> Fix For: 3.0.0-alpha3
>
> Attachments: MAPREDUCE-6824-trunk.001.patch, 
> MAPREDUCE-6824-trunk.002.patch, MAPREDUCE-6824-trunk.003.patch
>
>
> bq. 
> ./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java:752:
>  private static ContainerLaunchContext createCommonContainerLaunchContext(:3: 
> Method length is 172 lines (max allowed is 150).
> {{TaskAttemptImpl#createCommonContainerLaunchContext}} is longer than 150 
> lines and needs to be refactored.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6824) TaskAttemptImpl#createCommonContainerLaunchContext is longer than 150 lines

2017-04-04 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15955603#comment-15955603
 ] 

Chris Trezzo commented on MAPREDUCE-6824:
-

[~ajisakaa] is there any way we could backport this to branch-2 as well? It 
fixes a checkstyle issue for MAPREDUCE-5951 that will also hopefully make it to 
branch-2 shortly.

> TaskAttemptImpl#createCommonContainerLaunchContext is longer than 150 lines
> ---
>
> Key: MAPREDUCE-6824
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6824
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Trivial
>  Labels: newbie
> Fix For: 3.0.0-alpha3
>
> Attachments: MAPREDUCE-6824-trunk.001.patch, 
> MAPREDUCE-6824-trunk.002.patch, MAPREDUCE-6824-trunk.003.patch
>
>
> bq. 
> ./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java:752:
>  private static ContainerLaunchContext createCommonContainerLaunchContext(:3: 
> Method length is 172 lines (max allowed is 150).
> {{TaskAttemptImpl#createCommonContainerLaunchContext}} is longer than 150 
> lines and needs to be refactored.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6846) Fragments specified for libjar paths are not handled correctly

2017-04-03 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15954025#comment-15954025
 ] 

Chris Trezzo commented on MAPREDUCE-6846:
-

Any additional comments [~templedf]? Thanks!

> Fragments specified for libjar paths are not handled correctly
> --
>
> Key: MAPREDUCE-6846
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6846
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.6.0, 2.7.3, 3.0.0-alpha2
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Minor
> Attachments: MAPREDUCE-6846-trunk.001.patch, 
> MAPREDUCE-6846-trunk.002.patch, MAPREDUCE-6846-trunk.003.patch, 
> MAPREDUCE-6846-trunk.004.patch, MAPREDUCE-6846-trunk.005.patch, 
> MAPREDUCE-6846-trunk.006.patch
>
>
> If a user specifies a fragment for a libjars path via generic options parser, 
> the client crashes with a FileNotFoundException:
> {noformat}
> java.io.FileNotFoundException: File file:/home/mapred/test.txt#testFrag.txt 
> does not exist
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:638)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:864)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:628)
>   at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:442)
>   at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:363)
>   at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:314)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.copyRemoteFiles(JobResourceUploader.java:387)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.uploadLibJars(JobResourceUploader.java:154)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.uploadResources(JobResourceUploader.java:105)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:102)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:197)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1892)
>   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341)
>   at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:306)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:359)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:367)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
>   at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
>   at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
> {noformat}
> This is actually inconsistent with the behavior for files and archives. Here 
> is a table showing the current behavior for each type of path and resource:
> | || Qualified path (i.e. file://home/mapred/test.txt#frag.txt) || Absolute 
> path (i.e. /home/mapred/test.txt#frag.txt) || Relative path (i.e. 
> test.txt#frag.txt) ||
> || -libjars | FileNotFound | FileNotFound|FileNotFound|
> || -files | (/) | (/) | (/) |
> || -archives | (/) | (/) | (/) |



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6824) TaskAttemptImpl#createCommonContainerLaunchContext is longer than 150 lines

2017-04-03 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15953886#comment-15953886
 ] 

Chris Trezzo commented on MAPREDUCE-6824:
-

Thanks [~ajisakaa] and [~haibochen]!

> TaskAttemptImpl#createCommonContainerLaunchContext is longer than 150 lines
> ---
>
> Key: MAPREDUCE-6824
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6824
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Trivial
>  Labels: newbie
> Fix For: 3.0.0-alpha3
>
> Attachments: MAPREDUCE-6824-trunk.001.patch, 
> MAPREDUCE-6824-trunk.002.patch, MAPREDUCE-6824-trunk.003.patch
>
>
> bq. 
> ./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java:752:
>  private static ContainerLaunchContext createCommonContainerLaunchContext(:3: 
> Method length is 172 lines (max allowed is 150).
> {{TaskAttemptImpl#createCommonContainerLaunchContext}} is longer than 150 
> lines and needs to be refactored.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6846) Fragments specified for libjar paths are not handled correctly

2017-03-31 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated MAPREDUCE-6846:

Attachment: MAPREDUCE-6846-trunk.006.patch

Thanks [~templedf] for the additional comments. Attached is v6 to address them.

> Fragments specified for libjar paths are not handled correctly
> --
>
> Key: MAPREDUCE-6846
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6846
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.6.0, 2.7.3, 3.0.0-alpha2
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Minor
> Attachments: MAPREDUCE-6846-trunk.001.patch, 
> MAPREDUCE-6846-trunk.002.patch, MAPREDUCE-6846-trunk.003.patch, 
> MAPREDUCE-6846-trunk.004.patch, MAPREDUCE-6846-trunk.005.patch, 
> MAPREDUCE-6846-trunk.006.patch
>
>
> If a user specifies a fragment for a libjars path via generic options parser, 
> the client crashes with a FileNotFoundException:
> {noformat}
> java.io.FileNotFoundException: File file:/home/mapred/test.txt#testFrag.txt 
> does not exist
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:638)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:864)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:628)
>   at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:442)
>   at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:363)
>   at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:314)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.copyRemoteFiles(JobResourceUploader.java:387)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.uploadLibJars(JobResourceUploader.java:154)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.uploadResources(JobResourceUploader.java:105)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:102)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:197)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1892)
>   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341)
>   at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:306)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:359)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:367)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
>   at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
>   at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
> {noformat}
> This is actually inconsistent with the behavior for files and archives. Here 
> is a table showing the current behavior for each type of path and resource:
> | || Qualified path (i.e. file://home/mapred/test.txt#frag.txt) || Absolute 
> path (i.e. /home/mapred/test.txt#frag.txt) || Relative path (i.e. 
> test.txt#frag.txt) ||
> || -libjars | FileNotFound | FileNotFound|FileNotFound|
> || -files | (/) | (/) | (/) |
> || -archives | (/) | (/) | (/) |



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6824) TaskAttemptImpl#createCommonContainerLaunchContext is longer than 150 lines

2017-03-31 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated MAPREDUCE-6824:

Attachment: MAPREDUCE-6824-trunk.003.patch

[~haibochen] probably should have included environment conf as well. Thanks! 
Attached is a v3 that breaks that out to a separate method.

> TaskAttemptImpl#createCommonContainerLaunchContext is longer than 150 lines
> ---
>
> Key: MAPREDUCE-6824
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6824
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Trivial
>  Labels: newbie
> Attachments: MAPREDUCE-6824-trunk.001.patch, 
> MAPREDUCE-6824-trunk.002.patch, MAPREDUCE-6824-trunk.003.patch
>
>
> bq. 
> ./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java:752:
>  private static ContainerLaunchContext createCommonContainerLaunchContext(:3: 
> Method length is 172 lines (max allowed is 150).
> {{TaskAttemptImpl#createCommonContainerLaunchContext}} is longer than 150 
> lines and needs to be refactored.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6846) Fragments specified for libjar paths are not handled correctly

2017-03-31 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated MAPREDUCE-6846:

Attachment: MAPREDUCE-6846-trunk.005.patch

V5 posted to fix whitespace.

> Fragments specified for libjar paths are not handled correctly
> --
>
> Key: MAPREDUCE-6846
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6846
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.6.0, 2.7.3, 3.0.0-alpha2
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Minor
> Attachments: MAPREDUCE-6846-trunk.001.patch, 
> MAPREDUCE-6846-trunk.002.patch, MAPREDUCE-6846-trunk.003.patch, 
> MAPREDUCE-6846-trunk.004.patch, MAPREDUCE-6846-trunk.005.patch
>
>
> If a user specifies a fragment for a libjars path via generic options parser, 
> the client crashes with a FileNotFoundException:
> {noformat}
> java.io.FileNotFoundException: File file:/home/mapred/test.txt#testFrag.txt 
> does not exist
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:638)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:864)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:628)
>   at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:442)
>   at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:363)
>   at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:314)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.copyRemoteFiles(JobResourceUploader.java:387)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.uploadLibJars(JobResourceUploader.java:154)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.uploadResources(JobResourceUploader.java:105)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:102)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:197)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1892)
>   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341)
>   at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:306)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:359)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:367)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
>   at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
>   at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
> {noformat}
> This is actually inconsistent with the behavior for files and archives. Here 
> is a table showing the current behavior for each type of path and resource:
> | || Qualified path (i.e. file://home/mapred/test.txt#frag.txt) || Absolute 
> path (i.e. /home/mapred/test.txt#frag.txt) || Relative path (i.e. 
> test.txt#frag.txt) ||
> || -libjars | FileNotFound | FileNotFound|FileNotFound|
> || -files | (/) | (/) | (/) |
> || -archives | (/) | (/) | (/) |



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6846) Fragments specified for libjar paths are not handled correctly

2017-03-30 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated MAPREDUCE-6846:

Attachment: MAPREDUCE-6846-trunk.004.patch

Attached is a v4 rebase on trunk. Thanks!

> Fragments specified for libjar paths are not handled correctly
> --
>
> Key: MAPREDUCE-6846
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6846
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.6.0, 2.7.3, 3.0.0-alpha2
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Minor
> Attachments: MAPREDUCE-6846-trunk.001.patch, 
> MAPREDUCE-6846-trunk.002.patch, MAPREDUCE-6846-trunk.003.patch, 
> MAPREDUCE-6846-trunk.004.patch
>
>
> If a user specifies a fragment for a libjars path via generic options parser, 
> the client crashes with a FileNotFoundException:
> {noformat}
> java.io.FileNotFoundException: File file:/home/mapred/test.txt#testFrag.txt 
> does not exist
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:638)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:864)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:628)
>   at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:442)
>   at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:363)
>   at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:314)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.copyRemoteFiles(JobResourceUploader.java:387)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.uploadLibJars(JobResourceUploader.java:154)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.uploadResources(JobResourceUploader.java:105)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:102)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:197)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1892)
>   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341)
>   at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:306)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:359)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:367)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
>   at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
>   at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
> {noformat}
> This is actually inconsistent with the behavior for files and archives. Here 
> is a table showing the current behavior for each type of path and resource:
> | || Qualified path (i.e. file://home/mapred/test.txt#frag.txt) || Absolute 
> path (i.e. /home/mapred/test.txt#frag.txt) || Relative path (i.e. 
> test.txt#frag.txt) ||
> || -libjars | FileNotFound | FileNotFound|FileNotFound|
> || -files | (/) | (/) | (/) |
> || -archives | (/) | (/) | (/) |



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6846) Fragments specified for libjar paths are not handled correctly

2017-03-30 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15950152#comment-15950152
 ] 

Chris Trezzo commented on MAPREDUCE-6846:
-

Thanks [~mingma]! The DistributedCache.addCacheFile call is necessary because 
the call to DistributedCache.addFileToClassPath in the same method no longer 
adds the URIs to the distributed cache. See that the last parameter is set to 
false instead of !wildcard. Please let me know if I have missed something.

> Fragments specified for libjar paths are not handled correctly
> --
>
> Key: MAPREDUCE-6846
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6846
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.6.0, 2.7.3, 3.0.0-alpha2
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Minor
> Attachments: MAPREDUCE-6846-trunk.001.patch, 
> MAPREDUCE-6846-trunk.002.patch, MAPREDUCE-6846-trunk.003.patch
>
>
> If a user specifies a fragment for a libjars path via generic options parser, 
> the client crashes with a FileNotFoundException:
> {noformat}
> java.io.FileNotFoundException: File file:/home/mapred/test.txt#testFrag.txt 
> does not exist
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:638)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:864)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:628)
>   at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:442)
>   at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:363)
>   at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:314)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.copyRemoteFiles(JobResourceUploader.java:387)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.uploadLibJars(JobResourceUploader.java:154)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.uploadResources(JobResourceUploader.java:105)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:102)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:197)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1892)
>   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341)
>   at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:306)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:359)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:367)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
>   at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
>   at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
> {noformat}
> This is actually inconsistent with the behavior for files and archives. Here 
> is a table showing the current behavior for each type of path and resource:
> | || Qualified path (i.e. file://home/mapred/test.txt#frag.txt) || Absolute 
> path (i.e. /home/mapred/test.txt#frag.txt) || Relative path (i.e. 
> test.txt#frag.txt) ||
> || -libjars | FileNotFound | FileNotFound|FileNotFound|
> || -files | (/) | (/) | (/) |
> || -archives | (/) | (/) | (/) |



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional 

[jira] [Updated] (MAPREDUCE-6824) TaskAttemptImpl#createCommonContainerLaunchContext is longer than 150 lines

2017-03-30 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated MAPREDUCE-6824:

Attachment: MAPREDUCE-6824-trunk.002.patch

Attached v2 patch for trunk. This patch fixes the findbugs warning and failed 
test. Thanks!

> TaskAttemptImpl#createCommonContainerLaunchContext is longer than 150 lines
> ---
>
> Key: MAPREDUCE-6824
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6824
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Trivial
>  Labels: newbie
> Attachments: MAPREDUCE-6824-trunk.001.patch, 
> MAPREDUCE-6824-trunk.002.patch
>
>
> bq. 
> ./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java:752:
>  private static ContainerLaunchContext createCommonContainerLaunchContext(:3: 
> Method length is 172 lines (max allowed is 150).
> {{TaskAttemptImpl#createCommonContainerLaunchContext}} is longer than 150 
> lines and needs to be refactored.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6862) Fragments are not handled correctly by resource limit checking

2017-03-29 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15948204#comment-15948204
 ] 

Chris Trezzo commented on MAPREDUCE-6862:
-

Thanks [~mingma]!

> Fragments are not handled correctly by resource limit checking
> --
>
> Key: MAPREDUCE-6862
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6862
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.9.0, 3.0.0-alpha1
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Minor
> Fix For: 2.9.0, 3.0.0-alpha3
>
> Attachments: MAPREDUCE-6862-trunk.001.patch
>
>
> If a user specifies a fragment for a libjar, files, archives path via generic 
> options parser and resource limit checking is enabled, the client crashes 
> with a FileNotFoundException:
> {noformat}
> java.io.FileNotFoundException: File file:/home/mapred/test.txt#testFrag.txt 
> does not exist
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:638)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:864)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:628)
>   at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:442)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.getFileStatus(JobResourceUploader.java:413)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.explorePath(JobResourceUploader.java:395)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.checkLocalizationLimits(JobResourceUploader.java:304)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.uploadResources(JobResourceUploader.java:103)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:102)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:197)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1892)
>   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341)
>   at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:306)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:359)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:367)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
>   at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
>   at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org




[jira] [Updated] (MAPREDUCE-6824) TaskAttemptImpl#createCommonContainerLaunchContext is longer than 150 lines

2017-03-29 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated MAPREDUCE-6824:

Status: Patch Available  (was: Open)

> TaskAttemptImpl#createCommonContainerLaunchContext is longer than 150 lines
> ---
>
> Key: MAPREDUCE-6824
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6824
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Trivial
>  Labels: newbie
> Attachments: MAPREDUCE-6824-trunk.001.patch
>
>
> bq. 
> ./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java:752:
>  private static ContainerLaunchContext createCommonContainerLaunchContext(:3: 
> Method length is 172 lines (max allowed is 150).
> {{TaskAttemptImpl#createCommonContainerLaunchContext}} is longer than 150 
> lines and needs to be refactored.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6824) TaskAttemptImpl#createCommonContainerLaunchContext is longer than 150 lines

2017-03-29 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated MAPREDUCE-6824:

Attachment: MAPREDUCE-6824-trunk.001.patch

Attached is a v1 patch for trunk. I have split the following out into separate 
methods:
# Job jar configuration.
# Job conf configuration.
# Token setup.
# External shuffle provider setup.

I did not modify any unit tests because this is a straightforward refactor. 
Existing coverage should handle this change.

> TaskAttemptImpl#createCommonContainerLaunchContext is longer than 150 lines
> ---
>
> Key: MAPREDUCE-6824
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6824
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Trivial
>  Labels: newbie
> Attachments: MAPREDUCE-6824-trunk.001.patch
>
>
> bq. 
> ./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java:752:
>  private static ContainerLaunchContext createCommonContainerLaunchContext(:3: 
> Method length is 172 lines (max allowed is 150).
> {{TaskAttemptImpl#createCommonContainerLaunchContext}} is longer than 150 
> lines and needs to be refactored.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6824) TaskAttemptImpl#createCommonContainerLaunchContext is longer than 150 lines

2017-03-29 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15948178#comment-15948178
 ] 

Chris Trezzo commented on MAPREDUCE-6824:
-

Taking this task as it is causing a checkstyle issue for MAPREDUCE-5951.

> TaskAttemptImpl#createCommonContainerLaunchContext is longer than 150 lines
> ---
>
> Key: MAPREDUCE-6824
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6824
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Trivial
>  Labels: newbie
>
> bq. 
> ./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java:752:
>  private static ContainerLaunchContext createCommonContainerLaunchContext(:3: 
> Method length is 172 lines (max allowed is 150).
> {{TaskAttemptImpl#createCommonContainerLaunchContext}} is longer than 150 
> lines and needs to be refactored.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Assigned] (MAPREDUCE-6824) TaskAttemptImpl#createCommonContainerLaunchContext is longer than 150 lines

2017-03-29 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo reassigned MAPREDUCE-6824:
---

Assignee: Chris Trezzo  (was: Udai Kiran Potluri)

> TaskAttemptImpl#createCommonContainerLaunchContext is longer than 150 lines
> ---
>
> Key: MAPREDUCE-6824
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6824
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Trivial
>  Labels: newbie
>
> bq. 
> ./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java:752:
>  private static ContainerLaunchContext createCommonContainerLaunchContext(:3: 
> Method length is 172 lines (max allowed is 150).
> {{TaskAttemptImpl#createCommonContainerLaunchContext}} is longer than 150 
> lines and needs to be refactored.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Work started] (MAPREDUCE-5951) Add support for the YARN Shared Cache

2017-03-29 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on MAPREDUCE-5951 started by Chris Trezzo.
---
> Add support for the YARN Shared Cache
> -
>
> Key: MAPREDUCE-5951
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5951
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>  Labels: BB2015-05-TBR
> Attachments: MAPREDUCE-5951-Overview.001.pdf, 
> MAPREDUCE-5951-trunk.016.patch, MAPREDUCE-5951-trunk.017.patch, 
> MAPREDUCE-5951-trunk.018.patch, MAPREDUCE-5951-trunk.019.patch, 
> MAPREDUCE-5951-trunk-v10.patch, MAPREDUCE-5951-trunk-v11.patch, 
> MAPREDUCE-5951-trunk-v12.patch, MAPREDUCE-5951-trunk-v13.patch, 
> MAPREDUCE-5951-trunk-v14.patch, MAPREDUCE-5951-trunk-v15.patch, 
> MAPREDUCE-5951-trunk-v1.patch, MAPREDUCE-5951-trunk-v2.patch, 
> MAPREDUCE-5951-trunk-v3.patch, MAPREDUCE-5951-trunk-v4.patch, 
> MAPREDUCE-5951-trunk-v5.patch, MAPREDUCE-5951-trunk-v6.patch, 
> MAPREDUCE-5951-trunk-v7.patch, MAPREDUCE-5951-trunk-v8.patch, 
> MAPREDUCE-5951-trunk-v9.patch
>
>
> Implement the necessary changes so that the MapReduce application can 
> leverage the new YARN shared cache (i.e. YARN-1492).
> Specifically, allow per-job configuration so that MapReduce jobs can specify 
> which set of resources they would like to cache (i.e. jobjar, libjars, 
> archives, files).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-5951) Add support for the YARN Shared Cache

2017-03-29 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15948071#comment-15948071
 ] 

Chris Trezzo commented on MAPREDUCE-5951:
-

This issue is currently waiting on MAPREDUCE-6862 and MAPREDUCE-6846. I would 
like to get those two patches in so I don't have to rebase this multiple times.

> Add support for the YARN Shared Cache
> -
>
> Key: MAPREDUCE-5951
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5951
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>  Labels: BB2015-05-TBR
> Attachments: MAPREDUCE-5951-Overview.001.pdf, 
> MAPREDUCE-5951-trunk.016.patch, MAPREDUCE-5951-trunk.017.patch, 
> MAPREDUCE-5951-trunk.018.patch, MAPREDUCE-5951-trunk.019.patch, 
> MAPREDUCE-5951-trunk-v10.patch, MAPREDUCE-5951-trunk-v11.patch, 
> MAPREDUCE-5951-trunk-v12.patch, MAPREDUCE-5951-trunk-v13.patch, 
> MAPREDUCE-5951-trunk-v14.patch, MAPREDUCE-5951-trunk-v15.patch, 
> MAPREDUCE-5951-trunk-v1.patch, MAPREDUCE-5951-trunk-v2.patch, 
> MAPREDUCE-5951-trunk-v3.patch, MAPREDUCE-5951-trunk-v4.patch, 
> MAPREDUCE-5951-trunk-v5.patch, MAPREDUCE-5951-trunk-v6.patch, 
> MAPREDUCE-5951-trunk-v7.patch, MAPREDUCE-5951-trunk-v8.patch, 
> MAPREDUCE-5951-trunk-v9.patch
>
>
> Implement the necessary changes so that the MapReduce application can 
> leverage the new YARN shared cache (i.e. YARN-1492).
> Specifically, allow per-job configuration so that MapReduce jobs can specify 
> which set of resources they would like to cache (i.e. jobjar, libjars, 
> archives, files).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6846) Fragments specified for libjar paths are not handled correctly

2017-03-29 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15948022#comment-15948022
 ] 

Chris Trezzo commented on MAPREDUCE-6846:
-

Any other comments or suggestions on this [~templedf] [~jlowe] [~sjlee0]? I 
have tested this on a pseudo distributed cluster as well. Thanks again for the 
reviews!

> Fragments specified for libjar paths are not handled correctly
> --
>
> Key: MAPREDUCE-6846
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6846
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.6.0, 2.7.3, 3.0.0-alpha2
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Minor
> Attachments: MAPREDUCE-6846-trunk.001.patch, 
> MAPREDUCE-6846-trunk.002.patch, MAPREDUCE-6846-trunk.003.patch
>
>
> If a user specifies a fragment for a libjars path via generic options parser, 
> the client crashes with a FileNotFoundException:
> {noformat}
> java.io.FileNotFoundException: File file:/home/mapred/test.txt#testFrag.txt 
> does not exist
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:638)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:864)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:628)
>   at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:442)
>   at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:363)
>   at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:314)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.copyRemoteFiles(JobResourceUploader.java:387)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.uploadLibJars(JobResourceUploader.java:154)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.uploadResources(JobResourceUploader.java:105)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:102)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:197)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1892)
>   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341)
>   at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:306)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:359)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:367)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
>   at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
>   at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
> {noformat}
> This is actually inconsistent with the behavior for files and archives. Here 
> is a table showing the current behavior for each type of path and resource:
> | || Qualified path (i.e. file://home/mapred/test.txt#frag.txt) || Absolute 
> path (i.e. /home/mapred/test.txt#frag.txt) || Relative path (i.e. 
> test.txt#frag.txt) ||
> || -libjars | FileNotFound | FileNotFound|FileNotFound|
> || -files | (/) | (/) | (/) |
> || -archives | (/) | (/) | (/) |



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6862) Fragments are not handled correctly by resource limit checking

2017-03-27 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated MAPREDUCE-6862:

Status: Patch Available  (was: Open)

> Fragments are not handled correctly by resource limit checking
> --
>
> Key: MAPREDUCE-6862
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6862
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha1, 2.9.0
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Minor
> Attachments: MAPREDUCE-6862-trunk.001.patch
>
>
> If a user specifies a fragment for a libjar, files, archives path via generic 
> options parser and resource limit checking is enabled, the client crashes 
> with a FileNotFoundException:
> {noformat}
> java.io.FileNotFoundException: File file:/home/mapred/test.txt#testFrag.txt 
> does not exist
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:638)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:864)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:628)
>   at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:442)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.getFileStatus(JobResourceUploader.java:413)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.explorePath(JobResourceUploader.java:395)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.checkLocalizationLimits(JobResourceUploader.java:304)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.uploadResources(JobResourceUploader.java:103)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:102)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:197)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1892)
>   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341)
>   at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:306)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:359)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:367)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
>   at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
>   at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6862) Fragments are not handled correctly by resource limit checking

2017-03-27 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated MAPREDUCE-6862:

Attachment: MAPREDUCE-6862-trunk.001.patch

Attached is a v1 for trunk. This patch does the following:
# Fixes the conversion from string to path when there are fragments present for 
the limit checker in JobResourceUploader.
# Adds a unit test.

> Fragments are not handled correctly by resource limit checking
> --
>
> Key: MAPREDUCE-6862
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6862
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.9.0, 3.0.0-alpha1
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Minor
> Attachments: MAPREDUCE-6862-trunk.001.patch
>
>
> If a user specifies a fragment for a libjar, files, archives path via generic 
> options parser and resource limit checking is enabled, the client crashes 
> with a FileNotFoundException:
> {noformat}
> java.io.FileNotFoundException: File file:/home/mapred/test.txt#testFrag.txt 
> does not exist
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:638)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:864)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:628)
>   at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:442)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.getFileStatus(JobResourceUploader.java:413)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.explorePath(JobResourceUploader.java:395)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.checkLocalizationLimits(JobResourceUploader.java:304)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.uploadResources(JobResourceUploader.java:103)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:102)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:197)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1892)
>   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341)
>   at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:306)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:359)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:367)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
>   at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
>   at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6846) Fragments specified for libjar paths are not handled correctly

2017-03-24 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated MAPREDUCE-6846:

Attachment: MAPREDUCE-6846-trunk.003.patch

Attached is a trunk v3 patch addressing [~templedf]'s comments. Here are some 
high-level changes:
# In JobResourceUploader I modified IOException messages to explicitly call out 
URISyntaxException.
# Added makeQualify call to JobResourceUploader#copyRemoteFiles, since all 
newPaths returned from that method should be qualified using the jtFs.
# Refactored both TestJobResourceUploader#ResourceConf inner classes into a 
single class.
# In TestJobResourceUploader I reverted changing Assert.fail calls to 
Assert.true or Assert.false.

> Fragments specified for libjar paths are not handled correctly
> --
>
> Key: MAPREDUCE-6846
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6846
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.6.0, 2.7.3, 3.0.0-alpha2
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Minor
> Attachments: MAPREDUCE-6846-trunk.001.patch, 
> MAPREDUCE-6846-trunk.002.patch, MAPREDUCE-6846-trunk.003.patch
>
>
> If a user specifies a fragment for a libjars path via generic options parser, 
> the client crashes with a FileNotFoundException:
> {noformat}
> java.io.FileNotFoundException: File file:/home/mapred/test.txt#testFrag.txt 
> does not exist
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:638)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:864)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:628)
>   at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:442)
>   at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:363)
>   at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:314)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.copyRemoteFiles(JobResourceUploader.java:387)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.uploadLibJars(JobResourceUploader.java:154)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.uploadResources(JobResourceUploader.java:105)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:102)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:197)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1892)
>   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341)
>   at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:306)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:359)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:367)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
>   at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
>   at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
> {noformat}
> This is actually inconsistent with the behavior for files and archives. Here 
> is a table showing the current behavior for each type of path and resource:
> | || Qualified path (i.e. file://home/mapred/test.txt#frag.txt) || Absolute 
> path (i.e. /home/mapred/test.txt#frag.txt) || Relative path (i.e. 
> test.txt#frag.txt) ||
> || -libjars | FileNotFound | FileNotFound|FileNotFound|
> || -files | (/) | (/) | (/) |
> || -archives | (/) 

[jira] [Commented] (MAPREDUCE-6846) Fragments specified for libjar paths are not handled correctly

2017-03-24 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15941256#comment-15941256
 ] 

Chris Trezzo commented on MAPREDUCE-6846:
-

Thanks for the review [~templedf]!

Quick question:

bq. It seems odd to create pathURI and then do nothing with it that you 
couldn't do with tmpURI until the end.

Can we actually use tmpURI in this case? It seems as though the URIs/paths we 
submit to the DistributedCache#addFileToClassPath and 
DistributedCache#addCacheFile methods should match. This is so that the symlink 
is correctly resolved in MRApps#addToClasspathIfNotJar for libjars that are not 
jars.

My understanding is that we need to use the path returned by copyRemoteFiles() 
for DistributedCache#addCacheFile otherwise the resource will not be found 
during localization. Because of this, we also need the pathURI so that the 
paths match and we honor user supplied fragments. I can move the 
addFileToClassPath call to the top, but would still need the pathURI. Is this 
what you had in mind?
{code}
Path newPath =
copyRemoteFiles(libjarsDir, tmp, conf, submitReplication);
try {
  URI pathURI = getPathURI(newPath, tmpURI.getFragment());
  DistributedCache.addFileToClassPath(new Path(pathURI.getPath()), conf,
  jtFs, false);
  if (!foundFragment) {
foundFragment = pathURI.getFragment() != null;
  }
  libjarURIs.add(pathURI);
}
{code}

Please let me know if I am missing something! Thanks!

> Fragments specified for libjar paths are not handled correctly
> --
>
> Key: MAPREDUCE-6846
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6846
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.6.0, 2.7.3, 3.0.0-alpha2
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Minor
> Attachments: MAPREDUCE-6846-trunk.001.patch, 
> MAPREDUCE-6846-trunk.002.patch
>
>
> If a user specifies a fragment for a libjars path via generic options parser, 
> the client crashes with a FileNotFoundException:
> {noformat}
> java.io.FileNotFoundException: File file:/home/mapred/test.txt#testFrag.txt 
> does not exist
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:638)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:864)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:628)
>   at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:442)
>   at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:363)
>   at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:314)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.copyRemoteFiles(JobResourceUploader.java:387)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.uploadLibJars(JobResourceUploader.java:154)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.uploadResources(JobResourceUploader.java:105)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:102)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:197)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1892)
>   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341)
>   at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:306)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:359)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:367)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
>   at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
>   at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> 

[jira] [Commented] (MAPREDUCE-4686) hadoop-mapreduce-client-core fails compilation in Eclipse due to missing Avro-generated classes

2017-03-22 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15937215#comment-15937215
 ] 

Chris Trezzo commented on MAPREDUCE-4686:
-

[~cnauroth] I noticed that this patch was removed from trunk by HADOOP-9304. It 
seems like Eclipse m2e builds still need it. I am currently building with 
Eclipse Neon.2 Release (4.6.2) on Mac 10.12. With a fresh project import, I 
still need to add the generated-source directory or the m2e build will not work.

Is there any way you could add this commit back to trunk? Thanks!

> hadoop-mapreduce-client-core fails compilation in Eclipse due to missing 
> Avro-generated classes
> ---
>
> Key: MAPREDUCE-4686
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4686
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: build
>Affects Versions: 3.0.0-alpha1
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Fix For: 3.0.0-alpha1
>
> Attachments: HADOOP-8848.patch
>
>
> After importing all of hadoop-common trunk into Eclipse with the m2e plugin, 
> the Avro-generated classes in hadoop-mapreduce-client-core don't show up on 
> Eclipse's classpath.  This causes compilation errors for anything that 
> depends on those classes.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6846) Fragments specified for libjar paths are not handled correctly

2017-03-14 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated MAPREDUCE-6846:

Attachment: MAPREDUCE-6846-trunk.002.patch

Attaching v2 for trunk.

This version does the following:
# Addresses [~sjlee0]'s comments.
# Adds two tests to cover wildcards.
# Replaces some Assert.fail calls in TestJobResourceUploader with more specific 
Assert.true or Assert.false calls.

> Fragments specified for libjar paths are not handled correctly
> --
>
> Key: MAPREDUCE-6846
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6846
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.6.0, 2.7.3, 3.0.0-alpha2
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Minor
> Attachments: MAPREDUCE-6846-trunk.001.patch, 
> MAPREDUCE-6846-trunk.002.patch
>
>
> If a user specifies a fragment for a libjars path via generic options parser, 
> the client crashes with a FileNotFoundException:
> {noformat}
> java.io.FileNotFoundException: File file:/home/mapred/test.txt#testFrag.txt 
> does not exist
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:638)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:864)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:628)
>   at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:442)
>   at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:363)
>   at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:314)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.copyRemoteFiles(JobResourceUploader.java:387)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.uploadLibJars(JobResourceUploader.java:154)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.uploadResources(JobResourceUploader.java:105)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:102)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:197)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1892)
>   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341)
>   at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:306)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:359)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:367)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
>   at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
>   at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
> {noformat}
> This is actually inconsistent with the behavior for files and archives. Here 
> is a table showing the current behavior for each type of path and resource:
> | || Qualified path (i.e. file://home/mapred/test.txt#frag.txt) || Absolute 
> path (i.e. /home/mapred/test.txt#frag.txt) || Relative path (i.e. 
> test.txt#frag.txt) ||
> || -libjars | FileNotFound | FileNotFound|FileNotFound|
> || -files | (/) | (/) | (/) |
> || -archives | (/) | (/) | (/) |



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6846) Fragments specified for libjar paths are not handled correctly

2017-03-13 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated MAPREDUCE-6846:

Affects Version/s: 2.6.0

> Fragments specified for libjar paths are not handled correctly
> --
>
> Key: MAPREDUCE-6846
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6846
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.6.0, 2.7.3, 3.0.0-alpha2
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Minor
> Attachments: MAPREDUCE-6846-trunk.001.patch
>
>
> If a user specifies a fragment for a libjars path via generic options parser, 
> the client crashes with a FileNotFoundException:
> {noformat}
> java.io.FileNotFoundException: File file:/home/mapred/test.txt#testFrag.txt 
> does not exist
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:638)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:864)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:628)
>   at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:442)
>   at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:363)
>   at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:314)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.copyRemoteFiles(JobResourceUploader.java:387)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.uploadLibJars(JobResourceUploader.java:154)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.uploadResources(JobResourceUploader.java:105)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:102)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:197)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1892)
>   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341)
>   at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:306)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:359)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:367)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
>   at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
>   at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
> {noformat}
> This is actually inconsistent with the behavior for files and archives. Here 
> is a table showing the current behavior for each type of path and resource:
> | || Qualified path (i.e. file://home/mapred/test.txt#frag.txt) || Absolute 
> path (i.e. /home/mapred/test.txt#frag.txt) || Relative path (i.e. 
> test.txt#frag.txt) ||
> || -libjars | FileNotFound | FileNotFound|FileNotFound|
> || -files | (/) | (/) | (/) |
> || -archives | (/) | (/) | (/) |



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6846) Fragments specified for libjar paths are not handled correctly

2017-03-13 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15922836#comment-15922836
 ] 

Chris Trezzo commented on MAPREDUCE-6846:
-

Thanks [~sjlee0] for the review!

bq. Just curious, was this always broken, or is it a regression?
I took a look through various branches and see that it has been broken at least 
back to branch-2.0.0-alpha. It seems that this has been broken for a very long 
time.

bq. Can we not add the fragment if the fragment is null and the original URL 
does not have the fragment either?
I can do that. That is existing behavior before this patch, but it seems like a 
low risk change.

bq. I'm wondering, would it lead to simpler code if we iterate over the list 
once to determine whether there is a fragment and iterate again to do the 
population of the distributed cache? It might be slightly more expensive, but 
may lead to code that's easier to understand/maintain. Thoughts?
Ack. I will do this as well. Originally I avoided this approach due to it being 
slightly more expensive (as you stated above), but I think you are right in 
that the gains probably don't outweigh the added complexity.

I will post an updated patch. Thanks!

> Fragments specified for libjar paths are not handled correctly
> --
>
> Key: MAPREDUCE-6846
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6846
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.7.3, 3.0.0-alpha2
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Minor
> Attachments: MAPREDUCE-6846-trunk.001.patch
>
>
> If a user specifies a fragment for a libjars path via generic options parser, 
> the client crashes with a FileNotFoundException:
> {noformat}
> java.io.FileNotFoundException: File file:/home/mapred/test.txt#testFrag.txt 
> does not exist
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:638)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:864)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:628)
>   at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:442)
>   at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:363)
>   at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:314)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.copyRemoteFiles(JobResourceUploader.java:387)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.uploadLibJars(JobResourceUploader.java:154)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.uploadResources(JobResourceUploader.java:105)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:102)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:197)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1892)
>   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341)
>   at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:306)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:359)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:367)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
>   at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
>   at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
> {noformat}
> This is actually 

[jira] [Updated] (MAPREDUCE-6846) Fragments specified for libjar paths are not handled correctly

2017-03-09 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated MAPREDUCE-6846:

Status: Patch Available  (was: Open)

> Fragments specified for libjar paths are not handled correctly
> --
>
> Key: MAPREDUCE-6846
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6846
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha2, 2.7.3
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Minor
> Attachments: MAPREDUCE-6846-trunk.001.patch
>
>
> If a user specifies a fragment for a libjars path via generic options parser, 
> the client crashes with a FileNotFoundException:
> {noformat}
> java.io.FileNotFoundException: File file:/home/mapred/test.txt#testFrag.txt 
> does not exist
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:638)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:864)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:628)
>   at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:442)
>   at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:363)
>   at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:314)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.copyRemoteFiles(JobResourceUploader.java:387)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.uploadLibJars(JobResourceUploader.java:154)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.uploadResources(JobResourceUploader.java:105)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:102)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:197)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1892)
>   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341)
>   at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:306)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:359)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:367)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
>   at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
>   at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
> {noformat}
> This is actually inconsistent with the behavior for files and archives. Here 
> is a table showing the current behavior for each type of path and resource:
> | || Qualified path (i.e. file://home/mapred/test.txt#frag.txt) || Absolute 
> path (i.e. /home/mapred/test.txt#frag.txt) || Relative path (i.e. 
> test.txt#frag.txt) ||
> || -libjars | FileNotFound | FileNotFound|FileNotFound|
> || -files | (/) | (/) | (/) |
> || -archives | (/) | (/) | (/) |



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6846) Fragments specified for libjar paths are not handled correctly

2017-03-09 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated MAPREDUCE-6846:

Attachment: MAPREDUCE-6846-trunk.001.patch

Attached is a v1 patch for trunk. Here is a summary of the changes:
# Modified JobResourceUploader#uploadLibJars to handle fragments using URI's 
instead of paths.
# Modified JobResourceUploader#uploadLibJars to work with the wild card 
feature. We now keep track of the paths and only use the wild card feature if 
there are no fragments specified for any individual libjar resource.
# Added tests to TestJobResourceUploader to verify that paths are handled 
correctly for libjars, files, archives and the jobjar. These tests verify 
behavior with different combinations of path types. For example: fragments/no 
fragments, schemes/no schemes, absolute vs relative paths.
# Created a new builder in TestJobResourceUploader to make creation of new 
tests easier. This splits out the resource limit test configuration into a 
separate conf that extends the new base resource configuration. With these 
confs it is easy to set up a unit test with different numbers of resources and 
path types.

I will kick hadoop qa for a run.

> Fragments specified for libjar paths are not handled correctly
> --
>
> Key: MAPREDUCE-6846
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6846
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.7.3, 3.0.0-alpha2
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Minor
> Attachments: MAPREDUCE-6846-trunk.001.patch
>
>
> If a user specifies a fragment for a libjars path via generic options parser, 
> the client crashes with a FileNotFoundException:
> {noformat}
> java.io.FileNotFoundException: File file:/home/mapred/test.txt#testFrag.txt 
> does not exist
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:638)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:864)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:628)
>   at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:442)
>   at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:363)
>   at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:314)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.copyRemoteFiles(JobResourceUploader.java:387)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.uploadLibJars(JobResourceUploader.java:154)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.uploadResources(JobResourceUploader.java:105)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:102)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:197)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1892)
>   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341)
>   at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:306)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:359)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:367)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
>   at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
>   at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
> {noformat}
> This is actually 

[jira] [Comment Edited] (MAPREDUCE-6862) Fragments are not handled correctly by resource limit checking

2017-03-09 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904061#comment-15904061
 ] 

Chris Trezzo edited comment on MAPREDUCE-6862 at 3/9/17 11:32 PM:
--

Working on a fix now. The issue is because the JobResourceUploader#explorePath 
method is path based instead of URI based. When paths are created based off of 
the passed in string from GenericOptionsParser, the fragment is not treated 
appropriately.


was (Author: ctrezzo):
Working on a fix now.

> Fragments are not handled correctly by resource limit checking
> --
>
> Key: MAPREDUCE-6862
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6862
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.9.0, 3.0.0-alpha1
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Minor
>
> If a user specifies a fragment for a libjar, files, archives path via generic 
> options parser and resource limit checking is enabled, the client crashes 
> with a FileNotFoundException:
> {noformat}
> java.io.FileNotFoundException: File file:/home/mapred/test.txt#testFrag.txt 
> does not exist
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:638)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:864)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:628)
>   at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:442)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.getFileStatus(JobResourceUploader.java:413)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.explorePath(JobResourceUploader.java:395)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.checkLocalizationLimits(JobResourceUploader.java:304)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.uploadResources(JobResourceUploader.java:103)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:102)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:197)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1892)
>   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341)
>   at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:306)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:359)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:367)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
>   at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
>   at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6862) Fragments are not handled correctly by resource limit checking

2017-03-09 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904061#comment-15904061
 ] 

Chris Trezzo commented on MAPREDUCE-6862:
-

Working on a fix now.

> Fragments are not handled correctly by resource limit checking
> --
>
> Key: MAPREDUCE-6862
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6862
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.9.0, 3.0.0-alpha1
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Minor
>
> If a user specifies a fragment for a libjar, files, archives path via generic 
> options parser and resource limit checking is enabled, the client crashes 
> with a FileNotFoundException:
> {noformat}
> java.io.FileNotFoundException: File file:/home/mapred/test.txt#testFrag.txt 
> does not exist
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:638)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:864)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:628)
>   at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:442)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.getFileStatus(JobResourceUploader.java:413)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.explorePath(JobResourceUploader.java:395)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.checkLocalizationLimits(JobResourceUploader.java:304)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.uploadResources(JobResourceUploader.java:103)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:102)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:197)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1892)
>   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341)
>   at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:306)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:359)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:367)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
>   at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
>   at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Created] (MAPREDUCE-6862) Fragments are not handled correctly by resource limit checking

2017-03-09 Thread Chris Trezzo (JIRA)
Chris Trezzo created MAPREDUCE-6862:
---

 Summary: Fragments are not handled correctly by resource limit 
checking
 Key: MAPREDUCE-6862
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6862
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0-alpha1, 2.9.0
Reporter: Chris Trezzo
Assignee: Chris Trezzo
Priority: Minor


If a user specifies a fragment for a libjar, files, archives path via generic 
options parser and resource limit checking is enabled, the client crashes with 
a FileNotFoundException:
{noformat}
java.io.FileNotFoundException: File file:/home/mapred/test.txt#testFrag.txt 
does not exist
at 
org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:638)
at 
org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:864)
at 
org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:628)
at 
org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:442)
at 
org.apache.hadoop.mapreduce.JobResourceUploader.getFileStatus(JobResourceUploader.java:413)
at 
org.apache.hadoop.mapreduce.JobResourceUploader.explorePath(JobResourceUploader.java:395)
at 
org.apache.hadoop.mapreduce.JobResourceUploader.checkLocalizationLimits(JobResourceUploader.java:304)
at 
org.apache.hadoop.mapreduce.JobResourceUploader.uploadResources(JobResourceUploader.java:103)
at 
org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:102)
at 
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:197)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1892)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362)
at 
org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:306)
at 
org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:359)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at 
org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:367)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6846) Fragments specified for libjar paths are not handled correctly

2017-02-10 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15861581#comment-15861581
 ] 

Chris Trezzo commented on MAPREDUCE-6846:
-

Perfect! Thanks [~jlowe]!

> Fragments specified for libjar paths are not handled correctly
> --
>
> Key: MAPREDUCE-6846
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6846
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.7.3, 3.0.0-alpha2
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Minor
>
> If a user specifies a fragment for a libjars path via generic options parser, 
> the client crashes with a FileNotFoundException:
> {noformat}
> java.io.FileNotFoundException: File file:/home/mapred/test.txt#testFrag.txt 
> does not exist
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:638)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:864)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:628)
>   at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:442)
>   at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:363)
>   at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:314)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.copyRemoteFiles(JobResourceUploader.java:387)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.uploadLibJars(JobResourceUploader.java:154)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.uploadResources(JobResourceUploader.java:105)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:102)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:197)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1892)
>   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341)
>   at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:306)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:359)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:367)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
>   at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
>   at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
> {noformat}
> This is actually inconsistent with the behavior for files and archives. Here 
> is a table showing the current behavior for each type of path and resource:
> | || Qualified path (i.e. file://home/mapred/test.txt#frag.txt) || Absolute 
> path (i.e. /home/mapred/test.txt#frag.txt) || Relative path (i.e. 
> test.txt#frag.txt) ||
> || -libjars | FileNotFound | FileNotFound|FileNotFound|
> || -files | (/) | (/) | (/) |
> || -archives | (/) | (/) | (/) |



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (MAPREDUCE-6846) Fragments specified for libjar paths are not handled correctly

2017-02-09 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15860382#comment-15860382
 ] 

Chris Trezzo edited comment on MAPREDUCE-6846 at 2/9/17 11:31 PM:
--

bq. I was under the impression that if the wildcard mapped to only one file 
then we would not convey this as a wildcard through to the staging directory 
but instead remap it to the one entry that it globbed to (i.e.: as if the user 
had specified the one path directly rather than a glob to that one path).

True, once it is in the staging dir it will not look like a wildcard. That 
being said, there is a second part to the feature. I will attempt to explain my 
current understanding:

See {{JobResourceUploader#uploadLibJars}}:
{code:java}
  private void uploadLibJars(Configuration conf, Collection libjars,
  Path submitJobDir, FsPermission mapredSysPerms, short submitReplication)
  throws IOException {
Path libjarsDir = JobSubmissionFiles.getJobDistCacheLibjars(submitJobDir);
if (!libjars.isEmpty()) {
  FileSystem.mkdirs(jtFs, libjarsDir, mapredSysPerms);
  for (String tmpjars : libjars) {
Path tmp = new Path(tmpjars);
Path newPath =
copyRemoteFiles(libjarsDir, tmp, conf, submitReplication);

// Add each file to the classpath
DistributedCache.addFileToClassPath(
new Path(newPath.toUri().getPath()), conf, jtFs, !useWildcard);
  }

  if (useWildcard) {
// Add the whole directory to the cache
Path libJarsDirWildcard =
jtFs.makeQualified(new Path(libjarsDir, DistributedCache.WILDCARD));

DistributedCache.addCacheFile(libJarsDirWildcard.toUri(), conf);
  }
}
  }
{code}
{{useWildcard}} is set by the {{mapreduce.client.libjars.wildcard}} config 
parameter. If this is set to true, then we add the files individually to the 
classpath (i.e. {{mapreduce.job.classpath.files}}), but then we glob them all 
together when adding them to the distributed cache (i.e. 
{{mapreduce.job.cache.files}}). At that point, we would loose the fragment name 
because the LocalResource objects submitted to YARN are created based off of 
those paths.

As a side note, this method also contains the original bug that motivated this 
jira. This bug is due to the uploadLibJars method creating a path from tmpjars 
instead of a URI. The path constructor does not support fragments and we loose 
them at this point with or without wildcards.


was (Author: ctrezzo):
bq. I was under the impression that if the wildcard mapped to only one file 
then we would not convey this as a wildcard through to the staging directory 
but instead remap it to the one entry that it globbed to (i.e.: as if the user 
had specified the one path directly rather than a glob to that one path).

True, once it is in the staging dir it will not look like a wildcard. That 
being said, there is a second part to the feature. I will attempt to explain my 
current understanding:

See {{JobResourceUploader#uploadLibJars}}:
{code:java}
  private void uploadLibJars(Configuration conf, Collection libjars,
  Path submitJobDir, FsPermission mapredSysPerms, short submitReplication)
  throws IOException {
Path libjarsDir = JobSubmissionFiles.getJobDistCacheLibjars(submitJobDir);
if (!libjars.isEmpty()) {
  FileSystem.mkdirs(jtFs, libjarsDir, mapredSysPerms);
  for (String tmpjars : libjars) {
Path tmp = new Path(tmpjars);
Path newPath =
copyRemoteFiles(libjarsDir, tmp, conf, submitReplication);

// Add each file to the classpath
DistributedCache.addFileToClassPath(
new Path(newPath.toUri().getPath()), conf, jtFs, !useWildcard);
  }

  if (useWildcard) {
// Add the whole directory to the cache
Path libJarsDirWildcard =
jtFs.makeQualified(new Path(libjarsDir, DistributedCache.WILDCARD));

DistributedCache.addCacheFile(libJarsDirWildcard.toUri(), conf);
  }
}
  }
{code}
{{useWildcard}} is set by the {{mapreduce.client.libjars.wildcard}} config 
parameter. If this is set to true, then we add the files individually to the 
classpath (i.e. {{mapreduce.job.classpath.files}}), but then we glob them all 
together when adding them to the distributed cache (i.e. 
{{mapreduce.job.cache.files}}). At that point, we would loose the fragment name 
because the LocalResource objects submitted to YARN are created based off of 
those paths.

> Fragments specified for libjar paths are not handled correctly
> --
>
> Key: MAPREDUCE-6846
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6846
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.7.3, 3.0.0-alpha2
>Reporter: Chris Trezzo
>Assignee: Chris 

[jira] [Commented] (MAPREDUCE-6846) Fragments specified for libjar paths are not handled correctly

2017-02-09 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15860382#comment-15860382
 ] 

Chris Trezzo commented on MAPREDUCE-6846:
-

bq. I was under the impression that if the wildcard mapped to only one file 
then we would not convey this as a wildcard through to the staging directory 
but instead remap it to the one entry that it globbed to (i.e.: as if the user 
had specified the one path directly rather than a glob to that one path).

True, once it is in the staging dir it will not look like a wildcard. That 
being said, there is a second part to the feature. I will attempt to explain my 
current understanding:

See {{JobResourceUploader#uploadLibJars}}:
{code:java}
  private void uploadLibJars(Configuration conf, Collection libjars,
  Path submitJobDir, FsPermission mapredSysPerms, short submitReplication)
  throws IOException {
Path libjarsDir = JobSubmissionFiles.getJobDistCacheLibjars(submitJobDir);
if (!libjars.isEmpty()) {
  FileSystem.mkdirs(jtFs, libjarsDir, mapredSysPerms);
  for (String tmpjars : libjars) {
Path tmp = new Path(tmpjars);
Path newPath =
copyRemoteFiles(libjarsDir, tmp, conf, submitReplication);

// Add each file to the classpath
DistributedCache.addFileToClassPath(
new Path(newPath.toUri().getPath()), conf, jtFs, !useWildcard);
  }

  if (useWildcard) {
// Add the whole directory to the cache
Path libJarsDirWildcard =
jtFs.makeQualified(new Path(libjarsDir, DistributedCache.WILDCARD));

DistributedCache.addCacheFile(libJarsDirWildcard.toUri(), conf);
  }
}
  }
{code}
{{useWildcard}} is set by the {{mapreduce.client.libjars.wildcard}} config 
parameter. If this is set to true, then we add the files individually to the 
classpath (i.e. {{mapreduce.job.classpath.files}}), but then we glob them all 
together when adding them to the distributed cache (i.e. 
{{mapreduce.job.cache.files}}). At that point, we would loose the fragment name 
because the LocalResource objects submitted to YARN are created based off of 
those paths.

> Fragments specified for libjar paths are not handled correctly
> --
>
> Key: MAPREDUCE-6846
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6846
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.7.3, 3.0.0-alpha2
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Minor
>
> If a user specifies a fragment for a libjars path via generic options parser, 
> the client crashes with a FileNotFoundException:
> {noformat}
> java.io.FileNotFoundException: File file:/home/mapred/test.txt#testFrag.txt 
> does not exist
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:638)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:864)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:628)
>   at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:442)
>   at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:363)
>   at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:314)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.copyRemoteFiles(JobResourceUploader.java:387)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.uploadLibJars(JobResourceUploader.java:154)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.uploadResources(JobResourceUploader.java:105)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:102)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:197)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1892)
>   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341)
>   at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:306)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:359)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:367)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   

[jira] [Commented] (MAPREDUCE-6846) Fragments specified for libjar paths are not handled correctly

2017-02-09 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15860355#comment-15860355
 ] 

Chris Trezzo commented on MAPREDUCE-6846:
-

Thanks [~jlowe]! I wasn't even thinking about the wildcard parsing from the 
generic options parser... that will have to change as well. Currently it looks 
like wildcards are only supported if the path is a directory and the wildcard 
is at the end of the path (i.e. in the form {{/mydir/*}}). If the wildcard is 
anywhere else, it is an illegal argument.

I can change the parsing of the path so that it handles fragments as well. At 
that point we can check if the wildcard resolves to multiple jars and throw an 
exception if there are conflicts as you suggested. That handles the generic 
options parser part of it.

Additionally though, we still have to handle the wildcard portion that handles 
adding libjars to the distributed cache. Right now, if it is enabled, it will 
add all libjars in the staging directory using a wildcard (i.e. 
{{.staging/libjars/*}}) to reduce the size of the jobconf. To handle this part 
of it, I figure I can go with the approach specified above in #1. If there is a 
path with a fragment specified we can not use the wildcard feature, otherwise 
the fragment will not be honored when added to the distributed cache.

Let me know if that sounds good!

> Fragments specified for libjar paths are not handled correctly
> --
>
> Key: MAPREDUCE-6846
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6846
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.7.3, 3.0.0-alpha2
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Minor
>
> If a user specifies a fragment for a libjars path via generic options parser, 
> the client crashes with a FileNotFoundException:
> {noformat}
> java.io.FileNotFoundException: File file:/home/mapred/test.txt#testFrag.txt 
> does not exist
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:638)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:864)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:628)
>   at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:442)
>   at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:363)
>   at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:314)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.copyRemoteFiles(JobResourceUploader.java:387)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.uploadLibJars(JobResourceUploader.java:154)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.uploadResources(JobResourceUploader.java:105)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:102)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:197)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1892)
>   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341)
>   at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:306)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:359)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:367)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
>   at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
>   at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
> 

[jira] [Commented] (MAPREDUCE-6846) Fragments specified for libjar paths are not handled correctly

2017-02-08 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15858637#comment-15858637
 ] 

Chris Trezzo commented on MAPREDUCE-6846:
-

Unfortunately, it looks like wildcards was implemented on top of this bug (see 
MAPREDUCE-6719). Fixing this so that fragments are honored might be a little 
tricky when wildcards are being used because you would lose the per-path 
fragment information.

I can see two potential approaches so far:
# Only use a wildcard when there have been no fragments specified by the user. 
This would preserve the intended naming of resources, but would reduce the 
number of instances where wildcards could be used.
# Silently ignore fragments specified by libjars - I am not a fan of this 
approach because the application could be expecting a specific resource name 
for libjars so that symlinks don't conflict when resources are localized.

I will start working on a v1 patch for #1. Thoughts [~templedf] and [~sjlee0]?

> Fragments specified for libjar paths are not handled correctly
> --
>
> Key: MAPREDUCE-6846
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6846
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.7.3, 3.0.0-alpha2
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Minor
>
> If a user specifies a fragment for a libjars path via generic options parser, 
> the client crashes with a FileNotFoundException:
> {noformat}
> java.io.FileNotFoundException: File file:/home/mapred/test.txt#testFrag.txt 
> does not exist
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:638)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:864)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:628)
>   at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:442)
>   at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:363)
>   at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:314)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.copyRemoteFiles(JobResourceUploader.java:387)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.uploadLibJars(JobResourceUploader.java:154)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.uploadResources(JobResourceUploader.java:105)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:102)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:197)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1892)
>   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341)
>   at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:306)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:359)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:367)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
>   at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
>   at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
> {noformat}
> This is actually inconsistent with the behavior for files and archives. Here 
> is a table showing the current behavior for each type of path and resource:
> | || Qualified path (i.e. file://home/mapred/test.txt#frag.txt) || Absolute 
> path (i.e. /home/mapred/test.txt#frag.txt) || Relative path 

[jira] [Comment Edited] (MAPREDUCE-6846) Fragments specified for libjar paths are not handled correctly

2017-02-08 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15858637#comment-15858637
 ] 

Chris Trezzo edited comment on MAPREDUCE-6846 at 2/8/17 10:40 PM:
--

Unfortunately, it looks like wildcards was implemented on top of this bug (see 
MAPREDUCE-6719). Fixing this so that fragments are honored might be a little 
tricky when wildcards are being used because you would lose the per-path 
fragment information.

I can see two potential approaches so far:
# Only use a wildcard when there have been no fragments specified by the user. 
This would preserve the intended naming of resources, but would reduce the 
number of instances where wildcards could be used.
# Silently ignore fragments specified by libjars when wildcards are enabled - I 
am not a fan of this approach because the application could be expecting a 
specific resource name for libjars so that symlinks don't conflict when 
resources are localized.

I will start working on a v1 patch for #1. Thoughts [~templedf] and [~sjlee0]?


was (Author: ctrezzo):
Unfortunately, it looks like wildcards was implemented on top of this bug (see 
MAPREDUCE-6719). Fixing this so that fragments are honored might be a little 
tricky when wildcards are being used because you would lose the per-path 
fragment information.

I can see two potential approaches so far:
# Only use a wildcard when there have been no fragments specified by the user. 
This would preserve the intended naming of resources, but would reduce the 
number of instances where wildcards could be used.
# Silently ignore fragments specified by libjars - I am not a fan of this 
approach because the application could be expecting a specific resource name 
for libjars so that symlinks don't conflict when resources are localized.

I will start working on a v1 patch for #1. Thoughts [~templedf] and [~sjlee0]?

> Fragments specified for libjar paths are not handled correctly
> --
>
> Key: MAPREDUCE-6846
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6846
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.7.3, 3.0.0-alpha2
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Minor
>
> If a user specifies a fragment for a libjars path via generic options parser, 
> the client crashes with a FileNotFoundException:
> {noformat}
> java.io.FileNotFoundException: File file:/home/mapred/test.txt#testFrag.txt 
> does not exist
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:638)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:864)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:628)
>   at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:442)
>   at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:363)
>   at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:314)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.copyRemoteFiles(JobResourceUploader.java:387)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.uploadLibJars(JobResourceUploader.java:154)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.uploadResources(JobResourceUploader.java:105)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:102)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:197)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1892)
>   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341)
>   at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:306)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:359)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:367)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
>  

[jira] [Updated] (MAPREDUCE-6846) Fragments specified for libjar paths are not handled correctly

2017-02-08 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated MAPREDUCE-6846:

Description: 
If a user specifies a fragment for a libjars path via generic options parser, 
the client crashes with a FileNotFoundException:
{noformat}
java.io.FileNotFoundException: File file:/home/mapred/test.txt#testFrag.txt 
does not exist
at 
org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:638)
at 
org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:864)
at 
org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:628)
at 
org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:442)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:363)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:314)
at 
org.apache.hadoop.mapreduce.JobResourceUploader.copyRemoteFiles(JobResourceUploader.java:387)
at 
org.apache.hadoop.mapreduce.JobResourceUploader.uploadLibJars(JobResourceUploader.java:154)
at 
org.apache.hadoop.mapreduce.JobResourceUploader.uploadResources(JobResourceUploader.java:105)
at 
org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:102)
at 
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:197)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1892)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362)
at 
org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:306)
at 
org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:359)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at 
org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:367)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
{noformat}

This is actually inconsistent with the behavior for files and archives. Here is 
a table showing the current behavior for each type of path and resource:
| || Qualified path (i.e. file://home/mapred/test.txt#frag.txt) || Absolute 
path (i.e. /home/mapred/test.txt#frag.txt) || Relative path (i.e. 
test.txt#frag.txt) ||
|| -libjars | FileNotFound | FileNotFound|FileNotFound|
|| -files | (/) | (/) | (/) |
|| -archives | (/) | (/) | (/) |

  was:
If a user specifies a fragment for a libjars path via generic options parser, 
the client crashes with a FileNotFoundException:
{noformat}
java.io.FileNotFoundException: File file:/home/mapred/test.txt#testFrag.txt 
does not exist
at 
org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:638)
at 
org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:864)
at 
org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:628)
at 
org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:442)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:363)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:314)
at 
org.apache.hadoop.mapreduce.JobResourceUploader.copyRemoteFiles(JobResourceUploader.java:387)
at 
org.apache.hadoop.mapreduce.JobResourceUploader.uploadLibJars(JobResourceUploader.java:154)
at 
org.apache.hadoop.mapreduce.JobResourceUploader.uploadResources(JobResourceUploader.java:105)
at 
org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:102)
at 

[jira] [Commented] (MAPREDUCE-6846) Fragments specified for libjar paths are not handled correctly

2017-02-08 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15858606#comment-15858606
 ] 

Chris Trezzo commented on MAPREDUCE-6846:
-

Working on a v1 patch now.

> Fragments specified for libjar paths are not handled correctly
> --
>
> Key: MAPREDUCE-6846
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6846
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.7.3, 3.0.0-alpha2
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Minor
>
> If a user specifies a fragment for a libjars path via generic options parser, 
> the client crashes with a FileNotFoundException:
> {noformat}
> java.io.FileNotFoundException: File file:/home/mapred/test.txt#testFrag.txt 
> does not exist
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:638)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:864)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:628)
>   at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:442)
>   at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:363)
>   at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:314)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.copyRemoteFiles(JobResourceUploader.java:387)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.uploadLibJars(JobResourceUploader.java:154)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.uploadResources(JobResourceUploader.java:105)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:102)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:197)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1892)
>   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341)
>   at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:306)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:359)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at 
> org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:367)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
>   at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
>   at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
> {noformat}
> This is actually inconsistent with the behavior for files and archives. Here 
> is a table showing the current behavior for each type of path and resource:
> | || Qualified path (i.e. file:/home/mapred/test.txt#frag.txt) || Absolute 
> path (i.e. /home/mapred/test.txt#frag.txt) || Relative path (i.e. 
> test.txt#frag.txt) ||
> || -libjars | FileNotFound | FileNotFound|FileNotFound|
> || -files | (/) | (/) | (/) |
> || -archives | (/) | (/) | (/) |



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Created] (MAPREDUCE-6846) Fragments specified for libjar paths are not handled correctly

2017-02-08 Thread Chris Trezzo (JIRA)
Chris Trezzo created MAPREDUCE-6846:
---

 Summary: Fragments specified for libjar paths are not handled 
correctly
 Key: MAPREDUCE-6846
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6846
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0-alpha2, 2.7.3
Reporter: Chris Trezzo
Assignee: Chris Trezzo
Priority: Minor


If a user specifies a fragment for a libjars path via generic options parser, 
the client crashes with a FileNotFoundException:
{noformat}
java.io.FileNotFoundException: File file:/home/mapred/test.txt#testFrag.txt 
does not exist
at 
org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:638)
at 
org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:864)
at 
org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:628)
at 
org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:442)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:363)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:314)
at 
org.apache.hadoop.mapreduce.JobResourceUploader.copyRemoteFiles(JobResourceUploader.java:387)
at 
org.apache.hadoop.mapreduce.JobResourceUploader.uploadLibJars(JobResourceUploader.java:154)
at 
org.apache.hadoop.mapreduce.JobResourceUploader.uploadResources(JobResourceUploader.java:105)
at 
org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:102)
at 
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:197)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1892)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362)
at 
org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:306)
at 
org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:359)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at 
org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:367)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
{noformat}

This is actually inconsistent with the behavior for files and archives. Here is 
a table showing the current behavior for each type of path and resource:
| || Qualified path (i.e. file:/home/mapred/test.txt#frag.txt) || Absolute path 
(i.e. /home/mapred/test.txt#frag.txt) || Relative path (i.e. test.txt#frag.txt) 
||
|| -libjars | FileNotFound | FileNotFound|FileNotFound|
|| -files | (/) | (/) | (/) |
|| -archives | (/) | (/) | (/) |



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6824) TaskAttemptImpl#createCommonContainerLaunchContext is longer than 150 lines

2017-02-02 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15850455#comment-15850455
 ] 

Chris Trezzo commented on MAPREDUCE-6824:
-

Sounds great! Thanks [~udai]!

> TaskAttemptImpl#createCommonContainerLaunchContext is longer than 150 lines
> ---
>
> Key: MAPREDUCE-6824
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6824
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Chris Trezzo
>Assignee: Udai Kiran Potluri
>Priority: Trivial
>  Labels: newbie
>
> bq. 
> ./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java:752:
>  private static ContainerLaunchContext createCommonContainerLaunchContext(:3: 
> Method length is 172 lines (max allowed is 150).
> {{TaskAttemptImpl#createCommonContainerLaunchContext}} is longer than 150 
> lines and needs to be refactored.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6825) YARNRunner#createApplicationSubmissionContext method is longer than 150 lines

2017-02-02 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15850453#comment-15850453
 ] 

Chris Trezzo commented on MAPREDUCE-6825:
-

Thanks [~GergelyNovak]! +1 on the v2 patch. It looks good to me. If someone 
wants to commit this, that would be great!

> YARNRunner#createApplicationSubmissionContext method is longer than 150 lines
> -
>
> Key: MAPREDUCE-6825
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6825
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Chris Trezzo
>Assignee: Gergely Novák
>Priority: Trivial
>  Labels: newbie
> Attachments: MAPREDUCE-6825.001.patch, MAPREDUCE-6825.002.patch
>
>
> bq. 
> ./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/java/org/apache/hadoop/mapred/YARNRunner.java:341:
>  public ApplicationSubmissionContext createApplicationSubmissionContext(:3: 
> Method length is 249 lines (max allowed is 150).
> {{YARNRunner#createApplicationSubmissionContext}} is longer than 150 lines 
> and needs to be refactored.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6825) YARNRunner#createApplicationSubmissionContext method is longer than 150 lines

2017-01-19 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15830855#comment-15830855
 ] 

Chris Trezzo commented on MAPREDUCE-6825:
-

Yes! Thank you [~GergelyNovak] for the patch. Kicking off a QA run.

> YARNRunner#createApplicationSubmissionContext method is longer than 150 lines
> -
>
> Key: MAPREDUCE-6825
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6825
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Chris Trezzo
>Assignee: Gergely Novák
>Priority: Trivial
>  Labels: newbie
> Attachments: MAPREDUCE-6825.001.patch
>
>
> bq. 
> ./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/java/org/apache/hadoop/mapred/YARNRunner.java:341:
>  public ApplicationSubmissionContext createApplicationSubmissionContext(:3: 
> Method length is 249 lines (max allowed is 150).
> {{YARNRunner#createApplicationSubmissionContext}} is longer than 150 lines 
> and needs to be refactored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6825) YARNRunner#createApplicationSubmissionContext method is longer than 150 lines

2017-01-19 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated MAPREDUCE-6825:

Status: Patch Available  (was: Open)

> YARNRunner#createApplicationSubmissionContext method is longer than 150 lines
> -
>
> Key: MAPREDUCE-6825
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6825
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Chris Trezzo
>Assignee: Gergely Novák
>Priority: Trivial
>  Labels: newbie
> Attachments: MAPREDUCE-6825.001.patch
>
>
> bq. 
> ./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/java/org/apache/hadoop/mapred/YARNRunner.java:341:
>  public ApplicationSubmissionContext createApplicationSubmissionContext(:3: 
> Method length is 249 lines (max allowed is 150).
> {{YARNRunner#createApplicationSubmissionContext}} is longer than 150 lines 
> and needs to be refactored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-5951) Add support for the YARN Shared Cache

2017-01-19 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15830767#comment-15830767
 ] 

Chris Trezzo commented on MAPREDUCE-5951:
-

Note: currently this patch depends on how YARN-3637 is implemented. I will 
adjust this patch once it is committed.

> Add support for the YARN Shared Cache
> -
>
> Key: MAPREDUCE-5951
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5951
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>  Labels: BB2015-05-TBR
> Attachments: MAPREDUCE-5951-Overview.001.pdf, 
> MAPREDUCE-5951-trunk.016.patch, MAPREDUCE-5951-trunk.017.patch, 
> MAPREDUCE-5951-trunk.018.patch, MAPREDUCE-5951-trunk.019.patch, 
> MAPREDUCE-5951-trunk-v10.patch, MAPREDUCE-5951-trunk-v11.patch, 
> MAPREDUCE-5951-trunk-v12.patch, MAPREDUCE-5951-trunk-v13.patch, 
> MAPREDUCE-5951-trunk-v14.patch, MAPREDUCE-5951-trunk-v15.patch, 
> MAPREDUCE-5951-trunk-v1.patch, MAPREDUCE-5951-trunk-v2.patch, 
> MAPREDUCE-5951-trunk-v3.patch, MAPREDUCE-5951-trunk-v4.patch, 
> MAPREDUCE-5951-trunk-v5.patch, MAPREDUCE-5951-trunk-v6.patch, 
> MAPREDUCE-5951-trunk-v7.patch, MAPREDUCE-5951-trunk-v8.patch, 
> MAPREDUCE-5951-trunk-v9.patch
>
>
> Implement the necessary changes so that the MapReduce application can 
> leverage the new YARN shared cache (i.e. YARN-1492).
> Specifically, allow per-job configuration so that MapReduce jobs can specify 
> which set of resources they would like to cache (i.e. jobjar, libjars, 
> archives, files).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-5951) Add support for the YARN Shared Cache

2017-01-18 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15829326#comment-15829326
 ] 

Chris Trezzo commented on MAPREDUCE-5951:
-

These are the same 3 checkstyle warnings I mentioned in the 
[comment|https://issues.apache.org/jira/browse/MAPREDUCE-5951?focusedCommentId=15755192=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15755192]
 above.

> Add support for the YARN Shared Cache
> -
>
> Key: MAPREDUCE-5951
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5951
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>  Labels: BB2015-05-TBR
> Attachments: MAPREDUCE-5951-Overview.001.pdf, 
> MAPREDUCE-5951-trunk.016.patch, MAPREDUCE-5951-trunk.017.patch, 
> MAPREDUCE-5951-trunk.018.patch, MAPREDUCE-5951-trunk.019.patch, 
> MAPREDUCE-5951-trunk-v10.patch, MAPREDUCE-5951-trunk-v11.patch, 
> MAPREDUCE-5951-trunk-v12.patch, MAPREDUCE-5951-trunk-v13.patch, 
> MAPREDUCE-5951-trunk-v14.patch, MAPREDUCE-5951-trunk-v15.patch, 
> MAPREDUCE-5951-trunk-v1.patch, MAPREDUCE-5951-trunk-v2.patch, 
> MAPREDUCE-5951-trunk-v3.patch, MAPREDUCE-5951-trunk-v4.patch, 
> MAPREDUCE-5951-trunk-v5.patch, MAPREDUCE-5951-trunk-v6.patch, 
> MAPREDUCE-5951-trunk-v7.patch, MAPREDUCE-5951-trunk-v8.patch, 
> MAPREDUCE-5951-trunk-v9.patch
>
>
> Implement the necessary changes so that the MapReduce application can 
> leverage the new YARN shared cache (i.e. YARN-1492).
> Specifically, allow per-job configuration so that MapReduce jobs can specify 
> which set of resources they would like to cache (i.e. jobjar, libjars, 
> archives, files).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-5951) Add support for the YARN Shared Cache

2017-01-18 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated MAPREDUCE-5951:

Attachment: MAPREDUCE-5951-trunk.019.patch

Attached is a v19 to address [~sjlee0]'s comments. Thanks!

> Add support for the YARN Shared Cache
> -
>
> Key: MAPREDUCE-5951
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5951
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>  Labels: BB2015-05-TBR
> Attachments: MAPREDUCE-5951-Overview.001.pdf, 
> MAPREDUCE-5951-trunk.016.patch, MAPREDUCE-5951-trunk.017.patch, 
> MAPREDUCE-5951-trunk.018.patch, MAPREDUCE-5951-trunk.019.patch, 
> MAPREDUCE-5951-trunk-v10.patch, MAPREDUCE-5951-trunk-v11.patch, 
> MAPREDUCE-5951-trunk-v12.patch, MAPREDUCE-5951-trunk-v13.patch, 
> MAPREDUCE-5951-trunk-v14.patch, MAPREDUCE-5951-trunk-v15.patch, 
> MAPREDUCE-5951-trunk-v1.patch, MAPREDUCE-5951-trunk-v2.patch, 
> MAPREDUCE-5951-trunk-v3.patch, MAPREDUCE-5951-trunk-v4.patch, 
> MAPREDUCE-5951-trunk-v5.patch, MAPREDUCE-5951-trunk-v6.patch, 
> MAPREDUCE-5951-trunk-v7.patch, MAPREDUCE-5951-trunk-v8.patch, 
> MAPREDUCE-5951-trunk-v9.patch
>
>
> Implement the necessary changes so that the MapReduce application can 
> leverage the new YARN shared cache (i.e. YARN-1492).
> Specifically, allow per-job configuration so that MapReduce jobs can specify 
> which set of resources they would like to cache (i.e. jobjar, libjars, 
> archives, files).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-5951) Add support for the YARN Shared Cache

2017-01-13 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated MAPREDUCE-5951:

Attachment: MAPREDUCE-5951-Overview.001.pdf

Attaching a documentation that gives an overview of MapReduce support for the 
shared cache. This will hopefully make reviewing the patch easier!

I am also working on an updated patch to address the comments from [~sjlee0]. 
Thanks!

> Add support for the YARN Shared Cache
> -
>
> Key: MAPREDUCE-5951
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5951
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>  Labels: BB2015-05-TBR
> Attachments: MAPREDUCE-5951-Overview.001.pdf, 
> MAPREDUCE-5951-trunk.016.patch, MAPREDUCE-5951-trunk.017.patch, 
> MAPREDUCE-5951-trunk.018.patch, MAPREDUCE-5951-trunk-v10.patch, 
> MAPREDUCE-5951-trunk-v11.patch, MAPREDUCE-5951-trunk-v12.patch, 
> MAPREDUCE-5951-trunk-v13.patch, MAPREDUCE-5951-trunk-v14.patch, 
> MAPREDUCE-5951-trunk-v15.patch, MAPREDUCE-5951-trunk-v1.patch, 
> MAPREDUCE-5951-trunk-v2.patch, MAPREDUCE-5951-trunk-v3.patch, 
> MAPREDUCE-5951-trunk-v4.patch, MAPREDUCE-5951-trunk-v5.patch, 
> MAPREDUCE-5951-trunk-v6.patch, MAPREDUCE-5951-trunk-v7.patch, 
> MAPREDUCE-5951-trunk-v8.patch, MAPREDUCE-5951-trunk-v9.patch
>
>
> Implement the necessary changes so that the MapReduce application can 
> leverage the new YARN shared cache (i.e. YARN-1492).
> Specifically, allow per-job configuration so that MapReduce jobs can specify 
> which set of resources they would like to cache (i.e. jobjar, libjars, 
> archives, files).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-5951) Add support for the YARN Shared Cache

2017-01-11 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15819182#comment-15819182
 ] 

Chris Trezzo commented on MAPREDUCE-5951:
-

Thanks [~sjlee0] for the review! I will work on v19 of the patch to address 
your comments.

> Add support for the YARN Shared Cache
> -
>
> Key: MAPREDUCE-5951
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5951
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>  Labels: BB2015-05-TBR
> Attachments: MAPREDUCE-5951-trunk-v1.patch, 
> MAPREDUCE-5951-trunk-v10.patch, MAPREDUCE-5951-trunk-v11.patch, 
> MAPREDUCE-5951-trunk-v12.patch, MAPREDUCE-5951-trunk-v13.patch, 
> MAPREDUCE-5951-trunk-v14.patch, MAPREDUCE-5951-trunk-v15.patch, 
> MAPREDUCE-5951-trunk-v2.patch, MAPREDUCE-5951-trunk-v3.patch, 
> MAPREDUCE-5951-trunk-v4.patch, MAPREDUCE-5951-trunk-v5.patch, 
> MAPREDUCE-5951-trunk-v6.patch, MAPREDUCE-5951-trunk-v7.patch, 
> MAPREDUCE-5951-trunk-v8.patch, MAPREDUCE-5951-trunk-v9.patch, 
> MAPREDUCE-5951-trunk.016.patch, MAPREDUCE-5951-trunk.017.patch, 
> MAPREDUCE-5951-trunk.018.patch
>
>
> Implement the necessary changes so that the MapReduce application can 
> leverage the new YARN shared cache (i.e. YARN-1492).
> Specifically, allow per-job configuration so that MapReduce jobs can specify 
> which set of resources they would like to cache (i.e. jobjar, libjars, 
> archives, files).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-5951) Add support for the YARN Shared Cache

2016-12-16 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15755575#comment-15755575
 ] 

Chris Trezzo commented on MAPREDUCE-5951:
-

I have filed MAPREDUCE-6825 and MAPREDUCE-6824 to address the checkstyle issues 
around methods that are too long.

> Add support for the YARN Shared Cache
> -
>
> Key: MAPREDUCE-5951
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5951
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>  Labels: BB2015-05-TBR
> Attachments: MAPREDUCE-5951-trunk-v1.patch, 
> MAPREDUCE-5951-trunk-v10.patch, MAPREDUCE-5951-trunk-v11.patch, 
> MAPREDUCE-5951-trunk-v12.patch, MAPREDUCE-5951-trunk-v13.patch, 
> MAPREDUCE-5951-trunk-v14.patch, MAPREDUCE-5951-trunk-v15.patch, 
> MAPREDUCE-5951-trunk-v2.patch, MAPREDUCE-5951-trunk-v3.patch, 
> MAPREDUCE-5951-trunk-v4.patch, MAPREDUCE-5951-trunk-v5.patch, 
> MAPREDUCE-5951-trunk-v6.patch, MAPREDUCE-5951-trunk-v7.patch, 
> MAPREDUCE-5951-trunk-v8.patch, MAPREDUCE-5951-trunk-v9.patch, 
> MAPREDUCE-5951-trunk.016.patch, MAPREDUCE-5951-trunk.017.patch, 
> MAPREDUCE-5951-trunk.018.patch
>
>
> Implement the necessary changes so that the MapReduce application can 
> leverage the new YARN shared cache (i.e. YARN-1492).
> Specifically, allow per-job configuration so that MapReduce jobs can specify 
> which set of resources they would like to cache (i.e. jobjar, libjars, 
> archives, files).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Created] (MAPREDUCE-6825) YARNRunner#createApplicationSubmissionContext method is longer than 150 lines

2016-12-16 Thread Chris Trezzo (JIRA)
Chris Trezzo created MAPREDUCE-6825:
---

 Summary: YARNRunner#createApplicationSubmissionContext method is 
longer than 150 lines
 Key: MAPREDUCE-6825
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6825
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Chris Trezzo
Priority: Trivial


bq. 
./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/java/org/apache/hadoop/mapred/YARNRunner.java:341:
 public ApplicationSubmissionContext createApplicationSubmissionContext(:3: 
Method length is 249 lines (max allowed is 150).

{{YARNRunner#createApplicationSubmissionContext}} is longer than 150 lines and 
needs to be refactored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Created] (MAPREDUCE-6824) TaskAttemptImpl#createCommonContainerLaunchContext is longer than 150 lines

2016-12-16 Thread Chris Trezzo (JIRA)
Chris Trezzo created MAPREDUCE-6824:
---

 Summary: TaskAttemptImpl#createCommonContainerLaunchContext is 
longer than 150 lines
 Key: MAPREDUCE-6824
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6824
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Chris Trezzo
Priority: Trivial


bq. 
./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java:752:
 private static ContainerLaunchContext createCommonContainerLaunchContext(:3: 
Method length is 172 lines (max allowed is 150).

{{TaskAttemptImpl#createCommonContainerLaunchContext}} is longer than 150 lines 
and needs to be refactored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (MAPREDUCE-5951) Add support for the YARN Shared Cache

2016-12-16 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15755192#comment-15755192
 ] 

Chris Trezzo edited comment on MAPREDUCE-5951 at 12/16/16 8:55 PM:
---

Attached is v18 to fix the one checkstyle issue. There are three outstanding 
checkstyle issues that I am leaning towards not fixing as part of the patch. 
Please let me know your thoughts. They are the following:
bq. 
./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java:752:
  private static ContainerLaunchContext createCommonContainerLaunchContext(:3: 
Method length is 172 lines (max allowed is 150).

This patch barely touches this method so it seems wrong to refactor the method 
as part of this jira. I can file a separate jira to fix this.

bq. 
./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/java/org/apache/hadoop/mapred/YARNRunner.java:341:
  public ApplicationSubmissionContext createApplicationSubmissionContext(:3: 
Method length is 249 lines (max allowed is 150).

The same reasoning applies to this warning as the previous issue. I can file a 
separate jira to fix this.

bq. 
./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/util/MRApps.java:572:
  private static void parseDistributedCacheArtifacts(:23: More than 7 
parameters (found 8).

This issue was caused by this patch adding an additional parameter to this 
method. I can fix the number of parameter issues, but that forces me to touch 
three existing calls to the deprecated DistributedCache api, which would fix 1 
warning but create 3 new ones. It is a larger change to not use the deprecated 
api because the existing code is not set up to use it, furthermore use of the 
deprecated api is currently widespread in this code. My thoughts are that I 
will leave this warning as is.


was (Author: ctrezzo):
Attached is v18 to fix the one checkstyle issue. There are three outstanding 
checkstyle issues that I am leaning towards not fixing as part of the patch. 
Please let me know your thoughts. They are the following:
bq. 
./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java:752:
  private static ContainerLaunchContext createCommonContainerLaunchContext(:3: 
Method length is 172 lines (max allowed is 150).

This patch barely touches this method so it seems wrong to refactor the patch 
as part of this jira. I can file a separate jira to fix this.

bq. 
./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/java/org/apache/hadoop/mapred/YARNRunner.java:341:
  public ApplicationSubmissionContext createApplicationSubmissionContext(:3: 
Method length is 249 lines (max allowed is 150).

The same reasoning applies to this warning as the previous issue. I can file a 
separate jira to fix this.

bq. 
./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/util/MRApps.java:572:
  private static void parseDistributedCacheArtifacts(:23: More than 7 
parameters (found 8).

This issue was caused by this patch adding an additional parameter to this 
method. I can fix the number of parameter issues, but that forces me to touch 
three existing calls to the deprecated DistributedCache api, which would fix 1 
warning but create 3 new ones. It is a larger change to not use the deprecated 
api because the existing code is not set up to use it, furthermore use of the 
deprecated api is currently widespread in this code. My thoughts are that I 
will leave this warning as is.

> Add support for the YARN Shared Cache
> -
>
> Key: MAPREDUCE-5951
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5951
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>  Labels: BB2015-05-TBR
> Attachments: MAPREDUCE-5951-trunk-v1.patch, 
> MAPREDUCE-5951-trunk-v10.patch, MAPREDUCE-5951-trunk-v11.patch, 
> MAPREDUCE-5951-trunk-v12.patch, MAPREDUCE-5951-trunk-v13.patch, 
> MAPREDUCE-5951-trunk-v14.patch, MAPREDUCE-5951-trunk-v15.patch, 
> MAPREDUCE-5951-trunk-v2.patch, MAPREDUCE-5951-trunk-v3.patch, 
> MAPREDUCE-5951-trunk-v4.patch, MAPREDUCE-5951-trunk-v5.patch, 
> MAPREDUCE-5951-trunk-v6.patch, MAPREDUCE-5951-trunk-v7.patch, 
> MAPREDUCE-5951-trunk-v8.patch, MAPREDUCE-5951-trunk-v9.patch, 
> MAPREDUCE-5951-trunk.016.patch, MAPREDUCE-5951-trunk.017.patch, 
> MAPREDUCE-5951-trunk.018.patch
>
>
> Implement the necessary changes so that the MapReduce application can 
> leverage the new 

[jira] [Updated] (MAPREDUCE-5951) Add support for the YARN Shared Cache

2016-12-16 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated MAPREDUCE-5951:

Attachment: MAPREDUCE-5951-trunk.018.patch

Attached is v18 to fix the one checkstyle issue. There are three outstanding 
checkstyle issues that I am leaning towards not fixing as part of the patch. 
Please let me know your thoughts. They are the following:
bq. 
./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java:752:
  private static ContainerLaunchContext createCommonContainerLaunchContext(:3: 
Method length is 172 lines (max allowed is 150).

This patch barely touches this method so it seems wrong to refactor the patch 
as part of this jira. I can file a separate jira to fix this.

bq. 
./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/java/org/apache/hadoop/mapred/YARNRunner.java:341:
  public ApplicationSubmissionContext createApplicationSubmissionContext(:3: 
Method length is 249 lines (max allowed is 150).

The same reasoning applies to this warning as the previous issue. I can file a 
separate jira to fix this.

bq. 
./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/util/MRApps.java:572:
  private static void parseDistributedCacheArtifacts(:23: More than 7 
parameters (found 8).

This issue was caused by this patch adding an additional parameter to this 
method. I can fix the number of parameter issues, but that forces me to touch 
three existing calls to the deprecated DistributedCache api, which would fix 1 
warning but create 3 new ones. It is a larger change to not use the deprecated 
api because the existing code is not set up to use it, furthermore use of the 
deprecated api is currently widespread in this code. My thoughts are that I 
will leave this warning as is.

> Add support for the YARN Shared Cache
> -
>
> Key: MAPREDUCE-5951
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5951
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>  Labels: BB2015-05-TBR
> Attachments: MAPREDUCE-5951-trunk-v1.patch, 
> MAPREDUCE-5951-trunk-v10.patch, MAPREDUCE-5951-trunk-v11.patch, 
> MAPREDUCE-5951-trunk-v12.patch, MAPREDUCE-5951-trunk-v13.patch, 
> MAPREDUCE-5951-trunk-v14.patch, MAPREDUCE-5951-trunk-v15.patch, 
> MAPREDUCE-5951-trunk-v2.patch, MAPREDUCE-5951-trunk-v3.patch, 
> MAPREDUCE-5951-trunk-v4.patch, MAPREDUCE-5951-trunk-v5.patch, 
> MAPREDUCE-5951-trunk-v6.patch, MAPREDUCE-5951-trunk-v7.patch, 
> MAPREDUCE-5951-trunk-v8.patch, MAPREDUCE-5951-trunk-v9.patch, 
> MAPREDUCE-5951-trunk.016.patch, MAPREDUCE-5951-trunk.017.patch, 
> MAPREDUCE-5951-trunk.018.patch
>
>
> Implement the necessary changes so that the MapReduce application can 
> leverage the new YARN shared cache (i.e. YARN-1492).
> Specifically, allow per-job configuration so that MapReduce jobs can specify 
> which set of resources they would like to cache (i.e. jobjar, libjars, 
> archives, files).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-5951) Add support for the YARN Shared Cache

2016-12-14 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated MAPREDUCE-5951:

Attachment: MAPREDUCE-5951-trunk.017.patch

Attaching v17 to address findbugs, checkstyle and whitespace errors.

> Add support for the YARN Shared Cache
> -
>
> Key: MAPREDUCE-5951
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5951
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>  Labels: BB2015-05-TBR
> Attachments: MAPREDUCE-5951-trunk-v1.patch, 
> MAPREDUCE-5951-trunk-v10.patch, MAPREDUCE-5951-trunk-v11.patch, 
> MAPREDUCE-5951-trunk-v12.patch, MAPREDUCE-5951-trunk-v13.patch, 
> MAPREDUCE-5951-trunk-v14.patch, MAPREDUCE-5951-trunk-v15.patch, 
> MAPREDUCE-5951-trunk-v2.patch, MAPREDUCE-5951-trunk-v3.patch, 
> MAPREDUCE-5951-trunk-v4.patch, MAPREDUCE-5951-trunk-v5.patch, 
> MAPREDUCE-5951-trunk-v6.patch, MAPREDUCE-5951-trunk-v7.patch, 
> MAPREDUCE-5951-trunk-v8.patch, MAPREDUCE-5951-trunk-v9.patch, 
> MAPREDUCE-5951-trunk.016.patch, MAPREDUCE-5951-trunk.017.patch
>
>
> Implement the necessary changes so that the MapReduce application can 
> leverage the new YARN shared cache (i.e. YARN-1492).
> Specifically, allow per-job configuration so that MapReduce jobs can specify 
> which set of resources they would like to cache (i.e. jobjar, libjars, 
> archives, files).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-5951) Add support for the YARN Shared Cache

2016-12-13 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated MAPREDUCE-5951:

Attachment: MAPREDUCE-5951-trunk.016.patch

Attaching v16 for a QA run. This was a non-trivial rebase. I have also modified 
the patch so that it no longer uses arrays in the JobResourceUploader as 
requested by [~kasha].

> Add support for the YARN Shared Cache
> -
>
> Key: MAPREDUCE-5951
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5951
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>  Labels: BB2015-05-TBR
> Attachments: MAPREDUCE-5951-trunk-v1.patch, 
> MAPREDUCE-5951-trunk-v10.patch, MAPREDUCE-5951-trunk-v11.patch, 
> MAPREDUCE-5951-trunk-v12.patch, MAPREDUCE-5951-trunk-v13.patch, 
> MAPREDUCE-5951-trunk-v14.patch, MAPREDUCE-5951-trunk-v15.patch, 
> MAPREDUCE-5951-trunk-v2.patch, MAPREDUCE-5951-trunk-v3.patch, 
> MAPREDUCE-5951-trunk-v4.patch, MAPREDUCE-5951-trunk-v5.patch, 
> MAPREDUCE-5951-trunk-v6.patch, MAPREDUCE-5951-trunk-v7.patch, 
> MAPREDUCE-5951-trunk-v8.patch, MAPREDUCE-5951-trunk-v9.patch, 
> MAPREDUCE-5951-trunk.016.patch
>
>
> Implement the necessary changes so that the MapReduce application can 
> leverage the new YARN shared cache (i.e. YARN-1492).
> Specifically, allow per-job configuration so that MapReduce jobs can specify 
> which set of resources they would like to cache (i.e. jobjar, libjars, 
> archives, files).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6128) Automatic addition of bundled jars to distributed cache

2016-12-12 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated MAPREDUCE-6128:

Status: Open  (was: Patch Available)

Canceling patch because it does not apply to trunk.

> Automatic addition of bundled jars to distributed cache 
> 
>
> Key: MAPREDUCE-6128
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6128
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 2.5.1
>Reporter: Gera Shegalov
>Assignee: Gera Shegalov
> Attachments: MAPREDUCE-6128.v01.patch, MAPREDUCE-6128.v02.patch, 
> MAPREDUCE-6128.v03.patch, MAPREDUCE-6128.v04.patch, MAPREDUCE-6128.v05.patch, 
> MAPREDUCE-6128.v06.patch, MAPREDUCE-6128.v07.patch, MAPREDUCE-6128.v08.patch
>
>
> On the client side, JDK adds Class-Path elements from the job jar manifest
> on the classpath. In theory there could be many bundled jars in many 
> directories such that adding them manually via libjars or similar means to 
> task classpaths is cumbersome. If this property is enabled, the same jars are 
> added
> to the task classpaths automatically.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6637) Testcase Failure : TestFileInputFormat.testSplitLocationInfo

2016-09-13 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15488800#comment-15488800
 ] 

Chris Trezzo commented on MAPREDUCE-6637:
-

Thanks!

> Testcase Failure : TestFileInputFormat.testSplitLocationInfo
> 
>
> Key: MAPREDUCE-6637
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6637
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
> Fix For: 2.7.3, 2.6.5, 3.0.0-alpha1
>
> Attachments: MAPREDUCE-6637.patch
>
>
> Following testcase is failing after HADOOP-12810
> {noformat}
> FAILED:  org.apache.hadoop.mapred.TestFileInputFormat.testSplitLocationInfo[0]
> Error Message:
> expected:<2> but was:<1>
> Stack Trace:
> java.lang.AssertionError: expected:<2> but was:<1>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.mapred.TestFileInputFormat.testSplitLocationInfo(TestFileInputFormat.java:115)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-5817) Mappers get rescheduled on node transition even after all reducers are completed

2016-09-13 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15488750#comment-15488750
 ] 

Chris Trezzo commented on MAPREDUCE-5817:
-

Thanks!

> Mappers get rescheduled on node transition even after all reducers are 
> completed
> 
>
> Key: MAPREDUCE-5817
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5817
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster
>Affects Versions: 2.3.0
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
> Fix For: 2.7.3, 2.6.5, 3.0.0-alpha1
>
> Attachments: MAPREDUCE-5817.001.patch, MAPREDUCE-5817.002.patch, 
> mapreduce-5817.patch
>
>
> We're seeing a behavior where a job runs long after all reducers were already 
> finished. We found that the job was rescheduling and running a number of 
> mappers beyond the point of reducer completion. In one situation, the job ran 
> for some 9 more hours after all reducers completed!
> This happens because whenever a node transition (to an unusable state) comes 
> into the app master, it just reschedules all mappers that already ran on the 
> node in all cases.
> Therefore, if any node transition has a potential to extend the job period. 
> Once this window opens, another node transition can prolong it, and this can 
> happen indefinitely in theory.
> If there is some instability in the pool (unhealthy, etc.) for a duration, 
> then any big job is severely vulnerable to this problem.
> If all reducers have been completed, JobImpl.actOnUnusableNode() should not 
> reschedule mapper tasks. If all reducers are completed, the mapper outputs 
> are no longer needed, and there is no need to reschedule mapper tasks as they 
> would not be consumed anyway.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-5817) Mappers get rescheduled on node transition even after all reducers are completed

2016-08-22 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15431581#comment-15431581
 ] 

Chris Trezzo commented on MAPREDUCE-5817:
-

Adding 2.6.5 to the target versions with the intention of backporting this to 
branch-2.6. Please let me know if you think otherwise. Thanks!

> Mappers get rescheduled on node transition even after all reducers are 
> completed
> 
>
> Key: MAPREDUCE-5817
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5817
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster
>Affects Versions: 2.3.0
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
> Fix For: 2.7.3
>
> Attachments: MAPREDUCE-5817.001.patch, MAPREDUCE-5817.002.patch, 
> mapreduce-5817.patch
>
>
> We're seeing a behavior where a job runs long after all reducers were already 
> finished. We found that the job was rescheduling and running a number of 
> mappers beyond the point of reducer completion. In one situation, the job ran 
> for some 9 more hours after all reducers completed!
> This happens because whenever a node transition (to an unusable state) comes 
> into the app master, it just reschedules all mappers that already ran on the 
> node in all cases.
> Therefore, if any node transition has a potential to extend the job period. 
> Once this window opens, another node transition can prolong it, and this can 
> happen indefinitely in theory.
> If there is some instability in the pool (unhealthy, etc.) for a duration, 
> then any big job is severely vulnerable to this problem.
> If all reducers have been completed, JobImpl.actOnUnusableNode() should not 
> reschedule mapper tasks. If all reducers are completed, the mapper outputs 
> are no longer needed, and there is no need to reschedule mapper tasks as they 
> would not be consumed anyway.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-5817) Mappers get rescheduled on node transition even after all reducers are completed

2016-08-22 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated MAPREDUCE-5817:

Target Version/s: 2.8.0, 2.6.5  (was: 2.8.0)

> Mappers get rescheduled on node transition even after all reducers are 
> completed
> 
>
> Key: MAPREDUCE-5817
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5817
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster
>Affects Versions: 2.3.0
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
> Fix For: 2.7.3
>
> Attachments: MAPREDUCE-5817.001.patch, MAPREDUCE-5817.002.patch, 
> mapreduce-5817.patch
>
>
> We're seeing a behavior where a job runs long after all reducers were already 
> finished. We found that the job was rescheduling and running a number of 
> mappers beyond the point of reducer completion. In one situation, the job ran 
> for some 9 more hours after all reducers completed!
> This happens because whenever a node transition (to an unusable state) comes 
> into the app master, it just reschedules all mappers that already ran on the 
> node in all cases.
> Therefore, if any node transition has a potential to extend the job period. 
> Once this window opens, another node transition can prolong it, and this can 
> happen indefinitely in theory.
> If there is some instability in the pool (unhealthy, etc.) for a duration, 
> then any big job is severely vulnerable to this problem.
> If all reducers have been completed, JobImpl.actOnUnusableNode() should not 
> reschedule mapper tasks. If all reducers are completed, the mapper outputs 
> are no longer needed, and there is no need to reschedule mapper tasks as they 
> would not be consumed anyway.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6690) Limit the number of resources a single map reduce job can submit for localization

2016-08-17 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15425190#comment-15425190
 ] 

Chris Trezzo commented on MAPREDUCE-6690:
-

Thanks [~jlowe] for the review and commit!

> Limit the number of resources a single map reduce job can submit for 
> localization
> -
>
> Key: MAPREDUCE-6690
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6690
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
> Fix For: 2.9.0
>
> Attachments: MAPREDUCE-6690-trunk-v1.patch, 
> MAPREDUCE-6690-trunk-v2.patch, MAPREDUCE-6690-trunk-v3.patch, 
> MAPREDUCE-6690-trunk-v4.patch, MAPREDUCE-6690-trunk-v5.patch, 
> MAPREDUCE-6690-trunk-v6.patch, MAPREDUCE-6690-trunk-v7.patch
>
>
> Users will sometimes submit a large amount of resources to be localized as 
> part of a single map reduce job. This can cause issues with YARN localization 
> that destabilize the cluster and potentially impact other user jobs. These 
> resources are specified via the files, libjars, archives and jobjar command 
> line arguments or directly through the configuration (i.e. distributed cache 
> api). The resources specified could be too large in multiple dimensions:
> # Total size
> # Number of files
> # Size of an individual resource (i.e. a large fat jar)
> We would like to encourage good behavior on the client side by having the 
> option of enforcing resource limits along the above dimensions.
> There should be a separate effort to enforce limits at the YARN layer on the 
> server side, but this jira is only covering the map reduce layer on the 
> client side. In practice, having these client side limits will get us a long 
> way towards preventing these localization anti-patterns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6690) Limit the number of resources a single map reduce job can submit for localization

2016-08-16 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15423593#comment-15423593
 ] 

Chris Trezzo commented on MAPREDUCE-6690:
-

Thanks for the review [~jlowe]! Attached is a v7 patch. Here are the major 
changes:
# Changes to address your comments around getStringCollection, totalConfigSize* 
and ensuring tests failed in the intended way.
# Changes to make the usage of the word resource vs file consistent throughout 
the patch (i.e. a file is a type of resource).

> Limit the number of resources a single map reduce job can submit for 
> localization
> -
>
> Key: MAPREDUCE-6690
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6690
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
> Attachments: MAPREDUCE-6690-trunk-v1.patch, 
> MAPREDUCE-6690-trunk-v2.patch, MAPREDUCE-6690-trunk-v3.patch, 
> MAPREDUCE-6690-trunk-v4.patch, MAPREDUCE-6690-trunk-v5.patch, 
> MAPREDUCE-6690-trunk-v6.patch, MAPREDUCE-6690-trunk-v7.patch
>
>
> Users will sometimes submit a large amount of resources to be localized as 
> part of a single map reduce job. This can cause issues with YARN localization 
> that destabilize the cluster and potentially impact other user jobs. These 
> resources are specified via the files, libjars, archives and jobjar command 
> line arguments or directly through the configuration (i.e. distributed cache 
> api). The resources specified could be too large in multiple dimensions:
> # Total size
> # Number of files
> # Size of an individual resource (i.e. a large fat jar)
> We would like to encourage good behavior on the client side by having the 
> option of enforcing resource limits along the above dimensions.
> There should be a separate effort to enforce limits at the YARN layer on the 
> server side, but this jira is only covering the map reduce layer on the 
> client side. In practice, having these client side limits will get us a long 
> way towards preventing these localization anti-patterns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6690) Limit the number of resources a single map reduce job can submit for localization

2016-08-16 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated MAPREDUCE-6690:

Attachment: MAPREDUCE-6690-trunk-v7.patch

> Limit the number of resources a single map reduce job can submit for 
> localization
> -
>
> Key: MAPREDUCE-6690
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6690
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
> Attachments: MAPREDUCE-6690-trunk-v1.patch, 
> MAPREDUCE-6690-trunk-v2.patch, MAPREDUCE-6690-trunk-v3.patch, 
> MAPREDUCE-6690-trunk-v4.patch, MAPREDUCE-6690-trunk-v5.patch, 
> MAPREDUCE-6690-trunk-v6.patch, MAPREDUCE-6690-trunk-v7.patch
>
>
> Users will sometimes submit a large amount of resources to be localized as 
> part of a single map reduce job. This can cause issues with YARN localization 
> that destabilize the cluster and potentially impact other user jobs. These 
> resources are specified via the files, libjars, archives and jobjar command 
> line arguments or directly through the configuration (i.e. distributed cache 
> api). The resources specified could be too large in multiple dimensions:
> # Total size
> # Number of files
> # Size of an individual resource (i.e. a large fat jar)
> We would like to encourage good behavior on the client side by having the 
> option of enforcing resource limits along the above dimensions.
> There should be a separate effort to enforce limits at the YARN layer on the 
> server side, but this jira is only covering the map reduce layer on the 
> client side. In practice, having these client side limits will get us a long 
> way towards preventing these localization anti-patterns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6690) Limit the number of resources a single map reduce job can submit for localization

2016-08-05 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15409997#comment-15409997
 ] 

Chris Trezzo commented on MAPREDUCE-6690:
-

Filed jira to fix unrelated test failure: MAPREDUCE-6747

> Limit the number of resources a single map reduce job can submit for 
> localization
> -
>
> Key: MAPREDUCE-6690
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6690
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
> Attachments: MAPREDUCE-6690-trunk-v1.patch, 
> MAPREDUCE-6690-trunk-v2.patch, MAPREDUCE-6690-trunk-v3.patch, 
> MAPREDUCE-6690-trunk-v4.patch, MAPREDUCE-6690-trunk-v5.patch, 
> MAPREDUCE-6690-trunk-v6.patch
>
>
> Users will sometimes submit a large amount of resources to be localized as 
> part of a single map reduce job. This can cause issues with YARN localization 
> that destabilize the cluster and potentially impact other user jobs. These 
> resources are specified via the files, libjars, archives and jobjar command 
> line arguments or directly through the configuration (i.e. distributed cache 
> api). The resources specified could be too large in multiple dimensions:
> # Total size
> # Number of files
> # Size of an individual resource (i.e. a large fat jar)
> We would like to encourage good behavior on the client side by having the 
> option of enforcing resource limits along the above dimensions.
> There should be a separate effort to enforce limits at the YARN layer on the 
> server side, but this jira is only covering the map reduce layer on the 
> client side. In practice, having these client side limits will get us a long 
> way towards preventing these localization anti-patterns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Created] (MAPREDUCE-6747) TestMapReduceJobControl#testJobControlWithKillJob times out in trunk

2016-08-05 Thread Chris Trezzo (JIRA)
Chris Trezzo created MAPREDUCE-6747:
---

 Summary: TestMapReduceJobControl#testJobControlWithKillJob times 
out in trunk
 Key: MAPREDUCE-6747
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6747
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Chris Trezzo
Priority: Minor


TestMapReduceJobControl#testJobControlWithKillJob seems to time out while 
waiting for all jobs to complete. This seems to only happen if the test is run 
with the other tests in the class (specifically testJobControlWithFailJob). If 
testJobControlWithKillJob is run by itself, the test passes.

Looking into the test logs, when run with another test from the class, the test 
runs into an issue while setting permissions on the local file system:
{noformat}
2016-08-05 11:40:32,101 WARN  [Thread-100] util.Shell 
(Shell.java:joinThread(1023)) - Interrupted while joining on: 
Thread[Thread-105,5,main]
java.lang.InterruptedException
at java.lang.Object.wait(Native Method)
at java.lang.Thread.join(Thread.java:1245)
at java.lang.Thread.join(Thread.java:1319)
at org.apache.hadoop.util.Shell.joinThread(Shell.java:1020)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:969)
at org.apache.hadoop.util.Shell.run(Shell.java:878)
at 
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1172)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:1266)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:1248)
at 
org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:781)
at 
org.apache.hadoop.fs.RawLocalFileSystem.mkOneDirWithMode(RawLocalFileSystem.java:526)
at 
org.apache.hadoop.fs.RawLocalFileSystem.mkdirsWithOptionalPermission(RawLocalFileSystem.java:566)
at 
org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:538)
at 
org.apache.hadoop.fs.RawLocalFileSystem.mkdirsWithOptionalPermission(RawLocalFileSystem.java:565)
at 
org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:538)
at 
org.apache.hadoop.fs.RawLocalFileSystem.mkdirsWithOptionalPermission(RawLocalFileSystem.java:565)
at 
org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:538)
at 
org.apache.hadoop.fs.ChecksumFileSystem.mkdirs(ChecksumFileSystem.java:696)
at 
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.setupJob(FileOutputCommitter.java:343)
at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:541)
{noformat}

Conversely, when the test is run by itself, this issue is not hit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6690) Limit the number of resources a single map reduce job can submit for localization

2016-08-05 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15409982#comment-15409982
 ] 

Chris Trezzo commented on MAPREDUCE-6690:
-

TestMapReduceJobControl#testJobControlWithKillJob times out in trunk without 
this patch. The broken test is unrelated. I will file another jira to fix the 
test, but this patch should be ready for review. Thanks!

> Limit the number of resources a single map reduce job can submit for 
> localization
> -
>
> Key: MAPREDUCE-6690
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6690
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
> Attachments: MAPREDUCE-6690-trunk-v1.patch, 
> MAPREDUCE-6690-trunk-v2.patch, MAPREDUCE-6690-trunk-v3.patch, 
> MAPREDUCE-6690-trunk-v4.patch, MAPREDUCE-6690-trunk-v5.patch, 
> MAPREDUCE-6690-trunk-v6.patch
>
>
> Users will sometimes submit a large amount of resources to be localized as 
> part of a single map reduce job. This can cause issues with YARN localization 
> that destabilize the cluster and potentially impact other user jobs. These 
> resources are specified via the files, libjars, archives and jobjar command 
> line arguments or directly through the configuration (i.e. distributed cache 
> api). The resources specified could be too large in multiple dimensions:
> # Total size
> # Number of files
> # Size of an individual resource (i.e. a large fat jar)
> We would like to encourage good behavior on the client side by having the 
> option of enforcing resource limits along the above dimensions.
> There should be a separate effort to enforce limits at the YARN layer on the 
> server side, but this jira is only covering the map reduce layer on the 
> client side. In practice, having these client side limits will get us a long 
> way towards preventing these localization anti-patterns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6690) Limit the number of resources a single map reduce job can submit for localization

2016-08-04 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated MAPREDUCE-6690:

Attachment: MAPREDUCE-6690-trunk-v6.patch

V6 Attached. Re-based to trunk.

> Limit the number of resources a single map reduce job can submit for 
> localization
> -
>
> Key: MAPREDUCE-6690
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6690
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
> Attachments: MAPREDUCE-6690-trunk-v1.patch, 
> MAPREDUCE-6690-trunk-v2.patch, MAPREDUCE-6690-trunk-v3.patch, 
> MAPREDUCE-6690-trunk-v4.patch, MAPREDUCE-6690-trunk-v5.patch, 
> MAPREDUCE-6690-trunk-v6.patch
>
>
> Users will sometimes submit a large amount of resources to be localized as 
> part of a single map reduce job. This can cause issues with YARN localization 
> that destabilize the cluster and potentially impact other user jobs. These 
> resources are specified via the files, libjars, archives and jobjar command 
> line arguments or directly through the configuration (i.e. distributed cache 
> api). The resources specified could be too large in multiple dimensions:
> # Total size
> # Number of files
> # Size of an individual resource (i.e. a large fat jar)
> We would like to encourage good behavior on the client side by having the 
> option of enforcing resource limits along the above dimensions.
> There should be a separate effort to enforce limits at the YARN layer on the 
> server side, but this jira is only covering the map reduce layer on the 
> client side. In practice, having these client side limits will get us a long 
> way towards preventing these localization anti-patterns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6365) Refactor JobResourceUploader#uploadFilesInternal

2016-07-20 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15386362#comment-15386362
 ] 

Chris Trezzo commented on MAPREDUCE-6365:
-

Thanks [~sjlee0]! The intention would be to get MapReduce support for Shared 
cache in 2.9.0, so backporting this refactor to 2.9.0 sounds great.

> Refactor JobResourceUploader#uploadFilesInternal
> 
>
> Key: MAPREDUCE-6365
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6365
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Minor
> Fix For: 3.0.0-alpha2
>
> Attachments: MAPREDUCE-6365-trunk-v1.patch
>
>
> JobResourceUploader#uploadFilesInternal is a large method and there are 
> similar pieces of code that could probably be pulled out into separate 
> methods.  This refactor would improve readability of the code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6365) Refactor JobResourceUploader#uploadFilesInternal

2016-07-08 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15368601#comment-15368601
 ] 

Chris Trezzo commented on MAPREDUCE-6365:
-

[~sjlee0] The patch should be good to go and ready for review.

The two minus ones are accounted for:
# The patch does not include unit tests because it is purely a refactor with no 
functional changes. The existing unit tests pass.
# There was one failed test (TestCLI#testGetJob), but that is a known flapping 
test and already has a jira accounting for it (MAPREDUCE-6625). I ran the test 
locally and it passed.



> Refactor JobResourceUploader#uploadFilesInternal
> 
>
> Key: MAPREDUCE-6365
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6365
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Minor
> Attachments: MAPREDUCE-6365-trunk-v1.patch
>
>
> JobResourceUploader#uploadFilesInternal is a large method and there are 
> similar pieces of code that could probably be pulled out into separate 
> methods.  This refactor would improve readability of the code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6365) Refactor JobResourceUploader#uploadFilesInternal

2016-07-08 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated MAPREDUCE-6365:

Status: Patch Available  (was: Open)

> Refactor JobResourceUploader#uploadFilesInternal
> 
>
> Key: MAPREDUCE-6365
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6365
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Minor
> Attachments: MAPREDUCE-6365-trunk-v1.patch
>
>
> JobResourceUploader#uploadFilesInternal is a large method and there are 
> similar pieces of code that could probably be pulled out into separate 
> methods.  This refactor would improve readability of the code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6365) Refactor JobResourceUploader#uploadFilesInternal

2016-07-08 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated MAPREDUCE-6365:

Attachment: MAPREDUCE-6365-trunk-v1.patch

v1 patch attached. /cc [~sjlee0]

This is a simple patch that does the following:
# Separates files, libjars, archives, jobjar uploading logic into separate 
methods instead of one big method. This will make the code more readable in the 
future, especially when adding shared cache support.
# Rename the JobResourceUploader#uploadFiles method to uploadResources. This 
more appropriately represents what the method does and avoids a redundant 
method name when creating the new upload methods.

> Refactor JobResourceUploader#uploadFilesInternal
> 
>
> Key: MAPREDUCE-6365
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6365
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Minor
> Attachments: MAPREDUCE-6365-trunk-v1.patch
>
>
> JobResourceUploader#uploadFilesInternal is a large method and there are 
> similar pieces of code that could probably be pulled out into separate 
> methods.  This refactor would improve readability of the code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6690) Limit the number of resources a single map reduce job can submit for localization

2016-06-10 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15325223#comment-15325223
 ] 

Chris Trezzo commented on MAPREDUCE-6690:
-

The whitespace errors were for lines that this patch did not touch. I am not 
sure why they appeared during the run. [~jlowe] the patch should be good as is, 
unless you have additional comments. Thanks!

> Limit the number of resources a single map reduce job can submit for 
> localization
> -
>
> Key: MAPREDUCE-6690
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6690
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
> Attachments: MAPREDUCE-6690-trunk-v1.patch, 
> MAPREDUCE-6690-trunk-v2.patch, MAPREDUCE-6690-trunk-v3.patch, 
> MAPREDUCE-6690-trunk-v4.patch, MAPREDUCE-6690-trunk-v5.patch
>
>
> Users will sometimes submit a large amount of resources to be localized as 
> part of a single map reduce job. This can cause issues with YARN localization 
> that destabilize the cluster and potentially impact other user jobs. These 
> resources are specified via the files, libjars, archives and jobjar command 
> line arguments or directly through the configuration (i.e. distributed cache 
> api). The resources specified could be too large in multiple dimensions:
> # Total size
> # Number of files
> # Size of an individual resource (i.e. a large fat jar)
> We would like to encourage good behavior on the client side by having the 
> option of enforcing resource limits along the above dimensions.
> There should be a separate effort to enforce limits at the YARN layer on the 
> server side, but this jira is only covering the map reduce layer on the 
> client side. In practice, having these client side limits will get us a long 
> way towards preventing these localization anti-patterns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6690) Limit the number of resources a single map reduce job can submit for localization

2016-06-09 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated MAPREDUCE-6690:

Attachment: MAPREDUCE-6690-trunk-v5.patch

> Limit the number of resources a single map reduce job can submit for 
> localization
> -
>
> Key: MAPREDUCE-6690
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6690
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
> Attachments: MAPREDUCE-6690-trunk-v1.patch, 
> MAPREDUCE-6690-trunk-v2.patch, MAPREDUCE-6690-trunk-v3.patch, 
> MAPREDUCE-6690-trunk-v4.patch, MAPREDUCE-6690-trunk-v5.patch
>
>
> Users will sometimes submit a large amount of resources to be localized as 
> part of a single map reduce job. This can cause issues with YARN localization 
> that destabilize the cluster and potentially impact other user jobs. These 
> resources are specified via the files, libjars, archives and jobjar command 
> line arguments or directly through the configuration (i.e. distributed cache 
> api). The resources specified could be too large in multiple dimensions:
> # Total size
> # Number of files
> # Size of an individual resource (i.e. a large fat jar)
> We would like to encourage good behavior on the client side by having the 
> option of enforcing resource limits along the above dimensions.
> There should be a separate effort to enforce limits at the YARN layer on the 
> server side, but this jira is only covering the map reduce layer on the 
> client side. In practice, having these client side limits will get us a long 
> way towards preventing these localization anti-patterns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6690) Limit the number of resources a single map reduce job can submit for localization

2016-06-09 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated MAPREDUCE-6690:

Attachment: (was: MAPREDUCE-6690-trunk-v5.patch)

> Limit the number of resources a single map reduce job can submit for 
> localization
> -
>
> Key: MAPREDUCE-6690
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6690
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
> Attachments: MAPREDUCE-6690-trunk-v1.patch, 
> MAPREDUCE-6690-trunk-v2.patch, MAPREDUCE-6690-trunk-v3.patch, 
> MAPREDUCE-6690-trunk-v4.patch
>
>
> Users will sometimes submit a large amount of resources to be localized as 
> part of a single map reduce job. This can cause issues with YARN localization 
> that destabilize the cluster and potentially impact other user jobs. These 
> resources are specified via the files, libjars, archives and jobjar command 
> line arguments or directly through the configuration (i.e. distributed cache 
> api). The resources specified could be too large in multiple dimensions:
> # Total size
> # Number of files
> # Size of an individual resource (i.e. a large fat jar)
> We would like to encourage good behavior on the client side by having the 
> option of enforcing resource limits along the above dimensions.
> There should be a separate effort to enforce limits at the YARN layer on the 
> server side, but this jira is only covering the map reduce layer on the 
> client side. In practice, having these client side limits will get us a long 
> way towards preventing these localization anti-patterns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6690) Limit the number of resources a single map reduce job can submit for localization

2016-06-09 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated MAPREDUCE-6690:

Attachment: MAPREDUCE-6690-trunk-v5.patch

V5 attached.
# Fixed javadoc.
# Test failures are unrelated.

> Limit the number of resources a single map reduce job can submit for 
> localization
> -
>
> Key: MAPREDUCE-6690
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6690
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
> Attachments: MAPREDUCE-6690-trunk-v1.patch, 
> MAPREDUCE-6690-trunk-v2.patch, MAPREDUCE-6690-trunk-v3.patch, 
> MAPREDUCE-6690-trunk-v4.patch, MAPREDUCE-6690-trunk-v5.patch
>
>
> Users will sometimes submit a large amount of resources to be localized as 
> part of a single map reduce job. This can cause issues with YARN localization 
> that destabilize the cluster and potentially impact other user jobs. These 
> resources are specified via the files, libjars, archives and jobjar command 
> line arguments or directly through the configuration (i.e. distributed cache 
> api). The resources specified could be too large in multiple dimensions:
> # Total size
> # Number of files
> # Size of an individual resource (i.e. a large fat jar)
> We would like to encourage good behavior on the client side by having the 
> option of enforcing resource limits along the above dimensions.
> There should be a separate effort to enforce limits at the YARN layer on the 
> server side, but this jira is only covering the map reduce layer on the 
> client side. In practice, having these client side limits will get us a long 
> way towards preventing these localization anti-patterns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6690) Limit the number of resources a single map reduce job can submit for localization

2016-06-08 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated MAPREDUCE-6690:

Attachment: MAPREDUCE-6690-trunk-v4.patch

V4 attached.
# Fixed checkstyle/javadoc.
# Fixed TestMRJobs failures (test only changes).

> Limit the number of resources a single map reduce job can submit for 
> localization
> -
>
> Key: MAPREDUCE-6690
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6690
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
> Attachments: MAPREDUCE-6690-trunk-v1.patch, 
> MAPREDUCE-6690-trunk-v2.patch, MAPREDUCE-6690-trunk-v3.patch, 
> MAPREDUCE-6690-trunk-v4.patch
>
>
> Users will sometimes submit a large amount of resources to be localized as 
> part of a single map reduce job. This can cause issues with YARN localization 
> that destabilize the cluster and potentially impact other user jobs. These 
> resources are specified via the files, libjars, archives and jobjar command 
> line arguments or directly through the configuration (i.e. distributed cache 
> api). The resources specified could be too large in multiple dimensions:
> # Total size
> # Number of files
> # Size of an individual resource (i.e. a large fat jar)
> We would like to encourage good behavior on the client side by having the 
> option of enforcing resource limits along the above dimensions.
> There should be a separate effort to enforce limits at the YARN layer on the 
> server side, but this jira is only covering the map reduce layer on the 
> client side. In practice, having these client side limits will get us a long 
> way towards preventing these localization anti-patterns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6690) Limit the number of resources a single map reduce job can submit for localization

2016-06-01 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15311624#comment-15311624
 ] 

Chris Trezzo commented on MAPREDUCE-6690:
-

I filled YARN-5192 to address the server-side YARN feature.

> Limit the number of resources a single map reduce job can submit for 
> localization
> -
>
> Key: MAPREDUCE-6690
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6690
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
> Attachments: MAPREDUCE-6690-trunk-v1.patch, 
> MAPREDUCE-6690-trunk-v2.patch, MAPREDUCE-6690-trunk-v3.patch
>
>
> Users will sometimes submit a large amount of resources to be localized as 
> part of a single map reduce job. This can cause issues with YARN localization 
> that destabilize the cluster and potentially impact other user jobs. These 
> resources are specified via the files, libjars, archives and jobjar command 
> line arguments or directly through the configuration (i.e. distributed cache 
> api). The resources specified could be too large in multiple dimensions:
> # Total size
> # Number of files
> # Size of an individual resource (i.e. a large fat jar)
> We would like to encourage good behavior on the client side by having the 
> option of enforcing resource limits along the above dimensions.
> There should be a separate effort to enforce limits at the YARN layer on the 
> server side, but this jira is only covering the map reduce layer on the 
> client side. In practice, having these client side limits will get us a long 
> way towards preventing these localization anti-patterns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6690) Limit the number of resources a single map reduce job can submit for localization

2016-06-01 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15311613#comment-15311613
 ] 

Chris Trezzo commented on MAPREDUCE-6690:
-

Note: I was getting some InvocationTargetException failures during TestMRJobs 
on my local machine. I am submitting the patch anyways to get a run.

> Limit the number of resources a single map reduce job can submit for 
> localization
> -
>
> Key: MAPREDUCE-6690
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6690
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
> Attachments: MAPREDUCE-6690-trunk-v1.patch, 
> MAPREDUCE-6690-trunk-v2.patch, MAPREDUCE-6690-trunk-v3.patch
>
>
> Users will sometimes submit a large amount of resources to be localized as 
> part of a single map reduce job. This can cause issues with YARN localization 
> that destabilize the cluster and potentially impact other user jobs. These 
> resources are specified via the files, libjars, archives and jobjar command 
> line arguments or directly through the configuration (i.e. distributed cache 
> api). The resources specified could be too large in multiple dimensions:
> # Total size
> # Number of files
> # Size of an individual resource (i.e. a large fat jar)
> We would like to encourage good behavior on the client side by having the 
> option of enforcing resource limits along the above dimensions.
> There should be a separate effort to enforce limits at the YARN layer on the 
> server side, but this jira is only covering the map reduce layer on the 
> client side. In practice, having these client side limits will get us a long 
> way towards preventing these localization anti-patterns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6690) Limit the number of resources a single map reduce job can submit for localization

2016-06-01 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated MAPREDUCE-6690:

Attachment: MAPREDUCE-6690-trunk-v3.patch

V3 attached.

# Addressed comments from [~jlowe].
# Added more unit tests.

> Limit the number of resources a single map reduce job can submit for 
> localization
> -
>
> Key: MAPREDUCE-6690
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6690
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
> Attachments: MAPREDUCE-6690-trunk-v1.patch, 
> MAPREDUCE-6690-trunk-v2.patch, MAPREDUCE-6690-trunk-v3.patch
>
>
> Users will sometimes submit a large amount of resources to be localized as 
> part of a single map reduce job. This can cause issues with YARN localization 
> that destabilize the cluster and potentially impact other user jobs. These 
> resources are specified via the files, libjars, archives and jobjar command 
> line arguments or directly through the configuration (i.e. distributed cache 
> api). The resources specified could be too large in multiple dimensions:
> # Total size
> # Number of files
> # Size of an individual resource (i.e. a large fat jar)
> We would like to encourage good behavior on the client side by having the 
> option of enforcing resource limits along the above dimensions.
> There should be a separate effort to enforce limits at the YARN layer on the 
> server side, but this jira is only covering the map reduce layer on the 
> client side. In practice, having these client side limits will get us a long 
> way towards preventing these localization anti-patterns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6690) Limit the number of resources a single map reduce job can submit for localization

2016-06-01 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15311608#comment-15311608
 ] 

Chris Trezzo commented on MAPREDUCE-6690:
-

bq. should there be a corresponding YARN feature to reject applications that 
are asking for too much localization?

Yes, totally agree. As per the description, I was thinking of creating a follow 
up jira for a more complete server-side YARN solution. I am thinking of 
something where we can leverage the container launch context and the node 
manager can be smart about not launching containers that will cause too much 
localization. I haven't thought too much about this yet, but I will definitely 
file the jira.

> Limit the number of resources a single map reduce job can submit for 
> localization
> -
>
> Key: MAPREDUCE-6690
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6690
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
> Attachments: MAPREDUCE-6690-trunk-v1.patch, 
> MAPREDUCE-6690-trunk-v2.patch
>
>
> Users will sometimes submit a large amount of resources to be localized as 
> part of a single map reduce job. This can cause issues with YARN localization 
> that destabilize the cluster and potentially impact other user jobs. These 
> resources are specified via the files, libjars, archives and jobjar command 
> line arguments or directly through the configuration (i.e. distributed cache 
> api). The resources specified could be too large in multiple dimensions:
> # Total size
> # Number of files
> # Size of an individual resource (i.e. a large fat jar)
> We would like to encourage good behavior on the client side by having the 
> option of enforcing resource limits along the above dimensions.
> There should be a separate effort to enforce limits at the YARN layer on the 
> server side, but this jira is only covering the map reduce layer on the 
> client side. In practice, having these client side limits will get us a long 
> way towards preventing these localization anti-patterns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6690) Limit the number of resources a single map reduce job can submit for localization

2016-05-31 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15308934#comment-15308934
 ] 

Chris Trezzo commented on MAPREDUCE-6690:
-

Thanks for the review [~jlowe]!

bq. Is this intended to apply to all distributed cache items or only those that 
need to be uploaded during job submission?

Yes, it is intended to apply to all distributed cache items as well. Good 
catch! I will add in the DC items to the check. As a side note: the reasoning 
for including DC items is that even though the DC items are in an accessible 
place, they could still cause a significant amount of localization to the YARN 
local cache. The amount of localization is affected by the local cache size and 
the hit rate in the cache, but I chose to go with the most conservative 
approach.

I will also address your other comments.

> Limit the number of resources a single map reduce job can submit for 
> localization
> -
>
> Key: MAPREDUCE-6690
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6690
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
> Attachments: MAPREDUCE-6690-trunk-v1.patch, 
> MAPREDUCE-6690-trunk-v2.patch
>
>
> Users will sometimes submit a large amount of resources to be localized as 
> part of a single map reduce job. This can cause issues with YARN localization 
> that destabilize the cluster and potentially impact other user jobs. These 
> resources are specified via the files, libjars, archives and jobjar command 
> line arguments or directly through the configuration (i.e. distributed cache 
> api). The resources specified could be too large in multiple dimensions:
> # Total size
> # Number of files
> # Size of an individual resource (i.e. a large fat jar)
> We would like to encourage good behavior on the client side by having the 
> option of enforcing resource limits along the above dimensions.
> There should be a separate effort to enforce limits at the YARN layer on the 
> server side, but this jira is only covering the map reduce layer on the 
> client side. In practice, having these client side limits will get us a long 
> way towards preventing these localization anti-patterns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6690) Limit the number of resources a single map reduce job can submit for localization

2016-05-10 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated MAPREDUCE-6690:

Attachment: MAPREDUCE-6690-trunk-v2.patch

Attached is a v2 patch. Here are the updates:
# Fixed checkstyle and whitespace issues.
# Added check for limiting the size of an individual resource.
# Test failures from v1 were unrelated.

> Limit the number of resources a single map reduce job can submit for 
> localization
> -
>
> Key: MAPREDUCE-6690
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6690
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
> Attachments: MAPREDUCE-6690-trunk-v1.patch, 
> MAPREDUCE-6690-trunk-v2.patch
>
>
> Users will sometimes submit a large amount of resources to be localized as 
> part of a single map reduce job. This can cause issues with YARN localization 
> that destabilize the cluster and potentially impact other user jobs. These 
> resources are specified via the files, libjars, archives and jobjar command 
> line arguments or directly through the configuration (i.e. distributed cache 
> api). The resources specified could be too large in multiple dimensions:
> # Total size
> # Number of files
> # Size of an individual resource (i.e. a large fat jar)
> We would like to encourage good behavior on the client side by having the 
> option of enforcing resource limits along the above dimensions.
> There should be a separate effort to enforce limits at the YARN layer on the 
> server side, but this jira is only covering the map reduce layer on the 
> client side. In practice, having these client side limits will get us a long 
> way towards preventing these localization anti-patterns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



  1   2   >