[jira] [Comment Edited] (MAPREDUCE-5951) Add support for the YARN Shared Cache

2017-10-05 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16193573#comment-16193573
 ] 

Chris Trezzo edited comment on MAPREDUCE-5951 at 10/5/17 8:17 PM:
--

This is the javac warning:
bq. [WARNING] 
/testptch/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/util/LocalResourceBuilder.java:[34,44]
 [deprecation] DistributedCache in org.apache.hadoop.mapreduce.filecache has 
been deprecated

LocalResourceBuilder was a class added to fix a checkstyle warning. I have used 
{{@SuppressWarnings("deprecation")}} to silence the warnings around 
DistributedCache usage at the class level. This warning is complaining about 
the import statement. If anyone has an idea for how to apply the annotation to 
the import statement, please let me know. Furthermore, the LocalResourceBuilder 
is simply refactoring the MRApps#parseDistributedCacheArtifacts method, so I do 
not think it makes sense to fix the usage of a deprecated interface in this 
patch, especially since it is used in a lot of places.


was (Author: ctrezzo):
This is the javac warning:
{noformat}
[WARNING] 
/testptch/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/util/LocalResourceBuilder.java:[34,44]
 [deprecation] DistributedCache in org.apache.hadoop.mapreduce.filecache has 
been deprecated
{noformat}

LocalResourceBuilder was a class added to fix a checkstyle warning. I have used 
{{@SuppressWarnings("deprecation")}} to silence the warnings around 
DistributedCache usage at the class level. This warning is complaining about 
the import statement. If anyone has an idea for how to apply the annotation to 
the import statement, please let me know. Furthermore, the LocalResourceBuilder 
is simply refactoring the MRApps#parseDistributedCacheArtifacts method, so I do 
not think it makes sense to fix the usage of a deprecated interface in this 
patch, especially since it is used in a lot of places.

> Add support for the YARN Shared Cache
> -
>
> Key: MAPREDUCE-5951
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5951
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>  Labels: BB2015-05-TBR
> Attachments: MAPREDUCE-5951-Overview.001.pdf, 
> MAPREDUCE-5951-trunk.016.patch, MAPREDUCE-5951-trunk.017.patch, 
> MAPREDUCE-5951-trunk.018.patch, MAPREDUCE-5951-trunk.019.patch, 
> MAPREDUCE-5951-trunk-020.patch, MAPREDUCE-5951-trunk-021.patch, 
> MAPREDUCE-5951-trunk-v10.patch, MAPREDUCE-5951-trunk-v11.patch, 
> MAPREDUCE-5951-trunk-v12.patch, MAPREDUCE-5951-trunk-v13.patch, 
> MAPREDUCE-5951-trunk-v14.patch, MAPREDUCE-5951-trunk-v15.patch, 
> MAPREDUCE-5951-trunk-v1.patch, MAPREDUCE-5951-trunk-v2.patch, 
> MAPREDUCE-5951-trunk-v3.patch, MAPREDUCE-5951-trunk-v4.patch, 
> MAPREDUCE-5951-trunk-v5.patch, MAPREDUCE-5951-trunk-v6.patch, 
> MAPREDUCE-5951-trunk-v7.patch, MAPREDUCE-5951-trunk-v8.patch, 
> MAPREDUCE-5951-trunk-v9.patch
>
>
> Implement the necessary changes so that the MapReduce application can 
> leverage the new YARN shared cache (i.e. YARN-1492).
> Specifically, allow per-job configuration so that MapReduce jobs can specify 
> which set of resources they would like to cache (i.e. jobjar, libjars, 
> archives, files).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (MAPREDUCE-5951) Add support for the YARN Shared Cache

2017-04-27 Thread Erik Krogen (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15987585#comment-15987585
 ] 

Erik Krogen edited comment on MAPREDUCE-5951 at 4/27/17 8:22 PM:
-

Hey [~ctrezzo], I have a question about the behavior of this patch. Currently 
the old logic for resource visibility is used, so if a resource is 
world-readable, it will be marked as PUBLIC, else PRIVATE. Given my current 
understanding of this patch's behavior, I see the following scenario:
* Client submits a job with libjar X, which has never been used before. Client 
contacts SCM to mark X as "used", SCM responds that it does not have X.
* Client uploads X to staging directory, which I assume here is _not_ 
world-readable. X is marked as PRIVATE.
* MR-AM localizes X, then uploads it to the shared cache. Other NMs all 
localize X as PRIVATE and do not share it with other applications.
* Client then submits the same job with the same X. Client contacts SCM, and 
SCM responds with a world-readable (755 dirs / 555 file) path inside of the 
shared cache.
* Client does not upload X, and marks X as PUBLIC, since it is currently in a 
world-readable location. 
* MR-AM and NMs all localize X as PUBLIC and share it with other applications.

Please correct me if I am wrong on any of these steps. It seems that it is the 
expected behavior that X is eventually PUBLIC, given that we asked for it to be 
uploaded to the publicly shared cache, but it seems unnecessary for it to be 
marked as PRIVATE the first time around. Do we do this just to avoid changing 
the existing logic for marking a resource as PRIVATE vs PUBLIC, is this an 
oversight, or is this behavior desired?


was (Author: xkrogen):
Hey [~ctrezzo], I have a question about the behavior of this patch. Currently 
the old logic for resource visibility is used, so if a resource is 
world-readable, it will be marked as PUBLIC, else PRIVATE. Given my current 
understanding of this patch's behavior, I see the following scenario:
* Client submits a job with libjar X, which has never been used before. Client 
contacts SCM to mark X as "used", SCM responds that it does not have X.
* Client uploads X to staging directory, which I assume here is _not_ 
world-readable. X is marked as PRIVATE.
* MR-AM localizes X, then uploads it to the shared cache. Other NMs all 
localize X as PRIVATE and do not share it with other applications.
* Client then submits the same job with the same X. Client contacts SCM, and 
SCM responds with a world-readable (755 dirs / 555 file) path inside of the 
shared cache.
* Client does not upload X, and marks X as PUBLIC, since it is currently in a 
world-readable location. 
* MR-AM and NMs all localize X as PUBLIC and share it with other applications.
Please correct me if I am wrong on any of these steps. It seems that it is the 
expected behavior that X is eventually PUBLIC, given that we asked for it to be 
uploaded to the publicly shared cache, but it seems unnecessary for it to be 
marked as PRIVATE the first time around. Do we do this just to avoid changing 
the existing logic for marking a resource as PRIVATE vs PUBLIC, is this an 
oversight, or is this behavior desired?

> Add support for the YARN Shared Cache
> -
>
> Key: MAPREDUCE-5951
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5951
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>  Labels: BB2015-05-TBR
> Attachments: MAPREDUCE-5951-Overview.001.pdf, 
> MAPREDUCE-5951-trunk.016.patch, MAPREDUCE-5951-trunk.017.patch, 
> MAPREDUCE-5951-trunk.018.patch, MAPREDUCE-5951-trunk.019.patch, 
> MAPREDUCE-5951-trunk-v10.patch, MAPREDUCE-5951-trunk-v11.patch, 
> MAPREDUCE-5951-trunk-v12.patch, MAPREDUCE-5951-trunk-v13.patch, 
> MAPREDUCE-5951-trunk-v14.patch, MAPREDUCE-5951-trunk-v15.patch, 
> MAPREDUCE-5951-trunk-v1.patch, MAPREDUCE-5951-trunk-v2.patch, 
> MAPREDUCE-5951-trunk-v3.patch, MAPREDUCE-5951-trunk-v4.patch, 
> MAPREDUCE-5951-trunk-v5.patch, MAPREDUCE-5951-trunk-v6.patch, 
> MAPREDUCE-5951-trunk-v7.patch, MAPREDUCE-5951-trunk-v8.patch, 
> MAPREDUCE-5951-trunk-v9.patch
>
>
> Implement the necessary changes so that the MapReduce application can 
> leverage the new YARN shared cache (i.e. YARN-1492).
> Specifically, allow per-job configuration so that MapReduce jobs can specify 
> which set of resources they would like to cache (i.e. jobjar, libjars, 
> archives, files).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (MAPREDUCE-5951) Add support for the YARN Shared Cache

2016-12-16 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15755192#comment-15755192
 ] 

Chris Trezzo edited comment on MAPREDUCE-5951 at 12/16/16 8:55 PM:
---

Attached is v18 to fix the one checkstyle issue. There are three outstanding 
checkstyle issues that I am leaning towards not fixing as part of the patch. 
Please let me know your thoughts. They are the following:
bq. 
./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java:752:
  private static ContainerLaunchContext createCommonContainerLaunchContext(:3: 
Method length is 172 lines (max allowed is 150).

This patch barely touches this method so it seems wrong to refactor the method 
as part of this jira. I can file a separate jira to fix this.

bq. 
./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/java/org/apache/hadoop/mapred/YARNRunner.java:341:
  public ApplicationSubmissionContext createApplicationSubmissionContext(:3: 
Method length is 249 lines (max allowed is 150).

The same reasoning applies to this warning as the previous issue. I can file a 
separate jira to fix this.

bq. 
./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/util/MRApps.java:572:
  private static void parseDistributedCacheArtifacts(:23: More than 7 
parameters (found 8).

This issue was caused by this patch adding an additional parameter to this 
method. I can fix the number of parameter issues, but that forces me to touch 
three existing calls to the deprecated DistributedCache api, which would fix 1 
warning but create 3 new ones. It is a larger change to not use the deprecated 
api because the existing code is not set up to use it, furthermore use of the 
deprecated api is currently widespread in this code. My thoughts are that I 
will leave this warning as is.


was (Author: ctrezzo):
Attached is v18 to fix the one checkstyle issue. There are three outstanding 
checkstyle issues that I am leaning towards not fixing as part of the patch. 
Please let me know your thoughts. They are the following:
bq. 
./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java:752:
  private static ContainerLaunchContext createCommonContainerLaunchContext(:3: 
Method length is 172 lines (max allowed is 150).

This patch barely touches this method so it seems wrong to refactor the patch 
as part of this jira. I can file a separate jira to fix this.

bq. 
./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/java/org/apache/hadoop/mapred/YARNRunner.java:341:
  public ApplicationSubmissionContext createApplicationSubmissionContext(:3: 
Method length is 249 lines (max allowed is 150).

The same reasoning applies to this warning as the previous issue. I can file a 
separate jira to fix this.

bq. 
./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/util/MRApps.java:572:
  private static void parseDistributedCacheArtifacts(:23: More than 7 
parameters (found 8).

This issue was caused by this patch adding an additional parameter to this 
method. I can fix the number of parameter issues, but that forces me to touch 
three existing calls to the deprecated DistributedCache api, which would fix 1 
warning but create 3 new ones. It is a larger change to not use the deprecated 
api because the existing code is not set up to use it, furthermore use of the 
deprecated api is currently widespread in this code. My thoughts are that I 
will leave this warning as is.

> Add support for the YARN Shared Cache
> -
>
> Key: MAPREDUCE-5951
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5951
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>  Labels: BB2015-05-TBR
> Attachments: MAPREDUCE-5951-trunk-v1.patch, 
> MAPREDUCE-5951-trunk-v10.patch, MAPREDUCE-5951-trunk-v11.patch, 
> MAPREDUCE-5951-trunk-v12.patch, MAPREDUCE-5951-trunk-v13.patch, 
> MAPREDUCE-5951-trunk-v14.patch, MAPREDUCE-5951-trunk-v15.patch, 
> MAPREDUCE-5951-trunk-v2.patch, MAPREDUCE-5951-trunk-v3.patch, 
> MAPREDUCE-5951-trunk-v4.patch, MAPREDUCE-5951-trunk-v5.patch, 
> MAPREDUCE-5951-trunk-v6.patch, MAPREDUCE-5951-trunk-v7.patch, 
> MAPREDUCE-5951-trunk-v8.patch, MAPREDUCE-5951-trunk-v9.patch, 
> MAPREDUCE-5951-trunk.016.patch, MAPREDUCE-5951-trunk.017.patch, 
> MAPREDUCE-5951-trunk.018.patch
>
>
> Implement the necessary changes so that the MapReduce application can 
> leverage the new