[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14319375#comment-14319375
 ] 

Karthik Kambatla commented on MAPREDUCE-5951:
---------------------------------------------

Sorry for the delay in getting to this. Getting a continuous chunk of time to 
look at this somewhat large patch was hard. 

Here are my first round of comments - a combination of high-level and detailed 
comments. Let us see if we can get some of this in through other JIRAs first, 
to allow for a more thorough review.
# DistributedCache changes aren’t central to what this JIRA is trying to 
address. Could we leave them out and address in another JIRA? 
## This has nothing to do with this patch, but it would be nice to make the 
code around setting CLASSPATH_FILES a little more readable. Could we define 
another String prefix to hold “” or classpath, based on whether classpath is 
null. 
# Job
## The new APIs should all be @Unstable
## Let us make the javadoc for the new APIs a little more formal - we don’t 
need to mention SCMClientProtocol.use, or that the APIs are intended for user 
use. Even for the return value, I would go with something like “If shared cache 
is enabled and the resource added successfully, return true. Otherwise, return 
false.”
## How about renaming the methods to addFileToSharedCache, 
addArchiveToSharedCache, addFileToSharedCacheAndClasspath? 
## Make both new methods private static instead of static private.
# JobID changes might not be required. Use ConverterUtils#toApplicationId? 
# JobImpl
## cleanupSharedCacheResources - nit: I would check for (checksums == null || 
checksums.length == 0) and return to save on indentations. 80 chars is already 
too small.
## cleanupSharedCacheUploadPolicies - javadoc should use block comments. Well, 
may be a nit.
# JobSubmitter
## Can we do the code moving from JobSumitter to FileUploader (may be, we need 
a more descriptive name) to another JIRA and look at that first if needed. 
Otherwise, it is hard to review the changes.
## May be, I am misreading the patch. Is this patch hardcoding MR job 
submission to always use SharedCache? If yes, we should definitely avoid that. 
# mapred-default.xml: We need a little more fool-proof config. The way the 
patch currently is, a typo will lead to unexpected behavior without any 
warnings.


> Add support for the YARN Shared Cache
> -------------------------------------
>
>                 Key: MAPREDUCE-5951
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5951
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>            Reporter: Chris Trezzo
>            Assignee: Chris Trezzo
>         Attachments: MAPREDUCE-5951-trunk-v1.patch, 
> MAPREDUCE-5951-trunk-v2.patch, MAPREDUCE-5951-trunk-v3.patch, 
> MAPREDUCE-5951-trunk-v4.patch, MAPREDUCE-5951-trunk-v5.patch, 
> MAPREDUCE-5951-trunk-v6.patch
>
>
> Implement the necessary changes so that the MapReduce application can 
> leverage the new YARN shared cache (i.e. YARN-1492).
> Specifically, allow per-job configuration so that MapReduce jobs can specify 
> which set of resources they would like to cache (i.e. jobjar, libjars, 
> archives, files).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to