[ 
https://issues.apache.org/jira/browse/HADOOP-2914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Zeyliger updated HADOOP-2914:
------------------------------------

    Attachment: HADOOP-2914-v2.patch

bq. In DistributedCacheHandle the class doc should go before the class 
declaration, not at the beginning of the file. Also need to add Apache license.

Done.

bq. Use an enum rather than a boolean for isArchive in CacheFile.

Done.

bq. We shouldn't remove public methods to DistributedCache, but rather 
deprecate them and remove them in a future release. Can DistributedCache 
delegate to DistributedCacheManager? I like the fact you have documented the 
intended audience for each public method of DistributedCache. (This paves the 
way to separating the public and private interfaces in future.)

Done.

My current thinking on APIs (for a future JIRA) is that users should access 
DistributedCache through Job.addToCache(URI, flags) and 
Context.getCachedFiles().  But there's some more work to get there.

bq. Is there duplication between TestMRWithDistributedCache and tests that use 
MRCaching that could be avoided?

Probably, but it's hard to tease out.  MRCaching is more complicated than the 
test I'm adding, and does, I believe, test some things that I don't.  On the 
other hand, TestMRWithDistributedCache tests the classpath stuff.  I'm loath to 
delete tests too eagerly.

bq. Could TestMRWithDistributedCache also test symlinking?

It does now test symlinking.  However, I couldn't (easily) get LocalJobRunner 
to do symlinks appropriately.  LocalJobRunner doesn't currently have a notion 
of task directory, and I think this patch is already quite large.

> extend DistributedCache to work locally (LocalJobRunner)
> --------------------------------------------------------
>
>                 Key: HADOOP-2914
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2914
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: sam rash
>            Assignee: Philip Zeyliger
>            Priority: Minor
>         Attachments: HADOOP-2914-v1-full.patch, 
> HADOOP-2914-v1-since-4041.patch, HADOOP-2914-v2.patch
>
>
> The DistributedCache does not work locally when using the outlined recipe at 
> http://hadoop.apache.org/core/docs/r0.16.0/api/org/apache/hadoop/filecache/DistributedCache.html
>  
> Ideally, LocalJobRunner would take care of populating the JobConf and copying 
> remote files to the local file sytem (http, assume hdfs = default fs = local 
> fs when doing local development.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to