Re: Distributed Cache with New API

Ted Yu Thu, 15 Apr 2010 13:07:47 -0700

Please see the sample within
src\core\org\apache\hadoop\filecache\DistributedCache.java:


 *     JobConf job = new JobConf();
 *     DistributedCache.addCacheFile(new
URI("/myapp/lookup.dat#lookup.dat"),
 *                                   job);


On Thu, Apr 15, 2010 at 12:56 PM, Larry Compton
<[email protected]>wrote:

> I'm trying to use the distributed cache in a MapReduce job written to the
> new API (org.apache.hadoop.mapreduce.*). In my "Tool" class, a file path is
> added to the distributed cache as follows:
>
>    public int run(String[] args) throws Exception {
>        Configuration conf = getConf();
>        Job job = new Job(conf, "Job");
>        ...
>        DistributedCache.addCacheFile(new Path(args[0]).toUri(), conf);
>        ...
>        return job.waitForCompletion(true) ? 0 : 1;
>    }
>
> The "setup()" method in my mapper tries to read the path as follows:
>
>    protected void setup(Context context) throws IOException {
>        Path[] paths = DistributedCache.getLocalCacheFiles(context
>                .getConfiguration());
>    }
>
> But "paths" is null.
>
> I'm assuming I'm setting up the distributed cache incorrectly. I've seen a
> few hints in previous mailing list postings that indicate that the
> distributed cache is accessed via the Job and JobContext objects in the
> revised API, but the javadocs don't seem to support that.
>
> Thanks.
> Larry
>

Re: Distributed Cache with New API

Reply via email to