Please see the sample within
src\core\org\apache\hadoop\filecache\DistributedCache.java:
* JobConf job = new JobConf();
* DistributedCache.addCacheFile(new
URI("/myapp/lookup.dat#lookup.dat"),
* job);
On Thu, Apr 15, 2010 at 12:56 PM, Larry Compton
<[email protected]>wrote:
> I'm trying to use the distributed cache in a MapReduce job written to the
> new API (org.apache.hadoop.mapreduce.*). In my "Tool" class, a file path is
> added to the distributed cache as follows:
>
> public int run(String[] args) throws Exception {
> Configuration conf = getConf();
> Job job = new Job(conf, "Job");
> ...
> DistributedCache.addCacheFile(new Path(args[0]).toUri(), conf);
> ...
> return job.waitForCompletion(true) ? 0 : 1;
> }
>
> The "setup()" method in my mapper tries to read the path as follows:
>
> protected void setup(Context context) throws IOException {
> Path[] paths = DistributedCache.getLocalCacheFiles(context
> .getConfiguration());
> }
>
> But "paths" is null.
>
> I'm assuming I'm setting up the distributed cache incorrectly. I've seen a
> few hints in previous mailing list postings that indicate that the
> distributed cache is accessed via the Job and JobContext objects in the
> revised API, but the javadocs don't seem to support that.
>
> Thanks.
> Larry
>