I'm trying to use the DistributedCache but having an issue resolving the 
symlinks to my files.

My Driver class writes some hashmaps to files in the DC like this:
        Path tPath = new Path("/data/cache/fd", UUID.randomUUID().toString());
        os = new ObjectOutputStream(fs.create(tPath));
        os.writeObject(myHashMap);
        os.close();
URI uri = new URI(tPath.toString() + "#" + "q_map");
        DistributedCache.addCacheFile(uri, config);
        DistributedCache.createSymlink(config);

But what Path() do I need to access to read the symlinks? 
I tried variations of "q_map",  "work/q_map" but neither works.

The files are definitely there because I can set a config var to the path and 
read the files in my reducer. For example, in my Driver class I set a variable 
via
 config.set(q_map, tPath.toString());

And then in my Reducer's setup() I do something like
Path q_map_path = new Path(config.get(q_map_path));
        if (fs.exists(q_map_path)) {
                HashMap<String,String> qMap = loadmap(conf,q_map_path);
        }

I tried to resolve the path to the symlinks via ${mapred.local.dir}/work but 
that doesn't work either. 
In the STDOUT of my mapper attempt I see:

  2012-05-29 03:59:54,369 - INFO  [main:TaskRunner@759] - 
   Creating symlink: 
/tmp/hadoop-mapred/mapred/local/taskTracker/distcache/-3168904771265144450_-884848596_406879224/varuna010/data/cache/fd/6dc9d5c0-98be-4105-bd59-b344924dd989
 
  <- 
/tmp/hadoop-mapred/mapred/local/taskTracker/root/jobcache/job_201205250826_0020/attempt_201205250826_0020_m_000000_0/work/q_map

Which says it's creating the symlinks, BUT I also see this output: 

mapred.local.dir: 
/tmp/hadoop-mapred/mapred/local/taskTracker/root/jobcache/job_201205250826_0020/attempt_201205250826_0020_m_000000_0
   job.local.dir: 
/tmp/hadoop-mapred/mapred/local/taskTracker/root/jobcache/job_201205250826_0020/work
  mapred.task.id: attempt_201205250826_0020_m_000000_0
Path [work/q_map] does not exist
Path 
[/tmp/hadoop-mapred/mapred/local/taskTracker/root/jobcache/job_201205250826_0020/attempt_201205250826_0020_m_000000_0/work/q_map]
 does not exist

Which is from this code in my mapper's setup() method:
try {
        System.out.printf("mapred.local.dir: %s\n", 
conf.get("mapred.local.dir"));
        System.out.printf("   job.local.dir: %s\n", conf.get("job.local.dir"));
        System.out.printf("  mapred.task.id: %s\n", conf.get("mapred.task.id"));
        fs = FileSystem.get(conf);
        Path symlink = new Path("work/q_map");
        Path fullpath = new Path(conf.get("mapred.local.dir") + "/work/q_map");
        System.out.printf("Path [%s] ",symlink.toString());
        if (fs.exists(symlink)) {
                System.out.println("exists");
        } else {
                System.out.println("does not exist");
        }       
        System.out.printf("Path [%s] ",fullpath.toString());
        if (fs.exists(fullpath)) {
                System.out.println("exists");
        } else {
                System.out.println("does not exist");
        }       
} catch (IOException e1) {
        e1.printStackTrace();
}

Regards,
Alan

Reply via email to