I'm trying to use the DistributedCache but having an issue resolving the symlinks to my files.
My Driver class writes some hashmaps to files in the DC like this: Path tPath = new Path("/data/cache/fd", UUID.randomUUID().toString()); os = new ObjectOutputStream(fs.create(tPath)); os.writeObject(myHashMap); os.close(); URI uri = new URI(tPath.toString() + "#" + "q_map"); DistributedCache.addCacheFile(uri, config); DistributedCache.createSymlink(config); But what Path() do I need to access to read the symlinks? I tried variations of "q_map", "work/q_map" but neither works. The files are definitely there because I can set a config var to the path and read the files in my reducer. For example, in my Driver class I set a variable via config.set(q_map, tPath.toString()); And then in my Reducer's setup() I do something like Path q_map_path = new Path(config.get(q_map_path)); if (fs.exists(q_map_path)) { HashMap<String,String> qMap = loadmap(conf,q_map_path); } I tried to resolve the path to the symlinks via ${mapred.local.dir}/work but that doesn't work either. In the STDOUT of my mapper attempt I see: 2012-05-29 03:59:54,369 - INFO [main:TaskRunner@759] - Creating symlink: /tmp/hadoop-mapred/mapred/local/taskTracker/distcache/-3168904771265144450_-884848596_406879224/varuna010/data/cache/fd/6dc9d5c0-98be-4105-bd59-b344924dd989 <- /tmp/hadoop-mapred/mapred/local/taskTracker/root/jobcache/job_201205250826_0020/attempt_201205250826_0020_m_000000_0/work/q_map Which says it's creating the symlinks, BUT I also see this output: mapred.local.dir: /tmp/hadoop-mapred/mapred/local/taskTracker/root/jobcache/job_201205250826_0020/attempt_201205250826_0020_m_000000_0 job.local.dir: /tmp/hadoop-mapred/mapred/local/taskTracker/root/jobcache/job_201205250826_0020/work mapred.task.id: attempt_201205250826_0020_m_000000_0 Path [work/q_map] does not exist Path [/tmp/hadoop-mapred/mapred/local/taskTracker/root/jobcache/job_201205250826_0020/attempt_201205250826_0020_m_000000_0/work/q_map] does not exist Which is from this code in my mapper's setup() method: try { System.out.printf("mapred.local.dir: %s\n", conf.get("mapred.local.dir")); System.out.printf(" job.local.dir: %s\n", conf.get("job.local.dir")); System.out.printf(" mapred.task.id: %s\n", conf.get("mapred.task.id")); fs = FileSystem.get(conf); Path symlink = new Path("work/q_map"); Path fullpath = new Path(conf.get("mapred.local.dir") + "/work/q_map"); System.out.printf("Path [%s] ",symlink.toString()); if (fs.exists(symlink)) { System.out.println("exists"); } else { System.out.println("does not exist"); } System.out.printf("Path [%s] ",fullpath.toString()); if (fs.exists(fullpath)) { System.out.println("exists"); } else { System.out.println("does not exist"); } } catch (IOException e1) { e1.printStackTrace(); } Regards, Alan