1. Hmm, tgz files are unzipped but the name doesn't change. 2. Append "#name' to be symlinked there.
On Thu, Apr 28, 2011 at 10:23 PM, Saptarshi Guha <[email protected]> wrote: > Hello, > > From the docs (for 0.20) for DistributedCache [1] I'm under the > impression that .tgz files will be unzipped,untarred and symlinked > into the > jobs current dir > > However, when running the job, this little fragment[2] reveals ( i > have called DistributedCache.createSymlink(config_); just after > adding the cache components) > > Arch=/data01/hadoop/mapred/mapred/taskTracker/distcache/5775566659502863353_-129792898_530471609/a.X.com/user/sguha/tmp/rhipe-hbase.jar > Arch=/data01/hadoop/mapred/mapred/taskTracker/distcache/5324957355881422466_25039836_529778096/a.X.com/user/sguha/Rdist.tar.gz > File=/data01/hadoop/mapred/mapred/taskTracker/distcache/1213508244132138160_-278348214_531319237/a.X.com/user/sguha/mscript.sh > > But having inspected the ls -r of the working directory , I dont see > this happening (only mscipt.sh was symlinked, it was added via > addCacheFile) > > ls -lR > .: > total 12 > lrwxrwxrwx 1 mapred mapred 90 Apr 28 22:11 job.jar -> > /data01/hadoop/mapred/mapred/taskTracker/sguha/jobcache/job_201102231451_6814/jars/job.jar > lrwxrwxrwx 1 mapred mapred 141 Apr 28 22:11 mscript.sh -> > /data01/hadoop/mapred/mapred/taskTracker/distcache/1213508244132138160_-278348214_531319237/a.X.com/user/sguha/mscript.sh > drwxr-xr-x 2 mapred mapred 4096 Apr 28 22:11 tmp > ./tmp: > total 0 > > In summary: > > - I added via addCacheFile (mscript.sh) - symlinked into working directory. > OK > - I added a JAR file with some classes I needed - added using > addArchiveToClassPath and this worked too - OK > - I added a tgz file hoping it would be untarred, unzipped and > symlinked in current folder (using addCacheArchive) - NOT-OK > > Have I missed anything? > > Cheers > Joy > > > > [1] > http://hadoop.apache.org/common/docs/r0.20.0/api/org/apache/hadoop/filecache/DistributedCache.html > [2] Path[] localArchives = > DistributedCache.getLocalCacheArchives(context.getConfiguration()); > Path[] localFiles = > DistributedCache.getLocalCacheFiles(context.getConfiguration()); > for(Path p : localArchives) System.out.println("Arch="+p); > for(Path p : localFiles) System.out.println("File="+p); >
