Hi Akhil,

DistributedCache.addCacheArchive takes path on hdfs. From your code, it looks 
like you are passing local path.
Also, if you want to create symlink, you should pass URI as hdfs://<path>#<linkname>, besides calling DistributedCache.createSymlink(conf);

Thanks
Amareshwari


akhil1988 wrote:
Please ask any questions if I am not clear above about the problem I am
facing.

Thanks,
Akhil

akhil1988 wrote:
Hi All!

I want a directory to be present in the local working directory of the
task for which I am using the following statements:
DistributedCache.addCacheArchive(new URI("/home/akhil1988/Config.zip"),
conf);
DistributedCache.createSymlink(conf);

Here Config is a directory which I have zipped and put at the given
location in HDFS
I have zipped the directory because the API doc of DistributedCache
(http://hadoop.apache.org/core/docs/r0.20.0/api/index.html) says that the
archive files are unzipped in the local cache directory :

DistributedCache can be used to distribute simple, read-only data/text
files and/or more complex types such as archives, jars etc. Archives (zip,
tar and tgz/tar.gz files) are un-archived at the slave nodes.

So, from my understanding of the API docs I expect that the Config.zip
file will be unzipped to Config directory and since I have SymLinked them
I can access the directory in the following manner from my map function:

FileInputStream fin = new FileInputStream("Config/file1.config");

But I get the FileNotFoundException on the execution of this statement.
Please let me know where I am going wrong.

Thanks,
Akhil



Reply via email to