Hi,

I've found it much easier to write the file to HDFS use the API, then pass
the 'path' to the file in HDFS as a property. You'll need to remember to
clean up the file after you're done with it.

Example details are in this thread:
http://groups.google.com/group/cascading-user/browse_thread/thread/d5c619349562a8d6#

Hope this helps,

Chris

On Thu, Jun 25, 2009 at 4:50 PM, akhil1988 <akhilan...@gmail.com> wrote:

>
> Please ask any questions if I am not clear above about the problem I am
> facing.
>
> Thanks,
> Akhil
>
> akhil1988 wrote:
> >
> > Hi All!
> >
> > I want a directory to be present in the local working directory of the
> > task for which I am using the following statements:
> >
> > DistributedCache.addCacheArchive(new URI("/home/akhil1988/Config.zip"),
> > conf);
> > DistributedCache.createSymlink(conf);
> >
> >>> Here Config is a directory which I have zipped and put at the given
> >>> location in HDFS
> >
> > I have zipped the directory because the API doc of DistributedCache
> > (http://hadoop.apache.org/core/docs/r0.20.0/api/index.html) says that
> the
> > archive files are unzipped in the local cache directory :
> >
> > DistributedCache can be used to distribute simple, read-only data/text
> > files and/or more complex types such as archives, jars etc. Archives
> (zip,
> > tar and tgz/tar.gz files) are un-archived at the slave nodes.
> >
> > So, from my understanding of the API docs I expect that the Config.zip
> > file will be unzipped to Config directory and since I have SymLinked them
> > I can access the directory in the following manner from my map function:
> >
> > FileInputStream fin = new FileInputStream("Config/file1.config");
> >
> > But I get the FileNotFoundException on the execution of this statement.
> > Please let me know where I am going wrong.
> >
> > Thanks,
> > Akhil
> >
>
> --
> View this message in context:
> http://www.nabble.com/Using-addCacheArchive-tp24207739p24210836.html
> Sent from the Hadoop core-user mailing list archive at Nabble.com.
>
>

Reply via email to