Re: Distributed cache - are files unique per job?

Jason Venner Tue, 29 Sep 2009 07:31:27 -0700

When you use the commandline -archives
a directory "archives" is created in hdfs under the the per job submission
area, to store the archives.
So there should be no collisions, as long as no other job tracker is using
the same system directory path (conf.get("mapred.system.dir",
"/tmp/hadoop/mapred/system")) in your hdfs.



On Tue, Sep 29, 2009 at 2:55 AM, Erik Forsberg <[email protected]> wrote:

> Hi!
>
> If I distribute files using the Distributed Cache (-archives option),
> are they guaranteed to be unique per job, or is there a risk that if I
> distribute a file named A with job 1, job 2 which also distributes a
> file named A will read job 1's file?
>
> I think they are unique per job, just want to verify that.
>
> Thanks,
> \EF
> --
> Erik Forsberg <[email protected]>
> Developer, Opera Software - http://www.opera.com/
>



-- 
Pro Hadoop, a book to guide you from beginner to hadoop mastery,
http://www.amazon.com/dp/1430219424?tag=jewlerymall
www.prohadoopbook.com a community for Hadoop Professionals

Re: Distributed cache - are files unique per job?

Reply via email to