When you use the commandline -archives
a directory "archives" is created in hdfs under the the per job submission
area, to store the archives.
So there should be no collisions, as long as no other job tracker is using
the same system directory path (conf.get("mapred.system.dir",
"/tmp/hadoop/mapred/system")) in your hdfs.


On Tue, Sep 29, 2009 at 2:55 AM, Erik Forsberg <[email protected]> wrote:

> Hi!
>
> If I distribute files using the Distributed Cache (-archives option),
> are they guaranteed to be unique per job, or is there a risk that if I
> distribute a file named A with job 1, job 2 which also distributes a
> file named A will read job 1's file?
>
> I think they are unique per job, just want to verify that.
>
> Thanks,
> \EF
> --
> Erik Forsberg <[email protected]>
> Developer, Opera Software - http://www.opera.com/
>



-- 
Pro Hadoop, a book to guide you from beginner to hadoop mastery,
http://www.amazon.com/dp/1430219424?tag=jewlerymall
www.prohadoopbook.com a community for Hadoop Professionals

Reply via email to