Hi Keith Willey,

-files option takes comma separated files (passed as URIs) to make them 
available on compute nodes for maps or reduces.
For example,
  -files 
file:///myfiles/file1,file:///myfiles/file2,hdfs:/localhost:9000/files/dfsfile.

You can also pass a symlink name in the uri's fragment.
For example,
  -files file:///myfiles/file1#file1,file:///myfiles2/file1#file2
But the second example does not work as expected in branch 0.20. ( see 
http://issues.apache.org/jira/browse/MAPREDUCE-787)
I hope the above examples clarify your confusions.

Thanks
Amareshwari


On 4/10/10 4:44 AM, "Keith Wiley" <[email protected]> wrote:

I'm a little confused how the -files flag works.  My understanding is that it 
takes two arguments: a file URI (could be local or on HDFS, assumed local if no 
URI scheme is provided) and a short "tag" representing the file on the 
distributed cache, usually just the name of the file without the long path that 
precedes it in the URI.

But, -files can also pass multiple files to the distributed cache, so, how does 
this all go together.  Are odd arguments all URIs and even arguments all 
cache-tags?  Is it that simple?  I'm not really sure how to fit it all together 
if I need to send several files to the distributed cache (several shared 
libraries for example).




Reply via email to