Hi Keith Willey, -files option takes comma separated files (passed as URIs) to make them available on compute nodes for maps or reduces. For example, -files file:///myfiles/file1,file:///myfiles/file2,hdfs:/localhost:9000/files/dfsfile.
You can also pass a symlink name in the uri's fragment. For example, -files file:///myfiles/file1#file1,file:///myfiles2/file1#file2 But the second example does not work as expected in branch 0.20. ( see http://issues.apache.org/jira/browse/MAPREDUCE-787) I hope the above examples clarify your confusions. Thanks Amareshwari On 4/10/10 4:44 AM, "Keith Wiley" <[email protected]> wrote: I'm a little confused how the -files flag works. My understanding is that it takes two arguments: a file URI (could be local or on HDFS, assumed local if no URI scheme is provided) and a short "tag" representing the file on the distributed cache, usually just the name of the file without the long path that precedes it in the URI. But, -files can also pass multiple files to the distributed cache, so, how does this all go together. Are odd arguments all URIs and even arguments all cache-tags? Is it that simple? I'm not really sure how to fit it all together if I need to send several files to the distributed cache (several shared libraries for example).
