Arun C Murthy wrote:

On Feb 23, 2009, at 2:01 AM, Bing TANG wrote:

Hi, everyone,
Could somdone tell me the principle of "-file" when using Hadoop
Streaming. I want to ship a big file to Slaves, so how it works?

Hadoop uses "SCP" to copy? How does Hadoop deal with -file option?


No, -file just copies the file from the local filesystem to HDFS, and the DistributedCache copies it to the local filesystem of the node on which the map/reduce task runs.

-file option does not use DistributedCache yet. HADOOP-2622 is still open for the same. -file option ships the files along with the streaming jar. (it unpacks the jar and copy the files and pack the jar again). You can use -files, -libjars and -archives to copy the files to distributed cache.
-Amareshwari
Arun


Reply via email to