Im finding that "hadoop fs -put" on a cluster is quite slow for me when i
have large amounts of small files... much slower than native file ops.
Note that Im using the RawLocalFileSystem as the underlying backing
filesystem that is being written to in this case, so HDFS isnt the issue.

I see that the Put class creates a linkedlist of # number of elements in
the path.

1) Is there a more performant way to run "fs -put"

2) Has anyone else noted that "fs -put" has extra overhead?

Im going to trace some more but , just wanted to bounce this off the
mailing list... maybe others also have run into this issue.

** Is "hadoop fs -put" inherently slower than a unix "cp"action, regardless
of filesystem -- and if so , why? **


-- 
Jay Vyas
http://jayunit100.blogspot.com

Reply via email to