Re: performance of hadoop fs -put

2014-01-29 Thread Jay Vyas
No , im using a glob pattern, its all done in one put statement On Tue, Jan 28, 2014 at 9:22 PM, Harsh J ha...@cloudera.com wrote: Are you calling one command per file? That's bound to be slow as it invokes a new JVM each time. On Jan 29, 2014 7:15 AM, Jay Vyas jayunit...@gmail.com wrote:

performance of hadoop fs -put

2014-01-28 Thread Jay Vyas
Im finding that hadoop fs -put on a cluster is quite slow for me when i have large amounts of small files... much slower than native file ops. Note that Im using the RawLocalFileSystem as the underlying backing filesystem that is being written to in this case, so HDFS isnt the issue. I see that

Re: performance of hadoop fs -put

2014-01-28 Thread Harsh J
Are you calling one command per file? That's bound to be slow as it invokes a new JVM each time. On Jan 29, 2014 7:15 AM, Jay Vyas jayunit...@gmail.com wrote: Im finding that hadoop fs -put on a cluster is quite slow for me when i have large amounts of small files... much slower than native