They write directly to HDFS, there's no additional buffering on the local file system of the client.
-Joey On Tue, May 31, 2011 at 7:56 PM, Mapred Learn <mapred.le...@gmail.com> wrote: > Hi guys, > I asked this question earlier but did not get any response. So, posting > again. Hope somebody can point to the right description: > > When you do hadoop fs -copyFromLocal or use API to call fs.write() (when > Filesystem fs is HDFS), does it write to local filesystem first before > writing to HDFS ? > > I read and found out that it writes on local file-system until block-size is > reached and then writes on HDFS. > Wouldn't HDFS Client choke if it writes to local filesystem if multiple such > fs -copyFromLocal commands are running. I thought atleast in fs.write(), if > you provide byte array, it should not write on local file-system ? > > Some places I found out that hdfs client and datanode communicate through > rpc/sockets. Do they write on local file-systems also in this case or is it > just a buffer in memory that they write directly on HDFS. > Could somebody point me to some doc/code where I could find out how fs > -copyFromLocal and fs.write() work ? Do they write on local-filesystem > before block size is reached and then write to HDFS or write directly to > HDFS ? > > Thanks in advance, > -JJ -- Joseph Echeverria Cloudera, Inc. 443.305.9434