> * It seems that HDFS' staging strategy," In fact, initially the HDFS >> client caches the file data into a temporary local file. Application writes >> are transparently redirected to this temporary local file" , is quite >> different from the original GFS paper (see Section 2.3 of GFS paper "neither >> client nor the chunkserver caches file data"). Can someone help me >> understanding it ? >> >> > > People keep referencing this on the list, but it hasn't been that way in > about 3 years :) Where do you see this, so we can update the docs? >
I believe the "Staging" section of http://hadoop.apache.org/hdfs/docs/r0.21.0/hdfs_design.html#Data+Organizationis the culprit. If you agree, I'll file the JIRA.