Yes you're correct. Also note that sometimes the request may be for 3 replicas but NameNode may only be able to grant lesser cause remaining DNs are full/unreachable/loaded-with-threads, in which case write will work with just the lesser amount of pipeline size, so long as its >= dfs.replication.min.
If it gets 0 assignments when requesting for a write, it runs into this: wiki.apache.org/hadoop/FAQ#What_does_.22file_could_only_be_replicated_to_0_nodes.2C_instead_of_1.22_mean.3F On Fri, Jan 27, 2012 at 4:53 AM, Zhenhua (Gerald) Guo <jen...@gmail.com> wrote: > Thanks, Harsh J. Your answer is quite helpful! > If I understand right, writes wait until all replicas are created if > there is no error during the replication process. If there is any > error in the replication pipeline, dfs.replication.min comes into play > . Is my understanding correct? > > Gerald > > On Thu, Jan 26, 2012 at 4:07 PM, Harsh J <ha...@cloudera.com> wrote: >> Hi, >> >> On Fri, Jan 27, 2012 at 12:27 AM, Zhenhua (Gerald) Guo <jen...@gmail.com> >> wrote: >>> I have two questions regarding creation of replicas. >>> - When a user uploads a file to HDFS, it returns whenever the first >>> replica is created? or the client needs wait until all replicas are >>> created? >>> - When the output of MapReduce jobs is written to HDFS (by reduce >>> tasks), the writing of output returns when the first replica is >>> created? or wait until all replicas are created? >> >> Both questions are the same as both do the same form of DFS write. >> >> Writes are synchronous and replication is pipelined, presently in Apache >> Hadoop. >> >> But a write will succeed if at least 1 replica was written (controlled >> via dfs.replication.min -- pipeline can lose DNs out of errors, or can >> get fewer than requested DNs cause of load/space issues, but write >> will succeed if it at least gets one DN) >> >> Also see the whole conversation at >> http://search-hadoop.com/m/bF99W1ZmNqz1 for some more tidbits you >> might find interesting. >> >> -- >> Harsh J >> Customer Ops. Engineer, Cloudera -- Harsh J Customer Ops. Engineer, Cloudera