Yes. Joep
On Fri, May 17, 2013 at 6:38 AM, John Lilley <[email protected]>wrote: > Right, sorry for the ambiguity, I was talking about HDFS writes only. > > So my application doesn't need to do anything to signal that it is writing > from inside vs. outside of the Hadoop cluster, it figures that out from IP > or hostname? > > > -----Original Message----- > From: Harsh J [mailto:[email protected]] > Sent: Thursday, May 16, 2013 11:12 PM > To: <[email protected]> > Subject: Re: Question about writing HDFS files > > Thanks for the clarification Rahul. In that case, then the reading is > correct (and that a HDFS client behaves the same, in and out of MR - its > not really related to MR at all). > > A "client outside" would write to a random set of datanode, across at > least two racks for 3 replicas if rack awareness is turned on. > > On Fri, May 17, 2013 at 8:17 AM, Rahul Bhattacharjee < > [email protected]> wrote: > > Hi Harsh, > > > > I think what John meant by writing to local disk is writing to the > > same data node first which has initiated the write call. > > > > John can further clarify. > > > > > > On Fri, May 17, 2013 at 4:23 AM, Harsh J <[email protected]> wrote: > >> > >> That is not true. HDFS writes are not staged to a local disk first > >> before being written onto the DataNodes. The old architecture docs > >> seem to suggest that the writes get staged to a local disk but thats > >> not true anymore, see https://issues.apache.org/jira/browse/HDFS-1454. > >> > >> Also worth noting that a HDFS client behaves the same way in almost > >> all contexts, whether its invoked from an MR framework or directly > >> from shell. > >> > >> On Fri, May 17, 2013 at 3:38 AM, John Lilley > >> <[email protected]> > >> wrote: > >> > I seem to recall reading that when a MapReduce task writes a file, > >> > the blocks of the file are always written to local disk, and > >> > replicated to other nodes. If this is true, is this also true for > >> > non-MR applications writing to HDFS from Hadoop worker nodes? What > >> > about clients outside of the cluster doing a file load? > >> > > >> > Thanks > >> > > >> > John > >> > > >> > > >> > >> > >> > >> -- > >> Harsh J > > > > > > > > -- > Harsh J >
