The namenode does decide the replica for either case. It just so happens that when running from a datanode the first replica is housed on the same node. Hope this makes sense. On Oct 30, 2012 8:13 PM, "Mohit Anchlia" <[email protected]> wrote:
> Thanks and if it is not the datanode then I am guessing namenode decides > the nodes in replication pipeline? > > On Tue, Oct 30, 2012 at 5:36 PM, ranjith raghunath < > [email protected]> wrote: > >> If your client node is a datanode with your cluster then the first copy >> does get written to that data node. >> >> Experts please feel free to correct me here. >> On Oct 30, 2012 7:11 PM, "Mohit Anchlia" <[email protected]> wrote: >> >>> With respect to replication if I run pig job from one of the nodes >>> within the Hadoop cluster then do I always end up with writing 1 replica >>> copy to that client node always and remaining 2 replica copies to other >>> nodes? >>> >>> >> >
