hi i am still going in circles .... i still cant pin point a single function call that interacts with the HDFS for block locations... it is as if files are making circular calls to getBlockLocations() which is implemented such that it calls the same function in a different class ... i mean it is not talking to the HDFS anywhere.
plz help! momina On 5/7/10, Amogh Vasekar <am...@yahoo-inc.com> wrote: > Hi, > The (o.a.h.fs) FileSystem API has GetBlockLocations that is used to > determine replicas. > In general cases, (o.a.h.mapreduce.lib.input) FileInputFormat's getSplits() > calls this method, which is passed on for job scheduling along with the > split info. > > Hope this is what you were looking for. > > Amogh > > > On 5/7/10 4:22 PM, "momina khan" <momina.a...@gmail.com> wrote: > > hi, > > i am trying to figure out how hadoop uses data locality to schedule maps on > nodes which locally store tha map input ... going through code i am going in > circles in between a couple of file but not really getting anywhere ... that > is to say that i cant locate the HDFS API or func that can communicate a > node list that store replicas foe say a block! > > i am going from FSNameSystem.java to DFSClient.java to > BlocksWithLocations.java to DataNodeDescriptor.java and then back again > without getting to the HDFS interface that communicates replicas' storing > nodes for a block! > > someone plz help! > momina > >