Hey Momina, Here's the path on 20:
DistributedFileSystem#getFileBlockLocations -> DFSClient#getFileBlockLocations -> callGetBlockLocations -> ClientProtocol#getBlockLocations -> (via proxy) NameNode#getBlockLocations See createNamenode and createRPCNamenode in the DFSClient constructor for how the RPC proxy is established. Thanks, Eli On Fri, May 7, 2010 at 8:16 PM, momina khan <momina.a...@gmail.com> wrote: > hi > > i am still going in circles .... i still cant pin point a single > function call that interacts with the HDFS for block locations... it > is as if files are making circular calls to getBlockLocations() which > is implemented such that it calls the same function in a different > class ... i mean it is not talking to the HDFS anywhere. > > plz help! > momina > > On 5/7/10, Amogh Vasekar <am...@yahoo-inc.com> wrote: >> Hi, >> The (o.a.h.fs) FileSystem API has GetBlockLocations that is used to >> determine replicas. >> In general cases, (o.a.h.mapreduce.lib.input) FileInputFormat's getSplits() >> calls this method, which is passed on for job scheduling along with the >> split info. >> >> Hope this is what you were looking for. >> >> Amogh >> >> >> On 5/7/10 4:22 PM, "momina khan" <momina.a...@gmail.com> wrote: >> >> hi, >> >> i am trying to figure out how hadoop uses data locality to schedule maps on >> nodes which locally store tha map input ... going through code i am going in >> circles in between a couple of file but not really getting anywhere ... that >> is to say that i cant locate the HDFS API or func that can communicate a >> node list that store replicas foe say a block! >> >> i am going from FSNameSystem.java to DFSClient.java to >> BlocksWithLocations.java to DataNodeDescriptor.java and then back again >> without getting to the HDFS interface that communicates replicas' storing >> nodes for a block! >> >> someone plz help! >> momina >> >> >