Hi, The (o.a.h.fs) FileSystem API has GetBlockLocations that is used to determine replicas. In general cases, (o.a.h.mapreduce.lib.input) FileInputFormat's getSplits() calls this method, which is passed on for job scheduling along with the split info.
Hope this is what you were looking for. Amogh On 5/7/10 4:22 PM, "momina khan" <momina.a...@gmail.com> wrote: hi, i am trying to figure out how hadoop uses data locality to schedule maps on nodes which locally store tha map input ... going through code i am going in circles in between a couple of file but not really getting anywhere ... that is to say that i cant locate the HDFS API or func that can communicate a node list that store replicas foe say a block! i am going from FSNameSystem.java to DFSClient.java to BlocksWithLocations.java to DataNodeDescriptor.java and then back again without getting to the HDFS interface that communicates replicas' storing nodes for a block! someone plz help! momina