Hey Momina,

Here's the path on 20:

DistributedFileSystem#getFileBlockLocations
-> DFSClient#getFileBlockLocations
    -> callGetBlockLocations
       -> ClientProtocol#getBlockLocations
           -> (via proxy) NameNode#getBlockLocations

See createNamenode and createRPCNamenode in the DFSClient constructor
for how the RPC proxy is established.

Thanks,
Eli

On Fri, May 7, 2010 at 8:16 PM, momina khan <momina.a...@gmail.com> wrote:
> hi
>
> i am still going in circles .... i still cant pin point a single
> function call that interacts with the HDFS for block locations... it
> is as if files are making circular calls to getBlockLocations() which
> is implemented such that it calls the same function in a different
> class ... i mean it is not talking to the HDFS anywhere.
>
> plz help!
> momina
>
> On 5/7/10, Amogh Vasekar <am...@yahoo-inc.com> wrote:
>> Hi,
>> The (o.a.h.fs) FileSystem API has GetBlockLocations that is used to
>> determine replicas.
>> In general cases, (o.a.h.mapreduce.lib.input) FileInputFormat's getSplits()
>> calls this method, which is passed on for job scheduling along with the
>> split info.
>>
>> Hope this is what you were looking for.
>>
>> Amogh
>>
>>
>> On 5/7/10 4:22 PM, "momina khan" <momina.a...@gmail.com> wrote:
>>
>> hi,
>>
>> i am trying to figure out how hadoop uses data locality to schedule maps on
>> nodes which locally store tha map input ... going through code i am going in
>> circles in between a couple of file but not really getting anywhere ... that
>> is to say that i cant locate the HDFS API or func that can communicate a
>> node list that store replicas foe say a block!
>>
>> i am going from FSNameSystem.java to DFSClient.java to
>> BlocksWithLocations.java to DataNodeDescriptor.java and then back again
>> without getting to the HDFS interface that communicates replicas' storing
>> nodes for a block!
>>
>> someone plz help!
>> momina
>>
>>
>

Reply via email to