Hi Demai, Sorry, I missed that you are already tried this out. I think you can construct the block location on the local file system if you have the block pool id and the block id. If you are using cloudera distribution, the default location is under /dfs/dn ( the value of dfs.data.dir, dfs.datanode.data.dir configuration keys).
Thanks Yehia On 27 August 2014 21:20, Yehia Elshater <[email protected]> wrote: > Hi Demai, > > You can use fsck utility like the following: > > hadoop fsck /path/to/your/hdfs/file -files -blocks -locations -racks > > This will display all the information you need about the blocks of your > file. > > Hope it helps. > Yehia > > > On 27 August 2014 20:18, Demai Ni <[email protected]> wrote: > >> Hi, Stanley, >> >> Many thanks. Your method works. For now, I can have two steps approach: >> 1) getFileBlockLocations to grab hdfs BlockLocation[] >> 2) use local file system call(like find command) to match the block to >> files on local file system . >> >> Maybe there is an existing Hadoop API to return such info in already? >> >> Demai on the run >> >> On Aug 26, 2014, at 9:14 PM, Stanley Shi <[email protected]> wrote: >> >> I am not sure this is what you want but you can try this shell command: >> >> find [DATANODE_DIR] -name [blockname] >> >> >> On Tue, Aug 26, 2014 at 6:42 AM, Demai Ni <[email protected]> wrote: >> >>> Hi, folks, >>> >>> New in this area. Hopefully to get a couple pointers. >>> >>> I am using Centos and have Hadoop set up using cdh5.1(Hadoop 2.3) >>> >>> I am wondering whether there is a interface to get each hdfs block >>> information in the term of local file system. >>> >>> For example, I can use "Hadoop fsck /tmp/test.txt -files -blocks -racks" >>> to get blockID and its replica on the nodes, such as: repl =3[ >>> /rack/hdfs01, /rack/hdfs02...] >>> >>> With such info, is there a way to >>> 1) login to hfds01, and read the block directly at local file system >>> level? >>> >>> >>> Thanks >>> >>> Demai on the run >> >> >> >> >> -- >> Regards, >> *Stanley Shi,* >> >> >
