Hi Lars, I appreciate a lot for your reply. As you told, a regionserver processes hfiles so that all data blocks are located in the same physical machine unless the regionserver failes. I ran following hadoop command to see location of a HFile
*hadoop fsck /hbase/testtable/9488ef7fbd23b62b9bf85b722c015e90/testcf/08dc1940944b4952b23f0cbee51bcea8 -files -locations -blocks* here is the output... FSCK started by hadoop from /203.235.211.142 for path > /hbase/testtable/9488ef7fbd23b62b9bf85b722c015e90/testcf/08dc1940944b4952b23f0cbee51bcea8 > at Thu Jun 21 13:40:11 KST 2012 > /hbase/testtable/9488ef7fbd23b62b9bf85b722c015e90/testcf/08dc1940944b4952b23f0cbee51bcea8 > 727156659 bytes, 11 block(s): OK > 0. blk_1832396139416350298_1296638 len=67108864 repl=3 [hadoop-145:50010, > hadoop-143:50010, hadoop-144:50010] > 1. blk_8910330590545256327_1296640 len=67108864 repl=3 [hadoop-143:50010, > hadoop-157:50010, hadoop-159:50010] > 2. blk_-3868612696419011016_1296640 len=67108864 repl=3 [hadoop-145:50010, > hadoop-143:50010, hadoop-156:50010] > 3. blk_-7551946394410945015_1296640 len=67108864 repl=3 [hadoop-145:50010, > hadoop-143:50010, hadoop-157:50010] > 4. blk_-1875839158119319613_1296640 len=67108864 repl=3 [hadoop-145:50010, > hadoop-143:50010, hadoop-146:50010] > 5. blk_-6953623390282045248_1296640 len=67108864 repl=3 [hadoop-143:50010, > hadoop-157:50010, hadoop-144:50010] > 6. blk_-3016727256928339770_1296640 len=67108864 repl=3 [hadoop-143:50010, > hadoop-146:50010, hadoop-159:50010] > 7. blk_3526351456802007773_1296640 len=67108864 repl=3 [hadoop-143:50010, > hadoop-160:50010, hadoop-156:50010] > 8. blk_5134681308608742320_1296640 len=67108864 repl=3 [hadoop-145:50010, > hadoop-143:50010, hadoop-144:50010] > 9. blk_-6875541109589395450_1296640 len=67108864 repl=3 [hadoop-145:50010, > hadoop-143:50010, hadoop-156:50010] > 10. blk_-553661064097182668_1296640 len=56068019 repl=3 [hadoop-143:50010, > hadoop-146:50010, hadoop-159:50010] > > Status: HEALTHY > Total size: 727156659 B > Total dirs: 0 > Total files: 1 > Total blocks (validated): 11 (avg. block size 66105150 B) > Minimally replicated blocks: 11 (100.0 %) > Over-replicated blocks: 0 (0.0 %) > Under-replicated blocks: 0 (0.0 %) > Mis-replicated blocks: 0 (0.0 %) > Default replication factor: 3 > Average block replication: 3.0 > Corrupt blocks: 0 > Missing replicas: 0 (0.0 %) > Number of data-nodes: 9 > Number of racks: 1 > FSCK ended at Thu Jun 21 13:40:11 KST 2012 in 4 milliseconds > As you see, data blocks of the HFile are stored across two different datanodes (hadoop-145 and hadoop-143). Let say a map task runs on hadoop-145 and needs to access the block 7. Then the map task needs to remotely access the block 7 on hadoop-143 server. Almost half of the data blocks are stored & accessed remotely. Referring from the above example, It's hard to say that the data locality is being applied to HBase. Ben On Fri, Jun 15, 2012 at 5:21 PM, Lars George <[email protected]> wrote: > Hi Ben, > > See inline... > > On Jun 15, 2012, at 6:56 AM, Ben Kim wrote: > > > Hi, > > > > I've been posting questions in the mailing-list quiet often lately, and > > here goes another one about data locality > > I read the excellent blog post about data locality that Lars George wrote > > at http://www.larsgeorge.com/2010/05/hbase-file-locality-in-hdfs.html > > > > I understand data locality in hbase as locating a region in a > region-server > > where most of its data blocks reside. > > The opposite is happening, i.e. the region server process triggers for all > data it writes to be located on the same physical machine. > > > So that way fast data access is guranteed when running a MR because each > > map/reduce task is run for each region in the tasktracker where the > region > > co-locates. > > Correct. > > > But what if the data blocks of the region are evenly spread over multiple > > region-servers? > > This will not happen, unless the original server fails. Then the region is > moved to another that now needs to do a lot of remote reads over the > network. This is way there is work being done to allow for custom placement > policies in HDFS. That way you can store the entire region and all copies > as complete units on three data nodes. In case of a failure you can then > move the region to one of the two copies. This is not available yet though, > but it is being worked on (so I heard). > > > Does a MR task has to remotely access the data blocks from other > > regionservers? > > For the above failure case, it would be the region server accessing the > remote data, yes. > > > How good is hbase locating datablocks where a region resides? > > That is again the wrong way around. HBase has no clue as to where blocks > reside, nor does it know that the file system in fact uses separate blocks. > HBase stores files, HDFS does the block magic underneath the hood, and > transparent to HBase. > > > Also is it correct to say that if i set smaller data block size data > > locality gets worse, and if data block size gets bigger data locality > gets > > better. > > This is not applicable here, I am assuming this stems from the above > confusion about which system is handling the blocks, HBase or HDFS. See > above. > > HTH, > Lars > > > > > Best regards, > > -- > > > > *Benjamin Kim* > > *benkimkimben at gmail* > > -- *Benjamin Kim* *benkimkimben at gmail*
