While data locality is nice, you may see it becoming less of a bonus or issue.
With Co-processors available, indexing becomes viable. So you may see things where within the M/R you process a row from table A, maybe hit an index to find a value in table B and then do some processing. There's more to it, like using the intersection of indexes across a common key to join data which blows the whole issue of data locality out of the water. But that's a longer discussion requiring a white board, and some alcohol or coffee.... ;-) On Jun 21, 2012, at 5:07 AM, Lars George wrote: > Hi Ben, > > According to your fsck dump, the first copy is located on hadoop-143, which > has all the blocks for the region. So if you check, I would assume that the > region is currently open and served by hadoop-143, right? > > The TableInputFormat getSplit() will report that server to the MapReduce > framework, so the task would run on that node to access the data locally. > *Only* if you have speculative execution turned on you will have a task that > is run on another random node in parallel, which would need to do heaps of > remote reads. That is why it is recommended to turn that of for HBase and > MapReduce in combination. > > Lars > > On Jun 21, 2012, at 6:57 AM, Ben Kim wrote: > >> Hi Lars, >> I appreciate a lot for your reply. >> >> As you told, a regionserver processes hfiles so that all data blocks are >> located in the same physical machine unless the regionserver failes. >> I ran following hadoop command to see location of a HFile >> >> *hadoop fsck >> /hbase/testtable/9488ef7fbd23b62b9bf85b722c015e90/testcf/08dc1940944b4952b23f0cbee51bcea8 >> -files -locations -blocks* >> >> here is the output... >> >> FSCK started by hadoop from /203.235.211.142 for path >>> /hbase/testtable/9488ef7fbd23b62b9bf85b722c015e90/testcf/08dc1940944b4952b23f0cbee51bcea8 >>> at Thu Jun 21 13:40:11 KST 2012 >>> /hbase/testtable/9488ef7fbd23b62b9bf85b722c015e90/testcf/08dc1940944b4952b23f0cbee51bcea8 >>> 727156659 bytes, 11 block(s): OK >>> 0. blk_1832396139416350298_1296638 len=67108864 repl=3 [hadoop-145:50010, >>> hadoop-143:50010, hadoop-144:50010] >>> 1. blk_8910330590545256327_1296640 len=67108864 repl=3 [hadoop-143:50010, >>> hadoop-157:50010, hadoop-159:50010] >>> 2. blk_-3868612696419011016_1296640 len=67108864 repl=3 [hadoop-145:50010, >>> hadoop-143:50010, hadoop-156:50010] >>> 3. blk_-7551946394410945015_1296640 len=67108864 repl=3 [hadoop-145:50010, >>> hadoop-143:50010, hadoop-157:50010] >>> 4. blk_-1875839158119319613_1296640 len=67108864 repl=3 [hadoop-145:50010, >>> hadoop-143:50010, hadoop-146:50010] >>> 5. blk_-6953623390282045248_1296640 len=67108864 repl=3 [hadoop-143:50010, >>> hadoop-157:50010, hadoop-144:50010] >>> 6. blk_-3016727256928339770_1296640 len=67108864 repl=3 [hadoop-143:50010, >>> hadoop-146:50010, hadoop-159:50010] >>> 7. blk_3526351456802007773_1296640 len=67108864 repl=3 [hadoop-143:50010, >>> hadoop-160:50010, hadoop-156:50010] >>> 8. blk_5134681308608742320_1296640 len=67108864 repl=3 [hadoop-145:50010, >>> hadoop-143:50010, hadoop-144:50010] >>> 9. blk_-6875541109589395450_1296640 len=67108864 repl=3 [hadoop-145:50010, >>> hadoop-143:50010, hadoop-156:50010] >>> 10. blk_-553661064097182668_1296640 len=56068019 repl=3 [hadoop-143:50010, >>> hadoop-146:50010, hadoop-159:50010] >>> >>> Status: HEALTHY >>> Total size: 727156659 B >>> Total dirs: 0 >>> Total files: 1 >>> Total blocks (validated): 11 (avg. block size 66105150 B) >>> Minimally replicated blocks: 11 (100.0 %) >>> Over-replicated blocks: 0 (0.0 %) >>> Under-replicated blocks: 0 (0.0 %) >>> Mis-replicated blocks: 0 (0.0 %) >>> Default replication factor: 3 >>> Average block replication: 3.0 >>> Corrupt blocks: 0 >>> Missing replicas: 0 (0.0 %) >>> Number of data-nodes: 9 >>> Number of racks: 1 >>> FSCK ended at Thu Jun 21 13:40:11 KST 2012 in 4 milliseconds >>> >> >> As you see, data blocks of the HFile are stored across two different >> datanodes (hadoop-145 and hadoop-143). >> >> Let say a map task runs on hadoop-145 and needs to access the block 7. Then >> the map task needs to remotely access the block 7 on hadoop-143 server. >> Almost half of the data blocks are stored & accessed remotely. Referring >> from the above example, It's hard to say that the data locality is being >> applied to HBase. >> >> Ben >> >> >> On Fri, Jun 15, 2012 at 5:21 PM, Lars George <lars.geo...@gmail.com> wrote: >> >>> Hi Ben, >>> >>> See inline... >>> >>> On Jun 15, 2012, at 6:56 AM, Ben Kim wrote: >>> >>>> Hi, >>>> >>>> I've been posting questions in the mailing-list quiet often lately, and >>>> here goes another one about data locality >>>> I read the excellent blog post about data locality that Lars George wrote >>>> at http://www.larsgeorge.com/2010/05/hbase-file-locality-in-hdfs.html >>>> >>>> I understand data locality in hbase as locating a region in a >>> region-server >>>> where most of its data blocks reside. >>> >>> The opposite is happening, i.e. the region server process triggers for all >>> data it writes to be located on the same physical machine. >>> >>>> So that way fast data access is guranteed when running a MR because each >>>> map/reduce task is run for each region in the tasktracker where the >>> region >>>> co-locates. >>> >>> Correct. >>> >>>> But what if the data blocks of the region are evenly spread over multiple >>>> region-servers? >>> >>> This will not happen, unless the original server fails. Then the region is >>> moved to another that now needs to do a lot of remote reads over the >>> network. This is way there is work being done to allow for custom placement >>> policies in HDFS. That way you can store the entire region and all copies >>> as complete units on three data nodes. In case of a failure you can then >>> move the region to one of the two copies. This is not available yet though, >>> but it is being worked on (so I heard). >>> >>>> Does a MR task has to remotely access the data blocks from other >>>> regionservers? >>> >>> For the above failure case, it would be the region server accessing the >>> remote data, yes. >>> >>>> How good is hbase locating datablocks where a region resides? >>> >>> That is again the wrong way around. HBase has no clue as to where blocks >>> reside, nor does it know that the file system in fact uses separate blocks. >>> HBase stores files, HDFS does the block magic underneath the hood, and >>> transparent to HBase. >>> >>>> Also is it correct to say that if i set smaller data block size data >>>> locality gets worse, and if data block size gets bigger data locality >>> gets >>>> better. >>> >>> This is not applicable here, I am assuming this stems from the above >>> confusion about which system is handling the blocks, HBase or HDFS. See >>> above. >>> >>> HTH, >>> Lars >>> >>>> >>>> Best regards, >>>> -- >>>> >>>> *Benjamin Kim* >>>> *benkimkimben at gmail* >>> >>> >> >> >> -- >> >> *Benjamin Kim* >> *benkimkimben at gmail* > >