Thank you all for your very precious responses.

Lars you are right. The root of the problem was my misunderstanding of
which system handles the blocks. I was thinking about data locality in
HDFS, which works transparently with HBase.  Things make sens a lot better
now :)

Ben


On Thu, Jun 21, 2012 at 9:45 PM, Michael Segel <[email protected]>wrote:

> While data locality is nice, you may see it becoming less of a bonus or
> issue.
>
> With Co-processors available, indexing becomes viable.  So you may see
> things where within the M/R you process a row from table A, maybe hit an
> index to find a value in table B and then do some processing.
>
> There's more to it, like using the intersection of indexes across a common
> key to join data which blows the whole issue of data locality out of the
> water. But that's a longer discussion requiring a white board, and some
> alcohol or coffee.... ;-)
>
>
> On Jun 21, 2012, at 5:07 AM, Lars George wrote:
>
> > Hi Ben,
> >
> > According to your fsck dump, the first copy is located on hadoop-143,
> which has all the blocks for the region. So if you check, I would assume
> that the region is currently open and served by hadoop-143, right?
> >
> > The TableInputFormat getSplit() will report that server to the MapReduce
> framework, so the task would run on that node to access the data locally.
> *Only* if you have speculative execution turned on you will have a task
> that is run on another random node in parallel, which would need to do
> heaps of remote reads. That is why it is recommended to turn that of for
> HBase and MapReduce in combination.
> >
> > Lars
> >
> > On Jun 21, 2012, at 6:57 AM, Ben Kim wrote:
> >
> >> Hi Lars,
> >> I appreciate a lot for your reply.
> >>
> >> As you told, a regionserver processes hfiles so that all data blocks are
> >> located in the same physical machine unless the regionserver failes.
> >> I ran following hadoop command to see location of a HFile
> >>
> >> *hadoop fsck
> >>
> /hbase/testtable/9488ef7fbd23b62b9bf85b722c015e90/testcf/08dc1940944b4952b23f0cbee51bcea8
> >> -files -locations -blocks*
> >>
> >> here is the output...
> >>
> >> FSCK started by hadoop from /203.235.211.142 for path
> >>>
> /hbase/testtable/9488ef7fbd23b62b9bf85b722c015e90/testcf/08dc1940944b4952b23f0cbee51bcea8
> >>> at Thu Jun 21 13:40:11 KST 2012
> >>>
> /hbase/testtable/9488ef7fbd23b62b9bf85b722c015e90/testcf/08dc1940944b4952b23f0cbee51bcea8
> >>> 727156659 bytes, 11 block(s):  OK
> >>> 0. blk_1832396139416350298_1296638 len=67108864 repl=3
> [hadoop-145:50010,
> >>> hadoop-143:50010, hadoop-144:50010]
> >>> 1. blk_8910330590545256327_1296640 len=67108864 repl=3
> [hadoop-143:50010,
> >>> hadoop-157:50010, hadoop-159:50010]
> >>> 2. blk_-3868612696419011016_1296640 len=67108864 repl=3
> [hadoop-145:50010,
> >>> hadoop-143:50010, hadoop-156:50010]
> >>> 3. blk_-7551946394410945015_1296640 len=67108864 repl=3
> [hadoop-145:50010,
> >>> hadoop-143:50010, hadoop-157:50010]
> >>> 4. blk_-1875839158119319613_1296640 len=67108864 repl=3
> [hadoop-145:50010,
> >>> hadoop-143:50010, hadoop-146:50010]
> >>> 5. blk_-6953623390282045248_1296640 len=67108864 repl=3
> [hadoop-143:50010,
> >>> hadoop-157:50010, hadoop-144:50010]
> >>> 6. blk_-3016727256928339770_1296640 len=67108864 repl=3
> [hadoop-143:50010,
> >>> hadoop-146:50010, hadoop-159:50010]
> >>> 7. blk_3526351456802007773_1296640 len=67108864 repl=3
> [hadoop-143:50010,
> >>> hadoop-160:50010, hadoop-156:50010]
> >>> 8. blk_5134681308608742320_1296640 len=67108864 repl=3
> [hadoop-145:50010,
> >>> hadoop-143:50010, hadoop-144:50010]
> >>> 9. blk_-6875541109589395450_1296640 len=67108864 repl=3
> [hadoop-145:50010,
> >>> hadoop-143:50010, hadoop-156:50010]
> >>> 10. blk_-553661064097182668_1296640 len=56068019 repl=3
> [hadoop-143:50010,
> >>> hadoop-146:50010, hadoop-159:50010]
> >>>
> >>> Status: HEALTHY
> >>> Total size:    727156659 B
> >>> Total dirs:    0
> >>> Total files:   1
> >>> Total blocks (validated):      11 (avg. block size 66105150 B)
> >>> Minimally replicated blocks:   11 (100.0 %)
> >>> Over-replicated blocks:        0 (0.0 %)
> >>> Under-replicated blocks:       0 (0.0 %)
> >>> Mis-replicated blocks:         0 (0.0 %)
> >>> Default replication factor:    3
> >>> Average block replication:     3.0
> >>> Corrupt blocks:                0
> >>> Missing replicas:              0 (0.0 %)
> >>> Number of data-nodes:          9
> >>> Number of racks:               1
> >>> FSCK ended at Thu Jun 21 13:40:11 KST 2012 in 4 milliseconds
> >>>
> >>
> >> As you see, data blocks of the HFile are stored across two different
> >> datanodes (hadoop-145 and hadoop-143).
> >>
> >> Let say a map task runs on hadoop-145 and needs to access the block 7.
> Then
> >> the map task needs to remotely access the block 7 on hadoop-143 server.
> >> Almost half of the data blocks are stored & accessed remotely. Referring
> >> from the above example, It's hard to say that the data locality is being
> >> applied to HBase.
> >>
> >> Ben
> >>
> >>
> >> On Fri, Jun 15, 2012 at 5:21 PM, Lars George <[email protected]>
> wrote:
> >>
> >>> Hi Ben,
> >>>
> >>> See inline...
> >>>
> >>> On Jun 15, 2012, at 6:56 AM, Ben Kim wrote:
> >>>
> >>>> Hi,
> >>>>
> >>>> I've been posting questions in the mailing-list quiet often lately,
> and
> >>>> here goes another one about data locality
> >>>> I read the excellent blog post about data locality that Lars George
> wrote
> >>>> at http://www.larsgeorge.com/2010/05/hbase-file-locality-in-hdfs.html
> >>>>
> >>>> I understand data locality in hbase as locating a region in a
> >>> region-server
> >>>> where most of its data blocks reside.
> >>>
> >>> The opposite is happening, i.e. the region server process triggers for
> all
> >>> data it writes to be located on the same physical machine.
> >>>
> >>>> So that way fast data access is guranteed when running a MR because
> each
> >>>> map/reduce task is run for each region in the tasktracker where the
> >>> region
> >>>> co-locates.
> >>>
> >>> Correct.
> >>>
> >>>> But what if the data blocks of the region are evenly spread over
> multiple
> >>>> region-servers?
> >>>
> >>> This will not happen, unless the original server fails. Then the
> region is
> >>> moved to another that now needs to do a lot of remote reads over the
> >>> network. This is way there is work being done to allow for custom
> placement
> >>> policies in HDFS. That way you can store the entire region and all
> copies
> >>> as complete units on three data nodes. In case of a failure you can
> then
> >>> move the region to one of the two copies. This is not available yet
> though,
> >>> but it is being worked on (so I heard).
> >>>
> >>>> Does a MR task has to remotely access the data blocks from other
> >>>> regionservers?
> >>>
> >>> For the above failure case, it would be the region server accessing the
> >>> remote data, yes.
> >>>
> >>>> How good is hbase locating datablocks where a region resides?
> >>>
> >>> That is again the wrong way around. HBase has no clue as to where
> blocks
> >>> reside, nor does it know that the file system in fact uses separate
> blocks.
> >>> HBase stores files, HDFS does the block magic underneath the hood, and
> >>> transparent to HBase.
> >>>
> >>>> Also is it correct to say that if i set smaller data block size data
> >>>> locality gets worse, and if data block size gets bigger  data locality
> >>> gets
> >>>> better.
> >>>
> >>> This is not applicable here, I am assuming this stems from the above
> >>> confusion about which system is handling the blocks, HBase or HDFS. See
> >>> above.
> >>>
> >>> HTH,
> >>> Lars
> >>>
> >>>>
> >>>> Best regards,
> >>>> --
> >>>>
> >>>> *Benjamin Kim*
> >>>> *benkimkimben at gmail*
> >>>
> >>>
> >>
> >>
> >> --
> >>
> >> *Benjamin Kim*
> >> *benkimkimben at gmail*
> >
> >
>
>


-- 

*Benjamin Kim*
*benkimkimben at gmail*

Reply via email to