Re: scanner on a given column: whole table scan or just the rows that have values

Ric Wang Tue, 09 Jun 2009 23:22:52 -0700

Billy,

Thank you, it's clearer to me now. But WITHIN the one family where the
column-label that needs to be scanned over lives (since I only have one
family for the entire table), it will still have to scan EVERY row in that
family no matter if each cell on that column-label has value or not?


-Ric


On Wed, Jun 10, 2009 at 1:03 AM, Billy Pearson
<[email protected]>wrote:

> It will not scan every row if there is more then one column family only the
> rows that have data for that column.
>
> You do have parallelism when scanning large tables the mr job should be
> splitting the job in to one mapper per region
> if coded setup correctly. New patches in dev set for 0.20 will allow more
> mappers per region speeding up this in some cases.
>
> Row-based database can have index but they do not scale well index require
> more memory
> Hbase is designed to be Distributed parallel fault tolerant that scales
> easy from 1 to hundreds to thousands of servers
>
> Billy
>
>
>
> "Ric Wang" <[email protected]> wrote in message
> news:[email protected]...
>
>  Hi,
>>
>> Thanks. But if it is still scanning EVERY row in the entire table, how
>> does
>> HBase achieve better scan performance, compared to a row-based database?
>>
>> Thanks,
>> Ric
>>
>>
>>
>> On Tue, Jun 9, 2009 at 9:35 PM, Ryan Rawson <[email protected]> wrote:
>>
>>  Without the use of indexes, there is no easy way to get the info without
>>> touching every row.
>>>
>>> So yes you'll be scanning every row.  But hbase has good bulk scan perf.
>>>
>>> On Jun 9, 2009 7:24 PM, "Ric Wang" <[email protected]> wrote:
>>>
>>> How does the scanner know how to get ONLY the "relevant" rows, without a
>>> whole table scan?
>>>
>>> Thanks!
>>> Ric
>>>
>>> On Tue, Jun 9, 2009 at 4:31 PM, Naveen Koorakula <[email protected]>
>>> wrote:
>>> > The scanner only s...
>>> --
>>>
>>> Ric Wang [email protected]
>>>
>>>
>>
>>
>> --
>> Ric Wang
>> [email protected]
>>
>>
>
>


-- 
Ric Wang
[email protected]

Re: scanner on a given column: whole table scan or just the rows that have values

Reply via email to