The scanner only scans the relevant rows.

On Tue, Jun 9, 2009 at 2:10 PM, Ric Wang <[email protected]> wrote:

> Hi,
>
> My HBase table has millions of rows; and on given column (ex.
> famliyA:labelB), only a couple of thousand rows really have values
> (sparse).
> Now my task is to find out the set of row keys whose column value of
> "familyA:labelB" satisfy some kind of condition.
>
> For that task, I am getting a scanner on the column "familyA:labelB";
> looping over the values of that column (I guess I'd better off using some
> kind of filter instead, but regardless...); if the value matches my
> condition, I get the corresponding row key and add it into the result set.
>
> My questions are:
>
> 1. When the scanner loops over the column, is it scanning the whole table
> of
> millions of rows, or mostly just the ones that really have values for that
> particular column? My guess is that it's NOT scanning the whole table per
> my
> very limited understanding of how column-based database works; seems that'd
> be awfully inefficient. Can someone please let me know?
>
> 2. If in the unfortunate case, that whole table scan does have to happen,
> any suggestions on how I could change my table design (adding index..?) to
> avoid the performance hit?
>
> Thanks very much for your help!
> Ric
>

Reply via email to