Take a look at this blog:
http://blog.sematext.com/2012/08/09/consider-using-fuzzyrowfilter-when-in-need-for-secondary-indexes-in-hbase/

From your earlier description, the components of your rowkey have fixed length. 
Thus you can consider using fuzzy row filter. 

Cheers

On Jan 14, 2014, at 11:08 PM, Ramon Wang <[email protected]> wrote:

> Hi Ted
> 
> Thanks for the quick reply.
> 
> With this FuzzyRowFilter, do i still need to pass in startRow and stopRow
> like below when constructing a Scan object?
> 
>> Scan(byte [] startRow, byte [] stopRow)
> 
> 
> Will the FuzzyRowFilter provide us performance like a directly get by row
> when we pass something like "20140101_EN_?"
> 
> Cheers
> Ramon
> 
> 
> On Wed, Jan 15, 2014 at 2:22 PM, Ted Yu <[email protected]> wrote:
> 
>> Please take a look at
>> http://hbase.apache.org/0.94/apidocs/org/apache/hadoop/hbase/filter/FuzzyRowFilter.html
>> 
>> Cheers
>> 
>> On Jan 14, 2014, at 10:16 PM, Ramon Wang <[email protected]> wrote:
>> 
>>> Hi Folks
>>> 
>>> We have a table with fixed pattern row key design, the format for the row
>>> key is YEAR_COUNTRY_randomNumber, for example:
>>> 
>>> 20140101_EN_1
>>> 20140101_EN_2
>>> 20140101_EN_3
>>> 20140101_US_1
>>> 20140101_US_2
>>> 20140101_US_3
>>> ...
>>> 
>>> Is there a way i can quickly get the data for "20140101_EN_*" by using
>> Scan
>>> without scan the full table? I think we are probably going to use
>>> the PrefixFilter filter with the Scan object, but the problem is that we
>>> don't know the "startRow" for each scan, any ideas?
>>> 
>>> Thanks
>>> Ramon
>> 

Reply via email to