[
https://issues.apache.org/jira/browse/HBASE-6618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13965203#comment-13965203
]
Igor Kuzmitshov commented on HBASE-6618:
----------------------------------------
[~alexb], you are right about keeping the mask separate, somehow I forgot that
? can be a “normal byte”, sorry.
I have just checked other Filters, it seems that all are quite low-level and
use byte arrays as constructor parameters. It makes sense to use byte arrays as
parameters to be consistent, but adding a builder could be nice as well.
For me, the biggest “inconvenience” (especially when using HBase shell) of
constructing a FuzzyRowFilter is not in byte arrays themselves, but in Lists of
Pairs (or Triples) of byte arrays. I would add a simpler constructor for one
rule (I guess one rule would be enough quite often) and a separate method to
add rules:
{code}
FuzzyRowFilter(byte[] fuzzyInfo, byte[] lowerBytes, byte[] upperBytes)
void addRule(byte[] fuzzyInfo, byte[] lowerBytes, byte[] upperBytes)
{code}
> Implement FuzzyRowFilter with ranges support
> --------------------------------------------
>
> Key: HBASE-6618
> URL: https://issues.apache.org/jira/browse/HBASE-6618
> Project: HBase
> Issue Type: New Feature
> Components: Filters
> Reporter: Alex Baranau
> Assignee: Alex Baranau
> Priority: Minor
> Fix For: 0.99.0
>
> Attachments: HBASE-6618-algo-desc-bits.png, HBASE-6618-algo.patch,
> HBASE-6618.patch, HBASE-6618_2.path, HBASE-6618_3.path, HBASE-6618_4.patch,
> HBASE-6618_5.patch
>
>
> Apart from current ability to specify fuzzy row filter e.g. for
> <userId_actionId> format as ????_0004 (where 0004 - actionId) it would be
> great to also have ability to specify the "fuzzy range" , e.g. ????_0004,
> ..., ????_0099.
> See initial discussion here: http://search-hadoop.com/m/WVLJdX0Z65
> Note: currently it is possible to provide multiple fuzzy row rules to
> existing FuzzyRowFilter, but in case when the range is big (contains
> thousands of values) it is not efficient.
> Filter should perform efficient fast-forwarding during the scan (this is what
> distinguishes it from regex row filter).
> While such functionality may seem like a proper fit for custom filter (i.e.
> not including into standard filter set) it looks like the filter may be very
> re-useable. We may judge based on the implementation that will hopefully be
> added.
--
This message was sent by Atlassian JIRA
(v6.2#6252)