[ https://issues.apache.org/jira/browse/HBASE-6618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alex Baranau updated HBASE-6618: -------------------------------- Attachment: HBASE-6618.patch Looks like Anil didn't find time for that in the end. But I believe that this functionality is very very useful. Will work on it then. Attached patch with implemented filter functionality. I have doubts about filter API (i.e. defining fuzzy rules with Triple<byte[], byte[], byte[]), any suggestions are very welcome! > Implement FuzzyRowFilter with ranges support > -------------------------------------------- > > Key: HBASE-6618 > URL: https://issues.apache.org/jira/browse/HBASE-6618 > Project: HBase > Issue Type: New Feature > Components: Filters > Reporter: Alex Baranau > Priority: Minor > Attachments: HBASE-6618-algo-desc-bits.png, HBASE-6618-algo.patch, > HBASE-6618.patch > > > Apart from current ability to specify fuzzy row filter e.g. for > <userId_actionId> format as ????_0004 (where 0004 - actionId) it would be > great to also have ability to specify the "fuzzy range" , e.g. ????_0004, > ..., ????_0099. > See initial discussion here: http://search-hadoop.com/m/WVLJdX0Z65 > Note: currently it is possible to provide multiple fuzzy row rules to > existing FuzzyRowFilter, but in case when the range is big (contains > thousands of values) it is not efficient. > Filter should perform efficient fast-forwarding during the scan (this is what > distinguishes it from regex row filter). > While such functionality may seem like a proper fit for custom filter (i.e. > not including into standard filter set) it looks like the filter may be very > re-useable. We may judge based on the implementation that will hopefully be > added. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira