[
https://issues.apache.org/jira/browse/HBASE-22833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16905099#comment-16905099
]
Itsuki Toyota commented on HBASE-22833:
---------------------------------------
[~openinx] Thanks for your reply and the neat article!
> I think we did not expand the MultiRowRangeFilter as multiple start/stop
> ranges
My understanding was _MultiRowRangeFilter_ could handle multiple start/stop
ranges since it has the constructor of
_MultiRowRangeFilter(List<MultiRowRangeFilter.RowRange> list)_ and
_MultiRowRangeFilter.RowRange_ is composed of _startRow_, _stopRow_, and flags
that indicate given start/stop row is inclusive or not. [0][1]
I found the following mention in your article:
> 例如对PrefixFilter(333)来说,碰到rowkey=111的行时,其实是可以根据前缀为333直接定位到下一个rowkey=333的Cell,只是当前的PrefixFilter没有做这个优化。
(In English)
> For example, for the PrefixFilter (333), when the row of rowkey=111 is
> encountered, it can be directly positioned to the next one according to the
> prefix 333. Rowkey=333's Cell, but the current PrefixFilter does not do this
> optimization.
I thought a filter which contains multiple stop/row pairs (e.g.,
_MultiRowRangeFilter_) can achieve this kind of optimization. ("it's better
than the current formal way" means this optimization)
Could you tell me if my understanding contains something incorrect?
Cheers,
[0]
https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/MultiRowRangeFilter.html
[1]
https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/MultiRowRangeFilter.RowRange.html
> MultiRowRangeFilter should provide a method for creating a filter which is
> functionally equivalent to multiple prefix filters
> -----------------------------------------------------------------------------------------------------------------------------
>
> Key: HBASE-22833
> URL: https://issues.apache.org/jira/browse/HBASE-22833
> Project: HBase
> Issue Type: Wish
> Components: Client
> Affects Versions: 3.0.0
> Reporter: Itsuki Toyota
> Priority: Minor
>
> HI,
> I think current formal way to make multiple prefix filters is to create a
> _FilterList_ and add _PrefixFilter_ instances to the list:
> {code:java}
> FilterList allFilters = new FilterList(FilterList.Operator.MUST_PASS_ONE);
> allFilters.addFilter(new PrefixFilter(Bytes.toBytes("123")));
> allFilters.addFilter(new PrefixFilter(Bytes.toBytes("456")));
> allFilters.addFilter(new PrefixFilter(Bytes.toBytes("678")));
> scan.setFilter(allFilters);
> {code}
> (c.f.,
> https://stackoverflow.com/questions/41074213/hbase-how-to-specify-multiple-prefix-filters-in-a-single-scan-operation
> )
> However, in the case of creating a single prefix filter, HBase provides
> _scan.setRowPrefixFilter_ method.
> This method creates a range filter by setting a start row and a stop row.
> The value of a stop row is decided by calling
> _calculateTheClosestNextRowKeyForPrefix_ ( c.f.,
> https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Scan.java#L574-L597
> )
> _MultiRowRangeFilter_ could leverage a list of start row and stop row pairs
> and _calculateTheClosestNextRowKeyForPrefix_ could compute the stop row value
> corresponding to given start row (i.e., a prefix).
> I think this kind of filter (a filter which is functionally equivalent to
> multiple prefix filters) should be creatable by _MultiRowRangeFilter_ and
> it's better than the current formal way.
> Cheers,
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)