[jira] [Commented] (HBASE-22833) MultiRowRangeFilter should provide a method for creating a filter which is functionally equivalent to multiple prefix filters

Itsuki Toyota (JIRA) Mon, 12 Aug 2019 04:30:32 -0700


    [ 
https://issues.apache.org/jira/browse/HBASE-22833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16905099#comment-16905099
 ]


Itsuki Toyota commented on HBASE-22833:
---------------------------------------

[~openinx] Thanks for your reply and the neat article!

> I think we did not expand the MultiRowRangeFilter as multiple start/stop 
> ranges

My understanding was _MultiRowRangeFilter_ could handle multiple start/stop 
ranges since it has the constructor of 
_MultiRowRangeFilter(List<MultiRowRangeFilter.RowRange> list)_ and 
_MultiRowRangeFilter.RowRange_ is composed of _startRow_, _stopRow_, and flags 
that indicate given start/stop row is inclusive or not. [0][1]

I found the following mention in your article:

> 例如对PrefixFilter(333)来说，碰到rowkey=111的行时，其实是可以根据前缀为333直接定位到下一个rowkey=333的Cell，只是当前的PrefixFilter没有做这个优化。

(In English)
> For example, for the PrefixFilter (333), when the row of rowkey=111 is 
> encountered, it can be directly positioned to the next one according to the 
> prefix 333. Rowkey=333's Cell, but the current PrefixFilter does not do this 
> optimization.

I thought a filter which contains multiple stop/row pairs (e.g., 
_MultiRowRangeFilter_) can achieve this kind of optimization. ("it's better 
than the current formal way" means this optimization)
Could you tell me if my understanding contains something incorrect?

Cheers,

[0] 
https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/MultiRowRangeFilter.html
[1] 
https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/MultiRowRangeFilter.RowRange.html


> MultiRowRangeFilter should provide a method for creating a filter which is 
> functionally equivalent to multiple prefix filters
> -----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-22833
>                 URL: https://issues.apache.org/jira/browse/HBASE-22833
>             Project: HBase
>          Issue Type: Wish
>          Components: Client
>    Affects Versions: 3.0.0
>            Reporter: Itsuki Toyota
>            Priority: Minor
>
> HI,
> I think current formal way to make multiple prefix filters is to create a 
> _FilterList_ and add _PrefixFilter_ instances to the list:
> {code:java}
> FilterList allFilters = new FilterList(FilterList.Operator.MUST_PASS_ONE);
> allFilters.addFilter(new PrefixFilter(Bytes.toBytes("123")));
> allFilters.addFilter(new PrefixFilter(Bytes.toBytes("456")));
> allFilters.addFilter(new PrefixFilter(Bytes.toBytes("678")));
> scan.setFilter(allFilters);
> {code}
> (c.f., 
> https://stackoverflow.com/questions/41074213/hbase-how-to-specify-multiple-prefix-filters-in-a-single-scan-operation
>  )
> However, in the case of creating a single prefix filter, HBase provides 
> _scan.setRowPrefixFilter_ method.
> This method creates a range filter by setting a start row and a stop row.
> The value of a stop row is decided by calling 
> _calculateTheClosestNextRowKeyForPrefix_ ( c.f., 
> https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Scan.java#L574-L597
>  )
> _MultiRowRangeFilter_ could leverage a list of start row and stop row pairs 
> and _calculateTheClosestNextRowKeyForPrefix_ could compute the stop row value 
> corresponding to given start row (i.e., a prefix).
> I think this kind of filter (a filter which is functionally equivalent to 
> multiple prefix filters) should be creatable by _MultiRowRangeFilter_ and 
> it's better than the current formal way.
> Cheers,



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (HBASE-22833) MultiRowRangeFilter should provide a method for creating a filter which is functionally equivalent to multiple prefix filters

Reply via email to