[ 
https://issues.apache.org/jira/browse/HBASE-5104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13177451#comment-13177451
 ] 

Kannan Muthukkaruppan commented on HBASE-5104:
----------------------------------------------

Lars: Yes.

Jiakai wrote in with: <<< the filters in FilterList are applied in order. The 
ColumnPaginationFilter's filterKeyValue() is called only when 
ColumnPrefixFilter's filterKeyValue() returns true. i.e. the current 
implementation should be equivalent to:
select * from (select * from Tab where filter1) where filter2

So it should return the desired result after the bug is fixed.

If you meant to suggest that filters in FilterList should be interchangeable, 
then it becomes a design question. I'm fine with the alternative approaches you 
suggested, too.>>>>

Response:  Existing code structure wise, Jiakai is correct. The filters are 
evaluated in order... so once SEEK_NEXT_USING_HINT is correctly handled, you'll 
get the behavior you want. But I am concerned overall with a 
ColumnPaginationFilter being a stateful filter whose state gets updated 
depending on what other filters where ahead of it. But perhaps, for backward 
compatibility, we cannot change its existing behavior.

So we'll probably need to do both... fix the SEEK_NEXT_USING_HINT to work right 
with FilterList (at which point your case will start working fine), and also 
support limit/offset at the Scan/Get or ColumnPrefixFilter level as a cleaner 
alternative to do pagination.

One disadvantage of sticking with the FilterList approach would be that it 
might be trickier to get the "seek_next_using_hint" optimization. The 
ColumnPrefixFilter can only seek next using hint in limited circumstances. For 
example, if you have an OR filter of two prefix filters:

((ColumnPrefix("B") or ColumnPrefix("A")) AND (PaginationFilter(5, 5))

we cannot have the first filter suggest a SEEK_NEXT_USING_HINT to go to prefix 
B, as that'll miss out columns starting at "A".

We'll need to restrict the SEEK_NEXT_USING_HINT to be used in much more limited 
circumstances... and if there are other filters in the mix, we probably need to 
scan one cell at a time. This might be another reason to deal with LIMIT/OFFSET 
as either an option to the ColumnPrefixFilter itself or at the Scan/Get API 
level.

                
> FilterList doesn't work right with ColumnPaginationFilter
> ---------------------------------------------------------
>
>                 Key: HBASE-5104
>                 URL: https://issues.apache.org/jira/browse/HBASE-5104
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Kannan Muthukkaruppan
>            Assignee: Madhuwanti Vaidya
>         Attachments: testFilterList.rb
>
>
> Thanks Jiakai Liu for reporting this issue and doing the initial 
> investigation. Email from Jiakai below:
> Assuming that we have an index column family with the following entries:
> "tag0:001:thread1"
> ...
> "tag1:001:thread1"
> "tag1:002:thread2"
> ...
> "tag1:010:thread10"
> ...
> "tag2:001:thread1"
> "tag2:005:thread5"
> ...
> To get threads with "tag1" in range [5, 10), I tried the following code:
>     ColumnPrefixFilter filter1 = new 
> ColumnPrefixFilter(Bytes.toBytes("tag1"));
>     ColumnPaginationFilter filter2 = new ColumnPaginationFilter(5 /* limit 
> */, 5 /* offset */);
>     FilterList filters = new FilterList(Operator.MUST_PASS_ALL);
>     filters.addFilter(filter1);
>     filters.addFilter(filter2);
>     Get get = new Get(USER);
>     get.addFamily(COLUMN_FAMILY);
>     get.setMaxVersions(1);
>     get.setFilter(filters);
> Somehow it didn't work as expected. It returned the entries as if the filter1 
> were not set.
> Turns out the ColumnPrefixFilter returns SEEK_NEXT_USING_HINT in some cases. 
> The FilterList filter does not handle this return code properly (treat it as 
> INCLUDE).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to