[
https://issues.apache.org/jira/browse/HBASE-20565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16546584#comment-16546584
]
Zheng Hu edited comment on HBASE-20565 at 7/17/18 1:19 PM:
---
Upload the patch.v1, and Let me explain the core idea:
Assume that filterList = filter-A AND filter-B AND filter-C AND ,
if a cell has been filtered out by filter-A, then no need to
pass the cell to filter-B and filter-C, only the included cell set of filter-A
should be passed to filter-B, and only the included cell set of filter-A &
filter-B should be passed to filter-C
The max rule can still working, but only the include* return code should be
merged into a max return code.
The problem is the order of filters may result in diff cells...so we need to
tell the user explicitly to place the count-related filters at the last
position. In SQL syntax, we accept the sql :
{code}
select * from table where xxx and yyy limit 1, 100,
{code}
the limit is at the end of the statement,
SQL such as:
{code}
select * from table where xxx limit 1, 1000 and yyy
{code}
will not be accepted.
was (Author: openinx):
Upload the patch.v1, and pasted the discuss with [~anoop.hbase] ...
> What if the order of filters be opposite way in FL?
A good question, I think we need to tell the user explicitly to place the
count-related filters at the last position. In SQL syntax, we accept the sql
: select * from table where xxx and xxx limit 1, 100, the limit is at the end
of the statement, sql such as: select * from table where xxx limit 1, 1000 and
xx will not be accepted.
I think it's meaningful to require the count-related filters put at the end of
sub-filters.
On Fri, May 25, 2018 at 6:25 PM, Anoop John wrote:
> if a cell has been filtered out by filter-A, then no need to
pass the cell to filter-B and filter-C, only the included cell set of
filter-A should be passed to filter-B, and only the included cell set
of filter-A & filter-B should be passed to filter-C ...
U mean u propose such a change now? Then the order of filters matters
right? Say the count based filter is coming second and the other
(which can filter out some cells) come as 1st, it will work. What if
the order of filters be opposite way in FL?
-Anoop-
On Fri, May 25, 2018 at 12:29 PM, OpenInx wrote:
> I have to admit that my previous solution was one-sided...
> Not only the ColumnPaginationFilter has the problem, other counter-related
> filters also has the problem too.
>
>> We have 2 filters in a FL. We pass cell 1 and 2. First filter select cell1
>> but been filtered out by F2. Now we need to tell both filters that we
>> have excludes this cell. This will be useful for filters which work on
>> counting basis. It can reduce the counter which it would have advanced.
>> Pls see the possibility.
>
> Assume that FilterList = filter-A AND ColumnCountGetFilter , if cell x
> has been filtered out by filter-A, then what the expected return code do
> the ColumnCountGetFilter#filterKeyValue shoud return ?
> In theory, the count in ColumnCountGetFilter should not increment when
> checking the cell x . So what is the purpose of passing the cell x to
> ColumnCountGetFilter#filterKeyValue ?
> To get the return code from ColumnCountGetFilter for max the forward step ?
>
> Now, I'm thinking that the implementation in branch-1.2 is more reasonable,
> Assume that filterList = filter-A AND filter-B AND filter-C AND ,
> if a cell has been filtered out by filter-A, then no need to
> pass the cell to filter-B and filter-C, only the included cell set of
> filter-A should be passed to filter-B, and only the included cell set of
> filter-A & filter-B should be passed to filter-C
>
> The max rule can still working, but only the include* return code should be
> merged into a max return code.
>
> I think the semantic is more reasonable.
>
>
> On Thu, May 24, 2018 at 4:31 PM, Anoop John wrote:
>>
>> The offset is the cell offset in a row na. This says we already fetched
>> till there. So ya of there is another filter also along with this pagination
>> filter, it must be hard for the pagination filter to decide the column
>> offset for the next request. So ya ideally the column offset might work
>> there.
>> But the issue is we can not really generalize this. It depends on the way
>> the col offset and column value offset is been implemented in pagination
>> filter.
>>
>> I kind of thinking that we need a generic framework change now. If we pass
>> all cells to all filters ( which is correct also) then there should be a way
>> later with which we say all filters that we decided later that this cell is
>> not included in result.
>>
>> We have 2 filters in a FL. We pass cell 1 and 2. First filter select cell1
>> but been filtered out by F2. Now we need to tell both filters that we have
>> excludes this