[
https://issues.apache.org/jira/browse/HBASE-17958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15982567#comment-15982567
]
Duo Zhang commented on HBASE-17958:
-----------------------------------
Yeah this is really a bad practice. The filter returns SEEK_NEXT_ROW or
SEEK_NEXT_COL but we may still pass the cell of the same row or same column to
SQM. We have a strange optimization in SQM called stickyNextRow(which is really
confusing to me when refactoring SQM,,,) so SEEK_NEXT_ROW usually works, but
for SEEK_NEXT_COL there is no such optimization so it is broken...
In fact, if we decide that a skip is better than seek, then we should call
heap.next() continuously until we reach the next row or next column, and then
start to call SQM.match again. It is really confusing that SQM returns
SEEK_NEXT_ROW or SEEK_NEXT_COL but it could still receive the cell from the
same row or same column, right?
Thanks.
> Avoid passing unexpected cell to ScanQueryMatcher when optimize SEEK to SKIP
> ----------------------------------------------------------------------------
>
> Key: HBASE-17958
> URL: https://issues.apache.org/jira/browse/HBASE-17958
> Project: HBase
> Issue Type: Bug
> Reporter: Guanghao Zhang
>
> {code}
> ScanQueryMatcher.MatchCode qcode = matcher.match(cell);
> qcode = optimize(qcode, cell);
> {code}
> The optimize method may change the MatchCode from SEEK_NEXT_COL/SEEK_NEXT_ROW
> to SKIP. But it still pass the next cell to ScanQueryMatcher. It will get
> wrong result when use some filter, etc. ColumnCountGetFilter. It just count
> the columns's number. If pass a same column to this filter, the count result
> will be wrong. So we should avoid passing cell to ScanQueryMatcher when
> optimize SEEK to SKIP.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)