[ https://issues.apache.org/jira/browse/HBASE-17958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15982567#comment-15982567 ]
Duo Zhang commented on HBASE-17958: ----------------------------------- Yeah this is really a bad practice. The filter returns SEEK_NEXT_ROW or SEEK_NEXT_COL but we may still pass the cell of the same row or same column to SQM. We have a strange optimization in SQM called stickyNextRow(which is really confusing to me when refactoring SQM,,,) so SEEK_NEXT_ROW usually works, but for SEEK_NEXT_COL there is no such optimization so it is broken... In fact, if we decide that a skip is better than seek, then we should call heap.next() continuously until we reach the next row or next column, and then start to call SQM.match again. It is really confusing that SQM returns SEEK_NEXT_ROW or SEEK_NEXT_COL but it could still receive the cell from the same row or same column, right? Thanks. > Avoid passing unexpected cell to ScanQueryMatcher when optimize SEEK to SKIP > ---------------------------------------------------------------------------- > > Key: HBASE-17958 > URL: https://issues.apache.org/jira/browse/HBASE-17958 > Project: HBase > Issue Type: Bug > Reporter: Guanghao Zhang > > {code} > ScanQueryMatcher.MatchCode qcode = matcher.match(cell); > qcode = optimize(qcode, cell); > {code} > The optimize method may change the MatchCode from SEEK_NEXT_COL/SEEK_NEXT_ROW > to SKIP. But it still pass the next cell to ScanQueryMatcher. It will get > wrong result when use some filter, etc. ColumnCountGetFilter. It just count > the columns's number. If pass a same column to this filter, the count result > will be wrong. So we should avoid passing cell to ScanQueryMatcher when > optimize SEEK to SKIP. -- This message was sent by Atlassian JIRA (v6.3.15#6346)