Tanuj Khurana created PHOENIX-7758:
--------------------------------------
Summary: Read repair with scan filters can give incorrect results
Key: PHOENIX-7758
URL: https://issues.apache.org/jira/browse/PHOENIX-7758
Project: Phoenix
Issue Type: Bug
Affects Versions: 5.3.0, 5.2.1, 5.1.2
Reporter: Tanuj Khurana
Assignee: Tanuj Khurana
When we scan an index table and find that a row is unverified, we trigger the
read repair process. The result of the read repair process can delete the index
row but if there are filters on the scan the state of the filter is not reset.
This can cause issues. One such instance is the DistinctPrefixFilter. Assume
that the first unique prefix is an unverified row which is deleted after read
repair. When we scan the next row with the same prefix, DistinctPrefixFilter
will ignore the row because it has already seen that prefix and will seek to
the next row key prefix thereby skipping all subsequent rows with that prefix.
One solution is to add a _reinitialize_ API to the filter interface so that we
can reset the state of the filter. HBase already has a _reset_ API defined on
the filter interface but that is used to reset the state of the filter after
every row. The state which we want to _reinitialize_ is maintained across rows.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)