Viraj Jasani created PHOENIX-7733:
-------------------------------------

             Summary: Data integrity issue impacting uncovered indexes (and 
potentially others) with rare occurrence
                 Key: PHOENIX-7733
                 URL: https://issues.apache.org/jira/browse/PHOENIX-7733
             Project: Phoenix
          Issue Type: Improvement
            Reporter: Viraj Jasani


Phoenix provides two types of indexes: covered indexes and uncovered indexes. 
While running some tests on uncovered indexes, we discovered data integrity 
issue when run against HBase 2.5 but is not present when run against HBase 2.6. 
The issue is likely not related to uncovered indexes only.

While scanning rows in uncovered index table, the corresponding full row is 
scanned from the data table. If the condition expression is provided by the 
user, the condition is evaluated on the data table row. Condition is evaluated 
as server side filters on the table regions. The test that discovered the issue 
has very large num of rows from the beginning that do not satisfy the filter 
expression. In other words, more than "hbase.client.scanner.max.result.size" MB 
worth of rows do not satisfy the filter expression. Therefore, the scanner 
returns no rows for HBase 2.5. However, increasing 
"hbase.client.scanner.max.result.size" to higher value made the scanner return 
correct result.

This data correctness issue is not present on HBase 2.6 because HBASE-27558 
fixed it in a way already, while fixing this was not the intention of the Jira. 
The large num of changes b/ HBase 2.5 and 2.6 in the scan path (while mostly 
related to quotas and metrics) makes it difficult to find the root cause.

Jira for fixing the issue on HBase 2.5: HBASE-29722

The purpose of this Jira is to create tests to reproduce the rare occurrence of 
the data correctness issue. We need to wait until new HBase 2.5 release is 
available with the above fix.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to