[
https://issues.apache.org/jira/browse/PHOENIX-4980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16657644#comment-16657644
]
Vincent Poon commented on PHOENIX-4980:
---------------------------------------
I can now repro this in an integration test. Filed PHOENIX-4988 which is
causing the issue seen here.
In this JIRA, we only see the inconsistency in row counts after a flush,
because what happens is PHOENIX-4988 causes an Upsert which only touches
non-indexed columns to generate an incorrect index rowkey based on the previous
deleted row's indexed column. Example:
upsert (pk, indexed, nonindexed) values (1, i1, n1)
delete a
upsert (pk, nonindexed) values (1, n2)
This creates an index rowkey of i1_1 (previousIndexedVal_pk)
If you don't flush, the next update to the rowkey will generate the same index
key, so the index maintenance of delete+put works fine.
But if you flush after the last upsert, the next update will not delete the
prior rowkey because it no longer has the correct info.
> Mismatch in row counts between data and index tables while multiple clients
> try to upsert data
> ----------------------------------------------------------------------------------------------
>
> Key: PHOENIX-4980
> URL: https://issues.apache.org/jira/browse/PHOENIX-4980
> Project: Phoenix
> Issue Type: Bug
> Affects Versions: 4.14.0
> Reporter: Abhishek Talluri
> Priority: Major
> Labels: LocalIndex, globalMutableSecondaryIndex, secondaryIndex
> Attachments: TestSecIndex.java
>
>
> Phoenix table has A,B,C,D,E as its columns and A as the primary key for the
> table.
> CREATE TABLE TEST (A VARCHAR NOT NULL PRIMARY KEY, B VARCHAR, C VARCHAR, D
> VARCHAR , E VARCHAR);
> Global index is built on D & E
> CREATE INDEX TEST_IND on TEST (D,E);
> Client 1 updates A,B,C whereas client 2 updates A,B,D,E
> I used phoenix 5.14.2-1.cdh5.14.2.p0.3 parcel to test this issue. Ran with
> two threads that load data using upserts reading from the csv file. Within 10
> iterations, i could observe the difference in the row counts between data
> table and index table. Attaching the code used to test this behavior. This
> issue also exists in both Global and Local indexes.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)