[ 
https://issues.apache.org/jira/browse/PHOENIX-4980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16657644#comment-16657644
 ] 

Vincent Poon commented on PHOENIX-4980:
---------------------------------------

I can now repro this in an integration test.  Filed PHOENIX-4988 which is 
causing the issue seen here.
In this JIRA, we only see the inconsistency in row counts after a flush, 
because what happens is PHOENIX-4988 causes an Upsert which only touches 
non-indexed columns to generate an incorrect index rowkey based on the previous 
deleted row's indexed column.  Example:
upsert (pk, indexed, nonindexed) values (1, i1, n1)
delete a
upsert (pk, nonindexed) values (1, n2)

This  creates an index rowkey of i1_1 (previousIndexedVal_pk)

If you don't flush, the next update to the rowkey will generate the same index 
key, so the index maintenance of delete+put works fine.
But if you flush after the last upsert, the next update will not delete the 
prior rowkey because it no longer has the correct info.



> Mismatch in row counts between data and index tables while multiple clients 
> try to upsert data
> ----------------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-4980
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-4980
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.14.0
>            Reporter: Abhishek Talluri
>            Priority: Major
>              Labels: LocalIndex, globalMutableSecondaryIndex, secondaryIndex
>         Attachments: TestSecIndex.java
>
>
> Phoenix table has A,B,C,D,E as its columns and A as the primary key for the 
> table.
> CREATE TABLE TEST (A VARCHAR NOT NULL PRIMARY KEY, B VARCHAR, C VARCHAR, D 
> VARCHAR , E VARCHAR);
> Global index is built on D & E
> CREATE INDEX TEST_IND on TEST (D,E);
> Client 1 updates A,B,C whereas client 2 updates A,B,D,E
> I used phoenix 5.14.2-1.cdh5.14.2.p0.3 parcel to test this issue. Ran with 
> two threads that load data using upserts reading from the csv file. Within 10 
> iterations, i could observe the difference in the row counts between data 
> table and index table. Attaching the code used to test this behavior. This 
> issue also exists in both Global and Local indexes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to