[ 
https://issues.apache.org/jira/browse/PHOENIX-5768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17057227#comment-17057227
 ] 

Kadir OZDEMIR commented on PHOENIX-5768:
----------------------------------------

[~gjacoby] and [~giskender], Unfortunately, we cannot determination if a row 
update is full or partial without looking at the current state of the data row 
when the update does not cover all columns. A row update can be just a put 
mutation, just a delete mutation (a set of delete column cells) or a pair of 
put and delete. Even if we consider all these combinations, we cannot always 
determine whether an update is a full row update. This is because the 
definition of "full row update" here does not mean that we have values for all 
columns of an index row. The first update on a row must be always considered 
full regardless of how many columns it covers.  We cannot determine if an 
update is the first or a subsequent update without checking if the row already 
exists. The subsequent updates should be considered partial only if they do not 
cover previously covered columns. Again, we cannot determine this without 
retrieving the existing row state. This means option 2 may lead to lots of 
unverified rows unnecessarily as we have to label every update as partial if it 
does not cover all columns.

> Supporting partial overwrites for immutable tables with indexes
> ---------------------------------------------------------------
>
>                 Key: PHOENIX-5768
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-5768
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 5.0.0, 4.14.3
>            Reporter: Kadir OZDEMIR
>            Assignee: Kadir OZDEMIR
>            Priority: Critical
>
> Phoenix allows immutable table with indexes to be overwritten partially as 
> long as the indexed columns are not updated during partial overwrites. 
> However, there is no check/enforcement for this. The immutable index 
> mutations are prepared on the client side without reading the existing data 
> table rows. This means the index mutations prepared by the client will be 
> partial when the data table row mutations are partial. The new indexing 
> design assumes index rows are always full and all cells within an index row 
> have the same timestamp. On the read path, GlobalIndexChecker returns only 
> the cells with the most recent timestamp of the row. This means that if the 
> client updates the same row multiple times, the client will read back only 
> the most recent update which could be partial.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to