[ 
https://issues.apache.org/jira/browse/PHOENIX-5768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056477#comment-17056477
 ] 

Kadir OZDEMIR commented on PHOENIX-5768:
----------------------------------------

Because of the correctness issue mentioned above, PHOENIX-5708 made sure that 
the new index implementation returns only the cells from the last update for a 
given row. Although this fixes the above correctness issue, it creates another 
one when there is no orphan index rows but there are partial overwrites. In the 
new design, the orphan index rows are unverified rows. There are two types of 
solutions for this issue:
 # Full row update during write : We can detect partial writes during preparing 
index updates on the client side, read the data table, add missing column 
values and prepare full index updates as we do for mutable tables. We can 
either prepare the full index row on the client side or threat this partial 
write as if it were on a mutable table and leverage the server side 
implementation.
 # Full row update during read: We can detect partial writes during preparing 
index updates on the client side and do not proceed with the last index write 
phase. This leaves index rows unverified. During read, the read repair will 
repair these rows and turn them into full row updates.

[~giskender], [~gjacoby], I prefer option 2 as it is simpler to implement and 
does not impact write performance. Any thoughts on this?

> Supporting partial overwrites for immutable tables with indexes
> ---------------------------------------------------------------
>
>                 Key: PHOENIX-5768
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-5768
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 5.0.0, 4.14.3
>            Reporter: Kadir OZDEMIR
>            Assignee: Kadir OZDEMIR
>            Priority: Critical
>
> Phoenix allows immutable table with indexes to be overwritten partially as 
> long as the indexed columns are not updated during partial overwrites. 
> However, there is no check/enforcement for this. The immutable index 
> mutations are prepared on the client side without reading the existing data 
> table rows. This means the index mutations prepared by the client will be 
> partial when the data table row mutations are partial. The new indexing 
> design assumes index rows are always full and all cells within an index row 
> have the same timestamp. On the read path, GlobalIndexChecker returns only 
> the cells with the most recent timestamp of the row. This means that if the 
> client updates the same row multiple times, the client will read back only 
> the most recent update which could be partial.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to