[
https://issues.apache.org/jira/browse/HIVE-23725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17192760#comment-17192760
]
Denys Kuzmenko commented on HIVE-23725:
---------------------------------------
reverted commit e2a02f1: https://github.com/apache/hive/pull/1474
[~kgyrtkirk], [~pvargacl], thank you for the review!
> ValidTxnManager snapshot outdating causing partial reads in merge insert
> ------------------------------------------------------------------------
>
> Key: HIVE-23725
> URL: https://issues.apache.org/jira/browse/HIVE-23725
> Project: Hive
> Issue Type: Bug
> Reporter: Peter Varga
> Assignee: Peter Varga
> Priority: Major
> Labels: pull-request-available
> Time Spent: 7h 40m
> Remaining Estimate: 0h
>
> When the ValidTxnManager invalidates the snapshot during merge insert and
> starts to read committed transactions that were not committed when the query
> compilation happened, it can cause partial read problems if the committed
> transaction created new partition in the source or target table.
> The solution should be not only fix the snapshot but also recompile the query
> and acquire the locks again.
> You could construct an example like this:
> 1. open and compile transaction 1 that merge inserts data from a partitioned
> source table that has a few partition.
> 2. Open, run and commit transaction 2 that inserts data to an old and a new
> partition to the source table.
> 3. Open, run and commit transaction 3 that inserts data to the target table
> of the merge statement, that will retrigger a snapshot generation in
> transaction 1.
> 4. Run transaction 1, the snapshot will be regenerated, and it will read
> partial data from transaction 2 breaking the ACID properties.
> Different setup.
> Switch the transaction order:
> 1. compile transaction 1 that inserts data to an old and a new partition of
> the source table.
> 2. compile transaction 2 that insert data to the target table
> 2. compile transaction 3 that merge inserts data from the source table to the
> target table
> 3. run and commit transaction 1
> 4. run and commit transaction 2
> 5. run transaction 3, since it cointains 1 and 2 in its snaphot the
> isValidTxnListState will be triggered and we do a partial read of the
> transaction 1 for the same reasons.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)