Peter Varga created HIVE-24481:
----------------------------------

             Summary: Skipped compaction can cause data corruption with 
streaming
                 Key: HIVE-24481
                 URL: https://issues.apache.org/jira/browse/HIVE-24481
             Project: Hive
          Issue Type: Bug
            Reporter: Peter Varga
            Assignee: Peter Varga


Timeline:
1. create a partitioned table, add one static partition
2. transaction 1 writes delta_1, and aborts
3. create streaming connection, with batch 3, withStaticPartitionValues with 
the existing partition
4. beginTransaction, write, commitTransaction
5. beginTransaction, write, abortTransaction
6. beingTransaction, write, commitTransaction
7. close connection, count of the table is 2
8. run manual minor compaction on the partition. it will skip compaction, 
because deltacount =1 but clean, because there is aborted txn1
9. cleaner will remove both aborted record from txn_components
10. wait for acidhousekeeper to remove empty aborted txns
11. select * from table return *3* records, reading the aborted record



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to