[
https://issues.apache.org/jira/browse/HIVE-21052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17218186#comment-17218186
]
Denys Kuzmenko commented on HIVE-21052:
---------------------------------------
[~vpnvishv], do you plan to cherry-pick this change into branch-3.1? We did it
a bit differently in master (no p-type compaction), would be great if you could
take a look.
> Make sure transactions get cleaned if they are aborted before addPartitions
> is called
> -------------------------------------------------------------------------------------
>
> Key: HIVE-21052
> URL: https://issues.apache.org/jira/browse/HIVE-21052
> Project: Hive
> Issue Type: Bug
> Components: Transactions
> Affects Versions: 3.0.0, 3.1.1
> Reporter: Jaume M
> Assignee: Jaume M
> Priority: Critical
> Labels: pull-request-available
> Attachments: Aborted Txn w_Direct Write.pdf, HIVE-21052.1.patch,
> HIVE-21052.10.patch, HIVE-21052.11.patch, HIVE-21052.12.patch,
> HIVE-21052.2.patch, HIVE-21052.3.patch, HIVE-21052.4.patch,
> HIVE-21052.5.patch, HIVE-21052.6.patch, HIVE-21052.7.patch,
> HIVE-21052.8.patch, HIVE-21052.9.patch
>
> Time Spent: 12h 10m
> Remaining Estimate: 0h
>
> If the transaction is aborted between openTxn and addPartitions and data has
> been written on the table the transaction manager will think it's an empty
> transaction and no cleaning will be done.
> This is currently an issue in the streaming API and in micromanaged tables.
> As proposed by [~ekoifman] this can be solved by:
> * Writing an entry with a special marker to TXN_COMPONENTS at openTxn and
> when addPartitions is called remove this entry from TXN_COMPONENTS and add
> the corresponding partition entry to TXN_COMPONENTS.
> * If the cleaner finds and entry with a special marker in TXN_COMPONENTS that
> specifies that a transaction was opened and it was aborted it must generate
> jobs for the worker for every possible partition available.
> cc [~ewohlstadter]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)