[ 
https://issues.apache.org/jira/browse/HIVE-29420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18055871#comment-18055871
 ] 

Stamatis Zampetakis commented on HIVE-29420:
--------------------------------------------

The majority of tests that were added in the pull request especially those 
simulating retriggered compactions are using/setting the 
metastore.txn.use.minhistorywriteid feature. [~dkuzmenko] [~kuczoram] can you 
clarify what's the correlation and impact of the 
metastore.txn.use.minhistorywriteid property with the risk of dataloss? Can the 
dataloss occur when metastore.txn.use.minhistorywriteid is false? Is setting 
metastore.txn.use.minhistorywriteid to true increases the likelihood of losing 
data in versions that don't have this fix? 
 
The questions aim to complete the description of this ticket so that users can 
fully determine if they are affected or not.
 
 

> Hive ACID: Cleaner mishandles retries of killed compactions
> -----------------------------------------------------------
>
>                 Key: HIVE-29420
>                 URL: https://issues.apache.org/jira/browse/HIVE-29420
>             Project: Hive
>          Issue Type: Bug
>          Components: Transactions
>    Affects Versions: 4.2.0
>            Reporter: Denys Kuzmenko
>            Assignee: Denys Kuzmenko
>            Priority: Major
>              Labels: Compaction, pull-request-available
>             Fix For: 4.3.0
>
>
> Compaction retries triggered by timeouts under an incorrect configuration 
> pose a risk of data loss. If a compaction is re-attempted before the prior 
> attempt has completed or been properly aborted, the Cleaner may observe 
> multiple base directories with the same writeId and erroneously delete all of 
> them.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to