[
https://issues.apache.org/jira/browse/HIVE-18772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16590496#comment-16590496
]
Eugene Koifman commented on HIVE-18772:
---------------------------------------
[~sankarh], would this have any impact on replication?
> Make Acid Cleaner use MIN_HISTORY_LEVEL
> ---------------------------------------
>
> Key: HIVE-18772
> URL: https://issues.apache.org/jira/browse/HIVE-18772
> Project: Hive
> Issue Type: Improvement
> Components: Transactions
> Affects Versions: 3.0.0
> Reporter: Eugene Koifman
> Assignee: Eugene Koifman
> Priority: Major
>
> Instead of using Lock Manager state as it currently does.
> This will eliminate possible race conditions
> See this
> [comment|https://issues.apache.org/jira/browse/HIVE-18192?focusedCommentId=16338208&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16338208]
> Suppose A is the set of all ValidTxnList across all active readers. Each
> ValidTxnList has minOpenTxnId.
> MIN_HISTORY_LEVEL allows us to determine X = min(minOpenTxnId) across all
> currently active readers
> This means that no active transaction in the system sees any txn with txnid <
> X as open.
> This means if construct ValidTxnIdList with HWM=X-1 and use that in
> getAcidState(), any files determined by this call as 'obsolete', will be seen
> as obsolete by any existing/future reader, i.e. can be physically deleted.
> This is also necessary for multi-statement transactions where relying on the
> state of Lock Manager is not sufficient. For example
> Suppose txn 17 starts at t1 and sees txnid 13 with writeID 13 open.
> 13 commits (via it's parent txn) at t2 > t1. (17 is still running).
> Compaction runs at t3 >t2 to produce base_14 (or delta_10_14 for example) on
> Table1/Part1 (17 is still running)
> Now delta_13 may be cleaned since it can be seen as obsolete and there may be
> no locks on it, i.e. no one is reading it.
> Now at t4 > t3 17 may (multi stmt txn) needs to read Table1/Part1. It cannot
> use base_14 is that may have absorbed delete events from delete_delta_14.
> Using MIN_HISTORY_LEVEL solves this.
> See description of HIVE-18747 for more details on MIN_HISTORY_LEVEL
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)