[ 
https://issues.apache.org/jira/browse/HIVE-18772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-18772:
----------------------------------
    Attachment: HIVE-18772.03.patch

> Make Acid Cleaner use MIN_HISTORY_LEVEL
> ---------------------------------------
>
>                 Key: HIVE-18772
>                 URL: https://issues.apache.org/jira/browse/HIVE-18772
>             Project: Hive
>          Issue Type: Improvement
>          Components: Transactions
>    Affects Versions: 3.0.0
>            Reporter: Eugene Koifman
>            Assignee: Eugene Koifman
>            Priority: Major
>         Attachments: HIVE-18772.01.patch, HIVE-18772.02.patch, 
> HIVE-18772.02.patch, HIVE-18772.03.patch
>
>
> Instead of using Lock Manager state as it currently does.
> This will eliminate possible race conditions
> See this 
> [comment|https://issues.apache.org/jira/browse/HIVE-18192?focusedCommentId=16338208&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16338208]
> Suppose A is the set of all ValidTxnList across all active readers.  Each 
> ValidTxnList has minOpenTxnId.
> MIN_HISTORY_LEVEL allows us to determine X = min(minOpenTxnId) across all 
> currently active readers
> This means that no active transaction in the system sees any txn with txnid < 
> X as open.
> This means if construct ValidTxnIdList with HWM=X-1 and use that in 
> getAcidState(), any files determined by this call as 'obsolete', will be seen 
> as obsolete by any existing/future reader, i.e. can be physically deleted.
> This is also necessary for multi-statement transactions where relying on the 
> state of Lock Manager is not sufficient.  For example
> Suppose txn 17 starts at t1 and sees txnid 13 with writeID 13 open.
> 13 commits (via it's parent txn) at t2 > t1.  (17 is still running).
> Compaction runs at t3 >t2 to produce base_14 (or delta_10_14 for example) on 
> Table1/Part1 (17 is still running)
> Now delta_13 may be cleaned since it can be seen as obsolete and there may be 
> no locks on it, i.e. no one is reading it.
> Now at t4 > t3 17 may (multi stmt txn) needs to read Table1/Part1. It cannot 
> use base_14 is that may have absorbed delete events from delete_delta_14.
> Using MIN_HISTORY_LEVEL solves this.
> See description of HIVE-18747 for more details on MIN_HISTORY_LEVEL



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to