[ 
https://issues.apache.org/jira/browse/HIVE-21052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16746727#comment-16746727
 ] 

Jaume M commented on HIVE-21052:
--------------------------------

[~ekoifman] I agree with doing the db/tables/partitions lookup before 
submitting the work to avoid syncing later. Also it is probably useful to 
parallelize all this lookup if we have a long list.

I'm not sure I understand the table/partition level lock concept. If two 
partitions of the same tables where attempted to be cleaned wouldn't the second 
one get blocked because the first one got the lock for the table?
If I understand correctly there what needs to be avoided are 
* The 'p'-table-clean together with another 'p'-table-clean. These two tables 
being different
* The 'p'-table-clean with a partition clean of that table.
We could still get two different table locks and run concurrently two 
'p'-table-clean.

I mentioned it in a comment now if we have a bunch of compactions that are 
executed concurrently, one of them takes 1 hour and all the other finish in 5 
minutes, the cleaner would be 55 minutes waiting for a single compaction. 
Probably in the future we want the cleaner to keep on iterating and submitting.

> Make sure transactions get cleaned if they are aborted before addPartitions 
> is called
> -------------------------------------------------------------------------------------
>
>                 Key: HIVE-21052
>                 URL: https://issues.apache.org/jira/browse/HIVE-21052
>             Project: Hive
>          Issue Type: Bug
>          Components: Transactions
>    Affects Versions: 3.0.0
>            Reporter: Jaume M
>            Assignee: Jaume M
>            Priority: Critical
>         Attachments: Aborted Txn w_Direct Write.pdf, HIVE-21052.1.patch, 
> HIVE-21052.2.patch, HIVE-21052.3.patch, HIVE-21052.4.patch, 
> HIVE-21052.5.patch, HIVE-21052.6.patch, HIVE-21052.7.patch
>
>
> If the transaction is aborted between openTxn and addPartitions and data has 
> been written on the table the transaction manager will think it's an empty 
> transaction and no cleaning will be done.
> This is currently an issue in the streaming API and in micromanaged tables. 
> As proposed by [~ekoifman] this can be solved by:
> * Writing an entry with a special marker to TXN_COMPONENTS at openTxn and 
> when addPartitions is called remove this entry from TXN_COMPONENTS and add 
> the corresponding partition entry to TXN_COMPONENTS.
> * If the cleaner finds and entry with a special marker in TXN_COMPONENTS that 
> specifies that a transaction was opened and it was aborted it must generate 
> jobs for the worker for every possible partition available.
> cc [~ewohlstadter]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to