[jira] [Updated] (HIVE-20435) Failed Dynamic Partition Insert into insert only table may loose transaction metadata

2019-01-02 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-20435:
--
Description: 
{{TxnHandler.enqueueLockWithRetry()}} has an optimization where it doesn't writ 
to {{TXN_COMPONENTS}} if the write is a dynamic partition insert because it 
expects to write to this table from {{addDynamicPartitions()}}.

For insert-only, transactional tables, we create the target dir and start 
writing to it before {{addDynamicPartitions()}} is called. So if a txn is 
aborted, we may have a delta dir in the partition but no corresponding entry in 
{{TXN_COMPONENTS}}. This means {{TxnStore.cleanEmptyAbortedTxns()}} may clean 
up {{TXNS}} entry for the aborted transaction before Compactor removes this 
delta dir, at which point it looks like committed data.

Streaming API V2 with dynamic partition mode also has this problem.

Full CRUD are currently immune to this since they rely on "move" operation in 
MoveTask but longer term they should follow the same model as insert-only 
tables.

  was:
{{TxnHandler.enqueueLockWithRetry()}} has an optimization where it doesn't writ 
to {{TXN_COMPONENTS}} if the write is a dynamic partition insert because it 
expects to write to this table from {{addDynamicPartitions()}}.  

For insert-only, transactional tables, we create the target dir and start 
writing to it before {{addDynamicPartitions()}} is called.  So if a txn is 
aborted, we may have a delta dir in the partition but no corresponding entry in 
{{TXN_COMPONENTS}}.  This means {{TxnStore.cleanEmptyAbortedTxns()}} may clean 
up {{TXNS}} entry for the aborted transaction before Compactor removes this 
delta dir, at which point it looks like committed data.

Full CRUD are currently immune to this since they rely on "move" operation in 
MoveTask but longer term they should follow the same model as insert-only 
tables.


> Failed Dynamic Partition Insert into insert only table may loose transaction 
> metadata
> -
>
> Key: HIVE-20435
> URL: https://issues.apache.org/jira/browse/HIVE-20435
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
>
> {{TxnHandler.enqueueLockWithRetry()}} has an optimization where it doesn't 
> writ to {{TXN_COMPONENTS}} if the write is a dynamic partition insert because 
> it expects to write to this table from {{addDynamicPartitions()}}.
> For insert-only, transactional tables, we create the target dir and start 
> writing to it before {{addDynamicPartitions()}} is called. So if a txn is 
> aborted, we may have a delta dir in the partition but no corresponding entry 
> in {{TXN_COMPONENTS}}. This means {{TxnStore.cleanEmptyAbortedTxns()}} may 
> clean up {{TXNS}} entry for the aborted transaction before Compactor removes 
> this delta dir, at which point it looks like committed data.
> Streaming API V2 with dynamic partition mode also has this problem.
> Full CRUD are currently immune to this since they rely on "move" operation in 
> MoveTask but longer term they should follow the same model as insert-only 
> tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20435) Failed Dynamic Partition Insert into insert only table may loose transaction metadata

2018-10-02 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-20435:
--
Summary: Failed Dynamic Partition Insert into insert only table may loose 
transaction metadata  (was: Failed Dynamic Partition Insert into insert only 
table may loose transaction metadat)

> Failed Dynamic Partition Insert into insert only table may loose transaction 
> metadata
> -
>
> Key: HIVE-20435
> URL: https://issues.apache.org/jira/browse/HIVE-20435
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
>
> {{TxnHandler.enqueueLockWithRetry()}} has an optimization where it doesn't 
> writ to {{TXN_COMPONENTS}} if the write is a dynamic partition insert because 
> it expects to write to this table from {{addDynamicPartitions()}}.  
> For insert-only, transactional tables, we create the target dir and start 
> writing to it before {{addDynamicPartitions()}} is called.  So if a txn is 
> aborted, we may have a delta dir in the partition but no corresponding entry 
> in {{TXN_COMPONENTS}}.  This means {{TxnStore.cleanEmptyAbortedTxns()}} may 
> clean up {{TXNS}} entry for the aborted transaction before Compactor removes 
> this delta dir, at which point it looks like committed data.
> Full CRUD are currently immune to this since they rely on "move" operation in 
> MoveTask but longer term they should follow the same model as insert-only 
> tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)