[ 
https://issues.apache.org/jira/browse/HIVE-19867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16522959#comment-16522959
 ] 

Steve Yeom edited comment on HIVE-19867 at 6/26/18 12:31 AM:
-------------------------------------------------------------

Eugene and I talked. 
What I missed from the above is that, when we have two concurrent INSERT
only one can have the other's write id in its writeIdList.

But a possible solution is, if atomicity is guaranteed, 
to check either of the two condition is true
1. old stats' writeIdList in TBLS/PARTITIONS has the new updater's writeId
2. new updater's writeIdList has the old stats' writeId (to be saved in 
TBLS/PARTITIONS).
If then, we can say we have a concurrent INSERTs.


was (Author: steveyeom2017):
A simple idea is that 
1. We save writeId of the stats updater into TBLS/PARTITIONS.
2. When we update stats, we check whether the new stats updater's writeId is in 
the old
  stats updater's writeIdList and check whether the old stats updater's writeId 
is in the current stats 
  updater's writeIdList. If both are true it is concurrent update.
  Thus we turn to false the COLUMN_STATS_ACCURATE of the current 
table/partition.

> Test and verify Concurrent INSERTS  
> ------------------------------------
>
>                 Key: HIVE-19867
>                 URL: https://issues.apache.org/jira/browse/HIVE-19867
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Transactions
>    Affects Versions: 4.0.0
>            Reporter: Steve Yeom
>            Assignee: Steve Yeom
>            Priority: Major
>             Fix For: 4.0.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to