[ 
https://issues.apache.org/jira/browse/HIVE-15032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-15032:
----------------------------------
    Description: 
{noformat}
create table if not exists TAB_PART (a int, b int)  partitioned by (p string) 
clustered by (a) into 2  buckets stored as orc TBLPROPERTIES 
('transactional'='true')

   insert into TAB_PART partition(p='blah') values(1,2) //this uses static part
    update TAB_PART set b = 7 where p = 'blah' //this uses DP... WHY?
{noformat}

the Update is rewritten into an Insert stmt but SemanticAnalzyer.genFileSink() 
for this Insert is set up with dynamic partitions

at least in theory, we should be able to analyze the WHERE clause so that 
Insert doesn't have to use DP.

Another important side effect of this is how locks are acquired.  If the table 
doesn't have partition 'blah', ss it is, a SHARED_WRITE is acquired on the 
TAB_PART table.
However it would suffice to acquire a SHARED_WRITE on the single partition 
operated on, or better yet, short circuit the query.

If the table does have partition 'blah', we get only the partition lock

see TestDbTxnManager2.testWriteSetTracking3() testWriteSetTracking5()

  was:
{noformat}
create table if not exists TAB_PART (a int, b int)  partitioned by (p string) 
clustered by (a) into 2  buckets stored as orc TBLPROPERTIES 
('transactional'='true')

   insert into TAB_PART partition(p='blah') values(1,2) //this uses static part
    update TAB_PART set b = 7 where p = 'blah' //this uses DP... WHY?
{noformat}

the Update is rewritten into an Insert stmt but SemanticAnalzyer.genFileSink() 
for this Insert is set up with dynamic partitions

at least in theory, we should be able to analyze the WHERE clause so that 
Insert doesn't have to use DP.

Another important side effect of this is how locks are acquired.  If the table 
doesn't have partition 'blah', ss it is, a SHARED_WRITE is acquired on the 
TAB_PART table.
However it would suffice to acquire a SHARED_WRITE on the single partition 
operated on, or better yet, short circuit the query.

If the table does have partition 'blah', we get only the partition lock


> Update/Delete statements use dynamic partitions when it's not necessary
> -----------------------------------------------------------------------
>
>                 Key: HIVE-15032
>                 URL: https://issues.apache.org/jira/browse/HIVE-15032
>             Project: Hive
>          Issue Type: Bug
>          Components: Transactions
>    Affects Versions: 1.0.0
>            Reporter: Eugene Koifman
>            Assignee: Eugene Koifman
>
> {noformat}
> create table if not exists TAB_PART (a int, b int)  partitioned by (p string) 
> clustered by (a) into 2  buckets stored as orc TBLPROPERTIES 
> ('transactional'='true')
>    insert into TAB_PART partition(p='blah') values(1,2) //this uses static 
> part
>     update TAB_PART set b = 7 where p = 'blah' //this uses DP... WHY?
> {noformat}
> the Update is rewritten into an Insert stmt but 
> SemanticAnalzyer.genFileSink() for this Insert is set up with dynamic 
> partitions
> at least in theory, we should be able to analyze the WHERE clause so that 
> Insert doesn't have to use DP.
> Another important side effect of this is how locks are acquired.  If the 
> table doesn't have partition 'blah', ss it is, a SHARED_WRITE is acquired on 
> the TAB_PART table.
> However it would suffice to acquire a SHARED_WRITE on the single partition 
> operated on, or better yet, short circuit the query.
> If the table does have partition 'blah', we get only the partition lock
> see TestDbTxnManager2.testWriteSetTracking3() testWriteSetTracking5()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to