[ 
https://issues.apache.org/jira/browse/HIVE-17969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16236683#comment-16236683
 ] 

Adam Szita commented on HIVE-17969:
-----------------------------------

The optimization in [^HIVE-17969.0.patch] aims at reducing the time the query 
takes during the alterPartition() call. Inside this method we call 
removeUnusedColumnDescriptor(). In case we have many (if not all..) partitions 
with the same schema we make a lot of redundant calls when checking 
removability of the old column descriptor. Instead of this we should take a 
batch of partitions, see what old CDs there are (in case there is no schema 
change in 30k partitions than it will be one for the 30k parts..) and then run 
the check on this distinct set of CDs only - thereby saving a lot of DB query 
time.
[~pvary] can you take a look please?

> Metastore to alter table in batches of partitions when renaming table
> ---------------------------------------------------------------------
>
>                 Key: HIVE-17969
>                 URL: https://issues.apache.org/jira/browse/HIVE-17969
>             Project: Hive
>          Issue Type: Improvement
>          Components: Metastore
>            Reporter: Adam Szita
>            Assignee: Adam Szita
>            Priority: Major
>         Attachments: HIVE-17969.0.patch
>
>
> I'm currently trying to speed up the {{alter table rename to}} feature of 
> HMS. The recently submitted change (HIVE-9447) already helps a lot especially 
> on Oracle HMS DBs.
> This time I intend to gain throughput independently of DB types by enabling 
> HMS to execute this alter table command on batches of partitions (rather than 
> 1by1)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to