[
https://issues.apache.org/jira/browse/HIVE-17969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16236683#comment-16236683
]
Adam Szita commented on HIVE-17969:
-----------------------------------
The optimization in [^HIVE-17969.0.patch] aims at reducing the time the query
takes during the alterPartition() call. Inside this method we call
removeUnusedColumnDescriptor(). In case we have many (if not all..) partitions
with the same schema we make a lot of redundant calls when checking
removability of the old column descriptor. Instead of this we should take a
batch of partitions, see what old CDs there are (in case there is no schema
change in 30k partitions than it will be one for the 30k parts..) and then run
the check on this distinct set of CDs only - thereby saving a lot of DB query
time.
[~pvary] can you take a look please?
> Metastore to alter table in batches of partitions when renaming table
> ---------------------------------------------------------------------
>
> Key: HIVE-17969
> URL: https://issues.apache.org/jira/browse/HIVE-17969
> Project: Hive
> Issue Type: Improvement
> Components: Metastore
> Reporter: Adam Szita
> Assignee: Adam Szita
> Priority: Major
> Attachments: HIVE-17969.0.patch
>
>
> I'm currently trying to speed up the {{alter table rename to}} feature of
> HMS. The recently submitted change (HIVE-9447) already helps a lot especially
> on Oracle HMS DBs.
> This time I intend to gain throughput independently of DB types by enabling
> HMS to execute this alter table command on batches of partitions (rather than
> 1by1)
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)