[
https://issues.apache.org/jira/browse/HIVE-24649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17267194#comment-17267194
]
Peter Vary commented on HIVE-24649:
-----------------------------------
[~anishek]: If the transaction is not committed then we will have partitions
created for the table but since the writes are not committed then the readers
will know that they should not read them and they will handle this as an empty
partition, so no data corruption should happen. Having extra empty partition is
not nice, but the next query will just not create them again.
> Optimise Hive::addWriteNotificationLog for large data inserts
> -------------------------------------------------------------
>
> Key: HIVE-24649
> URL: https://issues.apache.org/jira/browse/HIVE-24649
> Project: Hive
> Issue Type: Improvement
> Components: HiveServer2
> Reporter: Rajesh Balamohan
> Priority: Major
> Labels: performance
>
> When loading dynamic partition with large dataset, it spends lot of time in
> "Hive::loadDynamicPartitions --> addWriteNotificationLog".
> Though it is for same for same table, it ends up loading table and partition
> details for every partition and writes to notification log.
> Also, "Partition" details may be already present in {{PartitionDetails}}
> object in {{Hive::loadDynamicPartitions}}. This is unnecessarily recomputed
> again in {{HiveMetaStore::add_write_notification_log}}
>
> Lines of interest:
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L3028
> https://github.com/apache/hive/blob/89073a94354f0cc14ec4ae0a43e05aae29276b4d/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L8500
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)