Rajesh Balamohan created HIVE-24649:
---------------------------------------

             Summary: Optimise Hive::addWriteNotificationLog for large data 
inserts
                 Key: HIVE-24649
                 URL: https://issues.apache.org/jira/browse/HIVE-24649
             Project: Hive
          Issue Type: Improvement
          Components: HiveServer2
            Reporter: Rajesh Balamohan


When loading dynamic partition with large dataset, it spends lot of time in 
"Hive::loadDynamicPartitions --> addWriteNotificationLog".

Though it is for same for same table, it ends up loading table and partition 
details for every partition and writes to notification log.

Also, "Partition" details may be already present in {{PartitionDetails}} object 
in {{Hive::loadDynamicPartitions}}. This is unnecessarily recomputed again in 
{{HiveMetaStore::add_write_notification_log}}

 
Lines of interest:
https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L3028
https://github.com/apache/hive/blob/89073a94354f0cc14ec4ae0a43e05aae29276b4d/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L8500
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to