[ https://issues.apache.org/jira/browse/HIVE-8061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Pengcheng Xiong updated HIVE-8061: ---------------------------------- Attachment: HIVE-8061.1.patch Major improvement (1) All the partition status update/insert is now done in one transaction. (2) Rather than to use a query to update per col per partition (total query = #col * # part), now we use 1 query to delete everything and then use 1 query to insert everything. The transaction makes sure that this happens in ACID mode. > improve the speed of col stats update speed > ------------------------------------------- > > Key: HIVE-8061 > URL: https://issues.apache.org/jira/browse/HIVE-8061 > Project: Hive > Issue Type: Improvement > Reporter: Pengcheng Xiong > Assignee: Pengcheng Xiong > Priority: Minor > Attachments: HIVE-8061.1.patch > > > We worked hard towards faster update stats for columns of a partition of a > table previously https://issues.apache.org/jira/browse/HIVE-7736 > and https://issues.apache.org/jira/browse/HIVE-7876 > Although there is some improvement, it is only correct in the first run. > There will be duplicate column stats later. Thanks to Eugene Koifman 's > comments. > We fixed this in https://issues.apache.org/jira/browse/HIVE-7944 by reversing > the patch. > This JIRA ticket is my another try to improve the speed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)