[jira] [Updated] (HIVE-7736) improve the columns stats update speed for all the partitions of a table
[ https://issues.apache.org/jira/browse/HIVE-7736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-7736: --- Resolution: Fixed Fix Version/s: 0.14.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Pengcheng! > improve the columns stats update speed for all the partitions of a table > > > Key: HIVE-7736 > URL: https://issues.apache.org/jira/browse/HIVE-7736 > Project: Hive > Issue Type: Improvement >Reporter: pengcheng xiong >Assignee: pengcheng xiong >Priority: Minor > Fix For: 0.14.0 > > Attachments: HIVE-7736.0.patch, HIVE-7736.1.patch, HIVE-7736.2.patch, > HIVE-7736.3.patch, HIVE-7736.4.patch > > > The current implementation of columns stats update for all the partitions of > a table takes a long time when there are thousands of partitions. > For example, on a given cluster, it took 600+ seconds to update all the > partitions' columns stats for a table with 2 columns but 2000 partitions. > ANALYZE TABLE src_stat_part partition (partitionId) COMPUTE STATISTICS for > columns; > We would like to improve the columns stats update speed for all the > partitions of a table -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7736) improve the columns stats update speed for all the partitions of a table
[ https://issues.apache.org/jira/browse/HIVE-7736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] pengcheng xiong updated HIVE-7736: -- Attachment: HIVE-7736.4.patch > improve the columns stats update speed for all the partitions of a table > > > Key: HIVE-7736 > URL: https://issues.apache.org/jira/browse/HIVE-7736 > Project: Hive > Issue Type: Improvement >Reporter: pengcheng xiong >Assignee: pengcheng xiong >Priority: Minor > Attachments: HIVE-7736.0.patch, HIVE-7736.1.patch, HIVE-7736.2.patch, > HIVE-7736.3.patch, HIVE-7736.4.patch > > > The current implementation of columns stats update for all the partitions of > a table takes a long time when there are thousands of partitions. > For example, on a given cluster, it took 600+ seconds to update all the > partitions' columns stats for a table with 2 columns but 2000 partitions. > ANALYZE TABLE src_stat_part partition (partitionId) COMPUTE STATISTICS for > columns; > We would like to improve the columns stats update speed for all the > partitions of a table -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7736) improve the columns stats update speed for all the partitions of a table
[ https://issues.apache.org/jira/browse/HIVE-7736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] pengcheng xiong updated HIVE-7736: -- Status: Patch Available (was: Open) wait for QA tests > improve the columns stats update speed for all the partitions of a table > > > Key: HIVE-7736 > URL: https://issues.apache.org/jira/browse/HIVE-7736 > Project: Hive > Issue Type: Improvement >Reporter: pengcheng xiong >Assignee: pengcheng xiong >Priority: Minor > Attachments: HIVE-7736.0.patch, HIVE-7736.1.patch, HIVE-7736.2.patch, > HIVE-7736.3.patch, HIVE-7736.4.patch > > > The current implementation of columns stats update for all the partitions of > a table takes a long time when there are thousands of partitions. > For example, on a given cluster, it took 600+ seconds to update all the > partitions' columns stats for a table with 2 columns but 2000 partitions. > ANALYZE TABLE src_stat_part partition (partitionId) COMPUTE STATISTICS for > columns; > We would like to improve the columns stats update speed for all the > partitions of a table -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7736) improve the columns stats update speed for all the partitions of a table
[ https://issues.apache.org/jira/browse/HIVE-7736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] pengcheng xiong updated HIVE-7736: -- Status: Open (was: Patch Available) > improve the columns stats update speed for all the partitions of a table > > > Key: HIVE-7736 > URL: https://issues.apache.org/jira/browse/HIVE-7736 > Project: Hive > Issue Type: Improvement >Reporter: pengcheng xiong >Assignee: pengcheng xiong >Priority: Minor > Attachments: HIVE-7736.0.patch, HIVE-7736.1.patch, HIVE-7736.2.patch, > HIVE-7736.3.patch > > > The current implementation of columns stats update for all the partitions of > a table takes a long time when there are thousands of partitions. > For example, on a given cluster, it took 600+ seconds to update all the > partitions' columns stats for a table with 2 columns but 2000 partitions. > ANALYZE TABLE src_stat_part partition (partitionId) COMPUTE STATISTICS for > columns; > We would like to improve the columns stats update speed for all the > partitions of a table -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7736) improve the columns stats update speed for all the partitions of a table
[ https://issues.apache.org/jira/browse/HIVE-7736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] pengcheng xiong updated HIVE-7736: -- Attachment: HIVE-7736.3.patch regenerate the patch (rebase), wait for QA tests > improve the columns stats update speed for all the partitions of a table > > > Key: HIVE-7736 > URL: https://issues.apache.org/jira/browse/HIVE-7736 > Project: Hive > Issue Type: Improvement >Reporter: pengcheng xiong >Assignee: pengcheng xiong >Priority: Minor > Attachments: HIVE-7736.0.patch, HIVE-7736.1.patch, HIVE-7736.2.patch, > HIVE-7736.3.patch > > > The current implementation of columns stats update for all the partitions of > a table takes a long time when there are thousands of partitions. > For example, on a given cluster, it took 600+ seconds to update all the > partitions' columns stats for a table with 2 columns but 2000 partitions. > ANALYZE TABLE src_stat_part partition (partitionId) COMPUTE STATISTICS for > columns; > We would like to improve the columns stats update speed for all the > partitions of a table -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7736) improve the columns stats update speed for all the partitions of a table
[ https://issues.apache.org/jira/browse/HIVE-7736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] pengcheng xiong updated HIVE-7736: -- Attachment: HIVE-7736.2.patch simple write path (just send a colstats list to the server) add test cases > improve the columns stats update speed for all the partitions of a table > > > Key: HIVE-7736 > URL: https://issues.apache.org/jira/browse/HIVE-7736 > Project: Hive > Issue Type: Improvement >Reporter: pengcheng xiong >Assignee: pengcheng xiong >Priority: Minor > Attachments: HIVE-7736.0.patch, HIVE-7736.1.patch, HIVE-7736.2.patch > > > The current implementation of columns stats update for all the partitions of > a table takes a long time when there are thousands of partitions. > For example, on a given cluster, it took 600+ seconds to update all the > partitions' columns stats for a table with 2 columns but 2000 partitions. > ANALYZE TABLE src_stat_part partition (partitionId) COMPUTE STATISTICS for > columns; > We would like to improve the columns stats update speed for all the > partitions of a table -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7736) improve the columns stats update speed for all the partitions of a table
[ https://issues.apache.org/jira/browse/HIVE-7736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] pengcheng xiong updated HIVE-7736: -- Status: Patch Available (was: Open) improve the columns stats update speed for all the partitions of a table through MetaStoreDirectSQL > improve the columns stats update speed for all the partitions of a table > > > Key: HIVE-7736 > URL: https://issues.apache.org/jira/browse/HIVE-7736 > Project: Hive > Issue Type: Improvement >Reporter: pengcheng xiong >Assignee: pengcheng xiong >Priority: Minor > Attachments: HIVE-7736.0.patch, HIVE-7736.1.patch > > > The current implementation of columns stats update for all the partitions of > a table takes a long time when there are thousands of partitions. > For example, on a given cluster, it took 600+ seconds to update all the > partitions' columns stats for a table with 2 columns but 2000 partitions. > ANALYZE TABLE src_stat_part partition (partitionId) COMPUTE STATISTICS for > columns; > We would like to improve the columns stats update speed for all the > partitions of a table -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7736) improve the columns stats update speed for all the partitions of a table
[ https://issues.apache.org/jira/browse/HIVE-7736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] pengcheng xiong updated HIVE-7736: -- Attachment: HIVE-7736.1.patch > improve the columns stats update speed for all the partitions of a table > > > Key: HIVE-7736 > URL: https://issues.apache.org/jira/browse/HIVE-7736 > Project: Hive > Issue Type: Improvement >Reporter: pengcheng xiong >Assignee: pengcheng xiong >Priority: Minor > Attachments: HIVE-7736.0.patch, HIVE-7736.1.patch > > > The current implementation of columns stats update for all the partitions of > a table takes a long time when there are thousands of partitions. > For example, on a given cluster, it took 600+ seconds to update all the > partitions' columns stats for a table with 2 columns but 2000 partitions. > ANALYZE TABLE src_stat_part partition (partitionId) COMPUTE STATISTICS for > columns; > We would like to improve the columns stats update speed for all the > partitions of a table -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7736) improve the columns stats update speed for all the partitions of a table
[ https://issues.apache.org/jira/browse/HIVE-7736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] pengcheng xiong updated HIVE-7736: -- Attachment: HIVE-7736.0.patch > improve the columns stats update speed for all the partitions of a table > > > Key: HIVE-7736 > URL: https://issues.apache.org/jira/browse/HIVE-7736 > Project: Hive > Issue Type: Improvement >Reporter: pengcheng xiong >Assignee: pengcheng xiong >Priority: Minor > Attachments: HIVE-7736.0.patch > > > The current implementation of columns stats update for all the partitions of > a table takes a long time when there are thousands of partitions. > For example, on a given cluster, it took 600+ seconds to update all the > partitions' columns stats for a table with 2 columns but 2000 partitions. > ANALYZE TABLE src_stat_part partition (partitionId) COMPUTE STATISTICS for > columns; > We would like to improve the columns stats update speed for all the > partitions of a table -- This message was sent by Atlassian JIRA (v6.2#6252)