[jira] [Updated] (HIVE-7736) improve the columns stats update speed for all the partitions of a table

2014-08-22 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-7736:
---

   Resolution: Fixed
Fix Version/s: 0.14.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Pengcheng!

> improve the columns stats update speed for all the partitions of a table
> 
>
> Key: HIVE-7736
> URL: https://issues.apache.org/jira/browse/HIVE-7736
> Project: Hive
>  Issue Type: Improvement
>Reporter: pengcheng xiong
>Assignee: pengcheng xiong
>Priority: Minor
> Fix For: 0.14.0
>
> Attachments: HIVE-7736.0.patch, HIVE-7736.1.patch, HIVE-7736.2.patch, 
> HIVE-7736.3.patch, HIVE-7736.4.patch
>
>
> The current implementation of columns stats update for all the partitions of 
> a table takes a long time when there are thousands of partitions. 
> For example, on a given cluster, it took 600+ seconds to update all the 
> partitions' columns stats for a table with 2 columns but 2000 partitions.
> ANALYZE TABLE src_stat_part partition (partitionId) COMPUTE STATISTICS for 
> columns;
> We would like to improve the columns stats update speed for all the 
> partitions of a table



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7736) improve the columns stats update speed for all the partitions of a table

2014-08-21 Thread pengcheng xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

pengcheng xiong updated HIVE-7736:
--

Attachment: HIVE-7736.4.patch

> improve the columns stats update speed for all the partitions of a table
> 
>
> Key: HIVE-7736
> URL: https://issues.apache.org/jira/browse/HIVE-7736
> Project: Hive
>  Issue Type: Improvement
>Reporter: pengcheng xiong
>Assignee: pengcheng xiong
>Priority: Minor
> Attachments: HIVE-7736.0.patch, HIVE-7736.1.patch, HIVE-7736.2.patch, 
> HIVE-7736.3.patch, HIVE-7736.4.patch
>
>
> The current implementation of columns stats update for all the partitions of 
> a table takes a long time when there are thousands of partitions. 
> For example, on a given cluster, it took 600+ seconds to update all the 
> partitions' columns stats for a table with 2 columns but 2000 partitions.
> ANALYZE TABLE src_stat_part partition (partitionId) COMPUTE STATISTICS for 
> columns;
> We would like to improve the columns stats update speed for all the 
> partitions of a table



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7736) improve the columns stats update speed for all the partitions of a table

2014-08-21 Thread pengcheng xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

pengcheng xiong updated HIVE-7736:
--

Status: Patch Available  (was: Open)

wait for QA tests

> improve the columns stats update speed for all the partitions of a table
> 
>
> Key: HIVE-7736
> URL: https://issues.apache.org/jira/browse/HIVE-7736
> Project: Hive
>  Issue Type: Improvement
>Reporter: pengcheng xiong
>Assignee: pengcheng xiong
>Priority: Minor
> Attachments: HIVE-7736.0.patch, HIVE-7736.1.patch, HIVE-7736.2.patch, 
> HIVE-7736.3.patch, HIVE-7736.4.patch
>
>
> The current implementation of columns stats update for all the partitions of 
> a table takes a long time when there are thousands of partitions. 
> For example, on a given cluster, it took 600+ seconds to update all the 
> partitions' columns stats for a table with 2 columns but 2000 partitions.
> ANALYZE TABLE src_stat_part partition (partitionId) COMPUTE STATISTICS for 
> columns;
> We would like to improve the columns stats update speed for all the 
> partitions of a table



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7736) improve the columns stats update speed for all the partitions of a table

2014-08-21 Thread pengcheng xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

pengcheng xiong updated HIVE-7736:
--

Status: Open  (was: Patch Available)

> improve the columns stats update speed for all the partitions of a table
> 
>
> Key: HIVE-7736
> URL: https://issues.apache.org/jira/browse/HIVE-7736
> Project: Hive
>  Issue Type: Improvement
>Reporter: pengcheng xiong
>Assignee: pengcheng xiong
>Priority: Minor
> Attachments: HIVE-7736.0.patch, HIVE-7736.1.patch, HIVE-7736.2.patch, 
> HIVE-7736.3.patch
>
>
> The current implementation of columns stats update for all the partitions of 
> a table takes a long time when there are thousands of partitions. 
> For example, on a given cluster, it took 600+ seconds to update all the 
> partitions' columns stats for a table with 2 columns but 2000 partitions.
> ANALYZE TABLE src_stat_part partition (partitionId) COMPUTE STATISTICS for 
> columns;
> We would like to improve the columns stats update speed for all the 
> partitions of a table



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7736) improve the columns stats update speed for all the partitions of a table

2014-08-21 Thread pengcheng xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

pengcheng xiong updated HIVE-7736:
--

Attachment: HIVE-7736.3.patch

regenerate the patch (rebase), wait for QA tests 

> improve the columns stats update speed for all the partitions of a table
> 
>
> Key: HIVE-7736
> URL: https://issues.apache.org/jira/browse/HIVE-7736
> Project: Hive
>  Issue Type: Improvement
>Reporter: pengcheng xiong
>Assignee: pengcheng xiong
>Priority: Minor
> Attachments: HIVE-7736.0.patch, HIVE-7736.1.patch, HIVE-7736.2.patch, 
> HIVE-7736.3.patch
>
>
> The current implementation of columns stats update for all the partitions of 
> a table takes a long time when there are thousands of partitions. 
> For example, on a given cluster, it took 600+ seconds to update all the 
> partitions' columns stats for a table with 2 columns but 2000 partitions.
> ANALYZE TABLE src_stat_part partition (partitionId) COMPUTE STATISTICS for 
> columns;
> We would like to improve the columns stats update speed for all the 
> partitions of a table



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7736) improve the columns stats update speed for all the partitions of a table

2014-08-20 Thread pengcheng xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

pengcheng xiong updated HIVE-7736:
--

Attachment: HIVE-7736.2.patch

simple write path (just send a colstats list to the server)

add test cases



> improve the columns stats update speed for all the partitions of a table
> 
>
> Key: HIVE-7736
> URL: https://issues.apache.org/jira/browse/HIVE-7736
> Project: Hive
>  Issue Type: Improvement
>Reporter: pengcheng xiong
>Assignee: pengcheng xiong
>Priority: Minor
> Attachments: HIVE-7736.0.patch, HIVE-7736.1.patch, HIVE-7736.2.patch
>
>
> The current implementation of columns stats update for all the partitions of 
> a table takes a long time when there are thousands of partitions. 
> For example, on a given cluster, it took 600+ seconds to update all the 
> partitions' columns stats for a table with 2 columns but 2000 partitions.
> ANALYZE TABLE src_stat_part partition (partitionId) COMPUTE STATISTICS for 
> columns;
> We would like to improve the columns stats update speed for all the 
> partitions of a table



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7736) improve the columns stats update speed for all the partitions of a table

2014-08-16 Thread pengcheng xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

pengcheng xiong updated HIVE-7736:
--

Status: Patch Available  (was: Open)

improve the columns stats update speed for all the partitions of a table 
through MetaStoreDirectSQL

> improve the columns stats update speed for all the partitions of a table
> 
>
> Key: HIVE-7736
> URL: https://issues.apache.org/jira/browse/HIVE-7736
> Project: Hive
>  Issue Type: Improvement
>Reporter: pengcheng xiong
>Assignee: pengcheng xiong
>Priority: Minor
> Attachments: HIVE-7736.0.patch, HIVE-7736.1.patch
>
>
> The current implementation of columns stats update for all the partitions of 
> a table takes a long time when there are thousands of partitions. 
> For example, on a given cluster, it took 600+ seconds to update all the 
> partitions' columns stats for a table with 2 columns but 2000 partitions.
> ANALYZE TABLE src_stat_part partition (partitionId) COMPUTE STATISTICS for 
> columns;
> We would like to improve the columns stats update speed for all the 
> partitions of a table



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7736) improve the columns stats update speed for all the partitions of a table

2014-08-16 Thread pengcheng xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

pengcheng xiong updated HIVE-7736:
--

Attachment: HIVE-7736.1.patch

> improve the columns stats update speed for all the partitions of a table
> 
>
> Key: HIVE-7736
> URL: https://issues.apache.org/jira/browse/HIVE-7736
> Project: Hive
>  Issue Type: Improvement
>Reporter: pengcheng xiong
>Assignee: pengcheng xiong
>Priority: Minor
> Attachments: HIVE-7736.0.patch, HIVE-7736.1.patch
>
>
> The current implementation of columns stats update for all the partitions of 
> a table takes a long time when there are thousands of partitions. 
> For example, on a given cluster, it took 600+ seconds to update all the 
> partitions' columns stats for a table with 2 columns but 2000 partitions.
> ANALYZE TABLE src_stat_part partition (partitionId) COMPUTE STATISTICS for 
> columns;
> We would like to improve the columns stats update speed for all the 
> partitions of a table



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7736) improve the columns stats update speed for all the partitions of a table

2014-08-15 Thread pengcheng xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

pengcheng xiong updated HIVE-7736:
--

Attachment: HIVE-7736.0.patch

> improve the columns stats update speed for all the partitions of a table
> 
>
> Key: HIVE-7736
> URL: https://issues.apache.org/jira/browse/HIVE-7736
> Project: Hive
>  Issue Type: Improvement
>Reporter: pengcheng xiong
>Assignee: pengcheng xiong
>Priority: Minor
> Attachments: HIVE-7736.0.patch
>
>
> The current implementation of columns stats update for all the partitions of 
> a table takes a long time when there are thousands of partitions. 
> For example, on a given cluster, it took 600+ seconds to update all the 
> partitions' columns stats for a table with 2 columns but 2000 partitions.
> ANALYZE TABLE src_stat_part partition (partitionId) COMPUTE STATISTICS for 
> columns;
> We would like to improve the columns stats update speed for all the 
> partitions of a table



--
This message was sent by Atlassian JIRA
(v6.2#6252)