[ 
https://issues.apache.org/jira/browse/HIVE-6157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13880385#comment-13880385
 ] 

Sergey Shelukhin commented on HIVE-6157:
----------------------------------------

metadata_only_queries.q.out got modified for tez test, but it appears that the 
changes come from unrelated patches. Because it doesn't run in HiveQA it didn't 
get updated at some point. Not including in this patch... the rest passed

> Fetching column stats slower than the 101 during rush hour
> ----------------------------------------------------------
>
>                 Key: HIVE-6157
>                 URL: https://issues.apache.org/jira/browse/HIVE-6157
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 0.13.0
>            Reporter: Gunther Hagleitner
>            Assignee: Sergey Shelukhin
>         Attachments: HIVE-6157.01.patch, HIVE-6157.01.patch, 
> HIVE-6157.03.patch, HIVE-6157.nogen.patch, HIVE-6157.nogen.patch, 
> HIVE-6157.prelim.patch
>
>
> "hive.stats.fetch.column.stats" controls whether the column stats for a table 
> are fetched during explain (in Tez: during query planning). On my setup (1 
> table 4000 partitions, 24 columns) the time spent in semantic analyze goes 
> from ~1 second to ~66 seconds when turning the flag on. 65 seconds spent 
> fetching column stats...
> The reason is probably that the APIs force you to make separate metastore 
> calls for each column in each partition. That's probably the first thing that 
> has to change. The question is if in addition to that we need to cache this 
> in the client or store the stats as a single blob in the database to further 
> cut down on the time. However, the way it stands right now column stats seem 
> unusable.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to