Re: Improve performance of Analyze table compute statistics

Prabhakar Reddy Thu, 13 Sep 2018 10:38:47 -0700

Thank you Gopal for this Information.Currently I am using EMR to run this
query.As this operation is CPU intensive could you please let me know if
increasing the RAM/cores can speed up this process?


On Tue, Aug 28, 2018 at 8:56 PM Gopal Vijayaraghavan <gop...@apache.org>
wrote:

>
> > Will it be referring to orc metadata or it will be loading the whole
> file and then counting the rows.
>
> Depends on the partial-scan setting or if it is computing full column
> stats (the full column stats does an nDV, which reads all rows).
>
> hive> analyze table compute statistics ... partialscan;
>
> https://issues.apache.org/jira/browse/HIVE-4177
>
> AFAIK, this got removed in Hive 3.x (because we really want autogather
> column stats on insert, not just basic stats from this).
>
> > Is there any place to cache this information so that I don't need to
> scan all the files every time.
>
> https://cwiki.apache.org/confluence/display/Hive/LLAP
>
> Cheers,
> Gopal
>
>
>

Re: Improve performance of Analyze table compute statistics

Reply via email to