Thank you Gopal for this Information.Currently I am using EMR to run this query.As this operation is CPU intensive could you please let me know if increasing the RAM/cores can speed up this process?
On Tue, Aug 28, 2018 at 8:56 PM Gopal Vijayaraghavan <gop...@apache.org> wrote: > > > Will it be referring to orc metadata or it will be loading the whole > file and then counting the rows. > > Depends on the partial-scan setting or if it is computing full column > stats (the full column stats does an nDV, which reads all rows). > > hive> analyze table compute statistics ... partialscan; > > https://issues.apache.org/jira/browse/HIVE-4177 > > AFAIK, this got removed in Hive 3.x (because we really want autogather > column stats on insert, not just basic stats from this). > > > Is there any place to cache this information so that I don't need to > scan all the files every time. > > https://cwiki.apache.org/confluence/display/Hive/LLAP > > Cheers, > Gopal > > >