> Will it be referring to orc metadata or it will be loading the whole file and 
> then counting the rows.

Depends on the partial-scan setting or if it is computing full column stats 
(the full column stats does an nDV, which reads all rows).

hive> analyze table compute statistics ... partialscan;

https://issues.apache.org/jira/browse/HIVE-4177

AFAIK, this got removed in Hive 3.x (because we really want autogather column 
stats on insert, not just basic stats from this).

> Is there any place to cache this information so that I don't need to scan all 
> the files every time.

https://cwiki.apache.org/confluence/display/Hive/LLAP

Cheers,
Gopal


Reply via email to