[ 
https://issues.apache.org/jira/browse/HIVE-23776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich reassigned HIVE-23776:
---------------------------------------


> Retire quickstats autocollection
> --------------------------------
>
>                 Key: HIVE-23776
>                 URL: https://issues.apache.org/jira/browse/HIVE-23776
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Zoltan Haindrich
>            Assignee: Zoltan Haindrich
>            Priority: Major
>
> this is about:
> * num files
> * datasize (sum of filesizes)
> * num erasure coded files
> right now these are scanned during every BasicStatsTask execution - which 
> means some filesystem reads/etc - for small inserts these are visible in case 
> the fs is a bit slower (s3 and friends)
> I don't think they are really in use...we rely more on columnstats which are 
> more accurate ; and because of the datasize in this case is for "offline" 
> (ondisk) - while we should be insted calculate with "online" sizes...
> proposal:
> * remove collection and storage of this data
> * collect it on the fly during "desc formatted" statements to provide them 
> for informational purposes



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to