[ https://issues.apache.org/jira/browse/HIVE-23776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Zoltan Haindrich reassigned HIVE-23776: --------------------------------------- > Retire quickstats autocollection > -------------------------------- > > Key: HIVE-23776 > URL: https://issues.apache.org/jira/browse/HIVE-23776 > Project: Hive > Issue Type: Improvement > Reporter: Zoltan Haindrich > Assignee: Zoltan Haindrich > Priority: Major > > this is about: > * num files > * datasize (sum of filesizes) > * num erasure coded files > right now these are scanned during every BasicStatsTask execution - which > means some filesystem reads/etc - for small inserts these are visible in case > the fs is a bit slower (s3 and friends) > I don't think they are really in use...we rely more on columnstats which are > more accurate ; and because of the datasize in this case is for "offline" > (ondisk) - while we should be insted calculate with "online" sizes... > proposal: > * remove collection and storage of this data > * collect it on the fly during "desc formatted" statements to provide them > for informational purposes -- This message was sent by Atlassian Jira (v8.3.4#803005)