[ https://issues.apache.org/jira/browse/HIVE-7811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14116563#comment-14116563 ]
Alan Gates commented on HIVE-7811: ---------------------------------- A few comments and questions on review board, mostly minor. One larger issue. If I understand correctly this is recomputing stats on all compactions. I think it's ok to only do it at major compactions. Minor compactions are initiated by number of delta files, not the number of records. So needing a minor compaction tells you little about how far out of step the stats are. Major compactions, on the other hand, are driven by size of the delta file(s). Thus when we are doing a major compaction it is reasonable to assume are stats are off as well. Thoughts? Also, this may conflict with HIVE-7508. Poor [~roshan_naik] has waited for 2 months for a review, so I'll check that one in first which may force you to rebase this patch. Sorry. > Compactions need to update table/partition stats > ------------------------------------------------ > > Key: HIVE-7811 > URL: https://issues.apache.org/jira/browse/HIVE-7811 > Project: Hive > Issue Type: Sub-task > Components: Transactions > Affects Versions: 0.13.1 > Reporter: Eugene Koifman > Assignee: Eugene Koifman > Attachments: HIVE-7811.3.patch, HIVE-7811.4.patch, HIVE-7811.5.patch > > > Compactions should trigger stats recalculation for columns that which already > have sats. > https://reviews.apache.org/r/25201/ > Major compactions will cause the Compactor to see which columns already have > stats and run analyze command for those columns. If compacting a partition > then stats for that partition will be computed. If table is not partitioned, > then the whole table. -- This message was sent by Atlassian JIRA (v6.2#6252)