[ https://issues.apache.org/jira/browse/IMPALA-7876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17184697#comment-17184697 ]
Vincent Tran commented on IMPALA-7876: -------------------------------------- I can reproduce this on ~ 3.2.0. I think this may be unrelated to the width of the table. The specs for my table is below: {noformat} default> show create table one_gram_p; Query: show create table one_gram_p CREATE TABLE default.one_gram_p ( ngram STRING, match_count INT, volume_count INT ) PARTITIONED BY ( year STRING ) STORED AS TEXTFILE LOCATION 'hdfs:////user/hive/warehouse/one_gram_p' TBLPROPERTIES ('DO_NOT_UPDATE_STATS'='true', 'STATS_GENERATED'='TASK', 'impala.enable.stats.extrapolation'='true', 'impala.lastComputeStatsTime'='1598383227', 'numRows'='1430731493', 'totalSize'='22081529047') {noformat} I need to check against the master branch next. > COMPUTE STATS TABLESAMPLE is not updating number of estimated rows > ------------------------------------------------------------------ > > Key: IMPALA-7876 > URL: https://issues.apache.org/jira/browse/IMPALA-7876 > Project: IMPALA > Issue Type: Bug > Components: Frontend > Affects Versions: Impala 3.0 > Reporter: Andre Araujo > Priority: Critical > > Running the command below seems to have no impact on the #rows stats. > {code} > [host:21000] default> COMPUTE STATS wide TABLESAMPLE SYSTEM(5); > Query: COMPUTE STATS wide TABLESAMPLE SYSTEM(100) > +-------------------------------------------+ > | summary | > +-------------------------------------------+ > | Updated 1 partition(s) and 103 column(s). | > +-------------------------------------------+ > WARNINGS: Ignoring TABLESAMPLE because the effective sampling rate is 100%. > The minimum sample size is COMPUTE_STATS_MIN_SAMPLE_SIZE=1.00GB and the table > size 20.35GB > Fetched 1 row(s) in 43.67s > [host:21000] default> show table stats wide; > Query: show table stats wide > +-------+--------------+--------+---------+--------------+-------------------+---------+-------------------+-------------------------------------+ > | #Rows | Extrap #Rows | #Files | Size | Bytes Cached | Cache Replication > | Format | Incremental stats | Location | > +-------+--------------+--------+---------+--------------+-------------------+---------+-------------------+-------------------------------------+ > | 0 | -1 | 84 | 20.35GB | NOT CACHED | NOT CACHED > | PARQUET | false | hdfs://ns1/user/hive/warehouse/wide | > +-------+--------------+--------+---------+--------------+-------------------+---------+-------------------+-------------------------------------+ > Fetched 1 row(s) in 0.01s > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org