[
https://issues.apache.org/jira/browse/IMPALA-7876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17184733#comment-17184733
]
Vincent Tran commented on IMPALA-7876:
--------------------------------------
This should reproduce this master.
{noformat}
CREATE TABLE default.one_gram_p ( ngram STRING, match_count INT,volume_count
INT )PARTITIONED BY (year STRING)STORED AS TEXTFILE TBLPROPERTIES
('impala.enable.stats.extrapolation'='true');
insert into one_gram_p partition(year) values('abc',122,2564,'2010'),
('abc',122,2564,'2010'), ('abc',122,2564,'2010'), ('abc',122,2564,'2010'),
('abc',122,2564,'2010'), ('abc',122,2564,'2010'), ('abc',122,2564,'2010'),
('abc',122,2564,'2010'), ('abc',122,2564,'2010'), ('abc',122,2564,'2010'),
('abc',122,2564,'2010'), ('abc',122,2564,'2010'), ('abc',122,2564,'2010'),
('abc',122,2564,'2010'), ('abc',122,2564,'2010'), ('abc',122,2564,'2010'),
('abc',122,2564,'2010'), ('abc',122,2564,'2010'), ('abc',122,2564,'2010'),
('abc',122,2564,'2010'), ('abc',122,2564,'2010'), ('abc',122,2564,'2010'),
('abc',122,2564,'2010'), ('abc',122,2564,'2010'), ('abc',122,2564,'2010'),
('abc',122,2564,'2010'), ('abc',122,2564,'2010'), ('abc',122,2564,'2010'),
('abc',122,2564,'2010'), ('abc',122,2564,'2010');
set compute_stats_min_sample_size=1B;
compute stats one_gram_p tablesample system(50);
show table stats one_gram_p;{noformat}
> COMPUTE STATS TABLESAMPLE is not updating number of estimated rows
> ------------------------------------------------------------------
>
> Key: IMPALA-7876
> URL: https://issues.apache.org/jira/browse/IMPALA-7876
> Project: IMPALA
> Issue Type: Bug
> Components: Frontend
> Affects Versions: Impala 3.0
> Reporter: Andre Araujo
> Priority: Critical
>
> Running the command below seems to have no impact on the #rows stats.
> {code}
> [host:21000] default> COMPUTE STATS wide TABLESAMPLE SYSTEM(5);
> Query: COMPUTE STATS wide TABLESAMPLE SYSTEM(100)
> +-------------------------------------------+
> | summary |
> +-------------------------------------------+
> | Updated 1 partition(s) and 103 column(s). |
> +-------------------------------------------+
> WARNINGS: Ignoring TABLESAMPLE because the effective sampling rate is 100%.
> The minimum sample size is COMPUTE_STATS_MIN_SAMPLE_SIZE=1.00GB and the table
> size 20.35GB
> Fetched 1 row(s) in 43.67s
> [host:21000] default> show table stats wide;
> Query: show table stats wide
> +-------+--------------+--------+---------+--------------+-------------------+---------+-------------------+-------------------------------------+
> | #Rows | Extrap #Rows | #Files | Size | Bytes Cached | Cache Replication
> | Format | Incremental stats | Location |
> +-------+--------------+--------+---------+--------------+-------------------+---------+-------------------+-------------------------------------+
> | 0 | -1 | 84 | 20.35GB | NOT CACHED | NOT CACHED
> | PARQUET | false | hdfs://ns1/user/hive/warehouse/wide |
> +-------+--------------+--------+---------+--------------+-------------------+---------+-------------------+-------------------------------------+
> Fetched 1 row(s) in 0.01s
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]