Andre Araujo created IMPALA-7876:
------------------------------------
Summary: COMPUTE STATS TABLESAMPLE is not updating number of
estimated rows
Key: IMPALA-7876
URL: https://issues.apache.org/jira/browse/IMPALA-7876
Project: IMPALA
Issue Type: Bug
Affects Versions: Impala 3.0
Reporter: Andre Araujo
Running the command below seems to have no impact on the #rows stats.
{code}
[host:21000] default> COMPUTE STATS wide TABLESAMPLE SYSTEM(5);
Query: COMPUTE STATS wide TABLESAMPLE SYSTEM(100)
+-------------------------------------------+
| summary |
+-------------------------------------------+
| Updated 1 partition(s) and 103 column(s). |
+-------------------------------------------+
WARNINGS: Ignoring TABLESAMPLE because the effective sampling rate is 100%.
The minimum sample size is COMPUTE_STATS_MIN_SAMPLE_SIZE=1.00GB and the table
size 20.35GB
Fetched 1 row(s) in 43.67s
[host:21000] default> show table stats wide;
Query: show table stats wide
+-------+--------------+--------+---------+--------------+-------------------+---------+-------------------+-------------------------------------+
| #Rows | Extrap #Rows | #Files | Size | Bytes Cached | Cache Replication |
Format | Incremental stats | Location |
+-------+--------------+--------+---------+--------------+-------------------+---------+-------------------+-------------------------------------+
| 0 | -1 | 84 | 20.35GB | NOT CACHED | NOT CACHED |
PARQUET | false | hdfs://ns1/user/hive/warehouse/wide |
+-------+--------------+--------+---------+--------------+-------------------+---------+-------------------+-------------------------------------+
Fetched 1 row(s) in 0.01s
{code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]