Zoltán Borók-Nagy created IMPALA-11608:
------------------------------------------
Summary: Impala SHOW TABLE STATS shows wrong number of files
Key: IMPALA-11608
URL: https://issues.apache.org/jira/browse/IMPALA-11608
Project: IMPALA
Issue Type: Bug
Components: Frontend
Reporter: Zoltán Borók-Nagy
Impala SHOW TABLE stats outputs wrong value for number of files. It should only
calculate the number of data files, but it calculates all files under the table
directory, including metadata files, orphaned files, and old data files not
belonging to the current snapshot.
It should only output the number of data files in the current snapshot, making
the output consistent with SHOW FILES IN tbl;
{noformat}
create table test (i int) stored as iceberg;
compute stats test;
show table stats test;
+-------+--------+--------+--------------+-------------------+---------+-------------------+--------------------------------------------+
| #Rows | #Files | Size | Bytes Cached | Cache Replication | Format |
Incremental stats | Location |
+-------+--------+--------+--------------+-------------------+---------+-------------------+--------------------------------------------+
| -1 | 2 | 2.70KB | NOT CACHED | NOT CACHED | PARQUET | false
| hdfs://localhost:20500/test-warehouse/test |
+-------+--------+--------+--------------+-------------------+---------+-------------------+--------------------------------------------+
{noformat}
SHOW TABLE STATS is handled here:
https://github.com/apache/impala/blob/66484a4c081f3242750a3a0e04159dd4580b37a4/fe/src/main/java/org/apache/impala/service/Frontend.java#L1429-L1457
--
This message was sent by Atlassian Jira
(v8.20.10#820010)