[ 
https://issues.apache.org/jira/browse/IMPALA-11608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17623164#comment-17623164
 ] 

ASF subversion and git services commented on IMPALA-11608:
----------------------------------------------------------

Commit 3973fc6d09dd1bc2abaae1e75e151f0f167f6602 in impala's branch 
refs/heads/master from LPL
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=3973fc6d0 ]

IMPALA-11608: Fix SHOW TABLE STATS iceberg_tbl shows wrong number of files

Impala SHOW TABLE stats outputs wrong value for number of files for
Iceberg tables. It should only calculate the number of data files and
delete files, but it calculates all files under the table directory,
including metadata files, orphaned files, and old data files not
belonging to the current snapshot.

Testing:
 - add e2e tests

Change-Id: I110e5e13cec3aa898f115e1ed795ce98e68ef06c
Reviewed-on: http://gerrit.cloudera.org:8080/19150
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>


> Impala SHOW TABLE STATS shows wrong number of files for Iceberg tables
> ----------------------------------------------------------------------
>
>                 Key: IMPALA-11608
>                 URL: https://issues.apache.org/jira/browse/IMPALA-11608
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Frontend
>            Reporter: Zoltán Borók-Nagy
>            Assignee: LiPenglin
>            Priority: Major
>              Labels: impala-iceberg, ramp-up
>
> Impala SHOW TABLE stats outputs wrong value for number of files for Iceberg 
> tables. It should only calculate the number of data files, but it calculates 
> all files under the table directory, including metadata files, orphaned 
> files, and old data files not belonging to the current snapshot.
> It should only output the number of data files in the current snapshot, 
> making the output consistent with SHOW FILES IN tbl;
> {noformat}
> create table test (i int) stored as iceberg;
> compute stats test;
> show table stats test;
> +-------+--------+--------+--------------+-------------------+---------+-------------------+--------------------------------------------+
> | #Rows | #Files | Size   | Bytes Cached | Cache Replication | Format  | 
> Incremental stats | Location                                   |
> +-------+--------+--------+--------------+-------------------+---------+-------------------+--------------------------------------------+
> | -1    | 2      | 2.70KB | NOT CACHED   | NOT CACHED        | PARQUET | 
> false             | hdfs://localhost:20500/test-warehouse/test |
> +-------+--------+--------+--------------+-------------------+---------+-------------------+--------------------------------------------+
> {noformat}
> SHOW TABLE STATS is handled here: 
> https://github.com/apache/impala/blob/66484a4c081f3242750a3a0e04159dd4580b37a4/fe/src/main/java/org/apache/impala/service/Frontend.java#L1429-L1457



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to