[ 
https://issues.apache.org/jira/browse/HIVE-28266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy Fingerman updated HIVE-28266:
-------------------------------------
    Summary: Iceberg: select count(*) from data_files metadata tables gives 
wrong result  (was: Iceberg: select count(*) from *.data_files metadata tables 
gives wrong result)

> Iceberg: select count(*) from data_files metadata tables gives wrong result
> ---------------------------------------------------------------------------
>
>                 Key: HIVE-28266
>                 URL: https://issues.apache.org/jira/browse/HIVE-28266
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Dmitriy Fingerman
>            Assignee: Dmitriy Fingerman
>            Priority: Major
>
> In Hive Iceberg, every table has a corresponding metadata table 
> "*.data_files" that contains info about the files that contain table's data.
> select count(*) from a data_file metadata table returns number of rows in the 
> data table instead of number of data files from the metadata table.
>  
> {code:java}
> CREATE TABLE x (name VARCHAR(50), age TINYINT, num_clicks BIGINT) stored by 
> iceberg stored as orc TBLPROPERTIES 
> ('external.table.purge'='true','format-version'='2');
> insert into x values 
> ('amy', 35, 123412344),
> ('adxfvy', 36, 123412534),
> ('amsdfyy', 37, 123417234),
> ('asafmy', 38, 123412534);
> insert into x values 
> ('amerqwy', 39, 123441234),
> ('amyxzcv', 40, 123341234),
> ('erweramy', 45, 122341234);
> Select * from default.x.data_files;
> – Returns 2 records in the output
> Select count from default.x.data_files;
> – Returns 7 instead of 2
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to