FelixYBW opened a new issue, #15332:
URL: https://github.com/apache/iceberg/issues/15332

   ### Apache Iceberg version
   
   1.8.1
   
   ### Query engine
   
   Spark
   
   ### Please describe the bug 🐞
   
   I tried to migrate a TPCDS hive table to iceberg using
   `CALL 
spark_catalog.system.migrate('tpcdsdbiceberg_10tb_partitioned2_perfteam.catalog_sales`.
 The null partition is stored as `cs_sold_date_sk=__HIVE_DEFAULT_PARTITION__` 
on S3.
   
   After migration, I get below strange result:
   
   ```sql
   select count(*) from catalog_sales
   14399880363
   select count(*) from catalog_sales_BACK_UP_
   14399880363
   ```
   
   ```sql
   select count(*) from catalog_sales where cs_sold_date_sk is null
   0
   select count(*) from catalog_sales_BACK_UP_ where cs_sold_date_sk is null
   71981198
   ```
   
   ```sql
   select count(*) from catalog_sales where cs_sold_date_sk is not null
   14327899165 
   select count(*) from catalog_sales_BACK_UP_ where cs_sold_date_sk is not null
   14327899165
   ```
   
   Is it a known issue? I didn't search anything.
   
   ### Willingness to contribute
   
   - [ ] I can contribute a fix for this bug independently
   - [ ] I would be willing to contribute a fix for this bug with guidance from 
the Iceberg community
   - [x] I cannot contribute a fix for this bug at this time


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to