ZhendongBai opened a new issue, #7775: URL: https://github.com/apache/iceberg/issues/7775
### Query engine spark sql 3.3.2 ### Question when I use spark 3.3.2,write the same table into hive partitioned table(table A) and iceberg(metadata store in hive) partitioned table(table B), both table are orc format and have same compression strategy, I do following test: 1. first create an iceberg table(table C) like hive table and add table A into table C (via spark add_files procedure) 2. compare table C and table B via select ".data_files" metadata table(column_sizes field, I extract the field size via field-id), the result show as below: <img width="790" alt="image" src="https://github.com/apache/iceberg/assets/18043146/03311e28-e311-47fb-85d3-930b0f5bf435"> why the iceberg table string fields bytes size bigger than spark sql fields bytes size ? especially, map<string, string> and string type. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
