andrey-mindrin opened a new issue #4217: URL: https://github.com/apache/iceberg/issues/4217
We’ve tested Iceberg performance vs Hive format by using Spark TPC-DS performance tests (scale factor 1000) from Databricks and found 50% less performance in Iceberg tables. Environment: 1. On premises cluster which runs Spark 3.1.2 with Iceberg 0.13.0 with the same number executors, cores, memory, etc. 2. Parquet codec snappy 3. Tables were partitioned like in original Hive tables 4. Tables were COW and they were created in Spark from Hive tables with CTAS. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
