FelixYBW commented on issue #11397:
URL:
https://github.com/apache/incubator-gluten/issues/11397#issuecomment-3774901997
To record here,
store_sales partitioning by ss_sold_date_sk has 1824 partitions
hive df.repartition('ss_sold_date_sk') has the same partition number as
"spark.shuffle.partitions" which is 90 in this test. So each partition
generates multiple parquet files.
Iceberg has 1765 partitions
deltalake has 1479 partitions
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]