lordk911 commented on issue #5236:
URL: https://github.com/apache/kyuubi/issues/5236#issuecomment-1704662042
@ulysses-you
1、I've change spark-default.conf to :
spark.sql.extensions
org.apache.kyuubi.plugin.spark.authz.ranger.RangerSparkExtension,org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions,org.apache.kyuubi.sql.KyuubiSparkSQLExtension
2、then connect to kyuubi :
2.1) set spark.sql.optimizer.insertRepartitionBeforeWrite.enabled=true;
2.2)execute test sql : insert into select
2.3) before the InsertIntoHadoopFsRelationCommand there will be a Exchange
node with RoundRobinPartitioning

2.4)about 20 minutes later I cancel the query , because the shuffle write
data size become larger when I use spark3.2.3 with KyuubiSparkSQLExtension
2.5) set spark.sql.optimizer.insertRepartitionBeforeWrite.enabled=false;
2.6) execute test sql : insert into select again
2.7) sql finished with the same output datasize and file number as direct
use of spark without KyuubiSparkSQLExtension.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]