guobj opened a new issue, #9859:
URL: https://github.com/apache/incubator-gluten/issues/9859
### Backend
VL (Velox)
### Bug description
After using gluten, the performance is greatly reduced, and the execution
plan is as follows。
== Physical Plan ==
Execute InsertIntoHiveTable (62)
+- AdaptiveSparkPlan (61)
+- == Final Plan ==
* Project (41)
+- VeloxColumnarToRow (40)
+- ^ ShuffledHashJoinExecTransformer LeftOuter BuildRight (38)
:- ^ ProjectExecTransformer (13)
: +- ^ FilterExecTransformer (12)
: +- ^ WindowExecTransformer (11)
: +- ^ SortExecTransformer (10)
: +- ^ InputIteratorTransformer (9)
: +- AQEShuffleRead (7)
: +- ShuffleQueryStage (6)
: +- ColumnarExchange (5)
: +- VeloxResizeBatches (4)
: +- ^ ProjectExecTransformer (2)
: +- ^ NativeScan hive
loan_data_warehouse.ads_did_deviceid_mapping_df (1)
+- ^ RegularHashAggregateExecTransformer (37)
+- ^ InputIteratorTransformer (36)
+- AQEShuffleRead (34)
+- ShuffleQueryStage (33)
+- ColumnarExchange (32)
+- VeloxResizeBatches (31)
+- ^ ProjectExecTransformer (29)
+- ^ FlushableHashAggregateExecTransformer
(28)
+- ^ ProjectExecTransformer (27)
+- ^ FilterExecTransformer (26)
+- ^ WindowExecTransformer (25)
+- ^ SortExecTransformer (24)
+- ^
InputIteratorTransformer (23)
+- AQEShuffleRead (21)
+- ShuffleQueryStage
(20)
+- ColumnarExchange
(19)
+-
VeloxResizeBatches (18)
+- ^
ProjectExecTransformer (16)
+- ^
FilterExecTransformer (15)
+- ^
NativeScan hive loan_data_warehouse.ads_did_deviceid_mapping_df (14)
+- == Initial Plan ==
Project (60)
+- SortMergeJoin LeftOuter (59)
:- Project (47)
: +- Filter (46)
: +- Window (45)
: +- Sort (44)
: +- Exchange (43)
: +- Scan hive
loan_data_warehouse.ads_did_deviceid_mapping_df (42)
+- SortAggregate (58)
+- Sort (57)
+- Exchange (56)
+- SortAggregate (55)
+- Project (54)
+- Filter (53)
+- Window (52)
+- Sort (51)
+- Exchange (50)
+- Filter (49)
+- Scan hive
loan_data_warehouse.ads_did_deviceid_mapping_df (48)
### Gluten version
Gluten-1.3
### Spark version
Spark-3.2.x
### Spark configurations
bin/spark-submit \
--class org.apache.kyuubi.engine.spark.SparkSQLEngine \
--conf spark.driver.memory=10g \
--conf spark.dynamicAllocation.enabled=false \
--conf spark.executor.instances=20 \
--conf spark.executor.cores=4 \
--conf
spark.executor.extraClassPath=./gluten-velox-bundle-spark3.2_2.12-centos_7_x86_64-1.3.0.jar
\
--conf spark.executor.memory=10g \
--conf
spark.executorEnv.JAVA_HOME=./openjdk-1.8.0.345/java-1.8.0-openjdk-1.8.0.345.b01-1.el7_9.x86_64
\
--conf
spark.files=/home/spark/agent/gluten-velox-bundle-spark3.2_2.12-centos_7_x86_64-1.3.0.jar
\
--conf spark.gluten.sql.debug=true \
--conf spark.memory.offHeap.enabled=true \
--conf spark.memory.offHeap.size=10g \
--conf spark.network.timeout=300s \
--conf spark.plugins=org.apache.gluten.GlutenPlugin \
--conf
spark.shuffle.manager=org.apache.spark.shuffle.sort.ColumnarShuffleManager \
--conf
spark.yarn.appMasterEnv.JAVA_HOME=./openjdk-1.8.0.345/java-1.8.0-openjdk-1.8.0.345.b01-1.el7_9.x86_64
\
--conf
spark.yarn.dist.archives=/home/spark/agent/java-1.8.0-openjdk.tar.gz#openjdk-1.8.0.345
### System information
_No response_
### Relevant logs
```bash
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]