acruise opened a new issue, #7200:
URL: https://github.com/apache/incubator-gluten/issues/7200

   ### Backend
   
   VL (Velox)
   
   ### Bug description
   
   I have a TPC-DS dataset in ORC format on S3. On vanilla Spark 3.5.1 on a 
single node, this query completes in 1-3 seconds:
   ```
   val customers = 
spark.read.orc("s3a://mybucket/datasets/tpcds_sf100.orc/customer/*.orc").toDF
   customers.count()
   ```
   
   With Gluten enabled (built from the 1.2.0 tag, with S3 enabled), 
initializing the DataFrame is fine, but when I invoke `count()` the expected 
number of tasks is spawned, but they do nothing at all.
   
   I've tried disabling whole-stage codegen, but it makes no difference.
   
   ### Spark version
   
   Spark-3.5.x
   
   ### Spark configurations
   
   ```
   /opt/spark/bin/spark-shell \
       --jars 
/home/alex/incubator-gluten/package/target/gluten-velox-bundle-spark3.5_2.12-ubuntu_22.04_x86_64-1.2.0.jar
 \
       --packages org.apache.hadoop:hadoop-aws:3.3.4 \
       -c 
spark.hadoop.fs.s3a.aws.credentials.provider=com.amazonaws.auth.DefaultAWSCredentialsProviderChain
 \
       -c spark.plugins=org.apache.gluten.GlutenPlugin \
       -c spark.memory.offHeap.enabled=true \
       -c spark.memory.offHeap.size=32G \
       -c spark.driver.memory=8g \
       -c spark.executor.memory=16g
   ```
   
   ### System information
   
   No such script in v1.2.0 :)
   
   It's a c5a.8xlarge (64GB, 32 cores, >100G local disk)
   
   
   ### Relevant logs
   
   ```bash
   scala> customers.count
   24/09/11 22:36:32 INFO FileSourceStrategy: Pushed Filters: 
   24/09/11 22:36:32 INFO FileSourceStrategy: Post-Scan Filters: 
   24/09/11 22:36:32 INFO GlutenFallbackReporter: Validation failed for plan: 
Exchange[QueryId=0], due to: [FallbackByBackendSettings] Validation failed on 
node Exchange.
   24/09/11 22:36:32 INFO InputPartitionsUtil: Planning scan with bin packing, 
max size: 6213148 bytes, open cost is considered as scanning 4194304 bytes.
   24/09/11 22:36:32 INFO DAGScheduler: Registering RDD 5 (count at 
<console>:27) as input to shuffle 0
   24/09/11 22:36:32 INFO DAGScheduler: Got map stage job 1 (count at 
<console>:27) with 22 output partitions
   24/09/11 22:36:32 INFO DAGScheduler: Final stage: ShuffleMapStage 1 (count 
at <console>:27)
   24/09/11 22:36:32 INFO DAGScheduler: Parents of final stage: List()
   24/09/11 22:36:32 INFO DAGScheduler: Missing parents: List()
   24/09/11 22:36:32 INFO DAGScheduler: Submitting ShuffleMapStage 1 
(MapPartitionsRDD[5] at count at <console>:27), which has no missing parents
   24/09/11 22:36:32 INFO MemoryStore: Block broadcast_1 stored as values in 
memory (estimated size 32.9 KiB, free 36.6 GiB)
   24/09/11 22:36:32 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes 
in memory (estimated size 14.6 KiB, free 36.6 GiB)
   24/09/11 22:36:32 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory 
on ip-172-31-0-251.us-west-1.compute.internal:38299 (size: 14.6 KiB, free: 36.6 
GiB)
   24/09/11 22:36:32 INFO SparkContext: Created broadcast 1 from broadcast at 
DAGScheduler.scala:1585
   24/09/11 22:36:32 INFO DAGScheduler: Submitting 22 missing tasks from 
ShuffleMapStage 1 (MapPartitionsRDD[5] at count at <console>:27) (first 15 
tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 
14))
   24/09/11 22:36:32 INFO TaskSchedulerImpl: Adding task set 1.0 with 22 tasks 
resource profile 0
   24/09/11 22:36:32 INFO TaskSetManager: Starting task 0.0 in stage 1.0 (TID 
1) (ip-172-31-0-251.us-west-1.compute.internal, executor driver, partition 0, 
PROCESS_LOCAL, 9202 bytes) 
   24/09/11 22:36:32 INFO TaskSetManager: Starting task 1.0 in stage 1.0 (TID 
2) (ip-172-31-0-251.us-west-1.compute.internal, executor driver, partition 1, 
PROCESS_LOCAL, 9204 bytes) 
   24/09/11 22:36:32 INFO TaskSetManager: Starting task 2.0 in stage 1.0 (TID 
3) (ip-172-31-0-251.us-west-1.compute.internal, executor driver, partition 2, 
PROCESS_LOCAL, 9204 bytes) 
   24/09/11 22:36:32 INFO TaskSetManager: Starting task 3.0 in stage 1.0 (TID 
4) (ip-172-31-0-251.us-west-1.compute.internal, executor driver, partition 3, 
PROCESS_LOCAL, 9204 bytes) 
   24/09/11 22:36:32 INFO TaskSetManager: Starting task 4.0 in stage 1.0 (TID 
5) (ip-172-31-0-251.us-west-1.compute.internal, executor driver, partition 4, 
PROCESS_LOCAL, 9204 bytes) 
   24/09/11 22:36:32 INFO TaskSetManager: Starting task 5.0 in stage 1.0 (TID 
6) (ip-172-31-0-251.us-west-1.compute.internal, executor driver, partition 5, 
PROCESS_LOCAL, 9204 bytes) 
   24/09/11 22:36:32 INFO TaskSetManager: Starting task 6.0 in stage 1.0 (TID 
7) (ip-172-31-0-251.us-west-1.compute.internal, executor driver, partition 6, 
PROCESS_LOCAL, 9204 bytes) 
   24/09/11 22:36:32 INFO TaskSetManager: Starting task 7.0 in stage 1.0 (TID 
8) (ip-172-31-0-251.us-west-1.compute.internal, executor driver, partition 7, 
PROCESS_LOCAL, 9204 bytes) 
   24/09/11 22:36:32 INFO TaskSetManager: Starting task 8.0 in stage 1.0 (TID 
9) (ip-172-31-0-251.us-west-1.compute.internal, executor driver, partition 8, 
PROCESS_LOCAL, 9204 bytes) 
   24/09/11 22:36:32 INFO TaskSetManager: Starting task 9.0 in stage 1.0 (TID 
10) (ip-172-31-0-251.us-west-1.compute.internal, executor driver, partition 9, 
PROCESS_LOCAL, 9204 bytes) 
   24/09/11 22:36:32 INFO TaskSetManager: Starting task 10.0 in stage 1.0 (TID 
11) (ip-172-31-0-251.us-west-1.compute.internal, executor driver, partition 10, 
PROCESS_LOCAL, 9204 bytes) 
   24/09/11 22:36:32 INFO TaskSetManager: Starting task 11.0 in stage 1.0 (TID 
12) (ip-172-31-0-251.us-west-1.compute.internal, executor driver, partition 11, 
PROCESS_LOCAL, 9204 bytes) 
   24/09/11 22:36:32 INFO TaskSetManager: Starting task 12.0 in stage 1.0 (TID 
13) (ip-172-31-0-251.us-west-1.compute.internal, executor driver, partition 12, 
PROCESS_LOCAL, 9204 bytes) 
   24/09/11 22:36:32 INFO TaskSetManager: Starting task 13.0 in stage 1.0 (TID 
14) (ip-172-31-0-251.us-west-1.compute.internal, executor driver, partition 13, 
PROCESS_LOCAL, 9204 bytes) 
   24/09/11 22:36:32 INFO TaskSetManager: Starting task 14.0 in stage 1.0 (TID 
15) (ip-172-31-0-251.us-west-1.compute.internal, executor driver, partition 14, 
PROCESS_LOCAL, 9204 bytes) 
   24/09/11 22:36:32 INFO TaskSetManager: Starting task 15.0 in stage 1.0 (TID 
16) (ip-172-31-0-251.us-west-1.compute.internal, executor driver, partition 15, 
PROCESS_LOCAL, 9204 bytes) 
   24/09/11 22:36:32 INFO TaskSetManager: Starting task 16.0 in stage 1.0 (TID 
17) (ip-172-31-0-251.us-west-1.compute.internal, executor driver, partition 16, 
PROCESS_LOCAL, 9204 bytes) 
   24/09/11 22:36:32 INFO TaskSetManager: Starting task 17.0 in stage 1.0 (TID 
18) (ip-172-31-0-251.us-west-1.compute.internal, executor driver, partition 17, 
PROCESS_LOCAL, 9204 bytes) 
   24/09/11 22:36:32 INFO TaskSetManager: Starting task 18.0 in stage 1.0 (TID 
19) (ip-172-31-0-251.us-west-1.compute.internal, executor driver, partition 18, 
PROCESS_LOCAL, 9204 bytes) 
   24/09/11 22:36:32 INFO TaskSetManager: Starting task 19.0 in stage 1.0 (TID 
20) (ip-172-31-0-251.us-west-1.compute.internal, executor driver, partition 19, 
PROCESS_LOCAL, 9204 bytes) 
   24/09/11 22:36:32 INFO TaskSetManager: Starting task 20.0 in stage 1.0 (TID 
21) (ip-172-31-0-251.us-west-1.compute.internal, executor driver, partition 20, 
PROCESS_LOCAL, 9204 bytes) 
   24/09/11 22:36:32 INFO TaskSetManager: Starting task 21.0 in stage 1.0 (TID 
22) (ip-172-31-0-251.us-west-1.compute.internal, executor driver, partition 21, 
PROCESS_LOCAL, 9204 bytes) 
   24/09/11 22:36:32 INFO Executor: Running task 0.0 in stage 1.0 (TID 1)
   24/09/11 22:36:32 INFO Executor: Running task 1.0 in stage 1.0 (TID 2)
   24/09/11 22:36:32 INFO Executor: Running task 2.0 in stage 1.0 (TID 3)
   24/09/11 22:36:32 INFO Executor: Running task 3.0 in stage 1.0 (TID 4)
   24/09/11 22:36:32 INFO Executor: Running task 4.0 in stage 1.0 (TID 5)
   24/09/11 22:36:32 INFO Executor: Running task 5.0 in stage 1.0 (TID 6)
   24/09/11 22:36:32 INFO Executor: Running task 6.0 in stage 1.0 (TID 7)
   24/09/11 22:36:32 INFO Executor: Running task 7.0 in stage 1.0 (TID 8)
   24/09/11 22:36:32 INFO Executor: Running task 8.0 in stage 1.0 (TID 9)
   24/09/11 22:36:32 INFO Executor: Running task 9.0 in stage 1.0 (TID 10)
   24/09/11 22:36:32 INFO Executor: Running task 10.0 in stage 1.0 (TID 11)
   24/09/11 22:36:32 INFO Executor: Running task 11.0 in stage 1.0 (TID 12)
   24/09/11 22:36:32 INFO Executor: Running task 12.0 in stage 1.0 (TID 13)
   24/09/11 22:36:32 INFO Executor: Running task 13.0 in stage 1.0 (TID 14)
   24/09/11 22:36:32 INFO Executor: Running task 14.0 in stage 1.0 (TID 15)
   24/09/11 22:36:32 INFO Executor: Running task 15.0 in stage 1.0 (TID 16)
   24/09/11 22:36:32 INFO Executor: Running task 16.0 in stage 1.0 (TID 17)
   24/09/11 22:36:32 INFO Executor: Running task 17.0 in stage 1.0 (TID 18)
   24/09/11 22:36:32 INFO Executor: Running task 18.0 in stage 1.0 (TID 19)
   24/09/11 22:36:32 INFO Executor: Running task 19.0 in stage 1.0 (TID 20)
   24/09/11 22:36:32 INFO Executor: Running task 20.0 in stage 1.0 (TID 21)
   24/09/11 22:36:32 INFO Executor: Running task 21.0 in stage 1.0 (TID 22)
   [Stage 0:>                  (0 + 1) / 1][Stage 1:>                (0 + 22) / 
22]
   ```
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to