[I] [VL] Not being able to read or write to S3a on Spark 1.4.2 and Gluten 1.4 [incubator-gluten]

via GitHub Wed, 10 Sep 2025 07:57:18 -0700


Neuw84 opened a new issue, #10670:
URL: https://github.com/apache/incubator-gluten/issues/10670


   ### Backend
   
   VL (Velox)
   
   ### Bug description
   
   Using Spark 3.5.2 and Gluten 1.4 with the provided configs... I am not able 
to read and write from S3... seems that the task are running but no errors and 
no advances on the tasks....
   
   If I disable Gluten plugin it just works and also I am not able to read data 
from S3 if using Gluten.
   
   Any hints?
   
   ```
   25/09/10 12:03:31 INFO DAGScheduler: Submitting 4 missing tasks from 
ResultStage 5 (VeloxColumnarWriteFilesRDD[20] at parquet at 
NativeMethodAccessorImpl.java:0) (first 15 tasks are for partitions Vector(0, 
1, 2, 3))
   25/09/10 12:03:31 INFO TaskSchedulerImpl: Adding task set 5.0 with 4 tasks 
resource profile 0
   25/09/10 12:03:31 INFO TaskSetManager: Starting task 0.0 in stage 5.0 (TID 
33) (jupyter, executor driver, partition 0, NODE_LOCAL, 9274 bytes) 
   25/09/10 12:03:31 INFO TaskSetManager: Starting task 1.0 in stage 5.0 (TID 
34) (jupyter, executor driver, partition 1, NODE_LOCAL, 9274 bytes) 
   25/09/10 12:03:31 INFO TaskSetManager: Starting task 2.0 in stage 5.0 (TID 
35) (jupyter, executor driver, partition 2, NODE_LOCAL, 9274 bytes) 
   25/09/10 12:03:31 INFO TaskSetManager: Starting task 3.0 in stage 5.0 (TID 
36) (jupyter, executor driver, partition 3, NODE_LOCAL, 9274 bytes) 
   25/09/10 12:03:31 INFO Executor: Running task 0.0 in stage 5.0 (TID 33)
   25/09/10 12:03:31 INFO Executor: Running task 2.0 in stage 5.0 (TID 35)
   25/09/10 12:03:31 INFO Executor: Running task 3.0 in stage 5.0 (TID 36)
   25/09/10 12:03:31 INFO Executor: Running task 1.0 in stage 5.0 (TID 34)
   
   ```
   
   ### Gluten version
   
   Gluten-1.4
   
   ### Spark version
   
   Spark-3.5.x
   
   ### Spark configurations
   
   ```
   
         .master("local[*]")
         .appName("s3a-committers-stable")
         .config("spark.plugins", "org.apache.gluten.GlutenPlugin") 
         .config("spark.memory.offHeap.size", "8g") 
         
.config("spark.shuffle.manager","org.apache.spark.shuffle.sort.ColumnarShuffleManager")
  
         .config("spark.memory.offHeap.enabled", "true") 
         
.config("spark.driver.extraClassPath","/opt/spark/jars/gluten-velox-bundle-spark3.5_2.12-linux_amd64-1.4.0.jar")
         
.config("spark.executor.extraClassPath","/opt/spark/jars/gluten-velox-bundle-spark3.5_2.12-linux_amd64-1.4.0.jar")
         
.config("spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version", "2")
         .config("spark.hadoop.fs.s3a.bucket.all.committer.magic.enabled", 
"true")
         .config("spark.hadoop.fs.s3a.use.instance.credentials","true")
   
        # .config("spark.speculation", "false")
         .config("spark.dynamicAllocation.enabled", "false")
         .config("spark.gluten.velox.awsSdkLogLevel","debug")
         .config("spark.log.level", "info")
   ```
   
   
   
   ### System information
   
   Docker based on Jupyter Image.
   
   ### Relevant logs
   
   ```bash
   
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[I] [VL] Not being able to read or write to S3a on Spark 1.4.2 and Gluten 1.4 [incubator-gluten]

Reply via email to