VaibhavFRI opened a new issue, #8207:
URL: https://github.com/apache/incubator-gluten/issues/8207
### Backend
VL (Velox)
### Bug description
I encountered an UnsupportedOperationException while running a Spark job
with JDK17 using Gluten with the Velox backend on an ARM-based platform. The
error occurs during execution, indicating that sun.misc.Unsafe or the
java.nio.DirectByteBuffer constructor is not available.
Error message:
org.apache.gluten.exception.GlutenException: Error during calling Java code
from native code:
java.lang.UnsupportedOperationException: sun.misc.Unsafe or
java.nio.DirectByteBuffer.<init>(long, int) not available
Command used to run spark job:
spark-submit --class com.example.KMeansExample --properties-file
spark-config.conf --jars
/path/to/gluten-velox-bundle-spark3.5_2.12-ubuntu_22.04_aarch_64-1.3.0-SNAPSHOT.jar
target/<spark-build.jar>
Gluten Version: 1.3.0-SNAPSHOT
Spark Version: 3.5.2
JDK Version: 17
Platform: ARM (AWS Graviton)
Backend: Velox
OS: Ubuntu 22.04
### Spark version
Spark-3.5.x
### Spark configurations
spark.executor.instances 1
spark.executor.cores 1
spark.task.cpus 1
spark.dynamicAllocation.enabled false
spark.cores.max 1
spark.executor.memory 56g
spark.driver.memory 4g
spark.memory.offHeap.enabled true
spark.memory.offHeap.size 20g
spark.executor.memoryOverhead 1g
spark.driver.extraJavaOptions "--illegal-access=permit
-Dio.netty.tryReflectionSetAccessible=true --add-opens
java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED"
spark.executor.extraJavaOptions "--illegal-access=permit
-Dio.netty.tryReflectionSetAccessible=true --add-opens
java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED"
spark.plugins org.apache.gluten.GlutenPlugin
spark.gluten.sql.columnar.forceShuffledHashJoin true
spark.shuffle.manager org.apache.spark.shuffle.sort.ColumnarShuffleManager
spark.executor.extraClassPath
'/pathto/gluten-velox-bundle-spark3.5_2.12-ubuntu_22.04_aarch_64-1.3.0-SNAPSHOT.jar'
spark.driver.extraClassPath
'/pathto/gluten-velox-bundle-spark3.5_2.12-ubuntu_22.04_aarch_64-1.3.0-SNAPSHOT.jar'
### System information
_No response_
### Relevant logs
```bash
Caused by: org.apache.gluten.exception.GlutenException:
org.apache.gluten.exception.GlutenException: Error during calling Java code
from native code: java.lang.UnsupportedOperationException: sun.misc.Unsafe or
java.nio.DirectByteBuffer.<init>(long, int) not available
at
io.netty.util.internal.PlatformDependent.directBuffer(PlatformDependent.java:534)
at
org.apache.gluten.vectorized.LowCopyFileSegmentJniByteInputStream.read(LowCopyFileSegmentJniByteInputStream.java:100)
at
org.apache.gluten.vectorized.ColumnarBatchOutIterator.nativeNext(Native Method)
at
org.apache.gluten.vectorized.ColumnarBatchOutIterator.next0(ColumnarBatchOutIterator.java:62)
at
org.apache.gluten.iterator.ClosableIterator.next(ClosableIterator.java:51)
at
org.apache.gluten.vectorized.ColumnarBatchSerializerInstance$TaskDeserializationStream.liftedTree1$1(ColumnarBatchSerializer.scala:180)
at
org.apache.gluten.vectorized.ColumnarBatchSerializerInstance$TaskDeserializationStream.readValue(ColumnarBatchSerializer.scala:179)
at
org.apache.spark.serializer.DeserializationStream$$anon$2.getNext(Serializer.scala:188)
at
org.apache.spark.serializer.DeserializationStream$$anon$2.getNext(Serializer.scala:185)
at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:490)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
at
org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:31)
at
org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
at scala.collection.Iterator.isEmpty(Iterator.scala:387)
at scala.collection.Iterator.isEmpty$(Iterator.scala:387)
at scala.collection.AbstractIterator.isEmpty(Iterator.scala:1431)
at
org.apache.gluten.execution.VeloxColumnarToRowExec$.toRowIterator(VeloxColumnarToRowExec.scala:121)
at
org.apache.gluten.execution.VeloxColumnarToRowExec.$anonfun$doExecuteInternal$1(VeloxColumnarToRowExec.scala:77)
at org.apache.spark.rdd.RDD.$anonfun$mapPartitions$2(RDD.scala:858)
at
org.apache.spark.rdd.RDD.$anonfun$mapPartitions$2$adapted(RDD.scala:858)
at
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93)
at
org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166)
at org.apache.spark.scheduler.Task.run(Task.scala:141)
at
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
at
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
at
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94)
at
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623)
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:840)
at
org.apache.gluten.iterator.ClosableIterator.next(ClosableIterator.java:53)
at
org.apache.gluten.vectorized.ColumnarBatchSerializerInstance$TaskDeserializationStream.liftedTree1$1(ColumnarBatchSerializer.scala:180)
at
org.apache.gluten.vectorized.ColumnarBatchSerializerInstance$TaskDeserializationStream.readValue(ColumnarBatchSerializer.scala:179)
at
org.apache.spark.serializer.DeserializationStream$$anon$2.getNext(Serializer.scala:188)
at
org.apache.spark.serializer.DeserializationStream$$anon$2.getNext(Serializer.scala:185)
at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:490)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
at
org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:31)
at
org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
at scala.collection.Iterator.isEmpty(Iterator.scala:387)
at scala.collection.Iterator.isEmpty$(Iterator.scala:387)
at scala.collection.AbstractIterator.isEmpty(Iterator.scala:1431)
at
org.apache.gluten.execution.VeloxColumnarToRowExec$.toRowIterator(VeloxColumnarToRowExec.scala:121)
at
org.apache.gluten.execution.VeloxColumnarToRowExec.$anonfun$doExecuteInternal$1(VeloxColumnarToRowExec.scala:77)
at org.apache.spark.rdd.RDD.$anonfun$mapPartitions$2(RDD.scala:858)
at
org.apache.spark.rdd.RDD.$anonfun$mapPartitions$2$adapted(RDD.scala:858)
at
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93)
at
org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166)
at org.apache.spark.scheduler.Task.run(Task.scala:141)
at
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
at
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
at
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94)
at
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623)
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:840)
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]