mixermt commented on issue #1714:
URL:
https://github.com/apache/datafusion-comet/issues/1714#issuecomment-2849354985
Another observation, failure happens while read of data from Iceberg table.
Either we have some versions mismatch, jar hell or some other still unknown
reason to me
We are using Iceberg version 1.6.1
Occasionally I see things like
```
java.lang.IllegalAccessError: tried to access method Å.()V from class
org.apache.iceberg.shaded.org.apache.parquet.bytes.SingleBufferInputStream
at
org.apache.iceberg.shaded.org.apache.parquet.bytes.SingleBufferInputStream.(SingleBufferInputStream.java:37)
at Å.wrap(ByteBufferInputStream.java:38)
at
org.apache.iceberg.shaded.org.apache.parquet.bytes.BytesInput$ByteBufferBytesInput.toInputStream(BytesInput.java:532)
at
org.apache.iceberg.shaded.org.apache.parquet.hadoop.CodecFactory$HeapBytesDecompressor.decompress(CodecFactory.java:112)
at
org.apache.iceberg.shaded.org.apache.parquet.hadoop.ColumnChunkPageReadStore$ColumnChunkPageReader$1.visit(ColumnChunkPageReadStore.java:139)
at
org.apache.iceberg.shaded.org.apache.parquet.hadoop.ColumnChunkPageReadStore$ColumnChunkPageReader$1.visit(ColumnChunkPageReadStore.java:131)
at
org.apache.iceberg.shaded.org.apache.parquet.column.page.DataPageV1.accept(DataPageV1.java:120)
at
org.apache.iceberg.shaded.org.apache.parquet.hadoop.ColumnChunkPageReadStore$ColumnChunkPageReader.readPage(ColumnChunkPageReadStore.java:131)
at
org.apache.iceberg.parquet.BaseColumnIterator.advance(BaseColumnIterator.java:59)
at
org.apache.iceberg.arrow.vectorized.parquet.VectorizedColumnIterator.access$100(VectorizedColumnIterator.java:35)
at
org.apache.iceberg.arrow.vectorized.parquet.VectorizedColumnIterator$BatchReader.nextBatch(VectorizedColumnIterator.java:75)
at
org.apache.iceberg.arrow.vectorized.VectorizedArrowReader.read(VectorizedArrowReader.java:150)
at
org.apache.iceberg.spark.data.vectorized.ColumnarBatchReader$ColumnBatchLoader.readDataToColumnVectors(ColumnarBatchReader.java:123)
at
org.apache.iceberg.spark.data.vectorized.ColumnarBatchReader$ColumnBatchLoader.loadDataToColumnBatch(ColumnarBatchReader.java:98)
at
org.apache.iceberg.spark.data.vectorized.ColumnarBatchReader.read(ColumnarBatchReader.java:72)
at
org.apache.iceberg.spark.data.vectorized.ColumnarBatchReader.read(ColumnarBatchReader.java:44)
at
org.apache.iceberg.parquet.VectorizedParquetReader$FileIterator.next(VectorizedParquetReader.java:147)
at org.apache.iceberg.spark.source.BaseReader.next(BaseReader.java:138)
at
org.apache.spark.sql.execution.datasources.v2.PartitionIterator.hasNext(DataSourceRDD.scala:120)
at
org.apache.spark.sql.execution.datasources.v2.MetricsIterator.hasNext(DataSourceRDD.scala:158)
at
org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.$anonfun$hasNext$1(DataSourceRDD.scala:63)
at
org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.$anonfun$hasNext$1$adapted(DataSourceRDD.scala:63)
at scala.Option.exists(Option.scala:257)
at
org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.hasNext(DataSourceRDD.scala:63)
at
org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458)
at
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage6.columnartorow_nextBatch_0$(Unknown
Source)
at
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage6.hashAgg_doAggregateWithKeys_0$(Unknown
Source)
at
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage6.processNext(Unknown
Source)
at
org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at
org.apache.spark.sql.execution.WholeStageCodegenEvaluatorFactory$WholeStageCodegenPartitionEvaluator$$anon$1.hasNext(WholeStageCodegenEvaluatorFactory.scala:43)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458)
at
org.apache.spark.shuffle.sort.UnsafeShuffleWriter.write(UnsafeShuffleWriter.java:179)
at
org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59)
at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:104)
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]