qinghui-xu commented on issue #14239:
URL: https://github.com/apache/iceberg/issues/14239#issuecomment-3370604560
It seems the issue not only happens on the data rewriting stage of COW
`delete`, but also when `select` on a table being upserted. It's likely related
to the eq deletion that need to be applied:
```
java.lang.IllegalStateException: Not an instance of java.lang.CharSequence:
176
at org.apache.iceberg.data.GenericRecord.get(GenericRecord.java:138)
at
org.apache.iceberg.data.InternalRecordWrapper.get(InternalRecordWrapper.java:101)
at
org.apache.iceberg.types.Comparators$StructLikeComparator.compare(Comparators.java:121)
at
org.apache.iceberg.types.Comparators$StructLikeComparator.compare(Comparators.java:94)
at
org.apache.iceberg.util.StructLikeWrapper.equals(StructLikeWrapper.java:91)
at java.base/java.util.HashMap.putVal(HashMap.java:631)
at java.base/java.util.HashMap.put(HashMap.java:608)
at java.base/java.util.HashSet.add(HashSet.java:220)
at org.apache.iceberg.util.StructLikeSet.add(StructLikeSet.java:102)
at org.apache.iceberg.util.StructLikeSet.add(StructLikeSet.java:32)
at
org.apache.iceberg.relocated.com.google.common.collect.Iterators.addAll(Iterators.java:370)
at
org.apache.iceberg.relocated.com.google.common.collect.Iterables.addAll(Iterables.java:332)
at
org.apache.iceberg.data.BaseDeleteLoader.loadEqualityDeletes(BaseDeleteLoader.java:115)
at
org.apache.iceberg.data.DeleteFilter.applyEqDeletes(DeleteFilter.java:207)
at
org.apache.iceberg.data.DeleteFilter.applyEqDeletes(DeleteFilter.java:224)
at org.apache.iceberg.data.DeleteFilter.filter(DeleteFilter.java:178)
at
org.apache.iceberg.spark.source.RowDataReader.open(RowDataReader.java:99)
at
org.apache.iceberg.spark.source.RowDataReader.open(RowDataReader.java:43)
at org.apache.iceberg.spark.source.BaseReader.next(BaseReader.java:141)
at
org.apache.spark.sql.execution.datasources.v2.PartitionIterator.hasNext(DataSourceRDD.scala:120)
at
org.apache.spark.sql.execution.datasources.v2.MetricsIterator.hasNext(DataSourceRDD.scala:158)
at
org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.$anonfun$hasNext$1(DataSourceRDD.scala:63)
at
org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.$anonfun$hasNext$1$adapted(DataSourceRDD.scala:63)
at scala.Option.exists(Option.scala:376)
at
org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.hasNext(DataSourceRDD.scala:63)
at
org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.advanceToNextIter(DataSourceRDD.scala:97)
at
org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.hasNext(DataSourceRDD.scala:63)
at
org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
at
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown
Source)
at
org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at
org.apache.spark.sql.execution.WholeStageCodegenEvaluatorFactory$WholeStageCodegenPartitionEvaluator$$anon$1.hasNext(WholeStageCodegenEvaluatorFactory.scala:43)
at
org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:388)
at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:893)
at
org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:893)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93)
at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166)
at org.apache.spark.scheduler.Task.run(Task.scala:141)
at
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
at
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
at
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623)
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]