Re: [I] Exception when using DELETE FROM Spark query on table using copy-on-write [iceberg]

via GitHub Sat, 18 Oct 2025 05:26:01 -0700


qinghui-xu commented on issue #14239:
URL: https://github.com/apache/iceberg/issues/14239#issuecomment-3370604560


   It seems the issue not only happens on the data rewriting stage of COW 
`delete`, but also when `select` on a table being upserted. It's likely related 
to the eq deletion that need to be applied:
   ```
   java.lang.IllegalStateException: Not an instance of java.lang.CharSequence: 
176
     at org.apache.iceberg.data.GenericRecord.get(GenericRecord.java:138)
     at 
org.apache.iceberg.data.InternalRecordWrapper.get(InternalRecordWrapper.java:101)
     at 
org.apache.iceberg.types.Comparators$StructLikeComparator.compare(Comparators.java:121)
     at 
org.apache.iceberg.types.Comparators$StructLikeComparator.compare(Comparators.java:94)
     at 
org.apache.iceberg.util.StructLikeWrapper.equals(StructLikeWrapper.java:91)
     at java.base/java.util.HashMap.putVal(HashMap.java:631)
     at java.base/java.util.HashMap.put(HashMap.java:608)
     at java.base/java.util.HashSet.add(HashSet.java:220)
     at org.apache.iceberg.util.StructLikeSet.add(StructLikeSet.java:102)
     at org.apache.iceberg.util.StructLikeSet.add(StructLikeSet.java:32)
     at 
org.apache.iceberg.relocated.com.google.common.collect.Iterators.addAll(Iterators.java:370)
     at 
org.apache.iceberg.relocated.com.google.common.collect.Iterables.addAll(Iterables.java:332)
     at 
org.apache.iceberg.data.BaseDeleteLoader.loadEqualityDeletes(BaseDeleteLoader.java:115)
     at 
org.apache.iceberg.data.DeleteFilter.applyEqDeletes(DeleteFilter.java:207)
     at 
org.apache.iceberg.data.DeleteFilter.applyEqDeletes(DeleteFilter.java:224)
     at org.apache.iceberg.data.DeleteFilter.filter(DeleteFilter.java:178)
     at 
org.apache.iceberg.spark.source.RowDataReader.open(RowDataReader.java:99)
     at 
org.apache.iceberg.spark.source.RowDataReader.open(RowDataReader.java:43)
     at org.apache.iceberg.spark.source.BaseReader.next(BaseReader.java:141)
     at 
org.apache.spark.sql.execution.datasources.v2.PartitionIterator.hasNext(DataSourceRDD.scala:120)
     at 
org.apache.spark.sql.execution.datasources.v2.MetricsIterator.hasNext(DataSourceRDD.scala:158)
     at 
org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.$anonfun$hasNext$1(DataSourceRDD.scala:63)
     at 
org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.$anonfun$hasNext$1$adapted(DataSourceRDD.scala:63)
     at scala.Option.exists(Option.scala:376)
     at 
org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.hasNext(DataSourceRDD.scala:63)
     at 
org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.advanceToNextIter(DataSourceRDD.scala:97)
     at 
org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.hasNext(DataSourceRDD.scala:63)
     at 
org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
     at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
     at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown
 Source)
     at 
org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
     at 
org.apache.spark.sql.execution.WholeStageCodegenEvaluatorFactory$WholeStageCodegenPartitionEvaluator$$anon$1.hasNext(WholeStageCodegenEvaluatorFactory.scala:43)
     at 
org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:388)
     at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:893)
     at 
org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:893)
     at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
     at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
     at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
     at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93)
     at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166)
     at org.apache.spark.scheduler.Task.run(Task.scala:141)
     at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
     at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
     at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
     at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94)
     at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623)
     at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
     at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
     at java.base/java.lang.Thread.run(Thread.java:829)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [I] Exception when using DELETE FROM Spark query on table using copy-on-write [iceberg]

Reply via email to