ad1happy2go opened a new issue, #13970:
URL: https://github.com/apache/hudi/issues/13970

   ### Bug Description
   
   Exception during compaction with Positional Merge
   
   ```
   25/09/23 05:45:01 WARN TaskSetManager: Lost task 2.3 in stage 30.0 (TID 
10174) (ip-10-0-114-99.us-west-2.compute.internal executor 3): TaskKilled 
(Stage cancelled: Job aborted due to stage failure: Task 31 in stage 30.0 
failed 4 times, most recent failure: Lost task 31.3 in stage 30.0 (TID 10102) 
(ip-10-0-123-223.us-west-2.compute.internal executor 1): 
java.lang.ClassCastException: class java.lang.Integer cannot be cast to class 
java.lang.Long (java.lang.Integer and java.lang.Long are in module java.base of 
loader 'bootstrap')
        at java.base/java.lang.Long.compareTo(Long.java:71)
        at 
org.apache.hudi.DefaultSparkRecordMerger.partialMerge(DefaultSparkRecordMerger.java:60)
        at 
org.apache.hudi.common.table.read.BufferedRecordMergerFactory$PartialUpdateBufferedRecordMerger.finalMerge(BufferedRecordMergerFactory.java:322)
        at 
org.apache.hudi.common.table.read.buffer.FileGroupRecordBuffer.hasNextBaseRecord(FileGroupRecordBuffer.java:238)
        at 
org.apache.hudi.common.table.read.buffer.PositionBasedFileGroupRecordBuffer.hasNextBaseRecord(PositionBasedFileGroupRecordBuffer.java:238)
        at 
org.apache.hudi.common.table.read.buffer.KeyBasedFileGroupRecordBuffer.doHasNext(KeyBasedFileGroupRecordBuffer.java:148)
        at 
org.apache.hudi.common.table.read.buffer.FileGroupRecordBuffer.hasNext(FileGroupRecordBuffer.java:153)
        at 
org.apache.hudi.common.table.read.HoodieFileGroupReader.hasNext(HoodieFileGroupReader.java:247)
        at 
org.apache.hudi.common.table.read.HoodieFileGroupReader$HoodieFileGroupReaderIterator.hasNext(HoodieFileGroupReader.java:334)
        at 
org.apache.hudi.common.util.collection.MappingIterator.hasNext(MappingIterator.java:39)
        at 
org.apache.spark.sql.execution.datasources.parquet.HoodieFileGroupReaderBasedFileFormat$$anon$1.hasNext(HoodieFileGroupReaderBasedFileFormat.scala:354)
        at 
org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.$anonfun$hasNext$1(FileScanRDD.scala:269)
        at 
scala.runtime.java8.JFunction0$mcZ$sp.apply(JFunction0$mcZ$sp.java:23)
        at 
org.apache.spark.util.FileAccessContext$.withContext(FileAccessContext.scala:41)
        at 
org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:269)
        at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
        at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.hashAgg_doAggregateWithKeys_0$(Unknown
 Source)
        at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown
 Source)
        at 
org.apache.spark.sql.execution.UnsafeRowInterceptor.hasNext(UnsafeRowInterceptor.java:24)
        at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.hasNext(Unknown
 Source)
        at 
org.apache.spark.sql.execution.WholeStageCodegenEvaluatorFactory$WholeStageCodegenPartitionEvaluator$$anon$1.hasNext(WholeStageCodegenEvaluatorFactory.scala:43)
        at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
        at 
org.apache.spark.shuffle.sort.UnsafeShuffleWriter.start(UnsafeShuffleWriter.java:229)
        at 
org.apache.spark.shuffle.DirectShuffleWriteProcessor.doWrite(DirectShuffleWriteProcessor.scala:44)
   ```
   
   **Steps to reproduce:**
   1. Reproducible Code - 
https://gist.github.com/ad1happy2go/f1399d3f5af7fcbefa14ff17f31fbdbb
   2.Used master from Sep 22
   3.
   
   
   ### Environment
   
   **Hudi version:**
   **Query engine:** (Spark/Flink/Trino etc)
   **Relevant configs:**
   
   
   ### Logs and Stack Trace
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to