MrAladdin opened a new issue, #10982:
URL: https://github.com/apache/hudi/issues/10982
**Describe the problem you faced**
1、spark upsert hudi(mor)
2、exception when executing log compaction : Unsupported Operation Exception
3、org.apache.hudi.exception.HoodieRollbackException: Unknown listing type,
during rollback of [==>20240409000634923005__logcompaction__INFLIGHT]
I also want to know why after a log compaction exception, it remains in an
inflight state, and the program does not exit abnormally.
**Environment Description**
* Hudi version :0.14.1
* Spark version :3.4.1
* Hive version :3.1.2
* Hadoop version :3.1.3
* Storage (HDFS/S3/GCS..) :hdfs
* Running on Docker? (yes/no) :no
**Additional context**
.option("hoodie.metadata.enable", "true")
.option("hoodie.metadata.index.async", "false")
.option("hoodie.metadata.index.check.timeout.seconds", "900")
.option("hoodie.auto.adjust.lock.configs", "true")
.option("hoodie.metadata.optimized.log.blocks.scan.enable", "true")
.option("hoodie.metadata.metrics.enable", "false")
.option("hoodie.metadata.index.column.stats.enable", "false")
.option("hoodie.metadata.compact.max.delta.commits", "10")
.option("hoodie.metadata.record.index.enable", "true")
.option("hoodie.index.type", "RECORD_INDEX")
.option("hoodie.metadata.max.init.parallelism", "100000")
.option("hoodie.metadata.record.index.min.filegroup.count", "10")
.option("hoodie.metadata.record.index.max.filegroup.count",
"10000")
.option("hoodie.metadata.record.index.max.filegroup.size",
"1073741824")
.option("hoodie.metadata.auto.initialize", "true")
.option("hoodie.metadata.record.index.growth.factor", "2.0")
.option("hoodie.metadata.max.logfile.size", "2147483648")
.option("hoodie.metadata.log.compaction.enable", "true")
.option("hoodie.metadata.log.compaction.blocks.threshold", "5")
.option("hoodie.write.concurrency.mode",
"optimistic_concurrency_control")
.option("hoodie.write.lock.provider",
"org.apache.hudi.client.transaction.lock.FileSystemBasedLockProvider")
.option("hoodie.write.lock.filesystem.expire", "10")
**Stacktrace**
one exception:
Job aborted due to stage failure: Task 6 in stage 203.0 failed 4 times, most
recent failure: Lost task 6.3 in stage 203.0 (TID 4263) (11.slave.hdp executor
13): org.apache.hudi.exception.HoodieException: Unsupported Operation Exception
at
org.apache.hudi.common.util.collection.BitCaskDiskMap.values(BitCaskDiskMap.java:302)
at
org.apache.hudi.common.util.collection.ExternalSpillableMap.values(ExternalSpillableMap.java:275)
at
org.apache.hudi.table.HoodieSparkMergeOnReadTable.handleInsertsForLogCompaction(HoodieSparkMergeOnReadTable.java:206)
at
org.apache.hudi.table.action.compact.LogCompactionExecutionHelper.writeFileAndGetWriteStats(LogCompactionExecutionHelper.java:79)
at
org.apache.hudi.table.action.compact.HoodieCompactor.compact(HoodieCompactor.java:237)
at
org.apache.hudi.table.action.compact.HoodieCompactor.lambda$compact$988df80a$1(HoodieCompactor.java:132)
at
org.apache.spark.api.java.JavaPairRDD$.$anonfun$toScalaFunction$1(JavaPairRDD.scala:1070)
at scala.collection.Iterator$$anon$10.next(Iterator.scala:461)
at scala.collection.Iterator$$anon$11.nextCur(Iterator.scala:486)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:492)
at
org.apache.spark.storage.memory.MemoryStore.putIterator(MemoryStore.scala:223)
at
org.apache.spark.storage.memory.MemoryStore.putIteratorAsBytes(MemoryStore.scala:352)
at
org.apache.spark.storage.BlockManager.$anonfun$doPutIterator$1(BlockManager.scala:1552)
at
org.apache.spark.storage.BlockManager.org$apache$spark$storage$BlockManager$$doPut(BlockManager.scala:1462)
at
org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1526)
at
org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:1349)
at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:375)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:326)
at
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:328)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:92)
at
org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:161)
at org.apache.spark.scheduler.Task.run(Task.scala:139)
at
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:554)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1529)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:557)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
two exception:
org.apache.hudi.exception.HoodieRollbackException: Unknown listing type,
during rollback of [==>20240409000634923005__logcompaction__INFLIGHT]
at
org.apache.hudi.table.action.rollback.ListingBasedRollbackStrategy.lambda$getRollbackRequests$742513f$1(ListingBasedRollbackStrategy.java:187)
at
org.apache.hudi.client.common.HoodieSparkEngineContext.lambda$flatMap$7d470b86$1(HoodieSparkEngineContext.java:150)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]