[
https://issues.apache.org/jira/browse/HUDI-5936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ethan Guo updated HUDI-5936:
----------------------------
Component/s: cleaning
table-service
> Potential serialization issue when FileStatus is not serializable
> -----------------------------------------------------------------
>
> Key: HUDI-5936
> URL: https://issues.apache.org/jira/browse/HUDI-5936
> Project: Apache Hudi
> Issue Type: Bug
> Components: cleaning, table-service
> Reporter: Shawn Chang
> Assignee: Shawn Chang
> Priority: Major
> Labels: pull-request-available
>
> Hadoop3's FileStatus is serializable and won't have this issue. However, when
> users run Hudi on older Hadoop or customized FileSystem implementation whose
> FileStatus is not serializable then it's possible to run into serialization
> issue.
>
> Exception:
> {code:java}
> com.esotericsoftware.kryo.KryoException:
> java.util.ConcurrentModificationException Caused by:
> java.util.ConcurrentModificationException at
> java.util.Vector$Itr.checkForComodification(Vector.java:1212 ) at
> java.util.Vector$Itr.next(Vector.java:1165 ) at
> com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:99
> ) at
> com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:40
> ) at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:575 ) at
> com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:79 )
> {code}
>
> The LOC that causes this issue:
> https://github.com/apache/hudi/blob/master/hudi-common/src/main/java/org/apache/hudi/metadata/FileSystemBackedTableMetadata.java#L109
> Driver stack trace:
> {code:java}
> Driver stacktrace:
> at
> org.apache.spark.scheduler.DAGScheduler.failJobAndIndependentStages(DAGScheduler.scala:2863)
> at
> org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2(DAGScheduler.scala:2799)
> at
> org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2$adapted(DAGScheduler.scala:2798)
> at
> scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
> at
> scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
> at
> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:2798)
> at
> org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1(DAGScheduler.scala:1239)
> at
> org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1$adapted(DAGScheduler.scala:1239)
> at scala.Option.foreach(Option.scala:407)
> at
> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:1239)
> at
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:3051)
> at
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2993)
> at
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2982)
> at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49)
> at
> org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:1009)
> at org.apache.spark.SparkContext.runJob(SparkContext.scala:2229)
> at org.apache.spark.SparkContext.runJob(SparkContext.scala:2250)
> at org.apache.spark.SparkContext.runJob(SparkContext.scala:2269)
> at org.apache.spark.SparkContext.runJob(SparkContext.scala:2294)
> at org.apache.spark.rdd.RDD.$anonfun$collect$1(RDD.scala:1021)
> at
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
> at
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
> at org.apache.spark.rdd.RDD.withScope(RDD.scala:406)
> at org.apache.spark.rdd.RDD.collect(RDD.scala:1020)
> at org.apache.spark.api.java.JavaRDDLike.collect(JavaRDDLike.scala:362)
> at org.apache.spark.api.java.JavaRDDLike.collect$(JavaRDDLike.scala:361)
> at
> org.apache.spark.api.java.AbstractJavaRDDLike.collect(JavaRDDLike.scala:45)
> at
> org.apache.hudi.client.common.HoodieSparkEngineContext.flatMap(HoodieSparkEngineContext.java:137)
> at
> org.apache.hudi.metadata.FileSystemBackedTableMetadata.getAllPartitionPaths(FileSystemBackedTableMetadata.java:86)
> at
> org.apache.hudi.table.action.clean.CleanPlanner.getPartitionPathsForFullCleaning(CleanPlanner.java:214)
> at
> org.apache.hudi.table.action.clean.CleanPlanner.getPartitionPathsForCleanByCommits(CleanPlanner.java:168)
> at
> org.apache.hudi.table.action.clean.CleanPlanner.getPartitionPathsToClean(CleanPlanner.java:133)
> at
> org.apache.hudi.table.action.clean.CleanPlanActionExecutor.requestClean(CleanPlanActionExecutor.java:106)
> at
> org.apache.hudi.table.action.clean.CleanPlanActionExecutor.requestClean(CleanPlanActionExecutor.java:148)
> at
> org.apache.hudi.table.action.clean.CleanPlanActionExecutor.execute(CleanPlanActionExecutor.java:173)
> at
> org.apache.hudi.table.HoodieSparkCopyOnWriteTable.scheduleCleaning(HoodieSparkCopyOnWriteTable.java:204)
> at
> org.apache.hudi.client.BaseHoodieWriteClient.scheduleTableServiceInternal(BaseHoodieWriteClient.java:1354)
> at
> org.apache.hudi.client.BaseHoodieWriteClient.clean(BaseHoodieWriteClient.java:865)
> at
> org.apache.hudi.client.BaseHoodieWriteClient.clean(BaseHoodieWriteClient.java:827)
> at
> org.apache.hudi.async.AsyncCleanerService.lambda$startService$0(AsyncCleanerService.java:55)
> at
> java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> ... 1 more
> Caused by: com.esotericsoftware.kryo.KryoException:
> java.util.ConcurrentModificationException
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)