[ 
https://issues.apache.org/jira/browse/HUDI-5936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo updated HUDI-5936:
----------------------------
    Component/s: cleaning
                 table-service

> Potential serialization issue when FileStatus is not serializable
> -----------------------------------------------------------------
>
>                 Key: HUDI-5936
>                 URL: https://issues.apache.org/jira/browse/HUDI-5936
>             Project: Apache Hudi
>          Issue Type: Bug
>          Components: cleaning, table-service
>            Reporter: Shawn Chang
>            Assignee: Shawn Chang
>            Priority: Major
>              Labels: pull-request-available
>
> Hadoop3's FileStatus is serializable and won't have this issue. However, when 
> users run Hudi on older Hadoop or customized FileSystem implementation whose 
> FileStatus is not serializable then it's possible to run into serialization 
> issue.
>  
> Exception:
> {code:java}
> com.esotericsoftware.kryo.KryoException: 
> java.util.ConcurrentModificationException Caused by: 
> java.util.ConcurrentModificationException at 
> java.util.Vector$Itr.checkForComodification(Vector.java:1212 ) at 
> java.util.Vector$Itr.next(Vector.java:1165 ) at 
> com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:99
>  ) at 
> com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:40
>  ) at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:575 ) at 
> com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:79 ) 
> {code}
>  
> The LOC that causes this issue: 
> https://github.com/apache/hudi/blob/master/hudi-common/src/main/java/org/apache/hudi/metadata/FileSystemBackedTableMetadata.java#L109
> Driver stack trace:
> {code:java}
> Driver stacktrace:
>       at 
> org.apache.spark.scheduler.DAGScheduler.failJobAndIndependentStages(DAGScheduler.scala:2863)
>       at 
> org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2(DAGScheduler.scala:2799)
>       at 
> org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2$adapted(DAGScheduler.scala:2798)
>       at 
> scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
>       at 
> scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
>       at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
>       at 
> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:2798)
>       at 
> org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1(DAGScheduler.scala:1239)
>       at 
> org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1$adapted(DAGScheduler.scala:1239)
>       at scala.Option.foreach(Option.scala:407)
>       at 
> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:1239)
>       at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:3051)
>       at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2993)
>       at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2982)
>       at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49)
>       at 
> org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:1009)
>       at org.apache.spark.SparkContext.runJob(SparkContext.scala:2229)
>       at org.apache.spark.SparkContext.runJob(SparkContext.scala:2250)
>       at org.apache.spark.SparkContext.runJob(SparkContext.scala:2269)
>       at org.apache.spark.SparkContext.runJob(SparkContext.scala:2294)
>       at org.apache.spark.rdd.RDD.$anonfun$collect$1(RDD.scala:1021)
>       at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
>       at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
>       at org.apache.spark.rdd.RDD.withScope(RDD.scala:406)
>       at org.apache.spark.rdd.RDD.collect(RDD.scala:1020)
>       at org.apache.spark.api.java.JavaRDDLike.collect(JavaRDDLike.scala:362)
>       at org.apache.spark.api.java.JavaRDDLike.collect$(JavaRDDLike.scala:361)
>       at 
> org.apache.spark.api.java.AbstractJavaRDDLike.collect(JavaRDDLike.scala:45)
>       at 
> org.apache.hudi.client.common.HoodieSparkEngineContext.flatMap(HoodieSparkEngineContext.java:137)
>       at 
> org.apache.hudi.metadata.FileSystemBackedTableMetadata.getAllPartitionPaths(FileSystemBackedTableMetadata.java:86)
>       at 
> org.apache.hudi.table.action.clean.CleanPlanner.getPartitionPathsForFullCleaning(CleanPlanner.java:214)
>       at 
> org.apache.hudi.table.action.clean.CleanPlanner.getPartitionPathsForCleanByCommits(CleanPlanner.java:168)
>       at 
> org.apache.hudi.table.action.clean.CleanPlanner.getPartitionPathsToClean(CleanPlanner.java:133)
>       at 
> org.apache.hudi.table.action.clean.CleanPlanActionExecutor.requestClean(CleanPlanActionExecutor.java:106)
>       at 
> org.apache.hudi.table.action.clean.CleanPlanActionExecutor.requestClean(CleanPlanActionExecutor.java:148)
>       at 
> org.apache.hudi.table.action.clean.CleanPlanActionExecutor.execute(CleanPlanActionExecutor.java:173)
>       at 
> org.apache.hudi.table.HoodieSparkCopyOnWriteTable.scheduleCleaning(HoodieSparkCopyOnWriteTable.java:204)
>       at 
> org.apache.hudi.client.BaseHoodieWriteClient.scheduleTableServiceInternal(BaseHoodieWriteClient.java:1354)
>       at 
> org.apache.hudi.client.BaseHoodieWriteClient.clean(BaseHoodieWriteClient.java:865)
>       at 
> org.apache.hudi.client.BaseHoodieWriteClient.clean(BaseHoodieWriteClient.java:827)
>       at 
> org.apache.hudi.async.AsyncCleanerService.lambda$startService$0(AsyncCleanerService.java:55)
>       at 
> java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>       ... 1 more
> Caused by: com.esotericsoftware.kryo.KryoException: 
> java.util.ConcurrentModificationException
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to