[ 
https://issues.apache.org/jira/browse/HUDI-8963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davis Zhang updated HUDI-8963:
------------------------------
    Fix Version/s: 1.0.2

> Multi-writer schema evolution interleaved with compaction can have issues
> -------------------------------------------------------------------------
>
>                 Key: HUDI-8963
>                 URL: https://issues.apache.org/jira/browse/HUDI-8963
>             Project: Apache Hudi
>          Issue Type: Bug
>            Reporter: Davis Zhang
>            Priority: Major
>             Fix For: 1.0.2
>
>
> first delta commit use schema 1 and committed
> second delta commit use schema 2 (valid schema evolution) and goes inflight
> compaction.request
> second delta commit finishes
> compaction execution hit issues
> {code:java}
> drwxr-xr-x@ 2 zhanyeha  staff    64 Feb  5 17:36 history
> -rw-r--r--@ 1 zhanyeha  staff     0 Feb  5 17:36 0011.deltacommit.requested
> -rw-r--r--@ 1 zhanyeha  staff     0 Feb  5 17:36 0011.deltacommit.inflight
> -rw-r--r--@ 1 zhanyeha  staff  4314 Feb  5 17:36 
> 0011_20250205173626508.deltacommit
> -rw-r--r--@ 1 zhanyeha  staff     0 Feb  5 17:36 0012.deltacommit.requested
> -rw-r--r--@ 1 zhanyeha  staff  2795 Feb  5 17:36 0012.deltacommit.inflight
> -rw-r--r--@ 1 zhanyeha  staff  4502 Feb  5 17:36 
> 0012_20250205173628037.deltacommit
> -rw-r--r--@ 1 zhanyeha  staff     0 Feb  5 17:36 0021.deltacommit.requested
> -rw-r--r--@ 1 zhanyeha  staff   113 Feb  5 17:36 0021.deltacommit.inflight
> -rw-r--r--@ 1 zhanyeha  staff     0 Feb  5 17:36 0031.deltacommit.requested
> -rw-r--r--@ 1 zhanyeha  staff   113 Feb  5 17:36 0031.deltacommit.inflight
> -rw-r--r--@ 1 zhanyeha  staff  3186 Feb  5 17:36 0032.compaction.requested
> -rw-r--r--@ 1 zhanyeha  staff  3829 Feb  5 17:36 
> 0021_20250205173628336.deltacommit
>  {code}
> Error is projection step hits NPE. It is high chance that the compaction 
> writer schema and the data it handles mismatch, resulting in accessing 
> non-existing data fields.
>  
> {code:java}
>     at org.apache.spark.api.java.JavaRDDLike.collect$(JavaRDDLike.scala:361)
>     at 
> org.apache.spark.api.java.AbstractJavaRDDLike.collect(JavaRDDLike.scala:45)
>     at 
> org.apache.hudi.data.HoodieJavaRDD.collectAsList(HoodieJavaRDD.java:200)
>     at 
> org.apache.hudi.table.action.compact.RunCompactionActionExecutor.execute(RunCompactionActionExecutor.java:113)
>     ... 136 more
> Caused by: java.lang.NullPointerException
>     at 
> org.apache.spark.sql.catalyst.expressions.codegen.UnsafeWriter.write(UnsafeWriter.java:110)
>     at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.writeFields_0_4$(Unknown
>  Source)
>     at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown
>  Source)
>     at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown
>  Source)
>     at 
> org.apache.spark.sql.execution.datasources.RecordReaderIterator$$anon$1.next(RecordReaderIterator.scala:62)
>     at 
> org.apache.hudi.util.CloseableInternalRowIterator.next(CloseableInternalRowIterator.scala:57)
>     at 
> org.apache.hudi.util.CloseableInternalRowIterator.next(CloseableInternalRowIterator.scala:36)
>     at 
> org.apache.hudi.common.table.read.HoodieKeyBasedFileGroupRecordBuffer.doHasNext(HoodieKeyBasedFileGroupRecordBuffer.java:140)
>     at 
> org.apache.hudi.common.table.read.HoodieBaseFileGroupRecordBuffer.hasNext(HoodieBaseFileGroupRecordBuffer.java:160)
>     at 
> org.apache.hudi.common.table.read.HoodieFileGroupReader.hasNext(HoodieFileGroupReader.java:260)
>     at 
> org.apache.hudi.common.table.read.HoodieFileGroupReader$HoodieFileGroupReaderIterator.hasNext(HoodieFileGroupReader.java:331)
>     at 
> org.apache.hudi.io.HoodieSparkFileGroupReaderBasedMergeHandle.write(HoodieSparkFileGroupReaderBasedMergeHandle.java:203)
>     at 
> org.apache.hudi.table.HoodieSparkCopyOnWriteTable.compactUsingFileGroupReader(HoodieSparkCopyOnWriteTable.java:281)
>     at 
> org.apache.hudi.table.action.compact.HoodieCompactor.compact(HoodieCompactor.java:305)
>     at 
> org.apache.hudi.table.action.compact.HoodieCompactor.lambda$compact$8ace6636$1(HoodieCompactor.java:159)
>      {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to