[ 
https://issues.apache.org/jira/browse/HUDI-8609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17902014#comment-17902014
 ] 

Lokesh Jain commented on HUDI-8609:
-----------------------------------

| This actually has nothing to do with the double insert.

If we remove the duplicate insert statement then even with the same precombine 
and partition field `ts` test passes.

> Fix field does not exist error with HoodieFileGroupReader
> ---------------------------------------------------------
>
>                 Key: HUDI-8609
>                 URL: https://issues.apache.org/jira/browse/HUDI-8609
>             Project: Apache Hudi
>          Issue Type: Bug
>            Reporter: Lokesh Jain
>            Priority: Blocker
>              Labels: pull-request-available
>             Fix For: 1.0.0
>
>
> [https://github.com/lokeshj1703/hudi/commit/04ea4757cc739c91251015fb55b223fedea54155
>  
> |https://github.com/lokeshj1703/hudi/commit/04ea4757cc739c91251015fb55b223fedea54155]reproduces
>  the error in a test. 
> The error happens when we do an insert of same record again in the table.
> {code:java}
> Caused by: java.lang.IllegalArgumentException: Field: ts does not exist in 
> the table schema
>     at 
> org.apache.hudi.common.table.read.HoodieFileGroupReaderSchemaHandler.generateRequiredSchema(HoodieFileGroupReaderSchemaHandler.java:165)
>     at 
> org.apache.hudi.common.table.read.HoodieFileGroupReaderSchemaHandler.prepareRequiredSchema(HoodieFileGroupReaderSchemaHandler.java:206)
>     at 
> org.apache.hudi.common.table.read.HoodieFileGroupReaderSchemaHandler.<init>(HoodieFileGroupReaderSchemaHandler.java:95)
>     at 
> org.apache.hudi.common.table.read.HoodieFileGroupReader.<init>(HoodieFileGroupReader.java:118)
>     at 
> org.apache.spark.sql.execution.datasources.parquet.HoodieFileGroupReaderBasedParquetFileFormat.$anonfun$buildReaderWithPartitionValues$3(HoodieFileGroupReaderBasedParquetFileFormat.scala:184)
>     at 
> org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.org$apache$spark$sql$execution$datasources$FileScanRDD$$anon$$readCurrentFile(FileScanRDD.scala:209)
>     at 
> org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:270)
>     at 
> org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:116)
>     at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
>     at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown
>  Source)
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to