[
https://issues.apache.org/jira/browse/HUDI-8609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17902014#comment-17902014
]
Lokesh Jain commented on HUDI-8609:
-----------------------------------
| This actually has nothing to do with the double insert.
If we remove the duplicate insert statement then even with the same precombine
and partition field `ts` test passes.
> Fix field does not exist error with HoodieFileGroupReader
> ---------------------------------------------------------
>
> Key: HUDI-8609
> URL: https://issues.apache.org/jira/browse/HUDI-8609
> Project: Apache Hudi
> Issue Type: Bug
> Reporter: Lokesh Jain
> Priority: Blocker
> Labels: pull-request-available
> Fix For: 1.0.0
>
>
> [https://github.com/lokeshj1703/hudi/commit/04ea4757cc739c91251015fb55b223fedea54155
>
> |https://github.com/lokeshj1703/hudi/commit/04ea4757cc739c91251015fb55b223fedea54155]reproduces
> the error in a test.
> The error happens when we do an insert of same record again in the table.
> {code:java}
> Caused by: java.lang.IllegalArgumentException: Field: ts does not exist in
> the table schema
> at
> org.apache.hudi.common.table.read.HoodieFileGroupReaderSchemaHandler.generateRequiredSchema(HoodieFileGroupReaderSchemaHandler.java:165)
> at
> org.apache.hudi.common.table.read.HoodieFileGroupReaderSchemaHandler.prepareRequiredSchema(HoodieFileGroupReaderSchemaHandler.java:206)
> at
> org.apache.hudi.common.table.read.HoodieFileGroupReaderSchemaHandler.<init>(HoodieFileGroupReaderSchemaHandler.java:95)
> at
> org.apache.hudi.common.table.read.HoodieFileGroupReader.<init>(HoodieFileGroupReader.java:118)
> at
> org.apache.spark.sql.execution.datasources.parquet.HoodieFileGroupReaderBasedParquetFileFormat.$anonfun$buildReaderWithPartitionValues$3(HoodieFileGroupReaderBasedParquetFileFormat.scala:184)
> at
> org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.org$apache$spark$sql$execution$datasources$FileScanRDD$$anon$$readCurrentFile(FileScanRDD.scala:209)
> at
> org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:270)
> at
> org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:116)
> at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
> at
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown
> Source)
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)