[
https://issues.apache.org/jira/browse/HUDI-8609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17902003#comment-17902003
]
Jonathan Vexler commented on HUDI-8609:
---------------------------------------
This actually has nothing to do with the double insert. It is because you are
using ts for both precombine and partition. In your example, it you change the
precombine to "price", then it will not fail
> Fix field does not exist error with HoodieFileGroupReader
> ---------------------------------------------------------
>
> Key: HUDI-8609
> URL: https://issues.apache.org/jira/browse/HUDI-8609
> Project: Apache Hudi
> Issue Type: Bug
> Reporter: Lokesh Jain
> Priority: Blocker
> Fix For: 1.0.0
>
>
> [https://github.com/lokeshj1703/hudi/commit/04ea4757cc739c91251015fb55b223fedea54155
>
> |https://github.com/lokeshj1703/hudi/commit/04ea4757cc739c91251015fb55b223fedea54155]reproduces
> the error in a test.
> The error happens when we do an insert of same record again in the table.
> {code:java}
> Caused by: java.lang.IllegalArgumentException: Field: ts does not exist in
> the table schema
> at
> org.apache.hudi.common.table.read.HoodieFileGroupReaderSchemaHandler.generateRequiredSchema(HoodieFileGroupReaderSchemaHandler.java:165)
> at
> org.apache.hudi.common.table.read.HoodieFileGroupReaderSchemaHandler.prepareRequiredSchema(HoodieFileGroupReaderSchemaHandler.java:206)
> at
> org.apache.hudi.common.table.read.HoodieFileGroupReaderSchemaHandler.<init>(HoodieFileGroupReaderSchemaHandler.java:95)
> at
> org.apache.hudi.common.table.read.HoodieFileGroupReader.<init>(HoodieFileGroupReader.java:118)
> at
> org.apache.spark.sql.execution.datasources.parquet.HoodieFileGroupReaderBasedParquetFileFormat.$anonfun$buildReaderWithPartitionValues$3(HoodieFileGroupReaderBasedParquetFileFormat.scala:184)
> at
> org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.org$apache$spark$sql$execution$datasources$FileScanRDD$$anon$$readCurrentFile(FileScanRDD.scala:209)
> at
> org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:270)
> at
> org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:116)
> at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
> at
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown
> Source)
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)