[
https://issues.apache.org/jira/browse/HUDI-3105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Shuo Cheng resolved HUDI-3105.
------------------------------
> flink bootstrap cause Invalid Hoodie Table error
> ------------------------------------------------
>
> Key: HUDI-3105
> URL: https://issues.apache.org/jira/browse/HUDI-3105
> Project: Apache Hudi
> Issue Type: Bug
> Components: flink
> Reporter: konwu
> Priority: Major
>
> environment
> * start flink task with enable bootstarp index
> * meet error before first success checkpoint
> * restart task also with bootstrap index enable
> and then
> org.apache.hudi.exception.InvalidTableException: Invalid Hoodie Table.
> viewfs://xx/xx/flight_order_info
> at
> org.apache.hudi.common.table.TableSchemaResolver.lambda$getTableParquetSchemaFromDataFile$0(TableSchemaResolver.java:88)
> at org.apache.hudi.common.util.Option.orElseThrow(Option.java:123)
> at
> org.apache.hudi.common.table.TableSchemaResolver.getTableParquetSchemaFromDataFile(TableSchemaResolver.java:88)
> at
> org.apache.hudi.common.table.TableSchemaResolver.getTableAvroSchemaFromDataFile(TableSchemaResolver.java:153)
> at
> org.apache.hudi.common.table.TableSchemaResolver.getTableAvroSchema(TableSchemaResolver.java:187)
> at
> org.apache.hudi.common.table.TableSchemaResolver.getTableAvroSchema(TableSchemaResolver.java:163)
> at
> org.apache.hudi.sink.bootstrap.BootstrapFunction.loadRecords(BootstrapFunction.java:160)
> at
> org.apache.hudi.sink.bootstrap.BootstrapFunction.processElement(BootstrapFunction.java:110)
> at
> org.apache.flink.streaming.api.operators.ProcessOperator.processElement(ProcessOperator.java:66)
> at
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:187)
> at
> org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:204)
> at
> org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:174)
> at
> org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65)
> at
> org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:395)
> at
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:191)
> at
> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:609)
> at
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:573)
> at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:755)
> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:570)
> at java.lang.Thread.run(Thread.java:748)
>
> to resolve
> [https://github.com/apache/hudi/blob/c81df99e50f2df84d85f08ff3a839595dad974d7/hudi-flink/src/main/java/org/apache/hudi/sink/bootstrap/BootstrapOperator.java?_pjax=%23js-repo-pjax-container%2C%20div%5Bitemtype%3D%22http%3A%2F%2Fschema.org%2FSoftwareSourceCode%22%5D%20main%2C%20%5Bdata-pjax-container%5D#L183]
>
> maybe we need to move
> getTableAvroSchema into if condition
--
This message was sent by Atlassian Jira
(v8.20.10#820010)