big-doudou commented on PR #9182: URL: https://github.com/apache/hudi/pull/9182#issuecomment-1651615360
> > How does this affect metadata cleaning? > > It removes the preceeding partial metadata if there is any. Before the checkpoint is completed, BucketStreamWrite flush buffer record, At this moment, TM restart. Step1: TM RUNING -> CANCELING Step2: JM -> subtaskFailed() set eventBuffer = null Step3: TM restart, send bootstrap event to coordinator Step4: JM handle bootstrap event to clean metadata JM -> subtaskFailed(): only the event before checkpoint is lost, it will not affect the process after restart. https://github.com/apache/hudi/blob/2a0223933884cb044e7aa56f205cae926358a030/hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/common/AbstractStreamWriteFunction.java#L223 this case is: failover causes the bootstrap event not to be sent -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
