big-doudou commented on issue #8892:
URL: https://github.com/apache/hudi/issues/8892#issuecomment-1632159235

   The following steps can reproduce this error
   1. Start the Flink task
   2. Before the detail commit, some log files are written to the disk. kill TM
   3. After waiting for a detail commit to complete, cancel the task and 
restart it again, exception: duplicate file id
   
   TimeLine before kill:
   2023-07-12 16:42 viewfs:///.hoodie/20230712164243784.deltacommit.inflight
   2023-07-12 16:42 viewfs:///.hoodie/20230712164243784.deltacommit.requested
   Files before kill:
   2023-07-12 16:45 
viewfs:///.00000255-add0-4f0d-b367-f1f7954c7717_20230712164243784.log.1_5-64-0
   
   TimeLine after reboot:
   2023-07-12 16:58 viewfs:///.hoodie/20230712164243784.deltacommit
   2023-07-12 16:42 viewfs:///.hoodie/20230712164243784.deltacommit.inflight
   2023-07-12 16:42 viewfs:///.hoodie/20230712164243784.deltacommit.requested
   File after kill:
   2023-07-12 16:50 
viewfs:///.00000255-00fd-4f23-9373-d85e12686dd3_20230712164243784.log.1_5-64-1
   2023-07-12 16:45 
viewfs:///.00000255-add0-4f0d-b367-f1f7954c7717_20230712164243784.log.1_5-64-0
   
   You can see that instant is reused, and the file is duplicated
   
   I locate the error in this place.
   If the fault is automatically restored, the bootstrap event will not be 
sent, so the program will not rollback the old instant, which will cause the 
files temporarily written by instant 20230712164243784 to not be cleaned up.
   
https://github.com/danny0405/hudi/blob/50712dceb582c0ebbce263dec4413c11b2e92ddd/hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/common/AbstractStreamWriteFunction.java#L216C1-L216C1


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to