stream2000 commented on PR #9887:
URL: https://github.com/apache/hudi/pull/9887#issuecomment-1774674394

   > Do we have conclusion that the data source V2 drop has no guarantee for 
recycling existing stray tasks?
   
   > can share more info how you confirm this?
   
   @danny0405 @boneanxs we can add a breakpoint at 
`org.apache.hudi.table.action.commit.BulkInsertDataInternalWriterHelper#abort` 
and run `TestInsertTable#Test Bulk Insert Into Consistent Hashing Bucket Index 
Table` locally, and we will get the following result: 
   
   1. When abort is called, we can see the log that spark is trying to cancel 
other write tasks: 
   
![image](https://github.com/apache/hudi/assets/39240496/bca5ffd9-e839-4354-ba49-476e471ff480)
   
   2. Those tasks to cancel are still runnable ( and are initializing the 
metaclient) 
   
![image](https://github.com/apache/hudi/assets/39240496/5a9089fd-fde2-4593-85e6-47a975a0abce)
   
   So we can confirm that spark datasource V2 won't wait all subtasks finished 
before calling `BatchWrite.abort`.
   
   @boneanxs 
   >  can we solve it by cleaning partial files in 
   
   We can't do it because the partial files may be written after `abort` is 
finished, and I think we should cleanup failed writes using rollback mechanism 
after the whole spark job finished, which is a common practice in hudi.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to