ferenc-csaky commented on PR #25231:
URL: https://github.com/apache/flink/pull/25231#issuecomment-2312059662

   Thanks for the fix @gaborgsomogyi!
   
   As I tried to understand the problem we have here, I checked the history of 
what happened here before and why. The previous change that was made 
`S3RecoverableFsDataOutputStream#sync()` done in 
[FLINK-28513](https://issues.apache.org/jira/browse/FLINK-28513). The error 
msg. there states:
   ```
   S3RecoverableFsDataOutputStream cannot sync state to S3. Use persist() to 
create a persistent recoverable intermediate point.
   ```
   If we check the [`persist()` 
implementation](https://github.com/apache/flink/blob/9bcd8f4b8f48c8d9ad05575b60779c9216ee4965/flink-filesystems/flink-s3-fs-base/src/main/java/org/apache/flink/fs/s3/common/writer/S3RecoverableFsDataOutputStream.java#L158),
 it will upload the local part to S3, but will not enforce closure of that 
local part file, neither will commit the snapshotted state. IIUC committing is 
crucial inside `sync()`, because it is a void method, so if we would simply 
call `persist()` inside, we would lose the `S3Recoverable` object it returns, 
and the ability to resume correctly in case of a failure. So without changing 
the interfaces, it is necessary to commit here and make sure we have a 
consistent state before we leave the `sync()` method and release the lock.
   
   Based on these conclusions, if the current solution performs good based 
according to the actual usage, the fix itself LGTM.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to