galenwarren commented on pull request #15599: URL: https://github.com/apache/flink/pull/15599#issuecomment-818921856
Sorry for the long delay. I do have a couple of additional questions that came up that I wanted to ask: - Does the license NOTICE file get generated automatically during build/deploy, or is that something I need to generate? I saw a script called ```collect_license_files.sh``` in the project, but I wasn't sure how to use it. Right now, there is no NOTICE. - [RecoverableFsDataOutputStream.Committer](https://ci.apache.org/projects/flink/flink-docs-release-1.8/api/java/org/apache/flink/core/fs/RecoverableFsDataOutputStream.Committer.html) contains both a ```commit``` and ```commitAfterRecovery```, and the descriptions say that the latter should be tolerant of situations where, say, the file has already been committed, which suggests that the former should not tolerate that situation. I've implemented if that way, but in thinking through some possible scenarios, it seems like it would be possible for a file to be written, committed (which deletes the temp files), and then for the processing to be restarted from an earlier check/savepoint at which point the recoverable write was still in progress. In that case, temporary files would continue to get written from that point on, but at commit time, the commit would fail because some of the temporary files would have already been deleted. So I wasn't sure if it perhaps made more sense to *a lways* look for the presence of the final file when committing with either method -- not just with ```commitAfterRecovery``` so that the commit would not fail in that case. The cost would be an extra file read on every commit, to see if the commit had already completed. @xintongsong , thanks for your help so far and looking forward to your feedback. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
