I am using Flink 1.13.1 on AWS EMR 6.4.  I have an existing application
using DataStream API that I would like to modify to write output to S3.  I
am testing the StreamingFileSink with a bounded input.  I have enabled
checkpointing.

A couple questions:
1) When the program finishes, all the files remain .inprogress.  Is that
"Important Note 2" in the documentation
<https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/connectors/datastream/streamfile_sink/>?
Is there a solution to this other than renaming the files myself?  Renaming
the files in S3 could be costly I think.

2) If I use a deprecated method such as DataStream.writeAsText() is that
guaranteed to write *all* the records from the stream, as long as the job
does not fail?  I understand checkpointing will not be effective here.

Thanks,
David

Reply via email to