Hi Swapnil,
       We faced this problem once, I think changing checkpoint dir to hdfs
and keeping sink dir to s3 with EMRFS s3 consistency enabled solves this
problem. If you are not using emr then I don't know how else it can be
solved. But in a nutshell because EMRFS s3 consistency uses Dynamo DB in
the back end to check for all files being written to s3. It kind of makes
s3 consistent and Streaming file sink works just fine.



On Sat, Aug 17, 2019, 3:32 AM Swapnil Kumar <swku...@zendesk.com> wrote:

> Hello, We are using Flink to process input events and aggregate and write
> o/p of our streaming job to S3 using StreamingFileSink but whenever we try
> to restore the job from a savepoint, the restoration fails with missing
> part files error. As per my understanding, s3 deletes those
> part(intermittent) files and can no longer be found on s3. Is there a
> workaround for this, so that we can use s3 as a sink?
>
> --
> Thanks,
> Swapnil Kumar
>

Reply via email to