Hi,

S3 would delete files only if you have 'lifecycle rules' [1] defined on the
bucket. Could that be the case? If so, make sure to disable / extend the
object expiration period.

[1]
https://docs.aws.amazon.com/AmazonS3/latest/dev/object-lifecycle-mgmt.html

Thanks,
Rafi


On Sat, Aug 17, 2019 at 1:48 AM Oytun Tez <oy...@motaword.com> wrote:

> Hi Swapnil,
>
> I am not familiar with the StreamingFileSink, however, this sounds like a
> checkpointing issue to me FileSink should keep its sink state, and remove
> from the state the files that it *really successfully* sinks (perhaps you
> may want to add a validation here with S3 to check file integrity). This
> leaves us in the state with the failed files, partial files etc.
>
>
>
> ---
> Oytun Tez
>
> *M O T A W O R D*
> The World's Fastest Human Translation Platform.
> oy...@motaword.com — www.motaword.com
>
>
> On Fri, Aug 16, 2019 at 6:02 PM Swapnil Kumar <swku...@zendesk.com> wrote:
>
>> Hello, We are using Flink to process input events and aggregate and write
>> o/p of our streaming job to S3 using StreamingFileSink but whenever we try
>> to restore the job from a savepoint, the restoration fails with missing
>> part files error. As per my understanding, s3 deletes those
>> part(intermittent) files and can no longer be found on s3. Is there a
>> workaround for this, so that we can use s3 as a sink?
>>
>> --
>> Thanks,
>> Swapnil Kumar
>>
>

Reply via email to