Thanks for your answer.

Aljoscha

On Tue, 8 Sep 2015 at 20:04 Johny Rufus <[email protected]> wrote:

> Your assumption is correct, as duplicates in a failure scenario will occur.
>
> Thanks,
> Rufus
>
> On Tue, Sep 8, 2015 at 4:10 AM, Aljoscha Krettek <[email protected]>
> wrote:
>
>> Hi,
>> as I understand it the HDFS sink uses the transaction system to verify
>> that all the elements in a transaction are written. This is what I would
>> call at-least-once semantics.
>>
>> My question is now what happens if the writing fails in the middle of
>> writing the elements in the transaction. When the transaction is retried
>> some of the elements might be written again, i.e. the output contains
>> duplicates. Is this assumption correct or is there something in place that
>> prevents this from happening?
>>
>> Thanks for your time,
>> Aljoscha
>>
>
>

Reply via email to