Re: NIFI /Kafka - Data loss possibility with node failures

KhajaAsmath Mohammed Tue, 11 Aug 2020 07:55:10 -0700

Thanks Joe. This is really helpful.

On Tue, Aug 11, 2020 at 9:33 AM Joe Witt <[email protected]> wrote:


> Asmath
>
> In a traditional installation, regardless of how a NiFi cluster obtains
> data (kafka, ftp, HTTP calls, TCP listening, etc, ....) once it is
> responsible for the data it has ack'd its receipt to the source(s).
>
> If that NiFi node were to become offline the data it owns is delayed. If
> that node becomes unrecoverably offline the data is likely to be lost.
>
> If you're going to run in environments where there are more powerful
> storage alignment options like in many Kubernetes based deployments then
> there are definitely options to solve the possibility of loss case to a
> very high degree and to ensure there is only minimal data delay in the
> worst case.
>
> In a Hadoop style environment though the traditional model I describe
> works very well, leverages appropriate RAID, and is proven highly reliable
> and durable.
>
> Thanks
>
> On Tue, Aug 11, 2020 at 7:26 AM KhajaAsmath Mohammed <
> [email protected]> wrote:
>
>> Hi,
>>
>> [image: image.png]
>>
>> we have 3 node NIFI clusters and due to some reasons NODE 2 and NODE 3
>> were disconnected when the flow was running . Consume kafka was reading
>> data from all node settings and loading the data into the database.
>>
>> In the above scenario, is there a possibility of loss of data?
>> Distributed processing in terms of hadoop will handle it automatically and
>> assign the task to other active nodes. Will it be the same case with the
>> NIFI cluster?
>>
>> Thanks,
>> Asmath
>>
>

Re: NIFI /Kafka - Data loss possibility with node failures

Reply via email to