Thanks Joe. This is really helpful. On Tue, Aug 11, 2020 at 9:33 AM Joe Witt <[email protected]> wrote:
> Asmath > > In a traditional installation, regardless of how a NiFi cluster obtains > data (kafka, ftp, HTTP calls, TCP listening, etc, ....) once it is > responsible for the data it has ack'd its receipt to the source(s). > > If that NiFi node were to become offline the data it owns is delayed. If > that node becomes unrecoverably offline the data is likely to be lost. > > If you're going to run in environments where there are more powerful > storage alignment options like in many Kubernetes based deployments then > there are definitely options to solve the possibility of loss case to a > very high degree and to ensure there is only minimal data delay in the > worst case. > > In a Hadoop style environment though the traditional model I describe > works very well, leverages appropriate RAID, and is proven highly reliable > and durable. > > Thanks > > On Tue, Aug 11, 2020 at 7:26 AM KhajaAsmath Mohammed < > [email protected]> wrote: > >> Hi, >> >> [image: image.png] >> >> we have 3 node NIFI clusters and due to some reasons NODE 2 and NODE 3 >> were disconnected when the flow was running . Consume kafka was reading >> data from all node settings and loading the data into the database. >> >> In the above scenario, is there a possibility of loss of data? >> Distributed processing in terms of hadoop will handle it automatically and >> assign the task to other active nodes. Will it be the same case with the >> NIFI cluster? >> >> Thanks, >> Asmath >> >
