We ended up building a simple groovy processor that will use mysql db to
queue up flowfiles. If a flowfile A fails, flowfile B would sit in a queue
until we address an issue with flowfile A. We also used back pressure
feature to slow down upstream Kafka consumers.

After playing with wait/notify we found it extremely difficult and
cumbersome. Enforce order was not really doing much for us as well. Our use
case was to process kafka messages on 3 node nifi cluster in order.

It worked really well in the end for us

On Thu, Apr 1, 2021, 03:35 Van Autreve Dries <[email protected]>
wrote:

> Hello all
>
> We recently started using NiFi and we were wondering if strict order of
> processing flow files in a cluster could be guaranteed by NiFi.
>
> One of the use cases is as following: messages arrive in a specific order,
> go through a simple flow with some basic transformations and are written to
> the destination (usually a relational database). The source of the messages
> can be a database, Kafka queue, …
> It’s important that messages are written to the destination in exactly the
> same order they arrived at NiFi. The reason is that messages could be
> deltas and we do not want to overwrite newer data with older deltas.
> Moreover we do not always control the message format, hence controlling
> this from the messaging protocol point of view might not be possible.
>
> We did some research in various places but have not found a satisfying
> answer. Our own investigations have revealed that:
> - Just running the first processor on the primary node is not enough even
> with a load balancing strategy “single node”. While testing with stopping /
> starting the primary node we had some situations were messages got out of
> order.
> - Using the EnforceOrder processor with high timeouts prevented the
> messages getting processed out of order, but each time the primary node
> changes, manual intervention is required to reconfigure the initial order
> property. Moreover it requires that the source system or first processor
> provides this incrementing sequence attribute.
>
> It seems also not possible to pinpoint a flow to a specific node. At least
> we have not found this option. We do understand that this would affect
> scalability and availability or failover, but might be acceptable for those
> specific cases.
>
> If there are other options we can explore, any input would be helpful.
> Or if it’s not (easily) possible with NiFi on its own, it would be good to
> know!
>
> --
> Kind Regards
> Dries Van Autreve
>
>
> (Sorry if this will result in a double post. I was not yet subscribed when
> I did the first post and my message does not seem to appear in the list...)
>
>

Reply via email to