We ended up building a simple groovy processor that will use mysql db to queue up flowfiles. If a flowfile A fails, flowfile B would sit in a queue until we address an issue with flowfile A. We also used back pressure feature to slow down upstream Kafka consumers.
After playing with wait/notify we found it extremely difficult and cumbersome. Enforce order was not really doing much for us as well. Our use case was to process kafka messages on 3 node nifi cluster in order. It worked really well in the end for us On Thu, Apr 1, 2021, 03:35 Van Autreve Dries <[email protected]> wrote: > Hello all > > We recently started using NiFi and we were wondering if strict order of > processing flow files in a cluster could be guaranteed by NiFi. > > One of the use cases is as following: messages arrive in a specific order, > go through a simple flow with some basic transformations and are written to > the destination (usually a relational database). The source of the messages > can be a database, Kafka queue, … > It’s important that messages are written to the destination in exactly the > same order they arrived at NiFi. The reason is that messages could be > deltas and we do not want to overwrite newer data with older deltas. > Moreover we do not always control the message format, hence controlling > this from the messaging protocol point of view might not be possible. > > We did some research in various places but have not found a satisfying > answer. Our own investigations have revealed that: > - Just running the first processor on the primary node is not enough even > with a load balancing strategy “single node”. While testing with stopping / > starting the primary node we had some situations were messages got out of > order. > - Using the EnforceOrder processor with high timeouts prevented the > messages getting processed out of order, but each time the primary node > changes, manual intervention is required to reconfigure the initial order > property. Moreover it requires that the source system or first processor > provides this incrementing sequence attribute. > > It seems also not possible to pinpoint a flow to a specific node. At least > we have not found this option. We do understand that this would affect > scalability and availability or failover, but might be acceptable for those > specific cases. > > If there are other options we can explore, any input would be helpful. > Or if it’s not (easily) possible with NiFi on its own, it would be good to > know! > > -- > Kind Regards > Dries Van Autreve > > > (Sorry if this will result in a double post. I was not yet subscribed when > I did the first post and my message does not seem to appear in the list...) > >
