Mark Payne created NIFI-9239:
--------------------------------

             Summary: Create processor to run Stateless dataflow, enabling 
Kafka's exactly once semantics
                 Key: NIFI-9239
                 URL: https://issues.apache.org/jira/browse/NIFI-9239
             Project: Apache NiFi
          Issue Type: New Feature
          Components: Extensions, NiFi Stateless
            Reporter: Mark Payne
            Assignee: Mark Payne


There have been many requests for the ability to create a dataflow in NiFi that 
makes use of Kafka's "Exactly Once Semantics" (EOS). While there are benefits 
to being able to do so, the requirements that Kafka puts forth don't really 
work well with NiFi's architecture.

However, it would make a lot of sense to run a NiFi dataflow using Stateless 
NiFi, in a manner that could support these EOS.

To do so, we would need to update the consume & publish processors in order to 
support the exactly once semantics. The Kafka Consumer would need to be capable 
of not committing its offsets, and the publisher would need to understand that 
that was the case and acknowledge the offsets as part of its commit.

This would require that all messages for the transaction be sent to 
PublishKafka(Record) as a single group, but that is possible with Batch Output 
mode of Process Groups.

While this is possible, it then leaves a concern about the ease of running a 
Stateless flow with NiFi. While it can be run from command-line, we should also 
build a Processor that will be capable of fetching a dataflow (from file, 
registry, etc.) and running that flow as a Processor within NiFi. This offers 
many additional advantages also, such as the ability to perform a file listing 
in NiFi, which is persisted, and then processing it with stateless.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to