Mark Payne created NIFI-9239:
--------------------------------
Summary: Create processor to run Stateless dataflow, enabling
Kafka's exactly once semantics
Key: NIFI-9239
URL: https://issues.apache.org/jira/browse/NIFI-9239
Project: Apache NiFi
Issue Type: New Feature
Components: Extensions, NiFi Stateless
Reporter: Mark Payne
Assignee: Mark Payne
There have been many requests for the ability to create a dataflow in NiFi that
makes use of Kafka's "Exactly Once Semantics" (EOS). While there are benefits
to being able to do so, the requirements that Kafka puts forth don't really
work well with NiFi's architecture.
However, it would make a lot of sense to run a NiFi dataflow using Stateless
NiFi, in a manner that could support these EOS.
To do so, we would need to update the consume & publish processors in order to
support the exactly once semantics. The Kafka Consumer would need to be capable
of not committing its offsets, and the publisher would need to understand that
that was the case and acknowledge the offsets as part of its commit.
This would require that all messages for the transaction be sent to
PublishKafka(Record) as a single group, but that is possible with Batch Output
mode of Process Groups.
While this is possible, it then leaves a concern about the ease of running a
Stateless flow with NiFi. While it can be run from command-line, we should also
build a Processor that will be capable of fetching a dataflow (from file,
registry, etc.) and running that flow as a Processor within NiFi. This offers
many additional advantages also, such as the ability to perform a file listing
in NiFi, which is persisted, and then processing it with stateless.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)