David Handermann created NIFI-11259:
---------------------------------------

             Summary: Redesign Kafka Component Integration
                 Key: NIFI-11259
                 URL: https://issues.apache.org/jira/browse/NIFI-11259
             Project: Apache NiFi
          Issue Type: Epic
          Components: Extensions
            Reporter: David Handermann
            Assignee: David Handermann


NiFi has supported integration with Kafka across a number of major release 
versions. As Kafka has continued to release new major versions, NiFi 
integration has followed a general pattern of duplicating existing code, 
updating Kafka libraries, and making small modifications. Although this 
approach has allowed NiFi to support a range of Kafka versions, it has made 
Kafka components difficult to maintain.

A new approach should be designed and implemented that abstracts access to 
Kafka library components. This approach should provide more maintainable Kafka 
Processors, and also define a better path to support new major Kafka releases.

The general approach should consist of a NiFi Controller Service and supporting 
classes that abstract Kafka operations for publishing and consuming messages. 
The Controller Service interface should not have any dependencies on Kafka 
libraries, making it independent of Kafka versions. The interface should define 
a contract for accessing Kafka Brokers so that Controller Service 
implementations will be responsible for connection and authentication settings. 
Controller Service implementations should be aligned to major Kafka versions, 
allowing minor dependency version updates without changing Controller Service 
implementations.

New Kafka Processors should be designed and implemented to use the Controller 
Service interface, without requiring knowledge of Kafka connection details. New 
Processors should support the same basic production and consumption 
capabilities, with configurable strategies for Kafka Record handling. These 
Processors should support the same capabilities as current Processors, 
including treating entire FlowFiles as Kafka Records, as well as processing 
FlowFiles using a configurable delimiter or using NiFi Record services.

New Kafka components should build on the pattern of the nifi-kafka-shared 
module to promote code reuse across major Kafka versions.

The new decoupled Kafka Processors and Controller Services should enable better 
unit testing. The Testcontainers library should considered for integration 
testing with different Kafka runtime versions.

Documentation should be written to describe common migration scenarios from 
existing Kafka Processors to new components.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to