greyp9 opened a new pull request, #8463:
URL: https://github.com/apache/nifi/pull/8463

   # Background
   Previous iterations of support for Kafka client versions in NiFi (1.0, 2.0, 
2.6) duplicated code from existing Kafka processors into new Maven modules, 
adjusted Kafka client library dependencies for the new modules, and adjusted 
for API differences as needed. The original JIRA associated with NiFi support 
for Kafka 3 (`NIFI-9330`), followed this same approach. After discussion, a new 
approach was chosen, and a new JIRA (`NIFI-11259`) created.
   
   - Refactor Kafka client library dependencies into a new controller service.
   - Expose a service API that was agnostic of any particular Kafka version.
   - Create new processor implementations that interacted with Kafka through 
the service API.
   
   In particular, the new processor module should have no Kafka dependencies. 
This is expected to ease the transition to future Kafka versions.
   
   - A new 3.0++ controller service might be created to isolate any major API 
changes to the Kafka client APIs.
   - The new `PublishKafka` and `ConsumeKafka` processors would need minimal / 
no changes to enable interactivity with the new controller service.
   
   Other refactoring activities have been undertaken at the same time:
   
   - The new `PublishKafka` processor is intended as a replacement for the 
existing `PublishKafka_2_6` and `PublishKafkaRecord_2_6` processor pair. It 
provides FlowFile-based or record-based data handling modes, controlled via a 
per-processor property. Similarly, `ConsumeKafka` replaces both 
`ConsumeKafka_2_6` and `ConsumeKafkaRecord_2_6`. This design adjustment reduces 
the code duplication that was present in the 2.6 processor set.
   
   # New Modules
   - `nifi-kafka-service-api` - API contract for `KafkaConnectionService`, 
which exposes access to instances of `KafkaConsumerService` and 
`KafkaProducerService` in a manner agnostic to a particular version of Kafka
     - `KafkaProducerService` - intermediary logical service brokering 
interactions with the producer APIs of the Kafka client libraries
     - `KafkaConsumerService` - intermediary logical service brokering 
interactions with the producer APIs of the Kafka client libraries
   - `nifi-kafka-service-api-nar` - NiFi NAR wrapper for the 
`KafkaConnectionService` API contract
   
   - `nifi-kafka-3-service` - home for `Kafka3ConnectionService`, which 
abstracts Kafka dependencies away from the new Kafka processors, and manages 
runtime connections to a remote Kafka 3 service instance
   - `nifi-kafka-3-service-nar` - NiFi NAR wrapper for `Kafka3ConnectionService`
   
   - `nifi-kafka-processors` - home for `PublishKafka` and `ConsumeKafka` 
processors, which allow interactivity with remote Kafka service instances 
agnostic to a particular Kafka version
   - `nifi-kafka-nar` - NiFi NAR wrapper for the `PublishKafka` and 
`ConsumeKafka` processors
   
   - `nifi-kafka-2-6-integration` - test bed to establish runtime behavior 
(testcontainers/kafka) of Kafka 2.6 processors for certain conditions, in order 
to better replicate those behaviors
   - `nifi-kafka-3-integration` - testing infrastructure to test new processors 
/ controller service while communicating with an actual (testcontainers) Kafka 
instance
   - `nifi-kafka-jacoco` - module home for configuration to instrument 
executions of `nifi-kafka-bundle` unit tests / integration tests, in order to 
assess test coverage
   
   # Notes
   - Integration tests are employed as a "first-class citizen" means of testing 
expected interactions with Kafka instances, running in `testcontainers`.
     - https://github.com/testcontainers/testcontainers-java
   - It is possible to use a single instance of testcontainers/kafka per Maven 
module, in order to incur the startup/teardown cost only once. The intent is to 
employ this strategy where feasible.
   - Instances of `simplelogger.properties` have been useful during 
development, but these would be removed before PR merge.
   - I’ve done a significant amount of runtime testing against containerized 
Kafka instances using the repo/branch below. This resource may be useful for 
others who want to do runtime testing without the need for fixed Kafka 
infrastructure.
     - https://github.com/greyp9/kafka-images/tree/NIFI-12194
   - There are opportunities to refactor common methods and declarations into 
the `nifi-kafka-shared` module; I’ve avoided that during development to ease 
the process of rebasing to `nifi/main`.
   - It is likely that this set of new components will co-exist with the 
existing Kafka 2.6 based processors for some period of time.
     - Allow for migration of existing flows to use the new components.
     - Slight behavioral differences might be anticipated during the "burn in" 
phase of the new components, due to the scope of work.
   - Code compatibility with JRE 8 has been targeted, to leave open the option 
of backporting this work to the support/1.x branch.
   - The PR as is should support the following authentication strategies: 
`PLAINTEXT`, `SSL`, `SASL_PLAINTEXT`, and `SASL_SSL`. Support for additional 
authentication strategies could be handled via follow on JIRAs.
   - Support for the `KafkaRecordSink` form factor could be handled via follow 
on JIRAs.
   - Support for Kafka "exactly once" semantics could be handled via follow on 
JIRAs.
   - Migration documentation could be handled via follow on JIRAs.
   
   
   ### Issue Tracking
   
   - [x] [Apache NiFi Jira](https://issues.apache.org/jira/browse/NIFI) issue 
created
   
   ### Pull Request Tracking
   
   - [x] Pull Request title starts with Apache NiFi Jira issue number, such as 
`NIFI-00000`
   - [x] Pull Request commit message starts with Apache NiFi Jira issue number, 
as such `NIFI-00000`
   
   ### Pull Request Formatting
   
   - [x] Pull Request based on current revision of the `main` branch
   - [x] Pull Request refers to a feature branch with one commit containing 
changes
   
   # Verification
   
   Please indicate the verification steps performed prior to pull request 
creation.
   
   ### Build
   
   - [x] Build completed using `mvn clean install -P contrib-check`
     - [x] JDK 21
   
   ### Licensing
   
   - [ ] New dependencies are compatible with the [Apache License 
2.0](https://apache.org/licenses/LICENSE-2.0) according to the [License 
Policy](https://www.apache.org/legal/resolved.html)
   - [ ] New dependencies are documented in applicable `LICENSE` and `NOTICE` 
files
   
   ### Documentation
   
   - [ ] Documentation formatting appears as expected in rendered files
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to