Alex Goos created NIFI-14882:
--------------------------------
Summary: ConsumeKafka improvement for Avro+SchemaRegistry input
Key: NIFI-14882
URL: https://issues.apache.org/jira/browse/NIFI-14882
Project: Apache NiFi
Issue Type: Improvement
Components: Extensions
Affects Versions: 2.5.0
Environment: NiFi 2.4+, Kafka 3
Reporter: Alex Goos
When Kafka receives Avro records marked with a schema identifier in the
Confluent SchemaRegistry - and no immediate transformation is needed - the
costly conversion into NiFiRecords and back to Avro is not needed. The
performance can be substantially improved by simply batching records from the
same source and with the same schema into a Avro Datafile.
A new ProcessingStrategy "Avro Datafile" can be added to the existing ones.
In our setup, a VM with a Intel(R) Xeon(R) CPU E5-2695 v4 @ 2.10GHz CPU, and a
single executor thread for ConsumeKafka this bumps throughput from 500MB/5Min
(Precessing Strategy RECORD) to 17GB/5min
--
This message was sent by Atlassian Jira
(v8.20.10#820010)