Alex Goos created NIFI-14882:
--------------------------------

             Summary: ConsumeKafka improvement for Avro+SchemaRegistry input
                 Key: NIFI-14882
                 URL: https://issues.apache.org/jira/browse/NIFI-14882
             Project: Apache NiFi
          Issue Type: Improvement
          Components: Extensions
    Affects Versions: 2.5.0
         Environment: NiFi 2.4+, Kafka 3 
            Reporter: Alex Goos


When Kafka receives Avro records marked with a schema identifier in the 
Confluent SchemaRegistry - and no immediate transformation is needed - the 
costly conversion into NiFiRecords and back to Avro is not needed. The 
performance can be substantially improved by simply batching records from the 
same source and with the same schema into a Avro Datafile. 

A new ProcessingStrategy "Avro Datafile" can be added to the existing ones.

In our setup, a VM with a Intel(R) Xeon(R) CPU E5-2695 v4 @ 2.10GHz CPU, and a 
single executor thread for ConsumeKafka this bumps throughput from 500MB/5Min 
(Precessing Strategy RECORD) to 17GB/5min

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to