Mark Payne created NIFI-9822:
--------------------------------
Summary: Update ConsumeKafkaRecord to allow writing out of the
Kafka record key
Key: NIFI-9822
URL: https://issues.apache.org/jira/browse/NIFI-9822
Project: Apache NiFi
Issue Type: New Feature
Components: Extensions
Reporter: Mark Payne
The ConsumeKafkaRecord processors are among the most commonly used in NiFi, as
they provide a very efficient mechanism for consuming structured data from
Kafka. The down side to these processors that they do not support writing out
the Kafka record's key. This was done because we wanted to bundle the records'
values together into a single FlowFile, and we didn't have a good way to
include the key.
For users who don't care about the key, this works great. For users that do
need the key, they often are forced to use the non-record-oriented
ConsumeKafka, which adds the key as an attribute. But this means that the key
may need to be hex-encoded, which makes it less usable and it means that we are
creating a FlowFile per kafka record.
We should improve this by introducing some new properties to the
ConsumeKafkaRecord processors:
- Output Strategy. This property should have the following values:
- Write Value Only - This should be the default value in order to maintain
backward compatibility and should behave the same as it does now.
- Use Wrapper - If selected, records that are provided to the Record Writer
should be wrapped in a wrapper element that contains 4 keys: "key" (the kafka
record key), "value" (the kafka record value), "headers" (a Map type of field
with Strings as both the keys and values), and "metadata" (should include
topic, partition, offset, timestamp, checksum).
If the Output Strategy selected is "Use Wrapper", we should provide the
following properties:
- Key Format - Allowable Values of (String, Byte Array, Record)
- Key Record Reader - if Key Format = "Record" then should allow specifying a
Record Reader for the key. Should be dependent on Key Format = Record.
Additionally, if the headers and the Kafka record key should only be added as
attributes if using an Output Strategy of "Write Value Only." As a result, the
following existing properties should be made dependent on using an Output
Strategy of "Write Value Only":
- Headers to Add as Attributes (Regex)
- Key Attribute Encoding
It will also be important to update the additionalDetails.html to explain the
differences between the two output modes, and provide examples, including when
one strategy should be preferred over the other.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)