Mark Payne created NIFI-9822:
--------------------------------

             Summary: Update ConsumeKafkaRecord to allow writing out of the 
Kafka record key
                 Key: NIFI-9822
                 URL: https://issues.apache.org/jira/browse/NIFI-9822
             Project: Apache NiFi
          Issue Type: New Feature
          Components: Extensions
            Reporter: Mark Payne


The ConsumeKafkaRecord processors are among the most commonly used in NiFi, as 
they provide a very efficient mechanism for consuming structured data from 
Kafka. The down side to these processors that they do not support writing out 
the Kafka record's key. This was done because we wanted to bundle the records' 
values together into a single FlowFile, and we didn't have a good way to 
include the key.

For users who don't care about the key, this works great. For users that do 
need the key, they often are forced to use the non-record-oriented 
ConsumeKafka, which adds the key as an attribute. But this means that the key 
may need to be hex-encoded, which makes it less usable and it means that we are 
creating a FlowFile per kafka record.

We should improve this by introducing some new properties to the 
ConsumeKafkaRecord processors:

- Output Strategy. This property should have the following values:
  - Write Value Only - This should be the default value in order to maintain 
backward compatibility and should behave the same as it does now.
  - Use Wrapper - If selected, records that are provided to the Record Writer 
should be wrapped in a wrapper element that contains 4 keys: "key" (the kafka 
record key), "value" (the kafka record value), "headers" (a Map type of field 
with Strings as both the keys and values), and "metadata" (should include 
topic, partition, offset, timestamp, checksum).

If the Output Strategy selected is "Use Wrapper", we should provide the 
following properties:
  - Key Format - Allowable Values of (String, Byte Array, Record)
  - Key Record Reader - if Key Format = "Record" then should allow specifying a 
Record Reader for the key. Should be dependent on Key Format = Record.

Additionally, if the headers and the Kafka record key should only be added as 
attributes if using an Output Strategy of "Write Value Only." As a result, the 
following existing properties should be made dependent on using an Output 
Strategy of "Write Value Only":
  - Headers to Add as Attributes (Regex)
  - Key Attribute Encoding


It will also be important to update the additionalDetails.html to explain the 
differences between the two output modes, and provide examples, including when 
one strategy should be preferred over the other.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to