[
https://issues.apache.org/jira/browse/NIFI-16026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18090550#comment-18090550
]
ASF subversion and git services commented on NIFI-16026:
--------------------------------------------------------
Commit 4776001c55e1657c0e81935711cff9f05f636eb1 in nifi's branch
refs/heads/main from Alaksiej Ščarbaty
[ https://gitbox.apache.org/repos/asf?p=nifi.git;h=4776001c55e ]
NIFI-16026 - Add Hex header encoding option to ConsumeKafka (#11356)
Add a 'Header Format' property (String/Hex) to ConsumeKafka so binary
Kafka Record Header values can be written as a lowercase hexadecimal
string instead of being corrupted by charset decoding. The existing
'Header Encoding' property keeps its meaning (the character set used for
the String format) and is shown only when Header Format is String, so
existing flows are unchanged and no property migration is needed.
Header decoding is resolved once into a HeaderValueConverter in
onScheduled and applied uniformly to FlowFile attributes and the
wrapper/inject-metadata record header fields. Internal batch headers
(kafka.max.offset, kafka.count) decode as fixed UTF-8, matching how they
are written.
Also add unit tests for KafkaUtils.toKeyString and correct the KeyEncoding
HEX description, which claimed uppercase output although HexFormat.of()
emits lowercase.
> Write message headers as hexadecimal strings in ConsumeKafka
> ------------------------------------------------------------
>
> Key: NIFI-16026
> URL: https://issues.apache.org/jira/browse/NIFI-16026
> Project: Apache NiFi
> Issue Type: Improvement
> Reporter: Alaksiej Ščarbaty
> Assignee: Alaksiej Ščarbaty
> Priority: Major
> Time Spent: 0.5h
> Remaining Estimate: 0h
>
> h2. Problem
> {{{}ConsumeKafka{}}}'s *Header Encoding* property is charset-only
> ({{{}Charset.forName(...){}}}, {{{}ConsumeKafka.java:197{}}}/{{{}:379{}}}).
> Binary headers (e.g. little-endian Int64) are corrupted by UTF-8 decoding and
> unrecoverable. The *Key Attribute Encoding* property already supports {{HEX}}
> ({{{}KeyEncoding{}}} enum), but headers have no binary-safe path.
> h2. Ask
> Make header format configurable to support either strings (according to the
> provided charset in {*}Header Encoding{*}) or hex format for Kafka message
> headers.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)