RE: Sharing data among flink operator pipeline

Kamal Mittal via user Mon, 25 Aug 2025 22:26:38 -0700

Ok thanks.

From: Kyle Lahnakoski <kyle.lahnako...@xe.com>
Sent: 25 August 2025 18:51
To: Kamal Mittal <kamal.mit...@ericsson.com>; user@flink.apache.org
Subject: Re: Sharing data among flink operator pipeline


You don't often get email from kyle.lahnako...@xe.com. Learn why this is 
important<https://aka.ms/LearnAboutSenderIdentification>
Kamal,


Ensure the headers you need are part of the stream element. In Kafka, 
deserialize with KafkaRecordDeserializationSchema/KafkaDeserializationSchema 
and emit a POJO that carries value + headers. Then map/flatMap/filter can use 
them directly—no keyed/broadcast state required. If you want operators to 
execute on the same task thread (operator chain), avoid shuffles (keyBy, 
rebalance, etc.) and keep parallelism equal so Flink uses forward partitioning 
and chaining.


From: Kamal Mittal via user <user@flink.apache.org>
Date: Monday, August 25, 2025 at 7:57 AM
To: Nikola Milutinovic <n.milutino...@levi9.com>, user@flink.apache.org 
<user@flink.apache.org>
Subject: RE: Sharing data among flink operator pipeline


No, I am not talking about Kafka Message Key. Rather key/value pair which comes 
separately as below.

Headers org.apache.kafka.clients.consumer.ConsumerRecord.headers()

From: Nikola Milutinovic <n.milutino...@levi9.com>
Sent: 25 August 2025 14:56
To: user@flink.apache.org
Subject: Re: Sharing data among flink operator pipeline

Hi Kamal.

When you say “Kafka Headers”, are you referring to “Kafka Message Key”? In 
Kafka, messages come as Key/Value pairs. You can omit the key, but if you do 
have it, you can read it in the job and then use it when writing to the other 
topic.

Nix.

From: Kamal Mittal via user 
<user@flink.apache.org<mailto:user@flink.apache.org>>
Date: Thursday, August 21, 2025 at 11:35 AM
To: user@flink.apache.org<mailto:user@flink.apache.org> 
<user@flink.apache.org<mailto:user@flink.apache.org>>
Subject: Sharing data among flink operator pipeline
Hello,

I have a scenario like for kafka source where kafka headers also come along 
with kafka record/event. Kafka headers fetched need to share/pass to 
next/parallel operators in pipeline.

So, is there any way to share data across operator pipeline?

Explored keyed state which has limitation that only with keyby() it will work.
Explored broadcast state which limitation that it can only work with process 
functions and not with map/flatmap/filter etc.

Rgds,
Kamal

RE: Sharing data among flink operator pipeline

Reply via email to