Hi Lec,

What I know is that Flink has defined some standard Metrics, which
require each Connector to report at least the Metries of these
standards, to help users significantly reduce development work for
individual Connectors.

But there are indeed challenges involved.

Best,
Jingsong

On Tue, Feb 3, 2026 at 10:17 PM lec ssmi <[email protected]> wrote:
>
> Hi Jingsong,
>
> Thanks a lot for the quick reply and for pointing me to the Paimon metrics 
> documentation — that is very helpful.
>
> I fully agree that, at the connector level, relying on the metrics exposed by 
> the Flink Paimon Source is the most accurate way to observe its runtime 
> behavior. However, from a platform or engine-agnostic monitoring perspective, 
> this also highlights a broader challenge.
>
> If source throughput detection relies on connector-specific metrics, then 
> monitoring inevitably becomes customized per connector. In practice, Flink 
> has many connectors, and each of them may expose different metrics, follow 
> different execution models, or even change behavior depending on 
> configuration. Paimon is a good example of this: different consumption modes 
> (e.g., with or without consumer-id) lead to quite different execution graphs 
> and operator responsibilities.
>
> From the perspective of a generic Flink platform or monitoring system, 
> implementing and maintaining connector-specific logic for each source type 
> (Kafka, Paimon, filesystem, etc.), and even for different modes of the same 
> connector, is not very friendly or scalable.
>
> This is why I’m particularly interested in whether there is (or could be) a 
> more stable semantic definition of a “logical ingress point” in Flink — one 
> that monitoring systems could rely on without deeply understanding each 
> connector’s internal implementation. Today, the physical JobGraph structure 
> and the actual data-ingesting operator do not always align in a uniform way 
> across connectors.
>
> I understand this may be more of a Flink-level abstraction question rather 
> than a Paimon-only one, but Paimon’s design makes the issue especially 
> visible.
>
> Thanks again for the clarification and for the great work on Paimon.
>
> Best regards,
> Lec
>
>
> Jingsong Li <[email protected]> 于2026年2月2日周一 10:39写道:
>>
>> Hi Lec,
>>
>> You can take a look at Flink Paimon's Metrics, which contains a wealth
>> of metrics. Paimon is just a lake format, so it doesn't have any
>> metrics, but the Flink Source we implemented produces a large number
>> of metrics.
>>
>> See doc: https://paimon.apache.org/docs/master/maintenance/metrics/
>>
>> Best,
>> Jingsong
>>
>> On Thu, Jan 29, 2026 at 10:17 PM lec ssmi <[email protected]> wrote:
>> >
>> > Hi Paimon community,
>> >
>> > I have a question regarding the runtime execution model and throughput 
>> > semantics when reading Paimon tables in streaming mode with consumer-id.
>> >
>> > From my understanding and observations, when consumer-id is specified, the 
>> > execution graph generated by Flink is different from some other common 
>> > sources (e.g. Kafka). Instead of a single source operator that directly 
>> > emits records, the graph usually contains:
>> >
>> > - A monitor-like source (often with parallelism = 1), which tracks 
>> > snapshot changes and produces snapshot/split events
>> > - One or more downstream read operators, which receive those splits and 
>> > perform the actual file reading, emitting the real RowData records
>> >
>> > In this setup, the “source” node in the execution graph mainly emits 
>> > metadata events (snapshot IDs / splits), while the real data throughput is 
>> > produced by the downstream read operators.
>> >
>> > This leads to a practical issue for platform-level monitoring tools. In 
>> > many Flink platforms, source throughput (records/s, bytes/s) is commonly 
>> > measured by observing the source vertex metrics. That approach works well 
>> > for sources like Kafka, where the source operator itself emits user 
>> > records. However, in the Paimon + consumer-id case, monitoring only the 
>> > source vertex seems misleading, because it does not reflect the actual 
>> > data ingestion rate.
>> >
>> > So my questions are:
>> >
>> > 1. Is this monitor + reader split in the execution graph an intentional 
>> > and stable design for Paimon streaming reads with consumer-id?
>> > 2. From the Paimon/Flink semantics perspective, which operator should be 
>> > considered the “ingress point” for measuring real data throughput?
>> > 3. Is there any recommended or documented way for external monitoring 
>> > systems to correctly identify the operator that represents actual data 
>> > ingestion when reading from Paimon?
>> >
>> > The motivation here is to build a connector-agnostic source rate detection 
>> > mechanism, and understanding the intended semantics on the Paimon side 
>> > would be very helpful.
>> >
>> > Thanks in advance for your insights, and thanks for the great work on 
>> > Paimon.
>> >
>> > Best regards.

Reply via email to