Hi Paimon community,

I have a question regarding the runtime execution model and throughput
semantics when reading Paimon tables in streaming mode with consumer-id.

>From my understanding and observations, when consumer-id is specified, the
execution graph generated by Flink is different from some other common
sources (e.g. Kafka). Instead of a single source operator that directly
emits records, the graph usually contains:

- A monitor-like source (often with parallelism = 1), which tracks snapshot
changes and produces snapshot/split events
- One or more downstream read operators, which receive those splits and
perform the actual file reading, emitting the real RowData records

In this setup, the “source” node in the execution graph mainly emits
metadata events (snapshot IDs / splits), while the real data throughput is
produced by the downstream read operators.

This leads to a practical issue for platform-level monitoring tools. In
many Flink platforms, source throughput (records/s, bytes/s) is commonly
measured by observing the source vertex metrics. That approach works well
for sources like Kafka, where the source operator itself emits user
records. However, in the Paimon + consumer-id case, monitoring only the
source vertex seems misleading, because it does not reflect the actual data
ingestion rate.

So my questions are:

1. Is this monitor + reader split in the execution graph an intentional and
stable design for Paimon streaming reads with consumer-id?
2. From the Paimon/Flink semantics perspective, which operator should be
considered the “ingress point” for measuring real data throughput?
3. Is there any recommended or documented way for external monitoring
systems to correctly identify the operator that represents actual data
ingestion when reading from Paimon?

The motivation here is to build a connector-agnostic source rate detection
mechanism, and understanding the intended semantics on the Paimon side
would be very helpful.

Thanks in advance for your insights, and thanks for the great work on
Paimon.

Best regards.

Reply via email to