DanielCarter-stack commented on issue #10375:
URL: https://github.com/apache/seatunnel/issues/10375#issuecomment-3795839621
<!-- code-pr-reviewer -->
Thanks for the detailed evidence. I've confirmed this is a **real bug** in
the metrics aggregation logic.
**Root Cause:**
In `JobClient.getJobMetricsSummary()`
(seatunnel-engine/seatunnel-engine-client/src/main/java/org/apache/seatunnel/engine/client/job/JobClient.java:165-170),
the code incorrectly uses `sourceReaders.size()` as the loop boundary:
```java
for (int i = 0; i < sourceReaders.size(); i++) {
JsonNode sinkWriter = sinkWriters.get(i); // BUG: assumes same size
sinkWriteCount += sinkWriter.get("value").asLong();
}
```
In multi-sink scenarios with parallelism > 1, `sinkWriters` typically has
**more elements** than `sourceReaders` (each sink can have multiple writer
tasks). Your JSON shows 2 sourceReaders vs 4 sinkWriters. When only 2
iterations occur, two sink writers are ignored. If the ignored writers contain
actual data (e.g., your taskGroupId=2 writers with 232003), the display
incorrectly shows 0.
The existing test `testMultipleSinks` (JobClientTest.java:117-146) uses
symmetric arrays (both size 3) and doesn't catch this.
**Fix Approach:**
- Change SinkWriteCount/SinkCommittedCount loops to iterate over
`sinkWriters.size()` independently
- Add test case for asymmetric source/sink arrays (e.g., 2 sources, 4 sinks)
Would you like to proceed with a PR?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]