wuhainan commented on issue #11013:
URL: https://github.com/apache/seatunnel/issues/11013#issuecomment-4645128312

   > Thanks for collecting the cleaner reproduction data — this is much 
stronger evidence.
   > 
   > The new logs make the direction here considerably narrower. In the current 
reader path, entering streaming mode is not by itself proof that the initial 
snapshot was fully materialized downstream. `resolvedTs` can continue to move 
forward once the CDC client starts, while actual row emission still depends on 
the PREWRITE/COMMIT matching and flush path. That means the symptom you 
captured is consistent with a real reader-side bug, not just a metrics illusion.
   > 
   > The most important part of your new report is this combination:
   > 
   > * one source split
   > * `startup.mode = initial`
   > * snapshot starts
   > * streaming begins about 18 seconds later
   > * downstream row count stalls at `93,354` while the source table had 
`522,625` rows
   > * `resolvedTs` and checkpoints keep advancing afterward
   > 
   > That strongly suggests the problem is in the snapshot-to-stream handoff 
family, rather than a normal or expected transition. In other words, this now 
looks much closer to either:
   > 
   > 1. the snapshot phase exiting before the split was fully materialized, or
   > 2. the reader entering streaming with incomplete materialization state and 
then continuing to advance `resolvedTs`.
   > 
   > So this issue is worth keeping separate and open. It is related to 
`#8815`, but the new evidence here is much more specific and should help us 
drive a more targeted fix.
   > 
   > The next high-value step would be a deterministic regression test around 
the single-split `INITIAL` path, especially validating that the reader does not 
enter effective streaming progress before the snapshot for that split is fully 
drained downstream.
   
   @DanielLeens Thank you for the detailed analysis and confirmation.
   
   I will keep the reproduction job logs and metrics available. If a 
deterministic regression test or a fix PR needs more runtime evidence, I can 
provide the full JobManager / TaskManager logs and the source/target count SQL 
results.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to