DanielLeens commented on issue #11013:
URL: https://github.com/apache/seatunnel/issues/11013#issuecomment-4644692689

   Thanks for collecting the cleaner reproduction data — this is much stronger 
evidence.
   
   The new logs make the direction here considerably narrower. In the current 
reader path, entering streaming mode is not by itself proof that the initial 
snapshot was fully materialized downstream. `resolvedTs` can continue to move 
forward once the CDC client starts, while actual row emission still depends on 
the PREWRITE/COMMIT matching and flush path. That means the symptom you 
captured is consistent with a real reader-side bug, not just a metrics illusion.
   
   The most important part of your new report is this combination:
   - one source split
   - `startup.mode = initial`
   - snapshot starts
   - streaming begins about 18 seconds later
   - downstream row count stalls at `93,354` while the source table had 
`522,625` rows
   - `resolvedTs` and checkpoints keep advancing afterward
   
   That strongly suggests the problem is in the snapshot-to-stream handoff 
family, rather than a normal or expected transition. In other words, this now 
looks much closer to either:
   1. the snapshot phase exiting before the split was fully materialized, or
   2. the reader entering streaming with incomplete materialization state and 
then continuing to advance `resolvedTs`.
   
   So this issue is worth keeping separate and open. It is related to `#8815`, 
but the new evidence here is much more specific and should help us drive a more 
targeted fix.
   
   The next high-value step would be a deterministic regression test around the 
single-split `INITIAL` path, especially validating that the reader does not 
enter effective streaming progress before the snapshot for that split is fully 
drained downstream.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to