yzeng1618 commented on issue #10571:
URL: https://github.com/apache/seatunnel/issues/10571#issuecomment-4020941433

   Thanks for reporting this.
   
   Based on code analysis, this also looks like a real bug, and the problem is 
likely in SQLServer CDC offset handling during restore.
   
   Preliminary conclusion:
   1. SeaTunnel’s SQLServer CDC path currently reduces the SQL Server position 
to `commit_lsn` in several places.
   2. But SQL Server/Debezium resume semantics require a more complete 
position, including at least:
      - `commit_lsn`
      - `change_lsn`
      - `event_serial_no`
   3. The current restored incremental split state and runtime filtering logic 
appear to compare only the reduced offset, which can make post-restart events 
be treated as already processed and therefore filtered out.
   
   Relevant code paths:
   - `IncrementalSourceRecordEmitter` updates incremental split startup offset 
from each source record
   - `LsnOffsetFactory.specific(...)` only rebuilds offset from `commit_lsn`
   - `SqlServerUtils.getLsnPosition(...)` also reduces the position to 
`commit_lsn`
   - `IncrementalSourceReader.snapshotState()` stores that reduced startup 
offset
   - `IncrementalSourceStreamFetcher.shouldEmit()` uses the restored startup 
offset as the filtering boundary after restart
   
   This matches the symptom that the first record after restart may be 
captured, while subsequent records can be missed due to incorrect resume/filter 
boundary handling.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to