yzeng1618 commented on issue #10571:
URL: https://github.com/apache/seatunnel/issues/10571#issuecomment-4020941433
Thanks for reporting this.
Based on code analysis, this also looks like a real bug, and the problem is
likely in SQLServer CDC offset handling during restore.
Preliminary conclusion:
1. SeaTunnel’s SQLServer CDC path currently reduces the SQL Server position
to `commit_lsn` in several places.
2. But SQL Server/Debezium resume semantics require a more complete
position, including at least:
- `commit_lsn`
- `change_lsn`
- `event_serial_no`
3. The current restored incremental split state and runtime filtering logic
appear to compare only the reduced offset, which can make post-restart events
be treated as already processed and therefore filtered out.
Relevant code paths:
- `IncrementalSourceRecordEmitter` updates incremental split startup offset
from each source record
- `LsnOffsetFactory.specific(...)` only rebuilds offset from `commit_lsn`
- `SqlServerUtils.getLsnPosition(...)` also reduces the position to
`commit_lsn`
- `IncrementalSourceReader.snapshotState()` stores that reduced startup
offset
- `IncrementalSourceStreamFetcher.shouldEmit()` uses the restored startup
offset as the filtering boundary after restart
This matches the symptom that the first record after restart may be
captured, while subsequent records can be missed due to incorrect resume/filter
boundary handling.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]