wuhainan commented on issue #11013:
URL: https://github.com/apache/seatunnel/issues/11013#issuecomment-4641280555

   > That additional observation is useful.
   > 
   > I checked the current reader flow more closely, and I do not see an 
intentional handoff that marks the snapshot as finished early in the normal 
path. In `INITIAL` mode, the reader only flips `snapshotCompleted` after 
`snapshotEvents(...)` returns, and `resolvedTs` is only advanced after the 
snapshot scan reaches the end of the current split.
   > 
   > So the `~91,746` ceiling you observed is more likely a symptom of the same 
underlying reader bug family rather than a separate "expected switch to 
streaming" path. In other words, this still looks like either:
   > 
   > 1. the snapshot phase exited prematurely for some split, or
   > 2. the reader reached streaming with an already incomplete materialization 
state and then continued advancing `resolvedTs`.
   > 
   > The most useful next evidence, if you still have it, would be:
   > 
   > 1. the startup log lines around `[splitId] Snapshot events start.` and the 
first `Capture streaming event from resolvedTs` for the same split
   > 2. how many source splits were assigned for that table in the job
   > 
   > That will let us distinguish "snapshot never fully drained" from 
"streaming advanced after a partial snapshot" much more cleanly. Either way, 
this still looks like a real TiDB-CDC reader bug worth keeping open.
   
   I started a new isolated reproduction job and collected cleaner evidence.
   
   Job:
   
   - Application: `application_1756448821197_5356`
   - Job name: `tidb2tidb_deposit_applications`
   - SeaTunnel connector: `connector-cdc-tidb-2.3.13`
   - TiKV client: `tikv-client-java-3.2.0`
   - `startup.mode = initial`
   - Flink source parallelism: `1`
   
   The source table row count was:
   
       alpha_online.deposit_applications: 522,625 rows
   
   The target table stopped at:
   
       target table: 93,354 rows
   
   The Flink metrics also stopped at exactly the same number:
   
       Source numRecordsOut = 93354
       Sink numRecordsIn = 93354
       SourceReceivedQPS = 0.0
       SinkWriteQPS = 0.0
       Sink numRecordsSendErrors = 0
       backPressuredTimeMsPerSecond = 0
       idleTimeMsPerSecond = 1000
   
   I queried the same metrics twice and the values stayed at `93354`, so the 
job was no longer emitting rows downstream.
   
   Startup log evidence:
   
   2026-06-07 10:46:40,705 INFO  
org.apache.seatunnel.connectors.seatunnel.cdc.tidb.source.reader.TiDBSourceReader
 [] - [alpha_online:deposit_applications:<ByteString@446b0db2 size=19 
contents="t\200\000\000\000\000\000e\377_r\000\000\000\000\000\000\000\000">-<ByteString@2828c34c
 size=19 
contents="t\200\000\000\000\000\000e\377_s\000\000\000\000\000\000\000\000">] 
Snapshot events start.
   2026-06-07 10:46:41,460 WARN  
org.apache.seatunnel.shade.com.zaxxer.hikari.HikariConfig    [] - HikariPool-1 
- idleTimeout has been set but has no effect because the pool is operating as a 
fixed size pool.
   2026-06-07 10:46:41,460 INFO  
org.apache.seatunnel.shade.com.zaxxer.hikari.HikariDataSource [] - HikariPool-1 
- Starting...
   2026-06-07 10:46:41,474 INFO  
org.apache.seatunnel.shade.com.zaxxer.hikari.HikariDataSource [] - HikariPool-1 
- Start completed.
   2026-06-07 10:46:59,162 INFO  
org.apache.seatunnel.connectors.seatunnel.cdc.tidb.source.reader.TiDBSourceReader
 [] - Capture streaming event from resolvedTs:466826140257353737
   2026-06-07 10:46:59,171 INFO  org.tikv.cdc.CDCClient                         
              [] - remove regions: []
   2026-06-07 10:46:59,171 INFO  org.tikv.cdc.CDCClient                         
              [] - add regions: [{Region[1485688] ConfVer[313901] 
Version[12789] Store[177965148] 
KeyRange[t\200\000\000\000\000\000e\377_r\200\000\000\000\000\003Z\247]:[t\200\000\000\000\000\000e\377_r\200\000\000\000\000\f\206\233]},
 {Region[1749236] ConfVer[313949] Version[12781] Store[177965148] 
KeyRange[t\200\000\000\000\000\000e\377_r\200\000\000\000\00029\351]:[t\200\000\000\000\000\000f\001_i\200\000\000\000\000\000\000\002\000\003\200\000\000\000\000\000C\225]},
 {Region[164304772] ConfVer[313907] Version[12784] Store[177965147] 
KeyRange[t\200\000\000\000\000\000e\377_i\200\000\000\000\000\000\000\004\004\000\000\027\365|\305\270\375\003\200\000\000\000\000\037!\316]:[t\200\000\000\000\000\000e\377_r\200\000\000\000\000\003Z\247]},
 {Region[164304978] ConfVer[313919] Version[12781] Store[164454270] 
KeyRange[t\200\000\000\000\000\000e\377_r\200\000\000\000\000\f\206\233]:[t\200\000\000\000\000\000e
 \377_r\200\000\000\000\000\025\351\000]}, {Region[575192972] ConfVer[313955] 
Version[12781] Store[177965147] 
KeyRange[t\200\000\000\000\000\000e\377_r\200\000\000\000\000\025\351\000]:[t\200\000\000\000\000\000e\377_r\200\000\000\000\00029\351]}],
 timestamp: 466826140257353737
   2026-06-07 10:46:59,175 INFO  org.tikv.cdc.RegionCDCClient                   
              [] - start streaming region: 1485688, running: true
   2026-06-07 10:46:59,181 INFO  org.tikv.cdc.RegionCDCClient                   
              [] - start streaming region: 1749236, running: true
   2026-06-07 10:46:59,181 INFO  org.tikv.cdc.RegionCDCClient                   
              [] - start streaming region: 164304772, running: true
   2026-06-07 10:46:59,181 INFO  org.tikv.cdc.RegionCDCClient                   
              [] - start streaming region: 164304978, running: true
   2026-06-07 10:46:59,181 INFO  org.tikv.cdc.RegionCDCClient                   
              [] - start streaming region: 575192972, running: true
   2026-06-07 10:46:59,182 INFO  org.tikv.cdc.CDCClient                         
              [] - keyRange applied
   2026-06-07 10:46:59,282 INFO  
org.apache.seatunnel.connectors.seatunnel.cdc.tidb.source.reader.TiDBSourceReader
 [] - Capture streaming event next resolvedTs:466826140257353737
   2026-06-07 10:46:59,284 INFO  
org.apache.seatunnel.connectors.seatunnel.cdc.tidb.source.reader.TiDBSourceReader
 [] - Capture streaming event from resolvedTs:466826140257353737
   2026-06-07 10:46:59,384 INFO  
org.apache.seatunnel.connectors.seatunnel.cdc.tidb.source.reader.TiDBSourceReader
 [] - Capture streaming event next resolvedTs:466826140257353737
   2026-06-07 10:47:00,385 INFO  
org.apache.seatunnel.connectors.seatunnel.cdc.tidb.source.reader.TiDBSourceReader
 [] - Capture streaming event from resolvedTs:466826140257353737
   2026-06-07 10:47:00,386 INFO  org.tikv.cdc.CDCClient                         
              [] - handle resolvedTs: 466826145175175177, regionId: 575192972
   
   So the reader entered streaming about 18 seconds after `Snapshot events 
start`, while only about `93,354 / 522,625` rows were materialized downstream.
   
   For the source split count:
   
   - The Flink execution plan shows source parallelism was `1`.
   - The TaskManager log contains only one occurrence of `Snapshot events 
start`.
   - The 5 regions shown in the CDCClient log are TiKV regions watched by 
CDCClient for that one split, not SeaTunnel source splits.
   
   After that, `resolvedTs` kept advancing:
   
       2026-06-07 10:56:55,574 INFO  TiDBSourceReader - Capture streaming event 
next resolvedTs:466826177279950865
       2026-06-07 10:56:57,574 INFO  TiDBSourceReader - Capture streaming event 
next resolvedTs:466826177355972613
   
   Checkpoints also kept succeeding:
   
       2026-06-07 10:56:29,604 INFO  CheckpointCoordinator - Completed 
checkpoint 20
       2026-06-07 10:56:59,591 INFO  CheckpointCoordinator - Completed 
checkpoint 21
   
   This looks like a reproducible case where the reader entered the streaming 
phase before the initial snapshot was fully materialized downstream.
   
   I still cannot prove the exact internal reason from logs alone because there 
is no explicit `Snapshot events finished` log or snapshot row count, but the 
evidence strongly suggests that the snapshot phase exited prematurely for the 
split, then the reader continued advancing `resolvedTs` in streaming mode.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to