github-actions[bot] commented on code in PR #63079:
URL: https://github.com/apache/doris/pull/63079#discussion_r3239195177


##########
fe/fe-core/src/main/java/org/apache/doris/job/offset/jdbc/JdbcTvfSourceOffsetProvider.java:
##########
@@ -312,9 +300,12 @@ public void updateOffset(Offset offset) {
      */
     @Override
     public void replayIfNeed(StreamingInsertJob job) throws JobException {

Review Comment:
   Restoring `cachedSyncTables` alone is not enough for TVF mid-snapshot 
recovery. This override restores `remainingSplits`/`finishedSplits` from 
`streaming_job_meta` and committed offsets, but unlike the base 
`JdbcSourceOffsetProvider.replayIfNeed()` it never rebuilds `cdcSplitProgress` 
from the last restored mid-table split. After FE restart, 
`cdcSplitProgress.currentSplittingTable` stays null; because the single TVF 
table is already present in `remainingSplits`/`finishedSplits`, inherited 
`noMoreSplits()` returns true and `advanceSplitsIfNeed()` stops fetching 
subsequent batches. Once the restored batch is drained, the job can transition 
to binlog/snapshot completion and skip the rest of the snapshot. Please mirror 
the base replay cursor restoration (`findResumeMidSplit` + 
`applySplitToProgress`, or equivalent) for the TVF path after restoring split 
metadata. This is distinct from the earlier cachedSyncTables-null issue because 
the table cache is now restored, but the async fetch 
 cursor is still missing.



##########
fe/fe-common/src/main/java/org/apache/doris/common/Config.java:
##########
@@ -1182,6 +1182,9 @@ public class Config extends ConfigBase {
     @ConfField(mutable = true, masterOnly = true)
     public static int streaming_cdc_heavy_rpc_timeout_sec = 600;
 
+    @ConfField(mutable = true, masterOnly = true)
+    public static int streaming_cdc_fetch_splits_batch_size = 100;

Review Comment:
   This mutable config is used directly as `FetchTableSplitsRequest.batchSize`, 
and both cdc_client split loops run only while `result.size() < batchSize`. If 
an operator sets this to `0` or a negative value dynamically, cdc_client 
returns an empty split batch without error; FE then keeps the current table 
cursor unchanged and will repeatedly see no progress. Please validate/clamp 
this config to a positive value before sending it (or fail the scheduler tick 
with a clear error) so an invalid dynamic config cannot wedge async splitting.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to