yzeng1618 commented on issue #10574:
URL: https://github.com/apache/seatunnel/issues/10574#issuecomment-4020704191

   sequenceDiagram
       participant Broker as Kafka Broker
       participant SplitReader as KafkaPartitionSplitReader
       participant Emitter as KafkaRecordEmitter
       participant Reader as KafkaSourceReader
       participant SrcFlow as SourceFlowLifeCycle
       participant SinkFlow as SinkFlowLifeCycle
       participant JdbcWriter as JdbcSinkWriter
       participant Ckpt as CheckpointCoordinator
       participant Enum as KafkaSourceSplitEnumerator
   
       %% 正常消费链路
       Broker->>SplitReader: poll()
       SplitReader->>Emitter: emitRecord(record)
       Emitter->>Emitter: deserialize & update currentOffset
   
       %% 快照状态
       SrcFlow->>Reader: snapshotState(checkpointId)
       Reader->>SrcFlow: return KafkaSourceSplit(startOffset=currentOffset)
       SrcFlow->>Ckpt: save state & ack
   
       %% Sink 异常
       SinkFlow->>JdbcWriter: prepareCommit()
       JdbcWriter-->>SinkFlow: flush error / BatchUpdateException
       Note over JdbcWriter,SinkFlow: 非XA JDBC无恢复状态,sink失败
   
       %% 恢复流程(问题点)
       Ckpt-->>SrcFlow: restore latest completed checkpoint
       SrcFlow->>Enum: RestoredSplitOperation(addSplitsBack)
   
       %% 核心BUG
       Enum->>Enum: force use GROUP_OFFSETS
       Enum->>Enum: compute wrong start offset(endOffset+1/group offset)
       Enum->>SplitReader: assign restored split
   
       %% 最终错误
       SplitReader->>Broker: seek(wrong offset)
       Note over SplitReader,Broker: 未使用CK偏移量,导致重复/丢失


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to