hyboll opened a new issue, #9368: URL: https://github.com/apache/seatunnel/issues/9368
### Search before asking - [x] I had searched in the [issues](https://github.com/apache/seatunnel/issues?q=is%3Aissue+label%3A%22bug%22) and found no similar issues. ### What happened 当Source pipeline出现Checkpoint timeout后,Source pipeline的状态会变为CANCELED并且不会恢复,但任务实际一直处于运行状态,且速率为0。     通过bin/seatunnel.sh -j jobId查看任务信息,可以看到 pipelineStateMapperMap 的状态为 CANCELING,而内部的 executionStateMap 中的两个 TaskGroupLocaltion 对象中一个为 CANCELED,另一个一直卡在 CANCELING  ### SeaTunnel Version 2.3.10 ### SeaTunnel Config ```conf { "env": { "checkpoint.interval": 300000, "checkpoint.timeout": 10000, "job.mode": "BATCH", "parallelism": 1 }, "sink": [ { "data_save_mode": "APPEND_DATA", "database": "c_test", "doris.batch.size": 4096, "doris.config": { "column_separator": "\t", "enclose": "#", "format": "csv", "trim_double_quotes": true }, "fenodes": "******:8030", "password": "******", "plugin_name": "DORIS", "sink.buffer-count": 3, "sink.buffer-size": "256 * 1024", "sink.enable-2pc": false, "sink.label-prefix": "custom_fields-1747459814", "sink.max-retries": 3, "table": "custom_fields", "username": "******" } ], "source": [ { "access_key": "******", "access_secret": "******", "bucket": "obs://******", "endpoint": "https://******", "field_delimiter": "\u0001", "file_format_type": "text", "path": "/data/20250527/174000/1748338893139", "plugin_name": "OBSFILE", "schema": { "fields": { "ATTRIBUTE": "STRING", "CREATED_DATE_TIME": "TIMESTAMP", "HADP_RS_CUSTOM_FIELDS_TRANSACTION_ID": "BIGINT", "HADP_RS_CUSTOM_FIELDS_TRANS_NUMBER": "BIGINT", "HADP_RS_CUSTOM_FIELDS_TRANS_TYPE": "SHORT", "HANDLE": "STRING", "MODIFIED_DATE_TIME": "TIMESTAMP", "VALUE": "STRING" } }, "skip_header_row_number": 1, "tmp_path": "/tmp" } ], "transform": [ { "plugin_name": "SQL", "query": "select ATTRIBUTE, CREATED_DATE_TIME, HANDLE, MODIFIED_DATE_TIME, VALUE Duplicate of #/from dual" } } ``` ### Running Command ```shell bin/seatunnel.sh -c doris.json ``` ### Error Exception ```log [979681127113424900] 2025-05-27 17:48:33,061 INFO [.s.e.s.c.CheckpointCoordinator] [checkpoint-coordinator-1/979681127113424900] - timeout checkpoint: [979681127113424900]/1/1, CHECKPOINT_TYPE [979681127113424900] 2025-05-27 17:48:33,066 INFO [.s.e.s.c.CheckpointCoordinator] [checkpoint-coordinator-1/979681127113424900] - start clean pending checkpoint cause Checkpoint expired before completing. Please increase checkpoint timeout in the seatunnel.yaml or jobConfig env. [979681127113424900] 2025-05-27 17:48:33,066 ERROR [s.e.s.c.CheckpointCoordinator] [seatunnel-coordinator-service-749] - trigger checkpoint failed org.apache.seatunnel.engine.server.checkpoint.CheckpointException: Checkpoint expired before completing. Please increase checkpoint timeout in the seatunnel.yaml or jobConfig env. at org.apache.seatunnel.engine.server.checkpoint.PendingCheckpoint.abortCheckpoint(PendingCheckpoint.java:176) ~[seatunnel-starter.jar:2.3.10] at org.apache.seatunnel.engine.server.checkpoint.CheckpointCoordinator.lambda$cleanPendingCheckpoint$20(CheckpointCoordinator.java:789) ~[seatunnel-starter.jar:2.3.10] at java.util.concurrent.ConcurrentHashMap$ValuesView.forEach(ConcurrentHashMap.java:4705) ~[?:1.8.0_362] at org.apache.seatunnel.engine.server.checkpoint.CheckpointCoordinator.cleanPendingCheckpoint(CheckpointCoordinator.java:787) ~[seatunnel-starter.jar:2.3.10] at org.apache.seatunnel.engine.server.checkpoint.CheckpointCoordinator.handleCoordinatorError(CheckpointCoordinator.java:288) ~[seatunnel-starter.jar:2.3.10] at org.apache.seatunnel.engine.server.checkpoint.CheckpointCoordinator.lambda$startTriggerPendingCheckpoint$9(CheckpointCoordinator.java:667) ~[seatunnel-starter.jar:2.3.10] at org.apache.seatunnel.api.tracing.MDCRunnable.run(MDCRunnable.java:43) ~[seatunnel-starter.jar:2.3.10] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_362] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_362] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) ~[?:1.8.0_362] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) ~[?:1.8.0_362] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_362] ``` ### Zeta or Flink or Spark Version Zeta ### Java or Scala Version _No response_ ### Screenshots _No response_ ### Are you willing to submit PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [x] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
