zhaoxqing opened a new issue, #7496: URL: https://github.com/apache/seatunnel/issues/7496
### Search before asking - [X] I had searched in the [issues](https://github.com/apache/seatunnel/issues?q=is%3Aissue+label%3A%22bug%22) and found no similar issues. ### What happened Job turned from state RUNNING to FAILED , but job status is still RUNNING and sink database no longer increases. Even in situations like OOM(Caused by: java.lang.OutOfMemoryError: Java heap space), it is still the case. ### SeaTunnel Version 2.3.7 ### SeaTunnel Config ```conf # SeaTunnel configuration file env { parallelism = 1 job.mode = "STREAMING" job.name = "callcenter" checkpoint.interval = 300000 # read_limit.bytes_per_second = 1000000 # read_limit.rows_per_second = 5000 connection_check_timeout_sec = 300 } source { MySQL-CDC { catalog = { factory = MySQL } base-url = "jdbc:mysql://192.168.0.17:3306/callcenter?useUnicode=true&characterEncoding=utf-8&autoReconnect=true" driver = "com.mysql.cj.jdbc.Driver" server-id = "119010-119011" username = "xxxx" password = "yyyyyyyyyy" startup.mode = "initial" database-name = ["callcenter"] table-names = ["callcenter.users"] } } transform { } sink { jdbc { url = "jdbc:mysql://localhost:3306/callcenter?useUnicode=true&characterEncoding=UTF-8&rewriteBatchedStatements=true" driver = "com.mysql.cj.jdbc.Driver" user = "yyyy" password = "xxxxxxxxxxxxxxxxx" generate_sink_sql = true database = "callcenter" connection_check_timeout_sec = 100 } } ``` ### Running Command ```shell ./bin/seatunnel.sh --config /app/seatunnel/conf/callcenter.conf --async ``` ### Error Exception ```log 2024-08-26 10:25:34,848 INFO [o.a.s.e.s.CoordinatorService ] [hz.main.generic-operation.thread-15] - [seatunnel-node02]:5801 [seatunnel-cluster] [5.1] Received task end from execution TaskGroupLocation{jobId=879947535932719107, pipelineId=1, taskGroupId=30000}, state FAILED 2024-08-26 10:25:34,973 INFO [o.a.s.e.s.d.p.PhysicalVertex ] [hz.main.generic-operation.thread-15] - Job callcenter (879947535932719107), Pipeline: [(1/1)], task: [pipeline-1 [Source[0]-MySQL-CDC]-SourceTask (1/1)] turned from state RUNNING to FAILED. 2024-08-26 10:25:34,973 INFO [o.a.s.e.s.d.p.PhysicalVertex ] [hz.main.generic-operation.thread-15] - Job callcenter (879947535932719107), Pipeline: [(1/1)], task: [pipeline-1 [Source[0]-MySQL-CDC]-SourceTask (1/1)] state process is stopped 2024-08-26 10:25:34,973 ERROR [o.a.s.e.s.d.p.PhysicalVertex ] [hz.main.generic-operation.thread-15] - Job callcenter (879947535932719107), Pipeline: [(1/1)], task: [pipeline-1 [Source[0]-MySQL-CDC]-SourceTask (1/1)] end with state FAILED and Exception: com.hazelcast.core.OperationTimeoutException: GetOperation invocation failed to complete due to operation-heartbeat-timeout. Current time: 2024-08-26 10:24:23.826. Start time: 2024-08-26 10:22:23.853. Total elapsed time: 119974 ms. Last operation heartbeat: never. Last operation heartbeat from member: 2024-08-26 10:21:11.368. Invocation{op=com.hazelcast.map.impl.operation.GetOperation{serviceName='hz:impl:mapService', identityHash=1355377163, partitionId=110, replicaIndex=0, callId=-31976, invocationTime=1724638943605 (2024-08-26 10:22:23.605), waitTimeout=-1, callTimeout=60000, tenantControl=com.hazelcast.spi.impl.tenantcontrol.NoopTenantControl@0, name=engine_runningJobMetrics}, tryCount=20, tryPauseMillis=500, invokeCount=1, cal lTimeoutMillis=60000, firstInvocationTimeMs=1724638943853, firstInvocationTime='2024-08-26 10:22:23.853', lastHeartbeatMillis=0, lastHeartbeatTime='1970-01-01 08:00:00.000', target=[seatunnel-node01]:5801, pendingResponse={VOID}, backupsAcksExpected=-1, backupsAcksReceived=0, connection=null} 2024-08-26 10:25:34,973 INFO [o.a.s.e.s.d.p.SubPlan ] [seatunnel-coordinator-service-525] - Job callcenter (879947535932719107), Pipeline: [(1/1)], task: [pipeline-1 [Source[0]-MySQL-CDC]-SourceTask (1/1)] future complete with state FAILED 2024-08-26 10:25:34,973 ERROR [o.a.s.e.s.d.p.SubPlan ] [seatunnel-coordinator-service-525] - Task TaskGroupLocation{jobId=879947535932719107, pipelineId=1, taskGroupId=30000} Failed in Job callcenter (879947535932719107), Pipeline: [(1/1)], Begin to cancel other tasks in this pipeline. 2024-08-26 10:25:34,974 INFO [o.a.s.e.s.m.JobMaster ] [hz.main.generic-operation.thread-15] - release the task group resource TaskGroupLocation{jobId=879947535932719107, pipelineId=1, taskGroupId=30000} 2024-08-26 10:25:35,101 INFO [o.a.s.e.s.d.p.SubPlan ] [seatunnel-coordinator-service-525] - Job callcenter (879947535932719107), Pipeline: [(1/1)] turned from state RUNNING to FAILING. 2024-08-26 10:25:35,102 INFO [o.a.s.e.s.d.p.PhysicalVertex ] [seatunnel-coordinator-service-525] - Job callcenter (879947535932719107), Pipeline: [(1/1)], task: [pipeline-1 [Source[0]-MySQL-CDC]-SplitEnumerator (1/1)] state process is start 2024-08-26 10:25:35,195 INFO [o.a.s.e.s.d.p.PhysicalVertex ] [seatunnel-coordinator-service-525] - Job callcenter (879947535932719107), Pipeline: [(1/1)], task: [pipeline-1 [Source[0]-MySQL-CDC]-SplitEnumerator (1/1)] turned from state RUNNING to CANCELING. 2024-08-26 10:25:35,326 INFO [o.a.s.e.s.d.p.PhysicalVertex ] [seatunnel-coordinator-service-525] - Job callcenter (879947535932719107), Pipeline: [(1/1)], task: [pipeline-1 [Source[0]-MySQL-CDC]-SplitEnumerator (1/1)] turned from state CANCELING to CANCELED. 2024-08-26 10:25:35,326 INFO [o.a.s.e.s.d.p.PhysicalVertex ] [seatunnel-coordinator-service-525] - Job callcenter (879947535932719107), Pipeline: [(1/1)], task: [pipeline-1 [Source[0]-MySQL-CDC]-SplitEnumerator (1/1)] state process is stopped 2024-08-26 10:25:35,327 INFO [o.a.s.e.s.d.p.PhysicalVertex ] [seatunnel-coordinator-service-525] - Job callcenter (879947535932719107), Pipeline: [(1/1)], task: [pipeline-1 [Source[0]-MySQL-CDC]-SourceTask (1/1)] state process is start 2024-08-26 10:25:35,327 INFO [o.a.s.e.s.d.p.SubPlan ] [seatunnel-coordinator-service-536] - Job callcenter (879947535932719107), Pipeline: [(1/1)], task: [pipeline-1 [Source[0]-MySQL-CDC]-SplitEnumerator (1/1)] future complete with state CANCELED 2024-08-26 10:25:35,327 INFO [.s.e.s.c.CheckpointCoordinator] [seatunnel-coordinator-service-536] - Turn checkpoint_state_879947535932719107_1 state from RUNNING to CANCELED 2024-08-26 10:25:35,375 INFO [o.a.s.e.s.d.p.SubPlan ] [seatunnel-coordinator-service-536] - Job callcenter (879947535932719107), Pipeline: [(1/1)] will end with state FAILED 2024-08-26 10:25:35,459 INFO [o.a.s.e.s.d.p.SubPlan ] [seatunnel-coordinator-service-536] - Job callcenter (879947535932719107), Pipeline: [(1/1)] turned from state FAILING to FAILED. 2024-08-26 10:25:35,459 INFO [o.a.s.e.s.d.p.SubPlan ] [seatunnel-coordinator-service-536] - Restore time 2, pipeline Job callcenter (879947535932719107), Pipeline: [(1/1)] 2024-08-26 10:25:35,459 INFO [o.a.s.e.s.d.p.SubPlan ] [seatunnel-coordinator-service-536] - Reset pipeline Job callcenter (879947535932719107), Pipeline: [(1/1)] state to CREATED 2024-08-26 10:25:35,558 INFO [o.a.s.e.s.d.p.SubPlan ] [seatunnel-coordinator-service-536] - Reset pipeline Job callcenter (879947535932719107), Pipeline: [(1/1)] state to CREATED complete 2024-08-26 10:25:35,642 INFO [o.a.s.e.s.d.p.PhysicalVertex ] [seatunnel-coordinator-service-536] - Job callcenter (879947535932719107), Pipeline: [(1/1)], task: [pipeline-1 [Source[0]-MySQL-CDC]-SplitEnumerator (1/1)] turn to state CREATED. 2024-08-26 10:25:35,657 INFO [o.a.s.e.s.CoordinatorService ] [hz.main.generic-operation.thread-18] - [seatunnel-node02]:5801 [seatunnel-cluster] [5.1] Received task end from execution TaskGroupLocation{jobId=879947535932719107, pipelineId=1, taskGroupId=1}, state FAILED 2024-08-26 10:25:35,740 INFO [o.a.s.e.s.d.p.PhysicalVertex ] [seatunnel-coordinator-service-536] - Job callcenter (879947535932719107), Pipeline: [(1/1)], task: [pipeline-1 [Source[0]-MySQL-CDC]-SourceTask (1/1)] turn to state CREATED. 2024-08-26 10:25:35,740 INFO [.s.e.s.c.CheckpointCoordinator] [seatunnel-coordinator-service-536] - Turn checkpoint_state_879947535932719107_1 state from CANCELED to RUNNING 2024-08-26 10:25:35,780 INFO [o.a.s.e.s.d.p.PhysicalVertex ] [seatunnel-coordinator-service-536] - The task Job callcenter (879947535932719107), Pipeline: [(1/1)], task: [pipeline-1 [Source[0]-MySQL-CDC]-SourceTask (1/1)] is in state CREATED when init state future 2024-08-26 10:25:35,819 INFO [o.a.s.e.s.d.p.PhysicalVertex ] [seatunnel-coordinator-service-536] - The task Job callcenter (879947535932719107), Pipeline: [(1/1)], task: [pipeline-1 [Source[0]-MySQL-CDC]-SplitEnumerator (1/1)] is in state FAILED when init state future 2024-08-26 10:25:35,819 INFO [o.a.s.e.s.d.p.SubPlan ] [seatunnel-coordinator-service-536] - Wait 3s and then restore the pipeline Job callcenter (879947535932719107), Pipeline: [(1/1)] 2024-08-26 10:25:35,820 INFO [o.a.s.e.s.d.p.PhysicalVertex ] [hz.main.generic-operation.thread-18] - Job callcenter (879947535932719107), Pipeline: [(1/1)], task: [pipeline-1 [Source[0]-MySQL-CDC]-SplitEnumerator (1/1)] turned from state CREATED to FAILED. 2024-08-26 10:25:35,820 WARN [o.a.s.e.s.d.p.PhysicalVertex ] [hz.main.generic-operation.thread-18] - Job callcenter (879947535932719107), Pipeline: [(1/1)], task: [pipeline-1 [Source[0]-MySQL-CDC]-SplitEnumerator (1/1)] state process is not start 2024-08-26 10:25:38,819 INFO [o.a.s.e.s.m.JobMaster ] [seatunnel-coordinator-service-536] - release the pipeline Job callcenter (879947535932719107), Pipeline: [(1/1)] resource 2024-08-26 10:25:38,875 INFO [o.a.s.e.s.d.p.SubPlan ] [seatunnel-coordinator-service-536] - Job callcenter (879947535932719107), Pipeline: [(1/1)] state process is start 2024-08-26 10:25:38,992 INFO [o.a.s.e.s.d.p.SubPlan ] [seatunnel-coordinator-service-536] - Job callcenter (879947535932719107), Pipeline: [(1/1)] turned from state CREATED to SCHEDULED. 2024-08-26 10:25:39,158 INFO [o.a.s.e.s.d.p.SubPlan ] [seatunnel-coordinator-service-536] - Job callcenter (879947535932719107), Pipeline: [(1/1)] turned from state SCHEDULED to DEPLOYING. 2024-08-26 10:25:39,158 INFO [o.a.s.e.s.d.p.PhysicalVertex ] [seatunnel-coordinator-service-536] - Job callcenter (879947535932719107), Pipeline: [(1/1)], task: [pipeline-1 [Source[0]-MySQL-CDC]-SourceTask (1/1)] state process is start 2024-08-26 10:25:39,271 INFO [o.a.s.e.s.d.p.PhysicalVertex ] [seatunnel-coordinator-service-536] - Job callcenter (879947535932719107), Pipeline: [(1/1)], task: [pipeline-1 [Source[0]-MySQL-CDC]-SourceTask (1/1)] turned from state CREATED to DEPLOYING. 2024-08-26 10:25:39,371 INFO [o.a.s.e.s.d.p.PhysicalVertex ] [seatunnel-coordinator-service-536] - Job callcenter (879947535932719107), Pipeline: [(1/1)], task: [pipeline-1 [Source[0]-MySQL-CDC]-SourceTask (1/1)] turned from state DEPLOYING to RUNNING. 2024-08-26 10:25:39,371 INFO [o.a.s.e.s.d.p.PhysicalVertex ] [seatunnel-coordinator-service-536] - Job callcenter (879947535932719107), Pipeline: [(1/1)], task: [pipeline-1 [Source[0]-MySQL-CDC]-SourceTask (1/1)] current state equals target state: RUNNING, skip 2024-08-26 10:25:39,452 INFO [o.a.s.e.s.d.p.SubPlan ] [seatunnel-coordinator-service-536] - Job callcenter (879947535932719107), Pipeline: [(1/1)] turned from state DEPLOYING to RUNNING. 2024-08-26 10:27:35,829 INFO [o.a.s.e.s.CoordinatorService ] [hz.main.generic-operation.thread-4] - [seatunnel-node02]:5801 [seatunnel-cluster] [5.1] Received task end from execution TaskGroupLocation{jobId=879947535932719107, pipelineId=1, taskGroupId=30000}, state FAILED 2024-08-26 10:27:35,961 INFO [o.a.s.e.s.d.p.PhysicalVertex ] [hz.main.generic-operation.thread-4] - Job callcenter (879947535932719107), Pipeline: [(1/1)], task: [pipeline-1 [Source[0]-MySQL-CDC]-SourceTask (1/1)] turned from state RUNNING to FAILED. 2024-08-26 10:27:35,961 INFO [o.a.s.e.s.d.p.PhysicalVertex ] [hz.main.generic-operation.thread-4] - Job callcenter (879947535932719107), Pipeline: [(1/1)], task: [pipeline-1 [Source[0]-MySQL-CDC]-SourceTask (1/1)] state process is stopped 2024-08-26 10:27:35,961 ERROR [o.a.s.e.s.d.p.PhysicalVertex ] [hz.main.generic-operation.thread-4] - Job callcenter (879947535932719107), Pipeline: [(1/1)], task: [pipeline-1 [Source[0]-MySQL-CDC]-SourceTask (1/1)] end with state FAILED and Exception: com.hazelcast.core.OperationTimeoutException: GetOperation invocation failed to complete due to operation-heartbeat-timeout. Current time: 2024-08-26 10:24:23.826. Start time: 2024-08-26 10:22:23.853. Total elapsed time: 119974 ms. Last operation heartbeat: never. Last operation heartbeat from member: 2024-08-26 10:21:11.368. Invocation{op=com.hazelcast.map.impl.operation.GetOperation{serviceName='hz:impl:mapService', identityHash=1355377163, partitionId=110, replicaIndex=0, callId=-31976, invocationTime=1724638943605 (2024-08-26 10:22:23.605), waitTimeout=-1, callTimeout=60000, tenantControl=com.hazelcast.spi.impl.tenantcontrol.NoopTenantControl@0, name=engine_runningJobMetrics}, tryCount=20, tryPauseMillis=500, invokeCount=1, call TimeoutMillis=60000, firstInvocationTimeMs=1724638943853, firstInvocationTime='2024-08-26 10:22:23.853', lastHeartbeatMillis=0, lastHeartbeatTime='1970-01-01 08:00:00.000', target=[seatunnel-node01]:5801, pendingResponse={VOID}, backupsAcksExpected=-1, backupsAcksReceived=0, connection=null} 2024-08-26 10:27:35,961 INFO [o.a.s.e.s.d.p.SubPlan ] [seatunnel-coordinator-service-536] - Job callcenter (879947535932719107), Pipeline: [(1/1)], task: [pipeline-1 [Source[0]-MySQL-CDC]-SourceTask (1/1)] future complete with state FAILED 2024-08-26 10:27:35,961 ERROR [o.a.s.e.s.d.p.SubPlan ] [seatunnel-coordinator-service-536] - Task TaskGroupLocation{jobId=879947535932719107, pipelineId=1, taskGroupId=30000} Failed in Job callcenter (879947535932719107), Pipeline: [(1/1)], Begin to cancel other tasks in this pipeline. 2024-08-26 10:27:35,962 INFO [o.a.s.e.s.m.JobMaster ] [hz.main.generic-operation.thread-4] - release the task group resource TaskGroupLocation{jobId=879947535932719107, pipelineId=1, taskGroupId=30000} 2024-08-26 10:27:36,055 INFO [o.a.s.e.s.d.p.SubPlan ] [seatunnel-coordinator-service-536] - Job callcenter (879947535932719107), Pipeline: [(1/1)] turned from state RUNNING to FAILING. 2024-08-26 10:27:36,055 INFO [o.a.s.e.s.d.p.PhysicalVertex ] [seatunnel-coordinator-service-536] - Job callcenter (879947535932719107), Pipeline: [(1/1)], task: [pipeline-1 [Source[0]-MySQL-CDC]-SplitEnumerator (1/1)] state process is start 2024-08-26 10:27:36,055 INFO [o.a.s.e.s.d.p.PhysicalVertex ] [seatunnel-coordinator-service-536] - Job callcenter (879947535932719107), Pipeline: [(1/1)], task: [pipeline-1 [Source[0]-MySQL-CDC]-SourceTask (1/1)] state process is start 2024-08-26 10:28:50,405 INFO [o.a.s.e.s.CoordinatorService ] [hz.main.generic-operation.thread-33] - [seatunnel-node02]:5801 [seatunnel-cluster] [5.1] Received task end from execution TaskGroupLocation{jobId=879947535932719107, pipelineId=1, taskGroupId=30000}, state FAILED 2024-08-26 10:28:50,405 INFO [o.a.s.e.s.d.p.PhysicalVertex ] [hz.main.generic-operation.thread-33] - Job callcenter (879947535932719107), Pipeline: [(1/1)], task: [pipeline-1 [Source[0]-MySQL-CDC]-SourceTask (1/1)] current state equals target state: FAILED, skip 2024-08-26 10:28:50,405 INFO [o.a.s.e.s.m.JobMaster ] [hz.main.generic-operation.thread-33] - release the task group resource TaskGroupLocation{jobId=879947535932719107, pipelineId=1, taskGroupId=30000} ``` ### Zeta or Flink or Spark Version Zeta 2.3.7 ### Java or Scala Version Java 1.8 ### Screenshots _No response_ ### Are you willing to submit PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
