github-actions[bot] commented on issue #17732:
URL: 
https://github.com/apache/dolphinscheduler/issues/17732#issuecomment-3578646489

   ### Search before asking
   
   - [x] I had searched in the 
[issues](https://github.com/apache/dolphinscheduler/issues?q=is%3Aissue) and 
found no similar issues.
   
   
   ### What happened
   
   tasks hanged in submitted status, no more info.
   here is the master log:
   
   [WI-0][TI-0] - 2025-11-26 02:26:44.426 WARN  [Curator-TreeCache-0] 
o.a.d.s.m.c.AbstractClusterSubscribeListener:[45] - Server 
MasterServerMetadata(super=BaseServerMetadata(processId=1629687, 
serverStartupTime=1764123193415, address=10.16.10.119:5678, 
cpuUsage=0.0012515644555694619, memoryUsage=0.09452163781361474, 
serverStatus=NORMAL)) removed
   [WI-0][TI-0] - 2025-11-26 02:26:44.426 WARN  [Curator-TreeCache-0] 
o.a.d.s.m.c.MasterSlotManager:[75] - Do rebalance failed, cannot found the 
current master: 10.16.10.119:5678 in the normal master clusters: []. Please 
check the current master server status
   [WI-0][TI-0] - 2025-11-26 02:26:44.426 INFO  [Curator-TreeCache-0] 
o.a.d.s.m.e.s.SystemEventBus:[40] - Published SystemEvent: 
MasterFailoverEvent{masterServerMetadata='MasterServerMetadata(super=BaseServerMetadata(processId=1629687,
 serverStartupTime=1764123193415, address=10.16.10.119:5678, 
cpuUsage=0.0012515644555694619, memoryUsage=0.09452163781361474, 
serverStatus=NORMAL))', eventTime=Wed Nov 26 02:26:44 UTC 2025, delayTime=30000}
   [WI-0][TI-0] - 2025-11-26 02:26:44.427 WARN  [Curator-TreeCache-0] 
o.a.d.s.m.c.AbstractClusterSubscribeListener:[45] - Server 
WorkerServerMetadata(workerGroup=default, workerWeight=100.0, 
taskThreadPoolUsage=0.0) removed
   [WI-0][TI-0] - 2025-11-26 02:26:44.427 INFO  [Curator-TreeCache-0] 
o.a.d.s.m.e.s.SystemEventBus:[40] - Published SystemEvent: 
WorkerFailoverEvent{workerServerMetadata='WorkerServerMetadata(workerGroup=default,
 workerWeight=100.0, taskThreadPoolUsage=0.0)', eventTime=Wed Nov 26 02:26:44 
UTC 2025, delayTime=30000}
   [WI-0][TI-0] - 2025-11-26 02:26:44.434 INFO  [Curator-TreeCache-0] 
o.a.d.r.a.h.DefaultServerStatusChangeListener:[32] - The status is standby now.
   [WI-0][TI-0] - 2025-11-26 02:26:44.434 INFO  [Curator-TreeCache-0] 
o.a.d.s.m.e.TaskGroupCoordinator:[463] - TaskGroupCoordinator closed
   [WI-0][TI-0] - 2025-11-26 02:26:44.435 ERROR [Thread-20] 
o.a.d.c.t.ThreadUtils:[80] - Current thread sleep error
   java.lang.InterruptedException: sleep interrupted
           at java.lang.Thread.sleep(Native Method)
           at 
org.apache.dolphinscheduler.common.thread.ThreadUtils.sleep(ThreadUtils.java:77)
           at 
org.apache.dolphinscheduler.server.master.engine.TaskGroupCoordinator.doStart(TaskGroupCoordinator.java:121)
           at java.lang.Thread.run(Thread.java:750)
   [WI-0][TI-0] - 2025-11-26 02:26:44.879 INFO  [Curator-TreeCache-0] 
o.a.d.s.m.c.AbstractClusterSubscribeListener:[41] - Server 
WorkerServerMetadata(workerGroup=default, workerWeight=100.0, 
taskThreadPoolUsage=0.0) added
   [WI-0][TI-0] - 2025-11-26 02:26:45.260 WARN  [MasterCommandLoopThread] 
o.a.d.s.m.e.c.IdSlotBasedCommandFetcher:[60] - MasterSlotManager check slot (-1 
-> 1)is invalidated.
   [WI-0][TI-0] - 2025-11-26 02:26:45.917 INFO  [Curator-TreeCache-0] 
o.a.d.s.m.c.AbstractClusterSubscribeListener:[41] - Server 
MasterServerMetadata(super=BaseServerMetadata(processId=1629687, 
serverStartupTime=1764123193415, address=10.16.10.119:5678, 
cpuUsage=0.003740648379052369, memoryUsage=0.09459177738163037, 
serverStatus=NORMAL)) added
   [WI-0][TI-0] - 2025-11-26 02:26:45.917 INFO  [Curator-TreeCache-0] 
o.a.d.s.m.c.MasterSlotManager:[89] - Do rebalance success, current master slot: 
0, total master slots: 1
   [WI-0][TI-0] - 2025-11-26 02:27:14.427 INFO  [SystemEventBusFireWorker] 
o.a.d.s.m.f.FailoverCoordinator:[105] - 
Master[MasterServerMetadata(super=BaseServerMetadata(processId=1629687, 
serverStartupTime=1764123193415, address=10.16.10.119:5678, 
cpuUsage=0.0012515644555694619, memoryUsage=0.09452163781361474, 
serverStatus=NORMAL))] failover starting
   [WI-0][TI-0] - 2025-11-26 02:27:14.427 INFO  [SystemEventBusFireWorker] 
o.a.d.s.m.f.FailoverCoordinator:[113] - The 
master[MasterServerMetadata(super=BaseServerMetadata(processId=1629687, 
serverStartupTime=1764123193415, address=10.16.10.119:5678, 
cpuUsage=0.0012515644555694619, memoryUsage=0.09452163781361474, 
serverStatus=NORMAL))] is alive, maybe it reconnect to registry skip failover
   [WI-0][TI-0] - 2025-11-26 02:27:14.427 INFO  [SystemEventBusFireWorker] 
o.a.d.s.m.e.s.SystemEventBusFireWorker:[103] - Fire SystemEvent: 
MasterFailoverEvent{masterServerMetadata='MasterServerMetadata(super=BaseServerMetadata(processId=1629687,
 serverStartupTime=1764123193415, address=10.16.10.119:5678, 
cpuUsage=0.0012515644555694619, memoryUsage=0.09452163781361474, 
serverStatus=NORMAL))', eventTime=Wed Nov 26 02:26:44 UTC 2025, 
delayTime=30000} cost: 0 ms
   [WI-0][TI-0] - 2025-11-26 02:27:14.427 INFO  [SystemEventBusFireWorker] 
o.a.d.s.m.f.FailoverCoordinator:[191] - 
Worker[WorkerServerMetadata(workerGroup=default, workerWeight=100.0, 
taskThreadPoolUsage=0.0)] failover starting
   [WI-0][TI-0] - 2025-11-26 02:27:14.427 INFO  [SystemEventBusFireWorker] 
o.a.d.s.m.f.FailoverCoordinator:[198] - The 
worker[WorkerServerMetadata(workerGroup=default, workerWeight=100.0, 
taskThreadPoolUsage=0.0)] is alive, maybe it reconnect to registry skip failover
   [WI-0][TI-0] - 2025-11-26 02:27:14.427 INFO  [SystemEventBusFireWorker] 
o.a.d.s.m.e.s.SystemEventBusFireWorker:[103] - Fire SystemEvent: 
WorkerFailoverEvent{workerServerMetadata='WorkerServerMetadata(workerGroup=default,
 workerWeight=100.0, taskThreadPoolUsage=0.0)', eventTime=Wed Nov 26 02:26:44 
UTC 2025, delayTime=30000} cost: 0 ms
   [WI-0][TI-0] - 2025-11-26 02:28:06.455 INFO  [MasterCommandHandleThreadPool] 
o.a.d.s.m.e.WorkflowEventBus:[41] - Publish event: 
WorkflowStartLifecycleEvent{workflow=03.CUST_加载(dim)_fr_project_code_dealer-v2-20251126022806126}
   [WI-0][TI-0] - 2025-11-26 02:28:06.456 INFO  [MasterCommandHandleThreadPool] 
o.a.d.s.m.e.c.CommandEngine:[174] - Success bootstrap command {
     "id" : 8928,
     "commandType" : "START_PROCESS",
     "workflowDefinitionCode" : 18819871298950,
     "workflowDefinitionVersion" : 19,
     "workflowInstanceId" : 14481,
     "commandParam" : 
"{\"commandType\":\"START_PROCESS\",\"subWorkflowInstance\":false,\"startNodes\":[],\"commandParams\":[{\"prop\":\"bizDate\",\"direct\":\"IN\",\"type\":\"VARCHAR\",\"value\":\"$[yyyy-MM-dd-1]\"},{\"prop\":\"tableName\",\"direct\":\"IN\",\"type\":\"VARCHAR\",\"value\":\"fr_project_code_dealer\"},{\"prop\":\"srcSystem\",\"direct\":\"IN\",\"type\":\"VARCHAR\",\"value\":\"yecai\"},{\"prop\":\"DMP_DB\",\"direct\":\"IN\",\"type\":\"VARCHAR\",\"value\":\"cust\"},{\"prop\":\"SRC_DB\",\"direct\":\"IN\",\"type\":\"VARCHAR\",\"value\":\"loan\"},{\"prop\":\"slctColums\",\"direct\":\"IN\",\"type\":\"VARCHAR\",\"value\":\"t.project_code,t.dealer_name,t.create_time,t.yewuyuan\"},{\"prop\":\"dof\",\"direct\":\"IN\",\"type\":\"VARCHAR\",\"value\":\"dim\"},{\"prop\":\"tableIdCol\",\"direct\":\"IN\",\"type\":\"VARCHAR\",\"value\":\"project_code\"}],\"timeZone\":\"UTC\"}",
     "workflowInstancePriority" : "MEDIUM",
     "executorId" : 0,
     "taskDependType" : "TASK_POST",
     "failureStrategy" : "CONTINUE",
     "warningType" : "NONE",
     "warningGroupId" : null,
     "scheduleTime" : null,
     "startTime" : null,
     "updateTime" : "2025-11-26 02:28:06",
     "workerGroup" : null,
     "tenantCode" : "default",
     "environmentCode" : -1,
     "dryRun" : 0
   }
   [WI-14481][TI-0] - 2025-11-26 02:28:06.471 INFO  
[ds-workflow-eventbus-worker-3] 
o.a.d.s.m.e.w.l.h.AbstractWorkflowLifecycleEventHandler:[47] - Begin fire 
workflow 03.CUST_加载(dim)_fr_project_code_dealer-v2-20251126022806126 
LifecycleEvent[WorkflowStartLifecycleEvent{workflow=03.CUST_加载(dim)_fr_project_code_dealer-v2-20251126022806126}]
 with state: RUNNING_EXECUTION
   [WI-14481][TI-0] - 2025-11-26 02:28:06.471 INFO  
[ds-workflow-eventbus-worker-3] o.a.d.s.m.e.WorkflowEventBus:[41] - Publish 
event: TaskStartLifecycleEvent{task=mysql->hdfs}
   [WI-14481][TI-0] - 2025-11-26 02:28:06.471 INFO  
[ds-workflow-eventbus-worker-3] 
o.a.d.s.m.e.w.l.h.AbstractWorkflowLifecycleEventHandler:[52] - Fired workflow 
03.CUST_加载(dim)_fr_project_code_dealer-v2-20251126022806126 
LifecycleEvent[WorkflowStartLifecycleEvent{workflow=03.CUST_加载(dim)_fr_project_code_dealer-v2-20251126022806126}]
 with state: RUNNING_EXECUTION
   [WI-14481][TI-0] - 2025-11-26 02:28:06.482 INFO  
[ds-workflow-eventbus-worker-3] o.a.d.s.m.e.WorkflowEventBus:[41] - Publish 
event: TaskDispatchLifecycleEvent{task=mysql->hdfs}
   [WI-14481][TI-0] - 2025-11-26 02:28:06.482 INFO  
[ds-workflow-eventbus-worker-3] 
o.a.d.s.m.e.t.l.h.AbstractTaskLifecycleEventHandler:[47] - Fired task 
mysql->hdfs TaskStartLifecycleEvent{task=mysql->hdfs} with state 
SUBMITTED_SUCCESS
   [WI-14481][TI-0] - 2025-11-26 02:28:06.486 INFO  
[ds-workflow-eventbus-worker-3] o.a.d.s.m.e.t.d.WorkerGroupDispatcher:[56] - 
Initialize WorkerGroupDispatcher: WorkerGroupTaskDispatcher-default
   [WI-14481][TI-0] - 2025-11-26 02:28:06.486 INFO  
[ds-workflow-eventbus-worker-3] o.a.d.s.m.e.t.d.WorkerGroupDispatcher:[62] - 
The WorkerGroupTaskDispatcher-default starting...
   [WI-14481][TI-0] - 2025-11-26 02:28:06.486 INFO  
[ds-workflow-eventbus-worker-3] o.a.d.s.m.e.t.d.WorkerGroupDispatcher:[64] - 
The WorkerGroupTaskDispatcher-default  started
   [WI-14481][TI-0] - 2025-11-26 02:28:06.486 INFO  
[ds-workflow-eventbus-worker-3] 
o.a.d.s.m.e.t.d.WorkerGroupDispatcherCoordinator:[59] - Success add 
Task[id=55958] to WorkerGroupDispatcher[name=default]
   [WI-14481][TI-0] - 2025-11-26 02:28:06.486 INFO  
[ds-workflow-eventbus-worker-3] 
o.a.d.s.m.e.t.l.h.AbstractTaskLifecycleEventHandler:[47] - Fired task 
mysql->hdfs TaskDispatchLifecycleEvent{task=mysql->hdfs} with state 
SUBMITTED_SUCCESS
   [WI-14481][TI-55958] - 2025-11-26 02:28:06.522 INFO  
[WorkerGroupTaskDispatcher-default] 
o.a.d.e.b.c.JdkDynamicRpcClientProxyFactory:[56] - Create DynamicRpcClientProxy 
cache for host: 10.16.10.117:1234
   [WI-0][TI-0] - 2025-11-26 02:28:06.576 INFO  
[MasterRpcServer-methodInvoker-1] o.a.d.s.m.e.WorkflowEventBus:[41] - Publish 
event: TaskDispatchedLifecycleEvent{task=mysql->hdfs, 
executorHost='10.16.10.117:1234'}
   [WI-14481][TI-0] - 2025-11-26 02:28:06.599 INFO  
[ds-workflow-eventbus-worker-2] 
o.a.d.s.m.e.t.l.h.AbstractTaskLifecycleEventHandler:[47] - Fired task 
mysql->hdfs TaskDispatchedLifecycleEvent{task=mysql->hdfs, 
executorHost='10.16.10.117:1234'} with state SUBMITTED_SUCCESS
   [WI-0][TI-0] - 2025-11-26 02:28:06.622 INFO  
[MasterRpcServer-methodInvoker-2] o.a.d.s.m.e.WorkflowEventBus:[41] - Publish 
event: TaskRunningLifecycleEvent{task=mysql->hdfs, 
logPath='/opt/datasophon/dolphinscheduler-3.3.2/worker-server/logs/20251126/18819871298950/19/14481/55958.log',
 startTime=Wed Nov 26 02:28:06 UTC 2025}
   [WI-14481][TI-0] - 2025-11-26 02:28:06.724 INFO  
[ds-workflow-eventbus-worker-16] 
o.a.d.s.m.e.t.l.h.AbstractTaskLifecycleEventHandler:[47] - Fired task 
mysql->hdfs TaskRunningLifecycleEvent{task=mysql->hdfs, 
logPath='/opt/datasophon/dolphinscheduler-3.3.2/worker-server/logs/20251126/18819871298950/19/14481/55958.log',
 startTime=Wed Nov 26 02:28:06 UTC 2025} with state DISPATCH
   [WI-0][TI-0] - 2025-11-26 02:28:06.727 INFO  
[MasterRpcServer-methodInvoker-3] o.a.d.s.m.e.WorkflowEventBus:[41] - Publish 
event: TaskRunningLifecycleEvent{task=mysql->hdfs, runtimeContext=null}
   [WI-14481][TI-0] - 2025-11-26 02:28:06.830 INFO  
[ds-workflow-eventbus-worker-14] 
o.a.d.s.m.e.t.l.h.AbstractTaskLifecycleEventHandler:[47] - Fired task 
mysql->hdfs TaskRunningLifecycleEvent{task=mysql->hdfs, runtimeContext=null} 
with state RUNNING_EXECUTION
   [WI-0][TI-0] - 2025-11-26 02:28:07.574 INFO  
[MasterRpcServer-methodInvoker-4] o.a.d.s.m.e.WorkflowEventBus:[41] - Publish 
event: TaskSuccessLifecycleEvent{task=mysql->hdfs, endTime=Wed Nov 26 02:28:07 
UTC 2025, varPool='[Property(prop=fLines, direct=OUT, type=VARCHAR, 
value=${sCount}), Property(prop=newLineColNums, direct=OUT, type=VARCHAR, 
value=)]'}
   [WI-14481][TI-0] - 2025-11-26 02:28:07.641 INFO  
[ds-workflow-eventbus-worker-15] o.a.d.s.m.e.WorkflowEventBus:[41] - Publish 
event: 
WorkflowTopologyLogicalTransitionWithTaskFinishLifecycleEvent{task=mysql->hdfstaskState=SUCCESS}
   [WI-14481][TI-0] - 2025-11-26 02:28:07.644 INFO  
[ds-workflow-eventbus-worker-15] 
o.a.d.s.m.e.t.l.h.AbstractTaskLifecycleEventHandler:[47] - Fired task 
mysql->hdfs TaskSuccessLifecycleEvent{task=mysql->hdfs, endTime=Wed Nov 26 
02:28:07 UTC 2025, varPool='[Property(prop=fLines, direct=OUT, type=VARCHAR, 
value=${sCount}), Property(prop=newLineColNums, direct=OUT, type=VARCHAR, 
value=)]'} with state RUNNING_EXECUTION
   [WI-14481][TI-0] - 2025-11-26 02:28:07.644 INFO  
[ds-workflow-eventbus-worker-15] 
o.a.d.s.m.e.w.l.h.AbstractWorkflowLifecycleEventHandler:[47] - Begin fire 
workflow 03.CUST_加载(dim)_fr_project_code_dealer-v2-20251126022806126 
LifecycleEvent[WorkflowTopologyLogicalTransitionWithTaskFinishLifecycleEvent{task=mysql->hdfstaskState=SUCCESS}]
 with state: RUNNING_EXECUTION
   [WI-14481][TI-0] - 2025-11-26 02:28:07.644 INFO  
[ds-workflow-eventbus-worker-15] o.a.d.s.m.e.WorkflowEventBus:[41] - Publish 
event: TaskStartLifecycleEvent{task=加载到临时表hive}
   [WI-14481][TI-0] - 2025-11-26 02:28:07.644 INFO  
[ds-workflow-eventbus-worker-15] 
o.a.d.s.m.e.w.l.h.AbstractWorkflowLifecycleEventHandler:[52] - Fired workflow 
03.CUST_加载(dim)_fr_project_code_dealer-v2-20251126022806126 
LifecycleEvent[WorkflowTopologyLogicalTransitionWithTaskFinishLifecycleEvent{task=mysql->hdfstaskState=SUCCESS}]
 with state: RUNNING_EXECUTION
   [WI-14481][TI-0] - 2025-11-26 02:28:07.653 INFO  
[ds-workflow-eventbus-worker-15] o.a.d.s.m.e.TaskGroupCoordinator:[363] - 
Success insert TaskGroupQueue: TaskGroupQueue(id=null, taskId=55959, 
taskName=加载到临时表hive, projectName=null, projectCode=null, 
workflowInstanceName=null, groupId=1, workflowInstanceId=14481, priority=0, 
forceStart=0, inQueue=1, status=WAIT_QUEUE, createTime=Wed Nov 26 02:28:07 UTC 
2025, updateTime=Wed Nov 26 02:28:07 UTC 2025) for TaskInstance: 加载到临时表hive
   [WI-14481][TI-0] - 2025-11-26 02:28:07.662 INFO  
[ds-workflow-eventbus-worker-15] o.a.d.s.m.e.t.s.AbstractTaskStateAction:[238] 
- Task[name=加载到临时表hive] using taskGroup, success acquire taskGroup slot
   [WI-14481][TI-0] - 2025-11-26 02:28:07.662 INFO  
[ds-workflow-eventbus-worker-15] 
o.a.d.s.m.e.t.l.h.AbstractTaskLifecycleEventHandler:[47] - Fired task 
加载到临时表hive TaskStartLifecycleEvent{task=加载到临时表hive} with state SUBMITTED_SUCCESS
   
   <img width="774" height="671" alt="Image" 
src="https://github.com/user-attachments/assets/b816a660-e18f-475d-81c8-1a71cd25cc03";
 />
   <img width="714" height="677" alt="Image" 
src="https://github.com/user-attachments/assets/50de7b72-e6bd-40bf-af0b-d89f7c40b107";
 />
   
   ### What you expected to happen
   
   workflow go ahead
   
   ### How to reproduce
   
   start a workflow
   
   ### Anything else
   
   经常卡死在不同的节点。也无法停止
   
   ### Version
   
   dev
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [x] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to