ruanwenjun commented on issue #16976: URL: https://github.com/apache/dolphinscheduler/issues/16976#issuecomment-2608748984
> [@ruanwenjun](https://github.com/ruanwenjun) i did some test for a 5mins failed retry task, run workflow, the task failed and waiting retry, and stop the workflow, the workflow stop after 5mins. > > ``` > ... > > [WI-0][TI-0] - 2025-01-22 14:55:19.537 INFO [MasterRpcServer-methodInvoker-5] o.a.d.s.m.e.WorkflowEventBus:[41] - Publish event: TaskRunningLifecycleEvent{task=<Task-with-retry>, runtimeContext=null} > [WI-3954361][TI-0] - 2025-01-22 14:55:19.641 INFO [ds-workflow-eventbus-worker-11] o.a.d.s.m.e.t.l.h.AbstractTaskLifecycleEventHandler:[47] - Fired task <Task-with-retry> TaskRunningLifecycleEvent{task=<Task-with-retry>, runtimeContext=null} with state RUNNING_EXECUTION > > [WI-0][TI-0] - 2025-01-22 14:55:20.400 INFO [MasterRpcServer-methodInvoker-12] o.a.d.s.m.e.WorkflowEventBus:[41] - Publish event: TaskFailedLifecycleEvent{task=<Task-with-retry>, endTime=Wed Jan 22 14:55:20 GMT+08:00 2025} > [WI-3954361][TI-0] - 2025-01-22 14:55:20.445 INFO [ds-workflow-eventbus-worker-10] o.a.d.s.m.e.WorkflowEventBus:[41] - Publish event: TaskRetryLifecycleEvent{task=<Task-with-retry>, delayTime=300096/ms} > [WI-3954361][TI-0] - 2025-01-22 14:55:20.447 INFO [ds-workflow-eventbus-worker-10] o.a.d.s.m.e.t.l.h.AbstractTaskLifecycleEventHandler:[47] - Fired task <Task-with-retry> TaskFailedLifecycleEvent{task=<Task-with-retry>, endTime=Wed Jan 22 14:55:20 GMT+08:00 2025} with state RUNNING_EXECUTION > > [WI-0][TI-0] - 2025-01-22 14:55:34.205 INFO [MasterRpcServer-methodInvoker-27] o.a.d.s.m.e.WorkflowEventBus:[41] - Publish event: WorkflowStopLifecycleEvent{workflow=<Workflow-with-retry-task>-20250122145518737} > > > @@@@#### here was blocking WorkflowStopLifecycleEvent for 5mins ####@@@@ > > > [WI-3954361][TI-0] - 2025-01-22 15:00:20.577 INFO [ds-workflow-eventbus-worker-20] o.a.d.s.m.e.WorkflowEventBus:[41] - Publish event: TaskStartLifecycleEvent{task=<Task-with-retry>} > [WI-3954361][TI-0] - 2025-01-22 15:00:20.578 INFO [ds-workflow-eventbus-worker-20] o.a.d.s.m.e.t.l.h.AbstractTaskLifecycleEventHandler:[47] - Fired task <Task-with-retry> TaskRetryLifecycleEvent{task=<Task-with-retry>, delayTime=300096/ms} with state FAILURE > [WI-3954361][TI-0] - 2025-01-22 15:00:20.579 INFO [ds-workflow-eventbus-worker-20] o.a.d.s.m.e.w.l.h.AbstractWorkflowLifecycleEventHandler:[47] - Begin fire workflow <Workflow-with-retry-task>-20250122145518737 LifecycleEvent[WorkflowStopLifecycleEvent{workflow=<Workflow-with-retry-task>-20250122145518737}] with state: RUNNING_EXECUTION > [WI-3954361][TI-0] - 2025-01-22 15:00:20.582 INFO [ds-workflow-eventbus-worker-20] o.a.d.s.m.e.w.s.AbstractWorkflowStateAction:[150] - Success set WorkflowExecuteRunnable: <Workflow-with-retry-task>-20250122145518737 state from: RUNNING_EXECUTION to READY_STOP > > ... > ``` > > and i just found the main reason is here !!: > > [dolphinscheduler/dolphinscheduler-eventbus/src/main/java/org/apache/dolphinscheduler/eventbus/AbstractDelayEvent.java](https://github.com/apache/dolphinscheduler/blob/352b47bd8576a47f83285ecfffec589de462fac0/dolphinscheduler-eventbus/src/main/java/org/apache/dolphinscheduler/eventbus/AbstractDelayEvent.java#L62-L64) > > Lines 62 to 64 in [352b47b](/apache/dolphinscheduler/commit/352b47bd8576a47f83285ecfffec589de462fac0) > > public int compareTo(Delayed other) { > return Long.compare(this.createTimeInNano, ((AbstractDelayEvent) other).createTimeInNano); > } > AbstractDelayEvent use createTimeInNano to compare other event, DelayQueue will sort the events using createTimeInNano, so the retry event was first put in queue, DelayQueue will take retry event first. > > if i change the compared value `createTimeInNano` to `createTimeInNano + delayTime`, that will not block the Following 0 delay events any more. This is a bug 👍 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
