bullyb1911 commented on issue #13636:
URL: 
https://github.com/apache/dolphinscheduler/issues/13636#issuecomment-1710219178

   Experienced this issue in Psuedo Cluster Configuration. When all servers 
were stopped and then restarted minutes later using the packaged start/stop 
scripts, the server reboots and creates scheduled workflow instances even 
though pending jobs during the time the server was offline are no longer 
executing due to a Recover Serial wait state. The scheduler continues to 
schedule new executions while the pending job remains in a Recover Serial wait 
state. When the Recovery Serial Wait jobs eventually timeout, they terminate. 
There are roughly 400 scheduled jobs and each job has timeouts set to default. 
Cron Manage is offline to prevent new scheduled executions.  Users have to wait 
for the jobs to timeout before being able to run the workflow. I copied the 
workflow config and created a temporary project to get the required tasks to 
run. You can change the workflow config to parallel execution if this workflow 
can support such a setting to avoid this. This is Version 3.1.7.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to