[I] [Improvement][Scheduler] Rerun workflow instance should follow the specified workerGroup parameter [dolphinscheduler]

via GitHub Sun, 14 Dec 2025 00:13:12 -0800


washingxian opened a new issue, #17794:
URL: https://github.com/apache/dolphinscheduler/issues/17794


   ### Search before asking
   
   - [x] I had searched in the 
[issues](https://github.com/apache/dolphinscheduler/issues?q=is%3Aissue) and 
found no similar feature requirement.
   
   
   ### Description
   
   ## Description
   The current rerun mechanism of workflow instances ignores the pre-configured 
`workerGroup` parameter, leading to random assignment of tasks to idle workers 
instead of the specified worker group. This breaks resource isolation and 
scheduling rules, making it impossible to control task execution nodes as 
expected during rerun scenarios.
   
   
   ### Issue Description
   When re-running a workflow instance, the system does not follow the 
specified `workerGroup` in the startup parameters, but randomly assigns the 
task to any idle worker node instead. This violates the expected resource 
isolation and scheduling rules, and cannot guarantee the consistency of task 
execution environment between the first run and rerun.
   
   ### What version of DolphinScheduler are you using?
   Version: 3.3.2
   
   ### What Operating System are you using?
   OS: Debian 12
   
   ### What happened?
   1. Create a workflow and set a specific `workerGroup` (e.g., "w1") in the 
startup parameters when running the workflow for the first time;
   2. The first run correctly executes on the nodes in the specified 
`workerGroup`;
   3. When re-running the failed/finished workflow instance (via "Rerun" 
button), the system ignores the `workerGroup` parameter;
   4. The re-run task is assigned to any idle worker node, not the specified 
`workerGroup`;
   
   
   ### What you expected to happen?
   1. When re-running a workflow instance, the system should inherit and use 
the `workerGroup` parameter specified in the original startup parameters;
   2. The rerun task must be executed only on the nodes in the specified 
`workerGroup`, consistent with the first run;
   3. If the specified `workerGroup` has no idle nodes, the task should wait in 
the queue instead of being randomly assigned to other worker groups.
   
   ### How to reproduce it (as minimally and clearly as possible)?
   1. Prepare a DolphinScheduler cluster with at least two independent worker 
groups (e.g., group A: node1/node2, group B: node3/node4);
   2. Create a simple test workflow (e.g., a shell task that prints the worker 
node name);
   3. Submit the workflow instance with startup parameter `workerGroup=group A`;
   4. Confirm the first run executes on node1/node2 (group A) by checking the 
task log;
   5. After the instance finishes/fails, click the "Rerun" button to re-execute 
the instance (without modifying any parameters);
   6. Check the task execution node: the rerun task runs on node3/node4 (group 
B) instead of group A;
   
   
   
   ### Are you willing to submit a PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [x] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: 
[email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[I] [Improvement][Scheduler] Rerun workflow instance should follow the specified workerGroup parameter [dolphinscheduler]

Reply via email to