[GitHub] [dolphinscheduler] github-actions[bot] commented on issue #6959: [Bug] [endless loop] Workergroup have only one Worker,when this worker is down。run task job is endless loop

GitBox Sun, 21 Nov 2021 23:54:14 -0800


github-actions[bot] commented on issue #6959:
URL: 
https://github.com/apache/dolphinscheduler/issues/6959#issuecomment-975215962

### Search before asking

-[X] I had searched in the
[issues](https://github.com/apache/dolphinscheduler/issues?q=is%3Aissue) and
found no similar issues.

### What happened

My dolphinScheduler-cluster version is 1.3.3, I have double master and many
WorkerGroup,one WorkerGroup have only one Worker.
When this worker is down. run task job is endless loop.
All tasks are blocked.
--------------------------------------------------
----------------------------------------
My Dolphin cluster version is 1.3.3, I have two master nodes and many work
groups, one of which has one worker node.
When my worker node service is down. The task that configures this node to
run will go through an endless loop.
All tasks will wait for the end of this endless loop to execute.

### What you expected to happen

When one WorkerGroup is abnormal, It doesn't affect anything else.
--------------------------------------------------
----------------------------------------
When my workgroup node is down, it will not affect the execution of other
tasks

### How to reproduce

1 set two WorkerGroup ,A and B, Each work have a Worker;
2 set two task, one is A,other is B;
3 stop B worker-service, then run more workerGroupB job;
4 run A task, then this task are blocked.
--------------------------------------------------
----------------------------------------
1. I have two working groups, A and B. Set up a Worker for each workgroup.
2. Set up two tasks, one task is executed by A working group, and the other
is executed by B.
3. I close the Worker service of the worker B node in the server, and run
the task of the B working group at the same time. Run a few more.
4. If you execute A again at this time, you will find that A cannot be
executed.
Looking at the log, you will find that the node is down constantly.

### Anything else

dolphinscheduler-master.log:
[ERROR] 2021-11-18 14:42:24.568
org.apache.dolphinscheduler.server.master.consumer.TaskPriorityQueueConsumer:[148]-dispatch
error

org.apache.dolphinscheduler.server.master.dispatch.exceptions.ExecuteException:
fail to execute: Command [type=TASK_EXECUTE_REQUEST, opaque=2867, bodyLen=1735]
due to no suitable worker, current task need to bi worker group execute
at
org.apache.dolphinscheduler.server.master.dispatch.ExecutorDispatcher.dispatch(ExecutorDispatcher.java:87)
at
org.apache.dolphinscheduler.server.master.consumer.TaskPriorityQueueConsumer.dispatch(TaskPriorityQueueConsumer.java:145)
at
org.apache.dolphinscheduler.server.master.consumer.TaskPriorityQueueConsumer.run(TaskPriorityQueueConsumer.java:114)

### Version

1.3.3

### Are you willing to submit PR?

-[X] Yes I am willing to submit a PR!

### Code of Conduct

-[X] I agree to follow this project's [Code of
Conduct](https://www.apache.org/foundation/policies/conduct)

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [dolphinscheduler] github-actions[bot] commented on issue #6959: [Bug] [endless loop] Workergroup have only one Worker,when this worker is down。run task job is endless loop

Reply via email to