[GitHub] [dolphinscheduler] Radeity opened a new issue, #14192: [Feature][Master] Support label management for DS Worker

via GitHub Wed, 24 May 2023 00:53:18 -0700


Radeity opened a new issue, #14192:
URL: https://github.com/apache/dolphinscheduler/issues/14192

### Search before asking

- [X] I had searched in the
[issues](https://github.com/apache/dolphinscheduler/issues?q=is%3Aissue) and
found no similar feature requirement.

### Description

- Add label management, in which we can define one or more labels for DS
Worker.

### Why we need this
- **Flexible Worker selection**
Currently, DS uses `worker group` to maintain a set of Worker, and user
selects a `worker group` for a process instance to decide which Worker can
execute task. We can replace `worker group` with label selector to decide a
Worker set, behaves like [Kubernetes label
selector](https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/),
also, in this way we can support to define both equality-based requirements
and set-based requirements. For the following label settings, we can use
different expressions to represent different Worker sets with labels unchanged.

- **Worker-affinity scheduling**
There are two use cases:
**(i)** If one task have special hardware demand, e.g. GPU, we expect
it can be dispatched to the node which has GPU resource. In the above example,
we can set task-level label selector to `gpu Exists`, and for other tasks do
not need GPU, we can set process-level label selector to `gpu DoesNotExist`. We
should set task-level label selector has **HIGHER** priority than process-level.
**(ii)** In the future, if we further consider to manage data
dependencies between upstream and downstream task node, we can dynamically set
the same task-level label selector for downstream task with upstream task. In
this way, we can avoid overhead brought from data transfer.

### Discussion
- Only support this feature for DolphinScheduler deployed inside Kubernetes,
or we implement Kubernetes-like label selector by ourselves and can be used for
normal cluster mode, either.

### Use case

Already describe above.

### Related issues

https://github.com/apache/dolphinscheduler/pull/14126#issuecomment-1552356663

### Are you willing to submit a PR?

- [X] Yes I am willing to submit a PR!

### Code of Conduct

- [X] I agree to follow this project's [Code of
Conduct](https://www.apache.org/foundation/policies/conduct)

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail:
[email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [dolphinscheduler] Radeity opened a new issue, #14192: [Feature][Master] Support label management for DS Worker

Reply via email to