Radeity opened a new issue, #14192:
URL: https://github.com/apache/dolphinscheduler/issues/14192

   ### Search before asking
   
   - [X] I had searched in the 
[issues](https://github.com/apache/dolphinscheduler/issues?q=is%3Aissue) and 
found no similar feature requirement.
   
   
   ### Description
   
   - Add label management, in which we can define one or more labels for DS 
Worker.
   
   ### Why we need this
   - **Flexible Worker selection** 
   Currently, DS uses `worker group` to maintain a set of Worker, and user 
selects a `worker group` for a process instance to decide which Worker can 
execute task. We can replace `worker group` with label selector to decide a 
Worker set, behaves like [Kubernetes label 
selector](https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/),
 also, in this way we can support to define both equality-based requirements 
and set-based requirements. For the following label settings, we can use 
different expressions to represent different Worker sets with labels unchanged.
                     
     <img width="665" alt="image" 
src="https://github.com/apache/dolphinscheduler/assets/45198818/f8a05697-bccd-4cbc-a3d4-e224dba0dd9e";>
   
   - **Worker-affinity scheduling** 
   There are two use cases:
        **(i)** If one task have special hardware demand, e.g. GPU, we expect 
it can be dispatched to the node which has GPU resource. In the above example, 
we can set task-level label selector to `gpu Exists`, and for other tasks do 
not need GPU, we can set process-level label selector to `gpu DoesNotExist`. We 
should set task-level label selector has **HIGHER** priority than process-level.
        **(ii)** In the future, if we further consider to manage data 
dependencies between upstream and downstream task node, we can dynamically set 
the same task-level label selector for downstream task with upstream task. In 
this way, we can avoid overhead brought from data transfer.
   
   ### Discussion
   - Only support this feature for DolphinScheduler deployed inside Kubernetes, 
or we implement Kubernetes-like label selector by ourselves and can be used for 
normal cluster mode, either.
   
   
   ### Use case
   
   Already describe above.
   
   ### Related issues
   
   https://github.com/apache/dolphinscheduler/pull/14126#issuecomment-1552356663
   
   ### Are you willing to submit a PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: 
[email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to