GWphua opened a new pull request, #17749:
URL: https://github.com/apache/druid/pull/17749

   ### Description
   
   This PR aims to fix an issue introduced by #17738. 
   
   When we have task pods running in multiple namespaces, it will be difficult 
to tell which task pods are being started by which cluster, especially when we 
have very similar job names. 
    
   #### Added an alias tag to allow Kubernetes operators to identify which 
cluster the task pods are created from.
   Here are some of the things I have considered when implementing the change:
   1. The job name must respect the 63-character limit.
   2. We can easily differentiate between tasks and namespaces.
   3. The job name must be unique.
   4. What if users are running everything within one namespace? Maybe users 
will still prefer the old way of displaying task ID.
   
   Intuition:
   `k8sJobName` is currently broken down into 2 segments: 
{taskId}-{hash(taskId)}. The hashing will take up 32 characters, and is needed 
to maintain uniqueness. Hence, we can only work with making changes to the 
`taskId` part. The first thought is to change `taskId` to namespace, so that we 
at least know which namespace a task pod belongs in.
   
   However, I encountered cases where namespace have long names, and similar 
prefix? (Think `data-infra-druid-cluster...-ns1`, 
`data-infra-druid-cluster...-ns2`). So, I introduced a new 
`druid.indexer.runner.alias` config to allow users to provide an alias instead. 
Now, `k8sJobName` will be of the form {alias}-{hash(taskId)}. If alias is not 
provided, we will default to using the current {taskId}-{hash(taskId)} naming 
convention.
   
   Note that this solution will not allow us to differentiate between tasks. 
However, this is the best I could come up with to work around the headache 
where there are hundreds of tasks with roughly the same job name.
   
   #### Release note
   <!-- Give your best effort to summarize your changes in a couple of 
sentences aimed toward Druid users. 
   
   If your change doesn't have end user impact, you can skip this section.
   
   For tips about how to write a good release note, see [Release 
notes](https://github.com/apache/druid/blob/master/CONTRIBUTING.md#release-notes).
   
   -->
   You can now name your K8s job names using `druid.indexer.runner.alias`
   
   <hr>
   
   ##### Key changed/added classes in this PR
    * `MyFoo`
    * `OurBar`
    * `TheirBaz`
   
   <hr>
   
   <!-- Check the items by putting "x" in the brackets for the done things. Not 
all of these items apply to every PR. Remove the items which are not done or 
not relevant to the PR. None of the items from the checklist below are strictly 
necessary, but it would be very helpful if you at least self-review the PR. -->
   
   This PR has:
   
   - [x] been self-reviewed.
   - [ ] added documentation for new or modified features or behaviors.
   - [x] a release note entry in the PR description.
   - [x] added unit tests or modified existing tests to cover new code paths, 
ensuring the threshold for [code 
coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md)
 is met.
   - [x] been tested in a test Druid cluster.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to