georgew5656 opened a new pull request, #14620:
URL: https://github.com/apache/druid/pull/14620
Fixes a issue where having data source names (or more generally task ids)
that are too long cause the K8sTaskRunner to throw errors.
### Description
The K8sTaskRunner truncates task ids in order to get a valid K8s job name
(less than 64 characters) and doing this can cause loss of uniquenes I fixed
this by appending a 128 bit hash onto the end of the task id which i think
should be sufficient to ensure uniqueness. (Change is in
KubernetesOverlordUtils).
I also had to fix a related issue that was covered up by the fact that our
truncation function can be applied repeatedly wihtout any affect.
The issue is the line in KubernetesPeonClient
K8sTaskId taskId = new K8sTaskId(job.getMetadata().getName());.
This takes the job name from the k8s api (which has already been truncated
by our truncation function since it is a job name), and then passes it into the
K8sTaskId constructor as the originalTaskId. The K8sTaskId calls the truncation
function a second time to generate the k8sTaskId. This is okay when our
truncation function just grabs the first 63 characters of the task id since
repeatedly applying this won't do anything, but with my latest changes this is
no longer the case.
I updated this logic to just use the job name directly from the kubernetes
api (that's all it needs anyways) and I had to change a lot of function
signatures to do this. I also explicitly changed the k8sTaskId field to
k8sJobName since that's the only thing that field is used for.
This does make it slightly more confusing to find out which task a job
corresponds to from a k8s client like k8s, but the PodTemplateTaskAdapter also
applies the whole unmodified task id as a annotation.
#### Release note
- Fixes a issue where having data source names (or more generally task ids)
that are too long cause the K8sTaskRunner to throw errors.
-->
<hr>
##### Key changed/added classes in this PR
* `KubernetesOverlordUtils`
* `KubernetesPeonClient`
* `K8sTaskId`
This PR has:
- [ X] been self-reviewed.
- [ ] using the [concurrency
checklist](https://github.com/apache/druid/blob/master/dev/code-review/concurrency.md)
(Remove this item if the PR doesn't have any relation to concurrency.)
- [ ] added documentation for new or modified features or behaviors.
- [ ] a release note entry in the PR description.
- [ ] added Javadocs for most classes and all non-trivial methods. Linked
related entities via Javadoc links.
- [ ] added or updated version, license, or notice information in
[licenses.yaml](https://github.com/apache/druid/blob/master/dev/license.md)
- [ X] added comments explaining the "why" and the intent of the code
wherever would not be obvious for an unfamiliar reader.
- [X ] added unit tests or modified existing tests to cover new code paths,
ensuring the threshold for [code
coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md)
is met.
- [ ] added integration tests.
- [ X] been tested in a test Druid cluster.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]