[
https://issues.apache.org/jira/browse/FLINK-32412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated FLINK-32412:
-----------------------------------
Labels: pull-request-available (was: )
> JobID collisions in FlinkSessionJob
> -----------------------------------
>
> Key: FLINK-32412
> URL: https://issues.apache.org/jira/browse/FLINK-32412
> Project: Flink
> Issue Type: Bug
> Components: Kubernetes Operator
> Affects Versions: kubernetes-operator-1.5.0
> Reporter: Fabio Wanner
> Assignee: Fabio Wanner
> Priority: Major
> Labels: pull-request-available
>
> From time to time we see {{JobId}} collisions in our deployments due to the
> low entropy of the generated {{{}JobId{}}}. The problem is that, although the
> {{uid}} from the k8s-resource (which is a UUID V4), only the {{hashCode}} of
> it will be used for the {{{}JobId{}}}. The {{hashCode}} is an integer, thus
> 32 bits. If we look at the birthday problem theorem we can expect a collision
> with a 50% chance with only 77000 random integers.
> In reality we seem to see the problem more often, but this could be because
> the {{uid}} might not be completely random, therefore increasing the chances
> if we just use parts of it.
> We propose to at least use the complete 64 bits of the upper part of the
> {{{}JobId{}}}, where 5.1×10{^}9{^} IDs are needed for a collision chance of
> 50%. We could even argue that most probably 64 bit for the generation number
> is not needed and another 32 bit could be spent on the uid to increase the
> entropy of the {{JobId}} even further (This would mean the max generation
> would be 4,294,967,295).
> Our suggestion for using 64 bits would be:
> {code:java}
> new JobID(
>
> UUID.fromString(Preconditions.checkNotNull(uid)).getMostSignificantBits(),
> Preconditions.checkNotNull(generation)
> );
> {code}
> Any thoughts on this? I would create a PR once we know how to proceed.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)