[ 
https://issues.apache.org/jira/browse/FLINK-32412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17736056#comment-17736056
 ] 

Gyula Fora commented on FLINK-32412:
------------------------------------

I think this is a very good improvement. We just have to make sure to not break 
the existing jobs but since the JobId is recorded in the status I think we are 
good.

> JobID collisions in FlinkSessionJob
> -----------------------------------
>
>                 Key: FLINK-32412
>                 URL: https://issues.apache.org/jira/browse/FLINK-32412
>             Project: Flink
>          Issue Type: Bug
>          Components: Kubernetes Operator
>    Affects Versions: kubernetes-operator-1.5.0
>            Reporter: Fabio Wanner
>            Priority: Major
>
> From time to time we see {{JobId}} collisions in our deployments due to the 
> low entropy of the generated {{{}JobId{}}}. The problem is that, although the 
> {{uid}} from the k8s-resource (which is a UUID, but we don't know of which 
> version), only the {{hashCode}} of it will be used for the {{{}JobId{}}}. The 
> {{hashCode}} is an integer, thus 32 bits. If we look at the birthday problem 
> theorem we can expect a collision with a 50% chance with only 77000 random 
> integers. 
> In reality we seem to see the problem more often, but this could be because 
> the {{uid}} might not be completely random, therefore increasing the chances 
> if we just use parts of it.
> We propose to at least use the complete 64 bits of the upper part of the 
> {{{}JobId{}}}, where 5.1×10{^}9{^} IDs are needed for a collision chance of 
> 50%. We could even argue that most probably 64 bit for the generation number 
> is not needed and another 32 bit could be spent on the uid to increase the 
> entropy of the {{JobId}} even further (This would mean the max generation 
> would be 4,294,967,295).
> Our suggestion for using 64 bits would be:
> {code:java}
> new JobID(
>     
> UUID.fromString(Preconditions.checkNotNull(uid)).getMostSignificantBits(), 
>     Preconditions.checkNotNull(generation)
> );
> {code}
> Any thoughts on this? I would create a PR once we know how to proceed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to