[ 
https://issues.apache.org/jira/browse/FLINK-17295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17253383#comment-17253383
 ] 

Till Rohrmann commented on FLINK-17295:
---------------------------------------

Thanks for restarting this discussion [~karmagyz]. I agree with you that 
introducing a random element seems to me like the most reliable solution. In 
particular since we ran into several problems with using a non-random id.

The increase in TDD size is not nice, though. Do you have some numbers how much 
slower the deployment of the WordCount job has become? Maybe it is not as bad 
because there are some other concurrent processes such as the blob download 
which slow down the deployment anyway.

If the TDD size should become a serious problem, then we might also be able to 
serialize the content in a more sophisticated way. For example, we could 
collect all {{ExecutionAttemptIDs}} and only serialize them once for a TDD and 
replace them with a monotonically increasing number which can be used as a 
lookup. Alternatively, we might rethink whether the individual ids 
(ExecutionVertexID and random part for example) really need to be 128 bits long.

> Refactor the ExecutionAttemptID to consist of ExecutionVertexID and 
> attemptNumber
> ---------------------------------------------------------------------------------
>
>                 Key: FLINK-17295
>                 URL: https://issues.apache.org/jira/browse/FLINK-17295
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Runtime / Coordination
>            Reporter: Yangze Guo
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.13.0
>
>
> Make the ExecutionAttemptID being composed of (ExecutionVertexID, 
> attemptNumber).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to