[
https://issues.apache.org/jira/browse/FLINK-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16626190#comment-16626190
]
Ufuk Celebi commented on FLINK-10292:
-------------------------------------
I understand that non-determinism may be an issue when generating the
{{JobGraph}}, but do we have some data about how common that is for
applications? Would it be possible to keep a fixed JobGraph in the image
instead of persisting one in the {{SubmittedJobGraphStore}}?
I like our current approach, because it keeps the source of truth for the job
in the image instead of the {{SubmittedJobGraphStore}}. I'm wondering about the
following scenario:
* A user creates a job cluster with high availability enabled (cluster ID for
the logical application, e.g. myapp)
** This will persist the job with a fixed ID (after FLINK-10291) on first
submission
* The user kills the application *without* cancelling
** This will leave all data in the high availability store(s) such as job
graphs or checkpoints
* The user updates the image with a modified application and keeps the high
availability configuration (e.g. cluster ID stays myapp)
** This will result in the job in the image to be ignored since we already
have a job graph with the same (fixed) ID
I think in such a scenario it can be desirable to still have the checkpoints
available, but it might be problematic if the job graph is recovered from the
{{SubmittedJobGraphStore}} instead of using the job that is part of the image.
What do you think about this scenario? Is it the responsibility of the user to
handle this? If so, I think that the approach outlined in this ticket makes
sense. If not, we may want to consider alternatives or ignore potential
non-determinism.
> Generate JobGraph in StandaloneJobClusterEntrypoint only once
> -------------------------------------------------------------
>
> Key: FLINK-10292
> URL: https://issues.apache.org/jira/browse/FLINK-10292
> Project: Flink
> Issue Type: Improvement
> Components: Distributed Coordination
> Affects Versions: 1.6.0, 1.7.0
> Reporter: Till Rohrmann
> Assignee: vinoyang
> Priority: Major
> Fix For: 1.7.0, 1.6.2
>
>
> Currently the {{StandaloneJobClusterEntrypoint}} generates the {{JobGraph}}
> from the given user code every time it starts/is restarted. This can be
> problematic if the the {{JobGraph}} generation has side effects. Therefore,
> it would be better to generate the {{JobGraph}} only once and store it in HA
> storage instead from where to retrieve.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)