Hi Vishal,
you can also keep the same cluster id when cancelling a job with savepoint
and then resuming a new job from it. Terminating the job should clean up
all state in Zk.
Cheers,
Till
On Fri, Feb 8, 2019 at 11:26 PM Vishal Santoshi
wrote:
> In one case however, we do want to retain the sa
In one case however, we do want to retain the same cluster id ( think
ingress on k8s and thus SLAs with external touch points ) but it is
essentially a new job ( added an incompatible change but at the interface
level it retains the same contract ) , the only way seems to be to remove
the chroot/s
If you keep the same cluster id, the upgraded job should pick up
checkpoints from the completed checkpoint store. However, I would recommend
to take a savepoint and resume from this savepoint because then you can
also specify that you allow non restored state, for example.
Cheers,
Till
On Fri, Fe
Is the rationale of using a jobID 00* also roughly the same. As in a
Flink job cluster is a single job and thus a single job id suffices ? I am
more wondering about the case when we are doing a compatible changes to a
job and want to resume ( given we are in HA mode and thus have a
chroot/subc
Hi Sergey,
the rationale why we are using a K8s job instead of a deployment is that a
Flink job cluster should terminate after it has successfully executed the
Flink job. This is unlike a session cluster which should run forever and
for which a K8s deployment would be better suited.
If in your us
Hi,
my team is currently experimenting with Flink running in Kubernetes (job
cluster setup). And we found out that with JobManager being deployed as
"Job" we can't just simply update certain values in job's yaml, e.g.
spec.template.spec.containers.image (
https://github.com/kubernetes/kubernetes/i