Hi Stephan,
Thank you for detail explaination. As you said, my opition is to keep task
still running druing jobmanager failover, even though sending update status
failed.
For the first reason you mentioned, if i understand correctly, the key issue is
status out of sync between taskmanager and
Hi!
That is a super interesting idea. If I understand you correctly, you are
suggesting to try and reconcile the TaskManagers and the JobManager before
restarting the job. That would mean that in case of a master failure, the
jobs may simply continue to run. That would be a nice enhancements, but
Hi, As i know, when TaskManager send UpdateTaskExecutionState to
JobManager, if the JobManager failover and the future response is fail, the
task will be failed. Is it feasible to retry send UpdateTaskExecutionState
again when future response fail until success. In JobManager HA mode, the
U