Hi,
I've setup Flink HA on AWS ( 3 Taskmanagers and 2 Jobmanagers each are on
EC2 m4.large instance with checkpoint enabled on S3 ). My topology works
fine, but after few hours I do see that Taskmanagers gets detached with
Jobmanager. I tried to reach Jobmanager using telnet at the same time and
it worked but Taskmanager does not succeed in connecting again. It attaches
only after I restart it. I tried following settings but still the problem
persists.

akka.ask.timeout: 20 s
akka.lookup.timeout: 20 s
akka.watch.heartbeat.interval: 20 s

Please find attached snapshot on one of the Taskmanager. Is there any
setting that I need to do ?

-- 
Thanks,
Deepak Jha

Reply via email to