Hi Alexey & Smile,
JM & RM are located in the same process, thus it's unlikely a network
issue. Such timeouts are usually caused by one of the two endpoints not
responding timely.
Some common causes:
- The process is under severe GC pressure. You can check the GC logs for
the pressure.
- Insuffic
JM log shows this:
INFO org.apache.flink.yarn.YarnResourceManager - The
heartbeat of JobManager with id 41e3ef1f248d24ddefdccd1887947106 timed out.
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Hi Alexey,
We also have the same problem running on Yarn using Flink 1.9.0.
JM log shows this:
We are also looking for a way to troubleshoot this problem.
Best regards.
Smile
Alexey Trenikhun wrote
> Hello,
>
> I periodically see in JM log (Flink 12.2):
>
> {"ts":"2021-05-15T21:10:36.325Z",
Hello,
I periodically see in JM log (Flink 12.2):
{"ts":"2021-05-15T21:10:36.325Z","message":"The heartbeat of JobManager with id
be8225ebae1d6422b7f268c801044b05 timed
out.","logger_name":"org.apache.flink.runtime.resourcemanager.StandaloneResourceManager","thread_name":"flink-akka.actor.defau