pltbkd commented on pull request #16357:
URL: https://github.com/apache/flink/pull/16357#issuecomment-874688034


   > Well one needs to ask themselves why it is that the timeout is multiple 
times the interval.
   > 
   > If the timeout is that large because a target should truly only be 
considered unreachable if nothing got through during this entire period, then 
in any case both mechanism will work the same way (<= because users configure 
it that way).
   
   In fact I saw that in `HeartbeatManagerOptions`, defined as defaultValue 
though the method is deprecated, am I at the right place?
   
   > IOW, any RPC message would be treated like a heartbeat request, and 
heartbeats are just a way to ensure periodic communication.
   
   There's a counter example that, assuming the heartbeat interval is 10s and 
timeout is 5s, when JM send a RPC request(not heartbeat) that costs more 5s to 
process in the remote TM, and there's no response of other RPC requests or 
heartbeat requests received during the 5s, the request will result in heartbeat 
timeout, though there's nothing wrong.
   The main difference is that a heartbeat request is expected to be responded 
as soon as possible, while a RPC request may take some time before responded.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to