Ngone51 commented on issue #25971: [SPARK-29298][CORE] Separate block manager heartbeat endpoint from driver endpoint URL: https://github.com/apache/spark/pull/25971#issuecomment-552453978 Actually, what I want to know is whether these 2 issues(failed task/job and executor lost) happen in a short duration rather than a long run duration. If they happen in a short duration, then I think this could really be the explanation for other may timeouted messages while heartbeat has already timeout. Anyway, I think this PR is good enough for certain executor lost issue.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
