Igor Konev created SPARK-43978:
----------------------------------
Summary: Dropping blocks from memory to disk may result in
heartbeat loss
Key: SPARK-43978
URL: https://issues.apache.org/jira/browse/SPARK-43978
Project: Spark
Issue Type: Improvement
Components: Spark Core
Affects Versions: 3.3.2
Reporter: Igor Konev
Dropping blocks from memory to disk is executed inside a synchronized block on
MemoryManager (see
org.apache.spark.storage.memory.MemoryStore#evictBlocksToFreeSpace). Heartbeats
include memory metrics that are retrieved from MemoryManager using synchronized
methods. When blocks are being dropped, heartbeats cannot be sent as
Heartbeater is blocks. If dropping blocks takes longer than the network
timeout, heartbeats are considered lost and the executor gets killed by the
driver.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]