Github user JeetKunDoug commented on a diff in the pull request:
https://github.com/apache/spark/pull/21322#discussion_r188314886
--- Diff:
core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala ---
@@ -384,15 +385,36 @@ private[spark] class MemoryStore(
}
}
+ private def maybeReleaseResources(entry: MemoryEntry[_]): Unit = {
+ entry match {
+ case SerializedMemoryEntry(buffer, _, _) => buffer.dispose()
+ case DeserializedMemoryEntry(objs: Array[Any], _, _) =>
maybeCloseValues(objs)
--- End diff --
Ah- ok, I see where the issue is. So in this case you may have a
deserialized instance but the memory store is full, so it fails to be put. Now
we've got a live, deserialized object not in MemoryStore. Thanks for catching
this. It looks like this case could be handled in
`MemoryStore.putIteratorAsValues` when the `putIterator` call fails, which
would handle several cases in `BlockManager` where we try (and fail) to put
deserialized values, but it means a check for potential `AutoClosable` values
any time we fail to put into `MemoryStore`, and I'm not sure of the performance
impact of this.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]