Github user davies commented on a diff in the pull request:
https://github.com/apache/spark/pull/11095#discussion_r56693830
--- Diff: core/src/main/java/org/apache/spark/memory/TaskMemoryManager.java
---
@@ -256,7 +261,20 @@ public MemoryBlock allocatePage(long size,
MemoryConsumer consumer) {
}
allocatedPages.set(pageNumber);
}
- final MemoryBlock page =
memoryManager.tungstenMemoryAllocator().allocate(acquired);
+ MemoryBlock page = null;
+ try {
+ page = memoryManager.tungstenMemoryAllocator().allocate(acquired);
+ } catch (OutOfMemoryError e) {
+ logger.warn("Failed to allocate a page ({} bytes), try again.",
acquired);
+ // there is no enough memory actually, it means the actual free
memory is smaller than
+ // MemoryManager thought, we should keep the acquired memory.
+ acquiredButNotUsed += acquired;
+ synchronized (this) {
+ allocatedPages.clear(pageNumber);
+ }
+ // this could trigger spilling to free some pages.
+ return allocatePage(size, consumer);
--- End diff --
Since we are continue hold some memory, the amount of free memory should
become smaller and smaller, it will fail to acquire soon.
yes, it's better to move `acquiredButNotUsed += acquired` into the
synchronized sections. If acquiredButNotUsed is not calculated correctly
(because risk conditions), you will only saw an warning message at the end of a
task.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]