Github user JoshRosen commented on the pull request:
https://github.com/apache/spark/pull/11095#issuecomment-181532082
This change makes sense to me. An OOM in `allocatePage` means that
user-code (or some other source) is using memory that is not tracked by Spark's
memory manager, so Spark mistakenly believes that it has more available memory
for managed use than it actually does. The key idea behind this patch is that
the unaccounted-for-memory-use can be handled by updating the memory bookkeping
structures after an OOM: if we OOM, we count the size of the original failed
request as `acquiredButNotUsed` and then re-request, which will cause Spark to
evict / spill pages because its own estimate of how much managed memory is
available will now be more accurate.
This strategy effectively counts the unmanaged memory as belonging to an
arbitrary task (the one that triggered the OOM) and assumes that the memory
will not be freed until that task finishes. This isn't perfectly accurate, but
I don't really see how we can do much better: we don't have any clue as to
where the memory came from, so if we didn't attribute it to an arbitrary task
then we'd have the problem of determining when to consider the arbitrary memory
to be freed. I suppose we could try to measure the difference between the JVM's
total memory usage and the sum of all tasks' managed memory in order to
estimate the amount of unmanaged memory, but that approach seems much more
complex and might not even be possible.
Therefore, this seems reasonable to me. Also, note that this should't
really come into play unless `spark.memoryFraction` is inaccurate.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]