Github user sitalkedia commented on the issue:
https://github.com/apache/spark/pull/15722
@davies - We fixed a similar issue with `UnsafeExternalSorter` in
SPARK-14363.
Basically following scenario is leading to OOM - Lets say we have total 4G
of memory available that is shared across 4 tasks, so fair share of each task
is around 1G. At some point of time few tasks finish and before the scheduler
can schedule more task on the executor, current running tasks grab all the
memory from the memory manager. The `LongArray` in that situation grows beyond
the fair share of memory for those tasks. Later when scheduler schedules more
task on this executors, already running tasks are forced to spill, but since
they are not reseting the `LongArray`, this is resulting in OOM.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]