[
https://issues.apache.org/jira/browse/SPARK-21033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16211574#comment-16211574
]
Cosmin Lehene commented on SPARK-21033:
---------------------------------------
[~cloud_fan] Can you update the title and description? It helps, when finding
the issue through Google, to get an accurate description within JIRA
{quote}n UnsafeInMemorySorter, one record may take 32 bytes: 1 long for
pointer, 1 long for key-prefix, and another 2 longs as the temporary buffer for
radix sort.
In UnsafeExternalSorter, we set the DEFAULT_NUM_ELEMENTS_FOR_SPILL_THRESHOLD to
be 1024 * 1024 * 1024 / 2, and hoping the max size of point array to be 8 GB.
However this is wrong, 1024 * 1024 * 1024 / 2 * 32 is actually 16 GB, and if we
grow the point array before reach this limitation, we may hit the max-page-size
error.
This PR fixes this by making DEFAULT_NUM_ELEMENTS_FOR_SPILL_THRESHOLD 2 times
smaller, and adding a safe check in
UnsafeExternalSorter.growPointerArrayIfNecessary to avoid allocating a page
larger than max page size.
{quote}
> fix the potential OOM in UnsafeExternalSorter
> ---------------------------------------------
>
> Key: SPARK-21033
> URL: https://issues.apache.org/jira/browse/SPARK-21033
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 2.2.0
> Reporter: Wenchen Fan
> Assignee: Wenchen Fan
>
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]