[ 
https://issues.apache.org/jira/browse/SPARK-21033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16211574#comment-16211574
 ] 

Cosmin Lehene commented on SPARK-21033:
---------------------------------------

[~cloud_fan] Can you update the title and description? It helps, when finding 
the issue through Google, to get an accurate description within JIRA 

{quote}n UnsafeInMemorySorter, one record may take 32 bytes: 1 long for 
pointer, 1 long for key-prefix, and another 2 longs as the temporary buffer for 
radix sort.

In UnsafeExternalSorter, we set the DEFAULT_NUM_ELEMENTS_FOR_SPILL_THRESHOLD to 
be 1024 * 1024 * 1024 / 2, and hoping the max size of point array to be 8 GB. 
However this is wrong, 1024 * 1024 * 1024 / 2 * 32 is actually 16 GB, and if we 
grow the point array before reach this limitation, we may hit the max-page-size 
error.

This PR fixes this by making DEFAULT_NUM_ELEMENTS_FOR_SPILL_THRESHOLD 2 times 
smaller, and adding a safe check in 
UnsafeExternalSorter.growPointerArrayIfNecessary to avoid allocating a page 
larger than max page size.

{quote}

> fix the potential OOM in UnsafeExternalSorter
> ---------------------------------------------
>
>                 Key: SPARK-21033
>                 URL: https://issues.apache.org/jira/browse/SPARK-21033
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.2.0
>            Reporter: Wenchen Fan
>            Assignee: Wenchen Fan
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to