[jira] [Commented] (SPARK-7542) Support off-heap sort buffer in UnsafeExternalSorter
[ https://issues.apache.org/jira/browse/SPARK-7542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14990761#comment-14990761 ] Apache Spark commented on SPARK-7542: - User 'davies' has created a pull request for this issue: https://github.com/apache/spark/pull/9477 > Support off-heap sort buffer in UnsafeExternalSorter > > > Key: SPARK-7542 > URL: https://issues.apache.org/jira/browse/SPARK-7542 > Project: Spark > Issue Type: New Feature > Components: Spark Core >Affects Versions: 1.4.0 >Reporter: Josh Rosen >Assignee: Davies Liu > > {{UnsafeExternalSorter}}, introduced in SPARK-7081, uses on-heap {{long[]}} > arrays as its sort buffers. When records are small, the sorting array might > be as large as the data pages, so it would be useful to be able to allocate > this array off-heap (using our unsafe LongArray). Unfortunately, we can't > currently do this because TimSort calls {{allocate()}} to create data buffers > but doesn't call any corresponding cleanup methods to free them. > We should look into extending TimSort with buffer freeing methods, then > consider switching to LongArray in UnsafeShuffleSortDataFormat. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-7542) Support off-heap sort buffer in UnsafeExternalSorter
[ https://issues.apache.org/jira/browse/SPARK-7542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14680140#comment-14680140 ] Apache Spark commented on SPARK-7542: - User 'viirya' has created a pull request for this issue: https://github.com/apache/spark/pull/8068 > Support off-heap sort buffer in UnsafeExternalSorter > > > Key: SPARK-7542 > URL: https://issues.apache.org/jira/browse/SPARK-7542 > Project: Spark > Issue Type: New Feature > Components: Spark Core >Affects Versions: 1.4.0 >Reporter: Josh Rosen > > {{UnsafeExternalSorter}}, introduced in SPARK-7081, uses on-heap {{long[]}} > arrays as its sort buffers. When records are small, the sorting array might > be as large as the data pages, so it would be useful to be able to allocate > this array off-heap (using our unsafe LongArray). Unfortunately, we can't > currently do this because TimSort calls {{allocate()}} to create data buffers > but doesn't call any corresponding cleanup methods to free them. > We should look into extending TimSort with buffer freeing methods, then > consider switching to LongArray in UnsafeShuffleSortDataFormat. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org