[
https://issues.apache.org/jira/browse/SPARK-7375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yin Huai resolved SPARK-7375.
-----------------------------
Resolution: Fixed
Fix Version/s: 1.4.0
Issue resolved by pull request 5948
[https://github.com/apache/spark/pull/5948]
> Avoid defensive copying in SQL exchange operator when sort-based shuffle
> buffers data in serialized form
> --------------------------------------------------------------------------------------------------------
>
> Key: SPARK-7375
> URL: https://issues.apache.org/jira/browse/SPARK-7375
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Reporter: Josh Rosen
> Assignee: Josh Rosen
> Fix For: 1.4.0
>
>
> The original sort-based shuffle buffers shuffle input records in memory while
> sorting them. This causes problems when mutable records are presented to the
> shuffle, which happens in Spark SQL's Exchange operator. To work around this
> issue, SPARK-2967 and SPARK-4479 added defensive copying of shuffle inputs in
> the Exchange operator when sort-based shuffle is enabled.
> I think that [~sandyr]'s recent patch for enabling serialization of records
> in sort-based shuffle (SPARK-4550) and my proposed {{unsafe}}-based shuffle
> path (SPARK-7081) may allow us to avoid this defensive copying in certain
> cases (since our patches cause records to be serialized one-at-a-time and
> remove the buffering of deserialized records).
> As mentioned in SPARK-4479, a long-term fix for this issue might be to add
> hooks for informing the shuffle about object (im)mutability in order to allow
> the shuffle layer to decide whether to copy. In the meantime, though, I think
> that we should just extend the checks added in SPARK-4479 to avoid copies
> when these new serialized sort paths are used.
> /cc [~rxin] [~marmbrus] [~yhuai]
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]