[jira] [Updated] (SPARK-3132) Avoid serialization for Array[Byte] in TorrentBroadcast
[ https://issues.apache.org/jira/browse/SPARK-3132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-3132: -- Assignee: (was: Davies Liu) > Avoid serialization for Array[Byte] in TorrentBroadcast > --- > > Key: SPARK-3132 > URL: https://issues.apache.org/jira/browse/SPARK-3132 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Reporter: Reynold Xin > > If the input data is a byte array, we should allow TorrentBroadcast to skip > serializing and compressing the input. > To do this, we should add a new parameter (shortCircuitByteArray) to > TorrentBroadcast, and then avoid serialization in if the input is byte array > and shortCircuitByteArray is true. > We should then also do compression in task serialization itself instead of > doing it in TorrentBroadcast. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-3132) Avoid serialization for Array[Byte] in TorrentBroadcast
[ https://issues.apache.org/jira/browse/SPARK-3132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-3132: - Component/s: Spark Core Avoid serialization for Array[Byte] in TorrentBroadcast --- Key: SPARK-3132 URL: https://issues.apache.org/jira/browse/SPARK-3132 Project: Spark Issue Type: Sub-task Components: Spark Core Reporter: Reynold Xin Assignee: Davies Liu If the input data is a byte array, we should allow TorrentBroadcast to skip serializing and compressing the input. To do this, we should add a new parameter (shortCircuitByteArray) to TorrentBroadcast, and then avoid serialization in if the input is byte array and shortCircuitByteArray is true. We should then also do compression in task serialization itself instead of doing it in TorrentBroadcast. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-3132) Avoid serialization for Array[Byte] in TorrentBroadcast
[ https://issues.apache.org/jira/browse/SPARK-3132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-3132: -- Assignee: Davies Liu [~davies], I'm assigning this to you since the PySpark broadcast changes that we discussed sound very similar to this. Avoid serialization for Array[Byte] in TorrentBroadcast --- Key: SPARK-3132 URL: https://issues.apache.org/jira/browse/SPARK-3132 Project: Spark Issue Type: Sub-task Reporter: Reynold Xin Assignee: Davies Liu If the input data is a byte array, we should allow TorrentBroadcast to skip serializing and compressing the input. To do this, we should add a new parameter (shortCircuitByteArray) to TorrentBroadcast, and then avoid serialization in if the input is byte array and shortCircuitByteArray is true. We should then also do compression in task serialization itself instead of doing it in TorrentBroadcast. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org