[
https://issues.apache.org/jira/browse/SPARK-21867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Apache Spark reassigned SPARK-21867:
------------------------------------
Assignee: Apache Spark
> Support async spilling in UnsafeShuffleWriter
> ---------------------------------------------
>
> Key: SPARK-21867
> URL: https://issues.apache.org/jira/browse/SPARK-21867
> Project: Spark
> Issue Type: Improvement
> Components: Spark Core
> Affects Versions: 2.2.0
> Reporter: Sital Kedia
> Assignee: Apache Spark
> Priority: Minor
> Attachments: Async ShuffleExternalSorter.pdf
>
>
> Currently, Spark tasks are single-threaded. But we see it could greatly
> improve the performance of the jobs, if we can multi-thread some part of it.
> For example, profiling our map tasks, which reads large amount of data from
> HDFS and spill to disks, we see that we are blocked on HDFS read and spilling
> majority of the time. Since both these operations are IO intensive the
> average CPU consumption during map phase is significantly low. In theory,
> both HDFS read and spilling can be done in parallel if we had additional
> memory to store data read from HDFS while we are spilling the last batch read.
> Let's say we have 1G of shuffle memory available per task. Currently, in case
> of map task, it reads from HDFS and the records are stored in the available
> memory buffer. Once we hit the memory limit and there is no more space to
> store the records, we sort and spill the content to disk. While we are
> spilling to disk, since we do not have any available memory, we can not read
> from HDFS concurrently.
> Here we propose supporting async spilling for UnsafeShuffleWriter, so that we
> can support reading from HDFS when sort and spill is happening
> asynchronously. Let's say the total 1G of shuffle memory can be split into
> two regions - active region and spilling region - each of size 500 MB. We
> start with reading from HDFS and filling the active region. Once we hit the
> limit of active region, we issue an asynchronous spill, while fliping the
> active region and spilling region. While the spil is happening
> asynchronosuly, we still have 500 MB of memory available to read the data
> from HDFS. This way we can amortize the high disk/network io cost during
> spilling.
> We made a prototype hack to implement this feature and we could see our map
> tasks were as much as 40% faster.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]