Matt Massie created SPARK-7884:
----------------------------------
Summary: Allow Spark shuffle APIs to be more customizable
Key: SPARK-7884
URL: https://issues.apache.org/jira/browse/SPARK-7884
Project: Spark
Issue Type: Improvement
Components: Spark Core
Reporter: Matt Massie
The current Spark shuffle has some hard-coded assumptions about how shuffle
managers will read and write data.
The FileShuffleBlockResolver.forMapTask method creates disk writers by calling
BlockManager.getDiskWriter. This forces all shuffle managers to store data
using the DiskBlockObjectWriter which read/write data as record-oriented
(preventing column-orient record writing).
The BlockStoreShuffleFetcher.fetch method relies on the
ShuffleBlockFetcherIterator that assumes shuffle data is written using the
BlockManager.getDiskWriter method and doesn't allow for customization.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]