[ 
https://issues.apache.org/jira/browse/SPARK-7884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Massie updated SPARK-7884:
-------------------------------
    Description: 
The current Spark shuffle has some hard-coded assumptions about how shuffle 
managers will read and write data.

The BlockStoreShuffleFetcher.fetch method relies on the 
ShuffleBlockFetcherIterator that assumes shuffle data is written using the 
BlockManager.getDiskWriter method and doesn't allow for customization.

  was:
The current Spark shuffle has some hard-coded assumptions about how shuffle 
managers will read and write data.

The FileShuffleBlockResolver.forMapTask method creates disk writers by calling 
BlockManager.getDiskWriter. This forces all shuffle managers to store data 
using the DiskBlockObjectWriter which read/write data as record-oriented 
(preventing column-orient record writing).

The BlockStoreShuffleFetcher.fetch method relies on the 
ShuffleBlockFetcherIterator that assumes shuffle data is written using the 
BlockManager.getDiskWriter method and doesn't allow for customization.


> Move block deserialization from BlockStoreShuffleFetcher to ShuffleReader
> -------------------------------------------------------------------------
>
>                 Key: SPARK-7884
>                 URL: https://issues.apache.org/jira/browse/SPARK-7884
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>            Reporter: Matt Massie
>
> The current Spark shuffle has some hard-coded assumptions about how shuffle 
> managers will read and write data.
> The BlockStoreShuffleFetcher.fetch method relies on the 
> ShuffleBlockFetcherIterator that assumes shuffle data is written using the 
> BlockManager.getDiskWriter method and doesn't allow for customization.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to