[ 
https://issues.apache.org/jira/browse/SPARK-2275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14043154#comment-14043154
 ] 

Raymond Liu commented on SPARK-2275:
------------------------------------

check out https://github.com/apache/spark/pull/1209 for my general idea about 
problem 1


> More general Storage Interface for Shuffle / Spill etc.
> -------------------------------------------------------
>
>                 Key: SPARK-2275
>                 URL: https://issues.apache.org/jira/browse/SPARK-2275
>             Project: Spark
>          Issue Type: Improvement
>          Components: Block Manager, Shuffle
>            Reporter: Raymond Liu
>
> Problem 1:
> In the current design, when shuffle / spill is involved, A File based 
> interface is assumed for various classes. While this might not been true when 
> we have disk store implemented base on something else other than FileSystem ( 
> e.g. an kv/object based NVM device) . And also if we try to utilize memory or 
> off heap store for shuffle, the File interface also do not work.
> Possible approaching : 
> So my general idea here is to hide the File Interface, instead using a 
> general ObjectId to represent the object been written to external store, And 
> pass around this ObjectId to various class to access the data. 
> e.g.  
> For Write path: DiskBlockObjectWritter( could be rename to 
> FileBlockObjectWritter) now take in ObjectId instead of File and do mapping 
> to File internally
> For Read path , A InputStream Interface is suppose to be able to retrieved 
> from the ObjectId by specific store, Thus  various read operation do not need 
> to rely on the assumption that the lower level storage is a filesystem and 
> rely on File to build their own FileInputStream etc.
> In this way, the current File base diskStore could still using File to 
> implement it's internal  storage, while other solution could be easily been 
> plug in with other low level implementation and just mapping to the ObjectId 
> for other module to interact with.
> Problem 2 :
> At present, In shuffle write path, the shuffle block manager manage the 
> mapping from some blockID to a FileSegment for the benefit of consolidate 
> shuffle, this way it bypass the block store's blockId based access mode. Then 
> in the read path, when read a shuffle block data, disk store query 
> shuffleBlockManager to hack the normal blockId to file mapping in order to 
> correctly read data from file. This really rend to a lot of bi-directional 
> dependencies between modules and the code logic is some how messed up. None 
> of the shuffle block manager and blockManager/Disk Store fully control the 
> read path. They are tightly coupled in low level code modules. And it make it 
> hard to implement other shuffle manager logics. e.g. a sort based shuffle 
> which might merge all output from one map partition to a single file. This 
> will need to hack more into the diskStore/diskBlockManager etc to find out 
> the right data to be read.
> Possible approaching:
> So I think it might be better that we expose an object + offset based read 
> interface  for BlockStore, ( or at least for DiskStore), e.g.  a 
> getObjectData(objectId, offset, length)  in addition to the current blockID 
> based interface.
> Then those mapping blockId to object and offset code logic can all reside in 
> the specific shuffle manager, if they do need to merge data into one single 
> object(File here in current diskStore implementation) they take care of the 
> mapping logic in both read/write path and take the responsibility of read / 
> write shuffle data ( since they already take care of write data,  then read 
> data also go through them instead of go through blockmanager is also 
> reasonable, they can further use the object+offset based read interface for 
> actual read work )
> The BlockStore itself should just take care of read/write as required, it 
> should not involve into the data mapping logic at all. This might make the 
> interface between modules more clear and decouple each other in a more clean 
> way. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to