xuechendi commented on issue #24322: [SPARK-27412][Shuffle]Add a new shuffle 
manager: PmemShuffleManager
URL: https://github.com/apache/spark/pull/24322#issuecomment-481240897
 
 
   @attilapiros , thanks for the comment, I am trying to propose this idea to 
spark, and also submitted a JIRA here 
https://issues.apache.org/jira/browse/SPARK-27412
   I added this PR is aiming for people who are interested can not only reading 
the design doc, but also be able to to check upon the codes.
   
   And for you question:
   1. My ultimate goal is to add an abstract layer above DiskBlockObjectWriter, 
let's call it 'BlockObjectStream', to hide file-channel by using a inputStream 
and outputStream. By which way, people who want to implement new storage 
backend for shuffle and external sorter only need to derive from 
'BlockObjectStream' and implement a inputstream and outputstream for it. like 
what the PmemBlockObjectWriter does in my current codes. So the reason I 
submitted my current codes is more like a proposal.
   
   2. For the C++ codes, thanks for the suggestion, I will make it as a 
separate artifact later.
   
   3. If you want to have a try, the configuration is to add below 
configuration and with Persistent Memory or 
emulation(https://pmem.io/2016/02/22/pm-emulation.html)
   And I will add some shuffle manager test to cover PmemShuffle path, thanks 
for the suggestion.
   
   Key | Value | Description
   -- | -- | --
   spark.shuffle.manager | org.apache.spark.shuffle.pmem.PmemShuffleManager | 
Enable PmemShuffleManager
   spark.shuffle.spill.pmem.MemoryThreshold | 16777216 | When inMemoryData   
exceeds 16MB, Spill to pmem
   spark.shuffle.pmem.pmem_list | /dev/dax0.0,/dev/dax1.0 | Listing   Pmem 
device
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to