[GitHub] attilapiros opened a new pull request #23688: [SPARK-25035][Core] Avoiding memory mapping at disk-stored blocks replication

GitBox Tue, 29 Jan 2019 13:13:04 -0800

attilapiros opened a new pull request #23688: [SPARK-25035][Core] Avoiding 
memory mapping at disk-stored blocks replication
URL: https://github.com/apache/spark/pull/23688
 
 
   ## What changes were proposed in this pull request?
   
   Before this PR `BlockManager#putBlockDataAsStream()` during block 
replication read the file content which was received via streaming into the 
memory even when the storage level was DISK_ONLY. 
   
   With this change the received file which was stored as a temporary file is 
moved into the right location backing the block.
   
   To avoid code duplication `doPutBytes` is refactored to template method  
called `BlockStoreUpdater` which has a separate implementation for byte buffer 
and temporary file based updates.
   
   ## How was this patch tested?
   
   With existing unit tests from `DistributedSuite`:
   - caching on disk, replicated (encryption = off) (with replication as stream)
   - caching on disk, replicated (encryption = on) (with replication as stream)
   - caching in memory, serialized, replicated (encryption = on) (with 
replication as stream)
   - caching in memory, serialized, replicated (encryption = off) (with 
replication as stream)
   - etc.
   
   And with new unit tests testing `putBlockDataAsStream` directly:
   - test putBlockDataAsStream with caching (encryption = off) 
   - test putBlockDataAsStream with caching (encryption = on) 
   - test putBlockDataAsStream with caching on disk (encryption = off) 
   - test putBlockDataAsStream with caching on disk (encryption = on)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] attilapiros opened a new pull request #23688: [SPARK-25035][Core] Avoiding memory mapping at disk-stored blocks replication

Reply via email to