Github user jerryshao commented on a diff in the pull request:
https://github.com/apache/spark/pull/20026#discussion_r162803492
--- Diff: core/src/main/scala/org/apache/spark/storage/DiskStore.scala ---
@@ -152,7 +153,7 @@ private class DiskBlockData(
file: File,
blockSize: Long) extends BlockData {
- override def toInputStream(): InputStream = new FileInputStream(file)
+ override def toInputStream(): InputStream = new
NioBufferedFileInputStream(file)
--- End diff --
IIUC for network (netty) transmission, it uses zero copy sendFile, which is
another path (`toNetty`). Here `toInputStream` seems only used by
`BlockManager` from code.
Also this is not the critical path, reducing memory copy will not increase
the performance a lot. Besides, due to the complexity of the code stack, we can
guarantee there's no memory-copy in the whole code chain, so I doubt about the
final improvements of changes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]