[GitHub] spark pull request #21440: [SPARK-24307][CORE] Support reading remote cached...

squito Tue, 17 Jul 2018 21:01:06 -0700

Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21440#discussion_r203245221
  
    --- Diff: 
core/src/main/scala/org/apache/spark/util/io/ChunkedByteBuffer.scala ---
    @@ -166,6 +170,34 @@ private[spark] class ChunkedByteBuffer(var chunks: 
Array[ByteBuffer]) {
     
     }
     
    +object ChunkedByteBuffer {
    +  // TODO eliminate this method if we switch BlockManager to getting 
InputStreams
    +  def fromManagedBuffer(data: ManagedBuffer, maxChunkSize: Int): 
ChunkedByteBuffer = {
    +    data match {
    +      case f: FileSegmentManagedBuffer =>
    +        map(f.getFile, maxChunkSize, f.getOffset, f.getLength)
    +      case other =>
    +        new ChunkedByteBuffer(other.nioByteBuffer())
    +    }
    +  }
    +
    +  def map(file: File, maxChunkSize: Int, offset: Long, length: Long): 
ChunkedByteBuffer = {
    +    Utils.tryWithResource(new FileInputStream(file).getChannel()) { 
channel =>
    --- End diff --
    
    I wasn't aware of that issue, thanks for sharing that, I'll update this.  
Should we also update other uses?  Seems there are a lot of other cases, eg. 
`UnsafeShuffleWriter`, `DiskBlockObjectWriter`, etc.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #21440: [SPARK-24307][CORE] Support reading remote cached...

Reply via email to