[GitHub] spark pull request: [SPARK-4107] Fix incorrect handling of read() ...

JoshRosen Mon, 27 Oct 2014 23:26:45 -0700

Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2969#discussion_r19454860
  
    --- Diff: core/src/main/scala/org/apache/spark/storage/TachyonStore.scala 
---
    @@ -105,25 +106,17 @@ private[spark] class TachyonStore(
           return None
         }
         val is = file.getInStream(ReadType.CACHE)
    -    var buffer: ByteBuffer = null
    +    assert (is != null)
         try {
    -      if (is != null) {
    -        val size = file.length
    -        val bs = new Array[Byte](size.asInstanceOf[Int])
    -        val fetchSize = is.read(bs, 0, size.asInstanceOf[Int])
    -        buffer = ByteBuffer.wrap(bs)
    -        if (fetchSize != size) {
    -          logWarning(s"Failed to fetch the block $blockId from Tachyon: 
Size $size " +
    -            s"is not equal to fetched size $fetchSize")
    -          return None
    -        }
    -      }
    +      val size = file.length
    +      val bs = new Array[Byte](size.asInstanceOf[Int])
    +      ByteStreams.readFully(is, bs)
    +      Some(ByteBuffer.wrap(bs))
         } catch {
    --- End diff --
    
    I think that catching OutOfMemoryError is inconsistent with the rest of the 
code base.  The only places where we catch it are for the purposes of logging 
more information about why an executor died.
    
    I don't think that any of the changes that I've made here are inconsistent 
with the old implementation: if we fail to fetch, we still return None, since 
ByteStreams.readFully throws IOException when it encounters errors.
    
    Besides, couldn't the OOM have occurred due to some other thread starving 
this one of memory?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-4107] Fix incorrect handling of read() ...

Reply via email to