Github user JoshRosen commented on a diff in the pull request:
https://github.com/apache/spark/pull/2969#discussion_r19454860
--- Diff: core/src/main/scala/org/apache/spark/storage/TachyonStore.scala
---
@@ -105,25 +106,17 @@ private[spark] class TachyonStore(
return None
}
val is = file.getInStream(ReadType.CACHE)
- var buffer: ByteBuffer = null
+ assert (is != null)
try {
- if (is != null) {
- val size = file.length
- val bs = new Array[Byte](size.asInstanceOf[Int])
- val fetchSize = is.read(bs, 0, size.asInstanceOf[Int])
- buffer = ByteBuffer.wrap(bs)
- if (fetchSize != size) {
- logWarning(s"Failed to fetch the block $blockId from Tachyon:
Size $size " +
- s"is not equal to fetched size $fetchSize")
- return None
- }
- }
+ val size = file.length
+ val bs = new Array[Byte](size.asInstanceOf[Int])
+ ByteStreams.readFully(is, bs)
+ Some(ByteBuffer.wrap(bs))
} catch {
--- End diff --
I think that catching OutOfMemoryError is inconsistent with the rest of the
code base. The only places where we catch it are for the purposes of logging
more information about why an executor died.
I don't think that any of the changes that I've made here are inconsistent
with the old implementation: if we fail to fetch, we still return None, since
ByteStreams.readFully throws IOException when it encounters errors.
Besides, couldn't the OOM have occurred due to some other thread starving
this one of memory?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]