Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/5732#discussion_r29398445
--- Diff:
streaming/src/main/scala/org/apache/spark/streaming/rdd/WriteAheadLogBackedBlockRDD.scala
---
@@ -94,47 +111,53 @@ class WriteAheadLogBackedBlockRDD[T: ClassTag](
val blockManager = SparkEnv.get.blockManager
val partition =
split.asInstanceOf[WriteAheadLogBackedBlockRDDPartition]
val blockId = partition.blockId
- blockManager.get(blockId) match {
- case Some(block) => // Data is in Block Manager
- val iterator = block.data.asInstanceOf[Iterator[T]]
- logDebug(s"Read partition data of $this from block manager, block
$blockId")
- iterator
- case None => // Data not found in Block Manager, grab it from write
ahead log file
- var dataRead: ByteBuffer = null
- var writeAheadLog: WriteAheadLog = null
- try {
- // The WriteAheadLogUtils.createLog*** method needs a directory
to create a
- // WriteAheadLog object as the default FileBasedWriteAheadLog
needs a directory for
- // writing log data. However, the directory is not needed if
data needs to be read, hence
- // a dummy path is provided to satisfy the method parameter
requirements.
- // FileBasedWriteAheadLog will not create any file or directory
at that path.
- val dummyDirectory = FileUtils.getTempDirectoryPath()
- writeAheadLog = WriteAheadLogUtils.createLogForReceiver(
- SparkEnv.get.conf, dummyDirectory, hadoopConf)
- dataRead = writeAheadLog.read(partition.walRecordHandle)
- } catch {
- case NonFatal(e) =>
- throw new SparkException(
- s"Could not read data from write ahead log record
${partition.walRecordHandle}", e)
- } finally {
- if (writeAheadLog != null) {
- writeAheadLog.close()
- writeAheadLog = null
- }
- }
- if (dataRead == null) {
+
+ def getBlockFromBlockManager(): Option[Iterator[T]] = {
--- End diff --
This whole thing just puts the old code within inner functions, not really
a big change. The real change is that these functions are called selectively in
lines 157..160.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]