Github user andrewor14 commented on a diff in the pull request:
https://github.com/apache/spark/pull/126#discussion_r10916368
--- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala ---
@@ -50,26 +50,36 @@ private[spark] class
MapOutputTrackerMasterActor(tracker: MapOutputTrackerMaster
}
}
-private[spark] class MapOutputTracker(conf: SparkConf) extends Logging {
+/**
+ * Class that keeps track of the location of the map output of
+ * a stage. This is abstract because different versions of MapOutputTracker
+ * (driver and worker) use different HashMap to store its metadata.
+ */
+private[spark] abstract class MapOutputTracker(conf: SparkConf) extends
Logging {
private val timeout = AkkaUtils.askTimeout(conf)
- // Set to the MapOutputTrackerActor living on the driver
+ /** Set to the MapOutputTrackerActor living on the driver */
var trackerActor: ActorRef = _
- protected val mapStatuses = new TimeStampedHashMap[Int, Array[MapStatus]]
+ /** This HashMap needs to have different storage behavior for driver and
worker */
+ protected val mapStatuses: Map[Int, Array[MapStatus]]
- // Incremented every time a fetch fails so that client nodes know to
clear
- // their cache of map output locations if this happens.
+ /**
+ * Incremented every time a fetch fails so that client nodes know to
clear
+ * their cache of map output locations if this happens.
+ */
protected var epoch: Long = 0
protected val epochLock = new java.lang.Object
--- End diff --
nit: not part of your patch, but can this just be AnyRef?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---