[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

tdas Tue, 18 Mar 2014 15:31:20 -0700

Github user tdas commented on a diff in the pull request:

    https://github.com/apache/spark/pull/126#discussion_r10729469
  
    --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala ---
    @@ -259,23 +299,24 @@ private[spark] class MapOutputTrackerMaster(conf: 
SparkConf)
         bytes
       }
     
    -  protected override def cleanup(cleanupTime: Long) {
    -    super.cleanup(cleanupTime)
    -    cachedSerializedStatuses.clearOldValues(cleanupTime)
    -  }
    -
       override def stop() {
         super.stop()
    +    metadataCleaner.cancel()
         cachedSerializedStatuses.clear()
       }
     
    -  override def updateEpoch(newEpoch: Long) {
    -    // This might be called on the MapOutputTrackerMaster if we're running 
in local mode.
    +  protected def cleanup(cleanupTime: Long) {
    +    mapStatuses.clearOldValues(cleanupTime)
    +    cachedSerializedStatuses.clearOldValues(cleanupTime)
       }
    +}
     
    -  def has(shuffleId: Int): Boolean = {
    -    cachedSerializedStatuses.get(shuffleId).isDefined || 
mapStatuses.contains(shuffleId)
    -  }
    +/**
    + * MapOutputTracker for the workers, which fetches map output information 
from the driver's
    + * MapOutputTrackerMaster.
    + */
    +private[spark] class MapOutputTrackerWorker(conf: SparkConf) extends 
MapOutputTracker(conf) {
    --- End diff --
    
    @pwendell Given that MapOutputTrackerWorker only change the type of HashMap 
used over MapOutputTracker, does it make sense to define two separate 
MapOutputTrackerMaster and MapOutputTrackerWorker? Or keep it as before, 
MapOutputTracker (for workers) and MapOutputTrackerMaster (for driver)? I think 
since we instantiate different classes in the driver and the worker, its best 
to name them accordingly.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

Reply via email to