[GitHub] spark pull request: [SPARK-2141] Adding getPersistentRddIds and un...

kanzhang Sun, 27 Jul 2014 10:23:28 -0700

Github user kanzhang commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1082#discussion_r15440360
  
    --- Diff: 
core/src/main/scala/org/apache/spark/api/java/JavaSparkContext.scala ---
    @@ -559,6 +559,19 @@ class JavaSparkContext(val sc: SparkContext) extends 
JavaSparkContextVarargsWork
       def getLocalProperty(key: String): String = sc.getLocalProperty(key)
     
       /**
    +   * Get a set of RDD IDs that have marked themselves as persistent via 
cache() call.
    +   * Note that this does not necessarily mean the caching or computation 
was successful.
    +   */
    +  def getPersistentRddIds(): java.util.Set[Int] =
    +    setAsJavaSet(sc.getPersistentRDDs.keySet)
    +
    +  /**
    +   * Unpersist an RDD from memory and/or disk storage
    +   */
    +  def unpersistRDD(rddId: Int, blocking: Boolean): Unit =
    +    sc.unpersistRDD(rddId, blocking)
    --- End diff --
    
    I personally don't think there is too much downside of making 
```SparkContext.unpersistRDD``` public (if only to keep Scala API in sync with 
Java/Python), since RDD ids are already known to users and they are unique per 
SparkContext. I think the key question here is whether there is legitimate use 
case we want to support and whether making it public is the best approach for 
supporting the use case.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-2141] Adding getPersistentRddIds and un...

Reply via email to