Github user kanzhang commented on a diff in the pull request:
https://github.com/apache/spark/pull/1082#discussion_r15440360
--- Diff:
core/src/main/scala/org/apache/spark/api/java/JavaSparkContext.scala ---
@@ -559,6 +559,19 @@ class JavaSparkContext(val sc: SparkContext) extends
JavaSparkContextVarargsWork
def getLocalProperty(key: String): String = sc.getLocalProperty(key)
/**
+ * Get a set of RDD IDs that have marked themselves as persistent via
cache() call.
+ * Note that this does not necessarily mean the caching or computation
was successful.
+ */
+ def getPersistentRddIds(): java.util.Set[Int] =
+ setAsJavaSet(sc.getPersistentRDDs.keySet)
+
+ /**
+ * Unpersist an RDD from memory and/or disk storage
+ */
+ def unpersistRDD(rddId: Int, blocking: Boolean): Unit =
+ sc.unpersistRDD(rddId, blocking)
--- End diff --
I personally don't think there is too much downside of making
```SparkContext.unpersistRDD``` public (if only to keep Scala API in sync with
Java/Python), since RDD ids are already known to users and they are unique per
SparkContext. I think the key question here is whether there is legitimate use
case we want to support and whether making it public is the best approach for
supporting the use case.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---