Github user pwendell commented on a diff in the pull request:
https://github.com/apache/spark/pull/1535#discussion_r15324267
--- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala ---
@@ -1269,6 +1269,19 @@ abstract class RDD[T: ClassTag](
/** A description of this RDD and its recursive dependencies for
debugging. */
def toDebugString: String = {
+ // Get a debug description of an rdd without its children
+ def debugSelf (rdd: RDD[_]): Seq[String] = {
+ import Utils.bytesToString
+
+ val persistence = storageLevel.description
+ val storageInfo = rdd.context.getRDDStorageInfo.filter(_.id ==
rdd.id).map(info =>
--- End diff --
Ah sorry, yeah I mean this very costly. I'd rather not do this in a debug
function - because people will do things like print debug statements inside of
loops. In that case the debugging will significantly alter the performance of
their application. There is a separate JIRA to make this function faster (it's
a function also used in the UI), but until that's fixed I'd rather not call it
here:
https://issues.apache.org/jira/browse/SPARK-2316
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---