[
https://issues.apache.org/jira/browse/SPARK-975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14100239#comment-14100239
]
Cheng Lian commented on SPARK-975:
----------------------------------
Usually we just filter them out by checking package/class names . Similar trick
is used in {{org.apache.spark.util.Utils.getCallSite}}.
> Spark Replay Debugger
> ---------------------
>
> Key: SPARK-975
> URL: https://issues.apache.org/jira/browse/SPARK-975
> Project: Spark
> Issue Type: New Feature
> Components: Spark Core
> Affects Versions: 0.9.0
> Reporter: Cheng Lian
> Labels: arthur, debugger
> Attachments: IMG_20140722_184149.jpg, RDD DAG.png
>
>
> The Spark debugger was first mentioned as {{rddbg}} in the [RDD technical
> report|http://www.cs.berkeley.edu/~matei/papers/2011/tr_spark.pdf].
> [Arthur|https://github.com/mesos/spark/tree/arthur], authored by [Ankur
> Dave|https://github.com/ankurdave], is an old implementation of the Spark
> debugger, which demonstrated both the elegance and power behind the RDD
> abstraction. Unfortunately, the corresponding GitHub branch was not merged
> into the master branch and had stopped 2 years ago. For more information
> about Arthur, please refer to [the Spark Debugger Wiki
> page|https://github.com/mesos/spark/wiki/Spark-Debugger] in the old GitHub
> repository.
> As a useful tool for Spark application debugging and analysis, it would be
> nice to have a complete Spark debugger. In
> [PR-224|https://github.com/apache/incubator-spark/pull/224], I propose a new
> implementation of the Spark debugger, the Spark Replay Debugger (SRD).
> [PR-224|https://github.com/apache/incubator-spark/pull/224] is only a preview
> for discussion. In the current version, I only implemented features that can
> illustrate the basic mechanisms. There are still features appeared in Arthur
> but missing in SRD, such as checksum based nondeterminsm detection and single
> task debugging with conventional debugger (like {{jdb}}). However, these
> features can be easily built upon current SRD framework. To minimize code
> review effort, I didn't include them into the current version intentionally.
> Attached is the visualization of the MLlib ALS application (with 1 iteration)
> generated by SRD. For more information, please refer to [the SRD overview
> document|http://spark-replay-debugger-overview.readthedocs.org/en/latest/].
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]