[ 
https://issues.apache.org/jira/browse/SPARK-975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983905#comment-13983905
 ] 

Cheng Lian commented on SPARK-975:
----------------------------------

Hi [~sarutak], thanks for caring about this. Sorry that this issue hasn't been 
updated for a while. At the time SRD was developed, related interfaces exposed 
by Spark and used in SRD were not well chosen and exposed some implementation 
details to API users, so SRD was not merged yet. We do have plan to improve 
Spark debugging facilities. Before we settle on a final design, I would like to 
rebase the SRD branch to the current master so that people can use it to debug 
and analyze their applications, though I can't promise anything for now.

> Spark Replay Debugger
> ---------------------
>
>                 Key: SPARK-975
>                 URL: https://issues.apache.org/jira/browse/SPARK-975
>             Project: Spark
>          Issue Type: New Feature
>          Components: Spark Core
>    Affects Versions: 0.9.0
>            Reporter: Cheng Lian
>              Labels: arthur, debugger
>
> The Spark debugger was first mentioned as {{rddbg}} in the [RDD technical 
> report|http://www.cs.berkeley.edu/~matei/papers/2011/tr_spark.pdf].
> [Arthur|https://github.com/mesos/spark/tree/arthur], authored by [Ankur 
> Dave|https://github.com/ankurdave], is an old implementation of the Spark 
> debugger, which demonstrated both the elegance and power behind the RDD 
> abstraction.  Unfortunately, the corresponding GitHub branch was not merged 
> into the master branch and had stopped 2 years ago.  For more information 
> about Arthur, please refer to [the Spark Debugger Wiki 
> page|https://github.com/mesos/spark/wiki/Spark-Debugger] in the old GitHub 
> repository.
> As a useful tool for Spark application debugging and analysis, it would be 
> nice to have a complete Spark debugger.  In 
> [PR-224|https://github.com/apache/incubator-spark/pull/224], I propose a new 
> implementation of the Spark debugger, the Spark Replay Debugger (SRD).
> [PR-224|https://github.com/apache/incubator-spark/pull/224] is only a preview 
> for discussion.  In the current version, I only implemented features that can 
> illustrate the basic mechanisms.  There are still features appeared in Arthur 
> but missing in SRD, such as checksum based nondeterminsm detection and single 
> task debugging with conventional debugger (like {{jdb}}).  However, these 
> features can be easily built upon current SRD framework.  To minimize code 
> review effort, I didn't include them into the current version intentionally.
> Attached is the visualization of the MLlib ALS application (with 1 iteration) 
> generated by SRD.  For more information, please refer to [the SRD overview 
> document|http://spark-replay-debugger-overview.readthedocs.org/en/latest/].



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to