GitHub user andrewor14 opened a pull request:

    https://github.com/apache/spark/pull/5912

    [SPARK-7347] Add hover to RDDs in DAG visualization

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/andrewor14/spark viz-hover

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/5912.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #5912
    
----
commit 6b3403be587fce495276fcb137d3d8d7afc839a7
Author: Andrew Or <[email protected]>
Date:   2015-04-17T00:33:26Z

    Scope all RDD methods
    
    This commit provides a mechanism to set and unset the call scope
    around each RDD operation defined in RDD.scala. This is useful
    for tagging an RDD with the scope in which it is created. This
    will be extended to similar methods in SparkContext.scala and
    other relevant files in a future commit.

commit a9ed4f9e563a6b4ba4a351f0170da53b3a4c973f
Author: Andrew Or <[email protected]>
Date:   2015-04-17T00:46:19Z

    Add a few missing scopes to certain RDD methods

commit 5143523227d1dc989658f2f8a11e5fa97d8add03
Author: Andrew Or <[email protected]>
Date:   2015-04-17T01:44:08Z

    Expose the necessary information in RDDInfo
    
    This includes the scope field that we added in previous commits,
    and the parent IDs for tracking the lineage through the listener
    API.

commit 21843488193295fea8a08c3cb1556d0b62a809ba
Author: Andrew Or <[email protected]>
Date:   2015-04-17T18:00:31Z

    Translate RDD information to dot file
    
    It turns out that the previous scope information is insufficient
    for producing a valid dot file. In particular, the scope hierarchy
    was missing, but crucial to differentiate between a parent RDD
    being in the same encompassing scope and it being in a completely
    distinct scope. Also, unique scope identifiers are needed to
    simplify the code significantly.
    
    This commit further adds the translation logic in a UI listener
    that converts RDDInfos to dot files.

commit f22f3379edbdb301631440d1627fb633d0da143f
Author: Andrew Or <[email protected]>
Date:   2015-04-17T20:52:17Z

    First working implementation of visualization with vis.js

commit 9fac6f37e08b74ae19fa268923d10871ffe08aed
Author: Andrew Or <[email protected]>
Date:   2015-04-22T02:23:16Z

    Re-implement scopes through annotations instead
    
    The previous "working" implementation frequently ran into
    NotSerializableExceptions. Why? ClosureCleaner doesn't like
    closures being wrapped in other closures, and these closures
    are simply not cleaned (details are intentionally omitted here).
    
    This commit reimplements scoping through annotations. All methods
    that should be scoped are now annotated with @RDDScope. Then, on
    creation, each RDD derives its scope from the stack trace, similar
    to how it derives its call site. This is the cleanest approach
    that bypasses NotSerializableExceptions with least significant
    limitations.

commit 494d5c28b38d3d829f008a1bba406e63d4ec8680
Author: Andrew Or <[email protected]>
Date:   2015-04-22T02:39:14Z

    Revert a few unintended style changes

commit 6a7cdcaed6bb6fd856bd7e2e15b0d78cbdb0b2d1
Author: Andrew Or <[email protected]>
Date:   2015-04-22T03:00:30Z

    Move RDD scope util methods and logic to its own file
    
    Just a small code re-organization.

commit 5e22946945f683927cabafeb0ede3bc8e275e4a0
Author: Andrew Or <[email protected]>
Date:   2015-04-22T03:01:17Z

    Merge branch 'master' of github.com:apache/spark into viz

commit 205f838477de8cabd28aab6301a67fd7d07bc517
Author: Andrew Or <[email protected]>
Date:   2015-04-23T05:33:31Z

    Reimplement rendering with dagre-d3 instead of viz.js
    
    Before this commit, this patch relies on a JavaScript version of
    GraphViz that was compiled from C. Even the minified version of
    this resource was ~2.5M. The main motivation for switching away
    from this library, however, is that this is a complete black box
    of which we have absolutely no control. It is not at all extensible,
    and if something breaks we will have a hard time understanding
    why.
    
    The new library, dagre-d3, is not perfect either. It does not
    officially support clustering of nodes; for certain large graphs,
    the clusters will have a lot of unnecessary whitespace. A few in
    the dagre-d3 community are looking into a solution, but until then
    we will have to live with this (minor) inconvenience.

commit fe7816fe25c2f68ff2eee931ebe7a95b1cc97cdf
Author: Andrew Or <[email protected]>
Date:   2015-04-27T19:37:41Z

    Merge branch 'master' of github.com:apache/spark into viz

commit 8dd5af265ee0c395c4c6d831ca697775d9e28104
Author: Andrew Or <[email protected]>
Date:   2015-04-27T21:50:45Z

    Fill in documentation + miscellaneous minor changes
    
    For instance, this adds ability to throw away old stage graphs.

commit 71281fa15d3bebac583e93ff84c5062f760b753d
Author: Andrew Or <[email protected]>
Date:   2015-04-27T22:40:52Z

    Embed the viz in the UI in a toggleable manner

commit 09d361eb53a98d758891f3db39d8c9d4c239ee88
Author: Andrew Or <[email protected]>
Date:   2015-04-27T23:42:19Z

    Add ID to node label (minor)

commit 52187fcfaafe8d9ac4531a4a76c2c79281d43f73
Author: Andrew Or <[email protected]>
Date:   2015-04-28T00:17:09Z

    Rat excludes

commit c3bfcae2ae12e1ebc2a817df4eb9dca8fcce463f
Author: Andrew Or <[email protected]>
Date:   2015-04-27T23:21:04Z

    Re-implement scopes using closures instead of annotations
    
    The problem with annotations is that there is no way to associate
    an RDD's scope with another's. This is because the stack trace
    simply does not expose enough information for us to associate one
    instance of a method invocation with another.
    
    So, we're back to closures. Note that this still suffers from the
    same not serializable issue previously discussed, and this is being
    fixed in the ClosureCleaner separately.

commit aa868a98430fafa0c3227d34140d211c38549a1e
Author: Andrew Or <[email protected]>
Date:   2015-04-27T23:34:31Z

    Ensure that HadoopRDD is actually serializable

commit 4310271e39bb67f489a18a5070374c71b8439c37
Author: Andrew Or <[email protected]>
Date:   2015-04-28T00:30:26Z

    Merge branch 'master' of github.com:apache/spark into viz2

commit 7ef957cdfd1889f27dc9b4be81d22d15d4225eb9
Author: Andrew Or <[email protected]>
Date:   2015-04-28T00:31:19Z

    Fix scala style

commit d19c4da59f126b5ee0126fbc88f19b2055e6f359
Author: Andrew Or <[email protected]>
Date:   2015-04-28T21:05:59Z

    Merge branch 'master' of github.com:apache/spark into viz2

commit 6e2cfeae9db3b05ac836a229e888af1a54e4f9d3
Author: Andrew Or <[email protected]>
Date:   2015-04-29T00:40:30Z

    Remove all return statements in `withScope`
    
    The closure cleaner doesn't like these statements, for a good
    reason.

commit 43de96ef71eb5e6ca81102c6e5a5f75b55cdebeb
Author: Andrew Or <[email protected]>
Date:   2015-04-29T00:42:43Z

    Add parent IDs to StageInfo

commit 5e388ea6bf356c9700aeeb325429d27940788c5e
Author: Andrew Or <[email protected]>
Date:   2015-04-29T01:03:18Z

    Fix line too long

commit 5f07e9c3f1ab16f4bf89606a9e3b2633be305df7
Author: Andrew Or <[email protected]>
Date:   2015-04-29T02:49:58Z

    Remove more return statements from scopes

commit ab9141660cac4503309efa07f3b801e9216fc8b9
Author: Andrew Or <[email protected]>
Date:   2015-05-01T04:10:39Z

    Introduce visualization to the Job Page
    
    This includes a generalization of the visualization previously
    displayed on the stage page. More functionality is needed in
    JavaScript to prevent the job visualization from looking too
    cluttered. This is still WIP.

commit 5c7ce164f8ba820daaa5e19dbaa8be166ac90e64
Author: Andrew Or <[email protected]>
Date:   2015-05-01T19:26:53Z

    Connect RDDs across stages + update style
    
    This requires us to track incoming and outgoing edges in each
    stage on the backend, and render the connecting edges manually
    ourselves in d3.

commit deb48a0d0580ecfef0ad1fb3b867ef365723785d
Author: Andrew Or <[email protected]>
Date:   2015-05-01T20:45:09Z

    Translate stage boxes taking into account the width
    
    Previously we had a lot of overlapping boxes for say ALS. This is
    because we did not take into account of the widths of the previous
    boxes.

commit 0706992a995d711c268fecab69da421b3dd12144
Author: Andrew Or <[email protected]>
Date:   2015-05-01T20:59:28Z

    Add link from jobs to stages

commit b80cc52d81e6963d78b670299b40c8fc033f40e2
Author: Andrew Or <[email protected]>
Date:   2015-05-03T00:37:25Z

    Merge branch 'master' of github.com:apache/spark into viz2
    
    Conflicts:
        core/src/main/scala/org/apache/spark/storage/RDDInfo.scala
        core/src/main/scala/org/apache/spark/ui/jobs/JobPage.scala
        core/src/main/scala/org/apache/spark/ui/jobs/JobsTab.scala

commit f9830a2f09a2a8f55b3fd0309c46374f75a6501d
Author: Andrew Or <[email protected]>
Date:   2015-05-03T23:23:43Z

    Refactor + clean up + document JS visualization code
    
    This commit should not introduce any substantial functionality
    differences. It just cleans up the JavaScript side of this patch
    such that it is easier to follow.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to