Github user andrewor14 commented on a diff in the pull request:
https://github.com/apache/spark/pull/5729#discussion_r29610035
--- Diff:
core/src/main/resources/org/apache/spark/ui/static/spark-dag-viz.js ---
@@ -0,0 +1,390 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+/*
+ * This file contains the logic to render the RDD DAG visualization in the
UI.
+ *
+ * This DAG describes the relationships between
+ * (1) an RDD and its dependencies,
+ * (2) an RDD and its operation scopes, and
+ * (3) an RDD's operation scopes and the stage / job hierarchy
+ *
+ * An operation scope is a general, named code block representing an
operation
+ * that instantiates RDDs (e.g. filter, textFile, reduceByKey). An
operation
+ * scope can be nested inside of other scopes if the corresponding RDD
operation
+ * invokes other such operations (for more detail, see
o.a.s.rdd.operationScope).
+ *
+ * A stage may include one or more operation scopes if the RDD operations
are
+ * streamlined into one stage (e.g. rdd.map(...).filter(...).flatMap(...)).
+ * On the flip side, an operation scope may also include one or many
stages,
+ * or even jobs if the RDD operation is higher level than Spark's
scheduling
+ * primitives (e.g. take, any SQL query).
+ *
+ * In the visualization, an RDD is expressed as a node, and its
dependencies
+ * as directed edges (from parent to child). operation scopes, stages, and
+ * jobs are expressed as clusters that may contain one or many nodes. These
+ * clusters may be nested inside of each other in the scenarios described
+ * above.
+ *
+ * The visualization is rendered in an SVG contained in
"div#dag-viz-graph",
+ * and its input data is expected to be populated in "div#dag-viz-metadata"
+ * by Spark's UI code. This is currently used only on the stage page and on
+ * the job page.
+ *
+ * This requires jQuery, d3, and dagre-d3. Note that we use a custom
release
+ * of dagre-d3 (http://github.com/andrewor14/dagre-d3) for some specific
+ * functionality. For more detail, please track the changes in that project
+ * since it was forked (commit 101503833a8ce5fe369547f6addf3e71172ce10b).
+ */
+
+var VizConstants = {
+ rddColor: "#444444",
--- End diff --
Yeah, we can do that in CSS, but I actually think it's quite nice if
everything about the dag is in this file. If we decide to move it we can always
do it later.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]