Github user andrewor14 commented on the pull request:
https://github.com/apache/spark/pull/7774#issuecomment-127429075
Review status: 0 of 22 files reviewed at latest revision, 66 unresolved
discussions, some commit checks failed.
---
<sup>**[sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlan.scala,
line 71
\[r4\]](https://reviewable.io:443/reviews/apache/spark/7774#-JvpjDe0mLMQg8-alRGn)**
([raw
file](https://github.com/apache/spark/blob/94065929603633714929c5ecbd43c2a65182552a/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlan.scala#L71)):</sup>
shall we keep this `private[spark]` for now? We can always expose it in the
future if we want. Same in the accessor several lines down.
---
<sup>**[sql/core/src/main/scala/org/apache/spark/sql/execution/SQLExecution.scala,
line 58
\[r3\]](https://reviewable.io:443/reviews/apache/spark/7774#-JvokJQh1eM3HWsvMBR6-r3-58)**
([raw
file](https://github.com/apache/spark/blob/ca1811f5beb035db74e932e683716bd09e1f7302/sql/core/src/main/scala/org/apache/spark/sql/execution/SQLExecution.scala#L58)):</sup>
I should add though that (1) allows us to write these custom events to the
event log and thus allow us to see the information in the history server, which
is very valuable.
---
<sup>**[sql/core/src/main/scala/org/apache/spark/sql/ui/AllExecutionsPage.scala,
line 133
\[r4\]](https://reviewable.io:443/reviews/apache/spark/7774#-Jvpk6HAav0HqRcnGqcD)**
([raw
file](https://github.com/apache/spark/blob/94065929603633714929c5ecbd43c2a65182552a/sql/core/src/main/scala/org/apache/spark/sql/ui/AllExecutionsPage.scala#L133)):</sup>
do you need `else { Nil }` here?
---
<sup>**[sql/core/src/main/scala/org/apache/spark/sql/ui/AllExecutionsPage.scala,
line 154
\[r4\]](https://reviewable.io:443/reviews/apache/spark/7774#-JvpkGDIsvkYxcFl-p1P)**
([raw
file](https://github.com/apache/spark/blob/94065929603633714929c5ecbd43c2a65182552a/sql/core/src/main/scala/org/apache/spark/sql/ui/AllExecutionsPage.scala#L154)):</sup>
+detail (without space) to be consistent with above
---
<sup>**[sql/core/src/main/scala/org/apache/spark/sql/ui/SparkPlanGraph.scala,
line 52
\[r4\]](https://reviewable.io:443/reviews/apache/spark/7774#-JvpkizmZ3pVvaR_iuiG)**
([raw
file](https://github.com/apache/spark/blob/94065929603633714929c5ecbd43c2a65182552a/sql/core/src/main/scala/org/apache/spark/sql/ui/SparkPlanGraph.scala#L52)):</sup>
does this need to be an atomic long? Seems a little overkill. Can this just
be `nextNodeId = 0` or something?
---
<sup>**[sql/core/src/main/scala/org/apache/spark/sql/ui/SQLListener.scala,
line 139
\[r4\]](https://reviewable.io:443/reviews/apache/spark/7774#-JvplbxtEcRT_dHYezAz)**
([raw
file](https://github.com/apache/spark/blob/94065929603633714929c5ecbd43c2a65182552a/sql/core/src/main/scala/org/apache/spark/sql/ui/SQLListener.scala#L139)):</sup>
`finishTask = true`, same in `onExecutorMetricsUpdate`
---
<sup>**[sql/core/src/main/scala/org/apache/spark/sql/ui/SQLListener.scala,
line 142
\[r4\]](https://reviewable.io:443/reviews/apache/spark/7774#-Jvpm5Q5pWW4XPL9JV6S)**
([raw
file](https://github.com/apache/spark/blob/94065929603633714929c5ecbd43c2a65182552a/sql/core/src/main/scala/org/apache/spark/sql/ui/SQLListener.scala#L142)):</sup>
I would rename this `updateTaskAccumulatorValues`, then add a java doc:
```
Update the accumulator values of a task with the latest metrics for this
task.
This is called every time we receive an executor heartbeat or when a task
finishes.
```
---
<sup>**[sql/core/src/main/scala/org/apache/spark/sql/ui/SQLListener.scala,
line 198
\[r4\]](https://reviewable.io:443/reviews/apache/spark/7774#-JvpnK83TUW9ZFVHe00-)**
([raw
file](https://github.com/apache/spark/blob/94065929603633714929c5ecbd43c2a65182552a/sql/core/src/main/scala/org/apache/spark/sql/ui/SQLListener.scala#L198)):</sup>
The `toMap` here doesn't seem safe. It turns out scala silently collapses
items with the same key when you call `toMap`:
```
scala> Seq((1, 1), (1, 2), (1, 3)).toMap
res5: scala.collection.immutable.Map[Int,Int] = Map(1 -> 3)
```
What are the semantics here? My understanding is that we want to display
the accumulators for each of the operators in this plan, not just for the last
one that has non-zero accumulator values.
---
<sup>**[sql/core/src/main/scala/org/apache/spark/sql/ui/SQLListener.scala,
line 252
\[r4\]](https://reviewable.io:443/reviews/apache/spark/7774#-JvppYcO2JgB6qKiIEyF)**
([raw
file](https://github.com/apache/spark/blob/94065929603633714929c5ecbd43c2a65182552a/sql/core/src/main/scala/org/apache/spark/sql/ui/SQLListener.scala#L252)):</sup>
this could be the java doc itself actually
---
Comments from the [review on
Reviewable.io](https://reviewable.io:443/reviews/apache/spark/7774)
<!-- Sent from Reviewable.io -->
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]