Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/12612#discussion_r60852131
--- Diff: core/src/main/scala/org/apache/spark/executor/TaskMetrics.scala
---
@@ -287,11 +258,10 @@ private[spark] object TaskMetrics extends Logging {
def fromAccumulatorUpdates(accumUpdates: Seq[AccumulableInfo]):
TaskMetrics = {
val definedAccumUpdates = accumUpdates.filter(_.update.isDefined)
val metrics = new ListenerTaskMetrics(definedAccumUpdates)
- // We don't register this [[ListenerTaskMetrics]] for cleanup, and
this is only used to post
- // event, we should un-register all accumulators immediately.
- metrics.internalAccums.foreach(acc => Accumulators.remove(acc.id))
- definedAccumUpdates.filter(_.internal).foreach { accum =>
- metrics.internalAccums.find(_.name ==
accum.name).foreach(_.setValueAny(accum.update.get))
+ definedAccumUpdates.filter(_.internal).foreach { accInfo =>
+ metrics.internalAccums.find(_.name == accInfo.name).foreach { acc =>
+ acc.asInstanceOf[Accumulator[Any, Any]].add(accInfo.update.get)
--- End diff --
This is an example shows a weakness of the new API: we can't `setValue`.
For this example, we have the final output and we wanna set the value of
accumulator so that it can produce the same output. With the new API, we can't
guarantee that all accumulators can implement `setValue`, e.g. the average
accumulator. I'm still thinking about how to fix it or work around it, @rxin
any ideas?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]