[
https://issues.apache.org/jira/browse/SPARK-10620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael Armbrust updated SPARK-10620:
-------------------------------------
Target Version/s: 2.0.0 (was: 1.6.0)
> Look into whether accumulator mechanism can replace TaskMetrics
> ---------------------------------------------------------------
>
> Key: SPARK-10620
> URL: https://issues.apache.org/jira/browse/SPARK-10620
> Project: Spark
> Issue Type: Task
> Components: Spark Core
> Reporter: Patrick Wendell
> Assignee: Andrew Or
>
> This task is simply to explore whether the internal representation used by
> TaskMetrics could be performed by using accumulators rather than having two
> separate mechanisms. Note that we need to continue to preserve the existing
> "Task Metric" data structures that are exposed to users through event logs
> etc. The question is can we use a single internal codepath and perhaps make
> this easier to extend in the future.
> I think a full exploration would answer the following questions:
> - How do the semantics of accumulators on stage retries differ from aggregate
> TaskMetrics for a stage? Could we implement clearer retry semantics for
> internal accumulators to allow them to be the same - for instance, zeroing
> accumulator values if a stage is retried (see discussion here: SPARK-10042).
> - Are there metrics that do not fit well into the accumulator model, or would
> be difficult to update as an accumulator.
> - If we expose metrics through accumulators in the future rather than
> continuing to add fields to TaskMetrics, what is the best way to coerce
> compatibility?
> - Are there any other considerations?
> - Is it worth it to do this, or is the consolidation too complicated to
> justify?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]