GitHub user pwendell opened a pull request:

    https://github.com/apache/spark/pull/1309

    SPARK-2380 [WIP]: Support displaying accumulator values in the web UI

    This patch adds support for giving accumulators user-visible names and 
displaying accumulator values in the web UI. This allows users to create custom 
counters that can display in the UI. The current approach displays both the 
accumulator deltas caused by each task and a "current" value of the accumulator 
totals for each stage, which gets update as tasks finish.
    
    Currently in Spark developers have been extending the `TaskMetrics` 
functionality to provide custom instrumentation for RDD's. This provides a 
potentially nicer alternative of going through the existing accumulator 
framework (actually `TaskMetrics` and accumulators are on an awkward collision 
course as we add more features to the former). The current patch demo's how we 
can use the feature to provide instrumentation for RDD input sizes. The nice 
thing about going through accumulators is that users can actually read the 
current value of the data being tracked in their programs. This could be useful 
to e.g. decide to short-circuit a Spark stage depending on how things are going.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/pwendell/spark metrics

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/1309.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1309
    
----
commit 0b72660a8da074f303ea1795af9ee1f0312877a7
Author: Patrick Wendell <pwend...@gmail.com>
Date:   2014-07-06T04:11:15Z

    Initial WIP example of supporing globally named accumulators.

commit ad85076f621df3dc688761bd189af2fd5935bd52
Author: Patrick Wendell <pwend...@gmail.com>
Date:   2014-07-06T11:41:51Z

    Example of using named accumulators for custom RDD metrics.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to