Micah Whitacre created CRUNCH-558:
-------------------------------------

             Summary: Add name to Spark Accumulators
                 Key: CRUNCH-558
                 URL: https://issues.apache.org/jira/browse/CRUNCH-558
             Project: Crunch
          Issue Type: Improvement
          Components: Spark
            Reporter: Micah Whitacre


It was brought up on the mailing list that our Crunch counters are not showing 
up on the Spark webui possibly because they are not named.

{quote}
We are currently testing a few capabilities using Spark and one thing we 
noticed in Spark is they don't list any user defined accumulators on web UI. 

On MapReduce I would imagine counters being displayed on the job page, however 
on a SparkPipeline I was only able to pull counter information from 
PipelineResult#getStageResult(). 

I think the reason these accumulators are not visible on web UI is because 
crunch does not name these accumulators. Spark expects an accumulator to have a 
name to be visible on the UI.

https://github.com/apache/crunch/blob/apache-crunch-0.13.0/crunch-spark/src/main/java/org/apache/crunch/impl/spark/SparkRuntime.java#L125-L126

https://github.com/apache/spark/blob/v1.4.1/core/src/main/scala/org/apache/spark/api/java/JavaSparkContext.scala#L616-L624
 (accumulator API with Name)

I would like to know if it's possible in crunch to name these accumulators so 
they are available in web UI. This will give us an experience where users can 
monitor/watch accumulators from web UI to obtain key information about their 
jobs. 
{quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to