[ 
https://issues.apache.org/jira/browse/SPARK-20404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15976506#comment-15976506
 ] 

Sergey Zhemzhitsky commented on SPARK-20404:
--------------------------------------------

I would agree, if the error occurs just at the time of creating the 
accumulator, but in this case the error may occur at any time (possibly after 
hours of running the job) when you will try to update the accumulator. So, for 
me the current behaviour seems to be misleading and confusing, rather than an 
expected one.

> Regression with accumulator names when migrating from 1.6 to 2.x
> ----------------------------------------------------------------
>
>                 Key: SPARK-20404
>                 URL: https://issues.apache.org/jira/browse/SPARK-20404
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 2.0.0, 2.0.1, 2.0.2, 2.1.0
>         Environment: Spark: 2.1
> Scala: 2.11
> Spark master: local
>            Reporter: Sergey Zhemzhitsky
>         Attachments: spark-context-accum-option.patch
>
>
> Creating accumulator with explicitly specified name equal to _null_, like the 
> following
> {code:java}
> sparkContext.accumulator(0, null)
> {code}
> throws exception at runtime
> {code:none}
> ERROR | DAGScheduler | dag-scheduler-event-loop | Failed to update 
> accumulators for task 0
> java.lang.NullPointerException
>       at 
> org.apache.spark.util.AccumulatorV2$$anonfun$1.apply(AccumulatorV2.scala:106)
>       at 
> org.apache.spark.util.AccumulatorV2$$anonfun$1.apply(AccumulatorV2.scala:106)
>       at scala.Option.exists(Option.scala:240)
>       at org.apache.spark.util.AccumulatorV2.toInfo(AccumulatorV2.scala:106)
>       at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$updateAccumulators$1.apply(DAGScheduler.scala:1091)
>       at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$updateAccumulators$1.apply(DAGScheduler.scala:1080)
>       at 
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>       at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
>       at 
> org.apache.spark.scheduler.DAGScheduler.updateAccumulators(DAGScheduler.scala:1080)
>       at 
> org.apache.spark.scheduler.DAGScheduler.handleTaskCompletion(DAGScheduler.scala:1183)
>       at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1647)
>       at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1605)
>       at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1594)
>       at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
> {code}
> The issue is in wrapping name into _Some_ instead of _Option_ when creating 
> accumulators.
> Patch is available.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to