Shixiong Zhu created SPARK-31923: ------------------------------------ Summary: Event log cannot be generated when some internal accumulators use unexpected types Key: SPARK-31923 URL: https://issues.apache.org/jira/browse/SPARK-31923 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 2.4.6 Reporter: Shixiong Zhu
A user may use internal accumulators by adding the "internal.metrics." prefix to the accumulator name to hide sensitive information from UI (Accumulators will be shown in Spark UI by default). However, org.apache.spark.util.JsonProtocol.accumValueToJson assumes an internal accumulator has only 3 possible types: int, long, and java.util.List[(BlockId, BlockStatus)]. When an internal accumulator uses an unexpected type, it will crash. It's better to make accumValueToJson more robust. How to reproduce it: - Enable Spark event log - Run the following command: {code} scala> val accu = sc.doubleAccumulator("internal.metrics.foo") accu: org.apache.spark.util.DoubleAccumulator = DoubleAccumulator(id: 0, name: Some(internal.metrics.foo), value: 0.0) scala> sc.parallelize(1 to 1, 1).foreach { _ => accu.add(1.0) } 20/06/06 16:11:27 ERROR AsyncEventQueue: Listener EventLoggingListener threw an exception java.lang.ClassCastException: java.lang.Double cannot be cast to java.util.List at org.apache.spark.util.JsonProtocol$.accumValueToJson(JsonProtocol.scala:330) at org.apache.spark.util.JsonProtocol$$anonfun$accumulableInfoToJson$3.apply(JsonProtocol.scala:306) at org.apache.spark.util.JsonProtocol$$anonfun$accumulableInfoToJson$3.apply(JsonProtocol.scala:306) at scala.Option.map(Option.scala:146) at org.apache.spark.util.JsonProtocol$.accumulableInfoToJson(JsonProtocol.scala:306) at org.apache.spark.util.JsonProtocol$$anonfun$accumulablesToJson$2.apply(JsonProtocol.scala:299) at org.apache.spark.util.JsonProtocol$$anonfun$accumulablesToJson$2.apply(JsonProtocol.scala:299) at scala.collection.immutable.List.map(List.scala:284) at org.apache.spark.util.JsonProtocol$.accumulablesToJson(JsonProtocol.scala:299) at org.apache.spark.util.JsonProtocol$.taskInfoToJson(JsonProtocol.scala:291) at org.apache.spark.util.JsonProtocol$.taskEndToJson(JsonProtocol.scala:145) at org.apache.spark.util.JsonProtocol$.sparkEventToJson(JsonProtocol.scala:76) at org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:138) at org.apache.spark.scheduler.EventLoggingListener.onTaskEnd(EventLoggingListener.scala:158) at org.apache.spark.scheduler.SparkListenerBus$class.doPostEvent(SparkListenerBus.scala:45) at org.apache.spark.scheduler.AsyncEventQueue.doPostEvent(AsyncEventQueue.scala:37) at org.apache.spark.scheduler.AsyncEventQueue.doPostEvent(AsyncEventQueue.scala:37) at org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:91) at org.apache.spark.scheduler.AsyncEventQueue.org$apache$spark$scheduler$AsyncEventQueue$$super$postToAll(AsyncEventQueue.scala:92) at org.apache.spark.scheduler.AsyncEventQueue$$anonfun$org$apache$spark$scheduler$AsyncEventQueue$$dispatch$1.apply$mcJ$sp(AsyncEventQueue.scala:92) at org.apache.spark.scheduler.AsyncEventQueue$$anonfun$org$apache$spark$scheduler$AsyncEventQueue$$dispatch$1.apply(AsyncEventQueue.scala:87) at org.apache.spark.scheduler.AsyncEventQueue$$anonfun$org$apache$spark$scheduler$AsyncEventQueue$$dispatch$1.apply(AsyncEventQueue.scala:87) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58) at org.apache.spark.scheduler.AsyncEventQueue.org$apache$spark$scheduler$AsyncEventQueue$$dispatch(AsyncEventQueue.scala:87) at org.apache.spark.scheduler.AsyncEventQueue$$anon$1$$anonfun$run$1.apply$mcV$sp(AsyncEventQueue.scala:83) at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1302) at org.apache.spark.scheduler.AsyncEventQueue$$anon$1.run(AsyncEventQueue.scala:82) {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org