[jira] [Commented] (SPARK-30586) NPE in LiveRDDDistribution (AppStatusListener)
[ https://issues.apache.org/jira/browse/SPARK-30586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17036788#comment-17036788 ] Jungtaek Lim commented on SPARK-30586: -- "onExecutorAdded" doesn't fill up hostPort in LiveExecutor which looks to be null in the stack trace. "onBlockManagerAdded" does - this seems to show one of possible case. If we have event log file for the application encountered the issue, easier to check what's happening there. > NPE in LiveRDDDistribution (AppStatusListener) > -- > > Key: SPARK-30586 > URL: https://issues.apache.org/jira/browse/SPARK-30586 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.4.4 > Environment: A Hadoop cluster consisting of Centos 7.4 machines. >Reporter: Jan Van den bosch >Priority: Major > > We've been noticing a great amount of NullPointerExceptions in our > long-running Spark job driver logs: > {noformat} > 20/01/17 23:40:12 ERROR AsyncEventQueue: Listener AppStatusListener threw an > exception > java.lang.NullPointerException > at > org.spark_project.guava.base.Preconditions.checkNotNull(Preconditions.java:191) > at > org.spark_project.guava.collect.MapMakerInternalMap.putIfAbsent(MapMakerInternalMap.java:3507) > at > org.spark_project.guava.collect.Interners$WeakInterner.intern(Interners.java:85) > at > org.apache.spark.status.LiveEntityHelpers$.weakIntern(LiveEntity.scala:603) > at > org.apache.spark.status.LiveRDDDistribution.toApi(LiveEntity.scala:486) > at > org.apache.spark.status.LiveRDD$$anonfun$2.apply(LiveEntity.scala:548) > at > org.apache.spark.status.LiveRDD$$anonfun$2.apply(LiveEntity.scala:548) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.mutable.HashMap$$anon$2$$anonfun$foreach$3.apply(HashMap.scala:139) > at > scala.collection.mutable.HashMap$$anon$2$$anonfun$foreach$3.apply(HashMap.scala:139) > at > scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:236) > at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40) > at scala.collection.mutable.HashMap$$anon$2.foreach(HashMap.scala:139) > at > scala.collection.TraversableLike$class.map(TraversableLike.scala:234) > at scala.collection.AbstractTraversable.map(Traversable.scala:104) > at org.apache.spark.status.LiveRDD.doUpdate(LiveEntity.scala:548) > at org.apache.spark.status.LiveEntity.write(LiveEntity.scala:49) > at > org.apache.spark.status.AppStatusListener.org$apache$spark$status$AppStatusListener$$update(AppStatusListener.scala:991) > at > org.apache.spark.status.AppStatusListener.org$apache$spark$status$AppStatusListener$$maybeUpdate(AppStatusListener.scala:997) > at > org.apache.spark.status.AppStatusListener$$anonfun$onExecutorMetricsUpdate$2.apply(AppStatusListener.scala:764) > at > org.apache.spark.status.AppStatusListener$$anonfun$onExecutorMetricsUpdate$2.apply(AppStatusListener.scala:764) > at > scala.collection.mutable.HashMap$$anon$2$$anonfun$foreach$3.apply(HashMap.scala:139) > at > scala.collection.mutable.HashMap$$anon$2$$anonfun$foreach$3.apply(HashMap.scala:139) > at > scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:236) > at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40) > at scala.collection.mutable.HashMap$$anon$2.foreach(HashMap.scala:139) > at > org.apache.spark.status.AppStatusListener.org$apache$spark$status$AppStatusListener$$flush(AppStatusListener.scala:788) > at > org.apache.spark.status.AppStatusListener.onExecutorMetricsUpdate(AppStatusListener.scala:764) > at > org.apache.spark.scheduler.SparkListenerBus$class.doPostEvent(SparkListenerBus.scala:59) > at > org.apache.spark.scheduler.AsyncEventQueue.doPostEvent(AsyncEventQueue.scala:37) > at > org.apache.spark.scheduler.AsyncEventQueue.doPostEvent(AsyncEventQueue.scala:37) > at > org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:91) > at > org.apache.spark.scheduler.AsyncEventQueue.org$apache$spark$scheduler$AsyncEventQueue$$super$postToAll(AsyncEventQueue.scala:92) > at > org.apache.spark.scheduler.AsyncEventQueue$$anonfun$org$apache$spark$scheduler$AsyncEventQueue$$dispatch$1.apply$mcJ$sp(AsyncEventQueue.scala:92) > at > org.apache.spark.scheduler.AsyncEventQueue$$anonfun$org$apache$spark$scheduler$AsyncEventQueue$$dispatch$1.apply(AsyncEventQueue.scala:87) > at >
[jira] [Commented] (SPARK-30586) NPE in LiveRDDDistribution (AppStatusListener)
[ https://issues.apache.org/jira/browse/SPARK-30586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17036734#comment-17036734 ] Saisai Shao commented on SPARK-30586: - We also met the same issue. Seems like the code doesn't check the nullable of string and directly called String intern, which throws NPE from guava. My first thinking is to add nullable check in {{weakIntern}}. Still investigating how this could be happened, might be due to the lost or out-of-order spark listener event. > NPE in LiveRDDDistribution (AppStatusListener) > -- > > Key: SPARK-30586 > URL: https://issues.apache.org/jira/browse/SPARK-30586 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.4.4 > Environment: A Hadoop cluster consisting of Centos 7.4 machines. >Reporter: Jan Van den bosch >Priority: Major > > We've been noticing a great amount of NullPointerExceptions in our > long-running Spark job driver logs: > {noformat} > 20/01/17 23:40:12 ERROR AsyncEventQueue: Listener AppStatusListener threw an > exception > java.lang.NullPointerException > at > org.spark_project.guava.base.Preconditions.checkNotNull(Preconditions.java:191) > at > org.spark_project.guava.collect.MapMakerInternalMap.putIfAbsent(MapMakerInternalMap.java:3507) > at > org.spark_project.guava.collect.Interners$WeakInterner.intern(Interners.java:85) > at > org.apache.spark.status.LiveEntityHelpers$.weakIntern(LiveEntity.scala:603) > at > org.apache.spark.status.LiveRDDDistribution.toApi(LiveEntity.scala:486) > at > org.apache.spark.status.LiveRDD$$anonfun$2.apply(LiveEntity.scala:548) > at > org.apache.spark.status.LiveRDD$$anonfun$2.apply(LiveEntity.scala:548) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.mutable.HashMap$$anon$2$$anonfun$foreach$3.apply(HashMap.scala:139) > at > scala.collection.mutable.HashMap$$anon$2$$anonfun$foreach$3.apply(HashMap.scala:139) > at > scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:236) > at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40) > at scala.collection.mutable.HashMap$$anon$2.foreach(HashMap.scala:139) > at > scala.collection.TraversableLike$class.map(TraversableLike.scala:234) > at scala.collection.AbstractTraversable.map(Traversable.scala:104) > at org.apache.spark.status.LiveRDD.doUpdate(LiveEntity.scala:548) > at org.apache.spark.status.LiveEntity.write(LiveEntity.scala:49) > at > org.apache.spark.status.AppStatusListener.org$apache$spark$status$AppStatusListener$$update(AppStatusListener.scala:991) > at > org.apache.spark.status.AppStatusListener.org$apache$spark$status$AppStatusListener$$maybeUpdate(AppStatusListener.scala:997) > at > org.apache.spark.status.AppStatusListener$$anonfun$onExecutorMetricsUpdate$2.apply(AppStatusListener.scala:764) > at > org.apache.spark.status.AppStatusListener$$anonfun$onExecutorMetricsUpdate$2.apply(AppStatusListener.scala:764) > at > scala.collection.mutable.HashMap$$anon$2$$anonfun$foreach$3.apply(HashMap.scala:139) > at > scala.collection.mutable.HashMap$$anon$2$$anonfun$foreach$3.apply(HashMap.scala:139) > at > scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:236) > at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40) > at scala.collection.mutable.HashMap$$anon$2.foreach(HashMap.scala:139) > at > org.apache.spark.status.AppStatusListener.org$apache$spark$status$AppStatusListener$$flush(AppStatusListener.scala:788) > at > org.apache.spark.status.AppStatusListener.onExecutorMetricsUpdate(AppStatusListener.scala:764) > at > org.apache.spark.scheduler.SparkListenerBus$class.doPostEvent(SparkListenerBus.scala:59) > at > org.apache.spark.scheduler.AsyncEventQueue.doPostEvent(AsyncEventQueue.scala:37) > at > org.apache.spark.scheduler.AsyncEventQueue.doPostEvent(AsyncEventQueue.scala:37) > at > org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:91) > at > org.apache.spark.scheduler.AsyncEventQueue.org$apache$spark$scheduler$AsyncEventQueue$$super$postToAll(AsyncEventQueue.scala:92) > at > org.apache.spark.scheduler.AsyncEventQueue$$anonfun$org$apache$spark$scheduler$AsyncEventQueue$$dispatch$1.apply$mcJ$sp(AsyncEventQueue.scala:92) > at > org.apache.spark.scheduler.AsyncEventQueue$$anonfun$org$apache$spark$scheduler$AsyncEventQueue$$dispatch$1.apply(AsyncEventQueue.scala:87) > at >