[ https://issues.apache.org/jira/browse/SPARK-32615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Leanken.Lin updated SPARK-32615: -------------------------------- Description: {code:java} // Reproduce Step sql/test-only org.apache.spark.sql.execution.adaptive.AdaptiveQueryExecSuite -- -z "SPARK-32573: Eliminate NAAJ when BuildSide is EmptyHashedRelationWithAllNullKeys" {code} {code:java} // Error Message 14:40:44.089 ERROR org.apache.spark.util.Utils: Uncaught exception in thread element-tracking-store-worker 14:40:44.089 ERROR org.apache.spark.util.Utils: Uncaught exception in thread element-tracking-store-worker java.util.NoSuchElementException: key not found: 12 at scala.collection.immutable.Map$Map1.apply(Map.scala:114) at org.apache.spark.sql.execution.ui.SQLAppStatusListener.$anonfun$aggregateMetrics$11(SQLAppStatusListener.scala:257) at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:238) at scala.collection.mutable.HashMap.$anonfun$foreach$1(HashMap.scala:149) at scala.collection.mutable.HashTable.foreachEntry(HashTable.scala:237) at scala.collection.mutable.HashTable.foreachEntry$(HashTable.scala:230) at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:44) at scala.collection.mutable.HashMap.foreach(HashMap.scala:149) at scala.collection.TraversableLike.map(TraversableLike.scala:238) at scala.collection.TraversableLike.map$(TraversableLike.scala:231) at scala.collection.AbstractTraversable.map(Traversable.scala:108) at org.apache.spark.sql.execution.ui.SQLAppStatusListener.aggregateMetrics(SQLAppStatusListener.scala:256) at org.apache.spark.sql.execution.ui.SQLAppStatusListener.$anonfun$onExecutionEnd$2(SQLAppStatusListener.scala:365) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at org.apache.spark.util.Utils$.tryLog(Utils.scala:1971) at org.apache.spark.status.ElementTrackingStore$$anon$1.run(ElementTrackingStore.scala:117) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)[info] - SPARK-32573: Eliminate NAAJ when BuildSide is EmptyHashedRelationWithAllNullKeys (2 seconds, 14 milliseconds) {code} This issue is mainly because during AQE, while sub-plan changed, the metrics update is overwrite. for example, in this UT, change from BroadcastHashJoinExec into a LocalTableScanExec, and in the onExecutionEnd action it will try aggregate all metrics including old ones during the execution, which will cause NoSuchElementException, since the metricsType is already updated with plan rewritten. So we need to filter out those outdated metrics. was: {code:java} // Reproduce Step sql/test-only org.apache.spark.sql.execution.adaptive.AdaptiveQueryExecSuite -- -z "SPARK-32573: Eliminate NAAJ when BuildSide is EmptyHashedRelationWithAllNullKeys" {code} {code:java} // Error Message 14:40:44.089 ERROR org.apache.spark.util.Utils: Uncaught exception in thread element-tracking-store-worker 14:40:44.089 ERROR org.apache.spark.util.Utils: Uncaught exception in thread element-tracking-store-worker java.util.NoSuchElementException: key not found: 12 at scala.collection.immutable.Map$Map1.apply(Map.scala:114) at org.apache.spark.sql.execution.ui.SQLAppStatusListener.$anonfun$aggregateMetrics$11(SQLAppStatusListener.scala:257) at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:238) at scala.collection.mutable.HashMap.$anonfun$foreach$1(HashMap.scala:149) at scala.collection.mutable.HashTable.foreachEntry(HashTable.scala:237) at scala.collection.mutable.HashTable.foreachEntry$(HashTable.scala:230) at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:44) at scala.collection.mutable.HashMap.foreach(HashMap.scala:149) at scala.collection.TraversableLike.map(TraversableLike.scala:238) at scala.collection.TraversableLike.map$(TraversableLike.scala:231) at scala.collection.AbstractTraversable.map(Traversable.scala:108) at org.apache.spark.sql.execution.ui.SQLAppStatusListener.aggregateMetrics(SQLAppStatusListener.scala:256) at org.apache.spark.sql.execution.ui.SQLAppStatusListener.$anonfun$onExecutionEnd$2(SQLAppStatusListener.scala:365) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at org.apache.spark.util.Utils$.tryLog(Utils.scala:1971) at org.apache.spark.status.ElementTrackingStore$$anon$1.run(ElementTrackingStore.scala:117) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)[info] - SPARK-32573: Eliminate NAAJ when BuildSide is EmptyHashedRelationWithAllNullKeys (2 seconds, 14 milliseconds) {code} This issue is mainly because during AQE, while sub-plan changed, the metrics update is overwrite. for example, in this UT, change from BroadcastHashJoinExec into a LocalTableScanExec, and in the onExecutionEnd action it will try aggregate all metrics during the execution, which will cause NoSuchElementException > Fix AQE aggregateMetrics java.util.NoSuchElementException > --------------------------------------------------------- > > Key: SPARK-32615 > URL: https://issues.apache.org/jira/browse/SPARK-32615 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 3.0.0 > Reporter: Leanken.Lin > Priority: Minor > > {code:java} > // Reproduce Step > sql/test-only org.apache.spark.sql.execution.adaptive.AdaptiveQueryExecSuite > -- -z "SPARK-32573: Eliminate NAAJ when BuildSide is > EmptyHashedRelationWithAllNullKeys" > {code} > {code:java} > // Error Message > 14:40:44.089 ERROR org.apache.spark.util.Utils: Uncaught exception in thread > element-tracking-store-worker > 14:40:44.089 ERROR org.apache.spark.util.Utils: Uncaught exception in thread > element-tracking-store-worker java.util.NoSuchElementException: key not > found: 12 > at scala.collection.immutable.Map$Map1.apply(Map.scala:114) > at > org.apache.spark.sql.execution.ui.SQLAppStatusListener.$anonfun$aggregateMetrics$11(SQLAppStatusListener.scala:257) > at > scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:238) at > scala.collection.mutable.HashMap.$anonfun$foreach$1(HashMap.scala:149) at > scala.collection.mutable.HashTable.foreachEntry(HashTable.scala:237) at > scala.collection.mutable.HashTable.foreachEntry$(HashTable.scala:230) at > scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:44) at > scala.collection.mutable.HashMap.foreach(HashMap.scala:149) at > scala.collection.TraversableLike.map(TraversableLike.scala:238) at > scala.collection.TraversableLike.map$(TraversableLike.scala:231) at > scala.collection.AbstractTraversable.map(Traversable.scala:108) at > org.apache.spark.sql.execution.ui.SQLAppStatusListener.aggregateMetrics(SQLAppStatusListener.scala:256) > at > org.apache.spark.sql.execution.ui.SQLAppStatusListener.$anonfun$onExecutionEnd$2(SQLAppStatusListener.scala:365) > at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at > org.apache.spark.util.Utils$.tryLog(Utils.scala:1971) at > org.apache.spark.status.ElementTrackingStore$$anon$1.run(ElementTrackingStore.scala:117) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748)[info] - SPARK-32573: Eliminate NAAJ > when BuildSide is EmptyHashedRelationWithAllNullKeys (2 seconds, 14 > milliseconds) > {code} > This issue is mainly because during AQE, while sub-plan changed, the metrics > update is overwrite. for example, in this UT, change from > BroadcastHashJoinExec into a LocalTableScanExec, and in the onExecutionEnd > action it will try aggregate all metrics including old ones during the > execution, which will cause NoSuchElementException, since the metricsType is > already updated with plan rewritten. So we need to filter out those outdated > metrics. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org