[ https://issues.apache.org/jira/browse/SPARK-27264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mike Chan resolved SPARK-27264. ------------------------------- Resolution: Invalid Before making further attempt, I want to ensure broadcast join is enabled on the cluster. Thank you. > spark sql released all executor but the job is not done > ------------------------------------------------------- > > Key: SPARK-27264 > URL: https://issues.apache.org/jira/browse/SPARK-27264 > Project: Spark > Issue Type: Question > Components: SQL > Affects Versions: 2.4.0 > Environment: Azure HDinsight spark 2.4 on Azure storage SQL: Read and > Join some data and finally write result to a Hive metastore; query executed > on jupyterhub; while the pre-migration cluster is a jupyter (non-hub) > Reporter: Mike Chan > Priority: Major > > I have a spark sql that used to execute < 10 mins now running at 3 hours > after a cluster migration and need to deep dive on what it's actually doing. > I'm new to spark and please don't mind if I'm asking something unrelated. > Increased spark.executor.memory but no luck. Env: Azure HDinsight spark 2.4 > on Azure storage SQL: Read and Join some data and finally write result to a > Hive metastore > The sparl.sql ends with below code: > .write.mode("overwrite").saveAsTable("default.mikemiketable") > Application Behavior: Within the first 15 mins, it loads and complete most > tasks (199/200); left only 1 executor process alive and continually to > shuffle read / write data. Because now it only leave 1 executor, we need to > wait 3 hours until this application finish. > [!https://i.stack.imgur.com/6hqvh.png!|https://i.stack.imgur.com/6hqvh.png] > Left only 1 executor alive > [!https://i.stack.imgur.com/55162.png!|https://i.stack.imgur.com/55162.png] > Not sure what's the executor doing: > [!https://i.stack.imgur.com/TwhuX.png!|https://i.stack.imgur.com/TwhuX.png] > From time to time, we can tell the shuffle read increased: > [!https://i.stack.imgur.com/WhF9A.png!|https://i.stack.imgur.com/WhF9A.png] > Therefore I increased the spark.executor.memory to 20g, but nothing changed. > From Ambari and YARN I can tell the cluster has many resources left. > [!https://i.stack.imgur.com/pngQA.png!|https://i.stack.imgur.com/pngQA.png] > Release of almost all executor > [!https://i.stack.imgur.com/pA134.png!|https://i.stack.imgur.com/pA134.png] > Any guidance is greatly appreciated. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org