Mike Chan created SPARK-27264:
---------------------------------

             Summary: spark sql released all executor but the job is not done
                 Key: SPARK-27264
                 URL: https://issues.apache.org/jira/browse/SPARK-27264
             Project: Spark
          Issue Type: Question
          Components: SQL
    Affects Versions: 2.4.0
         Environment: Azure HDinsight spark 2.4 on Azure storage SQL: Read and 
Join some data and finally write result to a Hive metastore
            Reporter: Mike Chan


I have a spark sql that used to execute < 10 mins now running at 3 hours after 
a cluster migration and need to deep dive on what it's actually doing. I'm new 
to spark and please don't mind if I'm asking something unrelated.

Increased spark.executor.memory but no luck. Env: Azure HDinsight spark 2.4 on 
Azure storage SQL: Read and Join some data and finally write result to a Hive 
metastore

The sparl.sql ends with below code: 
.write.mode("overwrite").saveAsTable("default.mikemiketable")

Application Behavior: Within the first 15 mins, it loads and complete most 
tasks (199/200); left only 1 executor process alive and continually to shuffle 
read / write data. Because now it only leave 1 executor, we need to wait 3 
hours until this application finish. 
[!https://i.stack.imgur.com/6hqvh.png!|https://i.stack.imgur.com/6hqvh.png]

Left only 1 executor alive 
[!https://i.stack.imgur.com/55162.png!|https://i.stack.imgur.com/55162.png]

Not sure what's the executor doing: 
[!https://i.stack.imgur.com/TwhuX.png!|https://i.stack.imgur.com/TwhuX.png]

>From time to time, we can tell the shuffle read increased: 
>[!https://i.stack.imgur.com/WhF9A.png!|https://i.stack.imgur.com/WhF9A.png]

Therefore I increased the spark.executor.memory to 20g, but nothing changed. 
From Ambari and YARN I can tell the cluster has many resources left. 
[!https://i.stack.imgur.com/pngQA.png!|https://i.stack.imgur.com/pngQA.png]

Release of almost all executor 
[!https://i.stack.imgur.com/pA134.png!|https://i.stack.imgur.com/pA134.png]

Any guidance is greatly appreciated.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to