Thomas Graves created SPARK-30831:
-------------------------------------

             Summary: Executors UI shows more active tasks then possible
                 Key: SPARK-30831
                 URL: https://issues.apache.org/jira/browse/SPARK-30831
             Project: Spark
          Issue Type: Bug
          Components: Web UI
    Affects Versions: 3.0.0
            Reporter: Thomas Graves


I regularly see the executors web ui showing more active tasks then it has 
cores. Looking at the code it seems that we track those separately and the 
message that is sent for task end is asynchronous and thus ends up showing up 
at the UI much later then the start event.

CoarseGrainedSchedulerBackend on statusUpdate increases the freeCores which 
then allow scheduler to assign another task, but the taskEndEvent is 
asynchronous.

We definitely don't want to slow down the scheduling part so not sure how 
easily it will be to improve.

 

To reproduce I just ran:

val df = sc.makeRDD(1 to 10000000, 6).toDF
val df2 = sc.makeRDD(1 to 10000000, 6).toDF

spark.time(df.select( $"value" as "a").join(df2.select($"value" as "b"), $"a" 
=== $"b").write.mode("overwrite").csv("somefile"))

 

And view the executors ui page. I started spark-shell with just 1 core per 
executor and you  see 2 active tasks



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to