Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20251#discussion_r161296230
  
    --- Diff: core/src/main/scala/org/apache/spark/ui/jobs/AllJobsPage.scala ---
    @@ -429,20 +429,40 @@ private[ui] class JobDataSource(
         val formattedDuration = duration.map(d => 
UIUtils.formatDuration(d)).getOrElse("Unknown")
         val submissionTime = jobData.submissionTime
         val formattedSubmissionTime = 
submissionTime.map(UIUtils.formatDate).getOrElse("Unknown")
    -    val jobDescription = 
UIUtils.makeDescription(jobData.description.getOrElse(""),
    -      basePath, plainText = false)
    +
    +    val lastStageAttempt = {
    +      val stageAttempts = jobData.stageIds.flatMap(store.stageData(_))
    --- End diff --
    
    Your code in L436 is getting last stage for this job, so loading all stages 
here seems unnecessary.
    
    I guess it's not guaranteed that the attempt that ran for this particular 
job is the last attempt of that particular stage. But the logic you have here, 
as far as I can see, will return the *fist* attempt of the stage with the 
highest id:
    
    ```
    scala> case class Foo(id: Int, str: String)
    defined class Foo
    
    scala> Seq(Foo(3, "one"), Foo(2, "two"), Foo(3, "three")).maxBy(_.id)
    res1: Foo = Foo(3,one)
    ```
    
    Given that you're only interested in the description anyway, and that won't 
change between attempts, either logic is fine. Loading the last attempt only is 
just a little cheaper.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to