Github user tdas commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5473#discussion_r28906576
  
    --- Diff: 
streaming/src/main/scala/org/apache/spark/streaming/ui/StreamingJobProgressListener.scala
 ---
    @@ -40,6 +43,8 @@ private[streaming] class 
StreamingJobProgressListener(ssc: StreamingContext)
       private var totalProcessedRecords = 0L
       private val receiverInfos = new HashMap[Int, ReceiverInfo]
     
    +  private val batchTimeToBatchUIData = new HashMap[Time, BatchUIData]
    --- End diff --
    
    This is good! But still not enough. The whole point of adding BatchUIData 
to encapsulate BatchInfo and outputOpId --> SparkJobId in it. So that all of 
that can be cleaned easily and we dont have to maintain separate data 
structures of BatchInfos (that is, waitingBatchInfos, runningBatchInfos, etc) 
and BatchUIData. So we should convert all (waiting/running/completed)BatchInfos 
to ***BatchUIData. Then we dont need to add this extra `batchTimeToBatchUIData` 
any more. And we dont need two methods - getBatchInfo and getBatchUIData, only 
the latter is needed.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to