Github user mwws commented on a diff in the pull request:
https://github.com/apache/spark/pull/13259#discussion_r64457578
--- Diff:
streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobScheduler.scala
---
@@ -193,6 +197,14 @@ class JobScheduler(val ssc: StreamingContext) extends
Logging {
listenerBus.post(StreamingListenerOutputOperationCompleted(job.toOutputOperationInfo))
logInfo("Finished job " + job.id + " from job set of time " +
jobSet.time)
if (jobSet.hasCompleted) {
+ // submit fake BatchCompleted event to show missing inputInfo on
Streaming UI
+ inputInfoMissedTimes.foreach (time => {
+ val streamIdToInputInfos = inputInfoTracker.getInfo(time)
+ val fakeJobSet = JobSet(time, Seq(), streamIdToInputInfos)
+
listenerBus.post(StreamingListenerBatchCompleted(fakeJobSet.toBatchInfo))
--- End diff --
Good point, it would be a breaking change in that case. Give the fact that
some information is indeed missing now, we need either send more events or add
additional fields in current event. The later might be better and more correct
in semantics, although it can still break current implementation of user's
custom listener. And I noticed that listener interface is annotated as
DeveloperAPI, so it's expected that API might be changed. How do you think?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]