squito commented on a change in pull request #25943:
[WIP][SPARK-29261][SQL][CORE] Support recover live entities from KVStore for
(SQL)AppStatusListener
URL: https://github.com/apache/spark/pull/25943#discussion_r334157041
##########
File path: core/src/main/scala/org/apache/spark/status/storeTypes.scala
##########
@@ -76,6 +109,29 @@ private[spark] class JobDataWrapper(
@JsonIgnore @KVIndex("completionTime")
private def completionTime: Long =
info.completionTime.map(_.getTime).getOrElse(-1L)
+
+ def toLiveJob: LiveJob = {
Review comment:
This is just a brainstorm, and not something I'm sure is the right approach
at all yet --
we could also consider just letting some details like this be lossy. But if
we were go down that road, I'd like some mechanism to ensure that we only
accepted lost info where it "probably" didn't matter. Eg. we'd save metadata
in such a way that the SHS would know to replay currently running jobs from the
event log, even though some of that info had already been parsed and stored in
the snapshot. But say there was some really late speculative task completion,
from a job that had finished long ago -- that may not be represented at
full-fidelity.
I feel like this would cover streaming pretty well. It would not work as
well for job-server style deployments ... but, really, I can't think of
anything which covers that very well.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]