Github user squito commented on a diff in the pull request:
https://github.com/apache/spark/pull/19582#discussion_r148901343
--- Diff:
core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala ---
@@ -486,8 +580,21 @@ private[history] class FsHistoryProvider(conf:
SparkConf, clock: Clock)
bus.addListener(listener)
replay(fileStatus, isApplicationCompleted(fileStatus), bus,
eventsFilter)
- listener.applicationInfo.foreach(addListing)
- listing.write(LogInfo(logPath.toString(), fileStatus.getLen()))
+ listener.applicationInfo.foreach { app =>
+ // Invalidate the existing UI for the reloaded app attempt, if any.
Note that this does
+ // not remove the UI from the active list; that has to be done in
onUIDetached, so that
+ // cleanup of files can be done in a thread-safe manner. It does
mean the UI will remain
+ // in memory for longer than it should.
--- End diff --
it took me some time to figure how the cache & invalidation worked, mostly
because I wasn't looking in the right places. I don't think you've made this
any more confusing than it was before (in fact its probably better), but seems
like a good opportunity to improve commenting a little. I think it might help
to have one comment in the code where the entire sequence is described ( here
on `mergeApplicationListing`, or on `AppCache`, or
`ApplicationCacheCheckFilter`, doesn't really matter, but they could all
reference the longer comment). if I understand correctly, it would be
something like:
Logs of incomplete apps are regularly polled to see if they have been
updated (based on an increase in file size). If they have, the existing data
for that app is marked as invalid in LoadedAppUI. However, no memory is freed,
no files are cleaned up at this time, nor is a new UI built. On each request
for one app's UI, the application cache is checked to see if it has a valid
`LoadedAppUI` in the cache. If there is data in the cache and its valid, then
its served. If there is data in the cache but it is invalid, then the UI is
rebuilt from the raw event logs. If there is nothing in the cache, then the UI
is built from the raw event logs and added to the cache. This may kick another
entry out of the cache -- if its for an incomplete app, then any KVStore data
written to disk is deleted (as the KVStore for an incomplete app is always
regenerated from scratch anyway).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]