Ngone51 commented on issue #25577: [WIP][CORE][SPARK-28867] InMemoryStore checkpoint to speed up replay log file in HistoryServer URL: https://github.com/apache/spark/pull/25577#issuecomment-525979016 > AppStatusListener has code to "flush" live entities to the kvstore, so you should never need to snapshot any live data. That's true, but KVStore still lack some necessary info to recover live entities due to live entities don't always write all fields out. e.g. `LiveStage` don't write field `savedTasks` out, while that field is used to indicate whether we should cleanup too old tasks according to the configured task retain threshold in AppStatusListener. Of course, we could update to write that field out, but that would break api.v1.StageData(Not sure what standard we have on these api data, maybe @vanzin has any suggestion ?) and introduce unnecessary filed in UI related data. Also, LiveStage doesn't write `completedIndices`, `activeTasksPerExecutor`, `blackListedExecutors`(This seems like a defect ? SHS yet can't show blacklist info about executors, though we have it int Live UI.). Other entities have similar problems. For some fields we could recover them in indirect way and some fields are not necessary needed. But, fields like `savedTasks` are still necessary. I think if we could write all those fields out whatever, then "recover live entities" can't be a big problem.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
