Ngone51 commented on issue #25577: [WIP][CORE][SPARK-28867] InMemoryStore checkpoint to speed up replay log file in HistoryServer URL: https://github.com/apache/spark/pull/25577#issuecomment-542189755 @squito We don't ignore snapshot in the SHS. As you can see, @HeartSaVioR and I are currently working together on SPARK-28594. And snapshot in the SHS is a necessary part of SPARK-28594. And, though snapshot in the driver can be more complicated than doing it in the SHS, we may still want to give a try. Because, IIUC, SPARK-28594 and SPARK-20656 are only useful for optimizing replaying **in-complete** event log files(though, SPARK-28594 has more objectives than this). But for a **completed** event log file, they don't reduce replay time at all (even with snapshot). Thus, snapshot in the driver actually indicates to optimize replaying the **completed** event log files. And this is actually what SPARK-28867 trying to do. Of course, snapshot in the driver would also optimize replaying for in-complete event log files as well. Our plan now is to address SPARK-28594 firstly. After that, we'll move to SPARK-28867. As you know, there would be more complications to do it. So, it still requires new and detailed design and discussions in the community.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
