Ngone51 commented on issue #25577: [WIP][CORE][SPARK-28867] InMemoryStore 
checkpoint to speed up replay log file in HistoryServer
URL: https://github.com/apache/spark/pull/25577#issuecomment-542189755
 
 
   @squito We don't ignore snapshot in the SHS. As you can see, @HeartSaVioR 
and I are currently working together on SPARK-28594. And snapshot in the SHS is 
a necessary part of SPARK-28594. 
   
   And, though snapshot in the driver can be more complicated than doing it in 
the SHS, we may still want to give a try. Because, IIUC, SPARK-28594 and 
SPARK-20656 are only useful for optimizing replaying **in-complete** event log 
files(though, SPARK-28594 has more objectives than this). But for a 
**completed** event log file, they don't reduce replay time at all (even with 
snapshot). Thus, snapshot in the driver actually indicates to optimize 
replaying the **completed** event log files. And this is actually what 
SPARK-28867 trying to do. Of course, snapshot in the driver would also optimize 
replaying for in-complete event log files as well.
   
   Our plan now is to address SPARK-28594 firstly. After that, we'll move to 
SPARK-28867. As you know, there would be more complications to do it. So, it 
still requires new and detailed design and discussions in the community.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to