[jira] [Commented] (SPARK-28867) InMemoryStore checkpoint to speed up replay log file in HistoryServer
[ https://issues.apache.org/jira/browse/SPARK-28867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16950307#comment-16950307 ] wuyi commented on SPARK-28867: -- [~irashid] Thanks for putting this as a related issue to SPARK-20656. Yeah, I agree that SPARK-28867 and SPARK-20656 is not quite the same. But they both want to implement incremental replay basing on snapshot mechanism(SPARK-20656 said it will have UI data stored on disk). Actually, I think SPARK-20656 is more similar to SPARK-28594. As both SPARK-20656 and SPARK-28594 plan to do snapshot on SHS, while in SPARK-28867 it plans to do snapshot in AppStatusListener(driver side). > InMemoryStore checkpoint to speed up replay log file in HistoryServer > - > > Key: SPARK-28867 > URL: https://issues.apache.org/jira/browse/SPARK-28867 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 3.0.0 >Reporter: wuyi >Priority: Major > > HistoryServer now could be very slow to replay a large log file at the first > time and it always re-replay an inprogress log file after it changes. we > could periodically checkpoint InMemoryStore to speed up replay log file. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-28867) InMemoryStore checkpoint to speed up replay log file in HistoryServer
[ https://issues.apache.org/jira/browse/SPARK-28867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949767#comment-16949767 ] Imran Rashid commented on SPARK-28867: -- This is closely related to SPARK-20656. Its not *quite* a duplicate, because that was about reparsing the logs for the same application within the same SHS instance -- so the SHS still had whatever state stored in memory. Here you're also talking about speeding up parsing of those files even when the SHS is restarted, which also requires some way to restore any state across SHS restarts. > InMemoryStore checkpoint to speed up replay log file in HistoryServer > - > > Key: SPARK-28867 > URL: https://issues.apache.org/jira/browse/SPARK-28867 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 3.0.0 >Reporter: wuyi >Priority: Major > > HistoryServer now could be very slow to replay a large log file at the first > time and it always re-replay an inprogress log file after it changes. we > could periodically checkpoint InMemoryStore to speed up replay log file. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org