[jira] [Commented] (SPARK-28867) InMemoryStore checkpoint to speed up replay log file in HistoryServer

2019-10-13 Thread wuyi (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16950307#comment-16950307
 ] 

wuyi commented on SPARK-28867:
--

[~irashid] Thanks for putting this as a related issue to  SPARK-20656.

 

Yeah, I agree that  SPARK-28867 and  SPARK-20656 is not quite the same. But 
they both want to implement incremental replay basing on snapshot 
mechanism(SPARK-20656 said it will have UI data stored on disk). Actually, I 
think SPARK-20656 is more similar to SPARK-28594. As both SPARK-20656 and 
SPARK-28594 plan to do snapshot on SHS, while in SPARK-28867 it plans to do 
snapshot in AppStatusListener(driver side).

> InMemoryStore checkpoint to speed up replay log file in HistoryServer
> -
>
> Key: SPARK-28867
> URL: https://issues.apache.org/jira/browse/SPARK-28867
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: wuyi
>Priority: Major
>
> HistoryServer now could be very slow to replay a large log file at the first 
> time and it always re-replay an inprogress log file after it changes. we 
> could periodically checkpoint InMemoryStore to speed up replay log file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28867) InMemoryStore checkpoint to speed up replay log file in HistoryServer

2019-10-11 Thread Imran Rashid (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949767#comment-16949767
 ] 

Imran Rashid commented on SPARK-28867:
--

This is closely related to SPARK-20656.  Its not *quite* a duplicate, because 
that was about reparsing the logs for the same application within the same SHS 
instance -- so the SHS still had whatever state stored in memory.  Here you're 
also talking about speeding up parsing of those files even when the SHS is 
restarted, which also requires some way to restore any state across SHS 
restarts.

> InMemoryStore checkpoint to speed up replay log file in HistoryServer
> -
>
> Key: SPARK-28867
> URL: https://issues.apache.org/jira/browse/SPARK-28867
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: wuyi
>Priority: Major
>
> HistoryServer now could be very slow to replay a large log file at the first 
> time and it always re-replay an inprogress log file after it changes. we 
> could periodically checkpoint InMemoryStore to speed up replay log file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org