shahidki31 opened a new pull request #24171: [SPARK-27204]SHS: Background caching for completed applications, when the disk store enabled URL: https://github.com/apache/spark/pull/24171 ## What changes were proposed in this pull request? First time AppUI loading time will take time from the History server page, if the event log size is huge. (Eg: 47 minutes for a 18GB eventlog file). Majority of time is consumed for replaying the event log. In this pr, I'm proposing a background caching for disk store, where history provider will replay the event log once the application completed and creates a store cache. So, when the user try to access the application, they can directly get from the cache or provider will wait till the time caching is completed. Either case, the loading time will be lesser than or equal to the current one. ## How was this patch tested? Manually tested and existing UT. Event log size : 1.7 GB **Before:** First time UI loading time = 391 sec **After:** First time UI loading time = 6 sec ( Best case: Already cached) 344 sec (worst case: Immedietly after background caching started)
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
