[
https://issues.apache.org/jira/browse/SPARK-18010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sean Owen updated SPARK-18010:
------------------------------
Fix Version/s: 2.0.3
> Remove unneeded heavy work performed by FsHistoryProvider for building up the
> application listing UI page
> ---------------------------------------------------------------------------------------------------------
>
> Key: SPARK-18010
> URL: https://issues.apache.org/jira/browse/SPARK-18010
> Project: Spark
> Issue Type: Improvement
> Components: Spark Core, Web UI
> Affects Versions: 1.6.2, 2.0.1, 2.1.0
> Reporter: Vinayak Joshi
> Assignee: Vinayak Joshi
> Fix For: 2.0.3, 2.1.0
>
>
> There are known complaints/cribs about History Server's Application List not
> updating quickly enough when the event log files that need replay are huge.
> Currently, the FsHistoryProvider design causes the entire event log file to
> be replayed when building the initial application listing (refer the method
> mergeApplicationListing(fileStatus: FileStatus) ). The process of replay
> involves:
> - each line in the event log being read as a string,
> - parsing the string to a Json structure
> - converting the Json to the corresponding Scala classes with nested
> structures
> Particularly the part involving parsing string to Json and then to Scala
> classes is expensive. Tests show that majority of time spent in replay is in
> doing this work.
> When the replay is performed for building the application listing, the only
> two events that the code really cares for are "SparkListenerApplicationStart"
> and "SparkListenerApplicationEnd" - since the only listener attached to the
> ReplayListenerBus at that point is the ApplicationEventListener. This means
> that when processing an event log file with a huge number (hundreds of
> thousands, can be more) of events, the work done to deserialize all of these
> event, and then replay them is not needed. Only two events are what we're
> interested in, and this can be used to ensure that when replay is performed
> for the purpose of building the application list, we only make the effort to
> replay these two events and not others.
> My tests show that this drastically improves application list load time. For
> a 150MB event log from a user, with over 100,000 events, the load time (local
> on my mac) comes down from about 16 secs to under 1 second using this
> approach. For customers that typically execute applications with large event
> logs, and thus have multiple large event logs present, this can speed up how
> soon the history server UI lists the apps considerably.
> I will be updating a pull request with take at fixing this.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]