Vinayak Joshi created SPARK-18010:
-------------------------------------

             Summary: Remove unneeded heavy work performed by FsHistoryProvider 
for building up the application listing UI page
                 Key: SPARK-18010
                 URL: https://issues.apache.org/jira/browse/SPARK-18010
             Project: Spark
          Issue Type: Improvement
          Components: Spark Core, Web UI
    Affects Versions: 2.0.1, 1.6.2, 2.1.0
            Reporter: Vinayak Joshi


There are known complaints/cribs about History Server's Application List not 
updating quickly enough when the event log files that need replay are huge. 
Currently, the FsHistoryProvider design causes the entire event log file to be 
replayed when building the initial application listing (refer the method 
mergeApplicationListing(fileStatus: FileStatus) ). The process of replay 
involves:
 - each line in the event log being read as a string,
 - parsing the string to a Json structure
 - converting the Json to the corresponding Scala classes with nested structures

Particularly the part involving parsing string to Json and then to Scala 
classes is expensive. Tests show that majority of time spent in replay is in 
doing this work. 

When the replay is performed for building the application listing, the only two 
events that the code really cares for are "SparkListenerApplicationStart" and 
"SparkListenerApplicationEnd" - since the only listener attached to the 
ReplayListenerBus at that point is the ApplicationEventListener. This means 
that when processing an event log file with a huge number (hundreds of 
thousands, can be more) of events, the work done to deserialize all of these 
event,  and then replay them is not needed. Only two events are what we're 
interested in, and this can be used to ensure that when replay is performed for 
the purpose of building the application list, we only make the effort to replay 
these two events and not others. 

My tests show that this drastically improves application list load time. For a 
150MB event log from a user, with over 100,000 events, the load time (local on 
my mac) comes down from about 16 secs to under 1 second using this approach. 
For customers that typically execute applications with large event logs, and 
thus have multiple large event logs present, this can speed up how soon the 
history server UI lists the apps considerably.

I will be updating a pull request with take at fixing this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to