GitHub user vanzin opened a pull request:
https://github.com/apache/spark/pull/19681
[SPARK-20652][sql] Store SQL UI data in the new app status store.
This change replaces the SQLListener with a new implementation that
saves the data to the same store used by the SparkContext's status
store. For that, the types used by the old SQLListener had to be
updated a bit so that they're more serialization-friendly.
The interface for getting data from the store was abstracted into
a new class, SQLAppStatusStore (following the convention used in
core).
Another change is the way that the SQL UI hooks up into the core
UI or the SHS. The old "SparkHistoryListenerFactory" was replaced
with a new "AppStatePlugin" that more explicitly differentiates
between the two use cases: processing events, and showing the UI.
Both live apps and the SHS use this new API (previously, it was
restricted to the SHS).
Note on the above: this causes a slight change of behavior for
live apps; the SQL tab will only show up after the first execution
is started.
The metrics gathering code was re-worked a bit so that the types
used are less memory hungry and more serialization-friendly. This
reduces memory usage when using in-memory stores, and reduces load
times when using disk stores.
Tested with existing and added unit tests. Note one unit test was
disabled because it depends on SPARK-20653, which isn't in yet.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/vanzin/spark SPARK-20652
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/19681.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #19681
----
commit ccd5adc1d6273b92fd6c9a0d4817451a5acb566a
Author: Marcelo Vanzin <[email protected]>
Date: 2017-04-06T17:00:25Z
[SPARK-20652][sql] Store SQL UI data in the new app status store.
This change replaces the SQLListener with a new implementation that
saves the data to the same store used by the SparkContext's status
store. For that, the types used by the old SQLListener had to be
updated a bit so that they're more serialization-friendly.
The interface for getting data from the store was abstracted into
a new class, SQLAppStatusStore (following the convention used in
core).
Another change is the way that the SQL UI hooks up into the core
UI or the SHS. The old "SparkHistoryListenerFactory" was replaced
with a new "AppStatePlugin" that more explicitly differentiates
between the two use cases: processing events, and showing the UI.
Both live apps and the SHS use this new API (previously, it was
restricted to the SHS).
Note on the above: this causes a slight change of behavior for
live apps; the SQL tab will only show up after the first execution
is started.
The metrics gathering code was re-worked a bit so that the types
used are less memory hungry and more serialization-friendly. This
reduces memory usage when using in-memory stores, and reduces load
times when using disk stores.
Tested with existing and added unit tests. Note one unit test was
disabled because it depends on SPARK-20653, which isn't in yet.
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]