GitHub user vanzin opened a pull request:
https://github.com/apache/spark/pull/19698
[SPARK-20648][core] Port JobsTab and StageTab to the new UI backend.
This change is a little larger because there's a whole lot of logic
behind these pages, all really tied to internal types and listeners,
and some of that logic had to be implemented in the new listener and
the needed data exposed through the API types.
- Added missing StageData and ExecutorStageSummary fields which are
used by the UI. Some json golden files needed to be updated to account
for new fields.
- Save RDD graph data in the store. This tries to re-use existing types as
much as possible, so that the code doesn't need to be re-written. So it's
probably not very optimal.
- Some old classes (e.g. JobProgressListener) still remain, since they're
used
in other parts of the code; they're not used by the UI anymore, though,
and
will be cleaned up in a separate change.
- Save information about active pools in the store. This data is not really
used
in the SHS, but it's not a lot of data so it's still recorded when
replaying
applications.
- Because the new store sorts things slightly differently from the previous
code, some json golden files had some elements within them shuffled
around.
- The retention unit test in UISeleniumSuite was disabled because the code
to throw away old stages / tasks hasn't been added yet.
- The job description field in the API tries to follow the old behavior,
which
makes it be empty most of the time, even though there's information to
fill it
in. For stages, a new field was added to hold the description (which is
basically
the job description), so that the UI can be rendered in the old way.
- A new stage status ("SKIPPED") was added to account for the fact that the
API
couldn't represent that state before. Without this, the stage would show
up as
"PENDING" in the UI, which is now based on API types.
- The API used to expose "executorRunTime" as the value of the task's
duration,
which wasn't really correct (also because that value was easily available
from the metrics object); this change fixes that by storing the correct
duration,
which also means a few expectation files needed to be updated to account
for
the new durations and sorting differences due to the changed values.
- Added changes to implement SPARK-20713 and SPARK-21922 in the new code.
Tested with existing unit tests (and by using the UI a lot).
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/vanzin/spark SPARK-20648
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/19698.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #19698
----
commit a22c45889d8fc0982caf4325eb729048537872bb
Author: Marcelo Vanzin <[email protected]>
Date: 2017-01-31T21:31:55Z
[SPARK-20648][core] Port JobsTab and StageTab to the new UI backend.
This change is a little larger because there's a whole lot of logic
behind these pages, all really tied to internal types and listeners,
and some of that logic had to be implemented in the new listener and
the needed data exposed through the API types.
- Added missing StageData and ExecutorStageSummary fields which are
used by the UI. Some json golden files needed to be updated to account
for new fields.
- Save RDD graph data in the store. This tries to re-use existing types as
much as possible, so that the code doesn't need to be re-written. So it's
probably not very optimal.
- Some old classes (e.g. JobProgressListener) still remain, since they're
used
in other parts of the code; they're not used by the UI anymore, though,
and
will be cleaned up in a separate change.
- Save information about active pools in the store. This data is not really
used
in the SHS, but it's not a lot of data so it's still recorded when
replaying
applications.
- Because the new store sorts things slightly differently from the previous
code, some json golden files had some elements within them shuffled
around.
- The retention unit test in UISeleniumSuite was disabled because the code
to throw away old stages / tasks hasn't been added yet.
- The job description field in the API tries to follow the old behavior,
which
makes it be empty most of the time, even though there's information to
fill it
in. For stages, a new field was added to hold the description (which is
basically
the job description), so that the UI can be rendered in the old way.
- A new stage status ("SKIPPED") was added to account for the fact that the
API
couldn't represent that state before. Without this, the stage would show
up as
"PENDING" in the UI, which is now based on API types.
- The API used to expose "executorRunTime" as the value of the task's
duration,
which wasn't really correct (also because that value was easily available
from the metrics object); this change fixes that by storing the correct
duration,
which also means a few expectation files needed to be updated to account
for
the new durations and sorting differences due to the changed values.
- Added changes to implement SPARK-20713 and SPARK-21922 in the new code.
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]