Github user JoshRosen commented on the pull request:
https://github.com/apache/spark/pull/4435#issuecomment-77918134
Hi @squito,
I took a quick pass through this PR and design doc. I like the overall
approach here:
- Structuring the hierarchy around applications / jobs / stages / tasks is
a good design.
- Using our existing binary-compatibility checking tools to ensure JSON
compatibility is a good idea. The end-to-end "golden file" tests are good,
too, since this guards against breakage unexpected changes in the JSON
serialization / translation layer.
- Building explicit versioning into the API from day one is a good idea.
A couple of high-level questions:
- Does this work significantly differently in the HistoryServer versus a
live application UI?
- How do the JSON endpoints respond to invalid requests or requests for
stages / jobs / tasks whose information has been garbage-collected? We should
add a test to ensure that these endpoints return the right HTTP error codes.
- Will adding / changing the Jackson dependency create a dependency-hell
mess for us? Should we pre-emptively resort to shading or a different
packaging strategy?
General code review comments:
- I don't think that the public JSON response classes should be case
classes: the addition of new fields will break binary compatiblity (since
`apply / unapply` will have their signatures change, breaking users' ability to
pattern-match on these objects). We've already hit this problem with
SparkListenerEvent, so I think we should seek to avoid it here.
- Restrict the visibility of components as much as possible. For example,
`JsonRoute` and all of its subclasses should probably be `private[ui]`.
- Imports need to be re-organized in several files.
- The `WorkerJsonRoute.scala` file is empty.
- I haven't been following the mailing list thread about enumerations very
closely, but we should figure out whether the new `enum` is compatible with
that decision.
- The debugging `println`s should become log statements or be removed.
- Minor code style nit, but there are a bunch of style violations here that
aren't caught by our checker. In general, you need extra whitespace around
operators, braces, etc, e.g. `.map{case(k,v) => k` should be `.map { case (k,
v) => k`.
- I see unit tests for the individual components, but it would nice to add
some end-to-end tests which can also serve as usage examples. Take a look at
the tests in `UISeleniumSuite` for inspiration. It might be nice to add a test
which spins up a driver, runs some long-running tasks / jobs, and polls the
REST api to get their statuses.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]