Github user tgravescs commented on the issue:
https://github.com/apache/spark/pull/19270
Right so from my understanding
https://issues.apache.org/jira/browse/SPARK-20657 is proposing to not using the
rest api and force more on the backing store which exists only for history
server but not a running application. Running application already have this
data in memory anyway. Or are you proposing changing the rest api to use the
new backing store and the stage page to the rest api?
If not then you are splitting the ui pages to be more specific to running
vs history or at least having a different backend store to fetch them from. If
its specific to the ui pages then it doesn't help the rest api. If it helps
the rest api then why not use that for the web pages too.
Either way the data has to get to the web browser, I would think not using
the rest api would be memory efficient if it can read directly from existing
objects (or the backing store) without having to create a new objects in the
rest api to send out. Maybe that doesn't matter so much if we make it more
server side to send out small bits at a time. The rest api would be a nice way
to abstract the backend out from the UI pages but if it doesn't perform well
enough we shouldn't do it.
Either one is doable I think we should just choose a direction.
I think the server side stuff makes a lot of sense the question is do we
want to do it now with this pr, or perhaps a pre-jira to this one, or do we do
it later. To me the current pages are frustrating enough I don't mind doing it
later but it probably depends on how often you load one of the pages with
100,000+ tasks. I don't think its any slower then loading the history server
pages now (which I'm glad you are working on)
Let me know your thoughts.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]