Github user tdas commented on the pull request:
https://github.com/apache/spark/pull/5473#issuecomment-97198401
High level points as discussed offline.
We are planning to increase the number of retainedBatches, hence we need to
be careful about all the data that we retain in memory. And its not a good idea
to retain BatchInfo objects in memory because it contains a lot of arbitrary
`ReceivedBlockInfo` objects, which can have arbitrary metadata ( #5732 ). So a
better design would be to train only `BatchUIData` objects which contains only
the data necessary for rendering the UI (timings, numrecords, etc.). So
`waitingBatches`, `runningBatches`, `completedBatches` should have only
`BatchUIData` objects.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]