Github user JoshRosen commented on the pull request:
https://github.com/apache/spark/pull/3009#issuecomment-63554720
> although there are possibly some race conditions here with the StageInfo
being modified by the DAGScheduler concurrently with a listener using it
I think this is actually okay: most of `StageInfo`'s fields are immutable
and the DAGScheduler already exposes its StageInfos to listeners through the
`SparkListenerStageSubmitted` event. Passing the StageInfos in the
`onJobSubmitted` event would let me fix another weird UI anomaly on the job
details page where the UI knows about the existence of a stage but doesn't know
its name / call site, etc. until that stage starts.
This change (or one where we just store the number of tasks) cause some
trickiness in the JSONProtocol backwards-compatibility support, but I don't
think that should be too tricky to address.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]