Github user sryza commented on the pull request:
https://github.com/apache/spark/pull/1056#issuecomment-49970751
I don't entirely understand the advantage of having a separate
PartialTaskMetrics. Ultimately every field of TaskMetrics except for maybe
shuffleFinishTime will be able to be updated incrementally. Any time the
driver is dealing with a TaskMetrics, the context makes it pretty clear on
whether it's full or not. We do already have code that touches the TaskMetrics
from multiple threads. For example, the ShuffleReadMetrics gets updated like
this. I think you're right that I've missed some race conditions - we probably
need to clone the metrics in a synchronized block before sending them off, but
I'm not convinced that a superclass makes these easier to fix.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---