Github user pwendell commented on the pull request:
https://github.com/apache/spark/pull/7770#issuecomment-127378665
Hey Imran,
So I think there is a larger design discussion at hand that we can probably
break out into a mailing list thread or maybe discuss offline - about as we add
more metrics what internal mechanisms do we use and how to we present them to
the user (these could be decoupled as concerns, actually). We had been going
down the direction of moving some of these to accumulators to avoid some of the
clunkiness of the taskmetrics approach and to avoid duplicate mechanism to the
same things (we discussed along these lines even when doing updates to
taskmetrics way back with @sryza).
However, an open question is how will this affect the user-facing story
regarding consumption. For instance, we could have public constants for
accumulator keys we plan to support and give them standard naming, if we want
them to be caught by MIMA and more formally supported.
I would recommend forking a new thread on the dev list to discuss the
trade-off's. This current patch as it stands, I don't think it represents a
long term commitment in any direction. We could always add these to the
TaskMetrics struct if we wanted to later on send them back to users in that
way. We could also have all the TaskMetrics use accumulators internally and
still support the current user-facing struct classes.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]