viirya edited a comment on pull request #31451: URL: https://github.com/apache/spark/pull/31451#issuecomment-810554421
> Thanks for the explanation. This sounds like a change from the API discussed in #31476. IIUC, before, the expectation was that `PartitionReader#currentMetricsValues()` is called after the partition is read. Now, the expectation is that `PartitionReader#currentMetricsValues()` is called for every row we iterate through in the reader. Such expectation should be documented clearly in the API, for implementors of custom metrics. I don't see we have documented in the API the exact time where `currentMetricsValues` will be called. This is implementation detail. If you worry about the implementation of `currentMetricsValues` will do something taking time. We can add a note to the API suggesting not to do heavy logic in it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
