viirya edited a comment on pull request #31451:
URL: https://github.com/apache/spark/pull/31451#issuecomment-810554421


   > Thanks for the explanation. This sounds like a change from the API 
discussed in #31476. IIUC, before, the expectation was that 
`PartitionReader#currentMetricsValues()` is called after the partition is read. 
Now, the expectation is that `PartitionReader#currentMetricsValues()` is called 
for every row we iterate through in the reader. Such expectation should be 
documented clearly in the API, for implementors of custom metrics.
   
   I don't see we have documented in the API the exact time where 
`currentMetricsValues` will be called. This is implementation detail. If you 
worry about the implementation of `currentMetricsValues` will do something 
taking time. We can add a note to the API suggesting not to do heavy logic in 
it. 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to