rdblue commented on issue #27021: [SPARK-30362][Core] Update InputMetrics in DataSourceRDD URL: https://github.com/apache/spark/pull/27021#issuecomment-578958347 +1 for this change, with a fix to avoid exposing the new helper classes. To address @cloud-fan's objection, this solution records the amount of data read by Hadoop file systems. We can always expose an additional way for v2 sources to return a size metric if the bytes read by those sources do not go through the Hadoop FileSystem API, but there are many cases that do use the file system API and at least those are supported by this change. Thanks @sandeep-katta!
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
