gene-db commented on code in PR #48172: URL: https://github.com/apache/spark/pull/48172#discussion_r1793952556
########## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileScanRDD.scala: ########## @@ -72,9 +74,30 @@ case class PartitionedFile( } } +/** + * Class used to store statistical data that is collected during a file scan and could be used to + * update the SQL metrics of the scan node. More members could be added to this class to to collect + * metrics related to new features. + */ +case class FileScanMetrics( + topLevelVariantMetrics: Option[VariantMetrics] = None, Review Comment: I don't understand how `FileScanMetrics` and `topLevelVariantMetrics`/`nestedVariantMetrics` are used. It looked like there was only 1 caller using `new FileScanMetrics()` and there, we are always providing both of these variant metrics. When would we provide `None` for both or either of them? Do you foresee more callers which would change the option values? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
