snmvaughan commented on PR #46188: URL: https://github.com/apache/spark/pull/46188#issuecomment-2123350238
@cloud-fan Spark already collects information about the number of rows and bytes written, but only reports the total aggregate. If you're concerned about the overall size, it is limited to the number of partitions instead of collecting it by file. The currently V1 writers only know about the path they are writing to, which is why I wanted to augment the `newFIle` with additional information. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
