thinkharderdev commented on code in PR #5898:
URL: https://github.com/apache/arrow-datafusion/pull/5898#discussion_r1163904273
##########
datafusion/core/src/physical_plan/file_format/file_stream.rs:
##########
@@ -140,14 +141,32 @@ impl StartableTime {
}
}
+/// Metrics for [`FileStream`]
+///
+/// Note that all of these metrics are in terms of wall clock time
+/// (not cpu time) so they include time spent waiting on I/O as well
+/// as other operators.
struct FileStreamMetrics {
- /// Time elapsed for file opening
+ /// Wall clock time elapsed for file opening.
+ ///
+ /// Time between when [`FileReader::open`] is called and when the
+ /// [`FileStream`] receives a stream for reading.
Review Comment:
```suggestion
/// Wall clock time elapsed for file opening.
///
/// Time between when [`FileReader::open`] is called and when the
/// [`FileStream`] receives a stream for reading.
///
/// If there are multiple files being scanned, the stream
/// will open the next file in the background while scanning the
/// current file. This metric will only capture time spent opening
/// while not also scanning.
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]