viirya commented on code in PR #1108: URL: https://github.com/apache/datafusion-comet/pull/1108#discussion_r1852759262
########## docs/source/user-guide/tuning.md: ########## @@ -105,8 +105,15 @@ then any shuffle operations that cannot be supported in this mode will fall back ## Metrics -Comet metrics are not directly comparable to Spark metrics in some cases. +Some Comet metrics are not directly comparable to Spark metrics in some cases. `CometScanExec` uses nanoseconds for total scan time. Spark also measures scan time in nanoseconds but converts to -milliseconds _per batch_ which can result in a large loss of precision. In one case we saw total scan time -of 41 seconds reported as 23 seconds for example. +milliseconds _per batch_ which can result in a large loss of precision. + +Comet also adds some custom metrics: + +### ShuffleWriterExec + +| Metric | Description | +| ---------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| `jvm_fetch_time` | Measure the time it takes for `ShuffleWriterExec` to fetch an existing batch from the JVM. Note that this does not include the execution time of the query that produced the input batch. | Review Comment: Looks like the new metric measures the time on fetching all batches, not just a batch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org