alamb opened a new pull request #908: URL: https://github.com/apache/arrow-datafusion/pull/908
# Which issue does this PR close? Closes https://github.com/apache/arrow-datafusion/issues/679. See https://github.com/apache/arrow-datafusion/pull/901 for an earlier version with feedback from @tustvold and @andygrove # Rationale for this change See the description on https://github.com/apache/arrow-datafusion/issues/679#issue-937015511 for the full rationale, but the TLDR version is: 1. Better align the metric data model with industry best practice to ease integration in other metric systems (e.g. prometheus, influxdb, etc) 2. Ability to get per-partition metrics 3. Ability to get current metric values *during* execution without allocation # What changes are included in this PR? 1. Update the `SQLMetric` API to be in its own module, have labels, know about partitions, and allow for real time inspection 2. Rename `SQLMetric` --> `Metric` 2. Update uses of metrics in DataFusion and Ballista to the new API 2. Functionality to aggregate (sum) metrics via predicate and via partition (as requested by @Dandandan in https://github.com/apache/arrow-datafusion/issues/679#issuecomment-874341354 and @andygrove in https://github.com/apache/arrow-datafusion/issues/679#issuecomment-874187741) 3. Rename metric names to snake case (`output_rows`) rather than camel 🐫 case (`outputRows`) to conform to Rust expectations (see note from @andygrove on the reason the names were camelCase to begin with: https://github.com/apache/arrow-datafusion/pull/901#issuecomment-901362155) # Are there any user-facing changes? YES! The `SQLMetric` / `Metric` API is now totally different so any code that creates / uses `SQLMetrics` would have to be updated. The updates are fairly mechanical as you can see in this PR) # Notes In keeping with Rust's tradition of static typing, I also changed to using more strongly typed versions of the metric values to avoid mistakes such as adding a "time" to a counter value, as well as allowing other counter specific operations. # Open Questions: The current SQL counters use "camel case" for the counter names (e.g. `numRows`) rather than the Rust standard "snake case" (e.g. `num_rows`). I kept the same naming convention in this PR, but I wonder if we want to make them more Rust standard snake case given we are messing with them all anyways. # Not included in this PR: 1. Ensure that all operators have reasonable metrics: (I plan this in a follow on PR for #866, using this API) 2. Support for a global "operator id" as described by @andygrove in https://github.com/apache/arrow-datafusion/issues/679#issuecomment-874187741 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
