adriangb commented on code in PR #21160:
URL: https://github.com/apache/datafusion/pull/21160#discussion_r2998336701


##########
datafusion/common/src/format.rs:
##########
@@ -206,6 +206,142 @@ impl ConfigField for ExplainFormat {
     }
 }
 
+/// Classifies a metric by what it measures.
+///
+/// This is orthogonal to [`MetricType`] (SUMMARY / DEV), which controls
+/// *verbosity*. `MetricCategory` controls *what kind of value* is shown,
+/// so that `EXPLAIN ANALYZE` output can be narrowed to only the categories
+/// that are useful in a given context.
+///
+/// For testing, the key property is **determinism**:
+/// - [`Rows`](Self::Rows) and [`Bytes`](Self::Bytes) depend on the plan
+///   and the data, so they are deterministic across runs (given the same
+///   input).
+/// - [`Timing`](Self::Timing) depends on hardware, system load, scheduling,
+///   etc., so it varies from run to run even on the same machine.
+///
+/// [`MetricCategory`] is especially useful in sqllogictest (`.slt`) files:
+/// setting `datafusion.explain.analyze_categories = 'rows'` lets a test
+/// assert on row-count metrics without sprinkling `<slt:ignore>` over every
+/// timing value.
+///
+/// Metrics that do not declare a category (the default for custom
+/// `Count` / `Gauge` metrics) are **always included** unless the config
+/// is set to `'none'`.
+///
+/// [`MetricType`]: datafusion_physical_expr_common::metrics::MetricType
+#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)]
+pub enum MetricCategory {
+    /// Row counts and related dimensionless counters: `output_rows`,
+    /// `spilled_rows`, `output_batches`, pruning metrics, ratios, etc.
+    ///
+    /// Deterministic given the same plan and data.
+    Rows,
+    /// Byte measurements: `output_bytes`, `spilled_bytes`,
+    /// `current_memory_usage`, `bytes_scanned`, etc.
+    ///
+    /// Deterministic given the same plan and data.
+    Bytes,
+    /// Wall-clock durations and timestamps: `elapsed_compute`,
+    /// operator-defined `Time` metrics, `start_timestamp` /
+    /// `end_timestamp`, etc.
+    ///
+    /// **Non-deterministic** — varies across runs even on the same hardware.
+    Timing,
+}

Review Comment:
   Done!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to