2010YOUY01 commented on code in PR #21160:
URL: https://github.com/apache/datafusion/pull/21160#discussion_r2998967487
##########
datafusion/common/src/format.rs:
##########
@@ -206,47 +206,227 @@ impl ConfigField for ExplainFormat {
}
}
-/// Verbosity levels controlling how `EXPLAIN ANALYZE` renders metrics
+/// Categorizes metrics so the display layer can choose the desired verbosity.
+///
+/// The `datafusion.explain.analyze_level` configuration controls which
+/// type is shown:
+/// - `"dev"` (the default): all metrics are shown.
+/// - `"summary"`: only metrics tagged as `Summary` are shown.
+///
+/// This is orthogonal to [`MetricCategory`], which filters by *what kind*
+/// of value a metric represents (rows / bytes / timing).
+///
+/// # Difference from `EXPLAIN ANALYZE VERBOSE`
+///
+/// The `VERBOSE` keyword controls whether per-partition metrics are shown
+/// (when specified) or aggregated metrics are displayed (when omitted).
+/// In contrast, `MetricType` determines which *levels* of metrics are
+/// displayed.
#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)]
-pub enum ExplainAnalyzeLevel {
- /// Show a compact view containing high-level metrics
+pub enum MetricType {
Review Comment:
I agree the new category tags are necessary
##########
datafusion/physical-expr-common/src/metrics/builder.rs:
##########
@@ -88,6 +93,15 @@ impl<'a> MetricBuilder<'a> {
self
}
+ /// Set the semantic category for the metric being constructed.
+ ///
+ /// See [`MetricCategory`] for details on the determinism properties
+ /// of each category.
+ pub fn with_category(mut self, category: MetricCategory) -> Self {
Review Comment:
I'm wondering if some metrics may need multiple categories.
Currently `Timing/Bytes/Rows` classify metrics by output type, but we could
also group them by functional aspects like `Pruning` or `Spilling`. That would
enable more flexible filtering, e.g.:
```sql
set datafusion.explain.analyze_categories = ['Timing', 'Spilling']
```
to show metrics matching both tags.
If this direction is plausible, should we consider storing multiple
categories (e.g. `Vec<MetricCategory>`) instead? To be more flexible for the
future change.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]