moomindani opened a new issue, #16573: URL: https://github.com/apache/iceberg/issues/16573
### Feature Request / Improvement Add a built-in mechanism that lets users restrict which tables' `ScanReport`s and `CommitReport`s are forwarded to a configured `MetricsReporter`, applied uniformly across all reporter implementations (`LoggingMetricsReporter`, `RESTMetricsReporter`, `OtelMetricsReporter`, and custom user-supplied ones). #### Motivation In deployments with many tables, users frequently want to emit metrics for only a subset: - Only tables in production databases (e.g., `prod.*`), not staging or sandbox - Only specific business-critical tables, excluding intermediate ones - Exclude noisy test or scratch tables (`tmp.*`, `*.bench_*`) Existing per-reporter knobs only partially address this. The `iceberg.otel.metrics.attributes` allowlist added in #16250 controls *which attributes* an OTel metric carries — useful for cardinality but does not stop metrics from being emitted for tables the user doesn't care about. Cardinality-control mechanisms in time-series backends (OTel Views, Prometheus relabel rules, etc.) are reporter-specific and require host-side knowledge. Table-level filtering is a cross-cutting concern that belongs above any single reporter. Putting it inside each reporter implementation would lead to repeated, slightly inconsistent flag sets per reporter. Putting it once in the framework layer means every existing and future `MetricsReporter` benefits without re-implementation. #### Proposal Introduce two catalog properties recognized by the catalog when constructing the reporter pipeline: ``` metrics-reporter-impl=org.apache.iceberg.metrics.OtelMetricsReporter metrics-reporter.table-name.include=prod\..* metrics-reporter.table-name.exclude=.*\.tmp_.* ``` Values are Java regex patterns matched against `ScanReport.tableName()` / `CommitReport.tableName()`. The catalog wraps the user's reporter in a filtering layer when either property is present. When both are present, `exclude` wins over `include` (an explicit deny overrides an include). When neither is set, behavior is identical to today (pass-through, with no runtime overhead). #### Behavior - `include` only set: forward reports whose table name matches; drop others. - `exclude` only set: drop reports whose table name matches; forward others. - Both set: drop if `exclude` matches; otherwise forward only if `include` matches. - Neither set: forward everything (current behavior). - Empty value (`metrics-reporter.table-name.include=`) is treated as "not set" rather than "match nothing" to avoid accidentally silencing all metrics on misconfiguration. #### Relationship to existing work - #16169, #16250 — surfaced this concern during discussion of per-table cardinality of the OTel reporter. This proposal complements `iceberg.otel.metrics.attributes` (attribute pruning) by giving users a way to also drop entire reports for uninteresting tables. - dev@ DISCUSS for #16250: https://lists.apache.org/thread/vn4gglocg2g40p69mfrrh86qzkn1rr4b ### Query engine None — applies to all engines that consume `MetricsReporter`. ### Willingness to contribute - [X] I can contribute this improvement/feature independently - [ ] I would be willing to contribute this improvement/feature with guidance from the Iceberg community - [ ] I cannot contribute this improvement/feature at this time -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
