Hello Dmitri,

I was thinking something along the lines of exposing catalog, namespace, and 
table name to CEL. For example:
namespace == "demo"
catalog == "prod" && namespace = "sales"
!table.startsWith("tmp_")

This would allow users to enable metrics for specific catalogs, namespaces, or 
tables during troubleshooting, or exclude noisy tables.

That said, I'm not sure we need the flexibility of CEL right away. I'm 
wondering if we should start with something simpler, such as include/exclude 
lists or glob patterns, which may be easier to configure and understand.

I'm pretty open-ended on the implementation. If we end up using CEL for 
maintenance utilities, it may make sense to use the same approach here as well 
so we can reuse the code and provide a consistent experience across features.

Thanks,
Yong Zheng

On 2026/06/09 01:08:40 Dmitri Bourlatchkov wrote:
> Hi Yong,
> 
> The feature request sounds reasonable to me. I think other users would
> appreciate this feature too.
> 
> How do you envision defining these filters?
> 
> I think CEL expressions can be a good fit. They are currently used in NoSQL
> maintenance tasks, but concerns have been raised about using CEL (Nessie's
> cel-java impl.) , which are tracked in [3847].
> 
> [3847] https://github.com/apache/polaris/issues/3847
> 
> Cheers,
> Dmitri.
> 
> On Sun, Jun 7, 2026 at 2:52 PM Yong Zheng <[email protected]> wrote:
> 
> > Hello,
> >
> > Currently we have polaris.iceberg-metrics.reporting and the ability to
> > persists those metrics to the backend. By default, this can be enabled by
> > change log level for org.apache.polaris.service.reporting to INFO for log
> > based metrics and polaris.iceberg-metrics.reporting.type to persistent if
> > we want it to be persisted on the backend. Currently this setting is all or
> > nothing. This means, with the settings enabled, all tables' metrics will be
> > report/persist. Should we introduce a filter (include/exclusion type
> > settings) which people can fine tune on what to include/exclude (and
> > default to include all)?
> >
> > There are couple use cases such as:
> > 1. exclude noise tables
> > 2. enable metrics for a given namespace during troubleshooting without
> > enable all (e.g. only certain tables are user facing and we would want to
> > close monitor the performance metrics on them while other tables may be
> > batched and latency is not that sensitive compared to those)
> > 3. avoid potential storage growth as there is lack of cleanup job atm
> > (raised in a different ML) and avoid extra I/O to the backend RDS if
> > metrics for majority of the tables are not necessary
> >
> > Thanks,
> > Yong Zheng
> >
> 

Reply via email to