Hi Pierre,
Thanks for sharing the updated proposal. I like the direction of keeping
Polaris as a pure store/serve layer while delegating metric computation
externally. It keeps the scope clean and aligns well with our community
call discussion.
As I review, I had a few questions/clarifications:
- For trivial metrics derivable from metadata.json, do you envision
external services recalculating them each time, or could Polaris parse and
persist them at commit time? Given Polaris has to deal with the metric
endpoint(/v1/{prefix}/namespaces/{namespace}/tables/{table}/metrics)
anyways, it would be reasonable to let Polaris handle everything around
metadata.json as well.
- How do you see the extra metrics services working in practice,
particularly their triggering mechanism? William’s delegation service
proposal(
https://docs.google.com/document/d/1AhR-cZ6WW6M-z8v53txOfcWvkDXvS-0xcMe3zjLMLj8/edit?tab=t.0#heading=h.57vglsnkoru0)
on asynchronous tasks seems like a good starting point.
- Do we store metrics separately or together with the other Polaris
transactional tables? What retention policy do you think makes sense for
these metrics?
Overall, this feels like a solid step forward. I’ll add more detailed
comments directly in the doc, but would you mind updating the sharing
settings to allow comments?
Yufei
On Mon, Sep 22, 2025 at 6:27 AM Pierre Laporte <[email protected]>
wrote:
> Hello folks
>
> Thanks for your feedback. I have published a new proposal that I think
> will address most of them. It is available at
>
> https://docs.google.com/document/d/1oFsuI_WKY0QqVqBNS4gtLlDdlZiV9fGmmUG3rLSED4Y/edit?tab=t.0#heading=h.1duembdpfkwi
>
> Notable changes compared to the first proposal:
>
> 1. This proposal describes a collaborative approach where metrics are
> computed exclusively by external services and then pushed to Polaris
> 2. Polaris stores and serves those metrics. But it does not compute
> any.
> 3. As discussed during the community call, this ^ means that even
> trivial metrics are not computed by Polaris (e.g. metrics can be derived
> from the table’s metadata.json).
> 4. A reference implementation for certain metric computation may be
> included for illustrative/demo purposes only. It is not not packaged
> with
> the Polaris runtime. It may also not be suited for large tables.
> 5. This proposal does not cover Iceberg commit and scan reports
>
> --
>
> Pierre
>
>
> On Mon, Sep 15, 2025 at 8:58 PM Pierre Laporte <[email protected]>
> wrote:
>
> >
> >
> > On Mon, Sep 15, 2025 at 8:17 PM Yufei Gu <[email protected]> wrote:
> >
> >> > From my perspective, Polaris should compute the aggregations it needs
> >> (like file size distributions)
> >>
> >> That’d be a pretty big perf hit on Polaris itself, unless you mean
> Polaris
> >> in the broader sense, including peripheral services like TMS.
> >
> >
> > I am not sure I understand what you mean. Let's imagine that Polaris
> does
> > not compute metrics and instead stores and serves arbitrary numbers
> > provided by external services, then the "operational metrics" part of
> > Polaris is nothing else than a wrapper around a database. It does not
> add
> > any value over that database and, in fact, removes value from the
> database
> > given that it will not expose all its configuration parameters.
> >
> > Regarding the overhead, you are correct that this computation will be
> done
> > by Polaris. As a result, the resource usage will likely increase. See
> the
> > sections "threading model" and "deployment options" of the "Design" tab
> for
> > measures that would prevent this overhead from impacting other Catalog
> > workloads.
> >
> > > I don't think clients should be able to push values to Polaris. Could
> >> you elaborate on what you mean by "a new metric write endpoint"?
> >>
> >> It's in your design doc. Any REST clients can update metrics with this
> >> design. I'm with you that we shouldn't do that now, that's also not the
> >> only option.
> >>
> >> Endpoint
> >>
> >> /v1/{prefix}/namespaces/{namespace}/tables/{table}
> >>
> >> Method
> >>
> >> POST
> >>
> >> Summary
> >>
> >> Request all metrics of the given table to be updated.
> >>
> >
> > This endpoint is there for external services that need fresh data. Note
> > that it does not allow those external services to push data to Polaris.
> >
> >
> >> > I would not consider serving historical data for now.
> >>
> >> There’s no way to handle only point-in-time data unless all the metrics
> >> we’re talking about come directly from the metadata.json file, such as
> >> snapshot summaries.
> >> Any async or additional metrics collection requires a time dimension,
> >> e.g.,
> >> snapshot timestamp is essential to indicate the scope of snapshot/commit
> >> metrics.
> >
> >
> > How could we serve historical data (i.e. "the value of this metric was
> > [...] as of two days ago") if we cannot compute point-in-time data (i.e.
> > "the value of this metric is currently [...]") ?
> >
> >
>