Thanks Anand for working on this.

Given the IRC event endpoint is WIP, I think it'd be best to be consistent
with the IRC endpoint. For more context on the IRC event, the event
endpoint is not finalized yet. I'd recommend speeding up the IRC side work
to avoid any inconsistencies between Polaris and IRC spec.

I have a few questions about the metrics endpoint:
1. Do we need to expose them via Polaris REST endpoint? Can users grab the
metrics from the backend directly? I understand the RBAC won't be there,
but it provides flexibility for users. For example, some users may choose a
different persistence model such as a KV store or storing the metrics as
objects in S3, which usually scales better than an RDBMS like Postgres. My
understanding is that Polaris is not intended to be a full metrics system
anyway, but rather to provide a way for downstream systems to consume these
data.
2. Should we consider it as an IRC endpoint? Given the metrics report
endpoint in IRC, the Iceberg community might also be interested in serving
back metrics, similar to the event endpoint. In that case, there is a risk
of fragmentation if we create a Polaris endpoint now. We should avoid that
if possible. It might be worth checking with the Iceberg community first.

Happy to hear others’ thoughts on this.

Yufei


On Wed, Mar 11, 2026 at 9:27 AM Nándor Kollár <[email protected]> wrote:

> I agree with Dmitri. The direction outlined in the proposal looks good to
> me, and the finer details can be worked out as implementation gets
> underway. We can adjust the design doc accordingly, later on.
>
>
> Thanks,
> Nandor
>
> Dmitri Bourlatchkov <[email protected]> ezt írta (időpont: 2026. márc. 10.,
> K, 20:43):
>
> > Hi Anand and All,
> >
> > The proposal LGTM in its current form.
> >
> > My personal approach is that a proposal does not have to be as "polished"
> > as the final API spec. As long as we have consensus on the general
> > approach and the basic API principles, I think we can proceed to
> > implementation and iron out final wrinkles during actual API spec and
> code
> > PRs.
> >
> > Would this approach work for everyone?
> >
> > Re-posting my comment from GH [1] here for visibility, in case people
> have
> > different opinions and wish to discuss this in more depth:
> >
> > From my POV, Polaris is a platform, indeed. In this sense, I think it is
> > > critical to enable users to control what features are at play in
> runtime,
> > > since different users have different use cases. This is why I
> originally
> > > advocated for isolating Metrics Persistence from MetaStore Persistence.
> > >
> > > If a user decides to leverage the "native" Polaris (scan/commit)
> Metrics
> > > Persistence, I do not see any disadvantage in also exposing an
> (optional)
> > > REST API for loading these metrics from Polaris Persistence.
> > >
> > > The degree of support and sophistication that goes into this sub-system
> > is
> > > up to the community. If we have contributors (like @obelix74 ) who are
> > > willing to evolve it, I see no harm in some functionality overlap with
> > more
> > > focused metrics platforms. Again, the key point is for all users to
> have
> > > control and be able to opt in or out of this feature in their specific
> > > deployments.
> > >
> > > Of course, offloading scan/commit metrics storage to a specialized
> > > observability system is possible too (assuming someone develops
> > integration
> > > code for that, which is very welcome).
> >
> >
> > [1] https://github.com/apache/polaris/pull/3924#discussion_r2913947744
> >
> > Thanks,
> > Dmitri.
> >
> > On Tue, Mar 10, 2026 at 12:37 PM Anand Kumar Sankaran via dev <
> > [email protected]> wrote:
> >
> > > Hi EJ Wang and Dmitri,
> > >
> > > I addressed all your concerns about the proposal, in particular
> > > https://github.com/apache/polaris/pull/3924#discussion_r2908317696.
> > >
> > > Does this address your concerns?
> > >
> > > -
> > > Anand
> > >
> > > From: Dmitri Bourlatchkov <[email protected]>
> > > Date: Monday, March 9, 2026 at 1:17 PM
> > > To: [email protected] <[email protected]>
> > > Subject: Re: Proposal for REST endpoints for table metrics and events
> > >
> > > This Message Is From an External Sender
> > > This message came from outside your organization.
> > > Report Suspicious<
> > >
> >
> https://us-phishalarm-ewt.proofpoint.com/EWT/v1/Iz9xO38YGHZK!YhNDZAGr2cumYdtJ_UTm0gR9PfI_-PwSpR_GtNr1uVQ_xo-s2AskvUmbkLZ-C5V8eOKN-omus47On4k4hfFo-0G7CHMLwVjEego-rZrPuepAybX7DP8Ua0VNSrsZ83C4$
> > > >
> > >
> > >
> > > Hi EJ,
> > >
> > > You make good points about the metrics API extensibility and evolution.
> > >
> > > However, we need to consider practical aspects too. Anand appears to
> have
> > > some specific use cases in mind, and I assume his proposal addresses
> > them.
> > >
> > > Starting with an API + implementation that works for some real
> > > world applications will validate the feature's usability.
> > >
> > > We can revamp the API completely in its v2 after v1 is merged. New
> major
> > > API versions do not have to be backward-compatible with older versions
> of
> > > the same API [1].
> > >
> > > In my personal experience, a v1 API can hardly be expected to cover all
> > use
> > > cases and extensions well. We can certainly take a bit more time to
> > polish
> > > it, but from my POV it might be best to iterate in terms of API
> versions
> > > rather than on unmerged commits in the initial proposal. Just my 2
> cents
> > :)
> > >
> > > That said, we should flag the new APIs in this proposal as "beta"... at
> > > least initially (which is the usual practice in Polaris).
> > >
> > > > I wonder if it would help to evaluate the Events API and Metrics API
> a
> > > bit more independently.
> > >
> > > That makes sense to me. However, the current proposal progressed a lot
> > > since its initial submission and contained both APIs. I would not want
> to
> > > lose this momentum.
> > >
> > > It might still be advisable to implement the events and metrics APIs
> > > separately and gather additional feedback at that time.
> > >
> > > [1]
> > >
> >
> https://urldefense.com/v3/__https://polaris.apache.org/in-dev/unreleased/evolution/__;!!Iz9xO38YGHZK!8KJ0uv4jK3mxZP4nYFrL1hZ0fMkQvoVEAJa8t9LBCzVtm_PWVFGQfIcZp-ykn3_F9_ph6EYyu3dUZjPAcQ$
> > >
> > > Cheers,
> > > Dmitri.
> > >
> > > On Mon, Mar 9, 2026 at 3:48 PM EJ Wang <[email protected]
> >
> > > wrote:
> > >
> > > > Hi Anand,
> > > >
> > > > I think the proposal is moving in a better direction, especially on
> the
> > > > Events side, and I appreciate the iteration so far. That said, I
> still
> > > have
> > > > some concerns about the Metrics side, but they are less about
> > individual
> > > > parameters or endpoint shape, and more about product boundary.
> > > >
> > > > 2 cents: I wonder if it would help to evaluate the Events API and
> > Metrics
> > > > API a bit more independently.
> > > >
> > > > The Events side feels relatively close to Polaris' catalog/change-log
> > > > scope. It is easier to justify as part of the core/community surface,
> > > > especially if the goal is to expose completed catalog mutations in a
> > way
> > > > that aligns with Iceberg-style events.
> > > >
> > > > The Metrics side feels different to me. Once we start adding more and
> > > more
> > > > type-specific filters, query semantics, and schema shape for
> individual
> > > > metric families, it seems easy for Polaris to drift toward a built-in
> > > > observability backend. My bias would be for Polaris to support a
> > smaller
> > > > set of community-recognized built-in metrics well, while providing
> good
> > > > extensibility points for deployments that want richer querying,
> > > > visualization, or use-case-specific metrics.
> > > >
> > > > Related to that, I am not yet convinced the current metrics model is
> > > > generic enough as a long-term direction. Even after consolidating to
> a
> > > > single endpoint, the design still feels fairly tied to the current
> > > > scan/commit shape. I worry that otherwise each new metric family will
> > > keep
> > > > pulling us into more storage/schema/API reshaping inside Polaris
> core.
> > > > So the framing question I would suggest is something like:
> > > > > What is the minimal built-in metrics surface Polaris should own in
> > > core,
> > > > and where should we instead rely on extensibility / sink-export /
> > > > plugin-style integration?
> > > >
> > > > For me, getting that boundary right matters more than settling every
> > > query
> > > > parameter detail first.
> > > >
> > > > -ej
> > > >
> > > > On Tue, Mar 3, 2026 at 12:29 PM Anand Kumar Sankaran via dev <
> > > > [email protected]> wrote:
> > > >
> > > > > Hi Yufei and Dmitri,
> > > > >
> > > > > Here is a proposal for the REST endpoints for metrics and events.
> > > > >
> > > > >
> > >
> >
> https://urldefense.com/v3/__https://github.com/apache/polaris/pull/3924/changes__;!!Iz9xO38YGHZK!8KJ0uv4jK3mxZP4nYFrL1hZ0fMkQvoVEAJa8t9LBCzVtm_PWVFGQfIcZp-ykn3_F9_ph6EYyu3fXDjRYWA$
> > > > >
> > > > > I did not see any precursors for raising a PR for proposals, so
> > trying
> > > > > this.  Please let me know what you think.
> > > > >
> > > > > -
> > > > > Anand
> > > > >
> > > > > From: Anand Kumar Sankaran <[email protected]>
> > > > > Date: Monday, March 2, 2026 at 10:25 AM
> > > > > To: [email protected] <[email protected]>
> > > > > Subject: Re: Polaris Telemetry and Audit Trail
> > > > >
> > > > > About the REST API, based on my use cases:
> > > > >
> > > > >
> > > > >   1.
> > > > > I want to be able to query commit metrics to track files added /
> > > removed
> > > > > per commit, along with record counts. The ingestion pipeline that
> > > writes
> > > > > this data is owned by us and we are guaranteed to write this
> > > information
> > > > > for each write.
> > > > >   2.
> > > > > I want to be able to query scan metrics for read. I understand
> > clients
> > > do
> > > > > not fulfill this requirement.
> > > > >   3.
> > > > > I want to be able to query the events table (events are persisted)
> -
> > > this
> > > > > may supersede #2, I am not sure yet.
> > > > >
> > > > > All this information is in the JDBC based persistence model and is
> > > > > persisted in the metastore. I currently don’t have a need to query
> > > > > prometheus or open telemetry. I do publish some events to
> Prometheus
> > > and
> > > > > they are forwarded to our dashboards elsewhere.
> > > > >
> > > > > About the CLI utilities, I meant the admin user utilities. In one
> of
> > > the
> > > > > earliest drafts of my proposal, Prashant mentioned that the metrics
> > > > tables
> > > > > can grow indefinitely and that a similar problem exists with the
> > events
> > > > > table as well. We discussed that cleaning up of old records from
> both
> > > > > metrics tables and events tables can be done via a CLI utility.
> > > > >
> > > > > I see that Yufei has covered the discussion about datasources.
> > > > >
> > > > > -
> > > > > Anand
> > > > >
> > > > >
> > > > >
> > > > > From: Yufei Gu <[email protected]>
> > > > > Date: Friday, February 27, 2026 at 9:54 PM
> > > > > To: [email protected] <[email protected]>
> > > > > Subject: Re: Polaris Telemetry and Audit Trail
> > > > >
> > > > > This Message Is From an External Sender
> > > > > This message came from outside your organization.
> > > > > Report Suspicious<
> > > > >
> > > >
> > >
> >
> https://us-phishalarm-ewt.proofpoint.com/EWT/v1/Iz9xO38YGHZK!YhNDZABkHi1B699ote2uMwpOZw8i0QMCGO2Szc-HshuABGhGvwPJcymE6G2oUUxtS8xDkSrtGTPm_I3QnVDHoLMk50m9v8z_nZKTkd-bnVUbreF1u0WnfV_X5eYevZl_$
> > > > > >
> > > > >
> > > > >
> > > > > As I mentioned in
> > > > >
> > > >
> > >
> >
> https://urldefense.com/v3/__https://github.com/apache/polaris/issues/3890__;!!Iz9xO38YGHZK!5EuyFFkk3vhRWVIRvQAWBSQfpJkTMA9HxugzDwXmN0LPPqhEFxYkFRGVhtb8AqUwXtDh2OplcMnbMDHKOxrvDU0$
> > > >,
> > > > > supporting
> > > > > multiple data sources is not a trivial change. I would strongly
> > > recommend
> > > > > starting with a design document to carefully evaluate the
> > architectural
> > > > > implications and long term impact.
> > > > >
> > > > > A REST endpoint to query metrics seems reasonable given the current
> > > JDBC
> > > > > based persistence model. That said, we may also consider
> alternative
> > > > > storage models. For example, if we later adopt a time series system
> > > such
> > > > as
> > > > > Prometheus to store metrics, the query model and access patterns
> > would
> > > be
> > > > > fundamentally different. Designing the REST API without considering
> > > these
> > > > > potential evolutions may limit flexibility. I'd suggest to start
> with
> > > the
> > > > > use case.
> > > > >
> > > > > Yufei
> > > > >
> > > > >
> > > > > On Fri, Feb 27, 2026 at 3:42 PM Dmitri Bourlatchkov <
> > [email protected]>
> > > > > wrote:
> > > > >
> > > > > > Hi Anand,
> > > > > >
> > > > > > Sharing my view... subject to discussion:
> > > > > >
> > > > > > 1. Adding non-IRC REST API to Polaris is perfectly fine.
> > > > > >
> > > > > > Figuring out specific endpoint URIs and payloads might require a
> > few
> > > > > > roundtrips, so opening a separate thread for that might be best.
> > > > > > Contributors commonly create Google Docs for new API proposals
> too
> > > > (they
> > > > > > fairly easy to update as the email discussion progresses).
> > > > > >
> > > > > > There was a suggestion to try Markdown (with PRs) for proposals
> [1]
> > > ...
> > > > > > feel free to give it a try if you are comfortable with that.
> > > > > >
> > > > > > 2. Could you clarify whether you mean end user utilities or admin
> > > user
> > > > > > utilities? In the latter case those might be more suitable for
> the
> > > > Admin
> > > > > > CLI (java) not the Python CLI, IMHO.
> > > > > >
> > > > > > Why would these utilities be common with events? IMHO, event use
> > > cases
> > > > > are
> > > > > > distinct from scan/commit metrics.
> > > > > >
> > > > > > 3. I'd prefer separating metrics persistence from MetaStore
> > > persistence
> > > > > at
> > > > > > the code level, so that they could be mixed and matched
> > > independently.
> > > > > The
> > > > > > separate datasource question will become a non-issue with that
> > > > approach,
> > > > > I
> > > > > > guess.
> > > > > >
> > > > > > The rationale for separating scan metrics and metastore
> persistence
> > > is
> > > > > that
> > > > > > "cascading deletes" between them are hardly ever required.
> > > Furthermore,
> > > > > the
> > > > > > data and query patterns are very different so different
> > technologies
> > > > > might
> > > > > > be beneficial in each case.
> > > > > >
> > > > > > [1]
> > > > >
> > > >
> > >
> >
> https://urldefense.com/v3/__https://lists.apache.org/thread/yto2wp982t43h1mqjwnslswhws5z47cy__;!!Iz9xO38YGHZK!5EuyFFkk3vhRWVIRvQAWBSQfpJkTMA9HxugzDwXmN0LPPqhEFxYkFRGVhtb8AqUwXtDh2OplcMnbMDHKxYDakNU$
> > > >> >
> > > > > > Cheers,
> > > > > > Dmitri.
> > > > > >
> > > > > > On Fri, Feb 27, 2026 at 6:19 PM Anand Kumar Sankaran via dev <
> > > > > > [email protected]> wrote:
> > > > > >
> > > > > > > Thanks all. This PR is merged now.
> > > > > > >
> > > > > > > Here are the follow-up features / work needed.  These were all
> > part
> > > > of
> > > > > > the
> > > > > > > merged PR at some point in time and were removed to reduce
> scope.
> > > > > > >
> > > > > > > Please let me know what you think.
> > > > > > >
> > > > > > >
> > > > > > >   1.  A REST API to paginate through table metrics. This will
> be
> > > > > non-IRC
> > > > > > > standard addition.
> > > > > > >   2.  Utilities for managing old records, should be common with
> > > > events.
> > > > > > > There was some discussion that it belongs to the CLI.
> > > > > > >   3.  Separate datasource (metrics, events, even other
> tables?).
> > > > > > >
> > > > > > >
> > > > > > > Anything else?
> > > > > > >
> > > > > > > -
> > > > > > > Anand
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > >
> > >
> > >
> >
>

Reply via email to