Hi All, Polaris can support both a local API and new endpoints in the IRC API once the Iceberg community adopts the latter.
What is the concern with having different APIs to access the same data? Cheers, Dmitri. On Tue, Mar 17, 2026 at 3:49 PM EJ Wang <[email protected]> wrote: > I share the same feeling with Yifei as reviewing the PRs. I want to avoid > creating discrepancies if the IRC side ended up supporting it. > > -ej > > On Wed, Mar 11, 2026 at 2:47 PM Yufei Gu <[email protected]> wrote: > > > Thanks Anand for working on this. > > > > Given the IRC event endpoint is WIP, I think it'd be best to be > consistent > > with the IRC endpoint. For more context on the IRC event, the event > > endpoint is not finalized yet. I'd recommend speeding up the IRC side > work > > to avoid any inconsistencies between Polaris and IRC spec. > > > > I have a few questions about the metrics endpoint: > > 1. Do we need to expose them via Polaris REST endpoint? Can users grab > the > > metrics from the backend directly? I understand the RBAC won't be there, > > but it provides flexibility for users. For example, some users may > choose a > > different persistence model such as a KV store or storing the metrics as > > objects in S3, which usually scales better than an RDBMS like Postgres. > My > > understanding is that Polaris is not intended to be a full metrics system > > anyway, but rather to provide a way for downstream systems to consume > these > > data. > > 2. Should we consider it as an IRC endpoint? Given the metrics report > > endpoint in IRC, the Iceberg community might also be interested in > serving > > back metrics, similar to the event endpoint. In that case, there is a > risk > > of fragmentation if we create a Polaris endpoint now. We should avoid > that > > if possible. It might be worth checking with the Iceberg community first. > > > > Happy to hear others’ thoughts on this. > > > > Yufei > > > > > > On Wed, Mar 11, 2026 at 9:27 AM Nándor Kollár <[email protected]> > wrote: > > > > > I agree with Dmitri. The direction outlined in the proposal looks good > to > > > me, and the finer details can be worked out as implementation gets > > > underway. We can adjust the design doc accordingly, later on. > > > > > > > > > Thanks, > > > Nandor > > > > > > Dmitri Bourlatchkov <[email protected]> ezt írta (időpont: 2026. márc. > > 10., > > > K, 20:43): > > > > > > > Hi Anand and All, > > > > > > > > The proposal LGTM in its current form. > > > > > > > > My personal approach is that a proposal does not have to be as > > "polished" > > > > as the final API spec. As long as we have consensus on the general > > > > approach and the basic API principles, I think we can proceed to > > > > implementation and iron out final wrinkles during actual API spec and > > > code > > > > PRs. > > > > > > > > Would this approach work for everyone? > > > > > > > > Re-posting my comment from GH [1] here for visibility, in case people > > > have > > > > different opinions and wish to discuss this in more depth: > > > > > > > > From my POV, Polaris is a platform, indeed. In this sense, I think it > > is > > > > > critical to enable users to control what features are at play in > > > runtime, > > > > > since different users have different use cases. This is why I > > > originally > > > > > advocated for isolating Metrics Persistence from MetaStore > > Persistence. > > > > > > > > > > If a user decides to leverage the "native" Polaris (scan/commit) > > > Metrics > > > > > Persistence, I do not see any disadvantage in also exposing an > > > (optional) > > > > > REST API for loading these metrics from Polaris Persistence. > > > > > > > > > > The degree of support and sophistication that goes into this > > sub-system > > > > is > > > > > up to the community. If we have contributors (like @obelix74 ) who > > are > > > > > willing to evolve it, I see no harm in some functionality overlap > > with > > > > more > > > > > focused metrics platforms. Again, the key point is for all users to > > > have > > > > > control and be able to opt in or out of this feature in their > > specific > > > > > deployments. > > > > > > > > > > Of course, offloading scan/commit metrics storage to a specialized > > > > > observability system is possible too (assuming someone develops > > > > integration > > > > > code for that, which is very welcome). > > > > > > > > > > > > [1] > https://github.com/apache/polaris/pull/3924#discussion_r2913947744 > > > > > > > > Thanks, > > > > Dmitri. > > > > > > > > On Tue, Mar 10, 2026 at 12:37 PM Anand Kumar Sankaran via dev < > > > > [email protected]> wrote: > > > > > > > > > Hi EJ Wang and Dmitri, > > > > > > > > > > I addressed all your concerns about the proposal, in particular > > > > > https://github.com/apache/polaris/pull/3924#discussion_r2908317696 > . > > > > > > > > > > Does this address your concerns? > > > > > > > > > > - > > > > > Anand > > > > > > > > > > From: Dmitri Bourlatchkov <[email protected]> > > > > > Date: Monday, March 9, 2026 at 1:17 PM > > > > > To: [email protected] <[email protected]> > > > > > Subject: Re: Proposal for REST endpoints for table metrics and > events > > > > > > > > > > This Message Is From an External Sender > > > > > This message came from outside your organization. > > > > > Report Suspicious< > > > > > > > > > > > > > > > https://us-phishalarm-ewt.proofpoint.com/EWT/v1/Iz9xO38YGHZK!YhNDZAGr2cumYdtJ_UTm0gR9PfI_-PwSpR_GtNr1uVQ_xo-s2AskvUmbkLZ-C5V8eOKN-omus47On4k4hfFo-0G7CHMLwVjEego-rZrPuepAybX7DP8Ua0VNSrsZ83C4$ > > > > > > > > > > > > > > > > > > > > > Hi EJ, > > > > > > > > > > You make good points about the metrics API extensibility and > > evolution. > > > > > > > > > > However, we need to consider practical aspects too. Anand appears > to > > > have > > > > > some specific use cases in mind, and I assume his proposal > addresses > > > > them. > > > > > > > > > > Starting with an API + implementation that works for some real > > > > > world applications will validate the feature's usability. > > > > > > > > > > We can revamp the API completely in its v2 after v1 is merged. New > > > major > > > > > API versions do not have to be backward-compatible with older > > versions > > > of > > > > > the same API [1]. > > > > > > > > > > In my personal experience, a v1 API can hardly be expected to cover > > all > > > > use > > > > > cases and extensions well. We can certainly take a bit more time to > > > > polish > > > > > it, but from my POV it might be best to iterate in terms of API > > > versions > > > > > rather than on unmerged commits in the initial proposal. Just my 2 > > > cents > > > > :) > > > > > > > > > > That said, we should flag the new APIs in this proposal as > "beta"... > > at > > > > > least initially (which is the usual practice in Polaris). > > > > > > > > > > > I wonder if it would help to evaluate the Events API and Metrics > > API > > > a > > > > > bit more independently. > > > > > > > > > > That makes sense to me. However, the current proposal progressed a > > lot > > > > > since its initial submission and contained both APIs. I would not > > want > > > to > > > > > lose this momentum. > > > > > > > > > > It might still be advisable to implement the events and metrics > APIs > > > > > separately and gather additional feedback at that time. > > > > > > > > > > [1] > > > > > > > > > > > > > > > https://urldefense.com/v3/__https://polaris.apache.org/in-dev/unreleased/evolution/__;!!Iz9xO38YGHZK!8KJ0uv4jK3mxZP4nYFrL1hZ0fMkQvoVEAJa8t9LBCzVtm_PWVFGQfIcZp-ykn3_F9_ph6EYyu3dUZjPAcQ$ > > > > > > > > > > Cheers, > > > > > Dmitri. > > > > > > > > > > On Mon, Mar 9, 2026 at 3:48 PM EJ Wang < > > [email protected] > > > > > > > > > wrote: > > > > > > > > > > > Hi Anand, > > > > > > > > > > > > I think the proposal is moving in a better direction, especially > on > > > the > > > > > > Events side, and I appreciate the iteration so far. That said, I > > > still > > > > > have > > > > > > some concerns about the Metrics side, but they are less about > > > > individual > > > > > > parameters or endpoint shape, and more about product boundary. > > > > > > > > > > > > 2 cents: I wonder if it would help to evaluate the Events API and > > > > Metrics > > > > > > API a bit more independently. > > > > > > > > > > > > The Events side feels relatively close to Polaris' > > catalog/change-log > > > > > > scope. It is easier to justify as part of the core/community > > surface, > > > > > > especially if the goal is to expose completed catalog mutations > in > > a > > > > way > > > > > > that aligns with Iceberg-style events. > > > > > > > > > > > > The Metrics side feels different to me. Once we start adding more > > and > > > > > more > > > > > > type-specific filters, query semantics, and schema shape for > > > individual > > > > > > metric families, it seems easy for Polaris to drift toward a > > built-in > > > > > > observability backend. My bias would be for Polaris to support a > > > > smaller > > > > > > set of community-recognized built-in metrics well, while > providing > > > good > > > > > > extensibility points for deployments that want richer querying, > > > > > > visualization, or use-case-specific metrics. > > > > > > > > > > > > Related to that, I am not yet convinced the current metrics model > > is > > > > > > generic enough as a long-term direction. Even after consolidating > > to > > > a > > > > > > single endpoint, the design still feels fairly tied to the > current > > > > > > scan/commit shape. I worry that otherwise each new metric family > > will > > > > > keep > > > > > > pulling us into more storage/schema/API reshaping inside Polaris > > > core. > > > > > > So the framing question I would suggest is something like: > > > > > > > What is the minimal built-in metrics surface Polaris should own > > in > > > > > core, > > > > > > and where should we instead rely on extensibility / sink-export / > > > > > > plugin-style integration? > > > > > > > > > > > > For me, getting that boundary right matters more than settling > > every > > > > > query > > > > > > parameter detail first. > > > > > > > > > > > > -ej > > > > > > > > > > > > On Tue, Mar 3, 2026 at 12:29 PM Anand Kumar Sankaran via dev < > > > > > > [email protected]> wrote: > > > > > > > > > > > > > Hi Yufei and Dmitri, > > > > > > > > > > > > > > Here is a proposal for the REST endpoints for metrics and > events. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://urldefense.com/v3/__https://github.com/apache/polaris/pull/3924/changes__;!!Iz9xO38YGHZK!8KJ0uv4jK3mxZP4nYFrL1hZ0fMkQvoVEAJa8t9LBCzVtm_PWVFGQfIcZp-ykn3_F9_ph6EYyu3fXDjRYWA$ > > > > > > > > > > > > > > I did not see any precursors for raising a PR for proposals, so > > > > trying > > > > > > > this. Please let me know what you think. > > > > > > > > > > > > > > - > > > > > > > Anand > > > > > > > > > > > > > > From: Anand Kumar Sankaran <[email protected]> > > > > > > > Date: Monday, March 2, 2026 at 10:25 AM > > > > > > > To: [email protected] <[email protected]> > > > > > > > Subject: Re: Polaris Telemetry and Audit Trail > > > > > > > > > > > > > > About the REST API, based on my use cases: > > > > > > > > > > > > > > > > > > > > > 1. > > > > > > > I want to be able to query commit metrics to track files added > / > > > > > removed > > > > > > > per commit, along with record counts. The ingestion pipeline > that > > > > > writes > > > > > > > this data is owned by us and we are guaranteed to write this > > > > > information > > > > > > > for each write. > > > > > > > 2. > > > > > > > I want to be able to query scan metrics for read. I understand > > > > clients > > > > > do > > > > > > > not fulfill this requirement. > > > > > > > 3. > > > > > > > I want to be able to query the events table (events are > > persisted) > > > - > > > > > this > > > > > > > may supersede #2, I am not sure yet. > > > > > > > > > > > > > > All this information is in the JDBC based persistence model and > > is > > > > > > > persisted in the metastore. I currently don’t have a need to > > query > > > > > > > prometheus or open telemetry. I do publish some events to > > > Prometheus > > > > > and > > > > > > > they are forwarded to our dashboards elsewhere. > > > > > > > > > > > > > > About the CLI utilities, I meant the admin user utilities. In > one > > > of > > > > > the > > > > > > > earliest drafts of my proposal, Prashant mentioned that the > > metrics > > > > > > tables > > > > > > > can grow indefinitely and that a similar problem exists with > the > > > > events > > > > > > > table as well. We discussed that cleaning up of old records > from > > > both > > > > > > > metrics tables and events tables can be done via a CLI utility. > > > > > > > > > > > > > > I see that Yufei has covered the discussion about datasources. > > > > > > > > > > > > > > - > > > > > > > Anand > > > > > > > > > > > > > > > > > > > > > > > > > > > > From: Yufei Gu <[email protected]> > > > > > > > Date: Friday, February 27, 2026 at 9:54 PM > > > > > > > To: [email protected] <[email protected]> > > > > > > > Subject: Re: Polaris Telemetry and Audit Trail > > > > > > > > > > > > > > This Message Is From an External Sender > > > > > > > This message came from outside your organization. > > > > > > > Report Suspicious< > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://us-phishalarm-ewt.proofpoint.com/EWT/v1/Iz9xO38YGHZK!YhNDZABkHi1B699ote2uMwpOZw8i0QMCGO2Szc-HshuABGhGvwPJcymE6G2oUUxtS8xDkSrtGTPm_I3QnVDHoLMk50m9v8z_nZKTkd-bnVUbreF1u0WnfV_X5eYevZl_$ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > As I mentioned in > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://urldefense.com/v3/__https://github.com/apache/polaris/issues/3890__;!!Iz9xO38YGHZK!5EuyFFkk3vhRWVIRvQAWBSQfpJkTMA9HxugzDwXmN0LPPqhEFxYkFRGVhtb8AqUwXtDh2OplcMnbMDHKOxrvDU0$ > > > > > >, > > > > > > > supporting > > > > > > > multiple data sources is not a trivial change. I would strongly > > > > > recommend > > > > > > > starting with a design document to carefully evaluate the > > > > architectural > > > > > > > implications and long term impact. > > > > > > > > > > > > > > A REST endpoint to query metrics seems reasonable given the > > current > > > > > JDBC > > > > > > > based persistence model. That said, we may also consider > > > alternative > > > > > > > storage models. For example, if we later adopt a time series > > system > > > > > such > > > > > > as > > > > > > > Prometheus to store metrics, the query model and access > patterns > > > > would > > > > > be > > > > > > > fundamentally different. Designing the REST API without > > considering > > > > > these > > > > > > > potential evolutions may limit flexibility. I'd suggest to > start > > > with > > > > > the > > > > > > > use case. > > > > > > > > > > > > > > Yufei > > > > > > > > > > > > > > > > > > > > > On Fri, Feb 27, 2026 at 3:42 PM Dmitri Bourlatchkov < > > > > [email protected]> > > > > > > > wrote: > > > > > > > > > > > > > > > Hi Anand, > > > > > > > > > > > > > > > > Sharing my view... subject to discussion: > > > > > > > > > > > > > > > > 1. Adding non-IRC REST API to Polaris is perfectly fine. > > > > > > > > > > > > > > > > Figuring out specific endpoint URIs and payloads might > require > > a > > > > few > > > > > > > > roundtrips, so opening a separate thread for that might be > > best. > > > > > > > > Contributors commonly create Google Docs for new API > proposals > > > too > > > > > > (they > > > > > > > > fairly easy to update as the email discussion progresses). > > > > > > > > > > > > > > > > There was a suggestion to try Markdown (with PRs) for > proposals > > > [1] > > > > > ... > > > > > > > > feel free to give it a try if you are comfortable with that. > > > > > > > > > > > > > > > > 2. Could you clarify whether you mean end user utilities or > > admin > > > > > user > > > > > > > > utilities? In the latter case those might be more suitable > for > > > the > > > > > > Admin > > > > > > > > CLI (java) not the Python CLI, IMHO. > > > > > > > > > > > > > > > > Why would these utilities be common with events? IMHO, event > > use > > > > > cases > > > > > > > are > > > > > > > > distinct from scan/commit metrics. > > > > > > > > > > > > > > > > 3. I'd prefer separating metrics persistence from MetaStore > > > > > persistence > > > > > > > at > > > > > > > > the code level, so that they could be mixed and matched > > > > > independently. > > > > > > > The > > > > > > > > separate datasource question will become a non-issue with > that > > > > > > approach, > > > > > > > I > > > > > > > > guess. > > > > > > > > > > > > > > > > The rationale for separating scan metrics and metastore > > > persistence > > > > > is > > > > > > > that > > > > > > > > "cascading deletes" between them are hardly ever required. > > > > > Furthermore, > > > > > > > the > > > > > > > > data and query patterns are very different so different > > > > technologies > > > > > > > might > > > > > > > > be beneficial in each case. > > > > > > > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://urldefense.com/v3/__https://lists.apache.org/thread/yto2wp982t43h1mqjwnslswhws5z47cy__;!!Iz9xO38YGHZK!5EuyFFkk3vhRWVIRvQAWBSQfpJkTMA9HxugzDwXmN0LPPqhEFxYkFRGVhtb8AqUwXtDh2OplcMnbMDHKxYDakNU$ > > > > > >> > > > > > > > > > Cheers, > > > > > > > > Dmitri. > > > > > > > > > > > > > > > > On Fri, Feb 27, 2026 at 6:19 PM Anand Kumar Sankaran via dev > < > > > > > > > > [email protected]> wrote: > > > > > > > > > > > > > > > > > Thanks all. This PR is merged now. > > > > > > > > > > > > > > > > > > Here are the follow-up features / work needed. These were > > all > > > > part > > > > > > of > > > > > > > > the > > > > > > > > > merged PR at some point in time and were removed to reduce > > > scope. > > > > > > > > > > > > > > > > > > Please let me know what you think. > > > > > > > > > > > > > > > > > > > > > > > > > > > 1. A REST API to paginate through table metrics. This > will > > > be > > > > > > > non-IRC > > > > > > > > > standard addition. > > > > > > > > > 2. Utilities for managing old records, should be common > > with > > > > > > events. > > > > > > > > > There was some discussion that it belongs to the CLI. > > > > > > > > > 3. Separate datasource (metrics, events, even other > > > tables?). > > > > > > > > > > > > > > > > > > > > > > > > > > > Anything else? > > > > > > > > > > > > > > > > > > - > > > > > > > > > Anand > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
