Hi All,

WDYT about another sync call next week?

Thanks,
Dmitri.

On Wed, May 6, 2026 at 5:29 PM Dmitri Bourlatchkov <[email protected]> wrote:

> Hi EJ,
>
> Thanks for the summary! It covers what we discussed in the meeting very
> well, IMHO.
>
> Looking forward to concrete PRs :)
>
> Cheers,
> Dmitri.
>
> On Wed, May 6, 2026 at 5:08 PM EJ Wang <[email protected]>
> wrote:
>
>> Hi folks,
>>
>> We had a community sync earlier, thanks JB for scheduling it. Notes from
>> the first metrics architecture sync (May 6, 10-11am PT). Discussion doc
>> with per-section status:
>>
>> https://docs.google.com/document/d/100h7c4damrUzVuquYbBHM0EvA4LSWuW2IT2dN_7nYVA/edit?tab=t.0
>>
>> *The meeting covered both topics from the doc. Direction-level alignment
>> was reached on the headline pieces; details remain for PR review or
>> follow-up sessions.*
>>
>> *Topic 1 — Persistence schema redesign*
>> Idea-level alignment on consolidating per-type tables
>> (scan_metrics_report,
>> commit_metrics_report) into a single metrics_report table. The motivating
>> cost is the surface area added by every new metric type today: new table,
>> SPI method, record class, model, converter, schema migration.
>>
>> Most schema details are deferred to the schema PR. A few specific points
>> came up:
>> •   metric_schema_version: Yufei prefers dropping it, since there is no
>> spec-level concept of metrics versioning today and it is hard to define
>> unilaterally. Robert prefers keeping it, given IRC v2 is coming and the
>> schema should be considered against its likely shape; Robert also raised
>> how to differentiate various payload formats if any. EJ's read is that
>> this
>> is a two-way-door decision. We can start without the field, and if IRC v2
>> changes the shape we would likely roll a corresponding new schema anyway,
>> which is not particularly costly.
>> •   Payload format: Robert pointed out that future formats beyond JSON may
>> be worth supporting. The exact shape is deferred to the schema discussion.
>> •   Partition strategy: Anand suggested monthly partitioning based on his
>> experience as potentially helpful at scale.
>>
>> *Topic 2 — Where metrics ingestion and storage belong*
>> Idea-level alignment that metrics should be a separated SPI from the
>> entity
>> persistence stack. Two reasons surfaced: (a) workloads and capability
>> requirements diverge enough that coupling them creates artificial
>> constraints, and (b) admin experience improves when metrics has its own
>> bootstrap, retention, and lifecycle. Dmitri noted Polaris being a platform
>> should have the flexibility to support different persistence backends per
>> concern, and pointed to a concrete next step of separating the JDBC
>> bootstrap for metrics from the metastore bootstrap. Robert proposed an
>> additional UX extension: detect an unbootstrapped metrics store on first
>> use and auto-bootstrap rather than requiring an explicit manual bootstrap
>> step.
>> The meeting also confirmed that Polaris metrics can start small and stay
>> Iceberg-focused. Naming and persistence schema can lean Iceberg-specific.
>> If a future expansion to generic-table metrics or operational metrics
>> arrives, an abstraction layer can be built on top of the Iceberg metrics
>> reporter at that point. Robert remains on the fence and would prefer
>> something more generic but did not block the direction; Dmitri's read was
>> that the proposed framework already has enough flexibility to absorb
>> future
>> expansion.
>>
>> The Trade-offs and Proposed structure sections in the doc were not
>> reviewed
>> in detail. They remain open for either the next sync or PR review.
>>
>> *Cross-cutting alignment — battery-included plus pluggable*
>> A common philosophy emerged from the discussion. EJ summarized it as:
>> Polaris should provide a battery-included UX for beginners and the
>> flexibility for advanced users to swap the included battery for something
>> more powerful or tailored to their use case. The SPI design needs to
>> enable
>> both.
>>
>> The inputs that shaped this framing:
>> •   Anand described how his team uses the current metrics persistence
>> (three metrics consumers in v1.4).
>> •   Yufei raised Grafana and dashboard integrations as a destination use
>> case beyond the default.
>> •   Robert called out that the current design is more JDBC-focused.
>>
>> Two concrete instances:
>> •   Async metrics intake: Yufei's initial position was that async should
>> largely live on the producer side and there is not much Polaris can do.
>> Robert suggested a Polaris-side default is doable via Vert.x. Dmitri
>> agreed
>> the direction is worth exploring. The meeting converged on a
>> battery-included default (likely Vert.x-backed) with an SPI shape that
>> lets
>> power users route to a more scalable backend (k8s-hosted queue, AWS SQS,
>> etc.).
>> •   Pluggable destinations: combining Yufei's dashboard use case with
>> Robert's JDBC-focused call-out, the meeting agreed the SPI should be
>> structured for multiple sinks so integrations become impl choices rather
>> than architectural changes.
>>
>> The battery-included default is most likely to use the existing
>> JDBC-backed
>> approach.
>>
>> *Direction (idea-level alignment)*
>> •   Single metrics_report table consolidating per-type metrics, replacing
>> scan_metrics_report and commit_metrics_report
>> •   Iceberg-focused naming and schema for now, revisit if generic-table or
>> operational metrics arrive
>> •   Metrics persistence as a separated SPI, not on BasePersistence
>> •   Bootstrap path separated for metrics, independent of metastore
>> bootstrap
>> •   "Battery-included plus pluggable" as the SPI design philosophy
>>
>> *Open items*
>> •   Schema details: metric_schema_version, payload format, IRC v2
>> forward-compat shape
>> •   SPI design details — full review either in the next sync or in the
>> corresponding PR
>> •   Schema refactor PR ownership
>>
>> *Action items*
>> •   EJ to take a first stab at the SPI design and potentially partner with
>> Anand to incorporate the lessons learned from the existing reporter and
>> persistence work.
>> •   Schema refactor PR ownership is not yet decided. If anyone is
>> interested in driving it, reply on this thread.
>> •   JB to schedule the next sync, tentatively in two weeks.
>>
>> -ej
>>
>> On Mon, Apr 27, 2026 at 3:07 PM EJ Wang <[email protected]>
>> wrote:
>>
>> > Thanks Yufei for the +1.
>> >
>> > JB, could you help add a biweekly metrics architecture sync to the
>> Polaris
>> > community calendar? I'm thinking Thursdays at 9-10am PT, on the
>> off-weeks
>> > from the community meeting (starting May 7), 60 minutes.
>> >
>> > Here's a rough agenda to work through over the first few sessions,
>> grouped
>> > by priority:
>> >
>> > *First: foundational direction*
>> >
>> > 1.  MetricsPersistence: public SPI or internal implementation detail?
>> >    •   Marked @Beta, javadoc calls it a "Service Provider Interface",
>> but
>> > only one consumer (JdbcBasePersistenceImpl), lives on BasePersistence.
>> If
>> > demoted to a private helper inside a persisting reporter impl, most
>> > downstream design decisions become implementation details rather than
>> > contract questions.
>> >
>> > 2.  Persistence schema redesign
>> >    •   Current two-table layout (scan_metrics_report,
>> > commit_metrics_report) with ~25 flattened columns each. Every new metric
>> > type requires a new table, SPI method, record class, model, converter,
>> and
>> > schema migration. Direction to explore: single table with metric_type
>> enum,
>> > schema_version, and JSON payload column.
>> >
>> > *Second: design details once direction is set*
>> >
>> > 3.  Partition key strategy
>> >    •   Single-table design means scan metrics at scale will have high
>> > write concurrency per table. Schema needs to expose enough structure for
>> > backends to shard by entity or time range.
>> >
>> > 4.  Read/write path consistency
>> >    •   Writes go through PolarisMetricsManager on MetaStoreManager.
>> Reads
>> > bypass MetaStoreManager and go straight to BasePersistence, excluding
>> > non-JDBC backends from the read API.
>> >
>> > *Third: cleanup and alignment*
>> >
>> > 5.  PolarisMetricsReporter naming
>> >    •   Only handles IRC (ScanReport/CommitReport), doesn't cover generic
>> > tables or operational metrics. Name is broader than scope.
>> >
>> > 6.  PolarisMetricsManager facade passthrough
>> >    •   Entire default method is
>> callCtx.getMetaStore().writeScanReport().
>> > Zero logic, passes Level 1 straight through to Level 3. Same
>> anti-pattern
>> > as PolarisEventManager.
>> >
>> > 7.  Iceberg community alignment
>> >    •   Payload-type extension needs discussion on dev@iceberg.
>> obelix74's
>> > Feb thread got zero replies. Needs a committer voice.
>> >
>> > Lets confirm prioritization in the first session.
>> >
>> > -ej
>> >
>> > On Tue, Apr 21, 2026 at 3:18 PM Yufei Gu <[email protected]> wrote:
>> >
>> >> Thanks everyone for continuing to drive this forward. I agree that the
>> >> problem is getting complex enough that a more structured discussion
>> would
>> >> help.
>> >>
>> >> +1 on setting up a biweekly sync for the metrics architecture. I’m
>> happy
>> >> to
>> >> join.
>> >>
>> >> Yufei
>> >>
>> >>
>> >> On Tue, Apr 21, 2026 at 2:34 PM EJ Wang <
>> [email protected]>
>> >> wrote:
>> >>
>> >> > Also, I've been looking more closely at the *persistence schema in
>> the
>> >> > current metrics work*, and I think there's a structural rigidity
>> problem
>> >> > worth raising before the shape gets locked in.
>> >> >
>> >> > Right now we have two separate tables (scan_metrics_report and
>> >> > commit_metrics_report), each with ~25 flattened columns that directly
>> >> > mirror the Iceberg report fields. The SPI follows the same split:
>> >> > writeScanReport and writeCommitReport as separate methods, with
>> per-type
>> >> > record classes, converters, and model objects. *The practical cost:
>> >> > adding a new metric type (operational metrics, for example) requires
>> a
>> >> new
>> >> > table, a new SPI method, a new record class, a new model class, a new
>> >> > converter branch, and a schema migration*. That's a lot of surface
>> area
>> >> > for what should be "one more kind of metric."
>> >> >
>> >> > *My bias* would be toward a single metrics table with *a typed JSON
>> >> > payload*. Something like: metric_type (enum), entity_id,
>> >> > table_identifier, snapshot_id (nullable), received_ts,
>> schema_version,
>> >> and
>> >> > a payload column for the metric-specific data. The metric_type +
>> >> > schema_version pair gives us a forward-compatible contract for the
>> >> payload
>> >> > shape. Adding a new metric type becomes an enum value and a payload
>> >> schema,
>> >> > not a schema migration. One thing I think we need to be deliberate
>> >> about is
>> >> > the partition key design. If all metric types land in one table, scan
>> >> > metrics at scale (high concurrency, high frequency across many
>> tables)
>> >> > could easily create hot partitions. We'd want the persistence layer
>> to
>> >> be
>> >> > able to shard by entity or time range, and that means the logical
>> schema
>> >> > needs to expose enough structure for backends to partition on. I
>> don't
>> >> > think the current flattened layout gives us that.
>> >> >
>> >> > This is getting complex enough that I don't think ad-hoc PR/ML
>> threads
>> >> > will converge well. *Would people be open to a biweekly sync for
>> metrics
>> >> > architecture?* I think 30 minutes every two weeks with interested
>> >> parties
>> >> > would be enough to work through the schema, SPI shape, and read API
>> >> design
>> >> > together. Happy to help set that up.
>> >> >
>> >> > -ej
>> >> >
>> >> > On Mon, Apr 20, 2026 at 2:19 PM EJ Wang <
>> [email protected]
>> >> >
>> >> > wrote:
>> >> >
>> >> >> Reviewed #4115, left a comment on the code organization side.
>> >> >>
>> >> >> One thing stood out: the metrics write path enters through
>> >> >> PolarisMetricsManager on MetaStoreManager, but the new read path
>> >> bypasses
>> >> >> MetaStoreManager entirely and goes straight to BasePersistence via
>> >> >> callContext.getMetaStore(). That means the read API only works for
>> >> backends
>> >> >> that implement BasePersistence. NoSQL and remote backends can't
>> >> participate.
>> >> >>
>> >> >> Stepping back, I think the metrics subsystem is growing into
>> something
>> >> >> real (write + read + REST API + AuthZ + pagination) *but the
>> >> persistence
>> >> >> side is split across two layers in a way that's hard to extend*. I
>> put
>> >> >> together two diagrams to show what I mean (my best effort).
>> >> >>
>> >> >> *Current state* (Diagram 1): three interfaces at three different
>> >> levels.
>> >> >> The engine-facing SPI (PolarisMetricsReporter) is clean. But
>> >> >> PolarisMetricsManager on MetaStoreManager is a passthrough to
>> >> >> MetricsPersistence on BasePersistence. The @Beta annotation and SPI
>> >> javadoc
>> >> >> are on the BasePersistence layer, while the actual extension points
>> >> >> (PolarisMetricsReporter, PolarisMetricsManager) carry no stability
>> >> >> annotation. The write path goes through the MetaStoreManager layer,
>> the
>> >> >> read path doesn't.
>> >> >>
>> >> >> *What I envision* (Diagram 2): two SPIs at two levels.
>> >> >> PolarisMetricsReporter stays as the engine-facing SPI.
>> >> >> PolarisMetricsManager becomes the backend-facing SPI with both write
>> >> and
>> >> >> read methods at the MetaStoreManager level, where any backend (JDBC,
>> >> NoSQL,
>> >> >> remote) can implement them. MetricsPersistence on BasePersistence
>> goes
>> >> >> away. Where metrics actually land is an implementation detail, not a
>> >> core
>> >> >> interface.
>> >> >>
>> >> >> *Minor naming thing*: PolarisMetricsReporter is broader than what it
>> >> >> actually handles. It only accepts Iceberg REST Catalog metrics
>> >> (ScanReport,
>> >> >> CommitReport via MetricsReport). Generic table metrics or
>> operational
>> >> >> metrics aren't in scope. Not blocking, but worth noting if the
>> metrics
>> >> >> surface expands.
>> >> >>
>> >> >> *Rough sketch of how to get there*:
>> >> >>  1.  Add read methods to PolarisMetricsManager (listScanReports,
>> >> >> listCommitReports) with default no-op, same as the existing write
>> >> methods.
>> >> >> (Probably make PolarisMetricsManager more explicit on being Iceberg
>> >> >> specific like package name or class name etc.)
>> >> >>  2.  Wire MetricsReportsService through MetaStoreManager instead of
>> >> >> callContext.getMetaStore().
>> >> >>  3.  Extract metrics persistence from JdbcBasePersistenceImpl into
>> its
>> >> >> own class. That file carries ~7 responsibilities, metrics being one
>> of
>> >> them.
>> >> >>  4.  Remove MetricsPersistence from BasePersistence.
>> >> >>
>> >> >> *None of this needs to happen in #4115. But if the direction makes
>> >> sense,
>> >> >> it would be good to align before the metrics surface grows further.
>> >> Curious
>> >> >> what others think.*
>> >> >>
>> >> >> *My mental model note*: Level 1 MetaStoreManager; level 2
>> transactional
>> >> >> persistence; level 3 base persistence
>> >> >>
>> >> >> Diagram 1
>> >> >> <
>> >>
>> https://www.plantuml.com/plantuml/uml/bLHDR-Cs4BthLmpIYupw0zbkKQ1r3M-S7Bp8xhhM7WCOb3IM65EaGD9EX2RzxHrHb4CxRelwa4YSDu_lpOVcnZ9jzvM8BBS2uGjQpJC3dtHMSekPtMk44IpsMgEqa5XcCOhCZikQQLP1pR8TAp2n3ILhmZDP20m0fcIvUkAoW2qJXd9z1bpToO9BX3WXu0ucy5rpgGPNm0nW5_epUWtm2Ue3pn3kMOFQmKntGZW0BYtgBSi8k5A2QMwybJNMIbFiGSR9QZc4nUqIvikStF0jHprua5C-amge42aNt3R0f5JaaoivdV2Pkqbx4hee4ymOkBh5BTiB-_uIeGeo8zL8rPsPl4DktdEiK1jkB1NdZCRbrSTecDe_mlHbF0wvBmCkaOH5_S8a_TTTKI6-nmCAkEw4LpxsZ-LbYLKQFKMNOgf_wuM7_bV9gOer5SYMMksBSWXFcbi49KNZXNLicwfe3TETC7gPdPqI7uBcHMb1RSzYq34c6PDUM9mn8HRsUTZEiDBve3NjVZumBj0U7SS37mGO7vcwtiK-_pU7U7L_f-digo9YbhSwIfMRwIITKGXbxdIUTCGF1SeCJxloKsU-3k9ddRbX1eDq1q_fx1JbBGT0glVyXimDuP4TQ5qpCAmnGEj2s_6n5mtn1z-97-63itFQZLPO1Ev2tu_WF7Ju-VPc0Skg5bYXxBhkY1xpD7EM_7fyflSpIsqMgVth5xhVr4eQxWQ8enaSAJQSG16yFSDuJ798rrcXr_3n-lfdk7icQjEBmFujL7AodiP_Y4Z7-YxvtZNs4zMgpNTl6tF8sglyPsmqchrjvQ-m-aP94r-TwCA2Ka8upPJZwtvSpoYCXkYMZU2NXvRMBfq9P3i3Le4VAZUAlUZ_oPKsxPgY0Q_BSKLkyr9bhQhQrJjo_x3TPlIB0DPjnMfcIoYP0QaYw1a0fTKDr8fB6ntNuvmoL1ZGkXa69Njh43zf9GiGxHQrA_jDYWRSzF5--WmTVrN97_Sm8LbLUy_lGBmLanJjFkDlGkRqjA_4tm00
>> >> >
>> >> >> :
>> >> >>
>> >> >> [image: image.png]
>> >> >>
>> >> >> Diagram 2
>> >> >> <
>> >>
>> https://www.plantuml.com/plantuml/uml/VLLDR-8m4BtdLupO2sWBLVU8AaGB7AXAbssGzb896SS9RXqxjKqBwkv_tt7iV43fdaZYDpFlpRmnOsE9jhjSH9PRmM31hERKm8scMsuPjJlDe0yheZDc8RR4iYWoBrmMH9CS2a9VICPYUy1OZN0YCy5Q0BCbYNhdCeEK28En8G8wCvbnoQ0R8_05Bc6bkLIz3X03p1zzH7zR-9ZfDquPt9C3qoNCX2yV4G2NbkcKu5jdgGJHt0GbZwnG6i-UP3TUpk5gM6Ldqke350eZUqzoCft3U9xWHvxoa5-7K4nF1J46EbEMafsmdrCBbQ44gVggy18IZrn_ph5asd1ZiIKdQSgueZvjXrQFSFrdC3YN-nXmBacxbGiYyLVxLaBtdhqn0LSzdBDhqQtQoOJeGyad3z0lUqnYgpGB6Ns8oVyta00Dy_WnX0tIOZ8v6SYxHll1TrH6aejAik-mh-AphVFCwSUQqFypElag5QRGFDjQKEd96K1P8QP41c9TzA_IIQyvdAWyv_RSiS3skb0_EzDDkK2v5xWF6MiGFlvhpFLcD2Dq2pml14gaF67eQkmd8gulDoC4kSOu6KVpkvlUJg1RTbWISU40RdBUUS_9XfRZ2dwxm_SW8LYFISgm_MnlDQ6M9P1gbKEc4X-2pH_FvJCkCqm9pbVjD6LrwdLeOrDWfOaqc8Wh9BE85oNKxkNQ6o4yGRy_Eae0G_G8tZv81d3bHDB23WOdisohVr3nh_j6lbSjbNaLRTc8UgtPbAU1J_tygOfZX9DWEJeHDvYx-qmSi5FgNLPZwHrHcUsncGQ5-skhUclpE5fo4ounpFauYrUbkU6ccfnxMvitwag4IyerhTxj8In_Oj1bDO4pQru674loYrGlULHLEGCjwJJ8gDoVZR8MxO4BT3IzRvIcAQKezC6xpziGnTyImrfEGyJI_OcKfgtxIvnTqFEMS17L9Z-jsARN5FmTheP7HtSdtOMT0B4GY2FYHXxgQmMtj2bRqiLFGapiVe1_QVKDrkqXcm83aFEXnMYCZ-xlyHy
>> >> >
>> >> >> :
>> >> >> [image: image.png]
>> >> >>
>> >> >>  -ej
>> >> >>
>> >> >> On Wed, Apr 15, 2026 at 8:22 AM Dmitri Bourlatchkov <
>> [email protected]>
>> >> >> wrote:
>> >> >>
>> >> >>> Hi All,
>> >> >>>
>> >> >>> Heads up: The current state of PR [4115] looks pretty solid to me.
>> I
>> >> >>> believe this PR is approaching a mergeable condition.
>> >> >>>
>> >> >>> Please post your reviews if you have any comments.
>> >> >>>
>> >> >>> [4115] https://github.com/apache/polaris/pull/4115
>> >> >>>
>> >> >>> Thanks,
>> >> >>> Dmitri.
>> >> >>>
>> >> >>> On Tue, Mar 3, 2026 at 3:29 PM Anand Kumar Sankaran via dev <
>> >> >>> [email protected]> wrote:
>> >> >>>
>> >> >>> > Hi Yufei and Dmitri,
>> >> >>> >
>> >> >>> > Here is a proposal for the REST endpoints for metrics and events.
>> >> >>> >
>> >> >>> > https://github.com/apache/polaris/pull/3924/changes
>> >> >>> >
>> >> >>> > I did not see any precursors for raising a PR for proposals, so
>> >> trying
>> >> >>> > this.  Please let me know what you think.
>> >> >>> >
>> >> >>> > -
>> >> >>> > Anand
>> >> >>> >
>> >> >>> > From: Anand Kumar Sankaran <[email protected]>
>> >> >>> > Date: Monday, March 2, 2026 at 10:25 AM
>> >> >>> > To: [email protected] <[email protected]>
>> >> >>> > Subject: Re: Polaris Telemetry and Audit Trail
>> >> >>> >
>> >> >>> > About the REST API, based on my use cases:
>> >> >>> >
>> >> >>> >
>> >> >>> >   1.
>> >> >>> > I want to be able to query commit metrics to track files added /
>> >> >>> removed
>> >> >>> > per commit, along with record counts. The ingestion pipeline that
>> >> >>> writes
>> >> >>> > this data is owned by us and we are guaranteed to write this
>> >> >>> information
>> >> >>> > for each write.
>> >> >>> >   2.
>> >> >>> > I want to be able to query scan metrics for read. I understand
>> >> clients
>> >> >>> do
>> >> >>> > not fulfill this requirement.
>> >> >>> >   3.
>> >> >>> > I want to be able to query the events table (events are
>> persisted) -
>> >> >>> this
>> >> >>> > may supersede #2, I am not sure yet.
>> >> >>> >
>> >> >>> > All this information is in the JDBC based persistence model and
>> is
>> >> >>> > persisted in the metastore. I currently don’t have a need to
>> query
>> >> >>> > prometheus or open telemetry. I do publish some events to
>> Prometheus
>> >> >>> and
>> >> >>> > they are forwarded to our dashboards elsewhere.
>> >> >>> >
>> >> >>> > About the CLI utilities, I meant the admin user utilities. In
>> one of
>> >> >>> the
>> >> >>> > earliest drafts of my proposal, Prashant mentioned that the
>> metrics
>> >> >>> tables
>> >> >>> > can grow indefinitely and that a similar problem exists with the
>> >> events
>> >> >>> > table as well. We discussed that cleaning up of old records from
>> >> both
>> >> >>> > metrics tables and events tables can be done via a CLI utility.
>> >> >>> >
>> >> >>> > I see that Yufei has covered the discussion about datasources.
>> >> >>> >
>> >> >>> > -
>> >> >>> > Anand
>> >> >>> >
>> >> >>> >
>> >> >>> >
>> >> >>> > From: Yufei Gu <[email protected]>
>> >> >>> > Date: Friday, February 27, 2026 at 9:54 PM
>> >> >>> > To: [email protected] <[email protected]>
>> >> >>> > Subject: Re: Polaris Telemetry and Audit Trail
>> >> >>> >
>> >> >>> > This Message Is From an External Sender
>> >> >>> > This message came from outside your organization.
>> >> >>> > Report Suspicious<
>> >> >>> >
>> >> >>>
>> >>
>> https://us-phishalarm-ewt.proofpoint.com/EWT/v1/Iz9xO38YGHZK!YhNDZABkHi1B699ote2uMwpOZw8i0QMCGO2Szc-HshuABGhGvwPJcymE6G2oUUxtS8xDkSrtGTPm_I3QnVDHoLMk50m9v8z_nZKTkd-bnVUbreF1u0WnfV_X5eYevZl_$
>> >> >>> > >
>> >> >>> >
>> >> >>> >
>> >> >>> > As I mentioned in
>> >> >>> >
>> >> >>>
>> >>
>> https://urldefense.com/v3/__https://github.com/apache/polaris/issues/3890__;!!Iz9xO38YGHZK!5EuyFFkk3vhRWVIRvQAWBSQfpJkTMA9HxugzDwXmN0LPPqhEFxYkFRGVhtb8AqUwXtDh2OplcMnbMDHKOxrvDU0$
>> >> >>> ,
>> >> >>> > supporting
>> >> >>> > multiple data sources is not a trivial change. I would strongly
>> >> >>> recommend
>> >> >>> > starting with a design document to carefully evaluate the
>> >> architectural
>> >> >>> > implications and long term impact.
>> >> >>> >
>> >> >>> > A REST endpoint to query metrics seems reasonable given the
>> current
>> >> >>> JDBC
>> >> >>> > based persistence model. That said, we may also consider
>> alternative
>> >> >>> > storage models. For example, if we later adopt a time series
>> system
>> >> >>> such as
>> >> >>> > Prometheus to store metrics, the query model and access patterns
>> >> would
>> >> >>> be
>> >> >>> > fundamentally different. Designing the REST API without
>> considering
>> >> >>> these
>> >> >>> > potential evolutions may limit flexibility. I'd suggest to start
>> >> with
>> >> >>> the
>> >> >>> > use case.
>> >> >>> >
>> >> >>> > Yufei
>> >> >>> >
>> >> >>> >
>> >> >>> > On Fri, Feb 27, 2026 at 3:42 PM Dmitri Bourlatchkov <
>> >> [email protected]>
>> >> >>> > wrote:
>> >> >>> >
>> >> >>> > > Hi Anand,
>> >> >>> > >
>> >> >>> > > Sharing my view... subject to discussion:
>> >> >>> > >
>> >> >>> > > 1. Adding non-IRC REST API to Polaris is perfectly fine.
>> >> >>> > >
>> >> >>> > > Figuring out specific endpoint URIs and payloads might require
>> a
>> >> few
>> >> >>> > > roundtrips, so opening a separate thread for that might be
>> best.
>> >> >>> > > Contributors commonly create Google Docs for new API proposals
>> too
>> >> >>> (they
>> >> >>> > > fairly easy to update as the email discussion progresses).
>> >> >>> > >
>> >> >>> > > There was a suggestion to try Markdown (with PRs) for proposals
>> >> [1]
>> >> >>> ...
>> >> >>> > > feel free to give it a try if you are comfortable with that.
>> >> >>> > >
>> >> >>> > > 2. Could you clarify whether you mean end user utilities or
>> admin
>> >> >>> user
>> >> >>> > > utilities? In the latter case those might be more suitable for
>> the
>> >> >>> Admin
>> >> >>> > > CLI (java) not the Python CLI, IMHO.
>> >> >>> > >
>> >> >>> > > Why would these utilities be common with events? IMHO, event
>> use
>> >> >>> cases
>> >> >>> > are
>> >> >>> > > distinct from scan/commit metrics.
>> >> >>> > >
>> >> >>> > > 3. I'd prefer separating metrics persistence from MetaStore
>> >> >>> persistence
>> >> >>> > at
>> >> >>> > > the code level, so that they could be mixed and matched
>> >> >>> independently.
>> >> >>> > The
>> >> >>> > > separate datasource question will become a non-issue with that
>> >> >>> approach,
>> >> >>> > I
>> >> >>> > > guess.
>> >> >>> > >
>> >> >>> > > The rationale for separating scan metrics and metastore
>> >> persistence
>> >> >>> is
>> >> >>> > that
>> >> >>> > > "cascading deletes" between them are hardly ever required.
>> >> >>> Furthermore,
>> >> >>> > the
>> >> >>> > > data and query patterns are very different so different
>> >> technologies
>> >> >>> > might
>> >> >>> > > be beneficial in each case.
>> >> >>> > >
>> >> >>> > > [1]
>> >> >>> >
>> >> >>>
>> >>
>> https://urldefense.com/v3/__https://lists.apache.org/thread/yto2wp982t43h1mqjwnslswhws5z47cy__;!!Iz9xO38YGHZK!5EuyFFkk3vhRWVIRvQAWBSQfpJkTMA9HxugzDwXmN0LPPqhEFxYkFRGVhtb8AqUwXtDh2OplcMnbMDHKxYDakNU$
>> >> >>> > >
>> >> >>> > > Cheers,
>> >> >>> > > Dmitri.
>> >> >>> > >
>> >> >>> > > On Fri, Feb 27, 2026 at 6:19 PM Anand Kumar Sankaran via dev <
>> >> >>> > > [email protected]> wrote:
>> >> >>> > >
>> >> >>> > > > Thanks all. This PR is merged now.
>> >> >>> > > >
>> >> >>> > > > Here are the follow-up features / work needed.  These were
>> all
>> >> >>> part of
>> >> >>> > > the
>> >> >>> > > > merged PR at some point in time and were removed to reduce
>> >> scope.
>> >> >>> > > >
>> >> >>> > > > Please let me know what you think.
>> >> >>> > > >
>> >> >>> > > >
>> >> >>> > > >   1.  A REST API to paginate through table metrics. This
>> will be
>> >> >>> > non-IRC
>> >> >>> > > > standard addition.
>> >> >>> > > >   2.  Utilities for managing old records, should be common
>> with
>> >> >>> events.
>> >> >>> > > > There was some discussion that it belongs to the CLI.
>> >> >>> > > >   3.  Separate datasource (metrics, events, even other
>> tables?).
>> >> >>> > > >
>> >> >>> > > >
>> >> >>> > > > Anything else?
>> >> >>> > > >
>> >> >>> > > > -
>> >> >>> > > > Anand
>> >> >>> > > >
>> >> >>> > > >
>> >> >>> > >
>> >> >>> >
>> >> >>> >
>> >> >>>
>> >> >>
>> >>
>> >
>>
>

Reply via email to