Hello! Just want to add my 2 cents here. Using external engines to handle heavy I/O operations looks well justified. On the other hand it was mentioned that there are metrics that could be handled by Polaris itself, though someone needs to control when they are calculated, pushed to storage etc.
Maybe we could use an approach in between the 1st and 2nd version of Pierre's proposal: - Polaris controls which metrics are calculated and when (benefitting from event listeners). - Polaris delegates computation to external engines if needed (SPI or API?). - Polaris pushes metrics to persistent storage via SPI. - Polaris provides Open API to request metrics, force refresh them; probably we could still allow to push externally calculated metrics as well. This way we will have a more centralized setup for different kinds of metrics, better control the granularity of incremental updates (we probably will not want to update metrics after each small commit). Also it will provide some flexibility, e.g. there were some proposals in the Iceberg community to enrich metadata with some metrics like histograms, so Polaris could choose to calculate internally vs externally if those metrics are available in metadata. On the downside, of course, such a solution will complicate Polaris' runtime, but imo it is still worth considering. Oleg On Mon, Sep 29, 2025 at 4:48 PM Pierre Laporte <[email protected]> wrote: > Hi Yufei, thanks for the feedback > > Just to confirm, this is not a Polaris event listener but the Iceberg Event > > REST endpoint(WIP), right? If we are using the Polaris event listener, we > > still have to figure out the protocol between the delegation service > > clients and servers, which are described in William's doc. > > > > The proposal does not include any sort of triggering system. So there is > no single answer to your question. I was merely trying to explore possible > implementation ideas. But keep in mind that this can come at a later time, > as we first have to define how Polaris defines operational metrics and deal > with them, before we can consider how external systems could integrate > them. > > To be clear, I think using the Iceberg Event REST endpoint is a good idea, > > it decouples the external service nicely, but we may have to wait for a > > while, as it's still WIP in the Iceberg community. > > > > Exactly. I do not recommend adding a dependency between this proposal and > other proposals, unless those are strictly necessary. > > > > Other than that, the SPI interface design seems missing in the doc. > That's > > an essential part of the metrics persistence. I think we will need more > > interface details to move forward. > > > > Note that the SPI cannot define how metrics are persisted. It is an > interface that should be extended so that metrics are persisted against a > certain database, and using a certain format. I do not think Polaris > should force a certain storage system for operational metrics. > > Could you list the information you would like to see added to the > document? I am having difficulties understanding the ask. > > -- > > Pierre >
