Hi all, Very sorry for the late reply - this week has been busy. I was (still somewhat am) in favor of strongly-typed events. I had earlier informed my opinion on this given other systems which do use their events later within their execution. It seems we do not have this use case yet - and not on the near horizon yet either, as Dmitri has noted.
However, my one remaining concern with keeping PolarisEvents as a flattened "bag of properties" is, unless we have comprehensive per-event testing (which defeats the whole point of removing the strongly-typed events structure), we may be vulnerable to typos and inconsistent naming, which could effectively render the unified filtering/pruning mechanisms useless. As a result, I propose the following (building on Alex's proposal) to move this conversation forward: the new method signature would be `Map<PolarisEvent.EventPropertyType, Object> attributes()` where EventPropertyType is an enum defined within PolarisEvent and contains all the different types of properties an event could have. Edge case call-out: There will be special care needed for events such as (Before/After)CommitTableEvent, which have metadata objects for before AND after - but these can be modeled using two separate EventPropertyType objects: one for metadataBefore and one for metadataAfter. All other events that only generate an "after" metadata object should store their metadata in "metadataAfter" and leave "metadataBefore" as unset, just like any other unused property. This may slightly complicate the unified filtering/pruning logic - but this, IMO, is an acceptable balance. WDYT? Best, Adnan Hemani On Fri, Nov 14, 2025 at 1:48 AM Oleg Soloviov <[email protected]> wrote: > Hi all, > > It looks like we have a lazy consensus on this proposal. If that's the case > and there are no further objections, I would like to work on this one. > > Thanks, > Oleg > > On Sat, Nov 8, 2025 at 12:13 AM Dmitri Bourlatchkov <[email protected]> > wrote: > > > Hi Alex, > > > > I agree that using a flat (single class?) type hierarchy for events on > the > > server side is reasonable. Polaris Server itself does not appear to > "read" > > the events it produces, so maintaining the multitude of getters does seem > > like an unnecessary overhead. At the same time producing well-structured > > payloads for delivering events to external systems (including persistence > > in the Polaris database) can be achieved without a verbose type > hierarchy. > > > > Cheers, > > Dmitri. > > > > On Fri, Nov 7, 2025 at 11:30 AM Alexandre Dutra <[email protected]> > wrote: > > > > > Hi all, > > > > > > I'm writing to express my concerns about the current state of the > > > PolarisEvent API and to propose a solution. > > > > > > Current challenges: > > > > > > 1) Excessive complexity: the PolarisEvent interface currently has over > > > 150 concrete subtypes, with a corresponding number of methods in the > > > PolarisEventListener interface. This forces each concrete listener to > > > implement all 150+ methods, even when the logic is similar or > > > identical, leading to significant boilerplate (see example [1] from a > > > recent PR). > > > > > > 2) Manual processes: afaik the current plan for event pruning (e.g., > > > removing sensitive or large data) is to implement this event by event. > > > This has been a slow process so far. We only have 2-3 events > > > implemented, we still have 147 more to go. > > > > > > While I generally advocate for strongly typed APIs, I believe that in > > > this specific context, the PolarisEvent hierarchy is slowing down the > > > development of event-related features. > > > > > > Do we need so many subtypes? Events are very short-lived objects; they > > > are created, immediately passed to a listener, and then > > > garbage-collected. Besides, most listeners will likely apply the same > > > logic to all events (basically: serialize and dispatch). This hints at > > > a type hierarchy that isn't being useful to its main consumers. > > > > > > My proposal is to completely flatten the PolarisEvent hierarchy. > > > Instead of numerous concrete types, we would have a single > > > implementation. This implementation would expose the methods I'm > > > adding in [2], including type() which allows distinguishing events by > > > type ID. > > > > > > It would also expose a new method: Map<String, Object> attributes(). > > > > > > An event factory would be responsible for creating events and > > > populating these attributes using a common set of well-defined, typed > > > attribute keys such as "catalog_name", "table_identifier", > > > "table_metadata", etc. > > > > > > This creates a schemaless-ish view of the event, which is ideal for > > > pruning and serialization. It would enable us to apply common rules > > > more efficiently. For example: > > > > > > 1) All events containing the "table_metdata" attribute could > > > automatically apply a pruning logic to reduce its size. > > > > > > 2) All events containing a specific attribute could automatically have > > > sensitive data removed from its value. > > > > > > I'm curious to hear what the community thinks of this proposal. > > > > > > Thanks, > > > Alex > > > > > > [1]: > > > > > > https://github.com/vchag/polaris/blob/4c0aef587e63d5e60d657561a0a53701417f324b/runtime/service/src/main/java/org/apache/polaris/service/events/listeners/AllEventsForwardingListener.java > > > [2]: https://github.com/apache/polaris/pull/2998 > > > > > >
