Hi Dmitri Sorry I meant local implementation. I did not raise a PR yet. I need the REST API first (so we can retrieve auditing data). Nandor said he will tackle the OSS implementation.
Noted, about batch size. I will pull that in. — Anand From: Dmitri Bourlatchkov <[email protected]> Date: Friday, March 6, 2026 at 9:54 AM To: [email protected] <[email protected]> Cc: Adnan Hemani <[email protected]>, Anand Kumar Sankaran <[email protected]> Subject: Re: [DISCUSS] Polaris event persistence This Message Is From an External Sender This message came from outside your organization. Report Suspicious<https://us-phishalarm-ewt.proofpoint.com/EWT/v1/Iz9xO38YGHZK!YhNDZAGr2cumYf-ONIUIcNix9BMki6DiT8DnwNj1doA9UrGenSDVmx0ESnkYg5ZH78wqpx1wf9tDd5RgF9Gz2MCKiEqXLLNFUK8jCMQCQT-_5XTMiwmMXKLDGhnKtEGB$> Hi Anand, When you wrote "implemented", did you mean a new metrics persistence PR or your local implementation? Using InMemoryBufferEventListener sounds quite valid to me. maxBufferSize 5 might be too low, I guess. I'd imagine under high load we'd want larger write batches for more efficiency at the Persistence layer. How about 100? (but I do not have any data to back it up). Cheers, Dmitri. On Fri, Mar 6, 2026 at 10:01 AM Anand Kumar Sankaran via dev <[email protected]<mailto:[email protected]>> wrote: Hi all, As discussed in this proposal<https://github.com/apache/polaris/pull/3924#issuecomment-4000072265<https://urldefense.com/v3/__https://github.com/apache/polaris/pull/3924*issuecomment-4000072265__;Iw!!Iz9xO38YGHZK!_PSDnN1pdaQTIY0tSTF4LxZX6-eMrbsL8MN108V54zwOQWY8W4iNnQNC9R3rZV8f-Uyy7wAa_TOwG8ezXBBPfh0-hLg3jpw$>>, for our auditing purposes, I implemented event persistence like this: It uses an in-memory buffering strategy provided by Apache Polaris' InMemoryBufferEventListener to batch events before flushing. events: persistence: enabled: true bufferTime: "5000ms" # Flush after 5 seconds maxBufferSize: 5 # Or after 5 events I implemented an audit event listener that extends InMemoryBufferEventListener, listens for many events, creates PolarisEvents and calls InMemoryBufferEventListener.processEvents() to buffer them. Is there a problem in doing this, while I wait for the fix discussed here? - Anand From: Adnan Hemani via dev <[email protected]<mailto:[email protected]>> Date: Thursday, March 5, 2026 at 3:44 PM To: [email protected]<mailto:[email protected]> <[email protected]<mailto:[email protected]>> Cc: Adnan Hemani <[email protected]<mailto:[email protected]>> Subject: Re: [DISCUSS] Polaris event persistence This Message Is From an External Sender This message came from outside your organization. Report Suspicious<https://us-phishalarm-ewt.proofpoint.com/EWT/v1/Iz9xO38YGHZK!YhNDZAGomgiHL3-p1SKrEFd2u0L0PbXJYIhH5QU9smTKTCzcVqeKKZNKPQEKGrvUBiAmMs6ekIb2jF-5Fj2qqxJBC7mVAYFbTWswfMi3u1i4x_1gvoli0X1wkDYijxky$> Hi all, Thanks for reviving the conversation regarding this feature, Nándor! My last recollection on this conversation was that we, as a community, had agreed to using async PolarisEventListeners to accommodate multiple event listeners at the same time. Using the Quarkus event-bus seems like a reasonable implementation idea based on my quick research. Nándor, if you would like to work on this, I would be glad to help in whatever way I can - I just don't have the bandwidth to own this feature in the immediate month or so. Regarding event flattening and information redactions, I mostly agree with Alex: flattening the events was not a panacea for the issue, but it should unlock our ability to apply mass transformations across event types. The PoC should show how we can achieve this along with the event flattening. I wasn't aware that people were waiting on me for a design proposal - my apologies if I accidentally made this promise. If anyone else would like to work on this, please do feel free to. The one thing to keep in mind is that, different event listeners will need different transformation patterns. For example, the Events Persistence Listener must fit the schema we merged earlier, which closely resembles the (proposed) Iceberg Event API's schema. But the AWS CloudWatch one will require much less transformation and can be used almost transparently minus the light redactions for security concerns. Additionally, we must tackle one more workstream: storing events in a database separate from the one that holds the metadata. As the volume of events increases, this will become a big concern if we turn on the JDBC listener by default. Best, Adnan Hemani On Thu, Mar 5, 2026 at 5:11 AM Alexandre Dutra <[email protected]<mailto:[email protected]>> wrote: > Hi Nándor, > > > it seems that the transformation of most service events to event > entities is missing from PolarisPersistenceEventListener > > Yes, unfortunately this has never been addressed, and yet it makes the > whole events feature pretty unusable. There is an old issue for that: > [1]. An external contributor tried to tackle it, but their PR has been > blocked for "security concerns" [2], and it's probably too old now. I > think we need to make this configurable for each listener. > > > The event flattening approach in [4] doesn't seem to help much here, as > it replaces roughly 150 classes with about 150 switch branches. > > That wasn't the intent. The idea was rather to define transformation > rules per event attribute, e.g. if an event has the TABLE_METADATA > attribute then we would apply some specific transformation rule to > "prune" the attribute from sensitive data or things like that. This > idea has received a PoC [3] a while ago, but I'm afraid the PoC is > obsolete by now. iirc Adnan was supposed to provide us with a design > proposal. > > > I am considering using the Quarkus event-bus [5] for the > PolarisEventListener implementations. > > Very good idea :-) My hot take here is that we will need multiple > listeners ASAP, because the JDBC listener will become kind of > mandatory now, and should probably be "on" by default. This old ML > thread is relevant: [4]. This PR also outlines a few good ideas: [5]. > Lastly, you can also have a look at what Nessie did [6] (although > Nessie has a complex delivery logic). > > So I'd suggest to tackle these 3 problems first: > > - Configurable event types per listener > - The "transformation" layer (it could start simple, just "no > transformation at all") > - Multiple listeners > > We could probably parallelize the efforts to some extent. > > Thanks, > Alex > > [1]: > https://urldefense.com/v3/__https://github.com/apache/polaris/issues/2630__;!!Iz9xO38YGHZK!6fIlpqfh2G1mjeV0q4vV61Pf3zpbNExOynGSXZ1r9c5mbGNL5g3qTTZXXy9wV_quehU6f8b25Zd1j2FpIKz0oA$ > [2]: > https://urldefense.com/v3/__https://github.com/apache/polaris/pull/2962__;!!Iz9xO38YGHZK!6fIlpqfh2G1mjeV0q4vV61Pf3zpbNExOynGSXZ1r9c5mbGNL5g3qTTZXXy9wV_quehU6f8b25Zd1j2H9hE7Blw$ > [3]: > https://urldefense.com/v3/__https://github.com/apache/polaris/pull/3217__;!!Iz9xO38YGHZK!6fIlpqfh2G1mjeV0q4vV61Pf3zpbNExOynGSXZ1r9c5mbGNL5g3qTTZXXy9wV_quehU6f8b25Zd1j2F5KZr6PA$ > [4]: > https://urldefense.com/v3/__https://lists.apache.org/thread/wj14coln1k4f9l8dmm21ktj2ql787gvc__;!!Iz9xO38YGHZK!6fIlpqfh2G1mjeV0q4vV61Pf3zpbNExOynGSXZ1r9c5mbGNL5g3qTTZXXy9wV_quehU6f8b25Zd1j2GKvqlyMQ$ > [5]: > https://urldefense.com/v3/__https://github.com/apache/polaris/pull/3442__;!!Iz9xO38YGHZK!6fIlpqfh2G1mjeV0q4vV61Pf3zpbNExOynGSXZ1r9c5mbGNL5g3qTTZXXy9wV_quehU6f8b25Zd1j2EW2gzBmw$ > [6]: > https://urldefense.com/v3/__https://github.com/projectnessie/nessie/blob/fe7fbb3cf2c0b573acd2d773f2d62ae67fef153d/events/quarkus/src/main/java/org/projectnessie/events/quarkus/QuarkusEventService.java__;!!Iz9xO38YGHZK!6fIlpqfh2G1mjeV0q4vV61Pf3zpbNExOynGSXZ1r9c5mbGNL5g3qTTZXXy9wV_quehU6f8b25Zd1j2GF462LDQ$ > > On Thu, Mar 5, 2026 at 1:37 PM Romain Manni-Bucau > <[email protected]<mailto:[email protected]>> > wrote: > > > > Hi, > > > > My 2cts would be that some industry tried this webhook like > implementation > > and it works while not that adopted. > > Since iceberg is quite closely bound to eventing for ingestion in general > > it can make sense to bypass REST (which doesnt scale by design until you > do > > adopt more relevant design like JSON-RPC which has bulk built-in) an just > > go asyncapi and support messaging by default - there I totally agree with > > JB that internal can be external in a lot of cases. > > Would also enable to use message API (MP) instead of an ad-hoc API which > > doesn't integrate with anything existing - even plain local CDI bus - > > instead of quarkus one which can stay in a niche in terms of > > ecosystem/adoption/end user knowledge. > > > > In terms of mappig I would just go model -> JSON/Avro with the schema > > exposed and documented with every release (optionally sync-ed in a schema > > registry) and be it, will enable external case as well as internal one > with > > a database which does support JSON column type (almost all modern ones). > > > > So overall keep it simple. > > > > Just my 2 cts > > > > Romain Manni-Bucau > > @rmannibucau > > <https://urldefense.com/v3/__https://x.com/rmannibucau__;!!Iz9xO38YGHZK!6fIlpqfh2G1mjeV0q4vV61Pf3zpbNExOynGSXZ1r9c5mbGNL5g3qTTZXXy9wV_quehU6f8b25Zd1j2EHnuKKkg$> > > | .NET Blog > > <https://urldefense.com/v3/__https://dotnetbirdie.github.io/__;!!Iz9xO38YGHZK!6fIlpqfh2G1mjeV0q4vV61Pf3zpbNExOynGSXZ1r9c5mbGNL5g3qTTZXXy9wV_quehU6f8b25Zd1j2H7YlfyJA$> > > | Blog > > <https://urldefense.com/v3/__https://rmannibucau.github.io/__;!!Iz9xO38YGHZK!6fIlpqfh2G1mjeV0q4vV61Pf3zpbNExOynGSXZ1r9c5mbGNL5g3qTTZXXy9wV_quehU6f8b25Zd1j2GnTJtaOg$> > | Old > > Blog > > <https://urldefense.com/v3/__http://rmannibucau.wordpress.com__;!!Iz9xO38YGHZK!6fIlpqfh2G1mjeV0q4vV61Pf3zpbNExOynGSXZ1r9c5mbGNL5g3qTTZXXy9wV_quehU6f8b25Zd1j2HBf1wPWg$> > > | Github > > <https://urldefense.com/v3/__https://github.com/rmannibucau__;!!Iz9xO38YGHZK!6fIlpqfh2G1mjeV0q4vV61Pf3zpbNExOynGSXZ1r9c5mbGNL5g3qTTZXXy9wV_quehU6f8b25Zd1j2HEFYo6JA$> > > | LinkedIn > > <https://urldefense.com/v3/__https://www.linkedin.com/in/rmannibucau__;!!Iz9xO38YGHZK!6fIlpqfh2G1mjeV0q4vV61Pf3zpbNExOynGSXZ1r9c5mbGNL5g3qTTZXXy9wV_quehU6f8b25Zd1j2HKOB66yw$> > > | Book > > < > https://urldefense.com/v3/__https://www.packtpub.com/en-us/product/java-ee-8-high-performance-9781788473064__;!!Iz9xO38YGHZK!6fIlpqfh2G1mjeV0q4vV61Pf3zpbNExOynGSXZ1r9c5mbGNL5g3qTTZXXy9wV_quehU6f8b25Zd1j2FplzwIdg$ > > > > Javaccino founder (Java/.NET service - contact via linkedin) > > > > > > Le jeu. 5 mars 2026 à 12:53, Jean-Baptiste Onofré > > <[email protected]<mailto:[email protected]>> a > > écrit : > > > > > Hi Nandor, > > > > > > I will take a look. Generally speaking, I wonder if we should > implement a > > > kind of internal event bus that supports "event dispatching." > > > > > > For example, I previously created a framework called Apache Karaf > Decanter > > > ( > > > https://urldefense.com/v3/__https://github.com/apache/karaf-decanter__;!!Iz9xO38YGHZK!6fIlpqfh2G1mjeV0q4vV61Pf3zpbNExOynGSXZ1r9c5mbGNL5g3qTTZXXy9wV_quehU6f8b25Zd1j2HGcYY-rw$) > > > based on this concept. It > allows > > > for multiple event appenders, which could provide a flexible way to > > > collect, process, and dispatch events. > > > > > > Just a thought. > > > > > > Regards, > > > JB > > > > > > On Thu, Mar 5, 2026 at 6:04 AM Nándor Kollár > > > <[email protected]<mailto:[email protected]>> > > > wrote: > > > > > > > Hi All, > > > > > > > > I recently reviewed how Polaris events are persisted, which is a > > > > prerequisite for implementing both the Iceberg event proposal [1] > and the > > > > event API in Polaris [2]. I identified two areas for improvement: it > > > > appears that we only persist two types of events, and Polaris allows > > > only a > > > > single event listener. Because of this limitation, we cannot, for > > > example, > > > > persist events in the database *and* send them to CloudWatch at the > same > > > > time. > > > > > > > > Regarding the first problem, it seems that the transformation of most > > > > service events to event entities is missing from > > > > PolarisPersistenceEventListener [3]. Supporting each service event > would > > > > likely require implementing a transformation for every event type, > which > > > > could result in more than 150 separate methods or switch cases. The > event > > > > flattening approach in [4] doesn't seem to help much here, as it > replaces > > > > roughly 150 classes with about 150 switch branches. At the moment, I > do > > > not > > > > yet have a good idea how we could simplify this transformation. In > the > > > > worst case, we would need to implement a large number of branches. > > > > > > > > As for the second problem, I am considering using the Quarkus > event-bus > > > [5] > > > > for the PolarisEventListener implementations. This would hopefully > keep > > > the > > > > listeners configurable, allowing individual listeners to be enabled > or > > > > disabled while also making it possible for multiple listeners to > consume > > > > Polaris events simultaneously. > > > > > > > > What do you think? > > > > > > > > Nandor > > > > > > > > [1] > > > > > > > > > > > > https://urldefense.com/v3/__https://docs.google.com/document/d/1WtIsNGVX75-_MsQIOJhXLAWg6IbplV4-DkLllQEiFT8/edit?tab=t.0__;!!Iz9xO38YGHZK!6fIlpqfh2G1mjeV0q4vV61Pf3zpbNExOynGSXZ1r9c5mbGNL5g3qTTZXXy9wV_quehU6f8b25Zd1j2GIF1lYYQ$ > > > > [2] > > > > https://urldefense.com/v3/__https://github.com/apache/polaris/pull/3924__;!!Iz9xO38YGHZK!6fIlpqfh2G1mjeV0q4vV61Pf3zpbNExOynGSXZ1r9c5mbGNL5g3qTTZXXy9wV_quehU6f8b25Zd1j2GKnASACw$ > > > > [3] > > > > > > > > > > > > https://urldefense.com/v3/__https://github.com/apache/polaris/blob/main/runtime/service/src/main/java/org/apache/polaris/service/events/listeners/PolarisPersistenceEventListener.java*L39__;Iw!!Iz9xO38YGHZK!6fIlpqfh2G1mjeV0q4vV61Pf3zpbNExOynGSXZ1r9c5mbGNL5g3qTTZXXy9wV_quehU6f8b25Zd1j2Fy-tYnPg$ > > > > [4] > > > > https://urldefense.com/v3/__https://lists.apache.org/thread/xonxwf9b38t9cxo841r0hn1b34plf7og__;!!Iz9xO38YGHZK!6fIlpqfh2G1mjeV0q4vV61Pf3zpbNExOynGSXZ1r9c5mbGNL5g3qTTZXXy9wV_quehU6f8b25Zd1j2Ho3l5v9A$ > > > > [5] > > > > https://urldefense.com/v3/__https://quarkus.io/guides/reactive-event-bus__;!!Iz9xO38YGHZK!6fIlpqfh2G1mjeV0q4vV61Pf3zpbNExOynGSXZ1r9c5mbGNL5g3qTTZXXy9wV_quehU6f8b25Zd1j2G2AXuATg$ > > > > > > > > -- Dmitri Bourlatchkov Senior Staff Software Engineer, Dremio Dremio.com<https://urldefense.com/v3/__https://www.dremio.com/?utm_medium=email&utm_source=signature&utm_term=na&utm_content=email-signature&utm_campaign=email-signature__;!!Iz9xO38YGHZK!_PSDnN1pdaQTIY0tSTF4LxZX6-eMrbsL8MN108V54zwOQWY8W4iNnQNC9R3rZV8f-Uyy7wAa_TOwG8ezXBBPfh0-WwCVQ3s$> / Follow Us on LinkedIn<https://urldefense.com/v3/__https://www.linkedin.com/company/dremio__;!!Iz9xO38YGHZK!_PSDnN1pdaQTIY0tSTF4LxZX6-eMrbsL8MN108V54zwOQWY8W4iNnQNC9R3rZV8f-Uyy7wAa_TOwG8ezXBBPfh0-ruNGDg8$> / Get Started<https://urldefense.com/v3/__https://www.dremio.com/get-started/__;!!Iz9xO38YGHZK!_PSDnN1pdaQTIY0tSTF4LxZX6-eMrbsL8MN108V54zwOQWY8W4iNnQNC9R3rZV8f-Uyy7wAa_TOwG8ezXBBPfh0-m094JcU$> [https://lh3.googleusercontent.com/d/1Ta6MlHHnksAk0fuUEm2yxgiTJbWMh6YM] The Agentic Lakehouse The only lakehouse built for agents, managed by agents
