Hi Andrew,

I think the idea of extending / augmenting Polaris behavior is very
appealing, but deserves some more thinking.

First off, I'm a bit concerned that this event listener interface would end
up with dozens, maybe hundreds of rather unrelated methods: "onFooStarted",
"onBarDropped", etc.

Next, and since you mentioned Quarkus: when using Quarkus, the application
state is entirely constructed at build time, not at runtime. It's thus not
possible to just "drop a jar" somewhere with your listener implementation,
and expect it to be picked up by the running Polaris server. Instead, this
scenario requires a bit more work: one has to create an application that
extends the Polaris Quarkus runtime, add the listener implementation, build
the modified app, then package and deploy it. I'm not saying that this
isn't possible, only that it requires some work from users. In Nessie, we
do this e.g. to allow users to provide a subscriber for Nessie events.

Speaking of Nessie events, I wonder if a better option would be to
implement something similar in Polaris? IOW Polaris would emit events
("tableCreated", "catalogAdded", etc.) and users interested in these would
just provide a subscriber/listener that would send the events to a
messaging system or anything that suits their needs. This would require
more work, but OTOH, would achieve a better decoupling between Polaris
proper and how users can extend it.

And that's where I think your proposal might step a bit into Catalog
Federation land. With federation, catalogs must have a way to send and
receive notifications. Polaris already has the "receiving end" (the
notifications endpoint), but nothing is implemented thus far for the sender
part. This observation has been brought
<https://lists.apache.org/thread/zcv6qm9ysknrhfpg093qgnrkrolptcht> to the
Iceberg mailing list already, where different approaches were discussed
(push vs pull based) – but no decision taken. Then some time later, JB brought
a similar proposal
<https://lists.apache.org/thread/48fczg6okw9mos94o90tr347fbz9qc3b> to the
Polaris mailing list, but again, it seems the discussion stalled.

I think my preference here would be to transfer this proposal to the
Catalog Federation epic, and make sure that we consider that events might
be consumed not only by reflecting catalogs, but by a broader audience
including general-purpose applications that wish to be notified of a
catalog's events (e.g. for the purpose of triggering table optimizations).

Thanks,

Alex


On Wed, Dec 11, 2024 at 8:00 PM Andrew Guterman <andrew.guterm...@gmail.com>
wrote:

> Hi folks,
>
> It would be useful to add a generic event listener interface to Polaris,
> consistent with other OSS projects. Users of the project may require
> additional functionality that doesn't have a clear enough value proposition
> to be in OSS. Instead, there can be event listeners that let you hook into
> various parts of the Polaris functionality (i.e. "before table commit")
> without OSS prescribing the limits of the extra functionality.
>
> Here are some examples of the shape of changes this would help:
>
>    - https://github.com/apache/polaris/pull/458
>    - https://github.com/apache/polaris/pull/155
>    - https://github.com/apache/polaris/pull/423
>    - Modification to application startup (i.e. by registering new Jersey
>    resources)
>    - Further observability changes
>
> In all of these cases we can (and sometimes have) introduced new ad-hoc
> interfaces to allow pluggable behavior, but I believe it's more scalable to
> have a single event listener interface whose number of events grows over
> time. The intent is also not to reduce the amount of contributions to OSS,
> but rather keep the changes focused and vendor-agnostic.
>
> Here are some example implementations in other OSS projects:
>
>    -
>
> https://spark.apache.org/docs/3.5.3/api/java/org/apache/spark/scheduler/SparkListener.html
>    -
> https://heroiclabs.com/docs/nakama/guides/server-framework/using-hooks/
>
> I'm waiting for the Quarkus work to settle before suggesting a concrete
> implementation, as I heard Quarkus might make it easy to intercept method
> calls, but I did want to gather thoughts in the interim.
>
> Andrew
>

Reply via email to