Re: Proposal: usage of service as foundation for Graph lifecycle listeners

Andrii Lomakin via dev Tue, 25 Nov 2025 04:48:31 -0800

Good day, Andrea.

Regarding your questions:


1. Data Manipulation in Listeners (e.g., Account Balance Change)

Yes, I absolutely imagine users wanting to execute data manipulation. We
already implement this behavior in our Java-based listeners. For reference,
you can see a similar test case here:
https://github.com/JetBrains/youtrackdb/blob/develop/tests/src/test/java/com/jetbrains/youtrackdb/auto/LinksConsistencyTest.java
.

2. Synchronizing with External Systems (e.g., REST Calls)

I do not believe synchronizing with external systems falls within the scope
of this proposal, as it would necessitate granting listeners broad access
to the entire system. I have a separate solution for this use case that I
plan to propose later.

3. Scope of Listener Access

- Internal System Access: I imagine the listener having access to the
public API of TinkerPop plus the core JDK. If they need more access, they
should be required to explicitly specify this in the settings.
- Structure and Process APIs: They should have full access to the Process
API. Regarding the Structure API, they will naturally have access to
Elements, but nothing beyond that.
- Access Beyond APIs (e.g., Netty layer): This should not be an
all-or-nothing approach. I suggest we define a default set of classes that
listeners can access and allow providers to extend these permissions as
needed.

4. Authorization Requirements (Similar to SQL Triggers)

While I agree that this functionality requires a proper authorization
schema, I view the relationship as being the reverse: triggers are a
simplified version of listeners. Authorization is definitely required for
registering WASM listeners. However, this does not necessarily need to be
part of the TinkerPop reference implementation, just as our existing
predicate-based security—which controls access to each element—does not
contradict the TinkerPop API.

Best regards,
Andrii Lomakin
YouTrackDB development lead

On Fri, Nov 21, 2025 at 12:19 AM Andrea Child <[email protected]> wrote:

> Hi Andrii,
>
> Regarding your goal to unify the API for embedded and remote, I do agree
> that would be a nice thing to achieve. I find it can be difficult for new
> users to understand (and for TP contributors to maintain) a system which
> has multiple ways to do the same thing. For example bytecode vs scripts,
> OLAP vs OLTP, web sockets vs http, etc. As a new user, how do you know
> which path to take? Unifying the API could help reduce complexity and
> increase usability.
>
> As for user-provided WASM event listeners I would like to ask more
> questions to help me understand the use cases and scope of access the
> listeners would have access to.
>
> Regarding use cases:
> - would you imagine the user wanting to execute data manipulation? For
> example, if the graph holds data related to accounts and balances, a user
> may want to register a listener which detects account balance changes and
> if the balance goes below zero, set a property on the account or add an
> edge from the account to a person who is a financial advisor.
> - do you imagine a user wanting to do more than just modify data and for
> example detect data change events in order to synchronize it into an
> external system? For example they can register a listener that detects
> creation of new accounts and propagates the new account data into an
> external system via a REST call.
>
> Regarding scope of listener access:
> - what parts of the internal system do you imagine the listener having
> access to?
> - will the listeners have access to the structure and process APIs?
> - will the listeners have access beyond the APIs, for example the netty
> layer?
>
> The concept of WASM listeners reminds me of SQL triggers, however creation
> of SQL triggers requires elevated admin privileges. Should registration of
> WASM listeners also require some level of authorization? TinkerPop does not
> currently dictate how authorization should be handled so perhaps the onus
> would be on the provider to limit access to listener registration.
>
> Andrea
>
> On 2025/11/18 08:18:01 Andrii Lomakin wrote:
> > Hi Andrea,
> >
> > One more detail I wanted to add: I think splitting the API into distinct
> > structural and process parts is problematic, as it currently prevents
> users
> > of embedded databases from scaling their applications easily.
> >
> > One of the key goals of this proposal is to unify these two APIs,
> allowing
> > users to evolve their deployment schemas—from embedded to remote—without
> > requiring significant changes. I believe achieving this unification is
> > quite important for the broader usability of the system.
> >
> >
> > On Tue, Nov 18, 2025 at 9:14 AM Andrii Lomakin <
> [email protected]>
> > wrote:
> >
> > > Hi Andrea.
> > >
> > > >Who is the ‘user’ in your proposal? I believe it is a developer who is
> > > using a GLV to query the graph but would just like to confirm.
> > >
> > > Completely correct :-)
> > >
> > > >In your proposal, a user supplies the event handling logic via WASM
> and
> > > this code is executed on the server, potentially very frequently
> depending
> > > on the event type it is handling. How can we ensure that multiple
> users do
> > > not register many expensive event handlers which will then contend for
> > > server resources?
> > >
> > > 1. WASM, by definition, requires a module to specify the amount of
> memory
> > > it uses. It is part of the WASM module declaration. It is a stack
> machine
> > > with linearly declared memory. We can easily check limits and reject
> > > incorrect or dangerous declarations.
> > > https://webassembly.github.io/spec/core/text/modules.html#memories
> > > 2. If it is absent, we should limit the number of threads used by user
> > > connections even without the context of this proposal, IMHO.
> > >
> > > >In your proposal, a user supplies the event handling logic via WASM
> and
> > > this code is executed on the server, potentially very frequently
> depending
> > > on the event type it is handling. How can we ensure that multiple
> users do
> > > not register many expensive event handlers, which will then contend for
> > > server resources
> > >
> > > In such cases, the user is limited to data provided by the event, which
> > > limits the scope of applicability of this approach. In reality, it
> breaks
> > > user abstractions and, as a result, makes this approach impractical for
> > > application developers to use.
> > > Once developers start to implement sophisticated routines, which is
> > > typical for this pattern of data handling, they will soon discover
> that the
> > > latency of remote access to the server will void the real
> applicability of
> > > this design.
> > >
> > >
> > > On Fri, Nov 14, 2025 at 7:30 PM Andrea Child
> > > <[email protected]> wrote:
> > >
> > >> Hi Andrii,
> > >>
> > >> My personal experience with event processing has involved either
> > >> server-side business logic or client-side logic that consumes events
> > >> published by the server, so I have a few questions to clarify your
> proposal.
> > >>
> > >> Who is the ‘user’ in your proposal? I believe it is a developer who is
> > >> using a GLV to query the graph but would just like to confirm.
> > >>
> > >> In your proposal, a user supplies the event handling logic via WASM
> and
> > >> this code is executed on the server, potentially very frequently
> depending
> > >> on the event type it is handling. How can we ensure that multiple
> users do
> > >> not register many expensive event handlers which will then contend for
> > >> server resources?
> > >>
> > >> What are the pros and cons of user-provided WASM event handler
> executed
> > >> on the server vs a mechanism for allowing users to subscribe to events
> > >> published by the server? I would think that one benefit is that it
> would be
> > >> faster to implement the user-provided WASM however it would be more
> > >> restrictive as the server should not just allow execution of any code.
> > >>
> > >> Thanks!
> > >>
> > >> Andrea
> > >>
> > >>
> > >> From: Andrii Lomakin <[email protected]>
> > >> Date: Tuesday, November 11, 2025 at 11:03 PM
> > >> To: [email protected] <[email protected]>
> > >> Subject: Re: Proposal: usage of service as foundation for Graph
> lifecycle
> > >> listeners
> > >>
> > >> Just a small addition Kotlin/WASM and IntelliJ already support DWARF
> based
> > >> debugging https://kotlinlang.org/docs/whatsnew2120.html#kotlin-wasm
> > >>
> > >> On Wed, Nov 12, 2025 at 7:58 AM Andrii Lomakin <
> > >> [email protected]>
> > >> wrote:
> > >>
> > >> > Hi Andrea.
> > >> >
> > >> > Answering your first question.
> > >> > The use of listeners for changes in entity state as triggers for
> > >> business
> > >> > logic is quite typical for EE applications, especially those with
> > >> complex
> > >> > logic.
> > >> >
> > >> > One of the applications uses them a lot because they use embedded
> > >> > distribution, but I strive to achieve a state where there will be no
> > >> > difference between remote and embedded deployments, and users can
> start
> > >> > with a small, embedded database and scale it, moving to a standalone
> > >> server
> > >> > or cloud if they later want to scale the load.
> > >> > WASM listeners allow the use of the process API for both cases, no
> > >> matter
> > >> > what deployment they use.
> > >> >
> > >> > As for debugging.
> > >> > GraalVM's WASM engine has announced debugging based on DWARF
> support as
> > >> an
> > >> > upcoming feature
> > >> https://youtu.be/Z2SWSIThHXY?si=33sxaLmJ26Gob9Aa&t=1127 .
> > >> >
> > >> >
> > >> >
> > >> > On Mon, Nov 10, 2025 at 8:46 PM Andrea Child
> > >> > <[email protected]> wrote:
> > >> >
> > >> >> Hi Andrii,
> > >> >>
> > >> >> Could you please elaborate on the types of scenarios that this
> proposal
> > >> >> would help users to troubleshoot and how the WASM byte code
> provided
> > >> could
> > >> >> be used to debug?
> > >> >>
> > >> >> Thanks!
> > >> >>
> > >> >> Andrea
> > >> >>
> > >> >> From: Andrii Lomakin <[email protected]>
> > >> >> Date: Monday, November 10, 2025 at 12:59 AM
> > >> >> To: [email protected] <[email protected]>
> > >> >> Subject: Proposal: usage of service as foundation for Graph
> lifecycle
> > >> >> listeners
> > >> >>
> > >> >> Good day.
> > >> >>
> > >> >> I would like to propose using the service as a foundation for
> > >> listeners of
> > >> >> the Graph life cycle events.
> > >> >>
> > >> >> It can look like the following:
> > >> >> 1. User creates a service using the provided WASM byte code using a
> > >> >> special
> > >> >> step like registerService(name, wasmByteCode).
> > >> >> 2. User registers the service as a listener for graph events. I
> think
> > >> that
> > >> >> could be events like vertex, edge lifecycle events, TX commit again
> > >> using
> > >> >> GraphTraversal commands.
> > >> >> 3. During the generation of the event, a special traversal that
> > >> contains
> > >> >> only the affected elements is created, and the service is called
> upon
> > >> it.
> > >> >> Like `_.inject(elements).call()`.
> > >> >>
> > >> >> In practice, it is advised for each GLV to provide a specialized
> > >> >> implementation of `registerService()` step that will compile the
> > >> passed-in
> > >> >> code into the WASM if possible.
> > >> >> That is possible on Java, and I suppose it is possible for
> JavaScript
> > >> at
> > >> >> least.
> > >> >>
> > >> >> That will allow us to blur the difference between remote and
> embedded
> > >> >> deployments, and for vendors that provide both variants to provide
> > >> >> debugging tools for lifecycle listeners when users can test and
> debug
> > >> >> implementation on their workstation and then deploy in production.
> > >> >>
> > >> >> WASM bytecode was intentionally designed for such use cases. I
> would be
> > >> >> interested in opinions about this proposal, which is, of course,
> > >> subject
> > >> >> to
> > >> >> a separate specification and many additional clarifications.
> > >> >>
> > >> >
> > >>
> > >
> >
>

Re: Proposal: usage of service as foundation for Graph lifecycle listeners

Reply via email to