Good day.
Stephen and Michael, cool questions!

Let me share our perspective.

1. From my point of view, there is no definition of match as a meterialized
entity per se in GQL, and as we do not have a RETURN statement, match()
returns only a single instance of Optional.empty(). Additionally, it is
impractical from a performance standpoint to support such logic.
The materialization of bound variables can be optimized by analyzing
`select` steps, but it is harder to do so with the semantics of returning
Optional for every match (regardless of its definition).
2. Yeah, good catch, we should add an additional type, IMHO better use a
non-literal for it.
3,4. Let me first reiterate how I understood it: a match is always executed
on the whole graph instance, but it has access to side effects registered
in Traverser as query parameters. Additionally, in such cases, it is
executed by the Traverser. We fully support it in such a case.
5, 6. Super cool question. Let me share our mental model:
   a. `match` is a barrier step, so it sees all changes made before
execution of this step.
   b. `match` is a streaming step.
  c.  'match' does not see changes done by steps after it, so it seems as
if we have a temporary snapshot of the state of the database. We (YTDB) can
implement this, but I am not sure that all vendors can, so I would
recommend it as a best practice. To clarify, this behaviour in general is
not different from the behavior of
        `g . ().hasLabel('labe').addV('label').count()`, but I am not sure
that all vendors implement it correctly at the moment.
7. I would skip it for now and contact the GQL committee to add support for
multi-properties, and if they do not want to add it in principle, to extend
the language.

Thank you, Lev Sivashov, for your valuable input in answering those
questions.


On Tue, Nov 11, 2025 at 6:37 PM Stephen Mallette <[email protected]>
wrote:

> I received some feedback about the match proposal and after some reflection
> myself wanted to share some points/questions for discussion here:
>
> 1. While match is said to return Optional.empty() I would think that the
> proposal means one of those per match. Therefore g.match(...).count() would
> return a count of the number of results extracted by match. Is that
> correct?
> 2. How will Optional.empty() translate to the TinkerPop type system.
> There's no serialization for "empty", right?
> 3. The point of mid-traversal match doing a "reset" and therefore matching
> against the entire graph for each input traverser like mid-traversal V(),
> made sense to me at first because I liked the simplicity, however, I can't
> help wondering if it will satisfy the use cases that arise from usage.
> Historically speaking, we almost always find folks wanting to dynamically
> inject values to a step. That seems particularly useful with respect to the
> parameterization function. Just making up the following as a silly example
> that looks a bit like math():
>
> g.V().as('a').match("MATCH (p:Person WHERE p.name = $a)").by('name')
>
> or use Traversal argument as an overload:
>
> g.V().project('m').by('name').as('a').match("MATCH (p:Person WHERE p.name
> =
> $a)", select('m'))
>
> If we didn't want to support this kind of use case, i'd be curious if match
> should work mid-traversal at all, because a good use case escapes me. are
> there examples of a good one?
>
> 4. In any event, i think i understand the proposal for mid-traversal
> match() as executing once per incoming traverser. I assume if there is no
> incoming traverser it doesn't execute?
> 5. Is match() a barrier step? Is it blocking? Can it be evaluated in
> parallel with what’s coming before/after? => these questions would matter
> particularly for update queries -> even if you don’t support update within
> match, it may happen in Gremlin parts before and after, so you’d want to
> have some execution order guarantees.
> 6. Similar questions regarding streaming output, early termination of
> limit() queries, etc.
> 7. Areas in which languages have an impedance mismatch. For instance,
> Gremlin supports multi-valued properties, openCypher (and, afaict, GQL)
> does not -- how do we handle that?
>
> I think we might want to update the proposal a bit once there is consensus
> on these points. Thanks to Michael Schmidt for calling out much of what's
> written here.
>
>
> On Mon, Oct 6, 2025 at 2:33 AM Andrii Lomakin <[email protected]>
> wrote:
>
> > Hi Cole.
> > Done https://github.com/apache/tinkerpop/pull/3232
> >
> > I'm sorry, but implementing schema changes over the manipulation by
> > vertices consumed me a bit.
> >
> > On Tue, Sep 23, 2025 at 8:31 AM Andrii Lomakin
> > <[email protected]> wrote:
> > >
> > > Hi Cole, sure.
> > >
> > > Let me wait for the results of additional discussion in Discord, and
> > then I
> > > will summarize everything as a PR.
> > >
> > > On Mon, Sep 22, 2025 at 9:13 PM Cole Greer <[email protected]
> > .invalid>
> > > wrote:
> > >
> > > > Thanks Andrii,
> > > >
> > > > That proposal looks good to me. I’d like to retain that same our
> > records
> > > > as well, would you mind
> > > > opening a PR to the 3.8-dev branch which adds that as a proposal to
> our
> > > > future docs:
> > > > https://github.com/apache/tinkerpop/tree/3.8-dev/docs/src/dev/future
> > > >
> > > > Regards,
> > > > Cole
> > > >
> > > > From: Andrii Lomakin <[email protected]>
> > > > Date: Monday, September 22, 2025 at 12:04 AM
> > > > To: [email protected] <[email protected]>
> > > > Subject: Re: Proposal of the new declarative semantics of the match
> > step
> > > > Good day.
> > > > As this thread has been dormant for two weeks already, I have
> > summarized
> > > > our discussion and created a spec proposal.
> > > > Please skip the part related to GQL DSL that is our specifics ->
> > > >
> > > >
> >
> https://youtrack.jetbrains.com/articles/YTDB-A-32/Specification-for-the-declarative-match-step
> > > >
> > > > Looking forward to any constructive feedback.
> > > >
> > > >
> > > > On Sun, Sep 7, 2025 at 4:40 PM Andrii Lomakin <
> > > > [email protected]>
> > > > wrote:
> > > >
> > > > > Good day, Cole.
> > > > > Thank you for your feedback.
> > > > >
> > > > > As for your concerns about supporting other pattern-matching
> > languages,
> > > > it
> > > > > is unlikely that a pattern-matching language can be implemented
> > without
> > > > the
> > > > > concept of a variable.
> > > > >
> > > > > I will wait two weeks for feedback from other participants, and
> > then, if
> > > > > there is no activity, I will summarize our discussion.
> > > > >
> > > > > On Thu, Sep 4, 2025 at 2:02 AM Cole Greer <[email protected]>
> > wrote:
> > > > >
> > > > >> Hi Andrii and Lev,
> > > > >>
> > > > >> I really like that idea. It's clean, simple, and concise. I
> suppose
> > one
> > > > >> downside of
> > > > >> dropping the RETURN is that you can no longer use aggregator
> > functions
> > > > as
> > > > >> in
> > > > >> RETURN count(n), however those capabilities already exist in
> > gremlin so
> > > > >> there
> > > > >> really isn't much of an impact here.
> > > > >>
> > > > >> I am concerned that no returns/always returning Optional.EMPTY may
> > place
> > > > >> undesired restrictions on providers who are using different
> > declarative
> > > > >> languages
> > > > >> via with("language", "GSQL"). If we want to give providers full
> > > > >> flexibility here, they
> > > > >> may want the ability to directly return data.
> > > > >>
> > > > >> For my purposes, I definitely want the default/reference GQL-based
> > match
> > > > >> language
> > > > >> to work like your example:
> > > > >>
> > > > >> g.V(1).property('friendWeight', match("MATCH
> > > > >>
> > (n{name:'Cole'})-[e:knows]->()").select("n").values("weight").sum()))
> > > > >>
> > > > >> I'm still not sure how much providers will choose to use their own
> > > > >> declarative
> > > > >> languages instead of using our default implementation, so this may
> > not
> > > > be
> > > > >> much of a
> > > > >> concern in practice. I would be happy to start with always
> returning
> > > > >> Optional.EMPTY,
> > > > >> and then we can consider giving providers some extensibility on
> > returns
> > > > >> in the future if
> > > > >> there is demand.
> > > > >>
> > > > >> I'm fully onboard with this proposal. Thanks for all of the work
> > that
> > > > has
> > > > >> gone into this.
> > > > >>
> > > > >> Regards,
> > > > >> Cole
> > > > >>
> > > > >> On 2025/09/03 08:03:36 Andrii Lomakin wrote:
> > > > >> > Hi Cole.
> > > > >> > Thank you for sharing. We reached an agreement on all topics
> > except
> > > > >> > the use of the RETURN statement.
> > > > >> >
> > > > >> > We brainstormed inside the team and came up with an interesting
> > idea
> > > > >> > about handling the output from the match statement, thanks to
> Lev
> > > > >> > Sivashov, who provided it.
> > > > >> >
> > > > >> > This idea is combined with another of my proposals to treat
> > > > >> > Optional.EMPTY returned by Traverser is a jolt to the execution
> > of the
> > > > >> > next step by Traversal, but it is treated as no value for the
> > steps
> > > > >> > that do not process input values, such as addV().
> > > > >> > It will fix queries such as `g.addV(__.inject('x'))` and similar
> > ones
> > > > >> > in Gremlin that accept Traversal and need a fake Traverser with
> a
> > > > >> > value to work as expected.
> > > > >> >
> > > > >> > So we propose not to support RETURN at all, as we already have a
> > means
> > > > >> > to handle projections in Gremlin.
> > > > >> >
> > > > >> > Instead:
> > > > >> > 1. match() steps returns Optional.empty() as result.
> > > > >> > 2. We specify which MATCH variables we need to fetch using the
> > > > select()
> > > > >> step.
> > > > >> >
> > > > >> > So query
> > > > >> > g.V(1).property('friendWeight', match("MATCH
> > > > >> > (n{name:'Cole'})-[e:knows]->() RETURN sum(e.weight)"))
> > > > >> >
> > > > >> > will look like
> > > > >> > g.V(1).property('friendWeight', match("MATCH
> > > > >> >
> > (n{name:'Cole'})-[e:knows]->()).select("n").values("weight").sum()))
> > > > >> >
> > > > >> > This approach is easily optimized for execution by analyzing the
> > > > >> > select steps and providing GQL executor names of variables that
> > are
> > > > >> > really needed. It also looks elegant, prevents informational
> > clutter,
> > > > >> > and offers minimal and efficient pattern-matching methods for
> > Gremlin.
> > > > >> > WDYT?
> > > > >> >
> > > > >> > If you agree, I will wait a week to gather feedback from other
> > > > >> > participants. If no additions are provided, I will publish a
> > summary
> > > > >> > here and link to our design document for general information,
> and
> > I
> > > > >> > will start implementing it at our pace.
> > > > >> >
> > > > >> > On Wed, Sep 3, 2025 at 5:27 AM Cole Greer <[email protected]
> >
> > > > wrote:
> > > > >> > >
> > > > >> > > Hi Andrii,
> > > > >> > >
> > > > >> > > I've taken more time to think through your proposal.
> > > > >> > >
> > > > >> > > > I think we can transform the idea of introduction of new
> > step, to
> > > > >> the idea of usage
> > > > >> > > > of `with` step and provide the following modulation rule for
> > the
> > > > new
> > > > >> > > > `match` step: if name of the key in with step is passed in
> > with
> > > > "$"
> > > > >> prefix,
> > > > >> > > > this prefix is removed an the rest of the key is used as
> query
> > > > >> parameter.
> > > > >> > > > It is quite a common way of naming the parameters.  As for
> > binding
> > > > >> of
> > > > >> > > > parameters for server queries, if query parameters are not
> > > > provided
> > > > >> > > > explicitly, then we will perform an implicit lookup over the
> > > > >> bindings of
> > > > >> > > > those parameters.
> > > > >> > >
> > > > >> > > I like this. It gives good flexibility for localized "match
> > > > >> parameters", while retaining some connection to the existing
> > parameter
> > > > >> bindings in the server.
> > > > >> > >
> > > > >> > > > There is a discrepancy between the naming of parameters
> > between
> > > > GQL
> > > > >> and
> > > > >> > > > Gremlin, but that is, IMHO, acceptable.
> > > > >> > > > As one more alternative, probably even more appealing, we
> can
> > wrap
> > > > >> > > > parameters in "{}", as Koltin does :-)
> > > > >> > > > That will resemble GQL style and will not create a visual
> > mess.
> > > > >> > > >
> > > > >> > > > So it will look like:
> > > > >> > > > ` g.match("MATCH (src:Airport {code:srcCode}),
> (dest:Airport
> > > > >> > > > {code:destCode}) RETURN src")
> > > > >> > > >     .addE("Route").to("dest")
> > > > >> > > >     .property(T.id,
> > > > >> > > >
> > > > >>
> > > >
> >
> format("%{_}-%{_}").by(constant("{srcCode}")).by(constant("{destCode}")))`
> > > > >> > >
> > > > >> > > We don't currently support any parameter replacement within a
> > string
> > > > >> literal, currently parameters can only be used to swap out the
> > string
> > > > >> literal in its entirety. It may be complicated to implement as
> that
> > > > >> parameter resolution would need to be added to all steps which
> > accept
> > > > >> string arguments. It may be best to spin this into it's own
> > discussion
> > > > if
> > > > >> there is interest in pursuing this.
> > > > >> > >
> > > > >> > > > > I still haven't quite aligned myself regarding single
> > > > non-element
> > > > >> > > > returns. I'll reply back on this topic soon.
> > > > >> > > >
> > > > >> > > > I'm curious to see what you think.
> > > > >> > >
> > > > >> > > I've worked through some examples here and my preference is
> not
> > to
> > > > >> wrap single returns in maps. I understand the desire to limit the
> > > > possible
> > > > >> return types from the match step to just Elements and Maps, but in
> > my
> > > > >> opinion this is outweighed by the convenience of directly using
> the
> > > > >> results. For instance with map wrapping:
> > > > >> > > g.match("MATCH (n{name:'Cole'}) RETURN
> > > > >> n.birthday").select("n.birthday").dateDiff(datetime("2000-01-01"))
> > > > >> > > compared to without maps:
> > > > >> > > g.match("MATCH (n{name:'Cole'}) RETURN
> > > > >> n.birthday").dateDiff(datetime("2000-01-01"))
> > > > >> > >
> > > > >> > > The map wrapping and associated select feels unnecessary to me
> > and
> > > > >> gets in the way. I feel similarly about the following examples:
> > > > >> > >
> > > > >> > > g.match("MATCH (n:person) RETURN
> > > > >> n.age").select("n.age").order().limit(5) vs.
> > > > >> > > g.match("MATCH (n:person) RETURN n.age").order().limit(5)
> > > > >> > >
> > > > >> > > g.V(1).property('friendWeight', match("MATCH
> > > > >> (n{name:'Cole'})-[e:knows]->() RETURN
> > > > >> sum(e.weight)").select("sum(e.weight)")) vs.
> > > > >> > > g.V(1).property('friendWeight', match("MATCH
> > > > >> (n{name:'Cole'})-[e:knows]->() RETURN sum(e.weight)"))
> > > > >> > >
> > > > >> > > I couldn't come up with examples where I wanted to retain the
> > > > results
> > > > >> in their maps so the select() always feels like an unnecessary
> > chore to
> > > > me.
> > > > >> Without these maps, the possible return types of match() would
> grow
> > to
> > > > >> include any property type supported by the graph, as well as the
> > return
> > > > >> types of any functions included in the declarative language. This
> is
> > > > more
> > > > >> complex but not without precedent considering steps such as
> > inject() and
> > > > >> constant().
> > > > >> > >
> > > > >> > > Of course for any match query which returns multiple results,
> a
> > map
> > > > >> of all of them should be returned:
> > > > >> > > g.match("MATCH (p:person)-[e:created]->(s:software) RETURN *")
> > > > >> > > -> {"p": V[1], "e": E[9], "s": V[3]}
> > > > >> > >
> > > > >> > > In my mind this is mostly a matter of a small convenience. If
> > you
> > > > >> feel strongly that wrapping any non-element results into maps is
> > > > >> preferable, I can accept that as well.
> > > > >> > >
> > > > >> > > Thanks,
> > > > >> > > Cole
> > > > >> > >
> > > > >> > >
> > > > >> > > On 2025/08/27 15:20:31 Andrii Lomakin wrote:
> > > > >> > > > Good day.
> > > > >> > > >
> > > > >> > > > >I suppose I'm approaching this one more from the
> perspective
> > that
> > > > >> I don't
> > > > >> > > > see why these parameters need to be isolated to just the
> match
> > > > >> subquery.
> > > > >> > > >
> > > > >> > > > Thank you, Cole, for your feedback.
> > > > >> > > > While you paused further analysis, I investigated code a
> bit,
> > and
> > > > I
> > > > >> think
> > > > >> > > > we can transform the idea of introduction of new step, to
> the
> > idea
> > > > >> of usage
> > > > >> > > > of `with` step and provide the following modulation rule for
> > the
> > > > new
> > > > >> > > > `match` step: if name of the key in with step is passed in
> > with
> > > > "$"
> > > > >> prefix,
> > > > >> > > > this prefix is removed an the rest of the key is used as
> query
> > > > >> parameter.
> > > > >> > > > It is quite a common way of naming the parameters.  As for
> > binding
> > > > >> of
> > > > >> > > > parameters for server queries, if query parameters are not
> > > > provided
> > > > >> > > > explicitly, then we will perform an implicit lookup over the
> > > > >> bindings of
> > > > >> > > > those parameters.
> > > > >> > > > "Global" parameters can be applied in `with` Step in
> > > > >> GraphTraversalSource
> > > > >> > > > using the same approach.
> > > > >> > > >
> > > > >> > > > In such case, your query example would look like:
> > > > >> > > >
> > > > >> > > > ` g.match("MATCH (src:Airport {code:srcCode}),
> (dest:Airport
> > > > >> > > > {code:destCode}) RETURN src")
> > > > >> > > >     .addE("Route").to("dest")
> > > > >> > > >     .property(T.id,
> > > > >> > > >
> > > > >>
> > format("%{_}-%{_}").by(constant("$srcCode")).by(constant("$destCode")))`
> > > > >> > > >
> > > > >> > > > There is a discrepancy between the naming of parameters
> > between
> > > > GQL
> > > > >> and
> > > > >> > > > Gremlin, but that is, IMHO, acceptable.
> > > > >> > > > As one more alternative, probably even more appealing, we
> can
> > wrap
> > > > >> > > > parameters in "{}", as Koltin does :-)
> > > > >> > > > That will resemble GQL style and will not create a visual
> > mess.
> > > > >> > > >
> > > > >> > > > So it will look like:
> > > > >> > > > ` g.match("MATCH (src:Airport {code:srcCode}),
> (dest:Airport
> > > > >> > > > {code:destCode}) RETURN src")
> > > > >> > > >     .addE("Route").to("dest")
> > > > >> > > >     .property(T.id,
> > > > >> > > >
> > > > >>
> > > >
> >
> format("%{_}-%{_}").by(constant("{srcCode}")).by(constant("{destCode}")))`
> > > > >> > > >
> > > > >> > > > Also, nobody prohibits keeping the policy of resolving
> > parameter
> > > > >> binding as
> > > > >> > > > it is right now for server queries, with the recommended way
> > to
> > > > use
> > > > >> the new
> > > > >> > > > approach, so it will not be a breaking change and I doubt
> that
> > > > many
> > > > >> users
> > > > >> > > > use string literals wrapped {} as values.
> > > > >> > > >
> > > > >> > > > > I still haven't quite aligned myself regarding single
> > > > non-element
> > > > >> > > > returns. I'll reply back on this topic soon.
> > > > >> > > >
> > > > >> > > > I'm curious to see what you think.
> > > > >> > > >
> > > > >> > > > > Thanks again for driving these discussions. In my opinion
> > this
> > > > >> will be
> > > > >> > > > one of the most exciting additions to gremlin in quite some
> > time.
> > > > >> > > >
> > > > >> > > > Thank you, I am totally flattered :-)
> > > > >> > > >
> > > > >> > > >
> > > > >> > > >
> > > > >> > > >
> > > > >> > > >
> > > > >> > > >
> > > > >> > > >
> > > > >> > > >
> > > > >> > > > On Tue, Aug 26, 2025 at 12:13 AM Cole Greer <
> > [email protected]
> > > > >
> > > > >> wrote:
> > > > >> > > >
> > > > >> > > > > Hi Andrii,
> > > > >> > > > >
> > > > >> > > > > It was great to see your response. I think we are mostly
> in
> > > > >> agreement here.
> > > > >> > > > >
> > > > >> > > > > > It would be even better, IMHO, if the TP project added
> an
> > > > >> ANTLR4 parser
> > > > >> > > > > for GQL match statements
> > > > >> > > > >
> > > > >> > > > > Agreed, I've been loosely following LDBC's Open GQL
> project
> > > > which
> > > > >> has
> > > > >> > > > > produced an Apache 2 licensed GQL Antlr grammar which
> likely
> > > > >> offers a good
> > > > >> > > > > starting point.
> > > > >> > > > > https://github.com/opengql/grammar
> > > > >> > > > >
> > > > >> > > > > > Except for obvious query injection cases, which, in the
> > > > absence
> > > > >> of query
> > > > >> > > > > parameters, should be handled by users themselves
> > > > >> > > > >
> > > > >> > > > > I mostly considered this in the remote context, in which
> > > > reliance
> > > > >> on
> > > > >> > > > > gremlin-server for parameters is not an issue. I suppose
> > there
> > > > >> may be
> > > > >> > > > > embedded use cases in which query injection is a concern,
> > > > however
> > > > >> this
> > > > >> > > > > seems much rarer than the remote case.
> > > > >> > > > >
> > > > >> > > > > > another important argument for the presence of query
> > > > parameters
> > > > >> is that
> > > > >> > > > > query parsing is quite a heavy process
> > > > >> > > > >
> > > > >> > > > > I definitely agree on this front.
> > > > >> > > > >
> > > > >> > > > > > >I would prefer to solve that problem at the broader
> > gremlin
> > > > >> level,
> > > > >> > > > > instead of isolating it to the match step.
> > > > >> > > > > >
> > > > >> > > > > > Would you happen to have any other applications in mind?
> > > > >> > > > >
> > > > >> > > > > I suppose I'm approaching this one more from the
> perspective
> > > > that
> > > > >> I don't
> > > > >> > > > > see why these parameters need to be isolated to just the
> > match
> > > > >> subquery.
> > > > >> > > > >
> > > > >> > > > > Parameters is already a bit overloaded and messy in
> > TinkerPop
> > > > and
> > > > >> I hope
> > > > >> > > > > to reduce that complexity overtime. As already noted,
> remote
> > > > >> gremlin
> > > > >> > > > > scripts already have the ability to use parameters via
> > > > >> gremlin-server.
> > > > >> > > > > Bytecode requests currently have bindings which serve a
> > similar
> > > > >> purpose.
> > > > >> > > > > Internally we also have the Parameterizing interface which
> > is
> > > > >> more about
> > > > >> > > > > steps supporting things like `with()` modulation, and not
> > > > related
> > > > >> to query
> > > > >> > > > > parameters.
> > > > >> > > > >
> > > > >> > > > > I think it's easier for users if we simply have one set of
> > query
> > > > >> > > > > parameters instead of fractured gremlin parameters and
> match
> > > > >> parameters. I
> > > > >> > > > > expect there are some cases where it is useful to
> reference
> > the
> > > > >> same
> > > > >> > > > > parameter in both the gremlin and GQL portions of a query,
> > > > >> although it is
> > > > >> > > > > admittedly not a common use case. The following query is a
> > > > >> somewhat
> > > > >> > > > > contrived example where the same parameters are used to
> > match 2
> > > > >> nodes, and
> > > > >> > > > > then the same parameters are concatenated together to form
> > an id
> > > > >> for a new
> > > > >> > > > > edge which is added between the nodes:
> > > > >> > > > > g.match("MATCH (src:Airport {code:srcCode}),
> (dest:Airport
> > > > >> > > > > {code:destCode}) RETURN src")
> > > > >> > > > >     .addE("Route").to("dest")
> > > > >> > > > >     .property(T.id,
> > > > >> > > > >
> > > > format("%{_}-%{_}").by(constant(srcCode)).by(constant(destCode)))
> > > > >> > > > >
> > > > >> > > > > There may also be cases where it is useful to have
> multiple
> > > > match
> > > > >> steps in
> > > > >> > > > > a single traversal which reuse the same parameters.
> > > > >> > > > >
> > > > >> > > > > Taking the existing remote query parameters, reworking
> them
> > to
> > > > >> support the
> > > > >> > > > > embedded case as well, then making those parameters
> > available to
> > > > >> the new
> > > > >> > > > > match step would solve the query injection and parse cache
> > > > >> problems without
> > > > >> > > > > introducing an additional form of parameters for users to
> > > > handle.
> > > > >> > > > >
> > > > >> > > > > > > I will take some time next week to work through some
> > example
> > > > >> queries
> > > > >> > > > > and get a better sense of how I feel on each option here.
> > > > >> > > > > >
> > > > >> > > > > > Looking forward to reading your conclusions.
> > > > >> > > > >
> > > > >> > > > > I still haven't quite aligned myself regarding single
> > > > non-element
> > > > >> returns.
> > > > >> > > > > I'll reply back on this topic soon.
> > > > >> > > > >
> > > > >> > > > > Thanks again for driving these discussions. In my opinion
> > this
> > > > >> will be one
> > > > >> > > > > of the most exciting additions to gremlin in quite some
> > time.
> > > > >> > > > >
> > > > >> > > > > Regards,
> > > > >> > > > > Cole
> > > > >> > > > >
> > > > >> > > > > On 2025/08/23 14:00:51 Andrii Lomakin wrote:
> > > > >> > > > > > Good day, Cole.
> > > > >> > > > > >
> > > > >> > > > > > Glad to exchange more ideas with you in this thread.
> > > > >> > > > > >
> > > > >> > > > > > >I think it would make sense for TinkerPop to adopt a
> > default
> > > > >> language
> > > > >> > > > > for the new match step, which is some heavily restricted
> > form of
> > > > >> GQL
> > > > >> > > > > (read-only, limited to basic MATCH, WHERE, and RETURN
> > > > >> statements). This
> > > > >> > > > > "standard" language could then be used in the new match
> step
> > > > >> without a
> > > > >> > > > > language with-modulator. Providers would still be free to
> > > > support
> > > > >> their own
> > > > >> > > > > languages via that modulator if they choose.
> > > > >> > > > > >
> > > > >> > > > > > That makes sense, I agree with you.
> > > > >> > > > > > It would be even better, IMHO, if the TP project added
> an
> > > > ANTLR4
> > > > >> > > > > > parser for GQL match statements (there is already at
> > least one
> > > > >> ANTLR
> > > > >> > > > > > spec in the public domain) that vendors can use to work
> > on the
> > > > >> AST
> > > > >> > > > > > level. We can talk about possible collaboration on this
> > task.
> > > > >> > > > > >
> > > > >> > > > > > > I'd be interested if you have any examples where
> > embedded
> > > > >> parameters
> > > > >> > > > > present a clear advantage.
> > > > >> > > > > >
> > > > >> > > > > > I expected that this question would be raised :-)
> > > > >> > > > > > But decided to move the discussion to a follow-up thread
> > to
> > > > >> avoid
> > > > >> > > > > > polluting the main proposal.
> > > > >> > > > > > Except for obvious query injection cases, which, in the
> > > > absence
> > > > >> of
> > > > >> > > > > > query parameters, should be handled by users themselves,
> > > > another
> > > > >> > > > > > important argument for the presence of query parameters
> is
> > > > that
> > > > >> query
> > > > >> > > > > > parsing is quite a heavy process, and the consumption of
> > 20%
> > > > of
> > > > >> CPU
> > > > >> > > > > > resources on query parsing is not a rare exception.
> > > > >> > > > > > To avoid this overhead, query parsing results (likely
> > ASTs)
> > > > are
> > > > >> cached
> > > > >> > > > > > by a simple string hash code (likely the only way, as
> > they are
> > > > >> not
> > > > >> > > > > > parsed in this phase). Of course, the absence of query
> > > > >> parameters very
> > > > >> > > > > > often increases the variability of queries by several
> > orders
> > > > of
> > > > >> > > > > > magnitude and voids caching efforts.
> > > > >> > > > > >
> > > > >> > > > > > >I would prefer to solve that problem at the broader
> > gremlin
> > > > >> level,
> > > > >> > > > > instead of isolating it to the match step.
> > > > >> > > > > >
> > > > >> > > > > > Would you happen to have any other applications in mind?
> > > > >> > > > > >
> > > > >> > > > > > > I will take some time next week to work through some
> > example
> > > > >> queries
> > > > >> > > > > and get a better sense of how I feel on each option here.
> > > > >> > > > > >
> > > > >> > > > > > Looking forward to reading your conclusions.
> > > > >> > > > > >
> > > > >> > > > > > >. I think that all "variables" bound in the match query
> > > > should
> > > > >> be
> > > > >> > > > > stored such that they are later selectable.
> > > > >> > > > > >
> > > > >> > > > > > Yeah, cool idea!
> > > > >> > > > > >
> > > > >> > > > > > >Overall I think this would be a great change to
> gremlin.
> > I
> > > > >> look forward
> > > > >> > > > > to keeping this discussion going and ultimately seeing the
> > > > >> changes land in
> > > > >> > > > > TinkerPop.
> > > > >> > > > > >
> > > > >> > > > > > Thank you, Cole!
> > > > >> > > > > > Once the discussion comes to a natural conclusion, I
> will
> > > > >> summarize
> > > > >> > > > > > all the ideas again to ensure that we are all on the
> same
> > > > page.
> > > > >> Then,
> > > > >> > > > > > we will add it to our roadmap.
> > > > >> > > > > >
> > > > >> > > > > > On Sat, Aug 23, 2025 at 12:01 AM Cole Greer <
> > > > >> [email protected]>
> > > > >> > > > > wrote:
> > > > >> > > > > > >
> > > > >> > > > > > > Hi Andrii,
> > > > >> > > > > > >
> > > > >> > > > > > > Thanks for starting this discussion and putting
> together
> > > > this
> > > > >> > > > > proposal. I want to start by saying that overall, I'm
> > massively
> > > > >> in favour
> > > > >> > > > > of the proposed overhaul of match(). This is a topic that
> > has
> > > > >> come up many
> > > > >> > > > > times in the past, and taking advantage of an established
> > > > >> declarative
> > > > >> > > > > language like GQL always seems to be the preferred
> solution.
> > > > >> > > > > > >
> > > > >> > > > > > > The idea of having the language configurable via
> > something
> > > > >> like
> > > > >> > > > > `.with(“language”,
> > > > >> > > > > > > “GQL”)` is quite interesting, and something I haven't
> > seen
> > > > in
> > > > >> previous
> > > > >> > > > > discussions. There is clear value in allowing providers to
> > > > >> support their
> > > > >> > > > > own preferred declarative languages here, but I also worry
> > about
> > > > >> the loss
> > > > >> > > > > of query portability if TinkerPop is too hands off on the
> > choice
> > > > >> of
> > > > >> > > > > declarative language. I believe the vast majority of
> usages
> > here
> > > > >> will be
> > > > >> > > > > seeing a traversal with a simple GQL-like match pattern. I
> > think
> > > > >> it would
> > > > >> > > > > make sense for TinkerPop to adopt a default language for
> > the new
> > > > >> match
> > > > >> > > > > step, which is some heavily restricted form of GQL
> > (read-only,
> > > > >> limited to
> > > > >> > > > > basic MATCH, WHERE, and RETURN statements). This
> "standard"
> > > > >> language could
> > > > >> > > > > then be used in the new match step without a language
> > > > >> with-modulator.
> > > > >> > > > > Providers would still be free to support their own
> > languages via
> > > > >> that
> > > > >> > > > > modulator if they choose.
> > > > >> > > > > > >
> > > > >> > > > > > > I will take a bit more time to consider the
> > withParameter()
> > > > >> proposal.
> > > > >> > > > > My initial reaction is that I prefer to tie it into the
> > existing
> > > > >> parameter
> > > > >> > > > > bindings included in remote requests to gremlin-server. I
> > would
> > > > >> like query
> > > > >> > > > > parameters to function in a unified manner across the
> entire
> > > > >> traversal if
> > > > >> > > > > possible, instead of a separate detached system isolated
> to
> > the
> > > > >> new match
> > > > >> > > > > step. I understand the current limitation of only
> supporting
> > > > >> parameters in
> > > > >> > > > > remote traversals. I'm not immediately seeing the need to
> > > > support
> > > > >> > > > > parameters for embedded traversals here, I'd be interested
> > if
> > > > you
> > > > >> have any
> > > > >> > > > > examples where embedded parameters present a clear
> > advantage. If
> > > > >> we do
> > > > >> > > > > decide there is a need for embedded parameters, I would
> > prefer
> > > > to
> > > > >> solve
> > > > >> > > > > that problem at the broader gremlin level, instead of
> > isolating
> > > > >> it to the
> > > > >> > > > > match step.
> > > > >> > > > > > >
> > > > >> > > > > > > I totally agree that the start and mid-step behaviour
> > of the
> > > > >> new match
> > > > >> > > > > step should be modeled after V() and E().
> > > > >> > > > > > >
> > > > >> > > > > > > I think the trickiest part of getting this right is
> the
> > > > >> return types.
> > > > >> > > > > The most common use cases I expect is where the RETURN
> > clause
> > > > >> only includes
> > > > >> > > > > a single node or edge. In this case I completely agree
> with
> > > > >> returning the
> > > > >> > > > > element itself. I definitely want to support usages such
> as
> > > > >> g.match("MATCH
> > > > >> > > > > (n{name:'Cole'}) RETURN n").out()... My main tenet here is
> > that
> > > > >> results
> > > > >> > > > > should naturally flow from the declarative match into the
> > > > >> subsequent
> > > > >> > > > > gremlin and be easy to consume. If multiple objects are
> > > > returned,
> > > > >> I would
> > > > >> > > > > agree that it is necessary to return a Map<String, ?> as
> in
> > > > >> g.match("MATCH
> > > > >> > > > > (p:person)-[e:created]->(s:software) RETURN *") -> {"p":
> > V[1],
> > > > >> "e": E[9],
> > > > >> > > > > "s": V[3]} ...
> > > > >> > > > > > >
> > > > >> > > > > > > I'm still on the fence for how to handle single
> returns
> > of
> > > > >> > > > > non-elements. I see the value in your recommendation to
> > return a
> > > > >> map of
> > > > >> > > > > size 1, but I also see some convenience to directly
> > returning
> > > > the
> > > > >> value
> > > > >> > > > > (usually a single property). I will take some time next
> > week to
> > > > >> work
> > > > >> > > > > through some example queries and get a better sense of
> how I
> > > > feel
> > > > >> on each
> > > > >> > > > > option here.
> > > > >> > > > > > >
> > > > >> > > > > > > There is one final item which I would like to see
> added
> > to
> > > > the
> > > > >> > > > > proposal. I think that all "variables" bound in the match
> > query
> > > > >> should be
> > > > >> > > > > stored such that they are later selectable. Essentially I
> > think
> > > > >> it's
> > > > >> > > > > important to support something like this:
> > > > >> > > > > > >
> > > > >> > > > > > > g.match("MATCH (n1{name:'Cole'})-[]->(n2) RETURN
> > > > >> > > > > n1").where(...)...select(n2).out()...
> > > > >> > > > > > >
> > > > >> > > > > > > The ability to select other bound variables later in
> the
> > > > >> traversal
> > > > >> > > > > should greatly limit the number of times users are forced
> to
> > > > >> return
> > > > >> > > > > multiple items at once, which reduces the amount of use
> > cases
> > > > >> where users
> > > > >> > > > > will be forced to break down maps in gremlin to complete
> > their
> > > > >> query.
> > > > >> > > > > > >
> > > > >> > > > > > > Overall I think this would be a great change to
> > gremlin. I
> > > > >> look
> > > > >> > > > > forward to keeping this discussion going and ultimately
> > seeing
> > > > >> the changes
> > > > >> > > > > land in TinkerPop.
> > > > >> > > > > > >
> > > > >> > > > > > > Thanks,
> > > > >> > > > > > > Cole
> > > > >> > > > > > >
> > > > >> > > > > > > On 2025/08/22 15:46:10 Andrii Lomakin wrote:
> > > > >> > > > > > > > Good day.
> > > > >> > > > > > > >
> > > > >> > > > > > > > I propose new semantics for the match step in
> Gremlin,
> > > > >> which we
> > > > >> > > > > discussed
> > > > >> > > > > > > > briefly in the Discord chat. The current ideas
> listed
> > > > >> partially
> > > > >> > > > > summarize
> > > > >> > > > > > > > ideas suggested by several discussion participants.
> > > > >> > > > > > > >
> > > > >> > > > > > > > The current semantics of the match step are complex
> to
> > > > >> optimize, so
> > > > >> > > > > users
> > > > >> > > > > > > > do not use this step in practice, and DB vendors do
> > not
> > > > >> recommend
> > > > >> > > > > using
> > > > >> > > > > > > > match step in queries.
> > > > >> > > > > > > >
> > > > >> > > > > > > > Instead, what is proposed is to provide a new match
> > step
> > > > >> based on
> > > > >> > > > > > > > declarative semantics.
> > > > >> > > > > > > >
> > > > >> > > > > > > > Signature of this step is quite simple:
> Travervsal<S,
> > E>
> > > > >> match(String
> > > > >> > > > > > > > matchQuery).
> > > > >> > > > > > > >
> > > > >> > > > > > > > Where matchQuery is a match statement written in
> > > > >> declarative query
> > > > >> > > > > language
> > > > >> > > > > > > > supported by the provider, I will use GQL as an
> > example
> > > > >> below.
> > > > >> > > > > > > >
> > > > >> > > > > > > > This step will require the language as a
> configuration
> > > > >> parameter
> > > > >> > > > > provided
> > > > >> > > > > > > > using with the step.
> > > > >> > > > > > > >
> > > > >> > > > > > > > So the simplest query will look like:
> > > > >> > > > > > > >
> > > > >> > > > > > > > g.match(“MATCH
> > > > >> > > > >
> (person:Person)-[:knows]->(friend:Person)”).with(“language”,
> > > > >> > > > > > > > “GQL”)
> > > > >> > > > > > > >
> > > > >> > > > > > > > match step can accept query parameters, so if we
> > provide a
> > > > >> query like
> > > > >> > > > > > > > g.match(“MATCH
> > > > >> > > > > > > > (p:Person WHERE p.name = $personName)RETURN
> > > > >> > > > > p.email”).with(“language”,
> > > > >> > > > > > > > “GQL”)
> > > > >> > > > > > > >
> > > > >> > > > > > > > we may use parameter bindings, but it will work only
> > for
> > > > >> interaction
> > > > >> > > > > with
> > > > >> > > > > > > > Gremlin Server, so instead, I propose an additional
> > > > >> modulator step:
> > > > >> > > > > > > > withParameter(String
> > > > >> > > > > > > > name, Object value)
> > > > >> > > > > > > >
> > > > >> > > > > > > > In such case final version will look like:
> > g.match(“MATCH
> > > > >> (p:Person
> > > > >> > > > > WHERE
> > > > >> > > > > > > > p.name = $personName) RETURN
> > p.email”).with(“language”,
> > > > >> > > > > > > > “GQL”).withParameter(“personName”, “Stephen”)
> > > > >> > > > > > > >
> > > > >> > > > > > > > Alongside the version of withParameter step that
> > provides
> > > > >> the name
> > > > >> > > > > of the
> > > > >> > > > > > > > query parameter, a version with the following
> > signature
> > > > >> should also
> > > > >> > > > > be
> > > > >> > > > > > > > provided: withParameter(int index, Object value) for
> > query
> > > > >> languages
> > > > >> > > > > that
> > > > >> > > > > > > > support indexed parameters with/instead of named
> > > > parameters.
> > > > >> > > > > > > >
> > > > >> > > > > > > > Because we already introduced one modulator step, it
> > is
> > > > >> reasonable to
> > > > >> > > > > > > > consider replacing it with step by more specific
> > > > >> withQueryLanguage()
> > > > >> > > > > > > > modulator step that will allow us to add more
> > > > >> expressiveness to the
> > > > >> > > > > > > > resulting queries.
> > > > >> > > > > > > >
> > > > >> > > > > > > > In such case final version will look like:
> > g.match(“MATCH
> > > > >> (p:Person
> > > > >> > > > > WHERE
> > > > >> > > > > > > > p.name = $personName) RETURN
> > > > >> > > > > > > >
> > > > >> p.email”).withQueryLanguage(“GQL”).withParameter(“personName”,
> > > > >> > > > > “Stephen”)
> > > > >> > > > > > > >
> > > > >> > > > > > > > As for the scope of application of this step, I
> > recommend
> > > > >> making it
> > > > >> > > > > behave
> > > > >> > > > > > > > exactly as it is implemented for the V() and E()
> > steps. It
> > > > >> could be
> > > > >> > > > > added
> > > > >> > > > > > > > in the middle of GraphTraversal, but the execution
> > result
> > > > >> will be
> > > > >> > > > > the same
> > > > >> > > > > > > > pattern matching execution applied to the whole
> graph
> > > > >> stored in the
> > > > >> > > > > > > > database (not to the item filtered/transformed by
> the
> > > > >> previous
> > > > >> > > > > steps).
> > > > >> > > > > > > >
> > > > >> > > > > > > > It also means that match step will be added to the
> > > > >> > > > > GraphTraversalSource.
> > > > >> > > > > > > >
> > > > >> > > > > > > > As for the format of the output of the match step, I
> > would
> > > > >> recommend
> > > > >> > > > > the
> > > > >> > > > > > > > following:
> > > > >> > > > > > > >
> > > > >> > > > > > > > 1.  If the match statement returns an Element
> > instance, it
> > > > >> is
> > > > >> > > > > returned as
> > > > >> > > > > > > > is.
> > > > >> > > > > > > >
> > > > >> > > > > > > > 2.  Otherwise, it should return any value that is
> > allowed
> > > > >> to be a
> > > > >> > > > > property
> > > > >> > > > > > > > value in Element.
> > > > >> > > > > > > >
> > > > >> > > > > > > > 3. I would add an optional recommendation to return
> > either
> > > > >> Element or
> > > > >> > > > > > > > Map<String,
> > > > >> > > > > > > > ?>  where the key of the map is the result a
> > projection of
> > > > >> the query
> > > > >> > > > > result
> > > > >> > > > > > > > which in case of query  g.match(“MATCH (p:Person
> WHERE
> > > > >> p.name =
> > > > >> > > > > > > > $personName) RETURN
> > > > >> > > > > > > >
> > > > >> p.email”).withQueryLanguage(“GQL”).withParameter(“personName”,
> > > > >> > > > > “Stephen”)
> > > > >> > > > > > > >
> > > > >> > > > > > > > will look like {“p.email”: “[email protected]”}.
> Following
> > > > this
> > > > >> optional
> > > > >> > > > > > > > recommendation will, IMHO, improve user experience.
> > > > >> > > > > > > >
> > > > >> > > > > > > > This step should be restricted to executing only
> > > > idempotent
> > > > >> queries.
> > > > >> > > > > > > >
> > > > >> > > > > > > > I would also recommend adding versions of
> > withParameter()
> > > > >> that accept
> > > > >> > > > > > > > Traversal as a value of the parameters, namely:
> > > > >> > > > > > > > 1.  withParameter(String name, TraversalSource
> value)
> > > > >> > > > > > > >
> > > > >> > > > > > > > 2.  withParameter(int index, TraversalSource value)
> > > > >> > > > > > > >
> > > > >> > > > > > > >
> > > > >> > > > > > > >
> > > > >> > > > > > > > The current version of the match step should be
> > deprecated
> > > > >> and then
> > > > >> > > > > removed.
> > > > >> > > > > > > >
> > > > >> > > > > > > > I want to thank Stephen Mallette, whose initial idea
> > > > >> closely aligned
> > > > >> > > > > with
> > > > >> > > > > > > > ours and who actively contributed to our
> discussions.
> > > > >> > > > > > > >
> > > > >> > > > > > > > I'm looking forward to your thoughts, observations,
> > and
> > > > any
> > > > >> other
> > > > >> > > > > feedback
> > > > >> > > > > > > > you may have.
> > > > >> > > > > > > >
> > > > >> > > > > > > > Best Regards,
> > > > >> > > > > > > > YouTrackDB development lead
> > > > >> > > > > > > > Andrii Lomakin
> > > > >> > > > > > > >
> > > > >> > > > > >
> > > > >> > > > >
> > > > >> > > >
> > > > >> >
> > > > >>
> > > > >
> > > >
> >
> >
>

Reply via email to