Hi Cole. Done https://github.com/apache/tinkerpop/pull/3232
I'm sorry, but implementing schema changes over the manipulation by vertices consumed me a bit. On Tue, Sep 23, 2025 at 8:31 AM Andrii Lomakin <[email protected]> wrote: > > Hi Cole, sure. > > Let me wait for the results of additional discussion in Discord, and then I > will summarize everything as a PR. > > On Mon, Sep 22, 2025 at 9:13 PM Cole Greer <[email protected]> > wrote: > > > Thanks Andrii, > > > > That proposal looks good to me. I’d like to retain that same our records > > as well, would you mind > > opening a PR to the 3.8-dev branch which adds that as a proposal to our > > future docs: > > https://github.com/apache/tinkerpop/tree/3.8-dev/docs/src/dev/future > > > > Regards, > > Cole > > > > From: Andrii Lomakin <[email protected]> > > Date: Monday, September 22, 2025 at 12:04 AM > > To: [email protected] <[email protected]> > > Subject: Re: Proposal of the new declarative semantics of the match step > > Good day. > > As this thread has been dormant for two weeks already, I have summarized > > our discussion and created a spec proposal. > > Please skip the part related to GQL DSL that is our specifics -> > > > > https://youtrack.jetbrains.com/articles/YTDB-A-32/Specification-for-the-declarative-match-step > > > > Looking forward to any constructive feedback. > > > > > > On Sun, Sep 7, 2025 at 4:40 PM Andrii Lomakin < > > [email protected]> > > wrote: > > > > > Good day, Cole. > > > Thank you for your feedback. > > > > > > As for your concerns about supporting other pattern-matching languages, > > it > > > is unlikely that a pattern-matching language can be implemented without > > the > > > concept of a variable. > > > > > > I will wait two weeks for feedback from other participants, and then, if > > > there is no activity, I will summarize our discussion. > > > > > > On Thu, Sep 4, 2025 at 2:02 AM Cole Greer <[email protected]> wrote: > > > > > >> Hi Andrii and Lev, > > >> > > >> I really like that idea. It's clean, simple, and concise. I suppose one > > >> downside of > > >> dropping the RETURN is that you can no longer use aggregator functions > > as > > >> in > > >> RETURN count(n), however those capabilities already exist in gremlin so > > >> there > > >> really isn't much of an impact here. > > >> > > >> I am concerned that no returns/always returning Optional.EMPTY may place > > >> undesired restrictions on providers who are using different declarative > > >> languages > > >> via with("language", "GSQL"). If we want to give providers full > > >> flexibility here, they > > >> may want the ability to directly return data. > > >> > > >> For my purposes, I definitely want the default/reference GQL-based match > > >> language > > >> to work like your example: > > >> > > >> g.V(1).property('friendWeight', match("MATCH > > >> (n{name:'Cole'})-[e:knows]->()").select("n").values("weight").sum())) > > >> > > >> I'm still not sure how much providers will choose to use their own > > >> declarative > > >> languages instead of using our default implementation, so this may not > > be > > >> much of a > > >> concern in practice. I would be happy to start with always returning > > >> Optional.EMPTY, > > >> and then we can consider giving providers some extensibility on returns > > >> in the future if > > >> there is demand. > > >> > > >> I'm fully onboard with this proposal. Thanks for all of the work that > > has > > >> gone into this. > > >> > > >> Regards, > > >> Cole > > >> > > >> On 2025/09/03 08:03:36 Andrii Lomakin wrote: > > >> > Hi Cole. > > >> > Thank you for sharing. We reached an agreement on all topics except > > >> > the use of the RETURN statement. > > >> > > > >> > We brainstormed inside the team and came up with an interesting idea > > >> > about handling the output from the match statement, thanks to Lev > > >> > Sivashov, who provided it. > > >> > > > >> > This idea is combined with another of my proposals to treat > > >> > Optional.EMPTY returned by Traverser is a jolt to the execution of the > > >> > next step by Traversal, but it is treated as no value for the steps > > >> > that do not process input values, such as addV(). > > >> > It will fix queries such as `g.addV(__.inject('x'))` and similar ones > > >> > in Gremlin that accept Traversal and need a fake Traverser with a > > >> > value to work as expected. > > >> > > > >> > So we propose not to support RETURN at all, as we already have a means > > >> > to handle projections in Gremlin. > > >> > > > >> > Instead: > > >> > 1. match() steps returns Optional.empty() as result. > > >> > 2. We specify which MATCH variables we need to fetch using the > > select() > > >> step. > > >> > > > >> > So query > > >> > g.V(1).property('friendWeight', match("MATCH > > >> > (n{name:'Cole'})-[e:knows]->() RETURN sum(e.weight)")) > > >> > > > >> > will look like > > >> > g.V(1).property('friendWeight', match("MATCH > > >> > (n{name:'Cole'})-[e:knows]->()).select("n").values("weight").sum())) > > >> > > > >> > This approach is easily optimized for execution by analyzing the > > >> > select steps and providing GQL executor names of variables that are > > >> > really needed. It also looks elegant, prevents informational clutter, > > >> > and offers minimal and efficient pattern-matching methods for Gremlin. > > >> > WDYT? > > >> > > > >> > If you agree, I will wait a week to gather feedback from other > > >> > participants. If no additions are provided, I will publish a summary > > >> > here and link to our design document for general information, and I > > >> > will start implementing it at our pace. > > >> > > > >> > On Wed, Sep 3, 2025 at 5:27 AM Cole Greer <[email protected]> > > wrote: > > >> > > > > >> > > Hi Andrii, > > >> > > > > >> > > I've taken more time to think through your proposal. > > >> > > > > >> > > > I think we can transform the idea of introduction of new step, to > > >> the idea of usage > > >> > > > of `with` step and provide the following modulation rule for the > > new > > >> > > > `match` step: if name of the key in with step is passed in with > > "$" > > >> prefix, > > >> > > > this prefix is removed an the rest of the key is used as query > > >> parameter. > > >> > > > It is quite a common way of naming the parameters. As for binding > > >> of > > >> > > > parameters for server queries, if query parameters are not > > provided > > >> > > > explicitly, then we will perform an implicit lookup over the > > >> bindings of > > >> > > > those parameters. > > >> > > > > >> > > I like this. It gives good flexibility for localized "match > > >> parameters", while retaining some connection to the existing parameter > > >> bindings in the server. > > >> > > > > >> > > > There is a discrepancy between the naming of parameters between > > GQL > > >> and > > >> > > > Gremlin, but that is, IMHO, acceptable. > > >> > > > As one more alternative, probably even more appealing, we can wrap > > >> > > > parameters in "{}", as Koltin does :-) > > >> > > > That will resemble GQL style and will not create a visual mess. > > >> > > > > > >> > > > So it will look like: > > >> > > > ` g.match("MATCH (src:Airport {code:srcCode}), (dest:Airport > > >> > > > {code:destCode}) RETURN src") > > >> > > > .addE("Route").to("dest") > > >> > > > .property(T.id, > > >> > > > > > >> > > format("%{_}-%{_}").by(constant("{srcCode}")).by(constant("{destCode}")))` > > >> > > > > >> > > We don't currently support any parameter replacement within a string > > >> literal, currently parameters can only be used to swap out the string > > >> literal in its entirety. It may be complicated to implement as that > > >> parameter resolution would need to be added to all steps which accept > > >> string arguments. It may be best to spin this into it's own discussion > > if > > >> there is interest in pursuing this. > > >> > > > > >> > > > > I still haven't quite aligned myself regarding single > > non-element > > >> > > > returns. I'll reply back on this topic soon. > > >> > > > > > >> > > > I'm curious to see what you think. > > >> > > > > >> > > I've worked through some examples here and my preference is not to > > >> wrap single returns in maps. I understand the desire to limit the > > possible > > >> return types from the match step to just Elements and Maps, but in my > > >> opinion this is outweighed by the convenience of directly using the > > >> results. For instance with map wrapping: > > >> > > g.match("MATCH (n{name:'Cole'}) RETURN > > >> n.birthday").select("n.birthday").dateDiff(datetime("2000-01-01")) > > >> > > compared to without maps: > > >> > > g.match("MATCH (n{name:'Cole'}) RETURN > > >> n.birthday").dateDiff(datetime("2000-01-01")) > > >> > > > > >> > > The map wrapping and associated select feels unnecessary to me and > > >> gets in the way. I feel similarly about the following examples: > > >> > > > > >> > > g.match("MATCH (n:person) RETURN > > >> n.age").select("n.age").order().limit(5) vs. > > >> > > g.match("MATCH (n:person) RETURN n.age").order().limit(5) > > >> > > > > >> > > g.V(1).property('friendWeight', match("MATCH > > >> (n{name:'Cole'})-[e:knows]->() RETURN > > >> sum(e.weight)").select("sum(e.weight)")) vs. > > >> > > g.V(1).property('friendWeight', match("MATCH > > >> (n{name:'Cole'})-[e:knows]->() RETURN sum(e.weight)")) > > >> > > > > >> > > I couldn't come up with examples where I wanted to retain the > > results > > >> in their maps so the select() always feels like an unnecessary chore to > > me. > > >> Without these maps, the possible return types of match() would grow to > > >> include any property type supported by the graph, as well as the return > > >> types of any functions included in the declarative language. This is > > more > > >> complex but not without precedent considering steps such as inject() and > > >> constant(). > > >> > > > > >> > > Of course for any match query which returns multiple results, a map > > >> of all of them should be returned: > > >> > > g.match("MATCH (p:person)-[e:created]->(s:software) RETURN *") > > >> > > -> {"p": V[1], "e": E[9], "s": V[3]} > > >> > > > > >> > > In my mind this is mostly a matter of a small convenience. If you > > >> feel strongly that wrapping any non-element results into maps is > > >> preferable, I can accept that as well. > > >> > > > > >> > > Thanks, > > >> > > Cole > > >> > > > > >> > > > > >> > > On 2025/08/27 15:20:31 Andrii Lomakin wrote: > > >> > > > Good day. > > >> > > > > > >> > > > >I suppose I'm approaching this one more from the perspective that > > >> I don't > > >> > > > see why these parameters need to be isolated to just the match > > >> subquery. > > >> > > > > > >> > > > Thank you, Cole, for your feedback. > > >> > > > While you paused further analysis, I investigated code a bit, and > > I > > >> think > > >> > > > we can transform the idea of introduction of new step, to the idea > > >> of usage > > >> > > > of `with` step and provide the following modulation rule for the > > new > > >> > > > `match` step: if name of the key in with step is passed in with > > "$" > > >> prefix, > > >> > > > this prefix is removed an the rest of the key is used as query > > >> parameter. > > >> > > > It is quite a common way of naming the parameters. As for binding > > >> of > > >> > > > parameters for server queries, if query parameters are not > > provided > > >> > > > explicitly, then we will perform an implicit lookup over the > > >> bindings of > > >> > > > those parameters. > > >> > > > "Global" parameters can be applied in `with` Step in > > >> GraphTraversalSource > > >> > > > using the same approach. > > >> > > > > > >> > > > In such case, your query example would look like: > > >> > > > > > >> > > > ` g.match("MATCH (src:Airport {code:srcCode}), (dest:Airport > > >> > > > {code:destCode}) RETURN src") > > >> > > > .addE("Route").to("dest") > > >> > > > .property(T.id, > > >> > > > > > >> format("%{_}-%{_}").by(constant("$srcCode")).by(constant("$destCode")))` > > >> > > > > > >> > > > There is a discrepancy between the naming of parameters between > > GQL > > >> and > > >> > > > Gremlin, but that is, IMHO, acceptable. > > >> > > > As one more alternative, probably even more appealing, we can wrap > > >> > > > parameters in "{}", as Koltin does :-) > > >> > > > That will resemble GQL style and will not create a visual mess. > > >> > > > > > >> > > > So it will look like: > > >> > > > ` g.match("MATCH (src:Airport {code:srcCode}), (dest:Airport > > >> > > > {code:destCode}) RETURN src") > > >> > > > .addE("Route").to("dest") > > >> > > > .property(T.id, > > >> > > > > > >> > > format("%{_}-%{_}").by(constant("{srcCode}")).by(constant("{destCode}")))` > > >> > > > > > >> > > > Also, nobody prohibits keeping the policy of resolving parameter > > >> binding as > > >> > > > it is right now for server queries, with the recommended way to > > use > > >> the new > > >> > > > approach, so it will not be a breaking change and I doubt that > > many > > >> users > > >> > > > use string literals wrapped {} as values. > > >> > > > > > >> > > > > I still haven't quite aligned myself regarding single > > non-element > > >> > > > returns. I'll reply back on this topic soon. > > >> > > > > > >> > > > I'm curious to see what you think. > > >> > > > > > >> > > > > Thanks again for driving these discussions. In my opinion this > > >> will be > > >> > > > one of the most exciting additions to gremlin in quite some time. > > >> > > > > > >> > > > Thank you, I am totally flattered :-) > > >> > > > > > >> > > > > > >> > > > > > >> > > > > > >> > > > > > >> > > > > > >> > > > > > >> > > > > > >> > > > On Tue, Aug 26, 2025 at 12:13 AM Cole Greer <[email protected] > > > > > >> wrote: > > >> > > > > > >> > > > > Hi Andrii, > > >> > > > > > > >> > > > > It was great to see your response. I think we are mostly in > > >> agreement here. > > >> > > > > > > >> > > > > > It would be even better, IMHO, if the TP project added an > > >> ANTLR4 parser > > >> > > > > for GQL match statements > > >> > > > > > > >> > > > > Agreed, I've been loosely following LDBC's Open GQL project > > which > > >> has > > >> > > > > produced an Apache 2 licensed GQL Antlr grammar which likely > > >> offers a good > > >> > > > > starting point. > > >> > > > > https://github.com/opengql/grammar > > >> > > > > > > >> > > > > > Except for obvious query injection cases, which, in the > > absence > > >> of query > > >> > > > > parameters, should be handled by users themselves > > >> > > > > > > >> > > > > I mostly considered this in the remote context, in which > > reliance > > >> on > > >> > > > > gremlin-server for parameters is not an issue. I suppose there > > >> may be > > >> > > > > embedded use cases in which query injection is a concern, > > however > > >> this > > >> > > > > seems much rarer than the remote case. > > >> > > > > > > >> > > > > > another important argument for the presence of query > > parameters > > >> is that > > >> > > > > query parsing is quite a heavy process > > >> > > > > > > >> > > > > I definitely agree on this front. > > >> > > > > > > >> > > > > > >I would prefer to solve that problem at the broader gremlin > > >> level, > > >> > > > > instead of isolating it to the match step. > > >> > > > > > > > >> > > > > > Would you happen to have any other applications in mind? > > >> > > > > > > >> > > > > I suppose I'm approaching this one more from the perspective > > that > > >> I don't > > >> > > > > see why these parameters need to be isolated to just the match > > >> subquery. > > >> > > > > > > >> > > > > Parameters is already a bit overloaded and messy in TinkerPop > > and > > >> I hope > > >> > > > > to reduce that complexity overtime. As already noted, remote > > >> gremlin > > >> > > > > scripts already have the ability to use parameters via > > >> gremlin-server. > > >> > > > > Bytecode requests currently have bindings which serve a similar > > >> purpose. > > >> > > > > Internally we also have the Parameterizing interface which is > > >> more about > > >> > > > > steps supporting things like `with()` modulation, and not > > related > > >> to query > > >> > > > > parameters. > > >> > > > > > > >> > > > > I think it's easier for users if we simply have one set of query > > >> > > > > parameters instead of fractured gremlin parameters and match > > >> parameters. I > > >> > > > > expect there are some cases where it is useful to reference the > > >> same > > >> > > > > parameter in both the gremlin and GQL portions of a query, > > >> although it is > > >> > > > > admittedly not a common use case. The following query is a > > >> somewhat > > >> > > > > contrived example where the same parameters are used to match 2 > > >> nodes, and > > >> > > > > then the same parameters are concatenated together to form an id > > >> for a new > > >> > > > > edge which is added between the nodes: > > >> > > > > g.match("MATCH (src:Airport {code:srcCode}), (dest:Airport > > >> > > > > {code:destCode}) RETURN src") > > >> > > > > .addE("Route").to("dest") > > >> > > > > .property(T.id, > > >> > > > > > > format("%{_}-%{_}").by(constant(srcCode)).by(constant(destCode))) > > >> > > > > > > >> > > > > There may also be cases where it is useful to have multiple > > match > > >> steps in > > >> > > > > a single traversal which reuse the same parameters. > > >> > > > > > > >> > > > > Taking the existing remote query parameters, reworking them to > > >> support the > > >> > > > > embedded case as well, then making those parameters available to > > >> the new > > >> > > > > match step would solve the query injection and parse cache > > >> problems without > > >> > > > > introducing an additional form of parameters for users to > > handle. > > >> > > > > > > >> > > > > > > I will take some time next week to work through some example > > >> queries > > >> > > > > and get a better sense of how I feel on each option here. > > >> > > > > > > > >> > > > > > Looking forward to reading your conclusions. > > >> > > > > > > >> > > > > I still haven't quite aligned myself regarding single > > non-element > > >> returns. > > >> > > > > I'll reply back on this topic soon. > > >> > > > > > > >> > > > > Thanks again for driving these discussions. In my opinion this > > >> will be one > > >> > > > > of the most exciting additions to gremlin in quite some time. > > >> > > > > > > >> > > > > Regards, > > >> > > > > Cole > > >> > > > > > > >> > > > > On 2025/08/23 14:00:51 Andrii Lomakin wrote: > > >> > > > > > Good day, Cole. > > >> > > > > > > > >> > > > > > Glad to exchange more ideas with you in this thread. > > >> > > > > > > > >> > > > > > >I think it would make sense for TinkerPop to adopt a default > > >> language > > >> > > > > for the new match step, which is some heavily restricted form of > > >> GQL > > >> > > > > (read-only, limited to basic MATCH, WHERE, and RETURN > > >> statements). This > > >> > > > > "standard" language could then be used in the new match step > > >> without a > > >> > > > > language with-modulator. Providers would still be free to > > support > > >> their own > > >> > > > > languages via that modulator if they choose. > > >> > > > > > > > >> > > > > > That makes sense, I agree with you. > > >> > > > > > It would be even better, IMHO, if the TP project added an > > ANTLR4 > > >> > > > > > parser for GQL match statements (there is already at least one > > >> ANTLR > > >> > > > > > spec in the public domain) that vendors can use to work on the > > >> AST > > >> > > > > > level. We can talk about possible collaboration on this task. > > >> > > > > > > > >> > > > > > > I'd be interested if you have any examples where embedded > > >> parameters > > >> > > > > present a clear advantage. > > >> > > > > > > > >> > > > > > I expected that this question would be raised :-) > > >> > > > > > But decided to move the discussion to a follow-up thread to > > >> avoid > > >> > > > > > polluting the main proposal. > > >> > > > > > Except for obvious query injection cases, which, in the > > absence > > >> of > > >> > > > > > query parameters, should be handled by users themselves, > > another > > >> > > > > > important argument for the presence of query parameters is > > that > > >> query > > >> > > > > > parsing is quite a heavy process, and the consumption of 20% > > of > > >> CPU > > >> > > > > > resources on query parsing is not a rare exception. > > >> > > > > > To avoid this overhead, query parsing results (likely ASTs) > > are > > >> cached > > >> > > > > > by a simple string hash code (likely the only way, as they are > > >> not > > >> > > > > > parsed in this phase). Of course, the absence of query > > >> parameters very > > >> > > > > > often increases the variability of queries by several orders > > of > > >> > > > > > magnitude and voids caching efforts. > > >> > > > > > > > >> > > > > > >I would prefer to solve that problem at the broader gremlin > > >> level, > > >> > > > > instead of isolating it to the match step. > > >> > > > > > > > >> > > > > > Would you happen to have any other applications in mind? > > >> > > > > > > > >> > > > > > > I will take some time next week to work through some example > > >> queries > > >> > > > > and get a better sense of how I feel on each option here. > > >> > > > > > > > >> > > > > > Looking forward to reading your conclusions. > > >> > > > > > > > >> > > > > > >. I think that all "variables" bound in the match query > > should > > >> be > > >> > > > > stored such that they are later selectable. > > >> > > > > > > > >> > > > > > Yeah, cool idea! > > >> > > > > > > > >> > > > > > >Overall I think this would be a great change to gremlin. I > > >> look forward > > >> > > > > to keeping this discussion going and ultimately seeing the > > >> changes land in > > >> > > > > TinkerPop. > > >> > > > > > > > >> > > > > > Thank you, Cole! > > >> > > > > > Once the discussion comes to a natural conclusion, I will > > >> summarize > > >> > > > > > all the ideas again to ensure that we are all on the same > > page. > > >> Then, > > >> > > > > > we will add it to our roadmap. > > >> > > > > > > > >> > > > > > On Sat, Aug 23, 2025 at 12:01 AM Cole Greer < > > >> [email protected]> > > >> > > > > wrote: > > >> > > > > > > > > >> > > > > > > Hi Andrii, > > >> > > > > > > > > >> > > > > > > Thanks for starting this discussion and putting together > > this > > >> > > > > proposal. I want to start by saying that overall, I'm massively > > >> in favour > > >> > > > > of the proposed overhaul of match(). This is a topic that has > > >> come up many > > >> > > > > times in the past, and taking advantage of an established > > >> declarative > > >> > > > > language like GQL always seems to be the preferred solution. > > >> > > > > > > > > >> > > > > > > The idea of having the language configurable via something > > >> like > > >> > > > > `.with(“language”, > > >> > > > > > > “GQL”)` is quite interesting, and something I haven't seen > > in > > >> previous > > >> > > > > discussions. There is clear value in allowing providers to > > >> support their > > >> > > > > own preferred declarative languages here, but I also worry about > > >> the loss > > >> > > > > of query portability if TinkerPop is too hands off on the choice > > >> of > > >> > > > > declarative language. I believe the vast majority of usages here > > >> will be > > >> > > > > seeing a traversal with a simple GQL-like match pattern. I think > > >> it would > > >> > > > > make sense for TinkerPop to adopt a default language for the new > > >> match > > >> > > > > step, which is some heavily restricted form of GQL (read-only, > > >> limited to > > >> > > > > basic MATCH, WHERE, and RETURN statements). This "standard" > > >> language could > > >> > > > > then be used in the new match step without a language > > >> with-modulator. > > >> > > > > Providers would still be free to support their own languages via > > >> that > > >> > > > > modulator if they choose. > > >> > > > > > > > > >> > > > > > > I will take a bit more time to consider the withParameter() > > >> proposal. > > >> > > > > My initial reaction is that I prefer to tie it into the existing > > >> parameter > > >> > > > > bindings included in remote requests to gremlin-server. I would > > >> like query > > >> > > > > parameters to function in a unified manner across the entire > > >> traversal if > > >> > > > > possible, instead of a separate detached system isolated to the > > >> new match > > >> > > > > step. I understand the current limitation of only supporting > > >> parameters in > > >> > > > > remote traversals. I'm not immediately seeing the need to > > support > > >> > > > > parameters for embedded traversals here, I'd be interested if > > you > > >> have any > > >> > > > > examples where embedded parameters present a clear advantage. If > > >> we do > > >> > > > > decide there is a need for embedded parameters, I would prefer > > to > > >> solve > > >> > > > > that problem at the broader gremlin level, instead of isolating > > >> it to the > > >> > > > > match step. > > >> > > > > > > > > >> > > > > > > I totally agree that the start and mid-step behaviour of the > > >> new match > > >> > > > > step should be modeled after V() and E(). > > >> > > > > > > > > >> > > > > > > I think the trickiest part of getting this right is the > > >> return types. > > >> > > > > The most common use cases I expect is where the RETURN clause > > >> only includes > > >> > > > > a single node or edge. In this case I completely agree with > > >> returning the > > >> > > > > element itself. I definitely want to support usages such as > > >> g.match("MATCH > > >> > > > > (n{name:'Cole'}) RETURN n").out()... My main tenet here is that > > >> results > > >> > > > > should naturally flow from the declarative match into the > > >> subsequent > > >> > > > > gremlin and be easy to consume. If multiple objects are > > returned, > > >> I would > > >> > > > > agree that it is necessary to return a Map<String, ?> as in > > >> g.match("MATCH > > >> > > > > (p:person)-[e:created]->(s:software) RETURN *") -> {"p": V[1], > > >> "e": E[9], > > >> > > > > "s": V[3]} ... > > >> > > > > > > > > >> > > > > > > I'm still on the fence for how to handle single returns of > > >> > > > > non-elements. I see the value in your recommendation to return a > > >> map of > > >> > > > > size 1, but I also see some convenience to directly returning > > the > > >> value > > >> > > > > (usually a single property). I will take some time next week to > > >> work > > >> > > > > through some example queries and get a better sense of how I > > feel > > >> on each > > >> > > > > option here. > > >> > > > > > > > > >> > > > > > > There is one final item which I would like to see added to > > the > > >> > > > > proposal. I think that all "variables" bound in the match query > > >> should be > > >> > > > > stored such that they are later selectable. Essentially I think > > >> it's > > >> > > > > important to support something like this: > > >> > > > > > > > > >> > > > > > > g.match("MATCH (n1{name:'Cole'})-[]->(n2) RETURN > > >> > > > > n1").where(...)...select(n2).out()... > > >> > > > > > > > > >> > > > > > > The ability to select other bound variables later in the > > >> traversal > > >> > > > > should greatly limit the number of times users are forced to > > >> return > > >> > > > > multiple items at once, which reduces the amount of use cases > > >> where users > > >> > > > > will be forced to break down maps in gremlin to complete their > > >> query. > > >> > > > > > > > > >> > > > > > > Overall I think this would be a great change to gremlin. I > > >> look > > >> > > > > forward to keeping this discussion going and ultimately seeing > > >> the changes > > >> > > > > land in TinkerPop. > > >> > > > > > > > > >> > > > > > > Thanks, > > >> > > > > > > Cole > > >> > > > > > > > > >> > > > > > > On 2025/08/22 15:46:10 Andrii Lomakin wrote: > > >> > > > > > > > Good day. > > >> > > > > > > > > > >> > > > > > > > I propose new semantics for the match step in Gremlin, > > >> which we > > >> > > > > discussed > > >> > > > > > > > briefly in the Discord chat. The current ideas listed > > >> partially > > >> > > > > summarize > > >> > > > > > > > ideas suggested by several discussion participants. > > >> > > > > > > > > > >> > > > > > > > The current semantics of the match step are complex to > > >> optimize, so > > >> > > > > users > > >> > > > > > > > do not use this step in practice, and DB vendors do not > > >> recommend > > >> > > > > using > > >> > > > > > > > match step in queries. > > >> > > > > > > > > > >> > > > > > > > Instead, what is proposed is to provide a new match step > > >> based on > > >> > > > > > > > declarative semantics. > > >> > > > > > > > > > >> > > > > > > > Signature of this step is quite simple: Travervsal<S, E> > > >> match(String > > >> > > > > > > > matchQuery). > > >> > > > > > > > > > >> > > > > > > > Where matchQuery is a match statement written in > > >> declarative query > > >> > > > > language > > >> > > > > > > > supported by the provider, I will use GQL as an example > > >> below. > > >> > > > > > > > > > >> > > > > > > > This step will require the language as a configuration > > >> parameter > > >> > > > > provided > > >> > > > > > > > using with the step. > > >> > > > > > > > > > >> > > > > > > > So the simplest query will look like: > > >> > > > > > > > > > >> > > > > > > > g.match(“MATCH > > >> > > > > (person:Person)-[:knows]->(friend:Person)”).with(“language”, > > >> > > > > > > > “GQL”) > > >> > > > > > > > > > >> > > > > > > > match step can accept query parameters, so if we provide a > > >> query like > > >> > > > > > > > g.match(“MATCH > > >> > > > > > > > (p:Person WHERE p.name = $personName)RETURN > > >> > > > > p.email”).with(“language”, > > >> > > > > > > > “GQL”) > > >> > > > > > > > > > >> > > > > > > > we may use parameter bindings, but it will work only for > > >> interaction > > >> > > > > with > > >> > > > > > > > Gremlin Server, so instead, I propose an additional > > >> modulator step: > > >> > > > > > > > withParameter(String > > >> > > > > > > > name, Object value) > > >> > > > > > > > > > >> > > > > > > > In such case final version will look like: g.match(“MATCH > > >> (p:Person > > >> > > > > WHERE > > >> > > > > > > > p.name = $personName) RETURN p.email”).with(“language”, > > >> > > > > > > > “GQL”).withParameter(“personName”, “Stephen”) > > >> > > > > > > > > > >> > > > > > > > Alongside the version of withParameter step that provides > > >> the name > > >> > > > > of the > > >> > > > > > > > query parameter, a version with the following signature > > >> should also > > >> > > > > be > > >> > > > > > > > provided: withParameter(int index, Object value) for query > > >> languages > > >> > > > > that > > >> > > > > > > > support indexed parameters with/instead of named > > parameters. > > >> > > > > > > > > > >> > > > > > > > Because we already introduced one modulator step, it is > > >> reasonable to > > >> > > > > > > > consider replacing it with step by more specific > > >> withQueryLanguage() > > >> > > > > > > > modulator step that will allow us to add more > > >> expressiveness to the > > >> > > > > > > > resulting queries. > > >> > > > > > > > > > >> > > > > > > > In such case final version will look like: g.match(“MATCH > > >> (p:Person > > >> > > > > WHERE > > >> > > > > > > > p.name = $personName) RETURN > > >> > > > > > > > > > >> p.email”).withQueryLanguage(“GQL”).withParameter(“personName”, > > >> > > > > “Stephen”) > > >> > > > > > > > > > >> > > > > > > > As for the scope of application of this step, I recommend > > >> making it > > >> > > > > behave > > >> > > > > > > > exactly as it is implemented for the V() and E() steps. It > > >> could be > > >> > > > > added > > >> > > > > > > > in the middle of GraphTraversal, but the execution result > > >> will be > > >> > > > > the same > > >> > > > > > > > pattern matching execution applied to the whole graph > > >> stored in the > > >> > > > > > > > database (not to the item filtered/transformed by the > > >> previous > > >> > > > > steps). > > >> > > > > > > > > > >> > > > > > > > It also means that match step will be added to the > > >> > > > > GraphTraversalSource. > > >> > > > > > > > > > >> > > > > > > > As for the format of the output of the match step, I would > > >> recommend > > >> > > > > the > > >> > > > > > > > following: > > >> > > > > > > > > > >> > > > > > > > 1. If the match statement returns an Element instance, it > > >> is > > >> > > > > returned as > > >> > > > > > > > is. > > >> > > > > > > > > > >> > > > > > > > 2. Otherwise, it should return any value that is allowed > > >> to be a > > >> > > > > property > > >> > > > > > > > value in Element. > > >> > > > > > > > > > >> > > > > > > > 3. I would add an optional recommendation to return either > > >> Element or > > >> > > > > > > > Map<String, > > >> > > > > > > > ?> where the key of the map is the result a projection of > > >> the query > > >> > > > > result > > >> > > > > > > > which in case of query g.match(“MATCH (p:Person WHERE > > >> p.name = > > >> > > > > > > > $personName) RETURN > > >> > > > > > > > > > >> p.email”).withQueryLanguage(“GQL”).withParameter(“personName”, > > >> > > > > “Stephen”) > > >> > > > > > > > > > >> > > > > > > > will look like {“p.email”: “[email protected]”}. Following > > this > > >> optional > > >> > > > > > > > recommendation will, IMHO, improve user experience. > > >> > > > > > > > > > >> > > > > > > > This step should be restricted to executing only > > idempotent > > >> queries. > > >> > > > > > > > > > >> > > > > > > > I would also recommend adding versions of withParameter() > > >> that accept > > >> > > > > > > > Traversal as a value of the parameters, namely: > > >> > > > > > > > 1. withParameter(String name, TraversalSource value) > > >> > > > > > > > > > >> > > > > > > > 2. withParameter(int index, TraversalSource value) > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > The current version of the match step should be deprecated > > >> and then > > >> > > > > removed. > > >> > > > > > > > > > >> > > > > > > > I want to thank Stephen Mallette, whose initial idea > > >> closely aligned > > >> > > > > with > > >> > > > > > > > ours and who actively contributed to our discussions. > > >> > > > > > > > > > >> > > > > > > > I'm looking forward to your thoughts, observations, and > > any > > >> other > > >> > > > > feedback > > >> > > > > > > > you may have. > > >> > > > > > > > > > >> > > > > > > > Best Regards, > > >> > > > > > > > YouTrackDB development lead > > >> > > > > > > > Andrii Lomakin > > >> > > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > >> > > > > >
