Hi Junwang,
> The fix is simply:
> >
> > tr = replace_property_refs(rte->relid, pf->whereClause, graph_path);
> >
> > Ashutosh, could you include this fix in the next patch revision?
>
> This fixes the crash.
>
Thanks for confirming.
> > Also, I'd like to check — do you see any potential side effects from
> > passing the full graph_path instead of list_make1(pe)? Since the mutator
> > now has access to all element mappings, I want to make sure there are no
> > unintended interactions in other code paths.
>
> One concern is that if we support
>
> MATCH (a IS users)-[]->(x IS users)<-[]-(b IS users WHERE b.name != a.name
> )
>
> user may expect the following also works:
>
> MATCH (a IS users WHERE b.name != a.name)-[]->(x IS users)<-[]-(b IS
> users)
>
> but the second actually failed to pass the transform phase.
>
> I tested neo4j, both are well supported.
>
Good catch. You're right -- both orderings should work.
The standard is explicit about this. Subclause 10.6 Syntax Rule 18
(ISO/IEC 9075-16:2023) states:
"The scope of an <element variable> that is declared by an
<element pattern> EP is the innermost <graph table> containing EP."
The scope is the entire <graph table>, not "from the point of
declaration onward." So a forward reference like your second example
is just as valid as the backward reference in the first.
The current implementation registers variables into gpstate->variables
sequentially inside transformGraphElementPattern(), which makes forward
references invisible at transform time.
So we might follow the same behavior. The solution I came out is in
> transformPathTerm
> we collect gpstate->variables before each transformGraphElementPattern.
>
> Something like:
>
> transformPathTerm(ParseState *pstate, List *path_term)
> {
> List *result = NIL;
> + GraphTableParseState *gpstate = pstate->p_graph_table_pstate;
> +
> + /*
> + * First gather all element variables from this path term so that
> WHERE
> + * clauses in any element pattern can reference variables
> appearing anywhere
> + * in the term, regardless of order.
> + */
> + foreach_node(GraphElementPattern, gep, path_term)
> + {
> + if (gep->variable)
> + {
> + String *v = makeString(pstrdup(gep->variable));
> +
> + if (!list_member(gpstate->variables, v))
> + gpstate->variables =
> lappend(gpstate->variables, v);
> + }
> + }
>
> foreach_node(GraphElementPattern, gep, path_term)
> result = lappend(result,
>
> Thoughts?
This correctly matches the standard's scoping semantics for a
single path term.
One thing worth noting for the future: the standard says the
variable scope covers the entire <graph table> (SR 18), which
means cross-path-term references should also work once we support
multiple path patterns. For example:
MATCH (a IS users WHERE b.name != a.name)-[]->(x IS users),
(b IS users)-[]->(y IS users)
Here `b` is declared in the second path term but forward-referenced
in the first. Pre-collecting inside transformPathTerm would not see
`b` when processing the first path term. Moving the collection up
to transformPathPatternList, spanning all path terms before any
transformation, would cover both cases. I'll be posting a
multi-pattern path matching patch soon in a separate thread.
Regards,
Henson