Hi Ashutosh and Junwang,
> patch 0002 in the attached patchset implements it this way. It
> > collects all the element variables before transforming the path
> > pattern. Later it uses this list to detect non-local variable
> > references and throws error.
>
I applied both patches (0001 and 0002) and ran some extended testing
beyond the included regression tests. All non-local cross-references
are properly rejected across every combination I tried:
vertex <-> edge cross-references:
- vertex WHERE referencing a later edge (forward) -> ERROR
- vertex WHERE referencing an earlier edge (backward) -> ERROR
- edge WHERE referencing a later vertex (forward) -> ERROR
- edge WHERE referencing an earlier vertex (backward) -> ERROR
longer patterns ()-[]->()-[]->():
- 3rd vertex -> 1st vertex (long backward) -> ERROR
- 1st edge -> 3rd vertex (long forward) -> ERROR
- 2nd vertex -> 1st vertex (adjacent backward) -> ERROR
- 2nd vertex -> 3rd vertex (adjacent forward) -> ERROR
- 1st vertex -> 3rd vertex (long forward) -> ERROR
- all references local -> OK
All combinations -- vertex-vertex, vertex-edge, edge-vertex, adjacent
and long-distance -- are blocked as expected. Looks good.
Seeing the comment above transformGraphTablePropertyRef:
>
> *
> * A property reference is parsed as a ColumnRef of the form:
> * <variable>.<property>. If <variable> is one of the variables bound to an
> * element pattern in the graph pattern and <property> can be resolved as a
> * property of the property graph, then we return a GraphPropertyRef node
>
> it seems to imply that G041 is supported. Should we update this comment
> to better reflect the current behavior?
>
Agreed, it would be nice to update the comment to reflect that
non-local element variable references are now rejected.
> I have a question about this. Why do we allow a ColumnRef to reference a
> lateral table outside the graph pattern?
>
> For example, in the following query:
>
> SELECT x1.a, g.*
> FROM x1,
> GRAPH_TABLE (
> myshop
> MATCH (x1 IS customers WHERE address = 'US')
> -[IS customer_orders]->(o IS orders)
> COLUMNS (x1.name AS customer_name,
> x1.customer_id AS cid,
> o.order_id)
> ) g;
>
> It’s easy for users to get confused and assume that `address` is a column
> of customers rather than the outside x1.
>
GRAPH_TABLE has implicit LATERAL semantics per the SQL standard
(ISO/IEC 9075-16, Syntax Rule 5 of Subclause 7.6: <graph table derived
table> is added to the list of <table primary>s that support lateral
join), so referencing outer FROM items is expected behavior.
When the FROM table name doesn't conflict with an element variable,
lateral references work as expected:
SELECT x2.a, g.*
FROM x2,
GRAPH_TABLE (myshop
MATCH (c IS customers WHERE c.address = 'US'
AND c.customer_id = x2.a)
-[IS customer_orders]->(o IS orders)
COLUMNS (c.name AS customer_name,
c.customer_id AS cid, o.order_id)) g;
-- OK: x2.a resolves to lateral table x2
But when the same name is used for both (as in your example with x1),
qualified references are always captured by the element variable:
SELECT x1.a, g.*
FROM x1,
GRAPH_TABLE (myshop
MATCH (x1 IS customers WHERE x1.address = 'US')
-[IS customer_orders]->(o IS orders
WHERE o.order_id = x1.a)
COLUMNS (x1.name AS customer_name,
x1.customer_id AS cid, o.order_id)) g;
-- ERROR: non-local element variable reference is not supported
-- x1.a is recognized as element variable x1, not lateral table x1
I agree this can be a bit confusing for users, but given that PGQ is
still in its early stage, prioritizing stability seems reasonable
for now. This should be resolved once G041 support is added in the
future.
Best Regards,
Henson