Copilot commented on code in PR #2341:
URL: https://github.com/apache/age/pull/2341#discussion_r2879747295


##########
regress/sql/cypher_match.sql:
##########
@@ -1492,6 +1492,56 @@ $$) AS (val agtype);
 
 SELECT drop_graph('issue_2308', true);
 
+--
+-- Issue 2193: CREATE ... WITH ... MATCH on brand-new label returns 0 rows
+-- on first execution because match_check_valid_label() runs before
+-- transform_prev_cypher_clause() creates the label table.
+--
+SELECT create_graph('issue_2193');
+
+-- Reporter's exact case: CREATE two Person nodes, then MATCH on Person
+-- Should return 2 rows on the very first execution
+SELECT * FROM cypher('issue_2193', $$
+    CREATE (a:Person {name: 'Jane', livesIn: 'London'}),
+           (b:Person {name: 'Tom', livesIn: 'Copenhagen'})
+    WITH a, b
+    MATCH (p:Person)
+    RETURN p.name ORDER BY p.name
+$$) AS (result agtype);
+
+-- Single CREATE + MATCH on brand-new label
+SELECT * FROM cypher('issue_2193', $$
+    CREATE (a:City {name: 'Berlin'})
+    WITH a
+    MATCH (c:City)
+    RETURN c.name ORDER BY c.name
+$$) AS (result agtype);
+
+-- MATCH on a label that now exists (second execution) still works
+SELECT * FROM cypher('issue_2193', $$
+    CREATE (a:City {name: 'Paris'})
+    WITH a
+    MATCH (c:City)
+    RETURN c.name ORDER BY c.name
+$$) AS (result agtype);
+
+-- MATCH on non-existent label without DML predecessor still returns 0 rows
+SELECT * FROM cypher('issue_2193', $$
+    MATCH (x:NonExistentLabel)
+    RETURN x
+$$) AS (result agtype);
+
+-- MATCH on non-existent label after DML predecessor still returns 0 rows
+-- and MATCH-introduced variable (p) is properly registered
+SELECT * FROM cypher('issue_2193', $$
+    CREATE (a:Person {name: 'Alice'})
+    WITH a
+    MATCH (p:NonExistentLabel)
+    RETURN p
+$$) AS (result agtype);
+

Review Comment:
   The regression case `CREATE ... WITH ... MATCH (p:NonExistentLabel) RETURN 
p` asserts only that the result set is empty, but it doesn’t verify that the 
preceding CREATE still executed (i.e., that DML side effects aren’t optimized 
away when MATCH injects an always-false predicate). Consider adding a follow-up 
query in the same graph to confirm the created vertex exists after this 
statement, so the test catches plan-elimination regressions (like the original 
`One-Time Filter: false` issue).
   ```suggestion
   
   -- Verify that the preceding CREATE was actually executed by matching the 
created vertex
   SELECT * FROM cypher('issue_2193', $$
       MATCH (a:Person {name: 'Alice'})
       RETURN a.name ORDER BY a.name
   $$) AS (result agtype);
   ```



##########
src/backend/parser/cypher_clause.c:
##########
@@ -2949,6 +2952,19 @@ static Query 
*transform_cypher_match_pattern(cypher_parsestate *cpstate,
              */
             pnsi = get_namespace_item(pstate, rte);
             query->targetList = expandNSItemAttrs(pstate, pnsi, 0, true, -1);
+
+            /*
+             * Now that the predecessor chain is fully transformed and
+             * any CREATE-generated labels exist in the cache, check
+             * whether the MATCH pattern references valid labels. This
+             * deferred check is only needed when the chain has DML,
+             * since labels created by CREATE are not in the cache at
+             * the time of the early check in transform_cypher_match().
+             */
+            if (has_dml && !match_check_valid_label(self, cpstate))
+            {
+                where = make_false_where_clause();
+            }

Review Comment:
   In the deferred invalid-label path (has_dml && !match_check_valid_label), 
setting `where = make_false_where_clause()` builds a constant-foldable 
predicate (`true = false`). PostgreSQL can optimize this into a `Result` with 
`One-Time Filter: false` and eliminate the FROM subtree entirely, which would 
skip executing the predecessor DML custom scan (the same class of symptom 
described in #2193). For DML predecessor chains, the “always false” predicate 
should be non-constant at plan time (e.g., incorporate a VOLATILE wrapper like 
`ag_catalog.agtype_volatile_wrapper(...)` around the constants, or otherwise 
force the optimizer to keep the predecessor scan) so DML side effects still 
occur even when MATCH returns 0 rows.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to