Re: [PR] Add MERGE ON CREATE SET / ON MATCH SET support (#1619) [age]

via GitHub Mon, 02 Mar 2026 10:48:15 -0800


Copilot commented on code in PR #2347:
URL: https://github.com/apache/age/pull/2347#discussion_r2873984555



##########
regress/sql/cypher_merge.sql:
##########
@@ -868,9 +868,87 @@ SELECT * FROM cypher('issue_1630', $$ MATCH (n) DETACH 
DELETE n $$) AS (a agtype
 SELECT * FROM cypher('issue_1709', $$ MATCH (n) DETACH DELETE n $$) AS (a 
agtype);
 SELECT * FROM cypher('issue_1446', $$ MATCH (n) DETACH DELETE n $$) AS (a 
agtype);
 
+--
+-- ON CREATE SET / ON MATCH SET tests (issue #1619)
+--
+SELECT create_graph('merge_actions');
+
+-- Basic ON CREATE SET: first run creates the node
+SELECT * FROM cypher('merge_actions', $$
+  MERGE (n:Person {name: 'Alice'})
+    ON CREATE SET n.created = true
+  RETURN n.name, n.created
+$$) AS (name agtype, created agtype);
+
+-- ON MATCH SET: second run matches the existing node
+SELECT * FROM cypher('merge_actions', $$
+  MERGE (n:Person {name: 'Alice'})
+    ON MATCH SET n.found = true
+  RETURN n.name, n.created, n.found
+$$) AS (name agtype, created agtype, found agtype);
+
+-- Both ON CREATE SET and ON MATCH SET (first run = create)
+SELECT * FROM cypher('merge_actions', $$
+  MERGE (n:Person {name: 'Bob'})
+    ON CREATE SET n.created = true
+    ON MATCH SET n.matched = true
+  RETURN n.name, n.created, n.matched
+$$) AS (name agtype, created agtype, matched agtype);
+
+-- Both ON CREATE SET and ON MATCH SET (second run = match)
+SELECT * FROM cypher('merge_actions', $$
+  MERGE (n:Person {name: 'Bob'})
+    ON CREATE SET n.created = true
+    ON MATCH SET n.matched = true
+  RETURN n.name, n.created, n.matched
+$$) AS (name agtype, created agtype, matched agtype);
+
+-- ON MATCH SET with MERGE after MATCH (Case 1: has predecessor)
+SELECT * FROM cypher('merge_actions', $$
+  MATCH (a:Person {name: 'Alice'})
+  MERGE (a)-[:KNOWS]->(b:Person {name: 'Charlie'})
+    ON CREATE SET b.source = 'merge_create'
+  RETURN a.name, b.name, b.source
+$$) AS (a agtype, b agtype, source agtype);
+

Review Comment:
   The test labeled "ON MATCH SET with MERGE after MATCH (Case 1: has 
predecessor)" only tests `ON CREATE SET` in Case 1 (MERGE with a predecessor 
clause). There is no test that verifies `ON MATCH SET` behavior in Case 1 — 
specifically, running a `MATCH ... MERGE ... ON MATCH SET` query where the path 
already exists and the ON MATCH SET clause fires. This execution path (lines 
682-685 and 704-708 in `cypher_merge.c`) is untested.
   ```suggestion
   -- ON MATCH SET with MERGE after MATCH (Case 1: has predecessor, first run = 
create)
   SELECT * FROM cypher('merge_actions', $$
     MATCH (a:Person {name: 'Alice'})
     MERGE (a)-[:KNOWS]->(b:Person {name: 'Charlie'})
       ON CREATE SET b.source = 'merge_create'
       ON MATCH SET b.source = 'merge_match'
     RETURN a.name, b.name, b.source
   $$) AS (a agtype, b agtype, source agtype);
   
   -- ON MATCH SET with MERGE after MATCH (Case 1: has predecessor, second run 
= match)
   SELECT * FROM cypher('merge_actions', $$
     MATCH (a:Person {name: 'Alice'})
     MERGE (a)-[:KNOWS]->(b:Person {name: 'Charlie'})
       ON CREATE SET b.source = 'merge_create'
       ON MATCH SET b.source = 'merge_match'
     RETURN a.name, b.name, b.source
   $$) AS (a agtype, b agtype, source agtype);
   ```



##########
src/backend/executor/cypher_set.c:
##########
@@ -485,11 +489,31 @@ static void process_update_list(CustomScanState *node)
          * this is a REMOVE clause or the variable references a variable that 
is
          * NULL. It will be possible for a variable to be NULL when OPTIONAL
          * MATCH is implemented.
+         *
+         * If prop_expr is set (used by MERGE ON CREATE/MATCH SET), evaluate
+         * the expression directly rather than reading from the scan tuple.
+         * The planner may have stripped the target entry at prop_position.
          */
         if (update_item->remove_item)
         {
             remove_property = true;
         }
+        else if (update_item->prop_expr != NULL)
+        {
+            ExprState *expr_state;
+            Datum val;
+            bool isnull;
+
+            expr_state = ExecInitExpr((Expr *)update_item->prop_expr,
+                                      (PlanState *)node);
+            val = ExecEvalExpr(expr_state, econtext, &isnull);

Review Comment:
   `ExecInitExpr` is called on every invocation of `apply_update_list` (once 
per row in the non-terminal MERGE path, lines 640-720 in `cypher_merge.c`). 
Expression state initialization should happen once at plan initialization time 
(in `create_cypher_merge_plan_state`), not per row. 
   
   The established pattern in this codebase is to call `ExecInitExpr` once 
during the plan state setup (see `cypher_merge.c:175-185` and 
`cypher_create.c:115-125`). Pre-initialized `ExprState` objects should be 
stored alongside `cypher_update_item` (e.g., a parallel `prop_expr_state` 
field) or in the `cypher_merge_custom_scan_state`. Calling `ExecInitExpr` on 
each row incurs unnecessary memory allocation and state-building overhead that 
grows linearly with the number of matched or created entities.



##########
regress/sql/cypher_merge.sql:
##########
@@ -868,9 +868,87 @@ SELECT * FROM cypher('issue_1630', $$ MATCH (n) DETACH 
DELETE n $$) AS (a agtype
 SELECT * FROM cypher('issue_1709', $$ MATCH (n) DETACH DELETE n $$) AS (a 
agtype);
 SELECT * FROM cypher('issue_1446', $$ MATCH (n) DETACH DELETE n $$) AS (a 
agtype);
 
+--
+-- ON CREATE SET / ON MATCH SET tests (issue #1619)
+--
+SELECT create_graph('merge_actions');
+
+-- Basic ON CREATE SET: first run creates the node
+SELECT * FROM cypher('merge_actions', $$
+  MERGE (n:Person {name: 'Alice'})
+    ON CREATE SET n.created = true
+  RETURN n.name, n.created
+$$) AS (name agtype, created agtype);
+
+-- ON MATCH SET: second run matches the existing node
+SELECT * FROM cypher('merge_actions', $$
+  MERGE (n:Person {name: 'Alice'})
+    ON MATCH SET n.found = true
+  RETURN n.name, n.created, n.found
+$$) AS (name agtype, created agtype, found agtype);
+
+-- Both ON CREATE SET and ON MATCH SET (first run = create)
+SELECT * FROM cypher('merge_actions', $$
+  MERGE (n:Person {name: 'Bob'})
+    ON CREATE SET n.created = true
+    ON MATCH SET n.matched = true
+  RETURN n.name, n.created, n.matched
+$$) AS (name agtype, created agtype, matched agtype);
+
+-- Both ON CREATE SET and ON MATCH SET (second run = match)
+SELECT * FROM cypher('merge_actions', $$
+  MERGE (n:Person {name: 'Bob'})
+    ON CREATE SET n.created = true
+    ON MATCH SET n.matched = true
+  RETURN n.name, n.created, n.matched
+$$) AS (name agtype, created agtype, matched agtype);
+
+-- ON MATCH SET with MERGE after MATCH (Case 1: has predecessor)

Review Comment:
   The comment says "ON MATCH SET with MERGE after MATCH (Case 1: has 
predecessor)" but the test only uses `ON CREATE SET`. The comment is misleading 
— it should say "ON CREATE SET with MERGE after MATCH" or the test should also 
cover the ON MATCH SET path by re-running the same MERGE query so that the path 
is found on the second run, triggering ON MATCH SET.
   ```suggestion
   -- ON CREATE SET with MERGE after MATCH (Case 1: has predecessor)
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] Add MERGE ON CREATE SET / ON MATCH SET support (#1619) [age]

Reply via email to