Copilot commented on code in PR #2347:
URL: https://github.com/apache/age/pull/2347#discussion_r2897186089
##########
src/include/parser/cypher_kwlist.h:
##########
@@ -29,6 +29,7 @@ PG_KEYWORD("match", MATCH, RESERVED_KEYWORD)
PG_KEYWORD("merge", MERGE, RESERVED_KEYWORD)
PG_KEYWORD("not", NOT, RESERVED_KEYWORD)
PG_KEYWORD("null", NULL_P, RESERVED_KEYWORD)
+PG_KEYWORD("on", ON, RESERVED_KEYWORD)
Review Comment:
The `ON` keyword is registered as `RESERVED_KEYWORD` in `cypher_kwlist.h`,
which means it cannot be used as a plain identifier (variable name) in Cypher
queries. While it is listed in `safe_keywords` in the grammar (so it can be
used as a label, property key, or schema-qualified name), using `ON` as a
variable name (e.g., `MATCH (on)`) would fail because `var_name` only accepts
`symbolic_name` (i.e., `IDENTIFIER`), and reserved keywords are not
identifiers.
If any existing user queries use `on` as a variable name, this would be a
breaking change. Consider whether `on` needs to be `RESERVED_KEYWORD` or if it
could use a different classification (though the current grammar likely
requires it to be `RESERVED_KEYWORD` to be recognized as the start of `ON MATCH
SET` / `ON CREATE SET` in the merge context). At minimum, the regression tests
should cover the case of using `on` as an identifier/property key to confirm
backward compatibility.
##########
src/backend/parser/cypher_clause.c:
##########
@@ -6833,6 +6833,52 @@ static Query *transform_cypher_merge(cypher_parsestate
*cpstate,
merge_information->graph_oid = cpstate->graph_oid;
merge_information->path = merge_path;
+ /* Transform ON MATCH SET items, if any */
+ if (self->on_match != NIL)
+ {
+ ListCell *lc2;
+
+ merge_information->on_match_set_info =
+ transform_cypher_set_item_list(cpstate, self->on_match, query);
+ merge_information->on_match_set_info->clause_name = "MERGE ON MATCH
SET";
+ merge_information->on_match_set_info->graph_name = cpstate->graph_name;
+
+ /*
+ * Store prop_expr for direct evaluation in the MERGE executor.
+ * The planner may strip SET expression target entries from the plan,
+ * so we embed the Expr in the update item for direct evaluation.
+ */
+ foreach(lc2, merge_information->on_match_set_info->set_items)
+ {
+ cypher_update_item *item = lfirst(lc2);
+ TargetEntry *set_tle = get_tle_by_resno(query->targetList,
+ item->prop_position);
+ if (set_tle != NULL)
+ item->prop_expr = (Node *)set_tle->expr;
+ }
+ }
+
+ /* Transform ON CREATE SET items, if any */
+ if (self->on_create != NIL)
+ {
+ ListCell *lc2;
+
+ merge_information->on_create_set_info =
+ transform_cypher_set_item_list(cpstate, self->on_create, query);
+ merge_information->on_create_set_info->clause_name = "MERGE ON CREATE
SET";
+ merge_information->on_create_set_info->graph_name =
cpstate->graph_name;
+
+ /* Store prop_expr for MERGE executor (see comment above) */
+ foreach(lc2, merge_information->on_create_set_info->set_items)
+ {
+ cypher_update_item *item = lfirst(lc2);
+ TargetEntry *set_tle = get_tle_by_resno(query->targetList,
+ item->prop_position);
+ if (set_tle != NULL)
+ item->prop_expr = (Node *)set_tle->expr;
+ }
Review Comment:
The same issue described in the `on_match_set_info` loop (silent fallback to
potentially-stripped `prop_position`) also applies here for
`on_create_set_info`. An assertion or error should be added for the case when
`set_tle` is `NULL`.
##########
src/backend/parser/cypher_clause.c:
##########
@@ -6833,6 +6833,52 @@ static Query *transform_cypher_merge(cypher_parsestate
*cpstate,
merge_information->graph_oid = cpstate->graph_oid;
merge_information->path = merge_path;
+ /* Transform ON MATCH SET items, if any */
+ if (self->on_match != NIL)
+ {
+ ListCell *lc2;
+
+ merge_information->on_match_set_info =
+ transform_cypher_set_item_list(cpstate, self->on_match, query);
+ merge_information->on_match_set_info->clause_name = "MERGE ON MATCH
SET";
+ merge_information->on_match_set_info->graph_name = cpstate->graph_name;
+
+ /*
+ * Store prop_expr for direct evaluation in the MERGE executor.
+ * The planner may strip SET expression target entries from the plan,
+ * so we embed the Expr in the update item for direct evaluation.
+ */
+ foreach(lc2, merge_information->on_match_set_info->set_items)
+ {
+ cypher_update_item *item = lfirst(lc2);
+ TargetEntry *set_tle = get_tle_by_resno(query->targetList,
+ item->prop_position);
+ if (set_tle != NULL)
+ item->prop_expr = (Node *)set_tle->expr;
+ }
Review Comment:
The `prop_expr` field is only populated when `set_tle != NULL`. The comment
states that the planner may strip the `prop_position` target entry, which is
precisely why `prop_expr` was introduced. However, if `get_tle_by_resno`
returns `NULL` (which should not happen at parse-time since the TLE was just
added, but is guarded defensively), `prop_expr` stays `NULL`, causing the
executor to fall back to reading from `scanTupleSlot->tts_values[prop_position
- 1]` — but this is exactly the entry that the planner might have stripped.
This silent fallback path could cause incorrect behavior or a crash instead of
a clear error.
An assertion or error should be added for the case where `set_tle` is
`NULL`, since this would indicate an internal bug in the transform. The silent
fallback to the potentially-stripped scan tuple position is dangerous.
##########
src/backend/executor/cypher_merge.c:
##########
@@ -752,8 +822,18 @@ static TupleTableSlot *exec_cypher_merge(CustomScanState
*node)
css->created_paths_list = new_path;
process_path(css, prebuilt_path_array, true);
+
+ /* ON CREATE SET: path was just created */
+ if (css->on_create_set_info)
+ apply_update_list(&css->css, css->on_create_set_info);
Review Comment:
The same issue as in the non-terminal Case 1: in the terminal Case 1 ON
CREATE SET path (line 824-828), `apply_update_list` is called with
`ecxt_scantuple` pointing to the lateral join's projection (set at line
789-790). The newly-created entity's values would be null in this slot.
Depending on whether `process_path` writes entity values into this lateral join
slot (which it does via `merge_vertex`), this may or may not work, but it is
inconsistent with the Case 2 and Case 3 approaches that call
`ExecStoreVirtualTuple` first.
```suggestion
if (css->on_create_set_info)
{
/*
* Materialize the current virtual tuple into the
scan
* slot so that apply_update_list sees the values
* corresponding to the newly-created path.
*
* This mirrors the behavior in the other MERGE
cases,
* which call ExecStoreVirtualTuple before
* apply_update_list.
*/
ExecStoreVirtualTuple(econtext->ecxt_scantuple);
apply_update_list(&css->css,
css->on_create_set_info);
}
```
##########
src/backend/executor/cypher_merge.c:
##########
@@ -668,8 +723,19 @@ static TupleTableSlot *exec_cypher_merge(CustomScanState
*node)
css->created_paths_list = new_path;
process_path(css, prebuilt_path_array, true);
+
+ /* ON CREATE SET: path was just created */
+ if (css->on_create_set_info)
+ apply_update_list(&css->css,
+ css->on_create_set_info);
Review Comment:
In the Case 1 (has predecessor, non-terminal) ON CREATE SET path (line
728-730), `apply_update_list` is called after `process_path(css,
prebuilt_path_array, true)` with `ecxt_scantuple` still pointing to the lateral
join's scantuple (set at the top of the loop). Unlike Case 2
(`process_simple_merge`, line 360) and Case 3 (line 988),
`ExecStoreVirtualTuple` is not called before `apply_update_list` to ensure
`tts_nvalid` is set correctly.
While this may work if the lateral join's projected slot already has
`tts_nvalid = natts` after `ExecProject`, the entity values for the
newly-created MERGE path entities (e.g., a fresh node `b` in `MERGE
(a)-[:R]->(b)`) would be null in the lateral join's slot. However,
`apply_update_list` reads the entity from
`scanTupleSlot->tts_values[entity_position - 1]` to get its graphid for the
UPDATE. If the entity slot is null (from the lateral join), this would fail at
the `tts_isnull` check (line 453, `if
(scanTupleSlot->tts_isnull[update_item->entity_position - 1]) continue`),
silently skipping the ON CREATE SET update without applying it.
The fix would be to use a fresh scan tuple slot (like the
`sss->ss.ss_ScanTupleSlot` approach used in Case 3), calling
`ExecStoreVirtualTuple` before `apply_update_list`.
```suggestion
if (css->on_create_set_info)
{
/* Use the scan tuple slot populated by
process_path */
econtext->ecxt_scantuple =
css->css.ss.ss_ScanTupleSlot;
ExecStoreVirtualTuple(econtext->ecxt_scantuple);
apply_update_list(&css->css,
css->on_create_set_info);
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]