gregfelice opened a new pull request, #2360:
URL: https://github.com/apache/age/pull/2360
## Summary
Enables bare graph patterns as boolean expressions in WHERE clauses (issue
#1577):
```cypher
-- Now valid (equivalent to EXISTS):
MATCH (a:Person), (b:Person)
WHERE (a)-[:KNOWS]->(b)
RETURN a.name, b.name
-- Also works with NOT, AND, OR:
WHERE NOT (a)-[:OWNS]->(:Asset)
WHERE (a)-[:KNOWS]->(b) AND a.name = 'Alice'
WHERE (a)-[:KNOWS]->(b) OR (a)-[:WORKS_WITH]->(b)
```
This is standard openCypher syntax, used extensively in Neo4j, and was the
most frequently cited migration blocker (open since 2024, no prior PR).
## Approach: GLR Parser
The fundamental challenge is that `(a)` is valid as both a parenthesized
expression and a graph node pattern. LALR(1) cannot distinguish them with one
token of lookahead.
**Solution:** Switch from LALR(1) to Bison's GLR mode. When the parser
encounters the ambiguity, it forks the parse stack and tries both
interpretations. The failing path is discarded. `%dprec` annotations resolve
cases where both paths succeed (bare `(a)` prefers the expression
interpretation).
**Changes:**
- `%glr-parser` directive + `YYRHSLOC` fix for GLR location compatibility
- `anonymous_path` added as `expr_atom` alternative with `%dprec 1`
- `'(' expr ')'` annotated with `%dprec 2` (higher priority)
- `expr_var` / `var_name_opt` rules annotated with `%dprec` to resolve
variable reduce/reduce conflicts
- Extracted `make_exists_pattern_sublink()` helper (shared by
`EXISTS(pattern)` and bare pattern rules)
**Conflict budget:** 7 shift/reduce (path extension vs arithmetic), 3
reduce/reduce (expr_var vs var_name_opt). All expected and correctly handled by
GLR forking + `%dprec`.
**Performance:** GLR adds ~20% to generated parser size. For non-ambiguous
queries (the common case), overhead is negligible — conflict states are only
reached when patterns appear in expression context. No fork occurs when
`%dprec` priorities differ.
## Regression Tests
New `pattern_expression` test (15 queries):
- Basic `WHERE (a)-[:REL]->(b)` patterns
- `NOT` pattern negation
- Labeled first node `(a:Person)-[:REL]->(b)`
- `AND` / `OR` boolean combinations
- Left-directed patterns `(a)<-[:REL]-(b)`
- Anonymous node patterns `(a)-[:REL]->()`
- Multi-hop patterns `(a)-[:R1]->()-[:R2]->(c)`
- `EXISTS()` backward compatibility verification
- Equivalence check: bare pattern produces same results as `EXISTS(pattern)`
- Non-pattern expression regression (`RETURN (1 + 2)`, `RETURN (n.name)`)
All 32 regression tests pass (31 existing + 1 new).
## Test plan
- [x] All 32 existing + new regression tests pass
- [x] Verified `WHERE (a)-[:KNOWS]->(b)` returns correct results
- [x] Verified `WHERE NOT (a)-[:KNOWS]->(:Person)` returns correct results
- [x] Verified backward compatibility with `EXISTS(pattern)` syntax
- [x] Verified non-pattern expressions `(1 + 2)` still work
- [ ] Reviewer: verify GLR conflict counts match expectations
🤖 Generated with [Claude Code](https://claude.com/claude-code)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]