gregfelice opened a new issue, #2349:
URL: https://github.com/apache/age/issues/2349
## Summary
The variable-length edge (VLE) expansion algorithm used for patterns like
`[*1..3]` exhibits O(n^k) time complexity because it does not perform cycle
detection or deduplication during recursive expansion. On graphs with cycles
(which are common in real-world data), this causes exponential path explosion.
## Reproduction
Setup: 100K nodes with edges forming natural cycles.
```sql
-- VLE query: find all nodes reachable within 1-3 hops
SELECT * FROM cypher('benchmark', $$
MATCH (n:Node {bench_id: 42})-[*1..3]->(m)
RETURN count(m)
$$) AS (cnt agtype);
```
On graphs with cycles, the expansion visits the same nodes exponentially
many times because `UNION ALL` semantics are used internally (no deduplication).
## Performance Data
Benchmark W06 (variable-length edge traversal):
| Approach | 10K scale | 100K scale |
|---|---|---|
| Cypher VLE (before SDK workaround) | ~23ms | ~23ms |
| Recursive CTE with UNION (dedup) | 0.08ms | 0.34ms |
The SDK workaround uses a recursive CTE with `UNION` (not `UNION ALL`) which
provides implicit cycle detection through set deduplication, preventing
exponential blowup:
```sql
WITH RECURSIVE vle AS (
SELECT e.end_id AS node_id, 1 AS depth
FROM benchmark."EDGE" e
WHERE e.start_id = (SELECT id FROM benchmark."Node" WHERE _bench_id = 42)
UNION -- implicit dedup = cycle detection
SELECT e.end_id, v.depth + 1
FROM vle v
JOIN benchmark."EDGE" e ON e.start_id = v.node_id
WHERE v.depth < 3
)
SELECT count(*) FROM vle WHERE depth >= 1;
```
## Suggested Fix
Add a visited-node set to the VLE expansion in the executor. When expanding
`[*min..max]`, track which `(node_id, depth)` pairs have already been explored
and skip duplicates. This is equivalent to using `UNION` instead of `UNION ALL`
in the recursive expansion.
## Environment
- PostgreSQL 18.2
- Apache AGE 1.7.0 (built from source, branch `release/PG18/1.7.0`)
- 32-core, 64GB RAM, Linux 6.17
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]