jrgemignani commented on issue #2428:
URL: https://github.com/apache/age/issues/2428#issuecomment-4652088532
I tried to validate this on the current `master` (`73d0705e`, PostgreSQL
18.3) and I think the framing of this as an Apache AGE bug is misleading.
Summary of what I found:
**The two queries are semantically equivalent — confirmed, and there is no
correctness bug.** I loaded `BUG003_reproducer_graph.cypher.txt` (1280 nodes /
2450 edges, matching the report) and ran both queries. Their sorted result
multisets are **byte-identical (12000 rows each)**. The injected `expr IS NULL
AND expr IS NOT NULL` disjunct is indeed always false, as stated.
**The "raised rows" is a planner *estimate* artifact, not real output.** The
`rows` delta (1405.83 → 3032.0) is an estimated-cardinality change. Actual
returned rows are identical. On current master the wall-clock impact is also
much milder than the report's 6.25× (I measured ~281 ms → ~358 ms, both fully
in memory, no disk spill) — so the severity is stats/plan-dependent, not a
fixed regression.
**Root cause is PostgreSQL's planner, not AGE.** PostgreSQL's
`eval_const_expressions` only folds a `NullTest` when its argument is a
compile-time `Const`. It deliberately does not reason about contradictions /
non-nullability over columns, function results, or nested `NullTest`s. This
reproduces with **zero AGE involvement** in plain SQL:
```sql
-- PG keeps the contradiction in the Filter (does NOT fold):
EXPLAIN (COSTS OFF)
SELECT g FROM generate_series(1,10) g
WHERE g > 5 OR (((g IS NULL OR g IS NOT NULL)) IS NULL
AND ((g IS NULL OR g IS NOT NULL)) IS NOT NULL);
-- Filter: ((g > 5) OR ((((g IS NULL) OR (g IS NOT NULL)) IS NULL)
-- AND (((g IS NULL) OR (g IS NOT NULL)) IS NOT
NULL)))
-- Even the simpler contradiction is NOT folded:
EXPLAIN (COSTS OFF)
SELECT g FROM generate_series(1,10) g WHERE g > 5 OR (g IS NULL AND g IS NOT
NULL);
-- Filter: ((g > 5) OR ((g IS NULL) AND (g IS NOT NULL)))
-- A real Const contradiction IS folded (sanity check):
EXPLAIN (COSTS OFF)
SELECT g FROM generate_series(1,10) g WHERE g > 5 OR (1=1 AND 1=2);
-- Filter: (g > 5)
```
**Why AGE *appears* to amplify it:** the top-level `OR` prevents the ten
`rN.id <> rM.id` inequalities from being decomposed and pushed to individual
join levels, so they collapse into one monolithic Nested-Loop join filter and
the row estimate balloons. That is standard SQL `OR` planning behavior, not
AGE-specific.
**Conclusion:** the equivalence claim is correct, but this is (a) not a
correctness bug and (b) not an AGE bug — it's a generic PostgreSQL
expression-simplification limitation that AGE inherits and that reproduces
without AGE. A fix would belong in PostgreSQL core (`eval_const_expressions`),
or would require AGE to add a bespoke boolean-contradiction simplifier for a
pattern that, as far as I can tell, only an automated metamorphic fuzzer would
generate.
Environment: Apache AGE `master` @ `73d0705e`, PostgreSQL 18.3,
`installcheck` 36/36 passing.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]