YaeSakuraQ opened a new issue, #2428: URL: https://github.com/apache/age/issues/2428
# Missed optimization: NULL contradiction disjunct adds Apache AGE filter rows I found a missed optimization opportunity: adding the semantically redundant predicate `base OR (expr IS NULL AND expr IS NOT NULL)` changes the execution plan and raises PROFILE rows, although the base and transformed queries return the same results. **Apache AGE Version:** Apache AGE 1.7.0 on PostgreSQL 18.1, Docker image `apache/age:latest`, digest `sha256:4241e2d8bb86a6b2ea44e9ad06c73856e12b209de295124603a599dd7feb70eb` **Operating System:** macOS Darwin 25.3.0 arm64 **Installation Method:** Docker image `apache/age:latest`, port `127.0.0.1:5455 -> 5432` **API/Driver:** psycopg Python client through Apache AGE `cypher(graph, query)` wrapper ### Steps to reproduce 1. Load the attached graph setup file into an empty disposable database: [BUG003_reproducer_graph.cypher.txt](https://github.com/user-attachments/files/28394319/BUG003_reproducer_graph.cypher.txt) ```bash # Execute each Cypher statement from BUG003_reproducer_graph.cypher.txt through Apache AGE's cypher(graph, query) wrapper. ``` 2. Verify the graph size: ```cypher MATCH (n) RETURN count(n) AS nodes; -- nodes: 1280 MATCH ()-[r]->() RETURN count(r) AS relationships; -- relationships: 2450 ``` 3. Run the base query with the native profile command: ```cypher EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT) MATCH (n0 :L2)-[r0 :T3]->(n1)<-[r1 :T4]-(n2 :L1), (n4 :L0)<-[r3 :T1]-(n5 :L1), (n6 :L3)-[r4 :T1]->(n7 :L3)-[r5 :T2]->(n3) WHERE ((((((((((true AND ((r0.id) <> (r1.id))) AND ((r0.id) <> (r3.id))) AND ((r0.id) <> (r4.id))) AND ((r0.id) <> (r5.id))) AND ((r1.id) <> (r3.id))) AND ((r1.id) <> (r4.id))) AND ((r1.id) <> (r5.id))) AND ((r3.id) <> (r4.id))) AND ((r3.id) <> (r5.id))) AND ((r4.id) <> (r5.id))) UNWIND [-2082148190, -1254759639, (r1.k62)] AS a0 RETURN (r0.k55) AS a1 ORDER BY a1 DESC; ``` 4. Run the transformed query with the native profile command: ```cypher EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT) MATCH (n0 :L2)-[r0 :T3]->(n1)<-[r1 :T4]-(n2 :L1), (n4 :L0)<-[r3 :T1]-(n5 :L1), (n6 :L3)-[r4 :T1]->(n7 :L3)-[r5 :T2]->(n3) WHERE (((((((((((true AND ((r0.id) <> (r1.id))) AND ((r0.id) <> (r3.id))) AND ((r0.id) <> (r4.id))) AND ((r0.id) <> (r5.id))) AND ((r1.id) <> (r3.id))) AND ((r1.id) <> (r4.id))) AND ((r1.id) <> (r5.id))) AND ((r3.id) <> (r4.id))) AND ((r3.id) <> (r5.id))) AND ((r4.id) <> (r5.id)))) OR (((n0.id IS NULL OR n0.id IS NOT NULL)) IS NULL AND ((n0.id IS NULL OR n0.id IS NOT NULL)) IS NOT NULL) UNWIND [-2082148190, -1254759639, (r1.k62)] AS a0 RETURN (r0.k55) AS a1 ORDER BY a1 DESC; ``` The transformed query is built by rewriting the target `WHERE` predicate as: ```text base OR (expr IS NULL AND expr IS NOT NULL) ``` where `base` is the original predicate: ```cypher ((((((((((true AND ((r0.id) <> (r1.id))) AND ((r0.id) <> (r3.id))) AND ((r0.id) <> (r4.id))) AND ((r0.id) <> (r5.id))) AND ((r1.id) <> (r3.id))) AND ((r1.id) <> (r4.id))) AND ((r1.id) <> (r5.id))) AND ((r3.id) <> (r4.id))) AND ((r3.id) <> (r5.id))) AND ((r4.id) <> (r5.id))) ``` and `expr` is: ```cypher (n0.id IS NULL OR n0.id IS NOT NULL) ``` The injected boolean item is always false because the same value cannot be both `NULL` and `NOT NULL`. The transformed predicate reduces to `base OR false`, so it is equivalent to the original base predicate. ### Expected behavior The planner should simplify the redundant boolean item before plan construction because the transformed predicate is semantically equivalent to the base predicate. It should avoid retaining predicate residue or introducing extra filters, joins, barriers, scans, row production, memory, or buffer work. ### Actual behavior Both queries return the same result set on the attached graph. Primary PROFILE metric: `rows` changed from `1405.83` to `3032.0`; ratio `2.156`, delta `1626.17`. Base PROFILE excerpt: ```text Sort (cost=21282514.73..21304326.56 rows=8724732 width=32) (actual time=355.785..356.895 rows=12000.00 loops=1) Sort Key: (agtype_access_operator(VARIADIC ARRAY[_age_default_alias_previous_cypher_clause.r0, '"k55"'::agtype])) DESC Sort Method: quicksort Memory: 385kB Buffers: shared hit=5783, temp read=921 written=936 -> Subquery Scan on _age_default_alias_previous_cypher_clause (cost=53984.43..19859203.99 rows=8724732 width=32) (actual time=259.301..356.092 rows=12000.00 loops=1) Buffers: shared hit=5783, temp read=921 written=936 -> Gather (cost=53984.43..19750144.84 rows=8724732 width=448) (actual time=259.299..349.231 rows=12000.00 loops=1) Workers Planned: 2 Workers Launched: 2 Buffers: shared hit=5783, temp read=921 written=936 -> ProjectSet (cost=52984.43..18876671.64 rows=3635305000 width=448) (actual time=250.327..337.343 rows=4000.00 loops=3) Buffers: shared hit=5783, temp read=921 written=936 -> Parallel Hash Join (cost=52984.43..618352.28 rows=3635305 width=112) (actual time=250.219..333.744 rows=1333.33 loops=3) Hash Cond: (r0.end_id = n1.id) Buffers: shared hit=5541, temp read=921 written=936 -> Hash Join (cost=52829.56..564932.27 rows=100967 width=112) (actual time=29.072..112.428 rows=1333.33 loops=3) Hash Cond: (r3.end_id = n4.id) Buffers: shared hit=5513, temp read=921 written=936 -> Hash Join (cost=52792.56..564629.29 rows=100967 width=120) (actual time=28.989..111.952 rows=9333.33 loops=3) Hash Cond: (r3.start_id = n5.id) Buffers: shared hit=5504, temp read=921 written=936 -> Nested Loop (cost=52755.56..564326.32 rows=100967 width=128) (actual time=28.956..109.908 rows=46533.33 loops=3) Join Filter: (((agtype_access_operator(VARIADIC ARRAY[r3.properties, '"id"'::agtype]) <> agtype_access_operator(VARIADIC ARRAY[r5.properties, '"id"'::agtype])))::agtype AND ((agtype_access_operator(VARIADIC ARRAY[r0.properties, '"id"'::agtype]) <> agtype_access_operator(VARIADIC ARRAY[r3.properties, '"id"'::agtype])))::agtype AND ((agtype_access_operator(VARIADIC ARRAY[r3.properties, '"id"'::agtype]) <> agtype_access_operator(VARIADIC ARRAY[r4.properties, '"id"'::agtype])))::agtype AND ((agtype_access_operator(VARIADIC ARRAY[r1.properties, '"id"'::agtype]) <> agtype_access_operator(VARIADIC ARRAY[r3.properties, '"id"'::agtype])))::agtype) Rows Removed by Join Filter: 133 Buffers: shared hit=5492, temp read=921 written=936 -> Hash Join (cost=52755.56..53174.82 rows=8431 width=176) (actual time=28.940..29.004 rows=133.33 loops=3) Hash Cond: (n3.id = r5.end_id) Buffers: shared hit=5471, temp read=921 written=936 -> Parallel Append (cost=0.00..117.36 rows=3001 width=8) (actual time=0.003..0.049 rows=426.67 loops=3) Buffers: shared hit=28 -> Parallel Seq Scan on "L2" n3_2 (cost=0.00..17.06 rows=706 width=8) (actual time=0.003..0.008 rows=70.00 loops=3) Buffers: shared hit=6 -> Parallel Seq Scan on "L0" n3_3 (cost=0.00..17.06 rows=706 width=8) (actual time=0.002..0.010 rows=170.00 loops=1) Buffers: shared hit=3 -> Parallel Seq Scan on "L4" n3_4 (cost=0.00..17.06 rows=706 width=8) (actual time=0.004..0.018 rows=310.00 loops=1) Buffers: shared hit=7 -> Parallel Seq Scan on "GdsmithNode" n3_5 (cost=0.00..17.06 rows=706 width=8) (actual time=0.003..0.008 rows=150.00 loops=1) Buffers: shared hit=2 -> Parallel Seq Scan on "L3" n3_6 (cost=0.00..17.06 rows=706 width=8) (actual time=0.003..0.017 rows=240.00 loops=1) Buffers: shared hit=6 -> Parallel Seq Scan on "L1" n3_7 (cost=0.00..17.06 rows=706 width=8) (actual time=0.001..0.008 rows=100.00 loops=2) Buffers: shared hit=4 -> Parallel Seq Scan on _ag_label_vertex n3_1 (cost=0.00..0.00 rows=1 width=8) (actual time=0.001..0.001 rows=0.00 loops=1) -> Hash (cost=52748.54..52748.54 rows=562 width=184) (actual time=28.902..28.906 rows=400.00 loops=3) Buckets: 1024 Batches: 1 Memory Usage: 244kB Buffers: shared hit=5443, temp read=921 written=936 -> Nested Loop (cost=50055.55..52748.54 rows=562 width=184) (actual time=27.366..28.852 rows=400.00 loops=3) Buffers: shared hit=5443, temp read=921 written=936 -> Merge Join (cost=50055.40..52625.19 rows=562 width=184) (actual time=26.637..28.438 rows=1200.00 loops=3) Merge Cond: (r1.end_id = r0.end_id) Join Filter: (_ag_enforce_edge_uniqueness2(r0.id, r1.id) AND ((agtype_access_operator(VARIADIC ARRAY[r1.properties, '"id"'::agtype]) <> agtype_access_operator(VARIADIC ARRAY[r5.properties, '"id"'::agtype])))::agtype AND ((agtype_access_operator(VARIADIC ARRAY[r0.properties, '"id"'::agtype]) <> agtype_access_operator(VARIADIC ARRAY[r1.properties, '"id"'::agtype])))::agtype AND ((agtype_access_operator(VARIADIC ARRAY[r1.properties, '"id"'::agtype]) <> agtype_access_operator(VARIADIC ARRAY[r4.properties, '"id"'::agtype])))::agtype) Buffers: shared hit=635, temp read=921 written=936 -> Index Scan using "T4_end_id_idx" on "T4" r1 (cost=0.15..62.70 rows=970 width=56) (actual time=0.011..0.051 rows=310.00 loops=3) Index Searches: 3 Buffers: shared hit=299 -> Sort (cost=50055.25..50078.73 rows=9390 width=128) (actual time=26.492..26.632 rows=3200.00 loops=3) Sort Key: r0.end_id Sort Method: quicksort Memory: 1436kB Buffers: shared hit=336, temp read=921 written=936 Worker 0: Sort Method: quicksort Memory: 1436kB Worker 1: Sort Method: quicksort Memory: 1436kB -> Hash Join (cost=34031.33..49435.66 rows=9390 width=128) (actual time=20.232..26.074 rows=3200.00 loops=3) Hash Cond: (r4.start_id = n6.id) Buffers: shared hit=330, temp read=921 written=936 -> Hash Join (cost=33994.33..49373.92 rows=9390 width=136) (actual time=20.053..25.745 rows=6400.00 loops=3) Hash Cond: (r4.end_id = n7.id) Join Filter: (_ag_enforce_edge_uniqueness2(r4.id, r5.id) AND ((agtype_access_operator(VARIADIC ARRAY[r4.properties, '"id"'::agtype]) <> agtype_access_operator(VARIADIC ARRAY[r5.properties, '"id"'::agtype])))::agtype AND ((agtype_access_operator(VARIADIC ARRAY[r0.properties, '"id"'::agtype]) <> agtype_access_operator(VARIADIC ARRAY[r4.properties, '"id"'::agtype])))::agtype) Buffers: shared hit=312, temp read=921 written=936 -> Seq Scan on "T1" r4 (cost=0.00..19.70 rows=970 width=56) (actual time=0.004..0.025 rows=350.00 loops=3) Buffers: shared hit=21 -> Hash (cost=24559.92..24559.92 rows=313633 width=120) (actual time=19.158..19.161 rows=7200.00 loops=3) Buckets: 65536 Batches: 8 Memory Usage: 2519kB Buffers: shared hit=229, temp written=912 -> Hash Join (cost=37.43..24559.92 rows=313633 width=120) (actual time=0.130..17.149 rows=7200.00 loops=3) Hash Cond: (r0.start_id = n0.id) Buffers: shared hit=229 -> Nested Loop (cost=0.43..23696.73 rows=313633 width=120) (actual time=0.091..15.741 rows=31500.00 loops=3) Join Filter: ((agtype_access_operator(VARIADIC ARRAY[r0.properties, '"id"'::agtype]) <> agtype_access_operator(VARIADIC ARRAY[r5.properties, '"id"'::agtype])))::agtype Buffers: shared hit=211 -> Merge Join (cost=0.43..152.10 rows=970 width=64) (actual time=0.073..0.149 rows=90.00 loops=3) Merge Cond: (n7.id = r5.start_id) Buffers: shared hit=187 -> Index Only Scan using "L3_pkey" on "L3" n7 (cost=0.15..66.15 rows=1200 width=8) (actual time=0.014..0.039 rows=240.00 loops=3) Heap Fetches: 720 Index Searches: 3 Buffers: shared hit=23 -> Index Scan using "T2_start_id_idx" on "T2" r5 (cost=0.28..70.83 rows=970 width=56) (actual time=0.006..0.070 rows=381.00 loops=3) Index Searches: 3 Buffers: shared hit=164 -> Materialize (cost=0.00..24.55 rows=970 width=56) (actual time=0.000..0.008 rows=350.00 loops=270) Storage: Memory Maximum Storage: 75kB Buffers: shared hit=24 -> Seq Scan on "T3" r0 (cost=0.00..19.70 rows=970 width=56) (actual time=0.004..0.024 rows=350.00 loops=3) Buffers: shared hit=24 -> Hash (cost=22.00..22.00 rows=1200 width=8) (actual time=0.024..0.024 rows=210.00 loops=3) Buckets: 2048 Batches: 1 Memory Usage: 25kB Buffers: shared hit=18 -> Seq Scan on "L2" n0 (cost=0.00..22.00 rows=1200 width=8) (actual time=0.004..0.013 rows=210.00 loops=3) Buffers: shared hit=18 -> Hash (cost=22.00..22.00 rows=1200 width=8) (actual time=0.026..0.026 rows=240.00 loops=3) Buckets: 2048 Batches: 1 Memory Usage: 26kB Buffers: shared hit=18 -> Seq Scan on "L3" n6 (cost=0.00..22.00 rows=1200 width=8) (actual time=0.005..0.015 rows=240.00 loops=3) Buffers: shared hit=18 -> Index Only Scan using "L1_pkey" on "L1" n2 (cost=0.15..0.22 rows=1 width=8) (actual time=0.000..0.000 rows=0.33 loops=3600) Index Cond: (id = r1.start_id) Heap Fetches: 1200 Index Searches: 3600 Buffers: shared hit=4808 -> Materialize (cost=0.00..24.55 rows=970 width=48) (actual time=0.000..0.011 rows=350.00 loops=400) Storage: Memory Maximum Storage: 66kB Buffers: shared hit=21 -> Seq Scan on "T1" r3 (cost=0.00..19.70 rows=970 width=48) (actual time=0.005..0.022 rows=350.00 loops=3) Buffers: shared hit=21 -> Hash (cost=22.00..22.00 rows=1200 width=8) (actual time=0.025..0.025 rows=200.00 loops=3) Buckets: 2048 Batches: 1 Memory Usage: 24kB Buffers: shared hit=12 -> Seq Scan on "L1" n5 (cost=0.00..22.00 rows=1200 width=8) (actual time=0.006..0.016 rows=200.00 loops=3) Buffers: shared hit=12 -> Hash (cost=22.00..22.00 rows=1200 width=8) (actual time=0.034..0.035 rows=170.00 loops=3) Buckets: 2048 Batches: 1 Memory Usage: 23kB Buffers: shared hit=9 -> Seq Scan on "L0" n4 (cost=0.00..22.00 rows=1200 width=8) (actual time=0.017..0.025 rows=170.00 loops=3) Buffers: shared hit=9 -> Parallel Hash (cost=117.36..117.36 rows=3001 width=8) (actual time=221.089..221.091 rows=426.67 loops=3) Buckets: 8192 Batches: 1 Memory Usage: 192kB Buffers: shared hit=28 -> Parallel Append (cost=0.00..117.36 rows=3001 width=8) (actual time=208.615..208.664 rows=426.67 loops=3) Buffers: shared hit=28 -> Parallel Seq Scan on "L2" n1_2 (cost=0.00..17.06 rows=706 width=8) (actual time=106.891..106.899 rows=105.00 loops=2) Buffers: shared hit=6 -> Parallel Seq Scan on "L0" n1_3 (cost=0.00..17.06 rows=706 width=8) (actual time=108.276..108.281 rows=85.00 loops=2) Buffers: shared hit=3 -> Parallel Seq Scan on "L4" n1_4 (cost=0.00..17.06 rows=706 width=8) (actual time=0.003..0.017 rows=310.00 loops=1) Buffers: shared hit=7 -> Parallel Seq Scan on "GdsmithNode" n1_5 (cost=0.00..17.06 rows=706 width=8) (actual time=0.005..0.011 rows=150.00 loops=1) Buffers: shared hit=2 -> Parallel Seq Scan on "L3" n1_6 (cost=0.00..17.06 rows=706 width=8) (actual time=0.005..0.018 rows=240.00 loops=1) Buffers: shared hit=6 -> Parallel Seq Scan on "L1" n1_7 (cost=0.00..17.06 rows=706 width=8) (actual time=195.514..195.529 rows=200.00 loops=1) Buffers: shared hit=4 -> Parallel Seq Scan on _ag_label_vertex n1_1 (cost=0.00..0.00 rows=1 width=8) (actual time=0.001..0.002 rows=0.00 loops=1) Planning: Buffers: shared hit=24 Planning Time: 22.477 ms JIT: Functions: 359 Options: Inlining true, Optimization true, Expressions true, Deforming true Timing: Generation 6.236 ms (Deform 2.235 ms), Inlining 72.404 ms, Optimization 282.046 ms, Emission 271.598 ms, Total 632.284 ms Execution Time: 358.945 ms ``` Transformed PROFILE excerpt: ```text Sort (cost=6360501050.93..6366923957.41 rows=2569162591 width=32) (actual time=2228.241..2228.717 rows=12000.00 loops=1) Sort Key: (agtype_access_operator(VARIADIC ARRAY[_age_default_alias_previous_cypher_clause.r0, '"k55"'::agtype])) DESC Sort Method: quicksort Memory: 385kB Buffers: shared hit=2780, temp written=25773 -> Subquery Scan on _age_default_alias_previous_cypher_clause (cost=28586766.50..5713081376.33 rows=2569162591 width=32) (actual time=2207.761..2227.782 rows=12000.00 loops=1) Buffers: shared hit=2780, temp written=25773 -> Gather (cost=28586766.50..5680966843.95 rows=2569162591 width=448) (actual time=2207.759..2221.255 rows=12000.00 loops=1) Workers Planned: 2 Workers Launched: 2 Buffers: shared hit=2780, temp written=25773 -> ProjectSet (cost=28585766.50..5424049584.85 rows=1070484413000 width=448) (actual time=886.021..890.234 rows=4000.00 loops=3) Buffers: shared hit=2780, temp written=25773 -> Hash Join (cost=28585766.50..47541620.55 rows=1070484413 width=112) (actual time=885.997..887.013 rows=1333.33 loops=3) Hash Cond: (r4.start_id = n6.id) Buffers: shared hit=2780, temp written=25773 -> Parallel Hash Join (cost=28585729.50..44721637.69 rows=1070484413 width=120) (actual time=668.595..669.440 rows=2666.67 loops=3) Hash Cond: (r5.end_id = n3.id) Buffers: shared hit=2762, temp written=25773 -> Merge Join (cost=28585574.63..29036604.14 rows=29731549 width=128) (actual time=668.478..669.012 rows=2666.67 loops=3) Merge Cond: (n1.id = r0.end_id) Buffers: shared hit=2734, temp written=25773 -> Sort (cost=290.68..298.19 rows=3001 width=8) (actual time=0.074..0.084 rows=278.67 loops=3) Sort Key: n1.id Sort Method: quicksort Memory: 49kB Buffers: shared hit=34 Worker 0: Sort Method: quicksort Memory: 25kB Worker 1: Sort Method: quicksort Memory: 25kB -> Parallel Append (cost=0.00..117.36 rows=3001 width=8) (actual time=0.001..0.034 rows=426.67 loops=3) Buffers: shared hit=28 -> Parallel Seq Scan on "L2" n1_2 (cost=0.00..17.06 rows=706 width=8) (actual time=0.002..0.009 rows=210.00 loops=1) Buffers: shared hit=6 -> Parallel Seq Scan on "L0" n1_3 (cost=0.00..17.06 rows=706 width=8) (actual time=0.002..0.007 rows=170.00 loops=1) Buffers: shared hit=3 -> Parallel Seq Scan on "L4" n1_4 (cost=0.00..17.06 rows=706 width=8) (actual time=0.002..0.012 rows=310.00 loops=1) Buffers: shared hit=7 -> Parallel Seq Scan on "GdsmithNode" n1_5 (cost=0.00..17.06 rows=706 width=8) (actual time=0.002..0.006 rows=150.00 loops=1) Buffers: shared hit=2 -> Parallel Seq Scan on "L3" n1_6 (cost=0.00..17.06 rows=706 width=8) (actual time=0.001..0.010 rows=240.00 loops=1) Buffers: shared hit=6 -> Parallel Seq Scan on "L1" n1_7 (cost=0.00..17.06 rows=706 width=8) (actual time=0.002..0.009 rows=200.00 loops=1) Buffers: shared hit=4 -> Parallel Seq Scan on _ag_label_vertex n1_1 (cost=0.00..0.00 rows=1 width=8) (actual time=0.000..0.000 rows=0.00 loops=1) -> Materialize (cost=28585283.95..28595193.09 rows=1981828 width=128) (actual time=2005.130..2006.106 rows=8000.00 loops=1) Storage: Memory Maximum Storage: 292kB Buffers: shared hit=2700, temp written=25773 -> Sort (cost=28585283.95..28590238.52 rows=1981828 width=128) (actual time=2005.128..2005.334 rows=8000.00 loops=1) Sort Key: r0.end_id Sort Method: quicksort Memory: 2880kB Buffers: shared hit=2700, temp written=25773 -> Hash Join (cost=219.68..28120589.59 rows=1981828 width=128) (actual time=1061.307..2004.429 rows=8000.00 loops=1) Hash Cond: (r3.start_id = n5.id) Buffers: shared hit=2700, temp written=25773 -> Hash Join (cost=182.68..28115331.91 rows=1981828 width=136) (actual time=1060.058..2002.860 rows=40000.00 loops=1) Hash Cond: (r1.start_id = n2.id) Buffers: shared hit=2696, temp written=25773 -> Hash Join (cost=145.68..28110074.24 rows=1981828 width=136) (actual time=518.268..1997.557 rows=120000.00 loops=1) Hash Cond: (r0.start_id = n0.id) Join Filter: ((((agtype_access_operator(VARIADIC ARRAY[r0.properties, '"id"'::agtype]) <> agtype_access_operator(VARIADIC ARRAY[r1.properties, '"id"'::agtype])))::agtype AND ((agtype_access_operator(VARIADIC ARRAY[r0.properties, '"id"'::agtype]) <> agtype_access_operator(VARIADIC ARRAY[r3.properties, '"id"'::agtype])))::agtype AND ((agtype_access_operator(VARIADIC ARRAY[r0.properties, '"id"'::agtype]) <> agtype_access_operator(VARIADIC ARRAY[r4.properties, '"id"'::agtype])))::agtype AND ((agtype_access_operator(VARIADIC ARRAY[r0.properties, '"id"'::agtype]) <> agtype_access_operator(VARIADIC ARRAY[r5.properties, '"id"'::agtype])))::agtype AND ((agtype_access_operator(VARIADIC ARRAY[r1.properties, '"id"'::agtype]) <> agtype_access_operator(VARIADIC ARRAY[r3.properties, '"id"'::agtype])))::agtype AND ((agtype_access_operator(VARIADIC ARRAY[r1.properties, '"id"'::agtype]) <> agtype_access_operator(VARIADIC ARRAY[r4.p roperties, '"id"'::agtype])))::agtype AND ((agtype_access_operator(VARIADIC ARRAY[r1.properties, '"id"'::agtype]) <> agtype_access_operator(VARIADIC ARRAY[r5.properties, '"id"'::agtype])))::agtype AND ((agtype_access_operator(VARIADIC ARRAY[r3.properties, '"id"'::agtype]) <> agtype_access_operator(VARIADIC ARRAY[r4.properties, '"id"'::agtype])))::agtype AND ((agtype_access_operator(VARIADIC ARRAY[r3.properties, '"id"'::agtype]) <> agtype_access_operator(VARIADIC ARRAY[r5.properties, '"id"'::agtype])))::agtype AND ((agtype_access_operator(VARIADIC ARRAY[r4.properties, '"id"'::agtype]) <> agtype_access_operator(VARIADIC ARRAY[r5.properties, '"id"'::agtype])))::agtype) OR ((((agtype_access_operator(VARIADIC ARRAY[n0.properties, '"id"'::agtype]) IS NULL) OR (agtype_access_operator(VARIADIC ARRAY[n0.properties, '"id"'::agtype]) IS NOT NULL)) IS NULL) AND (((agtype_access_operator(VARIADIC ARRAY[n0.properties, '"id"'::agtype]) IS NULL) OR (agtype_access_operator(VARIADIC ARRAY[n0.properti es, '"id"'::agtype]) IS NOT NULL)) IS NOT NULL))) Buffers: shared hit=2692, temp written=25773 -> Merge Join (cost=108.68..27067637.46 rows=397012715 width=232) (actual time=407.044..1508.330 rows=440000.00 loops=1) Merge Cond: (r1.end_id = r0.end_id) Join Filter: _ag_enforce_edge_uniqueness2(r0.id, r1.id) Buffers: shared hit=2686, temp written=25773 -> Index Scan using "T4_end_id_idx" on "T4" r1 (cost=0.15..62.70 rows=970 width=56) (actual time=0.010..0.145 rows=310.00 loops=1) Index Searches: 1 Buffers: shared hit=99 -> Materialize (cost=108.53..6838339.95 rows=245574900 width=176) (actual time=0.181..1435.236 rows=1400000.00 loops=1) Storage: Disk Maximum Storage: 206184kB Buffers: shared hit=2587, temp written=25773 -> Nested Loop (cost=108.53..6224402.70 rows=245574900 width=176) (actual time=0.180..1109.826 rows=1400000.00 loops=1) Buffers: shared hit=2587 -> Index Scan using "T3_end_id_idx" on "T3" r0 (cost=0.15..62.70 rows=970 width=56) (actual time=0.005..0.190 rows=350.00 loops=1) Index Searches: 1 Buffers: shared hit=111 -> Hash Join (cost=108.38..3993.42 rows=253170 width=120) (actual time=0.026..2.991 rows=4000.00 loops=350) Hash Cond: (r3.end_id = n4.id) Buffers: shared hit=2476 -> Nested Loop (cost=71.38..3289.50 rows=253170 width=128) (actual time=0.001..2.154 rows=28000.00 loops=350) Buffers: shared hit=2473 -> Seq Scan on "T1" r3 (cost=0.00..19.70 rows=970 width=48) (actual time=0.001..0.013 rows=350.00 loops=350) Buffers: shared hit=2450 -> Materialize (cost=71.38..105.82 rows=261 width=80) (actual time=0.000..0.002 rows=80.00 loops=122500) Storage: Memory Maximum Storage: 40kB Buffers: shared hit=23 -> Hash Join (cost=71.38..104.52 rows=261 width=80) (actual time=0.087..0.117 rows=80.00 loops=1) Hash Cond: (r4.end_id = n7.id) Join Filter: _ag_enforce_edge_uniqueness2(r4.id, r5.id) Buffers: shared hit=23 -> Seq Scan on "T1" r4 (cost=0.00..19.70 rows=970 width=56) (actual time=0.001..0.013 rows=350.00 loops=1) Buffers: shared hit=7 -> Hash (cost=59.26..59.26 rows=970 width=64) (actual time=0.081..0.082 rows=90.00 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 27kB Buffers: shared hit=16 -> Hash Join (cost=37.00..59.26 rows=970 width=64) (actual time=0.039..0.074 rows=90.00 loops=1) Hash Cond: (r5.start_id = n7.id) Buffers: shared hit=16 -> Seq Scan on "T2" r5 (cost=0.00..19.70 rows=970 width=56) (actual time=0.003..0.021 rows=410.00 loops=1) Buffers: shared hit=10 -> Hash (cost=22.00..22.00 rows=1200 width=8) (actual time=0.031..0.032 rows=240.00 loops=1) Buckets: 2048 Batches: 1 Memory Usage: 26kB Buffers: shared hit=6 -> Seq Scan on "L3" n7 (cost=0.00..22.00 rows=1200 width=8) (actual time=0.011..0.021 rows=240.00 loops=1) Buffers: shared hit=6 -> Hash (cost=22.00..22.00 rows=1200 width=8) (actual time=0.018..0.018 rows=170.00 loops=1) Buckets: 2048 Batches: 1 Memory Usage: 23kB Buffers: shared hit=3 -> Seq Scan on "L0" n4 (cost=0.00..22.00 rows=1200 width=8) (actual time=0.004..0.010 rows=170.00 loops=1) Buffers: shared hit=3 -> Hash (cost=22.00..22.00 rows=1200 width=40) (actual time=0.018..0.019 rows=210.00 loops=1) Buckets: 2048 Batches: 1 Memory Usage: 60kB Buffers: shared hit=6 -> Seq Scan on "L2" n0 (cost=0.00..22.00 rows=1200 width=40) (actual time=0.002..0.008 rows=210.00 loops=1) Buffers: shared hit=6 -> Hash (cost=22.00..22.00 rows=1200 width=8) (actual time=0.017..0.017 rows=200.00 loops=1) Buckets: 2048 Batches: 1 Memory Usage: 24kB Buffers: shared hit=4 -> Seq Scan on "L1" n2 (cost=0.00..22.00 rows=1200 width=8) (actual time=0.002..0.008 rows=200.00 loops=1) Buffers: shared hit=4 -> Hash (cost=22.00..22.00 rows=1200 width=8) (actual time=0.022..0.023 rows=200.00 loops=1) Buckets: 2048 Batches: 1 Memory Usage: 24kB Buffers: shared hit=4 -> Seq Scan on "L1" n5 (cost=0.00..22.00 rows=1200 width=8) (actual time=0.005..0.013 rows=200.00 loops=1) Buffers: shared hit=4 -> Parallel Hash (cost=117.36..117.36 rows=3001 width=8) (actual time=0.074..0.076 rows=426.67 loops=3) Buckets: 8192 Batches: 1 Memory Usage: 128kB Buffers: shared hit=28 -> Parallel Append (cost=0.00..117.36 rows=3001 width=8) (actual time=0.009..0.126 rows=1280.00 loops=1) Buffers: shared hit=28 -> Parallel Seq Scan on "L2" n3_2 (cost=0.00..17.06 rows=706 width=8) (actual time=0.003..0.012 rows=210.00 loops=1) Buffers: shared hit=6 -> Parallel Seq Scan on "L0" n3_3 (cost=0.00..17.06 rows=706 width=8) (actual time=0.003..0.009 rows=170.00 loops=1) Buffers: shared hit=3 -> Parallel Seq Scan on "L4" n3_4 (cost=0.00..17.06 rows=706 width=8) (actual time=0.003..0.016 rows=310.00 loops=1) Buffers: shared hit=7 -> Parallel Seq Scan on "GdsmithNode" n3_5 (cost=0.00..17.06 rows=706 width=8) (actual time=0.003..0.008 rows=150.00 loops=1) Buffers: shared hit=2 -> Parallel Seq Scan on "L3" n3_6 (cost=0.00..17.06 rows=706 width=8) (actual time=0.003..0.011 rows=240.00 loops=1) Buffers: shared hit=6 -> Parallel Seq Scan on "L1" n3_7 (cost=0.00..17.06 rows=706 width=8) (actual time=0.006..0.016 rows=200.00 loops=1) Buffers: shared hit=4 -> Parallel Seq Scan on _ag_label_vertex n3_1 (cost=0.00..0.00 rows=1 width=8) (actual time=0.002..0.002 rows=0.00 loops=1) -> Hash (cost=22.00..22.00 rows=1200 width=8) (actual time=217.393..217.393 rows=240.00 loops=3) Buckets: 2048 Batches: 1 Memory Usage: 26kB Buffers: shared hit=18 -> Seq Scan on "L3" n6 (cost=0.00..22.00 rows=1200 width=8) (actual time=217.352..217.375 rows=240.00 loops=3) Buffers: shared hit=18 Planning: Buffers: shared hit=6 Planning Time: 20.595 ms JIT: Functions: 383 Options: Inlining true, Optimization true, Expressions true, Deforming true Timing: Generation 5.759 ms (Deform 2.084 ms), Inlining 69.621 ms, Optimization 290.668 ms, Emission 291.912 ms, Total 657.961 ms Execution Time: 2238.285 ms ``` Missed optimization summary: the planner keeps the NULL-contradiction branch in a filter instead of reducing it to `false`, raising `rows` from `1405.83` to `3032.0`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
