On Mon, 3 Feb 2025 at 15:54, Alena Rybakina <a.rybak...@postgrespro.ru> wrote: > > > On 03.02.2025 14:32, Alexander Korotkov wrote: > > On Mon, Feb 3, 2025 at 12:22 PM Alena Rybakina > <a.rybak...@postgrespro.ru> wrote: > > Thank you for updated version! I agree for your version of the code. > > On 02.02.2025 21:00, Alexander Korotkov wrote: > > On Fri, Jan 31, 2025 at 4:31 PM Alena Rybakina > <a.rybak...@postgrespro.ru> wrote: > > I started reviewing at the patch and saw some output "ERROR" in the output of > the test and is it okay here? > > SELECT * FROM tenk1 t1 > WHERE t1.thousand = 42 OR t1.thousand = (SELECT t2.tenthous FROM tenk1 t2 > WHERE t2.thousand = t1.tenthous); > ERROR: more than one row returned by a subquery used as an expression > > The output is correct for this query. But the query is very > unfortunate for the regression test. I've revised query in the v47 > revision [1]. > > Links. > 1. > https://www.postgresql.org/message-id/CAPpHfdsBZmNt9qUoJBqsQFiVDX1%3DyCKpuVAt1YnR7JCpP%3Dk8%2BA%40mail.gmail.com > > While analyzing the modified query plan from the regression test, I noticed > that despite using a full seqscan for table t2 in the original plan, > its results are cached by Materialize node, and this can significantly speed > up the execution of the NestedLoop algorithm. > > For example, after running the query several times, I got results that show > that the query execution time was twice as bad. > > Original plan: > > EXPLAIN ANALYZE SELECT * FROM bitmap_split_or t1, bitmap_split_or t2 WHERE > t1.a=t2.b OR t1.a=1; QUERY PLAN > -------------------------------------------------------------------------------------------------------------------------------- > Nested Loop (cost=0.00..70067.00 rows=2502499 width=24) (actual > time=0.015..1123.247 rows=2003000 loops=1) Join Filter: ((t1.a = t2.b) OR > (t1.a = 1)) Rows Removed by Join Filter: 1997000 Buffers: shared hit=22 -> > Seq Scan on bitmap_split_or t1 (cost=0.00..31.00 rows=2000 width=12) (actual > time=0.006..0.372 rows=2000 loops=1) Buffers: shared hit=11 -> Materialize > (cost=0.00..41.00 rows=2000 width=12) (actual time=0.000..0.111 rows=2000 > loops=2000) Storage: Memory Maximum Storage: 110kB Buffers: shared hit=11 -> > Seq Scan on bitmap_split_or t2 (cost=0.00..31.00 rows=2000 width=12) (actual > time=0.003..0.188 rows=2000 loops=1) Buffers: shared hit=11 Planning Time: > 0.118 ms Execution Time: 1204.874 ms (13 rows) > > Query plan after the patch: > > EXPLAIN ANALYZE SELECT * FROM bitmap_split_or t1, bitmap_split_or t2 WHERE > t1.a=t2.b OR t1.a=1; QUERY PLAN > ----------------------------------------------------------------------------------------------------------------------------------------------- > Nested Loop (cost=0.28..56369.00 rows=2502499 width=24) (actual > time=0.121..2126.606 rows=2003000 loops=1) Buffers: shared hit=16009 read=2 > -> Seq Scan on bitmap_split_or t2 (cost=0.00..31.00 rows=2000 width=12) > (actual time=0.017..0.652 rows=2000 loops=1) Buffers: shared hit=11 -> Index > Scan using t_a_b_idx on bitmap_split_or t1 (cost=0.28..18.15 rows=1002 > width=12) (actual time=0.044..0.627 rows=1002 loops=2000) Index Cond: (a = > ANY (ARRAY[t2.b, 1])) Buffers: shared hit=15998 read=2 Planning Time: 0.282 > ms Execution Time: 2344.367 ms (9 rows) > > I'm afraid that we may lose this with this optimization. Maybe this can be > taken into account somehow, what do you think? > > The important aspect is that the second plan have lower cost than the > first one. So, that's the question to the cost model. The patch just > lets optimizer consider more comprehensive plurality of paths. You > can let optimizer select the first plan by tuning *_cost params. For > example, setting cpu_index_tuple_cost = 0.02 makes first plan win for > me. > > Other than that the test query is quite unfortunate as t1.a=1 is very > frequent. I've adjusted the query so that nested loop with index scan > wins both in cost and execution time. > > I've also adjusted another test query as proposed by Andrei. > > I'm going to push this patch is there is no more notes. > > Links. > 1. > https://www.postgresql.org/message-id/fc1017ca-877b-4f86-b491-154cf123eedd%40gmail.com > > > Okay.I agree with your code and have no more notes
Hi, Alexander! I've looked at patchset v48 and it looks good to me. Regards, Pavel Borisov Supabase