Re: Replace IN VALUES with ANY in WHERE clauses during optimization

Alena Rybakina Fri, 04 Apr 2025 11:23:13 -0700

On 02.04.2025 22:00, Alexander Korotkov wrote:

Hi, Alena!
On Wed, Apr 2, 2025 at 5:33 PM Alena Rybakina<a.rybak...@postgrespro.ru> wrote:
    I prepared a patch according to my suggestions, it just checks
    that the transformation is not carried out if there is a var
    element, there are changes only in one test, but I think it is
    correct.

    diff -U3
    /home/alena/postgrespro_or3/src/test/regress/expected/subselect.out
    /home/alena/postgrespro_or3/src/test/regress/results/subselect.out
    ---
    /home/alena/postgrespro_or3/src/test/regress/expected/subselect.out
    2025-04-02 02:50:07.018329864 +0300
    +++
    /home/alena/postgrespro_or3/src/test/regress/results/subselect.out
    2025-04-02 17:27:09.845104001 +0300
    @@ -3027,18 +3027,15 @@
     SELECT ten FROM onek t WHERE unique1 IN (VALUES (0), ((2 IN
       (SELECT (3)))::integer)
     );
    -                     QUERY PLAN
    -----------------------------------------------------
    - Nested Loop
    -   ->  Unique
    -         ->  Sort
    -               Sort Key: "*VALUES*".column1
    -               ->  Values Scan on "*VALUES*"
    -                     SubPlan 1
    -                       ->  Result
    -   ->  Index Scan using onek_unique1 on onek t
    -         Index Cond: (unique1 = "*VALUES*".column1)
    -(9 rows)
    +                                           QUERY PLAN
    
+------------------------------------------------------------------------------------------------
    + Bitmap Heap Scan on onek t
    +   Recheck Cond: (unique1 = ANY (ARRAY[0, ((ANY (2 = (hashed
    SubPlan 1).col1)))::integer]))
    +   ->  Bitmap Index Scan on onek_unique1
    +         Index Cond: (unique1 = ANY (ARRAY[0, ((ANY (2 = (hashed
    SubPlan 1).col1)))::integer]))
    +   SubPlan 1
    +     ->  Result
    +(6 rows)

     -- Alow to transformation and hold conversion between types of
    colemns and
     -- declared type of column pointed in RTE

Your patch turns back performance degradation issue that I mention in [1].

With your patch.
# explain analyze select * from test where val in (VALUES (5000),(4000), (9000), (2000), (1000), (140050), ((select 1)));
            QUERY PLAN
-------------------------------------------------------------------------------------------------------
Seq Scan on test (cost=0.01..23175.01 rows=7 width=4) (actualtime=0.455..417.929 rows=7.00 loops=1) Filter: (val = ANY (ARRAY[5000, 4000, 9000, 2000, 1000, 140050,(InitPlan 1).col1]))
   Rows Removed by Filter: 999993
   Buffers: shared hit=2362 read=2063
   InitPlan 1
-> Result (cost=0.00..0.01 rows=1 width=4) (actualtime=0.002..0.002 rows=1.00 loops=1)
 Planning Time: 0.211 ms
 Execution Time: 417.984 ms
(8 rows)

Without your patch.
# explain analyze select * from test where val in (VALUES (5000),(4000), (9000), (2000), (1000), (140050), ((select 1)));
 QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
Hash Semi Join (cost=0.19..17050.26 rows=7 width=4) (actualtime=0.289..210.335 rows=7.00 loops=1)
   Hash Cond: (test.val = "*VALUES*".column1)
   Buffers: shared hit=2211 read=2214
   InitPlan 1
-> Result (cost=0.00..0.01 rows=1 width=4) (actualtime=0.002..0.002 rows=1.00 loops=1) -> Seq Scan on test (cost=0.00..14425.00 rows=1000000 width=4)(actual time=0.203..85.925 rows=1000000.00 loops=1)
         Buffers: shared hit=2211 read=2214
-> Hash (cost=0.09..0.09 rows=7 width=4) (actualtime=0.067..0.068 rows=7.00 loops=1)
         Buckets: 1024  Batches: 1  Memory Usage: 9kB
-> Values Scan on "*VALUES*" (cost=0.00..0.09 rows=7width=4) (actual time=0.004..0.023 rows=7.00 loops=1)
 Planning Time: 0.321 ms
 Execution Time: 210.409 ms
(12 rows)
Only when we have all consts in the array, we can have SAOP hashingwhich is competitive with hashing of VALUES node. So, I'm continuingwith my version of patch.
Links.
1.https://www.postgresql.org/message-id/CAPpHfds-7eJ3ZMgyeVrMfC5E1nTHD4Bp0ch5MZhrYSoiCfERXw%40mail.gmail.com

Okay, I agree with you.

--
Regards,
Alena Rybakina
Postgres Professional

Re: Replace IN VALUES with ANY in WHERE clauses during optimization

Reply via email to