Hi!

On 03.10.2024 22:52, Ivan Kush wrote:

Hello, hackers! I with my friends propose the patch to replace IN VALUES to ANY in WHERE clauses.

# Intro

The `VALUES` in the `IN VALUES` construct is replaced with with an array of values when `VALUES` contains 1 column. In the end it will be replaced with ANY by the existing function makeA_Expr (src/backend/nodes/makefuncs.c)

This improves performance, especially if the values are small.

# Patch

v1-in_values_to_array.patch

# How realized

`VALUES` statement corresponds to `values_clause` nonterminal symbol in gram.y, where it's parsed to `SelectStmt` node.

`IN` is parsed in `a_expr` symbol. When it contains `VALUES` with 1 column, parser extracts data from `SelectStmt` and passes it

to function call `makeSimpleA_Expr` where simple `A_Expr` is created.

Later during optimizations of parser tree this `A_Expr` will be transformed to `ArrayExpr` (already realized in Postgres)


# Authors.
Author: Ivan Kush <ivan.k...@tantorlabs.com>
Author: Vadim Yacenko <vadim.yace...@tantorlabs.com>
Author: Alexander Simonov <alexander.simo...@tantorlabs.com>

# Tests
Implementation contains many regression tests of varying complexity, which check supported features.

# Platform
This patch was checkouted from tag REL_17_STABLE. Code is developed in Linux, doesn't contain platfrom-specific code, only Postgres internal data structures and functions.

# Documentation
Regression tests contain many examples

# Performance
It increases performance

# Example
Let's compare result. With path the execution time is significantly lower.

We have a table table1 with 10000 rows.

postgres=# \d table1;
                         Table "public.table1"
 Column |            Type             | Collation | Nullable | Default
--------+-----------------------------+-----------+----------+---------
 fld1   | timestamp without time zone |           | not null |
 fld2   | bytea                       |           | not null |
Indexes:
    "table1index" btree (fld2)

Let's execute several commands
see commands.sql

Plan no patch
see plan_no_patch.txt


Plan with patch
see plan_with_patch.txt

I think you should think about putting these constants in ANY Array

EXPLAIN (ANALYZE, COSTS OFF, SUMMARY OFF, TIMING OFF)
select * from t
    where x in (VALUES(1200), (1));
                    QUERY PLAN
---------------------------------------------------
 Seq Scan on t (actual rows=1 loops=1)
   Filter: (x = ANY ('{1200,1}'::integer[]))
(3 rows)

Anlrey Lepikhov and I recently described this in an article [0] here and the implementation already exists, but for now it was posted a binary application for testing. The acceleration is significant I agree.

[0] https://danolivo.substack.com/p/7456653e-9716-4e91-ad09-83737784c665

--
Regards,
Alena Rybakina
Postgres Professional



Reply via email to