Hi, I am testing performance both PG12 and PG11. I found the case of performance degradation in PG12.
Amit Langote help me to analyze and to create patch. Thanks! * environment CentOS Linux release 7.6.1810 (Core) postgresql 12.1 postgresql 11.6 * postgresql.conf shared_buffers = 2048MB max_parallel_workers_per_gather = 0 work_mem = '64MB' jit = off * test case CREATE TABLE realtest(a real, b real, c real, d real, e real); INSERT INTO realtest SELECT i,i,i,i,i FROM generate_series(0,10000000) AS i; EXPLAIN (ANALYZE on, VERBOSE on, BUFFERS on) select (2 * a) , (2 * b) , (2 * c), (2 * d), (2 * e) from realtest; * result PG12.1 5878.389 ms PG11.6 4533.554 ms ** PostgreSQL 12.1 pgbench=# EXPLAIN (ANALYZE on, VERBOSE on, BUFFERS on) select (2 * a) , (2 * b) , (2 * c), (2 * d), (2 * e) from realtest; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------ Seq Scan on public.realtest (cost=0.00..288697.59 rows=10000115 width=40) (actual time=0.040..5195.328 rows=10000001 loops=1) Output: ('2'::double precision * a), ('2'::double precision * b), ('2'::double precision * c), ('2'::double precision * d), ('2'::double precision * e) Buffers: shared hit=63695 Planning Time: 0.051 ms Execution Time: 5878.389 ms (5 行) Samples: 6K of event 'cpu-clock', Event count (approx.): 1577750000 Overhead Command Shared Object Symbol 25.48% postgres postgres [.] ExecInterpExpr ★18.65% postgres libc-2.17.so [.] __isinf 14.36% postgres postgres [.] float84mul 8.54% postgres [vdso] [.] __vdso_clock_gettime 4.02% postgres postgres [.] ExecScan 3.69% postgres postgres [.] tts_buffer_heap_getsomeattrs 2.63% postgres libc-2.17.so [.] __clock_gettime 2.55% postgres postgres [.] HeapTupleSatisfiesVisibility 2.00% postgres postgres [.] heapgettup_pagemode ** PostgreSQL 11.6 pgbench=# EXPLAIN (ANALYZE on, VERBOSE on, BUFFERS on) select (2 * a) , (2 * b) , (2 * c), (2 * d), (2 * e) from realtest; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------ Seq Scan on public.realtest (cost=0.00..288697.59 rows=10000115 width=40) (actual time=0.012..3845.480 rows=10000001 loops=1) Output: ('2'::double precision * a), ('2'::double precision * b), ('2'::double precision * c), ('2'::double precision * d), ('2'::double precision * e) Buffers: shared hit=63695 Planning Time: 0.033 ms Execution Time: 4533.554 ms (5 行) Samples: 4K of event 'cpu-clock', Event count (approx.): 1192000000 Overhead Command Shared Object Symbol 32.30% postgres postgres [.] ExecInterpExpr 14.95% postgres postgres [.] float84mul 10.57% postgres [vdso] [.] __vdso_clock_gettime ★ 6.84% postgres libc-2.17.so [.] __isinf 3.96% postgres postgres [.] ExecScan 3.50% postgres libc-2.17.so [.] __clock_gettime 3.31% postgres postgres [.] heap_getnext 3.08% postgres postgres [.] HeapTupleSatisfiesMVCC 2.77% postgres postgres [.] slot_deform_tuple 2.37% postgres postgres [.] ExecProcNodeInstr 2.08% postgres postgres [.] standard_ExecutorRun * cause Obviously, even in common cases where no overflow occurs, you can tell that PG 12 is performing isinf() 3 times on every call of float8_mul() once for each of val1, val2, result where as PG 11 is performing only once for result. That's because check_float8_val() (in PG 12) is a function whose arguments must be evaluated before it is called (it is inline, but that's irrelevant), whereas CHECKFLOATVAL() (in PG11) is a macro whose arguments are only substituted into its body. By the way, this change of float8mul() implementation is mostly due to the following commit in PG 12 development cycle: commit 6bf0bc842bd75877e31727eb559c6a69e237f831 Especially, the following diff: @@ -894,13 +746,8 @@ float8mul(PG_FUNCTION_ARGS) { float8 arg1 = PG_GETARG_FLOAT8(0); float8 arg2 = PG_GETARG_FLOAT8(1); - float8 result; - - result = arg1 * arg2; - CHECKFLOATVAL(result, isinf(arg1) || isinf(arg2), - arg1 == 0 || arg2 == 0); - PG_RETURN_FLOAT8(result); + PG_RETURN_FLOAT8(float8_mul(arg1, arg2)); } * patch This patch uses MACRO which was used by PG11. I tried attached patch, which can be applied to PG 12 source and performed a benchmark: PG12.1 5878.389 ms PG11.6 4533.554 ms PG12.1 + Patch 4679.162 ms ** PostgreSQL 12.1 + Patch postgres=# EXPLAIN (ANALYZE on, VERBOSE on, BUFFERS on) select (2 * a) , (2 * b) , (2 * c), (2 * d), (2 * e) from realtest; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------ Seq Scan on public.realtest (cost=0.00..307328.38 rows=10828150 width=40) (actual time=0.012..4009.012 rows=10000001 loops=1) Output: ('2'::double precision * a), ('2'::double precision * b), ('2'::double precision * c), ('2'::double precision * d), ('2'::double precision * e) Buffers: shared hit=63695 Planning Time: 0.038 ms Execution Time: 4679.162 ms (5 rows) Samples: 5K of event 'cpu-clock', Event count (approx.): 1376750000 Overhead Command Shared Object Symbol 31.43% postgres postgres [.] ExecInterpExpr 14.24% postgres postgres [.] float84mul 10.40% postgres [vdso] [.] __vdso_clock_gettime ★ 5.41% postgres libc-2.17.so [.] __isinf 4.63% postgres postgres [.] tts_buffer_heap_getsomeattrs 4.03% postgres postgres [.] ExecScan 3.54% postgres libc-2.17.so [.] __clock_gettime 3.12% postgres postgres [.] HeapTupleSatisfiesVisibility 2.36% postgres postgres [.] heap_getnextslot 2.16% postgres postgres [.] heapgettup_pagemode 2.09% postgres postgres [.] standard_ExecutorRun 2.07% postgres postgres [.] SeqNext 2.03% postgres postgres [.] ExecProcNodeInstr 2.03% postgres postgres [.] tts_virtual_clear PG 12 is still slower compared to PG 11, but the __isinf() situation is better with the patch. Best Regards, Keisuke Kuroda
check-float-val.patch
Description: Binary data