On Tue, 13 Jul 2021 at 01:59, Ronan Dunklau <ronan.dunk...@aiven.io> wrote: > > 3. This seems to be a bug fix where byval datum sorts do not properly > > handle bounded sorts. I think that maybe that should be fixed and > > backpatched. I don't see anything that says Datum sorts can't be > > bounded and if there were some restriction on that I'd expect > > tuplesort_set_bound() to fail when the Tuplesortstate had been set up > > with tuplesort_begin_datum(). > > I've kept this as-is for now, but I will remove it from my patch if it is > deemed worthy of back-patching in your other thread.
I've now pushed that bug fix so it's fine to remove the change to tuplesort.c now. I also did a round of benchmarking on this patch using the attached script. Anyone wanting to run it will need to run make installcheck first to create the required tables. On an AMD machine, I got the following results. Result in transactions per second. Test master v5 patch compare Test1 446.1 657.3 147.32% Test2 315.8 314.0 99.44% Test3 302.3 392.1 129.67% Test4 232.7 230.7 99.12% Test5 230.0 446.1 194.00% Test6 199.5 217.9 109.23% Test7 188.7 185.3 98.21% Test8 385.4 544.0 141.17% Tests 2, 4, 7 are designed to check if there is any regression from doing the additional run-time checks to see if we're doing datumSort. I measured a very small penalty from this. It's most visible in test7 with a drop of about 1.8%. Each test did OFFSET 1000000 as I didn't want to measure the overhead of outputting tuples. All the other tests show a pretty good gain. Test6 is testing a byref type, so it appears the gains are not just from byval datums. It would be good to see the benchmark script run on a few other machines to get an idea if the gains and losses are consistent. In theory, we likely could get rid of the small regression by having two versions of ExecSort() and setting the correct one during ExecInitSort() by setting the function pointer to the version we want to use in sortstate->ss.ps.ExecProcNode. But maybe the small regression is not worth going to that trouble over. I'm not aware of any other executor nodes that have logic like that so maybe it would be a bad idea to introduce something like that. David
#!/bin/bash # This should improve echo "select two from tenk1 order by two offset 1000000" > bench.sql echo Test1 pgbench -n -f bench.sql -T 10 regression | grep tps pgbench -n -f bench.sql -T 10 regression | grep tps pgbench -n -f bench.sql -T 10 regression | grep tps # This should stay the same echo "select * from tenk1 order by two offset 1000000" > bench.sql echo Test2 pgbench -n -f bench.sql -T 10 regression | grep tps pgbench -n -f bench.sql -T 10 regression | grep tps pgbench -n -f bench.sql -T 10 regression | grep tps # This should improve echo "select tenthous from tenk1 order by tenthous offset 1000000" > bench.sql echo Test3 pgbench -n -f bench.sql -T 10 regression | grep tps pgbench -n -f bench.sql -T 10 regression | grep tps pgbench -n -f bench.sql -T 10 regression | grep tps # This should stay the same echo "select * from tenk1 order by tenthous offset 1000000" > bench.sql echo Test4 pgbench -n -f bench.sql -T 10 regression | grep tps pgbench -n -f bench.sql -T 10 regression | grep tps pgbench -n -f bench.sql -T 10 regression | grep tps # This should improve echo "select stringu1 from tenk1 order by stringu1 offset 1000000" > bench.sql echo Test5 pgbench -n -f bench.sql -T 10 regression | grep tps pgbench -n -f bench.sql -T 10 regression | grep tps pgbench -n -f bench.sql -T 10 regression | grep tps # This should improve echo "select stringu2 from tenk1 order by stringu2 offset 1000000" > bench.sql echo Test6 pgbench -n -f bench.sql -T 10 regression | grep tps pgbench -n -f bench.sql -T 10 regression | grep tps pgbench -n -f bench.sql -T 10 regression | grep tps # This should stay the same echo "select * from tenk1 order by stringu2 offset 1000000" > bench.sql echo Test7 pgbench -n -f bench.sql -T 10 regression | grep tps pgbench -n -f bench.sql -T 10 regression | grep tps pgbench -n -f bench.sql -T 10 regression | grep tps # This should improve echo "select twenty from tenk1 order by twenty offset 1000000" > bench.sql echo Test8 pgbench -n -f bench.sql -T 10 regression | grep tps pgbench -n -f bench.sql -T 10 regression | grep tps pgbench -n -f bench.sql -T 10 regression | grep tps