Hello Andrey! Following up with the things I owed you: the benchmarks, the consistency check and adding a note for the 2^53 case.
I added a fast path. Each integer opclass's consistent() / distance() now
detects the "same type" case and calls the original gbt_num_consistent() /
gbt_num_distance() directly.
To confirm there's no regression I ran a microbenchmark on an -O2 build, no
asserts, single client, over a 500k row int4 GiST index, with the following
options:
-c enable_seqscan=off \
-c enable_bitmapscan=off \
-c enable_sort=off \
-c max_parallel_workers_per_gather=0
This is the base for the bench:
CREATE EXTENSION IF NOT EXISTS btree_gist;
DROP TABLE IF EXISTS benchg;
CREATE TABLE benchg (a int4);
INSERT INTO benchg SELECT g FROM generate_series(0, 499999) g;
CREATE INDEX benchg_idx ON benchg USING gist (a);
VACUUM (ANALYZE, FREEZE) benchg;
And the two workloads:
consistent(), full-range index-only count(*):
SELECT count(*) FROM benchg WHERE a >= 0 AND a <= 499999;
distance(), full KNN ordering (ORDER BY a<->k over all rows):
SELECT count(*) FROM (SELECT a FROM benchg ORDER BY a <-> 250000 LIMIT 1000000)
q;
The numbers in ms (12 repetitions, 15s each) before
(3e3d7875e95621b02311ea3443e5139e3bce944a) and after my patch:
before consistent min/med/mean = 51.754 52.718 54.137 ms
after consistent min/med/mean = 52.042 52.480 52.572 ms
------------------------------------------------------------------------
before distance min/med/mean = 76.863 77.177 77.395 ms
after distance min/med/mean = 77.357 77.803 77.980 ms
All numbers seem to be within measurement noise, except the consistent-before,
which is probably inflated by one slow rep.
Regarding the other point, I explored the regression suite path I mentioned.
The consistent() / distance() functions dispatch cross-type queries through a
single static table of supported subtype OIDs (gbt_int_crosstype_table in
btree_utils_num.c). I expose that exact table to SQL, in
gbt_int_crosstype_subtypes(),
so there is no hand-maintained second copy of the list.
The int_crosstype.sql regression test then builds the set of cross-type
(lefttype, righttype, strategy) entries that should exist in pg_amop from that
function, and EXCEPTs it against the cross-type rows actually present in
gist_int{2,4,8}_ops:
- a pg_amop row whose subtype the C dispatch does not handle shows up as
"unexpected in pg_amop", and
- a dispatch entry without the matching pg_amop rows shows up as
"missing from pg_amop".
Either kind of drift produces a diff under `make check`. So adding an ALTER
OPERATOR FAMILY entry without a matching dispatch entry (or vice versa) fails
the suite (as I mentioned in my previous email, I'm not aware of a way to do
this with amvalidate() without patching core).
I'm attaching the new set of patches (this time I include the tests).
Best regards!
0001-Implement-cross-type-operators-for-GiST-indexes.patch
Description: Binary data
0002-Add-tests-for-cross-type-operators-for-GiST-indexes.patch
Description: Binary data
