On Wed, 20 May 2026 at 10:45, Maxime Schoemans <[email protected]> wrote: > > Hi hackers, > > This patch set adds multi-entry support to the GiST and SP-GiST access > methods, allowing a single heap tuple to produce multiple index entries -- > similar to how GIN decomposes values via extractValue, but within the > R-tree (GiST) and quad-tree (SP-GiST) frameworks.
I think the attachments were lost in transmission. > Motivation > ---------- > > The existing GiST multirange opclass compresses a multirange into its > bounding union range, losing information about gaps. This makes operators > like @> and <@ imprecise at the index level, requiring many false-positive > rechecks. Multi-entry GiST instead stores each component range as a > separate index entry, giving the R-tree much tighter bounds and > significantly reducing rechecks for multiranges with gaps. Cool! I'm not sure this is preferable over a specialized AM (comparable to what GIN is to btree), but cool! > Other design decisions: > [...] > - Index-only scans are disabled on a multi-entry key column, since > the stored sub-entries do not represent the original datum. How is this done? Surely it's not through a planner check of GiST opclasses? > - For SP-GiST, compress becomes optional when extractValue is present: > extractValue produces the leaf-typed values directly, and leafType > may differ from the input type (e.g., anymultirange -> anyrange). I don't think it's generally safe to remove compression even when extractValue is present. > - Since leaf consistent functions see components, a separate opclass > (multirange_me_ops) is provided alongside the existing multirange_ops. > > Patch overview > -------------- > > 0001 - GiST AM infrastructure (gist.h, gist_private.h, gist.c, > gistbuild.c, gistget.c, gistscan.c, gistutil.c, gistvalidate.c): new > extractValue support function (proc 13), multi-entry insert/build paths, > TID deduplication during scans using simplehash. Could you help me understand how this deduplication works? Wouldn't this possibly use ~ unbounded memory (or however much memory is needed to store up to 2^48 TIDs)? GIN gets away with it because it uses a lossy Bitmap structure, and because it traverses the TID space linearly, but we can't use lossy checks for amgettuple() without losing correctness, and I don't think GiST can do the same. Kind regards, Matthias van de Meent
