I've created a new Wiki page that describes a scheme for normalizing
internal page items within B-Tree indexes, and the many optimizations
that this can enable:


Key normalization means creating a representation for internal page
items that we always just memcmp(), regardless of the details of the
underlying datatypes.

My intent in creating this wiki page is to document these techniques
centrally, as well as the problems that they may solve, and to show
how they're all interrelated. It might be that confusion about how one
optimization enables another holds back patch authors.

It might appear excessive to talk about several different techniques
in one place, but that seemed like the best way to me, because there
are subtle dependencies. If most of the optimizations are pursued as a
project all at once (say, key normalization, suffix truncation, and
treating heap TID as a unique-ifier), that may actually be more likely
to succeed than a project to do just one. The techniques don't appear
to be related at first, but they really are.

I'm not planning on working on key normalization or any of the other
techniques as projects myself, but FWIW I have produced minimal
prototypes of a few of the techniques over the past several years,
just to verify my understanding. My theories on this topic seem worth
writing down. I'm happy to explain or clarify any aspect of what I
describe, and to revise the design based on feedback. It is still very
much a work in progress.

Peter Geoghegan

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to