Re: [HACKERS] Indirect indexes

Robert Haas Wed, 19 Oct 2016 05:53:22 -0700

On Tue, Oct 18, 2016 at 2:28 PM, Alvaro Herrera
<[email protected]> wrote:
> I propose we introduce the concept of "indirect indexes".  I have a toy
> implementation and before I go further with it, I'd like this assembly's
> input on the general direction.
>
> Indirect indexes are similar to regular indexes, except that instead of
> carrying a heap TID as payload, they carry the value of the table's
> primary key.  Because this is laid out on top of existing index support
> code, values indexed by the PK can only be six bytes long (the length of
> ItemPointerData); in other words, 281,474,976,710,656 rows are
> supported, which should be sufficient for most use cases.[1]


So, I think that this is a really promising direction, but also that
you should try very hard to try to get out from under this 6-byte PK
limitation.  That seems really ugly, and in practice it probably means
your PK is probably going to be limited to int4, which is kind of sad
since it leaves people using int8 or text PKs out in the cold.  I
believe Claudio Freire is on to something when he suggests storing the
PK in the index tuple; one could try to skip storing the TID, or
always store it as all-zeroes.  Simon objected that putting the PK
into the index tuple would disable HOT, but I don't think that's a
valid objection.  The whole point of an indirect index is that it
doesn't disable HOT, and the physical location within the index page
you stick the PK value doesn't have any impact on whether that's safe.

The VACUUM problems seem fairly serious.  It's true that these indexes
will be less subject to bloat, because they only need updating when
the PK or the indexed columns change, not when other indexed columns
change.  On the other hand, there's nothing to prevent a PK from being
recycled for an unrelated tuple.  We can guarantee that a TID won't be
recycled until all index references to the TID are gone, but there's
no such guarantee for a PK.  AFAICT, that would mean that an indirect
index would have to be viewed as unreliable: after looking up the PK,
you'd *always* have to recheck that it actually matched the index
qual.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Indirect indexes

Reply via email to