On Wed, Jul 19, 2017 at 7:57 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:
> Peter Geoghegan <p...@bowt.ie> writes:
>> My argument for the importance of index bloat to the more general
>> bloat problem is simple: any bloat that accumulates, that cannot be
>> cleaned up, will probably accumulate until it impacts performance
>> quite noticeably.
> But that just begs the question: *does* it accumulate indefinitely, or
> does it eventually reach a more-or-less steady state?

Yes, I believe it does reach a more-or-less steady state. It saturates
when there is a lot of contention, because then you actually can reuse
the bloat. If it didn't saturate, and instead became arbitrarily bad,
then we'd surely have heard about that before now.

The bloat is not entirely wasted, because it actually prevents you
from getting even more bloat in that part of the keyspace.

> The traditional
> wisdom about btrees, for instance, is that no matter how full you pack
> them to start with, the steady state is going to involve something like
> 1/3rd free space.  You can call that bloat if you want, but it's not
> likely that you'll be able to reduce the number significantly without
> paying exorbitant costs.

For the purposes of this discussion, I'm mostly talking about
duplicates within a page on a unique index. If the keyspace owned by
an int4 unique index page only covers 20 distinct values, it will only
ever cover 20 distinct values, now and forever, despite the fact that
there is room for about 400 (a 90/10 split leaves you with 366 items +
1 high key).

I don't know if I should really even call this bloat, since the term
is so overloaded, although this is what other database systems call
index bloat. I like to think of it as "damage to the keyspace",
although that terminology seems unlikely to catch on.

Peter Geoghegan

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to