On Mon, Apr 28, 2014 at 6:02 AM, Robert Haas <robertmh...@gmail.com> wrote:
> Also true.  But the problem is that it is very rarely, if ever, the
> case that all pages are *equally* hot.  On a pgbench workload, for
> example, I'm very confident that while there's not really any cold
> data, the btree roots and visibility map pages are a whole lot hotter
> than a randomly-selected heap page.  If you evict a heap page, you're
> going to need it back pretty quick, because it won't be long until the
> random-number generator again chooses a key that happens to be located
> on that page.  But if you evict the root of the btree index, you're
> going to need it back *immediately*, because the very next query, no
> matter what key it's looking for, is going to need that page.  I'm
> pretty sure that's a significant difference.

I emphasized leaf pages because even with master the root and inner
pages are still going to be so hot as to make them constantly in
cache, at least with pgbench's use of a uniform distribution. You'd
have to have an absolutely enormous scale factor before this might not
be the case. As such, I'm not all that worried about inner pages when
performing these simple benchmarks. However, in the case of the
pgbench_accounts table, each of the B-Tree leaf pages that comprise
about 99.5% of the total is still going to be about six times more
frequently accessed than each heap page. That's a small enough
difference for it to easily go unappreciated, and yet a big enough
difference for it to hurt a lot.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to