> The point is that a big seqscan (either VACUUM or a plain table scan)
> hits a lot of pages, and thereby tends to fill your cache with pages
> that aren't actually likely to get hit again soon, perhaps pushing out
> pages that will be needed again soon.  This happens at both the
> shared-buffer and kernel-disk-cache levels of caching.

OK, I had thought (wrongly it seems, as usual, but this is how we learn!)
that a plain VACUUM did not incur a read of all pages.  I still don't
understand *why* it does, but I'll take your word for it.

Clearly if it distorts the 'normal' balance of pages in any caches, PG's or
the OS's, that's a _bad thing_.  I am currently in the nice position of
having a DB that (just about) fits in RAM, so I pretty much don't care about
read performance, but I will have to soon as it grows beyond 3GB :-(  These
conversations are invaluable in planning for that dread time...

> It would be good to find some way to prevent big seqscans from
> populating cache, but I don't know of any portable way to tell the OS
> that we don't want it to cache a page we are reading.

Quite.  The only natural way would be to read those pages through some
special device, but then you might as well do raw disk access from the
get-go.  Portability vs. Performance, the age old quandary.  FWIW I and many
others stand back in pure amazement at the sheer _quality_ of PostgreSQL.



---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
    (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Reply via email to