On 30 Srpen 2012, 17:53, Robert Haas wrote: > On Fri, Aug 24, 2012 at 6:36 PM, Tomas Vondra <t...@fuzzy.cz> wrote: >> attached is a patch that improves performance when dropping multiple >> tables within a transaction. Instead of scanning the shared buffers for >> each table separately, the patch removes this and evicts all the tables >> in a single pass through shared buffers. >> >> Our system creates a lot of "working tables" (even 100.000) and we need >> to perform garbage collection (dropping obsolete tables) regularly. This >> often took ~ 1 hour, because we're using big AWS instances with lots of >> RAM (which tends to be slower than RAM on bare hw). After applying this >> patch and dropping tables in groups of 100, the gc runs in less than 4 >> minutes (i.e. a 15x speed-up). >> >> This is not likely to improve usual performance, but for systems like >> ours, this patch is a significant improvement. > > Seems pretty reasonable. But instead of duplicating so much code, > couldn't we find a way to use replace DropRelFileNodeAllBuffers with > DropRelFileNodeAllBuffersList? Surely anyone who was planning to call > the first one could instead call the second one with a count of one > and a pointer to the address of the data they were planning to pass. > I'd probably swap the order of arguments to > DropRelFileNodeAllBuffersList as well. We could do something similar > with smgrdounlink/smgrdounlinkall so that, again, only one copy of the > code is needed.
Yeah, I was thinking about that too, but I simply wasn't sure which is the best choice so I've sent the raw patch. OTOH these functions are called on a very limited number of places, so a refactoring like this seems fine. Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers