Re: [HACKERS] slow dropping of tables, DropRelFileNodeBuffers, tas
On Tue, 5 Jun 2012, Simon Riggs wrote: Sounds less good and we'd need reasonable proof it actually did anything useful without being dangerous. Doing an initial unlocked test speeds things up another 2.69 fold (on top of 3.55 for your patch) for me, with 1GB of shared buffers. That seems like it should be worthwhile. How do we go about getting reasonable proof that it is safe? That's enough for me. So is it planned to apply that patch for 9.2 ? Thanks, S * Sergey E. Koposov, PhD, Research Associate Institute of Astronomy, University of Cambridge Madingley road, CB3 0HA, Cambridge, UK Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] slow dropping of tables, DropRelFileNodeBuffers, tas
Sergey Koposov kopo...@ast.cam.ac.uk writes: On Tue, 5 Jun 2012, Simon Riggs wrote: Sounds less good and we'd need reasonable proof it actually did anything useful without being dangerous. Doing an initial unlocked test speeds things up another 2.69 fold (on top of 3.55 for your patch) for me, with 1GB of shared buffers. That seems like it should be worthwhile. How do we go about getting reasonable proof that it is safe? That's enough for me. Say what? That's a performance result and proves not a damn thing about safety. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] slow dropping of tables, DropRelFileNodeBuffers, tas
On 7 June 2012 14:56, Tom Lane t...@sss.pgh.pa.us wrote: Sergey Koposov kopo...@ast.cam.ac.uk writes: On Tue, 5 Jun 2012, Simon Riggs wrote: Sounds less good and we'd need reasonable proof it actually did anything useful without being dangerous. Doing an initial unlocked test speeds things up another 2.69 fold (on top of 3.55 for your patch) for me, with 1GB of shared buffers. That seems like it should be worthwhile. How do we go about getting reasonable proof that it is safe? That's enough for me. Say what? That's a performance result and proves not a damn thing about safety. Of course not. Based on the rationale explained in the code comments in the patch, it seems like a reasonable thing to me now. The argument was that since we hold AccessExclusiveLock on the relation, no other agent can be reading in new parts of the table into new buffers, so the only change to a buffer would be away from the dropping relation, in which case we wouldn't care. Which seems correct to me. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] slow dropping of tables, DropRelFileNodeBuffers, tas
Simon Riggs si...@2ndquadrant.com writes: On 7 June 2012 14:56, Tom Lane t...@sss.pgh.pa.us wrote: Say what? That's a performance result and proves not a damn thing about safety. Of course not. Based on the rationale explained in the code comments in the patch, it seems like a reasonable thing to me now. The argument was that since we hold AccessExclusiveLock on the relation, no other agent can be reading in new parts of the table into new buffers, so the only change to a buffer would be away from the dropping relation, in which case we wouldn't care. Which seems correct to me. Oh, I must be confused about which patch we are talking about --- I thought this was in reference to some of the WIP ideas that were being thrown about with respect to using lock-free access primitives. Which patch are you proposing for commit now, exactly? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] slow dropping of tables, DropRelFileNodeBuffers, tas
On 7 June 2012 17:34, Tom Lane t...@sss.pgh.pa.us wrote: Simon Riggs si...@2ndquadrant.com writes: On 7 June 2012 14:56, Tom Lane t...@sss.pgh.pa.us wrote: Say what? That's a performance result and proves not a damn thing about safety. Of course not. Based on the rationale explained in the code comments in the patch, it seems like a reasonable thing to me now. The argument was that since we hold AccessExclusiveLock on the relation, no other agent can be reading in new parts of the table into new buffers, so the only change to a buffer would be away from the dropping relation, in which case we wouldn't care. Which seems correct to me. Oh, I must be confused about which patch we are talking about --- I thought this was in reference to some of the WIP ideas that were being thrown about with respect to using lock-free access primitives. Which patch are you proposing for commit now, exactly? Both of these, as attached up thread. Simon's patch - dropallforks.v1.patch Jeff's patch - DropRelFileNodeBuffers_unlock_v1.patch (needs a little tidyup) -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] slow dropping of tables, DropRelFileNodeBuffers, tas
Simon Riggs si...@2ndquadrant.com writes: On 7 June 2012 17:34, Tom Lane t...@sss.pgh.pa.us wrote: Oh, I must be confused about which patch we are talking about --- I thought this was in reference to some of the WIP ideas that were being thrown about with respect to using lock-free access primitives. Which patch are you proposing for commit now, exactly? Both of these, as attached up thread. Simon's patch - dropallforks.v1.patch Jeff's patch - DropRelFileNodeBuffers_unlock_v1.patch (needs a little tidyup) OK, will take a look. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] slow dropping of tables, DropRelFileNodeBuffers, tas
Jeff Janes jeff.ja...@gmail.com writes: On 30 May 2012 12:10, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Also, I wonder if DropRelFileNodeBuffers() could scan the pool without grabbing the spinlocks on every buffer? It could do an unlocked test first, and only grab the spinlock on buffers that need to be dropped. Sounds less good and we'd need reasonable proof it actually did anything useful without being dangerous. Doing an initial unlocked test speeds things up another 2.69 fold (on top of 3.55 for your patch) for me, with 1GB of shared buffers. That seems like it should be worthwhile. With shared_buffers set to 1GB, I see about a 2X reduction in the total time to drop a simple table, ie create table zit(f1 text primary key); drop table zit; (This table definition is chosen to ensure there's an index and a toast table involved, so several scans of the buffer pool are needed.) The DROP goes from about 40ms to about 20ms on a fairly recent Xeon desktop. So I'm convinced this is a win. I extended the patch to also cover DropDatabaseBuffers, FlushRelationBuffers, and FlushDatabaseBuffers, which have got the exact same type of full-pool scan loop, and committed it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] slow dropping of tables, DropRelFileNodeBuffers, tas
I wrote: Simon Riggs si...@2ndquadrant.com writes: Both of these, as attached up thread. Simon's patch - dropallforks.v1.patch Jeff's patch - DropRelFileNodeBuffers_unlock_v1.patch (needs a little tidyup) OK, will take a look. I didn't like dropallforks.v1.patch at all as presented, for several reasons: * Introducing an AllForks notation that only works in some contexts is a foot-gun of large caliber. This concern is not academic: you broke dropping of temp relations entirely, in the patch as presented, because for temp rels DropRelFileNodeBuffers would hand off to DropRelFileNodeLocalBuffers and the latter had not been taught about AllForks. * Since we have found out in this thread that the inner loop of DropRelFileNodeBuffers is performance-critical for the cases we're dealing with, it seems inappropriate to me to make its tests more complex. We want simpler, and we can have simpler given that the relation-drop case cares neither about fork nor block number. * The patch modified behavior of XLogDropRelation, which has not been shown to be performance-critical, and probably can't be because the hashtable it searches should never be all that large. It certainly doesn't matter to the speed of normal execution. I thought it would be a lot safer and probably a little bit quicker if we just split DropRelFileNodeBuffers into two routines, one for the specific-fork case and one for the all-forks case; and then the same for its main caller smgrdounlink. So I modified the patch along those lines and committed it. As committed, the smgrdounlinkfork case is actually dead code; it's never called from anywhere. I left it in place just in case we want it someday. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] slow dropping of tables, DropRelFileNodeBuffers, tas
On 7 June 2012 22:54, Tom Lane t...@sss.pgh.pa.us wrote: I thought it would be a lot safer and probably a little bit quicker if we just split DropRelFileNodeBuffers into two routines, one for the specific-fork case and one for the all-forks case; and then the same for its main caller smgrdounlink. So I modified the patch along those lines and committed it. As committed, the smgrdounlinkfork case is actually dead code; it's never called from anywhere. I left it in place just in case we want it someday. That's fine. The first version of the patch did it exactly that way. I tried to double guess objections and so recoded it the way submitted. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] slow dropping of tables, DropRelFileNodeBuffers, tas
On Thu, 7 Jun 2012, Tom Lane wrote: I extended the patch to also cover DropDatabaseBuffers, FlushRelationBuffers, and FlushDatabaseBuffers, which have got the exact same type of full-pool scan loop, and committed it. Thanks everybody for the patches/commits. Sergey * Sergey E. Koposov, PhD, Research Associate Institute of Astronomy, University of Cambridge Madingley road, CB3 0HA, Cambridge, UK Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] slow dropping of tables, DropRelFileNodeBuffers, tas
On 3 June 2012 19:07, Jeff Janes jeff.ja...@gmail.com wrote: On Thu, May 31, 2012 at 5:04 AM, Simon Riggs si...@2ndquadrant.com wrote: On 30 May 2012 12:10, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Hmm, we do this in smgrDoPendingDeletes: for (i = 0; i = MAX_FORKNUM; i++) { smgrdounlink(srel, i, false); } So we drop the buffers for each relation fork separately, which means that we scan the buffer pool four times. Relation forks in 8.4 introduced that issue, and 9.1 made it worse by adding another fork for unlogged tables. With some refactoring, we could scan the buffer pool just once. That would help a lot. That struck me as a safe and easy optimisation. This was a problem I'd been trying to optimise for 9.2, so I've written a patch that appears simple and clean enough to be applied directly. By directly do you mean before the fork/commit fest begins? Also, I wonder if DropRelFileNodeBuffers() could scan the pool without grabbing the spinlocks on every buffer? It could do an unlocked test first, and only grab the spinlock on buffers that need to be dropped. Sounds less good and we'd need reasonable proof it actually did anything useful without being dangerous. Doing an initial unlocked test speeds things up another 2.69 fold (on top of 3.55 for your patch) for me, with 1GB of shared buffers. That seems like it should be worthwhile. How do we go about getting reasonable proof that it is safe? That's enough for me. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] slow dropping of tables, DropRelFileNodeBuffers, tas
On Thu, May 31, 2012 at 5:04 AM, Simon Riggs si...@2ndquadrant.com wrote: On 30 May 2012 12:10, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Hmm, we do this in smgrDoPendingDeletes: for (i = 0; i = MAX_FORKNUM; i++) { smgrdounlink(srel, i, false); } So we drop the buffers for each relation fork separately, which means that we scan the buffer pool four times. Relation forks in 8.4 introduced that issue, and 9.1 made it worse by adding another fork for unlogged tables. With some refactoring, we could scan the buffer pool just once. That would help a lot. That struck me as a safe and easy optimisation. This was a problem I'd been trying to optimise for 9.2, so I've written a patch that appears simple and clean enough to be applied directly. By directly do you mean before the fork/commit fest begins? Also, I wonder if DropRelFileNodeBuffers() could scan the pool without grabbing the spinlocks on every buffer? It could do an unlocked test first, and only grab the spinlock on buffers that need to be dropped. Sounds less good and we'd need reasonable proof it actually did anything useful without being dangerous. Doing an initial unlocked test speeds things up another 2.69 fold (on top of 3.55 for your patch) for me, with 1GB of shared buffers. That seems like it should be worthwhile. How do we go about getting reasonable proof that it is safe? Thanks, Jeff DropRelFileNodeBuffers_unlock_v1.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] slow dropping of tables, DropRelFileNodeBuffers, tas
On Fri, 1 Jun 2012, Simon Riggs wrote: Why do you have 10,000 tables and why is it important to drop them so quickly? 1 tables are there, because that's the number of partitions. And I'm dropping them at the moment, because I'm doing testing. So it won't be really crucial for production. But I still thought it was worth reporting. Especially when the table dropping took .5 a sec. The problem is that when I set up the shared_buffers say to 48G, the timings of the tables rise significantly again. If its that important, why not run the drop in parallel sessions? Yes, before there was a strong reason to do that, now the timings are more manageable, but maybe I'll implement that. Cheers, S * Sergey E. Koposov, PhD, Research Associate Institute of Astronomy, University of Cambridge Madingley road, CB3 0HA, Cambridge, UK Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] slow dropping of tables, DropRelFileNodeBuffers, tas
On 1 June 2012 12:34, Sergey Koposov kopo...@ast.cam.ac.uk wrote: On Fri, 1 Jun 2012, Simon Riggs wrote: Why do you have 10,000 tables and why is it important to drop them so quickly? 1 tables are there, because that's the number of partitions. And I'm dropping them at the moment, because I'm doing testing. So it won't be really crucial for production. But I still thought it was worth reporting. Especially when the table dropping took .5 a sec. Ah, partitions. That explains the long drop time. Hopefully people don't need to do that too frequently. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] slow dropping of tables, DropRelFileNodeBuffers, tas
On 30 May 2012 12:10, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Hmm, we do this in smgrDoPendingDeletes: for (i = 0; i = MAX_FORKNUM; i++) { smgrdounlink(srel, i, false); } So we drop the buffers for each relation fork separately, which means that we scan the buffer pool four times. Relation forks in 8.4 introduced that issue, and 9.1 made it worse by adding another fork for unlogged tables. With some refactoring, we could scan the buffer pool just once. That would help a lot. That struck me as a safe and easy optimisation. This was a problem I'd been trying to optimise for 9.2, so I've written a patch that appears simple and clean enough to be applied directly. Also, I wonder if DropRelFileNodeBuffers() could scan the pool without grabbing the spinlocks on every buffer? It could do an unlocked test first, and only grab the spinlock on buffers that need to be dropped. Sounds less good and we'd need reasonable proof it actually did anything useful without being dangerous. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services dropallforks.v1.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] slow dropping of tables, DropRelFileNodeBuffers, tas
On Thu, 31 May 2012, Simon Riggs wrote: That struck me as a safe and easy optimisation. This was a problem I'd been trying to optimise for 9.2, so I've written a patch that appears simple and clean enough to be applied directly. Thanks! The patch indeed improved the timings, The dropping of 100 tables in a single commit before the patch took ~ 50 seconds, now it takes ~ 5 sec (it would be nice to reduce it further though, because the dropping of 1 tables still takes ~10 min). Cheers, S * Sergey E. Koposov, PhD, Research Associate Institute of Astronomy, University of Cambridge Madingley road, CB3 0HA, Cambridge, UK Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] slow dropping of tables, DropRelFileNodeBuffers, tas
On Thu, May 31, 2012 at 11:09 AM, Sergey Koposov kopo...@ast.cam.ac.uk wrote: On Thu, 31 May 2012, Simon Riggs wrote: That struck me as a safe and easy optimisation. This was a problem I'd been trying to optimise for 9.2, so I've written a patch that appears simple and clean enough to be applied directly. Thanks! The patch indeed improved the timings, The dropping of 100 tables in a single commit before the patch took ~ 50 seconds, now it takes ~ 5 sec (it would be nice to reduce it further though, because the dropping of 1 tables still takes ~10 min). I'm surprised it helped that much. I thought the most it could theoretically could help would be a factor of 4. I tried the initially unlocked test, and for me it cut the time by a factor of 3. But I only have a 1GB shared_buffers at the max, I would expect it help more at larger sizes because there is a constant overhead not related to scanning the shared buffers which gets diluted out the larger shared_buffers is. I added to that a drop-all very similar to what Simon posted and got another factor of 3. But, if you can do this during a maintenance window, then just restarting with a much smaller shared_buffers should give you a much larger speed up than either or both of these. If I can extrapolate up to 10G from my current curve, setting it to 8MB instead would give a speed up of nearly 400 fold. Cheers, Jeff -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] slow dropping of tables, DropRelFileNodeBuffers, tas
On 31 May 2012 19:09, Sergey Koposov kopo...@ast.cam.ac.uk wrote: On Thu, 31 May 2012, Simon Riggs wrote: That struck me as a safe and easy optimisation. This was a problem I'd been trying to optimise for 9.2, so I've written a patch that appears simple and clean enough to be applied directly. Thanks! The patch indeed improved the timings, The dropping of 100 tables in a single commit before the patch took ~ 50 seconds, now it takes ~ 5 sec Thanks for the timing. (it would be nice to reduce it further though, because the dropping of 1 tables still takes ~10 min). Why do you have 10,000 tables and why is it important to drop them so quickly? If its that important, why not run the drop in parallel sessions? -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] slow dropping of tables, DropRelFileNodeBuffers, tas
On 30.05.2012 03:40, Sergey Koposov wrote: I was running some tests on PG9.2beta where I'm creating and dropping large number of tables (~ 2). And I noticed that table dropping was extremely slow -- e.g. like half a second per table. ... I also stopped PG with gdb a few times and it was always at this backtrace: (gdb) bt #0 tas (lock=0x7fa4e3007538 \001) at ../../../../src/include/storage/s_lock.h:218 #1 0x006e6956 in DropRelFileNodeBuffers (rnode=..., forkNum=VISIBILITYMAP_FORKNUM, firstDelBlock=0) at bufmgr.c:2062 #2 0x0070c014 in smgrdounlink (reln=0x1618210, forknum=VISIBILITYMAP_FORKNUM, isRedo=0 '\000') at smgr.c:354 #3 0x0051ecf6 in smgrDoPendingDeletes (isCommit=1 '\001') at storage.c:364 Hmm, we do this in smgrDoPendingDeletes: for (i = 0; i = MAX_FORKNUM; i++) { smgrdounlink(srel, i, false); } So we drop the buffers for each relation fork separately, which means that we scan the buffer pool four times. Relation forks in 8.4 introduced that issue, and 9.1 made it worse by adding another fork for unlogged tables. With some refactoring, we could scan the buffer pool just once. That would help a lot. Also, I wonder if DropRelFileNodeBuffers() could scan the pool without grabbing the spinlocks on every buffer? It could do an unlocked test first, and only grab the spinlock on buffers that need to be dropped. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] slow dropping of tables, DropRelFileNodeBuffers, tas
On Wed, May 30, 2012 at 7:10 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: So we drop the buffers for each relation fork separately, which means that we scan the buffer pool four times. Relation forks in 8.4 introduced that issue, and 9.1 made it worse by adding another fork for unlogged tables. With some refactoring, we could scan the buffer pool just once. That would help a lot. +1. Also, I wonder if DropRelFileNodeBuffers() could scan the pool without grabbing the spinlocks on every buffer? It could do an unlocked test first, and only grab the spinlock on buffers that need to be dropped. I think it would be possible for the unlocked test to indicate that the buffer should be dropped when it really ought not to be, because someone else might be in the middle of changing the buffer tag, and that's not atomic. So you'd have to recheck after taking the spinlock. However, I don't think it's possible for the unlocked test to report a false negative, because we've already taken AccessExclusiveLock on the relation, which had better be enough to guarantee that nobody's pulling in any more buffers from that relation (if it doesn't guarantee that, the current code is already broken). Acquiring a heavyweight lock also interposes a full memory barrier, which should eliminate any risks due to memory-ordering effects. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] slow dropping of tables, DropRelFileNodeBuffers, tas
On Wed, May 30, 2012 at 4:10 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 30.05.2012 03:40, Sergey Koposov wrote: I was running some tests on PG9.2beta where I'm creating and dropping large number of tables (~ 2). And I noticed that table dropping was extremely slow -- e.g. like half a second per table. ... I also stopped PG with gdb a few times and it was always at this backtrace: (gdb) bt #0 tas (lock=0x7fa4e3007538 \001) at ../../../../src/include/storage/s_lock.h:218 #1 0x006e6956 in DropRelFileNodeBuffers (rnode=..., forkNum=VISIBILITYMAP_FORKNUM, firstDelBlock=0) at bufmgr.c:2062 #2 0x0070c014 in smgrdounlink (reln=0x1618210, forknum=VISIBILITYMAP_FORKNUM, isRedo=0 '\000') at smgr.c:354 #3 0x0051ecf6 in smgrDoPendingDeletes (isCommit=1 '\001') at storage.c:364 Hmm, we do this in smgrDoPendingDeletes: for (i = 0; i = MAX_FORKNUM; i++) { smgrdounlink(srel, i, false); } So we drop the buffers for each relation fork separately, which means that we scan the buffer pool four times. Relation forks in 8.4 introduced that issue, and 9.1 made it worse by adding another fork for unlogged tables. With some refactoring, we could scan the buffer pool just once. That would help a lot. If someone drops many tables in the same transaction, could it be made to stuff them into a hash table and then drop all of them with one cycle around the buffer pool? Or is the use case for that too narrow a use case to be worthwhile? Cheers, Jeff -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] slow dropping of tables, DropRelFileNodeBuffers, tas
On Wed, May 30, 2012 at 2:10 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Also, I wonder if DropRelFileNodeBuffers() could scan the pool without grabbing the spinlocks on every buffer? It could do an unlocked test first, and only grab the spinlock on buffers that need to be dropped. The scanning of buffers seemed awfully slow to me. Doing the math it ends up being somewhere around 120ns per buffer, about consistent with a full cache miss. It looks like the spinlock tas implementation (lock xchgb) is preventing prefetching. This suspicion is corroborated by the following comment in Linux kernels per_cpu.h: /* * xchg is implemented using cmpxchg without a lock prefix. xchg is * expensive due to the implied lock prefix. The processor cannot prefetch * cachelines if xchg is used. */ I'm not sure, but doing an unlocked test first might also be useful in triggering the prefetchers. The CPU should be doing a lot better than the current ~4.3GB/s when scanning buffer descriptors. Of course not scanning at all or doing less scans at the expense of more work in the inner loop would be even better. Regards, Ants Aasma -- Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers