Re: [HACKERS] slow dropping of tables, DropRelFileNodeBuffers, tas

2012-06-07 Thread Sergey Koposov

On Tue, 5 Jun 2012, Simon Riggs wrote:


Sounds less good and we'd need reasonable proof it actually did
anything useful without being dangerous.


Doing an initial unlocked test speeds things up another 2.69 fold (on
top of 3.55 for your patch) for me, with 1GB of shared buffers.  That
seems like it should be worthwhile.

How do we go about getting reasonable proof that it is safe?


That's enough for me.


So is it planned to apply that patch for 9.2 ?

Thanks,
S

*
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] slow dropping of tables, DropRelFileNodeBuffers, tas

2012-06-07 Thread Tom Lane
Sergey Koposov kopo...@ast.cam.ac.uk writes:
 On Tue, 5 Jun 2012, Simon Riggs wrote:
 Sounds less good and we'd need reasonable proof it actually did
 anything useful without being dangerous.

 Doing an initial unlocked test speeds things up another 2.69 fold (on
 top of 3.55 for your patch) for me, with 1GB of shared buffers.  That
 seems like it should be worthwhile.

 How do we go about getting reasonable proof that it is safe?

 That's enough for me.

Say what?  That's a performance result and proves not a damn thing about
safety.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] slow dropping of tables, DropRelFileNodeBuffers, tas

2012-06-07 Thread Simon Riggs
On 7 June 2012 14:56, Tom Lane t...@sss.pgh.pa.us wrote:
 Sergey Koposov kopo...@ast.cam.ac.uk writes:
 On Tue, 5 Jun 2012, Simon Riggs wrote:
 Sounds less good and we'd need reasonable proof it actually did
 anything useful without being dangerous.

 Doing an initial unlocked test speeds things up another 2.69 fold (on
 top of 3.55 for your patch) for me, with 1GB of shared buffers.  That
 seems like it should be worthwhile.

 How do we go about getting reasonable proof that it is safe?

 That's enough for me.

 Say what?  That's a performance result and proves not a damn thing about
 safety.

Of course not.

Based on the rationale explained in the code comments in the patch, it
seems like a reasonable thing to me now.

The argument was that since we hold AccessExclusiveLock on the
relation, no other agent can be reading in new parts of the table into
new buffers, so the only change to a buffer would be away from the
dropping relation, in which case we wouldn't care. Which seems correct
to me.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] slow dropping of tables, DropRelFileNodeBuffers, tas

2012-06-07 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes:
 On 7 June 2012 14:56, Tom Lane t...@sss.pgh.pa.us wrote:
 Say what?  That's a performance result and proves not a damn thing about
 safety.

 Of course not.

 Based on the rationale explained in the code comments in the patch, it
 seems like a reasonable thing to me now.

 The argument was that since we hold AccessExclusiveLock on the
 relation, no other agent can be reading in new parts of the table into
 new buffers, so the only change to a buffer would be away from the
 dropping relation, in which case we wouldn't care. Which seems correct
 to me.

Oh, I must be confused about which patch we are talking about --- I
thought this was in reference to some of the WIP ideas that were being
thrown about with respect to using lock-free access primitives.  Which
patch are you proposing for commit now, exactly?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] slow dropping of tables, DropRelFileNodeBuffers, tas

2012-06-07 Thread Simon Riggs
On 7 June 2012 17:34, Tom Lane t...@sss.pgh.pa.us wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 On 7 June 2012 14:56, Tom Lane t...@sss.pgh.pa.us wrote:
 Say what?  That's a performance result and proves not a damn thing about
 safety.

 Of course not.

 Based on the rationale explained in the code comments in the patch, it
 seems like a reasonable thing to me now.

 The argument was that since we hold AccessExclusiveLock on the
 relation, no other agent can be reading in new parts of the table into
 new buffers, so the only change to a buffer would be away from the
 dropping relation, in which case we wouldn't care. Which seems correct
 to me.

 Oh, I must be confused about which patch we are talking about --- I
 thought this was in reference to some of the WIP ideas that were being
 thrown about with respect to using lock-free access primitives.  Which
 patch are you proposing for commit now, exactly?

Both of these, as attached up thread.

Simon's patch - dropallforks.v1.patch
Jeff's patch - DropRelFileNodeBuffers_unlock_v1.patch
(needs a little tidyup)

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] slow dropping of tables, DropRelFileNodeBuffers, tas

2012-06-07 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes:
 On 7 June 2012 17:34, Tom Lane t...@sss.pgh.pa.us wrote:
 Oh, I must be confused about which patch we are talking about --- I
 thought this was in reference to some of the WIP ideas that were being
 thrown about with respect to using lock-free access primitives.  Which
 patch are you proposing for commit now, exactly?

 Both of these, as attached up thread.

 Simon's patch - dropallforks.v1.patch
 Jeff's patch - DropRelFileNodeBuffers_unlock_v1.patch
 (needs a little tidyup)

OK, will take a look.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] slow dropping of tables, DropRelFileNodeBuffers, tas

2012-06-07 Thread Tom Lane
Jeff Janes jeff.ja...@gmail.com writes:
 On 30 May 2012 12:10, Heikki Linnakangas
 heikki.linnakan...@enterprisedb.com wrote:
 Also, I wonder if DropRelFileNodeBuffers() could scan the pool without
 grabbing the spinlocks on every buffer? It could do an unlocked test first,
 and only grab the spinlock on buffers that need to be dropped.

 Sounds less good and we'd need reasonable proof it actually did
 anything useful without being dangerous.

 Doing an initial unlocked test speeds things up another 2.69 fold (on
 top of 3.55 for your patch) for me, with 1GB of shared buffers.  That
 seems like it should be worthwhile.

With shared_buffers set to 1GB, I see about a 2X reduction in the total
time to drop a simple table, ie
create table zit(f1 text primary key);
drop table zit;
(This table definition is chosen to ensure there's an index and a toast
table involved, so several scans of the buffer pool are needed.)  The
DROP goes from about 40ms to about 20ms on a fairly recent Xeon desktop.
So I'm convinced this is a win.

I extended the patch to also cover DropDatabaseBuffers,
FlushRelationBuffers, and FlushDatabaseBuffers, which have got the exact
same type of full-pool scan loop, and committed it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] slow dropping of tables, DropRelFileNodeBuffers, tas

2012-06-07 Thread Tom Lane
I wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 Both of these, as attached up thread.
 Simon's patch - dropallforks.v1.patch
 Jeff's patch - DropRelFileNodeBuffers_unlock_v1.patch
 (needs a little tidyup)

 OK, will take a look.

I didn't like dropallforks.v1.patch at all as presented, for several
reasons:

* Introducing an AllForks notation that only works in some contexts is
a foot-gun of large caliber.  This concern is not academic: you broke
dropping of temp relations entirely, in the patch as presented, because
for temp rels DropRelFileNodeBuffers would hand off to
DropRelFileNodeLocalBuffers and the latter had not been taught about
AllForks.

* Since we have found out in this thread that the inner loop of
DropRelFileNodeBuffers is performance-critical for the cases we're
dealing with, it seems inappropriate to me to make its tests more
complex.  We want simpler, and we can have simpler given that the
relation-drop case cares neither about fork nor block number.

* The patch modified behavior of XLogDropRelation, which has not been
shown to be performance-critical, and probably can't be because the
hashtable it searches should never be all that large.  It certainly
doesn't matter to the speed of normal execution.

I thought it would be a lot safer and probably a little bit quicker
if we just split DropRelFileNodeBuffers into two routines, one for
the specific-fork case and one for the all-forks case; and then the
same for its main caller smgrdounlink.  So I modified the patch along
those lines and committed it.

As committed, the smgrdounlinkfork case is actually dead code; it's
never called from anywhere.  I left it in place just in case we want
it someday.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] slow dropping of tables, DropRelFileNodeBuffers, tas

2012-06-07 Thread Simon Riggs
On 7 June 2012 22:54, Tom Lane t...@sss.pgh.pa.us wrote:

 I thought it would be a lot safer and probably a little bit quicker
 if we just split DropRelFileNodeBuffers into two routines, one for
 the specific-fork case and one for the all-forks case; and then the
 same for its main caller smgrdounlink.  So I modified the patch along
 those lines and committed it.

 As committed, the smgrdounlinkfork case is actually dead code; it's
 never called from anywhere.  I left it in place just in case we want
 it someday.

That's fine. The first version of the patch did it exactly that way.

I tried to double guess objections and so recoded it the way submitted.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] slow dropping of tables, DropRelFileNodeBuffers, tas

2012-06-07 Thread Sergey Koposov

On Thu, 7 Jun 2012, Tom Lane wrote:

I extended the patch to also cover DropDatabaseBuffers,
FlushRelationBuffers, and FlushDatabaseBuffers, which have got the exact
same type of full-pool scan loop, and committed it.


Thanks everybody for the patches/commits.

Sergey

*
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] slow dropping of tables, DropRelFileNodeBuffers, tas

2012-06-05 Thread Simon Riggs
On 3 June 2012 19:07, Jeff Janes jeff.ja...@gmail.com wrote:
 On Thu, May 31, 2012 at 5:04 AM, Simon Riggs si...@2ndquadrant.com wrote:
 On 30 May 2012 12:10, Heikki Linnakangas
 heikki.linnakan...@enterprisedb.com wrote:

 Hmm, we do this in smgrDoPendingDeletes:

 for (i = 0; i = MAX_FORKNUM; i++)
 {
        smgrdounlink(srel, i, false);
 }

 So we drop the buffers for each relation fork separately, which means that
 we scan the buffer pool four times. Relation forks in 8.4 introduced that
 issue, and 9.1 made it worse by adding another fork for unlogged tables.
 With some refactoring, we could scan the buffer pool just once. That would
 help a lot.

 That struck me as a safe and easy optimisation. This was a problem I'd
 been trying to optimise for 9.2, so I've written a patch that appears
 simple and clean enough to be applied directly.

 By directly do you mean before the fork/commit fest begins?


 Also, I wonder if DropRelFileNodeBuffers() could scan the pool without
 grabbing the spinlocks on every buffer? It could do an unlocked test first,
 and only grab the spinlock on buffers that need to be dropped.

 Sounds less good and we'd need reasonable proof it actually did
 anything useful without being dangerous.

 Doing an initial unlocked test speeds things up another 2.69 fold (on
 top of 3.55 for your patch) for me, with 1GB of shared buffers.  That
 seems like it should be worthwhile.

 How do we go about getting reasonable proof that it is safe?

That's enough for me.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] slow dropping of tables, DropRelFileNodeBuffers, tas

2012-06-03 Thread Jeff Janes
On Thu, May 31, 2012 at 5:04 AM, Simon Riggs si...@2ndquadrant.com wrote:
 On 30 May 2012 12:10, Heikki Linnakangas
 heikki.linnakan...@enterprisedb.com wrote:

 Hmm, we do this in smgrDoPendingDeletes:

 for (i = 0; i = MAX_FORKNUM; i++)
 {
        smgrdounlink(srel, i, false);
 }

 So we drop the buffers for each relation fork separately, which means that
 we scan the buffer pool four times. Relation forks in 8.4 introduced that
 issue, and 9.1 made it worse by adding another fork for unlogged tables.
 With some refactoring, we could scan the buffer pool just once. That would
 help a lot.

 That struck me as a safe and easy optimisation. This was a problem I'd
 been trying to optimise for 9.2, so I've written a patch that appears
 simple and clean enough to be applied directly.

By directly do you mean before the fork/commit fest begins?


 Also, I wonder if DropRelFileNodeBuffers() could scan the pool without
 grabbing the spinlocks on every buffer? It could do an unlocked test first,
 and only grab the spinlock on buffers that need to be dropped.

 Sounds less good and we'd need reasonable proof it actually did
 anything useful without being dangerous.

Doing an initial unlocked test speeds things up another 2.69 fold (on
top of 3.55 for your patch) for me, with 1GB of shared buffers.  That
seems like it should be worthwhile.

How do we go about getting reasonable proof that it is safe?

Thanks,

Jeff


DropRelFileNodeBuffers_unlock_v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] slow dropping of tables, DropRelFileNodeBuffers, tas

2012-06-01 Thread Sergey Koposov

On Fri, 1 Jun 2012, Simon Riggs wrote:



Why do you have 10,000 tables and why is it important to drop them so quickly?


1 tables are there, because that's the number of partitions. And I'm 
dropping them at the moment, because I'm doing testing. So it won't be
really crucial for production. But I still thought it was worth reporting. 
Especially when the table dropping took .5 a sec.


The problem is that when I set up the shared_buffers say to 48G, the 
timings of the tables rise significantly again.



If its that important, why not run the drop in parallel sessions?


Yes, before there was a strong reason to do that, now the timings are more 
manageable, but maybe I'll implement that.


Cheers,
S

*
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] slow dropping of tables, DropRelFileNodeBuffers, tas

2012-06-01 Thread Simon Riggs
On 1 June 2012 12:34, Sergey Koposov kopo...@ast.cam.ac.uk wrote:
 On Fri, 1 Jun 2012, Simon Riggs wrote:


 Why do you have 10,000 tables and why is it important to drop them so
 quickly?


 1 tables are there, because that's the number of partitions. And I'm
 dropping them at the moment, because I'm doing testing. So it won't be
 really crucial for production. But I still thought it was worth reporting.
 Especially when the table dropping took .5 a sec.

Ah, partitions. That explains the long drop time.

Hopefully people don't need to do that too frequently.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] slow dropping of tables, DropRelFileNodeBuffers, tas

2012-05-31 Thread Simon Riggs
On 30 May 2012 12:10, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:

 Hmm, we do this in smgrDoPendingDeletes:

 for (i = 0; i = MAX_FORKNUM; i++)
 {
        smgrdounlink(srel, i, false);
 }

 So we drop the buffers for each relation fork separately, which means that
 we scan the buffer pool four times. Relation forks in 8.4 introduced that
 issue, and 9.1 made it worse by adding another fork for unlogged tables.
 With some refactoring, we could scan the buffer pool just once. That would
 help a lot.

That struck me as a safe and easy optimisation. This was a problem I'd
been trying to optimise for 9.2, so I've written a patch that appears
simple and clean enough to be applied directly.

 Also, I wonder if DropRelFileNodeBuffers() could scan the pool without
 grabbing the spinlocks on every buffer? It could do an unlocked test first,
 and only grab the spinlock on buffers that need to be dropped.

Sounds less good and we'd need reasonable proof it actually did
anything useful without being dangerous.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


dropallforks.v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] slow dropping of tables, DropRelFileNodeBuffers, tas

2012-05-31 Thread Sergey Koposov

On Thu, 31 May 2012, Simon Riggs wrote:



That struck me as a safe and easy optimisation. This was a problem I'd
been trying to optimise for 9.2, so I've written a patch that appears
simple and clean enough to be applied directly.


Thanks! The patch indeed improved the timings, 
The dropping of 100 tables in a single commit before the patch took ~ 50 
seconds, now it takes ~ 5 sec (it would be nice to reduce it further 
though, because the dropping of 1 tables still takes ~10 min).


Cheers,
S

*
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] slow dropping of tables, DropRelFileNodeBuffers, tas

2012-05-31 Thread Jeff Janes
On Thu, May 31, 2012 at 11:09 AM, Sergey Koposov kopo...@ast.cam.ac.uk wrote:
 On Thu, 31 May 2012, Simon Riggs wrote:


 That struck me as a safe and easy optimisation. This was a problem I'd
 been trying to optimise for 9.2, so I've written a patch that appears
 simple and clean enough to be applied directly.


 Thanks! The patch indeed improved the timings, The dropping of 100 tables in
 a single commit before the patch took ~ 50 seconds, now it takes ~ 5 sec (it
 would be nice to reduce it further though, because the dropping of 1
 tables still takes ~10 min).

I'm surprised it helped that much.  I thought the most it could
theoretically could help would be a factor of 4.

I tried the initially unlocked test, and for me it cut the time by a
factor of 3.  But I only have a 1GB shared_buffers at the max, I would
expect it help more at larger sizes because there is a constant
overhead not related to scanning the shared buffers which gets diluted
out the larger shared_buffers is.

I added to that a drop-all very similar to what Simon posted and got
another factor of 3.

But, if you can do this during a maintenance window, then just
restarting with a much smaller shared_buffers should give you a much
larger speed up than either or both of these.  If I can extrapolate up
to 10G from my current curve, setting it to 8MB instead would give a
speed up of nearly 400 fold.

Cheers,

Jeff

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] slow dropping of tables, DropRelFileNodeBuffers, tas

2012-05-31 Thread Simon Riggs
On 31 May 2012 19:09, Sergey Koposov kopo...@ast.cam.ac.uk wrote:
 On Thu, 31 May 2012, Simon Riggs wrote:


 That struck me as a safe and easy optimisation. This was a problem I'd
 been trying to optimise for 9.2, so I've written a patch that appears
 simple and clean enough to be applied directly.


 Thanks! The patch indeed improved the timings, The dropping of 100 tables in
 a single commit before the patch took ~ 50 seconds, now it takes ~ 5 sec

Thanks for the timing.

(it
 would be nice to reduce it further though, because the dropping of 1
 tables still takes ~10 min).

Why do you have 10,000 tables and why is it important to drop them so quickly?

If its that important, why not run the drop in parallel sessions?

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] slow dropping of tables, DropRelFileNodeBuffers, tas

2012-05-30 Thread Heikki Linnakangas

On 30.05.2012 03:40, Sergey Koposov wrote:

I was running some tests on PG9.2beta where I'm creating and dropping
large number of tables (~ 2).

And I noticed that table dropping was extremely slow -- e.g. like half a
second per table.

...

I also stopped PG with gdb a few times and it was always at this backtrace:

(gdb) bt
#0 tas (lock=0x7fa4e3007538 \001) at
../../../../src/include/storage/s_lock.h:218
#1 0x006e6956 in DropRelFileNodeBuffers (rnode=...,
forkNum=VISIBILITYMAP_FORKNUM, firstDelBlock=0) at bufmgr.c:2062
#2 0x0070c014 in smgrdounlink (reln=0x1618210,
forknum=VISIBILITYMAP_FORKNUM, isRedo=0 '\000') at smgr.c:354
#3 0x0051ecf6 in smgrDoPendingDeletes (isCommit=1 '\001') at
storage.c:364


Hmm, we do this in smgrDoPendingDeletes:

for (i = 0; i = MAX_FORKNUM; i++)
{
smgrdounlink(srel, i, false);
}

So we drop the buffers for each relation fork separately, which means 
that we scan the buffer pool four times. Relation forks in 8.4 
introduced that issue, and 9.1 made it worse by adding another fork for 
unlogged tables. With some refactoring, we could scan the buffer pool 
just once. That would help a lot.


Also, I wonder if DropRelFileNodeBuffers() could scan the pool without 
grabbing the spinlocks on every buffer? It could do an unlocked test 
first, and only grab the spinlock on buffers that need to be dropped.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] slow dropping of tables, DropRelFileNodeBuffers, tas

2012-05-30 Thread Robert Haas
On Wed, May 30, 2012 at 7:10 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 So we drop the buffers for each relation fork separately, which means that
 we scan the buffer pool four times. Relation forks in 8.4 introduced that
 issue, and 9.1 made it worse by adding another fork for unlogged tables.
 With some refactoring, we could scan the buffer pool just once. That would
 help a lot.

+1.

 Also, I wonder if DropRelFileNodeBuffers() could scan the pool without
 grabbing the spinlocks on every buffer? It could do an unlocked test first,
 and only grab the spinlock on buffers that need to be dropped.

I think it would be possible for the unlocked test to indicate that
the buffer should be dropped when it really ought not to be, because
someone else might be in the middle of changing the buffer tag, and
that's not atomic.  So you'd have to recheck after taking the
spinlock.  However, I don't think it's possible for the unlocked test
to report a false negative, because we've already taken
AccessExclusiveLock on the relation, which had better be enough to
guarantee that nobody's pulling in any more buffers from that relation
(if it doesn't guarantee that, the current code is already broken).
Acquiring a heavyweight lock also interposes a full memory barrier,
which should eliminate any risks due to memory-ordering effects.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] slow dropping of tables, DropRelFileNodeBuffers, tas

2012-05-30 Thread Jeff Janes
On Wed, May 30, 2012 at 4:10 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 On 30.05.2012 03:40, Sergey Koposov wrote:

 I was running some tests on PG9.2beta where I'm creating and dropping
 large number of tables (~ 2).

 And I noticed that table dropping was extremely slow -- e.g. like half a
 second per table.

 ...


 I also stopped PG with gdb a few times and it was always at this
 backtrace:

 (gdb) bt
 #0 tas (lock=0x7fa4e3007538 \001) at
 ../../../../src/include/storage/s_lock.h:218
 #1 0x006e6956 in DropRelFileNodeBuffers (rnode=...,
 forkNum=VISIBILITYMAP_FORKNUM, firstDelBlock=0) at bufmgr.c:2062
 #2 0x0070c014 in smgrdounlink (reln=0x1618210,
 forknum=VISIBILITYMAP_FORKNUM, isRedo=0 '\000') at smgr.c:354
 #3 0x0051ecf6 in smgrDoPendingDeletes (isCommit=1 '\001') at
 storage.c:364


 Hmm, we do this in smgrDoPendingDeletes:

 for (i = 0; i = MAX_FORKNUM; i++)
 {
        smgrdounlink(srel, i, false);
 }

 So we drop the buffers for each relation fork separately, which means that
 we scan the buffer pool four times. Relation forks in 8.4 introduced that
 issue, and 9.1 made it worse by adding another fork for unlogged tables.
 With some refactoring, we could scan the buffer pool just once. That would
 help a lot.

If someone drops many tables in the same transaction, could it be made
to stuff them into a hash table and then drop all of them with one
cycle around the buffer pool?  Or is the use case for that too narrow
a use case to be worthwhile?

Cheers,

Jeff

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] slow dropping of tables, DropRelFileNodeBuffers, tas

2012-05-30 Thread Ants Aasma
On Wed, May 30, 2012 at 2:10 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 Also, I wonder if DropRelFileNodeBuffers() could scan the pool without
 grabbing the spinlocks on every buffer? It could do an unlocked test first,
 and only grab the spinlock on buffers that need to be dropped.

The scanning of buffers seemed awfully slow to me. Doing the math it
ends up being somewhere around 120ns per buffer, about consistent with
a full cache miss. It looks like the spinlock tas implementation (lock
xchgb) is preventing prefetching. This suspicion is corroborated by
the following comment in Linux kernels per_cpu.h:

/*
 * xchg is implemented using cmpxchg without a lock prefix. xchg is
 * expensive due to the implied lock prefix.  The processor cannot prefetch
 * cachelines if xchg is used.
 */

I'm not sure, but doing an unlocked test first might also be useful in
triggering the prefetchers. The CPU should be doing a lot better than
the current ~4.3GB/s when scanning buffer descriptors.

Of course not scanning at all or doing less scans at the expense of
more work in the inner loop would be even better.

Regards,
Ants Aasma
-- 
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers