Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-09 Thread Simon Riggs
On Wed, Jun 8, 2011 at 6:43 PM, Joshua Berkus j...@agliodbs.com wrote:
 Simon,

 The point I have made is that I disagree with a feature freeze date
 fixed ahead of time without regard to the content of the forthcoming
 release. I've not said I disagree with feature freezes altogether,
 which would be utterly ridiculous. Fixed dates are IMHO much less
 important than a sensible and useful feature set for our users.

 This is such a non-argument it's silly.  We have so many new major features 
 for 9.1 that I'm having trouble writing sensible press releases which don't 
 sound like a laundry list.

You're right this is a non-argument.

I am not continuing this debate using the above point. I am merely
correcting people's assertions about what I think, which is a little
tiresome for all of us and it would be much better if people didn't
foolishly put words in my mouth, as multiple people have done on this
thread.

I'm also quite happy with the feature set for 9.1.


 MySQL
 repeatedly delivered releases with half-finished features and earned
 much disrespect. We have never done that previously and I am against
 doing so in the future.

 This is also total BS.  I worked on the MySQL team.

Before Sun/Oracle, MySQL specifically had feature-driven releases, where 
Marketing decided what features 5.0, 5.1 and 5.2 would have.  They also 
accepted new features during beta if Marketing liked them enough.  This 
resulted in the 5.1 release being *three years late*, and 5.3 being cancelled 
altogether.  And let's talk about the legendary instability of 5.0, because 
they decided that they couldn't cancel partitioning and stored procedures, 
whether they were ready for prime time or not and because they kept changing 
the API during beta.

 MySQL never had time-based releases before Oracle took them over.  And Oracle 
 has been having feature-free releases because they're trying to work through 
 MySQL's list of thousands of unfixed bugs which dates back to 2003.


I claimed they delivered half-finished features. You clearly agree
with me on that. I'm not sure which part you see as BS?


 An argument for feature-driven releases is in fact an argument for the MySQL 
 AB development model.  And that's not a company I want to emulate.


Yes, I've also experienced totally marketing-driven software
development, and that's why I'm *here*. I've spoken at length about
how good our process is and have considerable respect for it and the
people that have made it work. I am not advocating any changes to it
at all, especially not to the model used by MYSQL AB.

I have asked that we maintain the Reasonableness we have always had
about how the feature freeze date was applied. An example of such
reasonableness is that if a feature is a few days late and it is
important, then it would still go into the release. An example of
unreasonableness would be to close the feature freeze on a
predetermined date, without regard to the state of the feature set in
the release. To date, we have always been reasonable and I don't want
to change the process in the way Robert has suggested we should
change. I was one of a number of developers making that point at the
developer meeting and I would say I was part of the majority view.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-09 Thread Robert Haas
On Thu, Jun 9, 2011 at 5:09 AM, Simon Riggs si...@2ndquadrant.com wrote:
 I have asked that we maintain the Reasonableness we have always had
 about how the feature freeze date was applied. An example of such
 reasonableness is that if a feature is a few days late and it is
 important, then it would still go into the release. An example of
 unreasonableness would be to close the feature freeze on a
 predetermined date, without regard to the state of the feature set in
 the release. To date, we have always been reasonable and I don't want
 to change the process in the way Robert has suggested we should
 change.

Now you're putting words in my mouth.  I wouldn't want to put out a
release without a good feature set, either, but we don't have that
problem.  Getting them out on a fairly regular schedule without a
really long feature freeze has traditionally been a bit harder.  I
believe that over the last few releases we've actually gotten better
at integrating larger patches while also sticking closer to the
schedule; and I'd like to continue to get better at both of those
things.  I don't advocate blind adherence to the feature freeze date
either, but I do prefer to see deviations measured in days or at most
weeks rather than months; and I have a lot more sympathy for the
patch submitted and no one got around to reviewing it situation than
I do for the patch just plain got here late case.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-09 Thread Dave Page
On Thu, Jun 9, 2011 at 2:13 PM, Robert Haas robertmh...@gmail.com wrote:
 On Thu, Jun 9, 2011 at 5:09 AM, Simon Riggs si...@2ndquadrant.com wrote:
 I have asked that we maintain the Reasonableness we have always had
 about how the feature freeze date was applied. An example of such
 reasonableness is that if a feature is a few days late and it is
 important, then it would still go into the release. An example of
 unreasonableness would be to close the feature freeze on a
 predetermined date, without regard to the state of the feature set in
 the release. To date, we have always been reasonable and I don't want
 to change the process in the way Robert has suggested we should
 change.

 Now you're putting words in my mouth.  I wouldn't want to put out a
 release without a good feature set, either, but we don't have that
 problem.  Getting them out on a fairly regular schedule without a
 really long feature freeze has traditionally been a bit harder.  I
 believe that over the last few releases we've actually gotten better
 at integrating larger patches while also sticking closer to the
 schedule; and I'd like to continue to get better at both of those
 things.  I don't advocate blind adherence to the feature freeze date
 either, but I do prefer to see deviations measured in days or at most
 weeks rather than months; and I have a lot more sympathy for the
 patch submitted and no one got around to reviewing it situation than
 I do for the patch just plain got here late case.

Can we make this the last post on this topic please?

-- 
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-09 Thread Pavan Deolasee

 
 Can we make this the last post on this topic please?
 

+1 :)

Thanks,
Pavan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-08 Thread Jim Nasby
On Jun 7, 2011, at 8:24 AM, Stephen Frost wrote:
 * Alvaro Herrera (alvhe...@commandprompt.com) wrote:
 I note that if 2nd Quadrant is interested in having a game-changing
 platform without having to wait a full year for 9.2, they can obviously
 distribute a modified version of Postgres that integrates Robert's
 patch.
 
 Having thought about this, I've got to agree with Alvaro on this one.
 The people who need this patch are likely to pull it down and patch it
 in and use it, regardless of if it's in a release or not.  My money is
 that Treat's already got it running on some massive prod system that he
 supports ( ;) ).
 
 If we get it into the first CF of 9.2 then people are going to be even
 more likely to pull it down and back-patch it into 9.1.  As soon as we
 wrap up CF1 and put out our first alpha, the performance testers will
 have something to point at and say look!  PG scales *even better* now!
 and they're not going to particularly care that it's an alpha and the
 blog-o-sphere isn't going to either, especially if we can say and it'll
 be in the next release which is scheduled for May.

From the Thinking Outside The Box dept.:

Also, if the performance gains prove to be as earth-shattering as initial 
results indicate, there's nothing that says we *have* to wait until the middle 
of next year to get this out. We could push to get 9.2 out with fewer other 
features, or possibly even break with tradition and backport this to 9.1 (or 
perhaps have a fork of 9.1 that we only support until 9.2 is out).

Obviously, those options all involve serious time commitments and the community 
will have to weigh those carefully. And we'd have to have very strong evidence 
of the benefits before even having that discussion, because the discussion 
itself will likely be resource intensive. But the option *is* there, should we 
decide to pursue it.

This means that this patch is too important to wait another 12 months isn't 
really a valid point: it only has to wait 12 months if thats what the community 
thinks is best; otherwise it could miss 9.1 *and* be out significantly before 
12 months from now.
--
Jim C. Nasby, Database Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-08 Thread Simon Riggs
On Wed, Jun 8, 2011 at 5:19 AM, Bruce Momjian br...@momjian.us wrote:
 Robert Haas wrote:
 On Mon, Jun 6, 2011 at 10:49 AM, Simon Riggs si...@2ndquadrant.com wrote:
  My point was that we have in the past implemented performance changes
  to increase scalability at the last minute, and also that our personal
  risk perspectives are not always set in stone.
 
  Robert has highlighted the value of this change and its clearly not
  beyond our wit to include it, even if it is beyond our will to do so.

 So, at the risk of totally derailing this thread -- what this boils
 down to is a philosophical disagreement.

 It seems to me (and, I think, to Tom and Heikki and others as well)
 that it's not possible to keep on making changes to the release right
 up until the last minute and then expect the release to be of high
 quality.  If we keep committing new features, then we'll keep
 introducing new bugs.  The only hope of making the bug count go down
 at some point is to stop making changes that aren't bug fixes.  We
 could come up with some complex procedure for determining whether a
 patch is important enough and non-invasive enough to bypass the normal
 deadline, but that would probably lead to a lot more arguing about
 procedure, and realistically, it's still going to increase the bug
 count at least somewhat.  IMHO, it's better to just have a deadline,
 and stuff either makes it or it doesn't.  I realize we haven't always
 adhered to the principle in the past, but at least IMV that's not a
 mistake we want to continue repeating.

 Simon is right that we slipped the vxid patch into 8.3 when a Postgres
 user I talked to at Linuxworld mentioned high vacuum freeze activity and
 simple calculations showed the many read-only queries could cause high
 xid usage.  Fortunately we already had a patch available and Tom applied
 it during beta.  It was an existing patch that took on new urgency
 during beta.

 Robert's point above is that it isn't so much making the decision of
 whether something should slip past the deadline, but the time-sapping
 discussion of whether something should slip, and the frankly disturbing
 behavior of some in this group to not accept a clear consensus,
 therefore prolonging the discussion of slippage far longer than
 necessary.

 Basically, if you propose something, and it gets shot down due to
 procedure, accept that unless you have some very good _new_ reason for
 continuing the discussion.  If you don't like that, then you are not
 going to do well in our group and maybe this isn't the group for you.

 I think we are going to need to be much more forceful about this, and if
 the threat that someone has commit rights and therefore we can't ignore
 them, we will have to reconsider who can commit to this project.  Do I
 need to be any clearer?

You are very clear, but as to why, I am not sure.

On Monday, realising that Robert had discovered something of massive
potential benefit to the community, I asked Tom to take a look at the
patch to see if I could get his interest in including it in this
release. I did that out of pure altruism; how could I possibly benefit
from highlighting the work of another person, another company?

Tom has agreed with me that making tuning proposals during beta is
acceptable. In this case, he thinks it is too risky to apply. In fact,
I agreed, having reviewed the patch myself, suggesting a much simpler,
non-invasive patch instead (a new reason, as you say). I then
immediately accepted his decision to exclude any patch involving
locking from further consideration.

Given the level of potential benefit, I don't have a problem tapping
Tom on the shoulder to review it and see if it is tweakable. At no
point have I discussed applying the patch myself, nor have I ever even
considered it. The main point is that in his hands a task can be done
in days, not the months others have quoted. You can read that as
respect and optimism, or you can see chaos and disrespect, but that is
all in the eye of the beholder.

As a result of this, I've been insulted, told I have no respect for
process and even suggested there was a threat of patch war. None of
that is reasonable or anywhere close to truth. If there has been a
time sapping discussion, it is because people have jumped to
conclusions and responded irrationally. To be honest, I'm completely
surprised by all of that. I had no idea that me asking Tom a question
was perceived as a denial of service attack on the community, nor that
it would result in the comments made to me and about me.

As long as I am allowed the freedom to speak in this forum then I will
speak up for PostgreSQL users, committer or not. As long as I'm a
committer, I will take responsibility for the code and seek to improve
it and fix it according to the community process.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list 

Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-08 Thread Robert Haas
On Wed, Jun 8, 2011 at 11:39 AM, Jim Nasby j...@nasby.net wrote:
 On Jun 7, 2011, at 8:24 AM, Stephen Frost wrote:
 * Alvaro Herrera (alvhe...@commandprompt.com) wrote:
 I note that if 2nd Quadrant is interested in having a game-changing
 platform without having to wait a full year for 9.2, they can obviously
 distribute a modified version of Postgres that integrates Robert's
 patch.

 Having thought about this, I've got to agree with Alvaro on this one.
 The people who need this patch are likely to pull it down and patch it
 in and use it, regardless of if it's in a release or not.  My money is
 that Treat's already got it running on some massive prod system that he
 supports ( ;) ).

 If we get it into the first CF of 9.2 then people are going to be even
 more likely to pull it down and back-patch it into 9.1.  As soon as we
 wrap up CF1 and put out our first alpha, the performance testers will
 have something to point at and say look!  PG scales *even better* now!
 and they're not going to particularly care that it's an alpha and the
 blog-o-sphere isn't going to either, especially if we can say and it'll
 be in the next release which is scheduled for May.

 From the Thinking Outside The Box dept.:

 Also, if the performance gains prove to be as earth-shattering as initial 
 results indicate, there's nothing that says we *have* to wait until the 
 middle of next year to get this out. We could push to get 9.2 out with fewer 
 other features, or possibly even break with tradition and backport this to 
 9.1 (or perhaps have a fork of 9.1 that we only support until 9.2 is out).

 Obviously, those options all involve serious time commitments and the 
 community will have to weigh those carefully. And we'd have to have very 
 strong evidence of the benefits before even having that discussion, because 
 the discussion itself will likely be resource intensive. But the option *is* 
 there, should we decide to pursue it.

 This means that this patch is too important to wait another 12 months isn't 
 really a valid point: it only has to wait 12 months if thats what the 
 community thinks is best; otherwise it could miss 9.1 *and* be out 
 significantly before 12 months from now.

Right.  The community gets to decide when the community wants to
release, and with what features.  Right now, the consensus is that we
want to finish up 9.1 and release it.  It doesn't seem impossible that
we could manage to do that before this patch is ready for commit,
which is why I don't want to try to slip this into 9.1 no matter how
valuable it is.

I also feel that the fundamental thing we need in order to have better
releases is more developers spending more time developing cool stuff.
That is why I am somewhat dismayed to see this discussion veer off on
what I consider to be a tangent about release scheduling.  It took me
about 3 days to write the patch.  I've now spent the better part of a
day on this scheduling discussion.  I would rather have spent that
time improving the patch.  Or working on some other patch.  Or getting
9.1 out the door.  Now, mind you, I think release scheduling is
important.  I believe in the value of good project management.  But if
we make every cool patch that comes along into an opportunity to fight
about the release schedule, that's not productive.  Already, I feel
that any hope I might have had of getting useful technical feedback on
this patch anytime in the near future has been basically obliterated.
What a bummer.

As for the 9.2 schedule, I'm actually hoping that 9.2 will be a big
release for performance, sorta like 8.3 was.  I think that to make
that happen, we're going to need more than one good patch.  This patch
can be part of that picture, but there are many users who derive no
benefit or only a small benefit from it.  Of course, there are some
who will get a big benefit, and I'm as excited about that as everyone
else, but if we can broaden the aperture a bit and come up with a
variety of improvements that hit on a variety of use cases, then we'll
really have something to brag about.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-08 Thread Simon Riggs
On Wed, Jun 8, 2011 at 6:02 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Bruce Momjian br...@momjian.us writes:
 Simon is right that we slipped the vxid patch into 8.3 when a Postgres
 user I talked to at Linuxworld mentioned high vacuum freeze activity and
 simple calculations showed the many read-only queries could cause high
 xid usage.  Fortunately we already had a patch available and Tom applied
 it during beta.  It was an existing patch that took on new urgency
 during beta.

 Just to set the record straight on this ... the vxid patch went in on
 2007-09-05:
 http://archives.postgresql.org/pgsql-committers/2007-09/msg00026.php
 which was a day shy of a month before we wrapped 8.3beta1:
 http://archives.postgresql.org/pgsql-committers/2007-10/msg00089.php
 so it was during alpha phase not beta.  And 8.3RC1 was stamped on
 2008-01-03.  So Simon's assertion that this was days before we produced
 a release candidate is correct, if you take days as 4 months.

The patch went in slightly more than 6 months after feature freeze,
even though it was written by a summer student and did not even pass
review by the student's mentor (me).

The patch is invasive, involving core changes to the transaction
infrastructure and touching the more than 30 files.

It was a brilliant contribution from Florian.

I take it as an example of
* what you can do when you set your mind to it, given sufficient cause
and a good starting point
* how people can propose things of value to the community even at a late stage
* how I have respected the process at other times

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-08 Thread Simon Riggs
On Wed, Jun 8, 2011 at 5:33 AM, Bruce Momjian br...@momjian.us wrote:

 One more thing --- when Tom applied that patch during 8.3 beta it was
 with everyone's agreement, so the policy should be that if we are going
 to break the rules, everyone has to agree --- if anyone disagrees, the
 rules stand.

I spoke against applying the patch, and to my knowledge was the only
person to have reviewed it at that stage.

I was happy that Tom applied it, but I would not have done so myself
then, nor would I do so now. I would trust only Tom to do that, which
is why I proposed to Tom that he look at Robert's patch.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-08 Thread Robert Haas
On Wed, Jun 8, 2011 at 12:25 PM, Simon Riggs si...@2ndquadrant.com wrote:
 As a result of this, I've been insulted, told I have no respect for
 process and even suggested there was a threat of patch war.

Well, you've pretty much said flat out you don't like the process, and
you don't agree with having a firm feature freeze.  I think it's a
perfectly legitimate question to ask whether we're going to have to
continually relitigate that point.  This is at least the second major
dust-up on this point since the end of 9.1CF4, and there were some
smaller ones, too.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-08 Thread Simon Riggs
On Wed, Jun 8, 2011 at 5:32 PM, Robert Haas robertmh...@gmail.com wrote:

 It took me
 about 3 days to write the patch.  I've now spent the better part of a
 day on this scheduling discussion.  I would rather have spent that
 time improving the patch.  Or working on some other patch.  Or getting
 9.1 out the door.

Sync Rep took 6 days to write initially and about 6 months to discuss
it, so you have a long way to go before your experience matches mine.

Sometimes people side track you onto things you think are pointless,
and sometimes you voice the opinion that they shouldn't have done so.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-08 Thread Simon Riggs
On Wed, Jun 8, 2011 at 5:44 PM, Robert Haas robertmh...@gmail.com wrote:
 On Wed, Jun 8, 2011 at 12:25 PM, Simon Riggs si...@2ndquadrant.com wrote:
 As a result of this, I've been insulted, told I have no respect for
 process and even suggested there was a threat of patch war.

 Well, you've pretty much said flat out you don't like the process, and
 you don't agree with having a firm feature freeze.  I think it's a
 perfectly legitimate question to ask whether we're going to have to
 continually relitigate that point.  This is at least the second major
 dust-up on this point since the end of 9.1CF4, and there were some
 smaller ones, too.

Why do you address this to me? Many others have been committing
patches against raised issues well after feature freeze.

You do not wish to stop all patches, only those you disagree with. How
would I know you disagree with a patch without discussing it?

I note that you've claimed *everything* I have discussed is a new
feature, whereas everything you or others have done is an open item.
You can claim that everything I suggest is a dust-up if you wish, but
who makes it a dust up and why?

The point I have made is that I disagree with a feature freeze date
fixed ahead of time without regard to the content of the forthcoming
release. I've not said I disagree with feature freezes altogether,
which would be utterly ridiculous. Fixed dates are IMHO much less
important than a sensible and useful feature set for our users. MySQL
repeatedly delivered releases with half-finished features and earned
much disrespect. We have never done that previously and I am against
doing so in the future.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-08 Thread Joshua Berkus
Simon,

 The point I have made is that I disagree with a feature freeze date
 fixed ahead of time without regard to the content of the forthcoming
 release. I've not said I disagree with feature freezes altogether,
 which would be utterly ridiculous. Fixed dates are IMHO much less
 important than a sensible and useful feature set for our users.

This is such a non-argument it's silly.  We have so many new major features for 
9.1 that I'm having trouble writing sensible press releases which don't sound 
like a laundry list.

 MySQL
 repeatedly delivered releases with half-finished features and earned
 much disrespect. We have never done that previously and I am against
 doing so in the future.

This is also total BS.  I worked on the MySQL team.  Before Sun/Oracle, MySQL 
specifically had feature-driven releases, where Marketing decided what features 
5.0, 5.1 and 5.2 would have.  They also accepted new features during beta if 
Marketing liked them enough.  This resulted in the 5.1 release being *three 
years late*, and 5.3 being cancelled altogether.  And let's talk about the 
legendary instability of 5.0, because they decided that they couldn't cancel 
partitioning and stored procedures, whether they were ready for prime time or 
not and because they kept changing the API during beta.

MySQL never had time-based releases before Oracle took them over.  And Oracle 
has been having feature-free releases because they're trying to work through 
MySQL's list of thousands of unfixed bugs which dates back to 2003.

An argument for feature-driven releases is in fact an argument for the MySQL AB 
development model.  And that's not a company I want to emulate.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-08 Thread Joshua D. Drake

On 06/07/2011 11:55 AM, Tom Lane wrote:

Simon Riggssi...@2ndquadrant.com  writes:

Before you arrived, it was quite normal to suggest tuning patches
after feature freeze.


*Low risk* tuning patches make sense at this stage, yes.  Fooling with
the lock mechanisms doesn't qualify as low risk in my book.  The
probability of undetected subtle problems is just too great.

regards, tom lane


I would like to see us continue on the path of release not 
destabilization. Any patch that breaks into core feature mechanisms 
(like locking) is bound to have something unsuspecting in the wings.


+1 for submitting for 9.2.
+1 for not comitting to 9.1.

Sincerely,

Joshua D. Drake



--
Command Prompt, Inc. - http://www.commandprompt.com/
PostgreSQL Support, Training, Professional Services and Development
The PostgreSQL Conference - http://www.postgresqlconference.org/
@cmdpromptinc - @postgresconf - 509-416-6579

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-08 Thread Robert Haas
On Wed, Jun 8, 2011 at 1:10 PM, Simon Riggs si...@2ndquadrant.com wrote:
 Why do you address this to me? Many others have been committing
 patches against raised issues well after feature freeze.

No one other than you has proposed committing anything nearly as
invasive as this, and the great majority of what we've committed has
been targeted at new regressions in 9.1.

There is a difference between a feature and a bug fix.  Sometimes the
distinction is arguable, but this isn't one of those cases.  A feature
freeze does not mean an absolute code freeze; it means a freeze on
*features*.

 You do not wish to stop all patches, only those you disagree with. How
 would I know you disagree with a patch without discussing it?

 I note that you've claimed *everything* I have discussed is a new
 feature, whereas everything you or others have done is an open item.
 You can claim that everything I suggest is a dust-up if you wish, but
 who makes it a dust up and why?

I think the people, including me, who feel that it's not a good idea
to commit new features have been very clear about the reasons for
their position - namely, (1) the desire to get the release out the
door in a timely fashion, and (2) the desire to treat everyone's
patches in a fair and even-handed way rather than privileging some
over others.  I'm just as much against committing my own features, or
Tom's features, or Alvaro's features as I am against committing your
features - not because I don't like the features (I do) but because I
want to release 9.1 in about a month.

 The point I have made is that I disagree with a feature freeze date
 fixed ahead of time without regard to the content of the forthcoming
 release. I've not said I disagree with feature freezes altogether,
 which would be utterly ridiculous. Fixed dates are IMHO much less
 important than a sensible and useful feature set for our users. MySQL
 repeatedly delivered releases with half-finished features and earned
 much disrespect. We have never done that previously and I am against
 doing so in the future.

So am I.  But apparently, we have very different ideas of what that
means.   I thought that making the server shuts down properly, even
if you are using sync rep was a clear-cut case of correcting a
half-finished feature, but you argued against that change.  And I
think that revamping the locking mechanism so it's faster is clearly
a new feature, not a repair to something half-finished.  I don't
expect it's very realistic to think that everyone is going to agree on
every patch, but we can't agree that bug fixes and features should be
treated differently, or if we can't agree at least in most cases on
what the difference is between one and the other, then we will spend a
lot of time talking past each other.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-08 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes:
 On Wed, Jun 8, 2011 at 6:02 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Just to set the record straight on this ... the vxid patch went in on
 2007-09-05:
 http://archives.postgresql.org/pgsql-committers/2007-09/msg00026.php
 which was a day shy of a month before we wrapped 8.3beta1:
 http://archives.postgresql.org/pgsql-committers/2007-10/msg00089.php
 so it was during alpha phase not beta.  And 8.3RC1 was stamped on
 2008-01-03.  So Simon's assertion that this was days before we produced
 a release candidate is correct, if you take days as 4 months.

 The patch went in slightly more than 6 months after feature freeze,
 even though it was written by a summer student and did not even pass
 review by the student's mentor (me).

I'm not sure why you're having such a hard time distinguishing before
beta from after beta, but in any case please notice that you're
describing a cycle where we spent nine months in feature freeze.
Nobody else here is going to hold that up as an example of sound project
management that we ought to repeat.  And the way to not repeat it is to
not accept risky new patches late in the cycle.

(This may be something of an apples-to-oranges comparison, though, since
as best I can tell from a quick look in the archives, we were not then
using the term feature freeze the same as we are now --- 2007-04-01
seems to have been the point that we would now call beginning of the
last CF, ie, all feature patches for 8.3 were supposed to have been
*submitted*, not necessarily committed.  And we had a lot of them
pending at that point, because of lack of the CF process to get things
in earlier.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-08 Thread Tom Lane
Joshua Berkus j...@agliodbs.com writes:
 Simon,
 The point I have made is that I disagree with a feature freeze date
 fixed ahead of time without regard to the content of the forthcoming
 release. I've not said I disagree with feature freezes altogether,
 which would be utterly ridiculous. Fixed dates are IMHO much less
 important than a sensible and useful feature set for our users.

 This is such a non-argument it's silly.

Perhaps more to the point, we've tried that approach in the past,
repeatedly, and it's been a scheduling disaster every single time.
Slipping the release date in order to get in newly-written features,
no matter *how* attractive they are, does not work.  Maybe there are
people who can make it work, but not us.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-07 Thread Dave Page
On Tue, Jun 7, 2011 at 12:29 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Dave Page dp...@pgadmin.org writes:
 On Mon, Jun 6, 2011 at 8:44 PM, Stephen Frost sfr...@snowman.net wrote:
 If we're going to start putting in changes like this, I'd suggest that
 we try and target something like September for 9.1 to actually be
 released.  Playing with the lock management isn't something we want to
 be doing lightly and I think we definitely need to have serious testing
 of this, similar to what has been done for the SSI changes, before we're
 going to be able to release it.

 Completely aside from the issue at hand, aren't we looking at a
 September release by now anyway (assuming we have to void late
 July/August as we usually do)?

 Very possibly.  So if we add this in, we're talking November or December
 instead of September.  You can't argue that July/August will be lost
 time for one development path but not another.

That would depend on 2 things - a) whether testing and review of this
single patch would really add 2 - 3 months to the schedule (I'm no
expert on our locking, but I suspect it would not), and b) whether
there are people around over the summer who could test/review. The
reason we usually skip the summer isn't actually a wholesale lack of
people - it's because it's not so good from a publicity perspective,
and it's hard to get all the packagers around at the same time.


-- 
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-07 Thread Stephen Frost
* Alvaro Herrera (alvhe...@commandprompt.com) wrote:
 I note that if 2nd Quadrant is interested in having a game-changing
 platform without having to wait a full year for 9.2, they can obviously
 distribute a modified version of Postgres that integrates Robert's
 patch.

Having thought about this, I've got to agree with Alvaro on this one.
The people who need this patch are likely to pull it down and patch it
in and use it, regardless of if it's in a release or not.  My money is
that Treat's already got it running on some massive prod system that he
supports ( ;) ).

If we get it into the first CF of 9.2 then people are going to be even
more likely to pull it down and back-patch it into 9.1.  As soon as we
wrap up CF1 and put out our first alpha, the performance testers will
have something to point at and say look!  PG scales *even better* now!
and they're not going to particularly care that it's an alpha and the
blog-o-sphere isn't going to either, especially if we can say and it'll
be in the next release which is scheduled for May.

So, all-in-all, -1 from me on trying to get this into 9.1.  Let's get
9.1 done and out the door already, hopefully before summer saps away
*too* many resources..

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-07 Thread Joshua D. Drake

On 06/06/2011 04:43 PM, Robert Haas wrote:

On Mon, Jun 6, 2011 at 6:53 PM, Alvaro Herrera
alvhe...@commandprompt.com  wrote:

Excerpts from Robert Haas's message of vie jun 03 09:17:08 -0400 2011:

I've now spent enough time working on this issue now to be convinced
that the approach has merit, if we can work out the kinks.  I'll start
with some performance numbers.


I hereby recommend that people with patches such as this one while on
the last weeks till release should refrain from posting them until the
release has actually taken place.


%@#!

Next time I'll be sure to only post my patches during beta if they suck.



I think Alvaro's point isn't directed at you Robert but at the idea that 
this should be applied to 9.1.


Sincerely,

Joshua D. Drake

--
Command Prompt, Inc. - http://www.commandprompt.com/
PostgreSQL Support, Training, Professional Services and Development
The PostgreSQL Conference - http://www.postgresqlconference.org/
@cmdpromptinc - @postgresconf - 509-416-6579

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-07 Thread Simon Riggs
On Mon, Jun 6, 2011 at 8:50 PM, Dave Page dp...@pgadmin.org wrote:
 On Mon, Jun 6, 2011 at 8:40 PM, Stefan Kaltenbrunner
 ste...@kaltenbrunner.cc wrote:
 On 06/06/2011 09:24 PM, Dave Page wrote:
 On Mon, Jun 6, 2011 at 8:12 PM, Dimitri Fontaine dimi...@2ndquadrant.fr 
 wrote:
 So, to the question “do we want hard deadlines?” I think the answer is
 “no”, to “do we need hard deadlines?”, my answer is still “no”, and to
 the question “does this very change should be considered this late?” my
 answer is yes.

 Because it really changes the game for PostgreSQL users.

 Much as I hate to say it (I too want to keep our schedule as
 predictable and organised as possible), I have to agree. Assuming the
 patch is good, I think this is something we should push into 9.1. It
 really could be a game changer.

 I disagree - the proposed patch maybe provides a very significant
 improvment for a certain workload type(nothing less but nothing more),
 but it was posted way after -BETA and I'm not sure we yet understand all
 implications of the changes.

 We certainly need to be happy with the implications if we were to make
 such a decision.

 We also have to consider that the underlying issues are known problems
 for multiple years^releases so I don't think there is a particular rush
 to force them into a particular release (as in 9.1).

 No, there's no *technical* reason we need to do this, as there would
 be if it were a bug fix for example. I would just like to see us
 narrow the gap with our competitors sooner rather than later, *if*
 we're a) happy with the change, and b) we're talking about a minimal
 delay (which we may be - Robert says he thinks the patch is good, so
 with another review and beta testing).

Stefan/Robert's observation that we perform a
VirtualXactLockTableInsert() to no real benefit is a good one.

It leads to the following simple patch to remove one lock table hit
per transaction. It's a lot smaller impact on the LockMgr locks, but
it will still be substantial. Performance tests please?

This patch is much less invasive and has impact only on CREATE INDEX
CONCURRENTLY and Hot Standby. It's taken me about 2 hours to write and
test and there's no way it will cause any delay at all to the release
schedule. (Though I'm sure Robert can improve it).

If we combine this patch with Koichi-san's recommended changes to the
number of lock partitions, we will have considerable impact for 9.1.
Robert will still get his day in the sun, just with 9.2.

This way we get something now *and* something later, while the risk
minimisers will have succeeded in protecting the code. A compromise
for everyone.

Please consider this as a serious proposal for tuning in 9.1.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


remove_VirtualXactLockTableInsert.v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-07 Thread Robert Haas
On Tue, Jun 7, 2011 at 12:51 PM, Simon Riggs si...@2ndquadrant.com wrote:
 On Mon, Jun 6, 2011 at 8:50 PM, Dave Page dp...@pgadmin.org wrote:
 On Mon, Jun 6, 2011 at 8:40 PM, Stefan Kaltenbrunner
 ste...@kaltenbrunner.cc wrote:
 On 06/06/2011 09:24 PM, Dave Page wrote:
 On Mon, Jun 6, 2011 at 8:12 PM, Dimitri Fontaine dimi...@2ndquadrant.fr 
 wrote:
 So, to the question “do we want hard deadlines?” I think the answer is
 “no”, to “do we need hard deadlines?”, my answer is still “no”, and to
 the question “does this very change should be considered this late?” my
 answer is yes.

 Because it really changes the game for PostgreSQL users.

 Much as I hate to say it (I too want to keep our schedule as
 predictable and organised as possible), I have to agree. Assuming the
 patch is good, I think this is something we should push into 9.1. It
 really could be a game changer.

 I disagree - the proposed patch maybe provides a very significant
 improvment for a certain workload type(nothing less but nothing more),
 but it was posted way after -BETA and I'm not sure we yet understand all
 implications of the changes.

 We certainly need to be happy with the implications if we were to make
 such a decision.

 We also have to consider that the underlying issues are known problems
 for multiple years^releases so I don't think there is a particular rush
 to force them into a particular release (as in 9.1).

 No, there's no *technical* reason we need to do this, as there would
 be if it were a bug fix for example. I would just like to see us
 narrow the gap with our competitors sooner rather than later, *if*
 we're a) happy with the change, and b) we're talking about a minimal
 delay (which we may be - Robert says he thinks the patch is good, so
 with another review and beta testing).

 Stefan/Robert's observation that we perform a
 VirtualXactLockTableInsert() to no real benefit is a good one.

 It leads to the following simple patch to remove one lock table hit
 per transaction. It's a lot smaller impact on the LockMgr locks, but
 it will still be substantial. Performance tests please?

 This patch is much less invasive and has impact only on CREATE INDEX
 CONCURRENTLY and Hot Standby. It's taken me about 2 hours to write and
 test and there's no way it will cause any delay at all to the release
 schedule. (Though I'm sure Robert can improve it).

 If we combine this patch with Koichi-san's recommended changes to the
 number of lock partitions, we will have considerable impact for 9.1.
 Robert will still get his day in the sun, just with 9.2.

 This way we get something now *and* something later, while the risk
 minimisers will have succeeded in protecting the code. A compromise
 for everyone.

 Please consider this as a serious proposal for tuning in 9.1.

You seem to have completely ignored the reason why it works that way
in the first place, which is that there is otherwise a risk of
undetected deadlock.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-07 Thread Robert Haas
On Tue, Jun 7, 2011 at 11:56 AM, Joshua D. Drake j...@commandprompt.com wrote:
 On 06/06/2011 04:43 PM, Robert Haas wrote:

 On Mon, Jun 6, 2011 at 6:53 PM, Alvaro Herrera
 alvhe...@commandprompt.com  wrote:

 Excerpts from Robert Haas's message of vie jun 03 09:17:08 -0400 2011:

 I've now spent enough time working on this issue now to be convinced
 that the approach has merit, if we can work out the kinks.  I'll start
 with some performance numbers.

 I hereby recommend that people with patches such as this one while on
 the last weeks till release should refrain from posting them until the
 release has actually taken place.

 %@#!

 Next time I'll be sure to only post my patches during beta if they suck.


 I think Alvaro's point isn't directed at you Robert but at the idea that
 this should be applied to 9.1.

Oh, I get that.  I'm just dismayed that we can't have a discussion
about the patch without getting sidetracked into a conversation about
whether we should throw feature freeze out the window.  If posting
patches that do interesting things during beta results in everyone
ignoring both the work that needs to be done to get from beta to final
release, and the patch itself, in favor of talking about the release
schedule, then I think at the next developer meeting we're going to
get to hear Tom argue that overlapping the end of beta with the
beginning of the next release cycle is a mistake and we should go back
to the old system where we yell at everyone to shut up unless they're
helping test or fix bugs.  Since that overlap is going to (hopefully)
allow this patch to get into the tree ~2-3 months SOONER than it would
have under the old system, I would be unhappy to see it abolished.

Everyone who is arguing for the inclusion of this patch in 9.1 should
take a minute to think about the following fact: If the PostgreSQL
development process does not work for Tom, it does not work.  Full
stop.  We all know that Tom is conservative with respect to release
management, but we also know that his output is enormous, that he
fixes virtually all of the bugs that *get* fixed, and that our
well-deserved reputation for high quality releases is in large part
attributable to him.  We will not be better off if we design a process
that leaves him cold.  The fact that Alvaro, Heikki, Andrew, Kevin,
and myself don't like the proposed process either is just icing on the
cake.  And I use the term process loosely, because what's really
being proposed is the complete absence of any process.  The idea of
having a feature freeze some time prior to release is hardly a novel
roadblock that we've invented here at the PostgreSQL Global
Development Group.  It's a basic software engineering principle that
has been universally adopted by just about every open and closed
source development project in existence, and with good reason.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-07 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes:
 Please consider this as a serious proposal for tuning in 9.1.

Look: it is at least four months too late for anything of the sort in 9.1.
We should be fixing bugs, and nothing else, if we ever want to get 9.1
out the door.  Performance improvements don't qualify, especially not
ones that tinker with fundamental parts of the system and seem highly
likely to introduce new bugs.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-07 Thread Joshua Berkus
 iew. The
 reason we usually skip the summer isn't actually a wholesale lack of
 people - it's because it's not so good from a publicity perspective,
 and it's hard to get all the packagers around at the same time.

Actually, the summer is *excellent* from a publicity perspective ... at least, 
June and July are.  Both of those months are full of US conferences whose PR we 
can piggyback on to make a splash.

August is really the only bad month from a PR perspective, because we lose a 
lot of our European RCs, and there's no bandwagons to jump on.  But even August 
has the advantage of having no major US or Christian holidays to interfere with 
release dates.

However, we're more likely to have an issue with *packager* availability in 
August.  Besides, isn't this a little premature?  Last I looked, we still have 
some big nasty open items.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-07 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 ... I think at the next developer meeting we're going to
 get to hear Tom argue that overlapping the end of beta with the
 beginning of the next release cycle is a mistake and we should go back
 to the old system where we yell at everyone to shut up unless they're
 helping test or fix bugs.

I think we have already got quite enough evidence to conclude that this
approach is broken.  Not only does it appear that hardly anybody but me
is actively working on stabilizing 9.1, but I'm wasting quite a bit of
my time trying to keep Simon from destabilizing it; to say nothing of
reacting to design proposals for 9.2 work (or else feeling guilty
because I'm ignoring them, which is in fact what I've mostly been
doing).

As a measure of how completely this is not working: I've had read the
SSI code as a number one priority item for about two months now, and
still haven't found time to read one line of it.

 Everyone who is arguing for the inclusion of this patch in 9.1 should
 take a minute to think about the following fact: If the PostgreSQL
 development process does not work for Tom, it does not work.

I'd like to think that I'm not the sole driver of this process.
However, if everybody else is going to start playing in their 9.2
sandbox and ignore getting a release out, then yeah it comes down
to how much bandwidth I've got.  And that's finite.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-07 Thread Joshua Berkus
Robert,

 Oh, I get that. I'm just dismayed that we can't have a discussion
 about the patch without getting sidetracked into a conversation about
 whether we should throw feature freeze out the window. 

That's not something you can change.  Whatever the patch is, even if it's a 
psql improvement, *someone* will argue that it's super-critical to shoehorn it 
into the release at the last minute.  It's a truism of human nature to 
rationalize exceptions where your own interest is concerned.

As long as we have solidarity of the committers that this is not allowed, 
however, this is not a real problem.  And it appears that we do.  In the 
future, it shouldn't even be necessary to discuss it.

For my part, I'm excited that we seem to be getting some big hairy important 
patches in to CF1, which means that those patches will be well-tested by the 
time 9.2 reaches beta.  Espeically getting Robert's patch and Simons's 
WALInsertLock work into CF1 means that we'll have 7 months to find serious bugs 
before beta starts.  So I'd really like to carry on with the current 
development schedule.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


9.1 release scheduling (was Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch)

2011-06-07 Thread Tom Lane
Joshua Berkus j...@agliodbs.com writes:
 Actually, the summer is *excellent* from a publicity perspective ... at 
 least, June and July are.  Both of those months are full of US conferences 
 whose PR we can piggyback on to make a splash.

 August is really the only bad month from a PR perspective, because we lose 
 a lot of our European RCs, and there's no bandwagons to jump on.  But even 
 August has the advantage of having no major US or Christian holidays to 
 interfere with release dates.

 However, we're more likely to have an issue with *packager* availability in 
 August.  Besides, isn't this a little premature?  Last I looked, we still 
 have some big nasty open items.

Well, we're trying to fix them --- I'm still hoping that the known beta
blockers will be cleared by Thursday so we can ship beta2.  However,
what happens after that is uncertain.  I'm concerned that once the CF
starts, the number of developer cycles devoted to 9.1 testing will go to
zero, meaning that four weeks or so from now when the CF is over, we'll
have made no real progress beyond beta2.  It's hard to see how we have a
release before August if that's how things stand in early July.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-07 Thread Robert Haas
On Tue, Jun 7, 2011 at 1:27 PM, Joshua Berkus j...@agliodbs.com wrote:
 As long as we have solidarity of the committers that this is not allowed, 
 however, this is not a real problem.  And it appears that we do.  In the 
 future, it shouldn't even be necessary to discuss it.

Solidarity?

Simon - who was a committer last time I checked - seems to think that
the current process is entirely bunko.  And that is resulting in the
waste of a lot of time that could be better spent.  Our ability to
sustain this development process rests on the idea that we have some
kind of shared idea of what is and is not acceptable in general and at
particular points in the release cycle.  It *shouldn't* be necessary
to discuss it, but it apparently is.  Over and over and over again, in
fact.  It is critically important for the future success of this
project that we learn to walk and chew gum at the same time.  We are
failing outright.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: 9.1 release scheduling (was Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch)

2011-06-07 Thread Thom Brown
On 7 June 2011 19:32, Tom Lane t...@sss.pgh.pa.us wrote:
 Joshua Berkus j...@agliodbs.com writes:
 Actually, the summer is *excellent* from a publicity perspective ... at 
 least, June and July are.  Both of those months are full of US conferences 
 whose PR we can piggyback on to make a splash.

 August is really the only bad month from a PR perspective, because we lose 
 a lot of our European RCs, and there's no bandwagons to jump on.  But even 
 August has the advantage of having no major US or Christian holidays to 
 interfere with release dates.

 However, we're more likely to have an issue with *packager* availability in 
 August.  Besides, isn't this a little premature?  Last I looked, we still 
 have some big nasty open items.

 Well, we're trying to fix them --- I'm still hoping that the known beta
 blockers will be cleared by Thursday so we can ship beta2.  However,
 what happens after that is uncertain.  I'm concerned that once the CF
 starts, the number of developer cycles devoted to 9.1 testing will go to
 zero, meaning that four weeks or so from now when the CF is over, we'll
 have made no real progress beyond beta2.  It's hard to see how we have a
 release before August if that's how things stand in early July.

Speaking of which, is it now safe to remove the NOT VALID constraints
don't dump properly issue from the blocker list since the fix has
been committed?

-- 
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-07 Thread Robert Haas
On Tue, Jun 7, 2011 at 1:21 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 ... I think at the next developer meeting we're going to
 get to hear Tom argue that overlapping the end of beta with the
 beginning of the next release cycle is a mistake and we should go back
 to the old system where we yell at everyone to shut up unless they're
 helping test or fix bugs.

 I think we have already got quite enough evidence to conclude that this
 approach is broken.  Not only does it appear that hardly anybody but me
 is actively working on stabilizing 9.1, but I'm wasting quite a bit of
 my time trying to keep Simon from destabilizing it; to say nothing of
 reacting to design proposals for 9.2 work (or else feeling guilty
 because I'm ignoring them, which is in fact what I've mostly been
 doing).

 As a measure of how completely this is not working: I've had read the
 SSI code as a number one priority item for about two months now, and
 still haven't found time to read one line of it.

 Everyone who is arguing for the inclusion of this patch in 9.1 should
 take a minute to think about the following fact: If the PostgreSQL
 development process does not work for Tom, it does not work.

 I'd like to think that I'm not the sole driver of this process.
 However, if everybody else is going to start playing in their 9.2
 sandbox and ignore getting a release out, then yeah it comes down
 to how much bandwidth I've got.  And that's finite.

I plead guilty to taking my eye off the ball post-beta1.  I busted my
ass for two months stabilizing other people's code after CF4 was over,
and then I moved on to other things.  I will try to get my eye back on
the ball - but actually I'm not sure there's all that much to do.   A
quick review of the open items list suggests that we have fixed a
total of six issues since beta1, as opposed to 47 prior to beta1.  And
all of those are being handled (two by you).  I also don't see much in
the way of unanswered 9.1 bug reports on pgsql-bugs, either.  There
may well be other open items, and I'm not unwilling to work on them,
but I don't read minds.  What needs doing?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: 9.1 release scheduling (was Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch)

2011-06-07 Thread Robert Haas
On Tue, Jun 7, 2011 at 1:45 PM, Thom Brown t...@linux.com wrote:
 Speaking of which, is it now safe to remove the NOT VALID constraints
 don't dump properly issue from the blocker list since the fix has
 been committed?

I hope so, because I just did that (before noticing this email from you).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-07 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Tue, Jun 7, 2011 at 1:27 PM, Joshua Berkus j...@agliodbs.com wrote:
 As long as we have solidarity of the committers that this is not allowed, 
 however, this is not a real problem.  And it appears that we do.  In the 
 future, it shouldn't even be necessary to discuss it.

 Solidarity?

 Simon - who was a committer last time I checked - seems to think that
 the current process is entirely bunko.  And that is resulting in the
 waste of a lot of time that could be better spent.

Yes.  If it were anybody but Simon, we wouldn't be spending a lot of
time on it; we'd just say sorry, this has to wait for 9.2 and that
would be the end of it.  As things stand, we have to convince him not to
commit these things ... or else be prepared to fight a war over whether
to revert them, which will be even more time-consuming and
trust-destroying.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-07 Thread Simon Riggs
On Tue, Jun 7, 2011 at 6:33 PM, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Jun 7, 2011 at 1:27 PM, Joshua Berkus j...@agliodbs.com wrote:
 As long as we have solidarity of the committers that this is not allowed, 
 however, this is not a real problem.  And it appears that we do.  In the 
 future, it shouldn't even be necessary to discuss it.

 Solidarity?

 Simon - who was a committer last time I checked - seems to think that
 the current process is entirely bunko.

I'm not sure why anyone that disagrees with you should be accused of
wanting to junk the whole process. I've not said that and I don't
think this.

Before you arrived, it was quite normal to suggest tuning patches
after feature freeze.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-07 Thread Stephen Frost
* Simon Riggs (si...@2ndquadrant.com) wrote:
 Before you arrived, it was quite normal to suggest tuning patches
 after feature freeze.

I haven't been around as long as some, but I think I've been around
longer than Robert, and I can say that I don't recall serious
performance patches, particularly ones around lock management and which
change a fair bit of good, generally being white-listed from feature
freeze or being pushed in after beta1.

Perhaps I've missed them or perhaps there's been a few exceptions that
I'm not remembering that make it look routine rather than an exception
basis.  We might have tweaked a config variable or changed a #define
somewhere close to the end of a cycle, but I really don't put those into
the same category as this change.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-07 Thread Kevin Grittner
Simon Riggs si...@2ndquadrant.com wrote:
 
 Before you arrived, it was quite normal to suggest tuning patches
 after feature freeze.
 
I've worn a lot of hats in the practical end of this industry, and
regardless of which perspective I look at this from, I can't think
of anything so destructive to productivity, developer morale,
meeting deadlines or release quality as slipping in just one more
item after feature freeze.  It's *always* something that someone
feels is so important that it's worth the delay and/or risk, and it
never works out well.
 
There are a lot of aspects of the development and release processes
on which I can see valid trade-offs and a lot of room for
negotiations and compromise, but having a feature freeze which is
treated seriously isn't one of them.  If nobody else was making an
issue of this, I still would be.
 
There's absolutely nothing personal or political in this -- I just
know what I've seen work and what I've seen cause problems.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-07 Thread Robert Haas
On Tue, Jun 7, 2011 at 2:06 PM, Simon Riggs si...@2ndquadrant.com wrote:
 On Tue, Jun 7, 2011 at 6:33 PM, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Jun 7, 2011 at 1:27 PM, Joshua Berkus j...@agliodbs.com wrote:
 As long as we have solidarity of the committers that this is not allowed, 
 however, this is not a real problem.  And it appears that we do.  In the 
 future, it shouldn't even be necessary to discuss it.

 Solidarity?

 Simon - who was a committer last time I checked - seems to think that
 the current process is entirely bunko.

 I'm not sure why anyone that disagrees with you should be accused of
 wanting to junk the whole process. I've not said that and I don't
 think this.

 Before you arrived, it was quite normal to suggest tuning patches
 after feature freeze.

I, of course, am not in a position to comment on what happened before
I arrived.  But of the six committers who have weighed in on this
thread, you're the only one who thinks this can plausibly be called a
tuning patch.  Nor would the outcome of this discussion have been any
different if I hadn't participated in it, which is why I steered clear
of the whole topic of how the patch should be handled procedurally for
the first three days.  By the time I weighed in with my opinion, Tom
and Heikki had already expressed theirs.

Now it's possible that my influence is so widespread and pernicious
that I've managed to convince to change Tom and Heikki's opinions on
the topic of feature freeze.  Perhaps, three years ago, they would
have been willing to accept the patch at the last minute, but now,
because of my advocacy for a disciplined feature freeze, they are not.
 To accept this argument, you would have to believe that I have the
power to make Tom Lane more conservative.  I don't believe I have
either the power or the inclination to do any such thing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-07 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes:
 Before you arrived, it was quite normal to suggest tuning patches
 after feature freeze.

*Low risk* tuning patches make sense at this stage, yes.  Fooling with
the lock mechanisms doesn't qualify as low risk in my book.  The
probability of undetected subtle problems is just too great.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-07 Thread Jignesh Shah
On Mon, Jun 6, 2011 at 11:20 PM, Jignesh Shah jks...@gmail.com wrote:


 Okay I tried it out with sysbench read scaling test..
 Note I had tried that earlier on 9.0
 http://jkshah.blogspot.com/2010/11/postgresql-90-simple-select-scaling.html

 And on that test I found that doing that test on anything bigger than
 4 cores lead to decreased performance ..
 Redoing the same test with 100 users on 4 vCPU Virtual Machine with
 8GB with 1M rows I get
   transactions:                        17870082 (59566.46 per sec.)
 which is inline with the best number on 9.0.
 This test hardly had any idle CPUs.

 However where it made a huge impact was doing the same test on my 8
 vCPU VM with 8GB RAM I get
    transactions:                        33274594 (110914.85 per sec.)

 which is a whopping 1.8x scaling for 2x scaling (from 4 to 8 vCPU)..
 My idle cpu was less than 7% which when taken into consideration that
 the useful work is line with my expectations is really impressive..
 (And plus the last time I did MySQL they were around 95K or so for the
 same test).


 Next step DBT-2..



I tried with a warehouse size of 50 all cached in memory and my
initial tests with DBT-2 using 8 vCPU does not show any major changes
for a quick 10 minute run. I did eliminate write bottlenecks for this
test so as to stress on locks (using full_page_writes=off,
synchronous_commit=off, etc). I also have a large enough bufferpool to
fit the all 50 warehouse DB in memory

Without patch  score:  29088 NOTPM
With patch patch score:  30161 NOTPM

It could be that I have other problems in the setup..One of the things
I noticed is that there are too many Idle in Connections being
reported which tells me something else is becoming a bottleneck here
:-) I also tested with multiple clients but similar results..  both
postgresql shows multiple idle in transaction and fetch in waiting
while the clients show waiting in SocketCheck.. like shown below for
example.

#0  0x7fc4e83a43c6 in poll () from /lib64/libc.so.6
#1  0x7fc4e8abd61a in pqSocketCheck ()
#2  0x7fc4e8abd730 in pqWaitTimed ()
#3  0x7fc4e8abc215 in PQgetResult ()
#4  0x7fc4e8abc398 in PQexecFinish ()
#5  0x004050e1 in execute_new_order ()
#6  0x0040374f in process_transaction ()
#7  0x00403519 in db_worker ()


So yes for DBT2 I think this is inconclusive since there still could
be other bottlenecks in play..  (Networking included)
But overall yes I like the sysbench read scaling numbers quite a bit..


Regards,
Jignesh

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-07 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 I plead guilty to taking my eye off the ball post-beta1.  I busted my
 ass for two months stabilizing other people's code after CF4 was over,
 and then I moved on to other things.  I will try to get my eye back on
 the ball - but actually I'm not sure there's all that much to do.   A
 quick review of the open items list suggests that we have fixed a
 total of six issues since beta1, as opposed to 47 prior to beta1.  And
 all of those are being handled (two by you).  I also don't see much in
 the way of unanswered 9.1 bug reports on pgsql-bugs, either.  There
 may well be other open items, and I'm not unwilling to work on them,
 but I don't read minds.  What needs doing?

Well, right at the moment there's not that much (if there were, I'd not
have proposed wrapping beta2 in two days).  You could look at some of
the not blocker items on the open-items list --- we really ought to
either do those things, or punt them off to TODO or the next CF as
appropriate, sometime before 9.1 final.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-07 Thread Simon Riggs
On Tue, Jun 7, 2011 at 7:55 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 Before you arrived, it was quite normal to suggest tuning patches
 after feature freeze.

 *Low risk* tuning patches make sense at this stage, yes.  Fooling with
 the lock mechanisms doesn't qualify as low risk in my book.  The
 probability of undetected subtle problems is just too great.

Good, then we do agree. Some things are allowed, with suitable
justification. That has not been a point accepted by everybody here
though.

Upthread, I proposed that we leave Robert's patch until 9.2. That was
*after* I had reviewed it for impact and risk. I agree, its High Risk,
and so must be put off until normal dev opens because of the
sensitivity and criticality of getting the locking interactions right.

Moving on from that, I have proposed other solutions. Koichi, Jignesh
and and then Robert have shown measurements of the huge contention in
this area of our software. Robert's patch addresses the problems, as
do Koichi's and my latest patch.  I would like to see us do
*something* about these problems for 9.1. Not all of them are risky or
time consuming. I'm clearly not alone in this thought; Dave, Dimitri
and Koichi-san have also spoken in favour of action for this release.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-07 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes:
 Moving on from that, I have proposed other solutions. Koichi, Jignesh
 and and then Robert have shown measurements of the huge contention in
 this area of our software. Robert's patch addresses the problems, as
 do Koichi's and my latest patch.  I would like to see us do
 *something* about these problems for 9.1. Not all of them are risky or
 time consuming.

In the first place, all of these issues predate 9.1 by years.  They are
not regressions or new bugs, and they have not suddenly gotten more
urgent.  In the second place, I haven't seen any proposals in the area
that appear low risk.  I seriously doubt that I would consider *any*
meaningful change in the locking area to be low risk.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-07 Thread Robert Haas
On Tue, Jun 7, 2011 at 3:44 PM, Jignesh Shah jks...@gmail.com wrote:
 On Mon, Jun 6, 2011 at 11:20 PM, Jignesh Shah jks...@gmail.com wrote:
 Okay I tried it out with sysbench read scaling test..
 Note I had tried that earlier on 9.0
 http://jkshah.blogspot.com/2010/11/postgresql-90-simple-select-scaling.html

 And on that test I found that doing that test on anything bigger than
 4 cores lead to decreased performance ..
 Redoing the same test with 100 users on 4 vCPU Virtual Machine with
 8GB with 1M rows I get
   transactions:                        17870082 (59566.46 per sec.)
 which is inline with the best number on 9.0.
 This test hardly had any idle CPUs.

 However where it made a huge impact was doing the same test on my 8
 vCPU VM with 8GB RAM I get
    transactions:                        33274594 (110914.85 per sec.)

 which is a whopping 1.8x scaling for 2x scaling (from 4 to 8 vCPU)..
 My idle cpu was less than 7% which when taken into consideration that
 the useful work is line with my expectations is really impressive..
 (And plus the last time I did MySQL they were around 95K or so for the
 same test).


 Next step DBT-2..



 I tried with a warehouse size of 50 all cached in memory and my
 initial tests with DBT-2 using 8 vCPU does not show any major changes
 for a quick 10 minute run. I did eliminate write bottlenecks for this
 test so as to stress on locks (using full_page_writes=off,
 synchronous_commit=off, etc). I also have a large enough bufferpool to
 fit the all 50 warehouse DB in memory

 Without patch  score:      29088 NOTPM
 With patch patch score:  30161 NOTPM

 It could be that I have other problems in the setup..One of the things
 I noticed is that there are too many Idle in Connections being
 reported which tells me something else is becoming a bottleneck here
 :-) I also tested with multiple clients but similar results..  both
 postgresql shows multiple idle in transaction and fetch in waiting
 while the clients show waiting in SocketCheck.. like shown below for
 example.

 #0  0x7fc4e83a43c6 in poll () from /lib64/libc.so.6
 #1  0x7fc4e8abd61a in pqSocketCheck ()
 #2  0x7fc4e8abd730 in pqWaitTimed ()
 #3  0x7fc4e8abc215 in PQgetResult ()
 #4  0x7fc4e8abc398 in PQexecFinish ()
 #5  0x004050e1 in execute_new_order ()
 #6  0x0040374f in process_transaction ()
 #7  0x00403519 in db_worker ()


 So yes for DBT2 I think this is inconclusive since there still could
 be other bottlenecks in play..  (Networking included)
 But overall yes I like the sysbench read scaling numbers quite a bit..

I think you will find that for write workloads WALInsertLock is so
badly contended that nothing else matters.  We really need to spend
some time working on that during the 9.2 cycle, but I don't have
anything that resembles a plan at this point.  If you have the cycles,
try compiling with LWLOCK_STATS defined and looking at the blk
numbers just to confirm that's where the bottleneck is.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-07 Thread Simon Riggs
On Tue, Jun 7, 2011 at 9:00 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 Moving on from that, I have proposed other solutions. Koichi, Jignesh
 and and then Robert have shown measurements of the huge contention in
 this area of our software. Robert's patch addresses the problems, as
 do Koichi's and my latest patch.  I would like to see us do
 *something* about these problems for 9.1. Not all of them are risky or
 time consuming.

 In the first place, all of these issues predate 9.1 by years.  They are
 not regressions or new bugs, and they have not suddenly gotten more
 urgent.  In the second place, I haven't seen any proposals in the area
 that appear low risk.  I seriously doubt that I would consider *any*
 meaningful change in the locking area to be low risk.

That's a shame. We'll fix it in 9.2 then.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-07 Thread Robert Haas
On Tue, Jun 7, 2011 at 12:51 PM, Simon Riggs si...@2ndquadrant.com wrote:
 Stefan/Robert's observation that we perform a
 VirtualXactLockTableInsert() to no real benefit is a good one.

 It leads to the following simple patch to remove one lock table hit
 per transaction. It's a lot smaller impact on the LockMgr locks, but
 it will still be substantial. Performance tests please?

 This patch is much less invasive and has impact only on CREATE INDEX
 CONCURRENTLY and Hot Standby. It's taken me about 2 hours to write and
 test and there's no way it will cause any delay at all to the release
 schedule. (Though I'm sure Robert can improve it).

Incidentally, I spent the morning (before we got off on this tangent)
writing a patch to make VXID locks spring into existence on demand
instead of creating them for every transaction.  This applies on top
of my fastlock patch and fits in quite nicely with the existing
infrastructure that patch creates, and it helps modestly.  Well,
according to one metric, at least, it helps dramatically: traffic on
each lock manager partition locks drops from hundreds of thousands of
lock requests in a five minute period to just a few hundred.  But the
actual user-visible performance benefit is fairly modest - it goes
from ~36K TPS unpatched to ~129K TPS with the fast relation locks
alone to ~138K TPS with the fast relation locks plus a similar hack
for fast VXID locks (all results with pgbench -c 36 -j 36 -n -S -T 300
on a Nate-Boley-provided 24-core box).  Now, I'm not going to knock a
7% performance improvement and the benefit may be larger on Stefan's
80-core box and I think it's definitely worth going to the trouble to
implement that optimization for 9.2, but it appears at least based on
the testing that I've done so far that the fast relation locks are the
big win and after that it gets much harder to make an improvement.  If
we were to fix ONLY the vxid issue in 9.1 as you were advocating, the
benefit would probably be much less, because at least in my tests, the
fast relation lock patch increases overall system throughput
sufficiently to cause a 12x increase in contention due to vxid
traffic.

With both the fast-relation locks and the fast-vxid locks in place, as
I mentioned, the lock manager partition lock contention is completely
gone; in fact the lock manager partition traffic is pretty much gone.
The remaining contention comes mostly from the free list locks (blk
~13%) and the buffer mapping locks (which were roughly: 800k shacq,
12000 exacq, 850 blk)  Interestingly, I saw that one buffer mapping
lock got about 5x hotter than the others, which is odd, but possibly
harmless, since the absolute amount of blocking is really rather small
(~0.1%).  At least for read performance, we may need to start looking
less at reducing lock contention and more at making the actual
underlying operations faster.

In the process of doing all of this, I discovered that I had neglected
to update GetLockConflicts() and, consequently, fastlock-v2 is broken
insofar as CREATE INDEX CONCURRENTLY and Hot Standby are concerned.  I
will fix that and post an updated version; and I'll also post the
follow-on patch to accelerate the VXID locks at that time.  In the
meantime, I would appreciate any review or testing of the remainder of
the patch.

 If we combine this patch with Koichi-san's recommended changes to the
 number of lock partitions, we will have considerable impact for 9.1.
 Robert will still get his day in the sun, just with 9.2.

I am at this point of the viewpoint that there is little point in
raising the number of lock partitions.  If you are doing very simple
SELECT statements across a large number of tables, then increasing the
number of lock partitions will help.  On read-write workloads, there's
really no benefit, because WALInsertLock contention is the bottleneck.
 And on read-only workloads that only touch one or a handful of
tables, the individual lock manager partitions where the locks fall
get very hot regardless of how many partitions you have.  Now that
does still leave some space for improvement - specifically, lots of
tables, read-only or read-mostly - but the fast-relation-lock and
fast-vxid-lock stuff will address those bottlenecks far more
thoroughly.  And increasing the number of lock partitions also has a
downside: it will slow down end-of-transaction cleanup, which is
already an area where we know we have problems.

There might be some point in raising the number of buffer mapping
partitions, but I don't know how to create a test case where it's
actually material, especially without the fastlock stuff.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-07 Thread Simon Riggs
On Tue, Jun 7, 2011 at 9:52 PM, Robert Haas robertmh...@gmail.com wrote:

 If we were to fix ONLY the vxid issue in 9.1 as you were advocating

Sensible debate is impossible when you don't read what I've written.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-07 Thread Robert Haas
On Tue, Jun 7, 2011 at 5:43 PM, Simon Riggs si...@2ndquadrant.com wrote:
 On Tue, Jun 7, 2011 at 9:52 PM, Robert Haas robertmh...@gmail.com wrote:
 If we were to fix ONLY the vxid issue in 9.1 as you were advocating

 Sensible debate is impossible when you don't read what I've written.

I've read every word you've written on this thread.  Much of it,
multiple times.  I am unclear what we are arguing about.  I don't want
to have a debate.  I want to figure out what works, and do it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-07 Thread Josh Berkus
On 6/7/11 1:11 PM, Simon Riggs wrote:
 that appear low risk.  I seriously doubt that I would consider *any*
  meaningful change in the locking area to be low risk.
 That's a shame. We'll fix it in 9.2 then.

I will point out that we bounced Alvaro's FK patch, which *was*
submitted in time for CF4, because of unknown locking impact.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: 9.1 release scheduling (was Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch)

2011-06-07 Thread Alvaro Herrera
Excerpts from Robert Haas's message of mar jun 07 13:53:23 -0400 2011:
 On Tue, Jun 7, 2011 at 1:45 PM, Thom Brown t...@linux.com wrote:
  Speaking of which, is it now safe to remove the NOT VALID constraints
  don't dump properly issue from the blocker list since the fix has
  been committed?
 
 I hope so, because I just did that (before noticing this email from you).

Yeah, pg_dump works in HEAD ... the bug now is that psql prints NOT
VALID twice.  Will fix.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-07 Thread Bruce Momjian
Robert Haas wrote:
 On Mon, Jun 6, 2011 at 10:49 AM, Simon Riggs si...@2ndquadrant.com wrote:
  My point was that we have in the past implemented performance changes
  to increase scalability at the last minute, and also that our personal
  risk perspectives are not always set in stone.
 
  Robert has highlighted the value of this change and its clearly not
  beyond our wit to include it, even if it is beyond our will to do so.
 
 So, at the risk of totally derailing this thread -- what this boils
 down to is a philosophical disagreement.
 
 It seems to me (and, I think, to Tom and Heikki and others as well)
 that it's not possible to keep on making changes to the release right
 up until the last minute and then expect the release to be of high
 quality.  If we keep committing new features, then we'll keep
 introducing new bugs.  The only hope of making the bug count go down
 at some point is to stop making changes that aren't bug fixes.  We
 could come up with some complex procedure for determining whether a
 patch is important enough and non-invasive enough to bypass the normal
 deadline, but that would probably lead to a lot more arguing about
 procedure, and realistically, it's still going to increase the bug
 count at least somewhat.  IMHO, it's better to just have a deadline,
 and stuff either makes it or it doesn't.  I realize we haven't always
 adhered to the principle in the past, but at least IMV that's not a
 mistake we want to continue repeating.

Simon is right that we slipped the vxid patch into 8.3 when a Postgres
user I talked to at Linuxworld mentioned high vacuum freeze activity and
simple calculations showed the many read-only queries could cause high
xid usage.  Fortunately we already had a patch available and Tom applied
it during beta.  It was an existing patch that took on new urgency
during beta.

Robert's point above is that it isn't so much making the decision of
whether something should slip past the deadline, but the time-sapping
discussion of whether something should slip, and the frankly disturbing
behavior of some in this group to not accept a clear consensus,
therefore prolonging the discussion of slippage far longer than
necessary.

Basically, if you propose something, and it gets shot down due to
procedure, accept that unless you have some very good _new_ reason for
continuing the discussion.  If you don't like that, then you are not
going to do well in our group and maybe this isn't the group for you.  

I think we are going to need to be much more forceful about this, and if
the threat that someone has commit rights and therefore we can't ignore
them, we will have to reconsider who can commit to this project.  Do I
need to be any clearer?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-07 Thread Bruce Momjian
Bruce Momjian wrote:
 Simon is right that we slipped the vxid patch into 8.3 when a Postgres
 user I talked to at Linuxworld mentioned high vacuum freeze activity and
 simple calculations showed the many read-only queries could cause high
 xid usage.  Fortunately we already had a patch available and Tom applied
 it during beta.  It was an existing patch that took on new urgency
 during beta.
 
 Robert's point above is that it isn't so much making the decision of
 whether something should slip past the deadline, but the time-sapping
 discussion of whether something should slip, and the frankly disturbing
 behavior of some in this group to not accept a clear consensus,
 therefore prolonging the discussion of slippage far longer than
 necessary.
 
 Basically, if you propose something, and it gets shot down due to
 procedure, accept that unless you have some very good _new_ reason for
 continuing the discussion.  If you don't like that, then you are not
 going to do well in our group and maybe this isn't the group for you.  
 
 I think we are going to need to be much more forceful about this, and if
 the threat that someone has commit rights and therefore we can't ignore
 them, we will have to reconsider who can commit to this project.  Do I
 need to be any clearer?

One more thing --- when Tom applied that patch during 8.3 beta it was
with everyone's agreement, so the policy should be that if we are going
to break the rules, everyone has to agree --- if anyone disagrees, the
rules stand.

In this case, several people early felt we should stick with the rules
--- at that point, there should have been no further discussion of
slipping things into 9.1.

Discussion takes energy, and discussing slipping things into 9.1 after
anyone objects is just wasting our valuable time.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-07 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 Simon is right that we slipped the vxid patch into 8.3 when a Postgres
 user I talked to at Linuxworld mentioned high vacuum freeze activity and
 simple calculations showed the many read-only queries could cause high
 xid usage.  Fortunately we already had a patch available and Tom applied
 it during beta.  It was an existing patch that took on new urgency
 during beta.

Just to set the record straight on this ... the vxid patch went in on
2007-09-05:
http://archives.postgresql.org/pgsql-committers/2007-09/msg00026.php
which was a day shy of a month before we wrapped 8.3beta1:
http://archives.postgresql.org/pgsql-committers/2007-10/msg00089.php
so it was during alpha phase not beta.  And 8.3RC1 was stamped on
2008-01-03.  So Simon's assertion that this was days before we produced
a release candidate is correct, if you take days as 4 months.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-06 Thread Heikki Linnakangas

On 06.06.2011 07:12, Robert Haas wrote:

I did some further investigation of this.  It appears that more than
99% of the lock manager lwlock traffic that remains with this patch
applied has locktag_type == LOCKTAG_VIRTUALTRANSACTION.  Every SELECT
statement runs in a separate transaction, and for each new transaction
we run VirtualXactLockTableInsert(), which takes a lock on the vxid of
that transaction, so that other processes can wait for it.  That
requires acquiring and releasing a lock manager partition lock, and we
have to do the same thing a moment later at transaction end to dump
the lock.

A quick grep seems to indicate that the only places where we actually
make use of those VXID locks are in DefineIndex(), when CREATE INDEX
CONCURRENTLY is in use, and during Hot Standby, when max_standby_delay
expires.  Considering that these are not commonplace events, it seems
tremendously wasteful to incur the overhead for every transaction.  It
might be possible to make the lock entry spring into existence on
demand - i.e. if a backend wants to wait on a vxid entry, it creates
the LOCK and PROCLOCK objects for that vxid.  That presents a few
synchronization challenges, and plus we have to make sure that the
backend that's just been given a lock knows that it needs to release
it, but those seem like they might be manageable problems, especially
given the new infrastructure introduced by the current patch, which
already has to deal with some of those issues.  I'll look into this
further.


Ah, I remember I saw that vxid lock pop up quite high in an oprofile 
profile recently. I think it was the case of executing a lot of very 
simple prepared queries. So it would be nice to address that, even from 
a single CPU point of view.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-06 Thread Simon Riggs
On Sat, Jun 4, 2011 at 5:55 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 The approach looks sound to me. It's a fairly isolated patch and we
 should be considering this for inclusion in 9.1, not wait another
 year.

 That suggestion is completely insane.  The patch is only WIP and full of
 bugs, even according to its author.  Even if it were solid, it is way
 too late to be pushing such stuff into 9.1.  We're trying to ship a
 release, not find ways to cause it to slip more.

In 8.3, you implemented virtual transactionids days before we produced
a Release Candidate, against my recommendation.

At that time, I didn't start questioning your sanity. In fact we all
applauded that because it was a great performance gain.

The fact that you disagree with me does not make me insane. Inaction
on this point, resulting in a year's delay, will be considered to be a
gross waste by the majority of objective observers.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-06 Thread Heikki Linnakangas

On 06.06.2011 12:40, Simon Riggs wrote:

On Sat, Jun 4, 2011 at 5:55 PM, Tom Lanet...@sss.pgh.pa.us  wrote:

Simon Riggssi...@2ndquadrant.com  writes:

The approach looks sound to me. It's a fairly isolated patch and we
should be considering this for inclusion in 9.1, not wait another
year.


That suggestion is completely insane.  The patch is only WIP and full of
bugs, even according to its author.  Even if it were solid, it is way
too late to be pushing such stuff into 9.1.  We're trying to ship a
release, not find ways to cause it to slip more.


In 8.3, you implemented virtual transactionids days before we produced
a Release Candidate, against my recommendation.


FWIW, this bottleneck was not introduced by the introduction of virtual 
transaction ids. Before that patch, we just took the lock on the real 
transaction id instead.



The fact that you disagree with me does not make me insane.


You are not insane, even if your suggestion is.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-06 Thread Robert Haas
On Mon, Jun 6, 2011 at 2:54 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 Ah, I remember I saw that vxid lock pop up quite high in an oprofile profile
 recently. I think it was the case of executing a lot of very simple prepared
 queries. So it would be nice to address that, even from a single CPU point
 of view.

It doesn't seem too hard to do, although I have to think about the
details.  Even though the VXID locks involved are Exclusive locks,
they are actually very much like the weak locks that the current
patch accelerates, because the Exclusive lock is taken only by the
VXID owner, and it can therefore be safely assumed that the initial
lock acquisition won't block anything.  Therefore, it's really
unnecessary to touch the primary lock table at transaction start (and
to only touch it at the end if someone's waiting).  However, there's a
fly in the ointment: when someone tries to ShareLock a VXID, we need
to determine whether that VXID is still around and, if so, make an
Exclusive lock entry for it in the primary lock table.  And, unlike
what I'm doing for strong relation locks, it's probably NOT acceptable
for that to acquire and release every per-backend LWLock, because
every place that waits for VXID locks waits for a list of locks in
sequence, so we could end up with O(n^2) behavior.  Now, in theory
that's not a huge problem: the VXID includes the backend ID, so we
ought to be able to figure out which single per-backend LWLock is of
interest and just acquire/release that one.  Unfortunately, it appears
that there's no easy way to go from a backend ID to a PGPROC.  The
backend IDs are offsets into the ProcState array, so they give us a
pointer to the backend's sinval state, not its PGPROC.  And while the
PGPROC has a pointer to the sinval info, there's no pointer in the
opposite direction.  Even if there were, we'd probably need to hold
SInvalWriteLock in shared mode to follow it.

That might not be the end of the world, since VXID locks are fairly
infrequently used, but it's certainly a little grotty.  I do rather
wonder if we should be trying to reduce the number of separate places
where we list the running processes.  We have arrays of PGPROC
structures, and then we have one set of pointers to PGPROCs in the
ProcArray, and then we have the ProcState structures for sinval.  I
wonder if there's some way to rearrange all this to simplify the
bookkeeping.

BTW, how do you identify from oprofile that *vxid* locks were the
problem?  I didn't think it could produce that level of detail.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-06 Thread Heikki Linnakangas

On 06.06.2011 07:12, Robert Haas wrote:

I did some further investigation of this.  It appears that more than
99% of the lock manager lwlock traffic that remains with this patch
applied has locktag_type == LOCKTAG_VIRTUALTRANSACTION.  Every SELECT
statement runs in a separate transaction, and for each new transaction
we run VirtualXactLockTableInsert(), which takes a lock on the vxid of
that transaction, so that other processes can wait for it.  That
requires acquiring and releasing a lock manager partition lock, and we
have to do the same thing a moment later at transaction end to dump
the lock.

A quick grep seems to indicate that the only places where we actually
make use of those VXID locks are in DefineIndex(), when CREATE INDEX
CONCURRENTLY is in use, and during Hot Standby, when max_standby_delay
expires.  Considering that these are not commonplace events, it seems
tremendously wasteful to incur the overhead for every transaction.  It
might be possible to make the lock entry spring into existence on
demand - i.e. if a backend wants to wait on a vxid entry, it creates
the LOCK and PROCLOCK objects for that vxid.  That presents a few
synchronization challenges, and plus we have to make sure that the
backend that's just been given a lock knows that it needs to release
it, but those seem like they might be manageable problems, especially
given the new infrastructure introduced by the current patch, which
already has to deal with some of those issues.  I'll look into this
further.


At the moment, the transaction with given vxid acquires an ExclusiveLock 
on the vxid, and anyone who wants to wait for it to finish acquires a 
ShareLock. If we simply reverse that, so that the transaction itself 
takes ShareLock, and anyone wanting to wait on it take an ExclusiveLock, 
will this fastlock patch bust this bottleneck too?


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-06 Thread Robert Haas
On Mon, Jun 6, 2011 at 8:02 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 On 06.06.2011 07:12, Robert Haas wrote:

 I did some further investigation of this.  It appears that more than
 99% of the lock manager lwlock traffic that remains with this patch
 applied has locktag_type == LOCKTAG_VIRTUALTRANSACTION.  Every SELECT
 statement runs in a separate transaction, and for each new transaction
 we run VirtualXactLockTableInsert(), which takes a lock on the vxid of
 that transaction, so that other processes can wait for it.  That
 requires acquiring and releasing a lock manager partition lock, and we
 have to do the same thing a moment later at transaction end to dump
 the lock.

 A quick grep seems to indicate that the only places where we actually
 make use of those VXID locks are in DefineIndex(), when CREATE INDEX
 CONCURRENTLY is in use, and during Hot Standby, when max_standby_delay
 expires.  Considering that these are not commonplace events, it seems
 tremendously wasteful to incur the overhead for every transaction.  It
 might be possible to make the lock entry spring into existence on
 demand - i.e. if a backend wants to wait on a vxid entry, it creates
 the LOCK and PROCLOCK objects for that vxid.  That presents a few
 synchronization challenges, and plus we have to make sure that the
 backend that's just been given a lock knows that it needs to release
 it, but those seem like they might be manageable problems, especially
 given the new infrastructure introduced by the current patch, which
 already has to deal with some of those issues.  I'll look into this
 further.

 At the moment, the transaction with given vxid acquires an ExclusiveLock on
 the vxid, and anyone who wants to wait for it to finish acquires a
 ShareLock. If we simply reverse that, so that the transaction itself takes
 ShareLock, and anyone wanting to wait on it take an ExclusiveLock, will this
 fastlock patch bust this bottleneck too?

Not without some further twaddling.  Right now, the fast path only
applies when you are taking a lock  ShareUpdateExclusiveLock on an
unshared relation.  See also the email I just sent on why using the
exact same mechanism might not be such a hot idea.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-06 Thread Heikki Linnakangas

On 06.06.2011 14:59, Robert Haas wrote:

BTW, how do you identify from oprofile that *vxid* locks were the
problem?  I didn't think it could produce that level of detail.


It can show the call stack of each call, with --callgraph=n option, 
where you can see what percentage of the calls to LockAcquire come from 
VirtualXactLockTableInsert.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-06 Thread Simon Riggs
On Mon, Jun 6, 2011 at 11:19 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 On 06.06.2011 12:40, Simon Riggs wrote:

 On Sat, Jun 4, 2011 at 5:55 PM, Tom Lanet...@sss.pgh.pa.us  wrote:

 Simon Riggssi...@2ndquadrant.com  writes:

 The approach looks sound to me. It's a fairly isolated patch and we
 should be considering this for inclusion in 9.1, not wait another
 year.

 That suggestion is completely insane.  The patch is only WIP and full of
 bugs, even according to its author.  Even if it were solid, it is way
 too late to be pushing such stuff into 9.1.  We're trying to ship a
 release, not find ways to cause it to slip more.

 In 8.3, you implemented virtual transactionids days before we produced
 a Release Candidate, against my recommendation.

 FWIW, this bottleneck was not introduced by the introduction of virtual
 transaction ids. Before that patch, we just took the lock on the real
 transaction id instead.

Of course it wasn't. You've misunderstood completely.

My point was that we have in the past implemented performance changes
to increase scalability at the last minute, and also that our personal
risk perspectives are not always set in stone.

Robert has highlighted the value of this change and its clearly not
beyond our wit to include it, even if it is beyond our will to do so.


 The fact that you disagree with me does not make me insane.

 You are not insane, even if your suggestion is.

LOL. Your logic is still poor though. :-)

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-06 Thread Robert Haas
On Mon, Jun 6, 2011 at 10:49 AM, Simon Riggs si...@2ndquadrant.com wrote:
 My point was that we have in the past implemented performance changes
 to increase scalability at the last minute, and also that our personal
 risk perspectives are not always set in stone.

 Robert has highlighted the value of this change and its clearly not
 beyond our wit to include it, even if it is beyond our will to do so.

So, at the risk of totally derailing this thread -- what this boils
down to is a philosophical disagreement.

It seems to me (and, I think, to Tom and Heikki and others as well)
that it's not possible to keep on making changes to the release right
up until the last minute and then expect the release to be of high
quality.  If we keep committing new features, then we'll keep
introducing new bugs.  The only hope of making the bug count go down
at some point is to stop making changes that aren't bug fixes.  We
could come up with some complex procedure for determining whether a
patch is important enough and non-invasive enough to bypass the normal
deadline, but that would probably lead to a lot more arguing about
procedure, and realistically, it's still going to increase the bug
count at least somewhat.  IMHO, it's better to just have a deadline,
and stuff either makes it or it doesn't.  I realize we haven't always
adhered to the principle in the past, but at least IMV that's not a
mistake we want to continue repeating.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-06 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote:
 
 IMHO, it's better to just have a deadline, and stuff either makes
 it or it doesn't.  I realize we haven't always adhered to the
 principle in the past, but at least IMV that's not a mistake we
 want to continue repeating.
 
+1
 
I've said it before, but I think it bears repeating, that deferring
this to 9.2 doesn't mean that it comes out in a production release
12 months later -- unless we continue to repeat this mistake
endlessly.  It means that this release comes out closer to when we
said it would -- for the sake of argument let's hypothesize one
month.  So by holding the line on such inclusions all the current
9.1 features come out one month sooner, and this feature comes out
11 months later than it would have if we'd put it into 9.1.  With
some feature we consider squeezing in, it would be more like
delaying everything which is done by three months so that one
feature gets out nine months earlier.
 
Perhaps the best way to describe the suggestion that this be
included in 9.1 isn't that it's an insane suggestion; but that it's
a suggestion which, if adopted, would be likely to drive those who
are striving for a more organized development and release process
insane.
 
Or one could look at it in a cost/benefit format -- major features
delivered per year go up by holding the line, administrative costs
are reduced, and people who are focusing on release stability get
more months per year to do development.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-06 Thread Simon Riggs
On Mon, Jun 6, 2011 at 5:14 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:

 Perhaps the best way to describe the suggestion that this be
 included in 9.1 isn't that it's an insane suggestion; but that it's
 a suggestion which, if adopted, would be likely to drive those who
 are striving for a more organized development and release process
 insane.

Kevin, I respect your opinion and thank you for stating your case
without insults.

In this discussion it should be recognised that I have personally
driven the development of a more organized dev and release process. I
requested and argued for stated release dates to assist resource
planning and suggested commitfests as a mechanism to reduce the
feedback times for developers. I also provided the first guide to
patch reviews we published. So I am a proponent of planning and
organization, though some would like to claim I see things
differently.

The major problems of the dev process are now solved, yet more
organization is still being discussed, as if more == better. What
I hear is changed organization and I am not certain that all
change == better in what I see is a leading example of how to
produce great software.

Releasing regularly is important, but not more important than
anything. Ever. Period. Trying to force that will definitely make you
mad, I can see. I request that people stop trying to enforce a process
so strictly that sensible and important change cannot take place when
needed.


 Or one could look at it in a cost/benefit format -- major features
 delivered per year go up by holding the line, administrative costs
 are reduced, and people who are focusing on release stability get
 more months per year to do development.

I do look at it in a cost/benefit format. The problem is the above
statement has nothing user-centric about it.

The cost to us is a few days work and the benefit is a whole year's
worth of increased performance for our user base, which has a hardware
equivalent well into the millions of dollars.

And that's ignoring the users that would've switched to using Postgres
earlier, and those who might leave because of competitive comparison.

I won't say any more about this because I am in no way a beneficiary
from this and even my opinion is given unpaid.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-06 Thread Josh Berkus

 That's an improvement of about ~3.5x.  According to the vmstat output,
 when running without the patch, the CPU state was about 40% idle.
 With the patch, it dropped down to around 6%.

Wow!  That's fantastic.

Jignesh, are you in a position to test any of Robert's work using DBT or
other benchmarks?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-06 Thread Dimitri Fontaine
Robert Haas robertmh...@gmail.com writes:
   IMHO, it's better to just have a deadline,

Well, that's the fine point we're now talking about.

I still think that we should try at making the best release possible.
And if that means including changes at beta time because that's when
someone got around to doing them, so be it — well, they should really
worth it.

So, to the question “do we want hard deadlines?” I think the answer is
“no”, to “do we need hard deadlines?”, my answer is still “no”, and to
the question “does this very change should be considered this late?” my
answer is yes.

Because it really changes the game for PostgreSQL users.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-06 Thread Dave Page
On Mon, Jun 6, 2011 at 8:12 PM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote:
 So, to the question “do we want hard deadlines?” I think the answer is
 “no”, to “do we need hard deadlines?”, my answer is still “no”, and to
 the question “does this very change should be considered this late?” my
 answer is yes.

 Because it really changes the game for PostgreSQL users.

Much as I hate to say it (I too want to keep our schedule as
predictable and organised as possible), I have to agree. Assuming the
patch is good, I think this is something we should push into 9.1. It
really could be a game changer.

-- 
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-06 Thread Stefan Kaltenbrunner
On 06/06/2011 09:24 PM, Dave Page wrote:
 On Mon, Jun 6, 2011 at 8:12 PM, Dimitri Fontaine dimi...@2ndquadrant.fr 
 wrote:
 So, to the question “do we want hard deadlines?” I think the answer is
 “no”, to “do we need hard deadlines?”, my answer is still “no”, and to
 the question “does this very change should be considered this late?” my
 answer is yes.

 Because it really changes the game for PostgreSQL users.
 
 Much as I hate to say it (I too want to keep our schedule as
 predictable and organised as possible), I have to agree. Assuming the
 patch is good, I think this is something we should push into 9.1. It
 really could be a game changer.

I disagree - the proposed patch maybe provides a very significant
improvment for a certain workload type(nothing less but nothing more),
but it was posted way after -BETA and I'm not sure we yet understand all
implications of the changes.
We also have to consider that the underlying issues are known problems
for multiple years^releases so I don't think there is a particular rush
to force them into a particular release (as in 9.1).


Stefan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-06 Thread Stephen Frost
* Dave Page (dp...@pgadmin.org) wrote:
 Much as I hate to say it (I too want to keep our schedule as
 predictable and organised as possible), I have to agree. Assuming the
 patch is good, I think this is something we should push into 9.1. It
 really could be a game changer.

So, with folks putting up that we should hammer this patch out and
force it into 9.1..  What should our new release date for 9.1 be?  What
about other patches that didn't make it into 9.1?  What about the
upcoming CommitFest that we've asked people to start working on?

If we're going to start putting in changes like this, I'd suggest that
we try and target something like September for 9.1 to actually be
released.  Playing with the lock management isn't something we want to
be doing lightly and I think we definitely need to have serious testing
of this, similar to what has been done for the SSI changes, before we're
going to be able to release it.

I don't agree that we should delay 9.1, but if people really want this
in, then we need to figure out what the new schedule is going to be.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-06 Thread Dave Page
On Mon, Jun 6, 2011 at 8:40 PM, Stefan Kaltenbrunner
ste...@kaltenbrunner.cc wrote:
 On 06/06/2011 09:24 PM, Dave Page wrote:
 On Mon, Jun 6, 2011 at 8:12 PM, Dimitri Fontaine dimi...@2ndquadrant.fr 
 wrote:
 So, to the question “do we want hard deadlines?” I think the answer is
 “no”, to “do we need hard deadlines?”, my answer is still “no”, and to
 the question “does this very change should be considered this late?” my
 answer is yes.

 Because it really changes the game for PostgreSQL users.

 Much as I hate to say it (I too want to keep our schedule as
 predictable and organised as possible), I have to agree. Assuming the
 patch is good, I think this is something we should push into 9.1. It
 really could be a game changer.

 I disagree - the proposed patch maybe provides a very significant
 improvment for a certain workload type(nothing less but nothing more),
 but it was posted way after -BETA and I'm not sure we yet understand all
 implications of the changes.

We certainly need to be happy with the implications if we were to make
such a decision.

 We also have to consider that the underlying issues are known problems
 for multiple years^releases so I don't think there is a particular rush
 to force them into a particular release (as in 9.1).

No, there's no *technical* reason we need to do this, as there would
be if it were a bug fix for example. I would just like to see us
narrow the gap with our competitors sooner rather than later, *if*
we're a) happy with the change, and b) we're talking about a minimal
delay (which we may be - Robert says he thinks the patch is good, so
with another review and beta testing).

-- 
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-06 Thread Josh Berkus
On 6/6/11 12:12 PM, Dimitri Fontaine wrote:
 So, to the question “do we want hard deadlines?” I think the answer is
 “no”, to “do we need hard deadlines?”, my answer is still “no”, and to
 the question “does this very change should be considered this late?” my
 answer is yes.

I could not disagree more strongly.  We're in *beta* now.  It's not like
the last CF closed a couple weeks ago.  Heck, I'm about to open the
first CF for 9.2 in just over a week.

Also, a patch like this needs several months of development, discussion
and  testing in order to fix the issues Robert already identified and
make sure it doesn't break something fundamental to concurrency.   Which
would mean delaying the release would be delayed until at least
November, screwing over all the users who don't care about this patch.

There will *always* be another really cool patch.  If we keep delaying
release to get in one more patch, then we never release.  At some point
you just have to take what you have and call it a release, and we are
months past that point.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-06 Thread Andrew Dunstan



On 06/06/2011 03:24 PM, Dave Page wrote:

On Mon, Jun 6, 2011 at 8:12 PM, Dimitri Fontainedimi...@2ndquadrant.fr  wrote:

So, to the question “do we want hard deadlines?” I think the answer is
“no”, to “do we need hard deadlines?”, my answer is still “no”, and to
the question “does this very change should be considered this late?” my
answer is yes.

Because it really changes the game for PostgreSQL users.

Much as I hate to say it (I too want to keep our schedule as
predictable and organised as possible), I have to agree. Assuming the
patch is good, I think this is something we should push into 9.1. It
really could be a game changer.



I'm not a fan of hard and fast deadlines for releases - it puts too much 
pressure on us to release before we might be ready. But I'm also not a 
fan of totally abandoning our established processes, which accepting 
this would. I don't mind bending the rules a bit occasionally; I do mind 
throwing them out the door.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-06 Thread Dave Page
On Mon, Jun 6, 2011 at 8:44 PM, Stephen Frost sfr...@snowman.net wrote:
 * Dave Page (dp...@pgadmin.org) wrote:
 Much as I hate to say it (I too want to keep our schedule as
 predictable and organised as possible), I have to agree. Assuming the
 patch is good, I think this is something we should push into 9.1. It
 really could be a game changer.

 So, with folks putting up that we should hammer this patch out and
 force it into 9.1..  What should our new release date for 9.1 be?  What
 about other patches that didn't make it into 9.1?  What about the
 upcoming CommitFest that we've asked people to start working on?

 If we're going to start putting in changes like this, I'd suggest that
 we try and target something like September for 9.1 to actually be
 released.  Playing with the lock management isn't something we want to
 be doing lightly and I think we definitely need to have serious testing
 of this, similar to what has been done for the SSI changes, before we're
 going to be able to release it.

Completely aside from the issue at hand, aren't we looking at a
September release by now anyway (assuming we have to void late
July/August as we usually do)?


-- 
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-06 Thread Jignesh Shah
On Mon, Jun 6, 2011 at 2:49 PM, Josh Berkus j...@agliodbs.com wrote:

 That's an improvement of about ~3.5x.  According to the vmstat output,
 when running without the patch, the CPU state was about 40% idle.
 With the patch, it dropped down to around 6%.

 Wow!  That's fantastic.

 Jignesh, are you in a position to test any of Robert's work using DBT or
 other benchmarks?

 --
 Josh Berkus
 PostgreSQL Experts Inc.
 http://pgexperts.com



I missed the discussion. Can you send me the patch (will that work
with 9.1 beta?)? I can do a before and after with DBT2 and let you
know.
And also test it with sysbench read test  which also has a relation
locking bottleneck.

Thanks.

Regards,
Jignesh

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-06 Thread Christopher Browne
On Mon, Jun 6, 2011 at 5:13 PM, Simon Riggs si...@2ndquadrant.com wrote:
 The cost to us is a few days work and the benefit is a whole year's
 worth of increased performance for our user base, which has a hardware
 equivalent well into the millions of dollars.

I doubt that this is an accurate reflection of the cost.

What was presented by Robert Haas was a proof of concept, and he
pointed out that it had numerous problems.  To requote:

There are numerous problems with the code as it stands at this point.
It crashes if you try to use 2PC, which means the regression tests
fail; it probably does horrible things if you run out of shared
memory; pg_locks knows nothing about the new mechanism (arguably, we
could leave it that way: only locks that can't possibly be conflicting
with anything can be taken using this mechanism, but it would be nice
to fix, I think); and there are likely some other gotchas as well.

Turning this into something ready for production deployment in 9.1
would require a non-trivial amount of additional effort, and would
likely have the adverse effect of deferring the release of 9.1, as
well as of further deferring all the effects of the patches submitted
for the latest commitfest
(https://commitfest.postgresql.org/action/commitfest_view?id=10),
since this defers release of 9.2, as well.

While the patch is a fine one, in that it has interesting effects, it
seems like a way wiser idea to me to let it go through the 9.2
process, so that it has 6 months worth of buildfarm runs before it
gets deployed for real just like all the other items in the 2011-06
CommitFest.

Note that it may lead to further discoveries, so that perhaps, in the
9.2 series, we'd see further improvements due to things that are
discovered as further consequence of testing
https://commitfest.postgresql.org/action/patch_view?id=572.
-- 
When confronted by a difficult problem, solve it by reducing it to the
question, How would the Lone Ranger handle this?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-06 Thread Robert Haas
On Mon, Jun 6, 2011 at 3:59 PM, Christopher Browne cbbro...@gmail.com wrote:
 On Mon, Jun 6, 2011 at 5:13 PM, Simon Riggs si...@2ndquadrant.com wrote:
 The cost to us is a few days work and the benefit is a whole year's
 worth of increased performance for our user base, which has a hardware
 equivalent well into the millions of dollars.

 I doubt that this is an accurate reflection of the cost.

 What was presented by Robert Haas was a proof of concept, and he
 pointed out that it had numerous problems.  To requote:

 There are numerous problems with the code as it stands at this point.
 It crashes if you try to use 2PC, which means the regression tests
 fail; it probably does horrible things if you run out of shared
 memory; pg_locks knows nothing about the new mechanism (arguably, we
 could leave it that way: only locks that can't possibly be conflicting
 with anything can be taken using this mechanism, but it would be nice
 to fix, I think); and there are likely some other gotchas as well.

The latest version of the patch is in much better shape:

http://archives.postgresql.org/pgsql-hackers/2011-06/msg00403.php

But this is not intended as disparagement for the balance of your argument.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-06 Thread Kevin Grittner
Stephen Frost sfr...@snowman.net wrote:
 
 if people really want this in, then we need to figure out what the
 new schedule is going to be.
 
I suggest June, 2012.  That way we can get a whole bunch more really
cool patches in, and the users won't have to wait for 9.2 to get
them.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-06 Thread Simon Riggs
On Mon, Jun 6, 2011 at 8:52 PM, Dave Page dp...@pgadmin.org wrote:
 On Mon, Jun 6, 2011 at 8:44 PM, Stephen Frost sfr...@snowman.net wrote:
 * Dave Page (dp...@pgadmin.org) wrote:
 Much as I hate to say it (I too want to keep our schedule as
 predictable and organised as possible), I have to agree. Assuming the
 patch is good, I think this is something we should push into 9.1. It
 really could be a game changer.

 So, with folks putting up that we should hammer this patch out and
 force it into 9.1..  What should our new release date for 9.1 be?  What
 about other patches that didn't make it into 9.1?  What about the
 upcoming CommitFest that we've asked people to start working on?

 If we're going to start putting in changes like this, I'd suggest that
 we try and target something like September for 9.1 to actually be
 released.  Playing with the lock management isn't something we want to
 be doing lightly and I think we definitely need to have serious testing
 of this, similar to what has been done for the SSI changes, before we're
 going to be able to release it.

 Completely aside from the issue at hand, aren't we looking at a
 September release by now anyway (assuming we have to void late
 July/August as we usually do)?

I see no reason to delay from a July release as has long been planned.

What open items are genuine blockers?

If we need deadlines anywhere its in beta and final release, otherwise
we all just sit around shrugging and saying another week I guess.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-06 Thread Alvaro Herrera
Excerpts from Dimitri Fontaine's message of lun jun 06 15:12:54 -0400 2011:

 So, to the question “do we want hard deadlines?” I think the answer is
 “no”, to “do we need hard deadlines?”, my answer is still “no”, and to
 the question “does this very change should be considered this late?” my
 answer is yes.
 
 Because it really changes the game for PostgreSQL users.

Maybe so, but the problem is that the patch is really WIP at this point
and it obviously still needs a lot of work, judging from the patch
author's comments.

I note that if 2nd Quadrant is interested in having a game-changing
platform without having to wait a full year for 9.2, they can obviously
distribute a modified version of Postgres that integrates Robert's
patch.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-06 Thread Alvaro Herrera
Excerpts from Robert Haas's message of vie jun 03 09:17:08 -0400 2011:
 I've now spent enough time working on this issue now to be convinced
 that the approach has merit, if we can work out the kinks.  I'll start
 with some performance numbers.

I hereby recommend that people with patches such as this one while on
the last weeks till release should refrain from posting them until the
release has actually taken place.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-06 Thread Tom Lane
Dave Page dp...@pgadmin.org writes:
 On Mon, Jun 6, 2011 at 8:44 PM, Stephen Frost sfr...@snowman.net wrote:
 If we're going to start putting in changes like this, I'd suggest that
 we try and target something like September for 9.1 to actually be
 released.  Playing with the lock management isn't something we want to
 be doing lightly and I think we definitely need to have serious testing
 of this, similar to what has been done for the SSI changes, before we're
 going to be able to release it.

 Completely aside from the issue at hand, aren't we looking at a
 September release by now anyway (assuming we have to void late
 July/August as we usually do)?

Very possibly.  So if we add this in, we're talking November or December
instead of September.  You can't argue that July/August will be lost
time for one development path but not another.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-06 Thread Robert Haas
On Mon, Jun 6, 2011 at 6:53 PM, Alvaro Herrera
alvhe...@commandprompt.com wrote:
 Excerpts from Robert Haas's message of vie jun 03 09:17:08 -0400 2011:
 I've now spent enough time working on this issue now to be convinced
 that the approach has merit, if we can work out the kinks.  I'll start
 with some performance numbers.

 I hereby recommend that people with patches such as this one while on
 the last weeks till release should refrain from posting them until the
 release has actually taken place.

%@#!

Next time I'll be sure to only post my patches during beta if they suck.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-06 Thread Stephen Frost
* Simon Riggs (si...@2ndquadrant.com) wrote:
 I see no reason to delay from a July release as has long been planned.
 
 What open items are genuine blockers?
 
 If we need deadlines anywhere its in beta and final release, otherwise
 we all just sit around shrugging and saying another week I guess.

I'm a bit confused by your response here.  Clearly, if we're going to
try and get this patch cleaned up and committable, then it's an open
item and a genuine blocker with a couple of months of work associated
with it.  If we don't try to shove this patch in then perhaps we can
get a release out in the next month or so.  It was my understand that
we're in beta and final release right now, and we're trying to hit
deadlines now which are associated with that.  Adding this patch into
the queue of things to be done before release moves us back out of
the beta testing and final release stage.

In other words, if you're argueing to stick to a release soon then it
doesn't make sense, to me anyway, to advocate applying a mostly
untested patch which changes a great deal of very important core logic.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-06 Thread Jignesh Shah
On Mon, Jun 6, 2011 at 2:49 PM, Josh Berkus j...@agliodbs.com wrote:

 That's an improvement of about ~3.5x.  According to the vmstat output,
 when running without the patch, the CPU state was about 40% idle.
 With the patch, it dropped down to around 6%.

 Wow!  That's fantastic.

 Jignesh, are you in a position to test any of Robert's work using DBT or
 other benchmarks?

 --
 Josh Berkus
 PostgreSQL Experts Inc.
 http://pgexperts.com



Okay I tried it out with sysbench read scaling test..
Note I had tried that earlier on 9.0
http://jkshah.blogspot.com/2010/11/postgresql-90-simple-select-scaling.html

And on that test I found that doing that test on anything bigger than
4 cores lead to decreased performance ..
Redoing the same test with 100 users on 4 vCPU Virtual Machine with
8GB with 1M rows I get
   transactions:17870082 (59566.46 per sec.)
which is inline with the best number on 9.0.
This test hardly had any idle CPUs.

However where it made a huge impact was doing the same test on my 8
vCPU VM with 8GB RAM I get
transactions:33274594 (110914.85 per sec.)

which is a whopping 1.8x scaling for 2x scaling (from 4 to 8 vCPU)..
My idle cpu was less than 7% which when taken into consideration that
the useful work is line with my expectations is really impressive..
(And plus the last time I did MySQL they were around 95K or so for the
same test).

Also note that in my earlier case 60K was the max irrespective of the
hardware I threw at it.. For this fastlock patch that does not seem to
be the problem anymore :-)

This gain is impressive..

Next step DBT-2..

Regards,
Jignesh

Next step

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-05 Thread Stefan Kaltenbrunner
On 06/03/2011 03:17 PM, Robert Haas wrote:
[...]
 
 As you can see, this works out to a bit more than a 4% improvement on
 this two-core box.  I also got access (thanks to Nate Boley) to a
 24-core box and ran the same test with scale factor 100 and
 shared_buffers=8GB.  Here are the results of alternating runs without
 and with the patch on that machine:
 
 tps = 36291.996228 (including connections establishing)
 tps = 129242.054578 (including connections establishing)
 tps = 36704.393055 (including connections establishing)
 tps = 128998.648106 (including connections establishing)
 tps = 36531.208898 (including connections establishing)
 tps = 131341.367344 (including connections establishing)
 
 That's an improvement of about ~3.5x.  According to the vmstat output,
 when running without the patch, the CPU state was about 40% idle.
 With the patch, it dropped down to around 6%.

nice - but lets see on real hardware...

Testing this on a brand new E7-4850 4 Socket/10cores+HT Box - so 80
hardware threads:

first some numbers with -HEAD(-T 120, runtimes at lower -c counts have
fairly high variation in the results, first number is the number of
connections/threads):


-j1:tps = 7928.965493 (including connections establishing)
-j8:tps = 53610.572347 (including connections establishing)
-j16:   tps = 80835.446118 (including connections establishing)
-j32:   tps = 75666.731883 (including connections establishing)
-j40:   tps = 74628.568388 (including connections establishing)
-j64.   tps = 68268.081973 (including connections establishing)
-c80tps = 66704.216166 (including connections establishing)

postgresql is completely lock limited in this test anything beyond
around -j10 is basically not able to push the box to more than 80% IDLE(!)


and now with the patch applied:

-j1:tps = 7783.295587 (including connections establishing)  
-j8:tps = 44361.661947 (including connections establishing)
-j16:   tps = 92270.464541 (including connections establishing)
-j24:   tps = 108259.524782 (including connections establishing)
-j32:   tps = 183337.422612 (including connections establishing)
-j40tps = 209616.052430 (including connections establishing)
-j48:   tps = 229621.292382 (including connections establishing)
-j56:   tps = 218690.391603 (including connections establishing)
-j64:   tps = 188028.348501 (including connections establishing)
-j80.   tps = 118814.741609 (including connections establishing)


so much better - but I still think there is some headroom left still,
although pgbench itself is a CPU hog in those benchmark with eating up
to 10 cores in the worst case scenario - will retest with sysbench which
in the past showed more reasonable CPU usage for me.



and a profile(patched code) for the -j48(aka fastest) case:

731535   11.8408  postgres s_lock
2918784.7244  postgres LWLockAcquire
2423733.9231  postgres AllocSetAlloc
2390833.8698  postgres LWLockRelease
2023413.2751  postgres SearchCatCache
1900553.0763  postgres hash_search_with_hash_value
1871483.0292  postgres base_yyparse
1732652.8045  postgres GetSnapshotData
75700 1.2253  postgres core_yylex
74974 1.2135  postgres MemoryContextAllocZeroAligned
61404 0.9939  postgres _bt_compare
57529 0.9312  postgres MemoryContextAlloc


and one for the -j80 case(also patched).


485798   48.9667  postgres s_lock
60327 6.0808  postgres LWLockAcquire
57049 5.7503  postgres LWLockRelease
18357 1.8503  postgres hash_search_with_hash_value
17033 1.7169  postgres GetSnapshotData
14763 1.4881  postgres base_yyparse
14460 1.4575  postgres SearchCatCache
13975 1.4086  postgres AllocSetAlloc
6416  0.6467  postgres PinBuffer
5024  0.5064  postgres SIGetDataEntries
4704  0.4741  postgres core_yylex
4625  0.4662  postgres _bt_compare



Stefan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-05 Thread Heikki Linnakangas

On 05.06.2011 22:04, Stefan Kaltenbrunner wrote:

and one for the -j80 case(also patched).


485798   48.9667  postgres s_lock
60327 6.0808  postgres LWLockAcquire
57049 5.7503  postgres LWLockRelease
18357 1.8503  postgres hash_search_with_hash_value
17033 1.7169  postgres GetSnapshotData
14763 1.4881  postgres base_yyparse
14460 1.4575  postgres SearchCatCache
13975 1.4086  postgres AllocSetAlloc
6416  0.6467  postgres PinBuffer
5024  0.5064  postgres SIGetDataEntries
4704  0.4741  postgres core_yylex
4625  0.4662  postgres _bt_compare


Hmm, does that mean that it's spending 50% of the time spinning on a 
spinlock? That's bad. It's one thing to be contended on a lock, and have 
a lot of idle time because of that, but it's even worse to spend a lot 
of time spinning because that CPU time won't be spent on doing more 
useful work, even if there is some other process on the system that 
could make use of that CPU time.


I like the overall improvement on the throughput, of course, but we have 
to find a way to avoid the busy-wait.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-05 Thread Stefan Kaltenbrunner
On 06/05/2011 09:12 PM, Heikki Linnakangas wrote:
 On 05.06.2011 22:04, Stefan Kaltenbrunner wrote:
 and one for the -j80 case(also patched).


 485798   48.9667  postgres s_lock
 60327 6.0808  postgres LWLockAcquire
 57049 5.7503  postgres LWLockRelease
 18357 1.8503  postgres hash_search_with_hash_value
 17033 1.7169  postgres GetSnapshotData
 14763 1.4881  postgres base_yyparse
 14460 1.4575  postgres SearchCatCache
 13975 1.4086  postgres AllocSetAlloc
 6416  0.6467  postgres PinBuffer
 5024  0.5064  postgres SIGetDataEntries
 4704  0.4741  postgres core_yylex
 4625  0.4662  postgres _bt_compare
 
 Hmm, does that mean that it's spending 50% of the time spinning on a
 spinlock? That's bad. It's one thing to be contended on a lock, and have
 a lot of idle time because of that, but it's even worse to spend a lot
 of time spinning because that CPU time won't be spent on doing more
 useful work, even if there is some other process on the system that
 could make use of that CPU time.

well yeah - we are broken right now with only being able to use ~20% of
CPU on a modern mid-range box, but using 80% CPU (or 4x like in the
above case) and only getting less than 2x the performance seems wrong as
well. I also wonder if we are still missing something fundamental -
because even with the current patch we are quite far away from linear
scaling and light-years from some of our competitors...


Stefan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-05 Thread Robert Haas
On Sun, Jun 5, 2011 at 4:01 PM, Stefan Kaltenbrunner
ste...@kaltenbrunner.cc wrote:
 On 06/05/2011 09:12 PM, Heikki Linnakangas wrote:
 On 05.06.2011 22:04, Stefan Kaltenbrunner wrote:
 and one for the -j80 case(also patched).


 485798   48.9667  postgres                 s_lock
 60327     6.0808  postgres                 LWLockAcquire
 57049     5.7503  postgres                 LWLockRelease
 18357     1.8503  postgres                 hash_search_with_hash_value
 17033     1.7169  postgres                 GetSnapshotData
 14763     1.4881  postgres                 base_yyparse
 14460     1.4575  postgres                 SearchCatCache
 13975     1.4086  postgres                 AllocSetAlloc
 6416      0.6467  postgres                 PinBuffer
 5024      0.5064  postgres                 SIGetDataEntries
 4704      0.4741  postgres                 core_yylex
 4625      0.4662  postgres                 _bt_compare

 Hmm, does that mean that it's spending 50% of the time spinning on a
 spinlock? That's bad. It's one thing to be contended on a lock, and have
 a lot of idle time because of that, but it's even worse to spend a lot
 of time spinning because that CPU time won't be spent on doing more
 useful work, even if there is some other process on the system that
 could make use of that CPU time.

 well yeah - we are broken right now with only being able to use ~20% of
 CPU on a modern mid-range box, but using 80% CPU (or 4x like in the
 above case) and only getting less than 2x the performance seems wrong as
 well. I also wonder if we are still missing something fundamental -
 because even with the current patch we are quite far away from linear
 scaling and light-years from some of our competitors...

Could you compile with LWLOCK_STATS, rerun these tests, total up the
blk numbers by LWLockId, and post the results?  (Actually, totalling
up the shacq and exacq numbers would be useful as well, if you
wouldn't mind.)

Unless I very much miss my guess, we're going to see zero contention
on the new structures introduced by this patch.  Rather, I suspect
what we're going to find is that, with the hideous contention on one
particular lock manager partition lock removed, there's a more
spread-out contention problem, likely involving the lock manager
partition lock, the buffer mapping locks, and possibly other LWLocks
as well.  The fact that the system is busy-waiting rather than just
not using the CPU at all probably means that the remaining contention
is more spread out than that which is removed by this patch.  We don't
actually have everything pile up on a single LWLock (as happens in git
master), but we do spend a lot of time fighting cache lines away from
other CPUs.  Or at any rate, that's my guess: we need some real
numbers to know for sure.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-05 Thread Robert Haas
On Sun, Jun 5, 2011 at 5:46 PM, Robert Haas robertmh...@gmail.com wrote:
 Could you compile with LWLOCK_STATS, rerun these tests, total up the
 blk numbers by LWLockId, and post the results?  (Actually, totalling
 up the shacq and exacq numbers would be useful as well, if you
 wouldn't mind.)

I did this on the loaner 24-core box from Nate Boley and got the
following results.  This is just the LWLocks that had blk0.

lwlock 0: shacq 0 exacq 200625 blk 24044
lwlock 4: shacq 80101430 exacq 196 blk 28
lwlock 33: shacq 8333673 exacq 11977 blk 864
lwlock 34: shacq 7092293 exacq 11890 blk 803
lwlock 35: shacq 7893875 exacq 11909 blk 848
lwlock 36: shacq 7567514 exacq 11912 blk 830
lwlock 37: shacq 7427774 exacq 11930 blk 745
lwlock 38: shacq 7120108 exacq 11989 blk 853
lwlock 39: shacq 7584952 exacq 11982 blk 782
lwlock 40: shacq 7949867 exacq 12056 blk 821
lwlock 41: shacq 6612240 exacq 11929 blk 746
lwlock 42: shacq 47512112 exacq 11844 blk 4503
lwlock 43: shacq 7943511 exacq 11871 blk 878
lwlock 44: shacq 7534558 exacq 12033 blk 800
lwlock 45: shacq 7128256 exacq 12045 blk 856
lwlock 46: shacq 7575339 exacq 12015 blk 818
lwlock 47: shacq 6745173 exacq 12094 blk 806
lwlock 48: shacq 8410348 exacq 12104 blk 977
lwlock 49: shacq 0 exacq 5007594 blk 172533
lwlock 50: shacq 0 exacq 5011704 blk 172282
lwlock 51: shacq 0 exacq 5003356 blk 172802
lwlock 52: shacq 0 exacq 5009020 blk 174648
lwlock 53: shacq 0 exacq 5010808 blk 172080
lwlock 54: shacq 0 exacq 5004908 blk 169934
lwlock 55: shacq 0 exacq 5009324 blk 170281
lwlock 56: shacq 0 exacq 5005904 blk 171001
lwlock 57: shacq 0 exacq 5006984 blk 169942
lwlock 58: shacq 0 exacq 5000346 blk 170001
lwlock 59: shacq 0 exacq 5004884 blk 170484
lwlock 60: shacq 0 exacq 5006304 blk 171325
lwlock 61: shacq 0 exacq 5008421 blk 170866
lwlock 62: shacq 0 exacq 5008162 blk 170868
lwlock 63: shacq 0 exacq 5002238 blk 170291
lwlock 64: shacq 0 exacq 5005348 blk 169764
lwlock 307: shacq 0 exacq 2 blk 1
lwlock 315: shacq 0 exacq 3 blk 2
lwlock 337: shacq 0 exacq 4 blk 3
lwlock 345: shacq 0 exacq 2 blk 1
lwlock 349: shacq 0 exacq 2 blk 1
lwlock 231251: shacq 0 exacq 2 blk 1
lwlock 253831: shacq 0 exacq 2 blk 1

So basically, even with the patch, at 24 cores the lock manager locks
are still under tremendous pressure.  But note that there's a big
difference between what's happening here and what's happening without
the patch.  Here's without the patch:

lwlock 0: shacq 0 exacq 191613 blk 17591
lwlock 4: shacq 21543085 exacq 102 blk 20
lwlock 33: shacq 2237938 exacq 11976 blk 463
lwlock 34: shacq 1907344 exacq 11890 blk 458
lwlock 35: shacq 2125308 exacq 11908 blk 442
lwlock 36: shacq 2038220 exacq 11912 blk 430
lwlock 37: shacq 1998059 exacq 11927 blk 449
lwlock 38: shacq 1916179 exacq 11953 blk 409
lwlock 39: shacq 2042173 exacq 12019 blk 479
lwlock 40: shacq 2140002 exacq 12056 blk 448
lwlock 41: shacq 1776772 exacq 11928 blk 392
lwlock 42: shacq 12777368 exacq 11842 blk 2451
lwlock 43: shacq 2132240 exacq 11869 blk 478
lwlock 44: shacq 2026845 exacq 12031 blk 446
lwlock 45: shacq 1918618 exacq 12045 blk 449
lwlock 46: shacq 2038437 exacq 12011 blk 472
lwlock 47: shacq 1814660 exacq 12089 blk 401
lwlock 48: shacq 2261208 exacq 12105 blk 478
lwlock 49: shacq 0 exacq 1347524 blk 17020
lwlock 50: shacq 0 exacq 1350678 blk 16888
lwlock 51: shacq 0 exacq 1346260 blk 16744
lwlock 52: shacq 0 exacq 1348432 blk 16864
lwlock 53: shacq 0 exacq 22216779 blk 4914363
lwlock 54: shacq 0 exacq 22217309 blk 4525381
lwlock 55: shacq 0 exacq 1348406 blk 13438
lwlock 56: shacq 0 exacq 1345996 blk 13299
lwlock 57: shacq 0 exacq 1347890 blk 13654
lwlock 58: shacq 0 exacq 1343486 blk 13349
lwlock 59: shacq 0 exacq 1346198 blk 13471
lwlock 60: shacq 0 exacq 1346236 blk 13532
lwlock 61: shacq 0 exacq 1343688 blk 13547
lwlock 62: shacq 0 exacq 1350068 blk 13614
lwlock 63: shacq 0 exacq 1345302 blk 13420
lwlock 64: shacq 0 exacq 1348858 blk 13635
lwlock 321: shacq 0 exacq 2 blk 1
lwlock 329: shacq 0 exacq 4 blk 3
lwlock 337: shacq 0 exacq 6 blk 4
lwlock 347: shacq 0 exacq 5 blk 4
lwlock 357: shacq 0 exacq 3 blk 2
lwlock 363: shacq 0 exacq 3 blk 2
lwlock 369: shacq 0 exacq 4 blk 3
lwlock 379: shacq 0 exacq 2 blk 1
lwlock 383: shacq 0 exacq 2 blk 1
lwlock 445: shacq 0 exacq 2 blk 1
lwlock 449: shacq 0 exacq 2 blk 1
lwlock 451: shacq 0 exacq 2 blk 1
lwlock 1023: shacq 0 exacq 2 blk 1
lwlock 11401: shacq 0 exacq 2 blk 1
lwlock 115591: shacq 0 exacq 2 blk 1
lwlock 117177: shacq 0 exacq 2 blk 1
lwlock 362839: shacq 0 exacq 2 blk 1

In the unpatched case, two lock manager locks are getting beaten to
death, and the others all about equally contended.  By eliminating the
portion of the lock manager contention that pertains specifically to
the two heavily trafficked locks, system throughput improves by about
3.5x - and, not surprisingly, traffic on the lock manager locks
increases by approximately the same multiple.  Those locks now become
the contention bottleneck, with about 12x the blocking they had
pre-patch.  

Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-05 Thread Robert Haas
On Sun, Jun 5, 2011 at 10:16 PM, Robert Haas robertmh...@gmail.com wrote:
 I'm definitely interested in investigating what to do
 about that, but I don't think it's this patch's problem to fix all of
 our lock manager bottlenecks.

I did some further investigation of this.  It appears that more than
99% of the lock manager lwlock traffic that remains with this patch
applied has locktag_type == LOCKTAG_VIRTUALTRANSACTION.  Every SELECT
statement runs in a separate transaction, and for each new transaction
we run VirtualXactLockTableInsert(), which takes a lock on the vxid of
that transaction, so that other processes can wait for it.  That
requires acquiring and releasing a lock manager partition lock, and we
have to do the same thing a moment later at transaction end to dump
the lock.

A quick grep seems to indicate that the only places where we actually
make use of those VXID locks are in DefineIndex(), when CREATE INDEX
CONCURRENTLY is in use, and during Hot Standby, when max_standby_delay
expires.  Considering that these are not commonplace events, it seems
tremendously wasteful to incur the overhead for every transaction.  It
might be possible to make the lock entry spring into existence on
demand - i.e. if a backend wants to wait on a vxid entry, it creates
the LOCK and PROCLOCK objects for that vxid.  That presents a few
synchronization challenges, and plus we have to make sure that the
backend that's just been given a lock knows that it needs to release
it, but those seem like they might be manageable problems, especially
given the new infrastructure introduced by the current patch, which
already has to deal with some of those issues.  I'll look into this
further.

It's likely that if we lick this problem, the BufFreelistLock and
BufMappingLocks are going to be the next hot spot.  Of course, we're
ignoring the ten-thousand pound gorilla in the corner, which is that
on write workloads we have a pretty bad contention problem with
WALInsertLock, which I fear will not be so easily addressed.  But one
problem at a time, I guess.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-04 Thread Simon Riggs
On Fri, Jun 3, 2011 at 2:17 PM, Robert Haas robertmh...@gmail.com wrote:

 I've now spent enough time working on this issue now to be convinced
 that the approach has merit, if we can work out the kinks.

Yes, the approach has merits and I'm sure we can work out the kinks.

 As you can see, this works out to a bit more than a 4% improvement on
 this two-core box.  I also got access (thanks to Nate Boley) to a
 24-core box and ran the same test with scale factor 100 and
 shared_buffers=8GB.  Here are the results of alternating runs without
 and with the patch on that machine:

 tps = 36291.996228 (including connections establishing)
 tps = 129242.054578 (including connections establishing)
 tps = 36704.393055 (including connections establishing)
 tps = 128998.648106 (including connections establishing)
 tps = 36531.208898 (including connections establishing)
 tps = 131341.367344 (including connections establishing)

 That's an improvement of about ~3.5x.  According to the vmstat output,
 when running without the patch, the CPU state was about 40% idle.
 With the patch, it dropped down to around 6%.

Congratulations. I believe that is realistic based upon my investigations.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-04 Thread Simon Riggs
On Sat, Jun 4, 2011 at 2:59 PM, Simon Riggs si...@2ndquadrant.com wrote:

 As you can see, this works out to a bit more than a 4% improvement on
 this two-core box.  I also got access (thanks to Nate Boley) to a
 24-core box and ran the same test with scale factor 100 and
 shared_buffers=8GB.  Here are the results of alternating runs without
 and with the patch on that machine:

 tps = 36291.996228 (including connections establishing)
 tps = 129242.054578 (including connections establishing)
 tps = 36704.393055 (including connections establishing)
 tps = 128998.648106 (including connections establishing)
 tps = 36531.208898 (including connections establishing)
 tps = 131341.367344 (including connections establishing)

 That's an improvement of about ~3.5x.  According to the vmstat output,
 when running without the patch, the CPU state was about 40% idle.
 With the patch, it dropped down to around 6%.

 Congratulations. I believe that is realistic based upon my investigations.


Tom,

You should look at this. It's good.

The approach looks sound to me. It's a fairly isolated patch and we
should be considering this for inclusion in 9.1, not wait another
year.

I will happily add its a completely different approach to the one I'd
been working on, and even more happily is so different from the Oracle
approach that we are definitely unencumbered by patent issues here.
Well done Robert, Noah.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-04 Thread Heikki Linnakangas

On 04.06.2011 18:01, Simon Riggs wrote:

It's a fairly isolated patch and we
should be considering this for inclusion in 9.1, not wait another
year.


-1

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-04 Thread Kevin Grittner
Simon Riggs  wrote:
 
 we should be considering this for inclusion in 9.1, not wait
 another year.
 
-1
 
I'm really happy that we're addressing the problems with scaling to
a large number of cores, and this patch sounds great.  Adding a new
feature at this point in the release cycle would be horrible. 
Frankly, from the tone of Robert's post, it probably wouldn't be
appropriate to include it in a release if it showed up in this
condition at the start of the last CF for that release.
 
The nice thing about annual releases is there's never one too far
away -- unless, of course, we hold up a release up to squeeze in
just one more feature.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-04 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes:
 The approach looks sound to me. It's a fairly isolated patch and we
 should be considering this for inclusion in 9.1, not wait another
 year.

That suggestion is completely insane.  The patch is only WIP and full of
bugs, even according to its author.  Even if it were solid, it is way
too late to be pushing such stuff into 9.1.  We're trying to ship a
release, not find ways to cause it to slip more.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-03 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote:
 
 That's an improvement of about ~3.5x.
 
Outstanding!
 
I don't want to even peek at this until I've posted the two WIP SSI
patches (now both listed on the Open Items page), but will
definitely take a look after that.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-03 Thread Robert Haas
On Fri, Jun 3, 2011 at 10:13 AM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Robert Haas robertmh...@gmail.com wrote:

 That's an improvement of about ~3.5x.

 Outstanding!

 I don't want to even peek at this until I've posted the two WIP SSI
 patches (now both listed on the Open Items page), but will
 definitely take a look after that.

Yeah, those SSI items are important to get nailed down RSN.  But
thanks for your interest in this patch.  :-)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers