date:20110628

Excerpts from HuangQi's message of lun jun 27 23:56:22 -0400 2011:

  BTW, which email system are you using to send to postgres mailing list?
 As you can keep the top-answering and maintain the title of your email with
 [hackers] in front, my gmail can not help on that. For this email, I just
 add by hand.

You shouldn't add the [Hackers] prefix yourself.  The mail list server
will add it automatically for you.  Gmail deals with this perfectly
fine.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Branch refs/heads/REL9_1_STABLE was removed

2011-06-28 Thread Tom Lane

Magnus Hagander mag...@hagander.net writes:
 We discussed earlier to potentially block the creation, and removal,
 of branches on the origin server, to prevent mistakes like this. It
 has only happened once in almost a year, so it's probably not
 necessary - but I wanted to raise the option anyway in case people
 forgot about it.

 The downside would be that in order to create or drop a branch *when
 intended* a committer would need someone from the infrastructure team
 to temporarily switch off the branch-blocking setting, and then back
 on..

I think it would be sensible to block branch removal, as there's
basically never a scenario where we'd do that during current usage.
I'm not excited about blocking branch addition, although I worry
sooner or later somebody will accidentally push a private development
branch :-(.  Is it possible to block only removal and not addition?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Branch refs/heads/REL9_1_STABLE was removed

On Tue, Jun 28, 2011 at 16:39, Tom Lane t...@sss.pgh.pa.us wrote:
 Magnus Hagander mag...@hagander.net writes:
 We discussed earlier to potentially block the creation, and removal,
 of branches on the origin server, to prevent mistakes like this. It
 has only happened once in almost a year, so it's probably not
 necessary - but I wanted to raise the option anyway in case people
 forgot about it.

 The downside would be that in order to create or drop a branch *when
 intended* a committer would need someone from the infrastructure team
 to temporarily switch off the branch-blocking setting, and then back
 on..

 I think it would be sensible to block branch removal, as there's
 basically never a scenario where we'd do that during current usage.
 I'm not excited about blocking branch addition, although I worry
 sooner or later somebody will accidentally push a private development
 branch :-(.  Is it possible to block only removal and not addition?

Yes, I think so. Either way it'll require a small addition to the
scripts we're using, so I'll try to just turn it into two different
settings. I don't see offhand any reason why this shouldn't work.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

On Mon, Jun 27, 2011 at 11:42 PM, Jeff Davis pg...@j-davis.com wrote:
 So, in effect, RANGEINPUT is a special type used only for range
 constructors. If someone tried to output it, it would throw an
 exception, and we'd even have enough information at that point to print
 a nice error message with a hint.

I don't think I like the idea of throwing an error when you try to
output it, but the rest seems reasonably sensible.

 Actually, this is pretty much exactly Florian's idea (thanks again,
 Florian), but at the time I didn't like it because pair didn't capture
 everything that I wanted to capture, like infinite bounds, etc. But
 there's no reason that it can't, and your point made me realize that --
 you are effectively just using TEXT as the intermediate type (which
 works, but has some undesirable characteristics).

What undesirable characteristics?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Branch refs/heads/REL9_1_STABLE was removed




On 06/28/2011 10:39 AM, Tom Lane wrote:

Magnus Hagandermag...@hagander.net  writes:

We discussed earlier to potentially block the creation, and removal,
of branches on the origin server, to prevent mistakes like this. It
has only happened once in almost a year, so it's probably not
necessary - but I wanted to raise the option anyway in case people
forgot about it.
The downside would be that in order to create or drop a branch *when
intended* a committer would need someone from the infrastructure team
to temporarily switch off the branch-blocking setting, and then back
on..

I think it would be sensible to block branch removal, as there's
basically never a scenario where we'd do that during current usage.
I'm not excited about blocking branch addition, although I worry
sooner or later somebody will accidentally push a private development
branch :-(.  Is it possible to block only removal and not addition?




+1. Spurious branch addition shouldn't cause us much pain - we'd just 
remove the new branch. Unwanted deletion is more disruptive.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [v9.2] Fix leaky-view problem, part 1

2011-06-28 Thread Noah Misch

I took a look at this patch.  It's incredibly simple, which is great, and it
seems to achieve its goal.

Suppose your query references two views owned by different roles.  The quals
of those views will have the same depth.  Is there a way for information to
leak from one view owner to another due to that?

I like how you've assumed that the table owner trusts the operator class
functions of indexes on his table to not leak information.  That handily
catches some basic and important qual pushdowns that would otherwise be lost.

This makes assumptions, at a distance, about the possible scan types and how
quals can be fitted to them.  Specifically, this patch achieves its goals
because any indexable qual is trustworthy, and any non-indexable qual cannot
be pushed down further than the view's own quals.  That seems to be true with
current plan types, but it feels fragile.  I don't have a concrete idea for
improvement, though.  Robert suggested another approach in
http://archives.postgresql.org/message-id/aanlktimbn_6tyxreh5rc7pmizt-vjb3xgp8cuho0o...@mail.gmail.com
; might that approach avoid this hazard?

The part 2 patch in this series implements the planner restriction more likely
to yield performance regressions, so it introduces a mechanism for identifying
when to apply the restriction.  This patch, however, applies its restriction
unconditionally.  Since we will inevitably have a such mechanism before you
are done sealing the leaks in our view implementation, the restriction in this
patch should also use that mechanism.  Therefore, I think we should review and
apply part 2 first, then update this patch to use its conditional behavior.

A few minor questions/comments on the implementation:

On Mon, Jun 06, 2011 at 01:37:11PM +0100, Kohei Kaigai wrote:
 --- a/src/backend/optimizer/plan/planner.c
 +++ b/src/backend/optimizer/plan/planner.c

 + else if (IsA(node, Query))
 + {
 + depth += 2;

Why two?

 --- a/src/test/regress/expected/select_views.out
 +++ b/src/test/regress/expected/select_views.out

 +EXPLAIN SELECT * FROM your_credit WHERE f_leak(number,expired);
 +QUERY PLAN
 +--
 + Seq Scan on credit_cards  (cost=0.00..181.20 rows=1 width=96)

Use EXPLAIN (COSTS OFF) in regression tests.  We do not put much effort into
the stability of exact cost values, and they do not matter for the purpose of
this test.

 --- a/src/test/regress/sql/select_views.sql
 +++ b/src/test/regress/sql/select_views.sql
 @@ -8,3 +8,46 @@ SELECT * FROM street;
  SELECT name, #thepath FROM iexit ORDER BY 1, 2;
  
  SELECT * FROM toyemp WHERE name = 'sharon';
 +
 +--
 +-- Test for leaky-view problem
 +--
 +
 +-- setups
 +SET client_min_messages TO 'warning';
 +
 +DROP ROLE IF EXISTS alice;
 +DROP FUNCTION IF EXISTS f_leak(text);
 +DROP TABLE IF EXISTS credit_cards;
 +
 +RESET client_min_messages;

No need for this.  The regression tests always start on a clean database.

 +
 +CREATE USER alice;
 +CREATE FUNCTION f_leak(text, text)
 +RETURNS bool LANGUAGE 'plpgsql'
 +AS 'begin raise notice ''% = %'', $1, $2; return true; end';

I ran this test case on master, and it did not reproduce the problem.
However, adding COST 0.1 to this CREATE FUNCTION did yield the expected
problem behavior.  I suggest adding that.

Thanks,
nm

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] add support for logging current role (what to review?)

On Mon, Jun 27, 2011 at 6:25 PM, Alex Hunsaker bada...@gmail.com wrote:
Ive been holding off because its marked as Waiting on Author, am now
thinking thats a mistake. =)

It links to this patch:
http://archives.postgresql.org/message-id/20110215135131.gx4...@tamriel.snowman.net

Which is older than the latest patch in that thread posted by Robert:
http://archives.postgresql.org/message-id/AANLkTikMadttguOWTkKLtgfe90kxR=u9njk9zebrw...@mail.gmail.com

(There are also various other patches and versions in that thread...)

The main difference between the first and the last patch is the first
one has support for changing what csv columns we output, while the
latter just tacks on an additional column.

The thread was very long and left me a bit confused as to what I
should actually be looking at. Or perhaps thats the point-- we need to
decide if a csvlog_fields GUC is worth it.

This is pretty much a classic example of the tailing wagging the dog.

Adding a new field to the output would be pretty easy if it weren't
for the fact that we want the CSV log output to contain all the same
fields that are available in the regular log. Of course, adding a
field to the CSV format is no big deal either. But now you have more
problems: (1) it breaks backward compatibility, (2) it makes the log
files bigger, and (3) if the new field is more expensive to compute
than the existing ones, then you're forcing people who want to use the
CSV log format to incur an overhead for a feature they don't care
about. We could surmount these problems by making the CSV log format
more flexible, but (a) that's a lot of work, (b) it imposes a burden
on tool-builders, and (c) it might produce a really long icky-looking
configuration line in postgresql.conf to show the default value of the
GUC controlling the log format. (You may think I'm joking about this
last one, but the point was raised... several times.)

The upshot, presumably, is that if Stephen would like us to add this
new field, he needs to be willing to rewrite our entire logging
infrastructure first... and then maybe we'll bounce the rewrite
because it adds too much complexity, but who was it that asked for all
that complexity again? Oh, that's right: we did. Now, I admit that
I've been guilty of engaging in this kind of scope creep from time to
time myself, so everyone feel free to pitch your favorite loose object
in my direction.

Anyway, if we are going to insist on rewriting substantial chunks of
the logging infrastructure before doing this, we at least need to
reach some agreement on what would be an acceptable outcome - and then
let Stephen code that up as a separate patch, submit it, get it
committed, and then do this on top of that. Or we can just decide
that adding one relatively minor field to the text format output is
not worth knocking over the applecart for.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Branch refs/heads/REL9_1_STABLE was removed

On Tue, Jun 28, 2011 at 11:05 AM, Andrew Dunstan and...@dunslane.net wrote:
 +1. Spurious branch addition shouldn't cause us much pain - we'd just remove
 the new branch. Unwanted deletion is more disruptive.

How about if we allow addition only of branches matching
/^REL_[0-9_]+_STABLE$/ and disallow deletion of all branches?  That
seems like it'd allow the one operation we will likely want to do with
any regularity (creating a new release branch once a year) without
going through hoops, while disallowing most of the problem cases.

The problem with allowing people to create branches and not remove
them is that someone might push a private branch and not be able to
get rid of it.  But if we only allow creation of branches that look
like the branches that are supposed to be there, then that shouldn't
be a danger.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Your Postgresql 9.2 patch

2011-06-28 Thread Leonardo Francalanci

 I've nearly  finished editing prior to commit, so no worries.


Thank you, let me know if I can help.



Leonardo

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Make the visibility map crash-safe.

2011-06-28 Thread Cédric Villemain

2011/6/27 Robert Haas robertmh...@gmail.com:
 On Thu, Jun 23, 2011 at 9:22 AM, Robert Haas robertmh...@gmail.com wrote:
 On Wed, Jun 22, 2011 at 10:23 PM, Robert Haas robertmh...@gmail.com wrote:
 Well, it seems I didn't put nearly enough thought into heap_update().
 The fix for the immediate problem looks simple enough - all the code
 has been refactored to use the new API, so the calls can be easily be
 moved into the critical section (see attached).  But looking at this a
 little more, I see that heap_update() is many bricks short of a load,
 because there are several places where the buffer can be unlocked and
 relocked, and we don't recheck whether the page is all-visible after
 reacquiring the lock.  So I've got some more work to do here.

 See what you think of the attached.  I *think* this covers all bases.
 It's a little more complicated than I would like, but I don't think
 fatally so.

 For lack of comment, committed.  It's hopefully at least better than
 what was there before, which was clearly several bricks short of a
 load.

out of curiosity, does it affect the previous benchmarks of the feature ?


 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers




-- 
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Make the visibility map crash-safe.

On Tue, Jun 28, 2011 at 11:44 AM, Cédric Villemain
cedric.villemain.deb...@gmail.com wrote:
 out of curiosity, does it affect the previous benchmarks of the feature ?

I don't think there's much performance impact, because the only case
where we actually have to do any real work is when vacuum comes
through and marks a buffer we're about to update all-visible.  It
would probably be good to run some tests to verify that, though.  I'll
try to set something up.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [Hackers]Backend quey plan process

2011-06-28 Thread HuangQi

On 28 June 2011 22:36, Alvaro Herrera alvhe...@commandprompt.com wrote:

 Excerpts from HuangQi's message of lun jun 27 23:56:22 -0400 2011:

   BTW, which email system are you using to send to postgres mailing
 list?
  As you can keep the top-answering and maintain the title of your email
 with
  [hackers] in front, my gmail can not help on that. For this email, I just
  add by hand.

 You shouldn't add the [Hackers] prefix yourself.  The mail list server
 will add it automatically for you.  Gmail deals with this perfectly
 fine.

 --
 Álvaro Herrera alvhe...@commandprompt.com
 The PostgreSQL Company - Command Prompt, Inc.
 PostgreSQL Replication, Consulting, Custom Development, 24x7 support


I see. Thanks.

-- 
Best Regards
Huang Qi Victor

Re: [HACKERS] Range Types, constructors, and the type system

2011-06-28 Thread David E. Wheeler

On Jun 27, 2011, at 8:42 PM, Jeff Davis wrote:

 Do we think that this is a good way forward? The only thing I can think
 of that's undesirable is that it's not normal to be required to cast the
 result of a function, and might be slightly difficult to explain in the
 documentation in a straightforward way

That's the part that bothers me. I think that if it's not cast it should 
somehow be useful. Maybe default to a text range or something?

Best,

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Branch refs/heads/REL9_1_STABLE was removed

2011-06-28 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 How about if we allow addition only of branches matching
 /^REL_[0-9_]+_STABLE$/ and disallow deletion of all branches?

+1, if feasible.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Branch refs/heads/REL9_1_STABLE was removed

On Jun 28, 2011 6:29 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Robert Haas robertmh...@gmail.com writes:
  How about if we allow addition only of branches matching
  /^REL_[0-9_]+_STABLE$/ and disallow deletion of all branches?

 +1, if feasible.


Pretty sure that's just a Small Matter Of Programming. I'll give it a try.

/Magnus

Re: [HACKERS] Range Types, constructors, and the type system

2011-06-28 Thread Jeff Davis

On Tue, 2011-06-28 at 10:58 -0400, Robert Haas wrote:
 On Mon, Jun 27, 2011 at 11:42 PM, Jeff Davis pg...@j-davis.com wrote:
  So, in effect, RANGEINPUT is a special type used only for range
  constructors. If someone tried to output it, it would throw an
  exception, and we'd even have enough information at that point to print
  a nice error message with a hint.
 
 I don't think I like the idea of throwing an error when you try to
 output it, but the rest seems reasonably sensible.

I thought it might add a little confusion if people thought they had a
range type but really had RANGEINPUT. For instance, if you do a create
table as select range(1,2) then the result might be slightly
unexpected.

But it's probably no more unexpected than create table as select
'foo'. So, I suppose there's not much reason to throw an error. We can
just output it in the same format as a range type.

It's also much easier to explain something in the documentation that has
an output format, because at least it's tangible.

  Actually, this is pretty much exactly Florian's idea (thanks again,
  Florian), but at the time I didn't like it because pair didn't capture
  everything that I wanted to capture, like infinite bounds, etc. But
  there's no reason that it can't, and your point made me realize that --
  you are effectively just using TEXT as the intermediate type (which
  works, but has some undesirable characteristics).
 
 What undesirable characteristics?

Well, for one, outputting something as text and then reading it back in
does not always produce the same value. For instance, for float, it only
does that if you have extra_float_digits set to some high-enough value.
I suppose I could save the GUC, set it, and set it back; but that seems
like unnecessary ugliness.

There's also the deparsing/reparsing cycle. That might not really matter
for performance, but it seems unnecessary.

And there's always the fallback that we have types for a reason.
Wouldn't it be odd if you wrote a query like:
  select range(1,2) || 'foo'
and it succeeded? I'm sure that kind of thing can lead to some dangerous
situations.

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

2011-06-28 Thread Jeff Davis

On Tue, 2011-06-28 at 09:30 -0700, David E. Wheeler wrote:
 On Jun 27, 2011, at 8:42 PM, Jeff Davis wrote:
 
  Do we think that this is a good way forward? The only thing I can think
  of that's undesirable is that it's not normal to be required to cast the
  result of a function, and might be slightly difficult to explain in the
  documentation in a straightforward way
 
 That's the part that bothers me.

Yeah, that bothered me, too. 

 I think that if it's not cast it should somehow be useful.

Let's see, what can one do with a range that has no ordering yet? ;)

Robert suggested that we don't need to throw an error, and I think I
agree. Just having a working output function solves most of the
documentation problem, because it makes it less abstract.

The only operators that we could really support are accessors, which
seems somewhat reasonable. However, I'd have some concerns even about
that, because if you do range(10,1), then what's the upper bound?

 Maybe default to a text range or something?

That sounds a little dangerous:
  select range('1','09')
would fail before it could be cast to int4range.

We could invent an UNKNOWNRANGE type or something. But I don't
particularly like that; it would start out working nicely when people
only had one textrange type, and then their old queries would start
failing when they added another range type based on text.

I think it's fine if the RANGEINPUT type isn't too useful by itself.
It's already a common requirement to cast unknown literals, and this
isn't too much different. It's only for constructors, so it still fits
pretty closely with that idea.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Small patch for GiST: move childoffnum to child

2011-06-28 Thread Alexander Korotkov

Actually, there is no more direct need of this patch because I've rewrote
insert function for fast build. But there are still two points for having
this changes:
1) As it was noted before, it simplifies code a bit.
2) It would be better if childoffnum have the same semantics in regular
insert and fastbuild insert.

--
With best regards,
Alexander Korotkov.

Re: [HACKERS] Range Types, constructors, and the type system

On Tue, Jun 28, 2011 at 12:58 PM, Jeff Davis pg...@j-davis.com wrote:
 On Tue, 2011-06-28 at 10:58 -0400, Robert Haas wrote:
 On Mon, Jun 27, 2011 at 11:42 PM, Jeff Davis pg...@j-davis.com wrote:
  So, in effect, RANGEINPUT is a special type used only for range
  constructors. If someone tried to output it, it would throw an
  exception, and we'd even have enough information at that point to print
  a nice error message with a hint.

 I don't think I like the idea of throwing an error when you try to
 output it, but the rest seems reasonably sensible.

 I thought it might add a little confusion if people thought they had a
 range type but really had RANGEINPUT. For instance, if you do a create
 table as select range(1,2) then the result might be slightly
 unexpected.

True...

 But it's probably no more unexpected than create table as select
 'foo'. So, I suppose there's not much reason to throw an error. We can
 just output it in the same format as a range type.

+1.

 It's also much easier to explain something in the documentation that has
 an output format, because at least it's tangible.

+1.

  Actually, this is pretty much exactly Florian's idea (thanks again,
  Florian), but at the time I didn't like it because pair didn't capture
  everything that I wanted to capture, like infinite bounds, etc. But
  there's no reason that it can't, and your point made me realize that --
  you are effectively just using TEXT as the intermediate type (which
  works, but has some undesirable characteristics).

 What undesirable characteristics?

 Well, for one, outputting something as text and then reading it back in
 does not always produce the same value. For instance, for float, it only
 does that if you have extra_float_digits set to some high-enough value.
 I suppose I could save the GUC, set it, and set it back; but that seems
 like unnecessary ugliness.

Yeah, I don't think we want to go there.

 There's also the deparsing/reparsing cycle. That might not really matter
 for performance, but it seems unnecessary.

 And there's always the fallback that we have types for a reason.
 Wouldn't it be odd if you wrote a query like:
  select range(1,2) || 'foo'
 and it succeeded? I'm sure that kind of thing can lead to some dangerous
 situations.

That's pretty much what we tried to get rid of with the 8.3 casting
changes, so agreed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Word-smithing doc changes

Excerpts from Greg Stark's message of sáb jun 25 21:01:36 -0400 2011:
 I think this commit was ill-advised:
 http://git.postgresql.org/gitweb?p=postgresql.git;a=commitdiff;h=a03feb9354bda5084f19cc952bc52ba7be89f372

 Seems way to implementation-specific and detailed for a user to make
 heads or tails of. Except in the sections talking about locking
 internals we don't talk about shared locks on virtual transactions
 identifiers we just talk about waiting for a transaction to complete.

Hmm, right.

 And looping over the transactions one by one is purely an
 implementation detail and uninteresting to users. Also it uses
 ill-defined terms like active transactions, potentially interfering
 older transactions, and  original index -- from the user's point of
 view there's only one index and it just isn't completely built yet.

Wow, that's a lot of mistakes for a single paragraph, sorry about that.

 Are we not yet in string-freeze though? I'll go ahead and edit it if
 people don't mind. I'm curious to see the original complaint though.

I don't -- please go ahead.

Original complaint in Message-id 4ddb64cb.7070...@2ndquadrant.com

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Branch refs/heads/REL9_1_STABLE was removed

Excerpts from Tom Lane's message of mar jun 28 10:39:22 -0400 2011:

 I think it would be sensible to block branch removal, as there's
 basically never a scenario where we'd do that during current usage.
 I'm not excited about blocking branch addition, although I worry
 sooner or later somebody will accidentally push a private development
 branch :-(.  Is it possible to block only removal and not addition?

If we can tweak the thing, how about we only allow creating branches
that match a certain pattern, say ^REL_\d+_\d+_STABLE$?

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] marking old branches as no longer maintained



I'd like to add a feature to the buildfarm that automatically picks up 
any new branch and automatically stops building any branch we're not 
maintaining any more. To do the latter, I need some way for the client 
to detect that we are or aren't interested in a branch. What I'd like to 
do is add a file to the old back branches (say from 7.4 to 8.1 currently 
- I can grandfather the rest) called end-of-life-reached or some such, 
with some appropriate text. As a branch reaches its end of life, i.e. 
right before the last release we make, we should add that file to the 
branch.


I think this would possibly be useful anyway, regardless of buildfarm 
utility - I still hear of people mistakenly grabbing and building 
releases past EOL, and maybe this will give one or two the extra clue 
they need that this is less than a good idea.


Comments?

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI modularity questions

2011-06-28 Thread Kevin Grittner

Heikki Linnakangas  wrote:
 On 27.06.2011 21:23, Kevin Grittner wrote:
 
 The bigger question is if those calls are needed at all
 (
http://archives.postgresql.org/message-id/4e072ea9.3030...@enterprisedb.com
 ).
 
Ah, I didn't properly grasp your concerns the first time I read that.
The heap relation lock for a seqscan is indeed required for
correctness and has been there all along.  The rs_relpredicatelocked
flag was added in response to this:
 
http://archives.postgresql.org/pgsql-hackers/2011-01/msg00730.php
 
 I'm uneasy about changing them this late in the release cycle, but
 I don't feel good about leaving useless clutter in place just
 because we're late in the release cycle either. More importantly,
 if locking the whole relation in a seqscan is not just a
 performance optimization, but is actually required for correctness,
 it's important that we make the code and comments to reflect that
 or someone will break it in the future.
 
OK, if that isn't clear in the comments, we should definitely make it
clear. Basically, the predicate locking strategy is as follows:
 
(1)  We're only concerned with read/write dependencies, also know as
rw-conflicts.  This is where two transactions overlap (each gets its
snapshot before the other commits, so neither can see the work of the
other), and one does a read which doesn't see the write of the other
due only to the timing.
 
(2)  For rw-conflicts where the read follows the write, the predicate
locks don't come into play -- we use the MVCC data in the heap tuples
directly.
 
(3)  Heap tuples are locked so that updates or deletes by an
overlapping transaction of the tuple which has been read can be
detected as a rw-conflict.  Keep in mind that access for such a
delete or update may not go through the same index on which the
conflicting read occurred.  It might use a different index or a
seqscan.  These may be promoted to page or heap relation locks to
control the shared space used by predicate locks, but the concept is
the same -- we're locking actual tuples read, not any gaps.
 
(4)  Index ranges are locked to detect inserts or updates which
create heap tuples which would have been read by an overlapping
transaction if they had existed and been visible at the time of the
index scan.  The entire goal of locks on indexes is to lock the
gaps where a scan *didn't* find anything; we only care about
conflicting index tuple inserts.
 
(5)  When a heap scan is executed, there is no index gap to lock to
cover the predicate involved, so we need to acquire a heap relation
lock -- any insert to the relation by an overlapping transaction is a
rw-conflict.  While these *look* just like tuple locks which got
promoted, their purpose is entirely different.  Like index locks,
they are for detecting inserts into the gaps.  [Light bulb goes on
over head: in some future release, perhaps it would be worth
differentiating between the two uses of heap relation locks, to
reduce the frequency of false positives.  A couple bit flags in the
lock structure might do it.]
 
So, the heap relation lock is clearly needed for the seqscan.  There
is room for performance improvement there in skipping the tuple lock
attempt when we're in a seqscan, which will always be a no-op when it
finds the heap relation lock after a hash table lookup.  But you are
also questioning whether the predicate locking of the tuples where
rs_relpredicatelocked is tested can be removed entirely, rather than
conditioned on the boolean.  The question is: can the code be reached
on something other than a seqscan of the heap, and can this happen
for a non-temporary, non-system table using a MVCC snapshot?
 
I've been trying to work backward to all the spots which call these
functions, directly or indirectly to determine that.  That's
obviously not trivial or easy work, and I fear that unless someone
more familiar with the code than I can weigh in on that question for
any particular PredicateLockTuple() call, I would rather leave the
calls alone for 9.1 and sort this out in 9.2.  I'm confident that
they don't do any damage where they are; it's a matter of very
marginal performance benefit (skipping a call to a fast return) and
code tidiness (not making unnecessary calls).
 
I can, with confidence, now answer my own previous question about
moving the calls outside the influence of HeapKeyTest(): it's not
necessary.  The rows currently excluded won't be seen by the caller,
so they don't fit under the needs of (3) above, and if (4) or (5)
aren't covered where they need to be, locking a few extra rows won't
help at all.  So we can drop that issue.
 
 (2) In reviewing the above, Heikki noticed that there was a second
 place in the executor that SSI calls were needed but missing. I
 submitted a patch here:


http://archives.postgresql.org/message-id/4e07550f02250003e...@gw.wicourts.gov

 I wonder, though, whether the section of code which I needed to
 modify should be moved to a new function in heapam.c on modularity

Re: [HACKERS] marking old branches as no longer maintained

On Tue, Jun 28, 2011 at 19:46, Andrew Dunstan and...@dunslane.net wrote:

 I'd like to add a feature to the buildfarm that automatically picks up any
 new branch and automatically stops building any branch we're not maintaining
 any more. To do the latter, I need some way for the client to detect that we
 are or aren't interested in a branch. What I'd like to do is add a file to
 the old back branches (say from 7.4 to 8.1 currently - I can grandfather the
 rest) called end-of-life-reached or some such, with some appropriate text.
 As a branch reaches its end of life, i.e. right before the last release we
 make, we should add that file to the branch.

Does this need to be driven out of the main tree? Couldn't you just
have a blacklist in the buildfarm code or site? (disclaimer: I
haven't looked at how it works so that may be a completely useless
idea..)

Another way would be to just not run bulids if there are no commits in
n days on a branch. Don't we already not run builds on branches with
no comments? Maybe just put a limit on how long we allow an override
of that?


 I think this would possibly be useful anyway, regardless of buildfarm
 utility - I still hear of people mistakenly grabbing and building releases
 past EOL, and maybe this will give one or two the extra clue they need that
 this is less than a good idea.

If you want that to actually work, you probably need to do something
to the point of breaking the configure script. There's zero chance of
people who're not reading the information about which releases are
supported are actually going read a random file somewhere in the
source tree, regardless of where you place it and what you name it.

You could reqiure something like ./configure
--yes-i-know-what-i-am-doing or something like that, I guess...

 Comments?



-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Avoid index rebuilds for no-rewrite ALTER TABLE ALTER TYPE

On Mon, Jun 27, 2011 at 10:43 PM, Noah Misch n...@leadboat.com wrote:
On Mon, Jun 27, 2011 at 03:45:43PM -0400, Robert Haas wrote:
On Wed, Jun 15, 2011 at 1:03 AM, Noah Misch n...@leadboat.com wrote:
[patch to avoid index rebuilds]

With respect to the documentation hunks, it seems to me that the first
hunk might be made clearer by leaving the paragraph of which it is a
part as-is, and adding another paragraph afterwards beginning with the
words In addition.

The added restriction elaborates on the transitivity requirement, so I wanted
to
keep the new language adjacent to that.

That makes sense, but it reads a bit awkwardly to me. Maybe it's just
that the sentence itself is so complicated that I have difficulty
understanding it. I guess it's the same problem as with the text you
added about hash indexes: without thinking about it, it's hard to
understand what territory is covered by the new sentence that is not
covered by what's already there. In the case of the hash indexes, I
think I have it figured out: we already know that we must get
compatible hash values out of logically equal values of different
datatypes. But it's possible that the inter-type cast changes the
value in some arbitrary way and then compensates for it by defining
the hash function in such a way as to compensate. Similarly, for
btrees, we need the relative ordering of A and B to remain the same
after casting within the opfamily, not to be rearranged somehow.
Maybe the text you've got is OK to explain this, but I wonder if
there's a way to do it more simply.

As you no doubt expected, my eyes was immediately drawn to the
index-resurrection hack. Reviewing the thread, I see that you asked
about that in January and never got feedback. I have to say that what
you've done here looks like a pretty vile hack, but it's hard to say
for sure without knowing what to compare it against. You made
reference to this being smaller and simpler than updating the index
definition in place - can you give a sketch of what would need to be
done if we went that route instead?

In at7-index-opfamily.patch attached to
http://archives.postgresql.org/message-id/20110113230124.ga18...@tornado.gateway.2wire.net
check out the code following the comment /* The old index is compatible.
Update catalogs. */ until the end of the function. That code would need
updates for per-column collations, and it incorrectly reuses
values/nulls/replace arrays. It probably does not belong in tablecmds.c,
either. However, it gives the right general outline.

It would be valuable to avoid introducing a second chunk of code that knows
everything about the catalog entries behind an index. That's what led me to
the
put forward the most recent version as best. What do you find vile about that
approach? I wasn't comfortable with it at first, because I suspected the
checks
in RelationPreserveStorage() might be important for correctness. Having
studied
it some more, though, I think they just reflect the narrower needs of its
current sole user.

Maybe vile is a strong word, but it seems like a modularity violation:
we're basically letting DefineIndex() do some stuff we don't really
want done, and then backing it out parts of it that we don't really
want done after all. It seems like it would be better to provide
DefineIndex with enough information not to do the wrong thing in the
first place. Could we maybe pass stmt-oldNode to DefineIndex(), and
let it work out the details?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] add support for logging current role (what to review?)

2011-06-28 Thread Alex Hunsaker

On Tue, Jun 28, 2011 at 09:15, Robert Haas robertmh...@gmail.com wrote:
 Anyway, if we are going to insist on rewriting substantial chunks of
 the logging infrastructure before doing this, we at least need to
 reach some agreement on what would be an acceptable outcome - and then
 let Stephen code that up as a separate patch, submit it, get it
 committed, and then do this on top of that.  Or we can just decide
 that adding one relatively minor field to the text format output is
 not worth knocking over the applecart for.

Well, one of the reasons I picked up this patch is its a feature *I*
want. I remember being quite surprised at the role csv logging reports
when i switched to it. I was logging everything so I have on occasion
had to find a SET ROLE and follow the session... its quite annoying
:-)

I think if Stephen was proposing 10 fields, or if there was a list of
fields we were planning on adding in the next release or 3, it might
be worth re-factoring. I know of no such list, and I think this field
useful/important enough that people who are using csv logging would
want it anyway. +1 on just tacking on the field and causing a flag day
for csv users.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] libpq SSL with non-blocking sockets

2011-06-28 Thread Martin Pihlak

On 06/25/2011 12:14 AM, Steve Singer wrote:
 I'm not a libpq guru but I've given your patch a look over.
 

Thanks for the review!

I have since simplified the patch to assume that partial SSL writes are
disabled -- according to SSL_write(3) this is the default behaviour.
Now the SSL retry buffer only holds the data to be retried, the
remainder is moved to the new outBuffer.

 -The patch doesn't compile with non-ssl builds,  the debug at the bottom
 of PQSendSome isn't in an #ifdef
 

Ack, there was another one in pqFlush. Fixed that.

 -I don't think your handling the return code properly.   Consider this case.
 
 pqSendSome(some data)
   sslRetryBuf = some Data
   return 1
 pqSendSome(more data)
   it sends all of 'some data'
   returns 0
 
 I think 1 should be returned because all of 'more data' still needs to
 be sent.  I think returning a 0 will break PQsetnonblocking if you call
 it when there is data in both sslRetryBuf and outputBuffer.

Hmm, I thought I thought about that. There was a check in the original
patch: if (conn-sslRetryBytes || (conn-outCount - remaining)  0)
So if the SSL retry buffer was emptied it would return 1 if there was
something left in the regular output buffer. Or did I miss something
there?

 We might even want to try sending the data in outputBuffer after we've
 sent all the data sitting in sslRetryBuf.
 

IMHO that'd add unnecessary complexity to the pqSendSome. As this only
happens in non-blocking mode the caller should be well prepared to
handle the retry.

 If you close the connection with an outstanding sslRetryBuf you need to
 free it.

Fixed.

New version of the patch attached.

regards,
Martin
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 9e4807e..8dc0226 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -2792,6 +2792,10 @@ freePGconn(PGconn *conn)
 	if (conn-gsslib)
 		free(conn-gsslib);
 #endif
+#if defined(USE_SSL)
+	if (conn-sslOutBufPtr)
+		free(conn-sslOutBufPtr);
+#endif
 	/* Note that conn-Pfdebug is not ours to close or free */
 	if (conn-last_query)
 		free(conn-last_query);
diff --git a/src/interfaces/libpq/fe-misc.c b/src/interfaces/libpq/fe-misc.c
index 17dde4a..d0c7812 100644
--- a/src/interfaces/libpq/fe-misc.c
+++ b/src/interfaces/libpq/fe-misc.c
@@ -781,6 +781,8 @@ pqSendSome(PGconn *conn, int len)
 	char	   *ptr = conn-outBuffer;
 	int			remaining = conn-outCount;
 	int			result = 0;
+	int			sent = 0;
+	bool		shiftOutBuffer = true;
 
 	if (conn-sock  0)
 	{
@@ -789,23 +791,34 @@ pqSendSome(PGconn *conn, int len)
 		return -1;
 	}
 
+#ifdef USE_SSL
+	if (conn-sslRetryPos)
+	{
+		/* 
+		 * We have some leftovers from previous SSL write attempts, these need
+		 * to be sent before the regular output buffer.
+		 */
+		len = remaining = conn-sslRetrySize;
+		ptr = conn-sslRetryPos;
+		shiftOutBuffer = false;
+	}
+#endif
+
 	/* while there's still data to send */
 	while (len  0)
 	{
-		int			sent;
 		char		sebuf[256];
+		int			write_size = len;
 
-#ifndef WIN32
-		sent = pqsecure_write(conn, ptr, len);
-#else
-
+#ifdef WIN32
 		/*
 		 * Windows can fail on large sends, per KB article Q201213. The
 		 * failure-point appears to be different in different versions of
 		 * Windows, but 64k should always be safe.
 		 */
-		sent = pqsecure_write(conn, ptr, Min(len, 65536));
+		write_size = Min(len, 65536);
 #endif
+		sent = pqsecure_write(conn, ptr, write_size);
 
 		if (sent  0)
 		{
@@ -857,6 +870,38 @@ pqSendSome(PGconn *conn, int len)
 	return -1;
 			}
 		}
+#ifdef USE_SSL
+		else if (sent == 0  conn-ssl  !conn-sslRetryPos)
+		{
+			/*
+			 * pqsecure_write indicates that a SSL write needs to be retried.
+			 * With non-blocking connections we need to ensure that the buffer
+			 * passed to the SSL_write() retry is the the exact same buffer as
+			 * in previous write -- see SSL_write(3SSL) for more on this. For
+			 * this we need to save the the original buffer and allocate a new
+			 * buffer for outgoing data.
+			 */
+			conn-sslOutBufPtr = conn-outBuffer;
+			conn-sslRetryPos = ptr;
+			conn-sslRetrySize = write_size;
+
+			/*
+			 * Allocate a new output buffer and prepare to move the remainder
+			 * of the original outBuffer there.
+			 */
+			remaining -= write_size;
+			ptr += write_size;
+
+			conn-outBufSize = Max(16 * 1024, remaining);
+			conn-outBuffer = malloc(conn-outBufSize);
+			if (conn-outBuffer == NULL)
+			{
+printfPQExpBuffer(conn-errorMessage,
+  cannot allocate memory for output buffer\n);
+return -1;
+			}
+		}
+#endif
 		else
 		{
 			ptr += sent;
@@ -903,10 +948,36 @@ pqSendSome(PGconn *conn, int len)
 		}
 	}
 
-	/* shift the remaining contents of the buffer */
-	if (remaining  0)
-		memmove(conn-outBuffer, ptr, remaining);
-	conn-outCount = remaining;
+#ifdef USE_SSL
+	if (conn-sslRetryPos  sent  0)
+	{
+		/*
+		 * A SSL write was successfully retried, free the retry buffer and
+		 * switch to the regular outBuffer.

Re: [HACKERS] SSI modularity questions

2011-06-28 Thread Heikki Linnakangas


On 28.06.2011 20:47, Kevin Grittner wrote:

(3)  Heap tuples are locked so that updates or deletes by an
overlapping transaction of the tuple which has been read can be
detected as a rw-conflict.  Keep in mind that access for such a
delete or update may not go through the same index on which the
conflicting read occurred.  It might use a different index or a
seqscan.  These may be promoted to page or heap relation locks to
control the shared space used by predicate locks, but the concept is
the same -- we're locking actual tuples read, not any gaps.


Ok, that's what I was missing. So the predicate locks on heap tuples are 
necessary. Thanks for explaining this again.



So, the heap relation lock is clearly needed for the seqscan.  There
is room for performance improvement there in skipping the tuple lock
attempt when we're in a seqscan, which will always be a no-op when it
finds the heap relation lock after a hash table lookup.  But you are
also questioning whether the predicate locking of the tuples where
rs_relpredicatelocked is tested can be removed entirely, rather than
conditioned on the boolean.  The question is: can the code be reached
on something other than a seqscan of the heap, and can this happen
for a non-temporary, non-system table using a MVCC snapshot?

I've been trying to work backward to all the spots which call these
functions, directly or indirectly to determine that.  That's
obviously not trivial or easy work, and I fear that unless someone
more familiar with the code than I can weigh in on that question for
any particular PredicateLockTuple() call, I would rather leave the
calls alone for 9.1 and sort this out in 9.2.  I'm confident that
they don't do any damage where they are; it's a matter of very
marginal performance benefit (skipping a call to a fast return) and
code tidiness (not making unnecessary calls).


Hmm, the calls in question are the ones in heapgettup() and 
heapgettup_pagemode(), which are subroutines of heap_getnext(). 
heap_getnext() is only used in sequential scans, so it seems safe to 
remove those calls.



(2) In reviewing the above, Heikki noticed that there was a second
place in the executor that SSI calls were needed but missing. I
submitted a patch here:



http://archives.postgresql.org/message-id/4e07550f02250003e...@gw.wicourts.gov


I wonder, though, whether the section of code which I needed to
modify should be moved to a new function in heapam.c on modularity
grounds.

If these two places were moved, there would be no SSI calls from
any source file in the executor subdirectory.


Same here, we might not need those PredicateLockTuple calls in
bitmap heap scan at all. Can you check my logic, and verify if
those PredicateLockTuple() calls are needed?


These sure look like they are needed per point (3) above.


Yep.


I would
like to add a test involving a lossy bitmap scan.  How many rows are
normally needed to force a bitmap scan to be lossy?


The size of bitmaps is controlled by work_mem, so you can set work_mem 
very small to cause them to become lossy earlier. Off the top of my head 
I don't have any guesstimate on how many rows you need.



 What's the
easiest way to check whether a plan is going to use (or is using) a
lossy bitmap scan?


Good question. There doesn't seem to be anything in the EXPLAIN ANALYZE 
output to show that, so I think you'll have to resort to adding some 
elog()s in the right places.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI modularity questions

On Tue, Jun 28, 2011 at 1:47 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 (5)  When a heap scan is executed, there is no index gap to lock to
 cover the predicate involved, so we need to acquire a heap relation
 lock -- any insert to the relation by an overlapping transaction is a
 rw-conflict.  While these *look* just like tuple locks which got
 promoted, their purpose is entirely different.  Like index locks,
 they are for detecting inserts into the gaps.  [Light bulb goes on
 over head: in some future release, perhaps it would be worth
 differentiating between the two uses of heap relation locks, to
 reduce the frequency of false positives.  A couple bit flags in the
 lock structure might do it.]

You know, it just occurred to me while reading this email that you're
using the term predicate lock in a way that is totally different
from what I learned in school.  What I was taught is that the word
predicate in predicate lock is like the word tuple in tuple
lock or the word relation in relation lock - that is, it
describes *the thing being locked*.  In other words, you are
essentially doing:

LOCK TABLE foo WHERE i = 1;

I think that what you're calling the predicate lock manager should
really be called the siread lock manager, and all of the places where
you are predicate locking a tuple should really be siread locking
the tuple.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] spinlock contention

On Jun23, 2011, at 23:40 , Robert Haas wrote:
 I tried rewriting the LWLocks using CAS.  It actually seems to make
 things slightly worse on the tests I've done so far, perhaps because I
 didn't make it respect spins_per_delay.  Perhaps fetch-and-add would
 be better, but I'm not holding my breath.  Everything I'm seeing so
 far leads me to the belief that we need to get rid of the contention
 altogether, not just contend more quickly.

I got curious and put a tool together to benchmark this. The code can
be found at https://github.com/fgp/lockbench.
  
The tool starts N workers who each acquire a lock in SHARE mode, increment
a per-worker counter and release the lock again, rinse, repeat. That is
done for T seconds, after which it reports the sum of the individual
counters, the elapsed (wall) time and the sums of the user and system times
consumed by the workers. The lock implementations currently supported are

atomicinc:  RW-Lock using atomic increment/decrement instructions
(locked xaddl) to acquire and release the lock in SHARE
mode in the non-contested case. The contested case is
unimplemented.

cmpxchng:   Like atomicinc, but uses locked cmpxchng loop instead of
locked xaddl.

spin:   RW-Lock built around a simple cmpxchng-based spinlock (i.e.,
to lock/unlock spinlock is taken, shared-lockers count is
incremented/ decremented, spinlock is released). Similar to
pg_lwlock, but without the sleep() in the contested path of
the spinlock. The contested case of the RW-Lock is again
unimplemented.

none:   No locking.

pg_lwlock:  Postgres LWLock implementation. The contested case doesn't
work because the workers currently don't have a valid MyProc.

pw_lwlock_cas:  Like pg_lwlock but with your CAS patch applied.

On an 4-core Intel Xeon system (Intel Xeon X3430, 2.4 GHz, 8MB cache) I get
the following results:

 |   1 worker|   2 workers   |   3 workers   |   4 workers   |
 |wallsec usersec|wallsec usersec|wallsec usersec|wallsec usersec|
 |  / /  |  / /  |  / /  |  / /  |
 |cycles   cycles|cycles   cycles|cycles   cycles|cycles   cycles|
--
none |2.5e-09 2.5e-09|1.3e-09 2.5e-09|8.5e-10 2.5e-09|6.8e-10 2.5e-09|
atomicinc|1.8e-08 1.8e-08|2.8e-08 5.6e-08|2.7e-08 8.1e-08|2.9e-08 1.1e-07|
cmpxchng |3.0e-08 3.0e-08|7.1e-08 1.4e-07|6.9e-08 2.0e-07|7.2e-08 2.9e-07|
spin |5.0e-08 5.0e-08|3.0e-07 5.9e-07|3.8e-07 1.1e-06|4.0e-07 1.5e-06|
pg_lwlock|2.8e-08 2.8e-08|2.9e-08 3.0e-08|3.0e-08 3.2e-08|2.9e-08 3.1e-08|
pg_lwlock|2.6e-08 2.6e-08|6.2e-08 1.2e-07|7.7e-08 2.3e-07|7.8e-08 3.1e-07|
 _cas|   |   ||  |
--

These results allow the following conclusions to be drawn

First, our current pg_lwlock is quite good at preserving resources, i.e.
at not spinning excessively - at least for up to four workers. It's the only
lock implementation that consistently uses about 30ns of processor time for
one acquire/release cycle. Only atomicinc with one worker (i.e. no cachline
fighting) manages to beat that. It does, however, effectively serializate
execution with this workload - the consumed user time is approximately equal
to the elapsed wall clock time, even in the case of 4 workers, meaning that
most of the time 3 of those 4 workers are sleeping.

Secondly, pg_lwlock_cas and cmpxchng perform simiarly, which comes at no
surprise, since use effectively the same algorithm. One could expect spin
and pg_lwlock to also perform similarly, but these two don't. The reason
is probably that spin uses a rather brain-dead spinning method, while
pg_lwlock uses rep; nop.

Now to atomicinc, which is (the fast-path of) the most optimized RW-lock
possible, at least on x86 and x86_64. One interesting result here is that
wall time seconds / cycle are suspiciously for atomicinc and pg_lwlock.
This proves, I think, that the limiting factor in both of these tests is
the speed at which cache line invalidation can occur. It doesn't seem to
matter whether the other cores are idle (as in the pg_lwlock case), or
trying to obtain ownership of the cache line themselves (as in the
atomicinc case). 

Finally, the no-locking test (none) shows that, even though the counter
is written to shared memory after every increment, memory bandwith isn't
an issue here, since performance scaled nearly linearly with the number
of workers here.

Here's the result of another run, this time with the per-worker counter
being increment 100 in each acquire/release cycle.

--- 100 Increments per Cycle -
 |   1 worker|   2 workers   |   3 workers   |   4 workers   |

Re: [HACKERS] marking old branches as no longer maintained




On 06/28/2011 01:54 PM, Magnus Hagander wrote:

On Tue, Jun 28, 2011 at 19:46, Andrew Dunstanand...@dunslane.net  wrote:

I'd like to add a feature to the buildfarm that automatically picks up any
new branch and automatically stops building any branch we're not maintaining
any more. To do the latter, I need some way for the client to detect that we
are or aren't interested in a branch. What I'd like to do is add a file to
the old back branches (say from 7.4 to 8.1 currently - I can grandfather the
rest) called end-of-life-reached or some such, with some appropriate text.
As a branch reaches its end of life, i.e. right before the last release we
make, we should add that file to the branch.

Does this need to be driven out of the main tree? Couldn't you just
have a blacklist in the buildfarm code or site? (disclaimer: I
haven't looked at how it works so that may be a completely useless
idea..)



Not very easily, mainly because of difficulties with MinGW SDK perl. 
Building it into the code would defeat the purpose.


If it's contentious I won't bother. We've managed OK for years, and can 
go on managing as we are.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Fast GiST index build

2011-06-28 Thread Alexander Korotkov

New version of patch. Bug which caused falldown on trees with high number of
levels was fixed. Also some more comments and refactoring.

--
With best regards,
Alexander Korotkov.


gist_fast_build-0.3.0.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] marking old branches as no longer maintained

On Tue, Jun 28, 2011 at 20:38, Andrew Dunstan and...@dunslane.net wrote:


 On 06/28/2011 01:54 PM, Magnus Hagander wrote:

 On Tue, Jun 28, 2011 at 19:46, Andrew Dunstanand...@dunslane.net  wrote:

 I'd like to add a feature to the buildfarm that automatically picks up
 any
 new branch and automatically stops building any branch we're not
 maintaining
 any more. To do the latter, I need some way for the client to detect that
 we
 are or aren't interested in a branch. What I'd like to do is add a file
 to
 the old back branches (say from 7.4 to 8.1 currently - I can grandfather
 the
 rest) called end-of-life-reached or some such, with some appropriate
 text.
 As a branch reaches its end of life, i.e. right before the last release
 we
 make, we should add that file to the branch.

 Does this need to be driven out of the main tree? Couldn't you just
 have a blacklist in the buildfarm code or site? (disclaimer: I
 haven't looked at how it works so that may be a completely useless
 idea..)


 Not very easily, mainly because of difficulties with MinGW SDK perl.
 Building it into the code would defeat the purpose.

Drat.


 If it's contentious I won't bother. We've managed OK for years, and can go
 on managing as we are.

If we can find a good way to do it, I think having BF animals
automatically picking up new branches is a very good thing to have. So
don't give up so easily :D If adding a more or less random file to
back branches is the only way to do it, I'm for doing that - I'd just
like to find some method that feels cleaner. But maybe I'm just
bikeshedding for no real use here.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] marking old branches as no longer maintained

2011-06-28 Thread Dave Page

On Tue, Jun 28, 2011 at 8:02 PM, Magnus Hagander mag...@hagander.net wrote:

 If we can find a good way to do it, I think having BF animals
 automatically picking up new branches is a very good thing to have. So
 don't give up so easily :D If adding a more or less random file to
 back branches is the only way to do it, I'm for doing that - I'd just
 like to find some method that feels cleaner. But maybe I'm just
 bikeshedding for no real use here.

Adding new branches automatically would be great, but it'll need some
work from the animal herders as well as some careful design - for
example, my Windows animals have separate schedules for each branch
(some running more frequently than others), whilst my Solaris ones now
use a runner script that cycles through the list of branches on each
of a couple of animals.

-- 
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] marking old branches as no longer maintained




On 06/28/2011 03:02 PM, Magnus Hagander wrote:



If it's contentious I won't bother. We've managed OK for years, and can go
on managing as we are.

If we can find a good way to do it, I think having BF animals
automatically picking up new branches is a very good thing to have. So
don't give up so easily :D If adding a more or less random file to
back branches is the only way to do it, I'm for doing that - I'd just
like to find some method that feels cleaner. But maybe I'm just
bikeshedding for no real use here.


Another way would be for us to have a file on master with the names of 
the branches we care about.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] marking old branches as no longer maintained




On 06/28/2011 03:17 PM, Dave Page wrote:

On Tue, Jun 28, 2011 at 8:02 PM, Magnus Hagandermag...@hagander.net  wrote:

If we can find a good way to do it, I think having BF animals
automatically picking up new branches is a very good thing to have. So
don't give up so easily :D If adding a more or less random file to
back branches is the only way to do it, I'm for doing that - I'd just
like to find some method that feels cleaner. But maybe I'm just
bikeshedding for no real use here.

Adding new branches automatically would be great, but it'll need some
work from the animal herders as well as some careful design - for
example, my Windows animals have separate schedules for each branch
(some running more frequently than others), whilst my Solaris ones now
use a runner script that cycles through the list of branches on each
of a couple of animals.


Modern buildfarm code has a wrapper builtin. So my crontab usually just 
looks like this:


   27 * * * * cd bf  ./run_branches.pl --config=nightjar.conf --run-all

The buildfarm.conf has a section like this:

   if ($branch eq 'global')
   {
$conf{branches_to_build} = [qw( HEAD REL9_1_STABLE
   REL9_0_STABLE REL8_4_STABLE REL8_3_STABLE REL8_2_STABLE)];
   }

What I'd like to do is to allow this to read:

   if ($branch eq 'global')
   {
$conf{branches_to_build} = 'ALL';
   }

and have it choose the right set for you.

But if you want to run some more frequently you'd still be stuck having 
to manage that yourself. There's actually not a lot of point in doing it 
that way, though. We don't build unless there have been changes on the 
branch, unless told otherwise, so you might as well run frequently and 
test all the branches - for the most part only HEAD (i.e. master) will 
be built because it gets far more changes than the back branches.


cheers

andrew




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Avoid index rebuilds for no-rewrite ALTER TABLE ALTER TYPE

2011-06-28 Thread Noah Misch

On Tue, Jun 28, 2011 at 02:11:11PM -0400, Robert Haas wrote:
 On Mon, Jun 27, 2011 at 10:43 PM, Noah Misch n...@leadboat.com wrote:
  On Mon, Jun 27, 2011 at 03:45:43PM -0400, Robert Haas wrote:
  On Wed, Jun 15, 2011 at 1:03 AM, Noah Misch n...@leadboat.com wrote:
   [patch to avoid index rebuilds]
 
  With respect to the documentation hunks, it seems to me that the first
  hunk might be made clearer by leaving the paragraph of which it is a
  part as-is, and adding another paragraph afterwards beginning with the
  words In addition.
 
  The added restriction elaborates on the transitivity requirement, so I 
  wanted to
  keep the new language adjacent to that.
 
 That makes sense, but it reads a bit awkwardly to me.  Maybe it's just
 that the sentence itself is so complicated that I have difficulty
 understanding it.  I guess it's the same problem as with the text you
 added about hash indexes: without thinking about it, it's hard to
 understand what territory is covered by the new sentence that is not
 covered by what's already there.  In the case of the hash indexes, I
 think I have it figured out: we already know that we must get
 compatible hash values out of logically equal values of different
 datatypes.  But it's possible that the inter-type cast changes the
 value in some arbitrary way and then compensates for it by defining
 the hash function in such a way as to compensate.  Similarly, for
 btrees, we need the relative ordering of A and B to remain the same
 after casting within the opfamily, not to be rearranged somehow.
 Maybe the text you've got is OK to explain this, but I wonder if
 there's a way to do it more simply.

That's basically right, I think.  Presently, there is no connection between
casts and operator family notions of equality.  For example, a cast can change
the hash value.  In general, that's not wrong.  However, I wish to forbid it
when some hash operator family covers both the source and destination types of
the cast.  Note that, I don't especially care whether the stored bits changed or
not.  I just want casts to preserve equality when an operator family defines
equality across the types involved in the cast.  The specific way of
articulating that is probably vulnerable to improvement.

  It would be valuable to avoid introducing a second chunk of code that knows
  everything about the catalog entries behind an index. ?That's what led me 
  to the
  put forward the most recent version as best. ?What do you find vile about 
  that
  approach? ?I wasn't comfortable with it at first, because I suspected the 
  checks
  in RelationPreserveStorage() might be important for correctness. ?Having 
  studied
  it some more, though, I think they just reflect the narrower needs of its
  current sole user.
 
 Maybe vile is a strong word, but it seems like a modularity violation:
 we're basically letting DefineIndex() do some stuff we don't really
 want done, and then backing it out parts of it that we don't really
 want done after all.  It seems like it would be better to provide
 DefineIndex with enough information not to do the wrong thing in the
 first place.  Could we maybe pass stmt-oldNode to DefineIndex(), and
 let it work out the details?

True.  I initially shied away from that, because we assume somewhat deep into
the stack that the new relation will have pg_class.oid = pg_class.relfilenode.
Here's the call stack in question:

RelationBuildLocalRelation
heap_create
index_create
DefineIndex
ATExecAddIndex

Looking at it again, it wouldn't bring the end of the world to add a relfilenode
argument to each.  None of those have more than four callers.  ATExecAddIndex()
would then call RelationPreserveStorage() before calling DefineIndex(), which
would in turn put things in a correct state from the start.  Does that seem
appropriate?  Offhand, I do like it better than what I had.

Thanks,
nm


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] marking old branches as no longer maintained

2011-06-28 Thread Dave Page

On Tuesday, June 28, 2011, Andrew Dunstan and...@dunslane.net wrote:


 On 06/28/2011 03:17 PM, Dave Page wrote:

 On Tue, Jun 28, 2011 at 8:02 PM, Magnus Hagandermag...@hagander.net  wrote:

 If we can find a good way to do it, I think having BF animals
 automatically picking up new branches is a very good thing to have. So
 don't give up so easily :D If adding a more or less random file to
 back branches is the only way to do it, I'm for doing that - I'd just
 like to find some method that feels cleaner. But maybe I'm just
 bikeshedding for no real use here.

 Adding new branches automatically would be great, but it'll need some
 work from the animal herders as well as some careful design - for
 example, my Windows animals have separate schedules for each branch
 (some running more frequently than others), whilst my Solaris ones now
 use a runner script that cycles through the list of branches on each
 of a couple of animals.


 Modern buildfarm code has a wrapper builtin. So my crontab usually just looks 
 like this:

    27 * * * * cd bf  ./run_branches.pl --config=nightjar.conf --run-all

 The buildfarm.conf has a section like this:

    if ($branch eq 'global')
    {
         $conf{branches_to_build} = [qw( HEAD REL9_1_STABLE
    REL9_0_STABLE REL8_4_STABLE REL8_3_STABLE REL8_2_STABLE)];
    }

 What I'd like to do is to allow this to read:

    if ($branch eq 'global')
    {
         $conf{branches_to_build} = 'ALL';
    }

 and have it choose the right set for you.

Oh, cool. Guess I'll be reconfiguring my animals soon :-)

 But if you want to run some more frequently you'd still be stuck having to 
 manage that yourself. There's actually not a lot of point in doing it that 
 way, though. We don't build unless there have been changes on the branch, 
 unless told otherwise, so you might as well run frequently and test all the 
 branches - for the most part only HEAD (i.e. master) will be built because it 
 gets far more changes than the back branches.

It was something Tom asked for ages ago, so he could see if the
Windows build got broken more promptly. I didn't want multiple
branches running with increased frequency as I run a number of animals
on a single machine with vmware, and a back patched change could cause
a lot of extra work.

-- 
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] marking old branches as no longer maintained




On 06/28/2011 03:48 PM, Dave Page wrote:



But if you want to run some more frequently you'd still be stuck having to 
manage that yourself. There's actually not a lot of point in doing it that way, 
though. We don't build unless there have been changes on the branch, unless 
told otherwise, so you might as well run frequently and test all the branches - 
for the most part only HEAD (i.e. master) will be built because it gets far 
more changes than the back branches.

It was something Tom asked for ages ago, so he could see if the
Windows build got broken more promptly. I didn't want multiple
branches running with increased frequency as I run a number of animals
on a single machine with vmware, and a back patched change could cause
a lot of extra work.



Oh, so maybe we need to have some sort of throttle. Probably just for 
non-head or non-(head-or-latest) would suffice.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fwd: Keywords in pg_hba.conf should be field-specific

Excerpts from Pavel Stehule's message of dom jun 26 13:10:13 -0400 2011:
 Hello
 
 
  I have touched next_token() and next_token_expand() a bit more, because
  it seemed to me that they could be simplified further (the bit about
  returning the comma in the token, instead of being a boolean return,
  seemed strange).  Please let me know what you think.
 
 I am thinking, so it is ok.

Thanks, I have pushed it.  Brendan: thanks for the patch.

 I am not sure, if load_ident is correct. In load_hba you clean a
 previous context after successful processing. In load_ident you clean
 data on start. Is it ok?

Yeah, they are a bit inconsistent.  I am uninclined to mess with it,
though.  Right now it seems to me that the only way it could fail is if
you hit an out-of-memory or a similar problem.  And if you hit that at
postmaster startup ... well ... I guess you have A Problem.

If somebody wants to submit a patch to fix that, be my guest.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [v9.2] Fix leaky-view problem, part 1

2011-06-28 Thread Kohei KaiGai

Thanks for your reviewing,

2011/6/28 Noah Misch n...@leadboat.com:
 I took a look at this patch.  It's incredibly simple, which is great, and it
 seems to achieve its goal.

 Suppose your query references two views owned by different roles.  The quals
 of those views will have the same depth.  Is there a way for information to
 leak from one view owner to another due to that?

Even if multiple subqueries were pulled-up and qualifiers got merged, user given
qualifiers shall have smaller depth value, so it will be always
launched after the
qualifiers originally used in the subqueries.

Of course, it is another topic in the case when the view is originally
defined with
leaky functions.

 I like how you've assumed that the table owner trusts the operator class
 functions of indexes on his table to not leak information.  That handily
 catches some basic and important qual pushdowns that would otherwise be lost.

 This makes assumptions, at a distance, about the possible scan types and how
 quals can be fitted to them.  Specifically, this patch achieves its goals
 because any indexable qual is trustworthy, and any non-indexable qual cannot
 be pushed down further than the view's own quals.  That seems to be true with
 current plan types, but it feels fragile.  I don't have a concrete idea for
 improvement, though.  Robert suggested another approach in
 http://archives.postgresql.org/message-id/aanlktimbn_6tyxreh5rc7pmizt-vjb3xgp8cuho0o...@mail.gmail.com
 ; might that approach avoid this hazard?

The reason why we didn't adopt the idea to check privileges of underlying tables
is that PostgreSQL checks privileges on executor phase, not planner phase.

If we try to have a flag on pg_proc, it is a tough work to categolize trustworth
functions and non-trustworh ones from beginning, because we have more than
2000 of built-in functions.
So, it is reasonable assumption that index access methods does not leak
contents of tuples scanned, because only superuser can define them.

 The part 2 patch in this series implements the planner restriction more likely
 to yield performance regressions, so it introduces a mechanism for identifying
 when to apply the restriction.  This patch, however, applies its restriction
 unconditionally.  Since we will inevitably have a such mechanism before you
 are done sealing the leaks in our view implementation, the restriction in this
 patch should also use that mechanism.  Therefore, I think we should review and
 apply part 2 first, then update this patch to use its conditional behavior.

The reason why this patch always gives the depth higher priority is the matter
is relatively minor compared to the issue the part.2 patch tries to tackle.
That affects the selection of scan plan (IndexScan or SeqScan), so it
is significant
decision to be controllable. However, this issue is just on a particular scan.

In addition, implementation will become complex, if both of qualifiers pulled-up
from security barrier view and qualifiers pulled-up from regular views are mixed
within a single qualifier list.

 A few minor questions/comments on the implementation:

 On Mon, Jun 06, 2011 at 01:37:11PM +0100, Kohei Kaigai wrote:
 --- a/src/backend/optimizer/plan/planner.c
 +++ b/src/backend/optimizer/plan/planner.c

 +     else if (IsA(node, Query))
 +     {
 +             depth += 2;

 Why two?

This patch is a groundwork for the upcoming row-level security feature.
In the case when the backend appends security policy functions, it should
be launched prior to user given qualifiers. So, I intends to give odd depth
numbers for these functions to have higher priorities to other one.
Of course, 1 might work well right now.

 --- a/src/test/regress/expected/select_views.out
 +++ b/src/test/regress/expected/select_views.out

 +EXPLAIN SELECT * FROM your_credit WHERE f_leak(number,expired);
 +                                QUERY PLAN
 +--
 + Seq Scan on credit_cards  (cost=0.00..181.20 rows=1 width=96)

 Use EXPLAIN (COSTS OFF) in regression tests.  We do not put much effort into
 the stability of exact cost values, and they do not matter for the purpose of
 this test.

OK, fixed.

 --- a/src/test/regress/sql/select_views.sql
 +++ b/src/test/regress/sql/select_views.sql
 @@ -8,3 +8,46 @@ SELECT * FROM street;
  SELECT name, #thepath FROM iexit ORDER BY 1, 2;

  SELECT * FROM toyemp WHERE name = 'sharon';
 +
 +--
 +-- Test for leaky-view problem
 +--
 +
 +-- setups
 +SET client_min_messages TO 'warning';
 +
 +DROP ROLE IF EXISTS alice;
 +DROP FUNCTION IF EXISTS f_leak(text);
 +DROP TABLE IF EXISTS credit_cards;
 +
 +RESET client_min_messages;

 No need for this.  The regression tests always start on a clean database.

Fixed.

 +
 +CREATE USER alice;
 +CREATE FUNCTION f_leak(text, text)
 +    RETURNS bool LANGUAGE 'plpgsql'
 +    AS 'begin raise notice ''% = %'', $1, $2; return true; end';

 I ran this test case on master, and it did not

Re: [HACKERS] spinlock contention

On Tue, Jun 28, 2011 at 2:33 PM, Florian Pflug f...@phlo.org wrote:
 [ testing of various spinlock implementations ]

I set T=30 and N=1 2 4 8 16 32 and tried this out on a 32-core
loaner from Nate Boley:

100 counter increments per cycle
worker  1   2   4   8   16  
32  
timewalluserwalluserwalluserwalluserwall
userwalluser
none2.8e-07 2.8e-07 1.5e-07 3.0e-07 8.0e-08 3.2e-07 4.2e-08 3.3e-07 2.1e-08 
3.3e-07 1.1e-08 3.4e-07
atomicinc   3.6e-07 3.6e-07 2.6e-07 5.1e-07 1.4e-07 5.5e-07 1.4e-07 1.1e-06 
1.5e-07 2.3e-06 1.5e-07 4.9e-06
cmpxchng3.6e-07 3.6e-07 3.4e-07 6.9e-07 3.2e-07 1.3e-06 2.9e-07 2.3e-06 
4.2e-07 6.6e-06 4.5e-07 1.4e-05
spin4.1e-07 4.1e-07 2.8e-07 5.7e-07 1.6e-07 6.3e-07 1.2e-06 9.4e-06 3.8e-06 
6.1e-05 1.4e-05 4.3e-04
pg_lwlock   3.8e-07 3.8e-07 2.7e-07 5.3e-07 1.5e-07 6.2e-07 3.9e-07 3.1e-06 
1.6e-06 2.5e-05 6.4e-06 2.0e-04
pg_lwlock_cas   3.7e-07 3.7e-07 2.8e-07 5.6e-07 1.4e-07 5.8e-07 1.4e-07 1.1e-06 
1.9e-07 3.0e-06 2.4e-07 7.5e-06

I wrote a little script to show to reorganize this data in a
possibly-easier-to-understand format - ordering each column from
lowest to highest, and showing each algorithm as a multiple of the
cheapest value for that column:

wall-1: 
none(1.0),atomicinc(1.3),cmpxchng(1.3),pg_lwlock_cas(1.3),pg_lwlock(1.4),spin(1.5)
user-1: 
none(1.0),atomicinc(1.3),cmpxchng(1.3),pg_lwlock_cas(1.3),pg_lwlock(1.4),spin(1.5)
wall-2: 
none(1.0),atomicinc(1.7),pg_lwlock(1.8),spin(1.9),pg_lwlock_cas(1.9),cmpxchng(2.3)
user-2: 
none(1.0),atomicinc(1.7),pg_lwlock(1.8),pg_lwlock_cas(1.9),spin(1.9),cmpxchng(2.3)
wall-4: 
none(1.0),atomicinc(1.7),pg_lwlock_cas(1.7),pg_lwlock(1.9),spin(2.0),cmpxchng(4.0)
user-4: 
none(1.0),atomicinc(1.7),pg_lwlock_cas(1.8),pg_lwlock(1.9),spin(2.0),cmpxchng(4.1)
wall-8: 
none(1.0),atomicinc(3.3),pg_lwlock_cas(3.3),cmpxchng(6.9),pg_lwlock(9.3),spin(28.6)
user-8: 
none(1.0),atomicinc(3.3),pg_lwlock_cas(3.3),cmpxchng(7.0),pg_lwlock(9.4),spin(28.5)
wall-16: 
none(1.0),atomicinc(7.1),pg_lwlock_cas(9.0),cmpxchng(20.0),pg_lwlock(76.2),spin(181.0)
user-16: 
none(1.0),atomicinc(7.0),pg_lwlock_cas(9.1),cmpxchng(20.0),pg_lwlock(75.8),spin(184.8)
wall-32: 
none(1.0),atomicinc(13.6),pg_lwlock_cas(21.8),cmpxchng(40.9),pg_lwlock(581.8),spin(1272.7)
user-32: 
none(1.0),atomicinc(14.4),pg_lwlock_cas(22.1),cmpxchng(41.2),pg_lwlock(588.2),spin(1264.7)

Here's the result of the same calculation applied to your second set of data:

wall-1: 
none(1.0),atomicinc(1.0),cmpxchng(1.0),pg_lwlock(1.0),pg_lwlock_cas(1.0),spin(1.2)
user-1: 
none(1.0),atomicinc(1.0),cmpxchng(1.0),pg_lwlock(1.0),pg_lwlock_cas(1.0),spin(1.2)
wall-2: 
none(1.0),atomicinc(1.0),cmpxchng(1.0),pg_lwlock(1.0),spin(1.2),pg_lwlock_cas(1.2)
user-2: 
none(1.0),cmpxchng(1.0),pg_lwlock(1.0),atomicinc(1.0),spin(1.2),pg_lwlock_cas(1.2)
wall-3: 
none(1.0),pg_lwlock(1.0),atomicinc(1.0),cmpxchng(1.0),spin(1.3),pg_lwlock_cas(1.3)
user-3: 
none(1.0),atomicinc(1.0),pg_lwlock(1.0),cmpxchng(1.0),spin(1.3),pg_lwlock_cas(1.3)
wall-4: 
none(1.0),atomicinc(1.0),cmpxchng(1.2),pg_lwlock_cas(1.3),pg_lwlock(1.8),spin(5.2)
user-4: 
none(1.0),atomicinc(1.0),cmpxchng(1.2),pg_lwlock_cas(1.3),pg_lwlock(1.8),spin(5.4)

There seems to be something a bit funky in your 3-core data, but
overall I read this data to indicate that 4 cores aren't really enough
to see a severe problem with spinlock contention.  I'm not even sure
that you can see this problem on a real workload on a 16-core system.
But as you add cores the problem gets rapidly worse, because the more
cores you have, the more likely it is that someone else is already
holding the spinlock (or has just very recently held the spinlock)
that you want.  And, of course, the problem multiplies itself, because
once your lock acquistions start taking longer, they become a larger
percentage of your execution time, which makes it proportionately more
likely that you'll find someone already there when you go to grab the
lock.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fwd: Keywords in pg_hba.conf should be field-specific

2011-06-28 Thread Pavel Stehule

2011/6/28 Alvaro Herrera alvhe...@commandprompt.com:
 Excerpts from Pavel Stehule's message of dom jun 26 13:10:13 -0400 2011:
 Hello

 
  I have touched next_token() and next_token_expand() a bit more, because
  it seemed to me that they could be simplified further (the bit about
  returning the comma in the token, instead of being a boolean return,
  seemed strange).  Please let me know what you think.

 I am thinking, so it is ok.

 Thanks, I have pushed it.  Brendan: thanks for the patch.

 I am not sure, if load_ident is correct. In load_hba you clean a
 previous context after successful processing. In load_ident you clean
 data on start. Is it ok?

 Yeah, they are a bit inconsistent.  I am uninclined to mess with it,
 though.  Right now it seems to me that the only way it could fail is if
 you hit an out-of-memory or a similar problem.  And if you hit that at
 postmaster startup ... well ... I guess you have A Problem.

there should be a format (syntax) error. If somebody breaks a pg_ident
and will do a reload, then all ident mapping is lost.

This is not related to topic or doesn't modify current behave, so
should not be repaired now

Pavel


 If somebody wants to submit a patch to fix that, be my guest.

 --
 Álvaro Herrera alvhe...@commandprompt.com
 The PostgreSQL Company - Command Prompt, Inc.
 PostgreSQL Replication, Consulting, Custom Development, 24x7 support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

On Jun28, 2011, at 05:42 , Jeff Davis wrote:
 On Mon, 2011-06-27 at 14:50 -0400, Robert Haas wrote:
 Couldn't we also do neither of these things?  I mean, presumably
 '[1,10]'::int8range had better work.
 
 I think that if we combine this idea with Florian's PAIR suggestion
 here:
 http://archives.postgresql.org/message-id/ad4fc75d-db99-48ed-9082-52ee3a4d7...@phlo.org
 
 then I think we have a solution.
 
 If we add a type RANGEINPUT that is not a pseudotype, we can use that as
 an intermediate type that is returned by range constructors. Then, we
 add casts from RANGEINPUT to each range type. That would allow
  range(1,2)::int8range
 to work without changing the type system around, because range() would
 have the signature:
  range(ANYELEMENT, ANYELEMENT) - RANGEINPUT
 and then the cast would change it into an int8range. But we only need
 the one cast per range type, and we can also support all of the other
 kinds of constructors like:
  range_cc(ANYELEMENT, ANYELEMENT) - RANGEINPUT
  range_linf_c(ANYELEMENT) - RANGEINPUT
 without additional hassle.

Hm, so RANGEINPUT would actually be what was previously discussed as
the range as a pair of bounds definition, as opposed to the
range as a set of values definition. So essentially we'd add a
second concept of what a range is to work around the range input
troubles.

I believe if we go that route we should make RANGEINPUT a full-blown
type, having pair of bound semantics. Adding a lobotomized version
just for the sake of range input feels a bit like a kludge to me.

Alternatively, we could replace RANGEINPUT simply with ANYARRAY,
and add functions ANYRANGE-ANYRANGE which allow specifying the
bound operator (, = respectively ,=) after construction.

So you'd write (using the functions-as-fields syntax I believe
we support)
  (ARRAY[1,2]::int8range).left_open.right_closed for '(1,2]'
and
  ARRAY[NULL,2]::int8range for '[-inf,2]'

All assuming that modifying the type system to support polymorphic
type resolution based on the return type is out of the question... ;-)

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fwd: Keywords in pg_hba.conf should be field-specific

Excerpts from Pavel Stehule's message of mar jun 28 16:19:02 -0400 2011:

 there should be a format (syntax) error. If somebody breaks a pg_ident
 and will do a reload, then all ident mapping is lost.

No, the file is not actually parsed until the auth verification runs.
The incorrect tokens are happily stored in memory by the tokeniser.  In
my view that's the wrong way to do it, but changing it seems to be a
lot of work (I didn't try).

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fwd: Keywords in pg_hba.conf should be field-specific

2011-06-28 Thread Pavel Stehule

2011/6/28 Alvaro Herrera alvhe...@commandprompt.com:
 Excerpts from Pavel Stehule's message of mar jun 28 16:19:02 -0400 2011:

 there should be a format (syntax) error. If somebody breaks a pg_ident
 and will do a reload, then all ident mapping is lost.

 No, the file is not actually parsed until the auth verification runs.
 The incorrect tokens are happily stored in memory by the tokeniser.  In
 my view that's the wrong way to do it, but changing it seems to be a
 lot of work (I didn't try).


ok


 --
 Álvaro Herrera alvhe...@commandprompt.com
 The PostgreSQL Company - Command Prompt, Inc.
 PostgreSQL Replication, Consulting, Custom Development, 24x7 support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] marking old branches as no longer maintained

2011-06-28 Thread Peter Eisentraut

On tis, 2011-06-28 at 15:37 -0400, Andrew Dunstan wrote:
 What I'd like to do is to allow this to read:
 
 if ($branch eq 'global')
 {
  $conf{branches_to_build} = 'ALL';
 }
 
 and have it choose the right set for you.

It seems to me that if you put a marker file into old branches, you'd
still have to check out and poll the old branches, which could become
somewhat expensive as the number of old branches grows.

Couldn't you just put a text file on the build farm server with
recommended branches?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI modularity questions

2011-06-28 Thread Nicolas Barbier

2011/6/28, Robert Haas robertmh...@gmail.com:

 You know, it just occurred to me while reading this email that you're
 using the term predicate lock in a way that is totally different
 from what I learned in school.  What I was taught is that the word
 predicate in predicate lock is like the word tuple in tuple
 lock or the word relation in relation lock - that is, it
 describes *the thing being locked*.  In other words, you are
 essentially doing:

 LOCK TABLE foo WHERE i = 1;

 I think that what you're calling the predicate lock manager should
 really be called the siread lock manager, and all of the places where
 you are predicate locking a tuple should really be siread locking
 the tuple.

The predicate in the full table case is: any tuple in this table
(including tuples that don't exist yet, otherwise it wouldn't be a
predicate). The predicate in the index case is: any tuple that would
be returned by so-and-such index scan (idem regarding tuples that
don't exist yet, hence locking the gaps).

The lock semantics (i.e., how conflicts between it and other locks are
defined and treated) are siread. The thing that it applies to is a
predicate. (I.e., PostgreSQL before SSI already supported some rather
trivial kind of predicate lock: the full table lock.)

Conclusion: I don't see the problem :-).

Nicolas

-- 
A. Because it breaks the logical sequence of discussion.
Q. Why is top posting bad?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] marking old branches as no longer maintained




On 06/28/2011 04:51 PM, Peter Eisentraut wrote:

On tis, 2011-06-28 at 15:37 -0400, Andrew Dunstan wrote:

What I'd like to do is to allow this to read:

 if ($branch eq 'global')
 {
  $conf{branches_to_build} = 'ALL';
 }

and have it choose the right set for you.

It seems to me that if you put a marker file into old branches, you'd
still have to check out and poll the old branches, which could become
somewhat expensive as the number of old branches grows.



No, not really. I'd use 'git ls-tree $branchname $filetolookfor. I have 
tested it and this works just fine, takes a second or so.




Couldn't you just put a text file on the build farm server with
recommended branches?


As I told Magnus, that gets ugly because of limitations in MinGW's SDK 
perl. I suppose I could just not implement the feature for MinGW, but 
I've tried damn hard not to make those sorts of compromises and I'm not 
keen to start.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] marking old branches as no longer maintained

2011-06-28 Thread Peter Eisentraut

On tis, 2011-06-28 at 17:05 -0400, Andrew Dunstan wrote:
  Couldn't you just put a text file on the build farm server with
  recommended branches?
 
 As I told Magnus, that gets ugly because of limitations in MinGW's SDK 
 perl. I suppose I could just not implement the feature for MinGW, but 
 I've tried damn hard not to make those sorts of compromises and I'm not 
 keen to start.

The buildfarm code can upload the build result via HTTP; why can't it
download a file via HTTP?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] spinlock contention

2011-06-28 Thread Merlin Moncure

On Tue, Jun 28, 2011 at 3:18 PM, Robert Haas robertmh...@gmail.com wrote:
 user-32: 
 none(1.0),atomicinc(14.4),pg_lwlock_cas(22.1),cmpxchng(41.2),pg_lwlock(588.2),spin(1264.7)

I may not be following all this correctly, but doesn't this suggest a
huge potential upside for the cas based patch you posted upthread when
combined with your earlier patches that were bogging down on spinlock
contentionl?

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI modularity questions

2011-06-28 Thread Kevin Grittner

Heikki Linnakangas  wrote:
 On 28.06.2011 20:47, Kevin Grittner wrote:
 
 Hmm, the calls in question are the ones in heapgettup() and
 heapgettup_pagemode(), which are subroutines of heap_getnext().
 heap_getnext() is only used in sequential scans, so it seems safe
 to remove those calls.
 
I haven't found anything to the contrary, if I understand correctly,
Dan found the same, and all the tests pass without them.  Here's a
patch to remove them.  This makes the recently-added
rs_relpredicatelocked boolean field unnecessary, so that's removed in
this patch, too.
 
 I would like to add a test involving a lossy bitmap scan. How many
 rows are normally needed to force a bitmap scan to be lossy?
 
 The size of bitmaps is controlled by work_mem, so you can set
 work_mem very small to cause them to become lossy earlier. Off the
 top of my head I don't have any guesstimate on how many rows you
 need.
 
 What's the easiest way to check whether a plan is going to use (or
 is using) a lossy bitmap scan?
 
 Good question. There doesn't seem to be anything in the EXPLAIN
 ANALYZE output to show that, so I think you'll have to resort to
 adding some elog()s in the right places.
 
OK, thanks.
 
-Kevin




ssi-seqscan-cleanup.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] spinlock contention

On Tue, Jun 28, 2011 at 5:33 PM, Merlin Moncure mmonc...@gmail.com wrote:
 On Tue, Jun 28, 2011 at 3:18 PM, Robert Haas robertmh...@gmail.com wrote:
 user-32: 
 none(1.0),atomicinc(14.4),pg_lwlock_cas(22.1),cmpxchng(41.2),pg_lwlock(588.2),spin(1264.7)

 I may not be following all this correctly, but doesn't this suggest a
 huge potential upside for the cas based patch you posted upthread when
 combined with your earlier patches that were bogging down on spinlock
 contentionl?

Well, you'd think so, but in fact that patch makes it slower.  Don't
ask me why, 'cuz I dunno.  :-(

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] marking old branches as no longer maintained




On 06/28/2011 05:31 PM, Peter Eisentraut wrote:

On tis, 2011-06-28 at 17:05 -0400, Andrew Dunstan wrote:

Couldn't you just put a text file on the build farm server with
recommended branches?

As I told Magnus, that gets ugly because of limitations in MinGW's SDK
perl. I suppose I could just not implement the feature for MinGW, but
I've tried damn hard not to make those sorts of compromises and I'm not
keen to start.

The buildfarm code can upload the build result via HTTP; why can't it
download a file via HTTP?



It has to use a separate script to do that. I don't really want to add 
another one just for this.


(thinks a bit) I suppose I can make it do:

   my $url = http://buildfarm.postgresql.org/branches_of_interest.txt;;
   my $branches_of_interest = `perl -MLWP::Simple -e getprint(q{$url})`;

Maybe that's the best option. It's certainly going to be less code than 
anything else :-)


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] spinlock contention

On Jun28, 2011, at 23:48 , Robert Haas wrote:
 On Tue, Jun 28, 2011 at 5:33 PM, Merlin Moncure mmonc...@gmail.com wrote:
 On Tue, Jun 28, 2011 at 3:18 PM, Robert Haas robertmh...@gmail.com wrote:
 user-32: 
 none(1.0),atomicinc(14.4),pg_lwlock_cas(22.1),cmpxchng(41.2),pg_lwlock(588.2),spin(1264.7)
 
 I may not be following all this correctly, but doesn't this suggest a
 huge potential upside for the cas based patch you posted upthread when
 combined with your earlier patches that were bogging down on spinlock
 contentionl?
 
 Well, you'd think so, but in fact that patch makes it slower.  Don't
 ask me why, 'cuz I dunno.  :-(

Huh? Where do you see your CAS patch being slower than unpatched LWLocks
in these results? Or are you referring to pgbench runs you made, not
to these artificial benchmarks?

best regards,
Florian Pflug



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] spinlock contention

On Tue, Jun 28, 2011 at 5:55 PM, Florian Pflug f...@phlo.org wrote:
 On Jun28, 2011, at 23:48 , Robert Haas wrote:
 On Tue, Jun 28, 2011 at 5:33 PM, Merlin Moncure mmonc...@gmail.com wrote:
 On Tue, Jun 28, 2011 at 3:18 PM, Robert Haas robertmh...@gmail.com wrote:
 user-32: 
 none(1.0),atomicinc(14.4),pg_lwlock_cas(22.1),cmpxchng(41.2),pg_lwlock(588.2),spin(1264.7)

 I may not be following all this correctly, but doesn't this suggest a
 huge potential upside for the cas based patch you posted upthread when
 combined with your earlier patches that were bogging down on spinlock
 contentionl?

 Well, you'd think so, but in fact that patch makes it slower.  Don't
 ask me why, 'cuz I dunno.  :-(

 Huh? Where do you see your CAS patch being slower than unpatched LWLocks
 in these results? Or are you referring to pgbench runs you made, not
 to these artificial benchmarks?

pgbench -S

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [v9.2] Fix leaky-view problem, part 1

2011-06-28 Thread Noah Misch

On Tue, Jun 28, 2011 at 10:11:59PM +0200, Kohei KaiGai wrote:
2011/6/28 Noah Misch n...@leadboat.com:
Suppose your query references two views owned by different roles. ?The quals
of those views will have the same depth. ?Is there a way for information to
leak from one view owner to another due to that?

Even if multiple subqueries were pulled-up and qualifiers got merged, user
given
qualifiers shall have smaller depth value, so it will be always
launched after the
qualifiers originally used in the subqueries.

Of course, it is another topic in the case when the view is originally
defined with
leaky functions.

Right. I was thinking of a pair of quals, one in each of two view definitions.
The views are mutually-untrusting. Something like this:

CREATE VIEW a AS SELECT * FROM ta WHERE ac = 5;
ALTER VIEW a OWNER TO alice;
CREATE VIEW b AS SELECT * FROM tb WHERE bc = 6;
ALTER VIEW b OWNER TO bob;
SELECT * FROM a, b;

Both the ac=5 and the bc=6 quals do run at the same depth despite enforcing
security for different principals. I can't think of a way that one view owner
could use this situation to subvert the security of the other view owner, but I
wanted to throw it out.

This makes assumptions, at a distance, about the possible scan types and how
quals can be fitted to them. ?Specifically, this patch achieves its goals
because any indexable qual is trustworthy, and any non-indexable qual cannot
be pushed down further than the view's own quals. ?That seems to be true
with
current plan types, but it feels fragile. ?I don't have a concrete idea for
improvement, though. ?Robert suggested another approach in
http://archives.postgresql.org/message-id/aanlktimbn_6tyxreh5rc7pmizt-vjb3xgp8cuho0o...@mail.gmail.com
; might that approach avoid this hazard?

The reason why we didn't adopt the idea to check privileges of underlying
tables
is that PostgreSQL checks privileges on executor phase, not planner phase.

If we try to have a flag on pg_proc, it is a tough work to categolize
trustworth
functions and non-trustworh ones from beginning, because we have more than
2000 of built-in functions.
So, it is reasonable assumption that index access methods does not leak
contents of tuples scanned, because only superuser can define them.

I was referring to this paragraph:

On the technical side, I am pretty doubtful that the approach of adding a
nestlevel to FuncExpr and RelOptInfo is the right way to go. I believe we
have existing code (to handle left joins) that prevents quals from being
pushed down too far by fudging the set of relations that are supposedly needed
to evaluate the qual. I suspect a similar approach would work here.

The part 2 patch in this series implements the planner restriction more
likely
to yield performance regressions, so it introduces a mechanism for
identifying
when to apply the restriction. ?This patch, however, applies its restriction
unconditionally. ?Since we will inevitably have a such mechanism before you
are done sealing the leaks in our view implementation, the restriction in
this
patch should also use that mechanism. ?Therefore, I think we should review
and
apply part 2 first, then update this patch to use its conditional behavior.

The reason why this patch always gives the depth higher priority is the matter
is relatively minor compared to the issue the part.2 patch tries to tackle.
That affects the selection of scan plan (IndexScan or SeqScan), so it
is significant
decision to be controllable. However, this issue is just on a particular scan.

True. The lost optimization opportunity is relatively minor, but perhaps not
absolutely minor. It would be one thing if we could otherwise get away without
designing any mechanism for applying these restrictions conditionally. However,
since you have implemented the conditional behavior elsewhere, it would be nice
to apply it here.

In addition, implementation will become complex, if both of qualifiers
pulled-up
from security barrier view and qualifiers pulled-up from regular views are
mixed
within a single qualifier list.

I only scanned the part 2 patch, but isn't the bookkeeping already happening for
its purposes? How much more complexity would we get to apply the same strategy
to the behavior of this patch?

On Mon, Jun 06, 2011 at 01:37:11PM +0100, Kohei Kaigai wrote:
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c

+ ? ? else if (IsA(node, Query))
+ ? ? {
+ ? ? ? ? ? ? depth += 2;

Why two?

This patch is a groundwork for the upcoming row-level security feature.
In the case when the backend appends security policy functions, it should
be launched prior to user given qualifiers. So, I intends to give odd depth
numbers for these functions to have higher priorities to other one.
Of course, 1 might work well right now.

I'd say it should either be 1 until such time as that's needed,

Re: [HACKERS] spinlock contention

On Jun28, 2011, at 22:18 , Robert Haas wrote:
 On Tue, Jun 28, 2011 at 2:33 PM, Florian Pflug f...@phlo.org wrote:
 [ testing of various spinlock implementations ]
 
 I set T=30 and N=1 2 4 8 16 32 and tried this out on a 32-core
 loaner from Nate Boley:

Cool, thanks!

 100 counter increments per cycle
 worker1   2   4   8   
 16  32  
 time  walluserwalluserwalluserwalluserwall
 userwalluser
 none  2.8e-07 2.8e-07 1.5e-07 3.0e-07 8.0e-08 3.2e-07 4.2e-08 3.3e-07 2.1e-08 
 3.3e-07 1.1e-08 3.4e-07
 atomicinc 3.6e-07 3.6e-07 2.6e-07 5.1e-07 1.4e-07 5.5e-07 1.4e-07 1.1e-06 
 1.5e-07 2.3e-06 1.5e-07 4.9e-06
 cmpxchng  3.6e-07 3.6e-07 3.4e-07 6.9e-07 3.2e-07 1.3e-06 2.9e-07 2.3e-06 
 4.2e-07 6.6e-06 4.5e-07 1.4e-05
 spin  4.1e-07 4.1e-07 2.8e-07 5.7e-07 1.6e-07 6.3e-07 1.2e-06 9.4e-06 3.8e-06 
 6.1e-05 1.4e-05 4.3e-04
 pg_lwlock 3.8e-07 3.8e-07 2.7e-07 5.3e-07 1.5e-07 6.2e-07 3.9e-07 3.1e-06 
 1.6e-06 2.5e-05 6.4e-06 2.0e-04
 pg_lwlock_cas 3.7e-07 3.7e-07 2.8e-07 5.6e-07 1.4e-07 5.8e-07 1.4e-07 1.1e-06 
 1.9e-07 3.0e-06 2.4e-07 7.5e-06

Here's the same table, formatted with spaces.

worker  1   2   4   8   
16  32  
timewalluserwalluserwalluserwalluser
walluserwalluser
none2.8e-07 2.8e-07 1.5e-07 3.0e-07 8.0e-08 3.2e-07 4.2e-08 3.3e-07 
2.1e-08 3.3e-07 1.1e-08 3.4e-07
atomicinc   3.6e-07 3.6e-07 2.6e-07 5.1e-07 1.4e-07 5.5e-07 1.4e-07 1.1e-06 
1.5e-07 2.3e-06 1.5e-07 4.9e-06
cmpxchng3.6e-07 3.6e-07 3.4e-07 6.9e-07 3.2e-07 1.3e-06 2.9e-07 2.3e-06 
4.2e-07 6.6e-06 4.5e-07 1.4e-05
spin4.1e-07 4.1e-07 2.8e-07 5.7e-07 1.6e-07 6.3e-07 1.2e-06 9.4e-06 
3.8e-06 6.1e-05 1.4e-05 4.3e-04
pg_lwlock   3.8e-07 3.8e-07 2.7e-07 5.3e-07 1.5e-07 6.2e-07 3.9e-07 3.1e-06 
1.6e-06 2.5e-05 6.4e-06 2.0e-04
pg_lwlock_cas   3.7e-07 3.7e-07 2.8e-07 5.6e-07 1.4e-07 5.8e-07 1.4e-07 1.1e-06 
1.9e-07 3.0e-06 2.4e-07 7.5e-06

And here's the throughput table calculated from your results,
i.e. the wall time per cycle relative to the wall time per cycle
for 1 worker.

workers   2   4   8  16  32
none1.9 3.5 6.7  13  26
atomicinc   1.4 2.6 2.6 2.4 2.4
cmpxchng1.1 1.1 1.2 0.9 0.8
spin1.5 2.6 0.3 0.1 0.0
pg_lwlock   1.4 2.5 1.0 0.2 0.1
pg_lwlock_cas   1.3 2.6 2.6 1.9 1.5

Hm, so in the best case we get 2.6x the throughput of a single core,
and that only for 4 and 8 workers (1.4e-7 second / cycle vs 3.6e-7).
In that case, there also seems to be little difference between
pg_lwlock{_cas} and atomicinc. atomicinc again manages to at least
sustain that throughput when the worker count is increased, while
for for the others the throughput actually *decreases*.

What totally puzzles me is that your results don't show any
trace of a decreased system load for the pg_lwlock implementation,
which I'd have expected due to the sleep() in the contested
path. Here are the user vs. wall time ratios - I'd have expected
to see value significantly below the number of workers for pg_lwlock

none  1.0 2.0 4.0 7.9 16 31
atomicinc 1.0 2.0 3.9 7.9 15 33
cmpxchng  1.0 2.0 4.1 7.9 16 31
spin  1.0 2.0 3.9 7.8 16 31
pg_lwlock 1.0 2.0 4.1 7.9 16 31
pg_lwlock_cas 1.0 2.0 4.1 7.9 16 31

 I wrote a little script to show to reorganize this data in a
 possibly-easier-to-understand format - ordering each column from
 lowest to highest, and showing each algorithm as a multiple of the
 cheapest value for that column:

If you're OK with that, I'd like to add that to the lockbench
repo.

 There seems to be something a bit funky in your 3-core data, but
 overall I read this data to indicate that 4 cores aren't really enough
 to see a severe problem with spinlock contention.

Hm, it starts to show if you lower the counter increment per cycle
(the D constant in run.sh). But yeah, it's never as bad as the
32-core results above..

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade defaulting to port 25432

2011-06-28 Thread Bruce Momjian

Peter Eisentraut wrote:
 On m?n, 2011-06-27 at 14:34 -0400, Bruce Momjian wrote:
  Bruce Momjian wrote:
   Robert Haas wrote:
It's easier to read the patches if you do separate changes in separate
patches.  Anyway, I'm a bit nervous about this hunk:

+   if (old_cluster.port == DEF_PGUPORT)
+   pg_log(PG_FATAL, When checking a live old 
server, 
+  you must specify the old server's 
port number.\n);

Is the implication here that I'm now going to need to specify more
than 4 command-line options/environment variables for this to work?
   
   Yes, we don't inherit PGPORT anymore.  Doing anything else was too
   complex to explain in the docs.
  
  But only if you are running --check on a live server.  Otherwise, we
  will just default to 50432 instead of 5432/PGPORT.
 
 When checking a live server, the built-in default port number or the
 value of the environment variable PGPORT is used.  But when performing
 an upgrade, a different port number is used by default, namely 50432,
 which can be overridden XXX [how?]
 
 Seems pretty clear to me, as long as that last bit is figured out.

Not sure where you got that text --- I assume that was an old email.  I
decided it was too complex to have PGPORT be honoroed only if there is a
live server, so I just always default to 50432 for both servers, and I
added this error check:

if (user_opts.check  is_server_running(old_cluster.pgdata))
{
*live_check = true;
+   if (old_cluster.port == DEF_PGUPORT)
+   pg_log(PG_FATAL, When checking a live old server, 
+  you must specify the old server's port number.\n);

OK?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] spinlock contention

On Tue, Jun 28, 2011 at 8:11 PM, Florian Pflug f...@phlo.org wrote:
 I wrote a little script to show to reorganize this data in a
 possibly-easier-to-understand format - ordering each column from
 lowest to highest, and showing each algorithm as a multiple of the
 cheapest value for that column:

 If you're OK with that, I'd like to add that to the lockbench
 repo.

It's a piece of crap, but you're welcome to try to fix it up.  See attached.

With respect to the apparent lack of impact of the sleep loop, I can
only speculate that this test case doesn't really mimic real-world
conditions inside the backend very well.  There is a lot more going on
there - multiple locks, lots of memory access, etc.  But I don't
really understand it either.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


sortby
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] time-delayed standbys

On Wed, Jun 15, 2011 at 1:58 AM, Fujii Masao masao.fu...@gmail.com wrote:
 After we run pg_ctl promote, time-delayed replication should be disabled?
 Otherwise, failover might take very long time when we set recovery_time_delay
 to high value.

PFA a patch that I believe will disable recovery_time_delay after
promotion.  The only change from the previous version is:

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog
index 1dbf792..41b3ae9 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -5869,7 +5869,7 @@ pg_is_xlog_replay_paused(PG_FUNCTION_ARGS)
 static void
 recoveryDelay(void)
 {
-   while (1)
+   while (!CheckForStandbyTrigger())
{
longsecs;
int microsecs;

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


time-delayed-standby-v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Online base backup from the hot-standby

2011-06-28 Thread Steve Singer

On 11-06-28 01:52 AM, Jun Ishiduka wrote:
 Considering everything that has been discussed on this thread so far.

 Do you still think your patch is the best way to accomplish base backups
 from standby servers?
 If not what changes do you think should be made?
 I reconsider the way to not use pg_stop_backup().

 Process of online base backup on standby server:
  1. pg_start_backup('x');
  2. copy the data directory
  3. copy *pg_control*

 Behavior while restore:
  * read Minimum recovery ending location of the copied pg_control.
  * use the value with the same purposes as the end-of-backup location.
- When the value is equal to 0/0, this behavior do not do.
   This situation is to acquire backup from master server.


The behaviour you describe above sounds okay to me, if someone else sees
issues with this then they should speak up (ideally before you go off
and write a new patch)

I'm going to consolidate my other comments below so this can act as a
more complete review.

Usability Review
-
We would like to be able to perform base backups from slaves without
having to call pg_start_backup() on the master. We can not currently do
this. The patch tries to do this. From a useability point of view it
would be nice if this could be done both manually with pg_start_backup()
and with pg_basebackup.

The main issue I have with the first patch you submitted is that it does
not work for cases where you don't want to call pg_basebackup or you
don't want to include the wal segments in the output of pg_basebackup.
There are a number of these use-cases (examples include the wal is
already available on an archive server, or you want to use
filesystem/disk array level snapshots instead of tar) . I feel your
above proposal to copy the control file as the last step in the
basebackup and the get the minRecoveryEnding location from this solves
these issues. It would be nice if things 'worked' when calling
pg_basebackup against the slave (maybe by having perform_base_backup()
resend the control file after it has sent the log?).

Feature test  Performance review
-
Skipped since a new patch is coming

Coding Review
--
I didn't look too closely at the code since a new patch that might
change a lot of the code. I did like how you added comments to most of
the larger code blocks that you added.


Architecture Review
---
There were some concerns with your original approach but using the
control file was suggested by Heikki and seems sound to me.


I'm marking this 'waiting for author' , if you don't think you'll be
able to get a reworked patch out during this commitfest then you should
move it to 'returned with feedback'

Steve


 
 Jun Ishizuka
 NTT Software Corporation
 TEL：045-317-7018
 E-Mail: ishizuka@po.ntts.co.jp
 





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

2011-06-28 Thread Jeff Davis

On Tue, 2011-06-28 at 22:20 +0200, Florian Pflug wrote:
 Hm, so RANGEINPUT would actually be what was previously discussed as
 the range as a pair of bounds definition, as opposed to the
 range as a set of values definition. So essentially we'd add a
 second concept of what a range is to work around the range input
 troubles.
 
 I believe if we go that route we should make RANGEINPUT a full-blown
 type, having pair of bound semantics. Adding a lobotomized version
 just for the sake of range input feels a bit like a kludge to me.

I think David Wheeler was trying to make a similar point, but I'm still
not convinced.

It's not a pair, because it can be made up of 0, 1, or 2 scalar values
(unless you count infinity as one of those values, in which case 0 or
2). And without ordering, it's not clear that those values are really
bounds.

The type needs to:
 * represent two values, either of which might be a special infinite
value
 * represent the value empty
 * represent inclusivity/exclusivity of both values

and those things seem fairly specific to ranges, so I don't really see
what other use we'd have for such a type. But I'm open to suggestion.

I don't think that having an extra type around is so bad. It solves a
lot of problems, and doesn't seem like it would get in the way. And it's
only for the construction of ranges out of scalars, which seems like the
most natural place where a cast might be required (similar to casting an
unknown literal, which is fairly common).

 Alternatively, we could replace RANGEINPUT simply with ANYARRAY,
 and add functions ANYRANGE-ANYRANGE which allow specifying the
 bound operator (, = respectively ,=) after construction.
 
 So you'd write (using the functions-as-fields syntax I believe
 we support)
   (ARRAY[1,2]::int8range).left_open.right_closed for '(1,2]'
 and
   ARRAY[NULL,2]::int8range for '[-inf,2]'

I think we can rule this one out:
 * The functions-as-fields syntax is all but deprecated (or should be)
 * That's hardly a readability improvement
 * It still suffers similar problems as casting back and forth to text:
ANYARRAY is too general, doesn't really take advantage of the type
system, and not a great fit anyway.

 All assuming that modifying the type system to support polymorphic
 type resolution based on the return type is out of the question... ;-)

It's still not out of the question, but I thought that the intermediate
type would be a less-intrusive alternative (and Robert seemed concerned
about how intrusive it was).

There also might be a little more effort educating users if we selected
the function based on the return type, because they might think that
casting the inputs explicitly would be enough to get it to pick the
right function. If it were a new syntax like RANGE[]::int8range, then I
think it would be easier to understand.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Process local hint bit cache