Re: [HACKERS] review: pgbench progress report improvements

2013-09-13 Thread Fabien COELHO


Hello,

About patch eols:


  postgresql patch -p1  ../pgbench-measurements-v2.patch
patching file contrib/pgbench/pgbench.c
patching file doc/src/sgml/pgbench.sgml


it can depends on o.s. I did tests on Fedora 14. and for patching without
warning I had to use dos2unix tool.


Hmmm. I use a Linux Ubuntu laptop, so generating DOS end of lines is 
unlikely if it is not there at the beginning. Running dos2unix on the 
patch file locally does not seem to change anything. So I assume that the 
patch encoding was changed somewhere along the path you used to get it.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] record identical operator

2013-09-13 Thread Benedikt Grundmann
On Thu, Sep 12, 2013 at 11:27 PM, Kevin Grittner kgri...@ymail.com wrote:

 Attached is a patch for a bit of infrastructure I believe to be
 necessary for correct behavior of REFRESH MATERIALIZED VIEW
 CONCURRENTLY as well as incremental maintenance of matviews.
 [...]
 The patch adds an identical operator (===) for the record type:


[...]

 The new operator is logically similar to IS NOT DISTINCT FROM for a
 record, although its implementation is very different.  For one
 thing, it doesn't replace the operation with column level operators
 in the parser.  For another thing, it doesn't look up operators for
 each type, so the identical operator does not need to be
 implemented for each type to use it as shown above.  It compares
 values byte-for-byte, after detoasting.  The test for identical
 records can avoid the detoasting altogether for any values with
 different lengths, and it stops when it finds the first column with
 a difference.

 I toyed with the idea of supporting hashing of records using this
 operator, but could not see how that would be a performance win.

 The identical (===) and not identical (!==) operator names were
 chosen because of a vague similarity to the exactly equals
 concepts in JavaScript and PHP, which use that name.  The semantics
 aren't quite the same, but it seemed close enough not to be too
 surprising.  The additional operator names seemed natural to me
 based on the first two, but I'm not really that attached to these
 names for the operators if someone has a better idea.

 Since the comparison of record values is not documented (only
 comparisons involving row value constructors), it doesn't seem like
 we should document this special case.  It is intended primarily for
 support of matview refresh and maintenance, and it seems likely
 that record comparison was not documented on the basis that it is
 intended primarily for support of such things as indexing and merge
 joins -- so leaving the new operators undocumented seems consistent
 with existing policy.  I'm open to arguments that the policy should
 change.

 -


Wouldn't it be slightly less surprising / magical to not declare new
operators
but just new primitive functions?  In the least invasive version they could
even
be called matview_is_record_identical or similar.

cheers,

Bene


Re: [HACKERS] 9.4 HEAD: select() failed in postmaster

2013-09-13 Thread MauMau

From: Jeff Janes jeff.ja...@gmail.com
--
I've implemented the Min to Max change and did some more testing.  Now I
have a different  but related problem (which I also saw before, but less
often than the select() one).  The 5 second clock doesn't get turned off.
So after all processes end, and a new startup is launched, if that startup
doesn't report back to the postmaster soon enough, it gets SIGKILLED.

postmaster.c near line 1681


   if ((Shutdown = ImmediateShutdown || (FatalError  !SendStop)) 
   now - AbortStartTime = SIGKILL_CHILDREN_AFTER_SECS)

It seems like this needs to have an additional and-test of pmState, but
which states to test I don't really know.

I've added in  (pmStatePM_RUN) and have not had any more failures, so
I think that this is on the right path but testing an enum for inequality
feels wrong.
--


AbortStartTime  0 is also necessary to avoid sending SIGKILL repeatedly. 
I sent the attached patch during the original discussion.  The below 
fragment is relevant:



--- 1663,1688 
TouchSocketLockFiles();
last_touch_time = now;
   }
+
+   /*
+* When postmaster got an immediate shutdown request
+* or some child terminated abnormally (FatalError case),
+* postmaster sends SIGQUIT to all children except
+* syslogger and dead_end ones, then wait for them to terminate.
+* If some children didn't terminate within a certain amount of time,
+* postmaster sends SIGKILL to them and wait again.
+* This resolves, for example, the hang situation where
+* a backend gets stuck in the call chain:
+* free() acquires some lock - received SIGQUIT -
+* quickdie() - ereport() - gettext() - malloc() - lock 
acquisition

+*/
+   if (AbortStartTime  0   /* SIGKILL only once */
+(Shutdown == ImmediateShutdown || (FatalError  !SendStop)) 
+now - AbortStartTime = 10)
+   {
+SignalAllChildren(SIGKILL);
+AbortStartTime = 0;
+   }
  }
 }


Regards
MauMau


reliable_immediate_shutdown.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Protocol forced to V2 in low-memory conditions?

2013-09-13 Thread Robert Haas
On Wed, Sep 11, 2013 at 2:35 PM, Andrew Dunstan and...@dunslane.net wrote:
 I vote against this.  If we remove V2 support from libpq, then we'll
 have no easy way to test that the backend's support still works.  And
 we've got too many people using V2 to think that it's OK not to have
 an easy way of testing that.  I think the question we ought to be
 asking is: how can we get widely-used connectors to stop relying on V2
 in the first place?

 How is it tested now, and who is doing the testing?

I don't know that there's any automated testing, but if someone
reports a bug that can only be reproduced using the v2 protocol, the
first thing I'm going to try to do is reproduce that using libpq, just
as I would with any other connector malfunction.  From what's being
said here it sounds like I might have to hack libpq a bit to get it to
speak the v2 protocol, but how long is that going to take?  Less time
than setting up an environment that can run whatever crazy connector
the client is using and trying to figure out whether the client or the
server is to blame, for sure.

Don't get me wrong, I think that the v2 protocol should go away, but
the real issue is the connectors that are still relying on it as a
primary way of talking to the server, not libpq.  We could force all
of those connectors to update or die by removing server support, and
once the server support is gone I don't care about libpq any more.  Or
maybe there's some friendlier way to wean those connectors off v2.
But in the meantime I'm skeptical that removing the code from libpq
gets us anywhere.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] record identical operator

2013-09-13 Thread Kevin Grittner
Benedikt Grundmann bgrundm...@janestreet.com wrote:

 Kevin Grittner kgri...@ymail.com wrote:

 Attached is a patch for a bit of infrastructure I believe to be
 necessary for correct behavior of REFRESH MATERIALIZED VIEW
 CONCURRENTLY as well as incremental maintenance of matviews.
 [...]
 The patch adds an identical operator (===) for the record
 type:

 Wouldn't it be slightly less surprising / magical to not declare
 new operators but just new primitive functions?  In the least
 invasive version they could even be called
 matview_is_record_identical or similar.

I'm not sure what is particularly surprising or magical about new
operators, but it is true that the patch could be smaller if
operators were not added.  The SQL functions added by the current
patch to support the operator approach are:

extern Datum record_image_eq(PG_FUNCTION_ARGS);
extern Datum record_image_ne(PG_FUNCTION_ARGS);
extern Datum record_image_lt(PG_FUNCTION_ARGS);
extern Datum record_image_gt(PG_FUNCTION_ARGS);
extern Datum record_image_le(PG_FUNCTION_ARGS);
extern Datum record_image_ge(PG_FUNCTION_ARGS);
extern Datum btrecordimagecmp(PG_FUNCTION_ARGS);

All take two record arguments and all but the last return a
boolean.  The last one returns -1, 0, or 1 depending on how the
values compare.  All comparisons are based on memcmp() of the data
in its storage format (after detoasting, where applicable).

As currently written, the patch still requires a unique index on
the matview in order to allow RMVC, but this patch was intended to
support removal of that restriction, which is something some people
were saying they wanted.  It just seemed best to do that with a
separate patch once we had the mechanism to support it.

RMVC currently generates its diff data with a query similar to
this:

test=# explain analyze
test-# SELECT *
test-#   FROM citext_matview m
test-#   FULL JOIN citext_table t ON (t === m)
test-#   WHERE t.id IS NULL OR m.id IS NULL;
test=# \pset pager off
Pager usage is off.
test=# explain analyze
SELECT *
  FROM citext_matview m
  FULL JOIN citext_table t ON (t.id = m.id AND t === m)
  WHERE t.id IS NULL OR m.id IS NULL;
 QUERY PLAN 
     
-
 Hash Full Join  (cost=1.11..2.24 rows=1 width=16) (actual time=0.056..0.056 
rows=0 loops=1)
   Hash Cond: (m.id = t.id)
   Join Filter: (t.* === m.*)
   Filter: ((t.id IS NULL) OR (m.id IS NULL))
   Rows Removed by Filter: 5
   -  Seq Scan on citext_matview m  (cost=0.00..1.05 rows=5 width=40) (actual 
time=0.002..0.006 rows=5 loops=1)
   -  Hash  (cost=1.05..1.05 rows=5 width=40) (actual time=0.023..0.023 rows=5 
loops=1)
 Buckets: 1024  Batches: 1  Memory Usage: 1kB
 -  Seq Scan on citext_table t  (cost=0.00..1.05 rows=5 width=40) 
(actual time=0.010..0.016 rows=5 loops=1)
 Total runtime: 0.102 ms
(10 rows)

With the operator support, we can remove the primary key columns
from the join conditions, and it still works, albeit with a slower
plan:

test=# explain analyze
SELECT *
  FROM citext_matview m
  FULL JOIN citext_table t ON (t === m)
  WHERE t.id IS NULL OR m.id IS NULL;
  QUERY PLAN
   
---
 Merge Full Join  (cost=2.22..2.32 rows=1 width=16) (actual time=0.072..0.072 
rows=0 loops=1)
   Merge Cond: (m.* === t.*)
   Filter: ((t.id IS NULL) OR (m.id IS NULL))
   Rows Removed by Filter: 5
   -  Sort  (cost=1.11..1.12 rows=5 width=40) (actual time=0.035..0.035 rows=5 
loops=1)
 Sort Key: m.*
 Sort Method: quicksort  Memory: 25kB
 -  Seq Scan on citext_matview m  (cost=0.00..1.05 rows=5 width=40) 
(actual time=0.012..0.016 rows=5 loops=1)
   -  Sort  (cost=1.11..1.12 rows=5 width=40) (actual time=0.014..0.014 rows=5 
loops=1)
 Sort Key: t.*
 Sort Method: quicksort  Memory: 25kB
 -  Seq Scan on citext_table t  (cost=0.00..1.05 rows=5 width=40) 
(actual time=0.003..0.003 rows=5 loops=1)
 Total runtime: 0.128 ms
(13 rows)

So, if the operators are included it will be a very small and
simple patch to relax the requirement for a unique index. 
Currently we generate an error if no index columns were found, but
with the new operators we could leave them out and the query would
still work.  Adding indexes to matviews would then be just a matter
of optimization, not a requirement to be able to use the RMVC
feature.

Using the functions instead of the operators things work just as
well as long as we use columns in the join conditions, which is
currently based on indexed columns:

test=# explain analyze
test-# SELECT *
test-#   FROM citext_matview m
test-#   FULL JOIN citext_table t ON (t.id = 

Re: [HACKERS] Custom Plan node

2013-09-13 Thread Robert Haas
On Tue, Sep 10, 2013 at 11:45 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 Fair enough, I think.  So the action item for KaiGai is to think of
 how the planner integration might work.

 Do you think the idea I mentioned at the upthread is worth to investigate
 for more detailed consideration? Or, does it seems to you short-sighted
 thinking to fit this infrastructure with planner?

 It categorizes plan node into three: join, scan and other stuff.
 Cost based estimation is almost applied on join and scan, so abstracted
 scan and join may make sense to inform core planner what does this
 custom plan node try to do.
 On the other hand, other stuff, like Agg, is a stuff that must be added
 according to the provided query, even if its cost estimation was not small,
 to perform as the provided query described.
 So, I thought important one is integration of join and scan, but manipulation
 of plan tree for other stuff is sufficient.

 How about your opinion?

Well, I don't know that I'm smart enough to predict every sort of
thing that someone might want to do here, unfortunately.  This is a
difficult area: there are many possible things someone might want to
do, and as Tom points out, there's a lot of special handling of
particular node types that can make things difficult.  And I can't
claim to be an expert in this area.

That having been said, I think the idea of a CustomScan node is
probably worth investigating.  I don't know if that would work out
well or poorly, but I think it would be worth some experimentation.
Perhaps you could have a hook that gets called for each baserel, and
can decide whether or not it wishes to inject any additional paths;
and then a CustomScan node that could be used to introduce such paths.
 I've been thinking that we really ought to have the ability to
optimize CTID range queries, like SELECT * FROM foo WHERE ctid  'some
constant'.  We have a Tid Scan node, but it only handles equalities,
not inequalities.  I suppose this functionality should really be in
core, but since it's not it might make an interesting test for the
infrastructure you're proposing.  You may be able to think of
something else you like better; it's just a thought.

I am a little less sanguine about the chances of a CustomJoin node
working out well.  I agree that we need something to handle join
pushdown, but it seems to me that might be done by providing a Foreign
Scan path into the joinrel rather than by adding a concept of foreign
joins per se.  There are other possible new join types, like the Range
Join that Jeff Davis has mentioned in the past, which might be
interesting.  But overall, I can't see us adding very many more join
types, so I'm not totally sure how much extensibility would help us
here.  We've added a few scan types over the years (index only scans
in 9.2, and bitmap index scans in 8.1, I think) but all of our
existing join types go back to time immemorial.

And I think that lumping everything else together under not a scan or
join has the least promise of all.  Is replacing Append really the
same as replacing Sort?  I think we'll need to think harder here about
what we're trying to accomplish and how to get there.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-09-13 Thread Robert Haas
On Wed, Sep 11, 2013 at 3:40 PM, Josh Berkus j...@agliodbs.com wrote:
 I think that most of the arguments in this thread drastically
 overestimate the precision and the effect of effective_cache_size. The
 planner logic behind it basically only uses it to calculate things
 within a single index scan. That alone shows that any precise
 calculation cannot be very meaningful.
 It also does *NOT* directly influence how the kernel caches disk
 io. It's there to guess how likely it is something is still cached when
 accessing things repeatedly.

 Agreed.  I think we should take the patch as-is, and spend the rest of
 the 9.4 dev cycle arguing about 3x vs. 4x.

 ;-)

I'm happy with that option, but I think the larger point here is that
this only has a hope of being right if you're setting shared_buffers
to 25% of system memory.  And more and more, people are not doing
that, because of the other recommendation, not much discussed here, to
cap shared_buffers at about 8GB.  Systems whose total memory is far
larger than 32GB are becoming quite commonplace, and only figure to
become moreso.  So while I don't particularly object to this proposal,
it would have had a lot more value if we'd done it 5 years ago.

Now the good news is that right now the default is 128MB, and under
any of these proposals the default will go up, quite a bit.  Default
shared_buffers is now 128MB, so we're looking at raising the default
to at least 384MB, and for people who also tune shared_buffers but
might not bother with effective cache size, it'll go up a lot more.
That's clearly a move in the right direction even if the accuracy of
the formula is suspect (which it is).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL 9.3 beta breaks some extensions make install

2013-09-13 Thread Cédric Villemain



Marti Raudsepp ma...@juffo.org a écrit :
On Tue, May 14, 2013 at 4:12 AM, Marti Raudsepp ma...@juffo.org
wrote:
 While testing out PostgreSQL 9.3beta1, I stumbled upon a problem

 % make DESTDIR=/tmp/foo install

 /usr/bin/install: will not overwrite just-created
 ‘/tmp/foo/usr/share/postgresql/extension/semver--0.3.0.sql’ with
 ‘./sql/semver--0.3.0.sql’
 make: *** [install] Error 1

On Wed, May 15, 2013 at 4:49 PM, Peter Eisentraut pete...@gmx.net
wrote:
 That said, I'm obviously outnumbered here.  What about the following
 compromise:  Use the configure-selected install program inside
 PostgreSQL (which we can test easily), and use install-sh under
 USE_PGXS?  Admittedly, the make install time of extensions is
probably
 not an issue.

Did we ever do anything about this? It looks like the thread got
distracted with VPATH builds and now I'm seeing this problem in 9.3.0.
:(

This occurs in Arch Linux, but for some odd reason not on Ubuntu when
using apt.postgresql.org. Somehow the pgxs.mk supplied by
apt.postgresql.org differs from the one shipped in PostgreSQL.

Is there a chance of getting this resolved in PostgreSQL or should we
get extension writers to fix their makefiles instead?

Apt.pgdg got the patch present in postgresql head applyed.
Andrew is about to commit (well...I hope) a doc patch about that and also a 
little fix.
Imho this is a bugfix so I hope it will be applyed in older branches.

--
Envoyé de mon téléphone. Excusez la brièveté.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] record identical operator

2013-09-13 Thread Andres Freund
Hi Kevin,

On 2013-09-12 15:27:27 -0700, Kevin Grittner wrote:
 The new operator is logically similar to IS NOT DISTINCT FROM for a
 record, although its implementation is very different.  For one
 thing, it doesn't replace the operation with column level operators
 in the parser.  For another thing, it doesn't look up operators for
 each type, so the identical operator does not need to be
 implemented for each type to use it as shown above.  It compares
 values byte-for-byte, after detoasting.  The test for identical
 records can avoid the detoasting altogether for any values with
 different lengths, and it stops when it finds the first column with
 a difference.

In the general case, that operator sounds dangerous to me. We don't
guarantee that a Datum containing the same data always has the same
binary representation. E.g. array can have a null bitmap or may not have
one, depending on how they were created.

I am not actually sure whether that's a problem for your usecase, but I
get headaches when we try circumventing the type abstraction that way.

Yes, we do such tricks in other places already, but afaik in all those
places errorneously believing two Datums are distinct is not error, just
a missed optimization. Allowing a general operator with such a murky
definition to creep into something SQL exposed... Hm. Not sure.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-09-13 Thread Merlin Moncure
On Fri, Sep 13, 2013 at 10:08 AM, Robert Haas robertmh...@gmail.com wrote:
 On Wed, Sep 11, 2013 at 3:40 PM, Josh Berkus j...@agliodbs.com wrote:
 I think that most of the arguments in this thread drastically
 overestimate the precision and the effect of effective_cache_size. The
 planner logic behind it basically only uses it to calculate things
 within a single index scan. That alone shows that any precise
 calculation cannot be very meaningful.
 It also does *NOT* directly influence how the kernel caches disk
 io. It's there to guess how likely it is something is still cached when
 accessing things repeatedly.

 Agreed.  I think we should take the patch as-is, and spend the rest of
 the 9.4 dev cycle arguing about 3x vs. 4x.

 ;-)

 I'm happy with that option, but I think the larger point here is that
 this only has a hope of being right if you're setting shared_buffers
 to 25% of system memory.  And more and more, people are not doing
 that, because of the other recommendation, not much discussed here, to
 cap shared_buffers at about 8GB.  Systems whose total memory is far
 larger than 32GB are becoming quite commonplace, and only figure to
 become moreso.  So while I don't particularly object to this proposal,
 it would have had a lot more value if we'd done it 5 years ago.

 Now the good news is that right now the default is 128MB, and under
 any of these proposals the default will go up, quite a bit.  Default
 shared_buffers is now 128MB, so we're looking at raising the default
 to at least 384MB, and for people who also tune shared_buffers but
 might not bother with effective cache size, it'll go up a lot more.
 That's clearly a move in the right direction even if the accuracy of
 the formula is suspect (which it is).

This is a very important point: the 8gb cap is also to high.  We have
a very high transaction rate server here that exploded at 32GB, was
downgraded to 2GB ran fine, then upgraded to 4GB (over my strenuous
objection) and exploded again.

The stock documentation advice I probably needs to be revised to so
that's the lesser of 2GB and 25%.  I'm more and more coming around to
the opinion that in terms of shared buffers we have some major
problems that manifest in high end servers.  So +1 to your point,
although I'm still ok with the auto-setting on the basis that for very
high end servers most of the setting end up being manually tweaked
anyways.   We do need to be cautious though; it's not impossible that
improvements to buffers system might cause the 25% advice to revised
upwards in the not-to-near future if some of the problems get solved..

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Successor of MD5 authentication, let's use SCRAM

2013-09-13 Thread Robert Haas
On Thu, Sep 12, 2013 at 11:33 AM, Magnus Hagander mag...@hagander.net wrote:
 Well, undocumented and OpenSSL tend to go hand in hand a lot. Or,
 well, it might be documented, but not in a useful way. I wouldn't
 count on it.

The OpenSSL code is some of the worst-formatted spaghetti code I've
ever seen, and the reason I know that is because whenever I try to do
anything with OpenSSL I generally end up having to read it, precisely
because, as you say, the documentation is extremely incomplete.  I
hate to be critical of other projects, but everything I've ever done
with OpenSSL has been difficult, and I really think we should try to
get less dependent on it rather than more.

 I fear starting to use that is going to make it even harder to break
 out from our openssl dependency, which people do complain about at
 least semi-regularly.

+1.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE

2013-09-13 Thread Robert Haas
On Wed, Sep 11, 2013 at 8:47 PM, Peter Geoghegan p...@heroku.com wrote:
 In practice the vast majority of insertions don't involve TOASTing.
 That's not an excuse for allowing the worst case to be really bad in
 terms of its impact on query response time, but it may well justify
 having whatever ameliorating measures we take result in bloat. It's at
 least the kind of bloat we're more or less used to dealing with, and
 have already invested a lot in controlling. Plus bloat-wise it can't
 be any worse than just inserting the tuple and having the transaction
 abort on a duplicate, since that already happens after toasting has
 done its work with regular insertion.

Andres is being very polite here, but the reality is that this
approach has zero chance of being accepted.  You can't hold buffer
locks for a long period of time across complex operations.  Full stop.
 It's a violation of the rules that are clearly documented in
src/backend/storage/buffer/README, which have been in place for a very
long time, and this patch is nowhere near important enough to warrant
a revision of those rules.  We are not going to risk breaking every
bit of code anywhere in the backend or in third-party code that takes
a buffer lock.  You are never going to convince me, or Tom, that the
benefit of doing that is worth the risk; in fact, I have a hard time
believing that you'll find ANY committer who thinks this approach is
worth considering.

Even if you get the code to run without apparent deadlocks, that
doesn't mean there aren't any; it just means that you haven't found
them all yet.  And even if you managed to squash every such hazard
that exists today, so what?  Fundamentally, locking protocols that
don't include deadlock detection don't scale.  You can use such locks
in limited contexts where proofs of correctness are straightforward,
but trying to stretch them beyond that point results not only in bugs,
but also in bad performance and unmaintainable code.  With a much more
complex locking regimen, even if your code is absolutely bug-free,
you've put a burden on the next guy who wants to change anything; how
will he avoid breaking things?  Our buffer locking regimen suffers
from painful complexity and serious maintenance difficulties as is.
Moreover, we've already got performance and scalability problems that
are attributable to every backend in the system piling up waiting on a
single lwlock, or a group of simultaneously-held lwlocks.
Dramatically broadening the scope of where lwlocks are used and for
how long they're held is going to make that a whole lot worse.  What's
worse, the problems will be subtle, restricted to the people using
this feature, and very difficult to measure on production systems, and
I have no confidence they'd ever get fixed.

A further problem is that a backend which holds even one lwlock can't
be interrupted.  We've had this argument before and it seems that you
don't think that non-interruptibility is a problem, but it project
policy to allow for timely interrupts in all parts of the backend and
we're not going to change that policy for this patch.  Heavyweight
locks are heavy weight precisely because they provide services - like
deadlock detection, satisfactory interrupt handling, and, also
importantly, FIFO queuing behavior - that are *important* for locks
that are held over an extended period of time.  We're not going to go
put those services into the lightweight lock mechanism because then it
would no longer be light weight, and we're not going to ignore the
importance of them, either.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-09-13 Thread Merlin Moncure
On Fri, Sep 13, 2013 at 11:07 AM, Andres Freund and...@2ndquadrant.com wrote:
 On 2013-09-13 10:50:06 -0500, Merlin Moncure wrote:
 The stock documentation advice I probably needs to be revised to so
 that's the lesser of 2GB and 25%.

 I think that would be a pretty bad idea. There are lots of workloads
 where people have postgres happily chugging along with s_b lots bigger
 than that and see benefits.
 We have a couple people reporting mostly undiagnosed (because that turns
 out to be hard!) problems that seem to be avoided with smaller s_b. We
 don't even remotely know enough about the problem to make such general
 recommendations.

I happen to be one of those couple people.  Load goes from 0.1 to
500 without warning then back to 0.1 equally without warning.
Unfortunately the server is in a different jurisdiction such that it
makes deep forensic analysis impossible.  I think this is happening
more and more often as postgres is becoming increasingly deployed on
high(er) -end servers.  I've personally (alone) dealt with 4-5
confirmed cases and there have been many more.  We have a problem.

But, to address your point, the big s_b benefits are equally hard to
quantify (unless your database happens to fit in s_b) -- they mostly
help high write activity servers where the write activity fits a very
specific pattern.  But the risks of low s_b (mostly slightly higher
i/o and query latency) are much easier to deal with than high s_b
(even if less likely); random inexplicable server stalls and other
weird manifestations.  My stock advice remains to set to max 2gb
until having a reason (of which there can be many) to set otherwise.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-09-13 Thread Andres Freund
On 2013-09-13 10:50:06 -0500, Merlin Moncure wrote:
 The stock documentation advice I probably needs to be revised to so
 that's the lesser of 2GB and 25%.

I think that would be a pretty bad idea. There are lots of workloads
where people have postgres happily chugging along with s_b lots bigger
than that and see benefits.
We have a couple people reporting mostly undiagnosed (because that turns
out to be hard!) problems that seem to be avoided with smaller s_b. We
don't even remotely know enough about the problem to make such general
recommendations.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Strange hanging bug in a simple milter

2013-09-13 Thread Stephen Frost
Heikki, all,

* Stephen Frost (sfr...@snowman.net) wrote:
 Very curious.  Out of time right now to look into it, but will probably
 be back at it later tonight.

Alright, I was back at this a bit today and decided to go with a hunch-
and it looks like I might have been right to try.

Leaving the locking callback hooks in place appears to prevent the
deadlocks from happening, at least with this little app.

IOW, in destroy_ssl_system(), simply arrange to not have
CRYPTO_set_locking_callback(NULL); and CRYPTO_set_id_callback(NULL);
called.  I've done this with the very simple attached patch.  Per the
comment above destroy_ssl_system(), this doesn't seem to be an
acceptable solution because libpq might get unloaded from the
application, but perhaps it points the way towards what the issue is.

My thought had been that there was an issue with regard to signal
handling, but I've been unable to confirm that, nor is it clear why that
would be the case.  In any case, I'm curious what others think of these
results and if anyone can reproduce the deadlock with this patch
applied.

Thanks!

Stephen
diff --git a/src/interfaces/libpq/fe-secure.c b/src/interfaces/libpq/fe-secure.c
new file mode 100644
index 3bd0113..e025b49 100644
*** a/src/interfaces/libpq/fe-secure.c
--- b/src/interfaces/libpq/fe-secure.c
*** static SSL_CTX *SSL_context = NULL;
*** 102,107 
--- 102,108 
  
  #ifdef ENABLE_THREAD_SAFETY
  static long ssl_open_connections = 0;
+ static int initialized_hooks = 0;
  
  #ifndef WIN32
  static pthread_mutex_t ssl_config_mutex = PTHREAD_MUTEX_INITIALIZER;
*** init_ssl_system(PGconn *conn)
*** 947,952 
--- 948,955 
  
  		if (ssl_open_connections++ == 0)
  		{
+ 			initialized_hooks = 1;
+ 
  			/* These are only required for threaded libcrypto applications */
  			CRYPTO_set_id_callback(pq_threadidcallback);
  			CRYPTO_set_locking_callback(pq_lockingcallback);
*** destroy_ssl_system(void)
*** 1015,1021 
  	if (pq_init_crypto_lib  ssl_open_connections  0)
  		--ssl_open_connections;
  
! 	if (pq_init_crypto_lib  ssl_open_connections == 0)
  	{
  		/* No connections left, unregister libcrypto callbacks */
  		CRYPTO_set_locking_callback(NULL);
--- 1018,1024 
  	if (pq_init_crypto_lib  ssl_open_connections  0)
  		--ssl_open_connections;
  
! 	if (pq_init_crypto_lib  ssl_open_connections == 0  initialized_hooks == 0)
  	{
  		/* No connections left, unregister libcrypto callbacks */
  		CRYPTO_set_locking_callback(NULL);


signature.asc
Description: Digital signature


Re: [HACKERS] Possible memory leak with SQL function?

2013-09-13 Thread Robert Haas
On Thu, Sep 12, 2013 at 5:29 AM, Yeb Havinga yebhavi...@gmail.com wrote:
 Is the following known behaviour, or should I put some time in writing a
 self contained test case?

 We have a function that takes a value and returns a ROW type. With the
 function implemented in language SQL, when executing this function in a
 large transaction, memory usage of the backend process increases.
 MemoryContextStats showed a lot of SQL function data. Debugging
 init_sql_fcache() showed that it was for the same function oid each time,
 and the oid was the function from value to ROW type.

 When the function is implemented in PL/pgSQL, the memory usage was much
 less.

 I'm sorry I cannot be more specific at the moment, such as what is 'much
 less' memory with a PL/pgSQl function, and are there as many SQL function
 data's as calls to the SQL function, because I would have to write a test
 case for this. I was just wondering, if this is known behavior of SQL
 functions vs PL/pgSQL functions, or could it be a bug?

It sounds like a bug to me, although I can't claim to know everything
there is to know about this topic.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-09-13 Thread Andres Freund
On 2013-09-13 11:27:03 -0500, Merlin Moncure wrote:
 On Fri, Sep 13, 2013 at 11:07 AM, Andres Freund and...@2ndquadrant.com 
 wrote:
  On 2013-09-13 10:50:06 -0500, Merlin Moncure wrote:
  The stock documentation advice I probably needs to be revised to so
  that's the lesser of 2GB and 25%.
 
  I think that would be a pretty bad idea. There are lots of workloads
  where people have postgres happily chugging along with s_b lots bigger
  than that and see benefits.
  We have a couple people reporting mostly undiagnosed (because that turns
  out to be hard!) problems that seem to be avoided with smaller s_b. We
  don't even remotely know enough about the problem to make such general
  recommendations.

 I happen to be one of those couple people.  Load goes from 0.1 to
 500 without warning then back to 0.1 equally without warning.
 Unfortunately the server is in a different jurisdiction such that it
 makes deep forensic analysis impossible.  I think this is happening
 more and more often as postgres is becoming increasingly deployed on
 high(er) -end servers.  I've personally (alone) dealt with 4-5
 confirmed cases and there have been many more.  We have a problem.

Absolutely not claiming the contrary. I think it sucks that we couldn't
fully figure out what's happening in detail. I'd love to get my hand on
a setup where it can be reliably reproduced.

 But, to address your point, the big s_b benefits are equally hard to
 quantify (unless your database happens to fit in s_b)

Databases where the hot dataset fits in s_b is pretty honking big use
case tho. That's one of the primary reasons to buy machines with
craploads of memory.

That said, I think having a note in the docs that large s_b can cause
such a problem might not be a bad idea and I surely wouldn't argue
against it.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Strange hanging bug in a simple milter

2013-09-13 Thread Andres Freund
On 2013-09-13 12:40:11 -0400, Stephen Frost wrote:
 Heikki, all,
 
 * Stephen Frost (sfr...@snowman.net) wrote:
  Very curious.  Out of time right now to look into it, but will probably
  be back at it later tonight.
 
 Alright, I was back at this a bit today and decided to go with a hunch-
 and it looks like I might have been right to try.
 
 Leaving the locking callback hooks in place appears to prevent the
 deadlocks from happening, at least with this little app.
 
 IOW, in destroy_ssl_system(), simply arrange to not have
 CRYPTO_set_locking_callback(NULL); and CRYPTO_set_id_callback(NULL);
 called.  I've done this with the very simple attached patch.  Per the
 comment above destroy_ssl_system(), this doesn't seem to be an
 acceptable solution because libpq might get unloaded from the
 application, but perhaps it points the way towards what the issue is.

It'd be interesting to replace the origin callbacks with one immediately
doing an abort() or similar to see whether they maybe are called after
they shouldn't be and from where.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Strange hanging bug in a simple milter

2013-09-13 Thread Stephen Frost
Andres,

On Friday, September 13, 2013, Andres Freund wrote:

 It'd be interesting to replace the origin callbacks with one immediately
 doing an abort() or similar to see whether they maybe are called after
 they shouldn't be and from where.


Good thought. Got sucked into a meeting but once I'm out I'll try having
the lock/unlock routines abort if they're called while ssl_open_connections
is zero, which should not be happening, but seems like it is.

Thanks,

Stephen


Re: [HACKERS] Strange hanging bug in a simple milter

2013-09-13 Thread Andres Freund
On 2013-09-13 13:15:34 -0400, Stephen Frost wrote:
 Andres,
 
 On Friday, September 13, 2013, Andres Freund wrote:
 
  It'd be interesting to replace the origin callbacks with one immediately
  doing an abort() or similar to see whether they maybe are called after
  they shouldn't be and from where.
 
 
 Good thought. Got sucked into a meeting but once I'm out I'll try having
 the lock/unlock routines abort if they're called while ssl_open_connections
 is zero, which should not be happening, but seems like it is.

Hm. close_SSL() first does pqsecure_destroy() which will unset the
callbacks, and the count and then goes on to do X509_free() and
ENGINE_finish(), ENGINE_free() if either is used.

It's not implausible that one of those actually needs locking. I doubt
engines play a role here, but, without having looked at the testcase,
X509_free() might be a possibility.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL 9.3 beta breaks some extensions make install

2013-09-13 Thread Marti Raudsepp
On Fri, Sep 13, 2013 at 6:42 PM, Cédric Villemain
ced...@2ndquadrant.com wrote:
 Marti Raudsepp ma...@juffo.org a écrit :
Did we ever do anything about this? It looks like the thread got
distracted with VPATH builds and now I'm seeing this problem in 9.3.0.

 Andrew is about to commit (well...I hope) a doc patch about that and also a 
 little fix.
 Imho this is a bugfix so I hope it will be applyed in older branches.

Oh I see, indeed commit 6697aa2bc25c83b88d6165340348a31328c35de6
Improve support for building PGXS modules with VPATH fixes the
problem and I see it's not present in REL9_3_0.

Andrew and others, does this seem safe enough to backport to 9.3.1?

 Apt.pgdg got the patch present in postgresql head applyed.

Erm, isn't apt.postgresql.org supposed to ship the *official*
PostgreSQL versions? Given that this issue affects all distros, I
don't see why Ubuntu/Debian need to be patched separately.

Does anyone else think this is problematic? By slipping patches into
distro-specific packages, you're confusing users (like me) and
bypassing the PostgreSQL QA process.

PS: Where are the sources used to build packages on apt.postgresql.org?

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Strange hanging bug in a simple milter

2013-09-13 Thread Stephen Frost
* Stephen Frost (sfr...@snowman.net) wrote:
 * Andres Freund (and...@2ndquadrant.com) wrote:
  Hm. close_SSL() first does pqsecure_destroy() which will unset the
  callbacks, and the count and then goes on to do X509_free() and
  ENGINE_finish(), ENGINE_free() if either is used.
  
  It's not implausible that one of those actually needs locking. I doubt
  engines play a role here, but, without having looked at the testcase,
  X509_free() might be a possibility.
 
 Unfortunately, while I can still easily get the deadlock to happen when
 the hooks are reset, the hooks don't appear to ever get called when
 ssl_open_connections is set to zero.  You have a good point about the
 additional SSL calls after the hooks are unloaded though, I wonder if
 holding the ssl_config_mutex lock over all of close_SSL might be more
 sensible..

I went ahead and moved the locks to be around all of close_SSL() and
haven't been able to reproduce the deadlock, so perhaps those calls are
the issue and what's happening is that another thread is dropping or
adding the hooks in a common place while the X509_free, etc, are trying
to figure out if they should be calling the locking functions or not,
but there's a race because there's no higher-level locking happening
around those.

Attached is a patch to move those and which doesn't deadlock for me.

Thoughts?

Thanks,

Stephen
diff --git a/src/interfaces/libpq/fe-secure.c b/src/interfaces/libpq/fe-secure.c
new file mode 100644
index 3bd0113..e14c301 100644
*** a/src/interfaces/libpq/fe-secure.c
--- b/src/interfaces/libpq/fe-secure.c
*** destroy_ssl_system(void)
*** 1009,1017 
  {
  #ifdef ENABLE_THREAD_SAFETY
  	/* Mutex is created in initialize_ssl_system() */
- 	if (pthread_mutex_lock(ssl_config_mutex))
- 		return;
- 
  	if (pq_init_crypto_lib  ssl_open_connections  0)
  		--ssl_open_connections;
  
--- 1009,1014 
*** destroy_ssl_system(void)
*** 1031,1037 
  		 */
  	}
  
- 	pthread_mutex_unlock(ssl_config_mutex);
  #endif
  }
  
--- 1028,1033 
*** open_client_SSL(PGconn *conn)
*** 1537,1542 
--- 1533,1543 
  static void
  close_SSL(PGconn *conn)
  {
+ #ifdef ENABLE_THREAD_SAFETY
+ 	if (pthread_mutex_lock(ssl_config_mutex))
+ 		return;
+ #endif
+ 
  	if (conn-ssl)
  	{
  		DECLARE_SIGPIPE_INFO(spinfo);
*** close_SSL(PGconn *conn)
*** 1565,1570 
--- 1566,1575 
  		conn-engine = NULL;
  	}
  #endif
+ 
+ #ifdef ENABLE_THREAD_SAFETY
+ 	pthread_mutex_unlock(ssl_config_mutex);
+ #endif
  }
  
  /*


signature.asc
Description: Digital signature


Re: [HACKERS] Strange hanging bug in a simple milter

2013-09-13 Thread Stephen Frost
* Andres Freund (and...@2ndquadrant.com) wrote:
 On 2013-09-13 13:15:34 -0400, Stephen Frost wrote:
  Good thought. Got sucked into a meeting but once I'm out I'll try having
  the lock/unlock routines abort if they're called while ssl_open_connections
  is zero, which should not be happening, but seems like it is.
 
 Hm. close_SSL() first does pqsecure_destroy() which will unset the
 callbacks, and the count and then goes on to do X509_free() and
 ENGINE_finish(), ENGINE_free() if either is used.
 
 It's not implausible that one of those actually needs locking. I doubt
 engines play a role here, but, without having looked at the testcase,
 X509_free() might be a possibility.

Unfortunately, while I can still easily get the deadlock to happen when
the hooks are reset, the hooks don't appear to ever get called when
ssl_open_connections is set to zero.  You have a good point about the
additional SSL calls after the hooks are unloaded though, I wonder if
holding the ssl_config_mutex lock over all of close_SSL might be more
sensible..

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Strange hanging bug in a simple milter

2013-09-13 Thread Andres Freund
On 2013-09-13 14:33:25 -0400, Stephen Frost wrote:
 * Stephen Frost (sfr...@snowman.net) wrote:
  * Andres Freund (and...@2ndquadrant.com) wrote:
   Hm. close_SSL() first does pqsecure_destroy() which will unset the
   callbacks, and the count and then goes on to do X509_free() and
   ENGINE_finish(), ENGINE_free() if either is used.
   
   It's not implausible that one of those actually needs locking. I doubt
   engines play a role here, but, without having looked at the testcase,
   X509_free() might be a possibility.
  
  Unfortunately, while I can still easily get the deadlock to happen when
  the hooks are reset, the hooks don't appear to ever get called when
  ssl_open_connections is set to zero.  You have a good point about the
  additional SSL calls after the hooks are unloaded though, I wonder if
  holding the ssl_config_mutex lock over all of close_SSL might be more
  sensible..
 
 I went ahead and moved the locks to be around all of close_SSL() and
 haven't been able to reproduce the deadlock, so perhaps those calls are
 the issue and what's happening is that another thread is dropping or
 adding the hooks in a common place while the X509_free, etc, are trying
 to figure out if they should be calling the locking functions or not,
 but there's a race because there's no higher-level locking happening
 around those.
 
 Attached is a patch to move those and which doesn't deadlock for me.

It seems slightly cleaner to just move the pqsecure_destroy(); to the
end of that function, based on a boolean. But if you think otherwise, I
won't protest...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Strange hanging bug in a simple milter

2013-09-13 Thread Andres Freund
On 2013-09-13 13:59:54 -0400, Stephen Frost wrote:
 Unfortunately, while I can still easily get the deadlock to happen when
 the hooks are reset, the hooks don't appear to ever get called when
 ssl_open_connections is set to zero.  You have a good point about the
 additional SSL calls after the hooks are unloaded though, I wonder if
 holding the ssl_config_mutex lock over all of close_SSL might be more
 sensible..

Hm. You might want to do the check for ssl_open_connections after a
memory barrier or such, it might not actually be up2date (too tired to
think about the actual guarantees of x86's cache coherency guarantees).

Does it burn if you move the pqsecure_destroy(); to the end of the
function, after setting a boolean in the if (conn-ssl) branch?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Strange hanging bug in a simple milter

2013-09-13 Thread Heikki Linnakangas

On 13.09.2013 22:26, Heikki Linnakangas wrote:

I'm afraid the move_locks.diff patch you posted earlier is also
broken; close_SSL() is called in error scenarios from
pqsecure_open_client(), while already holding the mutex. So it will
deadlock with itself if the connection cannot be established.


Actually, I think there's a pre-existing bug there in git master. If the 
SSL_set_app_data or SSL_set_fd call in pqsecure_open_client fails for 
some reason, it will call close_SSL() with conn-ssl already set, and 
the mutex held. close_SSL() will call 
pqsecure_destroy()-destroy_SSL()-destory_ssl_system(), which will try 
to acquire the mutex again.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE

2013-09-13 Thread Stephen Frost
* Peter Geoghegan (p...@heroku.com) wrote:
 I would suggest letting those other individuals speak for themselves
 too. Particularly if you're going to name someone who is on vacation
 like that.

It was my first concern regarding this patch.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Strange hanging bug in a simple milter

2013-09-13 Thread Stephen Frost
* Heikki Linnakangas (hlinnakan...@vmware.com) wrote:
 Umm, with that patch, pqsecure_destroy() is never called. The if
 (conn-ssl) test that's now at the end of the close_SSL function is
 never true, because conn-ssl is set to NULL earlier.

Yeah, got ahead of myself, as Andres pointed out.

 I'm afraid the move_locks.diff patch you posted earlier is also
 broken; close_SSL() is called in error scenarios from
 pqsecure_open_client(), while already holding the mutex. So it will
 deadlock with itself if the connection cannot be established.

Sorry, I was really just tossing it up to see if it really did avoid
this particular deadlock.  I'm running some tests now with the
attached to see if I can get it to deadlock now.

Thanks,

Stephen
diff --git a/src/interfaces/libpq/fe-secure.c b/src/interfaces/libpq/fe-secure.c
new file mode 100644
index 3bd0113..c008dcf 100644
*** a/src/interfaces/libpq/fe-secure.c
--- b/src/interfaces/libpq/fe-secure.c
*** open_client_SSL(PGconn *conn)
*** 1537,1542 
--- 1537,1544 
  static void
  close_SSL(PGconn *conn)
  {
+ 	bool destroy_needed = conn-ssl ? true : false;
+ 
  	if (conn-ssl)
  	{
  		DECLARE_SIGPIPE_INFO(spinfo);
*** close_SSL(PGconn *conn)
*** 1545,1551 
  		SSL_shutdown(conn-ssl);
  		SSL_free(conn-ssl);
  		conn-ssl = NULL;
- 		pqsecure_destroy();
  		/* We have to assume we got EPIPE */
  		REMEMBER_EPIPE(spinfo, true);
  		RESTORE_SIGPIPE(conn, spinfo);
--- 1547,1552 
*** close_SSL(PGconn *conn)
*** 1565,1570 
--- 1566,1582 
  		conn-engine = NULL;
  	}
  #endif
+ 
+ 	/*
+ 	 * This will remove our SSL locking hooks, if this is the last SSL
+ 	 * connection, which means we must wait to call it until after all
+ 	 * SSL calls have been made, otherwise we can end up with a race
+ 	 * condition and possible deadlocks.
+ 	 *
+ 	 * See comments above destroy_ssl_system().
+ 	 */
+ 	if (destroy_needed)
+ 		pqsecure_destroy();
  }
  
  /*


signature.asc
Description: Digital signature


Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE

2013-09-13 Thread Peter Geoghegan
On Fri, Sep 13, 2013 at 9:23 AM, Robert Haas robertmh...@gmail.com wrote:
 Andres is being very polite here, but the reality is that this
 approach has zero chance of being accepted.

I quite like Andres, but I have yet to see him behave as you describe
in a situation where someone proposed what was fundamentally a bad
idea. Maybe you should let him speak for himself?

 You can't hold buffer
 locks for a long period of time across complex operations.  Full stop.
  It's a violation of the rules that are clearly documented in
 src/backend/storage/buffer/README, which have been in place for a very
 long time, and this patch is nowhere near important enough to warrant
 a revision of those rules.

The importance of this patch is a value judgement. Our users have been
screaming for this for over ten years, so to my mind it has a fairly
high importance. Also, every other database system of every stripe
worth mentioning has something approximately equivalent to this,
including ones with much less functionality generally. The fact that
we don't is a really unfortunate omission.

As to the rules you refer to, you must mean These locks are intended
to be short-term: they should not be held for long. I don't think
that they will ever be held for long. At least, when I've managed the
amount of work that a heap_insert() can do better. I expect to produce
a revision where toasting doesn't happen with the locks held soon.
Actually, I've already written the code, I just need to do some
testing.

 We are not going to risk breaking every
 bit of code anywhere in the backend or in third-party code that takes
 a buffer lock.  You are never going to convince me, or Tom, that the
 benefit of doing that is worth the risk; in fact, I have a hard time
 believing that you'll find ANY committer who thinks this approach is
 worth considering.

I would suggest letting those other individuals speak for themselves
too. Particularly if you're going to name someone who is on vacation
like that.

 Even if you get the code to run without apparent deadlocks, that
 doesn't mean there aren't any;

Of course it doesn't. Who said otherwise?

 Our buffer locking regimen suffers
 from painful complexity and serious maintenance difficulties as is.

That's true to a point, but it has more to do with things like how
VACUUM interacts with hio.c. Things like this:

/*
 * Release the file-extension lock; it's now OK for someone else to extend
 * the relation some more. Note that we cannot release this lock before
 * we have buffer lock on the new page, or we risk a race condition
 * against vacuumlazy.c --- see comments therein.
 */
if (needLock)
UnlockRelationForExtension(relation, ExclusiveLock);

The btree code is different, though: It implements a well-defined
interface, with much clearer separation of concerns. As I've said
already, with trivial exception (think contrib), no external code
considers itself to have license to obtain locks of any sort on btree
buffers. No external code of ours - without exception - does anything
with multiple locks, or exclusive locks on btree buffers. I'll remind
you that I'm only holding shared locks when control is outside of the
btree code.

Even within the btree code, the number of access method functions that
could conflict with what I do here (that acquire exclusive locks) is
very small when you exclude things that only exclusive lock the
meta-page (there are also very few of those). So the surface area is
quite small.

I'm not denying that there is a cost, or that I haven't expanded
things in a direction I'd prefer not to. I just think that it may well
be worth it, particularly when you consider the alternatives - this
may well be the least worst thing. I mean, if we do the promise tuple
thing, and there are multiple unique indexes, what happens when an
inserter needs to block pending the outcome of another transaction?
They had better go clean up the promise tuples from the other unique
indexes that they're trying to insert into, because they cannot afford
to hold value locks for a long time, no matter how they're
implemented. That could take much longer than just releasing a shared
buffer lock, since for each unique index the promise tuple must be
re-found from scratch. There are huge issues with additional
complexity and bloat. Oh, and now your lightweight locks aren't so
lightweight any more.

If the value locks were made interruptible through some method, such
as the promise tuples approach, does that really make deadlocking
acceptable? So at least your system didn't seize up. But on the other
hand, the user randomly had a deadlock error through no fault of their
own. The former may be worse, but the latter is also inexcusable. In
general, the best solution is just to not have deadlock hazards. I
wouldn't be surprised if reasoning about deadlocking was harder with
that alternative approach to value locking, not easier.

 Moreover, we've already got performance and scalability problems that
 are 

Re: [HACKERS] Strange hanging bug in a simple milter

2013-09-13 Thread Heikki Linnakangas

On 13.09.2013 22:03, Stephen Frost wrote:

* Andres Freund (and...@2ndquadrant.com) wrote:

It seems slightly cleaner to just move the pqsecure_destroy(); to the
end of that function, based on a boolean. But if you think otherwise, I
won't protest...


Hmm, agreed; I had originally been concerned that the SIGPIPE madness
needed to be around the pqsecure_destroy() call, but I can't see why
that would be.

I've run through a few times w/ the attached and haven't seen the
deadlock.  Will continue testing, of course.

Heikki, any thoughts regarding this change?  Any concerns about
back-patching it?


Umm, with that patch, pqsecure_destroy() is never called. The if 
(conn-ssl) test that's now at the end of the close_SSL function is 
never true, because conn-ssl is set to NULL earlier.


I'm afraid the move_locks.diff patch you posted earlier is also 
broken; close_SSL() is called in error scenarios from 
pqsecure_open_client(), while already holding the mutex. So it will 
deadlock with itself if the connection cannot be established.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Large shared_buffer stalls WAS: proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-09-13 Thread Josh Berkus
On 09/13/2013 09:27 AM, Merlin Moncure wrote:
 I happen to be one of those couple people.  Load goes from 0.1 to
 500 without warning then back to 0.1 equally without warning.
 Unfortunately the server is in a different jurisdiction such that it
 makes deep forensic analysis impossible.  I think this is happening
 more and more often as postgres is becoming increasingly deployed on
 high(er) -end servers.  I've personally (alone) dealt with 4-5
 confirmed cases and there have been many more.  We have a problem.

Can you explain a bit more about this?  I'm currently grappling with a
db cluster which has periodic mysterious total LWLock paralysis, and is
configured with 8GB shared_buffers (and 512GB ram installed).

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Strange hanging bug in a simple milter

2013-09-13 Thread Stephen Frost
* Andres Freund (and...@2ndquadrant.com) wrote:
 It seems slightly cleaner to just move the pqsecure_destroy(); to the
 end of that function, based on a boolean. But if you think otherwise, I
 won't protest...

Hmm, agreed; I had originally been concerned that the SIGPIPE madness
needed to be around the pqsecure_destroy() call, but I can't see why
that would be.

I've run through a few times w/ the attached and haven't seen the
deadlock.  Will continue testing, of course.

Heikki, any thoughts regarding this change?  Any concerns about
back-patching it?

Thanks,

Stephen
diff --git a/src/interfaces/libpq/fe-secure.c b/src/interfaces/libpq/fe-secure.c
new file mode 100644
index 3bd0113..ca2c5e4 100644
*** a/src/interfaces/libpq/fe-secure.c
--- b/src/interfaces/libpq/fe-secure.c
*** close_SSL(PGconn *conn)
*** 1545,1551 
  		SSL_shutdown(conn-ssl);
  		SSL_free(conn-ssl);
  		conn-ssl = NULL;
- 		pqsecure_destroy();
  		/* We have to assume we got EPIPE */
  		REMEMBER_EPIPE(spinfo, true);
  		RESTORE_SIGPIPE(conn, spinfo);
--- 1545,1550 
*** close_SSL(PGconn *conn)
*** 1565,1570 
--- 1564,1572 
  		conn-engine = NULL;
  	}
  #endif
+ 
+ 	if (conn-ssl)
+ 		pqsecure_destroy();
  }
  
  /*


signature.asc
Description: Digital signature


Re: [HACKERS] Strange hanging bug in a simple milter

2013-09-13 Thread Andres Freund
On 2013-09-13 15:03:31 -0400, Stephen Frost wrote:
 * Andres Freund (and...@2ndquadrant.com) wrote:
  It seems slightly cleaner to just move the pqsecure_destroy(); to the
  end of that function, based on a boolean. But if you think otherwise, I
  won't protest...
 
 Hmm, agreed; I had originally been concerned that the SIGPIPE madness
 needed to be around the pqsecure_destroy() call, but I can't see why
 that would be.
 
 I've run through a few times w/ the attached and haven't seen the
 deadlock.  Will continue testing, of course.

That patch looks wrong to me. Note that the if (conn-ssl) branch resets
conn-ssl to NULL.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE

2013-09-13 Thread Peter Geoghegan
On Fri, Sep 13, 2013 at 12:14 PM, Stephen Frost sfr...@snowman.net wrote:
 It was my first concern regarding this patch.

It was my first concern too.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE

2013-09-13 Thread Andres Freund
On 2013-09-13 11:59:54 -0700, Peter Geoghegan wrote:
 On Fri, Sep 13, 2013 at 9:23 AM, Robert Haas robertmh...@gmail.com wrote:
  Andres is being very polite here, but the reality is that this
  approach has zero chance of being accepted.
 
 I quite like Andres, but I have yet to see him behave as you describe
 in a situation where someone proposed what was fundamentally a bad
 idea. Maybe you should let him speak for himself?

Unfortunately I have to agree with Robert here, I think it's a complete
nogo to do what you propose so far and I've several times now presented
arguments why I think so.
The reason I wasn't saying this will never get accepted are twofold:
a) I don't want to stiffle alternative ideas to the promises idea,
just because I think it's the way to go. That might stop a better idea
from being articulated. b) I am not actually in the position to say it's
not going to be accepted.

*I* think that unless you make some fundamental and very, very clever
modifications to your algorithm that end up *not holding a lock over
other operations at all*, it's not going to get committed. And I'll chip
in with my -1.
And clever modification doesn't mean slightly restructuring heapam.c's
operations.

 The importance of this patch is a value judgement. Our users have been
 screaming for this for over ten years, so to my mind it has a fairly
 high importance. Also, every other database system of every stripe
 worth mentioning has something approximately equivalent to this,
 including ones with much less functionality generally. The fact that
 we don't is a really unfortunate omission.

I aggree it's quite important but that doesn't mean we have to do stuff
that we think are unacceptable, especially as there *are* other ways to
do it.

 As to the rules you refer to, you must mean These locks are intended
 to be short-term: they should not be held for long. I don't think
 that they will ever be held for long. At least, when I've managed the
 amount of work that a heap_insert() can do better. I expect to produce
 a revision where toasting doesn't happen with the locks held soon.
 Actually, I've already written the code, I just need to do some
 testing.

I personally think - and have stated so before - that doing a
heap_insert() while holding the btree lock is unacceptable.

 The btree code is different, though: It implements a well-defined
 interface, with much clearer separation of concerns.

Which you're completely violating by linking the btree buffer locking
with the heap locking. It's not about the btree code alone.

At this point I am a bit confused why you are asking for review.

 I mean, if we do the promise tuple thing, and there are multiple
 unique indexes, what happens when an inserter needs to block pending
 the outcome of another transaction?  They had better go clean up the
 promise tuples from the other unique indexes that they're trying to
 insert into, because they cannot afford to hold value locks for a long
 time, no matter how they're implemented.

Why? We're using normal transaction visibility rules here. We don't stop
*other* values on the same index getting updated or similar.
And anyway. It doesn't matter which problem the promises idea
has. We're discussing your proposal here.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Strange hanging bug in a simple milter

2013-09-13 Thread Stephen Frost
* Heikki Linnakangas (hlinnakan...@vmware.com) wrote:
 Actually, I think there's a pre-existing bug there in git master. If
 the SSL_set_app_data or SSL_set_fd call in pqsecure_open_client
 fails for some reason, it will call close_SSL() with conn-ssl
 already set, and the mutex held. close_SSL() will call
 pqsecure_destroy()-destroy_SSL()-destory_ssl_system(), which will
 try to acquire the mutex again.

Yeah, good point, and that one looks like my fault.  Moving it after the
unlock should address that tho, no?  Looking through the other
close_SSL() calls, I don't see any other situations where it might be
called with the lock held.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Strange hanging bug in a simple milter

2013-09-13 Thread Stephen Frost
* Andres Freund (and...@2ndquadrant.com) wrote:
 That patch looks wrong to me. Note that the if (conn-ssl) branch resets
 conn-ssl to NULL.

huh, it figures one would overlook the simplest things.  Of course it's
not locking up now- we never remove the hooks (as my original patch was
doing :).  That's what I get for trying to go to meetings and code at
the same time.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Large shared_buffer stalls WAS: proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-09-13 Thread Merlin Moncure
On Fri, Sep 13, 2013 at 2:36 PM, Josh Berkus j...@agliodbs.com wrote:
 On 09/13/2013 09:27 AM, Merlin Moncure wrote:
 I happen to be one of those couple people.  Load goes from 0.1 to
 500 without warning then back to 0.1 equally without warning.
 Unfortunately the server is in a different jurisdiction such that it
 makes deep forensic analysis impossible.  I think this is happening
 more and more often as postgres is becoming increasingly deployed on
 high(er) -end servers.  I've personally (alone) dealt with 4-5
 confirmed cases and there have been many more.  We have a problem.

 Can you explain a bit more about this?  I'm currently grappling with a
 db cluster which has periodic mysterious total LWLock paralysis, and is
 configured with 8GB shared_buffers (and 512GB ram installed).

see thread: 
http://postgresql.1045698.n5.nabble.com/StrategyGetBuffer-optimization-take-2-td5766307.html
plus others.  In this particular case, s_b needs to be set to 2GB or
lower (they tried raising to 4GB for no good reason) and problem
reoccured.

what are the specific symptoms of your problem?  anything interesting
in pg_locks?  is $client willing to experiment with custom patches?

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Large shared_buffer stalls WAS: proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-09-13 Thread Josh Berkus
On 09/13/2013 12:55 PM, Merlin Moncure wrote:
 what are the specific symptoms of your problem?  anything interesting
 in pg_locks?  is $client willing to experiment with custom patches?

3 servers: 1 master, two replicas.
32-core Xeon, hyperthreaded to 64 cores
512GB RAM each
s_b set to 8GB
Load-balanced between all 3
~~ 11 different databases
combined database size around 600GB
using pgbouncer

Irregularly, during periods of high activity (although not necessarily
peak activity) one or another of the systems will go into paralysis,
with all backends apparently waiting on LWLocks (we need more tracing
information to completely confirm this).  Activity at this time is
usually somewhere between 50 and 100 concurrent queries (and 80 to 150
connections).  pg_locks doesn't think anything is waiting on a lock.

What's notable is that sometimes it's just *one* of the replicas which
goes into paralysis.  If the master gets this issue though, the replicas
experience it soon afterwards.  Increasing wal_buffers from 16GB to 64GB
seems to make this issue happen less frequently, but it doesn't go away
entirely.  Only a restart of the server, or killing all backend, ends
the lockup.

The workload is OLTP, essentially, around 20/80 write/read.  They use
PostGIS.  The other notable thing about their workload is that due to an
ORM defect, they get idle-in-transactions which last from 5 to 15
seconds several times a minute.

They are willing to use experimental patches, but only if those patches
can be applied only to a replica.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Large shared_buffer stalls WAS: proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-09-13 Thread Josh Berkus
On 09/13/2013 01:25 PM, Andres Freund wrote:
 
 
 Josh Berkus j...@agliodbs.com schrieb:
 
 What's notable is that sometimes it's just *one* of the replicas which
 goes into paralysis.  If the master gets this issue though, the
 replicas
 experience it soon afterwards.  Increasing wal_buffers from 16GB to
 64GB
 seems to make this issue happen less frequently, but it doesn't go away
 entirely.  
 
 Youre talking about MB, not GB here, right?

Oh, yes, right.


-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Large shared_buffer stalls WAS: proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-09-13 Thread Andres Freund


Josh Berkus j...@agliodbs.com schrieb:

What's notable is that sometimes it's just *one* of the replicas which
goes into paralysis.  If the master gets this issue though, the
replicas
experience it soon afterwards.  Increasing wal_buffers from 16GB to
64GB
seems to make this issue happen less frequently, but it doesn't go away
entirely.  

Youre talking about MB, not GB here, right?

Andres

-- 
Please excuse brevity and formatting - I am writing this on my mobile phone.

Andres Freund  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Completing PL support for Event Triggers

2013-09-13 Thread Dimitri Fontaine
Hi,

Please find attached to this email three patches covering the missing PL
support for Event Triggers: pltcl, plperl and plpython.

Due to “platform” problems here tonight and the CF deadline, the
plpython patch is known not to pass regression tests on my machine. The
code is fully rebased and compiles without warning, though, so I'm still
entering it into this CF: hopefully I will figure out what's wrong with
my local plpython platform support here early next week.

I'm going to add the 3 attached patches to the CF as a single item, but
maybe we'll then split if we have separate subsystem maintainers for
those 3 PLs?

Given the size of the changes (docs included in the following stats) I
don't think that's going to be necessary, but well, let me know.

 tcl 4 files changed, 204 insertions(+), 24 deletions(-)
 perl4 files changed, 240 insertions(+), 12 deletions(-)
 python  8 files changed, 165 insertions(+), 14 deletions(-)
 total  16 files changed, 609 insertions(+), 50 deletions(-)

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support



event_trigger_plperl.v1.patch.gz
Description: Binary data


event_trigger_plpython.v1.patch.gz
Description: Binary data


event_trigger_pltcl.v1.patch.gz
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Large shared_buffer stalls WAS: proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-09-13 Thread Merlin Moncure
On Fri, Sep 13, 2013 at 3:20 PM, Josh Berkus j...@agliodbs.com wrote:
 On 09/13/2013 12:55 PM, Merlin Moncure wrote:
 what are the specific symptoms of your problem?  anything interesting
 in pg_locks?  is $client willing to experiment with custom patches?

 3 servers: 1 master, two replicas.
 32-core Xeon, hyperthreaded to 64 cores
 512GB RAM each
 s_b set to 8GB
 Load-balanced between all 3
 ~~ 11 different databases
 combined database size around 600GB
 using pgbouncer

 Irregularly, during periods of high activity (although not necessarily
 peak activity) one or another of the systems will go into paralysis,
 with all backends apparently waiting on LWLocks (we need more tracing
 information to completely confirm this).  Activity at this time is
 usually somewhere between 50 and 100 concurrent queries (and 80 to 150
 connections).  pg_locks doesn't think anything is waiting on a lock.

 What's notable is that sometimes it's just *one* of the replicas which
 goes into paralysis.  If the master gets this issue though, the replicas
 experience it soon afterwards.  Increasing wal_buffers from 16GB to 64GB
 seems to make this issue happen less frequently, but it doesn't go away
 entirely.  Only a restart of the server, or killing all backend, ends
 the lockup.

 The workload is OLTP, essentially, around 20/80 write/read.  They use
 PostGIS.  The other notable thing about their workload is that due to an
 ORM defect, they get idle-in-transactions which last from 5 to 15
 seconds several times a minute.

 They are willing to use experimental patches, but only if those patches
 can be applied only to a replica.

ok, points similar:
*) master/slave config (two slaves for me)
*) 'big' server 256GB mem, 32 core
*) 80% approx. (perhaps more)
*) some spacial searching (but not very much)
*) OLTP
*) presentation of load, although in my case it did resolve anywhere
from 30 secs to half hour
*) aside from the spike, 100% healthy

points different
*) application side pooling: 96 app servers, max 5 connections each
(aside: are you using transaction mode pgbouncer?)
*) I saw gripes about relation extension in pg_locks

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-09-13 Thread Kevin Grittner
Andres Freund and...@2ndquadrant.com wrote:

 Absolutely not claiming the contrary. I think it sucks that we
 couldn't fully figure out what's happening in detail. I'd love to
 get my hand on a setup where it can be reliably reproduced.

I have seen two completely different causes for symptoms like this,
and I suspect that these aren't the only two.

(1)  The dirty page avalanche: PostgreSQL hangs on to a large
number of dirty buffers and then dumps a lot of them at once.  The
OS does the same.  When PostgreSQL dumps its buffers to the OS it
pushes the OS over a tipping point where it is writing dirty
buffers too fast for the controller's BBU cache to absorb them. 
Everything freezes until the controller writes and accepts OS
writes for a lot of data.  This can take several minutes, during
which time the database seems frozen.  Cure is some combination
of these: reduce shared_buffers, make the background writer more
aggressive, checkpoint more often, make the OS dirty page writing
more aggressive, add more BBU RAM to the controller.

(2)  Transparent huge page support goes haywire on its defrag work.
Clues on this include very high system CPU time during an
episode, and `perf top` shows more time in kernel spinlock
functions than anywhere else.  The database doesn't completely lock
up like with the dirty page avalanche, but it is slow enough that
users often describe it that way.  So far I have only seen this
cured by disabling THP support (in spite of some people urging that
just the defrag be disabled).  It does make me wonder whether there
is something we could do in PostgreSQL to interact better with
THPs.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-09-13 Thread Merlin Moncure
On Fri, Sep 13, 2013 at 4:04 PM, Kevin Grittner kgri...@ymail.com wrote:
 Andres Freund and...@2ndquadrant.com wrote:

 Absolutely not claiming the contrary. I think it sucks that we
 couldn't fully figure out what's happening in detail. I'd love to
 get my hand on a setup where it can be reliably reproduced.

 I have seen two completely different causes for symptoms like this,
 and I suspect that these aren't the only two.

 (1)  The dirty page avalanche: PostgreSQL hangs on to a large
 number of dirty buffers and then dumps a lot of them at once.  The
 OS does the same.  When PostgreSQL dumps its buffers to the OS it
 pushes the OS over a tipping point where it is writing dirty
 buffers too fast for the controller's BBU cache to absorb them.
 Everything freezes until the controller writes and accepts OS
 writes for a lot of data.  This can take several minutes, during
 which time the database seems frozen.  Cure is some combination
 of these: reduce shared_buffers, make the background writer more
 aggressive, checkpoint more often, make the OS dirty page writing
 more aggressive, add more BBU RAM to the controller.

Yeah -- I've seen this too, and it's a well understood problem.
Getting o/s to spin dirty pages out faster is the name of the game I
think.  Storage is getting so fast that it's (mostly) moot anyways.
Also, this is under the umbrella of 'high i/o' -- the stuff I've been
seeing  is low- or no- I/o.

 (2)  Transparent huge page support goes haywire on its defrag work.
 Clues on this include very high system CPU time during an
 episode, and `perf top` shows more time in kernel spinlock
 functions than anywhere else.  The database doesn't completely lock
 up like with the dirty page avalanche, but it is slow enough that
 users often describe it that way.  So far I have only seen this
 cured by disabling THP support (in spite of some people urging that
 just the defrag be disabled).  It does make me wonder whether there
 is something we could do in PostgreSQL to interact better with
 THPs.

Ah, that's a useful tip; need to research that, thanks.  Maybe Josh
might be able to give it a whirl...

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Large shared_buffer stalls WAS: proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-09-13 Thread Josh Berkus
On 09/13/2013 01:58 PM, Merlin Moncure wrote:
 ok, points similar:
 *) master/slave config (two slaves for me)
 *) 'big' server 256GB mem, 32 core
 *) 80% approx. (perhaps more)
 *) some spacial searching (but not very much)
 *) OLTP
 *) presentation of load, although in my case it did resolve anywhere
 from 30 secs to half hour
 *) aside from the spike, 100% healthy
 
 points different
 *) application side pooling: 96 app servers, max 5 connections each
 (aside: are you using transaction mode pgbouncer?)

Yes.

 *) I saw gripes about relation extension in pg_locks

I'll check for that next time.

We're also working on seeing if we can reproduce this under test conditions.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Large shared_buffer stalls WAS: proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-09-13 Thread Merlin Moncure
On Fri, Sep 13, 2013 at 4:15 PM, Josh Berkus j...@agliodbs.com wrote:
 On 09/13/2013 01:58 PM, Merlin Moncure wrote:
 ok, points similar:
 *) master/slave config (two slaves for me)
 *) 'big' server 256GB mem, 32 core
 *) 80% approx. (perhaps more)
 *) some spacial searching (but not very much)
 *) OLTP
 *) presentation of load, although in my case it did resolve anywhere
 from 30 secs to half hour
 *) aside from the spike, 100% healthy

 points different
 *) application side pooling: 96 app servers, max 5 connections each
 (aside: are you using transaction mode pgbouncer?)

 Yes

Given that, I would be curious to see what dropping connection pool
down to somewhere between 32-64 connections would do.  This is a
band-aid obviously since not everyone can or wants to run pgbouncer.
But this first thing that pops into my mind in these situations is
that it might alleviate the symptoms.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] record identical operator

2013-09-13 Thread Kevin Grittner
Andres Freund and...@2ndquadrant.com wrote:
 On 2013-09-12 15:27:27 -0700, Kevin Grittner wrote:
 The new operator is logically similar to IS NOT DISTINCT FROM for a
 record, although its implementation is very different.  For one
 thing, it doesn't replace the operation with column level operators
 in the parser.  For another thing, it doesn't look up operators for
 each type, so the identical operator does not need to be
 implemented for each type to use it as shown above.  It compares
 values byte-for-byte, after detoasting.  The test for identical
 records can avoid the detoasting altogether for any values with
 different lengths, and it stops when it finds the first column with
 a difference.

 In the general case, that operator sounds dangerous to me. We don't
 guarantee that a Datum containing the same data always has the same
 binary representation. E.g. array can have a null bitmap or may not have
 one, depending on how they were created.

 I am not actually sure whether that's a problem for your usecase, but I
 get headaches when we try circumventing the type abstraction that way.

 Yes, we do such tricks in other places already, but afaik in all those
 places errorneously believing two Datums are distinct is not error, just
 a missed optimization. Allowing a general operator with such a murky
 definition to creep into something SQL exposed... Hm. Not sure.

Well, the only two alternatives I could see were to allow
user-visible differences not to be carried to the matview if they
old and new values were considered equal, or to implement an
identical operator or function in every type that was to be
allowed in a matview.  Given those options, what's in this patch
seemed to me to be the least evil.

It might be worth noting that this scheme doesn't have a problem
with correctness if there are multiple equal values which are not
identical, as long as any two identical values are equal.  If the
query which generates contents for a matview generates
non-identical but equal values from one run to the next without any
particular reason, that might cause performance problems.

To mangle Orwell: Among pairs of equal values, some pairs are more
equal than others.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE

2013-09-13 Thread Peter Geoghegan
On Fri, Sep 13, 2013 at 12:23 PM, Andres Freund and...@2ndquadrant.com wrote:
 The reason I wasn't saying this will never get accepted are twofold:
 a) I don't want to stiffle alternative ideas to the promises idea,
 just because I think it's the way to go. That might stop a better idea
 from being articulated. b) I am not actually in the position to say it's
 not going to be accepted.

Well, the reality is that the promises idea hasn't been described in
remotely enough detail to compare it to what I have here. I've pointed
out plenty of problems with it. After all, it was the first thing that
I considered, and I'm on the record talking about it in the 2012 dev
meeting. I didn't take that approach for many good reasons.

The reason I ended up here is not because I didn't get the memo about
holding buffer locks across complex operations being a bad thing. At
least grant me that. I'm here because in all these years no one has
come up with a suggestion that doesn't have some very major downsides.
Like, even worse than this.

 As to the rules you refer to, you must mean These locks are intended
 to be short-term: they should not be held for long. I don't think
 that they will ever be held for long. At least, when I've managed the
 amount of work that a heap_insert() can do better. I expect to produce
 a revision where toasting doesn't happen with the locks held soon.
 Actually, I've already written the code, I just need to do some
 testing.

 I personally think - and have stated so before - that doing a
 heap_insert() while holding the btree lock is unacceptable.

Presumably your reason is essentially that we exclusive lock a heap
buffer (exactly one heap buffer) while holding shared locks on btree
index buffers. Is that really so different to holding an exclusive
lock on a btree buffer while holding a shared lock on a heap buffer?
Because that's what _bt_check_unique() does today.

Now, I'll grant you that there is one appreciable difference, which is
that multiple unique indexes may be involved. But limiting ourselves
to the primary key or something like that remains an option. And I'm
not sure that it's really any worse anyway.

 The btree code is different, though: It implements a well-defined
 interface, with much clearer separation of concerns.

 Which you're completely violating by linking the btree buffer locking
 with the heap locking. It's not about the btree code alone.

You're right that it isn't about just the btree code.

In order for a deadlock to occur, there must be a mutual dependency.
What code could feel entitled to hold buffer locks on btree buffers
and heap buffers at the same time except the btree code itself? It
already does so. But no one else does the same thing. If anyone did
anything with a heap buffer lock held that could result in a call into
one of the btree access method functions (I'm not contemplating the
possibility of this other code locking the btree buffer *directly*),
I'm quite sure that that would be rejected outright today, because
that causes deadlocks. Certainly, vacuumlazy.c doesn't do it, for
example. Why would anyone ever want to do that anyway? I cannot think
of any reason. I suppose that that does still leave transitive
dependencies, but now you're stretching things. After all, you're not
supposed to hold buffer locks for long! The dependency would have to
transit through, say, one of the system LWLocks used for WAL Logging.
Seems pretty close to impossible that it'd be an issue - index stuff
is only WAL-logged as index tuples are inserted (that is, as the locks
are finally released). Everyone automatically does that kind of thing
in a consistent order of locking, unlocking in the opposite order
anyway.

But what of the btree code deadlocking with itself? There are only a
few functions (2 or 3) where that's possible even in principle. I
think that they're going to be not too hard to analyze. For example,
with insertion, the trick is to always lock in a consistent order and
unlock/insert in the opposite order. The heap shared lock(s) needed in
the btree code cannot deadlock with another upserter because once the
other upserter has that exclusive heap buffer lock, it's *inevitable*
that it will release all of its shared buffer locks. Granted, I could
stand to think about this case more, but you get the idea - it *is*
possible to clamp down on the code that needs to care about this stuff
to a large degree. It's subtle, but btrees are generally considered
pretty complicated, and the btree code already cares about some odd
cases like these (it takes special precuations for catalog indexes,
for example).

The really weird thing about my patch is that the btree code trusts
the executor to call the heapam code to do the right thing in the
right way - it now knows more than I'd prefer. Would you be happier if
the btree code took more direct responsibility for the heap tuple
insertion instead? Before you say that's ridiculous, consider the
big modularity violation that 

[HACKERS] plpgsql.print_strict_params

2013-09-13 Thread Marko Tiikkaja

Hi,

Attached is a patch for optionally printing more information on STRICT 
failures in PL/PgSQL:



set plpgsql.print_strict_params to true;

create or replace function footest() returns void as $$
declare
x record;
p1 int := 2;
p3 text := 'foo';
begin
  -- too many rows
  select * from foo where f1  p1 or f1::text = p3  into strict x;
  raise notice 'x.f1 = %, x.f2 = %', x.f1, x.f2;
end$$ language plpgsql;
select footest();
ERROR:  query returned more than one row
DETAIL:  p1 = '2', p3 = 'foo'
CONTEXT:  PL/pgSQL function footest() line 8 at SQL statement


This parameter is turned off by default to preserve old behaviour, but 
can be extremely useful when debugging code in test environments.


I will add this to the open commitfest, but in the meanwhile, any 
feedback is appreciated.



Regards,
Marko Tiikkaja
*** a/doc/src/sgml/plpgsql.sgml
--- b/doc/src/sgml/plpgsql.sgml
***
*** 1076,1081  END;
--- 1076,1088 
   always sets literalFOUND/literal to true.
  /para
  
+   para
+  The configuration parameter literalplpgsql.print_strict_params/
+  can be enabled to get information about the parameters passed to the
+  query in the literalDETAIL/ part of the error message produced
+  when the requirements of STRICT are not met.
+   /para
+ 
  para
   For commandINSERT//commandUPDATE//commandDELETE/ with
   literalRETURNING/, applicationPL/pgSQL/application reports
*** a/src/pl/plpgsql/src/pl_exec.c
--- b/src/pl/plpgsql/src/pl_exec.c
***
*** 139,144  static void plpgsql_estate_setup(PLpgSQL_execstate *estate,
--- 139,150 
 ReturnSetInfo *rsi);
  static void exec_eval_cleanup(PLpgSQL_execstate *estate);
  
+ static char *exec_get_query_params(PLpgSQL_execstate *estate,
+  const 
PLpgSQL_expr *expr);
+ static char *exec_get_dynquery_params(PLpgSQL_execstate *estate,
+ const 
PreparedParamsData *ppd);
+ 
+ 
  static void exec_prepare_plan(PLpgSQL_execstate *estate,
  PLpgSQL_expr *expr, int cursorOptions);
  static bool exec_simple_check_node(Node *node);
***
*** 3226,3231  exec_prepare_plan(PLpgSQL_execstate *estate,
--- 3232,3310 
exec_simple_check_plan(expr);
  }
  
+ static char *
+ exec_get_query_params(PLpgSQL_execstate *estate,
+ const PLpgSQL_expr *expr)
+ {
+   int paramno;
+   int dno;
+   StringInfoData paramstr;
+   Bitmapset *tmpset;
+ 
+   if (!expr-paramnos)
+   return (no parameters);
+ 
+   initStringInfo(paramstr);
+   tmpset = bms_copy(expr-paramnos);
+   paramno = 1;
+   while ((dno = bms_first_member(tmpset)) = 0)
+   {
+   Datum paramdatum;
+   Oid paramtypeid;
+   bool paramisnull;
+   int32 paramtypmod;
+   PLpgSQL_var *curvar;
+ 
+   curvar = (PLpgSQL_var *) estate-datums[dno];
+ 
+   exec_eval_datum(estate, (PLpgSQL_datum *) curvar, paramtypeid,
+   paramtypmod, paramdatum, 
paramisnull);
+ 
+   if (paramno  1)
+   appendStringInfo(paramstr, , );
+ 
+   if (paramisnull)
+   appendStringInfo(paramstr, %s = NULL, 
curvar-refname);
+   else
+   {
+   char *value = convert_value_to_string(estate, 
paramdatum, paramtypeid);
+   appendStringInfo(paramstr, %s = '%s', 
curvar-refname, value);
+   }
+ 
+   paramno++;
+   }
+   bms_free(tmpset);
+ 
+   return paramstr.data;
+ }
+ 
+ static char *
+ exec_get_dynquery_params(PLpgSQL_execstate *estate,
+const PreparedParamsData *ppd)
+ {
+   int i;
+   StringInfoData paramstr;
+ 
+   if (!ppd)
+   return (no parameters);
+ 
+   initStringInfo(paramstr);
+   for (i = 0; i  ppd-nargs; ++i)
+   {
+   if (i  0)
+   appendStringInfoString(paramstr, , );
+ 
+   if (ppd-nulls[i] == 'n')
+   appendStringInfo(paramstr, $%d = NULL, i+1);
+   else
+   {
+   char *value = convert_value_to_string(estate, 
ppd-values[i], ppd-types[i]);
+   appendStringInfo(paramstr, $%d = '%s', i+1, value);
+   }
+   }
+ 
+   return paramstr.data;
+ }
  
  /* --
   * exec_stmt_execsql  Execute an SQL statement (possibly with 
INTO).
***
*** 3391,3408  exec_stmt_execsql(PLpgSQL_execstate *estate,
if (n == 0)
{
if (stmt-strict)

Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-09-13 Thread Andres Freund
On 2013-09-13 14:04:55 -0700, Kevin Grittner wrote:
 Andres Freund and...@2ndquadrant.com wrote:
 
  Absolutely not claiming the contrary. I think it sucks that we
  couldn't fully figure out what's happening in detail. I'd love to
  get my hand on a setup where it can be reliably reproduced.
 
 I have seen two completely different causes for symptoms like this,
 and I suspect that these aren't the only two.
 
 (1)  The dirty page avalanche: PostgreSQL hangs on to a large
 number of dirty buffers and then dumps a lot of them at once.  The
 OS does the same.  When PostgreSQL dumps its buffers to the OS it
 pushes the OS over a tipping point where it is writing dirty
 buffers too fast for the controller's BBU cache to absorb them. 
 Everything freezes until the controller writes and accepts OS
 writes for a lot of data.  This can take several minutes, during
 which time the database seems frozen.  Cure is some combination
 of these: reduce shared_buffers, make the background writer more
 aggressive, checkpoint more often, make the OS dirty page writing
 more aggressive, add more BBU RAM to the controller.

That should hopefully be diagnosable from other system stats like the
dirty rate.

 (2)  Transparent huge page support goes haywire on its defrag work.
 Clues on this include very high system CPU time during an
 episode, and `perf top` shows more time in kernel spinlock
 functions than anywhere else.  The database doesn't completely lock
 up like with the dirty page avalanche, but it is slow enough that
 users often describe it that way.  So far I have only seen this
 cured by disabling THP support (in spite of some people urging that
 just the defrag be disabled).  

Yes, I have seen that issue a couple of times now as well. I can confirm
that in at least two cases disabling defragmentation alone proved to be
enough to fix the issue.
Annoyingly enough there are different ways to disable
defragmentation/THP depending on whether you're using THP backported by
redhat or the upstream version...

 It does make me wonder whether there
 is something we could do in PostgreSQL to interact better with
 THPs.

The best thing I see is to just use explicit hugepages. I've previously
sent a prototype for that then has been turned into an actual
implementation by Christian Kruse.
A colleague of mine is working on polishing that patch into something
committable.
If you use large s_b, the memory savings alone (some 100kb instead
dozens of megabytes per backend) can be worth it, not to talk of actual
performance gains.

Updating the kernel helps as well, they've improved the efficiency of
defragmentation a good bit.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] record identical operator

2013-09-13 Thread Andres Freund
On 2013-09-13 14:36:27 -0700, Kevin Grittner wrote:
 Andres Freund and...@2ndquadrant.com wrote:
  On 2013-09-12 15:27:27 -0700, Kevin Grittner wrote:
  The new operator is logically similar to IS NOT DISTINCT FROM for a
  record, although its implementation is very different.  For one
  thing, it doesn't replace the operation with column level operators
  in the parser.  For another thing, it doesn't look up operators for
  each type, so the identical operator does not need to be
  implemented for each type to use it as shown above.  It compares
  values byte-for-byte, after detoasting.  The test for identical
  records can avoid the detoasting altogether for any values with
  different lengths, and it stops when it finds the first column with
  a difference.
 
  In the general case, that operator sounds dangerous to me. We don't
  guarantee that a Datum containing the same data always has the same
  binary representation. E.g. array can have a null bitmap or may not have
  one, depending on how they were created.
 
  I am not actually sure whether that's a problem for your usecase, but I
  get headaches when we try circumventing the type abstraction that way.
 
  Yes, we do such tricks in other places already, but afaik in all those
  places errorneously believing two Datums are distinct is not error, just
  a missed optimization. Allowing a general operator with such a murky
  definition to creep into something SQL exposed... Hm. Not sure.
 
 Well, the only two alternatives I could see were to allow
 user-visible differences not to be carried to the matview if they
 old and new values were considered equal, or to implement an
 identical operator or function in every type that was to be
 allowed in a matview.  Given those options, what's in this patch
 seemed to me to be the least evil.
 
 It might be worth noting that this scheme doesn't have a problem
 with correctness if there are multiple equal values which are not
 identical, as long as any two identical values are equal.  If the
 query which generates contents for a matview generates
 non-identical but equal values from one run to the next without any
 particular reason, that might cause performance problems.

I am not actually that concerned with MVCs using this, you're quite
capable of analyzing the dangers. What I am wary of is exposing an
operator that's basically broken from the get go to SQL.
Now, the obvious issue there is that matviews use SQL to refresh :(

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE

2013-09-13 Thread Kevin Grittner
Peter Geoghegan p...@heroku.com wrote:

 we exclusive lock a heap buffer (exactly one heap buffer) while
 holding shared locks on btree index buffers. Is that really so
 different to holding an exclusive lock on a btree buffer while
 holding a shared lock on a heap buffer? Because that's what
 _bt_check_unique() does today.

Is it possible to get a deadlock doing only one of those two
things?  Is it possible to avoid a deadlock doing both of them?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] record identical operator

2013-09-13 Thread Kevin Grittner
Andres Freund and...@2ndquadrant.com wrote:

 I am not actually that concerned with MVCs using this, you're quite
 capable of analyzing the dangers. What I am wary of is exposing an
 operator that's basically broken from the get go to SQL.
 Now, the obvious issue there is that matviews use SQL to refresh :(

I'm not sure why these operators are more broken or dangerous than
those which already exist to support the text_pattern_ops and
bpchar_pattern_ops operator families.  I could overload those
operator names as much as possible if that seems better.  As I said
at the start of the thread, I have no particular attachment to
these operator names.  For example, that would mean using ~=~ as
the operator for record_image_ge() instead of using ==.  It would
be trivial to make that adjustment to the patch.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] record identical operator

2013-09-13 Thread Andres Freund
On 2013-09-13 15:13:20 -0700, Kevin Grittner wrote:
 Andres Freund and...@2ndquadrant.com wrote:
 
  I am not actually that concerned with MVCs using this, you're quite
  capable of analyzing the dangers. What I am wary of is exposing an
  operator that's basically broken from the get go to SQL.
  Now, the obvious issue there is that matviews use SQL to refresh :(
 
 I'm not sure why these operators are more broken or dangerous than
 those which already exist to support the text_pattern_ops and
 bpchar_pattern_ops operator families.  I could overload those
 operator names as much as possible if that seems better.  As I said
 at the start of the thread, I have no particular attachment to
 these operator names.  For example, that would mean using ~=~ as
 the operator for record_image_ge() instead of using ==.  It would
 be trivial to make that adjustment to the patch.

Hm. I don't see the similarity. Those have pretty clearly defined
behaviour. Not one that's dependendant on padding bytes, null bitmaps
that can or cannot be present and such.

I guess we need input from others here.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plpgsql.print_strict_params

2013-09-13 Thread Alvaro Herrera
Marko Tiikkaja wrote:

 set plpgsql.print_strict_params to true;

Hmm, shouldn't this be a compile option?  See the comp_option
production in pl_gram.y for existing examples.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plpgsql.print_strict_params

2013-09-13 Thread Marko Tiikkaja

On 2013-09-14 00:39, Alvaro Herrera wrote:

set plpgsql.print_strict_params to true;


Hmm, shouldn't this be a compile option?  See the comp_option
production in pl_gram.y for existing examples.


I thought about that, but it seemed more like a runtime option to me. 
Any particular reason you think it would be better as a compile time option?



Regards,
Marko Tiikkaja



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] patch: add MAP_HUGETLB to mmap() where supported (WIP)

2013-09-13 Thread Richard Poole
The attached patch adds the MAP_HUGETLB flag to mmap() for shared memory
on systems that support it. It's based on Christian Kruse's patch from
last year, incorporating suggestions from Andres Freund.

On a system with 4GB shared_buffers, doing pgbench runs long enough for
each backend to touch most of the buffers, this patch saves nearly 8MB of
memory per backend and improves performances by just over 2% on average.

It is still WIP as there are a couple of points that Andres has pointed
out to me that haven't been addressed yet; also, the documentation is
incomplete.

Richard

-- 
Richard Poole http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 23ebc11..703b28f 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1052,6 +1052,42 @@ include 'filename'
   /listitem
  /varlistentry
 
+ varlistentry id=guc-huge-tlb-pages xreflabel=huge_tlb_pages
+  termvarnamehuge_tlb_pages/varname (typeenum/type)/term
+  indexterm
+   primaryvarnamehuge_tlb_pages/ configuration parameter/primary
+  /indexterm
+  listitem
+   para
+Enables/disables the use of huge tlb pages. Valid values are
+literalon/literal, literaloff/literal and literaltry/literal.
+The default value is literaltry/literal.
+   /para
+
+	   para
+	   Use of huge tlb pages reduces the cpu time spent on memory management and
+	   the amount of memory used for page tables and therefore improves performance.
+	   /para
+
+   para
+With varnamehuge_tlb_pages/varname set to literalon/literal
+symbolmmap()/symbol will be called with symbolMAP_HUGETLB/symbol.
+If the call fails the server will fail fatally.
+   /para
+
+   para
+With varnamehuge_tlb_pages/varname set to literaloff/literal we
+will not use symbolMAP_HUGETLB/symbol at all.
+   /para
+
+   para
+With varnamehuge_tlb_pages/varname set to literaltry/literal
+we will try to use symbolMAP_HUGETLB/symbol and fall back to
+symbolmmap()/symbol without symbolMAP_HUGETLB/symbol.
+   /para
+  /listitem
+ /varlistentry
+
  varlistentry id=guc-temp-buffers xreflabel=temp_buffers
   termvarnametemp_buffers/varname (typeinteger/type)/term
   indexterm
diff --git a/src/backend/port/sysv_shmem.c b/src/backend/port/sysv_shmem.c
index 20e3c32..57fff35 100644
--- a/src/backend/port/sysv_shmem.c
+++ b/src/backend/port/sysv_shmem.c
@@ -27,10 +27,14 @@
 #ifdef HAVE_SYS_SHM_H
 #include sys/shm.h
 #endif
+#ifdef MAP_HUGETLB
+#include dirent.h
+#endif
 
 #include miscadmin.h
 #include storage/ipc.h
 #include storage/pg_shmem.h
+#include utils/guc.h
 
 
 typedef key_t IpcMemoryKey;		/* shared memory key passed to shmget(2) */
@@ -61,6 +65,13 @@ typedef int IpcMemoryId;		/* shared memory ID returned by shmget(2) */
 #define MAP_FAILED ((void *) -1)
 #endif
 
+#ifdef MAP_HUGETLB
+#define PG_HUGETLB_BASE_ADDR (void *)(0x0UL)
+#define PG_MAP_HUGETLB MAP_HUGETLB
+#else
+#define PG_MAP_HUGETLB 0
+#endif
+
 
 unsigned long UsedShmemSegID = 0;
 void	   *UsedShmemSegAddr = NULL;
@@ -342,6 +353,161 @@ PGSharedMemoryIsInUse(unsigned long id1, unsigned long id2)
 }
 
 
+#ifdef MAP_HUGETLB
+#define HUGE_PAGE_INFO_DIR  /sys/kernel/mm/hugepages
+
+/*
+ *	static long InternalGetFreeHugepagesCount(const char *name)
+ *
+ * Attempt to read the number of available hugepages from
+ * /sys/kernel/mm/hugepages/hugepages-size/free_hugepages
+ * Will fail (return -1) if file could not be opened, 0 if no pages are available
+ * and  0 if there are free pages
+ *
+ */
+static long
+InternalGetFreeHugepagesCount(const char *name)
+{
+	int fd;
+	char buff[1024];
+	size_t len;
+	long result;
+	char *ptr;
+
+	len = snprintf(buff, 1024, %s/%s/free_hugepages, HUGE_PAGE_INFO_DIR, name);
+	if (len == 1024) /* I don't think that this will happen ever */
+	{
+		ereport(huge_tlb_pages == HUGE_TLB_TRY ? DEBUG1 : WARNING,
+(errmsg(Filename %s/%s/free_hugepages is too long, HUGE_PAGE_INFO_DIR, name),
+ errcontext(while checking hugepage size)));
+		return -1;
+	}
+
+	fd = open(buff, O_RDONLY);
+	if (fd = 0)
+	{
+		ereport(huge_tlb_pages == HUGE_TLB_TRY ? DEBUG1 : WARNING,
+(errmsg(Could not open file %s: %s, buff, strerror(errno)),
+ errcontext(while checking hugepage size)));
+		return -1;
+	}
+
+	len = read(fd, buff, 1024);
+	if (len = 0)
+	{
+		ereport(huge_tlb_pages == HUGE_TLB_TRY ? DEBUG1 : WARNING,
+(errmsg(Error reading from file %s: %s, buff, strerror(errno)),
+ errcontext(while checking hugepage size)));
+		close(fd);
+		return -1;
+	}
+
+	/*
+	 * If the content of free_hugepages is longer than or equal to 1024 bytes
+	 * the rest is irrelevant; we simply want to know if there are any
+	 * hugepages left
+	 */
+	if (len == 1024)
+	{
+		buff[1023] = 0;
+	}
+	else
+	{
+		buff[len] = 0;
+	}
+
+	close(fd);
+
+	result = 

Re: [HACKERS] ENABLE/DISABLE CONSTRAINT NAME

2013-09-13 Thread Robert Haas
On Thu, Sep 12, 2013 at 10:03 PM,  wangs...@highgo.com.cn wrote:
 Second, I tested the check and the foreign key constraint as your test
 above.
 And no error found, as fellow:

You're missing the point.  Peter wasn't worried that your patch throws
an error; he's concerned about the fact that it doesn't.

In PostgreSQL, you can only create the following view because test1
has a primary key over column a:

= create table test1 (a int constraint pk primary key, b text);
= create view test2 as select a, b from test1 group by a;
= alter table test1 drop constraint pk;

The reason that, if the primary key weren't there, it would be
ambiguous which row should be returned as among multiple values where
a is equal and b is not.  If you can disable the constraint, then you
can create precisely that problem.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plpgsql.print_strict_params

2013-09-13 Thread Alvaro Herrera
Marko Tiikkaja wrote:
 On 2013-09-14 00:39, Alvaro Herrera wrote:
 set plpgsql.print_strict_params to true;
 
 Hmm, shouldn't this be a compile option?  See the comp_option
 production in pl_gram.y for existing examples.
 
 I thought about that, but it seemed more like a runtime option to
 me. Any particular reason you think it would be better as a compile
 time option?

Seems like something that would be useful to change per function, rather
than globally, no?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Proposal/design feedback needed: WITHIN GROUP (sql standard ordered set aggregate functions)

2013-09-13 Thread Peter Eisentraut
On Fri, 2013-09-13 at 14:56 +0530, Atri Sharma wrote:
 This is our complete patch for implementation of WITHIN GROUP.

Please fix compiler warnings:

inversedistribution.c: In function ‘mode_final’:
inversedistribution.c:276:11: warning: ‘mode_val’ may be used uninitialized in 
this function [-Wmaybe-uninitialized]
inversedistribution.c:299:8: warning: ‘last_val’ may be used uninitialized in 
this function [-Wmaybe-uninitialized]




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plpgsql.print_strict_params

2013-09-13 Thread Marko Tiikkaja

On 2013-09-14 03:33, Alvaro Herrera wrote:

Marko Tiikkaja wrote:

I thought about that, but it seemed more like a runtime option to
me. Any particular reason you think it would be better as a compile
time option?


Seems like something that would be useful to change per function, rather
than globally, no?


The way I see it, STRICT is like an assertion, and you *would* pretty 
much always change it globally.


However, it would be useful to also have the option to set it for a 
single function only.  Will look into that.  Thanks for the suggestion!



Regards,
Marko Tiikkaja


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] record identical operator

2013-09-13 Thread Kevin Grittner
Andres Freund and...@2ndquadrant.com wrote:

 Not one that's dependendant on padding bytes, null bitmaps that
 can or cannot be present and such.

Can you provide an example of where that's an issue with this
patch?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PL/pgSQL, RAISE and error context

2013-09-13 Thread Marko Tiikkaja

On 23/08/2013 10:36, I wrote:

On 8/23/13 8:38 AM, Pavel Stehule wrote:

do you prepare patch ?


I should have the time to produce one for the September commitfest, but
if you (or anyone else) want to work on this, I won't object.

My opinion at this very moment is that we should leave the the DEFAULT
verbosity alone and add a new one (call it COMPACT or such) with the
suppressed context for non-ERRORs.


Well, turns out there isn't really any way to preserve complete 
backwards compatibility if we want to do this change.


The attached patch (based on Pavel's patch) changes the default to be 
slightly more verbose (the CONTEXT lines which were previously omitted 
will be visible), but adds a new PGVerbosity called COMPACT which 
suppresses CONTEXT in non-error messages.  Now DEFAULT will be more 
useful when debugging PL/PgSQL, and people who are annoyed by the new 
behaviour can use the COMPACT mode.


Any thoughts?



Regards,
Marko Tiikkaja

*** a/doc/src/sgml/libpq.sgml
--- b/doc/src/sgml/libpq.sgml
***
*** 5418,5423  int PQsetClientEncoding(PGconn 
*replaceableconn/replaceable, const char *re
--- 5418,5424 
  typedef enum
  {
  PQERRORS_TERSE,
+ PQERRORS_COMPACT,
  PQERRORS_DEFAULT,
  PQERRORS_VERBOSE
  } PGVerbosity;
***
*** 5430,5439  PGVerbosity PQsetErrorVerbosity(PGconn *conn, PGVerbosity 
verbosity);
returned messages include severity, primary text, and position only;
this will normally fit on a single line.  The default mode produces
messages that include the above plus any detail, hint, or context
!   fields (these might span multiple lines).  The firsttermVERBOSE/
!   mode includes all available fields.  Changing the verbosity does not
!   affect the messages available from already-existing
!   structnamePGresult/ objects, only subsequently-created ones.
   /para
  /listitem
 /varlistentry
--- 5431,5442 
returned messages include severity, primary text, and position only;
this will normally fit on a single line.  The default mode produces
messages that include the above plus any detail, hint, or context
!   fields (these might span multiple lines).  The COMPACT mode is otherwise
!   the same as the default, except the context field will be omitted for
!   non-error messages.  The firsttermVERBOSE/ mode includes all
!   available fields.  Changing the verbosity does not affect the messages
!   available from already-existing structnamePGresult/ objects, only
!   subsequently-created ones.
   /para
  /listitem
 /varlistentry
*** a/src/bin/psql/startup.c
--- b/src/bin/psql/startup.c
***
*** 796,801  verbosity_hook(const char *newval)
--- 796,803 
pset.verbosity = PQERRORS_DEFAULT;
else if (strcmp(newval, terse) == 0)
pset.verbosity = PQERRORS_TERSE;
+   else if (strcmp(newval, compact) == 0)
+   pset.verbosity = PQERRORS_COMPACT;
else if (strcmp(newval, verbose) == 0)
pset.verbosity = PQERRORS_VERBOSE;
else
*** a/src/interfaces/libpq/fe-protocol3.c
--- b/src/interfaces/libpq/fe-protocol3.c
***
*** 915,920  pqGetErrorNotice3(PGconn *conn, bool isError)
--- 915,924 
if (val)
appendPQExpBuffer(workBuf, libpq_gettext(QUERY:  
%s\n), val);
val = PQresultErrorField(res, PG_DIAG_CONTEXT);
+   }
+   if (isError || (conn-verbosity != PQERRORS_TERSE 
+   conn-verbosity != PQERRORS_COMPACT))
+   {
if (val)
appendPQExpBuffer(workBuf, libpq_gettext(CONTEXT:  
%s\n), val);
}
*** a/src/interfaces/libpq/libpq-fe.h
--- b/src/interfaces/libpq/libpq-fe.h
***
*** 106,111  typedef enum
--- 106,112 
  typedef enum
  {
PQERRORS_TERSE, /* single-line error messages */
+   PQERRORS_COMPACT,   /* single-line error messages 
on non-error messags */
PQERRORS_DEFAULT,   /* recommended style */
PQERRORS_VERBOSE/* all the facts, ma'am */
  } PGVerbosity;
*** a/src/pl/plpgsql/src/pl_exec.c
--- b/src/pl/plpgsql/src/pl_exec.c
***
*** 39,46 
  #include utils/typcache.h
  
  
- static const char *const raise_skip_msg = RAISE;
- 
  typedef struct
  {
int nargs;  /* number of arguments 
*/
--- 39,44 
***
*** 867,876  plpgsql_exec_error_callback(void *arg)
  {
PLpgSQL_execstate *estate = (PLpgSQL_execstate *) arg;
  
-   /* if we are doing RAISE, don't report its location */
-   if (estate-err_text == raise_skip_msg)
-   return;
- 
if (estate-err_text != NULL)
{
/*
--- 865,870 
***
*** 3032,3038  

[HACKERS] Proposal: PL/PgSQL strict_mode

2013-09-13 Thread Marko Tiikkaja

Hi,

After my previous suggestion for adding a STRICT keyword got shot 
down[1], I've been thinking about an idea Andrew Gierth tossed out: 
adding a new strict mode into PL/PgSQL.  In this mode, any query which 
processes more than one row will raise an exception.  This is a bit 
similar to specifying INTO STRICT .. for every statement, except 
processing no rows does not trigger the exception.  The need for this 
mode comes from a few observations I make almost every day:

  1) The majority of statements only deal with exactly 0 or 1 rows.
  2) Checking row_count for a statement is ugly and cumbersome, so 
often it just isn't checked.  I often use RETURNING TRUE INTO STRICT _OK 
for DML, but that a) requires an extra variable, and b) isn't possible 
if 0 rows affected is not an error in the application logic.
  3) SELECT .. INTO only fetches one row and ignores the rest.  Even 
row_count is always set to 0 or 1, so there's no way to fetch a value 
*and* to check that there would not have been more rows.  This creates 
bugs which make your queries return wrong results and which could go 
undetected for a long time.


Attached is a proof-of-concept patch (no docs, probably some actual code 
problems too) to implement this as a compile option:


=# create or replace function footest() returns void as $$
$# #strict_mode strict
$# begin
$#   -- not allowed to delete more than one row
$#   delete from foo where f1  100;
$# end$$ language plpgsql;
CREATE FUNCTION
=# select footest();
ERROR:  query processed more than one row
CONTEXT:  PL/pgSQL function footest() line 5 at SQL statement

Now while I think this is a step into the right direction, I do have a 
couple of problems with this patch:
  1) I'm not sure what would be the correct behaviour with EXECUTE. 
I'm tempted to just leave EXECUTE alone, as it has slightly different 
rules anyway.
  2) If you're running in strict mode and you want to 
insert/update/delete more than one row, things get a bit uglier; a wCTE 
would work for some cases.  If strict mode doesn't affect EXECUTE (see 
point 1 above), that could work too.  Or maybe there could be a new 
command which runs a query, discards the results and ignores the number 
of rows processed.


I'll be adding this to the open commitfest in hopes of getting some 
feedback on this idea (I'm prepared to hear a lot of you're crazy!), 
but feel free to comment away any time you please.



Regards,
Marko Tiikkaja

[1]: http://www.postgresql.org/message-id/510bf731.5020...@gmx.net
*** a/src/pl/plpgsql/src/pl_comp.c
--- b/src/pl/plpgsql/src/pl_comp.c
***
*** 351,356  do_compile(FunctionCallInfo fcinfo,
--- 351,357 
function-fn_cxt = func_cxt;
function-out_param_varno = -1; /* set up for no OUT param */
function-resolve_option = plpgsql_variable_conflict;
+   function-strict_mode = plpgsql_strict_mode;
  
if (is_dml_trigger)
function-fn_is_trigger = PLPGSQL_DML_TRIGGER;
***
*** 847,852  plpgsql_compile_inline(char *proc_source)
--- 848,854 
function-fn_cxt = func_cxt;
function-out_param_varno = -1; /* set up for no OUT param */
function-resolve_option = plpgsql_variable_conflict;
+   function-strict_mode = plpgsql_strict_mode;
  
plpgsql_ns_init();
plpgsql_ns_push(func_name);
*** a/src/pl/plpgsql/src/pl_exec.c
--- b/src/pl/plpgsql/src/pl_exec.c
***
*** 3238,3243  exec_stmt_execsql(PLpgSQL_execstate *estate,
--- 3238,3244 
ParamListInfo paramLI;
longtcount;
int rc;
+   bool strict_mode = estate-func-strict_mode;
PLpgSQL_expr *expr = stmt-sqlstmt;
  
/*
***
*** 3278,3293  exec_stmt_execsql(PLpgSQL_execstate *estate,
  
/*
 * If we have INTO, then we only need one row back ... but if we have 
INTO
!* STRICT, ask for two rows, so that we can verify the statement returns
!* only one.  INSERT/UPDATE/DELETE are always treated strictly. Without
!* INTO, just run the statement to completion (tcount = 0).
 *
 * We could just ask for two rows always when using INTO, but there are
 * some cases where demanding the extra row costs significant time, eg 
by
 * forcing completion of a sequential scan.  So don't do it unless we 
need
 * to enforce strictness.
 */
!   if (stmt-into)
{
if (stmt-strict || stmt-mod_stmt)
tcount = 2;
--- 3279,3297 
  
/*
 * If we have INTO, then we only need one row back ... but if we have 
INTO
!* STRICT or we're in strict mode, ask for two rows, so that we can 
verify
!* the statement returns only one.  INSERT/UPDATE/DELETE are always 
treated
!* strictly.  Without INTO, just run the statement to completion
!* (tcount = 0).
 *
 * We 

Re: [HACKERS] Proposal: PL/PgSQL strict_mode

2013-09-13 Thread Marko Tiikkaja

On 14/09/2013 06:28, I wrote:

2) Checking row_count for a statement is ugly and cumbersome, so
often it just isn't checked.  I often use RETURNING TRUE INTO STRICT _OK
for DML, but that a) requires an extra variable, and b) isn't possible
if 0 rows affected is not an error in the application logic.


The b) part here wasn't exactly true; you could use RETURNING TRUE INTO 
OK as INSERT/UPDATE/DELETE with RETURNING .. INTO raises an exception if 
it returns more than one row, but it's even more convoluted than the 
RETURNING TRUE INTO STRICT _OK, and it's repetitive.



Regards,
Marko Tiikkaja


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: PL/PgSQL strict_mode

2013-09-13 Thread chris travers
A few thoughts about this.

 On 14 September 2013 at 05:28 Marko Tiikkaja ma...@joh.to wrote:


 Hi,

 After my previous suggestion for adding a STRICT keyword got shot
 down[1], I've been thinking about an idea Andrew Gierth tossed out:
 adding a new strict mode into PL/PgSQL. In this mode, any query which
 processes more than one row will raise an exception. This is a bit
 similar to specifying INTO STRICT .. for every statement, except
 processing no rows does not trigger the exception. The need for this
 mode comes from a few observations I make almost every day:
 1) The majority of statements only deal with exactly 0 or 1 rows.
 2) Checking row_count for a statement is ugly and cumbersome, so
 often it just isn't checked. I often use RETURNING TRUE INTO STRICT _OK
 for DML, but that a) requires an extra variable, and b) isn't possible
 if 0 rows affected is not an error in the application logic.
 3) SELECT .. INTO only fetches one row and ignores the rest. Even
 row_count is always set to 0 or 1, so there's no way to fetch a value
 *and* to check that there would not have been more rows. This creates
 bugs which make your queries return wrong results and which could go
 undetected for a long time.

I am going to suggest that STRICT is semantically pretty far from what is meant
here in common speech.  I think STRICT here would be confusing.  This would be
really pretty severe for people coming from Perl or MySQL backgrounds.

May I suggest SINGLE as a key word instead?  It might be worth having attached
to a INSERT, UPDATE, and DELETE statements.

I am thinking something like:

DELETE SINGLE FROM foo WHERE f1  1000;

would be more clearer.  Similarly one could have:

INSERT SINGLE INTO foo SELECT * from foo2;

and

UPDATE SINGLE foo

You could even use SELECT SINGLE but not sure where the use case is there where
unique indexes are not sufficient.



 Attached is a proof-of-concept patch (no docs, probably some actual code
 problems too) to implement this as a compile option:

 =# create or replace function footest() returns void as $$
 $# #strict_mode strict
 $# begin
 $# -- not allowed to delete more than one row
 $# delete from foo where f1  100;
 $# end$$ language plpgsql;
 CREATE FUNCTION
 =# select footest();
 ERROR: query processed more than one row
 CONTEXT: PL/pgSQL function footest() line 5 at SQL statement

 Now while I think this is a step into the right direction, I do have a
 couple of problems with this patch:
 1) I'm not sure what would be the correct behaviour with EXECUTE.
 I'm tempted to just leave EXECUTE alone, as it has slightly different
 rules anyway.
 2) If you're running in strict mode and you want to
 insert/update/delete more than one row, things get a bit uglier; a wCTE
 would work for some cases. If strict mode doesn't affect EXECUTE (see
 point 1 above), that could work too. Or maybe there could be a new
 command which runs a query, discards the results and ignores the number
 of rows processed.

Yeah, I am worried about this one.  I am concerned that if you can't disable on
a statement by statement basis, then you have a problem where you end up
removing the mode from the function and then it becomes a lot harder for
everyone maintaining the function to have a clear picture of what is going on.
 I am further worried that hacked ugly code ways around it will introduce plenty
of other maintenance pain points that will be worse than what you are solving.

 I'll be adding this to the open commitfest in hopes of getting some
 feedback on this idea (I'm prepared to hear a lot of you're crazy!),
 but feel free to comment away any time you please.

Well, I don't know if my feedback above is helpful, but there it is ;-)


 Regards,
 Marko Tiikkaja

 [1]: http://www.postgresql.org/message-id/510bf731.5020...@gmx.net

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers
Best Wishes,
Chris Travers
http://www.2ndquadrant.com
PostgreSQL Services, Training, and Support