Re: [HACKERS] review: pgbench progress report improvements
Hello, About patch eols: postgresql patch -p1 ../pgbench-measurements-v2.patch patching file contrib/pgbench/pgbench.c patching file doc/src/sgml/pgbench.sgml it can depends on o.s. I did tests on Fedora 14. and for patching without warning I had to use dos2unix tool. Hmmm. I use a Linux Ubuntu laptop, so generating DOS end of lines is unlikely if it is not there at the beginning. Running dos2unix on the patch file locally does not seem to change anything. So I assume that the patch encoding was changed somewhere along the path you used to get it. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
On Thu, Sep 12, 2013 at 11:27 PM, Kevin Grittner kgri...@ymail.com wrote: Attached is a patch for a bit of infrastructure I believe to be necessary for correct behavior of REFRESH MATERIALIZED VIEW CONCURRENTLY as well as incremental maintenance of matviews. [...] The patch adds an identical operator (===) for the record type: [...] The new operator is logically similar to IS NOT DISTINCT FROM for a record, although its implementation is very different. For one thing, it doesn't replace the operation with column level operators in the parser. For another thing, it doesn't look up operators for each type, so the identical operator does not need to be implemented for each type to use it as shown above. It compares values byte-for-byte, after detoasting. The test for identical records can avoid the detoasting altogether for any values with different lengths, and it stops when it finds the first column with a difference. I toyed with the idea of supporting hashing of records using this operator, but could not see how that would be a performance win. The identical (===) and not identical (!==) operator names were chosen because of a vague similarity to the exactly equals concepts in JavaScript and PHP, which use that name. The semantics aren't quite the same, but it seemed close enough not to be too surprising. The additional operator names seemed natural to me based on the first two, but I'm not really that attached to these names for the operators if someone has a better idea. Since the comparison of record values is not documented (only comparisons involving row value constructors), it doesn't seem like we should document this special case. It is intended primarily for support of matview refresh and maintenance, and it seems likely that record comparison was not documented on the basis that it is intended primarily for support of such things as indexing and merge joins -- so leaving the new operators undocumented seems consistent with existing policy. I'm open to arguments that the policy should change. - Wouldn't it be slightly less surprising / magical to not declare new operators but just new primitive functions? In the least invasive version they could even be called matview_is_record_identical or similar. cheers, Bene
Re: [HACKERS] 9.4 HEAD: select() failed in postmaster
From: Jeff Janes jeff.ja...@gmail.com -- I've implemented the Min to Max change and did some more testing. Now I have a different but related problem (which I also saw before, but less often than the select() one). The 5 second clock doesn't get turned off. So after all processes end, and a new startup is launched, if that startup doesn't report back to the postmaster soon enough, it gets SIGKILLED. postmaster.c near line 1681 if ((Shutdown = ImmediateShutdown || (FatalError !SendStop)) now - AbortStartTime = SIGKILL_CHILDREN_AFTER_SECS) It seems like this needs to have an additional and-test of pmState, but which states to test I don't really know. I've added in (pmStatePM_RUN) and have not had any more failures, so I think that this is on the right path but testing an enum for inequality feels wrong. -- AbortStartTime 0 is also necessary to avoid sending SIGKILL repeatedly. I sent the attached patch during the original discussion. The below fragment is relevant: --- 1663,1688 TouchSocketLockFiles(); last_touch_time = now; } + + /* +* When postmaster got an immediate shutdown request +* or some child terminated abnormally (FatalError case), +* postmaster sends SIGQUIT to all children except +* syslogger and dead_end ones, then wait for them to terminate. +* If some children didn't terminate within a certain amount of time, +* postmaster sends SIGKILL to them and wait again. +* This resolves, for example, the hang situation where +* a backend gets stuck in the call chain: +* free() acquires some lock - received SIGQUIT - +* quickdie() - ereport() - gettext() - malloc() - lock acquisition +*/ + if (AbortStartTime 0 /* SIGKILL only once */ +(Shutdown == ImmediateShutdown || (FatalError !SendStop)) +now - AbortStartTime = 10) + { +SignalAllChildren(SIGKILL); +AbortStartTime = 0; + } } } Regards MauMau reliable_immediate_shutdown.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Protocol forced to V2 in low-memory conditions?
On Wed, Sep 11, 2013 at 2:35 PM, Andrew Dunstan and...@dunslane.net wrote: I vote against this. If we remove V2 support from libpq, then we'll have no easy way to test that the backend's support still works. And we've got too many people using V2 to think that it's OK not to have an easy way of testing that. I think the question we ought to be asking is: how can we get widely-used connectors to stop relying on V2 in the first place? How is it tested now, and who is doing the testing? I don't know that there's any automated testing, but if someone reports a bug that can only be reproduced using the v2 protocol, the first thing I'm going to try to do is reproduce that using libpq, just as I would with any other connector malfunction. From what's being said here it sounds like I might have to hack libpq a bit to get it to speak the v2 protocol, but how long is that going to take? Less time than setting up an environment that can run whatever crazy connector the client is using and trying to figure out whether the client or the server is to blame, for sure. Don't get me wrong, I think that the v2 protocol should go away, but the real issue is the connectors that are still relying on it as a primary way of talking to the server, not libpq. We could force all of those connectors to update or die by removing server support, and once the server support is gone I don't care about libpq any more. Or maybe there's some friendlier way to wean those connectors off v2. But in the meantime I'm skeptical that removing the code from libpq gets us anywhere. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
Benedikt Grundmann bgrundm...@janestreet.com wrote: Kevin Grittner kgri...@ymail.com wrote: Attached is a patch for a bit of infrastructure I believe to be necessary for correct behavior of REFRESH MATERIALIZED VIEW CONCURRENTLY as well as incremental maintenance of matviews. [...] The patch adds an identical operator (===) for the record type: Wouldn't it be slightly less surprising / magical to not declare new operators but just new primitive functions? In the least invasive version they could even be called matview_is_record_identical or similar. I'm not sure what is particularly surprising or magical about new operators, but it is true that the patch could be smaller if operators were not added. The SQL functions added by the current patch to support the operator approach are: extern Datum record_image_eq(PG_FUNCTION_ARGS); extern Datum record_image_ne(PG_FUNCTION_ARGS); extern Datum record_image_lt(PG_FUNCTION_ARGS); extern Datum record_image_gt(PG_FUNCTION_ARGS); extern Datum record_image_le(PG_FUNCTION_ARGS); extern Datum record_image_ge(PG_FUNCTION_ARGS); extern Datum btrecordimagecmp(PG_FUNCTION_ARGS); All take two record arguments and all but the last return a boolean. The last one returns -1, 0, or 1 depending on how the values compare. All comparisons are based on memcmp() of the data in its storage format (after detoasting, where applicable). As currently written, the patch still requires a unique index on the matview in order to allow RMVC, but this patch was intended to support removal of that restriction, which is something some people were saying they wanted. It just seemed best to do that with a separate patch once we had the mechanism to support it. RMVC currently generates its diff data with a query similar to this: test=# explain analyze test-# SELECT * test-# FROM citext_matview m test-# FULL JOIN citext_table t ON (t === m) test-# WHERE t.id IS NULL OR m.id IS NULL; test=# \pset pager off Pager usage is off. test=# explain analyze SELECT * FROM citext_matview m FULL JOIN citext_table t ON (t.id = m.id AND t === m) WHERE t.id IS NULL OR m.id IS NULL; QUERY PLAN - Hash Full Join (cost=1.11..2.24 rows=1 width=16) (actual time=0.056..0.056 rows=0 loops=1) Hash Cond: (m.id = t.id) Join Filter: (t.* === m.*) Filter: ((t.id IS NULL) OR (m.id IS NULL)) Rows Removed by Filter: 5 - Seq Scan on citext_matview m (cost=0.00..1.05 rows=5 width=40) (actual time=0.002..0.006 rows=5 loops=1) - Hash (cost=1.05..1.05 rows=5 width=40) (actual time=0.023..0.023 rows=5 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 1kB - Seq Scan on citext_table t (cost=0.00..1.05 rows=5 width=40) (actual time=0.010..0.016 rows=5 loops=1) Total runtime: 0.102 ms (10 rows) With the operator support, we can remove the primary key columns from the join conditions, and it still works, albeit with a slower plan: test=# explain analyze SELECT * FROM citext_matview m FULL JOIN citext_table t ON (t === m) WHERE t.id IS NULL OR m.id IS NULL; QUERY PLAN --- Merge Full Join (cost=2.22..2.32 rows=1 width=16) (actual time=0.072..0.072 rows=0 loops=1) Merge Cond: (m.* === t.*) Filter: ((t.id IS NULL) OR (m.id IS NULL)) Rows Removed by Filter: 5 - Sort (cost=1.11..1.12 rows=5 width=40) (actual time=0.035..0.035 rows=5 loops=1) Sort Key: m.* Sort Method: quicksort Memory: 25kB - Seq Scan on citext_matview m (cost=0.00..1.05 rows=5 width=40) (actual time=0.012..0.016 rows=5 loops=1) - Sort (cost=1.11..1.12 rows=5 width=40) (actual time=0.014..0.014 rows=5 loops=1) Sort Key: t.* Sort Method: quicksort Memory: 25kB - Seq Scan on citext_table t (cost=0.00..1.05 rows=5 width=40) (actual time=0.003..0.003 rows=5 loops=1) Total runtime: 0.128 ms (13 rows) So, if the operators are included it will be a very small and simple patch to relax the requirement for a unique index. Currently we generate an error if no index columns were found, but with the new operators we could leave them out and the query would still work. Adding indexes to matviews would then be just a matter of optimization, not a requirement to be able to use the RMVC feature. Using the functions instead of the operators things work just as well as long as we use columns in the join conditions, which is currently based on indexed columns: test=# explain analyze test-# SELECT * test-# FROM citext_matview m test-# FULL JOIN citext_table t ON (t.id =
Re: [HACKERS] Custom Plan node
On Tue, Sep 10, 2013 at 11:45 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote: Fair enough, I think. So the action item for KaiGai is to think of how the planner integration might work. Do you think the idea I mentioned at the upthread is worth to investigate for more detailed consideration? Or, does it seems to you short-sighted thinking to fit this infrastructure with planner? It categorizes plan node into three: join, scan and other stuff. Cost based estimation is almost applied on join and scan, so abstracted scan and join may make sense to inform core planner what does this custom plan node try to do. On the other hand, other stuff, like Agg, is a stuff that must be added according to the provided query, even if its cost estimation was not small, to perform as the provided query described. So, I thought important one is integration of join and scan, but manipulation of plan tree for other stuff is sufficient. How about your opinion? Well, I don't know that I'm smart enough to predict every sort of thing that someone might want to do here, unfortunately. This is a difficult area: there are many possible things someone might want to do, and as Tom points out, there's a lot of special handling of particular node types that can make things difficult. And I can't claim to be an expert in this area. That having been said, I think the idea of a CustomScan node is probably worth investigating. I don't know if that would work out well or poorly, but I think it would be worth some experimentation. Perhaps you could have a hook that gets called for each baserel, and can decide whether or not it wishes to inject any additional paths; and then a CustomScan node that could be used to introduce such paths. I've been thinking that we really ought to have the ability to optimize CTID range queries, like SELECT * FROM foo WHERE ctid 'some constant'. We have a Tid Scan node, but it only handles equalities, not inequalities. I suppose this functionality should really be in core, but since it's not it might make an interesting test for the infrastructure you're proposing. You may be able to think of something else you like better; it's just a thought. I am a little less sanguine about the chances of a CustomJoin node working out well. I agree that we need something to handle join pushdown, but it seems to me that might be done by providing a Foreign Scan path into the joinrel rather than by adding a concept of foreign joins per se. There are other possible new join types, like the Range Join that Jeff Davis has mentioned in the past, which might be interesting. But overall, I can't see us adding very many more join types, so I'm not totally sure how much extensibility would help us here. We've added a few scan types over the years (index only scans in 9.2, and bitmap index scans in 8.1, I think) but all of our existing join types go back to time immemorial. And I think that lumping everything else together under not a scan or join has the least promise of all. Is replacing Append really the same as replacing Sort? I think we'll need to think harder here about what we're trying to accomplish and how to get there. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers
On Wed, Sep 11, 2013 at 3:40 PM, Josh Berkus j...@agliodbs.com wrote: I think that most of the arguments in this thread drastically overestimate the precision and the effect of effective_cache_size. The planner logic behind it basically only uses it to calculate things within a single index scan. That alone shows that any precise calculation cannot be very meaningful. It also does *NOT* directly influence how the kernel caches disk io. It's there to guess how likely it is something is still cached when accessing things repeatedly. Agreed. I think we should take the patch as-is, and spend the rest of the 9.4 dev cycle arguing about 3x vs. 4x. ;-) I'm happy with that option, but I think the larger point here is that this only has a hope of being right if you're setting shared_buffers to 25% of system memory. And more and more, people are not doing that, because of the other recommendation, not much discussed here, to cap shared_buffers at about 8GB. Systems whose total memory is far larger than 32GB are becoming quite commonplace, and only figure to become moreso. So while I don't particularly object to this proposal, it would have had a lot more value if we'd done it 5 years ago. Now the good news is that right now the default is 128MB, and under any of these proposals the default will go up, quite a bit. Default shared_buffers is now 128MB, so we're looking at raising the default to at least 384MB, and for people who also tune shared_buffers but might not bother with effective cache size, it'll go up a lot more. That's clearly a move in the right direction even if the accuracy of the formula is suspect (which it is). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL 9.3 beta breaks some extensions make install
Marti Raudsepp ma...@juffo.org a écrit : On Tue, May 14, 2013 at 4:12 AM, Marti Raudsepp ma...@juffo.org wrote: While testing out PostgreSQL 9.3beta1, I stumbled upon a problem % make DESTDIR=/tmp/foo install /usr/bin/install: will not overwrite just-created ‘/tmp/foo/usr/share/postgresql/extension/semver--0.3.0.sql’ with ‘./sql/semver--0.3.0.sql’ make: *** [install] Error 1 On Wed, May 15, 2013 at 4:49 PM, Peter Eisentraut pete...@gmx.net wrote: That said, I'm obviously outnumbered here. What about the following compromise: Use the configure-selected install program inside PostgreSQL (which we can test easily), and use install-sh under USE_PGXS? Admittedly, the make install time of extensions is probably not an issue. Did we ever do anything about this? It looks like the thread got distracted with VPATH builds and now I'm seeing this problem in 9.3.0. :( This occurs in Arch Linux, but for some odd reason not on Ubuntu when using apt.postgresql.org. Somehow the pgxs.mk supplied by apt.postgresql.org differs from the one shipped in PostgreSQL. Is there a chance of getting this resolved in PostgreSQL or should we get extension writers to fix their makefiles instead? Apt.pgdg got the patch present in postgresql head applyed. Andrew is about to commit (well...I hope) a doc patch about that and also a little fix. Imho this is a bugfix so I hope it will be applyed in older branches. -- Envoyé de mon téléphone. Excusez la brièveté. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
Hi Kevin, On 2013-09-12 15:27:27 -0700, Kevin Grittner wrote: The new operator is logically similar to IS NOT DISTINCT FROM for a record, although its implementation is very different. For one thing, it doesn't replace the operation with column level operators in the parser. For another thing, it doesn't look up operators for each type, so the identical operator does not need to be implemented for each type to use it as shown above. It compares values byte-for-byte, after detoasting. The test for identical records can avoid the detoasting altogether for any values with different lengths, and it stops when it finds the first column with a difference. In the general case, that operator sounds dangerous to me. We don't guarantee that a Datum containing the same data always has the same binary representation. E.g. array can have a null bitmap or may not have one, depending on how they were created. I am not actually sure whether that's a problem for your usecase, but I get headaches when we try circumventing the type abstraction that way. Yes, we do such tricks in other places already, but afaik in all those places errorneously believing two Datums are distinct is not error, just a missed optimization. Allowing a general operator with such a murky definition to creep into something SQL exposed... Hm. Not sure. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers
On Fri, Sep 13, 2013 at 10:08 AM, Robert Haas robertmh...@gmail.com wrote: On Wed, Sep 11, 2013 at 3:40 PM, Josh Berkus j...@agliodbs.com wrote: I think that most of the arguments in this thread drastically overestimate the precision and the effect of effective_cache_size. The planner logic behind it basically only uses it to calculate things within a single index scan. That alone shows that any precise calculation cannot be very meaningful. It also does *NOT* directly influence how the kernel caches disk io. It's there to guess how likely it is something is still cached when accessing things repeatedly. Agreed. I think we should take the patch as-is, and spend the rest of the 9.4 dev cycle arguing about 3x vs. 4x. ;-) I'm happy with that option, but I think the larger point here is that this only has a hope of being right if you're setting shared_buffers to 25% of system memory. And more and more, people are not doing that, because of the other recommendation, not much discussed here, to cap shared_buffers at about 8GB. Systems whose total memory is far larger than 32GB are becoming quite commonplace, and only figure to become moreso. So while I don't particularly object to this proposal, it would have had a lot more value if we'd done it 5 years ago. Now the good news is that right now the default is 128MB, and under any of these proposals the default will go up, quite a bit. Default shared_buffers is now 128MB, so we're looking at raising the default to at least 384MB, and for people who also tune shared_buffers but might not bother with effective cache size, it'll go up a lot more. That's clearly a move in the right direction even if the accuracy of the formula is suspect (which it is). This is a very important point: the 8gb cap is also to high. We have a very high transaction rate server here that exploded at 32GB, was downgraded to 2GB ran fine, then upgraded to 4GB (over my strenuous objection) and exploded again. The stock documentation advice I probably needs to be revised to so that's the lesser of 2GB and 25%. I'm more and more coming around to the opinion that in terms of shared buffers we have some major problems that manifest in high end servers. So +1 to your point, although I'm still ok with the auto-setting on the basis that for very high end servers most of the setting end up being manually tweaked anyways. We do need to be cautious though; it's not impossible that improvements to buffers system might cause the 25% advice to revised upwards in the not-to-near future if some of the problems get solved.. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Successor of MD5 authentication, let's use SCRAM
On Thu, Sep 12, 2013 at 11:33 AM, Magnus Hagander mag...@hagander.net wrote: Well, undocumented and OpenSSL tend to go hand in hand a lot. Or, well, it might be documented, but not in a useful way. I wouldn't count on it. The OpenSSL code is some of the worst-formatted spaghetti code I've ever seen, and the reason I know that is because whenever I try to do anything with OpenSSL I generally end up having to read it, precisely because, as you say, the documentation is extremely incomplete. I hate to be critical of other projects, but everything I've ever done with OpenSSL has been difficult, and I really think we should try to get less dependent on it rather than more. I fear starting to use that is going to make it even harder to break out from our openssl dependency, which people do complain about at least semi-regularly. +1. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE
On Wed, Sep 11, 2013 at 8:47 PM, Peter Geoghegan p...@heroku.com wrote: In practice the vast majority of insertions don't involve TOASTing. That's not an excuse for allowing the worst case to be really bad in terms of its impact on query response time, but it may well justify having whatever ameliorating measures we take result in bloat. It's at least the kind of bloat we're more or less used to dealing with, and have already invested a lot in controlling. Plus bloat-wise it can't be any worse than just inserting the tuple and having the transaction abort on a duplicate, since that already happens after toasting has done its work with regular insertion. Andres is being very polite here, but the reality is that this approach has zero chance of being accepted. You can't hold buffer locks for a long period of time across complex operations. Full stop. It's a violation of the rules that are clearly documented in src/backend/storage/buffer/README, which have been in place for a very long time, and this patch is nowhere near important enough to warrant a revision of those rules. We are not going to risk breaking every bit of code anywhere in the backend or in third-party code that takes a buffer lock. You are never going to convince me, or Tom, that the benefit of doing that is worth the risk; in fact, I have a hard time believing that you'll find ANY committer who thinks this approach is worth considering. Even if you get the code to run without apparent deadlocks, that doesn't mean there aren't any; it just means that you haven't found them all yet. And even if you managed to squash every such hazard that exists today, so what? Fundamentally, locking protocols that don't include deadlock detection don't scale. You can use such locks in limited contexts where proofs of correctness are straightforward, but trying to stretch them beyond that point results not only in bugs, but also in bad performance and unmaintainable code. With a much more complex locking regimen, even if your code is absolutely bug-free, you've put a burden on the next guy who wants to change anything; how will he avoid breaking things? Our buffer locking regimen suffers from painful complexity and serious maintenance difficulties as is. Moreover, we've already got performance and scalability problems that are attributable to every backend in the system piling up waiting on a single lwlock, or a group of simultaneously-held lwlocks. Dramatically broadening the scope of where lwlocks are used and for how long they're held is going to make that a whole lot worse. What's worse, the problems will be subtle, restricted to the people using this feature, and very difficult to measure on production systems, and I have no confidence they'd ever get fixed. A further problem is that a backend which holds even one lwlock can't be interrupted. We've had this argument before and it seems that you don't think that non-interruptibility is a problem, but it project policy to allow for timely interrupts in all parts of the backend and we're not going to change that policy for this patch. Heavyweight locks are heavy weight precisely because they provide services - like deadlock detection, satisfactory interrupt handling, and, also importantly, FIFO queuing behavior - that are *important* for locks that are held over an extended period of time. We're not going to go put those services into the lightweight lock mechanism because then it would no longer be light weight, and we're not going to ignore the importance of them, either. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers
On Fri, Sep 13, 2013 at 11:07 AM, Andres Freund and...@2ndquadrant.com wrote: On 2013-09-13 10:50:06 -0500, Merlin Moncure wrote: The stock documentation advice I probably needs to be revised to so that's the lesser of 2GB and 25%. I think that would be a pretty bad idea. There are lots of workloads where people have postgres happily chugging along with s_b lots bigger than that and see benefits. We have a couple people reporting mostly undiagnosed (because that turns out to be hard!) problems that seem to be avoided with smaller s_b. We don't even remotely know enough about the problem to make such general recommendations. I happen to be one of those couple people. Load goes from 0.1 to 500 without warning then back to 0.1 equally without warning. Unfortunately the server is in a different jurisdiction such that it makes deep forensic analysis impossible. I think this is happening more and more often as postgres is becoming increasingly deployed on high(er) -end servers. I've personally (alone) dealt with 4-5 confirmed cases and there have been many more. We have a problem. But, to address your point, the big s_b benefits are equally hard to quantify (unless your database happens to fit in s_b) -- they mostly help high write activity servers where the write activity fits a very specific pattern. But the risks of low s_b (mostly slightly higher i/o and query latency) are much easier to deal with than high s_b (even if less likely); random inexplicable server stalls and other weird manifestations. My stock advice remains to set to max 2gb until having a reason (of which there can be many) to set otherwise. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers
On 2013-09-13 10:50:06 -0500, Merlin Moncure wrote: The stock documentation advice I probably needs to be revised to so that's the lesser of 2GB and 25%. I think that would be a pretty bad idea. There are lots of workloads where people have postgres happily chugging along with s_b lots bigger than that and see benefits. We have a couple people reporting mostly undiagnosed (because that turns out to be hard!) problems that seem to be avoided with smaller s_b. We don't even remotely know enough about the problem to make such general recommendations. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Strange hanging bug in a simple milter
Heikki, all, * Stephen Frost (sfr...@snowman.net) wrote: Very curious. Out of time right now to look into it, but will probably be back at it later tonight. Alright, I was back at this a bit today and decided to go with a hunch- and it looks like I might have been right to try. Leaving the locking callback hooks in place appears to prevent the deadlocks from happening, at least with this little app. IOW, in destroy_ssl_system(), simply arrange to not have CRYPTO_set_locking_callback(NULL); and CRYPTO_set_id_callback(NULL); called. I've done this with the very simple attached patch. Per the comment above destroy_ssl_system(), this doesn't seem to be an acceptable solution because libpq might get unloaded from the application, but perhaps it points the way towards what the issue is. My thought had been that there was an issue with regard to signal handling, but I've been unable to confirm that, nor is it clear why that would be the case. In any case, I'm curious what others think of these results and if anyone can reproduce the deadlock with this patch applied. Thanks! Stephen diff --git a/src/interfaces/libpq/fe-secure.c b/src/interfaces/libpq/fe-secure.c new file mode 100644 index 3bd0113..e025b49 100644 *** a/src/interfaces/libpq/fe-secure.c --- b/src/interfaces/libpq/fe-secure.c *** static SSL_CTX *SSL_context = NULL; *** 102,107 --- 102,108 #ifdef ENABLE_THREAD_SAFETY static long ssl_open_connections = 0; + static int initialized_hooks = 0; #ifndef WIN32 static pthread_mutex_t ssl_config_mutex = PTHREAD_MUTEX_INITIALIZER; *** init_ssl_system(PGconn *conn) *** 947,952 --- 948,955 if (ssl_open_connections++ == 0) { + initialized_hooks = 1; + /* These are only required for threaded libcrypto applications */ CRYPTO_set_id_callback(pq_threadidcallback); CRYPTO_set_locking_callback(pq_lockingcallback); *** destroy_ssl_system(void) *** 1015,1021 if (pq_init_crypto_lib ssl_open_connections 0) --ssl_open_connections; ! if (pq_init_crypto_lib ssl_open_connections == 0) { /* No connections left, unregister libcrypto callbacks */ CRYPTO_set_locking_callback(NULL); --- 1018,1024 if (pq_init_crypto_lib ssl_open_connections 0) --ssl_open_connections; ! if (pq_init_crypto_lib ssl_open_connections == 0 initialized_hooks == 0) { /* No connections left, unregister libcrypto callbacks */ CRYPTO_set_locking_callback(NULL); signature.asc Description: Digital signature
Re: [HACKERS] Possible memory leak with SQL function?
On Thu, Sep 12, 2013 at 5:29 AM, Yeb Havinga yebhavi...@gmail.com wrote: Is the following known behaviour, or should I put some time in writing a self contained test case? We have a function that takes a value and returns a ROW type. With the function implemented in language SQL, when executing this function in a large transaction, memory usage of the backend process increases. MemoryContextStats showed a lot of SQL function data. Debugging init_sql_fcache() showed that it was for the same function oid each time, and the oid was the function from value to ROW type. When the function is implemented in PL/pgSQL, the memory usage was much less. I'm sorry I cannot be more specific at the moment, such as what is 'much less' memory with a PL/pgSQl function, and are there as many SQL function data's as calls to the SQL function, because I would have to write a test case for this. I was just wondering, if this is known behavior of SQL functions vs PL/pgSQL functions, or could it be a bug? It sounds like a bug to me, although I can't claim to know everything there is to know about this topic. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers
On 2013-09-13 11:27:03 -0500, Merlin Moncure wrote: On Fri, Sep 13, 2013 at 11:07 AM, Andres Freund and...@2ndquadrant.com wrote: On 2013-09-13 10:50:06 -0500, Merlin Moncure wrote: The stock documentation advice I probably needs to be revised to so that's the lesser of 2GB and 25%. I think that would be a pretty bad idea. There are lots of workloads where people have postgres happily chugging along with s_b lots bigger than that and see benefits. We have a couple people reporting mostly undiagnosed (because that turns out to be hard!) problems that seem to be avoided with smaller s_b. We don't even remotely know enough about the problem to make such general recommendations. I happen to be one of those couple people. Load goes from 0.1 to 500 without warning then back to 0.1 equally without warning. Unfortunately the server is in a different jurisdiction such that it makes deep forensic analysis impossible. I think this is happening more and more often as postgres is becoming increasingly deployed on high(er) -end servers. I've personally (alone) dealt with 4-5 confirmed cases and there have been many more. We have a problem. Absolutely not claiming the contrary. I think it sucks that we couldn't fully figure out what's happening in detail. I'd love to get my hand on a setup where it can be reliably reproduced. But, to address your point, the big s_b benefits are equally hard to quantify (unless your database happens to fit in s_b) Databases where the hot dataset fits in s_b is pretty honking big use case tho. That's one of the primary reasons to buy machines with craploads of memory. That said, I think having a note in the docs that large s_b can cause such a problem might not be a bad idea and I surely wouldn't argue against it. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Strange hanging bug in a simple milter
On 2013-09-13 12:40:11 -0400, Stephen Frost wrote: Heikki, all, * Stephen Frost (sfr...@snowman.net) wrote: Very curious. Out of time right now to look into it, but will probably be back at it later tonight. Alright, I was back at this a bit today and decided to go with a hunch- and it looks like I might have been right to try. Leaving the locking callback hooks in place appears to prevent the deadlocks from happening, at least with this little app. IOW, in destroy_ssl_system(), simply arrange to not have CRYPTO_set_locking_callback(NULL); and CRYPTO_set_id_callback(NULL); called. I've done this with the very simple attached patch. Per the comment above destroy_ssl_system(), this doesn't seem to be an acceptable solution because libpq might get unloaded from the application, but perhaps it points the way towards what the issue is. It'd be interesting to replace the origin callbacks with one immediately doing an abort() or similar to see whether they maybe are called after they shouldn't be and from where. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Strange hanging bug in a simple milter
Andres, On Friday, September 13, 2013, Andres Freund wrote: It'd be interesting to replace the origin callbacks with one immediately doing an abort() or similar to see whether they maybe are called after they shouldn't be and from where. Good thought. Got sucked into a meeting but once I'm out I'll try having the lock/unlock routines abort if they're called while ssl_open_connections is zero, which should not be happening, but seems like it is. Thanks, Stephen
Re: [HACKERS] Strange hanging bug in a simple milter
On 2013-09-13 13:15:34 -0400, Stephen Frost wrote: Andres, On Friday, September 13, 2013, Andres Freund wrote: It'd be interesting to replace the origin callbacks with one immediately doing an abort() or similar to see whether they maybe are called after they shouldn't be and from where. Good thought. Got sucked into a meeting but once I'm out I'll try having the lock/unlock routines abort if they're called while ssl_open_connections is zero, which should not be happening, but seems like it is. Hm. close_SSL() first does pqsecure_destroy() which will unset the callbacks, and the count and then goes on to do X509_free() and ENGINE_finish(), ENGINE_free() if either is used. It's not implausible that one of those actually needs locking. I doubt engines play a role here, but, without having looked at the testcase, X509_free() might be a possibility. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL 9.3 beta breaks some extensions make install
On Fri, Sep 13, 2013 at 6:42 PM, Cédric Villemain ced...@2ndquadrant.com wrote: Marti Raudsepp ma...@juffo.org a écrit : Did we ever do anything about this? It looks like the thread got distracted with VPATH builds and now I'm seeing this problem in 9.3.0. Andrew is about to commit (well...I hope) a doc patch about that and also a little fix. Imho this is a bugfix so I hope it will be applyed in older branches. Oh I see, indeed commit 6697aa2bc25c83b88d6165340348a31328c35de6 Improve support for building PGXS modules with VPATH fixes the problem and I see it's not present in REL9_3_0. Andrew and others, does this seem safe enough to backport to 9.3.1? Apt.pgdg got the patch present in postgresql head applyed. Erm, isn't apt.postgresql.org supposed to ship the *official* PostgreSQL versions? Given that this issue affects all distros, I don't see why Ubuntu/Debian need to be patched separately. Does anyone else think this is problematic? By slipping patches into distro-specific packages, you're confusing users (like me) and bypassing the PostgreSQL QA process. PS: Where are the sources used to build packages on apt.postgresql.org? Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Strange hanging bug in a simple milter
* Stephen Frost (sfr...@snowman.net) wrote: * Andres Freund (and...@2ndquadrant.com) wrote: Hm. close_SSL() first does pqsecure_destroy() which will unset the callbacks, and the count and then goes on to do X509_free() and ENGINE_finish(), ENGINE_free() if either is used. It's not implausible that one of those actually needs locking. I doubt engines play a role here, but, without having looked at the testcase, X509_free() might be a possibility. Unfortunately, while I can still easily get the deadlock to happen when the hooks are reset, the hooks don't appear to ever get called when ssl_open_connections is set to zero. You have a good point about the additional SSL calls after the hooks are unloaded though, I wonder if holding the ssl_config_mutex lock over all of close_SSL might be more sensible.. I went ahead and moved the locks to be around all of close_SSL() and haven't been able to reproduce the deadlock, so perhaps those calls are the issue and what's happening is that another thread is dropping or adding the hooks in a common place while the X509_free, etc, are trying to figure out if they should be calling the locking functions or not, but there's a race because there's no higher-level locking happening around those. Attached is a patch to move those and which doesn't deadlock for me. Thoughts? Thanks, Stephen diff --git a/src/interfaces/libpq/fe-secure.c b/src/interfaces/libpq/fe-secure.c new file mode 100644 index 3bd0113..e14c301 100644 *** a/src/interfaces/libpq/fe-secure.c --- b/src/interfaces/libpq/fe-secure.c *** destroy_ssl_system(void) *** 1009,1017 { #ifdef ENABLE_THREAD_SAFETY /* Mutex is created in initialize_ssl_system() */ - if (pthread_mutex_lock(ssl_config_mutex)) - return; - if (pq_init_crypto_lib ssl_open_connections 0) --ssl_open_connections; --- 1009,1014 *** destroy_ssl_system(void) *** 1031,1037 */ } - pthread_mutex_unlock(ssl_config_mutex); #endif } --- 1028,1033 *** open_client_SSL(PGconn *conn) *** 1537,1542 --- 1533,1543 static void close_SSL(PGconn *conn) { + #ifdef ENABLE_THREAD_SAFETY + if (pthread_mutex_lock(ssl_config_mutex)) + return; + #endif + if (conn-ssl) { DECLARE_SIGPIPE_INFO(spinfo); *** close_SSL(PGconn *conn) *** 1565,1570 --- 1566,1575 conn-engine = NULL; } #endif + + #ifdef ENABLE_THREAD_SAFETY + pthread_mutex_unlock(ssl_config_mutex); + #endif } /* signature.asc Description: Digital signature
Re: [HACKERS] Strange hanging bug in a simple milter
* Andres Freund (and...@2ndquadrant.com) wrote: On 2013-09-13 13:15:34 -0400, Stephen Frost wrote: Good thought. Got sucked into a meeting but once I'm out I'll try having the lock/unlock routines abort if they're called while ssl_open_connections is zero, which should not be happening, but seems like it is. Hm. close_SSL() first does pqsecure_destroy() which will unset the callbacks, and the count and then goes on to do X509_free() and ENGINE_finish(), ENGINE_free() if either is used. It's not implausible that one of those actually needs locking. I doubt engines play a role here, but, without having looked at the testcase, X509_free() might be a possibility. Unfortunately, while I can still easily get the deadlock to happen when the hooks are reset, the hooks don't appear to ever get called when ssl_open_connections is set to zero. You have a good point about the additional SSL calls after the hooks are unloaded though, I wonder if holding the ssl_config_mutex lock over all of close_SSL might be more sensible.. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Strange hanging bug in a simple milter
On 2013-09-13 14:33:25 -0400, Stephen Frost wrote: * Stephen Frost (sfr...@snowman.net) wrote: * Andres Freund (and...@2ndquadrant.com) wrote: Hm. close_SSL() first does pqsecure_destroy() which will unset the callbacks, and the count and then goes on to do X509_free() and ENGINE_finish(), ENGINE_free() if either is used. It's not implausible that one of those actually needs locking. I doubt engines play a role here, but, without having looked at the testcase, X509_free() might be a possibility. Unfortunately, while I can still easily get the deadlock to happen when the hooks are reset, the hooks don't appear to ever get called when ssl_open_connections is set to zero. You have a good point about the additional SSL calls after the hooks are unloaded though, I wonder if holding the ssl_config_mutex lock over all of close_SSL might be more sensible.. I went ahead and moved the locks to be around all of close_SSL() and haven't been able to reproduce the deadlock, so perhaps those calls are the issue and what's happening is that another thread is dropping or adding the hooks in a common place while the X509_free, etc, are trying to figure out if they should be calling the locking functions or not, but there's a race because there's no higher-level locking happening around those. Attached is a patch to move those and which doesn't deadlock for me. It seems slightly cleaner to just move the pqsecure_destroy(); to the end of that function, based on a boolean. But if you think otherwise, I won't protest... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Strange hanging bug in a simple milter
On 2013-09-13 13:59:54 -0400, Stephen Frost wrote: Unfortunately, while I can still easily get the deadlock to happen when the hooks are reset, the hooks don't appear to ever get called when ssl_open_connections is set to zero. You have a good point about the additional SSL calls after the hooks are unloaded though, I wonder if holding the ssl_config_mutex lock over all of close_SSL might be more sensible.. Hm. You might want to do the check for ssl_open_connections after a memory barrier or such, it might not actually be up2date (too tired to think about the actual guarantees of x86's cache coherency guarantees). Does it burn if you move the pqsecure_destroy(); to the end of the function, after setting a boolean in the if (conn-ssl) branch? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Strange hanging bug in a simple milter
On 13.09.2013 22:26, Heikki Linnakangas wrote: I'm afraid the move_locks.diff patch you posted earlier is also broken; close_SSL() is called in error scenarios from pqsecure_open_client(), while already holding the mutex. So it will deadlock with itself if the connection cannot be established. Actually, I think there's a pre-existing bug there in git master. If the SSL_set_app_data or SSL_set_fd call in pqsecure_open_client fails for some reason, it will call close_SSL() with conn-ssl already set, and the mutex held. close_SSL() will call pqsecure_destroy()-destroy_SSL()-destory_ssl_system(), which will try to acquire the mutex again. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE
* Peter Geoghegan (p...@heroku.com) wrote: I would suggest letting those other individuals speak for themselves too. Particularly if you're going to name someone who is on vacation like that. It was my first concern regarding this patch. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Strange hanging bug in a simple milter
* Heikki Linnakangas (hlinnakan...@vmware.com) wrote: Umm, with that patch, pqsecure_destroy() is never called. The if (conn-ssl) test that's now at the end of the close_SSL function is never true, because conn-ssl is set to NULL earlier. Yeah, got ahead of myself, as Andres pointed out. I'm afraid the move_locks.diff patch you posted earlier is also broken; close_SSL() is called in error scenarios from pqsecure_open_client(), while already holding the mutex. So it will deadlock with itself if the connection cannot be established. Sorry, I was really just tossing it up to see if it really did avoid this particular deadlock. I'm running some tests now with the attached to see if I can get it to deadlock now. Thanks, Stephen diff --git a/src/interfaces/libpq/fe-secure.c b/src/interfaces/libpq/fe-secure.c new file mode 100644 index 3bd0113..c008dcf 100644 *** a/src/interfaces/libpq/fe-secure.c --- b/src/interfaces/libpq/fe-secure.c *** open_client_SSL(PGconn *conn) *** 1537,1542 --- 1537,1544 static void close_SSL(PGconn *conn) { + bool destroy_needed = conn-ssl ? true : false; + if (conn-ssl) { DECLARE_SIGPIPE_INFO(spinfo); *** close_SSL(PGconn *conn) *** 1545,1551 SSL_shutdown(conn-ssl); SSL_free(conn-ssl); conn-ssl = NULL; - pqsecure_destroy(); /* We have to assume we got EPIPE */ REMEMBER_EPIPE(spinfo, true); RESTORE_SIGPIPE(conn, spinfo); --- 1547,1552 *** close_SSL(PGconn *conn) *** 1565,1570 --- 1566,1582 conn-engine = NULL; } #endif + + /* + * This will remove our SSL locking hooks, if this is the last SSL + * connection, which means we must wait to call it until after all + * SSL calls have been made, otherwise we can end up with a race + * condition and possible deadlocks. + * + * See comments above destroy_ssl_system(). + */ + if (destroy_needed) + pqsecure_destroy(); } /* signature.asc Description: Digital signature
Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE
On Fri, Sep 13, 2013 at 9:23 AM, Robert Haas robertmh...@gmail.com wrote: Andres is being very polite here, but the reality is that this approach has zero chance of being accepted. I quite like Andres, but I have yet to see him behave as you describe in a situation where someone proposed what was fundamentally a bad idea. Maybe you should let him speak for himself? You can't hold buffer locks for a long period of time across complex operations. Full stop. It's a violation of the rules that are clearly documented in src/backend/storage/buffer/README, which have been in place for a very long time, and this patch is nowhere near important enough to warrant a revision of those rules. The importance of this patch is a value judgement. Our users have been screaming for this for over ten years, so to my mind it has a fairly high importance. Also, every other database system of every stripe worth mentioning has something approximately equivalent to this, including ones with much less functionality generally. The fact that we don't is a really unfortunate omission. As to the rules you refer to, you must mean These locks are intended to be short-term: they should not be held for long. I don't think that they will ever be held for long. At least, when I've managed the amount of work that a heap_insert() can do better. I expect to produce a revision where toasting doesn't happen with the locks held soon. Actually, I've already written the code, I just need to do some testing. We are not going to risk breaking every bit of code anywhere in the backend or in third-party code that takes a buffer lock. You are never going to convince me, or Tom, that the benefit of doing that is worth the risk; in fact, I have a hard time believing that you'll find ANY committer who thinks this approach is worth considering. I would suggest letting those other individuals speak for themselves too. Particularly if you're going to name someone who is on vacation like that. Even if you get the code to run without apparent deadlocks, that doesn't mean there aren't any; Of course it doesn't. Who said otherwise? Our buffer locking regimen suffers from painful complexity and serious maintenance difficulties as is. That's true to a point, but it has more to do with things like how VACUUM interacts with hio.c. Things like this: /* * Release the file-extension lock; it's now OK for someone else to extend * the relation some more. Note that we cannot release this lock before * we have buffer lock on the new page, or we risk a race condition * against vacuumlazy.c --- see comments therein. */ if (needLock) UnlockRelationForExtension(relation, ExclusiveLock); The btree code is different, though: It implements a well-defined interface, with much clearer separation of concerns. As I've said already, with trivial exception (think contrib), no external code considers itself to have license to obtain locks of any sort on btree buffers. No external code of ours - without exception - does anything with multiple locks, or exclusive locks on btree buffers. I'll remind you that I'm only holding shared locks when control is outside of the btree code. Even within the btree code, the number of access method functions that could conflict with what I do here (that acquire exclusive locks) is very small when you exclude things that only exclusive lock the meta-page (there are also very few of those). So the surface area is quite small. I'm not denying that there is a cost, or that I haven't expanded things in a direction I'd prefer not to. I just think that it may well be worth it, particularly when you consider the alternatives - this may well be the least worst thing. I mean, if we do the promise tuple thing, and there are multiple unique indexes, what happens when an inserter needs to block pending the outcome of another transaction? They had better go clean up the promise tuples from the other unique indexes that they're trying to insert into, because they cannot afford to hold value locks for a long time, no matter how they're implemented. That could take much longer than just releasing a shared buffer lock, since for each unique index the promise tuple must be re-found from scratch. There are huge issues with additional complexity and bloat. Oh, and now your lightweight locks aren't so lightweight any more. If the value locks were made interruptible through some method, such as the promise tuples approach, does that really make deadlocking acceptable? So at least your system didn't seize up. But on the other hand, the user randomly had a deadlock error through no fault of their own. The former may be worse, but the latter is also inexcusable. In general, the best solution is just to not have deadlock hazards. I wouldn't be surprised if reasoning about deadlocking was harder with that alternative approach to value locking, not easier. Moreover, we've already got performance and scalability problems that are
Re: [HACKERS] Strange hanging bug in a simple milter
On 13.09.2013 22:03, Stephen Frost wrote: * Andres Freund (and...@2ndquadrant.com) wrote: It seems slightly cleaner to just move the pqsecure_destroy(); to the end of that function, based on a boolean. But if you think otherwise, I won't protest... Hmm, agreed; I had originally been concerned that the SIGPIPE madness needed to be around the pqsecure_destroy() call, but I can't see why that would be. I've run through a few times w/ the attached and haven't seen the deadlock. Will continue testing, of course. Heikki, any thoughts regarding this change? Any concerns about back-patching it? Umm, with that patch, pqsecure_destroy() is never called. The if (conn-ssl) test that's now at the end of the close_SSL function is never true, because conn-ssl is set to NULL earlier. I'm afraid the move_locks.diff patch you posted earlier is also broken; close_SSL() is called in error scenarios from pqsecure_open_client(), while already holding the mutex. So it will deadlock with itself if the connection cannot be established. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Large shared_buffer stalls WAS: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
On 09/13/2013 09:27 AM, Merlin Moncure wrote: I happen to be one of those couple people. Load goes from 0.1 to 500 without warning then back to 0.1 equally without warning. Unfortunately the server is in a different jurisdiction such that it makes deep forensic analysis impossible. I think this is happening more and more often as postgres is becoming increasingly deployed on high(er) -end servers. I've personally (alone) dealt with 4-5 confirmed cases and there have been many more. We have a problem. Can you explain a bit more about this? I'm currently grappling with a db cluster which has periodic mysterious total LWLock paralysis, and is configured with 8GB shared_buffers (and 512GB ram installed). -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Strange hanging bug in a simple milter
* Andres Freund (and...@2ndquadrant.com) wrote: It seems slightly cleaner to just move the pqsecure_destroy(); to the end of that function, based on a boolean. But if you think otherwise, I won't protest... Hmm, agreed; I had originally been concerned that the SIGPIPE madness needed to be around the pqsecure_destroy() call, but I can't see why that would be. I've run through a few times w/ the attached and haven't seen the deadlock. Will continue testing, of course. Heikki, any thoughts regarding this change? Any concerns about back-patching it? Thanks, Stephen diff --git a/src/interfaces/libpq/fe-secure.c b/src/interfaces/libpq/fe-secure.c new file mode 100644 index 3bd0113..ca2c5e4 100644 *** a/src/interfaces/libpq/fe-secure.c --- b/src/interfaces/libpq/fe-secure.c *** close_SSL(PGconn *conn) *** 1545,1551 SSL_shutdown(conn-ssl); SSL_free(conn-ssl); conn-ssl = NULL; - pqsecure_destroy(); /* We have to assume we got EPIPE */ REMEMBER_EPIPE(spinfo, true); RESTORE_SIGPIPE(conn, spinfo); --- 1545,1550 *** close_SSL(PGconn *conn) *** 1565,1570 --- 1564,1572 conn-engine = NULL; } #endif + + if (conn-ssl) + pqsecure_destroy(); } /* signature.asc Description: Digital signature
Re: [HACKERS] Strange hanging bug in a simple milter
On 2013-09-13 15:03:31 -0400, Stephen Frost wrote: * Andres Freund (and...@2ndquadrant.com) wrote: It seems slightly cleaner to just move the pqsecure_destroy(); to the end of that function, based on a boolean. But if you think otherwise, I won't protest... Hmm, agreed; I had originally been concerned that the SIGPIPE madness needed to be around the pqsecure_destroy() call, but I can't see why that would be. I've run through a few times w/ the attached and haven't seen the deadlock. Will continue testing, of course. That patch looks wrong to me. Note that the if (conn-ssl) branch resets conn-ssl to NULL. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE
On Fri, Sep 13, 2013 at 12:14 PM, Stephen Frost sfr...@snowman.net wrote: It was my first concern regarding this patch. It was my first concern too. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE
On 2013-09-13 11:59:54 -0700, Peter Geoghegan wrote: On Fri, Sep 13, 2013 at 9:23 AM, Robert Haas robertmh...@gmail.com wrote: Andres is being very polite here, but the reality is that this approach has zero chance of being accepted. I quite like Andres, but I have yet to see him behave as you describe in a situation where someone proposed what was fundamentally a bad idea. Maybe you should let him speak for himself? Unfortunately I have to agree with Robert here, I think it's a complete nogo to do what you propose so far and I've several times now presented arguments why I think so. The reason I wasn't saying this will never get accepted are twofold: a) I don't want to stiffle alternative ideas to the promises idea, just because I think it's the way to go. That might stop a better idea from being articulated. b) I am not actually in the position to say it's not going to be accepted. *I* think that unless you make some fundamental and very, very clever modifications to your algorithm that end up *not holding a lock over other operations at all*, it's not going to get committed. And I'll chip in with my -1. And clever modification doesn't mean slightly restructuring heapam.c's operations. The importance of this patch is a value judgement. Our users have been screaming for this for over ten years, so to my mind it has a fairly high importance. Also, every other database system of every stripe worth mentioning has something approximately equivalent to this, including ones with much less functionality generally. The fact that we don't is a really unfortunate omission. I aggree it's quite important but that doesn't mean we have to do stuff that we think are unacceptable, especially as there *are* other ways to do it. As to the rules you refer to, you must mean These locks are intended to be short-term: they should not be held for long. I don't think that they will ever be held for long. At least, when I've managed the amount of work that a heap_insert() can do better. I expect to produce a revision where toasting doesn't happen with the locks held soon. Actually, I've already written the code, I just need to do some testing. I personally think - and have stated so before - that doing a heap_insert() while holding the btree lock is unacceptable. The btree code is different, though: It implements a well-defined interface, with much clearer separation of concerns. Which you're completely violating by linking the btree buffer locking with the heap locking. It's not about the btree code alone. At this point I am a bit confused why you are asking for review. I mean, if we do the promise tuple thing, and there are multiple unique indexes, what happens when an inserter needs to block pending the outcome of another transaction? They had better go clean up the promise tuples from the other unique indexes that they're trying to insert into, because they cannot afford to hold value locks for a long time, no matter how they're implemented. Why? We're using normal transaction visibility rules here. We don't stop *other* values on the same index getting updated or similar. And anyway. It doesn't matter which problem the promises idea has. We're discussing your proposal here. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Strange hanging bug in a simple milter
* Heikki Linnakangas (hlinnakan...@vmware.com) wrote: Actually, I think there's a pre-existing bug there in git master. If the SSL_set_app_data or SSL_set_fd call in pqsecure_open_client fails for some reason, it will call close_SSL() with conn-ssl already set, and the mutex held. close_SSL() will call pqsecure_destroy()-destroy_SSL()-destory_ssl_system(), which will try to acquire the mutex again. Yeah, good point, and that one looks like my fault. Moving it after the unlock should address that tho, no? Looking through the other close_SSL() calls, I don't see any other situations where it might be called with the lock held. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Strange hanging bug in a simple milter
* Andres Freund (and...@2ndquadrant.com) wrote: That patch looks wrong to me. Note that the if (conn-ssl) branch resets conn-ssl to NULL. huh, it figures one would overlook the simplest things. Of course it's not locking up now- we never remove the hooks (as my original patch was doing :). That's what I get for trying to go to meetings and code at the same time. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Large shared_buffer stalls WAS: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
On Fri, Sep 13, 2013 at 2:36 PM, Josh Berkus j...@agliodbs.com wrote: On 09/13/2013 09:27 AM, Merlin Moncure wrote: I happen to be one of those couple people. Load goes from 0.1 to 500 without warning then back to 0.1 equally without warning. Unfortunately the server is in a different jurisdiction such that it makes deep forensic analysis impossible. I think this is happening more and more often as postgres is becoming increasingly deployed on high(er) -end servers. I've personally (alone) dealt with 4-5 confirmed cases and there have been many more. We have a problem. Can you explain a bit more about this? I'm currently grappling with a db cluster which has periodic mysterious total LWLock paralysis, and is configured with 8GB shared_buffers (and 512GB ram installed). see thread: http://postgresql.1045698.n5.nabble.com/StrategyGetBuffer-optimization-take-2-td5766307.html plus others. In this particular case, s_b needs to be set to 2GB or lower (they tried raising to 4GB for no good reason) and problem reoccured. what are the specific symptoms of your problem? anything interesting in pg_locks? is $client willing to experiment with custom patches? merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Large shared_buffer stalls WAS: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
On 09/13/2013 12:55 PM, Merlin Moncure wrote: what are the specific symptoms of your problem? anything interesting in pg_locks? is $client willing to experiment with custom patches? 3 servers: 1 master, two replicas. 32-core Xeon, hyperthreaded to 64 cores 512GB RAM each s_b set to 8GB Load-balanced between all 3 ~~ 11 different databases combined database size around 600GB using pgbouncer Irregularly, during periods of high activity (although not necessarily peak activity) one or another of the systems will go into paralysis, with all backends apparently waiting on LWLocks (we need more tracing information to completely confirm this). Activity at this time is usually somewhere between 50 and 100 concurrent queries (and 80 to 150 connections). pg_locks doesn't think anything is waiting on a lock. What's notable is that sometimes it's just *one* of the replicas which goes into paralysis. If the master gets this issue though, the replicas experience it soon afterwards. Increasing wal_buffers from 16GB to 64GB seems to make this issue happen less frequently, but it doesn't go away entirely. Only a restart of the server, or killing all backend, ends the lockup. The workload is OLTP, essentially, around 20/80 write/read. They use PostGIS. The other notable thing about their workload is that due to an ORM defect, they get idle-in-transactions which last from 5 to 15 seconds several times a minute. They are willing to use experimental patches, but only if those patches can be applied only to a replica. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Large shared_buffer stalls WAS: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
On 09/13/2013 01:25 PM, Andres Freund wrote: Josh Berkus j...@agliodbs.com schrieb: What's notable is that sometimes it's just *one* of the replicas which goes into paralysis. If the master gets this issue though, the replicas experience it soon afterwards. Increasing wal_buffers from 16GB to 64GB seems to make this issue happen less frequently, but it doesn't go away entirely. Youre talking about MB, not GB here, right? Oh, yes, right. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Large shared_buffer stalls WAS: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
Josh Berkus j...@agliodbs.com schrieb: What's notable is that sometimes it's just *one* of the replicas which goes into paralysis. If the master gets this issue though, the replicas experience it soon afterwards. Increasing wal_buffers from 16GB to 64GB seems to make this issue happen less frequently, but it doesn't go away entirely. Youre talking about MB, not GB here, right? Andres -- Please excuse brevity and formatting - I am writing this on my mobile phone. Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Completing PL support for Event Triggers
Hi, Please find attached to this email three patches covering the missing PL support for Event Triggers: pltcl, plperl and plpython. Due to “platform” problems here tonight and the CF deadline, the plpython patch is known not to pass regression tests on my machine. The code is fully rebased and compiles without warning, though, so I'm still entering it into this CF: hopefully I will figure out what's wrong with my local plpython platform support here early next week. I'm going to add the 3 attached patches to the CF as a single item, but maybe we'll then split if we have separate subsystem maintainers for those 3 PLs? Given the size of the changes (docs included in the following stats) I don't think that's going to be necessary, but well, let me know. tcl 4 files changed, 204 insertions(+), 24 deletions(-) perl4 files changed, 240 insertions(+), 12 deletions(-) python 8 files changed, 165 insertions(+), 14 deletions(-) total 16 files changed, 609 insertions(+), 50 deletions(-) Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support event_trigger_plperl.v1.patch.gz Description: Binary data event_trigger_plpython.v1.patch.gz Description: Binary data event_trigger_pltcl.v1.patch.gz Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Large shared_buffer stalls WAS: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
On Fri, Sep 13, 2013 at 3:20 PM, Josh Berkus j...@agliodbs.com wrote: On 09/13/2013 12:55 PM, Merlin Moncure wrote: what are the specific symptoms of your problem? anything interesting in pg_locks? is $client willing to experiment with custom patches? 3 servers: 1 master, two replicas. 32-core Xeon, hyperthreaded to 64 cores 512GB RAM each s_b set to 8GB Load-balanced between all 3 ~~ 11 different databases combined database size around 600GB using pgbouncer Irregularly, during periods of high activity (although not necessarily peak activity) one or another of the systems will go into paralysis, with all backends apparently waiting on LWLocks (we need more tracing information to completely confirm this). Activity at this time is usually somewhere between 50 and 100 concurrent queries (and 80 to 150 connections). pg_locks doesn't think anything is waiting on a lock. What's notable is that sometimes it's just *one* of the replicas which goes into paralysis. If the master gets this issue though, the replicas experience it soon afterwards. Increasing wal_buffers from 16GB to 64GB seems to make this issue happen less frequently, but it doesn't go away entirely. Only a restart of the server, or killing all backend, ends the lockup. The workload is OLTP, essentially, around 20/80 write/read. They use PostGIS. The other notable thing about their workload is that due to an ORM defect, they get idle-in-transactions which last from 5 to 15 seconds several times a minute. They are willing to use experimental patches, but only if those patches can be applied only to a replica. ok, points similar: *) master/slave config (two slaves for me) *) 'big' server 256GB mem, 32 core *) 80% approx. (perhaps more) *) some spacial searching (but not very much) *) OLTP *) presentation of load, although in my case it did resolve anywhere from 30 secs to half hour *) aside from the spike, 100% healthy points different *) application side pooling: 96 app servers, max 5 connections each (aside: are you using transaction mode pgbouncer?) *) I saw gripes about relation extension in pg_locks merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers
Andres Freund and...@2ndquadrant.com wrote: Absolutely not claiming the contrary. I think it sucks that we couldn't fully figure out what's happening in detail. I'd love to get my hand on a setup where it can be reliably reproduced. I have seen two completely different causes for symptoms like this, and I suspect that these aren't the only two. (1) The dirty page avalanche: PostgreSQL hangs on to a large number of dirty buffers and then dumps a lot of them at once. The OS does the same. When PostgreSQL dumps its buffers to the OS it pushes the OS over a tipping point where it is writing dirty buffers too fast for the controller's BBU cache to absorb them. Everything freezes until the controller writes and accepts OS writes for a lot of data. This can take several minutes, during which time the database seems frozen. Cure is some combination of these: reduce shared_buffers, make the background writer more aggressive, checkpoint more often, make the OS dirty page writing more aggressive, add more BBU RAM to the controller. (2) Transparent huge page support goes haywire on its defrag work. Clues on this include very high system CPU time during an episode, and `perf top` shows more time in kernel spinlock functions than anywhere else. The database doesn't completely lock up like with the dirty page avalanche, but it is slow enough that users often describe it that way. So far I have only seen this cured by disabling THP support (in spite of some people urging that just the defrag be disabled). It does make me wonder whether there is something we could do in PostgreSQL to interact better with THPs. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers
On Fri, Sep 13, 2013 at 4:04 PM, Kevin Grittner kgri...@ymail.com wrote: Andres Freund and...@2ndquadrant.com wrote: Absolutely not claiming the contrary. I think it sucks that we couldn't fully figure out what's happening in detail. I'd love to get my hand on a setup where it can be reliably reproduced. I have seen two completely different causes for symptoms like this, and I suspect that these aren't the only two. (1) The dirty page avalanche: PostgreSQL hangs on to a large number of dirty buffers and then dumps a lot of them at once. The OS does the same. When PostgreSQL dumps its buffers to the OS it pushes the OS over a tipping point where it is writing dirty buffers too fast for the controller's BBU cache to absorb them. Everything freezes until the controller writes and accepts OS writes for a lot of data. This can take several minutes, during which time the database seems frozen. Cure is some combination of these: reduce shared_buffers, make the background writer more aggressive, checkpoint more often, make the OS dirty page writing more aggressive, add more BBU RAM to the controller. Yeah -- I've seen this too, and it's a well understood problem. Getting o/s to spin dirty pages out faster is the name of the game I think. Storage is getting so fast that it's (mostly) moot anyways. Also, this is under the umbrella of 'high i/o' -- the stuff I've been seeing is low- or no- I/o. (2) Transparent huge page support goes haywire on its defrag work. Clues on this include very high system CPU time during an episode, and `perf top` shows more time in kernel spinlock functions than anywhere else. The database doesn't completely lock up like with the dirty page avalanche, but it is slow enough that users often describe it that way. So far I have only seen this cured by disabling THP support (in spite of some people urging that just the defrag be disabled). It does make me wonder whether there is something we could do in PostgreSQL to interact better with THPs. Ah, that's a useful tip; need to research that, thanks. Maybe Josh might be able to give it a whirl... merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Large shared_buffer stalls WAS: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
On 09/13/2013 01:58 PM, Merlin Moncure wrote: ok, points similar: *) master/slave config (two slaves for me) *) 'big' server 256GB mem, 32 core *) 80% approx. (perhaps more) *) some spacial searching (but not very much) *) OLTP *) presentation of load, although in my case it did resolve anywhere from 30 secs to half hour *) aside from the spike, 100% healthy points different *) application side pooling: 96 app servers, max 5 connections each (aside: are you using transaction mode pgbouncer?) Yes. *) I saw gripes about relation extension in pg_locks I'll check for that next time. We're also working on seeing if we can reproduce this under test conditions. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Large shared_buffer stalls WAS: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
On Fri, Sep 13, 2013 at 4:15 PM, Josh Berkus j...@agliodbs.com wrote: On 09/13/2013 01:58 PM, Merlin Moncure wrote: ok, points similar: *) master/slave config (two slaves for me) *) 'big' server 256GB mem, 32 core *) 80% approx. (perhaps more) *) some spacial searching (but not very much) *) OLTP *) presentation of load, although in my case it did resolve anywhere from 30 secs to half hour *) aside from the spike, 100% healthy points different *) application side pooling: 96 app servers, max 5 connections each (aside: are you using transaction mode pgbouncer?) Yes Given that, I would be curious to see what dropping connection pool down to somewhere between 32-64 connections would do. This is a band-aid obviously since not everyone can or wants to run pgbouncer. But this first thing that pops into my mind in these situations is that it might alleviate the symptoms. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
Andres Freund and...@2ndquadrant.com wrote: On 2013-09-12 15:27:27 -0700, Kevin Grittner wrote: The new operator is logically similar to IS NOT DISTINCT FROM for a record, although its implementation is very different. For one thing, it doesn't replace the operation with column level operators in the parser. For another thing, it doesn't look up operators for each type, so the identical operator does not need to be implemented for each type to use it as shown above. It compares values byte-for-byte, after detoasting. The test for identical records can avoid the detoasting altogether for any values with different lengths, and it stops when it finds the first column with a difference. In the general case, that operator sounds dangerous to me. We don't guarantee that a Datum containing the same data always has the same binary representation. E.g. array can have a null bitmap or may not have one, depending on how they were created. I am not actually sure whether that's a problem for your usecase, but I get headaches when we try circumventing the type abstraction that way. Yes, we do such tricks in other places already, but afaik in all those places errorneously believing two Datums are distinct is not error, just a missed optimization. Allowing a general operator with such a murky definition to creep into something SQL exposed... Hm. Not sure. Well, the only two alternatives I could see were to allow user-visible differences not to be carried to the matview if they old and new values were considered equal, or to implement an identical operator or function in every type that was to be allowed in a matview. Given those options, what's in this patch seemed to me to be the least evil. It might be worth noting that this scheme doesn't have a problem with correctness if there are multiple equal values which are not identical, as long as any two identical values are equal. If the query which generates contents for a matview generates non-identical but equal values from one run to the next without any particular reason, that might cause performance problems. To mangle Orwell: Among pairs of equal values, some pairs are more equal than others. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE
On Fri, Sep 13, 2013 at 12:23 PM, Andres Freund and...@2ndquadrant.com wrote: The reason I wasn't saying this will never get accepted are twofold: a) I don't want to stiffle alternative ideas to the promises idea, just because I think it's the way to go. That might stop a better idea from being articulated. b) I am not actually in the position to say it's not going to be accepted. Well, the reality is that the promises idea hasn't been described in remotely enough detail to compare it to what I have here. I've pointed out plenty of problems with it. After all, it was the first thing that I considered, and I'm on the record talking about it in the 2012 dev meeting. I didn't take that approach for many good reasons. The reason I ended up here is not because I didn't get the memo about holding buffer locks across complex operations being a bad thing. At least grant me that. I'm here because in all these years no one has come up with a suggestion that doesn't have some very major downsides. Like, even worse than this. As to the rules you refer to, you must mean These locks are intended to be short-term: they should not be held for long. I don't think that they will ever be held for long. At least, when I've managed the amount of work that a heap_insert() can do better. I expect to produce a revision where toasting doesn't happen with the locks held soon. Actually, I've already written the code, I just need to do some testing. I personally think - and have stated so before - that doing a heap_insert() while holding the btree lock is unacceptable. Presumably your reason is essentially that we exclusive lock a heap buffer (exactly one heap buffer) while holding shared locks on btree index buffers. Is that really so different to holding an exclusive lock on a btree buffer while holding a shared lock on a heap buffer? Because that's what _bt_check_unique() does today. Now, I'll grant you that there is one appreciable difference, which is that multiple unique indexes may be involved. But limiting ourselves to the primary key or something like that remains an option. And I'm not sure that it's really any worse anyway. The btree code is different, though: It implements a well-defined interface, with much clearer separation of concerns. Which you're completely violating by linking the btree buffer locking with the heap locking. It's not about the btree code alone. You're right that it isn't about just the btree code. In order for a deadlock to occur, there must be a mutual dependency. What code could feel entitled to hold buffer locks on btree buffers and heap buffers at the same time except the btree code itself? It already does so. But no one else does the same thing. If anyone did anything with a heap buffer lock held that could result in a call into one of the btree access method functions (I'm not contemplating the possibility of this other code locking the btree buffer *directly*), I'm quite sure that that would be rejected outright today, because that causes deadlocks. Certainly, vacuumlazy.c doesn't do it, for example. Why would anyone ever want to do that anyway? I cannot think of any reason. I suppose that that does still leave transitive dependencies, but now you're stretching things. After all, you're not supposed to hold buffer locks for long! The dependency would have to transit through, say, one of the system LWLocks used for WAL Logging. Seems pretty close to impossible that it'd be an issue - index stuff is only WAL-logged as index tuples are inserted (that is, as the locks are finally released). Everyone automatically does that kind of thing in a consistent order of locking, unlocking in the opposite order anyway. But what of the btree code deadlocking with itself? There are only a few functions (2 or 3) where that's possible even in principle. I think that they're going to be not too hard to analyze. For example, with insertion, the trick is to always lock in a consistent order and unlock/insert in the opposite order. The heap shared lock(s) needed in the btree code cannot deadlock with another upserter because once the other upserter has that exclusive heap buffer lock, it's *inevitable* that it will release all of its shared buffer locks. Granted, I could stand to think about this case more, but you get the idea - it *is* possible to clamp down on the code that needs to care about this stuff to a large degree. It's subtle, but btrees are generally considered pretty complicated, and the btree code already cares about some odd cases like these (it takes special precuations for catalog indexes, for example). The really weird thing about my patch is that the btree code trusts the executor to call the heapam code to do the right thing in the right way - it now knows more than I'd prefer. Would you be happier if the btree code took more direct responsibility for the heap tuple insertion instead? Before you say that's ridiculous, consider the big modularity violation that
[HACKERS] plpgsql.print_strict_params
Hi, Attached is a patch for optionally printing more information on STRICT failures in PL/PgSQL: set plpgsql.print_strict_params to true; create or replace function footest() returns void as $$ declare x record; p1 int := 2; p3 text := 'foo'; begin -- too many rows select * from foo where f1 p1 or f1::text = p3 into strict x; raise notice 'x.f1 = %, x.f2 = %', x.f1, x.f2; end$$ language plpgsql; select footest(); ERROR: query returned more than one row DETAIL: p1 = '2', p3 = 'foo' CONTEXT: PL/pgSQL function footest() line 8 at SQL statement This parameter is turned off by default to preserve old behaviour, but can be extremely useful when debugging code in test environments. I will add this to the open commitfest, but in the meanwhile, any feedback is appreciated. Regards, Marko Tiikkaja *** a/doc/src/sgml/plpgsql.sgml --- b/doc/src/sgml/plpgsql.sgml *** *** 1076,1081 END; --- 1076,1088 always sets literalFOUND/literal to true. /para + para + The configuration parameter literalplpgsql.print_strict_params/ + can be enabled to get information about the parameters passed to the + query in the literalDETAIL/ part of the error message produced + when the requirements of STRICT are not met. + /para + para For commandINSERT//commandUPDATE//commandDELETE/ with literalRETURNING/, applicationPL/pgSQL/application reports *** a/src/pl/plpgsql/src/pl_exec.c --- b/src/pl/plpgsql/src/pl_exec.c *** *** 139,144 static void plpgsql_estate_setup(PLpgSQL_execstate *estate, --- 139,150 ReturnSetInfo *rsi); static void exec_eval_cleanup(PLpgSQL_execstate *estate); + static char *exec_get_query_params(PLpgSQL_execstate *estate, + const PLpgSQL_expr *expr); + static char *exec_get_dynquery_params(PLpgSQL_execstate *estate, + const PreparedParamsData *ppd); + + static void exec_prepare_plan(PLpgSQL_execstate *estate, PLpgSQL_expr *expr, int cursorOptions); static bool exec_simple_check_node(Node *node); *** *** 3226,3231 exec_prepare_plan(PLpgSQL_execstate *estate, --- 3232,3310 exec_simple_check_plan(expr); } + static char * + exec_get_query_params(PLpgSQL_execstate *estate, + const PLpgSQL_expr *expr) + { + int paramno; + int dno; + StringInfoData paramstr; + Bitmapset *tmpset; + + if (!expr-paramnos) + return (no parameters); + + initStringInfo(paramstr); + tmpset = bms_copy(expr-paramnos); + paramno = 1; + while ((dno = bms_first_member(tmpset)) = 0) + { + Datum paramdatum; + Oid paramtypeid; + bool paramisnull; + int32 paramtypmod; + PLpgSQL_var *curvar; + + curvar = (PLpgSQL_var *) estate-datums[dno]; + + exec_eval_datum(estate, (PLpgSQL_datum *) curvar, paramtypeid, + paramtypmod, paramdatum, paramisnull); + + if (paramno 1) + appendStringInfo(paramstr, , ); + + if (paramisnull) + appendStringInfo(paramstr, %s = NULL, curvar-refname); + else + { + char *value = convert_value_to_string(estate, paramdatum, paramtypeid); + appendStringInfo(paramstr, %s = '%s', curvar-refname, value); + } + + paramno++; + } + bms_free(tmpset); + + return paramstr.data; + } + + static char * + exec_get_dynquery_params(PLpgSQL_execstate *estate, +const PreparedParamsData *ppd) + { + int i; + StringInfoData paramstr; + + if (!ppd) + return (no parameters); + + initStringInfo(paramstr); + for (i = 0; i ppd-nargs; ++i) + { + if (i 0) + appendStringInfoString(paramstr, , ); + + if (ppd-nulls[i] == 'n') + appendStringInfo(paramstr, $%d = NULL, i+1); + else + { + char *value = convert_value_to_string(estate, ppd-values[i], ppd-types[i]); + appendStringInfo(paramstr, $%d = '%s', i+1, value); + } + } + + return paramstr.data; + } /* -- * exec_stmt_execsql Execute an SQL statement (possibly with INTO). *** *** 3391,3408 exec_stmt_execsql(PLpgSQL_execstate *estate, if (n == 0) { if (stmt-strict)
Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers
On 2013-09-13 14:04:55 -0700, Kevin Grittner wrote: Andres Freund and...@2ndquadrant.com wrote: Absolutely not claiming the contrary. I think it sucks that we couldn't fully figure out what's happening in detail. I'd love to get my hand on a setup where it can be reliably reproduced. I have seen two completely different causes for symptoms like this, and I suspect that these aren't the only two. (1) The dirty page avalanche: PostgreSQL hangs on to a large number of dirty buffers and then dumps a lot of them at once. The OS does the same. When PostgreSQL dumps its buffers to the OS it pushes the OS over a tipping point where it is writing dirty buffers too fast for the controller's BBU cache to absorb them. Everything freezes until the controller writes and accepts OS writes for a lot of data. This can take several minutes, during which time the database seems frozen. Cure is some combination of these: reduce shared_buffers, make the background writer more aggressive, checkpoint more often, make the OS dirty page writing more aggressive, add more BBU RAM to the controller. That should hopefully be diagnosable from other system stats like the dirty rate. (2) Transparent huge page support goes haywire on its defrag work. Clues on this include very high system CPU time during an episode, and `perf top` shows more time in kernel spinlock functions than anywhere else. The database doesn't completely lock up like with the dirty page avalanche, but it is slow enough that users often describe it that way. So far I have only seen this cured by disabling THP support (in spite of some people urging that just the defrag be disabled). Yes, I have seen that issue a couple of times now as well. I can confirm that in at least two cases disabling defragmentation alone proved to be enough to fix the issue. Annoyingly enough there are different ways to disable defragmentation/THP depending on whether you're using THP backported by redhat or the upstream version... It does make me wonder whether there is something we could do in PostgreSQL to interact better with THPs. The best thing I see is to just use explicit hugepages. I've previously sent a prototype for that then has been turned into an actual implementation by Christian Kruse. A colleague of mine is working on polishing that patch into something committable. If you use large s_b, the memory savings alone (some 100kb instead dozens of megabytes per backend) can be worth it, not to talk of actual performance gains. Updating the kernel helps as well, they've improved the efficiency of defragmentation a good bit. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
On 2013-09-13 14:36:27 -0700, Kevin Grittner wrote: Andres Freund and...@2ndquadrant.com wrote: On 2013-09-12 15:27:27 -0700, Kevin Grittner wrote: The new operator is logically similar to IS NOT DISTINCT FROM for a record, although its implementation is very different. For one thing, it doesn't replace the operation with column level operators in the parser. For another thing, it doesn't look up operators for each type, so the identical operator does not need to be implemented for each type to use it as shown above. It compares values byte-for-byte, after detoasting. The test for identical records can avoid the detoasting altogether for any values with different lengths, and it stops when it finds the first column with a difference. In the general case, that operator sounds dangerous to me. We don't guarantee that a Datum containing the same data always has the same binary representation. E.g. array can have a null bitmap or may not have one, depending on how they were created. I am not actually sure whether that's a problem for your usecase, but I get headaches when we try circumventing the type abstraction that way. Yes, we do such tricks in other places already, but afaik in all those places errorneously believing two Datums are distinct is not error, just a missed optimization. Allowing a general operator with such a murky definition to creep into something SQL exposed... Hm. Not sure. Well, the only two alternatives I could see were to allow user-visible differences not to be carried to the matview if they old and new values were considered equal, or to implement an identical operator or function in every type that was to be allowed in a matview. Given those options, what's in this patch seemed to me to be the least evil. It might be worth noting that this scheme doesn't have a problem with correctness if there are multiple equal values which are not identical, as long as any two identical values are equal. If the query which generates contents for a matview generates non-identical but equal values from one run to the next without any particular reason, that might cause performance problems. I am not actually that concerned with MVCs using this, you're quite capable of analyzing the dangers. What I am wary of is exposing an operator that's basically broken from the get go to SQL. Now, the obvious issue there is that matviews use SQL to refresh :( Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE
Peter Geoghegan p...@heroku.com wrote: we exclusive lock a heap buffer (exactly one heap buffer) while holding shared locks on btree index buffers. Is that really so different to holding an exclusive lock on a btree buffer while holding a shared lock on a heap buffer? Because that's what _bt_check_unique() does today. Is it possible to get a deadlock doing only one of those two things? Is it possible to avoid a deadlock doing both of them? -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
Andres Freund and...@2ndquadrant.com wrote: I am not actually that concerned with MVCs using this, you're quite capable of analyzing the dangers. What I am wary of is exposing an operator that's basically broken from the get go to SQL. Now, the obvious issue there is that matviews use SQL to refresh :( I'm not sure why these operators are more broken or dangerous than those which already exist to support the text_pattern_ops and bpchar_pattern_ops operator families. I could overload those operator names as much as possible if that seems better. As I said at the start of the thread, I have no particular attachment to these operator names. For example, that would mean using ~=~ as the operator for record_image_ge() instead of using ==. It would be trivial to make that adjustment to the patch. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
On 2013-09-13 15:13:20 -0700, Kevin Grittner wrote: Andres Freund and...@2ndquadrant.com wrote: I am not actually that concerned with MVCs using this, you're quite capable of analyzing the dangers. What I am wary of is exposing an operator that's basically broken from the get go to SQL. Now, the obvious issue there is that matviews use SQL to refresh :( I'm not sure why these operators are more broken or dangerous than those which already exist to support the text_pattern_ops and bpchar_pattern_ops operator families. I could overload those operator names as much as possible if that seems better. As I said at the start of the thread, I have no particular attachment to these operator names. For example, that would mean using ~=~ as the operator for record_image_ge() instead of using ==. It would be trivial to make that adjustment to the patch. Hm. I don't see the similarity. Those have pretty clearly defined behaviour. Not one that's dependendant on padding bytes, null bitmaps that can or cannot be present and such. I guess we need input from others here. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpgsql.print_strict_params
Marko Tiikkaja wrote: set plpgsql.print_strict_params to true; Hmm, shouldn't this be a compile option? See the comp_option production in pl_gram.y for existing examples. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpgsql.print_strict_params
On 2013-09-14 00:39, Alvaro Herrera wrote: set plpgsql.print_strict_params to true; Hmm, shouldn't this be a compile option? See the comp_option production in pl_gram.y for existing examples. I thought about that, but it seemed more like a runtime option to me. Any particular reason you think it would be better as a compile time option? Regards, Marko Tiikkaja -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] patch: add MAP_HUGETLB to mmap() where supported (WIP)
The attached patch adds the MAP_HUGETLB flag to mmap() for shared memory on systems that support it. It's based on Christian Kruse's patch from last year, incorporating suggestions from Andres Freund. On a system with 4GB shared_buffers, doing pgbench runs long enough for each backend to touch most of the buffers, this patch saves nearly 8MB of memory per backend and improves performances by just over 2% on average. It is still WIP as there are a couple of points that Andres has pointed out to me that haven't been addressed yet; also, the documentation is incomplete. Richard -- Richard Poole http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index 23ebc11..703b28f 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -1052,6 +1052,42 @@ include 'filename' /listitem /varlistentry + varlistentry id=guc-huge-tlb-pages xreflabel=huge_tlb_pages + termvarnamehuge_tlb_pages/varname (typeenum/type)/term + indexterm + primaryvarnamehuge_tlb_pages/ configuration parameter/primary + /indexterm + listitem + para +Enables/disables the use of huge tlb pages. Valid values are +literalon/literal, literaloff/literal and literaltry/literal. +The default value is literaltry/literal. + /para + + para + Use of huge tlb pages reduces the cpu time spent on memory management and + the amount of memory used for page tables and therefore improves performance. + /para + + para +With varnamehuge_tlb_pages/varname set to literalon/literal +symbolmmap()/symbol will be called with symbolMAP_HUGETLB/symbol. +If the call fails the server will fail fatally. + /para + + para +With varnamehuge_tlb_pages/varname set to literaloff/literal we +will not use symbolMAP_HUGETLB/symbol at all. + /para + + para +With varnamehuge_tlb_pages/varname set to literaltry/literal +we will try to use symbolMAP_HUGETLB/symbol and fall back to +symbolmmap()/symbol without symbolMAP_HUGETLB/symbol. + /para + /listitem + /varlistentry + varlistentry id=guc-temp-buffers xreflabel=temp_buffers termvarnametemp_buffers/varname (typeinteger/type)/term indexterm diff --git a/src/backend/port/sysv_shmem.c b/src/backend/port/sysv_shmem.c index 20e3c32..57fff35 100644 --- a/src/backend/port/sysv_shmem.c +++ b/src/backend/port/sysv_shmem.c @@ -27,10 +27,14 @@ #ifdef HAVE_SYS_SHM_H #include sys/shm.h #endif +#ifdef MAP_HUGETLB +#include dirent.h +#endif #include miscadmin.h #include storage/ipc.h #include storage/pg_shmem.h +#include utils/guc.h typedef key_t IpcMemoryKey; /* shared memory key passed to shmget(2) */ @@ -61,6 +65,13 @@ typedef int IpcMemoryId; /* shared memory ID returned by shmget(2) */ #define MAP_FAILED ((void *) -1) #endif +#ifdef MAP_HUGETLB +#define PG_HUGETLB_BASE_ADDR (void *)(0x0UL) +#define PG_MAP_HUGETLB MAP_HUGETLB +#else +#define PG_MAP_HUGETLB 0 +#endif + unsigned long UsedShmemSegID = 0; void *UsedShmemSegAddr = NULL; @@ -342,6 +353,161 @@ PGSharedMemoryIsInUse(unsigned long id1, unsigned long id2) } +#ifdef MAP_HUGETLB +#define HUGE_PAGE_INFO_DIR /sys/kernel/mm/hugepages + +/* + * static long InternalGetFreeHugepagesCount(const char *name) + * + * Attempt to read the number of available hugepages from + * /sys/kernel/mm/hugepages/hugepages-size/free_hugepages + * Will fail (return -1) if file could not be opened, 0 if no pages are available + * and 0 if there are free pages + * + */ +static long +InternalGetFreeHugepagesCount(const char *name) +{ + int fd; + char buff[1024]; + size_t len; + long result; + char *ptr; + + len = snprintf(buff, 1024, %s/%s/free_hugepages, HUGE_PAGE_INFO_DIR, name); + if (len == 1024) /* I don't think that this will happen ever */ + { + ereport(huge_tlb_pages == HUGE_TLB_TRY ? DEBUG1 : WARNING, +(errmsg(Filename %s/%s/free_hugepages is too long, HUGE_PAGE_INFO_DIR, name), + errcontext(while checking hugepage size))); + return -1; + } + + fd = open(buff, O_RDONLY); + if (fd = 0) + { + ereport(huge_tlb_pages == HUGE_TLB_TRY ? DEBUG1 : WARNING, +(errmsg(Could not open file %s: %s, buff, strerror(errno)), + errcontext(while checking hugepage size))); + return -1; + } + + len = read(fd, buff, 1024); + if (len = 0) + { + ereport(huge_tlb_pages == HUGE_TLB_TRY ? DEBUG1 : WARNING, +(errmsg(Error reading from file %s: %s, buff, strerror(errno)), + errcontext(while checking hugepage size))); + close(fd); + return -1; + } + + /* + * If the content of free_hugepages is longer than or equal to 1024 bytes + * the rest is irrelevant; we simply want to know if there are any + * hugepages left + */ + if (len == 1024) + { + buff[1023] = 0; + } + else + { + buff[len] = 0; + } + + close(fd); + + result =
Re: [HACKERS] ENABLE/DISABLE CONSTRAINT NAME
On Thu, Sep 12, 2013 at 10:03 PM, wangs...@highgo.com.cn wrote: Second, I tested the check and the foreign key constraint as your test above. And no error found, as fellow: You're missing the point. Peter wasn't worried that your patch throws an error; he's concerned about the fact that it doesn't. In PostgreSQL, you can only create the following view because test1 has a primary key over column a: = create table test1 (a int constraint pk primary key, b text); = create view test2 as select a, b from test1 group by a; = alter table test1 drop constraint pk; The reason that, if the primary key weren't there, it would be ambiguous which row should be returned as among multiple values where a is equal and b is not. If you can disable the constraint, then you can create precisely that problem. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpgsql.print_strict_params
Marko Tiikkaja wrote: On 2013-09-14 00:39, Alvaro Herrera wrote: set plpgsql.print_strict_params to true; Hmm, shouldn't this be a compile option? See the comp_option production in pl_gram.y for existing examples. I thought about that, but it seemed more like a runtime option to me. Any particular reason you think it would be better as a compile time option? Seems like something that would be useful to change per function, rather than globally, no? -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Proposal/design feedback needed: WITHIN GROUP (sql standard ordered set aggregate functions)
On Fri, 2013-09-13 at 14:56 +0530, Atri Sharma wrote: This is our complete patch for implementation of WITHIN GROUP. Please fix compiler warnings: inversedistribution.c: In function ‘mode_final’: inversedistribution.c:276:11: warning: ‘mode_val’ may be used uninitialized in this function [-Wmaybe-uninitialized] inversedistribution.c:299:8: warning: ‘last_val’ may be used uninitialized in this function [-Wmaybe-uninitialized] -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpgsql.print_strict_params
On 2013-09-14 03:33, Alvaro Herrera wrote: Marko Tiikkaja wrote: I thought about that, but it seemed more like a runtime option to me. Any particular reason you think it would be better as a compile time option? Seems like something that would be useful to change per function, rather than globally, no? The way I see it, STRICT is like an assertion, and you *would* pretty much always change it globally. However, it would be useful to also have the option to set it for a single function only. Will look into that. Thanks for the suggestion! Regards, Marko Tiikkaja -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
Andres Freund and...@2ndquadrant.com wrote: Not one that's dependendant on padding bytes, null bitmaps that can or cannot be present and such. Can you provide an example of where that's an issue with this patch? -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PL/pgSQL, RAISE and error context
On 23/08/2013 10:36, I wrote: On 8/23/13 8:38 AM, Pavel Stehule wrote: do you prepare patch ? I should have the time to produce one for the September commitfest, but if you (or anyone else) want to work on this, I won't object. My opinion at this very moment is that we should leave the the DEFAULT verbosity alone and add a new one (call it COMPACT or such) with the suppressed context for non-ERRORs. Well, turns out there isn't really any way to preserve complete backwards compatibility if we want to do this change. The attached patch (based on Pavel's patch) changes the default to be slightly more verbose (the CONTEXT lines which were previously omitted will be visible), but adds a new PGVerbosity called COMPACT which suppresses CONTEXT in non-error messages. Now DEFAULT will be more useful when debugging PL/PgSQL, and people who are annoyed by the new behaviour can use the COMPACT mode. Any thoughts? Regards, Marko Tiikkaja *** a/doc/src/sgml/libpq.sgml --- b/doc/src/sgml/libpq.sgml *** *** 5418,5423 int PQsetClientEncoding(PGconn *replaceableconn/replaceable, const char *re --- 5418,5424 typedef enum { PQERRORS_TERSE, + PQERRORS_COMPACT, PQERRORS_DEFAULT, PQERRORS_VERBOSE } PGVerbosity; *** *** 5430,5439 PGVerbosity PQsetErrorVerbosity(PGconn *conn, PGVerbosity verbosity); returned messages include severity, primary text, and position only; this will normally fit on a single line. The default mode produces messages that include the above plus any detail, hint, or context ! fields (these might span multiple lines). The firsttermVERBOSE/ ! mode includes all available fields. Changing the verbosity does not ! affect the messages available from already-existing ! structnamePGresult/ objects, only subsequently-created ones. /para /listitem /varlistentry --- 5431,5442 returned messages include severity, primary text, and position only; this will normally fit on a single line. The default mode produces messages that include the above plus any detail, hint, or context ! fields (these might span multiple lines). The COMPACT mode is otherwise ! the same as the default, except the context field will be omitted for ! non-error messages. The firsttermVERBOSE/ mode includes all ! available fields. Changing the verbosity does not affect the messages ! available from already-existing structnamePGresult/ objects, only ! subsequently-created ones. /para /listitem /varlistentry *** a/src/bin/psql/startup.c --- b/src/bin/psql/startup.c *** *** 796,801 verbosity_hook(const char *newval) --- 796,803 pset.verbosity = PQERRORS_DEFAULT; else if (strcmp(newval, terse) == 0) pset.verbosity = PQERRORS_TERSE; + else if (strcmp(newval, compact) == 0) + pset.verbosity = PQERRORS_COMPACT; else if (strcmp(newval, verbose) == 0) pset.verbosity = PQERRORS_VERBOSE; else *** a/src/interfaces/libpq/fe-protocol3.c --- b/src/interfaces/libpq/fe-protocol3.c *** *** 915,920 pqGetErrorNotice3(PGconn *conn, bool isError) --- 915,924 if (val) appendPQExpBuffer(workBuf, libpq_gettext(QUERY: %s\n), val); val = PQresultErrorField(res, PG_DIAG_CONTEXT); + } + if (isError || (conn-verbosity != PQERRORS_TERSE + conn-verbosity != PQERRORS_COMPACT)) + { if (val) appendPQExpBuffer(workBuf, libpq_gettext(CONTEXT: %s\n), val); } *** a/src/interfaces/libpq/libpq-fe.h --- b/src/interfaces/libpq/libpq-fe.h *** *** 106,111 typedef enum --- 106,112 typedef enum { PQERRORS_TERSE, /* single-line error messages */ + PQERRORS_COMPACT, /* single-line error messages on non-error messags */ PQERRORS_DEFAULT, /* recommended style */ PQERRORS_VERBOSE/* all the facts, ma'am */ } PGVerbosity; *** a/src/pl/plpgsql/src/pl_exec.c --- b/src/pl/plpgsql/src/pl_exec.c *** *** 39,46 #include utils/typcache.h - static const char *const raise_skip_msg = RAISE; - typedef struct { int nargs; /* number of arguments */ --- 39,44 *** *** 867,876 plpgsql_exec_error_callback(void *arg) { PLpgSQL_execstate *estate = (PLpgSQL_execstate *) arg; - /* if we are doing RAISE, don't report its location */ - if (estate-err_text == raise_skip_msg) - return; - if (estate-err_text != NULL) { /* --- 865,870 *** *** 3032,3038
[HACKERS] Proposal: PL/PgSQL strict_mode
Hi, After my previous suggestion for adding a STRICT keyword got shot down[1], I've been thinking about an idea Andrew Gierth tossed out: adding a new strict mode into PL/PgSQL. In this mode, any query which processes more than one row will raise an exception. This is a bit similar to specifying INTO STRICT .. for every statement, except processing no rows does not trigger the exception. The need for this mode comes from a few observations I make almost every day: 1) The majority of statements only deal with exactly 0 or 1 rows. 2) Checking row_count for a statement is ugly and cumbersome, so often it just isn't checked. I often use RETURNING TRUE INTO STRICT _OK for DML, but that a) requires an extra variable, and b) isn't possible if 0 rows affected is not an error in the application logic. 3) SELECT .. INTO only fetches one row and ignores the rest. Even row_count is always set to 0 or 1, so there's no way to fetch a value *and* to check that there would not have been more rows. This creates bugs which make your queries return wrong results and which could go undetected for a long time. Attached is a proof-of-concept patch (no docs, probably some actual code problems too) to implement this as a compile option: =# create or replace function footest() returns void as $$ $# #strict_mode strict $# begin $# -- not allowed to delete more than one row $# delete from foo where f1 100; $# end$$ language plpgsql; CREATE FUNCTION =# select footest(); ERROR: query processed more than one row CONTEXT: PL/pgSQL function footest() line 5 at SQL statement Now while I think this is a step into the right direction, I do have a couple of problems with this patch: 1) I'm not sure what would be the correct behaviour with EXECUTE. I'm tempted to just leave EXECUTE alone, as it has slightly different rules anyway. 2) If you're running in strict mode and you want to insert/update/delete more than one row, things get a bit uglier; a wCTE would work for some cases. If strict mode doesn't affect EXECUTE (see point 1 above), that could work too. Or maybe there could be a new command which runs a query, discards the results and ignores the number of rows processed. I'll be adding this to the open commitfest in hopes of getting some feedback on this idea (I'm prepared to hear a lot of you're crazy!), but feel free to comment away any time you please. Regards, Marko Tiikkaja [1]: http://www.postgresql.org/message-id/510bf731.5020...@gmx.net *** a/src/pl/plpgsql/src/pl_comp.c --- b/src/pl/plpgsql/src/pl_comp.c *** *** 351,356 do_compile(FunctionCallInfo fcinfo, --- 351,357 function-fn_cxt = func_cxt; function-out_param_varno = -1; /* set up for no OUT param */ function-resolve_option = plpgsql_variable_conflict; + function-strict_mode = plpgsql_strict_mode; if (is_dml_trigger) function-fn_is_trigger = PLPGSQL_DML_TRIGGER; *** *** 847,852 plpgsql_compile_inline(char *proc_source) --- 848,854 function-fn_cxt = func_cxt; function-out_param_varno = -1; /* set up for no OUT param */ function-resolve_option = plpgsql_variable_conflict; + function-strict_mode = plpgsql_strict_mode; plpgsql_ns_init(); plpgsql_ns_push(func_name); *** a/src/pl/plpgsql/src/pl_exec.c --- b/src/pl/plpgsql/src/pl_exec.c *** *** 3238,3243 exec_stmt_execsql(PLpgSQL_execstate *estate, --- 3238,3244 ParamListInfo paramLI; longtcount; int rc; + bool strict_mode = estate-func-strict_mode; PLpgSQL_expr *expr = stmt-sqlstmt; /* *** *** 3278,3293 exec_stmt_execsql(PLpgSQL_execstate *estate, /* * If we have INTO, then we only need one row back ... but if we have INTO !* STRICT, ask for two rows, so that we can verify the statement returns !* only one. INSERT/UPDATE/DELETE are always treated strictly. Without !* INTO, just run the statement to completion (tcount = 0). * * We could just ask for two rows always when using INTO, but there are * some cases where demanding the extra row costs significant time, eg by * forcing completion of a sequential scan. So don't do it unless we need * to enforce strictness. */ ! if (stmt-into) { if (stmt-strict || stmt-mod_stmt) tcount = 2; --- 3279,3297 /* * If we have INTO, then we only need one row back ... but if we have INTO !* STRICT or we're in strict mode, ask for two rows, so that we can verify !* the statement returns only one. INSERT/UPDATE/DELETE are always treated !* strictly. Without INTO, just run the statement to completion !* (tcount = 0). * * We
Re: [HACKERS] Proposal: PL/PgSQL strict_mode
On 14/09/2013 06:28, I wrote: 2) Checking row_count for a statement is ugly and cumbersome, so often it just isn't checked. I often use RETURNING TRUE INTO STRICT _OK for DML, but that a) requires an extra variable, and b) isn't possible if 0 rows affected is not an error in the application logic. The b) part here wasn't exactly true; you could use RETURNING TRUE INTO OK as INSERT/UPDATE/DELETE with RETURNING .. INTO raises an exception if it returns more than one row, but it's even more convoluted than the RETURNING TRUE INTO STRICT _OK, and it's repetitive. Regards, Marko Tiikkaja -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: PL/PgSQL strict_mode
A few thoughts about this. On 14 September 2013 at 05:28 Marko Tiikkaja ma...@joh.to wrote: Hi, After my previous suggestion for adding a STRICT keyword got shot down[1], I've been thinking about an idea Andrew Gierth tossed out: adding a new strict mode into PL/PgSQL. In this mode, any query which processes more than one row will raise an exception. This is a bit similar to specifying INTO STRICT .. for every statement, except processing no rows does not trigger the exception. The need for this mode comes from a few observations I make almost every day: 1) The majority of statements only deal with exactly 0 or 1 rows. 2) Checking row_count for a statement is ugly and cumbersome, so often it just isn't checked. I often use RETURNING TRUE INTO STRICT _OK for DML, but that a) requires an extra variable, and b) isn't possible if 0 rows affected is not an error in the application logic. 3) SELECT .. INTO only fetches one row and ignores the rest. Even row_count is always set to 0 or 1, so there's no way to fetch a value *and* to check that there would not have been more rows. This creates bugs which make your queries return wrong results and which could go undetected for a long time. I am going to suggest that STRICT is semantically pretty far from what is meant here in common speech. I think STRICT here would be confusing. This would be really pretty severe for people coming from Perl or MySQL backgrounds. May I suggest SINGLE as a key word instead? It might be worth having attached to a INSERT, UPDATE, and DELETE statements. I am thinking something like: DELETE SINGLE FROM foo WHERE f1 1000; would be more clearer. Similarly one could have: INSERT SINGLE INTO foo SELECT * from foo2; and UPDATE SINGLE foo You could even use SELECT SINGLE but not sure where the use case is there where unique indexes are not sufficient. Attached is a proof-of-concept patch (no docs, probably some actual code problems too) to implement this as a compile option: =# create or replace function footest() returns void as $$ $# #strict_mode strict $# begin $# -- not allowed to delete more than one row $# delete from foo where f1 100; $# end$$ language plpgsql; CREATE FUNCTION =# select footest(); ERROR: query processed more than one row CONTEXT: PL/pgSQL function footest() line 5 at SQL statement Now while I think this is a step into the right direction, I do have a couple of problems with this patch: 1) I'm not sure what would be the correct behaviour with EXECUTE. I'm tempted to just leave EXECUTE alone, as it has slightly different rules anyway. 2) If you're running in strict mode and you want to insert/update/delete more than one row, things get a bit uglier; a wCTE would work for some cases. If strict mode doesn't affect EXECUTE (see point 1 above), that could work too. Or maybe there could be a new command which runs a query, discards the results and ignores the number of rows processed. Yeah, I am worried about this one. I am concerned that if you can't disable on a statement by statement basis, then you have a problem where you end up removing the mode from the function and then it becomes a lot harder for everyone maintaining the function to have a clear picture of what is going on. I am further worried that hacked ugly code ways around it will introduce plenty of other maintenance pain points that will be worse than what you are solving. I'll be adding this to the open commitfest in hopes of getting some feedback on this idea (I'm prepared to hear a lot of you're crazy!), but feel free to comment away any time you please. Well, I don't know if my feedback above is helpful, but there it is ;-) Regards, Marko Tiikkaja [1]: http://www.postgresql.org/message-id/510bf731.5020...@gmx.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers Best Wishes, Chris Travers http://www.2ndquadrant.com PostgreSQL Services, Training, and Support