Re: [HACKERS] Proposal: PL/PgSQL strict_mode
2013/9/14 chris travers ch...@2ndquadrant.com ** A few thoughts about this. On 14 September 2013 at 05:28 Marko Tiikkaja ma...@joh.to wrote: Hi, After my previous suggestion for adding a STRICT keyword got shot down[1], I've been thinking about an idea Andrew Gierth tossed out: adding a new strict mode into PL/PgSQL. In this mode, any query which processes more than one row will raise an exception. This is a bit similar to specifying INTO STRICT .. for every statement, except processing no rows does not trigger the exception. The need for this mode comes from a few observations I make almost every day: 1) The majority of statements only deal with exactly 0 or 1 rows. 2) Checking row_count for a statement is ugly and cumbersome, so often it just isn't checked. I often use RETURNING TRUE INTO STRICT _OK for DML, but that a) requires an extra variable, and b) isn't possible if 0 rows affected is not an error in the application logic. 3) SELECT .. INTO only fetches one row and ignores the rest. Even row_count is always set to 0 or 1, so there's no way to fetch a value *and* to check that there would not have been more rows. This creates bugs which make your queries return wrong results and which could go undetected for a long time. I am going to suggest that STRICT is semantically pretty far from what is meant here in common speech. I think STRICT here would be confusing. This would be really pretty severe for people coming from Perl or MySQL backgrounds. May I suggest SINGLE as a key word instead? It might be worth having attached to a INSERT, UPDATE, and DELETE statements. I don't think so SINGLE is better. There is a risk of confusing with LIMIT clause. (More - we use a STRICT now, so new keyword increase a possible confusing) When I look on translation of STRICT to Czech language, I am thinging so STRICT is good enough. If we do some change, then I propose a little bit Cobol solution CHECK ONE ROW EXPECTED clause. instead DELETE FROM foo WHERE f1 1000 do DELETE FROM foo WHERE f1 1000 CHECK ONE ROW EXPECTED; Regards Pavel I am thinking something like: DELETE SINGLE FROM foo WHERE f1 1000; would be more clearer. Similarly one could have: INSERT SINGLE INTO foo SELECT * from foo2; and UPDATE SINGLE foo You could even use SELECT SINGLE but not sure where the use case is there where unique indexes are not sufficient. Attached is a proof-of-concept patch (no docs, probably some actual code problems too) to implement this as a compile option: =# create or replace function footest() returns void as $$ $# #strict_mode strict $# begin $# -- not allowed to delete more than one row $# delete from foo where f1 100; $# end$$ language plpgsql; CREATE FUNCTION =# select footest(); ERROR: query processed more than one row CONTEXT: PL/pgSQL function footest() line 5 at SQL statement Now while I think this is a step into the right direction, I do have a couple of problems with this patch: 1) I'm not sure what would be the correct behaviour with EXECUTE. I'm tempted to just leave EXECUTE alone, as it has slightly different rules anyway. 2) If you're running in strict mode and you want to insert/update/delete more than one row, things get a bit uglier; a wCTE would work for some cases. If strict mode doesn't affect EXECUTE (see point 1 above), that could work too. Or maybe there could be a new command which runs a query, discards the results and ignores the number of rows processed. Yeah, I am worried about this one. I am concerned that if you can't disable on a statement by statement basis, then you have a problem where you end up removing the mode from the function and then it becomes a lot harder for everyone maintaining the function to have a clear picture of what is going on. I am further worried that hacked ugly code ways around it will introduce plenty of other maintenance pain points that will be worse than what you are solving. I'll be adding this to the open commitfest in hopes of getting some feedback on this idea (I'm prepared to hear a lot of you're crazy!), but feel free to comment away any time you please. Well, I don't know if my feedback above is helpful, but there it is ;-) Regards, Marko Tiikkaja [1]: http://www.postgresql.org/message-id/510bf731.5020...@gmx.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers Best Wishes, Chris Travers http://www.2ndquadrant.com PostgreSQL Services, Training, and Support
Re: [HACKERS] Review: Patch to compute Max LSN of Data Pages
On Monday, July 08, 2013 5:16 PM Andres Freund wrote: On 2013-07-08 17:10:43 +0530, Amit Kapila wrote: On Monday, July 08, 2013 4:26 PM Andres Freund wrote: On 2013-07-08 16:17:54 +0530, Hari Babu wrote: +This utility can also be used to decide whether backup is required or not when the data page +in old-master precedes the last applied LSN in old-standby (i.e., new-master) at the +moment of the failover. + /para + /refsect1 I don't think this is safe in any interesting set of cases. Am I missing something? No, you are not missing anything. It can be only used to find max LSN in database which can avoid further corruption Why is the patch submitted documenting it as a use-case then? I find it rather scary if the *patch authors* document a known unsafe use case as one of the known use-cases. I got the problem which can occur with the specified use case. Removed the wrong use case specified above. Thanks for the review, please find the updated patch attached in the mail. Patch is not getting compiled on Windows, I had made following changes: a. updated the patch for resolving Windows build b. few documentation changes in (pg_resetxlog.sgml) for spelling mistake and minor line change c. corrected year for Copyright in file pg_computemaxlsn.c With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com pg_computemaxlsn_v11.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PoC] pgstattuple2: block sampling to reduce physical read
(2013/07/23 20:02), Greg Smith wrote: On 7/23/13 2:16 AM, Satoshi Nagayasu wrote: I've been working on new pgstattuple function to allow block sampling [1] in order to reduce block reads while scanning a table. A PoC patch is attached. Take a look at all of the messages linked in https://commitfest.postgresql.org/action/patch_view?id=778 Jaime and I tried to do what you're working on then, including a random block sampling mechanism modeled on the stats_target mechanism. We didn't do that as part of pgstattuple though, which was a mistake. Noah created some test cases as part of his thorough review that were not computing the correct results. Getting the results correct for all of the various types of PostgreSQL tables and indexes ended up being much harder than the sampling part. See http://www.postgresql.org/message-id/20120222052747.ge8...@tornado.leadboat.com in particular for that. Thanks for the info. I have read the previous discussion. I'm looking forward to seeing more feedback on this approach, in terms of design and performance improvement. So, I have submitted this for the next CF. This new function, pgstattuple2(), samples only 3,000 blocks (which accounts 24MB) from the table randomly, and estimates several parameters of the entire table. There should be an input parameter to the function for how much sampling to do, and if it's possible to make the scale for it to look like ANALYZE that's helpful too. I have a project for this summer that includes reviving this topic and making sure it works on some real-world systems. If you want to work on this too, I can easily combine that project into what you're doing. Yeah, I'm interested in that. Something can be shared? Regards, -- Satoshi Nagayasu sn...@uptime.jp Uptime Technologies, LLC. http://www.uptime.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE
On Fri, Sep 13, 2013 at 2:59 PM, Peter Geoghegan p...@heroku.com wrote: I would suggest letting those other individuals speak for themselves too. Particularly if you're going to name someone who is on vacation like that. You seem to be under the impression that I'm mentioning Tom's name, or Andres's, because I need to win some kind of an argument. I don't. We're not going to accept a patch that uses lwlocks in the way that you are proposing. I mean, if we do the promise tuple thing, and there are multiple unique indexes, what happens when an inserter needs to block pending the outcome of another transaction? They had better go clean up the promise tuples from the other unique indexes that they're trying to insert into, because they cannot afford to hold value locks for a long time, no matter how they're implemented. As Andres already pointed out, this is not correct. Just to add to what he said, we already have long-lasting value locks in the form of SIREAD locks. SIREAD can exist at different levels of granularity, but one of those levels is index-page-level granularity, where they have the function of guarding against concurrent insertions of values that would fall within that page, which just so happens to be the same thing you want to do here. The difference between those locks and what you're proposing here is that they are implemented differently. That is why those were acceptable and this is not. That could take much longer than just releasing a shared buffer lock, since for each unique index the promise tuple must be re-found from scratch. There are huge issues with additional complexity and bloat. Oh, and now your lightweight locks aren't so lightweight any more. Yep, totally agreed. If you simply lock the buffer, or take some other action which freezes out all concurrent modifications to the page, then re-finding the lock is much simpler. On the other hand, it's much simpler precisely because you've reduced concurrency to the degree necessary to make it simple. And reducing concurrency is bad. Similarly, complexity and bloat are not great things taken in isolation, but many of our existing locking schemes are already very complex. Tuple locks result in a complex jig that involves locking the tuple via the heavyweight lock manager, performing a WAL-logged modification to the page, and then releasing the lock in the heavyweight lock manager. As here, that is way more expensive than simply grabbing and holding a share-lock on the page. But we get a number of important benefits out of it. The backend remains interruptible while the tuple is locked, the protocol for granting locks is FIFO to prevent starvation, we don't suppress page eviction while the lock is held, we can simultaneously lock arbitrarily large numbers of tuples, and deadlocks are detect and handled cleanly. If those requirements were negotiable, we would surely have negotiated them away already, because the performance benefits would be immense. If the value locks were made interruptible through some method, such as the promise tuples approach, does that really make deadlocking acceptable? Yes. It's not possible to prevent all deadlocks. It IS possible to make sure that they are properly detected and that precisely one of the transactions involved is rolled back to resolve the deadlock. You can hardly compare a buffer's LWLock with a system one that protects critical shared memory structures. We're talking about a shared lock on a single btree leaf page per unique index involved in upserting. Actually, I can and I am. Buffers ARE critical shared memory structures. A further problem is that a backend which holds even one lwlock can't be interrupted. We've had this argument before and it seems that you don't think that non-interruptibility is a problem, but it project policy to allow for timely interrupts in all parts of the backend and we're not going to change that policy for this patch. I don't think non-interruptibility is a problem? Really, do you think that this kind of inflammatory rhetoric helps anybody? I said nothing of the sort. I recall saying something about an engineering trade-off. Of course I value interruptibility. I don't see what's inflammatory about that statement. The point is that this isn't the first time you've proposed a change which would harm interruptibility and it isn't the first time I've objected on precisely that basis. Interruptibility is not a nice-to-have that we can trade away from time to time; it's essential and non-negotiable. If you're concerned about non-interruptibility, consider XLogFlush(). That does rather a lot of work with WALWriteLock exclusive locked. On a busy system, some backend is very frequently going to experience a non-interruptible wait for the duration of however long it takes to write and flush perhaps a whole segment. All other flushing backends are stuck in non-interruptible waits waiting for that backend to
Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE
I haven't read the patch and the btree code is an area I really don't know, so take this for what it's worth It seems to me that the nature of the problem is that there will unavoidably be a nexus between the two parts of the code here. We can try to isolate it as much as possible but we're going to need a bit of a compromise. I'm imagining a function that takes two target heap buffers and a btree key. It would descend the btree and holding the leaf page lock do a try_lock on the heap pages. If it fails to get the locks then it releases whatever it got and returns for the heap update to find new pages and try again. This still leaves the potential problem with page splits and I assume it would still be tricky to call it without unsatisfactorily mixing executor and btree code. But that's as far as I got. -- greg
Re: [HACKERS] Proposal: PL/PgSQL strict_mode
On 14/09/2013 06:53, chris travers wrote: I am going to suggest that STRICT is semantically pretty far from what is meant here in common speech. I think STRICT here would be confusing. This would be really pretty severe for people coming from Perl or MySQL backgrounds. The name of the feature doesn't really matter. I chose STRICT because the behaviour is somewhat similar to INTO .. STRICT. May I suggest SINGLE as a key word instead? It might be worth having attached to a INSERT, UPDATE, and DELETE statements. I already suggested the attach a keyword to every statement approach, and got shot down quite quickly. Regards, Marko Tiikkaja -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_stat_statements: calls under-estimation propagation
This patch needs documentation. At a minimum, the new calls_underest field needs to be listed in the description of the pg_stat_statements. I have attached a version which includes documentation. pg_stat_statements-identification-v4.patch.gz http://postgresql.1045698.n5.nabble.com/file/n5770844/pg_stat_statements-identification-v4.patch.gz Pardon for not following the discussion: What exactly does the calls_underest field mean? I couldn't decipher it from the patch. What can an admin do with the value? How does it compare with just bumping up pg_stat_statements.max? Paraphrasing the documentation. There is a possibility that,for queries which are tracked intermittently, could have their statistics silently reset. The calls_underest field gives an indication that a query has been tracked intermittently and consequently have a higher probability of erroneous statistics. Queries tracked constantly will have a zero value for calls_underest indicating zero probability of erroneous statistics. A greater value of calls_underest indicates greater degree of inconsistent tracking which in turn means greater possibility of erroneous statistics due to statistics being reset when query was not being tracked. Increasing pg_stat_statements.max reduces the possibility of a query being tracked intermittently but does not address the problem of indicating the probability of erroneous statistics if the query is indeed being tracked intermittently. I do not believe the admin can influence the value of calls_underest as it is not a GUC parameter. We have a need to see this patch committed hence the revived interest in this thread regards Sameer -- View this message in context: http://postgresql.1045698.n5.nabble.com/pg-stat-statements-calls-under-estimation-propagation-tp5738128p5770844.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Proposal/design feedback needed: WITHIN GROUP (sql standard ordered set aggregate functions)
Peter == Peter Eisentraut pete...@gmx.net writes: Peter Please fix compiler warnings: Someone should do the same in WaitForBackgroundWorkerStartup so that building with -Werror works. New patch coming shortly. -- Andrew. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
On 2013-09-13 19:20:11 -0700, Kevin Grittner wrote: Andres Freund and...@2ndquadrant.com wrote: Not one that's dependendant on padding bytes, null bitmaps that can or cannot be present and such. Can you provide an example of where that's an issue with this patch? I haven't yet tested your patch, but what I am talking about is that e.g.: SELECT (ARRAY[1,2,3,NULL])[1:3] = ARRAY[1,2,3]; obviously should be true. But both arrays don't have the same binary representation since the former has a null bitmap, the latter not. So, if you had a composite type like (int4[]) and would compare that without invoking operators you'd return something false in some cases because of the null bitmaps. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unaccent module - two params function should be immutable
2013/9/11 Bruce Momjian br...@momjian.us On Tue, Feb 19, 2013 at 08:30:29AM +0100, Pavel Stehule wrote: Hello There was a proposal to change flag of function to immutable - should be used in indexes CREATE FUNCTION unaccent(regdictionary, text) RETURNS text AS 'MODULE_PATHNAME', 'unaccent_dict' LANGUAGE C STABLE STRICT; is there any progress? I have developed the attached patch based on your suggestion. I did not see anything in the code that would make it STABLE, except a lookup of a dictionary library: dictOid = get_ts_dict_oid(stringToQualifiedNameList(unaccent), false); yes, it risk, but similar is with timezones, and other external data. Regards Pavel -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
Re: [HACKERS] review: pgbench progress report improvements
2013/9/13 Fabien COELHO coe...@cri.ensmp.fr Hello, About patch eols: postgresql patch -p1 ../pgbench-measurements-v2.**patch patching file contrib/pgbench/pgbench.c patching file doc/src/sgml/pgbench.sgml it can depends on o.s. I did tests on Fedora 14. and for patching without warning I had to use dos2unix tool. Hmmm. I use a Linux Ubuntu laptop, so generating DOS end of lines is unlikely if it is not there at the beginning. Running dos2unix on the patch file locally does not seem to change anything. So I assume that the patch encoding was changed somewhere along the path you used to get it. It is possible - but, this is only minor issue Pavel -- Fabien.
Re: [HACKERS] review: pgbench progress report improvements
2013/9/12 Fabien COELHO coe...@cri.ensmp.fr Hello Pavel, Thanks for your review. * patched with minor warning * compilable cleanly * zero impact on PostgreSQL server functionality * it does what was in proposal ** change 5sec progress as default (instead no progress) ** finalise a rate limit support - fixes a latency calculation Just a point about the motivation: the rationale for having a continuous progress report is that benchmarking is subject to possibly long warmup times, and thus a test may have to run for hours so as to be significant. I find running a command for hours without any hint about what is going on quite annoying. * code is clean * documentation is included * there is no voices against this patch and this patch increases a pgbench usability/ I have only one question. When I tested this patch with throttling I got a very similar values of lag. Yep. That is just good! What is sense, or what is semantic of this value? The lag measures the stochastic processus health. Actually, it measures how far behind schedule the clients are when performing throttled transactions. If it was to increase, that would mean that something is amiss, possibly not enough client threads or other issues. If it is small, then all is well. It is not detailed documented. It is documented in the section about the --rate option, see http://www.postgresql.org/**docs/devel/static/pgbench.htmlhttp://www.postgresql.org/docs/devel/static/pgbench.html ok, I see it now. So this patch is ready for commit Regards Pavel Should be printed this value in this form on every row? We can print some warning when lag is higher than latency instead? Hmmm... what is important is when the lag changes values. Generally one would indeed expect that to be smaller than the latency, but that is not really possible when transaction are very fast, say under -S with read-only queries that hit the memory cache. Also the problem with printing warnings is that it changes the output format, but it seems to me more useful to print the value, so that it can be processed automatically and simply. Also, from a remote client perspective, say a web application, the overall latency is the lag plus the transaction latency: you first wait to get through the database (lag), and then you can perform your transaction (latency). Or we can use this value, but it should be better documented, please. Is the documentation pointed above enough? -- Fabien.
[HACKERS] GUC for data checksums
Attached is a small patch to add a new GUC to report wether data checksums for a particular cluster are enabled. The only way to get this info afaik is to look into pg_control and the version number used, but i'd welcome a way to access this remotely, too. If there aren't any objections i'll add this to the CF. -- Thanks Bernd data_checksums_guc.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
Andres Freund and...@2ndquadrant.com wrote: what I am talking about is that e.g.: SELECT (ARRAY[1,2,3,NULL])[1:3] = ARRAY[1,2,3]; obviously should be true. The patch does not change the behavior of the = operator for any type under any circumstances. But both arrays don't have the same binary representation since the former has a null bitmap, the latter not. So, if you had a composite type like (int4[]) and would compare that without invoking operators you'd return something false in some cases because of the null bitmaps. Not for the = operator. The new identical operator would find them to not be identical, though. Since the new operator is only for the record type, I need to wrap the values in your example: test=# SELECT row ((ARRAY[1,2,3,NULL])[1:3])::record test-# = row (ARRAY[1,2,3])::record; ?column? -- t (1 row) test=# SELECT row ((ARRAY[1,2,3,NULL])[1:3])::record test-# === row (ARRAY[1,2,3])::record; ?column? -- f (1 row) Or, to borrow from the citext example using arrays: test=# CREATE TABLE array_table ( test(# id serial primary key, test(# nums int4[] test(# ); CREATE TABLE test=# INSERT INTO array_table (nums) test-# VALUES ((ARRAY[1,2,3,NULL])[1:3]), (ARRAY[1,2,3]), test-# (ARRAY[1,2,3]), (NULL), (NULL); INSERT 0 5 test=# CREATE MATERIALIZED VIEW array_matview AS test-# SELECT * FROM array_table; SELECT 5 test=# CREATE UNIQUE INDEX array_matview_id test-# ON array_matview (id); CREATE INDEX test=# select * from array_matview; id | nums +- 1 | {1,2,3} 2 | {1,2,3} 3 | {1,2,3} 4 | 5 | (5 rows) Note that the on-disk representation of the row where id = 1 differs from the on-disk representation where id in (2,3), both in the table and the matview. test=# SELECT * test-# FROM array_matview m test-# FULL JOIN array_table t ON (t.id = m.id AND t === m) test-# WHERE t.id IS NULL OR m.id IS NULL; id | nums | id | nums +--++-- (0 rows) ... so the query looking for work for the RMVC statement finds nothing to do. test=# UPDATE array_table SET nums = (ARRAY[1,2,3,NULL])[1:3] test-# WHERE id between 1 and 2; UPDATE 2 Now we have added an unnecessary bitmap to the on-disk storage of the value where id = 2. test=# SELECT * test-# FROM array_matview m test-# FULL JOIN array_table t ON (t.id = m.id AND t === m) test-# WHERE t.id IS NULL OR m.id IS NULL; id | nums | id | nums +-++- | | 2 | {1,2,3} 2 | {1,2,3} | | (2 rows) ... and the query sees that they differ. test=# REFRESH MATERIALIZED VIEW CONCURRENTLY array_matview; REFRESH MATERIALIZED VIEW test=# SELECT * test-# FROM array_matview m test-# FULL JOIN array_table t ON (t.id = m.id AND t === m) test-# WHERE t.id IS NULL OR m.id IS NULL; id | nums | id | nums +--++-- (0 rows) The REFRESH causes them to match again, and later REFRESH runs won't see a need to do any work there unless the on-disk representation changes again. As far as I can see, we have four choices: (1) Never update values that are equal, even if they appear different to the users, as was demonstrated with the citext example. (2) Require every data type which can be used in a matview to implement some new operator or function for identical. Perhaps that could be mitigated to only implementat it if equal values can have user-visible differences. (3) Embed special cases into record identical tests for types known to allow multiple on-disk representations which have no user-visible differences. (4) Base the need to update a matview column on whether its on-disk representation is identical to what a new run of the defining query would generate. If this causes performance problems for use of a given type in a matview, one possible solution would be to modify that particular type to use a canonical format when storing a value into a record. For example, storing an array which has a bitmap of null values even though there are no nulls in the array could strip the bitmap as it is stored to the record. Currently we are using (4). I only included (3) for completeness; even just typing it as a hypothetical made me want to take a shower. (1) seems pretty horrid, too. (2) isn't evil, exactly, but any types which allowed user-visible differences in equal values would not exhibit correct behavior in matviews until and unless an identical operator was added. It might also perform noticeably worse than (4). Option (4) has the advantage of showing correct logical behavior in all types immediately, and restricting any performance fixes to the types with the inconsistent storage formats. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] git apply vs patch -p1
Folks, Lately I've been running into a lot of reports of false conflicts reported by git apply. The most recent one was the points patch, which git apply rejected for completely ficticious reasons (it claimed that the patch was trying to create a new file where a file already existed, which it wasn't). I think we should modify the patch review and developer instructions to recommend always using patch -p1 (or -p0, depending), even if the patch was produced with git diff. Thoughts? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_stat_statements: calls under-estimation propagation
On Sat, 2013-09-14 at 03:51 -0700, samthakur74 wrote: We have a need to see this patch committed hence the revived interest in this thread You have added this email to the commit fest, but it contains no patch. Please add the email with the actual patch. Maybe the author should be given a chance to update the patches, though, because they are quite old. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Assertions in PL/PgSQL
Hi, Attached is a patch for supporting assertions in PL/PgSQL. These are similar to the Assert() backend macro: they can be disabled during compile time, but when enabled, abort execution if the passed expression is not true. A simple example: CREATE FUNCTION delete_user(username text) RETURNS VOID AS $$ BEGIN DELETE FROM users WHERE users.username = delete_user.username; ASSERT FOUND; END $$ LANGUAGE plpgsql; SELECT delete_user('mia'); ERROR: Assertion on line 4 failed CONTEXT: PL/pgSQL function delete_user(text) line 4 at ASSERT Again, I'll add this to the open commitfest, but feedback is greatly appreciated. Regards, Marko Tiikkaja *** a/doc/src/sgml/plpgsql.sgml --- b/doc/src/sgml/plpgsql.sgml *** *** 3528,3533 RAISE unique_violation USING MESSAGE = 'Duplicate user ID: ' || user_id; --- 3528,3596 /para /note + sect2 id=plpgsql-assert +titleAssertions/title + +para + literalAssertions/literal provide a way to check that the + internal state of a function is as expected. For example: + programlisting + CREATE FUNCTION delete_user(username text) RETURNS VOID AS $$ + BEGIN + DELETE FROM users WHERE users.username = delete_user.username; + ASSERT FOUND; + END + $$ LANGUAGE plpgsql; + + SELECT delete_user('mia'); + ERROR: Assertion on line 4 failed + CONTEXT: PL/pgSQL function delete_user(text) line 4 at ASSERT + /programlisting + + One could implement the equivalent functionality with a conditional + RAISE EXCEPTION statement, but assertions have two major differences: + itemizedlist + listitem +para + They're a lot faster to write than the equivalent IF .. THEN RAISE EXCEPTION .. END IF constructs. +/para + /listitem + listitem +para + They can be (and are by default) disabled in production + environments. A disabled assertion only incurs a negligible + compile-time overhead and no execution time overhead, so you + can write complex checks for development environments without + having to worry about performance. +/para + /listitem + /itemizedlist +/para + +para + The configuration parameter varnameplpgsql.enable_assertions/ + controls whether assertions are enabled. Note that the value of + this parameter only affects subsequent compilations of + applicationPL/pgSQL/ functions, but not statements already + compiled in the current session. +/para + +para + It is also possible to enable assertions in a single function + (possibly overriding the global setting) by using a compile + option: + programlisting + CREATE FUNCTION delete_user(username text) RETURNS VOID AS $$ + #enable_assertions + BEGIN + DELETE FROM users WHERE users.username = delete_user.username; + ASSERT FOUND; + END + $$ LANGUAGE plpgsql; + /programlisting +/para + /sect2 + /sect1 sect1 id=plpgsql-trigger *** a/src/pl/plpgsql/src/pl_comp.c --- b/src/pl/plpgsql/src/pl_comp.c *** *** 351,356 do_compile(FunctionCallInfo fcinfo, --- 351,357 function-fn_cxt = func_cxt; function-out_param_varno = -1; /* set up for no OUT param */ function-resolve_option = plpgsql_variable_conflict; + function-enable_assertions = plpgsql_enable_assertions; if (is_dml_trigger) function-fn_is_trigger = PLPGSQL_DML_TRIGGER; *** *** 847,852 plpgsql_compile_inline(char *proc_source) --- 848,854 function-fn_cxt = func_cxt; function-out_param_varno = -1; /* set up for no OUT param */ function-resolve_option = plpgsql_variable_conflict; + function-enable_assertions = plpgsql_enable_assertions; plpgsql_ns_init(); plpgsql_ns_push(func_name); *** a/src/pl/plpgsql/src/pl_exec.c --- b/src/pl/plpgsql/src/pl_exec.c *** *** 133,138 static int exec_stmt_dynexecute(PLpgSQL_execstate *estate, --- 133,140 PLpgSQL_stmt_dynexecute *stmt); static int exec_stmt_dynfors(PLpgSQL_execstate *estate, PLpgSQL_stmt_dynfors *stmt); + static int exec_stmt_assert(PLpgSQL_execstate *estate, + PLpgSQL_stmt_assert *stmt); static void plpgsql_estate_setup(PLpgSQL_execstate *estate, PLpgSQL_function *func, *** *** 1466,1471 exec_stmt(PLpgSQL_execstate *estate, PLpgSQL_stmt *stmt) --- 1468,1477 rc = exec_stmt_close(estate, (PLpgSQL_stmt_close *) stmt); break; + case PLPGSQL_STMT_ASSERT: + rc = exec_stmt_assert(estate, (PLpgSQL_stmt_assert *) stmt); + break; +
Re: [HACKERS] Assertions in PL/PgSQL
On 14/09/2013 20:47, I wrote: These are similar to the Assert() backend macro: they can be disabled during compile time, but when enabled, abort execution if the passed expression is not true. And by compile time here, I mean at the time when the PL/PgSQL function is compiled, not the postgres server binary. Regards, Marko Tiikkaja -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
On 2013-09-14 11:25:52 -0700, Kevin Grittner wrote: Andres Freund and...@2ndquadrant.com wrote: what I am talking about is that e.g.: SELECT (ARRAY[1,2,3,NULL])[1:3] = ARRAY[1,2,3]; obviously should be true. The patch does not change the behavior of the = operator for any type under any circumstances. Yes, sure. I wasn't thinking you would. But both arrays don't have the same binary representation since the former has a null bitmap, the latter not. So, if you had a composite type like (int4[]) and would compare that without invoking operators you'd return something false in some cases because of the null bitmaps. Not for the = operator. The new identical operator would find them to not be identical, though. Yep. And I think that's a problem if exposed to SQL. People won't understand the hazards and end up using it because its faster or somesuch. Since the new operator is only for the record type, I need to wrap the values in your example: Yes. The REFRESH causes them to match again, and later REFRESH runs won't see a need to do any work there unless the on-disk representation changes again. Yes, I understand that the matview code itself will just perform superflous work. We use such comparisons in other parts of the code similarly. As far as I can see, we have four choices: (1) Never update values that are equal, even if they appear different to the users, as was demonstrated with the citext example. I think, introducing a noticeable amount of infrastructure for this just because of citext is a bad idea. At some point we need to replace citext with proper case-insensitive collation support - then it really might become necessary. (2) Require every data type which can be used in a matview to implement some new operator or function for identical. Perhaps that could be mitigated to only implementat it if equal values can have user-visible differences. That basically would require adding a new member to btree opclasses that btrees don't need themselves... Hm. (3) Embed special cases into record identical tests for types known to allow multiple on-disk representations which have no user-visible differences. I think this is a complete nogo. a) I don't forsee we know of all these cases b) it wouldn't be extensible. Oh. Now that I've read further, I see you feel the same. Good ;) (4) Base the need to update a matview column on whether its on-disk representation is identical to what a new run of the defining query would generate. If this causes performance problems for use of a given type in a matview, one possible solution would be to modify that particular type to use a canonical format when storing a value into a record. For example, storing an array which has a bitmap of null values even though there are no nulls in the array could strip the bitmap as it is stored to the record. If matview refreshs weren't using plain SQL and thus wouldn't require exposing that operator to SQL I wouldn't have a problem with this... There's the ungodly ugly choice of having an matview_equal function (or operator) that checks if we're doing a refresh atm... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] git apply vs patch -p1
On 09/14/2013 02:37 PM, Josh Berkus wrote: Folks, Lately I've been running into a lot of reports of false conflicts reported by git apply. The most recent one was the points patch, which git apply rejected for completely ficticious reasons (it claimed that the patch was trying to create a new file where a file already existed, which it wasn't). I think we should modify the patch review and developer instructions to recommend always using patch -p1 (or -p0, depending), even if the patch was produced with git diff. Thoughts? FWIW that's what I invariably use. You do have to be careful to git-add/git-rm any added/deleted files, which git-apply does for you (as well as renames) - I've been caught by that a couple of times. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: PL/PgSQL strict_mode
(Oops, just realized I only replied to a part of your message. I'm blaming it on my lack of sleep.) On 14/09/2013 06:53, chris travers wrote: 2) If you're running in strict mode and you want to insert/update/delete more than one row, things get a bit uglier; a wCTE would work for some cases. If strict mode doesn't affect EXECUTE (see point 1 above), that could work too. Or maybe there could be a new command which runs a query, discards the results and ignores the number of rows processed. Yeah, I am worried about this one. I am concerned that if you can't disable on a statement by statement basis, then you have a problem where you end up removing the mode from the function and then it becomes a lot harder for everyone maintaining the function to have a clear picture of what is going on. I am further worried that hacked ugly code ways around it will introduce plenty of other maintenance pain points that will be worse than what you are solving. I completely agree. If you can't turn the option on globally, you're bound to hit problems at some point. My thinking currently is that this mode would require adding a new type of statement (I'd call it PERFORM, but I know that one's been taken, so please don't get hung up on that): PERFORM UPDATE foo SET ..; or PERFORM SELECT foo(id) FROM ..; This statement would allow you to run any statement in strict mode without hitting the more than one row processed error message. I'll be adding this to the open commitfest in hopes of getting some feedback on this idea (I'm prepared to hear a lot of you're crazy!), but feel free to comment away any time you please. Well, I don't know if my feedback above is helpful, but there it is ;-) Thanks for your input! Regards, Marko Tiikkaja -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] git apply vs patch -p1
On 2013-09-14 15:03:52 -0400, Andrew Dunstan wrote: On 09/14/2013 02:37 PM, Josh Berkus wrote: Folks, Lately I've been running into a lot of reports of false conflicts reported by git apply. The most recent one was the points patch, which git apply rejected for completely ficticious reasons (it claimed that the patch was trying to create a new file where a file already existed, which it wasn't). I think we should modify the patch review and developer instructions to recommend always using patch -p1 (or -p0, depending), even if the patch was produced with git diff. Thoughts? FWIW that's what I invariably use. You do have to be careful to git-add/git-rm any added/deleted files, which git-apply does for you (as well as renames) - I've been caught by that a couple of times. git reset? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] json docs fixup
While writing slides for pgopen next week, I noticed that the JSON docs on json_populate_record and json_populate_recordset contain this sentence: A column may only be specified once. IIRC we removed that restriction during development, so unless there is a squawk I am going to simply remove that sentence where it occurs. Or should we say something like: If a column is specified more than once, the last value is used. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Assertions in PL/PgSQL
On Sat, Sep 14, 2013 at 1:52 PM, Marko Tiikkaja ma...@joh.to wrote: On 14/09/2013 20:47, I wrote: These are similar to the Assert() backend macro: they can be disabled during compile time, but when enabled, abort execution if the passed expression is not true. Hi, That's very good, thanks. And by compile time here, I mean at the time when the PL/PgSQL function is compiled, not the postgres server binary. and compile time means when the function is created or replaced? or the first time is used? if the second. Why not have a plpgsql.enable_assert variable? A directive you can use per function then i can keep the ASSERT and activate them by replacing the function -- Jaime Casanova www.2ndQuadrant.com Professional PostgreSQL: Soporte 24x7 y capacitación Phone: +593 4 5107566 Cell: +593 987171157
Re: [HACKERS] git apply vs patch -p1
On 09/14/2013 03:08 PM, Andres Freund wrote: On 2013-09-14 15:03:52 -0400, Andrew Dunstan wrote: On 09/14/2013 02:37 PM, Josh Berkus wrote: Folks, Lately I've been running into a lot of reports of false conflicts reported by git apply. The most recent one was the points patch, which git apply rejected for completely ficticious reasons (it claimed that the patch was trying to create a new file where a file already existed, which it wasn't). I think we should modify the patch review and developer instructions to recommend always using patch -p1 (or -p0, depending), even if the patch was produced with git diff. Thoughts? FWIW that's what I invariably use. You do have to be careful to git-add/git-rm any added/deleted files, which git-apply does for you (as well as renames) - I've been caught by that a couple of times. git reset? Yes, of course you can roll back as long as you haven't published your commits. But it's a nuisance. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] PL Code Archive Proposal
Hi, In my efforts to allow PostgreSQL users to be able to fully use the server even when not granted file system level access to it, came the question of PL lib code management. Where do you manage the library code you need, those parts of your code that are not exposed at the SQL level? For example when doing plpython you still need to install files on the server's file system each time you want to be able to import package from your Stored Procedures. The Python community seems to have finally solved that problem and now offers a facility (called wheel) comparable to Java .jar files, see details at https://pypi.python.org/pypi/wheel. I don't know about the Perl and TCL communities. When thinking about a way to benefit from those facilities in our PL infrastructure, we would need to be able to upload an archive file in a suitable format and I guess register per-PL handlers for those archives: storage and loading has to be considered. CREATE ARCHIVE schema.name LANGUAGE plpythonu AS $$ binary blob here, maybe base64 encoded, PL dependent $$; The standard saith that in the case of PL/Java a spefific function's classpath is composed of all those JAR archives that you've been registering against the same schema as where you put the function in. If we choose to follow that model then any function created in the same schema and language as any given archive is going to be able to import things from it: the archive will be LOADed (whatever that means in your PL of choice) when the function is compiled and used. Now, uploading a binary file and storing it in $PGDATA looks a lot like what we're still talking about for the DSO modules bits. So here's another way to think about it, where we don't need any language feature: CREATE ARCHIVE schema.name LANGUAGE plpythonu WITH 'path/to/component.py' AS $$ … $$, 'path/to/__init__.py' AS $$ … $$, …; That would just upload given text/plain contents on the file system and arrange for the language runtime to be able to use it. For python that means tweaking PYTHONPATH. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] information schema parameter_default implementation
Here is an updated patch which fixes the bug you have pointed out. On Thu, 2013-01-31 at 18:59 +0500, Ali Dar wrote: I checked our your patch. There seems to be an issue when we have OUT parameters after the DEFAULT values. Fixed. Some other minor observations: 1) Some variables are not lined in pg_get_function_arg_default(). Are you referring to indentation issues? I think the indentation is good, but pgindent will fix it anyway. 2) I found the following check a bit confusing, maybe you can make it better if (!argmodes || argmodes[i] == PROARGMODE_IN || argmodes[i] == PROARGMODE_INOUT || argmodes[i] == PROARGMODE_VARIADIC) Factored that out into a separate helper function. 2) inputargn can be assigned in declaration. I'd prefer to initialize it close to where it is used. 3) Function level comment for pg_get_function_arg_default() is missing. I think the purpose of the function is clear. 4) You can also add comments inside the function, for example the comment for the line: nth = inputargn - 1 - (proc-pronargs - proc-pronargdefaults); Suggestion? 5) I think the line added in the documentation(informational_schema.sgml) is very long. Consider revising. Maybe change from: The default expression of the parameter, or null if none or if the function is not owned by a currently enabled role. TO The default expression of the parameter, or null if none was specified. It will also be null if the function is not owned by a currently enabled role. I don't know what do you exactly mean by: function is not owned by a currently enabled role? I think this style is used throughout the documentation of the information schema. We need to keep the descriptions reasonably compact, but I'm willing to entertain other opinions. From 36f25fa2ec96879bda1993818db9a9632d8ac340 Mon Sep 17 00:00:00 2001 From: Peter Eisentraut pete...@gmx.net Date: Sat, 14 Sep 2013 15:55:54 -0400 Subject: [PATCH] Implement information_schema.parameters.parameter_default column Reviewed-by: Ali Dar ali.munir@gmail.com --- doc/src/sgml/information_schema.sgml| 9 src/backend/catalog/information_schema.sql | 9 +++- src/backend/utils/adt/ruleutils.c | 72 + src/include/catalog/catversion.h| 2 +- src/include/catalog/pg_proc.h | 2 + src/include/utils/builtins.h| 1 + src/test/regress/expected/create_function_3.out | 33 +++- src/test/regress/sql/create_function_3.sql | 24 + 8 files changed, 148 insertions(+), 4 deletions(-) diff --git a/doc/src/sgml/information_schema.sgml b/doc/src/sgml/information_schema.sgml index 22e17bb..22f43c8 100644 --- a/doc/src/sgml/information_schema.sgml +++ b/doc/src/sgml/information_schema.sgml @@ -3323,6 +3323,15 @@ titleliteralparameters/literal Columns/title in future versions.) /entry /row + + row + entryliteralparameter_default/literal/entry + entrytypecharacter_data/type/entry + entry + The default expression of the parameter, or null if none or if the + function is not owned by a currently enabled role. + /entry + /row /tbody /tgroup /table diff --git a/src/backend/catalog/information_schema.sql b/src/backend/catalog/information_schema.sql index c5f7a8b..fd706e3 100644 --- a/src/backend/catalog/information_schema.sql +++ b/src/backend/catalog/information_schema.sql @@ -1133,10 +1133,15 @@ CREATE VIEW parameters AS CAST(null AS sql_identifier) AS scope_schema, CAST(null AS sql_identifier) AS scope_name, CAST(null AS cardinal_number) AS maximum_cardinality, - CAST((ss.x).n AS sql_identifier) AS dtd_identifier + CAST((ss.x).n AS sql_identifier) AS dtd_identifier, + CAST( + CASE WHEN pg_has_role(proowner, 'USAGE') + THEN pg_get_function_arg_default(p_oid, (ss.x).n) + ELSE NULL END + AS character_data) AS parameter_default FROM pg_type t, pg_namespace nt, - (SELECT n.nspname AS n_nspname, p.proname, p.oid AS p_oid, + (SELECT n.nspname AS n_nspname, p.proname, p.oid AS p_oid, p.proowner, p.proargnames, p.proargmodes, _pg_expandarray(coalesce(p.proallargtypes, p.proargtypes::oid[])) AS x FROM pg_namespace n, pg_proc p diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c index 9a1d12e..5a05396 100644 --- a/src/backend/utils/adt/ruleutils.c +++ b/src/backend/utils/adt/ruleutils.c @@ -2265,6 +2265,78 @@ static char *generate_function_name(Oid funcid, int nargs, return argsprinted; } +static bool +is_input_argument(int nth, const char *argmodes) +{ + return (!argmodes || argmodes[nth] == PROARGMODE_IN || argmodes[nth] == PROARGMODE_INOUT || argmodes[nth] == PROARGMODE_VARIADIC); +} + +Datum +pg_get_function_arg_default(PG_FUNCTION_ARGS)
[HACKERS] Where to load modules from?
Hi, This topic gets back at every release, more often now that we have proper Extensions with ability to dumprestore. Lately the guys from Open Shift project (a Red Hat team) have asked for a way to load DSO module files from user-owned directory. The way they make that safe is by using cgroups and SELinux, IIUC. We can attack the problem in several ways: - have an initdb switch to tweak the library path per cluster, - have a superuser-only GUC to tweak the library path, - consider on-disk extension as templates and move their module files somewhere private in $PGDATA and load the code from there That would allow OS upgrades not to impact running instances until they do ALTER EXTENSION UPDATE; and allowing co-existence of different versions of the same extension in different clusters of the same major version, and maybe in separate databases of the same cluster in some cases (depends on the extension's module specifics), - do nothing even though the current solution is clearly broken, as in not allowing to answer several user needs and preventing us to implement full support (e.g. base backups, hot standby) for extensions. This proposal comes with no patch because I think we are able to understand it without that, so that it would only be a waste of everybody's time to attach code for a random solution on the list here to that email. Or consider that the fourth point is currently the only one addressed in this very proposal… Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Assertions in PL/PgSQL
On 2013-09-14 21:55, Jaime Casanova wrote: On Sat, Sep 14, 2013 at 1:52 PM, Marko Tiikkaja ma...@joh.to wrote: And by compile time here, I mean at the time when the PL/PgSQL function is compiled, not the postgres server binary. and compile time means when the function is created or replaced? or the first time is used? Uhh.. I have to admit, I'm not sure. I think this would be when you CREATE the function for the backend that created the function, and on the first call in other backends. if the second. Why not have a plpgsql.enable_assert variable? A directive you can use per function then i can keep the ASSERT and activate them by replacing the function The patch supports a compiler option to override the global option and enable it per function: create function foof() returns void as $$ #enable_assertions begin -- failure assert 1 2; end $$ language plpgsql; Does this address your wishes? Regards, Marko Tiikkaja -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Assertions in PL/PgSQL
Hello There is a significant issue - new reserved keyword. There is high probability so lot of users has a functions named assert. I like this functionality, but I dislike any compatibility break for feature, that can be implemented as extension without any lost of compatibility or lost of functionality. So can you redesign this without new keyword? Regards Pavel 2013/9/14 Marko Tiikkaja ma...@joh.to Hi, Attached is a patch for supporting assertions in PL/PgSQL. These are similar to the Assert() backend macro: they can be disabled during compile time, but when enabled, abort execution if the passed expression is not true. A simple example: CREATE FUNCTION delete_user(username text) RETURNS VOID AS $$ BEGIN DELETE FROM users WHERE users.username = delete_user.username; ASSERT FOUND; END $$ LANGUAGE plpgsql; SELECT delete_user('mia'); ERROR: Assertion on line 4 failed CONTEXT: PL/pgSQL function delete_user(text) line 4 at ASSERT Again, I'll add this to the open commitfest, but feedback is greatly appreciated. Regards, Marko Tiikkaja -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Assertions in PL/PgSQL
El 14/09/2013 15:18, Marko Tiikkaja ma...@joh.to escribió: On 2013-09-14 21:55, Jaime Casanova wrote: On Sat, Sep 14, 2013 at 1:52 PM, Marko Tiikkaja ma...@joh.to wrote: And by compile time here, I mean at the time when the PL/PgSQL function is compiled, not the postgres server binary. and compile time means when the function is created or replaced? or the first time is used? Uhh.. I have to admit, I'm not sure. I think this would be when you CREATE the function for the backend that created the function, and on the first call in other backends. if the second. Why not have a plpgsql.enable_assert variable? A directive you can use per function then i can keep the ASSERT and activate them by replacing the function The patch supports a compiler option to override the global option and enable it per function: create function foof() returns void as $$ #enable_assertions begin -- failure assert 1 2; end $$ language plpgsql; Does this address your wishes? That's exactly what i wanted. thanks -- Jaime Casanova 2ndQuadrant: Your PostgreSQL partner
Re: [HACKERS] Assertions in PL/PgSQL
On 2013-09-14 22:24, Pavel Stehule wrote: There is a significant issue - new reserved keyword. There is high probability so lot of users has a functions named assert. Someone may prove me wrong here, but to me it looks like this would only prevent ASSERT from being used as the name of a PL/PgSQL variable. That's still a backwards compatibility break, but the case you were worried about should still work: =# create function assert(boolean) returns boolean as $$ begin if $1 is not true then raise exception 'custom assert()'; end if; return true; end $$ language plpgsql; CREATE FUNCTION =# create function f() returns int as $$ $# begin $# assert false; $# perform assert(true); $# if assert(true) then $# perform assert(false); $# end if; $# end $# $$ language plpgsql; CREATE FUNCTION =# select f(); ERROR: custom assert() CONTEXT: SQL statement SELECT assert(false) PL/pgSQL function f() line 6 at PERFORM I like this functionality, but I dislike any compatibility break for feature, that can be implemented as extension without any lost of compatibility or lost of functionality. I don't see how this could be implemented as an extension, even if you write it in C. I think being able to turn assertions off in production with no execution time overhead is what justifies having this in-core. The nicer syntax isn't enough (compared to, say, PERFORM assert(..)). And if we're only breaking code for people who use assert as the variable name, I'd say we go for it. So can you redesign this without new keyword? I don't see an easy way to do that, but I'm not too familiar with PL/PgSQL's parser so maybe I'm just missing something. Regards, Marko Tiikkaja -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Assertions in PL/PgSQL
On 2013-09-14 22:40, I wrote: Someone may prove me wrong here, but to me it looks like this would only prevent ASSERT from being used as the name of a PL/PgSQL variable. The comment at the beginning of pl_scanner.c seems to suggest same. Regards, Marko Tiikkaja -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Assertions in PL/PgSQL
2013/9/14 Marko Tiikkaja ma...@joh.to On 2013-09-14 22:40, I wrote: Someone may prove me wrong here, but to me it looks like this would only prevent ASSERT from being used as the name of a PL/PgSQL variable. The comment at the beginning of pl_scanner.c seems to suggest same. yes, there are no too much possibilities, how to do it. Effective implementation needs a special PLpgSQL statement - but I am 100% happy with proposed syntax. We introduce a new special keyword just for one simple construct. A some languages has a generic PRAGMA keyword. So I would be much more happy with something like PRAGMA Assert(found); It is much more close to ADA, and it allows a reuse of new keyword for any other usage in future (your proposal is too simply, without possibility open new doors in future). And we can define a some standard predefined asserts too - like Assert, AssertNotNull, AssertNotEmpty, ... other issue - A asserts macros has one or two parameters usually. You should to support two params assert (with message). Regards, Marko Tiikkaja
Re: [HACKERS] [RFC] Extend namespace of valid guc names
On 2013-02-25 22:15:33 +0100, Andres Freund wrote: Currently guc-file.c allows the following as guc names: ID{LETTER}{LETTER_OR_DIGIT}* QUALIFIED_ID {ID}.{ID} That is either one token starting with a letter followed by numbers or letters or exactly two of those separated by a dot. Those restrictions are existing for neither SET/set_config() via SQL nor for postgres -c styles of setting options. I propose loosening those restrictions to a) allow repeatedly qualified names like a.b.c b) allow variables to start with a digit from the second level onwards. The attached patch does a) but not b) as Tom argued it's not allowed in a plain fashion in SQL either. There you need to quote the variable name. Additionally, should we perhaps enforce the same rules for -c and set_config()/SET? The discussion seems to have concluded that this isn't neccessary, so I haven't included this. The patch still is pretty trivial - I've looked around and I really didn't see anything else that requires changes. I haven't even found documentation to adapt. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services From 9fa69ccc79c7bc42be554bc3f4e740c4e77d50d7 Mon Sep 17 00:00:00 2001 From: Andres Freund and...@anarazel.de Date: Sat, 14 Sep 2013 23:15:10 +0200 Subject: [PATCH] Allow custom GUCs to be nested more than one level in config files Doing so was allowed via SET, postgres -c and set_config() before. Allowing to do so is useful for grouping together variables inside an extension's toplevel namespace. There still are differences in the accepted variable names between the different methods of setting GUCs after this commit, but the concensus is that those are acceptable. --- src/backend/utils/misc/guc-file.l | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/backend/utils/misc/guc-file.l b/src/backend/utils/misc/guc-file.l index b730a12..ff4202d 100644 --- a/src/backend/utils/misc/guc-file.l +++ b/src/backend/utils/misc/guc-file.l @@ -78,7 +78,7 @@ LETTER [A-Za-z_\200-\377] LETTER_OR_DIGIT [A-Za-z_0-9\200-\377] ID{LETTER}{LETTER_OR_DIGIT}* -QUALIFIED_ID {ID}.{ID} +QUALIFIED_ID {ID}(.{ID})+ UNQUOTED_STRING {LETTER}({LETTER_OR_DIGIT}|[-._:/])* STRING \'([^'\\\n]|\\.|\'\')*\' -- 1.8.4.21.g992c386.dirty -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PL/pgSQL, RAISE and error context
2013/9/14 Marko Tiikkaja ma...@joh.to On 23/08/2013 10:36, I wrote: On 8/23/13 8:38 AM, Pavel Stehule wrote: do you prepare patch ? I should have the time to produce one for the September commitfest, but if you (or anyone else) want to work on this, I won't object. My opinion at this very moment is that we should leave the the DEFAULT verbosity alone and add a new one (call it COMPACT or such) with the suppressed context for non-ERRORs. Well, turns out there isn't really any way to preserve complete backwards compatibility if we want to do this change. The attached patch (based on Pavel's patch) changes the default to be slightly more verbose (the CONTEXT lines which were previously omitted will be visible), but adds a new PGVerbosity called COMPACT which suppresses CONTEXT in non-error messages. Now DEFAULT will be more useful when debugging PL/PgSQL, and people who are annoyed by the new behaviour can use the COMPACT mode. Any thoughts? +1 Regards Pavel Regards, Marko Tiikkaja
Re: [HACKERS] Where to load modules from?
On 2013-09-14 22:15:58 +0200, Dimitri Fontaine wrote: The way they make that safe is by using cgroups and SELinux, IIUC. We can attack the problem in several ways: - have an initdb switch to tweak the library path per cluster, That sounds like an utterly horrible idea without any advantages. - have a superuser-only GUC to tweak the library path, Hm. I think we might want to make it a PGC_POSTMASTER/postgresql.conf variable instead. Is that stopping usecases of yours? That's what I vote for. - consider on-disk extension as templates and move their module files somewhere private in $PGDATA and load the code from there I don't understand what that does to address the security concerns. That would allow OS upgrades not to impact running instances until they do ALTER EXTENSION UPDATE; and allowing co-existence of different versions of the same extension in different clusters of the same major version, and maybe in separate databases of the same cluster in some cases (depends on the extension's module specifics), And it would be an upgrade nightmare. This proposal comes with no patch because I think we are able to understand it without that, so that it would only be a waste of everybody's time to attach code for a random solution on the list here to that email. Or consider that the fourth point is currently the only one addressed in this very proposal… Yea, the code issue seem to be small here. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] bitmap indexes
Hi Abhijit, On 2013-09-14 23:44:24 +0530, Abhijit Menon-Sen wrote: I've been working on this patch for a while, and have made some progress towards (a) general fixing, and (b) a working VACUUM implementation (the major remaining piece). Unfortunately, I've been busy moving house, and the latter is not complete (and not in this patch). I will continue working on the code, and I'll post updates. I expect to have more to show in just a few days. Nevertheless, I'm posting it for review now as I keep working. Given the size and age of the patch, I would appreciate any comments, no matter how nitpicky. It'd be nice if you could quickly sketch out the plan for vacuum, that will make reviewing more useful and easier. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Assertions in PL/PgSQL
On 2013-09-14 23:05, Pavel Stehule wrote: A some languages has a generic PRAGMA keyword. So I would be much more happy with something like PRAGMA Assert(found); It is much more close to ADA, and it allows a reuse of new keyword for any other usage in future (your proposal is too simply, without possibility open new doors in future). And we can define a some standard predefined asserts too - like Assert, AssertNotNull, AssertNotEmpty, ... I don't see why e.g. PRAGMA AssertNotEmpty(foo); would be better than ASSERT NotEmpty(foo); and the NotNull version is even sillier considering the expression is arbitrary SQL, and we'd have to do all kinds of different versions or people would be disappointed (AssertNull, AssertNotNull, AssertExists, AssertNotExists, etc.). I see what you're trying to do, but I don't think crippling new features just because we might do something similar at some point is a good idea. I'm guessing this is what happened with the row_count syntax, which made the feature an absolute nightmare to use. other issue - A asserts macros has one or two parameters usually. You should to support two params assert (with message). That I think is worth looking into. Regards, Marko Tiikkaja -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE
On 2013-09-14 09:57:43 +0100, Greg Stark wrote: It seems to me that the nature of the problem is that there will unavoidably be a nexus between the two parts of the code here. We can try to isolate it as much as possible but we're going to need a bit of a compromise. I think Roberts and mine point is that there are several ways to approach this without doing that. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Assertions in PL/PgSQL
Dne 14. 9. 2013 23:35 Marko Tiikkaja ma...@joh.to napsal(a): On 2013-09-14 23:05, Pavel Stehule wrote: A some languages has a generic PRAGMA keyword. So I would be much more happy with something like PRAGMA Assert(found); It is much more close to ADA, and it allows a reuse of new keyword for any other usage in future (your proposal is too simply, without possibility open new doors in future). And we can define a some standard predefined asserts too - like Assert, AssertNotNull, AssertNotEmpty, ... I don't see why e.g. PRAGMA AssertNotEmpty(foo); would be better than ASSERT NotEmpty(foo); and the NotNull version is even sillier considering the expression is arbitrary SQL, and we'd have to do all kinds of different versions or people would be disappointed (AssertNull, AssertNotNull, AssertExists, AssertNotExists, etc.). I see what you're trying to do, but I don't think crippling new features just because we might do something similar at some point is a good idea. I'm guessing this is what happened with the row_count syntax, which made the feature an absolute nightmare to use. a more than one asserts can be my personal preferrence (it is not important). but introduction a reserved keword for one very special purpose (without extensibility) is not prudent. plpgsql has still lot of relations to pl/sql and ada, and I don't think so we have to introduce a new original syntax everytime. other issue - A asserts macros has one or two parameters usually. You should to support two params assert (with message). That I think is worth looking into. Regards, Marko Tiikkaja
Re: [HACKERS] Assertions in PL/PgSQL
Dne 14. 9. 2013 23:55 Pavel Stehule pavel.steh...@gmail.com napsal(a): Dne 14. 9. 2013 23:35 Marko Tiikkaja ma...@joh.to napsal(a): On 2013-09-14 23:05, Pavel Stehule wrote: A some languages has a generic PRAGMA keyword. So I would be much more happy with something like PRAGMA Assert(found); It is much more close to ADA, and it allows a reuse of new keyword for any other usage in future (your proposal is too simply, without possibility open new doors in future). And we can define a some standard predefined asserts too - like Assert, AssertNotNull, AssertNotEmpty, ... I don't see why e.g. PRAGMA AssertNotEmpty(foo); would be better than ASSERT NotEmpty(foo); and the NotNull version is even sillier considering the expression is arbitrary SQL, and we'd have to do all kinds of different versions or people would be disappointed (AssertNull, AssertNotNull, AssertExists, AssertNotExists, etc.). I see what you're trying to do, but I don't think crippling new features just because we might do something similar at some point is a good idea. I'm guessing this is what happened with the row_count syntax, which made the feature an absolute nightmare to use. a more than one asserts can be my personal preferrence (it is not important). but introduction a reserved keword for one very special purpose (without extensibility) is not prudent. plpgsql has still lot of relations to pl/sql and ada, and I don't think so we have to introduce a new original syntax everytime this is a possibility for introduction a new hook and possibility implement asserions and similar task in generic form (as extension). it can be assertions, tracing, profiling. I like a integrated assertions, but I would not close a door to future enhancing (probably there will not be a possibility intriduce a new keyword for tracing - although I would to implement a difference between development an production usage. so I am against to your proposal - it doesn't allow any extensibility. other issue - A asserts macros has one or two parameters usually. You should to support two params assert (with message). That I think is worth looking into. Regards, Marko Tiikkaja
Re: [HACKERS] Assertions in PL/PgSQL
On 2013-09-14 23:55, Pavel Stehule wrote: but introduction a reserved keword for one very special purpose (without extensibility) is not prudent. How about using an existing keyword then? ASSERTION used to be reserved in the SQL standard but is unreserved in postgres. CHECK might work and is fully reserved. CONSTRAINT? IS? Regards, Marko Tiikkaja -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE
On 2013-09-13 14:41:46 -0700, Peter Geoghegan wrote: On Fri, Sep 13, 2013 at 12:23 PM, Andres Freund and...@2ndquadrant.com wrote: The reason I wasn't saying this will never get accepted are twofold: a) I don't want to stiffle alternative ideas to the promises idea, just because I think it's the way to go. That might stop a better idea from being articulated. b) I am not actually in the position to say it's not going to be accepted. Well, the reality is that the promises idea hasn't been described in remotely enough detail to compare it to what I have here. I've pointed out plenty of problems with it. Even if you disagree, I still think that doesn't matter in the very least. You say: I think that the details of how this approach compare to others are totally pertinent. For me, that's the whole point - getting towards something that will balance all of these concerns and be acceptable. Well, the two other people involved in the discussion so far have gone on the record saying that the presented approach is not acceptable to them. And you haven't started reacting to that. Yes, it's entirely possible that that could look quite different to what I have here. I do not want to reduce all this to a question of is this one design acceptable or not?. But the way you're discussing it so far is exactly reducing it that way. If you want the discussion to be about *how* can we implement it that the various concerns are addressed: fsck*ing great. I am with you there. In the end, even though I have my usual strong opinions which is the best way, I don't care which algorithm gets pursued further. At least, if, and only if, it has a fighting chance of getting committed. Which this doesn't. After all, it was the first thing that I considered, and I'm on the record talking about it in the 2012 dev meeting. I didn't take that approach for many good reasons. Well, I wasn't there when you said that ;) The reason I ended up here is not because I didn't get the memo about holding buffer locks across complex operations being a bad thing. At least grant me that. I'm here because in all these years no one has come up with a suggestion that doesn't have some very major downsides. Like, even worse than this. I think you're massively, massively, massively overstating the dangers of bloat here. It's a known problem that's *NOT* getting worse by any of the other proposals if you compare it with the loop/lock/catch implementation of upsert that we have today as the only option. And we *DO* have infrastructure to deal with bloat, even if could use some improvement. We *don't* have infrastructure with deadlocks on lwlocks. And we're not goint to get that infrastructure, because it would even further remove the lw part of lwlocks. As to the rules you refer to, you must mean These locks are intended to be short-term: they should not be held for long. I don't think that they will ever be held for long. At least, when I've managed the amount of work that a heap_insert() can do better. I expect to produce a revision where toasting doesn't happen with the locks held soon. Actually, I've already written the code, I just need to do some testing. I personally think - and have stated so before - that doing a heap_insert() while holding the btree lock is unacceptable. Presumably your reason is essentially that we exclusive lock a heap buffer (exactly one heap buffer) while holding shared locks on btree index buffers. It's that it interleaves an already complex but local locking scheme that required several years to become correct with another that is just the same. That's an utterly horrid idea. Is that really so different to holding an exclusive lock on a btree buffer while holding a shared lock on a heap buffer? Because that's what _bt_check_unique() does today. Yes, it it is different. But, in my opinion, bt_check_unique() doing so is a bug that needs fixing. Not something that we want to extend. (Note that _bt_check_unique() already needs to deal with the fact that it reads an unlocked page, because it moves right in some cases) And, as you say: Now, I'll grant you that there is one appreciable difference, which is that multiple unique indexes may be involved. But limiting ourselves to the primary key or something like that remains an option. And I'm not sure that it's really any worse anyway. I don't think that's an acceptable limitation. If it were something we could lift in a release or two, maybe, but that's not what you're talking about. At this point I am a bit confused why you are asking for review. I am asking for us, collectively, through consensus, to resolve the basic approach to doing this. That was something I stated right up front, pointing out details of where the discussion had gone in the past. That was my explicit goal. There has been plenty of discussing on this down through the years, but nothing ever came from it. At the moment ISTM
Re: [HACKERS] GUC for data checksums
Hi, On 2013-09-14 18:33:38 +0200, Bernd Helmle wrote: Attached is a small patch to add a new GUC to report wether data checksums for a particular cluster are enabled. The only way to get this info afaik is to look into pg_control and the version number used, but i'd welcome a way to access this remotely, too. If there aren't any objections i'll add this to the CF. Looks like a good idea to me. The implementation looks sane as well, except that I am not sure if we really need to introduce that faux variable. If the variable cannot be set and we have a SHOW hook, do we need it? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Assertions in PL/PgSQL
On 2013-09-15 00:09, Pavel Stehule wrote: this is a possibility for introduction a new hook and possibility implement asserions and similar task in generic form (as extension). it can be assertions, tracing, profiling. You can already do tracing and profiling in an extension. I don't see what you would put inside the function body for these two, either. Regards, Marko Tiikkaja -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE
On Sat, Sep 14, 2013 at 12:22 AM, Robert Haas robertmh...@gmail.com wrote: I mean, if we do the promise tuple thing, and there are multiple unique indexes, what happens when an inserter needs to block pending the outcome of another transaction? They had better go clean up the promise tuples from the other unique indexes that they're trying to insert into, because they cannot afford to hold value locks for a long time, no matter how they're implemented. As Andres already pointed out, this is not correct. While not doing this is not incorrect, it certainly would be useful for preventing deadlocks and unnecessary contention. In a world where people expect either an insert or an update, we ought to try and reduce contention across multiple unique indexes. I can understand why that doesn't matter today, though - if you're going to insert duplicates indifferent to whether or not there will be conflicts, that's a kind of abuse, and not worth optimizing - seems probable that most transactions will commit. However, it seems much less probable that most upserters will insert. People may well wish to upsert all the time where an insert is hardly ever necessary, which is one reason why I have doubts about other proposals. Note that today there is no guarantee that the original waiter for a duplicate-inserting xact to complete will be the first one to get a second chance, so I think it's hard to question this on correctness grounds. Even if they are released in FIFO order, there is no reason to assume that the first waiter will win the race with a second. Most obviously, the second waiter may not even ever get the chance to block on the same xid at all (so it's not really a waiter at all) and still be able to insert, if the blocking-xact aborts after the second waiter starts its descent but before it checks uniqueness. All this, even though the second waiter arrived maybe minutes after the first. What I'm talking about here is really unlikely to result in lock starvation, because the original waiter typically gets to observe the other waiter go through, and that's reason enough to give up entirely. Now, it's kind of weird that the original waiter will still end up blocking on the xid that caused it to wait in the first instance. So there should be more thought put into that, like remembering the xid and only waiting on it on a retry, or some similar scheme. Maybe you could contrive a scenario where this causes lock starvation, but I suspect you could do the same thing for the present btree insertion code. Just to add to what he said, we already have long-lasting value locks in the form of SIREAD locks. SIREAD can exist at different levels of granularity, but one of those levels is index-page-level granularity, where they have the function of guarding against concurrent insertions of values that would fall within that page, which just so happens to be the same thing you want to do here. The difference between those locks and what you're proposing here is that they are implemented differently. That is why those were acceptable and this is not. As the implementer of this patch, I'm obligated to put some checks in unique index insertion that everyone has to care about. There is no way around that. Complexity issues aside, I think that an argument could be made for this approach *reducing* the impact on concurrency relative to other approaches, if there isn't too many unique indexes to deal with, which is the case the vast majority of the time. I mean, those other approaches necessitate doing so much more with *exclusive* locks held. Like inserting, maybe doing a page split, WAL-logging, all with the lock, and then either updating in place or killing the promise tuple, and WAL-logging that, with an exclusive lock held the second time around. Plus searching for everything twice. I think that frequently killing all of those broken-promise tuples could have deleterious effects on concurrency and/or index bloat or the kind only remedied by reindex. Do you update the freespace map too? More exclusive locks! Or if you leave it up to VACUUM (and just set the xid to InvalidXid, which is still extra work), autovacuum has to care about a new *class* of bloat - index-only bloat. Plus lots of dead duplicates are bad for performance in btrees generally. As here, that is way more expensive than simply grabbing and holding a share-lock on the page. But we get a number of important benefits out of it. The backend remains interruptible while the tuple is locked, the protocol for granting locks is FIFO to prevent starvation, we don't suppress page eviction while the lock is held, we can simultaneously lock arbitrarily large numbers of tuples, and deadlocks are detect and handled cleanly. If those requirements were negotiable, we would surely have negotiated them away already, because the performance benefits would be immense. False equivalence. We only need to lock as many unique index *values* (not
Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE
On Sat, Sep 14, 2013 at 3:15 PM, Andres Freund and...@2ndquadrant.com wrote: Well, the reality is that the promises idea hasn't been described in remotely enough detail to compare it to what I have here. I've pointed out plenty of problems with it. Even if you disagree, I still think that doesn't matter in the very least. It matters if you care about getting this feature. You say: I think that the details of how this approach compare to others are totally pertinent. For me, that's the whole point - getting towards something that will balance all of these concerns and be acceptable. Well, the two other people involved in the discussion so far have gone on the record saying that the presented approach is not acceptable to them. And you haven't started reacting to that. Uh, yes I have. I'm not really sure what you could mean by that. What am I refusing to address? Yes, it's entirely possible that that could look quite different to what I have here. I do not want to reduce all this to a question of is this one design acceptable or not?. But the way you're discussing it so far is exactly reducing it that way. The fact that I was motivated to do things this way serves to illustrate the problems generally. If you want the discussion to be about *how* can we implement it that the various concerns are addressed: fsck*ing great. I am with you there. Isn't that what we were doing? There has been plenty of commentary on alternative approaches. In the end, even though I have my usual strong opinions which is the best way, I don't care which algorithm gets pursued further. At least, if, and only if, it has a fighting chance of getting committed. Which this doesn't. I don't think that any design that has been described to date doesn't have serious problems. Causing excessive bloat, particularly in indexes is a serious problem also. The reason I ended up here is not because I didn't get the memo about holding buffer locks across complex operations being a bad thing. At least grant me that. I'm here because in all these years no one has come up with a suggestion that doesn't have some very major downsides. Like, even worse than this. I think you're massively, massively, massively overstating the dangers of bloat here. It's a known problem that's *NOT* getting worse by any of the other proposals if you compare it with the loop/lock/catch implementation of upsert that we have today as the only option. Why would I compare it with that? That's terrible, and very few of our users actually know about it anyway. Also, will an UPDATE followed by an INSERT really bloat all that much anyway? And we *DO* have infrastructure to deal with bloat, even if could use some improvement. We *don't* have infrastructure with deadlocks on lwlocks. And we're not goint to get that infrastructure, because it would even further remove the lw part of lwlocks. Everything I said so far is predicated on LWLocks not deadlocking here, so I'm not really sure why you'd say that. If I can't find a way to prevent deadlock, then clearly the approach is doomed. It's that it interleaves an already complex but local locking scheme that required several years to become correct with another that is just the same. That's an utterly horrid idea. You're missing my point, which is that it may be possible, with relatively modest effort, to analyze things to ensure that deadlock is impossible - regardless of the complexities of the two systems - because they're reasonably well encapsulated. See below, under I'll say it again. Now, I can certainly understand why you wouldn't be willing to accept that at face value. The idea isn't absurd, though. You could think of the heap_insert() call as being under the control of the btree code (just as, say, heap_hot_search() is), even though the code isn't at all structured that way, and that's awkward. I'm actually slightly tempted to structure it that way. Is that really so different to holding an exclusive lock on a btree buffer while holding a shared lock on a heap buffer? Because that's what _bt_check_unique() does today. Yes, it it is different. But, in my opinion, bt_check_unique() doing so is a bug that needs fixing. Not something that we want to extend. Well, I think you know that that's never going to happen. There are all kinds of reasons why it works that way that cannot be disavowed. My definition of a bug includes a user being affected. At this point I am a bit confused why you are asking for review. I am asking for us, collectively, through consensus, to resolve the basic approach to doing this. That was something I stated right up front, pointing out details of where the discussion had gone in the past. That was my explicit goal. There has been plenty of discussing on this down through the years, but nothing ever came from it. At the moment ISTM you're not conceding on *ANY* points. That's not very often the way to find concensus.
Re: [HACKERS] GUC for data checksums
--On 15. September 2013 00:25:34 +0200 Andres Freund and...@2ndquadrant.com wrote: Looks like a good idea to me. The implementation looks sane as well, except that I am not sure if we really need to introduce that faux variable. If the variable cannot be set and we have a SHOW hook, do we need it? It's along the line with the other informational variables like block_size et al. Do you want to have a function instead or what's your intention? One benefit is to have 'em all in SHOW ALL which can be used to compare database/cluster settings, to mention one use case i have in mind. -- Thanks Bernd -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpgsql.print_strict_params
On Fri, 2013-09-13 at 23:56 +0200, Marko Tiikkaja wrote: Attached is a patch for optionally printing more information on STRICT failures in PL/PgSQL: Please fix the tabs in the SGML files. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch: add MAP_HUGETLB to mmap() where supported (WIP)
On Sat, 2013-09-14 at 00:41 +0100, Richard Poole wrote: The attached patch adds the MAP_HUGETLB flag to mmap() for shared memory on systems that support it. Please fix the tabs in the SGML files. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PL/pgSQL, RAISE and error context
On Sat, 2013-09-14 at 04:58 +0200, Marko Tiikkaja wrote: The attached patch (based on Pavel's patch) changes the default to be slightly more verbose (the CONTEXT lines which were previously omitted will be visible), but adds a new PGVerbosity called COMPACT which suppresses CONTEXT in non-error messages. Now DEFAULT will be more useful when debugging PL/PgSQL, and people who are annoyed by the new behaviour can use the COMPACT mode. Your patch fails the regression tests. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PoC] pgstattuple2: block sampling to reduce physical read
On Sat, 2013-09-14 at 16:18 +0900, Satoshi Nagayasu wrote: I'm looking forward to seeing more feedback on this approach, in terms of design and performance improvement. So, I have submitted this for the next CF. Your patch fails to build: pgstattuple.c: In function ‘pgstat_heap_sample’: pgstattuple.c:737:13: error: ‘SnapshotNow’ undeclared (first use in this function) pgstattuple.c:737:13: note: each undeclared identifier is reported only once for each function it appears in -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Proposal: json_populate_record and nested json objects
Hi all; Currently json_populate_record and json_populate_recordset cannot work with nested json objects. This creates two fundamental problems when trying to use JSON as an interface format. The first problem is you can't easily embed a json data type in an json object and have it populate a record. This means that storing extended attributes in the database is somewhat problematic if you accept the whole row in as a json object. The second problem is that nested data structures and json don't go together well. You can't have a composite type which has as an attribute an array of another composite type and populate this from a json object. This makes json largely an alternative to hstore for interfaces in its current shape. I would propose handling the json_populate_record and friends as such: 1. Don't throw errors initially as a pre-check if the json object is nested. 2. If one comes to a nested fragment, check the attribute type it is going into first. 2.1 If it is a json type, put the nested fragment there. 2.2 If it is a composite type (i.e. anything in pg_class), push it through another json_populate_record run 2.3 If it is neither, then see if a json::[type] cast exists, if so call it. 2.4 Otherwise raise an exception I have a few questions before I go on to look at creating a patch. 1. Are there any problems anyone spots with this approach? 2. Is anyone working on something like this? 3. Would it be preferable to build something like this first as an extension (perhaps with different function names) or first as a patch? Best Wishes, Chris Travers http://www.2ndquadrant.com PostgreSQL Services, Training, and Support
Re: [HACKERS] [PATCH] Revive line type
Here is a new patch for the line type, with a new input/output format {A,B,C}, as discussed in this thread. From 837fcf5d9b1ee8e589ef4b19f7d6e575229ca758 Mon Sep 17 00:00:00 2001 From: Peter Eisentraut pete...@gmx.net Date: Sun, 15 Sep 2013 00:02:06 -0400 Subject: [PATCH] Revive line type Change the input/output format to {A,B,C}, to match the internal representation. Complete the implementations of line_in, line_out, line_recv, line_send. Remove comments and error messages about the line type not being implemented. Add regression tests for existing line operators and functions. Reviewed-by: rui hua 365507506...@gmail.com --- doc/src/sgml/datatype.sgml | 42 - doc/src/sgml/func.sgml | 6 + src/backend/utils/adt/geo_ops.c| 219 ++- src/include/catalog/pg_type.h | 3 +- src/include/utils/geo_decls.h | 7 - src/test/regress/expected/geometry.out | 3 - src/test/regress/expected/line.out | 271 + src/test/regress/expected/sanity_check.out | 3 +- src/test/regress/output/misc.source| 3 +- src/test/regress/parallel_schedule | 2 +- src/test/regress/serial_schedule | 1 + src/test/regress/sql/geometry.sql | 4 - src/test/regress/sql/line.sql | 87 + 13 files changed, 503 insertions(+), 148 deletions(-) create mode 100644 src/test/regress/expected/line.out create mode 100644 src/test/regress/sql/line.sql diff --git a/doc/src/sgml/datatype.sgml b/doc/src/sgml/datatype.sgml index 87668ea..07f0385 100644 --- a/doc/src/sgml/datatype.sgml +++ b/doc/src/sgml/datatype.sgml @@ -3051,9 +3051,7 @@ titleGeometric Types/title para Geometric data types represent two-dimensional spatial objects. xref linkend=datatype-geo-table shows the geometric -types available in productnamePostgreSQL/productname. The -most fundamental type, the point, forms the basis for all of the -other types. +types available in productnamePostgreSQL/productname. /para table id=datatype-geo-table @@ -3063,8 +3061,8 @@ titleGeometric Types/title row entryName/entry entryStorage Size/entry -entryRepresentation/entry entryDescription/entry +entryRepresentation/entry /row /thead tbody @@ -3077,8 +3075,8 @@ titleGeometric Types/title row entrytypeline/type/entry entry32 bytes/entry -entryInfinite line (not fully implemented)/entry -entry((x1,y1),(x2,y2))/entry +entryInfinite line/entry +entry{A,B,C}/entry /row row entrytypelseg/type/entry @@ -3153,6 +3151,38 @@ titlePoints/title /sect2 sect2 +titleLines/title + +indexterm + primaryline/primary +/indexterm + +para + Lines (typeline/type) are represented by the linear equation Ax + By + + C = 0, where A and B are not both zero. Values of + type typeline/type is input and output in the following form: +synopsis +{ replaceableA/replaceable, replaceableB/replaceable, replaceableC/replaceable } +/synopsis + + Alternatively, any of the following forms can be used for input: + +synopsis +[ ( replaceablex1/replaceable , replaceabley1/replaceable ) , ( replaceablex2/replaceable , replaceabley2/replaceable ) ] +( ( replaceablex1/replaceable , replaceabley1/replaceable ) , ( replaceablex2/replaceable , replaceabley2/replaceable ) ) + ( replaceablex1/replaceable , replaceabley1/replaceable ) , ( replaceablex2/replaceable , replaceabley2/replaceable ) +replaceablex1/replaceable , replaceabley1/replaceable , replaceablex2/replaceable , replaceabley2/replaceable +/synopsis + + where + literal(replaceablex1/replaceable,replaceabley1/replaceable)/literal + and + literal(replaceablex2/replaceable,replaceabley2/replaceable)/literal + are two (different) points on the line. +/para + /sect2 + + sect2 titleLine Segments/title indexterm diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index ee1c957..8f60c56 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -8123,6 +8123,12 @@ titleGeometric Type Conversion Functions/title entryliteralcircle(polygon '((0,0),(1,1),(2,0))')/literal/entry /row row +entryliteralfunctionline(typepoint/type, typepoint/type)/function/literal/entry +entrytypeline/type/entry +entrypoints to line/entry +entryliteralline(point '(-1,0)', point '(1,0)')/literal/entry + /row + row entry indexterm primarylseg/primary diff --git a/src/backend/utils/adt/geo_ops.c b/src/backend/utils/adt/geo_ops.c index 5d0b596..6bfe6d7 100644 --- a/src/backend/utils/adt/geo_ops.c +++ b/src/backend/utils/adt/geo_ops.c @@ -926,42 +926,82 @@ box_diagonal(PG_FUNCTION_ARGS)
Re: [HACKERS] Assertions in PL/PgSQL
2013/9/15 Marko Tiikkaja ma...@joh.to On 2013-09-15 00:09, Pavel Stehule wrote: this is a possibility for introduction a new hook and possibility implement asserions and similar task in generic form (as extension). it can be assertions, tracing, profiling. You can already do tracing and profiling in an extension. I don't see what you would put inside the function body for these two, either. you cannot mark a tracing points explicitly in current (unsupported now) extensions. These functions share same pattern: CREATE OR REPLACE FUNCTION assert(boolean) RETURNS void AS $$ IF current_setting('plpgsq.assertions') = 'on' THEN IF $1 THEN RAISE EXCEPTION 'Assert fails'; END IF; END IF; END; $$ LANGUAGE plpgsql; CREATE OR REPLACE FUNCTION trace(text) RETURNS void AS $$ IF current_setting('plpgsq.trace') = 'on' THEN RAISE WARNING 'trace: %', $1; END IF; END; $$ LANGUAGE plpgsql; Depends on usage, these functions will not be extremely slow against to builtin solution - can be faster, if we implement it in C, and little bit faster if we implement it as internal PLpgSQL statement. But if you use a one not simple queries, then overhead is not significant (probably). You have to watch some global state variable and then execute (or not) some functionality. Regards Pavel Regards, Marko Tiikkaja
Re: [HACKERS] git apply vs patch -p1
On Sat, 2013-09-14 at 11:37 -0700, Josh Berkus wrote: Lately I've been running into a lot of reports of false conflicts reported by git apply. The most recent one was the points patch, which git apply rejected for completely ficticious reasons (it claimed that the patch was trying to create a new file where a file already existed, which it wasn't) Every file in that patch contains new file mode 100644 which means it is creating a new file. I would review how that patch was created. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Questions about checksum feature in 9.3
I am getting a new server ready for production and saw the release note on the new checksum feature. I thought it sounded like something we might want, and then after reading realized we have to initdb with the feature on. I figured I'd better check into it a little more since changing later might be a bit of a hassle and found notes on getting a vectorized version running for better performance. My attempts to compile it vectorized on OS X seemed to have failed since I don't find a vector instruction in the .o file even though the options -msse4.1 -funroll-loops -ftree-vectorize should be supported according to the man page for Apple's llvm-gcc. So, has anyone compiled checksum vectorized on OS X? Are there any performance data that would indicate whether or not I should worry with this in the first place? So far we are pretty happy with the performance of 9.2.4, but have noticed a few situations where it's a little slower than we might like, but these instances are rare. I'd accept a small performance hit if we can get better reliability and awareness of potential problems. Thanks, Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers