Re: [HACKERS] Client Messages
On 25.01.2012 15:29, Jim Mlodgenski wrote: On Tue, Jan 24, 2012 at 7:38 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: There's one little problem remaining with this, which is what to do if the message is in a different encoding than used by the client? That's not a new problem, we have the same problem with any string GUC, if you do SHOW setting. We restricted application_name to ASCII characters, which is an obvious way to avoid the problem, but I think it would be a shame to restrict this to ASCII-only. Isn't that an issue for the administrator understanding his audience. Maybe some additional documentation explaining the encoding issue? The thing is, there's currently no encoding conversion happening, so if you have one database in LATIN1 encoding and another in UTF-8, for example, whatever you put in your postgresql.conf is going to be wrong for one database. I'm happy to just document the issue for per-database messages, ALTER DATABASE ... SET welcome_message, the encoding used there need to match the encoding of the database, or it's displayed as garbage. But what about per-user messages, when the user has access to several databases, or postgresql.conf? For postgresql.conf I think we could make a rule that it's always in UTF-8. We haven't had to take a stance on the encoding used in postgresql.conf before, but IMHO UTF-8 only would be quite reasonable. We already have a problem there if for example you have two database with different encodings, a schema with non-ascii characters in it, and you try to put that schema in search_path in postgresql.conf. That's such a rare situation that we haven't heard any complaints, but it's not so unreasonable for welcome_message. That still leaves the problem with per-user messages, though. We've managed to dodge the encoding issue in shared catalogs this far. It would be pretty hard to enforce any given encoding there, I think. Perhaps we could just document that, and advice to create per-database and per-user settings in that case. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_stats_recovery view
--On 15. Januar 2012 02:50:00 -0500 Jaime Casanova ja...@2ndquadrant.com wrote: Attached is a patch thats implements a pg_stat_recovery view that keeps counters about processed wal records. I just notice that it still lacks documentation but i will add it during the week. Hi Jaime, do you have an updated patch? The current v1 patch doesn't apply cleanly anymore, and before i go and rebase the patch i thought i'm asking... -- Thanks Bernd -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for parameterized inner paths
To be clear, I'd love to have this feature. But if there is a choice between reducing planning time significantly for everyone and NOT getting this feature, and increasing planning time significantly for everyone and getting this feature, I think we will make more people happy by doing the first one. We're not really talking about are we going to accept or reject a specific feature. We're talking about whether we're going to decide that the last two years worth of planner development were headed in the wrong direction and we're now going to reject that and try to think of some entirely new concept. This isn't an isolated patch, it's the necessary next step in a multi-year development plan. The fact that it's a bit slower at the moment just means there's still work to do. Those planner improvements are very promising despite the current extra cost in planning in some situations. And it looks like a good solution to have an effective and interesting index-only usage. We've hitten the planner limits and a lower level of the planner need to be revisited. Please Tom, keep on. And it might be that 9.2 is a very good release to do that because if there is performance impact from this patch (at the end), there are so many improvment in other places that it should be easely compensated for users. It might be harder to do the change in 9.3 if the performance of 9.3 are lower than the ones from 9.2. -- Cédric Villemain +33 (0)6 20 30 22 52 http://2ndQuadrant.fr/ PostgreSQL: Support 24x7 - Développement, Expertise et Formation -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] xlog location arithmetic
On Sun, Jan 22, 2012 at 1:13 AM, Euler Taveira de Oliveira eu...@timbira.com wrote: On 23-12-2011 12:05, Tom Lane wrote: I too think a datatype is overkill, if we're only planning on providing one function. Just emit the values as numeric and have done. Here it is. Output changed to numeric. Thanks! When I compiled the source with xlogdiff.patch, I got the following warnings. xlogfuncs.c:511:2: warning: format '%lX' expects argument of type 'long unsigned int *', but argument 3 has type 'uint64 *' [-Wformat] xlogfuncs.c:511:2: warning: format '%lX' expects argument of type 'long unsigned int *', but argument 4 has type 'uint64 *' [-Wformat] xlogfuncs.c:515:2: warning: format '%lX' expects argument of type 'long unsigned int *', but argument 3 has type 'uint64 *' [-Wformat] xlogfuncs.c:515:2: warning: format '%lX' expects argument of type 'long unsigned int *', but argument 4 has type 'uint64 *' [-Wformat] xlogfuncs.c:524:3: warning: format '%lX' expects argument of type 'long unsigned int', but argument 2 has type 'uint64' [-Wformat] xlogfuncs.c:528:3: warning: format '%lX' expects argument of type 'long unsigned int', but argument 2 has type 'uint64' [-Wformat] When I tested the patch, I got the following error: postgres=# SELECT pg_current_xlog_location(); pg_current_xlog_location -- 0/274 (1 row) postgres=# SELECT pg_xlog_location_diff('0/274', '0/274'); ERROR: xrecoff 274 is out of valid range, 0..A4A534C In func.sgml para The functions shown in xref linkend=functions-admin-backup-table assist in making on-line backups. These functions cannot be executed during recovery. /para Since pg_xlog_location_diff() can be executed during recovery, the above needs to be updated. While the output was int8 I could use pg_size_pretty but now I couldn't. I attached another patch that implements pg_size_pretty(numeric). I agree it's necessary. * Note: every entry in pg_proc.h is expected to have a DESCR() comment, * except for functions that implement pg_operator.h operators and don't * have a good reason to be called directly rather than via the operator. According to the above source code comment in pg_proc.h, ISTM pg_size_pretty() for numeric also needs to have its own DESCR(). + buf = DatumGetCString(DirectFunctionCall1(numeric_out, NumericGetDatum(size))); + result = strcat(buf, kB); According to man strcat, the dest string must have enough space for the result. buf has enough space? Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Second thoughts on CheckIndexCompatible() vs. operator families
On Wed, Jan 25, 2012 at 11:53:10PM -0500, Tom Lane wrote: Not only is that code spectacularly unreadable, but has nobody noticed that this commit broke the buildfarm? Thanks for reporting the problem. This arose because the new test case temporarily sets client_min_messages=DEBUG1. The default buildfarm configuration sets log_statement=all in its postgresql.conf, so the client gets those log_statement lines. I would fix this as attached, resetting the optional logging to defaults during the test cases in question. Not delightful, but that's what we have to work with. nm *** a/src/test/regress/expected/alter_table.out --- b/src/test/regress/expected/alter_table.out *** *** 1665,1671 where oid = 'test_storage'::regclass; --- 1665,1679 t (1 row) + -- -- SET DATA TYPE without a rewrite + -- + -- Temporarily disable optional logging that make installcheck testers + -- reasonably might enable. + SET log_duration = off; + SET log_lock_waits = off; + SET log_statement = none; + SET log_temp_files = -1; CREATE DOMAIN other_textarr AS text[]; CREATE TABLE norewrite_array(c text[] PRIMARY KEY); NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index norewrite_array_pkey for table norewrite_array *** *** 1674,1679 ALTER TABLE norewrite_array ALTER c TYPE text[]; -- no work --- 1682,1691 ALTER TABLE norewrite_array ALTER c TYPE other_textarr; -- rebuild index DEBUG: building index norewrite_array_pkey on table norewrite_array RESET client_min_messages; + RESET log_duration; + RESET log_lock_waits; + RESET log_statement; + RESET log_temp_files; -- -- lock levels -- *** a/src/test/regress/sql/alter_table.sql --- b/src/test/regress/sql/alter_table.sql *** *** 1197,1203 select reltoastrelid 0 as has_toast_table --- 1197,1213 from pg_class where oid = 'test_storage'::regclass; + -- -- SET DATA TYPE without a rewrite + -- + + -- Temporarily disable optional logging that make installcheck testers + -- reasonably might enable. + SET log_duration = off; + SET log_lock_waits = off; + SET log_statement = none; + SET log_temp_files = -1; + CREATE DOMAIN other_textarr AS text[]; CREATE TABLE norewrite_array(c text[] PRIMARY KEY); SET client_min_messages = debug1; *** *** 1205,1210 ALTER TABLE norewrite_array ALTER c TYPE text[]; -- no work --- 1215,1225 ALTER TABLE norewrite_array ALTER c TYPE other_textarr; -- rebuild index RESET client_min_messages; + RESET log_duration; + RESET log_lock_waits; + RESET log_statement; + RESET log_temp_files; + -- -- lock levels -- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Second thoughts on CheckIndexCompatible() vs. operator families
On Thu, Jan 26, 2012 at 6:55 AM, Noah Misch n...@leadboat.com wrote: On Wed, Jan 25, 2012 at 11:53:10PM -0500, Tom Lane wrote: Not only is that code spectacularly unreadable, but has nobody noticed that this commit broke the buildfarm? Thanks for reporting the problem. This arose because the new test case temporarily sets client_min_messages=DEBUG1. The default buildfarm configuration sets log_statement=all in its postgresql.conf, so the client gets those log_statement lines. I would fix this as attached, resetting the optional logging to defaults during the test cases in question. Not delightful, but that's what we have to work with. I'm just going to remove the test. This is not very future-proof and an ugly pattern if it gets copied to other places. We need to work on a more sensible way for ALTER TABLE to report what it did, but a solution based on what GUCs the build-farm happens to set doesn't seem like it's justified for the narrowness of the case we're testing here. Whether or not we allow this case to work without a rewrite is in some sense arbitrary. There's no real reason it can't be done; rather, we're just exercising restraint to minimize the risk of future bugs. So I don't want to go to great lengths to test it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] psql NUL record and field separator
At 2012-01-14 14:23:49 +0200, pete...@gmx.net wrote: Inspired by this question http://stackoverflow.com/questions/6857265 I have implemented a way to set the psql record and field separators to a zero byte (ASCII NUL character). Since this patch is in the commitfest, I had a look at it. I agree that the feature is useful. The patch applies and builds cleanly with HEAD@9f9135d1, but needs a further minor tweak to work (attached). Without it, both zero separators get overwritten with the default value after option parsing. The code looks good otherwise. There's one problem: psql --record-separator-zero -At -c 'select something from somewhere' | xargs -0 dosomething If you run find -print0 and it finds one file, it will still print filename\0, and xargs -0 will work fine. But psql --record-separator-zero -At -c 'select 1' will print 1\n, not 1\0 or even 1\0\n, so xargs -0 will use the value 1\n, not 1. If you're doing this in a shell script, handing the last argument specially would be painful. At issue are (at least) these three lines from print_unaligned_text in src/bin/psql/print.c: 358 /* the last record needs to be concluded with a newline */ 359 if (need_recordsep) 360 fputc('\n', fout); Perhaps the right thing to do would be to change this to output \0 if --record-separator-zero was used (but leave it at \n otherwise)? That is what my second attached patch does: $ bin/psql --record-separator-zero --field-separator-zero -At -c 'select 1,2 union select 3,4'|xargs -0 echo 1 2 3 4 Thoughts? I think the most common use of this would be to set the record separator from the command line, so we could use a short option such as -0 or -z for that. I agree. The current option names are very unwieldy to type. -- ams diff --git a/src/bin/psql/startup.c b/src/bin/psql/startup.c index 69207c1..691ad14 100644 --- a/src/bin/psql/startup.c +++ b/src/bin/psql/startup.c @@ -150,12 +150,14 @@ main(int argc, char *argv[]) parse_psql_options(argc, argv, options); - if (!pset.popt.topt.fieldSep.separator) + if (!pset.popt.topt.fieldSep.separator + !pset.popt.topt.fieldSep.separator_zero) { pset.popt.topt.fieldSep.separator = pg_strdup(DEFAULT_FIELD_SEP); pset.popt.topt.fieldSep.separator_zero = false; } - if (!pset.popt.topt.recordSep.separator) + if (!pset.popt.topt.recordSep.separator + !pset.popt.topt.recordSep.separator_zero) { pset.popt.topt.recordSep.separator = pg_strdup(DEFAULT_RECORD_SEP); pset.popt.topt.recordSep.separator_zero = false; diff --git a/src/bin/psql/print.c b/src/bin/psql/print.c index 43ddab1..94535eb 100644 --- a/src/bin/psql/print.c +++ b/src/bin/psql/print.c @@ -357,7 +357,12 @@ print_unaligned_text(const printTableContent *cont, FILE *fout) } /* the last record needs to be concluded with a newline */ if (need_recordsep) - fputc('\n', fout); + { + if (cont-opt-recordSep.separator_zero) +print_separator(cont-opt-recordSep, fout); + else +fputc('\n', fout); + } } } -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: tracking temp files in pg_stat_database
On Sat, Jan 21, 2012 at 20:12, Tomas Vondra t...@fuzzy.cz wrote: On 21 Leden 2012, 18:13, Magnus Hagander wrote: On Sat, Jan 21, 2012 at 18:02, Tomas Vondra t...@fuzzy.cz wrote: Though I just looked at the tabstat code again, and we already split that message up at regular intervals. Which would make it quite weird to have global counters in it as well. But instead of there, perhaps we need a general non table, but more than one type of data message sent out at the same time. There is other stuff in the queue for it. I'm not convinced either way - I'm not against the original way in your patch either. I just wanted to throw the idea out there, and was hoping somebody else would have an opinion too :-) Let's make that in a separate patch. It seems like a good thing to do, but I don't think it should be mixed with this patch. Yeah, I'm not sure we even want to do that yet, when we go down this road. We might eventually, of course. I do like the idea of not sending the message for each temp file, though. One thing that worries me are long running transactions (think about a batch process that runs for several hours within a single transaction). By sending the data only at the commit, such transactions would not be accounted properly. So I'd suggest sending the message either at commit By that argument they are *already* not accounted properly, because the number of rows and those counters are wrong. By sending the temp file data in the middle of the transaction, you won't be able to correlate those numbers with the temp file usage. I'm not saying the other usecase isn't more common, but whichever way you do it, it's going to get inconsistent with *something*. The question is whether the patch should be modified to send the message only at commit (i.e. just like tabstats) or if it can remain the way it is now. And maybe we could change the way all those messages are sent (e.g. to the regular submits I've proposed in the previous post) to make them more consistent. I'm not asking for 100% consistency. What I don't like is a batch process consuming resources but not reflected in the stats until the final commit. With a huge spike in the stats right after that, although the activity was actually spread over several hours. Yeah, having thought about it some more, I think sending them when they happen is a better idea. (It's obviously still only going to send them when the file is closed, but that's as close as we can reasonably get). I've cleaned up the patch a bit and applied it - thanks! Other than cosmetic, I changed the text in the description around a bit (sol if that's worse now, that's purely my fault) and added docs to the section about pg_stat_database that you'd forgotten. Also, I'd like to point you in the direction of the src/include/catalog/find_unused_oids and duplicate_oids scripts. The oids you'd picked conflicted with the pg_extension catalog from a year ago. Easy enough to fix, and there are often conflicts with more recent oids that the committer has to deal with anyway, but cleaning those up before submission always helps a little bit. For the next one :-) -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: tracking temp files in pg_stat_database
On 26 Leden 2012, 14:44, Magnus Hagander wrote: On Sat, Jan 21, 2012 at 20:12, Tomas Vondra t...@fuzzy.cz wrote: On 21 Leden 2012, 18:13, Magnus Hagander wrote: On Sat, Jan 21, 2012 at 18:02, Tomas Vondra t...@fuzzy.cz wrote: Though I just looked at the tabstat code again, and we already split that message up at regular intervals. Which would make it quite weird to have global counters in it as well. But instead of there, perhaps we need a general non table, but more than one type of data message sent out at the same time. There is other stuff in the queue for it. I'm not convinced either way - I'm not against the original way in your patch either. I just wanted to throw the idea out there, and was hoping somebody else would have an opinion too :-) Let's make that in a separate patch. It seems like a good thing to do, but I don't think it should be mixed with this patch. Yeah, I'm not sure we even want to do that yet, when we go down this road. We might eventually, of course. Yes, that's one of the reasons why I suggested to do that in a separate patch (that might be rejected if we find it's a bad idea). I've cleaned up the patch a bit and applied it - thanks! Other than cosmetic, I changed the text in the description around a bit (sol if that's worse now, that's purely my fault) and added docs to the section about pg_stat_database that you'd forgotten. Great. Thanks for the fixes. Also, I'd like to point you in the direction of the src/include/catalog/find_unused_oids and duplicate_oids scripts. The oids you'd picked conflicted with the pg_extension catalog from a year ago. Easy enough to fix, and there are often conflicts with more recent oids that the committer has to deal with anyway, but cleaning those up before submission always helps a little bit. For the next one :-) Aha! I've been wondering how the commiters identify duplicate oids etc. but I was unaware of those scripts. Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Simulating Clog Contention
At 2012-01-12 12:31:20 +, si...@2ndquadrant.com wrote: The following patch adds a pgbench option -I to load data using INSERTs This is just to confirm that the patch applies and builds and works fine (though of course it does take a long time… pity there doesn't seem to be any easy way to add progress indication like -i has). I'm aware of the subsequent discussion about using only a long option, documenting -n, and adding a knob to control index creation timing. I don't have a useful opinion on any of those things. It's just that the patch was marked Needs review and it was only while waiting for 100k inserts to run that I thought of checking if there was any discussion about it. Oops. Yes, my laptop really is that slow. It appears to be eight times as fast as mine. -- ams -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for parameterized inner paths
On Wed, Jan 25, 2012 at 11:24 AM, Tom Lane t...@sss.pgh.pa.us wrote: I wrote: Attached is a WIP patch for parameterized paths, along the lines we have discussed before: ... I've made considerable progress on the TODO items I listed: indxpath.c has been ripped apart and restructured, and I have it considering parameterized paths for hash sub-joins. I made a deliberate policy decision not to work very hard on exploring parameterized mergejoin plans, because that seems to inflate the number of paths considered way more than the relative usefulness of merge over hash joins justifies. I don't fully understand this code, especially not on my first time through it, but it seems to me that the key to getting the performance of this code up to where we'd like it to be is to control the number of useless paths that get generated. Is there a guard in here against joining a parameterized path to an intermediate relation when no SJ is involved? In other words, if we're joining a parameterized path on A to a path on B, then either the join to B should satisfy at least part of the parameterization needed by A, or there should be a special join with A and B on one side and a relation that satisfies at least part of the parameterization of A on the other. In the probably not uncommon case where there are no SJs at all or all such SJs have only a single rel on the nullable side, we ought to be able to avoid creating any more paths than we do right now. Even if there are SJs with multiple rels on the outside, we could try to implement some fast check for whether intermediate paths make any sense: e.g. union the set of rels on the nullable sides of SJs. Then, if the joinrel whose paths we're trying to build isn't a subset of that set, the only thing worth considering at this level is satisfying a parameterization by building a parameter-driven nestloop: a bigger parameterized path will do us no good. Maybe there's a heuristic like this already in there; I just didn't see it on first glance. I guess I'm surprised my the amount of increase you're seeing in paths considered, though admittedly I haven't seen your test cases. If we're properly filtering out the paths that don't matter, I wouldn't have expected it to have as much impact as you're describing. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.2] sepgsql's DROP Permission checks
On Thu, Jan 26, 2012 at 7:27 AM, Kohei KaiGai kai...@kaigai.gr.jp wrote: It seems to me reasonable design. The attached patch is rebased one according to your perform-deletion patch. That looks pretty sensible. But I don't think this is true any more: +Please note that it shall not be checked on the objects removed by +cascaded deletion according to the standard manner in SQL. I've been thinking more about the question of object access hooks with arguments as well. In a couple of designs you've proposed, we've needed InvokeObjectAccessHook0..11 or whatever, and I don't like that design - it seems grotty and not type-safe. However, looking at what you did in this patch, I have another idea. Suppose that we add ONE additional argument to the object access hook function, as you've done here, and if a particular type of hook invocation needs multiple arguments, it can pass them with a struct. In fact, let's use a struct regardless, for uniformity, and pass the value as a void *. That is: typedef struct ObjectAccessDrop { int dropflags; } ObjectAccessDrop; At the call site, we do this: if (object_access_hook) { ObjectAccessDrop arg; arg.dropflags = flags; InvokeObjectAccessHook(..., arg); } If there's no argument, then we can just do: InvokeObjectAccessHook(..., NULL); The advantage of this is that if we change the structure definition, loadable modules compiled against a newer code base should either (1) still work or (2) fail to compile. The way we have it right now, if we decide to pass a second argument (say, the DropBehavior) to the hook, we're potentially going to silently break sepgsql no matter how we do it. But if we enforce use of a struct, then the only thing the client should ever be doing with the argument it gets is casting it to ObjectAccessDrop *. Then, if we've added fields to the struct, the code will still do the right thing even if the field order has been changed or whatever. If we've removed fields or changed their data types, things should blow up fairly obviously instead of seeming to work but actually failing. Thoughts? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: pg_basebackup (missing exit on error)
On Tue, Jan 24, 2012 at 4:39 AM, Thomas Ogrisegg tom-...@patches.fnord.at wrote: While testing a backup script, I noticed that pg_basebackup exits with 0, even if it had errors while writing the backup to disk when the backup file is to be sent to stdout. The attached patch should fix this problem (and some special cases when the last write fails). This part looks like a typo: + if (fflush (tarfile) != 0) + { + fprintf(stderr, _(%s: error flushing stdout: %s\n), + strerror (errno)); + disconnect_and_exit(1); + } The error is in flushing the tarfile, not stdout, right? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Psql names completion.
On Mon, Jan 23, 2012 at 5:28 PM, Dominik Bylica dominik2c...@gmail.com wrote: Hello. It's probably not the right place to write, but I guess you are there to take care of it. When I was creating a trigger with command like: create trigger asdf before update on beginninOfTheNameOfMyTable... I hit tab and I got: create trigger asdf before update on SET That was obviously not what I expected to get. Should be fixed in 9.2. Josh -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Avoid FK validations for no-rewrite ALTER TABLE ALTER TYPE
On Wed, Jan 25, 2012 at 10:39:56PM -0500, Noah Misch wrote: In any event, the patch needed a rebase, so I've attached it rebased and with that comment edited to reference ri_GenerateQualCollation(), that being the most-relevant source for the assumption in question. Commit 9d35116611e6a1fc10f2298944fbf0e4e1a826be invalidated the test case hunks again. We'll need to either remove the test cases, as Robert chose to do for that other patch, or bolster them per http://archives.postgresql.org/message-id/20120126115536.GD15670%40tornado.leadboat.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_stat_database deadlock counter
On Sun, Jan 22, 2012 at 19:35, Jaime Casanova ja...@2ndquadrant.com wrote: On Mon, Jan 16, 2012 at 3:19 PM, Magnus Hagander mag...@hagander.net wrote: Attached patch adds a counter for number of deadlocks in a database to pg_stat_database. A little review: - it applies with a few hunks - the oid you have chosen for the function pg_stat_get_db_deadlocks() is already in use, please choose another one using the unused_oids script (3150) yeah, taht always happens.. - pg_stat_reset() doesn't reset deadlock counter (see pgstat_recv_resetcounter()) Good catch, thanks! I've adjusted the patch for that, resolved the conflicts with the other pg_stat_database patch, and committed it. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Different error messages executing CREATE TABLE or ALTER TABLE to create a column xmin
On Tue, Jan 24, 2012 at 6:27 PM, Vik Reykja vikrey...@gmail.com wrote: I took my first stab at hacking the sources to fix the bug reported here: http://archives.postgresql.org/pgsql-bugs/2012-01/msg00152.php It's such a simple patch but it took me several hours with Google and IRC and I'm sure I did many things wrong. It seems to work as advertised, though, so I'm submitting it. I suppose since I have now submitted a patch, it is my moral obligation to review at least one. I'm not sure how helpful I'll be, but I'll go bite the bullet and sign myself up anyway. I'm not sure that an error message that is accurate but slightly less clear than you'd like qualifies as a bug, but I agree that it would probably be worth improving, and I also agree with the general approach you've taken here. However, I think we ought to handle renaming a column symmetrically to adding one. So here's a revised version of your patch that does that. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company systemcolumn-v2.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: better support for debugging of overloaded functions
At 2011-11-24 17:44:16 +0100, pavel.steh...@gmail.com wrote: patch is relative long, but almost all are changes in regress tests. Changes in plpgsql are 5 lines The change looks good in principle. The patch applies to HEAD with a bit of fuzz and builds fine… but it fails tests, because it's incomplete. Pavel, your patch doesn't contain any changes to pl_exec.c. Did you just forget to submit them? Anyway, some errcontext() calls need to be taught to print -fn_signature rather than -fn_name. I made those changes, and found some more failing tests. Updated patch attached. Ready for committer. -- ams diff --git a/src/pl/plpgsql/src/pl_comp.c b/src/pl/plpgsql/src/pl_comp.c index 6ba1e5e..eee6f6c 100644 --- a/src/pl/plpgsql/src/pl_comp.c +++ b/src/pl/plpgsql/src/pl_comp.c @@ -342,6 +342,7 @@ do_compile(FunctionCallInfo fcinfo, compile_tmp_cxt = MemoryContextSwitchTo(func_cxt); function-fn_name = pstrdup(NameStr(procStruct-proname)); + function-fn_signature = format_procedure(fcinfo-flinfo-fn_oid); function-fn_oid = fcinfo-flinfo-fn_oid; function-fn_xmin = HeapTupleHeaderGetXmin(procTup-t_data); function-fn_tid = procTup-t_self; diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c index 5ce8d6e..57e337e 100644 --- a/src/pl/plpgsql/src/pl_exec.c +++ b/src/pl/plpgsql/src/pl_exec.c @@ -799,7 +799,7 @@ plpgsql_exec_error_callback(void *arg) * local variable initialization */ errcontext(PL/pgSQL function \%s\ line %d %s, - estate-func-fn_name, + estate-func-fn_signature, estate-err_stmt-lineno, _(estate-err_text)); } @@ -810,7 +810,7 @@ plpgsql_exec_error_callback(void *arg) * arguments into local variables */ errcontext(PL/pgSQL function \%s\ %s, - estate-func-fn_name, + estate-func-fn_signature, _(estate-err_text)); } } @@ -818,13 +818,13 @@ plpgsql_exec_error_callback(void *arg) { /* translator: last %s is a plpgsql statement type name */ errcontext(PL/pgSQL function \%s\ line %d at %s, - estate-func-fn_name, + estate-func-fn_signature, estate-err_stmt-lineno, plpgsql_stmt_typename(estate-err_stmt)); } else errcontext(PL/pgSQL function \%s\, - estate-func-fn_name); + estate-func-fn_signature); } diff --git a/src/pl/plpgsql/src/plpgsql.h b/src/pl/plpgsql/src/plpgsql.h index 0aef8dc..739b9e4 100644 --- a/src/pl/plpgsql/src/plpgsql.h +++ b/src/pl/plpgsql/src/plpgsql.h @@ -679,6 +679,7 @@ typedef struct PLpgSQL_func_hashkey typedef struct PLpgSQL_function {/* Complete compiled function */ char *fn_name; + char *fn_signature; Oid fn_oid; TransactionId fn_xmin; ItemPointerData fn_tid; diff --git a/src/test/regress/expected/domain.out b/src/test/regress/expected/domain.out index 4f47374..4da1c53 100644 --- a/src/test/regress/expected/domain.out +++ b/src/test/regress/expected/domain.out @@ -493,7 +493,7 @@ begin end$$ language plpgsql; select doubledecrement(3); -- fail because of implicit null assignment ERROR: domain pos_int does not allow null values -CONTEXT: PL/pgSQL function doubledecrement line 3 during statement block local variable initialization +CONTEXT: PL/pgSQL function doubledecrement(pos_int) line 3 during statement block local variable initialization create or replace function doubledecrement(p1 pos_int) returns pos_int as $$ declare v pos_int := 0; begin @@ -501,7 +501,7 @@ begin end$$ language plpgsql; select doubledecrement(3); -- fail at initialization assignment ERROR: value for domain pos_int violates check constraint pos_int_check -CONTEXT: PL/pgSQL function doubledecrement line 3 during statement block local variable initialization +CONTEXT: PL/pgSQL function doubledecrement(pos_int) line 3 during statement block local variable initialization create or replace function doubledecrement(p1 pos_int) returns pos_int as $$ declare v pos_int := 1; begin @@ -514,10 +514,10 @@ select doubledecrement(0); -- fail before call ERROR: value for domain pos_int violates check constraint pos_int_check select doubledecrement(1); -- fail at assignment to v ERROR: value for domain pos_int violates check constraint pos_int_check -CONTEXT: PL/pgSQL function doubledecrement line 4 at assignment +CONTEXT: PL/pgSQL function doubledecrement(pos_int) line 4 at assignment select doubledecrement(2); -- fail at return ERROR: value for domain pos_int violates check constraint pos_int_check -CONTEXT: PL/pgSQL function doubledecrement while casting return value to function's return type +CONTEXT: PL/pgSQL function doubledecrement(pos_int) while casting return value to function's return type select doubledecrement(3); -- good doubledecrement - @@ -566,7 +566,7 @@ end$$ language plpgsql; select array_elem_check(121.00); ERROR: numeric field overflow DETAIL: A field with precision 4, scale 2 must round to an absolute value less than 10^2. -CONTEXT: PL/pgSQL
Re: [HACKERS] Client Messages
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: The thing is, there's currently no encoding conversion happening, so if you have one database in LATIN1 encoding and another in UTF-8, for example, whatever you put in your postgresql.conf is going to be wrong for one database. I'm happy to just document the issue for per-database messages, ALTER DATABASE ... SET welcome_message, the encoding used there need to match the encoding of the database, or it's displayed as garbage. But what about per-user messages, when the user has access to several databases, or postgresql.conf? I've not looked at the patch, but what exactly will happen if the string has the wrong encoding? The idea that occurs to me is to have the code that uses the GUC do a verify_mbstr(noerror) on it, and silently ignore it if it doesn't pass (maybe with a LOG message). This would have to be documented of course, but it seems better than the potential consequences of trying to send a wrongly-encoded string. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PL/Python result metadata
At 2012-01-11 22:05:34 +0200, pete...@gmx.net wrote: I propose to add two functions to the result object: .colnames() returns a list of column names (strings) .coltypes() returns a list of type OIDs (integers) […] Patch attached. Comments welcome. Applies, builds, passes tests. Code looks simple and good. Ready for committer, unless you want to add a typmod accessor too. -- ams -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Pause at end of recovery
On Thu, Jan 26, 2012 at 08:42, Fujii Masao masao.fu...@gmail.com wrote: On Wed, Dec 28, 2011 at 7:27 PM, Simon Riggs si...@2ndquadrant.com wrote: On Thu, Dec 22, 2011 at 6:16 AM, Simon Riggs si...@2ndquadrant.com wrote: I can see a reason to do this now. I've written patch and will commit on Friday. Nudge me if I don't. It's hard to write this so it works in all cases and doesn't work in the right cases also. Basically, we can't get in the way of crash recovery, so the only way we can currently tell a crash recovery from an archive recovery is the presence of restore_command. If you don't have that and you haven't set a recovery target, it won't pause and there's nothing I can do, AFAICS. Please test this and review before commit. What if wrong recovery target is specified and an archive recovery reaches end of WAL files unexpectedly? Even in this case, we want to pause recovery at the end? Otherwise, we'll lose chance to correct the recovery target and retry archive recovery. Yes, we definitely want to pause then. One idea; starting archive recovery with standby_mode=on meets your needs? I haven't tested, but probably, yes. But in that case, why do we need the pause_at_recovery_target *at all*? It's basically overloaded functionality already, but I figured it was set up that way to keep replication and recovery a bit separated? When archive recovery reaches end of WAL files, regardless of whether recovery target is specified or not, recovery pauses at the end. If hot_standby is enabled, you can check the contents and if it's OK you can finish recovery by pg_ctl promote. That is pretty much the usecase, yes. Or readjust the recovery target (or, heck, add more files to the wal archive because you set it up wrong somehow) and continue the recovery further along the line. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [PATCH] Fix float8 parsing of denormal values (on some platforms?)
At 2011-12-21 18:24:17 +0200, ma...@juffo.org wrote: I think the least invasive fix, as proposed by Jeroen, is to fail only when ERANGE is set *and* the return value is 0.0 or +/-HUGE_VAL. Reading relevant specifications, this seems to be a fairly safe assumption. That's what the attached patch does. Oops, now attached the patch too. Approach seems sensible, patch applies, builds, and passes tests. Marking ready for committer and crossing fingers about buildfarm failures… -- ams -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for parameterized inner paths
Robert Haas robertmh...@gmail.com writes: Is there a guard in here against joining a parameterized path to an intermediate relation when no SJ is involved? In other words, if we're joining a parameterized path on A to a path on B, then either the join to B should satisfy at least part of the parameterization needed by A, or there should be a special join with A and B on one side and a relation that satisfies at least part of the parameterization of A on the other. There is no such guard. We could probably add one with not an outrageous amount of expense, but I'm not clear on why you think it's appropriate to limit the join ordering that way? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: information schema/aclexplode doesn't know about default privileges
At 2012-01-09 20:23:59 +0200, pete...@gmx.net wrote: Nobody had a better idea, so here is the final patch. I adjusted the regression tests a bit to avoid bloat from the now-visible owner privileges. Patch applies, builds, and passes tests (and does report EXECUTE privileges on a newly-created function). Code looks fine. -- ams -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Different error messages executing CREATE TABLE or ALTER TABLE to create a column xmin
Robert Haas robertmh...@gmail.com writes: However, I think we ought to handle renaming a column symmetrically to adding one. Yeah, I was thinking the same. So here's a revised version of your patch that does that. This looks reasonable to me, except that possibly the new error message text could do with a bit more thought. It seems randomly unlike the normal message, and I also have a bit of logical difficulty with the wording equating a column with a column name. The wording that is in use in the existing CREATE TABLE case is column name \%s\ conflicts with a system column name We could do worse than to use that verbatim, so as to avoid introducing a new translatable string. Another possibility is column \%s\ of relation \%s\ already exists as a system column Or we could keep the primary errmsg the same as it is for a normal column and instead add a DETAIL explaining that this is a system column. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Simulating Clog Contention
On Thu, Jan 26, 2012 at 8:18 AM, Abhijit Menon-Sen a...@toroid.org wrote: This is just to confirm that the patch applies and builds and works fine (though of course it does take a long time… pity there doesn't seem to be any easy way to add progress indication like -i has). On any non server grade hardware you'd probably want to disable synchronous_commit while loading. FWIW, this is a great addition to pgbench. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Should we add crc32 in libpgport?
At 2012-01-17 13:43:11 -0800, dan...@heroku.com wrote: See the attached patch. It has a detailed cover letter/comment at the top of the file. The patch looks good, but the resetxlog and controldata Makefile hunks didn't apply. I don't know why, but I've attached updated versions of those hunks below, to save someone a moment. Everything else is fine. -- ams diff --git a/src/bin/pg_controldata/Makefile b/src/bin/pg_controldata/Makefile index 0eff846..ebaa179 100644 --- a/src/bin/pg_controldata/Makefile +++ b/src/bin/pg_controldata/Makefile @@ -15,16 +15,13 @@ subdir = src/bin/pg_controldata top_builddir = ../../.. include $(top_builddir)/src/Makefile.global -OBJS= pg_controldata.o pg_crc.o $(WIN32RES) +OBJS= pg_controldata.o $(WIN32RES) all: pg_controldata pg_controldata: $(OBJS) | submake-libpgport $(CC) $(CFLAGS) $^ $(LDFLAGS) $(LDFLAGS_EX) $(LIBS) -o $@$(X) -pg_crc.c: $(top_srcdir)/src/backend/utils/hash/pg_crc.c - rm -f $@ $(LN_S) $ . - install: all installdirs $(INSTALL_PROGRAM) pg_controldata$(X) '$(DESTDIR)$(bindir)/pg_controldata$(X)' diff --git a/src/bin/pg_resetxlog/Makefile b/src/bin/pg_resetxlog/Makefile index eb03b8a..b0dfd16 100644 --- a/src/bin/pg_resetxlog/Makefile +++ b/src/bin/pg_resetxlog/Makefile @@ -15,16 +15,13 @@ subdir = src/bin/pg_resetxlog top_builddir = ../../.. include $(top_builddir)/src/Makefile.global -OBJS= pg_resetxlog.o pg_crc.o $(WIN32RES) +OBJS= pg_resetxlog.o $(WIN32RES) all: pg_resetxlog pg_resetxlog: $(OBJS) | submake-libpgport $(CC) $(CFLAGS) $^ $(LDFLAGS) $(LDFLAGS_EX) $(LIBS) -o $@$(X) -pg_crc.c: $(top_srcdir)/src/backend/utils/hash/pg_crc.c - rm -f $@ $(LN_S) $ . - install: all installdirs $(INSTALL_PROGRAM) pg_resetxlog$(X) '$(DESTDIR)$(bindir)/pg_resetxlog$(X)' -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] BGWriter latch, power saving
On 17.01.2012 14:38, Peter Geoghegan wrote: On 17 January 2012 11:24, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: In the patch I sent, I did rearrange the sleeping logic. I think it's more readable the way it is now. I have no objection to either your refinement of the sleeping logic, nor that you moved some things in both the existing code and my patch so that they occur when no spinlock is held. Ok, committed with some further cleanup. Do you think the docs need to be updated for this, and if so, where? The only place I found in the docs that speak about how the bgwriter works is in config.sgml, where bgwriter_delay is described. Want to suggest an update to that? Should I proceed with a benchmark on V3, so that we can get this committed? I imagine that a long pgbench-tools run is appropriate, (after all, it was used to justify the re-write of the BGWriter for 8.3) at various scale factors, from smallish to quite large. I did some testing on this, with a highly artificial test case that dirties pages in shared_buffers as fast as possible. I tried to make it a worst-case scenario, see attached script. I tested this with a 32-core HP Itanium box, and on my 2-core laptop, and didn't see any measurable slowdown from this patch. So I think we're good. If setting the latch would become a contention issue, there would be a pretty easy solution: only try to do it every 10 or 100 dirtied pages, for example. A few dirty pages in the buffer cache don't mean anything, as long as we kick the bgwriter in a fairly timely fashion when a larger burst of activity begins. BTW, do you have some sort of a test setup for these power-saving patches, to actually measure the effect on number of interrupts or electricity use? Fewer wakeups should be a good thing, but it would be nice to see some figures to get an idea of how much progress we've done and what still needs to be done. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com dirtypages.sh Description: Bourne shell script -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for parameterized inner paths
On Thu, Jan 26, 2012 at 11:04 AM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: Is there a guard in here against joining a parameterized path to an intermediate relation when no SJ is involved? In other words, if we're joining a parameterized path on A to a path on B, then either the join to B should satisfy at least part of the parameterization needed by A, or there should be a special join with A and B on one side and a relation that satisfies at least part of the parameterization of A on the other. There is no such guard. We could probably add one with not an outrageous amount of expense, but I'm not clear on why you think it's appropriate to limit the join ordering that way? Because I think that adding parameterized paths that fail that test is bound to be worse - or at least no better - than just rearranging the join order. In other words, suppose I have this: a some-join (b some-other-join c on b.x = c.x) on a.y = b.y If some-join is a LEFT join and some-other-join is an INNER join, then it makes sense to think about implementing this as a parameterized nest loop with a on the outside and (b ij c) on the inside. But if the joins commute, then AFAICT it doesn't. The trivial case is when they are both inner joins; I could do: Nest Loop - Seq Scan A - Nested Loop - Index Scan B - Index Scan C But that's no better than: Nest Loop - Nest Loop - Seq Scan A - Index Scan B - Index Scan C ...because those are alternative formulations of the same concept: scan A, use that to drive index probes against B, and then use those results to drive index probes against C. But the same applies if both joins are left joins, or more generally: if the joins commute, then the plans we're already generating are sufficient. When they *don't* commute, then we can potentially benefit from joining a parameterized path against an intermediate table. I would expect a lot of people might have things like this: a INNER JOIN b ON a.bid = b.bid INNER JOIN c ON a.cid = c.cid INNER JOIN d ON a.did = d.did INNER JOIN e ON d.eid = e.eid INNER JOIN f ON d.fid = f.fid LEFT JOIN g ON a.gid = g.gid LEFT JOIN h ON a.hid = h.hid LEFT JOIN i ON a.iid = i.iid AFAICS, such queries aren't going to benefit from this optimization, so ideally we'd like them to not have to pay little or nothing for it. Now, if h happens to be a view defined as h1 INNER JOIN h2 ON h1.x = h2.x, then it makes sense to try to join a parameterized path on either h1 or h2 against the other relation before implementing it as a nest loop against some set of relations that includes a, but we still can't get any benefit from parameterization anywhere else - joining h1 or h2 against anything else in there is not legal, and join of a parameterized path over d to, say, f is still no better than what we can get by rearranging the join order: if the path over d is parameterized, then whatever join (d join f) will be no better than (whatever join d) join f. So it seems like we ought to just not build a parameterized d join f path in the first place, unless, of course, I am missing something. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Different error messages executing CREATE TABLE or ALTER TABLE to create a column xmin
On Thu, Jan 26, 2012 at 11:29 AM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: However, I think we ought to handle renaming a column symmetrically to adding one. Yeah, I was thinking the same. So here's a revised version of your patch that does that. This looks reasonable to me, except that possibly the new error message text could do with a bit more thought. It seems randomly unlike the normal message, and I also have a bit of logical difficulty with the wording equating a column with a column name. The wording that is in use in the existing CREATE TABLE case is column name \%s\ conflicts with a system column name We could do worse than to use that verbatim, so as to avoid introducing a new translatable string. Another possibility is column \%s\ of relation \%s\ already exists as a system column Or we could keep the primary errmsg the same as it is for a normal column and instead add a DETAIL explaining that this is a system column. I intended for the message to match the CREATE TABLE case. I think it does, except I see now that Vik's patch left out the word name where the original string has it. So I'll vote in favor of your first option, above, since that's what I intended anyway. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Simulating Clog Contention
On Thu, Jan 26, 2012 at 11:41 AM, Merlin Moncure mmonc...@gmail.com wrote: On Thu, Jan 26, 2012 at 8:18 AM, Abhijit Menon-Sen a...@toroid.org wrote: This is just to confirm that the patch applies and builds and works fine (though of course it does take a long time… pity there doesn't seem to be any easy way to add progress indication like -i has). On any non server grade hardware you'd probably want to disable synchronous_commit while loading. FWIW, this is a great addition to pgbench. Do you object to separating out the three different things the patch does and adding separate options for each one? If so, why? I find them independently useful. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for parameterized inner paths
On Thu, Jan 26, 2012 at 11:54 AM, Robert Haas robertmh...@gmail.com wrote: AFAICS, such queries aren't going to benefit from this optimization, so ideally we'd like them to not have to pay little or nothing for it. There are a few too many negations in that sentence. Let me try again: AFAICS, such queries aren't going to benefit from this optimization, so ideally we'd like them to pay little or nothing for it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Different error messages executing CREATE TABLE or ALTER TABLE to create a column xmin
On Thu, Jan 26, 2012 at 17:58, Robert Haas robertmh...@gmail.com wrote: On Thu, Jan 26, 2012 at 11:29 AM, Tom Lane t...@sss.pgh.pa.us wrote: This looks reasonable to me, except that possibly the new error message text could do with a bit more thought. It seems randomly unlike the normal message, and I also have a bit of logical difficulty with the wording equating a column with a column name. The wording that is in use in the existing CREATE TABLE case is column name \%s\ conflicts with a system column name We could do worse than to use that verbatim, so as to avoid introducing a new translatable string. Another possibility is column \%s\ of relation \%s\ already exists as a system column Or we could keep the primary errmsg the same as it is for a normal column and instead add a DETAIL explaining that this is a system column. I intended for the message to match the CREATE TABLE case. I think it does, except I see now that Vik's patch left out the word name where the original string has it. So I'll vote in favor of your first option, above, since that's what I intended anyway. My intention was to replicate the CREATE TABLE message; I'm not sure how I failed on that particular point. Thank you guys for noticing and fixing it (along with all the other corrections).
Re: [HACKERS] [v9.2] Add GUC sepgsql.client_label
On Tue, Jan 10, 2012 at 6:28 AM, Kohei KaiGai kai...@kaigai.gr.jp wrote: This patch adds a new GUC sepgsql.client_label that allows client process to switch its privileges into another one, as long as the system security policy admits this transition. Because of this feature, I ported two permissions from process class of SELinux; setcurrent and dyntransition. The first one checks whether the client has a right to switch its privilege. And the other one checks a particular transition path from X to Y. This feature might seem to break assumption of the sepgsql's security model. However, single-directed domain transition from bigger-privileges to smaller-privileged domain by users' operation is also supported on operating system, and useful feature to restrict applications capability at beginning of the session. A few weeks ago, I got a requirement from Joshua Brindle. He is working for Web-application that uses CAC (Common Access Card) for its authentication, and wanted to integrate its security credential and security label of selinux/sepgsql. One problem was the system environment unavailable to use labeled-networking (IPsec), thus, it was not an option to switch the security label of processes on web-server side. An other solution is to port dynamic-transition feature into sepgsql, as an analogy of operating system. An expected scenario is below: The web-server is running with WEBSERV domain. It is allowed to connect to PostgreSQL, and also allowed to invoke an trusted-procedure that takes an argument of security-credential within CAC, but, nothing else are allowed. The trusted-procedure is allowed to reference a table between security-credential and security-label to be assigned on, then it switches the security label of client into CLIENT_n. The CLIENT_n shall be allowed to access tables, functions and others according to the security policy, and also allowed to reset sepgsql.security_label to revert WEBSERV. However, he is not available to switch other domain without security-credential stored within CAC card. I and Joshua agreed this scenario is reasonable and secure. So, we'd like to suggest this new feature towards v9.2 timeline. I'm wondering if a function would be a better fit than a GUC. I don't think you can really restrict the ability to revert a GUC change - i.e. if someone does a SET and then a RESET, you pretty much have to allow that. I think. But if you expose a function then it can work however you like. On another note, this is an awfully large patch. Is there a separate patch here that just does code rearrangement that should be separated out? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Simulating Clog Contention
On Thu, Jan 26, 2012 at 10:59 AM, Robert Haas robertmh...@gmail.com wrote: On Thu, Jan 26, 2012 at 11:41 AM, Merlin Moncure mmonc...@gmail.com wrote: On Thu, Jan 26, 2012 at 8:18 AM, Abhijit Menon-Sen a...@toroid.org wrote: This is just to confirm that the patch applies and builds and works fine (though of course it does take a long time… pity there doesn't seem to be any easy way to add progress indication like -i has). On any non server grade hardware you'd probably want to disable synchronous_commit while loading. FWIW, this is a great addition to pgbench. Do you object to separating out the three different things the patch does and adding separate options for each one? If so, why? I find them independently useful. I didn't take a position on that -- although superficially it seems like more granular control is good (and you can always group options together with a 'super option' like as in cp -a) -- just making a general comment on the usefulness of testing against records that don't have the same xid. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inline Extension
Robert Haas robertmh...@gmail.com writes: On Mon, Jan 23, 2012 at 7:59 PM, Daniel Farina dan...@heroku.com wrote: Things are still a bit ugly in the more complex cases: consider PostGIS's linkage against libproj and other libraries. In order to After thinking about that for a while I think we should not include the shared module (.so) in the scope of this patch nor 9.2. Ensuring that the .so dependencies are met is a problem well outside the reach of the PostgreSQL system. cover all cases, I feel that what I need is an optional hook (for the same of argument, let's say it's another command type hook, e.g. archive_command) to be executed when extension (un)installation is occurs on a primary or is replayed on a standby whereby I can acquire the necessary dependencies for an extension or signal some kind of error (as to exactly how that interfaces with the server is delicate, should one want to supply good error messages to the user). Supporting command triggers on the replica is in the TODO list. Having that in 9.2 is far from granted though, but the catalog and syntax already have support for the feature. But, that's a bit unlike how we normally do things. And if we're going to WAL-log the writing of the extension files, then I start to feel like we should go back to putting the data in a system catalog rather than the filesystem, because inventing a new WAL record type for this seems like it will make things quite a bit more complicated for no particular benefit. Exactly, and we should talk about concurrent accesses to the filesystem too, and how to have a clean transactional backup including the scripts when that pg_dump option is used. So I'm going to prepare the next version of the patch with this design: - in catalog extension scripts for inline extension pg_extension_script(extoid, oldversion, version, script) oldversion is null when create extension is used unless when using the create extension from 'unpackaged' form - see about adding more control properties in the catalog? - current code that is parsing the filenames to determine the upgrade path will have to be able to take the version strings from the new catalog as an alternative, and getting to the script content must be able to select from the catalog or read a file on disk - pg_dump defaults to not dumping extension content - pg_dump --include-extension-scripts dumps the scripts found either in the filesystem or the catalog, a create script first then any number of update script as needed to reach the current installed version - same as we have -t, add -e --extension to pg_dump so that you can choose to dump only a given extension The extension dumping will not include the shared modules, so if you extension depend on them being installed on the server, you will be much better served with some OS level packaging. Not for 9.2, but I can't help thinking that if we could manage to host the .so module itself in the catalogs, we could solve updating it in a transactional way and more importantly host it per-database, rather than having the modules work per major version (not even per cluster) and the extension mechanism work per-database inside each cluster. But that's work for another release. Any comments before I spend time coding this? -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Pause at end of recovery
On Fri, Jan 27, 2012 at 12:50 AM, Magnus Hagander mag...@hagander.net wrote: One idea; starting archive recovery with standby_mode=on meets your needs? I haven't tested, but probably, yes. But in that case, why do we need the pause_at_recovery_target *at all*? It's basically overloaded functionality already, but I figured it was set up that way to keep replication and recovery a bit separated? AFAIK, when standby_mode = on, archive recovery pauses only at end of WAL files. When recovery target is specified and archive recovery reaches the target, it doesn't pause. OTOH, when pause_at_recovery_target is set, archive recovery pauses only at the target but not end of WAL files. Neither can cover all the usecases. So pause_at_recovery_target was implemented. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Different error messages executing CREATE TABLE or ALTER TABLE to create a column xmin
On Thu, Jan 26, 2012 at 12:02 PM, Vik Reykja vikrey...@gmail.com wrote: On Thu, Jan 26, 2012 at 17:58, Robert Haas robertmh...@gmail.com wrote: On Thu, Jan 26, 2012 at 11:29 AM, Tom Lane t...@sss.pgh.pa.us wrote: This looks reasonable to me, except that possibly the new error message text could do with a bit more thought. It seems randomly unlike the normal message, and I also have a bit of logical difficulty with the wording equating a column with a column name. The wording that is in use in the existing CREATE TABLE case is column name \%s\ conflicts with a system column name We could do worse than to use that verbatim, so as to avoid introducing a new translatable string. Another possibility is column \%s\ of relation \%s\ already exists as a system column Or we could keep the primary errmsg the same as it is for a normal column and instead add a DETAIL explaining that this is a system column. I intended for the message to match the CREATE TABLE case. I think it does, except I see now that Vik's patch left out the word name where the original string has it. So I'll vote in favor of your first option, above, since that's what I intended anyway. My intention was to replicate the CREATE TABLE message; I'm not sure how I failed on that particular point. Thank you guys for noticing and fixing it (along with all the other corrections). OK, committed with that further change. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_bulkload ON_DUPLICATE_MERGE
On Wed, Jan 25, 2012 at 2:49 PM, Benjamin Johnson benjamin.john...@getcarbonblack.com wrote: PG Gurus, I have a table like this: CREATE TABLE filemods ( guid BIGINT NOT NULL UNIQUE, filepath_guid BIGINT NOT NULL, createtime TIMESTAMP WITH TIME ZONE DEFAULT NULL, writetime TIMESTAMP WITH TIME ZONE DEFAULT NULL, deletetime TIMESTAMP WITH TIME ZONE DEFAULT NULL, ); One event might have (1, '2012-01-25 11:00:00', NULL, NULL) and another event might have (1, NULL, '2012-01-25 11:05:00', NULL) and the end result should be: (1, '2012-01-25 11:00:00', '2012-01-25 11:05:00', NULL). I'm trying to modify pg_bulkload to merge two rows together. The changes I have made seem to be working, although I would like input on what I am doing that is unsafe or terribly wrong. You can be brutal. I've seen incredible write speed with using pg_bulkload. If I can just get it to consolidate our rows based on the unique key it will remove a lot of complexity in our software. Also, I'm not entirely sure this mailing list is the correct one, but with all the internals you all know, I'm hoping you can help point out nasty flaws in my algorithm. This is the first legitimate attempt I have made at modifying PG source, so I'm not real familiar with the proper way of loading pages and tuples and updating heaps and all that pg core stuff. Here's the modifications to pg_btree.c (from pg_bulkload HEAD): http://pastebin.com/U23CapvR I also attached the patch. I am not sure who maintains pg_bulkload, but it's not part of the core distribution, so this is the wrong mailing list you may want to look here: http://pgfoundry.org/mail/?group_id=1000261 -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Client Messages
On tor, 2012-01-26 at 10:34 +0200, Heikki Linnakangas wrote: For postgresql.conf I think we could make a rule that it's always in UTF-8. We haven't had to take a stance on the encoding used in postgresql.conf before, but IMHO UTF-8 only would be quite reasonable. We have so far violently resisted forcing UTF-8 anywhere, and this doesn't seem like a good place to start. We could perhaps recognize an encoding declaration like this in the file: http://docs.python.org/reference/lexical_analysis.html#encoding-declarations -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring Postgres to Add A New Source File
On Wed, Jan 25, 2012 at 7:06 AM, Tareq Aljabban tareq.aljab...@gmail.com wrote: Hi, I'm doing some development on the storage manager module of Postgres. I have added few source files already to the smgr folder, and I was able to have the Make system includes them by adding their names to the OBJS list in the Makefile inside the smgr folder. (i.e. When I add A.c, I would add A.o to the OBJS list). That was working fine. Now I'm trying to add a new file hdfs_test.c to the project. The problem with this file is that it requires some extra directives in its compilation command (-I and -L directives). Using gcc the file is compiled with the command: gcc hdfs_test.c -I/HDFS_HOME/hdfs/src/c++/libhdfs -I/usr/lib/jvm/default-java/include -L/HDFS_HOME/hdfs/src/c++/libhdfs -L/HDFS_HOME/build/c++/Linux-i386-32/lib -L/usr/lib/jvm/default-java/jre/lib/i386/server -ljvm -lhdfs -o hdfs_test I was told that in order to add the extra directives, I need to modify $LDFLAGS in configure.in and $LIBS in MakeFile. I read about autoconf and checked configure.in of Postgres but still wasn't able to know exactly what I should be doing. This is really a general question about make that has nothing to do with PostgreSQL specifically; you might want to get a book on make. Makefiles in subdirectories of src/backend are only intended to be able to build the backend itself, so there's probably no particularly easy way to do what you want there. You likely want to put this code someplace like src/test/hdfs_test and copy one of the other Makefiles into the new directory, perhaps src/test/isolation/Makefile. But even if you do that for starters, you're going to need to make some modifications, which may be a bit tricky if you have no experience at all using make. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] hiding variable-length fields from Form_pg_* structs
On mån, 2012-01-09 at 14:04 -0500, Tom Lane wrote: #ifdef CATALOG_VARLEN /* variable-length fields start here */ to be even clearer. What would be appropriate to add instead of those inconsistently-used comments is explicit comments about the exception cases such as proargtypes, to make it clear that the placement of the #ifdef CATALOG_VARLEN is intentional and not a bug in those cases. I implemented your suggestions in the attached patch. diff --git a/src/backend/catalog/Catalog.pm b/src/backend/catalog/Catalog.pm index 1ba935c..cdb0bee 100644 --- a/src/backend/catalog/Catalog.pm +++ b/src/backend/catalog/Catalog.pm @@ -143,6 +143,7 @@ sub Catalogs elsif ($declaring_attributes) { next if (/^{|^$/); +next if (/^#/); if (/^}/) { undef $declaring_attributes; @@ -150,6 +151,7 @@ sub Catalogs else { my ($atttype, $attname) = split /\s+/, $_; +die parse error ($input_file) unless $attname; if (exists $RENAME_ATTTYPE{$atttype}) { $atttype = $RENAME_ATTTYPE{$atttype}; diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c index d0138fc..a59950e 100644 --- a/src/backend/utils/cache/relcache.c +++ b/src/backend/utils/cache/relcache.c @@ -3351,15 +3351,30 @@ RelationGetIndexList(Relation relation) while (HeapTupleIsValid(htup = systable_getnext(indscan))) { Form_pg_index index = (Form_pg_index) GETSTRUCT(htup); + Datum indclassDatum; + oidvector *indclass; + bool isnull; /* Add index's OID to result list in the proper order */ result = insert_ordered_oid(result, index-indexrelid); + /* + * indclass cannot be referenced directly through the C struct, because + * it comes after the variable-width indkey field. Must extract the + * datum the hard way... + */ + indclassDatum = heap_getattr(htup, + Anum_pg_index_indclass, + GetPgIndexDescriptor(), + isnull); + Assert(!isnull); + indclass = (oidvector *) DatumGetPointer(indclassDatum); + /* Check to see if it is a unique, non-partial btree index on OID */ if (index-indnatts == 1 index-indisunique index-indimmediate index-indkey.values[0] == ObjectIdAttributeNumber - index-indclass.values[0] == OID_BTREE_OPS_OID + indclass-values[0] == OID_BTREE_OPS_OID heap_attisnull(htup, Anum_pg_index_indpred)) oidIndex = index-indexrelid; } diff --git a/src/include/catalog/genbki.h b/src/include/catalog/genbki.h index 5fe3a5b..bcf31e6 100644 --- a/src/include/catalog/genbki.h +++ b/src/include/catalog/genbki.h @@ -22,6 +22,16 @@ /* Introduces a catalog's structure definition */ #define CATALOG(name,oid) typedef struct CppConcat(FormData_,name) +/* + * This is never defined; it's here only for documentation. + * + * Variable-length catalog fields (except possibly the first not nullable one) + * should not be visible in C structures, so they are made invisible by #ifdefs + * of an undefined symbol. See also MARKNOTNULL in bootstrap.c for how this is + * handled. + */ +#undef CATALOG_VARLEN + /* Options that may appear after CATALOG (on the same line) */ #define BKI_BOOTSTRAP #define BKI_SHARED_RELATION diff --git a/src/include/catalog/pg_aggregate.h b/src/include/catalog/pg_aggregate.h index 4e10021..0c8a20c 100644 --- a/src/include/catalog/pg_aggregate.h +++ b/src/include/catalog/pg_aggregate.h @@ -44,7 +44,9 @@ CATALOG(pg_aggregate,2600) BKI_WITHOUT_OIDS regproc aggfinalfn; Oid aggsortop; Oid aggtranstype; - text agginitval; /* VARIABLE LENGTH FIELD */ +#ifdef CATALOG_VARLEN /* variable-length fields start here */ + text agginitval; +#endif } FormData_pg_aggregate; /* diff --git a/src/include/catalog/pg_attrdef.h b/src/include/catalog/pg_attrdef.h index 01f663c..ad770e4 100644 --- a/src/include/catalog/pg_attrdef.h +++ b/src/include/catalog/pg_attrdef.h @@ -32,8 +32,10 @@ CATALOG(pg_attrdef,2604) { Oid adrelid; /* OID of table containing attribute */ int2 adnum; /* attnum of attribute */ +#ifdef CATALOG_VARLEN /* variable-length fields start here */ pg_node_tree adbin; /* nodeToString representation of default */ text adsrc; /* human-readable representation of default */ +#endif } FormData_pg_attrdef; /* diff --git a/src/include/catalog/pg_attribute.h b/src/include/catalog/pg_attribute.h index 5cb16f6..45e38e4 100644 --- a/src/include/catalog/pg_attribute.h +++ b/src/include/catalog/pg_attribute.h @@ -145,11 +145,8 @@ CATALOG(pg_attribute,1249) BKI_BOOTSTRAP BKI_WITHOUT_OIDS BKI_ROWTYPE_OID(75) BK /* attribute's collation */ Oid attcollation; - /* - * VARIABLE LENGTH FIELDS start here. These fields may be NULL, too. - * - * NOTE: the following fields are not present in tuple descriptors!
Re: [HACKERS] Different error messages executing CREATE TABLE or ALTER TABLE to create a column xmin
On Thu, Jan 26, 2012 at 18:47, Robert Haas robertmh...@gmail.com wrote: OK, committed with that further change. Thank you, Robert! My first real contribution, even if tiny :-) Just a small nit to pick, though: Giuseppe Sucameli contributed to this patch but was not credited in the commit log.
Re: [HACKERS] Client Messages
On 26.01.2012 17:31, Tom Lane wrote: Heikki Linnakangasheikki.linnakan...@enterprisedb.com writes: The thing is, there's currently no encoding conversion happening, so if you have one database in LATIN1 encoding and another in UTF-8, for example, whatever you put in your postgresql.conf is going to be wrong for one database. I'm happy to just document the issue for per-database messages, ALTER DATABASE ... SET welcome_message, the encoding used there need to match the encoding of the database, or it's displayed as garbage. But what about per-user messages, when the user has access to several databases, or postgresql.conf? I've not looked at the patch, but what exactly will happen if the string has the wrong encoding? You get an incorrectly encoded string, ie. garbage, in your console, when you log in with psql. You can also use current_setting() to copy the incorrectly-encoded string elsewhere in the system. If you insert it into a table and run pg_dump, I think the dump might not be restorable. That's a bit of a stretch, perhaps, but it would be nice to avoid that. BTW, you can already do that if you set e.g default_text_search_config to something non-ASCII in postgresql.conf. Or if you do it with search_path, you get a warning at login. For example, I did ALTER USER foouser set search_path ='kääk'; in a LATIN1 database, and then connected to a UTF-8 database and got: $ ~/pgsql.master/bin/psql postgres foouser WARNING: invalid value for parameter search_path: k��k DETAIL: schema k��k does not exist psql (9.2devel) Type help for help. (in case that didn't get across right, I set the search_path to a string containing two a-with-umlauts, and in the warning, they got replaced with question marks with inverse colors, which is apparently a character that the console uses to display bytes that are not valid UTF-8). The problem with welcome_message would look just like that. No-one is likely to run into that with search_path, but it's quite reasonable and expected to use your native language in a welcome message. The idea that occurs to me is to have the code that uses the GUC do a verify_mbstr(noerror) on it, and silently ignore it if it doesn't pass (maybe with a LOG message). This would have to be documented of course, but it seems better than the potential consequences of trying to send a wrongly-encoded string. Hmm, fine with me. It would be nice to plug the hole that these bogus characters can leak elsewhere into the system through current_setting, though. Perhaps we could put the verify_mbstr() call somewhere in guc.c, to forbid incorrectly encoded characters from being stored in the guc variable in the first place. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Different error messages executing CREATE TABLE or ALTER TABLE to create a column xmin
On Thu, Jan 26, 2012 at 1:13 PM, Vik Reykja vikrey...@gmail.com wrote: On Thu, Jan 26, 2012 at 18:47, Robert Haas robertmh...@gmail.com wrote: OK, committed with that further change. Thank you, Robert! My first real contribution, even if tiny :-) Just a small nit to pick, though: Giuseppe Sucameli contributed to this patch but was not credited in the commit log. Hmm, I should have credited him as the reporter: sorry about that. I didn't use any of his code, though. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] foreign key locks, 2nd attempt
Excerpts from Alvaro Herrera's message of mar ene 24 15:47:16 -0300 2012: Need more code changes for the following: * export FOR KEY UPDATE lock mode in SQL While doing this, I realized that there's an open item here regarding a transaction that locks a tuple, and then in an aborted savepoint deletes it. As things stand, what happens is that the original tuple lock is forgotten entirely, which was one of the things I wanted to fix (and in fact is fixed for all other cases AFAICS). So what we need is to be able to store a MultiXactId that includes a member for KeyUpdate locks, which will represent an UPDATE that touches key columns as well as DELETEs. That closes the hole. However, the problem with this is that we have no more bits left in the flag bitmask, which is another concern you had raised. I chose the easy way out and added a full byte of flags per transaction. This means that we now have 1636 xacts per members page rather than 1900+, but I'm not too concerned about that. (We could cut back to 4 flag bits per xact -- but then, having some room for future growth is probably a good plan anyway). So DELETEs can also create multis. I'm going to submit an updated patch shortly. -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.2] Add GUC sepgsql.client_label
2012/1/26 Robert Haas robertmh...@gmail.com: I'm wondering if a function would be a better fit than a GUC. I don't think you can really restrict the ability to revert a GUC change - i.e. if someone does a SET and then a RESET, you pretty much have to allow that. I think. But if you expose a function then it can work however you like. One benefit to use GUC is that we can utilize existing mechanism to revert a value set within a transaction block on error. If we implement same functionality with functions, XactCallback allows sepgsql to get control on appropriate timing? On another note, this is an awfully large patch. Is there a separate patch here that just does code rearrangement that should be separated out? OK. I moved some routines related to client_label into label.c. I'll separate this part from the new functionality part. Thanks, -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Different error messages executing CREATE TABLE or ALTER TABLE to create a column xmin
Hi Robert, On Thu, Jan 26, 2012 at 7:59 PM, Robert Haas robertmh...@gmail.com wrote: On Thu, Jan 26, 2012 at 1:13 PM, Vik Reykja vikrey...@gmail.com wrote: On Thu, Jan 26, 2012 at 18:47, Robert Haas robertmh...@gmail.com wrote: OK, committed with that further change. thank you for the fast fix and Vik for the base patch ;) Just a small nit to pick, though: Giuseppe Sucameli contributed to this patch but was not credited in the commit log. I didn't use any of his code, though. I agree, Robert didn't use any part of my patch, so there's no reason to credit me in the log. Regards. -- Giuseppe Sucameli -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.2] Add GUC sepgsql.client_label
On Thu, Jan 26, 2012 at 2:07 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote: 2012/1/26 Robert Haas robertmh...@gmail.com: I'm wondering if a function would be a better fit than a GUC. I don't think you can really restrict the ability to revert a GUC change - i.e. if someone does a SET and then a RESET, you pretty much have to allow that. I think. But if you expose a function then it can work however you like. One benefit to use GUC is that we can utilize existing mechanism to revert a value set within a transaction block on error. If we implement same functionality with functions, XactCallback allows sepgsql to get control on appropriate timing? Not sure, but I thought the use case was to set this at connection startup time and then hand the connection off to a client. What keeps the client from just issuing RESET? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for parameterized inner paths
Robert Haas robertmh...@gmail.com writes: Is there a guard in here against joining a parameterized path to an intermediate relation when no SJ is involved? In other words, if we're joining a parameterized path on A to a path on B, then either the join to B should satisfy at least part of the parameterization needed by A, or there should be a special join with A and B on one side and a relation that satisfies at least part of the parameterization of A on the other. I've implemented this idea, recast a bit to prevent generating a parameterized join path in the first place unless it depends on a parameter from a relation for which there's a join ordering constraint still outstanding. It seems to get us to where the planning time penalty is only about 10%, which frankly is probably less than sampling error considering the small set of test cases I'm looking at. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] BGWriter latch, power saving
On 26 January 2012 16:48, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Ok, committed with some further cleanup. Do you think the docs need to be updated for this, and if so, where? The only place I found in the docs that speak about how the bgwriter works is in config.sgml, where bgwriter_delay is described. Want to suggest an update to that? This new behaviour might warrant a mention, as in the attached doc-patch. I did some testing on this, with a highly artificial test case that dirties pages in shared_buffers as fast as possible. I tried to make it a worst-case scenario, see attached script. I tested this with a 32-core HP Itanium box, and on my 2-core laptop, and didn't see any measurable slowdown from this patch. So I think we're good. Cool. I was pretty confident that it would be completely impossible to show a regression under any circumstances, but I'm glad that you tested it on a large server like that. BTW, do you have some sort of a test setup for these power-saving patches, to actually measure the effect on number of interrupts or electricity use? Fewer wakeups should be a good thing, but it would be nice to see some figures to get an idea of how much progress we've done and what still needs to be done. The only thing I've been using is Intel's powertop utility. It is pretty easy to see what's happening, and what you'll see is exactly what you'd expect (So by process, I could see that the bgwriter had exactly 5 wake-ups per second with bgwriter_delay at 200). It is useful to sleep quite pro-actively, as CPUs will enter idle C-states and move to lower P-states quite quickly (some will be more familiar with the commercial names for P-states, such as Intel's speedstep technology). I might have been less conservative about the circumstances that cause the bgwriter to go to sleep, but I was conscious of the need to get something into this release. It's difficult to know what a useful workload should be to show the benefits of this work, apart from the obvious one, which is to show Postgres when completely idle. I think we started at 11.5 wake-ups per second, and I think that's down to about 6.5 now - the WAL Writer still accounts for 5 of those, so that's why I feel it's particularly important to get it done too, though obviously that's something that should defer to however we end up implementing group commit. I intend to blog about it in the next few days, and I'll present a careful analysis of the benefits of this work there. Look out for it on planet. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml new file mode 100644 index 309b6a5..0e4dd97 *** a/doc/src/sgml/config.sgml --- b/doc/src/sgml/config.sgml *** SET ENABLE_SEQSCAN TO OFF; *** 1318,1334 /indexterm listitem para ! Specifies the delay between activity rounds for the ! background writer. In each round the writer issues writes ! for some number of dirty buffers (controllable by the ! following parameters). It then sleeps for varnamebgwriter_delay/ ! milliseconds, and repeats. The default value is 200 milliseconds ! (literal200ms/). Note that on many systems, the effective ! resolution of sleep delays is 10 milliseconds; setting ! varnamebgwriter_delay/ to a value that is not a multiple of ! 10 might have the same results as setting it to the next higher ! multiple of 10. This parameter can only be set in the ! filenamepostgresql.conf/ file or on the server command line. /para /listitem /varlistentry --- 1318,1335 /indexterm listitem para ! Specifies the delay between activity rounds for the background writer. ! In each round the writer issues writes for some number of dirty buffers ! (controllable by the following parameters). It then sleeps for ! varnamebgwriter_delay/ milliseconds, and repeats, though it will ! hibernate when it has looked through every buffer in ! varnameshared_buffers/varname without finding a dirty buffer. The ! default value is 200 milliseconds (literal200ms/). Note that on ! many systems, the effective resolution of sleep delays is 10 ! milliseconds; setting varnamebgwriter_delay/ to a value that is not ! a multiple of 10 might have the same results as setting it to the next ! higher multiple of 10. This parameter can only be set in the ! filenamepostgresql.conf/ file or on the server command line. /para /listitem /varlistentry -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Progress on fast path sorting, btree index creation time
On Thu, Jan 19, 2012 at 1:47 PM, Peter Geoghegan pe...@2ndquadrant.com wrote: Thoughts? I generated some random data using this query: create table nodups (g) as select (g%10)*1000+g/10 from generate_series(1,1) g; And then used pgbench to repeatedly sort it using this query: select * from nodups order by g offset 10001; The patch came out about 28% faster than master. Admittedly, that's with no client overhead, but still: not bad. I don't like the API you've designed, though: as you have it, PrepareSortSupportFromOrderingOp does this after calling the sort support function: + ssup-usable_compar = ResolveComparatorProper(sortFunction); I think it would be better to simply have the sort support functions set usable_compar themselves. That would allow any user-defined functions that happen to have the same binary representation and comparison rules as one of the types for which we supply a custom qsort() to use initialize it to still make use of the optimization. There's no real reason to have a separate switch to decide how to initialize that field: the sort support trampoline already does that, and I don't see any reason to introduce a second way of doing the same thing. I am also a little unhappy that we have to duplicate code the fastcmp functions from nbtcompare.c in builtins.h. Wouldn't it make more sense to declare those functions as inline in nbtcompare.c, and then call the qsort-generating macro from that file? There were a couple of comment updates in tuplesort.c that looked independent from the reset of the patch, so I've committed those separately. I also committed your change to downgrade the belt-and-suspenders check for self-comparison to an assert, with some rewording of your proposed comment. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for parameterized inner paths
On Thu, Jan 26, 2012 at 2:27 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: Is there a guard in here against joining a parameterized path to an intermediate relation when no SJ is involved? In other words, if we're joining a parameterized path on A to a path on B, then either the join to B should satisfy at least part of the parameterization needed by A, or there should be a special join with A and B on one side and a relation that satisfies at least part of the parameterization of A on the other. I've implemented this idea, recast a bit to prevent generating a parameterized join path in the first place unless it depends on a parameter from a relation for which there's a join ordering constraint still outstanding. It seems to get us to where the planning time penalty is only about 10%, which frankly is probably less than sampling error considering the small set of test cases I'm looking at. Awesome. If you can post the updated patch, I'll poke at it a little more and see if anything jumps out at me, but that sounds promising. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 16-bit page checksums for 9.2
On Wed, Jan 11, 2012 at 10:12:31PM +, Simon Riggs wrote: v7 This patch uses FPIs to guard against torn hint writes, even when the checksums are disabled. One could not simply condition them on the page_checksums setting, though. Suppose page_checksums=off and we're hinting a page previously written with page_checksums=on. If that write tears, leaving the checksum intact, that block will now fail verification. A couple of ideas come to mind. (a) Read the on-disk page and emit an FPI only if the old page had a checksum. (b) Add a checksumEnableLSN field to pg_control. Whenever the cluster starts with checksums disabled, set the field to InvalidXLogRecPtr. Whenever the cluster starts with checksums enabled and the field is InvalidXLogRecPtr, set the field to the next LSN. When a checksum failure occurs in a page with LSN older than the stored one, emit either a softer warning or no message at all. Not all WAL-bypassing operations use SetBufferCommitInfoNeedsSave() to dirty the buffer. Here are other sites I noticed that do MarkBufferDirty() without bumping the page LSN: heap_xlog_visible() visibilitymap_clear() visibilitymap_truncate() lazy_scan_heap() XLogRecordPageWithFreeSpace() FreeSpaceMapTruncateRel() fsm_set_and_search() fsm_vacuum_page() fsm_search_avail() Though I have not investigated each of these in detail, I suspect most or all represent continued torn-page hazards. Since the FSM is just a hint, we do not WAL-log it at all; it would be nicest to always skip checksums for it, too. The visibility map shortcuts are more nuanced. When page_checksums=off and we read a page last written by page_checksums=on, we still verify its checksum. If a mostly-insert/read site uses checksums for some time and eventually wishes to shed the overhead, disabling the feature will not stop the costs for reads of old data. They would need a dump/reload to get the performance of a never-checksummed database. Let's either unconditionally skip checksum validation under page_checksums=off or add a new state like page_checksums=ignore for that even-weaker condition. --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml + para +Turning this parameter off speeds normal operation, but +might allow data corruption to go unnoticed. The checksum uses +16-bit checksums, using the fast Fletcher 16 algorithm. With this +parameter enabled there is still a non-zero probability that an error +could go undetected, as well as a non-zero probability of false +positives. + /para What sources of false positives do you intend to retain? --- a/src/backend/catalog/storage.c +++ b/src/backend/catalog/storage.c @@ -20,6 +20,7 @@ #include postgres.h #include access/visibilitymap.h +#include access/transam.h #include access/xact.h #include access/xlogutils.h #include catalog/catalog.h @@ -70,6 +71,7 @@ static PendingRelDelete *pendingDeletes = NULL; /* head of linked list */ /* XLOG gives us high 4 bits */ #define XLOG_SMGR_CREATE 0x10 #define XLOG_SMGR_TRUNCATE 0x20 +#define XLOG_SMGR_HINT 0x40 typedef struct xl_smgr_create { @@ -477,19 +479,74 @@ AtSubAbort_smgr(void) smgrDoPendingDeletes(false); } +/* + * Write a backup block if needed when we are setting a hint. + * + * Deciding the if needed bit is delicate and requires us to either + * grab WALInsertLock or check the info_lck spinlock. If we check the + * spinlock and it says Yes then we will need to get WALInsertLock as well, + * so the design choice here is to just go straight for the WALInsertLock + * and trust that calls to this function are minimised elsewhere. + * + * Callable while holding share lock on the buffer content. + * + * Possible that multiple concurrent backends could attempt to write + * WAL records. In that case, more than one backup block may be recorded + * though that isn't important to the outcome and the backup blocks are + * likely to be identical anyway. + */ +#define SMGR_HINT_WATERMARK 13579 +void +smgr_buffer_hint(Buffer buffer) +{ + /* + * Make an XLOG entry reporting the hint + */ + XLogRecPtr lsn; + XLogRecData rdata[2]; + int watermark = SMGR_HINT_WATERMARK; + + /* + * Not allowed to have zero-length records, so use a small watermark + */ + rdata[0].data = (char *) (watermark); + rdata[0].len = sizeof(int); + rdata[0].buffer = InvalidBuffer; + rdata[0].buffer_std = false; + rdata[0].next = (rdata[1]); + + rdata[1].data = NULL; + rdata[1].len = 0; + rdata[1].buffer = buffer; + rdata[1].buffer_std = true; + rdata[1].next = NULL; + + lsn = XLogInsert(RM_SMGR_ID, XLOG_SMGR_HINT, rdata); + + /* + * Set the page LSN if we wrote a backup block. +
Re: [HACKERS] Progress on fast path sorting, btree index creation time
On 26 January 2012 19:45, Robert Haas robertmh...@gmail.com wrote: The patch came out about 28% faster than master. Admittedly, that's with no client overhead, but still: not bad. Thanks. There was a 28% reduction in the time it took to execute the query, but there would have also been a larger reduction in the time that the backend held that all of that locally-allocated memory. That might also be worth instrumenting directly, by turning on trace_sort - can you report numbers on that, please? I don't like the API you've designed, though: as you have it, PrepareSortSupportFromOrderingOp does this after calling the sort support function: + ssup-usable_compar = ResolveComparatorProper(sortFunction); I think it would be better to simply have the sort support functions set usable_compar themselves. That would allow any user-defined functions that happen to have the same binary representation and comparison rules as one of the types for which we supply a custom qsort() to use initialize it to still make use of the optimization. There's no real reason to have a separate switch to decide how to initialize that field: the sort support trampoline already does that, and I don't see any reason to introduce a second way of doing the same thing. Hmm. You're right. I can't believe that that didn't occur to me. In practice, types that use the SortSupport API are all going to be façades on scalar types anyway, much like date and timestamp, and of those a good proportion will surely have the same comparator representation as the specialisations introduced by this patch. It might be that virtually all third-party types that end up using the API can avail of some specialisation. I am also a little unhappy that we have to duplicate code the fastcmp functions from nbtcompare.c in builtins.h. Wouldn't it make more sense to declare those functions as inline in nbtcompare.c, and then call the qsort-generating macro from that file? Maybe it would, but since the meta-qsort_arg introduced includes partial duplicates of code from tuplesort.c, it kind of felt right to instantiate specialisations there. It may be that doing it in nbtcompare.c is the best option available to us. Off the top of my head, I'm pretty sure that that's a good bit less code. There were a couple of comment updates in tuplesort.c that looked independent from the reset of the patch, so I've committed those separately. I also committed your change to downgrade the belt-and-suspenders check for self-comparison to an assert, with some rewording of your proposed comment. That seems reasonable. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Progress on fast path sorting, btree index creation time
On Thu, Jan 26, 2012 at 4:09 PM, Peter Geoghegan pe...@2ndquadrant.com wrote: On 26 January 2012 19:45, Robert Haas robertmh...@gmail.com wrote: The patch came out about 28% faster than master. Admittedly, that's with no client overhead, but still: not bad. Thanks. There was a 28% reduction in the time it took to execute the query, but there would have also been a larger reduction in the time that the backend held that all of that locally-allocated memory. That might also be worth instrumenting directly, by turning on trace_sort - can you report numbers on that, please? Apparently not. The sort is too short to register in the trace_sort output. I just get: LOG: begin tuple sort: nkeys = 1, workMem = 1024, randomAccess = f LOG: performsort starting: CPU 0.00s/0.00u sec elapsed 0.00 sec LOG: performsort done: CPU 0.00s/0.00u sec elapsed 0.00 sec LOG: internal sort ended, 853 KB used: CPU 0.00s/0.00u sec elapsed 0.00 sec I don't like the API you've designed, though: as you have it, PrepareSortSupportFromOrderingOp does this after calling the sort support function: + ssup-usable_compar = ResolveComparatorProper(sortFunction); I think it would be better to simply have the sort support functions set usable_compar themselves. That would allow any user-defined functions that happen to have the same binary representation and comparison rules as one of the types for which we supply a custom qsort() to use initialize it to still make use of the optimization. There's no real reason to have a separate switch to decide how to initialize that field: the sort support trampoline already does that, and I don't see any reason to introduce a second way of doing the same thing. Hmm. You're right. I can't believe that that didn't occur to me. In practice, types that use the SortSupport API are all going to be façades on scalar types anyway, much like date and timestamp, and of those a good proportion will surely have the same comparator representation as the specialisations introduced by this patch. It might be that virtually all third-party types that end up using the API can avail of some specialisation. Possibly. At a minimum it keeps the door open. I am also a little unhappy that we have to duplicate code the fastcmp functions from nbtcompare.c in builtins.h. Wouldn't it make more sense to declare those functions as inline in nbtcompare.c, and then call the qsort-generating macro from that file? Maybe it would, but since the meta-qsort_arg introduced includes partial duplicates of code from tuplesort.c, it kind of felt right to instantiate specialisations there. It may be that doing it in nbtcompare.c is the best option available to us. Off the top of my head, I'm pretty sure that that's a good bit less code. I was hoping so... There were a couple of comment updates in tuplesort.c that looked independent from the reset of the patch, so I've committed those separately. I also committed your change to downgrade the belt-and-suspenders check for self-comparison to an assert, with some rewording of your proposed comment. That seems reasonable. Cool. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Command Triggers
Dimitri Fontaine dimi...@2ndquadrant.fr writes: Really I think there is not any single point where you can put the command-trigger hook and be done. In almost every case, the right place is going to be buried somewhere within the execution of the command. I'm finally realizing it. I already introduced a structure called CommandContextData to keep track of the current command elements we want to pass down to command triggers, I think we're saying that this should be a static variable that commands will need to be editing. In fact passing it as an argument to the command trigger API is much simpler and done in the attached. I'm having problems here with my install and not enough time this week (you don't speak English if you don't use understatements here and there, right?) so please expect a one-step-further patch to show the integration concept, not a ready for commit one just yet. Next steps are about adding support for more commands, and now that we have settled on a simpler integration that will be easy. The parameters sent to the trigger procedure are now the command tag, the main object Oid, its schema name and its object name. Only the command tag will never be NULL, all the other columns could be left out when calling an ANY COMMAND trigger, or a command on a schemaless object. Note: the per-command integration means the Oid is generally available, so let's just export it. An ack about the way it's now implemented would be awesome, and we could begin to start about which set of command exactly we want supported from the get go (default to all of them of course, but well, I don't think that's necessarily the best use of our time given our command list). Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support command-trigger.v6.patch.gz Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Progress on fast path sorting, btree index creation time
Alright, so while I agree with everything you've asked for, the fact is that there is a controversy in relation to binary bloat, and that's the blocker here. How can we satisfactorily resolve that, or is that question adequately addressed by the benchmark that I produced? What if third party modules could include the template_qsort_arg.h header, and instantiate their own specialisation? The sort support function could either set an enum, or set a pointer to a qsort_arg specialisation, if there comparator was a little different, but still just a few instructions, as with float-based timestamps (I don't care enough about them to provide one in core, though). It would also essentially allow for user-defined sort functions, provided they fulfilled a basic interface. They may not even have to be comparison-based. I know that I expressed scepticism about the weird and wonderful ideas that some people have put forward in that area, but that's mostly because I don't think that GPU based sorting in a database is going to be practical. Why not add this capability to the SortSupport API, given that it wouldn't be very hard? The user would set the enum too, so it would appear in explain analyze output as custom sort or something like that. While a sorting specialisation of our qsort_arg is actually pretty close to optimal for scalar types and façades thereof, there could be a type that would benefit from using radix sort, for example, which isn't even comparison-based, and a user couldn't very well do that with the existing SortSupport API. I certainly don't care about this capability enough to defend it against any objections that anyone may have, especially at this late stage in the cycle. I just think that we might as well have it. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for parameterized inner paths
Robert Haas robertmh...@gmail.com writes: ... In an ideal world, I'd like the amount of effort we spend planning to be somehow tied to the savings we can expect to get, and deploy optimizations like this only in cases where we have a reasonable expectation of that effort being repaid. BTW, so far as that goes: the only way I know to significantly cut the planner's join-planning resource consumption in any run-time-tunable fashion is to restrict it to planning subproblems separately, as for instance via from_collapse_limit/join_collapse_limit. Now, whatever the merits of those specific heuristics, the killer issue is that maybe you really needed a join order different from what the subproblem division entails. I believe that this patch can help with that. If the issue is that you really don't want to form the entire result of a specific subproblem subquery, that can be dealt with by treating the subquery as a parameterizable path, such that it can be on the inside of a nestloop with whatever other relation is supplying the parameter. Or in other words, the reason from_collapse_limit/join_collapse_limit sometimes lead to bad plans is really that they represent artificial join order restrictions. So I think this patch ought to be able to alleviate the worst cases of that. Or at least it did a few hours ago. What we'll probably want to do is tweak the path-formation heuristic you suggested so that joins to relations outside the current subproblem are treated like outer joins and allowed to form parameterized paths. Anyway this is the sort of thing that I hope to investigate after the basic patch is in. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inline Extension
On Jan 26, 2012, at 9:40 AM, Dimitri Fontaine wrote: So I'm going to prepare the next version of the patch with this design: - in catalog extension scripts for inline extension pg_extension_script(extoid, oldversion, version, script) oldversion is null when create extension is used unless when using the create extension from 'unpackaged' form Would you keep all the migration scripts used over time to upgrade from one version to another? - see about adding more control properties in the catalog? - current code that is parsing the filenames to determine the upgrade path will have to be able to take the version strings from the new catalog as an alternative, and getting to the script content must be able to select from the catalog or read a file on disk - pg_dump defaults to not dumping extension content - pg_dump --include-extension-scripts dumps the scripts found either in the filesystem or the catalog, a create script first then any number of update script as needed to reach the current installed version - same as we have -t, add -e --extension to pg_dump so that you can choose to dump only a given extension Also --exclude-extension? The extension dumping will not include the shared modules, so if you extension depend on them being installed on the server, you will be much better served with some OS level packaging. Or must make sure it’s installed on the system before you restore. Not for 9.2, but I can't help thinking that if we could manage to host the .so module itself in the catalogs, we could solve updating it in a transactional way and more importantly host it per-database, rather than having the modules work per major version (not even per cluster) and the extension mechanism work per-database inside each cluster. But that's work for another release. +1 Cloud vendors will *love* this. Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 16-bit page checksums for 9.2
Some other comments on the checksum patch: I'm not sure why you moved the checksum calculation (PageSetVerificationInfo) to mdwrite() rather than smgrwrite(). If there were every another storage backend, it would have to duplicate the checksum check, right? Is there a disadvantage to putting it in smgrwrite()? You may have already noticed this, but a bunch of the comments are incorrect, now that you have moved the checksum calculation to mdwrite(). config.sgml - says writes via temp_buffers (which I think means local buffers) are not checksummed -- that's no longer true, right? bufmgr.c, line 1914 - bufToWrite no longer exists. You could revert changes from 1912-1920 localbuf.c, line 203 - as mentioned below, this comment is no longer relevant, you are checksumming local buffers now Dan - Original Message - From: Noah Misch n...@leadboat.com To: Simon Riggs si...@2ndquadrant.com Cc: Heikki Linnakangas heikki.linnakan...@enterprisedb.com, Robert Haas robertmh...@gmail.com, Andres Freund and...@anarazel.de, Kevin Grittner kevin.gritt...@wicourts.gov, da...@fetter.org, ai...@highrise.ca, st...@mit.edu, pgsql-hackers@postgresql.org Sent: Thursday, January 26, 2012 12:20:39 PM Subject: Re: [HACKERS] 16-bit page checksums for 9.2 On Wed, Jan 11, 2012 at 10:12:31PM +, Simon Riggs wrote: v7 This patch uses FPIs to guard against torn hint writes, even when the checksums are disabled. One could not simply condition them on the page_checksums setting, though. Suppose page_checksums=off and we're hinting a page previously written with page_checksums=on. If that write tears, leaving the checksum intact, that block will now fail verification. A couple of ideas come to mind. (a) Read the on-disk page and emit an FPI only if the old page had a checksum. (b) Add a checksumEnableLSN field to pg_control. Whenever the cluster starts with checksums disabled, set the field to InvalidXLogRecPtr. Whenever the cluster starts with checksums enabled and the field is InvalidXLogRecPtr, set the field to the next LSN. When a checksum failure occurs in a page with LSN older than the stored one, emit either a softer warning or no message at all. Not all WAL-bypassing operations use SetBufferCommitInfoNeedsSave() to dirty the buffer. Here are other sites I noticed that do MarkBufferDirty() without bumping the page LSN: heap_xlog_visible() visibilitymap_clear() visibilitymap_truncate() lazy_scan_heap() XLogRecordPageWithFreeSpace() FreeSpaceMapTruncateRel() fsm_set_and_search() fsm_vacuum_page() fsm_search_avail() Though I have not investigated each of these in detail, I suspect most or all represent continued torn-page hazards. Since the FSM is just a hint, we do not WAL-log it at all; it would be nicest to always skip checksums for it, too. The visibility map shortcuts are more nuanced. When page_checksums=off and we read a page last written by page_checksums=on, we still verify its checksum. If a mostly-insert/read site uses checksums for some time and eventually wishes to shed the overhead, disabling the feature will not stop the costs for reads of old data. They would need a dump/reload to get the performance of a never-checksummed database. Let's either unconditionally skip checksum validation under page_checksums=off or add a new state like page_checksums=ignore for that even-weaker condition. --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml + para +Turning this parameter off speeds normal operation, but +might allow data corruption to go unnoticed. The checksum uses +16-bit checksums, using the fast Fletcher 16 algorithm. With this +parameter enabled there is still a non-zero probability that an error +could go undetected, as well as a non-zero probability of false +positives. + /para What sources of false positives do you intend to retain? --- a/src/backend/catalog/storage.c +++ b/src/backend/catalog/storage.c @@ -20,6 +20,7 @@ #include postgres.h #include access/visibilitymap.h +#include access/transam.h #include access/xact.h #include access/xlogutils.h #include catalog/catalog.h @@ -70,6 +71,7 @@ static PendingRelDelete *pendingDeletes = NULL; /* head of linked list */ /* XLOG gives us high 4 bits */ #define XLOG_SMGR_CREATE 0x10 #define XLOG_SMGR_TRUNCATE 0x20 +#define XLOG_SMGR_HINT 0x40 typedef struct xl_smgr_create { @@ -477,19 +479,74 @@ AtSubAbort_smgr(void) smgrDoPendingDeletes(false); } +/* + * Write a backup block if needed when we are setting a hint. + * + * Deciding the if needed bit is delicate and requires us to either + * grab WALInsertLock or check the info_lck spinlock. If we check the + * spinlock and it says Yes then we will need to get WALInsertLock as well, + * so the design choice here is to just go straight for the
[HACKERS] PGCon 2012 Call for Papers - reminder
A reminder, there are three days left to get in before the deadline… PGCon 2012 will be held 17-18 May 2012, in Ottawa at the University of Ottawa. It will be preceded by two days of tutorials on 15-16 May 2012. We are now accepting proposals for talks. Proposals can be quite simple. We do not require academic-style papers. If you are doing something interesting with PostgreSQL, please submit a proposal. You might be one of the backend hackers or work on a PostgreSQL related project and want to share your know-how with others. You might be developing an interesting system using PostgreSQL as the foundation. Perhaps you migrated from another database to PostgreSQL and would like to share details. These, and other stories are welcome. Both users and developers are encouraged to share their experiences. Here are a some ideas to jump start your proposal process: - novel ways in which PostgreSQL is used - migration of production systems from another database - data warehousing - tuning PostgreSQL for different work loads - replication and clustering - hacking the PostgreSQL code - PostgreSQL derivatives and forks - applications built around PostgreSQL - benchmarking and performance engineering - case studies - location-aware and mapping software with PostGIS - The latest PostgreSQL features and features in development - research and teaching with PostgreSQL - things the PostgreSQL project could do better - integrating PostgreSQL with 3rd-party software Both users and developers are encouraged to share their experiences. The schedule is: 8 Jan 2012 Proposal acceptance begins 29 Jan 2012 Proposal acceptance ends 19 Feb 2012 Confirmation of accepted proposals See also http://www.pgcon.org/2012/papers.php Instructions for submitting a proposal to PGCon 2012 are available from: http://www.pgcon.org/2012/submissions.php -- Dan Langille - http://langille.org -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] hiding variable-length fields from Form_pg_* structs
Peter Eisentraut pete...@gmx.net writes: On mån, 2012-01-09 at 14:04 -0500, Tom Lane wrote: #ifdef CATALOG_VARLEN /* variable-length fields start here */ to be even clearer. What would be appropriate to add instead of those inconsistently-used comments is explicit comments about the exception cases such as proargtypes, to make it clear that the placement of the #ifdef CATALOG_VARLEN is intentional and not a bug in those cases. I implemented your suggestions in the attached patch. This looks ready to go to me, except for one trivial nit: in pg_extension.h, please keep the comment pointing out that extversion should never be null. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Progress on fast path sorting, btree index creation time
On Thu, Jan 26, 2012 at 5:10 PM, Peter Geoghegan pe...@2ndquadrant.com wrote: Alright, so while I agree with everything you've asked for, the fact is that there is a controversy in relation to binary bloat, and that's the blocker here. How can we satisfactorily resolve that, or is that question adequately addressed by the benchmark that I produced? I could go either way on that, honestly. I took a look today and found out that on my machine, a postgres 9.0.x executable was about 282kB larger than an 8.4 postgres executable. 9.1 was about 386kB larger than that, and 9.2 current is 239kB larger still. So it's not as if your patch is the first one to enlarge the size of the binary - it's obviously been growing steadily from release to release for years. I found that your patch adds 55kB to the executable size, which is clearly a lot more than what a typical patch adds, but still under 1%. So I wouldn't be shocked and appalled if we decided to commit this as-is. But if we want to put it on a diet, the first thing I'd probably be inclined to lose is the float4 specialization. Some members of the audience will recall that I take dim view of floating point arithmetic generally, but I'll concede that there are valid reasons for using float8. I have a harder time coming up with a good reason to use float4 - ever, for anything you care about. So I would be inclined to think that if we want to trim this back a bit, maybe that's the one to let go. If we want to be even more aggressive, the next thing I'd probably lose is the optimization of multiple sortkey cases, on the theory that single sort keys are probably by far the most common practical case. I'm not surprised that you weren't able to measure a performance regression from the binary bloat. Any such regression is bound to be very small and probably quite difficult to notice most of the time; it's really the cumulative effect of many binary-size-increasing changes we're worried about, not each individual one. Certainly, trying to shrink the binary is micro-optimimzation at its finest, but then so is inlining comparators. I don't think it can be realistically argued that we can increasing the size of the binary arbitrarily will never get us in trouble, much like (for a typical American family) spending $30 to have dinner at a cheap resteraunt will never break the budget. But if you do it every day, it gets expensive (and fattening). What if third party modules could include the template_qsort_arg.h header, and instantiate their own specialisation? Seems reasonable to me. The sort support function could either set an enum, or set a pointer to a qsort_arg specialisation, if there comparator was a little different, but still The latter seems better. just a few instructions, as with float-based timestamps (I don't care enough about them to provide one in core, though). It would also essentially allow for user-defined sort functions, provided they fulfilled a basic interface. They may not even have to be comparison-based. I know that I expressed scepticism about the weird and wonderful ideas that some people have put forward in that area, but that's mostly because I don't think that GPU based sorting in a database is going to be practical. A question for another day. Why not add this capability to the SortSupport API, given that it wouldn't be very hard? The user would set the enum too, so it would appear in explain analyze output as custom sort or something like that. I'm unenthused about going to any trouble to change the EXPLAIN format. While a sorting specialisation of our qsort_arg is actually pretty close to optimal for scalar types and façades thereof, there could be a type that would benefit from using radix sort, for example, which isn't even comparison-based, and a user couldn't very well do that with the existing SortSupport API. I certainly don't care about this capability enough to defend it against any objections that anyone may have, especially at this late stage in the cycle. I just think that we might as well have it. I don't see any reason not too, assuming it's not a lot of code. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] JSON for PG 9.2
At 2012-01-15 11:08:05 -0500, and...@dunslane.net wrote: Here's an update that adds row_to_json, plus a bit more cleanup. I've started reviewing this patch, but it'll take me a bit longer to go through json.c properly. Here are a few preliminary notes: 1. The patch has a lot of whitespace errors (primarily lines ending in whitespace), but applying with git apply --whitespace=fix isn't safe, because the test results need some (but not all) of those spaces. I applied the patch, backed out the changes to expected/json.out, and created the file from the patch, then removed the superfluous whitespace. 2. I bumped some function OIDs to avoid conflicts. 3. One documentation typo. Everything other than json.c (which I haven't read yet) looks fine (builds, passes tests). I've attached a patch covering the changes I made. More later. -- ams diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index 4f8b35e..ce4c4f6 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -9626,7 +9626,7 @@ table2-mapping /indexterm para -This section descripbes the functions that are available for creating +This section describes the functions that are available for creating JSON (see xref linkend=datatype-json) data. /para diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h index cdfa4cc..02c8679 100644 --- a/src/include/catalog/pg_proc.h +++ b/src/include/catalog/pg_proc.h @@ -4027,11 +4027,11 @@ DATA(insert OID = 323 ( json_recv PGNSP PGUID 12 1 0 0 0 f f f t f s 1 0 11 DESCR(I/O); DATA(insert OID = 324 ( json_send PGNSP PGUID 12 1 0 0 0 f f f t f s 1 0 17 114 _null_ _null_ _null_ _null_ json_send _null_ _null_ _null_ )); DESCR(I/O); -DATA(insert OID = 3144 ( query_to_json PGNSP PGUID 12 1 0 0 0 f f f t f s 2 0 114 25 16 _null_ _null_ _null_ _null_ query_to_json _null_ _null_ _null_ )); +DATA(insert OID = 3153 ( query_to_json PGNSP PGUID 12 1 0 0 0 f f f t f s 2 0 114 25 16 _null_ _null_ _null_ _null_ query_to_json _null_ _null_ _null_ )); DESCR(I/O); -DATA(insert OID = 3145 ( array_to_json PGNSP PGUID 12 1 0 0 0 f f f t f s 1 0 114 2277 _null_ _null_ _null_ _null_ array_to_json _null_ _null_ _null_ )); +DATA(insert OID = 3154 ( array_to_json PGNSP PGUID 12 1 0 0 0 f f f t f s 1 0 114 2277 _null_ _null_ _null_ _null_ array_to_json _null_ _null_ _null_ )); DESCR(I/O); -DATA(insert OID = 3146 ( row_to_json PGNSP PGUID 12 1 0 0 0 f f f t f s 1 0 114 2249 _null_ _null_ _null_ _null_ row_to_json _null_ _null_ _null_ )); +DATA(insert OID = 3155 ( row_to_json PGNSP PGUID 12 1 0 0 0 f f f t f s 1 0 114 2249 _null_ _null_ _null_ _null_ row_to_json _null_ _null_ _null_ )); DESCR(I/O); /* uuid */ diff --git a/src/test/regress/expected/json.out b/src/test/regress/expected/json.out index b975d72..1984106 100644 --- a/src/test/regress/expected/json.out +++ b/src/test/regress/expected/json.out @@ -278,11 +278,11 @@ SELECT query_to_json('select x as b, x * 2 as c from generate_series(1,3) x',tru (1 row) SELECT query_to_json(' - SELECT $$a$$ || x AS b, - y AS c, + SELECT $$a$$ || x AS b, + y AS c, ARRAY[ROW(x.*,ARRAY[1,2,3]), - ROW(y.*,ARRAY[4,5,6])] AS z - FROM generate_series(1,2) x, + ROW(y.*,ARRAY[4,5,6])] AS z + FROM generate_series(1,2) x, generate_series(4,5) y',true); query_to_json -- @@ -299,19 +299,19 @@ SELECT query_to_json('select array_agg(x) as d from generate_series(5,10) x',fal (1 row) -- array_to_json -SELECT array_to_json(array_agg(x)) +SELECT array_to_json(array_agg(x)) FROM generate_series(1,10) x; array_to_json [1,2,3,4,5,6,7,8,9,10] (1 row) -SELECT array_to_json(array_agg(q)) -FROM (SELECT $$a$$ || x AS b, - y AS c, +SELECT array_to_json(array_agg(q)) +FROM (SELECT $$a$$ || x AS b, + y AS c, ARRAY[ROW(x.*,ARRAY[1,2,3]), - ROW(y.*,ARRAY[4,5,6])] AS z - FROM generate_series(1,2) x, + ROW(y.*,ARRAY[4,5,6])] AS z + FROM generate_series(1,2) x, generate_series(4,5) y) q; array_to_json --- @@ -331,12 +331,12 @@ SELECT row_to_json(row(1,'foo')); {f1:1,f2:foo} (1 row) -SELECT row_to_json(q) -FROM (SELECT $$a$$ || x AS b, - y AS c, +SELECT
Re: [HACKERS] Progress on fast path sorting, btree index creation time
On 27 January 2012 03:32, Robert Haas robertmh...@gmail.com wrote: But if we want to put it on a diet, the first thing I'd probably be inclined to lose is the float4 specialization. Some members of the audience will recall that I take dim view of floating point arithmetic generally, but I'll concede that there are valid reasons for using float8. I have a harder time coming up with a good reason to use float4 - ever, for anything you care about. So I would be inclined to think that if we want to trim this back a bit, maybe that's the one to let go. If we want to be even more aggressive, the next thing I'd probably lose is the optimization of multiple sortkey cases, on the theory that single sort keys are probably by far the most common practical case. Obviously I don't think that we should let anything go, as the improvement in performance is so large that we're bound to be ahead - we only really pay for what we use anyway, and when we use a specialisation the difference is really quite big, particularly when you look at sorting in isolation. If a specialisation is never used, it is more or less never paid for, so there's no point in worrying about that. That said, float4 is obviously the weakest link. I'm inclined to think that float8 is the second weakest though, mostly because we get both dates and timestamps for free with the integer specialisations. I'm not surprised that you weren't able to measure a performance regression from the binary bloat. Any such regression is bound to be very small and probably quite difficult to notice most of the time; it's really the cumulative effect of many binary-size-increasing changes we're worried about, not each individual one. Certainly, trying to shrink the binary is micro-optimimzation at its finest, but then so is inlining comparators. I don't think it can be realistically argued that we can increasing the size of the binary arbitrarily will never get us in trouble, much like (for a typical American family) spending $30 to have dinner at a cheap resteraunt will never break the budget. But if you do it every day, it gets expensive (and fattening). Sure. At the risk of stating the obvious, and of repeating myself, I will point out that the true cost of increasing the size of the binary is not necessarily linear - it's a complex equation. I hope that this doesn't sound flippant, but if some naive person were to look at just the increasing binary size of Postgres and its performance in each successive release, they might conclude that there was a positive correlation between the two (since we didn't add flab to the binary, but muscle that pulls its own weight and then some). At the continued risk of stating the obvious, CPUs don't just cache instructions - they cache data too. If we spend less than half the time sorting data, which is the level of improvement I was able to demonstrate against pre-SortSupport Postgres, that will surely very often have the aggregate effect of ameliorating cache contention between cores. just a few instructions, as with float-based timestamps (I don't care enough about them to provide one in core, though). It would also essentially allow for user-defined sort functions, provided they fulfilled a basic interface. They may not even have to be comparison-based. I know that I expressed scepticism about the weird and wonderful ideas that some people have put forward in that area, but that's mostly because I don't think that GPU based sorting in a database is going to be practical. A question for another day. Fair enough. I certainly don't care about this capability enough to defend it against any objections that anyone may have, especially at this late stage in the cycle. I just think that we might as well have it. I don't see any reason not too, assuming it's not a lot of code. Good. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Speed dblink using alternate libpq tuple storage
Thank you for the comment, First, my priority is one-the-fly result processing, not the allocation optimizing. And this patch seems to make it possible, I can process results row-by-row, without the need to buffer all of them in PQresult. Which is great! But the current API seems clumsy, I guess its because the patch grew from trying to replace the low-level allocator. Exactly. I would like to propose better one-shot API with: void *(*RowStoreHandler)(PGresult *res, PGresAttValue *columns); where the PGresAttValue * is allocated once, inside PQresult. And the pointers inside point directly to network buffer. Good catch, thank you. The patch is dragging too much from the old implementation. It is no need to copy the data inside getAnotherTuple to do it, as you say. Ofcourse this requires replacing the current per-column malloc+copy pattern with per-row parse+handle pattern, but I think resulting API will be better: 1) Pass-through processing do not need to care about unnecessary per-row allocations. 2) Handlers that want to copy of the row (like regular libpq), can optimize allocations by having global view of the row. (Eg. One allocation for row header + data). This also optimizes call patterns - first libpq parses packet, then row handler processes row, no unnecessary back-and-forth. Summary - current API has various assumptions how the row is processed, let's remove those. Thank you, I rewrite the patch to make it realize. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] BGWriter latch, power saving
On 26.01.2012 21:37, Peter Geoghegan wrote: On 26 January 2012 16:48, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Ok, committed with some further cleanup. Do you think the docs need to be updated for this, and if so, where? The only place I found in the docs that speak about how the bgwriter works is in config.sgml, where bgwriter_delay is described. Want to suggest an update to that? This new behaviour might warrant a mention, as in the attached doc-patch. Hmm, the word hibernate might confuse people in this context. I committed this as: It then sleeps for varnamebgwriter_delay/ milliseconds, and repeats. When there are no dirty buffers in the buffer pool, though, it goes into a longer sleep regardless of varnamebgwriter_delay/. Thanks! -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers