Re: [HACKERS] patch for ja.po
Hi, Honda Shigehiro wrote: I think this is already patched in cvs of message catalogs. Could you try psql.po from http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/pgtranslation/messages/ja/psql.po I confirmed it. Sorry for the noise. regards, -- Tatsuhito Kasahara kasahara.tatsuh...@oss.ntt.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] problem with plural-forms
On Monday 25 May 2009 19:11:24 Zdenek Kotala wrote: I tried to run msgfmt -v ... on solaris and I got following error: Processing file psql-cs.po... GNU PO file found. Generating the MO file in the GNU MO format. Processing file psql-cs.po... Lines 1311, 1312 (psql-cs.po): incompatible printf-format. 0 format specifier(s) in msgid, but 1 format specifier(s) in msgstr. ... ... Problem is in: #: print.c:2351 #, c-format msgid (1 row) msgid_plural (%lu rows) msgstr[0] (%lu řádka) msgstr[1] (%lu řádky) msgstr[2] (%lu řádek) The problem here is (1 row) instead of (%lu row). When I run msgfmt without -v everything works fine but I think we should fixed it (there are more occurrences of this issue). GNU gettext accepts this, and in fact the GNU gettext documentation explicitly points out that this allowed: In the English singular case, the number - always 1 - can be replaced with one: printf (ngettext (One file removed, %d files removed, n), n); This works because the `printf' function discards excess arguments that are not consumed by the format string. One might consider this better style (English style, not C style) in some contexts. Of course the concrete example that you show doesn't actually take advantage of this, so if it is important to you, please send a patch to fix it. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] A couple of gripes about the gettext plurals patch
On Monday 25 May 2009 22:02:47 Tom Lane wrote: The issue of double translation is really a minor point; what is bothering me is that we've got such an ad-hoc, non-compile-time-checkable approach here. Zdenek's discovery today that some of the format strings are flat-out wrong http://archives.postgresql.org/pgsql-hackers/2009-05/msg00946.php surprises me not in the least. See response there why this is the way it is. Note also that gcc's format argument checking can see through ngettext() quite well, so as far as I can tell, we are not exposed to accidental format string mismatches. Example code: #include stdarg.h #include stdio.h #include libintl.h extern void errmsg(const char *fmt, ...) __attribute__((format(printf,1,2))); void errmsg(const char *fmt, ...) { va_list ap; va_start(ap, fmt); vfprintf(stderr, fmt, ap); va_end(ap); } int main(int argc, char *argv[]) { errmsg(ngettext(got %d argument, namely %s\n, got %d arguments, first ist %s\n, argc), argc, argv[0]); return 0; } I tried throwing various kinds of subtle garbage into the errmsg/ngettext line, but it was all discovered by gcc -Wall. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] cleanup hashindex for pg_migrator hashindex compat mode (for 8.4)
On 05/25/2009 07:58 PM, Andres Freund wrote: On 05/25/2009 07:53 PM, Andres Freund wrote: On 05/25/2009 07:31 PM, Tom Lane wrote: David Fetterda...@fetter.org writes: On Mon, May 25, 2009 at 12:24:05PM -0400, Tom Lane wrote: If you'd like to accomplish something *useful* about this, how about pestering git upstream to support diff -c output format? If we were to put it into a repository config file, that would more or less have the effect of enforcing a project style for diffs, no? Yes and no. You can define that a subset (or all) files use a specific diff driver in the repository - unfortunately the definition of that driver has to be done locally. Defining it currently involves installing a wrapper like the one on http://wiki.postgresql.org/wiki/Talk:Working_with_Git and doing Ugh, hit the wrong key: and executing `git config --global diff.context.command git-external-diff` The content of the former page is now merged into the main page about git http://wiki.postgresql.org/wiki/Working_with_Git and the notes on the Talk: page are deleted. Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] information_schema.columns changes needed for OLEDB
On Sunday 24 May 2009 03:37:28 Konstantin Izmailov wrote: Number 4 is actually numeric_precision (I typed incorrectly). My recollection is that numeric_precision sometimes expressed in radix 2 and it caused issues for Windows apps. It is measured in radix 2 for floating-point types and in radix 10 for fixed- point types. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] cleanup hashindex for pg_migrator hashindex compat mode (for 8.4)
On Monday 25 May 2009 20:58:59 Andres Freund wrote: and executing `git config --global diff.context.command git-external-diff` We already knew that you could do it with a wrapper. But that isn't the answer we were looking for, because it will basically mean that 98% of casual contributors will get it wrong, and it will probably not work very well on Windows. The goal is to get git-diff to do it itself. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] generic options for explain
On Monday 25 May 2009 18:02:53 Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: This is all much more complicated than what I proposed, and I fail to see what it buys us. I'd say that you're just reinforcing the point I made upthread, which is that insisting that XML is the only way to get more detailed information will just create a cottage industry of beating that XML output format into submission. The impression I have is that (to misquote Churchill) XML is the worst option available, except for all the others. We need something that can represent a fairly complex data structure, easily supports addition or removal of particular fields in the structure (including fields not foreseen in the original design), is not hard for programs to parse, and is widely supported --- ie, not hard includes you don't have to write your own parser, in most languages. How many realistic alternatives are there? I think we are going in the wrong direction. No one has said that they want a machine-readable EXPLAIN format. OK, there are historically about three people that want one, but they have already solved the problem of parsing the current format. And without having writtens such a parser myself I think that the current format is not inherently hard to parse. What people really want is optional additional information in the human- readable format. Giving them a machine readable format does not solve the problem. Giving them a machine readable format with all-or-none of the optional information and saying figure it out yourself does not solve anything either. The same people who currently complain will continue to complain. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] generic options for explain
Peter Eisentraut wrote: On Monday 25 May 2009 18:02:53 Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: This is all much more complicated than what I proposed, and I fail to see what it buys us. I'd say that you're just reinforcing the point I made upthread, which is that insisting that XML is the only way to get more detailed information will just create a cottage industry of beating that XML output format into submission. The impression I have is that (to misquote Churchill) XML is the worst option available, except for all the others. We need something that can represent a fairly complex data structure, easily supports addition or removal of particular fields in the structure (including fields not foreseen in the original design), is not hard for programs to parse, and is widely supported --- ie, not hard includes you don't have to write your own parser, in most languages. How many realistic alternatives are there? I think we are going in the wrong direction. No one has said that they want a machine-readable EXPLAIN format. That is not true. Tool developers like pgAdmin (I know that one for sure), phpPgAdmin (I think they have said it too) and third party tools have asked for this. Right now we parse the EXPLAIN output. Which doesn't get easier with each new thing we add to it :-) It would be very nice to have it tool parseable. I'm also fairly certain that people using auto_explain would have use for a format that's easier to parse. What people really want is optional additional information in the human- readable format. Giving them a machine readable format does not solve the problem. Giving them a machine readable format with all-or-none of the optional information and saying figure it out yourself does not solve anything either. The same people who currently complain will continue to complain. I agree that this is a separate issue. But that doesn't mean they don't both exist. //Magnus -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] generic options for explain
On Tue, May 26, 2009 at 8:15 AM, Peter Eisentraut pete...@gmx.net wrote: I think we are going in the wrong direction. No one has said that they want a machine-readable EXPLAIN format. OK, there are historically about three people that want one, but they have already solved the problem of parsing the current format. Pretty sure I've said I want one. And whilst it's true, we already parse the current output in pgAdmin, it's a PITA whenever the format changes. I also want a format in which Tom is not going to refuse to include additional data (such as the schema a relation is in) because it clutters the output. A machine readable format would seem to the idea way to include all data we may need, without making human-readable output an unreadable mess. -- Dave Page EnterpriseDB UK: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] generic options for explain
On May 26, 2009, at 8:15 AM, Peter Eisentraut pete...@gmx.net wrote: On Monday 25 May 2009 18:02:53 Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: This is all much more complicated than what I proposed, and I fail to see what it buys us. I'd say that you're just reinforcing the point I made upthread, which is that insisting that XML is the only way to get more detailed information will just create a cottage industry of beating that XML output format into submission. The impression I have is that (to misquote Churchill) XML is the worst option available, except for all the others. We need something that can represent a fairly complex data structure, easily supports addition or removal of particular fields in the structure (including fields not foreseen in the original design), is not hard for programs to parse, and is widely supported --- ie, not hard includes you don't have to write your own parser, in most languages. How many realistic alternatives are there? I think we are going in the wrong direction. No one has said that they want a machine-readable EXPLAIN format. OK, there are historically about three people that want one, but they have already solved the problem of parsing the current format. And without having writtens such a parser myself I think that the current format is not inherently hard to parse. What people really want is optional additional information in the human- readable format. Giving them a machine readable format does not solve the problem. Giving them a machine readable format with all-or-none of the optional information and saying figure it out yourself does not solve anything either. The same people who currently complain will continue to complain. Peter, The check is in the mail. :-) In all seriousness, I have no problem at all with providing machine- readable formats, but the problem you're describing here is definitely my primary pain point. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] generic options for explain
Well I want an SQL query-able format. I also want a way to retrieve the data for a query run from within an application without disturbing the application i.e. while still returning the regular result set. But I also like being able to conveniently run explain and get the results formatted to fit on the screen in a single step. I don't see anything wrong with Robert's direction to pass options to explain. It doesn't solve every problem but it doesn't make any of the other things we need harder either. On a bike-shedding note I would rather have the rhs of the option be optional and default to true for boolean options. Actually if we make a set of explain_* guc options we could make the options just locally set those options. -- Greg On 26 May 2009, at 13:15, Peter Eisentraut pete...@gmx.net wrote: On Monday 25 May 2009 18:02:53 Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: This is all much more complicated than what I proposed, and I fail to see what it buys us. I'd say that you're just reinforcing the point I made upthread, which is that insisting that XML is the only way to get more detailed information will just create a cottage industry of beating that XML output format into submission. The impression I have is that (to misquote Churchill) XML is the worst option available, except for all the others. We need something that can represent a fairly complex data structure, easily supports addition or removal of particular fields in the structure (including fields not foreseen in the original design), is not hard for programs to parse, and is widely supported --- ie, not hard includes you don't have to write your own parser, in most languages. How many realistic alternatives are there? I think we are going in the wrong direction. No one has said that they want a machine-readable EXPLAIN format. OK, there are historically about three people that want one, but they have already solved the problem of parsing the current format. And without having writtens such a parser myself I think that the current format is not inherently hard to parse. What people really want is optional additional information in the human- readable format. Giving them a machine readable format does not solve the problem. Giving them a machine readable format with all-or-none of the optional information and saying figure it out yourself does not solve anything either. The same people who currently complain will continue to complain. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] cleanup hashindex for pg_migrator hashindex compat mode (for 8.4)
Hi, On 05/26/2009 01:39 PM, Peter Eisentraut wrote: On Monday 25 May 2009 20:58:59 Andres Freund wrote: and executing `git config --global diff.context.command git-external-diff` We already knew that you could do it with a wrapper. But that isn't the answer we were looking for, because it will basically mean that 98% of casual contributors will get it wrong, and it will probably not work very well on Windows. It works on windows, linux, solaris (thats what I could get my hands on without bothering). I tested it - it works on any non ancient version of git. (Ancient in the sense, that git at that time didnt work properly on win anyway). And providing a 5-line wrapper download-ready surely makes it easier than figuring it out how to write one out of some git manpages. Also it allows at least those who prefer context diffs to use them easily when using git - that are the ones which seem to prefer using them most. The goal is to get git-diff to do it itself. I do not disagree. Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] problem with plural-forms
Peter Eisentraut píše v út 26. 05. 2009 v 13:39 +0300: On Monday 25 May 2009 19:11:24 Zdenek Kotala wrote: snip The problem here is (1 row) instead of (%lu row). When I run msgfmt without -v everything works fine but I think we should fixed it (there are more occurrences of this issue). GNU gettext accepts this, and in fact the GNU gettext documentation explicitly points out that this allowed: In the English singular case, the number - always 1 - can be replaced with one: printf (ngettext (One file removed, %d files removed, n), n); This works because the `printf' function discards excess arguments that are not consumed by the format string. Yeah, I check also printf specification and it is allowed. One might consider this better style (English style, not C style) in some contexts. Of course the concrete example that you show doesn't actually take advantage of this, so if it is important to you, please send a patch to fix it. It is not a big issue, because it works without -v but I prefer to fix it. I will send a patch. I also sended question to i18n solaris group if it is supported on solaris. thanks Zdenek -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] generic options for explain
On May 26, 2009, at 8:46 AM, Greg Stark greg.st...@enterprisedb.com wrote: Well I want an SQL query-able format. I also want a way to retrieve the data for a query run from within an application without disturbing the application i.e. while still returning the regular result set. But I also like being able to conveniently run explain and get the results formatted to fit on the screen in a single step. I don't see anything wrong with Robert's direction to pass options to explain. It doesn't solve every problem but it doesn't make any of the other things we need harder either. Your check is in the mail, too. On a bike-shedding note I would rather have the rhs of the option be optional and default to true for boolean options. I was thinking about that, too, so +1. Actually if we make a set of explain_* guc options we could make the options just locally set those options. I think that's probably over-complicated, but that's just MHO. ...Robert -- Greg On 26 May 2009, at 13:15, Peter Eisentraut pete...@gmx.net wrote: On Monday 25 May 2009 18:02:53 Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: This is all much more complicated than what I proposed, and I fail to see what it buys us. I'd say that you're just reinforcing the point I made upthread, which is that insisting that XML is the only way to get more detailed information will just create a cottage industry of beating that XML output format into submission. The impression I have is that (to misquote Churchill) XML is the worst option available, except for all the others. We need something that can represent a fairly complex data structure, easily supports addition or removal of particular fields in the structure (including fields not foreseen in the original design), is not hard for programs to parse, and is widely supported --- ie, not hard includes you don't have to write your own parser, in most languages. How many realistic alternatives are there? I think we are going in the wrong direction. No one has said that they want a machine-readable EXPLAIN format. OK, there are historically about three people that want one, but they have already solved the problem of parsing the current format. And without having writtens such a parser myself I think that the current format is not inherently hard to parse. What people really want is optional additional information in the human- readable format. Giving them a machine readable format does not solve the problem. Giving them a machine readable format with all-or-none of the optional information and saying figure it out yourself does not solve anything either. The same people who currently complain will continue to complain. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] cleanup hashindex for pg_migrator hashindex compat mode (for 8.4)
I'll repeat my suggestion that everyone poo-pooed: we can have the mail list filters recognize patches, run filterdiff on them with our prefered options, and attach the result as an additional attachment (or link to some web directory). I think it would be simple to do and would be happy to give it a go if I can get the necessary access. It doesn't solve *all* the problems since the committee still needs a unified diff if he wants to take advantage of git's merge abilities. I think this is actually all a red herring since it's pretty easy for the reviewer to run filterdiff anyways. But having things be automatic is still always easier than not. -- Greg On 26 May 2009, at 13:54, Andres Freund and...@anarazel.de wrote: Hi, On 05/26/2009 01:39 PM, Peter Eisentraut wrote: On Monday 25 May 2009 20:58:59 Andres Freund wrote: and executing `git config --global diff.context.command git-external-diff` We already knew that you could do it with a wrapper. But that isn't the answer we were looking for, because it will basically mean that 98% of casual contributors will get it wrong, and it will probably not work very well on Windows. It works on windows, linux, solaris (thats what I could get my hands on without bothering). I tested it - it works on any non ancient version of git. (Ancient in the sense, that git at that time didnt work properly on win anyway). And providing a 5-line wrapper download-ready surely makes it easier than figuring it out how to write one out of some git manpages. Also it allows at least those who prefer context diffs to use them easily when using git - that are the ones which seem to prefer using them most. The goal is to get git-diff to do it itself. I do not disagree. Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] generic options for explain
Peter Eisentraut pete...@gmx.net writes: I think we are going in the wrong direction. No one has said that they want a machine-readable EXPLAIN format. OK, there are historically about three people that want one, but they have already solved the problem of parsing the current format. Well, obviously the set of tool designers is smaller than the set of casual users of EXPLAIN, but their problems are none the less real and very important. What people really want is optional additional information in the human- readable format. Giving them a machine readable format does not solve the problem. Actually, the exact problem is this: those two goals are in conflict. There'd be little objection to adding any random set of optional stuff to EXPLAIN's textual output, if it weren't for the fact that it would make machine parsing of that output even harder than it is already. So my feeling is that we need a machine-readable format containing all the data in order to satisfy the needs of tool designers. Once they are freed from having to parse EXPLAIN's textual output, we can whack the textual output around all we want. (Which kills my previous argument that we only need one new option, but such is life.) Now there is a third set of desires having to do with being able to do simple SQL-based analysis of EXPLAIN output. That's the piece I think we don't have a good handle on. In particular, it's not clear whether a SQL-friendly output format can be the same as either of the other two. (I don't personally find this goal very compelling --- there is no natural law saying that SQL is a good tool for analyzing EXPLAIN output --- but I'm willing to look at it to see if it's feasible.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] cleanup hashindex for pg_migrator hashindex compat mode (for 8.4)
Tom Lane píše v po 25. 05. 2009 v 13:07 -0400: Zdenek Kotala zdenek.kot...@sun.com writes: Tom Lane píše v ne 24. 05. 2009 v 18:46 -0400: In any case, the barriers to implementing 8.3-style hash indexes in 8.4 are pretty huge: you'd need to duplicate not only the hash AM code, but also all the hash functions, and therefore all of the hash pg_amop and pg_amproc entries. I'm not sure if I need duplicate functions. Generally yes but It seems to me that hash index does not changed functions behavior and they could be shared at this moment. No, the behavior of the hash functions themselves changed during 8.4. Twice, even: hmm, I'm missed it. :( So as far as I can see, you need completely separate copies of both hash_any() and the SQL-level functions that call it. I'm not really seeing that the proposed refactoring makes this any easier. You might as well just copy-and-paste all that old code into a separate set of files, and not worry about what is in access/hash.h. Yeah, in this case everything have to be duplicated which is not big deal in comparison to do same amount of work for GIN. Then I can start with GIN. The advantage of refactoring is then only nicer code. thanks Zdenek -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] problem with plural-forms
Peter Eisentraut pete...@gmx.net writes: On Monday 25 May 2009 19:11:24 Zdenek Kotala wrote: The problem here is (1 row) instead of (%lu row). When I run msgfmt without -v everything works fine but I think we should fixed it (there are more occurrences of this issue). GNU gettext accepts this, and in fact the GNU gettext documentation explicitly points out that this allowed: In the English singular case, the number - always 1 - can be replaced with one: printf (ngettext (One file removed, %d files removed, n), n); This works because the `printf' function discards excess arguments that are not consumed by the format string. That advice is, if not outright wrong, at least incredibly short-sighted. The method breaks the instant you have any additional values to print. For example, this ain't gonna work: printf (ngettext (One file removed, containing %lu bytes, %d files removed, containing %lu bytes, n), n, total_bytes); I'm of the opinion that the test being performed by msgfmt -v is entirely reasonable, and we should not risk such problems for the sake of sometimes spelling out one. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] generic options for explain
Tom Lane wrote: Now there is a third set of desires having to do with being able to do simple SQL-based analysis of EXPLAIN output. That's the piece I think we don't have a good handle on. In particular, it's not clear whether a SQL-friendly output format can be the same as either of the other two. (I don't personally find this goal very compelling --- there is no natural law saying that SQL is a good tool for analyzing EXPLAIN output --- but I'm willing to look at it to see if it's feasible.) In libxml-enabled builds at least, this could presumably be done fairly easily via the XML functions, especially if we get XSLT processing into the core XML functionality as I hope we can do this release. In fact, the ability to leverage existing XML functionality to munge the output is the thing that swings me in favor of XML as the machine readable output format instead of JSON, since we don't have and aren't terribly likely to get an inbuilt JSON parser. It means we wouldn't need some external tool at all. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] generic options for explain
On Tue, May 26, 2009 at 9:52 AM, Andrew Dunstan and...@dunslane.net wrote: In libxml-enabled builds at least, this could presumably be done fairly easily via the XML functions, especially if we get XSLT processing into the core XML functionality as I hope we can do this release. In fact, the ability to leverage existing XML functionality to munge the output is the thing that swings me in favor of XML as the machine readable output format instead of JSON, since we don't have and aren't terribly likely to get an inbuilt JSON parser. It means we wouldn't need some external tool at all. I was thinking something similar, but from the pgAdmin perspective. We already use libxml2, but JSON would introduce another dependency for us. -- Dave Page EnterpriseDB UK: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] problem with plural-forms
Tom Lane wrote: That advice is, if not outright wrong, at least incredibly short-sighted. The method breaks the instant you have any additional values to print. For example, this ain't gonna work: printf (ngettext (One file removed, containing %lu bytes, %d files removed, containing %lu bytes, n), n, total_bytes); I think it should use the %2$s style specifier in that case. This should work: printf (ngettext (One file removed, containing %2$lu bytes, %d files removed, containing %lu bytes, n), n, total_bytes); -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] cleanup hashindex for pg_migrator hashindex compat mode (for 8.4)
Greg Stark greg.st...@enterprisedb.com writes: I'll repeat my suggestion that everyone poo-pooed: we can have the mail list filters recognize patches, run filterdiff on them with our prefered options, and attach the result as an additional attachment (or link to some web directory). The argument that was made at the developer meeting is that the preferred way of working will be to apply the submitted patch in one's local git repository, and then do any needed editorialization as a second patch on top of it. So the critical need as I see it is to be able to see a -c version of a patch-in-progress (ie, diff current working state versus some previous committed state). Readability of the patch as-submitted is useful for quick eyeball checks, but I think all serious reviewing is going to be done on local copies. I think this is actually all a red herring since it's pretty easy for the reviewer to run filterdiff anyways. I don't trust filterdiff one bit :-( regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] problem with plural-forms
Isn't case I think in these two cases that using one is actively a bad idea. These aren't English sentences they're fragments meant to report numerical results to programmers. We don't use two or three either. If the value were just part of some full sentence where the actual value wasn't the key piece of data such as some error messages the situation might be different. -- Greg On 26 May 2009, at 15:05, Alvaro Herrera alvhe...@commandprompt.com wrote: Tom Lane wrote: That advice is, if not outright wrong, at least incredibly short-sighted. The method breaks the instant you have any additional values to print. For example, this ain't gonna work: printf (ngettext (One file removed, containing %lu bytes, %d files removed, containing %lu bytes, n), n, total_bytes); I think it should use the %2$s style specifier in that case. This should work: printf (ngettext (One file removed, containing %2$lu bytes, %d files removed, containing %lu bytes, n), n, total_bytes); -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] problem with plural-forms
Alvaro Herrera alvhe...@commandprompt.com writes: I think it should use the %2$s style specifier in that case. This should work: printf (ngettext (One file removed, containing %2$lu bytes, %d files removed, containing %lu bytes, n), n, total_bytes); How's that gonna work? In the n=1 case, printf would have no idea about the type/size of the argument it would need to skip over. I think maybe you could make it work like this: printf (ngettext (One file removed, containing %1$lu bytes, %2$d files removed, containing %1$lu bytes, n), total_bytes, n); but *for sure* I don't want us playing such games without a robust compile-time check on both variants of the ngettext string. I'm not real sure it's a good idea at all, because of the potential for confusing translators. Notice also that we have subtly embedded the preferred English phrase ordering here: if someone wants to pull the same type of trick in a language where the bytecount ought to come first, he's just plain out of luck. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] problem with plural-forms
Tom Lane wrote: Alvaro Herrera alvhe...@commandprompt.com writes: I think it should use the %2$s style specifier in that case. This should work: printf (ngettext (One file removed, containing %2$lu bytes, %d files removed, containing %lu bytes, n), n, total_bytes); How's that gonna work? In the n=1 case, printf would have no idea about the type/size of the argument it would need to skip over. Hmm, I admit I have no idea how it works ... but now that I think about it, you are right that at least I only use it with the whole argument array, just in a different order. I think maybe you could make it work like this: printf (ngettext (One file removed, containing %1$lu bytes, %2$d files removed, containing %1$lu bytes, n), total_bytes, n); but *for sure* I don't want us playing such games without a robust compile-time check on both variants of the ngettext string. I'm not real sure it's a good idea at all, because of the potential for confusing translators. Notice also that we have subtly embedded the preferred English phrase ordering here: if someone wants to pull the same type of trick in a language where the bytecount ought to come first, he's just plain out of luck. Agreed on both counts. We have enough trouble finding translators as it is; I don't want to know what would happen if we were to confuse them with this :-) I find it strange that this topic has not been fully hashed out in the GNU gettext documentation. Maybe we should talk to them. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL Developer meeting minutes up
* Tom Lane t...@sss.pgh.pa.us [090526 11:20]: Aidan Van Dyk ai...@highrise.ca writes: This has been raised and ignored many times before on -hackers... The reason is because the tags in the CVS repository are broken (i.e they are such that it's impossible to actually create all the tags), so the git cvsimport tools that try to tags all croak on the PG CVS repository. The tool which doesn't croak doesn't try and import all the tags, just the sticky branch tags... Scripts to fix (actually, remove) the broken tags have also been posted, along with requests that if somebody is mucking with the actual repository, to make sure it's known about, and access is denied during the mucking period (access being any rsync/anoncvs/mirroring of the cvs root). Up to now I've always been of the opinion that fixing those tags wasn't worth taking any risk for. But if we are thinking of moving away from CVS, then this clearly becomes one of the hurdles we have to jump on the way. Can you refresh our memory about which tags are problematic and exactly what needs to be done about 'em? Specifically, it's 2 tags, and I just remove them: REL7_1_BETA2 REL7_1_BETA3 Previous threads: http://news.gmane.org/find-root.php?message_id=20080220225300.ge16...@yugib.highrise.ca http://news.gmane.org/find-root.php?message_id=20081229155140.gp12...@yugib.highrise.ca a. -- Aidan Van Dyk Create like a god, ai...@highrise.ca command like a king, http://www.highrise.ca/ work like a slave. signature.asc Description: Digital signature
Re: [HACKERS] problem with plural-forms
* Tom Lane t...@sss.pgh.pa.us [090526 10:56]: Actually, configure checks to see if the local printf supports m$ or not, and we use our own printf implementation if not. So I'm not worried about #2. I agree with your other points though. (So, if you wanna see how this is done, try src/port/snprintf.c) regards, tom lane So what part of a working libc does PG use that it *doesn't* have to carry around in src/port/? ;-) a. -- Aidan Van Dyk Create like a god, ai...@highrise.ca command like a king, http://www.highrise.ca/ work like a slave. signature.asc Description: Digital signature
Re: [HACKERS] generic options for explain
Peter Eisentraut wrote: On Tuesday 26 May 2009 16:55:55 Dave Page wrote: I was thinking something similar, but from the pgAdmin perspective. We already use libxml2, but JSON would introduce another dependency for us. I was actually looking for a C library for JSON (json type for PostgreSQL; you know it is coming :-) ), but only found a library tied to glib, which, considering the experience with libxml, did not excite me. If someone knows of a different, small, and independent JSON library for C, I would like to hear about it. There are several listed at http://www.json.org/ cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] generic options for explain
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Tue, May 26, 2009 at 04:36:56PM +0200, Magnus Hagander wrote: I was thinking something similar, but from the pgAdmin perspective. We already use libxml2, but JSON would introduce another dependency for us. Yeah, but probably not a huge one. There is one for wx, but I don't think it's included by default. ...and to put things into perspective: to...@floh:~$ apt-cache show libxml2 libjson-glib-1.0-0 | grep ^Size Size: 814356 Size: 33538 (not that I would recommend this one, since that's the one tied to glib, but seems that XML parsing is nearly one and a half orders of magnitude more complex than JSON). - -- tomás who thinks that XML-as-a-data-description-language is a denial of service attack on us all -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFKHATbBcgs9XrR2kYRArJmAJ4wJlvbnuWKYTvIDrSoLJccCyMTLwCbBM39 NCVSrDaEVad3NfogJrwRtiY= =Volp -END PGP SIGNATURE- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL Developer meeting minutes up
Tom Lane wrote: Aidan Van Dyk ai...@highrise.ca writes: This has been raised and ignored many times before on -hackers... The reason is because the tags in the CVS repository are broken (i.e they are such that it's impossible to actually create all the tags), so the git cvsimport tools that try to tags all croak on the PG CVS repository. The tool which doesn't croak doesn't try and import all the tags, just the sticky branch tags... Scripts to fix (actually, remove) the broken tags have also been posted, along with requests that if somebody is mucking with the actual repository, to make sure it's known about, and access is denied during the mucking period (access being any rsync/anoncvs/mirroring of the cvs root). Up to now I've always been of the opinion that fixing those tags wasn't worth taking any risk for. But if we are thinking of moving away from CVS, then this clearly becomes one of the hurdles we have to jump on the way. Can you refresh our memory about which tags are problematic and exactly what needs to be done about 'em? I think we need just to remove the two tags in question (they have long been irrelevant). Prudence suggests that we should do that some time (weeks, I think) after the 8.4 release, when reverting ,if we find any breakage, won't be too painful. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] problem with memory
Pavel Stehule pavel.steh...@gmail.com writes: one czech PostgreSQL user reports problem with memory - probable memleak? Is possible to diagnose some from log? If he's getting an actual out of memory error, let's see the memory context map that gets dumped to the server log (or more specifically, to stderr). Server 8G RAM, 32b Debian Etch. shared_buffers = 324000# min 16 or max_connections*2, 8KB each Although you may have told us enough right here. 2.5GB of shared buffers in a 4GB address space (with probably only 3GB available to the application) isn't a very sane choice. Back that off or run 64-bit. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] problem with plural-forms
Aidan Van Dyk ai...@highrise.ca writes: From the glibc printf man page: There may be no gaps in the numbers of arguments specified using '$'; for example, if arguments 1 and 3 are specified, argument 2 must also be specified somewhere in the format string. So, is skipping 1 allowed? No --- the point is that printf has to be able to figure out where each argument is on the stack, so it must be able to infer the size of each of the arguments from left to right. That said, I do think the msgid should be using the % args, not words for a few reasons: 1) Make it more clear for translators the arguments and their ordering without having to visit the source code 2) On crufty systems without gettext, I wouldn't expect them to support m$ modifiers then either... 3) Greg's these are numbers, not sentences is how I expect the system to work... Actually, configure checks to see if the local printf supports m$ or not, and we use our own printf implementation if not. So I'm not worried about #2. I agree with your other points though. (So, if you wanna see how this is done, try src/port/snprintf.c) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] problem with memory
Hello one czech PostgreSQL user reports problem with memory - probable memleak? Is possible to diagnose some from log? Postgres ver. 8.3.7 . Server 8G RAM, 32b Debian Etch. Configuration: shared_buffers = 324000# min 16 or max_connections*2, 8KB each temp_buffers = 16000# min 100, 8KB each work_mem = 8126# min 64, size in KB maintenance_work_mem = 16384# min 1024, size in KB max_stack_depth = 7680# min 100, size in KB max_fsm_pages = 20# min max_fsm_relations*16, 6 bytes max_fsm_relations = 12000# min 100, ~70 bytes each effective_cache_size = 30# typically 8KB each http://jyxo.cz/misc/sql.error.txt Thank you Pavel Stehule -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL Developer meeting minutes up
Andrew Dunstan and...@dunslane.net writes: Tom Lane wrote: Up to now I've always been of the opinion that fixing those tags wasn't worth taking any risk for. But if we are thinking of moving away from CVS, then this clearly becomes one of the hurdles we have to jump on the way. I think we need just to remove the two tags in question (they have long been irrelevant). Prudence suggests that we should do that some time (weeks, I think) after the 8.4 release, when reverting ,if we find any breakage, won't be too painful. I don't see a lot of point in waiting till after 8.4.0. There is no time, ever, where we are sure there will be no release for weeks --- a security or data-loss bug could crop up at any time. And not messing up back branch update releases is even more important than not messing up 8.4.0, because the back branches are much more likely to get dropped straight into production. Obviously we want a solid backup of the pre-modification CVS repository, and we have to follow Aidan's advice about synchronizing the change with mirror repositories, but I don't see a strong argument for waiting weeks to do this. I think we should get it over with, so people can get on with the work that it's blocking. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] problem with memory
The link was to the memory context dump. The only suspicious context I spotted was 300mb in MessageContext. What is lc_messages and lc_ctype set to on this machine? Were the latest round of infinite recursion in the character conversion routines in 8.3.7? -- Greg On 26 May 2009, at 17:00, Tom Lane t...@sss.pgh.pa.us wrote: Pavel Stehule pavel.steh...@gmail.com writes: one czech PostgreSQL user reports problem with memory - probable memleak? Is possible to diagnose some from log? If he's getting an actual out of memory error, let's see the memory context map that gets dumped to the server log (or more specifically, to stderr). Server 8G RAM, 32b Debian Etch. shared_buffers = 324000# min 16 or max_connections*2, 8KB each Although you may have told us enough right here. 2.5GB of shared buffers in a 4GB address space (with probably only 3GB available to the application) isn't a very sane choice. Back that off or run 64-bit. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] problem with plural-forms
I wrote: ... Notice also that we have subtly embedded the preferred English phrase ordering here: if someone wants to pull the same type of trick in a language where the bytecount ought to come first, he's just plain out of luck. Uh, scratch that [ not enough caffeine yet ]. What this coding embeds is the assumption that the filecount is the only variable we might wish to replace with a constant string, which is safe enough since that's the only one that we know a fixed value for in any one ngettext string. Still, I agree with Greg's opinion that this is just not a real good thing to be doing. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] generic options for explain
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Tue, May 26, 2009 at 11:15:21AM -0400, Aidan Van Dyk wrote: * to...@tuxteam.de to...@tuxteam.de [090526 11:03]: ...and to put things into perspective: to...@floh:~$ apt-cache show libxml2 libjson-glib-1.0-0 | grep ^Size Size: 814356 Size: 33538 And including glib, which does all the work for libjson-glib: moun...@pumpkin:~/projects/postgresql/PostgreSQL$ apt-cache show libxml2 libjson-glib-1.0-0 libglib2.0-0 | grep ^Size Size: 870188 Size: 36132 Size: 845166 glib also pulls in libpcre: Size: 214650 So: - XML: 870188(libxml) + 76038 (zlib1g) = 946226 - JSON: 36132 (json) + 845166 (glib) + 214650 (pcre) = 1095948 ;-) OK, OK, you win (darn: should have known those bloatophile gnomies. Surprise that they don't pull in Mono :-( But json-c (just downloaded compiled) is more in the ballpark of 100K, if I count all produced *.o And it's BSD. Regards - -- tomás -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFKHAvaBcgs9XrR2kYRAkasAJwPzzw3Os8e7QA2HvMSkQ0iRGWz+ACfYlp+ Y/v3EO+8sRiPzJNumADatdM= =EjCU -END PGP SIGNATURE- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL Developer meeting minutes up
Aidan Van Dyk ai...@highrise.ca writes: Specifically, it's 2 tags, and I just remove them: REL7_1_BETA2 REL7_1_BETA3 Previous threads: http://news.gmane.org/find-root.php?message_id=20080220225300.ge16...@yugib.highrise.ca http://news.gmane.org/find-root.php?message_id=20081229155140.gp12...@yugib.highrise.ca It looks like the ill-considered commit message mentioned in that first thread hasn't been dealt with, either. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] generic options for explain
On Tue, May 26, 2009 at 09:55:55AM -0400, Dave Page wrote: from the pgAdmin perspective. We already use libxml2, but JSON would introduce another dependency for us. ...and using XML introduces a dependency for those that apps that don't already use some XML parser. I realize that since the pool of apps that care to mechanically parse EXPLAIN output is small, it wouldn't necessarily be a big deal to hand each of them a new dependency in the form of a parser for XML, JSON, etc. But we know the least common denominator is to return a set of tuples; let's make sure that really is unworkable before forcing even that dependency. - Josh / eggyknap signature.asc Description: Digital signature
Re: [HACKERS] problem with memory
Greg Stark greg.st...@enterprisedb.com writes: Were the latest round of infinite recursion in the character conversion routines in 8.3.7? Yes, and in any case the typical symptom of that problem was a SIGSEGV (due to stack overrun) not an out-of-memory complaint. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] usability of pg_get_function_arguments
Gevik Babakhani pg...@xs4all.nl writes: I experimented with your example and noticed that pg_get_expr requires a hack --- it insists on having a relation OID argument, because all previous use-cases for it involved expressions that might possibly refer to a particular table. So you have to do something like regression=# select pg_get_expr(proargdefaults,'pg_proc'::regclass) from pg_proc where proname='f13'; pg_get_expr --- 10, 'hello'::character varying, '2009-01-01 00:00:00'::timestamp without time zone, 'comma here ,'::character varying (1 row) Unfortunately, there is no way to know to which argument(s) the values above belongs to. The last ones --- you can only omit arguments from the right, so it makes no sense to allow a nonconsecutive set of defaults. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] generic options for explain
On Tue, May 26, 2009 at 10:36 AM, Magnus Hagander mag...@hagander.net wrote: Dave Page wrote: On Tue, May 26, 2009 at 9:52 AM, Andrew Dunstan and...@dunslane.net wrote: In libxml-enabled builds at least, this could presumably be done fairly easily via the XML functions, especially if we get XSLT processing into the core XML functionality as I hope we can do this release. In fact, the ability to leverage existing XML functionality to munge the output is the thing that swings me in favor of XML as the machine readable output format instead of JSON, since we don't have and aren't terribly likely to get an inbuilt JSON parser. It means we wouldn't need some external tool at all. Actually, I think a number of users would be *very* happy if we had a builtin JSON parser. I'm unsure on how feasible that is though. I think it's likely that with proper design the amount of extra code that is required to support both XML and JSON is likely to be very small. I don't think we're going to get away without supporting XML because there are so many people already using XML-based tools, and I find Andrew's argument that we already have some built-in XML support that could possibly be used to smooth the road here as well pretty compelling. On the other hand, XML can be a really difficult technology to work with because it doesn't map cleanly to the data structures that most modern scripting languages (Perl, Python, Ruby, and probably Java and others) use. As a simple example, if you have a hash like { a = 1, b = 2 } (using the Perl syntax) you can map it to hasha1/ab2/b/hash. That's easy to generate, but the reverse transformation is full of error-handling cases, like hasha1/ab2c//b/hash and hasha1/aa2/a/hash. I'm sure experienced XML hackers have ways to work around these problems, but the XML libraries I've worked with basically don't even try to turn the thing into any sort of general-purpose data structure. They just let you ask questions like What is the root element? OK, now what elements does it contain? OK, there's an a tag there, what does that have inside it? Any more-deeply-nested tags?. On the other hand, JSON is explicitly designed to serialize and deserialize data structures of this type, and it pretty much just works, even between completely different programming languages. So to summarize that - if we're only going to support one machine-readable output format, it's probably got to be XML. But if the additional effort to also support JSON is small, which I believe to be the case, then I think it's worth doing because it's actually better technology for this type of application. Maybe someone will feel inspired to work up a contrib/json. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] problem with plural-forms
On Tuesday 26 May 2009 17:19:50 Tom Lane wrote: Alvaro Herrera alvhe...@commandprompt.com writes: I think it should use the %2$s style specifier in that case. This should work: printf (ngettext (One file removed, containing %2$lu bytes, %d files removed, containing %lu bytes, n), n, total_bytes); How's that gonna work? In the n=1 case, printf would have no idea about the type/size of the argument it would need to skip over. gcc -Wall actually warns if you do this. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] generic options for explain
Dave Page wrote: On Tue, May 26, 2009 at 9:52 AM, Andrew Dunstan and...@dunslane.net wrote: In libxml-enabled builds at least, this could presumably be done fairly easily via the XML functions, especially if we get XSLT processing into the core XML functionality as I hope we can do this release. In fact, the ability to leverage existing XML functionality to munge the output is the thing that swings me in favor of XML as the machine readable output format instead of JSON, since we don't have and aren't terribly likely to get an inbuilt JSON parser. It means we wouldn't need some external tool at all. Actually, I think a number of users would be *very* happy if we had a builtin JSON parser. I'm unsure on how feasible that is though. I was thinking something similar, but from the pgAdmin perspective. We already use libxml2, but JSON would introduce another dependency for us. Yeah, but probably not a huge one. There is one for wx, but I don't think it's included by default. -- Magnus Hagander Self: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] problem with plural-forms
* Alvaro Herrera alvhe...@commandprompt.com [090526 10:06]: Tom Lane wrote: That advice is, if not outright wrong, at least incredibly short-sighted. The method breaks the instant you have any additional values to print. For example, this ain't gonna work: printf (ngettext (One file removed, containing %lu bytes, %d files removed, containing %lu bytes, n), n, total_bytes); I think it should use the %2$s style specifier in that case. This should work: printf (ngettext (One file removed, containing %2$lu bytes, %d files removed, containing %lu bytes, n), n, total_bytes); From the glibc printf man page: There may be no gaps in the numbers of arguments specified using '$'; for example, if arguments 1 and 3 are specified, argument 2 must also be specified somewhere in the format string. So, is skipping 1 allowed? But, it *is* a commonly used form, especially in translations (where orders of things need to be flipped), and is already used in many of the translated PG .po files. That said, I do think the msgid should be using the % args, not words for a few reasons: 1) Make it more clear for translators the arguments and their ordering without having to visit the source code 2) On crufty systems without gettext, I wouldn't expect them to support m$ modifiers then either... 3) Greg's these are numbers, not sentences is how I expect the system to work... a. -- Aidan Van Dyk Create like a god, ai...@highrise.ca command like a king, http://www.highrise.ca/ work like a slave. signature.asc Description: Digital signature
Re: [HACKERS] generic options for explain
On Tuesday 26 May 2009 16:55:55 Dave Page wrote: I was thinking something similar, but from the pgAdmin perspective. We already use libxml2, but JSON would introduce another dependency for us. I was actually looking for a C library for JSON (json type for PostgreSQL; you know it is coming :-) ), but only found a library tied to glib, which, considering the experience with libxml, did not excite me. If someone knows of a different, small, and independent JSON library for C, I would like to hear about it. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL Developer meeting minutes up
[moving this onto -hackers, where I think it belongs] Tom Lane wrote: Huh? The buildfarm will only prove that HEAD of the active branches builds. What the concern was was whether we could correctly extract past states (particularly, but not solely, the tags corresponding to releases) from a converted git repository. The testing I had in mind was to check out various tags and diff that tree against actual release tarballs. It appears that our git repo is only picking up the branch tags (e.g. REL8_0_STABLE) , not all the release tags (e.g. REL8_0_5) . That needs to be fixed (if possible). cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] generic options for explain
Robert Haas wrote: On the other hand, XML can be a really difficult technology to work with because it doesn't map cleanly to the data structures that most modern scripting languages (Perl, Python, Ruby, and probably Java and others) use. As a simple example, if you have a hash like { a = 1, b = 2 } (using the Perl syntax) you can map it to hasha1/ab2/b/hash. That's easy to generate, but the reverse transformation is full of error-handling cases, like hasha1/ab2c//b/hash and hasha1/aa2/a/hash. I'm sure experienced XML hackers have ways to work around these problems, but the XML libraries I've worked with basically don't even try to turn the thing into any sort of general-purpose data structure. They just let you ask questions like What is the root element? OK, now what elements does it contain? OK, there's an a tag there, what does that have inside it? Any more-deeply-nested tags?. On the other hand, JSON is explicitly designed to serialize and deserialize data structures of this type, and it pretty much just works, even between completely different programming languages. Since we will be controlling the XML output, we can restrict it to a form that is equivalent to what JSON and similar serialisation languages use. We can even produce an XSD schema specifying what is allowed, if anyone is so minded, and a validating parser could be told to validate the XML against that schema. And XSLT processing is a very powerful transformation tool. We could even provide a stylesheet that would turn the XML into JSON. :-) Anyway, I think we're getting closer to consensus here. I think there's a good case for being able to stash the EXPLAIN output in a table as XML - that way we could slice and dice it several ways without having to rerun the EXPLAIN. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] cleanup hashindex for pg_migrator hashindex compat mode (for 8.4)
On Tue, May 26, 2009 at 10:09 AM, Tom Lane t...@sss.pgh.pa.us wrote: Greg Stark greg.st...@enterprisedb.com writes: I'll repeat my suggestion that everyone poo-pooed: we can have the mail list filters recognize patches, run filterdiff on them with our prefered options, and attach the result as an additional attachment (or link to some web directory). The argument that was made at the developer meeting is that the preferred way of working will be to apply the submitted patch in one's local git repository, and then do any needed editorialization as a second patch on top of it. So the critical need as I see it is to be able to see a -c version of a patch-in-progress (ie, diff current working state versus some previous committed state). Readability of the patch as-submitted is useful for quick eyeball checks, but I think all serious reviewing is going to be done on local copies. I think this is actually all a red herring since it's pretty easy for the reviewer to run filterdiff anyways. I don't trust filterdiff one bit :-( For any particular reason, or just natural skepticism? I believe there have been some wild-eyed claims tossed around in this space previously that unified diffs don't provide all the same information as context diffs, which is flatly false. AIUI, the reason for the name unified diff is that it combines, or unifies, the before and after versions of the code into a single chunk. The nice thing about this is that when you have a bunch of small changes in a file, you don't end up with all of the surrounding lines repeated in both the before and after sections. If you change four consecutive lines and run a unified diff, you end up with 4 +s, 4 -s, and 6 lines of context (3 before and 3 after), for a total of 14 lines. If you run a context diff, you end up with 4 !s and 6 lines of context in the before section and the same in the after section, for a total of 20 lines, 6 of which are duplicated. This means that in many cases you can see what's changed without having to page up and down in the diff. The not-so-nice thing about unified diffs is that when there is a huge hunk of code that's changed, there are probably by chance a few identical lines buried in there, like }, so the + and - lines end up mixed together in a way that wouldn't happen in a context diff (which would turn the whole thing into two big ! sections). It's no problem for a machine to understand this, but it's hard to read for a human being. I haven't personally verified the filterdiff code, but the transformation is pretty mechanical so I'm not sure why we should believe that it hasn't been implemented correctly without some evidence along those lines. I don't think there's any way to make anyone 100% happy here. I personally prefer unified diffs, so when I'm reviewing a complex patch formatted as a context diff I typically apply it and then run a unified diff using git. When I'm submitting a patch I use a unified diff to check my work and then convert it to a context diff for submission. On the other hand, I assume that, if you were presented with a complex unified diff, would just apply it and then run a context-diff to review it. Since, as you say, serious reviewing will be done on local copies anyway, I really don't see the point of worrying too much about how they're submitted to the mailing list. Let's just tell everyone to keep using context diffs as the have been doing, and if anyone doesn't then let's THROW THEIR PATCH ON THE DUST-HEAP OF HISTORY AND HAUL THEM OUT TO BE DRAWN AND QUARTERED... er, um, I mean, ask them not to do it that way the next time. If there's an issue here that's worth getting worked up about, I'm not seeing it. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] problem with plural-forms
On Tuesday 26 May 2009 16:47:44 Tom Lane wrote: The method breaks the instant you have any additional values to print. For example, this ain't gonna work: printf (ngettext (One file removed, containing %lu bytes, %d files removed, containing %lu bytes, n), n, total_bytes); Don't do that then. This only shows that you cannot implement everything this way. It does not show why the things that you can implement are wrong. I'm of the opinion that the test being performed by msgfmt -v is entirely reasonable, and we should not risk such problems for the sake of sometimes spelling out one. I have no objections to this. I am only pointing out how we arrived at the current state. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL Developer meeting minutes up
Andrew Dunstan wrote: [moving this onto -hackers, where I think it belongs] Tom Lane wrote: Huh? The buildfarm will only prove that HEAD of the active branches builds. What the concern was was whether we could correctly extract past states (particularly, but not solely, the tags corresponding to releases) from a converted git repository. The testing I had in mind was to check out various tags and diff that tree against actual release tarballs. It appears that our git repo is only picking up the branch tags (e.g. REL8_0_STABLE) , not all the release tags (e.g. REL8_0_5) . That needs to be fixed (if possible). Hmm. I looked through the source of the import script. It appears to mention tags here and there, but doesn't seem to do it. There is a comment that reads: # Previous CVS versions just added the tag to the current HEAD # revision and didn't insert a dead revision on the branch with # the same date, like it is happening now. # This means history is unclear as we can't reliably determine # if the tagging happened at the same time as the addition to # the branch. For now, just assume it did. # # XXX can't reproduce for now, disabling, as it breaks some # things # Basically, it comes down to cvs tags not being actual first class happening, but just metadata on files. I'm sure we could script the creation of these tags fairly reliably on *our* repository since we know which files are always updated when a tag is added. I'm thinking we could just parse the log for configure.in and grab the tags from there. Thoughts? -- Magnus Hagander Self: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] generic options for explain
On Tue, May 26, 2009 at 1:48 PM, Andrew Dunstan and...@dunslane.net wrote: Robert Haas wrote: On the other hand, XML can be a really difficult technology to work with because it doesn't map cleanly to the data structures that most modern scripting languages (Perl, Python, Ruby, and probably Java and others) use. As a simple example, if you have a hash like { a = 1, b = 2 } (using the Perl syntax) you can map it to hasha1/ab2/b/hash. That's easy to generate, but the reverse transformation is full of error-handling cases, like hasha1/ab2c//b/hash and hasha1/aa2/a/hash. I'm sure experienced XML hackers have ways to work around these problems, but the XML libraries I've worked with basically don't even try to turn the thing into any sort of general-purpose data structure. They just let you ask questions like What is the root element? OK, now what elements does it contain? OK, there's an a tag there, what does that have inside it? Any more-deeply-nested tags?. On the other hand, JSON is explicitly designed to serialize and deserialize data structures of this type, and it pretty much just works, even between completely different programming languages. Since we will be controlling the XML output, we can restrict it to a form that is equivalent to what JSON and similar serialisation languages use. We can even produce an XSD schema specifying what is allowed, if anyone is so minded, and a validating parser could be told to validate the XML against that schema. And XSLT processing is a very powerful transformation tool. We could even provide a stylesheet that would turn the XML into JSON. :-) Yeah, that's fine. I think we should target 4/1/2010 as the submission date for that stylesheet. :-) Anyway, I think we're getting closer to consensus here. I think there's a good case for being able to stash the EXPLAIN output in a table as XML - that way we could slice and dice it several ways without having to rerun the EXPLAIN. Yes, I think there is an excellent case for being able to stash any output format into a table. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL Developer meeting minutes up
Again, This has been raised and ignored many times before on -hackers... The reason is because the tags in the CVS repository are broken (i.e they are such that it's impossible to actually create all the tags), so the git cvsimport tools that try to tags all croak on the PG CVS repository. The tool which doesn't croak doesn't try and import all the tags, just the sticky branch tags... Scripts to fix (actually, remove) the broken tags have also been posted, along with requests that if somebody is mucking with the actual repository, to make sure it's known about, and access is denied during the mucking period (access being any rsync/anoncvs/mirroring of the cvs root). As long as the tags are broken, you aren't going to get the tags imported. If you're going to fix the tags, warn everybody (because most people doing automatic conversions must know - they may need to be very careful to avoid a full re-import), do it, and let us know when it's done. a. * Andrew Dunstan and...@dunslane.net [090526 10:41]: [moving this onto -hackers, where I think it belongs] Tom Lane wrote: Huh? The buildfarm will only prove that HEAD of the active branches builds. What the concern was was whether we could correctly extract past states (particularly, but not solely, the tags corresponding to releases) from a converted git repository. The testing I had in mind was to check out various tags and diff that tree against actual release tarballs. It appears that our git repo is only picking up the branch tags (e.g. REL8_0_STABLE) , not all the release tags (e.g. REL8_0_5) . That needs to be fixed (if possible). cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Aidan Van Dyk Create like a god, ai...@highrise.ca command like a king, http://www.highrise.ca/ work like a slave. signature.asc Description: Digital signature
Re: [HACKERS] generic options for explain
Peter Eisentraut wrote: On Tuesday 26 May 2009 16:55:55 Dave Page wrote: I was thinking something similar, but from the pgAdmin perspective. We already use libxml2, but JSON would introduce another dependency for us. I was actually looking for a C library for JSON (json type for PostgreSQL; you know it is coming :-) ), but only found a library tied to glib, which, considering the experience with libxml, did not excite me. If someone knows of a different, small, and independent JSON library for C, I would like to hear about it. The JSon page (http://json.org/) lists for example http://fara.cs.uni-potsdam.de/~jsg/json_parser/ which appears to not need it. But it seems very simple - though I haven't actually looked into the details. -- Magnus Hagander Self: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] usability of pg_get_function_arguments
Tom Lane wrote: Gevik Babakhani pg...@xs4all.nl writes: I experimented with your example and noticed that pg_get_expr requires a hack --- it insists on having a relation OID argument, because all previous use-cases for it involved expressions that might possibly refer to a particular table. So you have to do something like regression=# select pg_get_expr(proargdefaults,'pg_proc'::regclass) from pg_proc where proname='f13'; pg_get_expr --- 10, 'hello'::character varying, '2009-01-01 00:00:00'::timestamp without time zone, 'comma here ,'::character varying (1 row) Unfortunately, there is no way to know to which argument(s) the values above belongs to. The last ones --- you can only omit arguments from the right, so it makes no sense to allow a nonconsecutive set of defaults. regards, tom lane Indeed. I did not see that earlier. Thank you. -- Regards, Gevik -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] A couple of gripes about the gettext plurals patch
Peter Eisentraut pete...@gmx.net writes: I tried throwing various kinds of subtle garbage into the errmsg/ngettext line, but it was all discovered by gcc -Wall. I experimented with this and found that indeed both format strings are checked ... if you have a reasonably recent libintl.h AND you have specified --enable-nls. Otherwise it all goes to heck, apparently because the compiler doesn't try to look through our substitute definition #define ngettext(s,p,n) ((n) == 1 ? (s) : (p)) So I'm still of the opinion that we need some work here. I think that instead of this #define we need an actual function that we can hang a couple of __attribute_format_arg__ markers on. Otherwise things are going to slip by us. (Not sure about you, but I don't build with --enable-nls by default.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] cleanup hashindex for pg_migrator hashindex compat mode (for 8.4)
Robert Haas robertmh...@gmail.com writes: On Tue, May 26, 2009 at 10:09 AM, Tom Lane t...@sss.pgh.pa.us wrote: I don't trust filterdiff one bit :-( For any particular reason, or just natural skepticism? IIRC it was demonstrated to be broken the last time it was proposed as a solution to our problems. Maybe it's been fixed since then, but I don't have any confidence in it, since evidently it's not been stress tested very hard. I believe there have been some wild-eyed claims tossed around in this space previously that unified diffs don't provide all the same information as context diffs, which is flatly false. No, the gripe has always been just that they're less readable for nontrivial changes. The not-so-nice thing about unified diffs is that when there is a huge hunk of code that's changed, there are probably by chance a few identical lines buried in there, like }, so the + and - lines end up mixed together in a way that wouldn't happen in a context diff (which would turn the whole thing into two big ! sections). It's no problem for a machine to understand this, but it's hard to read for a human being. Exactly. Even without identical lines, I find that the old and new code gets intermixed in easily-confusing ways. -u is very readable for isolated single-line changes, but for anything larger, not so much. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] generic options for explain
* to...@tuxteam.de to...@tuxteam.de [090526 11:03]: ...and to put things into perspective: to...@floh:~$ apt-cache show libxml2 libjson-glib-1.0-0 | grep ^Size Size: 814356 Size: 33538 And including glib, which does all the work for libjson-glib: moun...@pumpkin:~/projects/postgresql/PostgreSQL$ apt-cache show libxml2 libjson-glib-1.0-0 libglib2.0-0 | grep ^Size Size: 870188 Size: 36132 Size: 845166 glib also pulls in libpcre: Size: 214650 So: - XML: 870188(libxml) + 76038 (zlib1g) = 946226 - JSON: 36132 (json) + 845166 (glib) + 214650 (pcre) = 1095948 ;-) -- Aidan Van Dyk Create like a god, ai...@highrise.ca command like a king, http://www.highrise.ca/ work like a slave. signature.asc Description: Digital signature
Re: [HACKERS] PostgreSQL Developer meeting minutes up
Aidan Van Dyk ai...@highrise.ca writes: This has been raised and ignored many times before on -hackers... The reason is because the tags in the CVS repository are broken (i.e they are such that it's impossible to actually create all the tags), so the git cvsimport tools that try to tags all croak on the PG CVS repository. The tool which doesn't croak doesn't try and import all the tags, just the sticky branch tags... Scripts to fix (actually, remove) the broken tags have also been posted, along with requests that if somebody is mucking with the actual repository, to make sure it's known about, and access is denied during the mucking period (access being any rsync/anoncvs/mirroring of the cvs root). Up to now I've always been of the opinion that fixing those tags wasn't worth taking any risk for. But if we are thinking of moving away from CVS, then this clearly becomes one of the hurdles we have to jump on the way. Can you refresh our memory about which tags are problematic and exactly what needs to be done about 'em? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] cleanup hashindex for pg_migrator hashindex compat mode (for 8.4)
Tom Lane escribió: Robert Haas robertmh...@gmail.com writes: On Tue, May 26, 2009 at 10:09 AM, Tom Lane t...@sss.pgh.pa.us wrote: I don't trust filterdiff one bit :-( For any particular reason, or just natural skepticism? IIRC it was demonstrated to be broken the last time it was proposed as a solution to our problems. Maybe it's been fixed since then, but I don't have any confidence in it, since evidently it's not been stress tested very hard. I think you're probably confusing it with interdiff. I've had the latter fail several times (and I haven't really used it all that much), but I've never seem filterdiff make a mistake even though I use it frequently. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] generic options for explain
Hi, Peter Eisentraut pete...@gmx.net writes: I was actually looking for a C library for JSON (json type for PostgreSQL; you know it is coming :-) ), but only found a library tied to glib, which, considering the experience with libxml, did not excite me. If someone knows of a different, small, and independent JSON library for C, I would like to hear about it. Looking at http://json.org/, it seems this particular project could fit: http://lloyd.github.com/yajl/ Yet Another JSON Library. YAJL is a small event-driven (SAX-style) JSON parser written in ANSI C, and a small validating JSON generator. YAJL is released under the BSD license. ... It's all ANSI C. It's been successfully compiled on debian linux, OSX 10.4 i386 ppc, OSX 10.5 i386, winXP, FreeBSD 4.10, FreeBSD 6.1 amd64, FreeBSD 7 i386, and windows vista. More platforms and binaries as time permits. ... A second motivation for writing YAJL, was that many available free JSON parsers fall over on large or complex inputs. YAJL is careful to minimize memory copying and input re-scanning when possible. The result is a parser that should be fast enough for most applications or tunable for any application. On my mac pro (2.66 ghz) it takes 1s to verify a 60meg json file. Minimizing that same file with json_reformat takes 4s. Largely because YAJL deals with streams, it's possible to parse JSON in low memory environments. Oftentimes with other parsers an application must hold both the input text and the memory representation of the tree in memory at one time. With YAJL you can incrementally read the input stream and hold only the in memory representation. Or for filtering or validation tasks, it's not required to hold the entire input text in memory. Hope this helps, regards, -- dim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] cleanup hashindex for pg_migrator hashindex compat mode (for 8.4)
Alvaro Herrera alvhe...@commandprompt.com writes: Tom Lane escribió: IIRC it was demonstrated to be broken the last time it was proposed as a solution to our problems. Maybe it's been fixed since then, but I don't have any confidence in it, since evidently it's not been stress tested very hard. I think you're probably confusing it with interdiff. No, because I never heard of interdiff before. Checking the archives, the discussion I was remembering was definitely about filterdiff, but the rap on it was undocumented (so maybe demonstrated is too harsh): http://archives.postgresql.org/pgsql-hackers/2007-10/msg01243.php regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] generic options for explain
(sorry for top posting - stupid apple) So the real elephant in the room is that the existing explain code is not really designed to be extensible, configurable, or to be printed in different formats. The current code is basically just gobs of text printed by different routines all over the code base. There are no data structures which represent what explain prints. The closest thing is the instrumentation objects which obtain the timing and counts but not the planner expectations or any associated data. If we're going to support multiple output formats or options to turn off and on sections I think we need to build a data structure independent of the format, have code to include or exclude stats as requested and then pass that to the requested formatter. -- Greg On 26 May 2009, at 18:53, Robert Haas robertmh...@gmail.com wrote: On Tue, May 26, 2009 at 1:48 PM, Andrew Dunstan and...@dunslane.net wrote: Robert Haas wrote: On the other hand, XML can be a really difficult technology to work with because it doesn't map cleanly to the data structures that most modern scripting languages (Perl, Python, Ruby, and probably Java and others) use. As a simple example, if you have a hash like { a = 1, b = 2 } (using the Perl syntax) you can map it to hasha1/ab2/b/hash. That's easy to generate, but the reverse transformation is full of error-handling cases, like hasha1/ab2c//b/hash and hasha1/aa2/a/hash. I'm sure experienced XML hackers have ways to work around these problems, but the XML libraries I've worked with basically don't even try to turn the thing into any sort of general-purpose data structure. They just let you ask questions like What is the root element? OK, now what elements does it contain? OK, there's an a tag there, what does that have inside it? Any more-deeply-nested tags?. On the other hand, JSON is explicitly designed to serialize and deserialize data structures of this type, and it pretty much just works, even between completely different programming languages. Since we will be controlling the XML output, we can restrict it to a form that is equivalent to what JSON and similar serialisation languages use. We can even produce an XSD schema specifying what is allowed, if anyone is so minded, and a validating parser could be told to validate the XML against that schema. And XSLT processing is a very powerful transformation tool. We could even provide a stylesheet that would turn the XML into JSON. :-) Yeah, that's fine. I think we should target 4/1/2010 as the submission date for that stylesheet. :-) Anyway, I think we're getting closer to consensus here. I think there's a good case for being able to stash the EXPLAIN output in a table as XML - that way we could slice and dice it several ways without having to rerun the EXPLAIN. Yes, I think there is an excellent case for being able to stash any output format into a table. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] cleanup hashindex for pg_migrator hashindex compat mode (for 8.4)
Uhm the rap you quoted was ambiguous but I read it as referring to the ability I described if viewing the difference between two patches -- which I didn't name but is in fact interdiff. -- Greg On 26 May 2009, at 19:58, Tom Lane t...@sss.pgh.pa.us wrote: Alvaro Herrera alvhe...@commandprompt.com writes: Tom Lane escribió: IIRC it was demonstrated to be broken the last time it was proposed as a solution to our problems. Maybe it's been fixed since then, but I don't have any confidence in it, since evidently it's not been stress tested very hard. I think you're probably confusing it with interdiff. No, because I never heard of interdiff before. Checking the archives, the discussion I was remembering was definitely about filterdiff, but the rap on it was undocumented (so maybe demonstrated is too harsh): http://archives.postgresql.org/pgsql-hackers/2007-10/msg01243.php regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] generic options for explain
Greg Stark greg.st...@enterprisedb.com writes: So the real elephant in the room is that the existing explain code is not really designed to be extensible, configurable, or to be printed in different formats. These are implementation details ;-). Let's get a definition that everyone can sign off on, and then worry about what has to be done to the code to make it happen. Even if we end up throwing away and rewriting all of explain.c, that's not *that* much code. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] generic options for explain
On Tue, May 26, 2009 at 3:04 PM, Greg Stark greg.st...@enterprisedb.com wrote: (sorry for top posting - stupid apple) So the real elephant in the room is that the existing explain code is not really designed to be extensible, configurable, or to be printed in different formats. The current code is basically just gobs of text printed by different routines all over the code base. There are no data structures which All over the code base? It looks to me like most of it is in explain.c, specifically explain_outNode(). (On an unrelated point, it's difficult to imagine why someone thought that was a good way of capitalizing punctuating that function name.) represent what explain prints. The closest thing is the instrumentation objects which obtain the timing and counts but not the planner expectations or any associated data. If we're going to support multiple output formats or options to turn off and on sections I think we need to build a data structure independent of the format, have code to include or exclude stats as requested and then pass that to the requested formatter. That sounds about right to me. I think that representation can be pretty thin, though, maybe just a big struct with all the attributes that are applicable to any node type and pointers to its left and right children. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] cleanup hashindex for pg_migrator hashindex compat mode (for 8.4)
Greg Stark greg.st...@enterprisedb.com writes: On 26 May 2009, at 19:58, Tom Lane t...@sss.pgh.pa.us wrote: http://archives.postgresql.org/pgsql-hackers/2007-10/msg01243.php Uhm the rap you quoted was ambiguous but I read it as referring to the ability I described if viewing the difference between two patches -- which I didn't name but is in fact interdiff. [ squint... ] Hmm, maybe you're right. I see how it could be read that way, anyway. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] generic options for explain
On Tue, May 26, 2009 at 3:20 PM, Tom Lane t...@sss.pgh.pa.us wrote: Greg Stark greg.st...@enterprisedb.com writes: So the real elephant in the room is that the existing explain code is not really designed to be extensible, configurable, or to be printed in different formats. These are implementation details ;-). Let's get a definition that everyone can sign off on, and then worry about what has to be done to the code to make it happen. Even if we end up throwing away and rewriting all of explain.c, that's not *that* much code. I'm actually not sure there's a whole lot to hash out... I was going to take a crack at writing some code. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] generic options for explain
Robert Haas robertmh...@gmail.com writes: On Tue, May 26, 2009 at 3:20 PM, Tom Lane t...@sss.pgh.pa.us wrote: These are implementation details ;-). Let's get a definition that everyone can sign off on, and then worry about what has to be done to the code to make it happen. I'm actually not sure there's a whole lot to hash out... I was going to take a crack at writing some code. I still haven't seen anything but formless handwaving as far as the SQL table output format goes. For that matter, there's not much more than handwaving behind the XML meme either. Show us a spec for the output format, then think about code. (This was somewhere around slide ten here: http://momjian.us/main/writings/pgsql/patch.pdf ;-)) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] generic options for explain
On Tue, May 26, 2009 at 3:33 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Tue, May 26, 2009 at 3:20 PM, Tom Lane t...@sss.pgh.pa.us wrote: These are implementation details ;-). Let's get a definition that everyone can sign off on, and then worry about what has to be done to the code to make it happen. I'm actually not sure there's a whole lot to hash out... I was going to take a crack at writing some code. I still haven't seen anything but formless handwaving as far as the SQL table output format goes. For that matter, there's not much more than handwaving behind the XML meme either. Show us a spec for the output format, then think about code. (This was somewhere around slide ten here: http://momjian.us/main/writings/pgsql/patch.pdf ;-)) OK, how about this: http://archives.postgresql.org/message-id/603c8f070905241827g74c8bf9cie9d98e38037a8...@mail.gmail.com I note in passing that there have been 51 messages posted to this thread since I wrote that email, and none of the were responses to it. At any rate, that email might not be as detailed as what you're looking for, but it's certainly a start. I don't really know how the table-format output is going to work out; I have to look at the code more to get a feeling for that. But I think with respect to XML or JSON, there really aren't too many options for how it can look, modulo minor syntax tweaks like arguing about whether the join type should be labelled JoinType or jointype or join_type. Still, if you have comments or think I'm overlooking something important, I definitely would like to know about that now before I put more time into it. I recognize that we haven't come to a consensus on the best possible syntax for EXPLAIN options, but it seems to me that threshold issue for improving EXPLAIN is everyone agreeing that we're going to allow for some kind of extendable syntax that doesn't rely on all options being keywords (presented in a fixed order, no less!). You caved in on that point upthread and I don't think we have any other holdouts. Now, of course, my syntax is the best possible one in the entire universe, but if by chance there is a technically feasible alternative syntax on which more than one person can agree (note: this has not happened yet), adjusting my patch to use that syntax rather than the one I stole from Peter shouldn't be too hard. A second issue on which we don't have consensus is a method to capture explain output. I am 100% of the opinion that there are only two sensible things to do here: (1) make EXPLAIN a fully reserved keyword so that we can use it just like a SELECT, or (2) provide a built-in function like pg_explain() that calls EXPLAIN with a user-specified set of arguments, and which third-party tools can count on to be installed. Since you labelled (1) as a non-starter and AFAICS you're the only holdout on making (2) a built-in rather than something everyone has to define for themselves, I'm hopeful that we'll bring you around. :-) The final issue on which we don't have a clear consensus is what OTHER new options we want for EXPLAIN aside from choice of output format. I posted a few ideas that I have and solicited some others upthread, but I think that the volume of email on other aspects of this patch has deprived people of the necessary time and space to think about how they might like to use an extensible options syntax once we have it - not to mention that the original patch was only posted 3 days ago and on a day when many of us were on airplanes, about to get on airplanes, or still jet-lagged. Personally, I think that that's the most interesting aspect of this whole project so I hope it gets some attention going forward, but I'm not too concerned about the exact timing of that attention. The point is that people not-infrequently come up with more stuff they'd like to see in EXPLAIN output, and those ideas get shot down because we don't have the syntax. If we fix the syntax, those ideas will come back around again in due course, and we'll be able to consider them on their merits rather than peremptorily shooting them down. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] ALTER CAST
Hi, any reason not to implement an ALTER CAST statement? the situation where i need it is for migration... i'm currently migrating an application from sql server. and in sql server the cast from int to bool seems like it is implicit and the application makes use of that...now, instead of changing the whole application is far more convenient to alter the cast (int as bool) to make it implicit but the only way to do it is via alter catalogs... -- Atentamente, Jaime Casanova Soporte y capacitación de PostgreSQL Asesoría y desarrollo de sistemas Guayaquil - Ecuador Cel. +59387171157 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] generic options for explain
Robert Haas robertmh...@gmail.com writes: On Tue, May 26, 2009 at 3:33 PM, Tom Lane t...@sss.pgh.pa.us wrote: I still haven't seen anything but formless handwaving as far as the SQL table output format goes. For that matter, there's not much more than handwaving behind the XML meme either. OK, how about this: http://archives.postgresql.org/message-id/603c8f070905241827g74c8bf9cie9d98e38037a8...@mail.gmail.com I note in passing that there have been 51 messages posted to this thread since I wrote that email, and none of the were responses to it. Well, we were having too much fun arguing about trivia ;-). And I suspect a number of people were too jet-lagged to keep track of what they'd read and what not. Anyway, good, we have a starting point. Some issues that I see here: 1. You seem to be assuming that each table row will represent exactly one plan node, no more, no less. That's okay as a first approximation but it breaks down on closer examination. In particular, where will you hang the information that's already available about trigger execution costs? Those are not associated with any particular plan node, as they occur atop the whole plan. The same goes for the total execution time of course, and I can foresee other types of stats that we might gather someday that would be hard to tie to any specific plan node. In XML this is soluble by having a toplevel node ExplainResults that contains not only the plan tree but other children. I'm not seeing how to translate that into a SQL table, though. Or at least not just one SQL table. 2. You didn't say anything about how any but simple scalar fields will be represented. Filter conditions and sort keys are particularly interesting here. I'm not really happy with just plopping down the same textual output we have now --- that is just as human-friendly-and-not- machine-friendly as before, only with a slightly smaller scope. I can foresee for example that someone might wish to extract the second or third sort key expression from a Sort node's sort key list. Or what about problems such as find which nodes this field is used in? 3. You left us with a handwave about how the tree structure will be represented in a table. Needs to be explicit. And it's not just simple child relationships that should be represented ... tell us about initplans and subplans, too. 4. The point about having lots of NULL columns is an annoyance that could escalate to the point of near unusability. To get a feeling for how workable that is, we need a pretty exact list of the set of output columns, not just a rough list of the kinds of things that will be there. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER CAST
Jaime Casanova jcasa...@systemguards.com.ec writes: Hi, any reason not to implement an ALTER CAST statement? Mainly, because we don't really *want* every thinks-he-knows-something DBA fooling around with the built-in casts. If he actually knows enough to know whether it's safe, then he knows enough to do it by poking the catalogs. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL Developer meeting minutes up
On Tue, 26 May 2009, Aidan Van Dyk wrote: * Tom Lane t...@sss.pgh.pa.us [090526 11:20]: Aidan Van Dyk ai...@highrise.ca writes: This has been raised and ignored many times before on -hackers... The reason is because the tags in the CVS repository are broken (i.e they are such that it's impossible to actually create all the tags), so the git cvsimport tools that try to tags all croak on the PG CVS repository. The tool which doesn't croak doesn't try and import all the tags, just the sticky branch tags... Scripts to fix (actually, remove) the broken tags have also been posted, along with requests that if somebody is mucking with the actual repository, to make sure it's known about, and access is denied during the mucking period (access being any rsync/anoncvs/mirroring of the cvs root). Up to now I've always been of the opinion that fixing those tags wasn't worth taking any risk for. But if we are thinking of moving away from CVS, then this clearly becomes one of the hurdles we have to jump on the way. Can you refresh our memory about which tags are problematic and exactly what needs to be done about 'em? Specifically, it's 2 tags, and I just remove them: REL7_1_BETA2 REL7_1_BETA3 So, you are suggesting: cvs -q tag -d REL7_1_BETA2 . cvs -q tag -d REL7_1_BETA3 . correct? Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . scra...@hub.org MSN . scra...@hub.org Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] generic options for explain
On Tue, May 26, 2009 at 5:24 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Tue, May 26, 2009 at 3:33 PM, Tom Lane t...@sss.pgh.pa.us wrote: I still haven't seen anything but formless handwaving as far as the SQL table output format goes. For that matter, there's not much more than handwaving behind the XML meme either. OK, how about this: http://archives.postgresql.org/message-id/603c8f070905241827g74c8bf9cie9d98e38037a8...@mail.gmail.com I note in passing that there have been 51 messages posted to this thread since I wrote that email, and none of the were responses to it. Well, we were having too much fun arguing about trivia ;-). And I suspect a number of people were too jet-lagged to keep track of what they'd read and what not. Anyway, good, we have a starting point. Some issues that I see here: 1. You seem to be assuming that each table row will represent exactly one plan node, no more, no less. That's okay as a first approximation but it breaks down on closer examination. In particular, where will you hang the information that's already available about trigger execution costs? Those are not associated with any particular plan node, as they occur atop the whole plan. The same goes for the total execution time of course, and I can foresee other types of stats that we might gather someday that would be hard to tie to any specific plan node. In XML this is soluble by having a toplevel node ExplainResults that contains not only the plan tree but other children. I'm not seeing how to translate that into a SQL table, though. Or at least not just one SQL table. 2. You didn't say anything about how any but simple scalar fields will be represented. Filter conditions and sort keys are particularly interesting here. I'm not really happy with just plopping down the same textual output we have now --- that is just as human-friendly-and-not- machine-friendly as before, only with a slightly smaller scope. I can foresee for example that someone might wish to extract the second or third sort key expression from a Sort node's sort key list. Or what about problems such as find which nodes this field is used in? 3. You left us with a handwave about how the tree structure will be represented in a table. Needs to be explicit. And it's not just simple child relationships that should be represented ... tell us about initplans and subplans, too. 4. The point about having lots of NULL columns is an annoyance that could escalate to the point of near unusability. To get a feeling for how workable that is, we need a pretty exact list of the set of output columns, not just a rough list of the kinds of things that will be there. Responding to these in bulk, I think that 1, 3, and 4 are pretty convincing arguments that the SQL-based output format is underspecified. I hereby promise not to do anything about that without further discussion, which is an easy promise to make considering that in light of those comments I have no idea what it should look like. I think (1) is the most damning point. However, as far as I can see, none of these will affect XML or JSON. With respect to (2), I think we should output the same text format that we have now, for starters. I agree that's not the only thing that someone might want, but I think there's a pretty good argument that it's ONE thing that someone might reasonably want, depending on the application. If someone cares to build a better mousetrap in this area, it can be added on once we figure out the design, and without breaking anything! - that's sort of the whole point of this exercise. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Lossy operators, RECHECK, pg_migrator, n all that
Over a year ago we left it unresolved exactly what to do with the now-obsolete RECHECK markings in GIST/GIN opclass definitions: http://archives.postgresql.org/message-id/19212.1208122...@sss.pgh.pa.us The current behavior of CVS HEAD is more or less designed to intentionally fail in this case, and it's time to stop doing that. The question came up again in bug #4817 http://archives.postgresql.org/pgsql-bugs/2009-05/msg00217.php and I also note that there is zero chance of pg_migrator working on databases containing lossy GIST/GIN opclasses as things stand. I think that the right solution is simply to reduce the existing gram.y ERROR about RECHECK to a WARNING or even a NOTICE; that is, we can just have 8.4 ignore it and no great harm will be done. Here are the considerations leading me to that: 1. As I noted in the bug #4817 thread, having pg_dump not emit RECHECK when scanning an old database seems like a very nasty form of foot-gun. We can't prevent people from trying to use a newer pg_dump with an older server. 2. There's also the problem that dumps made with 8.3 or older pg_dump would contain these keywords even if we had 8.4 pg_dump not emit them. So we have a forward-porting problem anyway. 3. As things have settled out, there is really not much harm in loading a pre-8.4 GIST or GIN opclass definition into 8.4. The APIs for some of the support functions have changed, but in an upward-compatible way. Even if your underlying .so module knows about the new arguments, it will still work if the SQL-level function definitions for it don't include those arguments, because GIST/GIN don't check the SQL-level function definitions. (Which might be a bad idea overall, but right now it provides a handy escape hatch.) Certainly, whether the RECHECK flags mean anything is the very least of your worries about whether an old opclass will behave correctly in 8.4 --- so it's pretty silly to throw errors for them when we're not even looking at the support function signatures. So it appears to me that downgrading the ERROR is a simple and safe solution. Any objections? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Common Table Expressions applied; some issues remain
Greg Stark st...@enterprisedb.com writes: [ point 1 here remains unresolved: http://archives.postgresql.org/message-id/9623.1223158...@sss.pgh.pa.us ] One possibility would be to not flatten the query but find these quals and copy them onto the cte when planning the cte. So we would still materialize the result and avoid duplicate execution but only fetch the records which we know a caller will need. We could even do that for multiple callers if we join their quals with an OR -- that still might allow a bitmap index scan. I'm not too thrilled about that solution because it still eliminates predictability of execution of volatile functions. It's really just a partial form of subquery pullup, so we're paying all the disadvantages for only a subset of the advantages. I could still see doing what I mentioned in the prior message, which is to flatten CTEs as if they are plain sub-selects when 1. they are non-recursive, 2. they are referenced only once, and 3. they contain no volatile functions. Restriction #3 is what we need to ensure we aren't causing visible semantics changes. You could argue #2 either way, I guess, but my feeling is that if someone is using a doubly referenced CTE then he's probably doing something more complex than we are currently prepared to optimize well. I think we should let that case go until we understand typical usage and possible optimizations better. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] generic options for explain
Sorry to come in on this discussion so late. Just catching up Robert Haas robertmh...@gmail.com wrote: Responding to these in bulk, I think that 1, 3, and 4 are pretty convincing arguments that the SQL-based output format is underspecified. I hereby promise not to do anything about that without further discussion, which is an easy promise to make considering that in light of those comments I have no idea what it should look like. I think (1) is the most damning point. However, as far as I can see, none of these will affect XML or JSON. Personally, I find XML to be very hard to read; however, I can see the value of writing to that and having someone who can tolerate XSLT turn XML into anything else we want. (That could include morphing it into SELECT statements with the literals to present it as a tuple set, I should think.) As long as nobody considers this issue done until there are useful and convenient ways to display and use the data within psql without having to look at the XML, that seems a reasonable approach. The big plus of the current technique is that it is so convenient to Ctrl+C something which is running too long, arrow up, hit Home, and put the EXPLAIN word in front. Turning the query into a character string literal and feeding it to a function would be a big step backward. A big down side of the current technique is that you can't get both the results of a SELECT and its plan. I haven't seen any discussion here about emitting the EXPLAIN output through some INFO messages or some such, and letting the query return its normal results, but I feel that would be a significant improvement, if it that be done. Also, something I miss from previous database products is a way to control the verbosity of the output when planning. I do think that needs to be some sort of option up front, not a filter phase, because of the high cost it can have. If there was a way to show all the candidate plans and their cost estimates in a run time environment, without any special build or configuration needed, I'd use it every now and then. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] commitfest management webapp
Back in January, there was some discussion of creation a web application to make it easier to manage CommitFests. http://archives.postgresql.org/message-id/20090127134245.ga6...@alvh.no-ip.org This was further discussed at PGCon, and I now have a working version for folks to play with. With the help of Dave Page, I was able to relearn how to use the BSD ports collection and get it up here: http://coridan.postgresql.org/ Feedback is appreciated, especially from people involved in previous commitfests as committers, reviewers, patch authors, or commitfest managers. The source code, in Perl, is available here: http://git.postgresql.org/gitweb?p=pgcommitfest.git;a=summary A few things to note: 1. You won't see a link anywhere to create a new CommitFest or edit the name of an existing CommitFest. This is not because the functionality doesn't exist, but because your community login account is not enabled with administrator privileges for this application. For the same reason, you won't be able to edit or delete comments other than your own. If you would like to have these great powers, please send me a private email with your community account name and I will power you up. If we decide to use this as official project infrastructure, then you might get un-powered up unless I or one of the CommitFest managers have some idea who you are. :-) 2. There are many things that this application doesn't do. One particularly glaring thing that it doesn't do is allow you to move a patch from one CommitFest to another CommitFest. This is basically a bug that I intend to fix, but it didn't seem necessary to fix it before putting the thing out there for people to look at and comment on. At any rate, if the application doesn't do something that you wish it did, please feel free to let me know what that thing is, or provide a patch. I'm very interested in making this better (unless of course you all hate it, in which case my interest in improving it will likely decline precipitously). 3. The integration with the community login system is currently rather poor. The problem is that we can't count on patch submitters to have a community login, and even if they do we can't count on the person adding the patch to the system to know what it is. We could of course require patch submitters to have a community login and to add their patches themselves, but I'm not really that keen on raising the bar for submitting a patch even to that modest extent. I'm open to suggestions on how to improve this situation, though, because it's definitely not ideal, and precludes things that reasonable people might want to do, like contact the guy who submitted this patch, contact the authors of all patches waiting for review, and similar. 4. The intent of this system is really just to get the data that is currently on the CommitFest pages into a database where, I venture to say, structured data about the development cycle of a database product properly belongs. I expect it to be possible to use this tool to build additional infrastructure to facilitate patch review, like an automated test to see whether all the latest versions of all the open patches actually still apply. (If we have a human being sanity check them to make sure they don't contain malicious code, we could also test for compiles and passes regression tests, which would rock.) This infrastructure does not currently exist, but having the data in a database makes it feasible to think about doing it; suggestions are welcome, as is code. I know that there are some of you reading this who may think that we should convert to reviewboard or patchwork or some other system. I can say that personally I'm unimpressed by those suggestions because they will almost certainly require process changes that this does not, process changes which I suspect we're unprepared to make. But there's nothing to prevent you from setting up and advocating your system in this space. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [pgsql-www] commitfest management webapp
Robert Haas robertmh...@gmail.com writes: Back in January, there was some discussion of creation a web application to make it easier to manage CommitFests. This was further discussed at PGCon, and I now have a working version for folks to play with. Cool. Just reading your description, I have one thought: 3. The integration with the community login system is currently rather poor. The problem is that we can't count on patch submitters to have a community login, and even if they do we can't count on the person adding the patch to the system to know what it is. We could of course require patch submitters to have a community login and to add their patches themselves, but I'm not really that keen on raising the bar for submitting a patch even to that modest extent. Agreed on that, though we have recently been asking people to do that and most seem to have played along. I'm open to suggestions on how to improve this situation, though, because it's definitely not ideal, and precludes things that reasonable people might want to do, like contact the guy who submitted this patch, contact the authors of all patches waiting for review, and similar. I don't understand why that bit would be based on community login at all. Wouldn't contacting someone mainly need an email address? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] dblink patches for comment
The attached addresses items#2 and 3 as listed by Bruce here: http://momjian.us/cgi-bin/pgsql/joe I think it is consistent with the discussions we had a PGCon last week. Any objections to me committing this for 8.4? On a side note, should I try to address items #1 #4 for 8.4 as well? Perhaps #4 yes since it is arguably a bug fix, but no to #1? Joe Index: dblink.c === RCS file: /opt/src/cvs/pgsql/contrib/dblink/dblink.c,v retrieving revision 1.77 diff -c -r1.77 dblink.c *** dblink.c 1 Jan 2009 17:23:31 - 1.77 --- dblink.c 25 May 2009 22:57:22 - *** *** 46,51 --- 46,52 #include catalog/pg_type.h #include executor/executor.h #include executor/spi.h + #include foreign/foreign.h #include lib/stringinfo.h #include miscadmin.h #include nodes/execnodes.h *** *** 77,83 /* * Internal declarations */ ! static Datum dblink_record_internal(FunctionCallInfo fcinfo, bool is_async, bool do_get); static remoteConn *getConnectionByName(const char *name); static HTAB *createConnHash(void); static void createNewConnection(const char *name, remoteConn * rconn); --- 78,84 /* * Internal declarations */ ! static Datum dblink_record_internal(FunctionCallInfo fcinfo, bool is_async); static remoteConn *getConnectionByName(const char *name); static HTAB *createConnHash(void); static void createNewConnection(const char *name, remoteConn * rconn); *** *** 93,101 static HeapTuple get_tuple_of_interest(Oid relid, int2vector *pkattnums, int16 pknumatts, char **src_pkattvals); static Oid get_relid_from_relname(text *relname_text); static char *generate_relation_name(Oid relid); ! static void dblink_connstr_check(const char *connstr); ! static void dblink_security_check(PGconn *conn, remoteConn *rconn); static void dblink_res_error(const char *conname, PGresult *res, const char *dblink_context_msg, bool fail); /* Global */ static remoteConn *pconn = NULL; --- 94,103 static HeapTuple get_tuple_of_interest(Oid relid, int2vector *pkattnums, int16 pknumatts, char **src_pkattvals); static Oid get_relid_from_relname(text *relname_text); static char *generate_relation_name(Oid relid); ! static void dblink_connstr_check(const char *connstr, bool is_fdw); ! static void dblink_security_check(PGconn *conn, remoteConn *rconn, bool is_fdw); static void dblink_res_error(const char *conname, PGresult *res, const char *dblink_context_msg, bool fail); + static char *get_connect_string(const char *servername); /* Global */ static remoteConn *pconn = NULL; *** *** 165,172 } \ else \ { \ ! connstr = conname_or_str; \ ! dblink_connstr_check(connstr); \ conn = PQconnectdb(connstr); \ if (PQstatus(conn) == CONNECTION_BAD) \ { \ --- 167,180 } \ else \ { \ ! bool is_fdw = true; \ ! connstr = get_connect_string(conname_or_str); \ ! if (connstr == NULL) \ ! { \ ! is_fdw = false; \ ! connstr = conname_or_str; \ ! } \ ! dblink_connstr_check(connstr, is_fdw); \ conn = PQconnectdb(connstr); \ if (PQstatus(conn) == CONNECTION_BAD) \ { \ *** *** 177,183 errmsg(could not establish connection), \ errdetail(%s, msg))); \ } \ ! dblink_security_check(conn, rconn); \ freeconn = true; \ } \ } while (0) --- 185,191 errmsg(could not establish connection), \ errdetail(%s, msg))); \ } \ ! dblink_security_check(conn, rconn, is_fdw); \ freeconn = true; \ } \ } while (0) *** *** 210,237 Datum dblink_connect(PG_FUNCTION_ARGS) { char *connstr = NULL; char *connname = NULL; char *msg; PGconn *conn = NULL; remoteConn *rconn = NULL; DBLINK_INIT; if (PG_NARGS() == 2) { ! connstr = text_to_cstring(PG_GETARG_TEXT_PP(1)); connname = text_to_cstring(PG_GETARG_TEXT_PP(0)); } else if (PG_NARGS() == 1) ! connstr = text_to_cstring(PG_GETARG_TEXT_PP(0)); if (connname) rconn = (remoteConn *) MemoryContextAlloc(TopMemoryContext, sizeof(remoteConn)); /* check password in connection string if not superuser */ ! dblink_connstr_check(connstr); conn = PQconnectdb(connstr); if (PQstatus(conn) == CONNECTION_BAD) --- 218,255 Datum dblink_connect(PG_FUNCTION_ARGS) { + char *conname_or_str = NULL; char *connstr = NULL; char *connname = NULL; char *msg; PGconn *conn = NULL; remoteConn *rconn = NULL; + bool is_fdw = true; DBLINK_INIT; if (PG_NARGS() == 2) { ! conname_or_str = text_to_cstring(PG_GETARG_TEXT_PP(1)); connname = text_to_cstring(PG_GETARG_TEXT_PP(0)); } else if (PG_NARGS() == 1) ! conname_or_str = text_to_cstring(PG_GETARG_TEXT_PP(0)); if (connname)
[HACKERS] effects of posix_fadvise on WAL logs
Hi all, Does anyone have any tests that showcase benefits from the posix_fadvise changes in xlog.c? I tried running some tests with dbt2 to see if any performance changes could be seen with 8.4beta2. I thought an OLTP type test with a lot of inserts and updates would be a good test. Unfortunately, I don't think I see anything interesting. I was hoping to see less page cache activity, but maybe I'm not looking correctly. Maybe there isn't enough activity to the WAL relative to the rest of the database to show anything interesting? Here are the tests I ran: Baseline on 8.4beta2, using wal_sync_method=fsync: http://207.173.203.223/~markwkm/community6/dbt2/m1500-8.4beta2/m1500.8.4beta2.2/report/ Next set wal_sync_method=open_sync for postix_fadvise: http://207.173.203.223/~markwkm/community6/dbt2/m1500-8.4beta2/m1500.8.4beta2.osync1/report/ Now using the attached patch, with wal_sync_method=open_sync: http://207.173.203.223/~markwkm/community6/dbt2/m1500-8.4beta2/m1500.8.4beta2.osync2/report/ I created the patch because currently posix_fadvise is used right before the file handle to the WAL log is closed. I think posix_fadvise needs to be called when the file is opened. Regards, Mark Wong pgsql-xlog-posix_fadvise-20090425.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] effects of posix_fadvise on WAL logs
On Tue, 26 May 2009, Mark Wong wrote: Maybe there isn't enough activity to the WAL relative to the rest of the database to show anything interesting? Maybe you could reduce checkpoint_segments and focus on UPDATEs? That's how I've been able to generate the most WAL activity relative to database writes in the past, because of the full_page_writes behavior. Quoth the docs: To ensure data page consistency, the first modification of a data page after each checkpoint results in logging the entire page content. In that case, a smaller checkpoint interval increases the volume of output to the WAL log, partially negating the goal of using a smaller interval, and in any case causing more disk I/O. You've got checkpoint_segments set to 3000 in your tests and checkpoint_time to 1 hour, which means the tests you ran are really generating minimal WAL volume. -- * Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] commitfest management webapp
On Tue, 26 May 2009, Robert Haas wrote: I'm open to suggestions on how to improve this situation, though, because it's definitely not ideal, and precludes things that reasonable people might want to do, like contact the guy who submitted this patch, contact the authors of all patches waiting for review, and similar. Since you're taking the message-id where the patch was submitted at as an input, couldn't you scrape this information out of the archives? You probably want to do a bit of that regardless; having the program pull and display the author and subject line of the archived message is a good sanity check that you entered the message ID correctly. I know that there are some of you reading this who may think that we should convert to reviewboard or patchwork or some other system. I can say that personally I'm unimpressed by those suggestions because they will almost certainly require process changes that this does not We used Reviewboard a fair amount here at Truviso for a while. Lately a good chunk of that patch review has been happening more efficiently by passing pointers to private git branches around instead. I think you're right to focus on just the review workflow and not the patch review itself, let people use whatever tools they're already comfortable with for that part. I just spent a few minutes poking around your code and that quickly was able to see how things fit together, which is certainly not something I can say about Reviewboard etc. The interface looks good and the code easy enough to improve. The main concerns I'm left with after that review are with how to properly test the security of the code. Some maturity there is one major thing that more packages in larger use have going for them vs. rolling your own in this sort of situation. -- * Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] survey of WAL blocksize changes
Hi all, A long long time ago (in 2004) I ran a series of tests surveying the results of changing BLCKSZ when it was used for both the WAL logs and the rest of the database system: http://archives.postgresql.org/pgsql-hackers/2004-03/msg01194.php Now more than 5 years later and now being able to set the WAL log and the rest of the database to different block sizes, I have a set of test results with DBT-2 showing the effects of changing the WAL log block size on OLTP transaction throughput on ext2, ranging from 1KB to 64KB: BS notpm % Change from default -- - -- 1 14673 -4.8% 2 15864 2.9% 4 15774 2.3% 8 15413 (default) 16 16118 4.6% 32 16051 4.1% 64 14874 -3.5% Pointers to raw data: BS url -- --- 1 http://207.173.203.223/~markwkm/community6/dbt2/m1500-8.4beta2/m1500.8.4beta2.wal.1/ 2 http://207.173.203.223/~markwkm/community6/dbt2/m1500-8.4beta2/m1500.8.4beta2.wal.2/ 4 http://207.173.203.223/~markwkm/community6/dbt2/m1500-8.4beta2/m1500.8.4beta2.wal.4/ 8 http://207.173.203.223/~markwkm/community6/dbt2/m1500-8.4beta2/m1500.8.4beta2.2/ 16 http://207.173.203.223/~markwkm/community6/dbt2/m1500-8.4beta2/m1500.8.4beta2.wal.16/ 32 http://207.173.203.223/~markwkm/community6/dbt2/m1500-8.4beta2/m1500.8.4beta2.wal.32/ 64 http://207.173.203.223/~markwkm/community6/dbt2/m1500-8.4beta2/m1500.8.4beta2.wal.64/ It appears for this workload using a 16KB or 32KB gets more than 4% throughput improvement, but some of that could be noise. Nothing quite jaw dropping yet. It'll be interesting to see if the combination of changing the table block size can further improve the performance. It will probably be interesting to try different filesystems and filesystem blocksizes too. Regards, Mark Wong -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [pgsql-www] commitfest management webapp
Robert Haas escribió: 3. The integration with the community login system is currently rather poor. The problem is that we can't count on patch submitters to have a community login, and even if they do we can't count on the person adding the patch to the system to know what it is. We could of course require patch submitters to have a community login and to add their patches themselves, but I'm not really that keen on raising the bar for submitting a patch even to that modest extent. Actually we already raised the bar -- people is supposed to add stuff to the commitfest pages on the wiki by themselves. Surely any patch submitter will find the two necessary minutes to create an account. I suggest you take it as a given that the submitter has an account already. For the cases on which this doesn't hold (i.e. some author just threw a quick oneliner to fix a typo in docs and is too lazy to follow procedure), somebody else will be responsibly and life will go on. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [pgsql-www] commitfest management webapp
On Tue, May 26, 2009 at 9:39 PM, Tom Lane t...@sss.pgh.pa.us wrote: 3. The integration with the community login system is currently rather poor. The problem is that we can't count on patch submitters to have a community login, and even if they do we can't count on the person adding the patch to the system to know what it is. We could of course require patch submitters to have a community login and to add their patches themselves, but I'm not really that keen on raising the bar for submitting a patch even to that modest extent. I'm open to suggestions on how to improve this situation, though, because it's definitely not ideal, and precludes things that reasonable people might want to do, like contact the guy who submitted this patch, contact the authors of all patches waiting for review, and similar. I don't understand why that bit would be based on community login at all. Wouldn't contacting someone mainly need an email address? Yes. Humor me be elaborating on your thought here? ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [pgsql-www] commitfest management webapp
On Tue, May 26, 2009 at 10:53 PM, Alvaro Herrera alvhe...@commandprompt.com wrote: Robert Haas escribió: 3. The integration with the community login system is currently rather poor. The problem is that we can't count on patch submitters to have a community login, and even if they do we can't count on the person adding the patch to the system to know what it is. We could of course require patch submitters to have a community login and to add their patches themselves, but I'm not really that keen on raising the bar for submitting a patch even to that modest extent. Actually we already raised the bar -- people is supposed to add stuff to the commitfest pages on the wiki by themselves. Surely any patch submitter will find the two necessary minutes to create an account. I suggest you take it as a given that the submitter has an account already. For the cases on which this doesn't hold (i.e. some author just threw a quick oneliner to fix a typo in docs and is too lazy to follow procedure), somebody else will be responsibly and life will go on. There's a very good chance that the person who ends up being responsible will be me - and I have enough problems without people thinking that I own 25% of the patches in the CommitFest. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Common Table Expressions applied; some issues remain
2009/5/27 Tom Lane t...@sss.pgh.pa.us: Greg Stark st...@enterprisedb.com writes: [ point 1 here remains unresolved: http://archives.postgresql.org/message-id/9623.1223158...@sss.pgh.pa.us ] One possibility would be to not flatten the query but find these quals and copy them onto the cte when planning the cte. So we would still materialize the result and avoid duplicate execution but only fetch the records which we know a caller will need. We could even do that for multiple callers if we join their quals with an OR -- that still might allow a bitmap index scan. I'm not too thrilled about that solution because it still eliminates predictability of execution of volatile functions. It's really just a partial form of subquery pullup, so we're paying all the disadvantages for only a subset of the advantages. I could still see doing what I mentioned in the prior message, which is to flatten CTEs as if they are plain sub-selects when 1. they are non-recursive, 2. they are referenced only once, and 3. they contain no volatile functions. And 4. only if the sub-selects use index scan? Or in other cases would it be effective? Regards, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Common Table Expressions applied; some issues remain
Hitoshi Harada umi.tan...@gmail.com writes: 2009/5/27 Tom Lane t...@sss.pgh.pa.us: I could still see doing what I mentioned in the prior message, which is to flatten CTEs as if they are plain sub-selects when 1. they are non-recursive, 2. they are referenced only once, and 3. they contain no volatile functions. And 4. only if the sub-selects use index scan? Or in other cases would it be effective? Uh ... you've got the causality backwards, and I don't see the point of such a restriction anyway. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] commitfest management webapp
On Tue, May 26, 2009 at 10:15 PM, Greg Smith gsm...@gregsmith.com wrote: On Tue, 26 May 2009, Robert Haas wrote: I'm open to suggestions on how to improve this situation, though, because it's definitely not ideal, and precludes things that reasonable people might want to do, like contact the guy who submitted this patch, contact the authors of all patches waiting for review, and similar. Since you're taking the message-id where the patch was submitted at as an input, couldn't you scrape this information out of the archives? You probably want to do a bit of that regardless; having the program pull and display the author and subject line of the archived message is a good sanity check that you entered the message ID correctly. I think something like this might work. I had a suggestion previously of just checking that the message-IDs are even valid, which might be a good place to start, and then I could try to figure out how to extend it along these lines. I'm not totally keen on pulling the subject lines. I know that's what we've mostly been doing, but sometimes the subject line is something like patch to improve the way that foo does bar, rather than make bar use baz algorithm or (even worse) patch to add support for foo rather than foo. Also, I think we may also want to assign each patch a shortname that can be used as an argument to command-line tools. I'd really like to be able to do something like this: $ pgcf-patch foo ...and have foo.patch show up in $CWD. Even swankier would be to make this integrate with git somehow. I know that there are some of you reading this who may think that we should convert to reviewboard or patchwork or some other system. I can say that personally I'm unimpressed by those suggestions because they will almost certainly require process changes that this does not We used Reviewboard a fair amount here at Truviso for a while. Lately a good chunk of that patch review has been happening more efficiently by passing pointers to private git branches around instead. I think you're right to focus on just the review workflow and not the patch review itself, let people use whatever tools they're already comfortable with for that part. Thanks and well said. I just spent a few minutes poking around your code and that quickly was able to see how things fit together, which is certainly not something I can say about Reviewboard etc. The interface looks good and the code easy enough to improve. The main concerns I'm left with after that review are with how to properly test the security of the code. Some maturity there is one major thing that more packages in larger use have going for them vs. rolling your own in this sort of situation. Well, the nice thing about it is that it's not a ton of code, so visual inspection ought to buy you something. But I obviously can't and don't promise that it is free of bugs, security-related or otherwise. I was pretty dismayed when I realized that Template's | html filter did not think apostrophes needed to be quoted, which they obviously do if you're going to use them in a context like a href='[% foo | html %]'. Now that's the one I caught - question is - what did I miss? ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [pgsql-www] commitfest management webapp
Robert Haas robertmh...@gmail.com writes: On Tue, May 26, 2009 at 9:39 PM, Tom Lane t...@sss.pgh.pa.us wrote: I don't understand why that bit would be based on community login at all. Wouldn't contacting someone mainly need an email address? Yes. Humor me be elaborating on your thought here? Um, what's to elaborate? I'm thinking you should track submitters and other actors by email address. A login might be appropriate to control who can modify the commitfest data, but that should be seen as a secretarial function, not something that's necessarily carried out by the patch authors or reviewers themselves. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] generic options for explain
Magnus Hagander a écrit : Dave Page wrote: I was thinking something similar, but from the pgAdmin perspective. We already use libxml2, but JSON would introduce another dependency for us. Yeah, but probably not a huge one. There is one for wx, but I don't think it's included by default. +1 for the machine readable explain. FWIW, I have an early patch for phpPgAdmin about a graphical explain. IIRC when I wrote it, I told myself the parser might actually be broken with multi-level sub-queries or something. But I ended with the same parsing code than pgAdmin anyway. About the format, JSON would be the best here, as it is a one function call in PHP to retrieve an associative array from the JSON code. -- Guillaume (ioguix) de Rorthais -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [pgsql-www] commitfest management webapp
Robert Haas wrote: I know that there are some of you reading this who may think that we should convert to reviewboard or patchwork or some other system. I can say that personally I'm unimpressed by those suggestions because they will almost certainly require process changes that this does not, process changes which I suspect we're unprepared to make. But there's nothing to prevent you from setting up and advocating your system in this space. well fwiw the patchwork demo system(http://trackerdemo.postgresql.org/) is still up for people to look at. However going that route would also require some hackery on the code because patchworks MIME-parser is not really up to speed so it is actually missing some patches at times... Stefan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers