Re: [HACKERS] pl/perl and utf-8 in sql_ascii databases
Hello, I've been stuck in mud trying to plperl work on windows environment. I saw many messages complaining that plperl wouldn't be built to work. For the convenience of those and myself, I describe the process of building postgresql with plperl on Windows with cygwin and VC++ I've done below. Hrm, I don't develop on windows here, but out of curiosity, what were the messages like? My memory about that has already become faint.. As far as I remember, I saw two patterns of crash. One is caused by gcc-4's stack-protector in cygperl5_10.dll. It caused crash on create function, (or create language). Building postgresql with gcc-4 did not help for me. Finally, I gave up to use pre-installed dll and built all including perl with GCC-3 to make it work. The another is 0xC005 (Access Violation) on 'create language plperl' for VC10(:-p) vs ActivePerl5.14. This happenend at ERRSV in plperl_(un)trasted_init(). Replacing ERRSV with get_sv(@,FALSE) had put down that (but also I don't know if it works) but finally I had a error didn't get a CODE reference from compiling function on create function .. language plperl which was the sign of dead end for me. I decided to behave well to use ActivePerl5.12 and VC8 at last. I suppose this is a kind of so-called DLL HELL related to memory allocation. ActivePerl 5.12 links the system's msvcrt.dll but VC links its output with msvcrxx.dll. MS says memory allocaltion/deallocation across DLL bounary should cause crash. But I don't know why the pair of AP5.12 and VC8 results in success. http://msdn.microsoft.com/en-us/library/ms235460.aspx badalex - The remainder of the patch whic fixes the easy fixable leakes badalexof palloc'ed memory won't be ported into 9.1. This is only for badalex9.3dev. badalex badalex What should I do for this? badalex badalex Just let the commiter decide? :-) Agreed. -- Kyotaro Horiguchi NTT Open Source Software Center == My e-mail address has been changed since Apr. 1, 2012. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Fix mapping of PostgreSQL encodings to Python encodings.
On 06.07.2012 00:54, Jan Urbański wrote: On 05/07/12 23:30, Peter Eisentraut wrote: On tor, 2012-07-05 at 22:53 +0200, Jan Urbański wrote: The problem is that PLyUnicode_Bytes is (via an ifdef) used as PyString_ToString on Python3, which means that there are numerous call sites and new ones might appear in any moment. I'm not that keen on invoking the traceback machinery on low-level encoding errors. Why not? Because it can lead to recursion errors, like the one this patch was supposed to fix. The traceback machinery calls into the encoding functions, because it converts Python strings (like function names) into C strings. In the backend elog routines, there is a global variable 'recursion_depth', which is incremented when an error-handling routine is entered, and decremented afterwards. Can we use a similar mechinism in PLy_elog() to detect and stop recursion? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Fix mapping of PostgreSQL encodings to Python encodings.
On 06/07/12 10:05, Heikki Linnakangas wrote: On 06.07.2012 00:54, Jan Urbański wrote: On 05/07/12 23:30, Peter Eisentraut wrote: On tor, 2012-07-05 at 22:53 +0200, Jan Urbański wrote: The problem is that PLyUnicode_Bytes is (via an ifdef) used as PyString_ToString on Python3, which means that there are numerous call sites and new ones might appear in any moment. I'm not that keen on invoking the traceback machinery on low-level encoding errors. Why not? Because it can lead to recursion errors, like the one this patch was supposed to fix. The traceback machinery calls into the encoding functions, because it converts Python strings (like function names) into C strings. In the backend elog routines, there is a global variable 'recursion_depth', which is incremented when an error-handling routine is entered, and decremented afterwards. Can we use a similar mechinism in PLy_elog() to detect and stop recursion? I guess we can, I'll try to do some tests in order to see if there's an easy user-triggereable way of causing PLy_elog to recurse and if not then a guard like this should be enough as a safety measure against as yet unknown conditions (as opposed to something we expect to happen regularly). Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] obsolete copyright notice
On 07/06/2012 02:13 AM, Bruce Momjian wrote: On Thu, Jul 05, 2012 at 10:10:08PM +0200, Antonin Houska wrote: I found out by chance that \copyright command still contains 2011. Perhaps documentation of new year's day changes needs to be updated, if such exists. What version of Postgres is this? 9.2beta1 We don't update the copyright for minor releases, I didn't know about this rule. I just saw the copyright updated in source files, so thought this was forgotten. Sorry for the noise, Tony H. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Covering Indexes
Hi all, On Thu, Jun 28, 2012 at 5:16 AM, David E. Wheeler da...@justatheory.com wrote: I don't see the virtue of this in this case. Since the index is not unique, why not just put the index on (a,b,c,d) and be done with it? Is there some advantage to be had in inventing a way to store c and d in the index without having them usable for indexing? Why not restrict it to UNIQUE indexes ? For not unique indexes it has no advantages (it could be in fact indexed on all columns anyway as an implementation detail). For the unique case the problem of many identical entries mentioned by Tom is not relevant, so the additional data will only bloat the index but not otherwise affect the index performance. Would this get close enough to index-covered table ? _That_ would be interesting - I have a really big table (table/index size: 370G/320G, ~8*10^9 rows) which is basically using double space because it's primary key is covering all columns of the table. Cheers, Csaba. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Incorrect behaviour when using a GiST index on points
Yes, it's a bug and it needs to be applied ! On Tue, Jul 3, 2012 at 7:44 PM, Robert Haas robertmh...@gmail.com wrote: On Tue, Jul 3, 2012 at 11:34 AM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Thu, Jun 21, 2012 at 2:53 PM, Alexander Korotkov aekorot...@gmail.com wrote: I think we definitely should apply this patch before 9.2 release, because it is a bug fix. Otherwise people will continue produce incorrect GiST indexes with in-core geometrical opclasses until 9.3. Patch is very simple and only changes few lines of code. Any thoughts? Do we need to apply this patch to 9.2? It's been like that all along, no? Yeah, but it seems an awful lot like a bug. In fact... it's hard to imagine how it could be any more of a bug than this. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Oracle porting sample instr function
On Mon, Jul 2, 2012 at 8:51 PM, Greg Smith g...@2ndquadrant.com wrote: snip The INSTR implementation in the docs will instead search backwards from the end of the string if you tell it to start at 0, same as if you gave it a negative input. I think it's therefore possible to get the plpgsql version to return a value in cases Oracle would instead return 0. Seems like a straightforward thing to confirm and change the sample to do differently; just have to add an explicit test for a 0 value of beg_index. I wrote that sample eons ago with the plpgsql-porting doc. I probably overlooked the 0 behavior. Thanks for reporting Greg, and thanks Albe for providing a patch. Roberto -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] pg_dump: Sort overloaded functions in deterministic order
On Thu, Jul 5, 2012 at 10:33 PM, Tom Lane t...@sss.pgh.pa.us wrote: You may in fact need a new field --- I'm just saying it should be in the object-type-specific struct, eg FuncInfo, not DumpableObject. I suggest adding char *funcsig to FuncInfo, and moving the funcsig = format_function_arguments(finfo, funciargs) code from dumpFunc to getFuncs. Because dumpFunc is called after sortDumpableObjectsByTypeName, setting funcsig in the FuncInfo struct in dumpFunc would't work, as it needs to be available when entering sortDumpableObjectsByTypeName. What do you think?
Re: [HACKERS] Schema version management
Tom Lane t...@sss.pgh.pa.us writes: This argument seems a bit irrelevant to me. pg_dump doesn't get to pick and choose what will be in the database it's told to dump. If we're Sure. going to do something like what Joel wants, we have to have file naming conventions for operator and cast objects. So we can't just leave them out of the conversation (or if we do, we shouldn't be surprised when the ensuing design sucks). I guess what we're saying is that at this point we can pick non user friendly naming rules, like pg_operator/oid.sql or something like that, OID based. Impacted users might as well learn about extensions. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Schema version management
On Fri, Jul 6, 2012 at 8:23 AM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Tom Lane t...@sss.pgh.pa.us writes: This argument seems a bit irrelevant to me. pg_dump doesn't get to pick and choose what will be in the database it's told to dump. If we're Sure. going to do something like what Joel wants, we have to have file naming conventions for operator and cast objects. So we can't just leave them out of the conversation (or if we do, we shouldn't be surprised when the ensuing design sucks). I guess what we're saying is that at this point we can pick non user friendly naming rules, like pg_operator/oid.sql or something like that, OID based. Impacted users might as well learn about extensions. I think that would defeat some of the human-readability goals that people have for this feature, not to mention that it would lose the ability to do diff -r between a dump produced on cluster A and a dump produced on cluster B. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Covering Indexes
On Fri, Jun 29, 2012 at 08:10:03AM +0200, Csaba Nagy wrote: Hi all, On Thu, Jun 28, 2012 at 5:16 AM, David E. Wheeler da...@justatheory.com wrote: I don't see the virtue of this in this case. Since the index is not unique, why not just put the index on (a,b,c,d) and be done with it? Is there some advantage to be had in inventing a way to store c and d in the index without having them usable for indexing? Why not restrict it to UNIQUE indexes ? For not unique indexes it has no advantages (it could be in fact indexed on all columns anyway as an implementation detail). For the unique case the problem of many identical entries mentioned by Tom is not relevant, so the additional data will only bloat the index but not otherwise affect the index performance. Would this get close enough to index-covered table ? _That_ would be interesting - I have a really big table (table/index size: 370G/320G, ~8*10^9 rows) which is basically using double space because it's primary key is covering all columns of the table. I was wondering if there was some way to specify an expression index that did a unique index check on some columns but included more columns not part of the unique index. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Schema version management
Hello, Is it imaginable to additionally generate an index file that map the half friendly file names to a cleartext object signature ? This would allow user to possibly postprocess the output while merging overloaded functions to single files or renaming the files according to their needs and preferences... best regards, Marc Mamin -Original Message- From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers- ow...@postgresql.org] On Behalf Of Robert Haas Sent: Freitag, 6. Juli 2012 15:02 To: Dimitri Fontaine Cc: Tom Lane; Christopher Browne; Pg Hackers Subject: Re: [HACKERS] Schema version management On Fri, Jul 6, 2012 at 8:23 AM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Tom Lane t...@sss.pgh.pa.us writes: This argument seems a bit irrelevant to me. pg_dump doesn't get to pick and choose what will be in the database it's told to dump. If we're Sure. going to do something like what Joel wants, we have to have file naming conventions for operator and cast objects. So we can't just leave them out of the conversation (or if we do, we shouldn't be surprised when the ensuing design sucks). I guess what we're saying is that at this point we can pick non user friendly naming rules, like pg_operator/oid.sql or something like that, OID based. Impacted users might as well learn about extensions. I think that would defeat some of the human-readability goals that people have for this feature, not to mention that it would lose the ability to do diff -r between a dump produced on cluster A and a dump produced on cluster B. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Covering Indexes
Le vendredi 6 juillet 2012 15:41:01, Bruce Momjian a écrit : On Fri, Jun 29, 2012 at 08:10:03AM +0200, Csaba Nagy wrote: Hi all, On Thu, Jun 28, 2012 at 5:16 AM, David E. Wheeler da...@justatheory.com wrote: I don't see the virtue of this in this case. Since the index is not unique, why not just put the index on (a,b,c,d) and be done with it? Is there some advantage to be had in inventing a way to store c and d in the index without having them usable for indexing? Why not restrict it to UNIQUE indexes ? For not unique indexes it has no advantages (it could be in fact indexed on all columns anyway as an implementation detail). For the unique case the problem of many identical entries mentioned by Tom is not relevant, so the additional data will only bloat the index but not otherwise affect the index performance. Would this get close enough to index-covered table ? _That_ would be interesting - I have a really big table (table/index size: 370G/320G, ~8*10^9 rows) which is basically using double space because it's primary key is covering all columns of the table. I was wondering if there was some way to specify an expression index that did a unique index check on some columns but included more columns not part of the unique index. I haven't tryed it, but I suppose that Exclusion Constraint should allow that. -- Cédric Villemain +33 (0)6 20 30 22 52 http://2ndQuadrant.fr/ PostgreSQL: Support 24x7 - Développement, Expertise et Formation signature.asc Description: This is a digitally signed message part.
Re: [HACKERS] CREATE FOREGIN TABLE LACUNA
On 24 June 2012 04:01, Alvaro Herrera alvhe...@commandprompt.com wrote: Excerpts from Dean Rasheed's message of sáb jun 23 18:08:31 -0400 2012: I spotted a couple of other issues during testing: David, when you generate a new version of the patch, please also make sure to use RELKIND_RELATION and RELKIND_FOREIGN instead of 'r' and 'f'. * You're still allowing INCLUDING DEFAULTS and INCLUDING STORAGE, even though these options are not supported on foreign tables. Maybe the code should list options allowed instead of the ones disallowed. * If I do INCLUDING ALL, I get an error because of the unsupported options. I think that ALL in this context needs to be made to mean all the options that foreign tables support (just COMMENTS at the moment). I agree. David, do you have an updated version of this patch? Regards, Dean -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Covering Indexes
Csaba Nagy ncsli...@googlemail.com writes: Why not restrict it to UNIQUE indexes ? What benefit would such a restriction provide? AFAICS it doesn't make implementation any easier. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Fix mapping of PostgreSQL encodings to Python encodings.
On 06/07/12 10:14, Jan Urbański wrote: On 06/07/12 10:05, Heikki Linnakangas wrote: In the backend elog routines, there is a global variable 'recursion_depth', which is incremented when an error-handling routine is entered, and decremented afterwards. Can we use a similar mechinism in PLy_elog() to detect and stop recursion? I guess we can, I'll try to do some tests in order to see if there's an easy user-triggereable way of causing PLy_elog to recurse and if not then a guard like this should be enough as a safety measure against as yet unknown conditions (as opposed to something we expect to happen regularly). Attached is a patch that stores the recursion level of PLy_traceback and prevents it from running if it's too deep (PLy_traceback is the one doing heavy lifting, that's why I chose to put the logic to skip running there). I tried a few things and was not able to easily invoke the infinite recursion condition, but I did notice that there are two more encodings that have different names in Postgres and in Python (KOI8-R and KOI8-U) and added them to the switch. There's still trouble with EUC_TW and MULE_INTERNAL which don't have Python equivalents. EUC-TW has been discussed in http://bugs.python.org/issue2066 and rejected (see http://bugs.python.org/issue2066#msg113731). If you use any of these encodings, you *will* get into the recursion trouble described eariler, just as before the path you'd get into it with CP1252 as your encoding. What shall we do about those? Ignore them? Document that if you're sing one of these encodings then PL/Python with Python 2 will be crippled and with Python 3 just won't work? Cheers, Jan diff --git a/src/pl/plpython/plpy_elog.c b/src/pl/plpython/plpy_elog.c new file mode 100644 index c375ac0..c2b3cb8 *** a/src/pl/plpython/plpy_elog.c --- b/src/pl/plpython/plpy_elog.c *** static char *get_source_line(const char *** 28,33 --- 28,41 /* + * Guard agains re-entrant calls to PLy_traceback, which can happen if + * traceback formatting functions raise Python errors. + */ + #define TRACEBACK_RECURSION_LIMIT 2 + static int recursion_depth = 0; + + + /* * Emit a PG error or notice, together with any available info about * the current Python error, previously set by PLy_exception_set(). * This should be used to propagate Python errors into PG. If fmt is *** PLy_traceback(char **xmsg, char **tbmsg, *** 147,166 StringInfoData xstr; StringInfoData tbstr; /* * get the current exception */ PyErr_Fetch(e, v, tb); /* ! * oops, no exception, return */ ! if (e == NULL) { *xmsg = NULL; *tbmsg = NULL; *tb_depth = 0; return; } --- 155,177 StringInfoData xstr; StringInfoData tbstr; + recursion_depth++; + /* * get the current exception */ PyErr_Fetch(e, v, tb); /* ! * oops, no exception or recursion depth exceeded, return */ ! if (e == NULL || recursion_depth TRACEBACK_RECURSION_LIMIT) { *xmsg = NULL; *tbmsg = NULL; *tb_depth = 0; + recursion_depth--; return; } *** PLy_traceback(char **xmsg, char **tbmsg, *** 326,331 --- 337,344 (*tb_depth)++; } + recursion_depth--; + /* Return the traceback. */ *tbmsg = tbstr.data; diff --git a/src/pl/plpython/plpy_util.c b/src/pl/plpython/plpy_util.c new file mode 100644 index bf29532..ea4ecdf *** a/src/pl/plpython/plpy_util.c --- b/src/pl/plpython/plpy_util.c *** PLyUnicode_Bytes(PyObject *unicode) *** 112,117 --- 112,123 case PG_WIN874: serverenc = cp874; break; + case PG_KOI8R: + serverenc = koi8-r; + break; + case PG_KOI8U: + serverenc = koi8-u; + break; default: /* Other encodings have the same name in Python. */ serverenc = GetDatabaseEncodingName(); *** PLyUnicode_Bytes(PyObject *unicode) *** 120,135 rv = PyUnicode_AsEncodedString(unicode, serverenc, strict); if (rv == NULL) ! { ! /* ! * Use a plain ereport instead of PLy_elog to avoid recursion, if ! * the traceback formatting functions try to do unicode to bytes ! * conversion again. ! */ ! ereport(ERROR, ! (errcode(ERRCODE_INTERNAL_ERROR), ! errmsg(could not convert Python Unicode object to PostgreSQL server encoding))); ! } return rv; } --- 126,132 rv = PyUnicode_AsEncodedString(unicode, serverenc, strict); if (rv == NULL) ! PLy_elog(ERROR, could not convert Python Unicode object to PostgreSQL server encoding); return rv; } -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Fix mapping of PostgreSQL encodings to Python encodings.
On 06.07.2012 18:01, Jan Urbański wrote: There's still trouble with EUC_TW and MULE_INTERNAL which don't have Python equivalents. EUC-TW has been discussed in http://bugs.python.org/issue2066 and rejected (see http://bugs.python.org/issue2066#msg113731). If you use any of these encodings, you *will* get into the recursion trouble described eariler, just as before the path you'd get into it with CP1252 as your encoding. What shall we do about those? Ignore them? Document that if you're sing one of these encodings then PL/Python with Python 2 will be crippled and with Python 3 just won't work? We could convert to UTF-8, and use the PostgreSQL functions to convert from UTF-8 to the server encoding. Double conversion might be slow, but I think it would be better than failing. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] obsolete copyright notice
On Fri, Jul 06, 2012 at 11:06:28AM +0200, Antonin Houska wrote: On 07/06/2012 02:13 AM, Bruce Momjian wrote: On Thu, Jul 05, 2012 at 10:10:08PM +0200, Antonin Houska wrote: I found out by chance that \copyright command still contains 2011. Perhaps documentation of new year's day changes needs to be updated, if such exists. What version of Postgres is this? 9.2beta1 We don't update the copyright for minor releases, I didn't know about this rule. I just saw the copyright updated in source files, so thought this was forgotten. Sorry for the noise, That is not noise. :-O You are 100% correct that 9.2beta1 should _not_ show 2011! The 9.2 major release comes out in 2012 and should show 2012 everywhere. We created copyright.pl for 9.2, and it seems it is lacking in some areas. I just made several commits to fix problems I found: * Fix perltidy problem caused by missing regex slash --- this was the only file to generate an error file from perltidy, so I am confident about the other files. I reverted the perltidy and re-ran it to make sure things were good. * Fix code that was supposed to skip the .git directory, but wasn't (added basename() call) * Modified code so all lines are processed, not just the first match --- this fixes the help.c file that causes the original report above. I checked the impact of not stopping after the first match and could not find any problems. The attached, applied patch to head and 9.2 addresses all these issues. Thanks for the report. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + diff --git a/src/tools/copyright.pl b/src/tools/copyright.pl new file mode 100755 index a55228d..52a7cb1 *** a/src/tools/copyright.pl --- b/src/tools/copyright.pl *** use strict; *** 11,16 --- 11,17 use warnings; use File::Find; + use File::Basename; use Tie::File; my $pgdg = 'PostgreSQL Global Development Group'; *** find({ wanted = \wanted, no_chdir = 1 *** 25,39 sub wanted { - # prevent corruption of git indexes by ignoring any .git/ ! if ($_ eq '.git') { $File::Find::prune = 1; return; } ! return if !-f $File::Find::name || -l $File::Find::name; # skip file names with binary extensions # How are these updated? bjm 2012-01-02 --- 26,39 sub wanted { # prevent corruption of git indexes by ignoring any .git/ ! if (basename($_) eq '.git') { $File::Find::prune = 1; return; } ! return if ! -f $File::Find::name || -l $File::Find::name; # skip file names with binary extensions # How are these updated? bjm 2012-01-02 *** sub wanted *** 46,59 { # We only care about lines with a copyright notice. ! next unless $line =~ m/$cc . *$pgdg /; ! # We stop when we've done one substitution. This is both for ! # efficiency and, at least in the case of this program, for ! # correctness. ! last if $line =~ m/$cc.*$year.*$pgdg/; ! last if $line =~ s/($cc\d{4})(, $pgdg)/$1-$year$2/; ! last if $line =~ s/($cc\d{4})-\d{4}(, $pgdg)/$1-$year$2/; } untie @lines; } --- 46,61 { # We only care about lines with a copyright notice. ! next unless $line =~ m/$cc.*$pgdg/; ! # Skip line if already matches the current year; if not ! # we get $year-$year, e.g. 2012-2012 ! next if $line =~ m/$cc$year, $pgdg/; ! ! # We process all lines because some files have copyright ! # strings embedded in them, e.g. src/bin/psql/help.c ! $line =~ s/($cc\d{4})(, $pgdg)/$1-$year$2/; ! $line =~ s/($cc\d{4})-\d{4}(, $pgdg)/$1-$year$2/; } untie @lines; } -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug tracker tool we need
On Wed, Apr 18, 2012 at 10:29:26AM -0400, Robert Haas wrote: IME, bug trackers typically work best when used by a tightly integrated team. Well, very many loosely distributed open-source projects use bug trackers quite successfully. So I think Greg has exactly the right idea: we shouldn't try to incorporate one of these systems that aims to manage workflow; Um, isn't half of the commitfest app about workflow? Patch is waiting for review, who is the reviewer, patch is waiting for author, who is the author, patch is ready for committer, who is the committer? And every week or so the commitfest manager (if any) produces a report on patch progress. Isn't that exactly what these workflow management systems provide? Yeah, but I thought we'd veered off into a digression about tracking bug reports. Here's our workflow for bugs: 1. If they seem interesting, Tom fixes them. 2. Or occasionally someone else fixes them. 3. Otherwise, they drift forever in the bleakness of space. I've been conducting the experiment for a year or two now of leaving unresolved bug reports unread in my mailbox. I've got over 100 such emails now... and some of them may not really be bugs, but nobody's put in the work to figure that out. It would be useful to have a I saved this email from April and have given it some thought. I decided to approach it by looking at our current workflow, then deciding what the problems were, rather than approaching it with we need a bug tracker. I started by drawing a diagram of our current development process: http://momjian.us/main/writings/pgsql/work_flow.pdf I did this so I could see the weaknesses. First, the left and right sides are what our users see, and the middle is controlled by developers. Looking at the chart, the three sections, left, middle, and right, seem to work fine in isolation. Our interaction with bug reporters is very good, our development flow seems fine, and people are certainly happy with the quality of our releases. Yes, there are problems, like the ability of patches to get lost, but in general, things are good. I think our big gap is in integrating these sections. There is no easy way for a bug reporter to find out what happens to his report unless the patch is applied in the same email thread as the report. It is hard for users to see _all_ the changes made in a release because the release notes are filtered. Our current system is designed to have very limited friction of action, and this give us a high-quality user experience and release quality, but it does have limits in how well we deal with complex cases. OK, now for the question about a bug tracker. A bug tracker would provide a track-able contact for everyone reporting a bug, and allow them to see exactly what release fixes the bug (in an ideal world). It also allows for more detailed reporting of what is each release. For me, the big problem with a bug trackers is that it adds so much friction to the development process, meaning fewer people are involved and less work gets done. If you have company-sponsored development, you can just hire more people to overcome that friction, but for volunteer development, I am not sure a bug tracker really works well, and given the chaotic content in almost every bug tracker database, I think that is true. So, my question is, what can we do to better integrate these sections? Assign a bug number on email that gets stamped on the commit and release note item? Add email notification of commits somehow? Should we publish the entire git log for each release? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Event Triggers reduced, v1
Robert Haas robertmh...@gmail.com writes: Here is an incremental documentation patch which I hope you will like. Definitely, it's better this way. I'm not thrilled with separating it into its own top level chapter, but I can see how it makes sense indeed. This part is strange though: + A trigger definition can also specify a literalWHEN/literal + condition so that, for example, a literalcommand_start/literal + tag can be fired only for particular commands which the user wishes + to intercept. A common use of such triggers is to restrict the range of + DDL operations which users may perform. I don't think of that as firing a command tag, so it's hard for me to parse that sentence. the matrix somewhat. I think as we add more firing points it will be clearer and easier to read if we have all the commands arranged in columns rather than listing a bunch of firing points on each line. I +1 also made a bunch of minor edits to improve readability and improve the English (which wasn't bad, but I touched it up a bit); and I tried to add some extra detail here and there (some of it recycled from previous patch versions). Assuming this all seems reasonably agreeable, can you merge it on your side? Done, thanks ! This took the last several hours, so I haven't looked at your latest code changes yet. However, in the course of editing the documentation, it occurred to me that we seem to be fairly arbitrarily excluding a large number of commands from the event trigger mechanism. As many as that? I'm surprised about the quantity. Yes I did not add all and any command we have, on purpose, and I agree that the new turn of things allow us to add a new set. For example, GRANT and REVOKE. In earlier patches, we needed specific changes for every command, so there was some reason not to try to support everything right out of the gate. But ISTM that the argument for this is much less now; presumably it's just a few extra lines of code per command, so maybe we ought to go ahead and try to make this as complete as possible. I attempt to explain in the Will do soon™. attached patch the reasons why we don't support certain classes of commands, but I can't come up with any explanation for supporting GRANT and REVOKE that doesn't fall flat. I can't even really see a reason not to support things like LISTEN and NOTIFY, and it would certainly be more consistent with the notion of a command_start trigger to support as many commands as we can. I would think that NOTIFY is on a fast track not to be disturbed by calling into used defined code, and that would explain why we don't support event triggers here. I had an interesting experience while testing this patch. I accidentally redefined my event trigger function to something which errored out. That of course precluded me from using CREATE OR REPLACE FUNCTION to fix it. This makes me feel rather glad that we decided to exclude CREATE/ALTER/DROP EVENT TRIGGER from the event trigger mechanism, else recovery would have had to involve system catalog hackery. Yeah, we have some places were it's not very hard to shoot oneself in the foot, here the resulting hole is a little too big and offers no real benefits. Event triggers on create|alter|drop event triggers, really? Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Event Triggers reduced, v1
On Fri, Jul 6, 2012 at 12:00 PM, Robert Haas robertmh...@gmail.com wrote: On Fri, Jul 6, 2012 at 7:21 AM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Robert Haas robertmh...@gmail.com writes: Attached is a incremental patch with a bunch of minor cleanups, including reverts of a few spurious white space changes. Could you merge this into your version? Thank you very much for that, yes it's included now. So you have 3 attachments here, the whole new patch revision (v1.7), the incremental patch to go from 1.6 to 1.7 and the incremental patch that should apply cleanly on top of your cleanups. Here is an incremental documentation patch which I hope you will like. And here is another incremental patch, this one doing some more cleanup. Some of this is cosmetic, but it also: - Fixes the new event_trigger type so that it passes the type sanity test, instead of adding the failure as expected output. - Fixes DROP EVENT TRIGGER IF EXISTS on a non-existent trigger. - Fleshes out the ownership handling so that it's more similar to what we do for other types of objects. I'm feeling pretty good about this at this point, although I think there is still some more work to do before we call it done and go home. I have a large remaining maintainability concern about the way we're mapping back and forth between node tags, event tags, and command tags. Right now we've got parse_event_tag, which parses something like 'ALTER AGGREGATE' into E_AlterAggregate; and then we've got command_to_string, which turns E_AlterAggregate back into 'ALTER AGGREGATE', and then we've got InitEventContext(), which turns T_RenameStmt or T_AlterObjectSchemaStmt with OBJECT_AGGREGATE into E_AlterAggregate. I can't easily verify that all three of these things are consistent with each other, and even if they are right now I estimate the chances of that remaining true as other people patch the code as near-zero. You didn't like my last proposal for dealing with this, which is fine: it might not have been the best way of dealing with it. But I think we have to figure out something better than what we've got now, or this is almost guaranteed to get broken. If you don't have a brilliant idea I'll hack on it and see what I can come up with. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company event-trigger-morecleanup.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Event Triggers reduced, v1
Robert Haas robertmh...@gmail.com writes: And here is another incremental patch, this one doing some more cleanup. Some of this is cosmetic, but it also: Thanks, applied in my github repository! I'm feeling pretty good about this at this point, although I think there is still some more work to do before we call it done and go home. Nice reading that :) I have a large remaining maintainability concern about the way we're mapping back and forth between node tags, event tags, and command tags. Right now we've got parse_event_tag, which parses something […valid concern…] If you don't have a brilliant idea I'll hack on it and see what I can come up with. I think we might be able to install a static array for the setup where we would find the different elements, and then code up some procedures doing different kind of look ups in that array. like 'ALTER AGGREGATE' into E_AlterAggregate; and then we've got command_to_string, which turns E_AlterAggregate back into 'ALTER AGGREGATE', and then we've got InitEventContext(), which turns T_RenameStmt or T_AlterObjectSchemaStmt with OBJECT_AGGREGATE into E_AlterAggregate. I can't easily verify that all three of these { E_AlterAggregate, // TrigEventCommand ALTER AGGREGATE,// command tag T_RenameStmt, // nodeTag -1// object type }, { E_AlterAggregate, ALTER AGGREGATE, T_AlterObjectSchemaStmt, OBJECT_AGGREGATE } The problem is coming up with a way of writing the code that does not incur a full array scan for each step of parsing or rewriting. And I don't see that it merits yet another cache. Given the existing event trigger cache it might be that we don't care about having a full scan of this table twice per event trigger related commands, as I don't think it would happen when executing other DDLs. Scratch that we need to parse command tags when we build the event cache, so scanning the full array each time would make that O(n²) and we want to avoid that. So we could install the contents of the array in another hash table in BuildEventTriggerCache() then use that to lookup the TrigEventCommand from the command tag… Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] patch: inline code with params
Hello I updated my two years old patch http://archives.postgresql.org/pgsql-hackers/2010-07/txtIakTCBA15Z.txt Syntax is based on Florian Pflog's proposal http://archives.postgresql.org/pgsql-hackers/2010-07/msg00110.php postgres=# do (a int, b int, text) $$begin raise notice '% % %', $1, $2, $3; end; $$ language plpgsql using 10+100,20, :'USER'; NOTICE: 110 20 pavel DO This patch is not final - missing documentation, regress tests, and doesn't support subselects as expr. inline_code_with_params.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Event Triggers reduced, v1
On Fri, Jul 6, 2012 at 3:29 PM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Here is an incremental documentation patch which I hope you will like. Definitely, it's better this way. I'm not thrilled with separating it into its own top level chapter, but I can see how it makes sense indeed. Oh, really? I thought that was a huge readability improvement. There are some things that are the same between triggers and event triggers, but there's an awful lotta stuff that is completely different. This part is strange though: + A trigger definition can also specify a literalWHEN/literal + condition so that, for example, a literalcommand_start/literal + tag can be fired only for particular commands which the user wishes + to intercept. A common use of such triggers is to restrict the range of + DDL operations which users may perform. I don't think of that as firing a command tag, so it's hard for me to parse that sentence. Oh, that should say a command_start trigger rather than a command_start tag. Good catch. This took the last several hours, so I haven't looked at your latest code changes yet. However, in the course of editing the documentation, it occurred to me that we seem to be fairly arbitrarily excluding a large number of commands from the event trigger mechanism. As many as that? I'm surprised about the quantity. Yes I did not add all and any command we have, on purpose, and I agree that the new turn of things allow us to add a new set. I admit I didn't count them up. :-) Maybe there aren't that many. I think we might want to extend the support matrix to include every command that appears in sql-commands.html and have either an X if it's supported or blank if it's not. That would make it easier to judge how many commands are not supported, not just for us but for users who may be trying to answer the same questions we are. I would think that NOTIFY is on a fast track not to be disturbed by calling into used defined code, and that would explain why we don't support event triggers here. If the DBA is allowed to restrict CREATE FUNCTION, why not NOTIFY? I guess I don't see why that one's deserving of special treatment. Mind you, if I ran the world, this would probably be broken up differently: I'd have ddl_command_start covering all the CREATE/ALTER/DROP commands and nothing else; and separate firing points for anything else I wanted to support. It's not too late to make that change, hint, hint. But if we're not gonna do that then I think that we'd better try to cast the net as broadly as reasonably possible. It seems to me that our excuse for not including things like UPDATE and DELETE is a bit thin; surely there are people who would like a sort of universal trigger that applies to every relation in the system. Of course there are recursion problems there that need to be thought long hard about, and no I don't really want to go there right now, but I'll bet you a nickle that someone is going to ask why it doesn't work that way. Another advantage to recasting this as ddl_command_start is that we quite easily pass the operation (CREATE, ALTER, DROP) and the named object type (TABLE, FUNCTION, CAST) as separate arguments. I think that would be a usability improvement, since it would then be dead easy to write an event trigger that prohibits DROP (and only DROP) of any sort. Of course it's not that hard to do it right now, but you have to parse the command tag. It would likely simplify the code for mapping between node and command tags, too. One other thought: if we're NOT going to do what I suggested above, then how about renaming TG_WHEN to TG_EVENT? Seems like that would fit better. Also: now that the E_WhatEver constants don't get stored on disk, I don't think they should live in pg_event_trigger.h any more; can we move them to someplace more appropriate? Possibly make them private to event_trigger.c? And, on a related note, I don't think it's a good idea to use E_ as a prefix for both the event types and the command tags. It's too short, and hard to grep for, and we don't want the same one for both, I think. How above things like EVT_CommandStart for the events and ECT_CreateAggregate for the command tags? I had an interesting experience while testing this patch. I accidentally redefined my event trigger function to something which errored out. That of course precluded me from using CREATE OR REPLACE FUNCTION to fix it. This makes me feel rather glad that we decided to exclude CREATE/ALTER/DROP EVENT TRIGGER from the event trigger mechanism, else recovery would have had to involve system catalog hackery. Yeah, we have some places were it's not very hard to shoot oneself in the foot, here the resulting hole is a little too big and offers no real benefits. Event triggers on create|alter|drop event triggers, really? Indeed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [HACKERS] Event Triggers reduced, v1
On Fri, Jul 6, 2012 at 4:00 PM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Robert Haas robertmh...@gmail.com writes: And here is another incremental patch, this one doing some more cleanup. Some of this is cosmetic, but it also: Thanks, applied in my github repository! Thanks. I have a large remaining maintainability concern about the way we're mapping back and forth between node tags, event tags, and command tags. Right now we've got parse_event_tag, which parses something […valid concern…] If you don't have a brilliant idea I'll hack on it and see what I can come up with. I think we might be able to install a static array for the setup where we would find the different elements, and then code up some procedures doing different kind of look ups in that array. +1. like 'ALTER AGGREGATE' into E_AlterAggregate; and then we've got command_to_string, which turns E_AlterAggregate back into 'ALTER AGGREGATE', and then we've got InitEventContext(), which turns T_RenameStmt or T_AlterObjectSchemaStmt with OBJECT_AGGREGATE into E_AlterAggregate. I can't easily verify that all three of these { E_AlterAggregate, // TrigEventCommand ALTER AGGREGATE,// command tag T_RenameStmt, // nodeTag -1// object type }, { E_AlterAggregate, ALTER AGGREGATE, T_AlterObjectSchemaStmt, OBJECT_AGGREGATE } The problem is coming up with a way of writing the code that does not incur a full array scan for each step of parsing or rewriting. And I don't see that it merits yet another cache. Given the existing event trigger cache it might be that we don't care about having a full scan of this table twice per event trigger related commands, as I don't think it would happen when executing other DDLs. Scratch that we need to parse command tags when we build the event cache, so scanning the full array each time would make that O(n²) and we want to avoid that. So we could install the contents of the array in another hash table in BuildEventTriggerCache() then use that to lookup the TrigEventCommand from the command tag… Ugh. Yeah, obviously the most important thing I think is that InitEventContext() needs to be lightning-fast, but we don't want BuildEventTriggerCache() to be pathologically slow either. I think the best thing to do with InitEventContext() might be to get rid of it. It's just a big switch over node tags, and we've already got one of those in standard_ProcessUtility. Maybe every case that already exists in that function should either (a) get a comment of the form /* does not support event triggers */ or (b) get a call of the form EventTriggerStartup(evt, parsetree, E_WhateverCommandThisIs). EventTriggerStartup() could call InitEventContext() and then if CommandFiresTriggersForEvent(..., E_CommandStart) it could also call ExecEventTriggers(). This might seem like it's just moving the wood around, but if someone adds a new case in standard_ProcessUtility, they're going to model it on one of the existing cases, which greatly decreases the likelihood that they're going to screw it up. And if they do screw it up it will be obviously non-parallel to the rest of what's there, so somebody can notice and fix it. As a side benefit, this would probably be faster than having two separate switches that are executed more or less consecutively. Now that leaves the question of how to translate between E_AlterAggregate and ALTER AGGREGATE; I think your idea of a hash table (or two?) might be the most practical option. We'd only need to build the hash table(s) if an index-scan of pg_event_trigger finds it non-empty, and then only once per session. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Fix mapping of PostgreSQL encodings to Python encodings.
On fre, 2012-07-06 at 18:53 +0300, Heikki Linnakangas wrote: What shall we do about those? Ignore them? Document that if you're sing one of these encodings then PL/Python with Python 2 will be crippled and with Python 3 just won't work? We could convert to UTF-8, and use the PostgreSQL functions to convert from UTF-8 to the server encoding. Double conversion might be slow, but I think it would be better than failing. Actually, we already do the other direction that way (PLyUnicode_FromStringAndSize) , so maybe it would be more consistent to always use this. I would hesitate to use this as a kind of fallback, because then we would sometimes be using PostgreSQL's recoding tables and sometimes Python's recoding tables, which could became confusing. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Schema version management
On tor, 2012-07-05 at 23:52 +0200, Dimitri Fontaine wrote: I would argue like Álvaro that when dealing with operators and casts you're probably writing an extension already, and we're providing another way to deal with that. I have code in the wild that defines new operators and casts and has no C code and is not in an extension and has no business being in an extension. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq URI and regression testing
Excerpts from Alex's message of jue abr 19 17:06:05 -0300 2012: Peter Eisentraut pete...@gmx.net writes: On tor, 2012-04-19 at 00:13 +0300, Alex wrote: +#!/usr/bin/env perl Don't do that. Call the script using $(PERL) from the makefile. Thank you for the suggestion. Attached v2 does just this (while keeping a more commonly found shebang line in the perl script for running it w/o the makefile.) I've applied this to 9.3. Andrew, can we have the non-MSVC buildfarm members running on the master branch also run make installcheck in src/interfaces/libpq? If there are platform dependencies here, it would be good to know what they are; if we don't have the tests run automatically we will never know. We'll need to figure out some way for MSVC animals to run the tests as well. Any takers? I think this is necessary, because otherwise it is quite clear that the tests are going to be kept failing forever. They were already failing because of message style changes that didn't update the expected output. (I am unsure about pushing the sh to perl test harness conversion in 9.2. If anybody thinks it should be done, please discuss.) -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WAL format changes
On fre, 2012-06-15 at 00:01 +0300, Heikki Linnakangas wrote: 1. Use a 64-bit segment number, instead of the log/seg combination. And don't waste the last segment on each logical 4 GB log file. The concept of a logical log file is now completely gone. XLogRecPtr is unchanged, but it should now be understood as a plain 64-bit value, just split into two 32-bit integers for historical reasons. On disk, this means that there will be log files ending in FF, those were skipped before. A thought on this. There were some concerns that this would silently break tools that pretend to have detailed knowledge of WAL file numbering and this previous behavior of skipping the FF files. We could address this by fixing the overall file naming from something like 000108D000FD 000108D000FE 000108D000FF 000108D1 to 000108D0FD00 000108D0FE00 000108D0FF00 000108D1 which represents the new true WAL stream numbering as opposed to the old two-part numbering. Thus, any tool that thinks it knows how the WAL files are sequenced will break very obviously, but any tool that just looks for 24 hexadecimal digits will be fine. I wonder if any tools in the former category would also break if one changes XLOG_SEG_SIZE. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Fix mapping of PostgreSQL encodings to Python encodings.
On 06/07/12 22:47, Peter Eisentraut wrote: On fre, 2012-07-06 at 18:53 +0300, Heikki Linnakangas wrote: What shall we do about those? Ignore them? Document that if you're sing one of these encodings then PL/Python with Python 2 will be crippled and with Python 3 just won't work? We could convert to UTF-8, and use the PostgreSQL functions to convert from UTF-8 to the server encoding. Double conversion might be slow, but I think it would be better than failing. Actually, we already do the other direction that way (PLyUnicode_FromStringAndSize) , so maybe it would be more consistent to always use this. I would hesitate to use this as a kind of fallback, because then we would sometimes be using PostgreSQL's recoding tables and sometimes Python's recoding tables, which could became confusing. So you're in favour of doing unicode - bytes by encoding with UTF-8 and then using the server's encoding functions? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WAL format changes
On Thu, Jun 14, 2012 at 10:01 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: This has the advantage that you can calculate the CRC for all the other fields before acquiring WALInsertLock. For xl_prev, you need to know where exactly the record is inserted, so it's handy that it's the last field before CRC. It may be late to mention this but fwiw you don't need to reorder the fields to do this. CRC has the property that you can easily adjust it for any changes to the data covered by it. Regardless of where the xl_prev link is you can calculate the CRC as if xl_prev is 0 and then once you get the lock add in the correct xl_prev. This is an argument in favour of using CRC over other checksums for which that would be hard or impossible. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] transforms
I haven't had the time to finish all the issues with this, but I want to keep the discussion going in the meantime and provide an updated patch. On mån, 2012-06-18 at 17:33 +0200, Andres Freund wrote: Cursory code review: * pstrndup already exists as pnstrdup (hstore_plperl.c) Fixed. * PyString_FromStringAndSize return value not decreffed? PyDict_SetItem doesn't steal references Fixed. * In plpython_to_hstore I would move the 'pairs' and some related variables in the PG_TRY block, so the reader doesn't need to check whether it should be volatile * In the same function items needs to be volatile to fit into longjmp semantics I'll recheck that later. * I don't think recording dependencies on transforms used when creating functions is a good idea as the transform might get created after the functions already exists. That seems to be a pretty confusing behaviour. We need the dependencies, because otherwise dropping a transform would break or silently alter the behavior of functions that depend on it. That sounds like my worst nightmare, thinking of some applications that would be affected by that. But your point is a good one. I think this could be addressed by prohibiting the creation of a transform that affects functions that already exist. Because the legacy behavior of PL implementations of defaulting to a string representation conversion, we would technically need a dependency on the absence of a transform object to make this airtight. In the far future, I could imagine removing this default behavior, meaning you couldn't create the function if no suitable transforms exist for all argument and return types. * I forsee the need for multiple transforms for the same type/language pair to coexist. The user would need to manually choose/call the transform in that case. This currently isn't easily possible... I thought about this briefly at the beginning, but see under worst nightmare above. Also, having a configuration setting for this or something would prevent any PL functions from being immutable. We don't allow multiple casts or multiple in/out functions either, which are related concepts. If you want different behavior, you should define a different type or different language. *) No psql backslash commands yet. Could be added. Doesn't really seem necessary to me. Not many people will need to look at this and the list of commands already is rather long. I'm going to leave this out for now. *) Permissions: Transforms don't have owners, a bit like casts. Currently, you are allowed to drop a transform if you own both the type and the language. That might be too strict, maybe own the type and have privileges on the language would be enough. Seems sensible enough to me. I have made this change. *) There is currently some syntax schizophrenia. The grammar accepts CREATE TRANSFORM FOR hstore LANGUAGE plperl ( FROM SQL WITH hstore_to_plperl, TO SQL WITH plperl_to_hstore ); but pg_dump produces CREATE TRANSFORM FOR hstore LANGUAGE plperl ( FROM SQL WITH hstore_to_plperl(hstore), TO SQL WITH plperl_to_hstore(internal) ); The SQL standard allows both. (In the same way that it allows 'DROP FUNCTION foo' without arguments, if it is not ambigious.) Precedent is that CREATE CAST requires arguments, but CREATE LANGUAGE does not. I don't find that problematic personally. I have fixed the syntax to include argument types, so the dump output and the input grammar is consistent. Other changes: - Fixed ecpg grammar to work again with this. - Changed extension naming to be more consistent. - Build additional contrib modules conditionally depending on whether --with-perl or --with-python were configured. (complaint from Jeff Janes) - Fixed Python 3. Things I still want to do: - Refactor the regression test framework for Python 3 so that contrib modules or external extensions don't have to repeat the magic in src/pl/plpython/Makefile. (Python 3 with hstore_plpython and ltree_plpython works, but the tests don't run.) - Refactor pyobject_to_string(), which is currently kind of copied and pasted from plpython, but should instead be exported by plpython in some suitable way. - Refactor shared library building so that I can have, say, hstore, hstore_plperl, and hstore_plpython in one directory, rather than in three. The reason being, if someone has a new type in a repository on github or something, I don't want them to have to make three separate projects or some crazy subdirectory structure in order to add some PL support for their type. This will require some deep Makefile.shlib hacking, but I think it's worth trying to make this simple(r). So, it's quite likely that this patch won't get finished in this commit fest. transforms-20120707.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your
Re: [HACKERS] Support for XLogRecPtr in expand_fmt_string?
On tis, 2012-07-03 at 14:52 -0400, Tom Lane wrote: Peter Eisentraut pete...@gmx.net writes: On tis, 2012-07-03 at 19:35 +0200, Andres Freund wrote: I wonder if we just should add a format code like %R or something similar as a replacement for the %X/%X notion. Maybe just print it as a single 64-bit value from now on. That'd be problematic also, because of the lack of standardization of the format code for uint64. We could write things like message... UINT64_FORMAT ...more message but I wonder how well the translation tools would work with that; and anyway it would at least double the translation effort for messages containing such things. The existing uses of INT64_FORMAT and UINT64_FORMAT show how this is done: You print the value in a temporary buffer and use %s in the final string. It's not terribly pretty, but it's been done this way forever, including in xlog code, so there shouldn't be a reason to hesitate about the use for this particular case. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug tracker tool we need
On Fri, Jul 6, 2012 at 12:21 PM, Bruce Momjian br...@momjian.us wrote: I think our big gap is in integrating these sections. There is no easy way for a bug reporter to find out what happens to his report unless the patch is applied in the same email thread as the report. It is hard for users to see _all_ the changes made in a release because the release notes are filtered. Our current system is designed to have very limited friction of action, and this give us a high-quality user experience and release quality, but it does have limits in how well we deal with complex cases. I do basically agree with this. I was reflecting on the bug tracker issue (or lack thereof) for unrelated reasons earlier today and I think there are some very nice things to recommend the current email-based system, which are the reasons you identify above. Perhaps the area where it falls down is structured searches (such as for closed or wontfix) and tracking progress of related, complex, or multi-part issues that span multiple root email messages. Maybe just using the message-ids to cross reference things (or at least morally: perhaps a point of indirection as to collapse multiple bug reports about the same thing, or to provide a place to add more annotation would be good, not unlike the CommitFest web application in relation to emails) is enough. Basically, perhaps an overlay on-top-of email might be a more supple way to figure out what process improvements work well without committing to a whole new tool chain and workflow all at once. -- fdr -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug tracker tool we need
On Fri, Jul 06, 2012 at 03:41:41PM -0700, Daniel Farina wrote: On Fri, Jul 6, 2012 at 12:21 PM, Bruce Momjian br...@momjian.us wrote: I think our big gap is in integrating these sections. There is no easy way for a bug reporter to find out what happens to his report unless the patch is applied in the same email thread as the report. It is hard for users to see _all_ the changes made in a release because the release notes are filtered. Our current system is designed to have very limited friction of action, and this give us a high-quality user experience and release quality, but it does have limits in how well we deal with complex cases. I do basically agree with this. I was reflecting on the bug tracker issue (or lack thereof) for unrelated reasons earlier today and I think there are some very nice things to recommend the current email-based system, which are the reasons you identify above. Perhaps the area where it falls down is structured searches (such as for closed or wontfix) and tracking progress of related, complex, or multi-part issues that span multiple root email messages. I normally assume friction is just something that slows you down from attaining a goal, but open source development is only _possible_ because of the low friction communication available via the Internet. It isn't that open source development would be slower --- it would probably not exist in its current form (think shareware diskettes for an alternative). So, while it is hopeful to think of a bug trackers as just slowing us down, it might really alter our ability to develop software. Yes, I know most other projects use bug trackers, but I doubt their development and user interactions are the same quality as ours. On the flip side, for complex cases, some of our user interactions are terrible. Maybe just using the message-ids to cross reference things (or at least morally: perhaps a point of indirection as to collapse multiple bug reports about the same thing, or to provide a place to add more annotation would be good, not unlike the CommitFest web application in relation to emails) is enough. Basically, perhaps an overlay on-top-of email might be a more supple way to figure out what process improvements work well without committing to a whole new tool chain and workflow all at once. I know there is work to allow cross-month email archive threading, and that might help. I feel we have to be honest in what our current development process does poorly. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] enhanced error fields
Attached is a revision of this patch, with a few clean-ups, mostly to the wording of certain things. On 5 July 2012 17:41, Pavel Stehule pavel.steh...@gmail.com wrote: * renamed auxiliary functions and moved it elog.c - header is new file relerror.h In my revision, I've just added a pre-declaration and removed the dedicated header, which didn't make too much sense to me: + /* Pre-declare Relation, in order to avoid a build dependency on rel.h. */ + typedef struct RelationData *Relation; This works fine, and is precedented. See src/include/storage/buffile.h, for example. If you think it's unreasonable that none of the functions now added to elog.h are callable without also including rel.h, consider that all call sites are naturally already doing that, for the errmsg() string itself. Besides, this is a perfectly valid way of reducing build dependencies, or at least a more valid way than creating a new header that does not really represent a natural new division for these new functions, IMHO. Opaque pointers are ordinarily used to encapsulate things in C, rather than to prevent build dependencies, but I believe that's only because in general that's something that C programmers are more likely to want. * new fields constraint_table and trigger_table - constraints and triggers are related to relation in pg, not just to schema I've added some remarks to that effect in the docs of my revision for your consideration. * better coverage of enhancing errors in source code Good. I think it's important that we nail down just where these are expected to be available. It would be nice if there was a quick and easy answer to the question Just what ErrorResponse fields should this new sub-category of class 23 ereport() site have?. We clearly have yet to work those details out. I have another concern with this patch. log_error_verbosity appears to be intended as an exact analogue of client verbosity (as set by PQsetErrorVerbosity()). While this generally holds for this patch, there is one notable exception: You always log all of these new fields within write_csvlog(), even if (Log_error_verbosity PGERROR_VERBOSE). Why? There will be a bunch of commas in most CSV logs once this happens, so that the schema of the log is consistent. That is kind of ugly, but I don't see a way around it. We already do this for location and application_name (though that's separately controlled by the application_name guc). I haven't touched that in the attached revision, as I'd like to hear what you have to say. Another problem that will need to be fixed is that of the follow values: +#define PG_DIAG_COLUMN_NAME'c' +#define PG_DIAG_TABLE_NAME 't' +#define PG_DIAG_SCHEMA_NAME's' +#define PG_DIAG_CONSTRAINT_NAME'n' +#define PG_DIAG_CONSTRAINT_TABLE 'o' +#define PG_DIAG_CONSTRAINT_SCHEMA 'm' +#define PG_DIAG_ROUTINE_NAME 'r' +#define PG_DIAG_ROUTINE_SCHEMA 'u' +#define PG_DIAG_TRIGGER_NAME 'g' +#define PG_DIAG_TRIGGER_TABLE 'i' +#define PG_DIAG_TRIGGER_SCHEMA 'h' Not all appear to have a way of setting the value within the ereport interface. For example, there is nothing like errrelation_column() (or errrelcol(), as I call it) to set PG_DIAG_ROUTINE_NAME. This is something I haven't touched. src/backend/utils/adt/domains.c 162: (errcode(ERRCODE_CHECK_VIOLATION), these exceptions are related to domains - we has not adequate fields now - and these fields are not in standards it needs some like DOMAIN_NAME and DOMAIN_SCHEMA ??? Hmm. I'm not sure that it's worth it. I took a look at recent JDBC documentation, because I'd expect it to offer the most complete support for exposing this in exception classes. Turns out that it does not expose things at as fine a granularity as you have here at all, which is disappointing. It seems to suppose that just a vendor code and cause will be sufficient. If you're using this one proprietary database, there is a command that'll let you get diagnostics. The wisdom for users of another proprietary database seems to be that you just use stored procedures. So I agree that CONSTRAINT NAME should be unset in the event of uncatalogued, unnamed not null integrity constraint violations. The standard seems to be loose on this, and I think we'll have to be too. However, I'd find it pretty intolerable if we were inconsistent between ereport() callsites that were *effectively the same error* - this could result in a user's application breaking based on the phase of the moon. Do you suppose it's worth stashing the last set of these fields to one side, and exposing this through an SQL utility command, or even a bundled function? I don't imagine that it's essential to have that right away, but it's something to consider. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services eelog.2012_07_07.patch
Re: [HACKERS] Bug tracker tool we need
I wonder if maybe the nearest step towards better bug tracker is a more readily referenceable mail archive. Clearly, one of our frictions is searching for relevant messages, so improved mail archive == lowered friction, no? There's a very particular use case; people keep rueing that indexes get cut off on a monthly basis. That's doubtless not the only pain, but it keeps getting mentioned, so solving it seems valuable. Having a correlation between commits, commitfest entries, and associated email seems like another valuable addition. Perhaps there are more... I'm not yet poking at anything that would suggest email database, either. A lot of the analysis would be more network-oriented; putting more of a Prolog hat on, not so much tabular / relational ...
Re: [HACKERS] Bug tracker tool we need
On Fri, Jul 06, 2012 at 08:44:13PM -0400, Christopher Browne wrote: I wonder if maybe the nearest step towards better bug tracker is a more readily referenceable mail archive. Clearly, one of our frictions is searching for relevant messages, so improved mail archive == lowered friction, no? There's a very particular use case; people keep rueing that indexes get cut off on a monthly basis. That's doubtless not the only pain, but it keeps getting mentioned, so solving it seems valuable. Agreed. I think Magnus is working on having the threads span months. The big question is what are we going to do with this ability once we get it. Having a correlation between commits, commitfest entries, and associated email seems like another valuable addition. Yep. Perhaps there are more... I'm not yet poking at anything that would suggest email database, either. A lot of the analysis would be more network-oriented; putting more of a Prolog hat on, not so much tabular / relational ... To put a finer point on this, I think projects that interact with users via a bug tracker have much poorer user/developer communication, and also less impetus to fix bugs quickly, because it is already recorded in the tracker. And after not dealing with bugs immediately for a while, the bug database becomes huge, and developers can only triage the database, fixing commonly-reported bugs, and leaving the rest for later, which effectively means never. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug tracker tool we need
On Fri, Jul 6, 2012 at 6:41 PM, Daniel Farina dan...@heroku.com wrote: I do basically agree with this. I was reflecting on the bug tracker issue (or lack thereof) for unrelated reasons earlier today and I think there are some very nice things to recommend the current email-based system, which are the reasons you identify above. Perhaps the area where it falls down is structured searches (such as for closed or wontfix) and tracking progress of related, complex, or multi-part issues that span multiple root email messages. Maybe just using the message-ids to cross reference things (or at least morally: perhaps a point of indirection as to collapse multiple bug reports about the same thing, or to provide a place to add more annotation would be good, not unlike the CommitFest web application in relation to emails) is enough. Basically, perhaps an overlay on-top-of email might be a more supple way to figure out what process improvements work well without committing to a whole new tool chain and workflow all at once. +1. This is almost word-for-word how I feel about it myself. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Allow replacement of bloated primary key indexes without foreign key rebuilds
Today I learnt [1,2,3] that the feature ALTER TABLE .. ADD CONSTRAINT ... USING INDEX we added back in the day is not so useful in the field. Having to drop foreign key constraints before this command, and recreate them afterwards makes this command useless to most database setups. I feel sorry that no one brought this up when we were implementing the feature; maybe we could've done something about it right then. I wish to correct it now, and did some research. Long story short, I realized that the foreign key constraint depends on the index relation of the primary key, and does not depend on the constraint object as I had expected (Please see rows 5 and 11 of the result set shown below). This behaviour is also seen when the FKey references a unique constraint. As much as that perplexes me, I think it makes our job a bit easier. All we need to do is allow swapping of pg_class.relfilenode of two indexes. This will let the dependency entries stand as they are and allow us to drop the bloated primary key index structure without having to rebuild the foreign key constraints. As for the syntactical sugar, this can be added to either ALTER TABLE or to ALTER INDEX. Although under no normal circumstances one would need to use ALTER INDEX to swap two indexes' relfilenode (because one can easily create a duplicate index and drop/rename-in-place the old one), I think it would make more sense here since it is just an operation on two indexes and has nothing to do with the constraints, apart from the fact that we want to use this feature to meddle with the constraints. Syntax options: ALTER TABLE tbl REPLACE [CONSTRAINT constr] {PRIMARY KEY | UNIQUE} USING INDEX new_index; ALTER INDEX ind REPLACE WITH new_index; Note that in both the syntaxes, it is assumed that all remnants of new_index will be gone after the command completes successfully; that is, the commands will behave as if they deleted the index structure of the index being replaced and placed the new structure in its place, while dropping the index that was used for the replacement. I don't think we need to ensure that the new_index is completely flushed to disk before the operation, but we do need to issue relevant cache invalidation messages after the operation is done. For replacement to be successful, new_index should not be associated with any constraints, and, new_index should be identical to the index being replaced, except for the index names. The ALTER TABLE syntax closely emulates the existing syntax of replacing a constraint using an existing index, but looking at the grammar construction I feel that it may be more complex to implement than the ALTER INDEX syntax. ALTER INDEX feels easier to do, since we won't have to jump through hoops like in ALTER TABLE's multi-command support (ATExec*() functions), and dropping the new_index might be easier to do. Thoughts? postgres=# \d+ test Table public.test Column | Type | Modifiers | Storage | Stats target | Description +-+---+-+--+- a | integer | not null | plain | | b | integer | | plain | | Indexes: test_pkey PRIMARY KEY, btree (a) temp_idx UNIQUE, btree (a) Referenced by: TABLE test2 CONSTRAINT test2_b_fkey FOREIGN KEY (b) REFERENCES test(a) Has OIDs: no postgres=# \d+ test2 Table public.test2 Column | Type | Modifiers | Storage | Stats target | Description +-+---+-+--+- a | integer | | plain | | b | integer | | plain | | Foreign-key constraints: test2_b_fkey FOREIGN KEY (b) REFERENCES test(a) Has OIDs: no Relevant output of query [4] on pg_depend: classid| objid |objid | refclassid | refobjid | refobjid | deptype ---+---+--+---+--+--+- pg_class | 16413 | test | pg_namespace | 2200 | public | n pg_type | 16415 | test | pg_class | 16413 | test | i pg_type | 16414 | test[] | pg_type | 16415 | test | i pg_constraint | 16417 | test_pkey| pg_class | 16413 | test | a pg_class | 16416 | test_pkey| pg_constraint | 16417 | test_pkey| i pg_class | 16418 | test2| pg_namespace | 2200 | public | n pg_type | 16420 | test2| pg_class | 16418 | test2| i pg_type | 16419 | test2[] | pg_type | 16420 | test2| i pg_constraint | 16421 | test2_b_fkey | pg_class | 16413 | test | n pg_constraint | 16421 | test2_b_fkey | pg_class | 16418 |