Re: [HACKERS] Replacing plpgsql's lexer
On Tue, 2009-04-14 at 18:29 -0400, Tom Lane wrote: > Simon Riggs writes: > > On Tue, 2009-04-14 at 16:37 -0400, Tom Lane wrote: > >> Comments, objections, better ideas? > > > Please, if you do this, make it optional. > > I don't think making the plpgsql lexer pluggable is realistic. Doesn't sound easy, no. (I didn't suggest pluggable, just optional). > > Potentially changing the behaviour of thousands of functions just to fix > > a rare bug will not endear us to our users. The bug may be something > > that people are relying on in some subtle way, ugly as that sounds. > > That's why I don't want to change it in a minor release. In a major > release, however, it's fair game. If we want to make easy upgrades a reality, this is the type of issue we must consider. Not much point having perfect binary upgrades if all your functions start behaving differently after upgrade and then you discover there isn't a binary downgrade path... Rather than come up with specific solutions, let me just ask the question: Is there a workaround for people caught by these changes? Let's plan that alongside the change itself, so we have a reserve 'chute. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why isn't stats_temp_directory automatically created?
Hi, On Tue, Apr 14, 2009 at 10:26 PM, Euler Taveira de Oliveira wrote: > Fujii Masao escreveu: >> >> Is it worth making the patch which creates stats_temp_directory >> if not present? >> > +1. Here is the patch. This patch should be added to CommitFest-2009-First?, or committed before 8.4 release? The patch is very small, so I don't think that it'll block 8.4 release. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center Index: src/backend/postmaster/pgstat.c === RCS file: /projects/cvsroot/pgsql/src/backend/postmaster/pgstat.c,v retrieving revision 1.187 diff -c -r1.187 pgstat.c *** src/backend/postmaster/pgstat.c 1 Jan 2009 17:23:46 - 1.187 --- src/backend/postmaster/pgstat.c 15 Apr 2009 06:08:04 - *** *** 111,116 --- 111,117 bool pgstat_track_counts = false; int pgstat_track_functions = TRACK_FUNC_OFF; int pgstat_track_activity_query_size = 1024; + char *pgstat_temp_directory; /* -- * Built from GUC parameter *** *** 589,594 --- 590,600 return 0; /* + * Create temporary statistics directory if not present; ignore errors + */ + mkdir(pgstat_temp_directory, 0700); + + /* * Do nothing if too soon since last collector start. This is a safety * valve to protect against continuous respawn attempts if the collector * is dying immediately at launch. Note that since we will be re-called Index: src/backend/utils/misc/guc.c === RCS file: /projects/cvsroot/pgsql/src/backend/utils/misc/guc.c,v retrieving revision 1.502 diff -c -r1.502 guc.c *** src/backend/utils/misc/guc.c 7 Apr 2009 23:27:34 - 1.502 --- src/backend/utils/misc/guc.c 15 Apr 2009 06:08:18 - *** *** 375,382 char *IdentFileName; char *external_pid_file; - char *pgstat_temp_directory; - int tcp_keepalives_idle; int tcp_keepalives_interval; int tcp_keepalives_count; --- 375,380 Index: src/include/pgstat.h === RCS file: /projects/cvsroot/pgsql/src/include/pgstat.h,v retrieving revision 1.82 diff -c -r1.82 pgstat.h *** src/include/pgstat.h 4 Jan 2009 22:19:59 - 1.82 --- src/include/pgstat.h 15 Apr 2009 06:08:22 - *** *** 593,598 --- 593,599 extern bool pgstat_track_counts; extern int pgstat_track_functions; extern PGDLLIMPORT int pgstat_track_activity_query_size; + extern PGDLLIMPORT char *pgstat_temp_directory; extern char *pgstat_stat_tmpname; extern char *pgstat_stat_filename; -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why isn't stats_temp_directory automatically created?
Euler Taveira de Oliveira wrote: > Fujii Masao escreveu: > > Is it worth making the patch which creates stats_temp_directory > > if not present? > > > +1. +1, but AFAIK stats_temp_directory was designed to symlink to a RAM drive. Even if stats_temp_directory exists as a symbolic link, the destination directory might be lost in such a situation. If you try to make servers more robust, you might also need to consider broken symlinks. Regards, --- ITAGAKI Takahiro NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Warm Standby restore_command documentation (was: New trigger option of pg_standby)
Hi, On Wed, Apr 15, 2009 at 3:30 AM, Andreas Pflug wrote: > I've been following the thread with growing lack of understanding why > this is so hardly discussed, and I went back to the documentation of > what the restore_command should do ( > http://www.postgresql.org/docs/8.3/static/warm-standby.html ) > > While the algorithm presented in the pseudocode isn't dealing too good > with a situation where the trigger is set while the restore_command is > sleeping (this should be handled better in a real implementation), the > code says > > "Restore all wal files. If no more wal files are present, stop restoring > if the trigger is set; otherwise wait for a new wal file". > > Since pg_standby is meant as implementation of restore_command, it has > to follow the directive stated above; *anything else is a bug*. > pg_standby currently does *not* obey this directive, and has that > documented, but a documented bug still is a bug. > > Conclusion: There's no "new trigger option" needed, instead pg_standby > has to be fixed so it does what the warm standby option of postgres > needs. The trigger is only to be examined if no more files are > restorable, and only once. Yeah, as a result of the discussion on that thread, I'll change the default behavior instead of adding new trigger option. But, I'm not going to get rid of the current behavior; it's chosen if the trigger file containing "fast" exists. On the other hand, new behavior is chosen when the trigger file containing "smart" or an empty one exists (default). Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Memory exhaustion during bulk insert
Hi all, I'm doing an experimental project using Postgres as the prototype. I want to store attribute values of xml type in an internal XML table which is created for every XML column. One XML node (element, attribute or text) is stored as a tuple in the XML table. While a 127MB XML document 'dblp.xml' (that has about 4 million XML nodes thus 4 million tuples) is being stored, 2GB memory is exhausted rapidly and then my computer hangs up. I guess the reason is that the memory runs out before the transaction is committed because the number of tuples being inserted is too large. The flow of tuple insertion and functions called are as follows: while (get the next XML node != NULL) { /* fill in values and isnull array */ ... tup = heap_form_tuple(tupleDesc, values, isnull); simple_heap_insert(xmlTable, tup); ... heap_freetuple(tup); } I searched the mailinglist archive and noticed that a patch to improve bulk insert performance is committed in Nov 2008. The log message said "(the patch) keeps the current target buffer pinned and make it work in a small ring of buffers to avoid having bulk inserts trash the whole buffer arena." However, I do not know much about the code below the heapam layer. Can that patch solve my problem (the version I use is 8.3.5)? Or could you give me some suggestion about how to avoid memory exhaustion during bulk insert (in the meanwhile it must clean up nicely after a transaction abort)? Thanks in advance. Regards, -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Patch for server-side encoding issues
Here is a WIP patch to solve server-side encoding issues. It includes "Solution of the file name problem of copy on windows" patch. http://archives.postgresql.org/message-id/20090413184335.39be.52131...@oss.ntt.co.jp It could solve the following issues. They are not only in Windows nor Japan-specific problems. They could also occur if you use databases with mulitple encodings or database with non-platform-native encoding even on POSIX platforms. <1> Non-ascii file paths for database that encoding is different from platform's encoding (that comes from $LANG or Windows codepage), especially for COPY TO/FROM. <2> Use appropriate encoding for non-text server log (console, syslog and eventlog). The encoding is the same as <1>. <3> Use appropriate encoding for text server log (stderr and csvlog), especially database cluster has databases with a variety of encoding. New GUC parameter 'log_encoding' specifies the encoding in server log. <4> (incomplete) Avoid encoding conversion error in printing server log and messages for client. Instead of error, print '?' if there is no equivalent character in the target encoding. For <4>, I use PG_TRY and PG_CATCH for now, but it must be a bad manner. Instead, I'm thinking that convertion procedures will take an optional argument whether it should raise error or not. However, we need to modify all of conversion functions to do so. More research is needed against following situations: - NLS messages - Module path for LOAD - Arguments for system(), including archive_command and restore_command - Query texts for other database in pg_stat_activity and pg_stat_statements Comments welcome. Please notify me if I'm missing something. Here is a sample code to test the patch. (client_encoding = sjis / system encoding = sjis) C:\home\>createdb utfdb --encoding=utf8 --locale=C C:\home\>createdb eucdb --encoding=eucjp --locale=C C:\home\>psql utfdb -c "COPY (SELECT 1) TO 'C:/home/日本語ファイル.txt'" C:\home\>psql utfdb -c "SELECT '日本語' WITH ERROR" ERROR: syntax error at or near "WITH ERROR" LINE 1: SELECT '日本語' WITH ERROR ^ C:\home\>psql eucdb -c "COPY (SELECT 1) TO 'C:/home/日本語ファイル.txt'" C:\home\>psql eucdb -c "SELECT '日本語' WITH ERROR" ERROR: syntax error at or near "WITH ERROR" LINE 1: SELECT '日本語' WITH ERROR ^ Regards, --- ITAGAKI Takahiro NTT Open Source Software Center server-side_encoding_issues_20090415.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Replacing plpgsql's lexer
* Bruce Momjian (br...@momjian.us) wrote: > Well, this bug has existed long before 8.4 so we could just leave it for > 8.5, and it is not like we have had tons of complaints; the only > complaint I saw was one from March, 2008. I think it's a good thing to do in general. I'm also concerned about if it will impact the plpgsql functions we have (which are pretty numerous..) but in the end I'd rather have it fixed in 8.4 than possibly delayed indefinitely (after all, if it's in 8.4, why fix it for 8.5?). Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Replacing plpgsql's lexer
Bruce Momjian writes: > Well, this bug has existed long before 8.4 so we could just leave it for > 8.5, and it is not like we have had tons of complaints; the only > complaint I saw was one from March, 2008. We had one last week, which is what prompted me to start looking at the plpgsql lexer situation in the first place. Also, if the unicode literal situation doesn't change, that's going to be problematic as well. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Replacing plpgsql's lexer
Tom Lane wrote: > Simon Riggs writes: > > On Tue, 2009-04-14 at 16:37 -0400, Tom Lane wrote: > >> Comments, objections, better ideas? > > > Please, if you do this, make it optional. > > I don't think making the plpgsql lexer pluggable is realistic. > > > Potentially changing the behaviour of thousands of functions just to fix > > a rare bug will not endear us to our users. The bug may be something > > that people are relying on in some subtle way, ugly as that sounds. > > That's why I don't want to change it in a minor release. In a major > release, however, it's fair game. Well, this bug has existed long before 8.4 so we could just leave it for 8.5, and it is not like we have had tons of complaints; the only complaint I saw was one from March, 2008. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: add columns created and altered to pg_proc and pg_class
Bruce Momjian writes: > How do you handle dump/restore? Is it preserved? I would think not. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unicode support
> "Peter" == Peter Eisentraut writes: > On Tuesday 14 April 2009 07:07:27 Andrew Gierth wrote: >> FWIW, the SQL spec puts the onus of normalization squarely on the >> application; the database is allowed to assume that Unicode >> strings are already normalized, is allowed to behave in >> implementation-defined ways when presented with strings that >> aren't normalized, and provision of normalization functions and >> predicates is just another optional feature. Peter> Can you name chapter and verse on that? 4.2.8 Universal character sets A UCS string is a character string whose character repertoire is UCS and whose character encoding form is one of UTF8, UTF16, or UTF32. Any two UCS strings are comparable. An SQL-implementation may assume that all UCS strings are normalized in one of Normalization Form C (NFC), Normalization Form D (NFD), Normalization Form KC (NFKC), or Normalization Form KD (NFKD), as specified by [Unicode15]. may be used to verify the normalization form to which a particular UCS string conforms. Applications may also use to enforce a particular . With the exception of and , the result of any operation on an unnormalized UCS string is implementation-defined. Conversion of UCS strings from one character set to another is automatic. Detection of a noncharacter in a UCS-string causes an exception condition to be raised. The detection of an unassigned code point does not. [Obviously there are things here that we don't conform to anyway (we don't raise exceptions for noncharacters, for example. We don't claim conformance to T061.] ::= ::= IS [ NOT ] [ ] NORMALIZED 1) Without Feature T061, "UCS support", conforming SQL language shall not contain a . 2) Without Feature F394, "Optional normal form specification", conforming SQL language shall not contain . ::= NORMALIZE [ [ ] ] ::= NFC | NFD | NFKC | NFKD 7) Without Feature T061, "UCS support", conforming SQL language shall not contain a . 9) Without Feature F394, "Optional normal form specification", conforming SQL language shall not contain . Peter> I see this, for example, Peter> 6.27 [...] Peter> So SQL redirects the question of character length the Unicode Peter> standard. I have not been able to find anything there on a Peter> quick look, but I'm sure the Unicode standard has some very Peter> specific ideas on this. Note that the matter of normalization Peter> is not mentioned here. I've taken a not-so-quick look at the Unicode standard (though I don't claim to be any sort of expert on it), and I certainly can't see any definitive indication what the length is supposed to be; however, the use of terminology such as "combining character sequence" (meaning a series of codepoints that combine to make a single glyph) certainly seems to strongly imply that our interpretation is correct and that the OP's is not. Other indications: the units used by length() must be the same as the units used by position() and substring() (in the spec, when USING CHARACTERS is specified), and it would not make sense to use a definition of "character" that did not allow you to look inside a combining sequence. I've also failed so far to find any examples of other programming languages in which a combining character sequence is taken to be a single character for purposes of length or position specification. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: add columns created and altered to pg_proc and pg_class
Tom Lane wrote: > Pavel Stehule writes: > > this my proposal is very simple. It help to people who have to manage > > large or complex database system. Important data are date of creating > > and date of altering tables and stored procedures. These data cannot > > be modified by user, so implementation doesn't need any new > > statements. > > ISTM anyone who thinks they need this actually need a full DDL log; > or at least, if we give them this, they will be back next week asking > for a full log. So it'd save a lot of work to tell them to just log > their DDL to start with. > > Some obvious objections to the simple approach: > - what if I want to know *who* made the change > - what if I need to know about the change before last > - what if I need to know about a DROP > - what if I need to know about operators, operator classes, schemas, etc > etc How do you handle dump/restore? Is it preserved? -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Yet another regression issue with Fedora-10 + PG 8.4 beta1
Devrim =?ISO-8859-1?Q?G=DCND=DCZ?= writes: > I built PG 8.4 beta1 on 2 different Fedora-10 boxes. One of them was a > local box, which runs under QEMU. The other one is Fedora's build > servers. (I did a scratch build on Fedora build server) > On the first machine, I got a regression failure: > http://www.gunduz.org/temp/regression.out > http://www.gunduz.org/temp/regression.diffs > I could not repeat this with 2nd one (Fedora build servers) > Any ideas what might have caused it? QEMU screwing up the handling of floating-point traps, perhaps? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Replacing plpgsql's lexer
Simon Riggs writes: > On Tue, 2009-04-14 at 16:37 -0400, Tom Lane wrote: >> Comments, objections, better ideas? > Please, if you do this, make it optional. I don't think making the plpgsql lexer pluggable is realistic. > Potentially changing the behaviour of thousands of functions just to fix > a rare bug will not endear us to our users. The bug may be something > that people are relying on in some subtle way, ugly as that sounds. That's why I don't want to change it in a minor release. In a major release, however, it's fair game. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] psql with "Function Type" in \df
On Tue, Apr 14, 2009 at 02:52:32PM -0400, Alvaro Herrera wrote: > Tom Lane wrote: > > > I had a second thought about that: presumably we should make the > > function type names translatable. If we do that, it might be better > > to make the aggregate case be "aggregate" and take the width hit. > > Otherwise translators are going to be puzzled when they come across > > "agg" as a translatable phrase. > > I think it's good to have them translatable. As for using "aggregate" > instead of "agg" I don't think it's that great an idea. If you need to > notify translators that "agg" stands for "aggregate", add a > /* translator: */ comment. Here's the next revision :) Cheers, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml index 10d42ca..5224440 100644 --- a/doc/src/sgml/ref/psql-ref.sgml +++ b/doc/src/sgml/ref/psql-ref.sgml @@ -1043,14 +1043,19 @@ testdb=> Lists available functions, together with their argument and -return types. If pattern -is specified, only functions whose names match the pattern are shown. -If the form \df+ is used, additional information about -each function, including volatility, language, source code and description, is shown. -By default, only user-created objects are shown; supply a -pattern or the S modifier to include system -objects. +return types and their function type: 'normal', 'agg', +'trigger', and 'window'. If pattern is specified, only +functions whose names match the pattern are shown. If the +form \df+ is used, additional information +about each function, including volatility, language, source +code and description, is shown. By default, only user-created +objects are shown; supply a pattern or the +S modifier to include system objects. To +include aggregates in the result set, use \dfa, normal +functions, \dfn, trigger functions, \dft, windowing functions, +\dfw. You may freely mix and match the +, S, a, n, t and w +options. @@ -1064,7 +1069,6 @@ testdb=> - \dF[+] [ pattern ] diff --git a/doc/src/sgml/release.sgml b/doc/src/sgml/release.sgml index fc56c3d..09ba686 100644 --- a/doc/src/sgml/release.sgml +++ b/doc/src/sgml/release.sgml @@ -183,6 +183,15 @@ do it for earlier branch release files. + + + In psql, \df now shows which type of function (normal, + aggregate, trigger, or window) it is. You can also specify + mix-and-match options. To get aggregates and windowing + functions, including system ones, for example, invoke \dfwaS+ + + + diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c index b39466d..1dc3cc3 100644 --- a/src/bin/psql/command.c +++ b/src/bin/psql/command.c @@ -365,8 +365,22 @@ exec_command(const char *cmd, case 'D': success = listDomains(pattern, show_system); break; - case 'f': - success = describeFunctions(pattern, show_verbose, show_system); + case 'f': /* function subsystem */ + switch (cmd[2]) + { + case '\0': + case '+': + case 'S': + case 'a': + case 'n': + case 't': + case 'w': + success = describeFunctions(&cmd[2], pattern, show_verbose, show_system); + break; + default: + status = PSQL_CMD_UNKNOWN; + break; + } break; case 'g': /* no longer distinct from \du */ diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c index 731baf8..ad5dcbe 100644 --- a/src/bin/psql/describe.c +++ b/src/bin/psql/describe.c @@ -183,15 +183,43 @@ describeTablespaces(const char *pattern, bool verbose) /* \df - * Takes an optional regexp to select particular functions + * Takes an optional regexp to select particular functions. + * + * As with \d, you can specify the kinds of functions you want: + * + * a for aggregat
[HACKERS] Yet another regression issue with Fedora-10 + PG 8.4 beta1
Hi, I built PG 8.4 beta1 on 2 different Fedora-10 boxes. One of them was a local box, which runs under QEMU. The other one is Fedora's build servers. (I did a scratch build on Fedora build server) On the first machine, I got a regression failure: http://www.gunduz.org/temp/regression.out http://www.gunduz.org/temp/regression.diffs I could not repeat this with 2nd one (Fedora build servers) Any ideas what might have caused it? Regards, -- Devrim GÜNDÜZ, RHCE devrim~gunduz.org, devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr http://www.gunduz.org signature.asc Description: This is a digitally signed message part
Re: [HACKERS] Replacing plpgsql's lexer
Andrew Dunstan writes: > Robert Haas wrote: >> All this sounds good. As for how to handle := and .., I think making >> them lex the same way in PL/pgsql and core SQL would be a good thing. > They don't have any significance in core SQL. What would we do with the > lexeme? It would just fail --- the core grammar will have no production that can accept it. Right offhand I think the only difference is that instead of regression=# select a .. 2; ERROR: syntax error at or near "." LINE 1: select a .. 2; ^ you'd see regression=# select a .. 2; ERROR: syntax error at or near ".." LINE 1: select a .. 2; ^ ie it acts like one token not two in the error message. This solution would become problematic if the core grammar ever had a meaning for := or .. that required treating them as two tokens (eg, the grammar allowed this sequence with whitespace between). I don't think that's very likely though; and if it did happen we could fix it with the aforementioned control switch. > The only thing that makes me nervous about this is that we're very close > to Beta. OTOH, this is one area the regression suite should give a > fairly good workout to. Yeah, I'd rather have done it before beta1, but too late. The other solution still entails massive changes to the plpgsql lexer, so it doesn't really look like much lower risk. AFAICS the practical alternatives are a reimplementation in beta2, or no fix until 8.5. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Replacing plpgsql's lexer
Robert Haas wrote: All this sounds good. As for how to handle := and .., I think making them lex the same way in PL/pgsql and core SQL would be a good thing. They don't have any significance in core SQL. What would we do with the lexeme? ISTR we've used some hacks in the past to split lexemes into pieces, and presumably we'd have to do something similar with these. The only thing that makes me nervous about this is that we're very close to Beta. OTOH, this is one area the regression suite should give a fairly good workout to. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Replacing plpgsql's lexer
On Tue, 2009-04-14 at 16:37 -0400, Tom Lane wrote: > Comments, objections, better ideas? Please, if you do this, make it optional. Potentially changing the behaviour of thousands of functions just to fix a rare bug will not endear us to our users. The bug may be something that people are relying on in some subtle way, ugly as that sounds. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Clean shutdown and warm standby
Hi, On Wed, Apr 8, 2009 at 9:11 PM, I wrote: > Following the discussion here > http://archives.postgresql.org/message-id/49d9e986.8010...@pse-consulting.de > , I wrote a small patch which rotates the last XLog file on shutdown > [snip] Any comment or advice on how I can fix it with a different method if this one is considered wrong? Original message and patch here: http://archives.postgresql.org/message-id/1d4e0c10904081211p2c0f1cdepe620c11d1271c...@mail.gmail.com Thanks. -- Guillaume -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Replacing plpgsql's lexer
On Tue, Apr 14, 2009 at 4:37 PM, Tom Lane wrote: > Whichever way the current discussion about Unicode literals turns out, > it's clear that plpgsql is not up to speed on matching the core lexer's > behavior --- it's wrong anyway with respect to > standard_conforming_strings. > > I had earlier speculated semi-facetiously about ripping out the plpgsql > lexer altogether, but the more I think about it the less silly the idea > looks. Suppose that we change the core lexer so that the keyword lookup > table it's supposed to use is passed to scanner_init() rather than being > hard-wired in. Then make plpgsql call the core lexer using its own > keyword table. Everything else would match core lexical behavior > automatically. The special behavior that we do want, such as being > able to construct a string representing a desired subrange of the input, > could all be handled in plpgsql-specific wrapper code. > > I've just spent a few minutes looking for trouble spots in this theory, > and so far the only real ugliness I can see is that plpgsql treats > ":=" and ".." as single tokens whereas the core would parse them as two > tokens. We could hack the core lexer to have an additional switch that > controls that. Or maybe just make it always return them as single > tokens --- AFAICS, neither combination is legal in core SQL anyway, > so this would only result in a small change in the exact syntax error > you get if you write such a thing in core SQL. > > Another trouble spot is the #option syntax, but that could be handled > by a special-purpose prescan, or just dropped altogether; it's not like > we've ever used that for anything but debugging. > > It looks like this might take about a day's worth of work (IOW two > or three days real time) to get done. > > Normally I'd only consider doing such a thing during development phase, > but since we're staring at at least one and maybe two bugs that are > going to be hard to fix in any materially-less-intrusive way, I'm > thinking about doing it now. Theoretically this change shouldn't break > any working code, so letting it hit the streets in 8.4beta2 doesn't seem > totally unreasonable. > > Comments, objections, better ideas? All this sounds good. As for how to handle := and .., I think making them lex the same way in PL/pgsql and core SQL would be a good thing. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unicode string literals versus the world
On Tue, Apr 14, 2009 at 2:55 PM, Tom Lane wrote: > Robert Haas writes: >> Well, that's fine, but that's a long way from Peter's statement that >> "I think the tendency should be to get rid of E'' usage". > > Bear in mind that that's Peter's opinion; it's not necessarily shared > by anyone else. I was just responding to your assertion of the > diametrically opposed position that non-E strings are useless (which > I don't share either). Useless might be an overstatement, but I certainly have gotten no benefit out of them. It seems decidedly odd to me to propose that users embed literal control characters in their code/SQL scripts in lieu of using escape sequences. If that were a normal and reasonable thing to do then I expect C, C++, Perl, Python, Ruby, Javascript, and countless other languages wouldn't provide this functionality. In reality, most of them do provide it, sometimes as the only option. And as to your point about Windows pathnames, if you're trying to avoid escaping lots of backslashes, dollar-quoting is a perfectly adequate solution. The real motivation for standard_conforming_strings is the one embedded in the name. And I'm fine with that. But I don't see E'' disappearing from my code any time soon. I periodically have a need for things like foo || E'\n' || bar and it screws up the formatting of the code to write foo || ' ' || bar. Writing it that way, it's also entirely non-obvious whether there are any spaces or tabs after the opening quote and before the newline, and if so whether or not they are intentional. But I don't mind if someone ELSE likes the new, standard strings - it just isn't me. :-) ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Solution of the file name problem of copy on windows.
Itagaki Takahiro writes: > There are some issues: > * Is it possible to determine the platform encoding? There is no platform encoding in linux. File name encoding depend on user locale, so different users can have different encoding of file name. -- Sergey Burladyan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Replacing plpgsql's lexer
Whichever way the current discussion about Unicode literals turns out, it's clear that plpgsql is not up to speed on matching the core lexer's behavior --- it's wrong anyway with respect to standard_conforming_strings. I had earlier speculated semi-facetiously about ripping out the plpgsql lexer altogether, but the more I think about it the less silly the idea looks. Suppose that we change the core lexer so that the keyword lookup table it's supposed to use is passed to scanner_init() rather than being hard-wired in. Then make plpgsql call the core lexer using its own keyword table. Everything else would match core lexical behavior automatically. The special behavior that we do want, such as being able to construct a string representing a desired subrange of the input, could all be handled in plpgsql-specific wrapper code. I've just spent a few minutes looking for trouble spots in this theory, and so far the only real ugliness I can see is that plpgsql treats ":=" and ".." as single tokens whereas the core would parse them as two tokens. We could hack the core lexer to have an additional switch that controls that. Or maybe just make it always return them as single tokens --- AFAICS, neither combination is legal in core SQL anyway, so this would only result in a small change in the exact syntax error you get if you write such a thing in core SQL. Another trouble spot is the #option syntax, but that could be handled by a special-purpose prescan, or just dropped altogether; it's not like we've ever used that for anything but debugging. It looks like this might take about a day's worth of work (IOW two or three days real time) to get done. Normally I'd only consider doing such a thing during development phase, but since we're staring at at least one and maybe two bugs that are going to be hard to fix in any materially-less-intrusive way, I'm thinking about doing it now. Theoretically this change shouldn't break any working code, so letting it hit the streets in 8.4beta2 doesn't seem totally unreasonable. Comments, objections, better ideas? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unicode string literals versus the world
On 4/14/09, Tom Lane wrote: > Peter Eisentraut writes: > > On Tuesday 14 April 2009 18:54:33 Tom Lane wrote: > >> The other proposal that seemed > >> attractive to me was a decode-like function: > >> > >> uescape('foo\00e9bar') > >> uescape('foo\00e9bar', '\') > > > This was discussed previously, but rejected with the following argument: > > > There are some other disadvantages for making a function call. You > > couldn't use that kind of literal in any other place where the parser > > calls for a string constant: role names, tablespace locations, > > passwords, copy delimiters, enum values, function body, file names. > > > I'm less than convinced that those are really plausible use-cases for > characters that one is unable to type directly. However, I'll grant the > point. So that narrows us down to considering the \u extension to E'' > strings as a saner and safer alternative to the spec's syntax. My vote would go to \u. The U& may be "sql standard" but it's different from any established practical standard. Alternative would be to make U& follow stdstr setting: stdstr=on -> you get fully standard-conforming syntax: U&'\xxx' UESCAPE '\' stdstr=off -> you need to follow old quoting rules: U&'\\xxx' UESCAPE '\\' This would result in safe, and when stdstr=on, fully standard compliant syntax. Only downside would be that in practice - stdstr=off - it would be unusable. Third alternative would be to do both of them - \u as a usable method and safe-U& to mark the checkbox for SQL-standard compliance. If we do want U&, I would prefer that to U&-only syntax. -- marko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unicode string literals versus the world
Tom Lane wrote: > This is *not* about code within Postgres. One typically provides libraries for this sort of thing, but your point is taken; suggestion withdrawn. --mlp _ Meredith L. Patterson Founder and CTO Osogato, Inc. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unicode string literals versus the world
Tom Lane wrote: > I suspect that it's actually impossible to parse such a thing correctly > without a full-fledged flex lexer or something of equivalent complexity. > Certainly it's a couple of orders of magnitude harder than it is for > either standard-conforming or E'' literals. Is there a reason not to use a full-fledged flex lexer? I'd be willing to take a crack at such a thing, but I'm working 80-hour weeks through the end of June and likely wouldn't be able to put in any time on it till then. So I definitely couldn't promise anything for 8.4, but if putting it off till 8.5 works, sign me up. Cheers, --mlp _ Meredith L. Patterson Founder and CTO Osogato, Inc. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unicode string literals versus the world
"Meredith L. Patterson" writes: > Tom Lane wrote: >> I suspect that it's actually impossible to parse such a thing correctly >> without a full-fledged flex lexer or something of equivalent complexity. > Is there a reason not to use a full-fledged flex lexer? The point is that that's a pretty large imposition on client code that we don't control or maintain, in order to get a feature that could be gotten in much less dangerous ways that don't impact any code outside PG. > I'd be willing to take a crack at such a thing, but I'm working 80-hour > weeks through the end of June and likely wouldn't be able to put in any > time on it till then. So I definitely couldn't promise anything for 8.4, > but if putting it off till 8.5 works, sign me up. Shall we pass your name on to every package using Postgres, then? This is *not* about code within Postgres. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unicode string literals versus the world
On Tuesday 14 April 2009 21:48:12 Tom Lane wrote: > Peter Eisentraut writes: > > I think we can handle that and the cases Tom presents by erroring out > > when the U& syntax is used with stdstr off. > > I think you're missing the point --- this is not about whether the > syntax is unambiguous (it is already) but about whether a frontend that > doesn't understand it 100% will be secure against subversion. I have no > confidence in the latter assumption. I think I am getting the point quite well. Do you have an example how this can be subverted? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] psql with "Function Type" in \df
On Tue, Apr 14, 2009 at 03:04:55PM -0400, Alvaro Herrera wrote: > David Fetter wrote: > > On Tue, Apr 14, 2009 at 02:52:32PM -0400, Alvaro Herrera wrote: > > > > I think it's good to have them translatable. As for using > > > "aggregate" instead of "agg" I don't think it's that great an > > > idea. If you need to notify translators that "agg" stands for > > > "aggregate", add a /* translator: */ comment. > > > > Where would I add that? First mention, each time, or...? > > Is there more than one mention of "agg"? It's in 3 branches in describe.c. Cheers, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] psql with "Function Type" in \df
David Fetter wrote: > On Tue, Apr 14, 2009 at 02:52:32PM -0400, Alvaro Herrera wrote: > > I think it's good to have them translatable. As for using > > "aggregate" instead of "agg" I don't think it's that great an idea. > > If you need to notify translators that "agg" stands for "aggregate", > > add a /* translator: */ comment. > > Where would I add that? First mention, each time, or...? Is there more than one mention of "agg"? -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] psql with "Function Type" in \df
On Tue, Apr 14, 2009 at 02:52:32PM -0400, Alvaro Herrera wrote: > Tom Lane wrote: > > > I had a second thought about that: presumably we should make the > > function type names translatable. If we do that, it might be > > better to make the aggregate case be "aggregate" and take the > > width hit. Otherwise translators are going to be puzzled when > > they come across "agg" as a translatable phrase. > > I think it's good to have them translatable. As for using > "aggregate" instead of "agg" I don't think it's that great an idea. > If you need to notify translators that "agg" stands for "aggregate", > add a /* translator: */ comment. Where would I add that? First mention, each time, or...? Cheers, David (reworking patch per suggestions) -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unicode string literals versus the world
Robert Haas writes: > Well, that's fine, but that's a long way from Peter's statement that > "I think the tendency should be to get rid of E'' usage". Bear in mind that that's Peter's opinion; it's not necessarily shared by anyone else. I was just responding to your assertion of the diametrically opposed position that non-E strings are useless (which I don't share either). regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] psql with "Function Type" in \df
Tom Lane wrote: > I had a second thought about that: presumably we should make the > function type names translatable. If we do that, it might be better > to make the aggregate case be "aggregate" and take the width hit. > Otherwise translators are going to be puzzled when they come across > "agg" as a translatable phrase. I think it's good to have them translatable. As for using "aggregate" instead of "agg" I don't think it's that great an idea. If you need to notify translators that "agg" stands for "aggregate", add a /* translator: */ comment. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unicode string literals versus the world
Peter Eisentraut writes: > On Tuesday 14 April 2009 18:54:33 Tom Lane wrote: >> The other proposal that seemed >> attractive to me was a decode-like function: >> >> uescape('foo\00e9bar') >> uescape('foo\00e9bar', '\') > This was discussed previously, but rejected with the following argument: > There are some other disadvantages for making a function call. You > couldn't use that kind of literal in any other place where the parser > calls for a string constant: role names, tablespace locations, > passwords, copy delimiters, enum values, function body, file names. I'm less than convinced that those are really plausible use-cases for characters that one is unable to type directly. However, I'll grant the point. So that narrows us down to considering the \u extension to E'' strings as a saner and safer alternative to the spec's syntax. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unicode string literals versus the world
On Tue, Apr 14, 2009 at 2:22 PM, Tom Lane wrote: > Robert Haas writes: >> Maybe I've just got my head deeply in the sand, but I don't understand >> what the alternative to E'' supposedly is. How am I supposed to write >> the equivalent of E'\t\n\f' without using E''? The >> standard_conforming_strings syntax apparently supports no escapes of >> any kind, which seems so hideously inconvenient that I can't even >> imagine why someone wants that behavior. > > Well, quite aside from issues of compatibility with standards and other > databases, I'm sure there are lots of Windows users who are more > interested in being able to store a Windows pathname without doubling > their backslashes than they are in being able to type readable names > for ASCII control characters. After all, in most cases you can get > those characters into a string just by typing them (especially if you > aren't using readline or something like it). Well, that's fine, but that's a long way from Peter's statement that "I think the tendency should be to get rid of E'' usage". It's only been a minor inconvenience for me to change my applications to use E'', but I certainly don't see how I could get by without it, and it's far more like other programming languages that I use (e.g. C) than the standard syntax is. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unicode string literals versus the world
Peter Eisentraut writes: > I think we can handle that and the cases Tom presents by erroring out > when the U& syntax is used with stdstr off. I think you're missing the point --- this is not about whether the syntax is unambiguous (it is already) but about whether a frontend that doesn't understand it 100% will be secure against subversion. I have no confidence in the latter assumption. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unicode string literals versus the world
On Tuesday 14 April 2009 17:32:00 Tom Lane wrote: > I admit that the SQL:2008 way also covers Unicode code > points in identifiers, which we can't emulate without a lexical change; > but frankly I think the use-case for that is so thin as to be almost > nonexistent. Who is going to choose identifiers that they cannot easily > type on their keyboards? For example, table names are automatically generated, or you write a test case for weird looking names, or you want to add special characters in an identifier that will later be displayed somewhere, or in general you are writing an application for a foreign language. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unicode string literals versus the world
On Tuesday 14 April 2009 18:54:33 Tom Lane wrote: > The other proposal that seemed > attractive to me was a decode-like function: > > uescape('foo\00e9bar') > uescape('foo\00e9bar', '\') This was discussed previously, but rejected with the following argument: There are some other disadvantages for making a function call. You couldn't use that kind of literal in any other place where the parser calls for a string constant: role names, tablespace locations, passwords, copy delimiters, enum values, function body, file names. Some of those appear to be very plausible use cases. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: add columns created and altered topg_proc and pg_class
On Tue, Apr 14, 2009 at 2:13 PM, Kevin Grittner wrote: > "Kevin Grittner" wrote: >> the timestamp column caused the copy to be about 11.3% larger > > Grabbed the wrong numbers. It's really 2.5%, but still Well, that's why Tom doesn't want to add it to pg_class. But putting it in a separate table will have no impact on the speed of anything except DDL statements, and even then it won't require copying the whole table, so the performance impact will be pretty minimal, so I think it should be all right. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unicode string literals versus the world
On Tuesday 14 April 2009 21:22:29 Tom Lane wrote: > BTW, does anyone know whether Unicode includes the ASCII control > characters ... ie, is \u0009 a name for tab? If so, maybe this > syntax is in part an attempt to cover that use-case in the standard. Yes on both. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unicode string literals versus the world
On Tuesday 14 April 2009 17:13:00 Marko Kreen wrote: > If the parsing does not happen in 2 passes and it does not take account > of stdstr setting then the default breakage would be: > >stdstr=off, U&' \' UESCAPE '!'. I think we can handle that and the cases Tom presents by erroring out when the U& syntax is used with stdstr off. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unicode support
On Apr 14, 2009, at 11:10 AM, Tom Lane wrote: Andrew Dunstan writes: I think there's a good case for some functions implementing the various Unicode normalization functions, though. I have no objection to that so long as the code footprint is in line with the utility gain (i.e. not all that much). If we have to bring in ICU or something similar to make it happen, the cost/benefit ratio looks pretty bad. I've no idea what it would require, but the mapping table must be pretty substantial. Still, I'd love to have this functionality in the database. Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unicode string literals versus the world
On Apr 14, 2009, at 11:22 AM, Tom Lane wrote: BTW, does anyone know whether Unicode includes the ASCII control characters ... ie, is \u0009 a name for tab? If so, maybe this syntax is in part an attempt to cover that use-case in the standard. Yes, you can use, e.g., in HTML to represent a tab character. Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unicode string literals versus the world
On Tuesday 14 April 2009 20:35:21 Robert Haas wrote: > Maybe I've just got my head deeply in the sand, but I don't understand > what the alternative to E'' supposedly is. How am I supposed to write > the equivalent of E'\t\n\f' without using E''? Well, the first alternative is to type those characters in literally. The second alternative is the U&'' syntax. ;-) The third alternative is to design applications that don't need this, because the processing behavior of those characters is quite unportable. But of course in some cases using the E'' syntax is the most convenient. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Warm Standby restore_command documentation (was: New trigger option of pg_standby)
I've been following the thread with growing lack of understanding why this is so hardly discussed, and I went back to the documentation of what the restore_command should do ( http://www.postgresql.org/docs/8.3/static/warm-standby.html ) While the algorithm presented in the pseudocode isn't dealing too good with a situation where the trigger is set while the restore_command is sleeping (this should be handled better in a real implementation), the code says "Restore all wal files. If no more wal files are present, stop restoring if the trigger is set; otherwise wait for a new wal file". Since pg_standby is meant as implementation of restore_command, it has to follow the directive stated above; *anything else is a bug*. pg_standby currently does *not* obey this directive, and has that documented, but a documented bug still is a bug. Conclusion: There's no "new trigger option" needed, instead pg_standby has to be fixed so it does what the warm standby option of postgres needs. The trigger is only to be examined if no more files are restorable, and only once. Regards, Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unicode support
>> I don't believe that the standard forbids the use of combining chars at all. >> RFC 3629 says: >> >> ... This issue is amenable to solutions based on Unicode Normalization >> Forms, see [UAX15]. > This is the relevant part. Tom was claiming that the UTF8 encoding required > normalizing the string of unicode codepoints before encoding. I'm not sure > that's true though, is it? No. I think Tom has mistaken this for the fact that the UTF8 encoding can have multiple byte representations for one and the same code point. The standard requires the shortest byte representation to be used. (Please see http://www.dwheeler.com/secure-programs/Secure-Programs-HOWTO/character-encoding.html for more information). However, this has nothing to do with *code point* normalization. The encoding does not require a code point sequence to be normalized. Infact, UTF-8 could hold any of the 4 different normalized forms, 2 of which are completely decomposed forms, that is, every accent takes up its own code point. Also, UTF-8 could hold non-normalized strings. Encodings just deal with how code points are represented in memory or over wires. > Another question is "what is the purpose of a database"? To me it would > be quite the wrong thing for the DB to not store what is presented, as > long as it's considered legal. Normalization of legal variant forms > seems pretty questionable. So I'm with the camp that says this is the > application's responsibility. What I did not mean is automatic normalization. I meant something like PG providing a function to normalize strings which can be explicitly called by the user in case it is needed. For example: SELECT * FROM table1 WHERE normalize(a, 'NFC') = normalize($1, 'NFC'); -- NFC is one of the 4 mentioned normalization forms and the one that should probably be used, since it combines code points rather than decomposing them. I completely agree that the database should never just normalize by itself, because it might be the users intention to store non-normalized strings. An exception might be an explicit configuration setting which tells PG to normalize automatically. In case of the above SELECT query, the problem of offloading the normalization to the app means, that every single application that is ever used with this database has to a) normalize the string, b) use the same normalization form. If just one application at one point in time fails to do so, string comparison is no longer safe (which is could be a security problem as the quoted RFC text says). But with a callable function like normalize() above, the user himself can choose whether it is important or not. That is, does he want code points to match (do not use normalize() then), or does he want characters to match (use normalize() then). The user can normalize the string exactly where it is needed (for comparison). I've searched PG's source code and it appeared to me that the 'text' type is just a typedef for 'varlena', the same type 'bytea' is based on. Given that the client and database encoding is the same, does this mean that text is internally stored in exactly the same binary representation the client has sent it in? So that if the client has sent it in any of the 4 normalized forms, PG guarantees to store and retrieve it (in case of a later SELECT) exactly as it was sent ("store what is presented")? In other words: does PG guarantuee the code point sequence to remain the same? Because if it does not, you cannot offload the normalization work to the app anyway, since PG would be allowed "un-normalize" it internally. Also, what happens if the client has a different encoding than the database, and PG has to internally convert client strings to UTF-8. Does it only generate code points in the same normalized form that it expects the user input to be in? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unicode string literals versus the world
Robert Haas writes: > Maybe I've just got my head deeply in the sand, but I don't understand > what the alternative to E'' supposedly is. How am I supposed to write > the equivalent of E'\t\n\f' without using E''? The > standard_conforming_strings syntax apparently supports no escapes of > any kind, which seems so hideously inconvenient that I can't even > imagine why someone wants that behavior. Well, quite aside from issues of compatibility with standards and other databases, I'm sure there are lots of Windows users who are more interested in being able to store a Windows pathname without doubling their backslashes than they are in being able to type readable names for ASCII control characters. After all, in most cases you can get those characters into a string just by typing them (especially if you aren't using readline or something like it). BTW, does anyone know whether Unicode includes the ASCII control characters ... ie, is \u0009 a name for tab? If so, maybe this syntax is in part an attempt to cover that use-case in the standard. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unicode support
On Tuesday 14 April 2009 19:26:41 Tom Lane wrote: > Another question is "what is the purpose of a database"? To me it would > be quite the wrong thing for the DB to not store what is presented, as > long as it's considered legal. Normalization of legal variant forms > seems pretty questionable. So I'm with the camp that says this is the > application's responsibility. I think automatically normalizing or otherwise fiddling with Unicode strings with combining characters is not acceptable. But the point is that we should process equivalent forms in a consistent way. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unicode support
On Monday 13 April 2009 20:18:31 - - wrote: > 2) PG has no support for the Unicode collation algorithm. Collation is > offloaded to the OS, which makes this quite inflexible. This argument is unclear. Do you want the Unicode collation algorithm or do you want flexibility? Some OS do implement the Unicode collation algorithm. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unicode support
On Tuesday 14 April 2009 18:49:45 Greg Stark wrote: > What's really at issue is "what is a string?". That is, it a sequence > of characters or a sequence of code points. I think a sequence of codepoints would be about as silly a definition as the antiquated notion of a string as a sequence of bytes. > If it's the former then we > would also have to prohibit certain strings such as U&'\0301' > entirely. And we have to make substr() pick out the right number of > code points, etc. Sure enough. That all goes along with what the original poster was saying. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unicode support
Greg Stark wrote: > Peter Eisentraut wrote: >> SELECT U&'\00E9', char_length(U&'\00E9'); >> ?column? | char_length >> --+- >> é| 1 >> (1 row) >> >> SELECT U&'\0065\0301', char_length(U&'\0065\0301'); >> ?column? | char_length >> --+- >> é| 2 >> (1 row) > > What's really at issue is "what is a string?". That is, it a > sequence of characters or a sequence of code points. Doesn't the SQL standard refer to them as "character string literals"? The function is called character_length or char_length. I'm curious -- can every multi-code-point character be normalized to a single-code-point character? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: add columns created and altered topg_proc and pg_class
"Kevin Grittner" wrote: > the timestamp column caused the copy to be about 11.3% larger Grabbed the wrong numbers. It's really 2.5%, but still -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unicode support
Andrew Dunstan writes: > I think there's a good case for some functions implementing the various > Unicode normalization functions, though. I have no objection to that so long as the code footprint is in line with the utility gain (i.e. not all that much). If we have to bring in ICU or something similar to make it happen, the cost/benefit ratio looks pretty bad. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: add columns created and altered to pg_proc and pg_class
Robert Haas wrote: > Making pg_class and pg_proc tables larger hurts run-time performance, > potentially. Making a separate table only slows down DDL operations, > which are much less frequent. Copying the pg_class table, with oids and indexes, with and without the addition of one timestamp column, the timestamp column caused the copy to be about 11.3% larger; so I see your point. I guess I didn't realize just how tight the pg_class table was. Given all that, I'm going to say that from my perspective I don't think the convenience of saving the information is worth the cost, with either approach. I understand it might mean more to others. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Regression failure on RHEL 4 w/ PostgreSQL 8.4 beta1
On Tue, 2009-04-14 at 11:16 -0400, Tom Lane wrote: > > So what changed between 8.3 and 8.4? Same box can build 8.3 with > > --with-system-tzdata . > > We didn't have 64-bit tzdata support before --- it's a new test > covering new functionality. Thanks Tom. Regards, -- Devrim GÜNDÜZ, RHCE devrim~gunduz.org, devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr http://www.gunduz.org signature.asc Description: This is a digitally signed message part
Re: [HACKERS] Unicode support
Kevin Grittner wrote: I'm curious -- can every multi-code-point character be normalized to a single-code-point character? I don't believe so. Those combinations used in the most common orthographic languages have their own code points, but I understand you can use the combining chars with essentially any other chars, although it might not always make much sense to do so. That's important when you're inventing symbols in things like Mathematical and Scientific papers. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unicode support
David E. Wheeler wrote: On Apr 14, 2009, at 9:26 AM, Tom Lane wrote: Another question is "what is the purpose of a database"? To me it would be quite the wrong thing for the DB to not store what is presented, as long as it's considered legal. Normalization of legal variant forms seems pretty questionable. So I'm with the camp that says this is the application's responsibility. Can `convert()` normalize strings? I think that's handling a quite different problem. It certainly should not do so automatically, IMNSHO. I think there's a good case for some functions implementing the various Unicode normalization functions, though. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unicode string literals versus the world
On Tue, Apr 14, 2009 at 8:53 AM, Peter Eisentraut wrote: > This doesn't excite me. I think the tendency should be to get rid of E'' > usage, because its definition of escape sequences is single-byte and ASCII > centric and thus overall a legacy construct. Certainly, we will want to keep > around E'' for a long time or forever, but it is a legitimate goal for > application writers to not use it, which is after all the reason behind this > whole standards-conforming strings project. I wouldn't want to have a > forward-looking feature such as the Unicode escapes be burdened with that kind > of legacy behavior. > > Also note that Unicode escapes are also available for identifiers, for which > there is no existing E"" that you can add it to. Maybe I've just got my head deeply in the sand, but I don't understand what the alternative to E'' supposedly is. How am I supposed to write the equivalent of E'\t\n\f' without using E''? The standard_conforming_strings syntax apparently supports no escapes of any kind, which seems so hideously inconvenient that I can't even imagine why someone wants that behavior. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unicode support
On Apr 14, 2009, at 9:26 AM, Tom Lane wrote: Another question is "what is the purpose of a database"? To me it would be quite the wrong thing for the DB to not store what is presented, as long as it's considered legal. Normalization of legal variant forms seems pretty questionable. So I'm with the camp that says this is the application's responsibility. Can `convert()` normalize strings? Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] psql with "Function Type" in \df
On Tue, Apr 14, 2009 at 12:35:21PM -0400, Tom Lane wrote: > David Fetter writes: > > On Mon, Apr 13, 2009 at 07:24:31PM -0400, Tom Lane wrote: > >> I'd go for something like > >> > >> Type > >> > >> window > >> agg > >> trigger > >> normal > >> > >> Or we could spell out "aggregate", but that makes the column a > >> couple of characters wider ... > > > Done. > > I had a second thought about that: presumably we should make the > function type names translatable. If we do that, it might be better > to make the aggregate case be "aggregate" and take the width hit. > Otherwise translators are going to be puzzled when they come across > "agg" as a translatable phrase. OK > Or maybe I'm overthinking that problem. Comments from anyone who > actually does translations? > > > I've also added \df[antw], which lets people narrow their search. > > Why didn't you make it work like \d[tisv], ie allow more than one > letter? If you're going to be inventing new features at this late > date, they should at least work like the adjacent precedent. Because I didn't think of it. Lemme see about that :) Cheers, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unicode support
Greg Stark writes: > What's really at issue is "what is a string?". That is, it a sequence > of characters or a sequence of code points. If it's the former then we > would also have to prohibit certain strings such as U&'\0301' > entirely. And we have to make substr() pick out the right number of > code points, etc. Another question is "what is the purpose of a database"? To me it would be quite the wrong thing for the DB to not store what is presented, as long as it's considered legal. Normalization of legal variant forms seems pretty questionable. So I'm with the camp that says this is the application's responsibility. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] psql with "Function Type" in \df
David Fetter writes: > On Mon, Apr 13, 2009 at 07:24:31PM -0400, Tom Lane wrote: >> I'd go for something like >> >> Type >> >> window >> agg >> trigger >> normal >> >> Or we could spell out "aggregate", but that makes the column a >> couple of characters wider ... > Done. I had a second thought about that: presumably we should make the function type names translatable. If we do that, it might be better to make the aggregate case be "aggregate" and take the width hit. Otherwise translators are going to be puzzled when they come across "agg" as a translatable phrase. Or maybe I'm overthinking that problem. Comments from anyone who actually does translations? > I've also added \df[antw], which lets people narrow their search. Why didn't you make it work like \d[tisv], ie allow more than one letter? If you're going to be inventing new features at this late date, they should at least work like the adjacent precedent. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: add columns created and altered to pg_proc and pg_class
Robert Haas writes: > On Tue, Apr 14, 2009 at 10:27 AM, Kevin Grittner >> Yeah, if it would be too heavy to add a timestamp column or two to >> pg_class and maybe one or two others, why is it better to add a whole >> new table to maintain in parallel -- with it's own primary key, >> foreign keys (or similar integrity enforcement mechanism), etc. > Making pg_class and pg_proc tables larger hurts run-time performance, > potentially. Making a separate table only slows down DDL operations, > which are much less frequent. And even more to the point, adding columns to the core system tables means you pay the performance cost *even when not using the feature*. We normally expect that inessential features should avoid making a performance impact on those who have no use for them. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unicode string literals versus the world
On Saturday 11 April 2009 18:20:47 Sam Mason wrote: > I can't see much support in the other database engines; searched for > Oracle, MS-SQL, DB2 and Firebird. MySQL has it planned for 7.1, so not > for a while. DB2 supports it, as far as I know. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unicode string literals versus the world
On Tuesday 14 April 2009 14:38:38 Marko Kreen wrote: > I think the problem is that they should not act like E'' strings, but they > should act like plain '' strings - they should follow stdstr setting. > > That way existing tools that may (or may not..) understand E'' and stdstr > settings, but definitely have not heard about U&'' strings can still > parse the SQL without new surprises. Can you be more specific in what "surprises" you expect? What algorithms do you suppose those "existing tools" use and what expectations do they have? > I still stand on my proposal, how about extending E'' strings with > unicode escapes (eg. \u)? The E'' strings are already more > clearly defined than '' and they are our "own", we don't need to > consider random standards, but can consider our sanity. This doesn't excite me. I think the tendency should be to get rid of E'' usage, because its definition of escape sequences is single-byte and ASCII centric and thus overall a legacy construct. Certainly, we will want to keep around E'' for a long time or forever, but it is a legitimate goal for application writers to not use it, which is after all the reason behind this whole standards-conforming strings project. I wouldn't want to have a forward-looking feature such as the Unicode escapes be burdened with that kind of legacy behavior. Also note that Unicode escapes are also available for identifiers, for which there is no existing E"" that you can add it to. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unicode string literals versus the world
Marko Kreen writes: > I would prefer that such quoting extensions would wait until > stdstr=on setting is the only mode Postgres will operate. > Fitting new quoting ways to environment with flippable stdstr setting > will be rather painful for everyone. It would certainly be a lot safer to wait until non-standard-conforming strings don't exist anymore. The problem is that that may never happen, and is certainly not on the roadmap to happen in the foreseeable future. > I still stand on my proposal, how about extending E'' strings with > unicode escapes (eg. \u)? The E'' strings are already more > clearly defined than '' and they are our "own", we don't need to > consider random standards, but can consider our sanity. That's one way we could proceed. The other proposal that seemed attractive to me was a decode-like function: uescape('foo\00e9bar') uescape('foo\00e9bar', '\') (double all the backslashes if you assume not standard_conforming_strings). The arguments in favor of this one are (1) you can apply it to the result of an expression, it's not strictly tied to literals; and (2) it's a lot lower-footprint solution since it doesn't affect basic literal handling. If you wish to suppose that this is only a stopgap until someday when we can implement the SQL standard syntax more safely, then low footprint is good. One could even imagine back-porting this into existing releases as a user-defined function. The solution with \u in extended literals is probably workable too. I'm slightly worried about the possibility of issues with code that thinks it knows what an E-literal means but doesn't really. In particular something might think it knows that "\u" just means "u", and proceed to strip the backslash. I don't see a path for that to become a security hole though, only a garden-variety bug. So I could live with that one on the grounds of being easier to use (which it would be, because of less typing compared to uescape()). regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unicode support
On Tue, Apr 14, 2009 at 1:32 PM, Peter Eisentraut wrote: > On Monday 13 April 2009 22:39:58 Andrew Dunstan wrote: >> Umm, but isn't that because your encoding is using one code point? >> >> See the OP's explanation w.r.t. canonical equivalence. >> >> This isn't about the number of bytes, but about whether or not we should >> count characters encoded as two or more combined code points as a single >> char or not. > > Here is a test case that shows the problem (if your terminal can display > combining characters (xterm appears to work)): > > SELECT U&'\00E9', char_length(U&'\00E9'); > ?column? | char_length > --+- > é | 1 > (1 row) > > SELECT U&'\0065\0301', char_length(U&'\0065\0301'); > ?column? | char_length > --+- > é | 2 > (1 row) What's really at issue is "what is a string?". That is, it a sequence of characters or a sequence of code points. If it's the former then we would also have to prohibit certain strings such as U&'\0301' entirely. And we have to make substr() pick out the right number of code points, etc. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unicode support
On Tuesday 14 April 2009 07:07:27 Andrew Gierth wrote: > FWIW, the SQL spec puts the onus of normalization squarely on the > application; the database is allowed to assume that Unicode strings > are already normalized, is allowed to behave in implementation-defined > ways when presented with strings that aren't normalized, and provision > of normalization functions and predicates is just another optional > feature. Can you name chapter and verse on that? I see this, for example, 6.27 5) If a is specified, then Case: a) If the character encoding form of is not UTF8, UTF16, or UTF32, then let S be the . Case: i) If the most specific type of S is character string, then the result is the number of characters in the value of S. NOTE 134 — The number of characters in a character string is determined according to the semantics of the character set of that character string. ii) Otherwise, the result is OCTET_LENGTH(S). b) Otherwise, the result is the number of explicit or implicit in , counted in accordance with the definition of those units in the relevant normatively referenced document. So SQL redirects the question of character length the Unicode standard. I have not been able to find anything there on a quick look, but I'm sure the Unicode standard has some very specific ideas on this. Note that the matter of normalization is not mentioned here. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unicode support
On Monday 13 April 2009 22:39:58 Andrew Dunstan wrote: > Umm, but isn't that because your encoding is using one code point? > > See the OP's explanation w.r.t. canonical equivalence. > > This isn't about the number of bytes, but about whether or not we should > count characters encoded as two or more combined code points as a single > char or not. Here is a test case that shows the problem (if your terminal can display combining characters (xterm appears to work)): SELECT U&'\00E9', char_length(U&'\00E9'); ?column? | char_length --+- é| 1 (1 row) SELECT U&'\0065\0301', char_length(U&'\0065\0301'); ?column? | char_length --+- é| 2 (1 row) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unicode string literals versus the world
Marko Kreen wrote: I still stand on my proposal, how about extending E'' strings with unicode escapes (eg. \u)? The E'' strings are already more clearly defined than '' and they are our "own", we don't need to consider random standards, but can consider our sanity. I suspect there would be lots more support in the user community, where \u is well understood in a number of contexts (Java and ECMAScript, for example). It's also tolerably sane. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Regression failure on RHEL 4 w/ PostgreSQL 8.4 beta1
Devrim =?ISO-8859-1?Q?G=DCND=DCZ?= writes: > On Mon, 2009-04-13 at 15:31 -0400, Tom Lane wrote: >> This test is checking whether you have working 64-bit-tzdata support. >> It seems you don't. > So what changed between 8.3 and 8.4? Same box can build 8.3 with > --with-system-tzdata . We didn't have 64-bit tzdata support before --- it's a new test covering new functionality. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Regression failure on RHEL 4 w/ PostgreSQL 8.4 beta1
On Mon, 2009-04-13 at 15:31 -0400, Tom Lane wrote: > > I'm getting the following failure on RHEL 4: > > > http://www.gunduz.org/temp/regression.out > > http://www.gunduz.org/temp/regression.diffs > > This test is checking whether you have working 64-bit-tzdata support. > It seems you don't. > > If you built with --with-system-tzdata, and RHEL4 doesn't include > 64-bit tzdata files, then this failure would be expected. (I'm not > totally sure about the second premise, but some quick digging in > the specfile's changelog suggests that Red Hat only started to > support 64-bit tzdata in RHEL5.) So what changed between 8.3 and 8.4? Same box can build 8.3 with --with-system-tzdata . Regards, -- Devrim GÜNDÜZ, RHCE devrim~gunduz.org, devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr http://www.gunduz.org signature.asc Description: This is a digitally signed message part
Re: [HACKERS] proposal: add columns created and altered to pg_proc and pg_class
On Tue, Apr 14, 2009 at 10:27 AM, Kevin Grittner wrote: > Pavel Stehule wrote: >> I though about it too. But I am not sure, if this isn't too >> complicated solution for simple task. If I thing little bit more - >> main important is timestamp of last change. > > Yeah, if it would be too heavy to add a timestamp column or two to > pg_class and maybe one or two others, why is it better to add a whole > new table to maintain in parallel -- with it's own primary key, > foreign keys (or similar integrity enforcement mechanism), etc. Making pg_class and pg_proc tables larger hurts run-time performance, potentially. Making a separate table only slows down DDL operations, which are much less frequent. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unicode string literals versus the world
Peter Eisentraut writes: > On Saturday 11 April 2009 00:54:25 Tom Lane wrote: >> If we let this go into 8.4, our previous rounds with security holes >> caused by careless string parsing will look like a day at the beach. > Note that the escape character marks the Unicode escapes; it doesn't > affect the quote characters that delimit the string. So offhand I > can't see any potential for quote confusion/SQL injection type > problems. Please elaborate if you see a problem. The problem is the interaction with non-standard-conforming strings. Case 1: select u&'foo\' uescape ',' ... The backend will see the backslash as just a data character, and will think that "..." is live SQL text. A non-Unicode-literal-aware frontend will think that the backslash escapes the second quote, the comma is live SQL text, and the ... is quoted material. Construction of an actual SQL injection attack is left as an exercise for the reader, but certainly the raw material is here. Case 2: select u&'foo' uescape '\' ... Again, any existing frontend code will think that the backslash quotes the final quote and the ... is quoted material. This one is particularly nasty because we allow arbitrary amounts of whitespace and commenting on either side of "uescape": select u&'foo' /* hello joe, do you /* understand nested comments today? */ -- yes, this one too */ uescape -- but not this one /* '\' ... I suspect that it's actually impossible to parse such a thing correctly without a full-fledged flex lexer or something of equivalent complexity. Certainly it's a couple of orders of magnitude harder than it is for either standard-conforming or E'' literals. Case 3: select u&'foo\' uescape ',' ... select u & 'foo\' uescape ',' ... In the first form the ... is live SQL, in the second form it is quoted material. This means that you might correctly validate a query and then have your results invalidated by later processing that innocently adds or removes whitespace. (This is particularly nasty in a standard that demands we parse "x/-1" and "x / -1" the same ...) So what we've got here is a whole new set of potential SQL injection attacks by confusing frontend literal-syntax checking, plus a truly staggering increase in the minimum *required* complexity of such checking. I understand the usefulness of being able to write Unicode code points, but they're not useful enough to justify this syntax. This thread has already mentioned a couple of ways we could add the same facility without making any lexical-level changes, at least for data values. I admit that the SQL:2008 way also covers Unicode code points in identifiers, which we can't emulate without a lexical change; but frankly I think the use-case for that is so thin as to be almost nonexistent. Who is going to choose identifiers that they cannot easily type on their keyboards? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: add columns created and altered to pg_proc and pg_class
Pavel Stehule wrote: > I though about it too. But I am not sure, if this isn't too > complicated solution for simple task. If I thing little bit more - > main important is timestamp of last change. Yeah, if it would be too heavy to add a timestamp column or two to pg_class and maybe one or two others, why is it better to add a whole new table to maintain in parallel -- with it's own primary key, foreign keys (or similar integrity enforcement mechanism), etc. Others apparently see a bigger advantage to this than I, but if it's not something I can just eyeball while I'm looking at the object definition, it isn't likely to save me much over going to other sources. Let's not over-engineer this. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unicode string literals versus the world
On 4/14/09, Peter Eisentraut wrote: > On Tuesday 14 April 2009 14:38:38 Marko Kreen wrote: > > I think the problem is that they should not act like E'' strings, but they > > should act like plain '' strings - they should follow stdstr setting. > > > > That way existing tools that may (or may not..) understand E'' and stdstr > > settings, but definitely have not heard about U&'' strings can still > > parse the SQL without new surprises. > > > Can you be more specific in what "surprises" you expect? What algorithms do > you suppose those "existing tools" use and what expectations do they have? If the parsing does not happen in 2 passes and it does not take account of stdstr setting then the default breakage would be: stdstr=off, U&' \' UESCAPE '!'. And anything, whose security or functionality depends on parsing SQL can be broken that way. Broken functionality would be eg. Slony (or other replication solution) distributing developer-written SQL code to bunch of nodes. It needs to parse text file to SQL statements and execute them separately. There are probably other solutions who expect to understand SQL at least token level to function correctly. (pgpool, java has probably something depending on it, etc.) > > I still stand on my proposal, how about extending E'' strings with > > unicode escapes (eg. \u)? The E'' strings are already more > > clearly defined than '' and they are our "own", we don't need to > > consider random standards, but can consider our sanity. > > > This doesn't excite me. I think the tendency should be to get rid of E'' > usage, because its definition of escape sequences is single-byte and ASCII > centric and thus overall a legacy construct. Why are you concentrating only on \0xx escapes? The \\, \n, etc seem standard and forward looking enough. Yes, unicode escapes are missing but we can add them without breaking anything. > Certainly, we will want to keep > around E'' for a long time or forever, but it is a legitimate goal for > application writers to not use it, which is after all the reason behind this > whole standards-conforming strings project. I wouldn't want to have a > forward-looking feature such as the Unicode escapes be burdened with that > kind > of legacy behavior. > > Also note that Unicode escapes are also available for identifiers, for which > there is no existing E"" that you can add it to. Well, I was not rejecting the standard quoting, but suggesting postponing until the stdstr mess is sorted out. We can use \uXX in meantime and I think most Postgres users would prefer to keep using it... -- marko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Windowing functions vs aggregates
Greg Stark writes: > However, I'm kind of confused by that result. Why does the range > "between unbounded preceding and current row" seem to be doing the > average of the whole result set? That's what it's supposed to do. "Current row" really includes all peers of the current row in the window frame ordering, and since you didn't specify any ORDER BY clause, all the rows are peers. If you put in "order by s" you'll get the result you were expecting: regression=# select s,(avg(s) OVER (range between unbounded preceding and current row)) from foo; s |avg ---+ 1 | 2.5000 2 | 2.5000 3 | 2.5000 4 | 2.5000 (4 rows) regression=# select s,(avg(s) OVER (order by s range between unbounded preceding and current row)) from foo; s | avg ---+ 1 | 1. 2 | 1.5000 3 | 2. 4 | 2.5000 (4 rows) I suppose the SQL committee defined it like that to try to reduce the implementation dependency of the results. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Windowing functions vs aggregates
Teodor Sigaev writes: > Cast of aggregate's type works: > # select avg(s)::int4 from foo; > but that doesn't work for with new windowing functions interface: > # select avg(s)::int4 OVER () from foo; > ERROR: syntax error at or near "OVER" > LINE 1: select avg(s)::int4 OVER () from foo; > Is that intentional? You would need to put the cast around the whole "foo() over ()" construct. That's not a divisible unit. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why isn't stats_temp_directory automatically created?
Fujii Masao escreveu: Is it worth making the patch which creates stats_temp_directory if not present? +1. -- Euler Taveira de Oliveira http://www.timbira.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Windowing functions vs aggregates
2009/4/14 Teodor Sigaev : > select avg(s)::int4 OVER () from foo; You can put the cast outside the window expression such as: postgres=# select s,(avg(s) OVER (range between unbounded preceding and current row))::int4 from foo; s | avg ---+- 1 | 2 2 | 2 3 | 2 (3 rows) However, I'm kind of confused by that result. Why does the range "between unbounded preceding and current row" seem to be doing the average of the whole result set? This is not related to the cast: postgres=# select s,avg(s) OVER (range between unbounded preceding and current row) from foo; s |avg ---+ 1 | 2. 2 | 2. 3 | 2. (3 rows) I haven't recompiled recently and I do recall some bug fixes a while back. Was this that? I'm recompiling now. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Windowing functions vs aggregates
Cast of aggregate's type works: # select avg(s)::int4 from foo; but that doesn't work for with new windowing functions interface: # select avg(s)::int4 OVER () from foo; ERROR: syntax error at or near "OVER" LINE 1: select avg(s)::int4 OVER () from foo; Is that intentional? -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] libpq is not thread safe
When postgreSQL is compiled with --thread-safe that libpq should be thread safe. But it is not true when somebody call fork(). The problem is that fork() forks only active threads and some mutex can stay locked by another thread. We use ssl_config mutex which is global. We need implement atfork handlers to fix this. See http://www.opengroup.org/onlinepubs/009695399/functions/pthread_atfork.html We should add pthread_atfork into _ini libpq section. Another problem with fork is that new process inherit connections and so on. Which is not also good, but it is happened also on single threaded application and developer can fix it in own code. Maybe some notice in documentation should help what application should do after fork. Comments? Zdenek -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unicode string literals versus the world
On 4/14/09, Peter Eisentraut wrote: > On Saturday 11 April 2009 00:54:25 Tom Lane wrote: > > It gets worse though: I have seldom seen such a badly designed piece of > > syntax as the Unicode string syntax --- see > > http://developer.postgresql.org/pgdocs/postgres/sql-syntax-lexical.html#SQL > >-SYNTAX-STRINGS-UESCAPE > > > > You scan the string, and then after that they tell you what the escape > > character is!? Not to mention the obvious ambiguity with & as an > > operator. > > > > If we let this go into 8.4, our previous rounds with security holes > > caused by careless string parsing will look like a day at the beach. > > No frontend that isn't fully cognizant of the Unicode string syntax is > > going to parse such things correctly --- it's going to be trivial for > > a bad guy to confuse a quoting mechanism as to what's an escape and what > > isn't. > > > Note that the escape character marks the Unicode escapes; it doesn't affect > the > quote characters that delimit the string. So offhand I can't see any > potential > for quote confusion/SQL injection type problems. Please elaborate if you see > a problem. > > If there are problems, we could consider getting rid of the UESCAPE clause. > Without it, the U&'' strings would behave much like the E'' strings. But I'd > like to understand the problem first. I think the problem is that they should not act like E'' strings, but they should act like plain '' strings - they should follow stdstr setting. That way existing tools that may (or may not..) understand E'' and stdstr settings, but definitely have not heard about U&'' strings can still parse the SQL without new surprises. If they already act that way then keeping U& should be fine. And if UESCAPE does not affect main string parsing, but is handled in second pass going over parsed string - like bytea \ - then that should also be fine and should not cause any new surprises. But if not, it must go. I would prefer that such quoting extensions would wait until stdstr=on setting is the only mode Postgres will operate. Fitting new quoting ways to environment with flippable stdstr setting will be rather painful for everyone. I still stand on my proposal, how about extending E'' strings with unicode escapes (eg. \u)? The E'' strings are already more clearly defined than '' and they are our "own", we don't need to consider random standards, but can consider our sanity. -- marko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unicode string literals versus the world
On Saturday 11 April 2009 21:50:29 Josh Berkus wrote: > On 4/11/09 11:47 AM, Marko Kreen wrote: > > On 4/11/09, Tom Lane wrote: > >> It gets worse though: I have seldom seen such a badly designed piece > >> of syntax as the Unicode string syntax --- see > >> > >> http://developer.postgresql.org/pgdocs/postgres/sql-syntax-lexical.html# > >>SQL-SYNTAX-STRINGS-UESCAPE > > WTF? Whose feature is this? What's the use case? The use case is approximately the same as in HTML: entering Unicode characters that your screen or keyboard cannot easily produce. It's a desperately needed feature for me. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unicode string literals versus the world
On Saturday 11 April 2009 00:54:25 Tom Lane wrote: > It gets worse though: I have seldom seen such a badly designed piece of > syntax as the Unicode string syntax --- see > http://developer.postgresql.org/pgdocs/postgres/sql-syntax-lexical.html#SQL >-SYNTAX-STRINGS-UESCAPE > > You scan the string, and then after that they tell you what the escape > character is!? Not to mention the obvious ambiguity with & as an > operator. > > If we let this go into 8.4, our previous rounds with security holes > caused by careless string parsing will look like a day at the beach. > No frontend that isn't fully cognizant of the Unicode string syntax is > going to parse such things correctly --- it's going to be trivial for > a bad guy to confuse a quoting mechanism as to what's an escape and what > isn't. Note that the escape character marks the Unicode escapes; it doesn't affect the quote characters that delimit the string. So offhand I can't see any potential for quote confusion/SQL injection type problems. Please elaborate if you see a problem. If there are problems, we could consider getting rid of the UESCAPE clause. Without it, the U&'' strings would behave much like the E'' strings. But I'd like to understand the problem first. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] New trigger option of pg_standby
Hi, On Tue, Apr 14, 2009 at 6:35 PM, Simon Riggs wrote: > On Mon, 2009-04-13 at 14:52 +0900, Fujii Masao wrote: > >> A lookahead (the +1) may have pg_standby get stuck as follows. >> Am I missing something? >> >> 1. the trigger file containing "smart" is created. >> 2. pg_standby is executed. >> 2-1. nextWALfile is restored. >> 2-2. the trigger file is deleted because nextWALfile+1 doesn't exist. >> 3. the restored nextWALfile is applied. >> 4. pg_standby is executed again to restore nextWALfile+1. > > This can't happen. (4) will never occur when (2-2) has occurred. A > non-zero error code means file not available which will cause recovery > to end and hence no requests for further WAL files are made. When pg_standby exits with non-zero code, (3) and (4) will never occur, and the transactions in nextWALfile will be lost. So, in (2-2), pg_standby has to call exit(0), I think. On the other hand, if exit(0) is called in (2-2), the above scenario happens. > It does *seem* as if there is a race condition there in that another WAL > file may arrive after we have taken the decision there are no more WAL > files, but it's not a problem. That could happen if we issue the trigger > while the master is still up, which is a mistake - why would we do that? > If we only issue the trigger once we are happy the master is down then > we don't get a problem. Yeah, I agree that such race condition is not a problem. The trigger file has to be created after all the WAL files arrive at the standby server. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] New trigger option of pg_standby
On Mon, 2009-04-13 at 14:52 +0900, Fujii Masao wrote: > A lookahead (the +1) may have pg_standby get stuck as follows. > Am I missing something? > > 1. the trigger file containing "smart" is created. > 2. pg_standby is executed. > 2-1. nextWALfile is restored. > 2-2. the trigger file is deleted because nextWALfile+1 doesn't exist. > 3. the restored nextWALfile is applied. > 4. pg_standby is executed again to restore nextWALfile+1. This can't happen. (4) will never occur when (2-2) has occurred. A non-zero error code means file not available which will cause recovery to end and hence no requests for further WAL files are made. It does *seem* as if there is a race condition there in that another WAL file may arrive after we have taken the decision there are no more WAL files, but it's not a problem. That could happen if we issue the trigger while the master is still up, which is a mistake - why would we do that? If we only issue the trigger once we are happy the master is down then we don't get a problem. So lets do it the next+1 way, when triggered. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] New trigger option of pg_standby
Hi, On Mon, Apr 13, 2009 at 2:52 PM, Fujii Masao wrote: > But, a lookahead nextWALfile seems to work fine. > > if (triggered) > { > if (smartMode && nextWALfile exists) > exit(0) > else > { > delete trigger file > exit(1) > } > } Umm... in this algorithm, the trigger file remains after failover if the nextWALfile has the invalid record which means the end of WAL files. I'd like to propose another simple idea; pg_standby deletes the trigger file *whenever* the nextWALfile is a timeline history file. A timeline history file is restored at the end of recovery, so it's guaranteed that the trigger file is deleted whether nextWALfile exists or not. A timeline history file is restored also at the beginning of recovery, so the accidentally remaining trigger file is deleted in early warm-standby as a side-effect of this idea. How does that sound? Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Why isn't stats_temp_directory automatically created?
Hi, log_directory is automatically created if not present when starting the database server. But, stats_temp_directory is not created. Why? ISTM that current behavior is undesirable. Is it worth making the patch which creates stats_temp_directory if not present? Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers