Re: [HACKERS] [BUGS] COPY FROM/TO losing a single byte of a multibyte UTF-8 sequence
On Aug 19, 2010, at 2:35 PM, Tom Lane wrote: > Steven Schlansker writes: >> I'm having a rather annoying problem - a particular string is causing the >> Postgres COPY functionality to lose a byte, causing data corruption in >> backups and transferred data. > > I was able to reproduce this on my own Mac. Some tracing shows that the > problem is that isspace(0x85) returns true when in locale en_US.utf-8. > This causes array_in to drop the final byte of the array element string, > thinking that it's insignificant whitespace. The 0x85 seems to be the second byte of a multibyte UTF-8 sequence. I'm not at all experienced with character encodings so I could be totally off base, but isn't it wrong to ever call isspace(0x85), whatever the result may be, given that the actual character is 0xCF85? (U+03C5, GREEK SMALL LETTER UPSILON) > I believe that you must > not have produced the data file data.copy on a Mac, or at least not in > that locale setting, because array_out should have double-quoted the > array element given that behavior of isspace(). Correct, it was produced on a Linux machine. That said, the charset there was also UTF-8. > > Now, it's probably less than sane for isspace() to be behaving that way, > since in a UTF8-based locale 0x85 can't be regarded as a valid character > code at all. But I'm not hopeful about the results of filing a bug with > Apple, because their UTF8-based locales have a lot of other bu^H^Hdubious > behaviors too, which they appear not to care much about. I actually can't reproduce that behavior here: #include #include int main() { printf("%d\n", isspace(0x85)); return 0; } [ste...@xxx:~]% gcc -o test test.c [ste...@xxx:~]% ./test 0 [ste...@xxx:~]% locale LANG="en_US.utf-8" LC_COLLATE="en_US.utf-8" LC_CTYPE="en_US.utf-8" LC_MESSAGES="en_US.utf-8" LC_MONETARY="en_US.utf-8" LC_NUMERIC="en_US.utf-8" LC_TIME="en_US.utf-8" LC_ALL= [ste...@xxx:~]% uname -a Darwin xxx.local 10.4.0 Darwin Kernel Version 10.4.0: Fri Apr 23 18:28:53 PDT 2010; root:xnu-1504.7.4~1/RELEASE_I386 i386 i386 Thanks much for your help, Steven Schlansker -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUGS] COPY FROM/TO losing a single byte of a multibyte UTF-8 sequence
On Aug 19, 2010, at 3:24 PM, Tom Lane wrote: > Steven Schlansker writes: >> >> I'm not at all experienced with character encodings so I could >> be totally off base, but isn't it wrong to ever call isspace(0x85), >> whatever the result may be, given that the actual character is 0xCF85? >> (U+03C5, GREEK SMALL LETTER UPSILON) > > We generally assume that in server-safe encodings, the ctype.h functions > will behave sanely on any single-byte value. You can argue the wisdom > of that, but deciding to change that policy would be a rather massive > code change; I'm not excited about going that direction. Fair enough. I presume there are no "server-safe encodings" for which a multibyte sequence 0x XX20 would be valid - which would break anyway (as the second byte looks like a real space) > You need a setlocale() call, else the program acts as though it's in C > locale regardless of environment. Sigh. I hate C sometimes. :-p Anyway, it looks like this is actually a BSD bug which got copy + pasted into Apple's Darwin source - http://lists.freebsd.org/pipermail/freebsd-i18n/2007-September/000157.html I have a couple of contacts at Apple so I'll see if there's any interest in backporting a fix, but I wouldn't hope for it to happen quickly if at all... Thanks for taking a look into fixing this, I hope you guys can reach consensus on how to get it fixed :) Best, Steven Schlansker -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] small smgrcreate cleanup patch
Robert Haas writes: > So I propose moving the TablespaceCreateDbspace() call to mdcreate(), > removing the redundant check from smgrcreate(), and deleting that > portion of the comment which says this is in the wrong place. While I don't care for having smgr.c call tablespace.c, moving the call to md.c instead is surely not an improvement :-(. The problem here is that we'd like the tablespace code to be above the smgr code, not below. Calling it from md.c makes the layer inversion worse not better. > You could argue that perhaps md.c isn't the right place either, but it > certainly makes more sense than smgr.c, and I'd argue it's exactly > right. On what grounds pray tell? > A related, interesting question is whether there's any purpose to the > smgr layer at all. Not a lot anymore, I think. This has been ranted about before, but I think the Berkeley guys bet on the wrong horse when they put an API layer here. The facilities that might once have been usefully switched between at this level are now all down inside kernel device drivers. I could see merging smgr.c and md.c entirely. But they'd still be appropriately placed below, not above, the tablespace commands. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] security hook on authorization
(2010/08/20 11:45), Robert Haas wrote: > 2010/8/19 KaiGai Kohei: >> I also plan to add a security hook on authorization time. >> It shall allow external security providers to set up credential of >> the authenticated clients. >> >> Please note that it is not intended to control authentication process. >> It is typically checked based on a pair of username and password. >> What I want to discuss is things after success of this authentication >> steps. >> >> From viewpoint of SE-PostgreSQL, it uses getpeercon(3) which obtains >> a security label of the peer process, so it does not need to consider >> database username. But we can easily assume other security mechanism >> which assigns a certain label based on the authenticated database user >> such as Oracle Label Security. >> >> So, I think this hook should be also invoked on the code path of >> SET SESSION AUTHORIZATION, not only database login time, although >> SE-PostgreSQL ignores this case. >> >> So, I think SetSessionUserId() is a candidate to put this hook which is >> entirely called from both of the code path. >> This routine is to assign credential of the default database privilege >> mechanism, so it seems to me it is a good point where external security >> provider also assigns its credential of the authenticated database user. > > How is this different from what we rejected before? > It made clear the purpose of this hook. I also intended to use the previous hook for authorization purpose, but it was deployed just after initialize_acl() without no argument. It might be suitable for SE-PostgreSQL, because it does not depend on authenticated database user, but might be too specific. The new hook shall be invoked on two code paths (database login and SET SESSION AUTHORIZATION). It allows upcoming security module which may assign client's credential based on the database user to utilize this hook also. Thanks, -- KaiGai Kohei -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] security hook on authorization
2010/8/19 KaiGai Kohei : > I also plan to add a security hook on authorization time. > It shall allow external security providers to set up credential of > the authenticated clients. > > Please note that it is not intended to control authentication process. > It is typically checked based on a pair of username and password. > What I want to discuss is things after success of this authentication > steps. > > From viewpoint of SE-PostgreSQL, it uses getpeercon(3) which obtains > a security label of the peer process, so it does not need to consider > database username. But we can easily assume other security mechanism > which assigns a certain label based on the authenticated database user > such as Oracle Label Security. > > So, I think this hook should be also invoked on the code path of > SET SESSION AUTHORIZATION, not only database login time, although > SE-PostgreSQL ignores this case. > > So, I think SetSessionUserId() is a candidate to put this hook which is > entirely called from both of the code path. > This routine is to assign credential of the default database privilege > mechanism, so it seems to me it is a good point where external security > provider also assigns its credential of the authenticated database user. How is this different from what we rejected before? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] small smgrcreate cleanup patch
smgrcreate() currently contains a call to TablespaceCreateDbspace(). As the comment says, this is a rather silly place to put it. The silliest thing about it, IMHO, is that it forces the following check to be done in both smgrcreate() and mdcreate(): if (isRedo && reln->md_fd[forknum] != NULL) return; So I propose moving the TablespaceCreateDbspace() call to mdcreate(), removing the redundant check from smgrcreate(), and deleting that portion of the comment which says this is in the wrong place. You could argue that perhaps md.c isn't the right place either, but it certainly makes more sense than smgr.c, and I'd argue it's exactly right. Moving the TablespaceCreateDbspace() logic into md.c (or smgr.c) seems like it would be more awkward, though I suppose it could be done. One less-than-thrilling result would be that the TablespaceCreateLock logic wouldn't be confined to tablespace.c, not that that's *necessarily* a disaster. A related, interesting question is whether there's any purpose to the smgr layer at all. Would we accept a patch that implemented an alternative smgr layer, perhaps on a per-tablespace basis? The only real alternative I can imagine to "magnetic disk" storage would be network storage. You could imagine using such a thing for temporary tablespaces, essentially writing out temporary table pages to a RAM cache on a remote machine; or perhaps more interestingly, using it for fault tolerance, keeping 2 or 3 copies of each page on different servers. Maybe someone will say "hey, just use iSCSI" or "hey, just use AFS" or something like that, but I dunno if I trust them to do the right thing with fsync, and they might not be as well-optimized as would be possible if we rolled our own. At any rate, if we DON'T think we'd ever consider supporting something like this, perhaps we ought to just fold the md.c stuff into smgr.c and eliminate all the jumping through hoops. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company smgrcreate_cleanup.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] security hook on authorization
I also plan to add a security hook on authorization time. It shall allow external security providers to set up credential of the authenticated clients. Please note that it is not intended to control authentication process. It is typically checked based on a pair of username and password. What I want to discuss is things after success of this authentication steps. >From viewpoint of SE-PostgreSQL, it uses getpeercon(3) which obtains a security label of the peer process, so it does not need to consider database username. But we can easily assume other security mechanism which assigns a certain label based on the authenticated database user such as Oracle Label Security. So, I think this hook should be also invoked on the code path of SET SESSION AUTHORIZATION, not only database login time, although SE-PostgreSQL ignores this case. So, I think SetSessionUserId() is a candidate to put this hook which is entirely called from both of the code path. This routine is to assign credential of the default database privilege mechanism, so it seems to me it is a good point where external security provider also assigns its credential of the authenticated database user. Thanks, -- KaiGai Kohei src/backend/utils/init/miscinit.c | 14 ++ src/include/miscadmin.h |4 2 files changed, 18 insertions(+), 0 deletions(-) diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c index b3243aa..81f7bd1 100644 --- a/src/backend/utils/init/miscinit.c +++ b/src/backend/utils/init/miscinit.c @@ -229,6 +229,15 @@ static int SecurityRestrictionContext = 0; /* We also remember if a SET ROLE is currently active */ static bool SetRoleIsActive = false; +/* + * SetSessionUserId_hook allows external security providers to authorize + * the authenticated client on database login and SET SESSION AUTHORIZATION. + * It takes two arguments; userid_old and userid_new. + * If userid_old would be InvalidOid, it means the hook was invoked on + * database login time. Elsewhere, it was invoked due to SET SESSION + * AUTHORIZATION. + */ +SetSessionUserId_hook_type SetSessionUserId_hook = NULL; /* * GetUserId - get the current effective user ID. @@ -282,6 +291,11 @@ SetSessionUserId(Oid userid, bool is_superuser) { AssertState(SecurityRestrictionContext == 0); AssertArg(OidIsValid(userid)); + + /* We also allow security provides to authorize the client */ + if (SetSessionUserId_hook) + (*SetSessionUserId_hook)(SessionUserId, userid); + SessionUserId = userid; SessionUserIsSuperuser = is_superuser; SetRoleIsActive = false; diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h index 2c775c1..5478de6 100644 --- a/src/include/miscadmin.h +++ b/src/include/miscadmin.h @@ -278,6 +278,10 @@ extern void SetDataDir(const char *dir); extern void ChangeToDataDir(void); extern char *make_absolute_path(const char *path); +/* Hook for plugins to get control in SetSessionUserId */ +typedef void (*SetSessionUserId_hook_type)(Oid userid_old, Oid userid_new); +extern PGDLLIMPORT SetSessionUserId_hook_type SetSessionUserId_hook; + /* in utils/misc/superuser.c */ extern bool superuser(void); /* current user is superuser */ extern bool superuser_arg(Oid roleid); /* given user is superuser */ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: tuplestore, tuplesort aggregate functions
On Thu, Aug 19, 2010 at 06:03:57PM -0400, Robert Haas wrote: > On Thu, Aug 19, 2010 at 5:53 PM, David Fetter wrote: > > On Thu, Aug 19, 2010 at 05:40:46PM -0400, Tom Lane wrote: > >> "Kevin Grittner" writes: > >> > Robert Haas wrote: > >> >> I suppose there could also be a bit of an ambiguity if you're working > >> >> with a type like int4 where the values are discrete steps. Like, what > >> >> do you do with {1, 2}? > >> > >> Hmm, good point. > >> > >> > The same thing you do with the avg function? > >> > >> avg's approach is not at all datatype-independent though. If > >> you're willing to give up the idea of a polymorphic median() > >> function, that would probably be the thing to do. If not, you > >> need to take the left or right one of the two central elements. > > > > Whether the median needs to be in the sample is one question that > > doesn't really have a universal right answer. A separate > > question, also without a universal right answer, is whether the > > median needs to be in the same domain as the sample, int4 being > > the case above. > > > > We could just go with "whatever Oracle, DB2 and MS-SQL Server > > have," assuming it's the same thing, until something appears in > > the SQL standard. > > That's usually a sensible default, when in doubt. If nothing else, > it improves compatibility. That's assuming we find such a thing, which I haven't yet. Trying to stay cheery, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [BUGS] COPY FROM/TO losing a single byte of a multibyte UTF-8 sequence
> We generally assume that in server-safe encodings, the ctype.h functions > will behave sanely on any single-byte value. I think this "wisedom" is only true for C locale. I'm not surprised all that it does not work with non C locales. >From array_funcs.c: while (isspace((unsigned char) *p)) p++; IMO this should be something like: while (isspace((unsigned char) *p)) p += pg_mblen(p); -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] trace_recovery_messages
Fujii Masao writes: > The explanation of trace_recovery_messages in the document > is inconsistent with the definition of it in guc.c. I've applied a patch for this. I was tempted to propose that we just rip out trace_recovery_messages altogether, but I suppose Simon would object. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fw: patch for pg_ctl.c to add windows service start-type
I don't know how to edit documents exactly before. Thanks. On Thu, 19 Aug 2010 08:48:53 -0700 David Fetter wrote: > On Thu, Aug 19, 2010 at 10:24:54PM +0800, Quan Zongliang wrote: > > documents attached. html and man-page > > Thanks! > > For future reference, the way to patch docs is by patching the SGML > source. Please find enclosed a patch which incorporates the code > patch you sent with these docs. > > Cheers, > David. > -- > David Fetter http://fetter.org/ > Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter > Skype: davidfetter XMPP: david.fet...@gmail.com > iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics > > Remember to vote! > Consider donating to Postgres: http://www.postgresql.org/about/donate -- Quan Zongliang -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fw: patch for pg_ctl.c to add windows service start-type
Because Windows's CreateService has serial start-type: SERVICE_AUTO_START SERVICE_BOOT_START SERVICE_DEMAND_START SERVICE_DISABLED SERVICE_SYSTEM_START Although all of them are not useful for pg service. I think it is better to use enum. On Thu, 19 Aug 2010 16:48:53 -0400 Alvaro Herrera wrote: > Excerpts from David Fetter's message of jue ago 19 16:40:18 -0400 2010: > > On Thu, Aug 19, 2010 at 03:47:43PM -0400, Alvaro Herrera wrote: > > > Excerpts from David Fetter's message of jue ago 19 11:48:53 -0400 2010: > > > > > > > + > > > > + -S > > > class="parameter"> > > > > > > You omitted the start-type inside the tag. Also, the "a" > > > and "d" values seem to be accepted but not documented. > > > > D'oh! Changed patch enclosed. Now in context format :) > > Thanks. > > Another thing -- why is there an enum at all? Seems it'd be > simpler to assign the right value to the variable in the getopt() code > to start with. > > -- > Álvaro Herrera > The PostgreSQL Company - Command Prompt, Inc. > PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Quan Zongliang -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Avoiding deadlocks ...
Kevin, This one is for you: Two sessions, in transaction: Process A Process B update session where id = X; update order where orderid = 5; update order where orderid = 5; update order where orderid = 5; ... deadlock error. It seems like we ought to be able to avoid a deadlock in this case; there's a clear precedence of who grabbed the order row first. Does your serializability patch address the above situation at all? -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUGS] COPY FROM/TO losing a single byte of a multibyte UTF-8 sequence
Steven Schlansker writes: > On Aug 19, 2010, at 2:35 PM, Tom Lane wrote: >> I was able to reproduce this on my own Mac. Some tracing shows that the >> problem is that isspace(0x85) returns true when in locale en_US.utf-8. >> This causes array_in to drop the final byte of the array element string, >> thinking that it's insignificant whitespace. > The 0x85 seems to be the second byte of a multibyte UTF-8 > sequence. Check. > I'm not at all experienced with character encodings so I could > be totally off base, but isn't it wrong to ever call isspace(0x85), > whatever the result may be, given that the actual character is 0xCF85? > (U+03C5, GREEK SMALL LETTER UPSILON) We generally assume that in server-safe encodings, the ctype.h functions will behave sanely on any single-byte value. You can argue the wisdom of that, but deciding to change that policy would be a rather massive code change; I'm not excited about going that direction. >> I believe that you must >> not have produced the data file data.copy on a Mac, or at least not in >> that locale setting, because array_out should have double-quoted the >> array element given that behavior of isspace(). > Correct, it was produced on a Linux machine. That said, the charset > there was also UTF-8. Right ... but you had an isspace function that meets our expectations. > I actually can't reproduce that behavior here: You need a setlocale() call, else the program acts as though it's in C locale regardless of environment. My test case looks like this: $ cat isspace.c #include #include #include int main() { int c; setlocale(LC_ALL, ""); for (c = 1; c < 256; c++) { if (isspace(c)) printf("%3o is space\n", c); } return 0; } $ gcc -O -Wall isspace.c $ LANG=C ./a.out 11 is space 12 is space 13 is space 14 is space 15 is space 40 is space $ LANG=en_US.utf-8 ./a.out 11 is space 12 is space 13 is space 14 is space 15 is space 40 is space 205 is space 240 is space $ regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: tuplestore, tuplesort aggregate functions
On Thu, Aug 19, 2010 at 5:53 PM, David Fetter wrote: > On Thu, Aug 19, 2010 at 05:40:46PM -0400, Tom Lane wrote: >> "Kevin Grittner" writes: >> > Robert Haas wrote: >> >> I suppose there could also be a bit of an ambiguity if you're working >> >> with a type like int4 where the values are discrete steps. Like, what >> >> do you do with {1, 2}? >> >> Hmm, good point. >> >> > The same thing you do with the avg function? >> >> avg's approach is not at all datatype-independent though. If you're >> willing to give up the idea of a polymorphic median() function, that >> would probably be the thing to do. If not, you need to take the left >> or right one of the two central elements. > > Whether the median needs to be in the sample is one question that > doesn't really have a universal right answer. A separate question, > also without a universal right answer, is whether the median needs to > be in the same domain as the sample, int4 being the case above. > > We could just go with "whatever Oracle, DB2 and MS-SQL Server have," > assuming it's the same thing, until something appears in the SQL > standard. That's usually a sensible default, when in doubt. If nothing else, it improves compatibility. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: tuplestore, tuplesort aggregate functions
On Thu, Aug 19, 2010 at 05:40:46PM -0400, Tom Lane wrote: > "Kevin Grittner" writes: > > Robert Haas wrote: > >> I suppose there could also be a bit of an ambiguity if you're working > >> with a type like int4 where the values are discrete steps. Like, what > >> do you do with {1, 2}? > > Hmm, good point. > > > The same thing you do with the avg function? > > avg's approach is not at all datatype-independent though. If you're > willing to give up the idea of a polymorphic median() function, that > would probably be the thing to do. If not, you need to take the left > or right one of the two central elements. Whether the median needs to be in the sample is one question that doesn't really have a universal right answer. A separate question, also without a universal right answer, is whether the median needs to be in the same domain as the sample, int4 being the case above. We could just go with "whatever Oracle, DB2 and MS-SQL Server have," assuming it's the same thing, until something appears in the SQL standard. Cheers, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: tuplestore, tuplesort aggregate functions
"Kevin Grittner" writes: > Robert Haas wrote: >> I suppose there could also be a bit of an ambiguity if you're working >> with a type like int4 where the values are discrete steps. Like, what >> do you do with {1, 2}? Hmm, good point. > The same thing you do with the avg function? avg's approach is not at all datatype-independent though. If you're willing to give up the idea of a polymorphic median() function, that would probably be the thing to do. If not, you need to take the left or right one of the two central elements. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Old git repo
On Thu, 2010-08-19 at 23:30 +0200, Magnus Hagander wrote: > It might well be, and the cost is low. But if you're talking about > gitweb links or so, they'll still be invalid, because it would have to > be renamed to "postgresql-old" or something like that... Sure, that's fine. It would just be nice to have a place to turn to if you have an old SHA1 and no other information; even if it's slightly inconvenient. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUGS] COPY FROM/TO losing a single byte of a multibyte UTF-8 sequence
Steven Schlansker writes: > I'm having a rather annoying problem - a particular string is causing the > Postgres COPY functionality to lose a byte, causing data corruption in > backups and transferred data. I was able to reproduce this on my own Mac. Some tracing shows that the problem is that isspace(0x85) returns true when in locale en_US.utf-8. This causes array_in to drop the final byte of the array element string, thinking that it's insignificant whitespace. I believe that you must not have produced the data file data.copy on a Mac, or at least not in that locale setting, because array_out should have double-quoted the array element given that behavior of isspace(). Now, it's probably less than sane for isspace() to be behaving that way, since in a UTF8-based locale 0x85 can't be regarded as a valid character code at all. But I'm not hopeful about the results of filing a bug with Apple, because their UTF8-based locales have a lot of other bu^H^Hdubious behaviors too, which they appear not to care much about. In any case, stepping back and taking a larger viewpoint, it seems unsafe for array_in/array_out to have any locale-sensitive behavior anyhow. As an example, in a LATINx locale it is entirely sane for isspace() to return true for 0xA0, while it should certainly not do so in C locale. This means we are at risk of data corruption, ie dropping a valid data character, when an array value starting or ending with 0xA0 is dumped from a C-locale database and loaded into a LATINx-locale one. So it seems like the safest answer is to modify array_in/array_out to use an ASCII-only definition of isspace(). I believe this is traditionally defined as space, tab, CR, LF, VT, FF. We could perhaps trim that further, like just space and tab, but there might be some risk of breaking client code that expects the other traditional whitespace to be ignored. I'm not sure if there are any other places with similar risks. hstore's I/O routines contain isspace calls, but I haven't analyzed the implications. There is an isspace call in record_out but it is just there for cosmetic purposes and doesn't protect any decisions in record_in, so I think it's okay if it makes platform/locale-dependent choices. Comments? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Old git repo
On Thu, Aug 19, 2010 at 5:29 PM, Jeff Davis wrote: > The new git repository will have different SHA1s for all of the commits, > so any old SHA1s will be useless without the old repository. > > Hopefully nobody used links to specific commits (or SHA1s) pointing to > the old git repository for anything important. But I found myself doing > so occasionally for unimportant things (if it was important, I included > the date as a safeguard) -- so I assume a few other people did, as well. > > Would it be worth keeping the old git repository around in a read-only > mode, just in case people have links/SHA1s floating around for it? Perhaps "http://git.postgresql.org/gitweb?p=postgresql-before-official-git-migration.git;a=summary"; -- http://linuxfinances.info/info/linuxdistributions.html -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Old git repo
On Thu, Aug 19, 2010 at 23:29, Jeff Davis wrote: > The new git repository will have different SHA1s for all of the commits, > so any old SHA1s will be useless without the old repository. > > Hopefully nobody used links to specific commits (or SHA1s) pointing to > the old git repository for anything important. But I found myself doing > so occasionally for unimportant things (if it was important, I included > the date as a safeguard) -- so I assume a few other people did, as well. > > Would it be worth keeping the old git repository around in a read-only > mode, just in case people have links/SHA1s floating around for it? It might well be, and the cost is low. But if you're talking about gitweb links or so, they'll still be invalid, because it would have to be renamed to "postgresql-old" or something like that... -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Old git repo
The new git repository will have different SHA1s for all of the commits, so any old SHA1s will be useless without the old repository. Hopefully nobody used links to specific commits (or SHA1s) pointing to the old git repository for anything important. But I found myself doing so occasionally for unimportant things (if it was important, I included the date as a safeguard) -- so I assume a few other people did, as well. Would it be worth keeping the old git repository around in a read-only mode, just in case people have links/SHA1s floating around for it? Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [pgsql-hackers] Daily digest v1.11023 (17 messages)
From: Tom Lane mailto:t...@sss.pgh.pa.us>> Date: August 19, 2010 10:25:36 AM PDT To: David Fetter mailto:da...@fetter.org>> Cc: Kevin Grittner mailto:kevin.gritt...@wicourts.gov>>, Robert Haas mailto:robertmh...@gmail.com>>, Pavel Stehule mailto:pavel.steh...@gmail.com>>, Greg Stark mailto:gsst...@mit.edu>>, PostgreSQL Hackers mailto:pgsql-hackers@postgresql.org>> Subject: Re: wip: functions median and percentile David Fetter mailto:da...@fetter.org>> writes: On Thu, Aug 19, 2010 at 12:12:12PM -0500, Kevin Grittner wrote: http://www.merriam-webster.com/dictionary/median If you do a google search for "median" and poke around, you'll find many places where this is the only definition mentioned; the others seem to be rather infrequently used. Why not make the commone usage convenient? The reason not to is the same reason that MEDIAN doesn't appear in the SQL standard, namely that what's common in one field is wrong in another. Hmm, do you have any *evidence* that that's why it's not in the standard? My own take on that is that it's reasonably probable that the SQL committee might standardize a function by that name someday. What we need to be worrying about is the prospect that if there are multiple definitions for the term, they might pick a different one than we did. A name like "arithmetic_median" seems much less likely to get blindsided by future standardization. regards, tom lane Median is in the standard, you just have to look a little harder, under the section on inverse distribution functions: SELECT PERCENTILE_DIST(0.5) WITHIN GROUP (order by x) ... or SELECT PERCENTILE_CONT(0.5) WITHIN GROUP (order by x) ... Depending on whether you want a discrete or continuous median. Oracle added support for the inverse distribution functions in Oracle 9, which is perhaps why you can find it in the standard. Oracle added the "median(x)" aggregate as a synonym for "percentile_cont(0.5) within group (order by x)" in Oracle 10. My money would be on this become standardized at some point, especially since it is a much friendlier syntax. Regards, Caleb
Re: [HACKERS] small typed-table bug in gram.y
Robert Haas wrote: > While examining gram.y today I happened to notice an oversight in the > grammar rules for creating typed tables: the fourth CREATE TABLE > production ignores $2. Which I think means (although of course I > didn't test it) that if you do something like "CREATE TEMP TABLE IF > NOT EXISTS foo OF bar", the TEMP won't stick. I tried it on a build from yesterday's HEAD. You're right. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fw: patch for pg_ctl.c to add windows service start-type
On Thu, Aug 19, 2010 at 04:48:53PM -0400, Alvaro Herrera wrote: > Excerpts from David Fetter's message of jue ago 19 16:40:18 -0400 2010: > > On Thu, Aug 19, 2010 at 03:47:43PM -0400, Alvaro Herrera wrote: > > > Excerpts from David Fetter's message of jue ago 19 11:48:53 -0400 2010: > > > > > > > + > > > > + -S > > > class="parameter"> > > > > > > You omitted the start-type inside the tag. Also, > > > the "a" and "d" values seem to be accepted but not documented. > > > > D'oh! Changed patch enclosed. Now in context format :) > > Thanks. > > Another thing -- why is there an enum at all? Seems it'd be simpler > to assign the right value to the variable in the getopt() code to > start with. That's a question for the patch author. I was just cleaning up the docs :) Cheers, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] small typed-table bug in gram.y
While examining gram.y today I happened to notice an oversight in the grammar rules for creating typed tables: the fourth CREATE TABLE production ignores $2. Which I think means (although of course I didn't test it) that if you do something like "CREATE TEMP TABLE IF NOT EXISTS foo OF bar", the TEMP won't stick. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fw: patch for pg_ctl.c to add windows service start-type
Excerpts from David Fetter's message of jue ago 19 16:40:18 -0400 2010: > On Thu, Aug 19, 2010 at 03:47:43PM -0400, Alvaro Herrera wrote: > > Excerpts from David Fetter's message of jue ago 19 11:48:53 -0400 2010: > > > > > + > > > + -S > > class="parameter"> > > > > You omitted the start-type inside the tag. Also, the "a" > > and "d" values seem to be accepted but not documented. > > D'oh! Changed patch enclosed. Now in context format :) Thanks. Another thing -- why is there an enum at all? Seems it'd be simpler to assign the right value to the variable in the getopt() code to start with. -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fw: patch for pg_ctl.c to add windows service start-type
On Thu, Aug 19, 2010 at 03:47:43PM -0400, Alvaro Herrera wrote: > Excerpts from David Fetter's message of jue ago 19 11:48:53 -0400 2010: > > > + > > + -S > class="parameter"> > > You omitted the start-type inside the tag. Also, the "a" > and "d" values seem to be accepted but not documented. D'oh! Changed patch enclosed. Now in context format :) Cheers, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate *** a/doc/src/sgml/ref/pg_ctl-ref.sgml --- b/doc/src/sgml/ref/pg_ctl-ref.sgml *** *** 96,101 PostgreSQL documentation --- 96,107 -U username -P password -D datadir +-S + +a[uto] +d[emand] + + -w -t seconds -o options *** *** 377,382 PostgreSQL documentation --- 383,400 + + + -S start-type + + +Start type of the system service to register. start-type can +be auto, or demand, or +the first letter of one of these two. If this is omitted, +auto is used. + + + *** a/src/bin/pg_ctl/pg_ctl.c --- b/src/bin/pg_ctl/pg_ctl.c *** *** 69,74 typedef enum --- 69,82 RUN_AS_SERVICE_COMMAND } CtlCommand; + + typedef enum + { + PGCTL_START_AUTO, + PGCTL_START_DEMAND + } PgctlWin32StartType; + + #define DEFAULT_WAIT 60 static bool do_wait = false; *** *** 87,92 static char *exec_path = NULL; --- 95,101 static char *register_servicename = "PostgreSQL"; /* FIXME: + version ID? */ static char *register_username = NULL; static char *register_password = NULL; + static PgctlWin32StartType win32_start_type = PGCTL_START_AUTO; static char *argv0 = NULL; static bool allow_core_files = false; *** *** 1132,1137 pgwin32_doRegister(void) --- 1141,1147 { SC_HANDLE hService; SC_HANDLE hSCM = OpenSCManager(NULL, NULL, SC_MANAGER_ALL_ACCESS); + DWORD starttype; if (hSCM == NULL) { *** *** 1145,1153 pgwin32_doRegister(void) exit(1); } if ((hService = CreateService(hSCM, register_servicename, register_servicename, SERVICE_ALL_ACCESS, SERVICE_WIN32_OWN_PROCESS, ! SERVICE_AUTO_START, SERVICE_ERROR_NORMAL, pgwin32_CommandLine(true), NULL, NULL, "RPCSS\0", register_username, register_password)) == NULL) { --- 1155,1168 exit(1); } + if (win32_start_type == PGCTL_START_AUTO) + starttype = SERVICE_AUTO_START; + else + starttype = SERVICE_DEMAND_START; + if ((hService = CreateService(hSCM, register_servicename, register_servicename, SERVICE_ALL_ACCESS, SERVICE_WIN32_OWN_PROCESS, ! starttype, SERVICE_ERROR_NORMAL, pgwin32_CommandLine(true), NULL, NULL, "RPCSS\0", register_username, register_password)) == NULL) { *** *** 1586,1592 do_help(void) printf(_(" %s killSIGNALNAME PID\n"), progname); #if defined(WIN32) || defined(__CYGWIN__) printf(_(" %s register [-N SERVICENAME] [-U USERNAME] [-P PASSWORD] [-D DATADIR]\n" !"[-w] [-t SECS] [-o \"OPTIONS\"]\n"), progname); printf(_(" %s unregister [-N SERVICENAME]\n"), progname); #endif --- 1601,1607 printf(_(" %s killSIGNALNAME PID\n"), progname); #if defined(WIN32) || defined(__CYGWIN__) printf(_(" %s register [-N SERVICENAME] [-U USERNAME] [-P PASSWORD] [-D DATADIR]\n" !"[-S START-TYPE] [-w] [-t SECS] [-o \"OPTIONS\"]\n"), progname); printf(_(" %s unregister [-N SERVICENAME]\n"), progname); #endif *** *** 1627,1632 do_help(void) --- 1642,1654 printf(_(" -N SERVICENAME service name with which to register PostgreSQL server\n")); printf(_(" -P PASSWORD password of account to register PostgreSQL server\n")); printf(_(" -U USERNAME user name of account to register PostgreSQL server\n")); + printf(_(" -S START-TYPE service start type to register PostgreSQL server,\n" + " can be auto or demand\n")); + +
Re: [HACKERS] proposal: tuplestore, tuplesort aggregate functions
Robert Haas wrote: > I suppose there could also be a bit of an ambiguity if you're working > with a type like int4 where the values are discrete steps. Like, what > do you do with {1, 2}? The same thing you do with the avg function? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: tuplestore, tuplesort aggregate functions
On Thu, Aug 19, 2010 at 4:21 PM, Tom Lane wrote: > Robert Haas writes: >> You've made this assertion at least three times now, but I confess >> that I've only ever learned one way to compute a median; and quick >> Google searches for "median", "kinds of median", and few other >> variants failed to turn up anything obvious either. > > There are different ways to define it when the number of samples is even. > However I believe that "use the mean of the two middle values" is much > the most common way to deal with that. I suppose there could also be a bit of an ambiguity if you're working with a type like int4 where the values are discrete steps. Like, what do you do with {1, 2}? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: tuplestore, tuplesort aggregate functions
Robert Haas writes: > You've made this assertion at least three times now, but I confess > that I've only ever learned one way to compute a median; and quick > Google searches for "median", "kinds of median", and few other > variants failed to turn up anything obvious either. There are different ways to define it when the number of samples is even. However I believe that "use the mean of the two middle values" is much the most common way to deal with that. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: tuplestore, tuplesort aggregate functions
On Thu, Aug 19, 2010 at 9:46 AM, David Fetter wrote: > As to median, please make sure you say in detail which median you're > using and name it so, as there is no single, authoritative median. You've made this assertion at least three times now, but I confess that I've only ever learned one way to compute a median; and quick Google searches for "median", "kinds of median", and few other variants failed to turn up anything obvious either. It seems to me that if median is good enough for Oracle, Sybase, Excel, and the nun who taught my fifth-grade elementary school class, it ought to be good enough for us, too. (Don't mess with Sr. Catherine!) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fw: patch for pg_ctl.c to add windows service start-type
Excerpts from David Fetter's message of jue ago 19 11:48:53 -0400 2010: > + > + -S class="parameter"> You omitted the start-type inside the tag. Also, the "a" and "d" values seem to be accepted but not documented. -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] refactoring comment.c
Robert Haas writes: > Any other kibitzing before I commit this? Sure ... + * If the object is a relation or a child object of a relation (e.g. an + * attribute or contraint, *relp will set to point to that relation). This Parenthesis in the wrong place here, grammar and spelling not much better. Also, I still feel that this comment could do better about explaining the behavior, particularly with respect to locking. Perhaps say + * If the target object is a relation or a child object of a relation + * (e.g. an attribute or constraint), the relation is also opened, and *relp + * receives the open relcache entry pointer; otherwise *relp is set to NULL. + * This is a bit grotty but it makes life simpler, since the caller will + * typically need the relcache entry too. Caller must close the relcache + * entry when done with it. The relation is locked with the specified + * lockmode if the target object is the relation itself or an attribute, + * but for other child objects, only AccessShareLock is acquired on the + * relation. + ScanKeyInit(&skey[0], ObjectIdAttributeNumber, BTEqualStrategyNumber, + F_OIDEQ, ObjectIdGetDatum(address.objectId)); There's a standard convention for the layout of ScanKeyInit calls, and this isn't it. Trivial, I know, but it's better to make similar code look similar. There's no longer any need for a diff in src/backend/parser/Makefile. + #define OBJECTADDRESS_H + + #include "nodes/parsenodes.h" + #include "nodes/pg_list.h" + #include "storage/lock.h" + #include "utils/rel.h" You shouldn't need pg_list.h here, as parsenodes.h surely includes it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] wip: functions median and percentile
David Fetter writes: > On Thu, Aug 19, 2010 at 01:25:36PM -0400, Tom Lane wrote: >> A name like "arithmetic_median" seems much less likely to get >> blindsided by future standardization. > Yep. OTOH, if Pavel's right that Oracle already has an aggregate named median(), it seems quite unlikely that the SQL committee would pick a definition incompatible with Oracle's to standardize on. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] wip: functions median and percentile
Pavel Stehule wrote: > I looked there and Oracle11g use "median" in common sense. As does Sybase IQ. FWIW, Excel spreadsheets do, too. The chance of the SQL committee picking some other meaning for bare MEDIAN seems vanishingly small; although I have to grant that with only Oracle and Sybase as a precedent in the SQL world, it's not zero. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!
Heikki Linnakangas writes: > On 19/08/10 20:18, Tom Lane wrote: >> But I'm still not seeing how this self-pipe hack avoids a race >> condition. If the signal handler is sending a byte whenever it >> executes, then you could have bytes already stacked up in the pipe >> from previous interrupts that didn't happen to come while inside >> pg_usleep. If you clear those before sleeping, you have a race >> condition, and if you don't, then you fail to sleep the intended >> amount of time even though there was no interrupt this time. > You clear the pipe after waking up. Hmm ... that doesn't answer my second objection about failing to sleep the expected amount of time, but on reflection I guess that can't be done: we have to be able to cope with interrupts occurring just before the sleep actually begins, and there's no way to define "just before" except by reference to the calling code having done whatever it might need to do before/after sleeping. > Hmm, will need to think about a suitable API for that. Offhand I'd suggest something like SetSleepInterrupt() -- called by signal handlers, writes pipe ClearSleepInterrupt() -- called by sleep-and-do-something loops, clears pipe pg_usleep() itself remains the same, but it is now guaranteed to return immediately if SetSleepInterrupt is called, or has been called since the last ClearSleepInterrupt. > The nice thing > would be that we could implement it using pselect() where available. > (And reliable - the Linux select() man page says that glibc's pselect() > is emulated using select(), and suffers from the very same race > condition pselect() was invented to solve. How awful is that!?) Ick. So how would you tell if it's trustworthy? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] wip: functions median and percentile
On Thu, Aug 19, 2010 at 01:25:36PM -0400, Tom Lane wrote: > David Fetter writes: > > On Thu, Aug 19, 2010 at 12:12:12PM -0500, Kevin Grittner wrote: > >> http://www.merriam-webster.com/dictionary/median > >> > >> If you do a google search for "median" and poke around, you'll find > >> many places where this is the only definition mentioned; the others > >> seem to be rather infrequently used. Why not make the commone usage > >> convenient? > > > The reason not to is the same reason that MEDIAN doesn't appear in the > > SQL standard, namely that what's common in one field is wrong in > > another. > > Hmm, do you have any *evidence* that that's why it's not in the standard? Only from Joe Celko, who was on the standards committee, and has written on the subject. There's nothing I've found in the standard itself for this, if that's what you're looking for. > My own take on that is that it's reasonably probable that the SQL > committee might standardize a function by that name someday. What > we need to be worrying about is the prospect that if there are > multiple definitions for the term, they might pick a different one > than we did. Exactly :) > A name like "arithmetic_median" seems much less likely to get > blindsided by future standardization. Yep. Cheers, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] wip: functions median and percentile
2010/8/19 David Fetter : > On Thu, Aug 19, 2010 at 12:12:12PM -0500, Kevin Grittner wrote: >> David Fetter wrote: >> >> > Median may be useful, but we pretty much can't just call it >> > "median." Instead, we need to call it something like "left_median" >> > or "arithmetic_median." >> >> I think it would be reasonable, and perhaps preferable, to use just >> "median" for the semantics described in most dictionaries -- for >> example, this: >> >> http://www.merriam-webster.com/dictionary/median >> >> If you do a google search for "median" and poke around, you'll find >> many places where this is the only definition mentioned; the others >> seem to be rather infrequently used. Why not make the commone usage >> convenient? > > The reason not to is the same reason that MEDIAN doesn't appear in the > SQL standard, namely that what's common in one field is wrong in > another. I think some else. The reason can be more simple - implementation of median is significantly harder then other aggregates. I looked there and Oracle11g use "median" in common sense. Regards Pavel Stehule > > Cheers, > David. > -- > David Fetter http://fetter.org/ > Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter > Skype: davidfetter XMPP: david.fet...@gmail.com > iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics > > Remember to vote! > Consider donating to Postgres: http://www.postgresql.org/about/donate > -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] wip: functions median and percentile
2010/8/19 David Fetter : > On Thu, Aug 19, 2010 at 12:49:45PM -0400, Robert Haas wrote: >> On Thu, Aug 19, 2010 at 11:33 AM, Tom Lane wrote: >> > Greg Stark writes: >> >> On Thu, Aug 19, 2010 at 11:59 AM, Pavel Stehule >> >> wrote: >> >>> I am sending a prototype implementation of functions median and >> >>> percentile. This implementation is very simple and I moved it to >> >>> contrib for this moment - it is more easy maintainable. Later I'll >> >>> move it to core. >> > >> >> So if the entire result set fits in memory it would be nice to use the >> >> O(n) Quickselect algorithm -- which would only be a small change to >> >> the existing Quicksort code -- instead of sorting the entire set. >> > >> > That seems like rather a lot of added infrastructure for functions whose >> > popularity is at best unknown. I think we should KISS for the first >> > implementation. >> >> +1. I think the functions are useful, but the perfect is the enemy of the >> good. > > Percentile is already there as NTILE, a windowing function. Median I don't think it. The purpose is same, but usage is different - aggregate functions can be used as window func, but window functions cannot be used as aggregates. > may be useful, but we pretty much can't just call it "median." > Instead, we need to call it something like "left_median" or > "arithmetic_median." I put some time to deep searching in this area - and I talked with my kolegues mathematican and everywhere use a just median. Some longer or different name isn't barrier for me, but people can be confused. Regards Pavel > > Cheers, > David. > -- > David Fetter http://fetter.org/ > Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter > Skype: davidfetter XMPP: david.fet...@gmail.com > iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics > > Remember to vote! > Consider donating to Postgres: http://www.postgresql.org/about/donate > -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!
On 19/08/10 20:18, Tom Lane wrote: Heikki Linnakangas writes: On 19/08/10 19:57, Tom Lane wrote: Hmm, but couldn't you still do that inside pg_usleep? Signal handlers that do that couldn't know if they were interrupting a sleep per se, so this would have to be a backend-wide convention. I don't understand, do what inside pg_usleep()? I was envisioning still using pg_usleep, and having this interaction between signal handlers and the delay be private to pg_usleep, rather than putting such ugly code into forty-nine different places. There are a *lot* of places where we have loops that break down delays into at-most-one-second pg_usleep calls, and if we're going to have a hack like this we should fix them all, not just two or three that SR cares about. Hmm, will need to think about a suitable API for that. The nice thing would be that we could implement it using pselect() where available. (And reliable - the Linux select() man page says that glibc's pselect() is emulated using select(), and suffers from the very same race condition pselect() was invented to solve. How awful is that!?) But I'm still not seeing how this self-pipe hack avoids a race condition. If the signal handler is sending a byte whenever it executes, then you could have bytes already stacked up in the pipe from previous interrupts that didn't happen to come while inside pg_usleep. If you clear those before sleeping, you have a race condition, and if you don't, then you fail to sleep the intended amount of time even though there was no interrupt this time. You clear the pipe after waking up. Before sending all the pending WAL and deciding that you're fully caught up again: for(;;) { 1. select() 2. clear pipe 3. send any pending WAL } There's more information on the self-pipe trick at e.g http://lwn.net/Articles/177897/ -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] wip: functions median and percentile
David Fetter writes: > On Thu, Aug 19, 2010 at 12:12:12PM -0500, Kevin Grittner wrote: >> http://www.merriam-webster.com/dictionary/median >> >> If you do a google search for "median" and poke around, you'll find >> many places where this is the only definition mentioned; the others >> seem to be rather infrequently used. Why not make the commone usage >> convenient? > The reason not to is the same reason that MEDIAN doesn't appear in the > SQL standard, namely that what's common in one field is wrong in > another. Hmm, do you have any *evidence* that that's why it's not in the standard? My own take on that is that it's reasonably probable that the SQL committee might standardize a function by that name someday. What we need to be worrying about is the prospect that if there are multiple definitions for the term, they might pick a different one than we did. A name like "arithmetic_median" seems much less likely to get blindsided by future standardization. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!
Heikki Linnakangas writes: > On 19/08/10 19:57, Tom Lane wrote: >> Hmm, but couldn't you still do that inside pg_usleep? Signal handlers >> that do that couldn't know if they were interrupting a sleep per se, >> so this would have to be a backend-wide convention. > I don't understand, do what inside pg_usleep()? I was envisioning still using pg_usleep, and having this interaction between signal handlers and the delay be private to pg_usleep, rather than putting such ugly code into forty-nine different places. There are a *lot* of places where we have loops that break down delays into at-most-one-second pg_usleep calls, and if we're going to have a hack like this we should fix them all, not just two or three that SR cares about. But I'm still not seeing how this self-pipe hack avoids a race condition. If the signal handler is sending a byte whenever it executes, then you could have bytes already stacked up in the pipe from previous interrupts that didn't happen to come while inside pg_usleep. If you clear those before sleeping, you have a race condition, and if you don't, then you fail to sleep the intended amount of time even though there was no interrupt this time. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] wip: functions median and percentile
On Thu, Aug 19, 2010 at 12:12:12PM -0500, Kevin Grittner wrote: > David Fetter wrote: > > > Median may be useful, but we pretty much can't just call it > > "median." Instead, we need to call it something like "left_median" > > or "arithmetic_median." > > I think it would be reasonable, and perhaps preferable, to use just > "median" for the semantics described in most dictionaries -- for > example, this: > > http://www.merriam-webster.com/dictionary/median > > If you do a google search for "median" and poke around, you'll find > many places where this is the only definition mentioned; the others > seem to be rather infrequently used. Why not make the commone usage > convenient? The reason not to is the same reason that MEDIAN doesn't appear in the SQL standard, namely that what's common in one field is wrong in another. Cheers, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] wip: functions median and percentile
David Fetter wrote: > Median may be useful, but we pretty much can't just call it > "median." Instead, we need to call it something like "left_median" > or "arithmetic_median." I think it would be reasonable, and perhaps preferable, to use just "median" for the semantics described in most dictionaries -- for example, this: http://www.merriam-webster.com/dictionary/median If you do a google search for "median" and poke around, you'll find many places where this is the only definition mentioned; the others seem to be rather infrequently used. Why not make the commone usage convenient? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!
On 19/08/10 19:57, Tom Lane wrote: Heikki Linnakangas writes: On 19/08/10 16:38, Tom Lane wrote: Considering that pg_usleep is implemented with select, I'm not following what you mean by "replace pg_usleep() with select()"? Instead of using pg_usleep(), call select() directly, waiting not only for the timeout, but also for data to arrive on the "self-pipe". The signal handler writes a byte to the self-pipe, waking up the select(). That way the select() is interupted by the signal arriving, even if signals per se don't interrupt it. And it closes the race condition involved with setting a flag in the signal handler and checking that in the main loop. Hmm, but couldn't you still do that inside pg_usleep? Signal handlers that do that couldn't know if they were interrupting a sleep per se, so this would have to be a backend-wide convention. I don't understand, do what inside pg_usleep()? We only need to respond quickly to one signal, the one that tells walsender "there's some new WAL that you should send". We can rely on polling for all the other signals like SIGHUP for config reload or shutdown request, like we do today. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] wip: functions median and percentile
On Thu, Aug 19, 2010 at 12:49:45PM -0400, Robert Haas wrote: > On Thu, Aug 19, 2010 at 11:33 AM, Tom Lane wrote: > > Greg Stark writes: > >> On Thu, Aug 19, 2010 at 11:59 AM, Pavel Stehule > >> wrote: > >>> I am sending a prototype implementation of functions median and > >>> percentile. This implementation is very simple and I moved it to > >>> contrib for this moment - it is more easy maintainable. Later I'll > >>> move it to core. > > > >> So if the entire result set fits in memory it would be nice to use the > >> O(n) Quickselect algorithm -- which would only be a small change to > >> the existing Quicksort code -- instead of sorting the entire set. > > > > That seems like rather a lot of added infrastructure for functions whose > > popularity is at best unknown. I think we should KISS for the first > > implementation. > > +1. I think the functions are useful, but the perfect is the enemy of the > good. Percentile is already there as NTILE, a windowing function. Median may be useful, but we pretty much can't just call it "median." Instead, we need to call it something like "left_median" or "arithmetic_median." Cheers, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!
Heikki Linnakangas writes: > On 19/08/10 16:38, Tom Lane wrote: >> Considering that pg_usleep is implemented with select, I'm not following >> what you mean by "replace pg_usleep() with select()"? > Instead of using pg_usleep(), call select() directly, waiting not only > for the timeout, but also for data to arrive on the "self-pipe". The > signal handler writes a byte to the self-pipe, waking up the select(). > That way the select() is interupted by the signal arriving, even if > signals per se don't interrupt it. And it closes the race condition > involved with setting a flag in the signal handler and checking that in > the main loop. Hmm, but couldn't you still do that inside pg_usleep? Signal handlers that do that couldn't know if they were interrupting a sleep per se, so this would have to be a backend-wide convention. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Coerce 'unknown' type parameters to the right type in the
On 19/08/10 18:08, Robert Haas wrote: On Thu, Aug 19, 2010 at 9:47 AM, Tom Lane wrote: Another possibility is for EXECUTE USING to coerce any unknowns to TEXT before it calls the parser at all. This would square with the typical default assumption for unknown literals, and it would avoid having to have any semantics changes below the SPI call. That seems more intuitive than just chucking an error. Ok, I reverted the previous patch, and did that instead. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PL/pgsSQL EXECUTE USING INTO
Heikki Linnakangas writes: > While testing the recent issue with unknown params in EXECUTE USING, I > accidentally did this: >EXECUTE 'SELECT ''foo'' || $1' USING 'bar' INTO t; > The mistake I made? I put the USING and INTO clauses in wrong order, > INTO needs to go first. We should throw an error on that, but it looks > like the INTO clause is just silently ignored. This is more interesting than it looks. It appears that the plpgsql parser interprets the USING's argument expression as being 'bar' INTO t so it generates a plplgsql expression with query SELECT 'bar' INTO t and the only reason that you don't get a failure is that exec_simple_check_plan fails to notice the intoClause, so it thinks this represents a "simple expression", which means it evaluates the 'bar' subexpression and ignores the INTO altogether. That's certainly a bug in exec_simple_check_plan :-( I think that accepting this order of the clauses would require some duplication of code in the stmt_dynexecute production. It might be worth doing anyway, because if you made this mistake then certainly others will. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] wip: functions median and percentile
On Thu, Aug 19, 2010 at 11:33 AM, Tom Lane wrote: > Greg Stark writes: >> On Thu, Aug 19, 2010 at 11:59 AM, Pavel Stehule >> wrote: >>> I am sending a prototype implementation of functions median and >>> percentile. This implementation is very simple and I moved it to >>> contrib for this moment - it is more easy maintainable. Later I'll >>> move it to core. > >> So if the entire result set fits in memory it would be nice to use the >> O(n) Quickselect algorithm -- which would only be a small change to >> the existing Quicksort code -- instead of sorting the entire set. > > That seems like rather a lot of added infrastructure for functions whose > popularity is at best unknown. I think we should KISS for the first > implementation. +1. I think the functions are useful, but the perfect is the enemy of the good. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!
On 19/08/10 16:38, Tom Lane wrote: Heikki Linnakangas writes: BTW, on what platforms signals don't interrupt sleep? Although that issue has been discussed many times before, I couldn't find any reference to a real platform in the archives. I've got one in captivity (my old HPUX box). Happy to test whatever you come up with. Considering that pg_usleep is implemented with select, I'm not following what you mean by "replace pg_usleep() with select()"? Instead of using pg_usleep(), call select() directly, waiting not only for the timeout, but also for data to arrive on the "self-pipe". The signal handler writes a byte to the self-pipe, waking up the select(). That way the select() is interupted by the signal arriving, even if signals per se don't interrupt it. And it closes the race condition involved with setting a flag in the signal handler and checking that in the main loop. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!
On 19/08/10 18:08, Alvaro Herrera wrote: Excerpts from Heikki Linnakangas's message of jue ago 19 02:02:34 -0400 2010: On 19/08/10 04:46, Robert Haas wrote: And so far we haven't seen a patch for that. Somebody write one. And then let's get it reviewed and committed RSN. Fujii is on vacation, but I've started working on it. The two issues with Fujii's latest patch are that it would not respond promptly on platforms where signals don't interrupt sleep, and it suffers the classic race condition that pselect() was invented for. I'm going to replace pg_usleep() with select(), and use the so called "self-pipe trick" to get over the race condition. I have that written up but I want to do some testing and cleanup before posting the patch. Hmm, IIRC the self-pipe trick doesn't work on Windows, mainly because select() doesn't handle pipes, only sockets. You may need some extra hack to make it work there. That's fine, we can do the naive set-a-flag implementation on Windows because on Windows our "signals" are only delivered at CHECK_FOR_INTERRUPTS(), so we don't have the race condition to begin with. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] refactoring comment.c
On Thu, Aug 19, 2010 at 11:57 AM, Tom Lane wrote: > Alvaro Herrera writes: >> Excerpts from Robert Haas's message of mié ago 18 21:32:48 -0400 2010: >>> Here's v3. > >> The header comment in objectaddress.c contains a funny mistake: it says >> it works with ObjectAddresses. However, ObjectAddresses is a different >> type altogether, so I recommend not using that as plural for >> ObjectAddress. Maybe "ObjectAddress objects"? :-D > > Alternatively, maybe ObjectAddresses was a bad choice of type name, > and it should be ObjectAddressList or ObjectAddressArray or some such. > But changing that might be more trouble than it's worth. Yeah, I think it was a bad choice of type name. If I were otherwise touching that code, I'd probably advocate for changing it, but since I'm not, I'm inclined to just reword the comment. It might be something to keep in mind if we ever overhaul that part of the system, though, since at that point anything that must be back-patched will have merge conflicts anyway. Any other kibitzing before I commit this? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] refactoring comment.c
Alvaro Herrera writes: > Excerpts from Robert Haas's message of mié ago 18 21:32:48 -0400 2010: >> Here's v3. > The header comment in objectaddress.c contains a funny mistake: it says > it works with ObjectAddresses. However, ObjectAddresses is a different > type altogether, so I recommend not using that as plural for > ObjectAddress. Maybe "ObjectAddress objects"? :-D Alternatively, maybe ObjectAddresses was a bad choice of type name, and it should be ObjectAddressList or ObjectAddressArray or some such. But changing that might be more trouble than it's worth. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fw: patch for pg_ctl.c to add windows service start-type
On Thu, Aug 19, 2010 at 10:24:54PM +0800, Quan Zongliang wrote: > documents attached. html and man-page Thanks! For future reference, the way to patch docs is by patching the SGML source. Please find enclosed a patch which incorporates the code patch you sent with these docs. Cheers, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate diff --git a/doc/src/sgml/ref/pg_ctl-ref.sgml b/doc/src/sgml/ref/pg_ctl-ref.sgml index 20d87bb..a4e675e 100644 --- a/doc/src/sgml/ref/pg_ctl-ref.sgml +++ b/doc/src/sgml/ref/pg_ctl-ref.sgml @@ -377,6 +377,18 @@ PostgreSQL documentation + + + -S + + + Start type of the system service to register. start-type can + be auto, or demand, or + the first letter of one of these two. If this is omitted, + auto is used. + + + diff --git a/src/bin/pg_ctl/pg_ctl.c b/src/bin/pg_ctl/pg_ctl.c index 1caec12..42a8aa9 100644 --- a/src/bin/pg_ctl/pg_ctl.c +++ b/src/bin/pg_ctl/pg_ctl.c @@ -69,6 +69,14 @@ typedef enum RUN_AS_SERVICE_COMMAND } CtlCommand; + +typedef enum +{ + PGCTL_START_AUTO, + PGCTL_START_DEMAND +} PgctlWin32StartType; + + #define DEFAULT_WAIT 60 static bool do_wait = false; @@ -87,6 +95,7 @@ static char *exec_path = NULL; static char *register_servicename = "PostgreSQL"; /* FIXME: + version ID? */ static char *register_username = NULL; static char *register_password = NULL; +static PgctlWin32StartType win32_start_type = PGCTL_START_AUTO; static char *argv0 = NULL; static bool allow_core_files = false; @@ -1132,6 +1141,7 @@ pgwin32_doRegister(void) { SC_HANDLE hService; SC_HANDLE hSCM = OpenSCManager(NULL, NULL, SC_MANAGER_ALL_ACCESS); + DWORD starttype; if (hSCM == NULL) { @@ -1145,9 +1155,14 @@ pgwin32_doRegister(void) exit(1); } + if (win32_start_type == PGCTL_START_AUTO) + starttype = SERVICE_AUTO_START; + else + starttype = SERVICE_DEMAND_START; + if ((hService = CreateService(hSCM, register_servicename, register_servicename, SERVICE_ALL_ACCESS, SERVICE_WIN32_OWN_PROCESS, - SERVICE_AUTO_START, SERVICE_ERROR_NORMAL, + starttype, SERVICE_ERROR_NORMAL, pgwin32_CommandLine(true), NULL, NULL, "RPCSS\0", register_username, register_password)) == NULL) { @@ -1586,7 +1601,7 @@ do_help(void) printf(_(" %s killSIGNALNAME PID\n"), progname); #if defined(WIN32) || defined(__CYGWIN__) printf(_(" %s register [-N SERVICENAME] [-U USERNAME] [-P PASSWORD] [-D DATADIR]\n" -"[-w] [-t SECS] [-o \"OPTIONS\"]\n"), progname); +"[-S START-TYPE] [-w] [-t SECS] [-o \"OPTIONS\"]\n"), progname); printf(_(" %s unregister [-N SERVICENAME]\n"), progname); #endif @@ -1627,6 +1642,13 @@ do_help(void) printf(_(" -N SERVICENAME service name with which to register PostgreSQL server\n")); printf(_(" -P PASSWORD password of account to register PostgreSQL server\n")); printf(_(" -U USERNAME user name of account to register PostgreSQL server\n")); + printf(_(" -S START-TYPE service start type to register PostgreSQL server,\n" + " can be auto or demand\n")); + + printf(_("\nStart types are:\n")); + printf(_(" auto service start automatically during system startup\n")); + printf(_(" demand service start on demand\n")); + #endif printf(_("\nReport bugs to .\n")); @@ -1696,6 +1718,25 @@ set_sig(char *signame) +#if defined(WIN32) || defined(__CYGWIN__) +static void +set_starttype(char *starttypeopt) +{ + if (strcmp(starttypeopt, "a") == 0 || strcmp(starttypeopt, "auto") == 0) + win32_start_type = PGCTL_START_AUTO; + else if (strcmp(starttypeopt, "d") == 0 || strcmp(starttypeopt, "demand") == 0) + win32_start_type = PGCTL_START_DEMAND; + else + { + write_stderr(_("%s: unrecognized start type \"%s\"\n"), progname, starttypeopt); + do_advice(); + exit(1); + } +} +#endif + + + int main(int argc, char **argv) { @@ -1772,7 +1813,7 @@ main(int argc, char **argv) /* process command-line options */ while (optind < argc) { - while ((c = getopt_
Re: [HACKERS] refactoring comment.c
Excerpts from Robert Haas's message of mié ago 18 21:32:48 -0400 2010: > Here's v3. The header comment in objectaddress.c contains a funny mistake: it says it works with ObjectAddresses. However, ObjectAddresses is a different type altogether, so I recommend not using that as plural for ObjectAddress. Maybe "ObjectAddress objects"? :-D -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] wip: functions median and percentile
Greg Stark writes: > On Thu, Aug 19, 2010 at 11:59 AM, Pavel Stehule > wrote: >> I am sending a prototype implementation of functions median and >> percentile. This implementation is very simple and I moved it to >> contrib for this moment - it is more easy maintainable. Later I'll >> move it to core. > So if the entire result set fits in memory it would be nice to use the > O(n) Quickselect algorithm -- which would only be a small change to > the existing Quicksort code -- instead of sorting the entire set. That seems like rather a lot of added infrastructure for functions whose popularity is at best unknown. I think we should KISS for the first implementation. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] FTS wildcard and custom ispell dictionary problem
Hello, I am using PostgreSQL 8.4 full text search in following way: Custom FTS configuration called "dc2" with these dictionaries in following order for asciihword token: latvian_ispell, english_stem, russian_stem Latvian ispell dictionary contains words with different endings but same meaning (latvian langiage specifics, plural words, etc) The problem starts when using wildcard :* to_tsquery syntax. For example. If i look for the word "kriev" i am automatically adding wildcard using syntax: to_tsquery('dc2', 'kriev:*'); By searching kriev:* FTS founds word "krievs" in latvian_ispell dictionary which is totally okei. SELECT * from ts_debug('dc2', 'kriev:*'); alias | description | token | dictionaries | dictionary | lexemes ---+-+---+--++-- asciiword | Word, all ASCII | kriev | {latvian_ispell,english_stem,russian_stem} | latvian_ispell | {krievs} blank | Space symbols | :*| {} If understand correctly now database uses not kriev:* but krievs:* for following queries. And here is the problem, data contains also word: Krievija, and in this case search doesn't find it, because it looks for Krievs:* and not Kriev:* anymore. Is there any solution anone could suggest to get results by both criterias - kriev:* (starting query) and krievs:* (founded in ispell dict). Only idea i had is to somehow combine two tsqueries one - to_tsquery('dc2', 'kriev:*') and to_tsquery('english', 'kriev:*'); so the search looks for both - kriev:* and krievs:* but anyway didnt figured out any syntax i could use :( Thanks -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PL/pgsSQL EXECUTE USING INTO
Excerpts from Heikki Linnakangas's message of jue ago 19 04:29:19 -0400 2010: > While testing the recent issue with unknown params in EXECUTE USING, I > accidentally did this: > > postgres=# DO $$ > DECLARE >t text; > BEGIN >EXECUTE 'SELECT ''foo'' || $1' USING 'bar' INTO t; >RAISE NOTICE '%', t; > END; > $$; > NOTICE: > DO > > The mistake I made? I put the USING and INTO clauses in wrong order, > INTO needs to go first. We should throw an error on that, but it looks > like the INTO clause is just silently ignored. Can't we just accept either order? -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!
On Thu, Aug 19, 2010 at 17:08, Alvaro Herrera wrote: > Excerpts from Heikki Linnakangas's message of jue ago 19 02:02:34 -0400 2010: >> On 19/08/10 04:46, Robert Haas wrote: > >> > And so far we haven't seen a patch for that. >> > Somebody write one. And then let's get it reviewed and committed RSN. >> >> Fujii is on vacation, but I've started working on it. The two issues >> with Fujii's latest patch are that it would not respond promptly on >> platforms where signals don't interrupt sleep, and it suffers the >> classic race condition that pselect() was invented for. I'm going to >> replace pg_usleep() with select(), and use the so called "self-pipe >> trick" to get over the race condition. I have that written up but I want >> to do some testing and cleanup before posting the patch. > > Hmm, IIRC the self-pipe trick doesn't work on Windows, mainly because > select() doesn't handle pipes, only sockets. You may need some extra > hack to make it work there. We have a pipe implementation using sockets that is used on Windows for just this reason, IIRC. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Coerce 'unknown' type parameters to the right type in the
On Aug 19, 2010, at 8:08 AM, Robert Haas wrote: >> Another possibility is for EXECUTE USING to coerce any unknowns to TEXT >> before it calls the parser at all. This would square with the typical >> default assumption for unknown literals, and it would avoid having to >> have any semantics changes below the SPI call. > > That seems more intuitive than just chucking an error. It'd be nice if SPI itself could work this way for UNKNOWNs, too. Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!
Excerpts from Heikki Linnakangas's message of jue ago 19 02:02:34 -0400 2010: > On 19/08/10 04:46, Robert Haas wrote: > > And so far we haven't seen a patch for that. > > Somebody write one. And then let's get it reviewed and committed RSN. > > Fujii is on vacation, but I've started working on it. The two issues > with Fujii's latest patch are that it would not respond promptly on > platforms where signals don't interrupt sleep, and it suffers the > classic race condition that pselect() was invented for. I'm going to > replace pg_usleep() with select(), and use the so called "self-pipe > trick" to get over the race condition. I have that written up but I want > to do some testing and cleanup before posting the patch. Hmm, IIRC the self-pipe trick doesn't work on Windows, mainly because select() doesn't handle pipes, only sockets. You may need some extra hack to make it work there. -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Coerce 'unknown' type parameters to the right type in the
On Thu, Aug 19, 2010 at 9:47 AM, Tom Lane wrote: > Another possibility is for EXECUTE USING to coerce any unknowns to TEXT > before it calls the parser at all. This would square with the typical > default assumption for unknown literals, and it would avoid having to > have any semantics changes below the SPI call. That seems more intuitive than just chucking an error. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] wip: functions median and percentile
On Thu, Aug 19, 2010 at 11:59 AM, Pavel Stehule wrote: > I am sending a prototype implementation of functions median and > percentile. This implementation is very simple and I moved it to > contrib for this moment - it is more easy maintainable. Later I'll > move it to core. So if the entire result set fits in memory it would be nice to use the O(n) Quickselect algorithm -- which would only be a small change to the existing Quicksort code -- instead of sorting the entire set. That should be possible both for percentile() and median(). It would be good to generalize the tuplesort api sufficiently so that you can implement this as a contrib module even if we eventually decide to integrate it in core. It's probably worth having two copies of the sort code, one for Quickselect and one for Quicksort just for speed, though I suppose it's worth benchmarking. But I'm not aware of a generalization of tapesort to allow O(n) selection with the sequential i/o properties of tapesort. It would be especially interesting, probably more so than the in-memory case. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: tuplestore, tuplesort aggregate functions
2010/8/19 David Fetter : > On Thu, Aug 19, 2010 at 12:45:13PM +0200, Pavel Stehule wrote: >> Hello >> >> > I'll test both variant first. Maybe there are not any significant >> > difference between them. Now nodeAgg can build, fill a tuplesort. >> > So I think is natural use it. It needs only one - skip a calling a >> > transident function and directly call final function with external >> > tuplesort. Minimally you don't need 2x same code. >> >> yesterday I did a small test. Aggregates without transident >> functions are only about 2% faster, so there has no sense thinking >> more about them. I'll send a patch with median and percentile >> functions immediately - these functions are implemented like usual >> aggregates. > > NTILE is already a windowing function. Might want to check into any > performance improvements you can give that. The performance has to be +/- same. It is based on same technology - tuplesort. > > As to median, please make sure you say in detail which median you're > using and name it so, as there is no single, authoritative median. I searched about this. And I found so there is a different methods for same statistic value with different names. But result has to be same. I don't found a other median than "arithmetic" or "financial" median (with a two derived forms - left, right median). These methods are a different forms of some SQL - see http://www.simple-talk.com/sql/t-sql-programming/median-workbench/ and it is SQL related solutions. With tuplesort I can to use a simple solution - I have a number of elements, have a sorted set and can move to n position is set. Next forms of median what I found are estimations. regards Pavel > > Cheers, > David. > -- > David Fetter http://fetter.org/ > Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter > Skype: davidfetter XMPP: david.fet...@gmail.com > iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics > > Remember to vote! > Consider donating to Postgres: http://www.postgresql.org/about/donate > -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Coerce 'unknown' type parameters to the right type in the
Heikki Linnakangas writes: > I'm starting to wonder if it's worth enforcing the rule that all unknown > Params must be coerced to the same target type. We could just document > the behavior. Or maybe we should revert the whole thing, and add a check > to PL/pgSQL EXECUTE USING code to just throw a nicer error message if > you pass an unknown parameter in the USING clause. +1 for the latter. There's no good reason to be passing unknowns to USING. I'm also getting more and more uncomfortable with the amount of new behavior that's being pushed into an existing SPI call. Another possibility is for EXECUTE USING to coerce any unknowns to TEXT before it calls the parser at all. This would square with the typical default assumption for unknown literals, and it would avoid having to have any semantics changes below the SPI call. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: tuplestore, tuplesort aggregate functions
On Thu, Aug 19, 2010 at 12:45:13PM +0200, Pavel Stehule wrote: > Hello > > > I'll test both variant first. Maybe there are not any significant > > difference between them. Now nodeAgg can build, fill a tuplesort. > > So I think is natural use it. It needs only one - skip a calling a > > transident function and directly call final function with external > > tuplesort. Minimally you don't need 2x same code. > > yesterday I did a small test. Aggregates without transident > functions are only about 2% faster, so there has no sense thinking > more about them. I'll send a patch with median and percentile > functions immediately - these functions are implemented like usual > aggregates. NTILE is already a windowing function. Might want to check into any performance improvements you can give that. As to median, please make sure you say in detail which median you're using and name it so, as there is no single, authoritative median. Cheers, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!
Heikki Linnakangas writes: > BTW, on what platforms signals don't interrupt sleep? Although that > issue has been discussed many times before, I couldn't find any > reference to a real platform in the archives. I've got one in captivity (my old HPUX box). Happy to test whatever you come up with. Considering that pg_usleep is implemented with select, I'm not following what you mean by "replace pg_usleep() with select()"? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PL/pgsSQL EXECUTE USING INTO
On Thu, Aug 19, 2010 at 4:29 AM, Heikki Linnakangas wrote: > While testing the recent issue with unknown params in EXECUTE USING, I > accidentally did this: > > postgres=# DO $$ > DECLARE > t text; > BEGIN > EXECUTE 'SELECT ''foo'' || $1' USING 'bar' INTO t; > RAISE NOTICE '%', t; > END; > $$; > NOTICE: > DO > > The mistake I made? I put the USING and INTO clauses in wrong order, INTO > needs to go first. We should throw an error on that, but it looks like the > INTO clause is just silently ignored. Another option would be to make it work as expected. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] wip: functions median and percentile
Hello I am sending a prototype implementation of functions median and percentile. This implementation is very simple and I moved it to contrib for this moment - it is more easy maintainable. Later I'll move it to core. These functions are relative simple, there are not barrier for implementation own specific mutations of this functions - so I propose move to core only basic and well known form of these to core. postgres=# select median(v) from generate_series(1,10) g(v); median 5.5 (1 row) Time: 1.475 ms postgres=# select percentile(v,50) from generate_series(1,10) g(v); percentile 5 (1 row) Time: 0.626 ms This implementation is based on tuplesort and the speed is relative well - the result from 100 rows is less 1 sec. Regards Pavel Stehule *** ./contrib/median/Makefile.orig 2010-08-19 12:38:56.144777253 +0200 --- ./contrib/median/Makefile 2010-08-18 20:23:39.180156339 +0200 *** *** 0 --- 1,17 + # $PostgreSQL: pgsql/contrib/median/Makefile,v 1.1 2008/07/29 18:31:20 tgl Exp $ + + MODULES = median + DATA_built = median.sql + DATA = uninstall_median.sql + REGRESS = median + + ifdef USE_PGXS + PG_CONFIG = pg_config + PGXS := $(shell $(PG_CONFIG) --pgxs) + include $(PGXS) + else + subdir = contrib/median + top_builddir = ../.. + include $(top_builddir)/src/Makefile.global + include $(top_srcdir)/contrib/contrib-global.mk + endif *** ./contrib/median/median.c.orig 2010-08-19 12:39:01.456650776 +0200 --- ./contrib/median/median.c 2010-08-19 12:35:32.104649418 +0200 *** *** 0 --- 1,244 + /* + * $PostgreSQL: pgsql/contrib/citext/citext.c,v 1.2 2009/06/11 14:48:50 momjian Exp $ + */ + #include "postgres.h" + + #include "funcapi.h" + #include "miscadmin.h" + #include "catalog/pg_type.h" + #include "parser/parse_coerce.h" + #include "parser/parse_oper.h" + #include "utils/builtins.h" + #include "utils/tuplesort.h" + + Datum median_transfn(PG_FUNCTION_ARGS); + Datum median_finalfn(PG_FUNCTION_ARGS); + Datum percentile_transfn(PG_FUNCTION_ARGS); + Datum percentile_finalfn(PG_FUNCTION_ARGS); + + + #ifdef PG_MODULE_MAGIC + PG_MODULE_MAGIC; + #endif + + PG_FUNCTION_INFO_V1(median_transfn); + PG_FUNCTION_INFO_V1(median_finalfn); + PG_FUNCTION_INFO_V1(percentile_transfn); + PG_FUNCTION_INFO_V1(percentile_finalfn); + + + typedef struct + { + int nelems; /* number of valid entries */ + Tuplesortstate *sortstate; + FmgrInfo cast_func_finfo; + int p; /* nth for percentille */ + } StatAggState; + + static StatAggState * + makeStatAggState(FunctionCallInfo fcinfo) + { + MemoryContext oldctx; + MemoryContext aggcontext; + StatAggState *aggstate; + Oid sortop, + castfunc; + Oid valtype; + CoercionPathType pathtype; + + if (!AggCheckCallContext(fcinfo, &aggcontext)) + { + /* cannot be called directly because of internal-type argument */ + elog(ERROR, "string_agg_transfn called in non-aggregate context"); + } + + oldctx = MemoryContextSwitchTo(aggcontext); + + aggstate = (StatAggState *) palloc(sizeof(StatAggState)); + aggstate->nelems = 0; + + valtype = get_fn_expr_argtype(fcinfo->flinfo, 1); + get_sort_group_operators(valtype, + true, false, false, + &sortop, NULL, NULL); + + aggstate->sortstate = tuplesort_begin_datum(valtype, + sortop, + SORTBY_NULLS_DEFAULT, + work_mem, false); + + MemoryContextSwitchTo(oldctx); + + if (valtype != FLOAT8OID) + { + /* find a cast function */ + + pathtype = find_coercion_pathway(FLOAT8OID, valtype, + COERCION_EXPLICIT, + &castfunc); + if (pathtype == COERCION_PATH_FUNC) + { + Assert(OidIsValid(castfunc)); + fmgr_info_cxt(castfunc, &aggstate->cast_func_finfo, + aggcontext); + } + else if (pathtype == COERCION_PATH_RELABELTYPE) + { + aggstate->cast_func_finfo.fn_oid = InvalidOid; + } + else + elog(ERROR, "no conversion function from %s %s", + format_type_be(valtype), + format_type_be(FLOAT8OID)); + } + + return aggstate; + } + + /* + * append a non NULL value to tuplesort + */ + Datum + median_transfn(PG_FUNCTION_ARGS) + { + StatAggState *aggstate; + + aggstate = PG_ARGISNULL(0) ? NULL : (StatAggState *) PG_GETARG_POINTER(0); + + if (!PG_ARGISNULL(1)) + { + if (aggstate == NULL) + aggstate = makeStatAggState(fcinfo); + + tuplesort_putdatum(aggstate->sortstate, PG_GETARG_DATUM(1), false); + aggstate->nelems++; + } + + PG_RETURN_POINTER(aggstate); + } + + static double + to_double(Datum value, FmgrInfo *cast_func_finfo) + { + if (cast_func_finfo->fn_oid != InvalidOid) + { + return DatumGetFloat8(FunctionCall1(cast_func_finfo, value)); + } + else + return DatumGetFloat8(value); + } + + Datum + median_finalfn(PG_FUNCTION_ARGS) + { + StatAggState *aggstate; + + Assert(AggCheckCallContext(fcinfo, NULL)); + + aggstate = PG_ARGISNULL(0) ? NULL : (StatAggState *) PG_GETARG_POINTER(0); + + if (aggst
Re: [HACKERS] proposal: tuplestore, tuplesort aggregate functions
Hello > > I'll test both variant first. Maybe there are not any significant > difference between them. Now nodeAgg can build, fill a tuplesort. So I > think is natural use it. It needs only one - skip a calling a > transident function and directly call final function with external > tuplesort. Minimally you don't need 2x same code. yesterday I did a small test. Aggregates without transident functions are only about 2% faster, so there has no sense thinking more about them. I'll send a patch with median and percentile functions immediately - these functions are implemented like usual aggregates. Regards Pavel > > Regards > > Pavel Stehule > >> regards, tom lane >> > -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] git: uh-oh
On Thu, Aug 19, 2010 at 07:00, Michael Haggerty wrote: > Magnus Hagander wrote: >> Is there some way to make cvs2git work this way, and just not bother >> even trying to create merge commits, or is that fundamentally >> impossible and we need to look at another tool? > > The good news: (I just reminded myself/realized that) Max Bowsher has > already implemented pretty much exactly what you want in the cvs2svn > trunk version, including noting in the commit messages any cherry-picks > that are not reflected in the repo ancestry. Ah, that's great. > The bad news: It is broken [1]. But I don't think it should be too much > work to fix it. That's less great of course, but it gives hope! Thanks for your continued efforts! -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] PL/pgsSQL EXECUTE USING INTO
While testing the recent issue with unknown params in EXECUTE USING, I accidentally did this: postgres=# DO $$ DECLARE t text; BEGIN EXECUTE 'SELECT ''foo'' || $1' USING 'bar' INTO t; RAISE NOTICE '%', t; END; $$; NOTICE: DO The mistake I made? I put the USING and INTO clauses in wrong order, INTO needs to go first. We should throw an error on that, but it looks like the INTO clause is just silently ignored. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [COMMITTERS] pgsql: Coerce 'unknown' type parameters to the right type in the
On 18/08/10 18:03, Heikki Linnakangas wrote: On 18/08/10 16:57, Tom Lane wrote: hei...@postgresql.org (Heikki Linnakangas) writes: Log Message: --- Coerce 'unknown' type parameters to the right type in the fixed-params parse_analyze() function. That case occurs e.g with PL/pgSQL EXECUTE ... USING 'stringconstant'. The coercion with a CoerceViaIO node. The result is similar to the coercion via input function performed for unknown constants in coerce_type(), except that this happens at runtime. Unfortunately, this entirely fails to enforce the rule that an unknown Param be coerced the same way everywhere. You'd need a cleanup pass as well, cf check_variable_parameters(). Yeah, you're right. I'll find a way to do the cleanup pass in fixed params case too. It turned out to be messier than I imagined, but I have a working patch now. It still doesn't behave exactly like the variable params case, though. To wit: postgres=# DO $$ declare t text; begin EXECUTE 'SELECT 1+ $1, $1' INTO t USING '123' ; RAISE NOTICE '%', t; end; $$; ERROR: could not determine data type of parameter $1 LINE 1: SELECT 1+ $1, $1 ^ QUERY: SELECT 1+ $1, $1 CONTEXT: PL/pgSQL function "inline_code_block" line 5 at EXECUTE statement The varparams code doesn't thrown an error on that. At the first reference to $1, the parameter is resolved to int4. On the 2nd reference, it's an int4 and there's nothing to force coercion to anything else, so it's returned as an int4. In the fixed params case, however, that throws an error. We could make it behave the same if we really wanted to, but that would add even more code. I'm starting to wonder if it's worth enforcing the rule that all unknown Params must be coerced to the same target type. We could just document the behavior. Or maybe we should revert the whole thing, and add a check to PL/pgSQL EXECUTE USING code to just throw a nicer error message if you pass an unknown parameter in the USING clause. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c index 6b99a10..2cb9f31 100644 --- a/src/backend/parser/analyze.c +++ b/src/backend/parser/analyze.c @@ -92,6 +92,10 @@ parse_analyze(Node *parseTree, const char *sourceText, query = transformStmt(pstate, parseTree); + /* make sure all is well with parameter types */ + if (numParams > 0) + check_fixed_parameters(pstate, query); + free_parsestate(pstate); return query; diff --git a/src/backend/parser/parse_param.c b/src/backend/parser/parse_param.c index 60917f4..7cb34c2 100644 --- a/src/backend/parser/parse_param.c +++ b/src/backend/parser/parse_param.c @@ -59,7 +59,10 @@ static Node *variable_coerce_param_hook(ParseState *pstate, Param *param, static Node *fixed_coerce_param_hook(ParseState *pstate, Param *param, Oid targetTypeId, int32 targetTypeMod, int location); -static bool check_parameter_resolution_walker(Node *node, ParseState *pstate); +static bool check_fixed_parameter_resolution_walker(Node *node, + ParseState *pstate); +static bool check_variable_parameter_resolution_walker(Node *node, + ParseState *pstate); /* @@ -322,6 +325,135 @@ variable_coerce_param_hook(ParseState *pstate, Param *param, return NULL; } + +/* + * Check for consistent assignment of unknown parameters after completion + * of parsing with parse_fixed_parameters. + * + * Note: this code intentionally does not check that all parameter positions + * were used, nor that all got non-UNKNOWN types assigned. Caller of parser + * should enforce that if it's important. + */ +void +check_fixed_parameters(ParseState *pstate, Query *query) +{ + FixedParamState *parstate = (FixedParamState *) pstate->p_ref_hook_state; + + /* + * If parse_fixed_parameters() didn't resolve any unknown types, there's + * nothing to do. + */ + if (parstate->unknownParamTypes != NULL) + (void) query_tree_walker(query, + check_fixed_parameter_resolution_walker, + (void *) pstate, 0); +} + +/* + * Traverse a fully-analyzed tree to verify that all references to unknown + * Params are coerced to the same type. Although we check in + * fixed_coerce_param_hook() that an unknown Param is not coerced to different + * types at different locations in the query, some Params might still be + * uncoerced, if there wasn't anything to force their coercion, and yet other + * instances seen later might have gotten coerced. + */ +static bool +check_fixed_parameter_resolution_walker(Node *node, ParseState *pstate) +{ + FixedParamState *parstate = (FixedParamState *) pstate->p_ref_hook_state; + + if (node == NULL) + return false; + + /* + * Check if this is a CoerceViaIO(Param of type 'unknown') construct, + * created by parse_fixed_parameters(). In theory, it could be a similar + * construct created by other means, but that doesn't currently happen; + * unknown Params needing coerc