[HACKERS] [FeatureRequest] Base Convert Function
Hi! Is there possibility of having internal base converting function in PgSQL? There are already functions for converting between decimal and hexadecimal notations i think pgsql can be able to convert between number with radixes from 1 to 36 (actually fast (de)encoding base36 is what i need)... thx -- S pozdravem Best regards Tomáš Mudruňka - Spoje.net / Arachne Labs XMPP/Jabber: har...@jabbim.cz, ICQ: 283782978 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_ctl and port number detection
Florian Pflug wrote: > On Dec21, 2010, at 03:04 , Bruce Momjian wrote: > > Tom Lane wrote: > >> Actually, if we're going to do this at all, we should do > >> > >>pid > >>datadir > >>port > >>socketdir > >>... here be dragons ... > >> > >> so that pg_ctl doesn't have to assume the server is running with a > >> default value of unix_socket_dir. Not sure what to put in the fourth > >> line on Windows though ... maybe just leave it empty? > > > > I am curious about the use of the socketdir. What can that be used for? > > If it's non-default and you want to connect via unix sockets, just knowing > the port won't help much. Ah, so pg_ctl is going to use that information too --- good point! -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_ctl and port number detection
On Dec21, 2010, at 03:04 , Bruce Momjian wrote: > Tom Lane wrote: >> Actually, if we're going to do this at all, we should do >> >> pid >> datadir >> port >> socketdir >> ... here be dragons ... >> >> so that pg_ctl doesn't have to assume the server is running with a >> default value of unix_socket_dir. Not sure what to put in the fourth >> line on Windows though ... maybe just leave it empty? > > I am curious about the use of the socketdir. What can that be used for? If it's non-default and you want to connect via unix sockets, just knowing the port won't help much. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal : cross-column stats
On Dec18, 2010, at 17:59 , Tomas Vondra wrote: > It seems to me you're missing one very important thing - this was not > meant as a new default way to do estimates. It was meant as an option > when the user (DBA, developer, ...) realizes the current solution gives > really bad estimates (due to correlation). In that case he could create > 'cross-column' statistics on those columns, and the optimizer would use > that info to do the estimates. I do understand that. I just have the nagging feeling that there is a way to judge from dist(A), dist(B) and dist(A,B) whether it makes sense to apply the uniform bayesian approach or to assume the columns are unrelated. I play with this for a bit over the weekend, but unfortunately ran out of time. So I'm writing up what I found, to prevent it from getting lost. I tried to pick up Robert's idea of quantifying "Implicativeness" - i.e., finding a number between 0 and 1 that describes how close the (A,B) are to representing a function A -> B. Observe that dist(A),dist(B) <= dist(A,B) <= dist(A)*dist(B) if the estimates of dist(?) are consistent. From that you easily get dist(A,B)/dist(B) <= dist(A) <= dist(A,B) and dist(A,B)/dist(A) <= dist(B) <= dist(A,B) If dist(A) == dist(A,B), then there is a functional dependency A -> B, and conversely if dist(B) == dist(A,B) there is a functional dependency B -> A. Note that you can have both at the same time! On the other hand, if dist(B) = dist(A,B)/dist(A), then B has the smallest number of distinct values possible for a given combination of dist(A,B) and dist(A). This is the anti-function case. This motivates the definition F(A,B) = [ dist(A)*dist(B) - dist(A,B) ] / [ dist(A,B) * ( dist(B) - 1) ] (You can probably drop the "-1", it doesn't make much of a difference for larger values of dist(B). F(A,B) specifies where dist(A) lies relative to dist(A,B)/dist(B) and dist(A,B) - a value of 0 indicates dist(A) = dist(A,B)/dist(B) while a value of 1 indicates that dist(A) == dist(A,B). So F(A,B) is a suitable measure of "Implicativeness" - it's higher if the table (A,B) looks more like a function A -> B. You might use that to decide if either A->B or B->a looks function-like enough to use the uniform bayesian approach. Or you might even go further, and decide *with* bayesian formula to use - the paper you cited always averages P(A=x|B=y)*P(B=y) and P(B=y|A=x)*P(A=x) but they offer no convincing reason for that other than "We don't know which to pick". I'd like to find a statistical explanation for that definition of F(A,B), but so far I couldn't come up with any. I created a Maple 14 worksheet while playing around with this - if you happen to have a copy of Maple available I'd be happy to send it to you.. This is what I got so far - I hope it may prove to be of use somehow. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_ctl and port number detection
Tom Lane wrote: > Actually, if we're going to do this at all, we should do > > pid > datadir > port > socketdir > ... here be dragons ... > > so that pg_ctl doesn't have to assume the server is running with a > default value of unix_socket_dir. Not sure what to put in the fourth > line on Windows though ... maybe just leave it empty? I am curious about the use of the socketdir. What can that be used for? -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] serializable lock consistency
On Mon, Dec 20, 2010 at 7:19 PM, Florian Pflug wrote: >> Space in the tuple header is >> precious, and I am not at all convinced that we should be willing to >> surrender any for this. > > Thats a pretty tight space to maneuver in, though. So tight, in fact, > that I may as well give up, barring some absolutely genius idea, which > I don't even know where to look for at the moment. > > Every feature has its price, and if giving up on completely hypothetical > future savings is too costly, then surely anything else I might suggest > is too :-( Of the ideas proposed so far, the idea of somehow making use of the existing multi-xid machinery to do this seems the most promising to me. But I haven't yet understood exactly what you're proposing, or fully thought through the performance implications, which is obviously something that needs to happen, and if that doesn't pan out, then, as I said upthread and you said here, yeah, we may need a new idea. It would be useful if some other folks weighed in on this, too. I understand what behavior we're trying to get here and why we want that, but I don't necessarily have the greatest understanding of all the details of the on-disk format. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extensions, patch 22 (cleanup, review, cleanup)
>> On Mon, December 20, 2010 22:35, Dimitri Fontaine wrote: > > I might be mistaken but it looks like a doc/src/sgml/ref/alter_extension.sgml is missing? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_ctl and port number detection
Tom Lane wrote: > Bruce Momjian writes: > > Tom Lane wrote: > >> No. If it goes in, it should go in as the third line. The shmem key > >> data is private to the server --- we do not want external programs > >> assuming anything at all about the private part of postmaster.pid. > > > OK, so you are suggesting having it as a third value on the third line? > > > 10231 > > /u/pgsql/data > > 5432001 45481984 port_here > > ^ > > I'm not sure why you're having such a hard time grasping this concept. > We do not want pg_ctl looking at the shmem key information, not even to > the extent of assuming a particular format for it. Therefore the port > number has to go before it not after it. What I'm thinking of is > > pid > datadir > port > ... here be dragons ... > > Actually, if we're going to do this at all, we should do > > pid > datadir > port > socketdir > ... here be dragons ... > > so that pg_ctl doesn't have to assume the server is running with a > default value of unix_socket_dir. Not sure what to put in the fourth > line on Windows though ... maybe just leave it empty? OK. I was hesitant to modify the existing postmaster.pid format and was trying to just add on the end. It is certainly easier to put it before the shared memory stuff. I will work on a patch. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] serializable lock consistency
On Dec21, 2010, at 00:08 , Robert Haas wrote: > My previously expressed concern about (C) wasn't based on ugliness, > but rather on my believe that there is likely a whole lot of code > which relies on the CTID being a self-link when no UPDATE has > occurred. We'd have to be confident that all such cases had been > found and fixed, which might not be easy to be confident about. Thats certainly a valid concern, and one that a credible proposal will have to address. > The obvious question is - if we can overlay CTID in some > situations, is there a better use for that than this? Just to throw > out something that might be totally impractical, maybe we could get > rid of XMAX. If the tuple hasn't been updated, the CTID field stores > the XMAX; if it HAS been updated, the CTID points to the successor > tuple, whose XMIN is our XMAX. I'm sure there are a bunch of reasons > why that doesn't actually work - I can think of some of them myself - > but the point is that there's an opportunity cost to stealing those > bits. Once they're dedicated to this purpose, they can't ever be > dedicated to any other purpose. Yes, there is an opportunity cost there, I can't argue with that. > Space in the tuple header is > precious, and I am not at all convinced that we should be willing to > surrender any for this. Thats a pretty tight space to maneuver in, though. So tight, in fact, that I may as well give up, barring some absolutely genius idea, which I don't even know where to look for at the moment. Every feature has its price, and if giving up on completely hypothetical future savings is too costly, then surely anything else I might suggest is too :-( > We have to believe not only that this change > is good, but also that it's more good than some other purpose to which > that bit space could potentially be put. Unfortunately, I'm running out of arguments for why *is* important and *is* worth paying a price. I've been forced to simply give up on making some database-side constraint enforcement 100% waterproof in the past. That bugged me greatly each time, and each time weakened my case when I tried to explain to client why enforcing such things in the database is a Good Thing. I therefore have a hard time trying to understand why people mostly seem to regard this is a non-issue or merely a nice-to-have. Regarding the sub-transaction vs. locking issue - I haven't been bitten by this personally, since I tend to avoid using sub-transactions at all. But if I did, I'd feel even stronger about this, since it's clearly a bug. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extensions, patch v20 (bitrot fixes)
On Tue, Dec 21, 2010 at 08:04, Martijn van Oosterhout wrote: > On Mon, Dec 20, 2010 at 10:15:56PM +0100, Nicolas Barbier wrote: >> >From >> >http://en.wikipedia.org/wiki/Japanese_language_and_computers#Character_encodings>: > ISTM that since all the mapping tables are public it should be a SMOP > to *prove* roundtrip conversions are safe, or identify the problems. Another issue in Japanese users is EUDC (End User Defined Character). Unfortunately for both postgres developers and application developers in Japan, many machine dependence characters are still used in popular mobile phones in Japan. Their native encoding is SHIFT_JIS, and we have an EUDC mapping for SHIFT_JIS to/from EUC_JP. But we don't have for UTF-8 to/from other encodings. That is one of the reasons why we cannot move to the UTF-8 world completely. Imagine that a module that manipulate EUDC text. It will be written in EUC_JP because SHIFT_JIS is not supported in postgres. Also, it cannot be rewritten in UTF-8 because there are no mapping for the characters used in it. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] serializable lock consistency
On Mon, Dec 20, 2010 at 5:32 PM, Florian Pflug wrote: > On Dec20, 2010, at 18:54 , Robert Haas wrote: >> On Mon, Dec 20, 2010 at 12:49 PM, Florian Pflug wrote: >>> For me, this is another very good reason to explore this further. Plus, it >>> improves the ratio of grotty-ness vs. number-of-problems-soved ;-) >> >> By all means, look into it further. I fear the boat is filling up >> with water, but if you manage to come up with a workable solution I'll >> be as happy as anyone, promise! > > I'll try to create a details proposal. To do that, however, I'll require > some guidance on whats acceptable and whats not. > > Here's a summary of the preceding discussion > > To deal with aborted transactions correctly, we need to track the last > locker of a particular tuple that actually committed. If we also want > to fix the bug that causes a row lock to be lost upon doing > lock;savepoint;update;restore that "latest committed locker" will > sometimes need to be a set, since it'll need to store the outer > transaction's xid as well as the latest actually committed locker. > > As long as no transaction aborts are involved, the tuple's xmax > contains all the information we need. If a transaction updates, > deletes or locks a row, the previous xmax is overwritten. If the > transaction later aborts, we cannot decide whether it has previously > been locked or not. > > And these ideas have come up > > A) Transactions who merely lock a row could put the previous > locker's xid (if >= GlobalXmin) *and* their own xid into a multi-xid, > and store that in xmax. For shared locks, this merely means cleaning > out the existing multi-xid a bit less aggressively. There's > no risk of bloat there, since we only need to keep one committed > xid, not all of them. For exclusive locks, we currently never > create a multi-xid. That'd change, we'd need to create one > if we find a previous locker with an xid >= GlobalXmin. This doesn't > solve the UPDATE and DELETE cases. For SELECT-FOR-SHARE this > is probably the best option, since it comes very close to what > we do currently. > > B) A transaction who UPDATEs or DELETEs a tuple could create an > intermediate lock-only tuple which'd contain the necessary > information about previous lock holders. We'd only need to do > that if there actually is one with xid >= GlobalXmin. We could > then choose whether to do the same for SELECT-FOR-UPDATE, or > whether we'd prefer to go with (A) > > C) The ctid field is only necessary for updated tuples. We could thus > overlay it with a field which stores the last committed locker after > a DELETE. UPDATEs could be handled either as in (B), or by storing the > information in the ctid-overlay in the *new* tuple. SELECT-FOR-UPDATE > could again either also use the ctid overlay or use (A). > > D) We could add a new tuple header field xlatest. To support binary > upgrade, we'd need to be able to read tuples without that field > also. We could then either create a new tuple version upon the > first lock request to such a tuple (which would then include the > new header), or we could simply raise a serialization error if > a serializable transaction tried to update a tuple without the > field whose xmax was aborted and >= GlobalXmin. > > I have the nagging feeling that (D) will meet quite some resistance. (C) was > too well received either, though I wonder if that'd change if the grotty-ness > was hidden behind a API, much xvac/cmin/cmax overlay is. (B) seems like a > lot of overhead, but maybe cleaner. More research is needed though to check > how it'd interact with HOT and how to get the locking right. (A) is IMHO the > best solution for the SELECT-FOR-SHARE since it's very close to what we do > today. > > Any comments? Especially of the "don't you dare" kind? I think any solution based on (D) has zero chance of being accepted, and it wouldn't be that high except that probabilities can't be negative. Unfortunately, I don't understand (A) or (B) well enough to comment intelligently. My previously expressed concern about (C) wasn't based on ugliness, but rather on my believe that there is likely a whole lot of code which relies on the CTID being a self-link when no UPDATE has occurred. We'd have to be confident that all such cases had been found and fixed, which might not be easy to be confident about. I have a more "meta" concern, too. Let's suppose (just for example) that you write some absolutely brilliant code which makes it possible to overlay CTID and which has the further advantage of being stunningly obvious, so that we have absolute confidence that it is correct. The obvious question is - if we can overlay CTID in some situations, is there a better use for that than this? Just to throw out something that might be totally impractical, maybe we could get rid of XMAX. If the tuple hasn't been updated, the CTID field stores the XMAX; if it HAS been updated, the CTID points to the successor
Re: [HACKERS] Extensions, patch v20 (bitrot fixes)
On Mon, Dec 20, 2010 at 10:15:56PM +0100, Nicolas Barbier wrote: > >From > >http://en.wikipedia.org/wiki/Japanese_language_and_computers#Character_encodings>: > > "Unicode is supposed to solve all encoding problems in all languages > of the world. [..] There are still controversies. For Japanese, the > kanji characters have been unified with Chinese, that is a character > considered to be the same in both Japanese and Chinese have been given > one and the same code number in Unicode, even if they look a little > different. This process, called Han unification, has caused > controversy." From http://en.wikipedia.org/wiki/CJK_Unified_Ideographs: "However, the source separation rule states that characters encoded separately in an earlier character set would remain separate in the new Unicode encoding." From all the references I've seen this has been applied everywhere and any failures to roundtrip conversions are considered bugs and I can't believe that at this point they havn't all been fixed. This is kind of underscored by the fact that references always point to theoretical problems rather than actual lists of characters that can't be converted. ISTM that since all the mapping tables are public it should be a SMOP to *prove* roundtrip conversions are safe, or identify the problems. Have a nice day, -- Martijn van Oosterhout http://svana.org/kleptog/ > Patriotism is when love of your own people comes first; nationalism, > when hate for people other than your own comes first. > - Charles de Gaulle signature.asc Description: Digital signature
Re: [HACKERS] Extensions, patch 22 (cleanup, review, cleanup)
Excerpts from Dimitri Fontaine's message of lun dic 20 18:35:44 -0300 2010: > Hi, > > From last round of review from Robert and Álvaro, here's the patch > version 22. Changes: I noticed this bit in the docs: The admin function pg_extension_flag_dump can be used to revert the default pg_dump policy about objects that belong to an extension and force the flagged objects to be part of the backups. However, the code to that function contains this bit: /* * CREATE EXTENSION is superuser only and we check we're in that code * path, so we don't add another explicit check for superuser here. */ if (!create_extension) ereport(ERROR, (errmsg("this function can only be used from CREATE EXTENSION"))); So presumably this shouldn't be documented because it cannot be called anyway? To be honest I don't understand the purpose of this part of the patch. I attach some minor fixes while reading it over. I compiled but didn't run it :-) -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support extension-fixes.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] serializable lock consistency
On Dec20, 2010, at 18:54 , Robert Haas wrote: > On Mon, Dec 20, 2010 at 12:49 PM, Florian Pflug wrote: >> For me, this is another very good reason to explore this further. Plus, it >> improves the ratio of grotty-ness vs. number-of-problems-soved ;-) > > By all means, look into it further. I fear the boat is filling up > with water, but if you manage to come up with a workable solution I'll > be as happy as anyone, promise! I'll try to create a details proposal. To do that, however, I'll require some guidance on whats acceptable and whats not. Here's a summary of the preceding discussion To deal with aborted transactions correctly, we need to track the last locker of a particular tuple that actually committed. If we also want to fix the bug that causes a row lock to be lost upon doing lock;savepoint;update;restore that "latest committed locker" will sometimes need to be a set, since it'll need to store the outer transaction's xid as well as the latest actually committed locker. As long as no transaction aborts are involved, the tuple's xmax contains all the information we need. If a transaction updates, deletes or locks a row, the previous xmax is overwritten. If the transaction later aborts, we cannot decide whether it has previously been locked or not. And these ideas have come up A) Transactions who merely lock a row could put the previous locker's xid (if >= GlobalXmin) *and* their own xid into a multi-xid, and store that in xmax. For shared locks, this merely means cleaning out the existing multi-xid a bit less aggressively. There's no risk of bloat there, since we only need to keep one committed xid, not all of them. For exclusive locks, we currently never create a multi-xid. That'd change, we'd need to create one if we find a previous locker with an xid >= GlobalXmin. This doesn't solve the UPDATE and DELETE cases. For SELECT-FOR-SHARE this is probably the best option, since it comes very close to what we do currently. B) A transaction who UPDATEs or DELETEs a tuple could create an intermediate lock-only tuple which'd contain the necessary information about previous lock holders. We'd only need to do that if there actually is one with xid >= GlobalXmin. We could then choose whether to do the same for SELECT-FOR-UPDATE, or whether we'd prefer to go with (A) C) The ctid field is only necessary for updated tuples. We could thus overlay it with a field which stores the last committed locker after a DELETE. UPDATEs could be handled either as in (B), or by storing the information in the ctid-overlay in the *new* tuple. SELECT-FOR-UPDATE could again either also use the ctid overlay or use (A). D) We could add a new tuple header field xlatest. To support binary upgrade, we'd need to be able to read tuples without that field also. We could then either create a new tuple version upon the first lock request to such a tuple (which would then include the new header), or we could simply raise a serialization error if a serializable transaction tried to update a tuple without the field whose xmax was aborted and >= GlobalXmin. I have the nagging feeling that (D) will meet quite some resistance. (C) was too well received either, though I wonder if that'd change if the grotty-ness was hidden behind a API, much xvac/cmin/cmax overlay is. (B) seems like a lot of overhead, but maybe cleaner. More research is needed though to check how it'd interact with HOT and how to get the locking right. (A) is IMHO the best solution for the SELECT-FOR-SHARE since it's very close to what we do today. Any comments? Especially of the "don't you dare" kind? best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extensions, patch 22 (cleanup, review, cleanup)
On Mon, December 20, 2010 22:55, Erik Rijkers wrote: > On Mon, December 20, 2010 22:35, Dimitri Fontaine wrote: > > During configure I spotted this: > > [...] > checking for bison... /usr/bin/bison > configure: using bison (GNU Bison) 2.3 > gawk: { if ($4 < 1.875-extension) exit 0; else exit 1;} > gawk: ^ syntax error > gawk: { if ($4 < 1.875-extension) exit 0; else exit 1;} > gawk: ^ syntax error > checking for flex... /usr/local/bin/flex > configure: using flex 2.5.3 > [...] > Apologies - please ignore the above ... It turns out this is an artifact of my build script hacking of the configure file. > Otherwise the patch applies, and compiles, checks, installs and runs OK. > (obviously I haven't > tested anything yet) > > Linux Centos 5.4 > > Erik Rijkers > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extensions, patch 22 (cleanup, review, cleanup)
On Mon, December 20, 2010 22:35, Dimitri Fontaine wrote: > Hi, > > From last round of review from Robert and �lvaro, here's the patch > version 22. Changes: > > - cleanup contrib/ and 'Adjust search_path' comments > - remove contrib/*/uninstall* scripts > - add some documentation to the NO USER DATA option > - remove objects locking in the pg_extension_objects() SRF, per Robert > - propose split patches > I used the 'full' patch. During configure I spotted this: [...] checking for bison... /usr/bin/bison configure: using bison (GNU Bison) 2.3 gawk: { if ($4 < 1.875-extension) exit 0; else exit 1;} gawk: ^ syntax error gawk: { if ($4 < 1.875-extension) exit 0; else exit 1;} gawk: ^ syntax error checking for flex... /usr/local/bin/flex configure: using flex 2.5.3 [...] Otherwise the patch applies, and compiles, checks, installs and runs OK. (obviously I haven't tested anything yet) Linux Centos 5.4 Erik Rijkers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extensions, patch v20 (bitrot fixes)
2010/12/20 Martijn van Oosterhout : > On Mon, Dec 20, 2010 at 09:03:56AM +0900, Itagaki Takahiro wrote: > >> UTF-8 is not a superset of all encodings. > > I think you mean Unicode is not a superset of all character sets. I've > heard this before but never found what's missing. [citation needed]? >From >http://en.wikipedia.org/wiki/Japanese_language_and_computers#Character_encodings>: "Unicode is supposed to solve all encoding problems in all languages of the world. [..] There are still controversies. For Japanese, the kanji characters have been unified with Chinese, that is a character considered to be the same in both Japanese and Chinese have been given one and the same code number in Unicode, even if they look a little different. This process, called Han unification, has caused controversy." For examples (my browser doesn't show any differences though, probably because I don't have the corresponding fonts): http://en.wikipedia.org/wiki/Han_unification#Examples_of_language_dependent_characters> Nicolas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] bug in SignalSomeChildren
On Mon, Dec 20, 2010 at 3:36 PM, Martijn van Oosterhout wrote: > Frankly it's a wart, for example strace/truss/whatever could (since > it's tracing anyway) just fudge the correct value in the getppid() call > so the userspace process doesn't notice. This has been a bug since > forever though, so I wouldn't hold my breath. Yeah, I have a vague recollection of noticing this in some other context ~10 years ago. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] bug in SignalSomeChildren
On Mon, Dec 20, 2010 at 03:08:02PM -0500, Robert Haas wrote: > The attached patch appears to work correctly on MacOS X. I did check, > BTW: getppid() in the attached process returns gdb's pid. Poor! This appears to be a BSDism at least. On Linux and BSD derivatives the man pages specifically mention the reparenting (needed for catching signals) but on Linux getppid() is specifically documented to return the correct value anyway. Frankly it's a wart, for example strace/truss/whatever could (since it's tracing anyway) just fudge the correct value in the getppid() call so the userspace process doesn't notice. This has been a bug since forever though, so I wouldn't hold my breath. Have a nice day, -- Martijn van Oosterhout http://svana.org/kleptog/ > Patriotism is when love of your own people comes first; nationalism, > when hate for people other than your own comes first. > - Charles de Gaulle signature.asc Description: Digital signature
Re: [HACKERS] Extensions, patch v20 (bitrot fixes)
On Mon, Dec 20, 2010 at 03:08:48PM -0500, Tom Lane wrote: > Kenneth Marshall writes: > > On Mon, Dec 20, 2010 at 02:10:39PM -0500, Tom Lane wrote: > >> [citation needed]? Exactly what characters are missing, and why would > >> the Unicode people have chosen to leave them out? It's not like they've > >> not heard of those encodings, I'm sure. > > > Here is an interesting description of some of the gotchas: > > http://en.wikipedia.org/wiki/Windows-1252 > > Well, it's interesting, but I see no glyphs on that page that lack > Unicode assignments. > > regards, tom lane > You are correct. I mis-read the text. Regards, Ken -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Happy New Year !
Hi There ! Sorry for offtopic, but I'll have no chance to say all of you "Happy New Year !" at right time, since I'm about to fly to Himalaya for winter trek in the Everest area. Regards, Oleg PS. My pictures from Everest 2009 trek: http://www.flickr.com/photos/obartunov/sets/72157622740300156/with/4078835652/ Slideshow if you have flash available: http://www.flickr.com/photos/obartunov/sets/72157622740300156/show/ _ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] bug in SignalSomeChildren
On Mon, Dec 20, 2010 at 3:11 PM, Tom Lane wrote: > Robert Haas writes: >> On Mon, Dec 20, 2010 at 2:23 PM, Tom Lane wrote: >>> I like that better actually ... one less thing for developers to get wrong. > >> The attached patch appears to work correctly on MacOS X. I did check, >> BTW: getppid() in the attached process returns gdb's pid. Poor! > > Looks good to me. > >> For my own purposes, I would be just as happy to apply this only to >> master. But I wonder if anyone wants to argue for back-patching, to >> help debug existing installations? > > Given the lack of non-developer complaints, I see no need to backpatch. Well, non-developers don't tend to attach gdb very often. Alvaro mentioned a problem installation upthread, thus the question. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] bug in SignalSomeChildren
Robert Haas writes: > On Mon, Dec 20, 2010 at 2:23 PM, Tom Lane wrote: >> I like that better actually ... one less thing for developers to get wrong. > The attached patch appears to work correctly on MacOS X. I did check, > BTW: getppid() in the attached process returns gdb's pid. Poor! Looks good to me. > For my own purposes, I would be just as happy to apply this only to > master. But I wonder if anyone wants to argue for back-patching, to > help debug existing installations? Given the lack of non-developer complaints, I see no need to backpatch. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extensions, patch v20 (bitrot fixes)
Kenneth Marshall writes: > On Mon, Dec 20, 2010 at 02:10:39PM -0500, Tom Lane wrote: >> [citation needed]? Exactly what characters are missing, and why would >> the Unicode people have chosen to leave them out? It's not like they've >> not heard of those encodings, I'm sure. > Here is an interesting description of some of the gotchas: > http://en.wikipedia.org/wiki/Windows-1252 Well, it's interesting, but I see no glyphs on that page that lack Unicode assignments. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extensions, patch v20 (bitrot fixes)
On Dec 20, 2010, at 11:53 AM, Kenneth Marshall wrote: > Here is an interesting description of some of the gotchas: > > http://en.wikipedia.org/wiki/Windows-1252 FWIW, those are gotchas translating between Windows 1252 and Latin-1. Windows 1252's nerbles translate to UTF-8 just fine. David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] bug in SignalSomeChildren
On Mon, Dec 20, 2010 at 2:23 PM, Tom Lane wrote: > Robert Haas writes: >> Another option that might be workable (but I have reservations, and >> haven't tested it either) is to check whether the return value of >> getppid() is equal to 1. If it's neither 1 nor PostmasterPid then try >> kill(). > > I like that better actually ... one less thing for developers to get wrong. The attached patch appears to work correctly on MacOS X. I did check, BTW: getppid() in the attached process returns gdb's pid. Poor! For my own purposes, I would be just as happy to apply this only to master. But I wonder if anyone wants to argue for back-patching, to help debug existing installations? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company postmaster-is-alive-bsd-workaround.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extensions, patch v20 (bitrot fixes)
On Mon, Dec 20, 2010 at 02:10:39PM -0500, Tom Lane wrote: > David Fetter writes: > > On Mon, Dec 20, 2010 at 08:01:42PM +0100, Martijn van Oosterhout wrote: > >> I think you mean Unicode is not a superset of all character sets. I've > >> heard this before but never found what's missing. [citation needed]? > > > Windows-1252, ISO-2022-JP-2 and EUC-TW are such encodings. > > [citation needed]? Exactly what characters are missing, and why would > the Unicode people have chosen to leave them out? It's not like they've > not heard of those encodings, I'm sure. > > regards, tom lane > Here is an interesting description of some of the gotchas: http://en.wikipedia.org/wiki/Windows-1252 Regards, Ken -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] bug in SignalSomeChildren
Robert Haas writes: > Another option that might be workable (but I have reservations, and > haven't tested it either) is to check whether the return value of > getppid() is equal to 1. If it's neither 1 nor PostmasterPid then try > kill(). I like that better actually ... one less thing for developers to get wrong. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] bug in SignalSomeChildren
On Mon, Dec 20, 2010 at 2:07 PM, Tom Lane wrote: > Robert Haas writes: >> On Mon, Dec 20, 2010 at 1:26 PM, Tom Lane wrote: >>> Robert Haas writes: Can we add a develop option to force use of the kill(0) method? > >>> How will that avoid needing to have an honest answer from getppid()? >>> Without that you can't know what to issue kill() against. > >> The answer to this question will probably be entirely self-evident if >> you stare at PostmasterIsAlive() for, well, it took me about 10 >> seconds. So probably less than five for you. > > Hmm, I was thinking that PostmasterPid was set originally from getppid, > but it looks like we rely on inheriting it through fork instead. > So maybe this will work. It's still slower and less reliable than the > getppid case for normal use, though. Well, that's why it'd be a developer option, rather than the default behavior. If we can agree on a name I'll work up a patch. Bikeshedding in 3... 2... 1... check_postmaster_via_kill avoid_backend_getppid ...? Another option that might be workable (but I have reservations, and haven't tested it either) is to check whether the return value of getppid() is equal to 1. If it's neither 1 nor PostmasterPid then try kill(). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extensions, patch v20 (bitrot fixes)
David Fetter writes: > On Mon, Dec 20, 2010 at 08:01:42PM +0100, Martijn van Oosterhout wrote: >> I think you mean Unicode is not a superset of all character sets. I've >> heard this before but never found what's missing. [citation needed]? > Windows-1252, ISO-2022-JP-2 and EUC-TW are such encodings. [citation needed]? Exactly what characters are missing, and why would the Unicode people have chosen to leave them out? It's not like they've not heard of those encodings, I'm sure. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] bug in SignalSomeChildren
Robert Haas writes: > On Mon, Dec 20, 2010 at 1:26 PM, Tom Lane wrote: >> Robert Haas writes: >>> Can we add a develop option to force use of the kill(0) method? >> How will that avoid needing to have an honest answer from getppid()? >> Without that you can't know what to issue kill() against. > The answer to this question will probably be entirely self-evident if > you stare at PostmasterIsAlive() for, well, it took me about 10 > seconds. So probably less than five for you. Hmm, I was thinking that PostmasterPid was set originally from getppid, but it looks like we rely on inheriting it through fork instead. So maybe this will work. It's still slower and less reliable than the getppid case for normal use, though. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extensions, patch v20 (bitrot fixes)
On Mon, Dec 20, 2010 at 08:01:42PM +0100, Martijn van Oosterhout wrote: > On Mon, Dec 20, 2010 at 09:03:56AM +0900, Itagaki Takahiro wrote: > > On Mon, Dec 20, 2010 at 01:34, Tom Lane wrote: > > >> I agree that "the default encoding is UTF-8", but it should be > > >> configurable by the 'encoding' parameter in control files. > > > > > > Why is it necessary to have such a parameter at all? > > > > UTF-8 is not a superset of all encodings. > > I think you mean Unicode is not a superset of all character sets. I've > heard this before but never found what's missing. [citation needed]? Windows-1252, ISO-2022-JP-2 and EUC-TW are such encodings. Cheers, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plperlu problem with utf8
On Sun, Dec 19, 2010 at 7:56 PM, David E. Wheeler wrote: > +1 Awesome. Should this go into the next commitfest? Or might it be > considered a bug fix? CommitFest or no CommitFest, patches get applied when a committer acquires enough round tuits. Putting then into the next CommitFest just provides a backstop to make sure that we remember to think about looking for some round tuits then if not sooner; it doesn't prevent them from being applied sooner. And in fact typically there are 2-4 patches per CommitFest that are committed before the CommitFest technically starts. Unfortunately, I'm not going to be able to pick this patch up now or then, due to lack of subject-matter expertise. I expect Andrew or Tom would be the most likely candidates... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extensions, patch v20 (bitrot fixes)
On Mon, Dec 20, 2010 at 09:03:56AM +0900, Itagaki Takahiro wrote: > On Mon, Dec 20, 2010 at 01:34, Tom Lane wrote: > >> I agree that "the default encoding is UTF-8", but it should be > >> configurable by the 'encoding' parameter in control files. > > > > Why is it necessary to have such a parameter at all? > > UTF-8 is not a superset of all encodings. I think you mean Unicode is not a superset of all character sets. I've heard this before but never found what's missing. [citation needed]? Have a nice day, -- Martijn van Oosterhout http://svana.org/kleptog/ > Patriotism is when love of your own people comes first; nationalism, > when hate for people other than your own comes first. > - Charles de Gaulle signature.asc Description: Digital signature
Re: [HACKERS] bug in SignalSomeChildren
On Mon, Dec 20, 2010 at 1:26 PM, Tom Lane wrote: > Robert Haas writes: Attaching gdb to either the startup process or a WAL sender causes PostmasterIsAlive to return false, resulting in a near-immediate exit. Seems pretty stupid for attaching gdb to change the return value of getppid() but it seems like that must be what's happening. > >> Can we add a develop option to force use of the kill(0) method? > > How will that avoid needing to have an honest answer from getppid()? > Without that you can't know what to issue kill() against. The answer to this question will probably be entirely self-evident if you stare at PostmasterIsAlive() for, well, it took me about 10 seconds. So probably less than five for you. > Seems like the correct path here is to complain to gdb and/or BSD > upstreams about this misbehavior. That might be a good thing to do too, but even if they agree to fix it and do in fact fix it right away, it'll take months or years before all of the major PostgreSQL contributors can benefit from those fixes, as opposed to, say, this afternoon. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] bug in SignalSomeChildren
Robert Haas writes: >>> Attaching gdb to either the startup process or a WAL sender causes >>> PostmasterIsAlive to return false, resulting in a near-immediate exit. >>> Seems pretty stupid for attaching gdb to change the return value of >>> getppid() but it seems like that must be what's happening. > Can we add a develop option to force use of the kill(0) method? How will that avoid needing to have an honest answer from getppid()? Without that you can't know what to issue kill() against. Seems like the correct path here is to complain to gdb and/or BSD upstreams about this misbehavior. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] MingW and MiniDumps
On Mon, Dec 20, 2010 at 19:24, Tom Lane wrote: > Andrew Dunstan writes: >> Maybe. I have no idea where narwhal got its version of dbghelp.h. I >> copied the file from the SDK directory to mingw's include directory and >> the build then ran perfectly. I think therefore the right thing is to >> have a configure test for the file and for MINIDUMP_TYPE on win32, and >> only build in crashdump if it passes those tests. +1. > mmm ... how ya gonna do that on the MSVC side of things? MSVC always has the Platform SDK. So it has the required headers. There was one symbol missing from an older version of the platform sdk, but it's not really one we need... -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] MingW and MiniDumps
Andrew Dunstan writes: > Maybe. I have no idea where narwhal got its version of dbghelp.h. I > copied the file from the SDK directory to mingw's include directory and > the build then ran perfectly. I think therefore the right thing is to > have a configure test for the file and for MINIDUMP_TYPE on win32, and > only build in crashdump if it passes those tests. mmm ... how ya gonna do that on the MSVC side of things? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] texteq/byteaeq: avoid detoast
texteq, textne, byteaeq and byteane detoast their arguments, then check for equality of length. Unequal lengths imply the answer trivially; given equal lengths, the functions proceed to compare the actual bytes. We can skip detoasting entirely when the lengths are unequal. The attached patch implements this. As submitted, it applies atop of my recent strncmp->memcmp patch, but they are logically independent. To benchmark some optimal and pessimal cases, I used the attached "bench-skip-texteq.sql". It uses a few datum sizes and varies whether the length check succeeds: bench-skip-texteq.sql, 10 MiB nomatch: 58.4s previous, 0.00664s patched bench-skip-texteq.sql, 144 B match: 73.0s previous, 71.9s patched bench-skip-texteq.sql,3 B match: 68.8s previous, 67.3s patched bench-skip-texteq.sql,3 B nomatch: 45.0s previous, 46.0s patched The timing differences in the smaller-length test cases are probably not statistically significant. Thanks, nm *** a/src/backend/utils/adt/varlena.c --- b/src/backend/utils/adt/varlena.c *** *** 1451,1472 text_cmp(text *arg1, text *arg2) Datum texteq(PG_FUNCTION_ARGS) { ! text *arg1 = PG_GETARG_TEXT_PP(0); ! text *arg2 = PG_GETARG_TEXT_PP(1); boolresult; ! /* !* Since we only care about equality or not-equality, we can avoid all the !* expense of strcoll() here, and just do bitwise comparison. !*/ ! if (VARSIZE_ANY_EXHDR(arg1) != VARSIZE_ANY_EXHDR(arg2)) result = false; else ! result = (memcmp(VARDATA_ANY(arg1), VARDATA_ANY(arg2), !VARSIZE_ANY_EXHDR(arg1)) == 0); ! ! PG_FREE_IF_COPY(arg1, 0); ! PG_FREE_IF_COPY(arg2, 1); PG_RETURN_BOOL(result); } --- 1451,1475 Datum texteq(PG_FUNCTION_ARGS) { ! Sizelen1 = toast_raw_datum_size(PG_GETARG_DATUM(0)) - VARHDRSZ; ! Sizelen2 = toast_raw_datum_size(PG_GETARG_DATUM(1)) - VARHDRSZ; boolresult; ! /* Fast path for different-length inputs: possibly skip detoast. */ ! if (len1 != len2) result = false; else ! { ! text *arg1 = PG_GETARG_TEXT_PP(0); ! text *arg2 = PG_GETARG_TEXT_PP(1); ! /* !* Since we only care about equality or not-equality, we can avoid all the !* expense of strcoll() here, and just do bitwise comparison. !*/ ! result = (memcmp(VARDATA_ANY(arg1), VARDATA_ANY(arg2), len1) == 0); ! PG_FREE_IF_COPY(arg1, 0); ! PG_FREE_IF_COPY(arg2, 1); ! } PG_RETURN_BOOL(result); } *** *** 1474,1495 texteq(PG_FUNCTION_ARGS) Datum textne(PG_FUNCTION_ARGS) { ! text *arg1 = PG_GETARG_TEXT_PP(0); ! text *arg2 = PG_GETARG_TEXT_PP(1); boolresult; ! /* !* Since we only care about equality or not-equality, we can avoid all the !* expense of strcoll() here, and just do bitwise comparison. !*/ ! if (VARSIZE_ANY_EXHDR(arg1) != VARSIZE_ANY_EXHDR(arg2)) result = true; else ! result = (memcmp(VARDATA_ANY(arg1), VARDATA_ANY(arg2), !VARSIZE_ANY_EXHDR(arg1)) != 0); ! ! PG_FREE_IF_COPY(arg1, 0); ! PG_FREE_IF_COPY(arg2, 1); PG_RETURN_BOOL(result); } --- 1477,1501 Datum textne(PG_FUNCTION_ARGS) { ! Sizelen1 = toast_raw_datum_size(PG_GETARG_DATUM(0)) - VARHDRSZ; ! Sizelen2 = toast_raw_datum_size(PG_GETARG_DATUM(1)) - VARHDRSZ; boolresult; ! /* Fast path for different-length inputs: possibly skip detoast. */ ! if (len1 != len2) result = true; else ! { ! text *arg1 = PG_GETARG_TEXT_PP(0); ! text *arg2 = PG_GETARG_TEXT_PP(1); ! /* !* Since we only care about equality or not-equality, we can avoid all the !* expense of strcoll() here, and just do bitwise comparison. !*/ ! result = (memcmp(VARDATA_ANY(arg1), VARDATA_ANY(arg2), len1) != 0); ! PG_FREE_IF_COPY(arg1, 0); ! PG_FREE_IF_COPY(arg2, 1); ! } PG_RETURN_BOOL(result); } *** *** 2358,2380 SplitIdentifierString(char *rawstring, char separator, Datum byteaeq(PG_FUNCTION_ARGS) { ! bytea *arg1 = PG_GETARG_BYTEA_PP(0); ! bytea *arg2 = PG_GETARG_BYTEA_PP(1); ! int len1, ! len2; boolresult; ! len1 = VARSIZE_ANY_EXHDR(arg1); ! len2 = VARSIZE_ANY_EXHDR(arg2); ! ! /* fast path for different-le
[HACKERS] strncmp->memcmp when we know the shorter length
When the caller knows the smaller string length, memcmp and strncmp are functionally equivalent. Since memcmp need not watch each byte for a NULL terminator, it often compares a CPU word at a time for better performance. The attached patch changes use of strncmp to memcmp where we have the length of the shorter string. I was most interested in the varlena.c instances, but I tried to find all applicable call sites. To benchmark it, I used the attached "bench-texteq.sql". This patch improved my 5-run average timing of the SELECT from 65.8s to 56.9s, a 13% improvement. I can't think of a case where the change should be pessimal. Thanks, nm *** a/contrib/hstore/hstore_io.c --- b/contrib/hstore/hstore_io.c *** *** 280,288 comparePairs(const void *a, const void *b) { if (((Pairs *) a)->keylen == ((Pairs *) b)->keylen) { ! int res = strncmp(((Pairs *) a)->key, ! ((Pairs *) b)->key, ! ((Pairs *) a)->keylen); if (res) return res; --- 280,288 { if (((Pairs *) a)->keylen == ((Pairs *) b)->keylen) { ! int res = memcmp(((Pairs *) a)->key, !((Pairs *) b)->key, !((Pairs *) a)->keylen); if (res) return res; *** *** 324,330 hstoreUniquePairs(Pairs *a, int4 l, int4 *buflen) while (ptr - a < l) { if (ptr->keylen == res->keylen && ! strncmp(ptr->key, res->key, res->keylen) == 0) { if (ptr->needfree) { --- 324,330 while (ptr - a < l) { if (ptr->keylen == res->keylen && ! memcmp(ptr->key, res->key, res->keylen) == 0) { if (ptr->needfree) { *** a/contrib/hstore/hstore_op.c --- b/contrib/hstore/hstore_op.c *** *** 49,55 hstoreFindKey(HStore *hs, int *lowbound, char *key, int keylen) stopMiddle = stopLow + (stopHigh - stopLow) / 2; if (HS_KEYLEN(entries, stopMiddle) == keylen) ! difference = strncmp(HS_KEY(entries, base, stopMiddle), key, keylen); else difference = (HS_KEYLEN(entries, stopMiddle) > keylen) ? 1 : -1; --- 49,55 stopMiddle = stopLow + (stopHigh - stopLow) / 2; if (HS_KEYLEN(entries, stopMiddle) == keylen) ! difference = memcmp(HS_KEY(entries, base, stopMiddle), key, keylen); else difference = (HS_KEYLEN(entries, stopMiddle) > keylen) ? 1 : -1; *** *** 263,269 hstore_delete(PG_FUNCTION_ARGS) int len = HS_KEYLEN(es, i); char *ptrs = HS_KEY(es, bufs, i); ! if (!(len == keylen && strncmp(ptrs, keyptr, keylen) == 0)) { int vallen = HS_VALLEN(es, i); --- 263,269 int len = HS_KEYLEN(es, i); char *ptrs = HS_KEY(es, bufs, i); ! if (!(len == keylen && memcmp(ptrs, keyptr, keylen) == 0)) { int vallen = HS_VALLEN(es, i); *** *** 331,339 hstore_delete_array(PG_FUNCTION_ARGS) int skeylen = HS_KEYLEN(es, i); if (skeylen == key_pairs[j].keylen) ! difference = strncmp(HS_KEY(es, ps, i), ! key_pairs[j].key, ! key_pairs[j].keylen); else difference = (skeylen > key_pairs[j].keylen) ? 1 : -1; } --- 331,339 int skeylen = HS_KEYLEN(es, i); if (skeylen == key_pairs[j].keylen) ! difference = memcmp(HS_KEY(es, ps, i), ! key_pairs[j].key, ! key_pairs[j].keylen); else difference = (skeylen > key_pairs[j].keylen) ? 1 : -1; } *** *** 416,424 hstore_delete_hstore(PG_FUNCTION_ARGS) int s2keylen = HS_KEYLEN(es
Re: [HACKERS] relaxing sync commit if no WAL written (was Re: unlogged tables)
On Sun, Dec 19, 2010 at 3:08 PM, Simon Riggs wrote: > On Sun, 2010-12-19 at 07:33 -0500, Robert Haas wrote: >> On Sun, Dec 19, 2010 at 7:01 AM, Simon Riggs wrote: >> > On Fri, 2010-12-17 at 13:35 -0500, Robert Haas wrote: >> > >> >> I'm >> >> thinking it makes sense to commit this part first. >> > >> > This will break Hot Standby, as previously explained. Don't. >> >> Uh, why? Skipping the commit record altogether would do that, but >> this patch doesn't do that. > > I was looking for XLogStandbyInfoActive() > > It isn't there, so you're either breaking HS or missing a possible > optimisation. Having said that, it would be useful to be able to assume > that all xids appear in the log, for diagnostic purposes. > > So I now agree with the way you've coded it. OK, thanks. Committed. Note that there is also a long comment in there which includes a discussion of the issues relating to Hot Standby. Hopefully that's clear enough to prevent anyone from getting too clever with this in the future. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] serializable lock consistency
On Mon, Dec 20, 2010 at 12:49 PM, Florian Pflug wrote: > For me, this is another very good reason to explore this further. Plus, it > improves the ratio of grotty-ness vs. number-of-problems-soved ;-) By all means, look into it further. I fear the boat is filling up with water, but if you manage to come up with a workable solution I'll be as happy as anyone, promise! -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_ctl and port number detection
On 12/20/2010 12:41 PM, Tom Lane wrote: Actually, if we're going to do this at all, we should do pid datadir port socketdir ... here be dragons ... so that pg_ctl doesn't have to assume the server is running with a default value of unix_socket_dir. Not sure what to put in the fourth line on Windows though ... maybe just leave it empty? Yes, that seems reasonable. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unlogged tables
Alvaro Herrera writes: > Excerpts from Robert Haas's message of sáb dic 18 02:21:41 -0300 2010: >> 1. pg_dump needs an option to control whether unlogged tables are >> dumped. --no-unlogged-tables seems like the obvious choice, assuming >> we want the default to be to dump them, which seems like the safest >> option. > If there are valid use cases for some unlogged tables being dumped and > some others not, would it make sense to be able to specify a pattern of > tables to be dumped or skipped? Presumably you could still do that with the regular --tables name pattern switch. I don't see a reason for unlogged tables to respond to a different name pattern. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] serializable lock consistency
On Dec20, 2010, at 17:54 , Robert Haas wrote: > On Mon, Dec 20, 2010 at 9:11 AM, Florian Pflug wrote: >> On Dec20, 2010, at 13:13 , Heikki Linnakangas wrote: >>> One way to look at this is that the problem arises because SELECT FOR >>> UPDATE doesn't create a new tuple like UPDATE does. The problematic case >>> was: >>> T1 locks, T1 commits, T2 updates, T2 aborts, all after T0 took its snapshot but before T0 attempts to delete. :-( >>> >>> If T1 does a regular UPDATE, T2 doesn't overwrite the xmax on the original >>> tuple, but on the tuple that T1 created. >> >>> So one way to handle FOR UPDATE would be to lazily turn the lock operation >>> by T1 into a dummy update, when T2 updates the tuple. You can't >>> retroactively make a regular update on behalf of the locking transaction >>> that committed already, or concurrent selects would see the same row twice, >>> but it might work with some kind of a magic tuple that's only followed >>> through the ctid from the original one, and only for the purpose of >>> visibility checks. >> >> In the case of an UPDATE of a recently locked tuple, we could avoid having >> to insert a dummy tuple by storing the old tuple's xmax in the new tuple's >> xmax. We'd flag the old tuple, and attempt to restore the xmax of any >> flagged tuple with an aborted xmax and a ctid != t_self during scanning and >> vacuuming. >> >> For DELETEs, that won't work. However, could we maybe abuse the ctid to >> store the old xmax? It currently contains t_self, but do we actually depend >> on that? > > My first reaction to all of this is that it sounds awfully grotty. Well, there's precedent for overlaying fields in the tuple header for space efficiency... >> FOR-SHARE and FOR-UPDATE locks could preserve information about the latest >> committed locker by creating a multi-xid. For FOR-SHARE locks, we'd just >> need to ensure that we only remove all but one finished transactions. For >> FOR-UPDATE locks, we'd need to create a multi-xid if the old xmax is >= >> GlobalXmin, but I guess that's tolerable. > > Even in the original version of this patch, there's a non-trivial > overhead here when a multi-xid exists that doesn't exist today: a > serializable transaction has to grovel through the XIDs in the > multi-xact and figure out whether any of them are new enough to be a > problem. It has to grovel through them anyway to see if any one of them is still running (in the UPDATE/DELETE/FOR-UPDATE case), or needs to copy them into a new multi-xid (in the FOR-SHARE case). Scanning it a second time thus won't cause any further IO. If the second scan is really a concern here, it could probably be folded into the first, but I doubt very much that it'd be worth the added complexity. Having to create a multi-xid for FOR-UPDATE locks is a real concern, though. But if overlaying/abusing ctid proves to be possible in the case of DELETEs, the same could be done for FOR-SHARE and FOR-UPDATE locks. Only the UPDATE case would then remain... > I fear that this whole approach is a case of trying to jam a > square peg through a round hole. We're trying to force the on-disk > format that we have to meet a requirement it wasn't designed for, and > it's looking pretty ugly. Certainly true. But most of the grotty-ness isn't inherently part of a solution to the problem at hand. Rather, it's imposed on us by the requirement for binary upgradability. IMHO, it's the price we pay for having that. > Kevin Grittner's work is a whole different > approach to this problem, and while that's obviously not fully > debugged and committed yet either, it's often easier to design a new > tool to solve a particular problem than to make an existing tool that > was really meant for something else do some new thing in addition. Kevin's work only solves the problem at hand if we removed support for what we call SERIALIZABLE now completely. As it stands, however, what we now call SERIALIZABLE will still be available as isolation level REPEATABLE READ, just as it is now, and will retain its quirky behaviour regarding row-level locks. For some workloads at least, Kevin's also introduces many more reasons for spurious serialization failures. Don't take me wrong here - I think what Kevin is doing is very important, and will be a huge leap for postgres, putting us ahead of all commercial and open-source competitors that I know of. But having to pay the price for that to be able to simply enforce RI constraints slightly outside of what FK constraints can do still seems wrong. For driving a screw into a piece of wood, a screwdriver still beats even the most powerful hammer, and leaves fewer traces on the wood... BTW, I realized that preventing UPDATEs, DELETEs and FOR-UPDATE locks from removing all traces of previous lock owners would also help to solve the bug cautioned about here: http://www.postgresql.org/docs/9.0/interactive/sql-select.html#SQL-FOR-UPDATE-SHARE Briefly, the problem is if you do BEGIN;
Re: [HACKERS] serializable lock consistency
On Mon, Dec 20, 2010 at 12:39 PM, Kevin Grittner wrote: > I see Florian's patch meeting a real need though, I agree, but that whole approach seems to be foundering on the rocks. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_ctl and port number detection
Bruce Momjian writes: > Tom Lane wrote: >> No. If it goes in, it should go in as the third line. The shmem key >> data is private to the server --- we do not want external programs >> assuming anything at all about the private part of postmaster.pid. > OK, so you are suggesting having it as a third value on the third line? > 10231 > /u/pgsql/data > 5432001 45481984 port_here > ^ I'm not sure why you're having such a hard time grasping this concept. We do not want pg_ctl looking at the shmem key information, not even to the extent of assuming a particular format for it. Therefore the port number has to go before it not after it. What I'm thinking of is pid datadir port ... here be dragons ... Actually, if we're going to do this at all, we should do pid datadir port socketdir ... here be dragons ... so that pg_ctl doesn't have to assume the server is running with a default value of unix_socket_dir. Not sure what to put in the fourth line on Windows though ... maybe just leave it empty? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extensions and custom_variable_classes
Excerpts from Dimitri Fontaine's message of lun dic 20 14:25:14 -0300 2010: > Robert Haas writes: > > Patches are better for me, anyway... > > Here it is then, version 21. Changes: Just noticed a small problem: you're removing the "SET search_path" lines in contrib Makefiles but you're leaving the "Adjust this setting to control where the objects get created" line behind, which should be removed as well. -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] serializable lock consistency
Robert Haas wrote: > Kevin Grittner's work is a whole different approach to this > problem, and while that's obviously not fully debugged and > committed yet either, it's often easier to design a new tool to > solve a particular problem than to make an existing tool that was > really meant for something else do some new thing in addition. I see Florian's patch meeting a real need though, even though it is an orthogonal solution to the same problem. For those converting to PostgreSQL from Oracle, there are shops with a lot of code which counts on the behavior which Florian is trying to implement. Without Florian's patch, if they do a minimal-effort conversion to PostgreSQL (without removing SELECT FOR SHARE/UPDATE clauses or explicit locks) and count on SSI behavior, they will be paying for the cost of integrity enforcement *twice*. It also may be useful for those who can't easily structure their code to automatically retry transactions which experience a serialization failure. For those shops doing "green field" development or converting from products with true serializability (DB2, Sybase ASE, InnoDB, etc.) it will probably be easier for them to just use SSI; and I have reason to believe that SSI will generally perform better, so Florian's patch doesn't obviate the need for SSI. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] fix to allow mingw+vpath
The simple change below allows a vpath build to be used on Mingw. Is there any objection? cheers andrew diff --git a/src/bin/pgevent/Makefile b/src/bin/pgevent/Makefile index 013b801..39cb7dc 100644 --- a/src/bin/pgevent/Makefile +++ b/src/bin/pgevent/Makefile @@ -27,7 +27,7 @@ pgevent.dll: pgevent.def $(OBJS) $(DLLWRAP) --def $< -o $(NAME) $(OBJS) pgmsgevent.o: pgmsgevent.rc win32ver.rc - $(WINDRES) $< -o $@ --include-dir=$(top_builddir)/src/include --include-dir=$(top_srcdir)/src/include --include-dir=$(srcdir) + $(WINDRES) $< -o $@ --include-dir=$(top_builddir)/src/include --include-dir=$(top_srcdir)/src/include --include-dir=$(srcdir) --include-dir=. all-lib: $(NAME) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unlogged tables
On Mon, Dec 20, 2010 at 9:05 AM, Alvaro Herrera wrote: > Excerpts from Robert Haas's message of sáb dic 18 02:21:41 -0300 2010: >> Here's an attempt to summarize the remaining issues with this patch >> that I know about. I may have forgotten something, so please mention >> it if you notice something missing. >> >> 1. pg_dump needs an option to control whether unlogged tables are >> dumped. --no-unlogged-tables seems like the obvious choice, assuming >> we want the default to be to dump them, which seems like the safest >> option. > > If there are valid use cases for some unlogged tables being dumped and > some others not, would it make sense to be able to specify a pattern of > tables to be dumped or skipped? Well, if you want to dump a subset of the tables in your database, you can already do that. I think that adding a pattern to --no-unlogged-tables (or whatever we end up calling it) would be an unnecessary frammish. There's no particular reason to think that unlogged tables are going to be so widely used or that concerns about which ones are going to be so widespread that we should do something here when we don't even have much simpler things like --function, which IMHO would extremely useful. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] A quick warning on the win32 build scripts
(Hello, very old thread) On Tue, Oct 5, 2010 at 16:29, Tom Lane wrote: > Craig Ringer writes: >> The cause turns out to be the automatic .DEF file generation. It skips >> DEF file generation if a DEF file already exists, even if the >> object/sources are newer than the existing DEF file. > > Um, so can't we fix that? It seems like a rather basic expectation of a > build process that it will rebuild dependent files when needed. Does anybody have an idea of *how* to actually do that? My first thought was to just generate a DEF file whenever any of the object files have changed, which could be a simple date comparison. However, generating the DEF file is *really* expensive and slow, so this would make rebuilding anything that *doesn't* add or remove an entrypoint unacceptably slow.. Can someone think of a way to do this without having to dump the list of exports from each and every object file? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] serializable lock consistency
On Mon, Dec 20, 2010 at 9:11 AM, Florian Pflug wrote: > On Dec20, 2010, at 13:13 , Heikki Linnakangas wrote: >> One way to look at this is that the problem arises because SELECT FOR UPDATE >> doesn't create a new tuple like UPDATE does. The problematic case was: >> >>> T1 locks, T1 commits, T2 updates, T2 aborts, all after T0 >>> took its snapshot but before T0 attempts to delete. :-( >> >> If T1 does a regular UPDATE, T2 doesn't overwrite the xmax on the original >> tuple, but on the tuple that T1 created. > >> So one way to handle FOR UPDATE would be to lazily turn the lock operation >> by T1 into a dummy update, when T2 updates the tuple. You can't >> retroactively make a regular update on behalf of the locking transaction >> that committed already, or concurrent selects would see the same row twice, >> but it might work with some kind of a magic tuple that's only followed >> through the ctid from the original one, and only for the purpose of >> visibility checks. > > In the case of an UPDATE of a recently locked tuple, we could avoid having to > insert a dummy tuple by storing the old tuple's xmax in the new tuple's xmax. > We'd flag the old tuple, and attempt to restore the xmax of any flagged tuple > with an aborted xmax and a ctid != t_self during scanning and vacuuming. > > For DELETEs, that won't work. However, could we maybe abuse the ctid to store > the old xmax? It currently contains t_self, but do we actually depend on that? My first reaction to all of this is that it sounds awfully grotty. > FOR-SHARE and FOR-UPDATE locks could preserve information about the latest > committed locker by creating a multi-xid. For FOR-SHARE locks, we'd just need > to ensure that we only remove all but one finished transactions. For > FOR-UPDATE locks, we'd need to create a multi-xid if the old xmax is >= > GlobalXmin, but I guess that's tolerable. Even in the original version of this patch, there's a non-trivial overhead here when a multi-xid exists that doesn't exist today: a serializable transaction has to grovel through the XIDs in the multi-xact and figure out whether any of them are new enough to be a problem. I fear that this whole approach is a case of trying to jam a square peg through a round hole. We're trying to force the on-disk format that we have to meet a requirement it wasn't designed for, and it's looking pretty ugly. Kevin Grittner's work is a whole different approach to this problem, and while that's obviously not fully debugged and committed yet either, it's often easier to design a new tool to solve a particular problem than to make an existing tool that was really meant for something else do some new thing in addition. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extensions and custom_variable_classes
On Mon, Dec 20, 2010 at 11:36 AM, Dimitri Fontaine wrote: > Robert Haas writes: >> The effect of that has to be that the postmaster adds a certain amount >> of space to PostgreSQL's initial shared memory allocation. That means >> the postmaster has to know that pg_stat_statements is a valid custom >> variable class. > > Ah. Yes. Indeed. So you still needed to edit postgresql.conf in such > cases. Well there's only 1 contrib module that touches SHM, but is also > happen to be the only one that needs custom_variable_classes support… > > Meanwhile, the custom_variable_classes related code has been removed > from the extension's git branch, and the WITH ENCODING option is no more > (the control file encoding is still there and defaults to UTF-8). > > Will continue collecting improvements ideas and cleanup needs as we go, > unless you prefer to see new patches (they are easy enough to produce). > > http://git.postgresql.org/gitweb?p=postgresql-extension.git;a=shortlog;h=refs/heads/extension Patches are better for me, anyway... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extensions and custom_variable_classes
Robert Haas writes: > The effect of that has to be that the postmaster adds a certain amount > of space to PostgreSQL's initial shared memory allocation. That means > the postmaster has to know that pg_stat_statements is a valid custom > variable class. Ah. Yes. Indeed. So you still needed to edit postgresql.conf in such cases. Well there's only 1 contrib module that touches SHM, but is also happen to be the only one that needs custom_variable_classes support… Meanwhile, the custom_variable_classes related code has been removed from the extension's git branch, and the WITH ENCODING option is no more (the control file encoding is still there and defaults to UTF-8). Will continue collecting improvements ideas and cleanup needs as we go, unless you prefer to see new patches (they are easy enough to produce). http://git.postgresql.org/gitweb?p=postgresql-extension.git;a=shortlog;h=refs/heads/extension Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Binary timestamp with without timezone
On Mon, Dec 20, 2010 at 6:29 AM, Radosław Smogura wrote: > > On Thu, 16 Dec 2010 14:24:27 -0500, Tom Lane wrote: >> >> =?utf-8?q?Rados=C5=82aw_Smogura?= writes: >>> >>> Tom Lane Thursday 16 December 2010 18:59:56 =?utf-8?q?Rados=C5=82aw_Smogura?= writes: > > ... This timestamp must be properly encoded > depending if target is WITH TZ or not, but JDBC (and other clients, > probably too) doesn't have any knowledge about target type when > statement is executed >> Seems like you need to fix *that*. >> >>> I don't say it's bad way to send timestamps in text mode. It's good >>> solution, >>> because timestamp without tz will silently ignore tz offset, timestamp >>> with tz >>> will use offset in calculations if it is there, if no it will use server >>> TZ. >> >> No, what I'm saying is that it's complete folly to be sending binary >> data for a value you don't know the exact type of. > > I know something about value I want to send, but only this it should be a > timestamp. I don't know if it should be with or without tz. That's your problem right there, full stop. If you don't know if your time is with or without tz, how can you possibly expect the server to know? Either send without tz, or grab the time zone from the local environment and convert to binary timezone tz. Your issue has absolutely nothing to do with which protocol you are using. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] bug in SignalSomeChildren
On Mon, Dec 20, 2010 at 8:20 AM, Alvaro Herrera wrote: > Excerpts from Robert Haas's message of vie dic 17 14:08:04 -0300 2010: > >> I'm having a bit of trouble confirming this on MacOS X, though. >> Attaching gdb to either the startup process or a WAL sender causes >> PostmasterIsAlive to return false, resulting in a near-immediate exit. >> Seems pretty stupid for attaching gdb to change the return value of >> getppid() but it seems like that must be what's happening. > > Yeah, this problem has been known for some time and causes quite some > pain. We have an open problem report on autovacuum failing to run after > some time, and we haven't been able to get a backtrace or strace because > of this issue -- trying to attach to it causes a full system restart > (PostmasterIsAlive returns false, autovac launcher dies, this death > causes postmaster to shut everything down and restart). This is on > FreeBSD. Can we add a develop option to force use of the kill(0) method? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extensions and custom_variable_classes
On Mon, Dec 20, 2010 at 3:56 AM, Dimitri Fontaine wrote: > Robert Haas writes: >> I bet it doesn't. The *postmaster* never connects to a database, so >> which copy of pg_extension does it ever read? > > None, which does it need to read? My answer is none, you're saying it's > wrong, I don't get why. postmaster surely has no business with what's in > a specific database and no use at all of placeholder GUCs, right? Let's take a concrete example. Suppose the user installs the extension 'pg_stat_statements' and puts the following into their postgresql.conf: pg_stat_statements.max = 31415; The effect of that has to be that the postmaster adds a certain amount of space to PostgreSQL's initial shared memory allocation. That means the postmaster has to know that pg_stat_statements is a valid custom variable class. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] MingW and MiniDumps
On 12/19/2010 03:09 PM, Magnus Hagander wrote: I can't find either the header or the symbols. That's weird - from what I can tell, at least narwahl isn't complaining about a missing include file, just the undefined symbols. Different versions of mingw perhaps? Maybe. I have no idea where narwhal got its version of dbghelp.h. I copied the file from the SDK directory to mingw's include directory and the build then ran perfectly. I think therefore the right thing is to have a configure test for the file and for MINIDUMP_TYPE on win32, and only build in crashdump if it passes those tests. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] queriing the version of libpq
On Thu, Dec 16, 2010 at 17:13, Tom Lane wrote: > Magnus Hagander writes: >> On Thu, Dec 16, 2010 at 17:07, Tom Lane wrote: >>> because if you're trying to link against an older libpq, the link will >>> fail before you ever get to execute. So let's have a less implausible >>> use-case please. > >> Look back at the very start of the thread. > > OK, but let's document those examples instead. Something like this better? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml index c502439..a776df4 100644 --- a/doc/src/sgml/libpq.sgml +++ b/doc/src/sgml/libpq.sgml @@ -5359,6 +5359,50 @@ void *PQresultAlloc(PGresult *res, size_t nBytes); + + + PQlibVersion + + PQlibVersion + PQserverVersion + + + + + + Return the version of libpq that is being used. + +int PQlibVersion(void); + + + + + The result value of this function can be used to determine, at + runtime, if a specific functionality is available in the currently + loaded version of libpq. Examples of this include whether specific + connection options are available for PQconnectdb and + the new bytea encoding in PostgreSQL 9.0 that requires + libpq version 9.0 or later to work. + + + + The number is formed by converting the major, minor, and revision + numbers into two-decimal-digit numbers and appending them together. + For example, version 9.1 will be returned as 90100, and version + 9.1.2 will be returned as 90102 (leading zeroes are not shown). + + + + + This function appeared in PostgreSQL version 9.1, so + it cannot be used to detect required functionality in earlier + versions, since linking to it will create a load time dependency + on version 9.1. + + + + + diff --git a/src/interfaces/libpq/exports.txt b/src/interfaces/libpq/exports.txt index a6c73af..1af8df6 100644 --- a/src/interfaces/libpq/exports.txt +++ b/src/interfaces/libpq/exports.txt @@ -159,3 +159,4 @@ PQconnectdbParams 156 PQconnectStartParams 157 PQping158 PQpingParams 159 +PQlibVersion 160 diff --git a/src/interfaces/libpq/fe-misc.c b/src/interfaces/libpq/fe-misc.c index 24ab7cf..999f60b 100644 --- a/src/interfaces/libpq/fe-misc.c +++ b/src/interfaces/libpq/fe-misc.c @@ -66,6 +66,14 @@ static int pqSocketCheck(PGconn *conn, int forRead, int forWrite, time_t end_time); static int pqSocketPoll(int sock, int forRead, int forWrite, time_t end_time); +/* + * PQlibVersion: return the libpq version number + */ +int +PQlibVersion(void) +{ + return PG_VERSION_NUM; +} /* * fputnbytes: print exactly N bytes to a file diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h index 271afed..6075ea8 100644 --- a/src/interfaces/libpq/libpq-fe.h +++ b/src/interfaces/libpq/libpq-fe.h @@ -547,6 +547,9 @@ extern int lo_export(PGconn *conn, Oid lobjId, const char *filename); /* === in fe-misc.c === */ +/* Get the version of the libpq library in use */ +extern int PQlibVersion(void); + /* Determine length of multibyte encoded char at *s */ extern int PQmblen(const char *s, int encoding); -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] bug in SignalSomeChildren
Alvaro Herrera writes: > ... We have an open problem report on autovacuum failing to run after > some time, and we haven't been able to get a backtrace or strace because > of this issue ... I wonder whether that's the already-fixed problem with autovacuum cost limit going to zero in long-lived workers. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Timeout for asynchronous replication Re: Timeout and wait-forever in sync rep
On Mon, Dec 20, 2010 at 3:17 AM, Fujii Masao wrote: > OK. How about keepalive-like parameters and behaviors? > > replication_keepalives_idle > replication_keepalives_interval > replication_keepalives_count > > The master sends the keepalive packet if replication_keepalives_idle > elapsed after receiving the last ACK packet including the receive/ > fsync/replay LSNs from the standby. OTOH, the standby sends the > ACK packet back to the master as soon as receiving the keepalive > packet. > > If the master could not receive the ACK packet for > replication_keepalives_interval, it repeats sending the keepalive > packet and receiving the ACK replication_keepalives_count -1 > times. If no ACK packet has finally arrived, the master thinks the > standby has been dead. I thought we were using a single TCP session per standby/slave? So adding another "KEEPALIVE" into the local buffer side of the TCP stream isn't going to help a "stuck" one arrive earlier. You really only have a few situations: 1) Network problems. Stuffing more stuff into the local buffers isn't gonig to help get packets from the remote that it would like to send (I say like to send, because network problems could be on either/both directions, the remote may or may not have seen our keepalive requrest) 2) The remote is getting them, and is swamped. It's not going to get processing our 2nd keepalive any sooner than processing our 1st. If a walreceiver reads a "keepalive" request, Just declare that it must reply immediately. Then the master config can trust that a keepalive should be replied to pretty quickly if networks is ok. TCP will make it get there "eventually" if it's a bad network, and the admins have set it be very network tolerant. The ACK might report that the salve is hopelessly behind on fsyncing/applying it's WAL, but that's good too. At least then the ACK comes back, and the master knows the slave is still churning away on the last batch of WAL, and can decide if it wants to think the slave is too far behind and boot it out. -- Aidan Van Dyk Create like a god, ai...@highrise.ca command like a king, http://www.highrise.ca/ work like a slave. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Timeout for asynchronous replication Re: Timeout and wait-forever in sync rep
On Mon, Dec 20, 2010 at 3:17 AM, Fujii Masao wrote: > On Tue, Dec 7, 2010 at 12:20 AM, Robert Haas wrote: >> Yeah. If we rely on the TCP send buffer filling up, then the amount >> of time the master takes to notice a dead standby is going to be hard >> for the user to predict. I think the standby ought to send some sort >> of heartbeat and the master should declare the standby dead if it >> doesn't see a heartbeat soon enough. Maybe the heartbeat could even >> include the receive/fsync/replay LSNs, so that sync rep can use the >> same machinery but with more aggressive policies about when they must >> be sent. > > OK. How about keepalive-like parameters and behaviors? > > replication_keepalives_idle > replication_keepalives_interval > replication_keepalives_count > > The master sends the keepalive packet if replication_keepalives_idle > elapsed after receiving the last ACK packet including the receive/ > fsync/replay LSNs from the standby. OTOH, the standby sends the > ACK packet back to the master as soon as receiving the keepalive > packet. > > If the master could not receive the ACK packet for > replication_keepalives_interval, it repeats sending the keepalive > packet and receiving the ACK replication_keepalives_count -1 > times. If no ACK packet has finally arrived, the master thinks the > standby has been dead. This doesn't really make sense, because you're connecting over a TCP connection. Once you send the first keepalive, TCP will keep retrying in some way that we have no control over. If those packets aren't getting through, adding more data to what has to be transmitted seems unlikely to do anything useful. I think the parameters we can usefully set are: - how long does the master wait before sending a keepalive request? - how long does the master wait after sending a keepalive before declaring the slave dead and closing the connection? But this can be further simplified. The slave doesn't really need the master to prompt it to send acknowledgments. It only needs to send them sufficiently often. As part of the start-replication sequence, let's have the master tell the slave "send me an acknowledgment at least every N seconds". And then the slave must do that. The master then has some value K > N, such that if no acknowledgment is received after K seconds, the connection is disconnected. The only reason to have the master send explicit keepalive requests (vs. just telling the client the interval) is if the master might request them for some reason other than timer expiration. Since the main point of this is to detect the situation where the slave has e.g. power cycled so that the connection is gone but the master doesn't know it, you could imagine a system where, when a new replication connection is received, we request keepalives on all of the existing connections to see if any of them are defunct. But I don't really think it needs to be quite that complicated. Another consideration is that you could configure the keepalive-frequency on the slave and the declare-dead-time on the master. Then the master wouldn't need to tell the slave the keepalive-frequency at replication start-up time. But that might also increase the chances of incompatible settings (e.g. slave's keepalive frequency is >= master's declare-dead-time), which would result in a lot of unnecessary reconnects. If both parameters are configured on the master, then we can enforce that declare-dead-time > keepalive-frequency. So I suggest: replication_keepalive_time - how often the slave is instructed to send acknowledgments when idle replication_idle_timeout - the period of inactivity after which the master closes the connection to the slave -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] serializable lock consistency
On Dec20, 2010, at 13:13 , Heikki Linnakangas wrote: > One way to look at this is that the problem arises because SELECT FOR UPDATE > doesn't create a new tuple like UPDATE does. The problematic case was: > >> T1 locks, T1 commits, T2 updates, T2 aborts, all after T0 >> took its snapshot but before T0 attempts to delete. :-( > > If T1 does a regular UPDATE, T2 doesn't overwrite the xmax on the original > tuple, but on the tuple that T1 created. > So one way to handle FOR UPDATE would be to lazily turn the lock operation by > T1 into a dummy update, when T2 updates the tuple. You can't retroactively > make a regular update on behalf of the locking transaction that committed > already, or concurrent selects would see the same row twice, but it might > work with some kind of a magic tuple that's only followed through the ctid > from the original one, and only for the purpose of visibility checks. In the case of an UPDATE of a recently locked tuple, we could avoid having to insert a dummy tuple by storing the old tuple's xmax in the new tuple's xmax. We'd flag the old tuple, and attempt to restore the xmax of any flagged tuple with an aborted xmax and a ctid != t_self during scanning and vacuuming. For DELETEs, that won't work. However, could we maybe abuse the ctid to store the old xmax? It currently contains t_self, but do we actually depend on that? FOR-SHARE and FOR-UPDATE locks could preserve information about the latest committed locker by creating a multi-xid. For FOR-SHARE locks, we'd just need to ensure that we only remove all but one finished transactions. For FOR-UPDATE locks, we'd need to create a multi-xid if the old xmax is >= GlobalXmin, but I guess that's tolerable. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unlogged tables
Excerpts from Robert Haas's message of sáb dic 18 02:21:41 -0300 2010: > Here's an attempt to summarize the remaining issues with this patch > that I know about. I may have forgotten something, so please mention > it if you notice something missing. > > 1. pg_dump needs an option to control whether unlogged tables are > dumped. --no-unlogged-tables seems like the obvious choice, assuming > we want the default to be to dump them, which seems like the safest > option. If there are valid use cases for some unlogged tables being dumped and some others not, would it make sense to be able to specify a pattern of tables to be dumped or skipped? -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] bug in SignalSomeChildren
Excerpts from Robert Haas's message of vie dic 17 14:08:04 -0300 2010: > I'm having a bit of trouble confirming this on MacOS X, though. > Attaching gdb to either the startup process or a WAL sender causes > PostmasterIsAlive to return false, resulting in a near-immediate exit. > Seems pretty stupid for attaching gdb to change the return value of > getppid() but it seems like that must be what's happening. Yeah, this problem has been known for some time and causes quite some pain. We have an open problem report on autovacuum failing to run after some time, and we haven't been able to get a backtrace or strace because of this issue -- trying to attach to it causes a full system restart (PostmasterIsAlive returns false, autovac launcher dies, this death causes postmaster to shut everything down and restart). This is on FreeBSD. -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_ctl and port number detection
Tom Lane wrote: > Bruce Momjian writes: > >> I wonder if we should write the port number as the 4th line in > >> postmaster.pid and return in a few major releases and use that. We > >> could fall back and use our existing code if there is no 4th line. > > No. If it goes in, it should go in as the third line. The shmem key > data is private to the server --- we do not want external programs > assuming anything at all about the private part of postmaster.pid. OK, so you are suggesting having it as a third value on the third line? 10231 /u/pgsql/data 5432001 45481984 port_here ^ I like that better because it simplifies the test and limits the possibility of non-atomic multi-line writes. For Win32, we would just have the port number because the line is normally empty. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] serializable lock consistency
On 20.12.2010 13:52, Florian Pflug wrote: On Dec20, 2010, at 07:16 , Heikki Linnakangas wrote: On 19.12.2010 20:57, Florian Pflug wrote: If we reuse the legacy field xvac to store xlast, we don't get into trouble with binary upgrades either. We' need to find a way to deal with tuples where HEAP_MOVED_IN or HEAP_MOVED_OUT is set, but that seems manageable.. xvac shares the field with command id, and cid is in use while the tuple is being updated. Right :-( Well, that nails this coffin shut pretty tightly, unless we were willing to add another field to heap tuples. One way to look at this is that the problem arises because SELECT FOR UPDATE doesn't create a new tuple like UPDATE does. The problematic case was: T1 locks, T1 commits, T2 updates, T2 aborts, all after T0 took its snapshot but before T0 attempts to delete. :-( If T1 does a regular UPDATE, T2 doesn't overwrite the xmax on the original tuple, but on the tuple that T1 created. So one way to handle FOR UPDATE would be to lazily turn the lock operation by T1 into a dummy update, when T2 updates the tuple. You can't retroactively make a regular update on behalf of the locking transaction that committed already, or concurrent selects would see the same row twice, but it might work with some kind of a magic tuple that's only followed through the ctid from the original one, and only for the purpose of visibility checks. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] serializable lock consistency
On Dec20, 2010, at 07:16 , Heikki Linnakangas wrote: > On 19.12.2010 20:57, Florian Pflug wrote: >> If we reuse the legacy field xvac to store xlast, we don't get into >> trouble with binary upgrades either. We' need to find a way to deal >> with tuples where HEAP_MOVED_IN or HEAP_MOVED_OUT is set, but that >> seems manageable.. > > xvac shares the field with command id, and cid is in use while the tuple is > being updated. Right :-( Well, that nails this coffin shut pretty tightly, unless we were willing to add another field to heap tuples. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Binary timestamp with without timezone
On Thu, 16 Dec 2010 14:24:27 -0500, Tom Lane wrote: =?utf-8?q?Rados=C5=82aw_Smogura?= writes: Tom Lane Thursday 16 December 2010 18:59:56 =?utf-8?q?Rados=C5=82aw_Smogura?= writes: ... This timestamp must be properly encoded depending if target is WITH TZ or not, but JDBC (and other clients, probably too) doesn't have any knowledge about target type when statement is executed Seems like you need to fix *that*. I don't say it's bad way to send timestamps in text mode. It's good solution, because timestamp without tz will silently ignore tz offset, timestamp with tz will use offset in calculations if it is there, if no it will use server TZ. No, what I'm saying is that it's complete folly to be sending binary data for a value you don't know the exact type of. I know something about value I want to send, but only this it should be a timestamp. I don't know if it should be with or without tz. There are too many ways for that to fail, and too few ways for the backend to validate what you sent. Adding more possible ways to interpret a binary blob makes that problem worse, not better. Official JDBC driver release use this technique to send timezone timestamps, but for text mode; any timestamp is send as UNSPECIFIED. So text mode driver can fail in this way too. What you need to fix is the inadequate type bookkeeping in JDBC. If you don't know the exact type of the value you're going to send, send it in text mode, where you have some reasonable hope of a mismatch being detected. I know that this procedure isn't good as well as in text mode and in binary mode, but gives any chance to do it better. In both cases we can find examples when this behaviour will fail, but In proposed solution I added (I hope in safe way) support for timezone information, that is missing in comparison to binary protocol, which can be useful. Maybe better idea is to create new timestamptz type, that will fully support TIME offsets, as well and most important, will give much more client friendly casting to timestamp and timestamptz-s. I mean it should be casted to timestamptz, as well to timestamp, but in last situation, per field base ('2010-01-01 +1:00)::timestamp -> '2010-01-01'. It could be better, because missing tz offset in current implementation can cause problems with historical DST offset (many posts found). Binary protocol will not have this disadvantage when reading, because Java supports historical DST, and timestamptz is UTC based. Regards, Radek -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extensions and custom_variable_classes
Itagaki Takahiro writes: > I think so. It would be better to remove the CVC support and related code. Will isolate that into another branch just in case and prepare a patch with that removed. > Preloading modules that defines CVC is a good direction to fix the issue, > but we need more consideration about where to do it. I checked for doing that too, but both shared_preload_libraries and local_preload_libraries are managed way earlier than the point where we can connect to a database a check what extensions it contains. So that appeared to me as a dead-end. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extensions and custom_variable_classes
Robert Haas writes: > I bet it doesn't. The *postmaster* never connects to a database, so > which copy of pg_extension does it ever read? None, which does it need to read? My answer is none, you're saying it's wrong, I don't get why. postmaster surely has no business with what's in a specific database and no use at all of placeholder GUCs, right? Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Timeout for asynchronous replication Re: Timeout and wait-forever in sync rep
On Tue, Dec 7, 2010 at 12:20 AM, Robert Haas wrote: > Yeah. If we rely on the TCP send buffer filling up, then the amount > of time the master takes to notice a dead standby is going to be hard > for the user to predict. I think the standby ought to send some sort > of heartbeat and the master should declare the standby dead if it > doesn't see a heartbeat soon enough. Maybe the heartbeat could even > include the receive/fsync/replay LSNs, so that sync rep can use the > same machinery but with more aggressive policies about when they must > be sent. OK. How about keepalive-like parameters and behaviors? replication_keepalives_idle replication_keepalives_interval replication_keepalives_count The master sends the keepalive packet if replication_keepalives_idle elapsed after receiving the last ACK packet including the receive/ fsync/replay LSNs from the standby. OTOH, the standby sends the ACK packet back to the master as soon as receiving the keepalive packet. If the master could not receive the ACK packet for replication_keepalives_interval, it repeats sending the keepalive packet and receiving the ACK replication_keepalives_count -1 times. If no ACK packet has finally arrived, the master thinks the standby has been dead. One obvious merit against my original proposal is that the master can notice the death of the standby even when there are no WAL records sendable. One demerit is that the standby needs to send some packets even in asynchronous replication. Thought? Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers