Re: [HACKERS] month abreviation
On 6/22/07, Euler Taveira de Oliveira [EMAIL PROTECTED] wrote: Jaime Casanova wrote: note the month abreviation (mons?) is this intentional? This notation has been used since the code was written (~7 years ago) [1]. [1] http://developer.postgresql.org/cvsweb.cgi/pgsql/src/backend/utils/adt/datetime.c?rev=1.42;content-type=text%2Fx-cvsweb-markup mmm... so, it had been bad for 7 years now... ;) ok, acceptting that as an abreviattion for months, what controls that. why u get years, days and mons, i mean, why is this one abreviated when the other two are not -- regards, Jaime Casanova Programming today is a race between software engineers striving to build bigger and better idiot-proof programs and the universe trying to produce bigger and better idiots. So far, the universe is winning. Richard Cook ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] Documentation of contrib modules
Hi, I think I'll have some spare time and I wanted to add some documentation of contrib modules as discussed in [1]. Then it was suggested only some of the contrib modules should be in the main docbook documentation. IMHO all of them (except start-scripts, probably) should be there so they have more exposure. I'm sure many PostgreSQL users don't know those contrib modules exist. I'd like to add a new part after Internals called Contrib Modules. Also, one question it comes to mind when looking at the current README files is if compilation installation instructions should be there or if with a simple generic psql -d dbname -f module.sql at the part introduction would be enough? Would do you think? [1] http://archives.postgresql.org/pgsql-hackers/2007-01/msg01443.php ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: What does Page Layout version mean? (Was: Re: [HACKERS] Reducing NUMERIC size for 8.3)
Heikki Linnakangas wrote: Since we're discussing upgrades, let me summarize the discussions we had over dinner in Ottawa for the benefit of all: Thanks for summary. As before, someone just needs to step up and do it. I'm now working on proposal. I hope that it will ready soon. Zdenek ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] EOL characters and multibyte encodings
Joe Conway [EMAIL PROTECTED] Tom Lane wrote: Joe Conway [EMAIL PROTECTED] writes: My first thought on fixing this issue was to simply replace all instances of '\r' in pg_proc.prosrc with '\n' prior to sending it to the R parser. As far as I know, any instances of '\r' embedded in a syntactically valid R statement must be escaped (i.e. literally the characters \ and r), so that should not be a problem. But I am concerned about how this potentially plays against multibyte characters. Is it safe to do this, or do I need to use a mb-aware replace algorithm? It's safe, because you'll be dealing with prosrc inside the backend, therefore using a backend-legal encoding, and those don't have any ASCII aliasing problems (all bytes of an MB character must have high bit set). The lower byte of some characters in BIG5, GBK, GB18030 may be less than 0x7F and don't have the high bit set. Fortunately, they don't use 0x0D and 0x0A (CR and LF). Regards, William ZHANG Great -- I wasn't sure about that. ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Worries about delayed-commit semantics
On Thu, 2007-06-21 at 18:15 -0400, Tom Lane wrote: I've been reflecting a bit about whether the notion of deferred fsync for transaction commits is really safe. The proposed patch tries to ensure that no consequences of a committed transaction can reach disk before the commit WAL record is fsync'd, but ISTM there are potential holes in what it's doing. In particular the path that concerns me is (1) transaction A commits with deferred fsync; (2) transaction B observes some effect of A (eg, a committed-good tuple); (3) transaction B makes a change that is contingent on the observation. If B's changes were to reach disk in advance of A's commit record, we'd have a risk of logical inconsistency. B's changes cannot reach disk before B's commit record. That is the existing WAL-before-data rule implemented by the buffer manager. If B can see A's changes, then A has written a commit record to the log that is definitely before B's commit record. So B's commit will also commit A's changes to WAL when it flushes at EOX. So whether A is a guaranteed transaction or not, B can always rely on those changes. I agree this feels unsafe when you first think about it, and was the reason for me taking months before publishing the idea. The patch is doing what it can to prevent *direct* effects of A from reaching disk before the commit record does, but it doesn't (and I think cannot) extend this to indirect effects perpetrated by other transactions. An example of the sort of risk I'm worried about is a REINDEX omitting an index entry for a tuple that it sees as committed dead by A. Now this may be safe anyway, but it requires analysis that I don't recall anyone having put forward. The cases that I can see are: 1. Ordinary WAL-logged change in a shared buffer page. The change will not be allowed to reach disk before the associated WAL record does, and that WAL record must follow A's commit, so we're safe. 2. Non-WAL-logged change in a temp table. Could reach disk in advance of A's commit, but we don't care since temp table contents don't survive crashes anyway. 3. Non-WAL-logged change made via one of the paths we have introduced to avoid WAL overhead for bulk updates. In these cases it's entirely possible for the data to reach disk before A's commit, because B will fsync it down to disk without any sort of interlock, as soon as it finishes the bulk update. However, I believe it's the case that all these paths are designed to write data that no other transaction can see until after B commits. That commit must follow A's in the WAL log, so until it has reached disk, the contents of the bulk-updated file are unimportant after a crash. So I think it's probably all OK, but this is a sufficiently long chain of reasoning that it had better be checked over by multiple people and recorded as part of the design implications of the patch. Does anyone think any of this is wrong, or too fragile to survive future code changes? Are there cases I've missed? I've done the analysis, but perhaps I should finish the docs now to aid with review of the patch on the points you make. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] tsearch in core patch
3) ALTER FULLTEXT CONFIGURATION cfgname ADD/ALTER/DROP MAPPING done Why not rename ALTER FULLTEXT CONFIGURATION -- ALTER TEXT SEARCH CONFIGURATION here too ? It's renamed too. most languages can be written using UNICODE charset and UTF-8 encoding, so neither charset not encoding can be used to determine language. yes --- how do many languages use ISO8859-1 locale?. ISO8859-1 is encoding, not locale. I meant, if we'll use encoding name (for example PG_LATIN1) we couldn't distinguish languages which use that encoding (for example italian and finnish and some more), but using locale names it's possible: it_IT.ISO8859-1, fi_FI.ISO8859-1 -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/ ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] tsearch in core patch
The recommendation I was making was to use the language name, not the encoding name, in the user-visible configuration. How does it determine language of db automatically? -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/ ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Worries about delayed-commit semantics
Tom Lane [EMAIL PROTECTED] writes: Tom Lane [EMAIL PROTECTED] writes: I've been reflecting a bit about whether the notion of deferred fsync for transaction commits is really safe. The proposed patch tries to ensure that no consequences of a committed transaction can reach disk before the commit WAL record is fsync'd, but ISTM there are potential holes in what it's doing. In particular the path that concerns me is (1) transaction A commits with deferred fsync; (2) transaction B observes some effect of A (eg, a committed-good tuple); (3) transaction B makes a change that is contingent on the observation. If B's changes were to reach disk in advance of A's commit record, we'd have a risk of logical inconsistency. The patch is doing what it can to prevent *direct* effects of A from reaching disk before the commit record does, but it doesn't (and I think cannot) extend this to indirect effects perpetrated by other transactions. An example of the sort of risk I'm worried about is a REINDEX omitting an index entry for a tuple that it sees as committed dead by A. Now this may be safe anyway, but it requires analysis that I don't recall anyone having put forward. The cases that I can see are: I think Simon did try to put all this in writing when he first proposed it. It's worth going through again with the actual implementation to be sure all the same guarantees hold. So I think it's probably all OK, but this is a sufficiently long chain of reasoning that it had better be checked over by multiple people and recorded as part of the design implications of the patch. Does anyone think any of this is wrong, or too fragile to survive future code changes? Are there cases I've missed? I think the logic you describe is not quite as subtle as you make it out to be. Certainly it's a bit surprising at first but it all boils down to the basic idea of how transactions and WAL records work: We never allow any other transactions to see the effects of our transaction until the commit record is fsynced to WAL. So now we're poking a hole in that but we certainly have to ensure that any transactions that do see the results of our deferred commit themselves don't record any visible effects until both their commit and ours hit WAL. The essential point in Simon's approach that guarantees that is that when you fsync you fsync all work that came before you. So committing a transaction also commits all deferred commits that you might depend on. BTW: I really dislike the name transaction guarantee for the feature; it sounds like marketing-speak, not to mention overpromising what we can deliver. Postgres can't guarantee anything in the face of untrustworthy disk hardware, for instance. I'd much rather use names derived from deferred commit or delayed commit or some such. Well from an implementation point of view we're delaying or deferring the commit. But from a user's point of view the important thing for them to realize is that a committed record could be lost. Perhaps we should just not come up with a new name and reuse the fsync variable. That way users of old installs which have fsync=off silently get this new behaviour. I'm not sure I like that idea since I use fsync=off to run cpu overhead tests here. But from a user's point of view it's probably the right thing. This is really what fsync=off should always have been doing. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] EOL characters and multibyte encodings
William ZHANG wrote: It's safe, because you'll be dealing with prosrc inside the backend, therefore using a backend-legal encoding, and those don't have any ASCII aliasing problems (all bytes of an MB character must have high bit set). The lower byte of some characters in BIG5, GBK, GB18030 may be less than 0x7F and don't have the high bit set. Fortunately, they don't use 0x0D and 0x0A (CR and LF). Those are client-only encodings, precisely for this sort of reason, and thus not relevant to the present discussion. As Tom points out above, when the language handler gets the code it will be encoded in the relevant backend encoding which can't be any of these. (Side note: the restriction by the R parser to unix-only line endings is a dreadful piece of design. As Jon Postel rightly said, the best rule is Be liberal in what you accept and conservative in what you send. Just about every parser for every language has been able to handle this, so why must R be different?) cheers andrew ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Worries about delayed-commit semantics
Tom Lane wrote: I've been reflecting a bit about whether the notion of deferred fsync for transaction commits is really safe. The proposed patch tries to ensure that no consequences of a committed transaction can reach disk before the commit WAL record is fsync'd, but ISTM there are potential holes in what it's doing. In particular the path that concerns me is BTW: I really dislike the name transaction guarantee for the feature; it sounds like marketing-speak, not to mention overpromising what we can deliver. Postgres can't guarantee anything in the face of Ahh but it can. :). PostgreSQL can guarantee that if the hardware is not faulty and the OS does what it is supposed to do... etc.. And yes, it is marketing but life is marketing, getting girlfriends is marketing. What matters is that once the marketing is over, you can stand up to the hype. untrustworthy disk hardware, for instance. I'd much rather use names derived from deferred commit or delayed commit or some such. Honestly, I prefer these names as well as it seems directly related versus transaction guarantee which sounds to be more like us saying, if we turn it off our transactions are bogus. Joshua D. Drake regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] GUC time unit spelling a bit inconsistent
Michael Paesold wrote: Btw.: I'm currently at DebConf in Edinburgh. On Scottish motorway signage, 5m means five miles. Even the Americans do that better. So, no, you can't have m for minutes. ;) Even with the ;) here and the context, the last sentence sounds to me quite arrogant. Most people here have tried to bring arguments and reasoning... you put it off with irrelevant anecdotes in the wrong context. It is hard to argue with your analysis here. -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] GUC time unit spelling a bit inconsistent
Michael Paesold wrote: Marko Kreen wrote: Considering Postgres will never user either meter or mile in settings, I don't consider your argument valid. I don't see the value of having units globally unique (literally). It's enough if they unique in the context of postgresql.conf. Thus +1 of having additional shortcuts Tom suggested. Also +1 for having them case-insensitive. Agreed. Although I suggest perhaps to not press for m as minutes, because it really is ambiguous for months or minutes, esp. in a context like log_rotation_age. Please lets have the unambiguous abbreviations. Please lets make it all case-insensitive. After all this discussion, what about a straight forward vote? Bruce, we had those before, no? Right. No one dictates what goes into PostgreSQL and I think there are clearly enough people who want improvement in this area, including perhaps having 'm' meaning minutes and going with case insensitivity. Please post a patch that we can discuss/review. If it is small we can try to get it into 8.3. -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] GUC time unit spelling a bit inconsistent
Peter Eisentraut wrote: Am Donnerstag, 21. Juni 2007 15:12 schrieb Andrew Dunstan: You don't seem to have any understanding that the units should be interpreted in context. You are right. I definitely have an understanding that units must be interpretable without context. And that clearly works for the most part. Consider even if we are clear that min is minutes, it could be chronological minutes or radial degree minutes, so yea, the context has to be considered. -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Worries about delayed-commit semantics
Joshua D. Drake [EMAIL PROTECTED] writes: Tom Lane wrote: untrustworthy disk hardware, for instance. I'd much rather use names derived from deferred commit or delayed commit or some such. Honestly, I prefer these names as well as it seems directly related versus transaction guarantee which sounds to be more like us saying, if we turn it off our transactions are bogus. Hm, another possibility: synchronous_commit = off -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] tsearch in core patch
Teodor Sigaev wrote: The recommendation I was making was to use the language name, not the encoding name, in the user-visible configuration. How does it determine language of db automatically? I don't think we are going to do language selection automatically --- the user is going to have to set tsearch_conf_name. -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Worries about delayed-commit semantics
On Fri, 2007-06-22 at 14:29 +0100, Gregory Stark wrote: Joshua D. Drake [EMAIL PROTECTED] writes: Tom Lane wrote: untrustworthy disk hardware, for instance. I'd much rather use names derived from deferred commit or delayed commit or some such. Honestly, I prefer these names as well as it seems directly related versus transaction guarantee which sounds to be more like us saying, if we turn it off our transactions are bogus. That was the intention..., but name change accepted. Hm, another possibility: synchronous_commit = off Ooo, I like that. Any other takers? -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Worries about delayed-commit semantics
Simon Riggs wrote: On Fri, 2007-06-22 at 14:29 +0100, Gregory Stark wrote: Joshua D. Drake [EMAIL PROTECTED] writes: Tom Lane wrote: untrustworthy disk hardware, for instance. I'd much rather use names derived from deferred commit or delayed commit or some such. Honestly, I prefer these names as well as it seems directly related versus transaction guarantee which sounds to be more like us saying, if we turn it off our transactions are bogus. That was the intention..., but name change accepted. Hm, another possibility: synchronous_commit = off Ooo, I like that. Any other takers? I like synchronous_commit = off, it even has a little girlfriend getting spin while being accurate :) Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] tsearch in core patch
I don't think we are going to do language selection automatically --- the user is going to have to set tsearch_conf_name. Are you suggest to remove long-lived feature of tsearch? In that case we don't need cfglocale (or cfglanguage as Tom suggested) and cfgdefault columns in pg_ts_cfg at all. Just set up tsearch_conf_name. -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/ ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Worries about delayed-commit semantics
So now we're poking a hole in that but we certainly have to ensure that any transactions that do see the results of our deferred commit themselves don't record any visible effects until both their commit and ours hit WAL. The essential point in Simon's approach that guarantees that is that when you fsync you fsync all work that came before you. So committing a transaction also commits all deferred commits that you might depend on. BTW: I really dislike the name transaction guarantee for the feature; it sounds like marketing-speak, not to mention overpromising what we can deliver. Postgres can't guarantee anything in the face of untrustworthy disk hardware, for instance. I'd much rather use names derived from deferred commit or delayed commit or some such. Well from an implementation point of view we're delaying or deferring the commit. But from a user's point of view the important thing for them to realize is that a committed record could be lost. Perhaps we should just not come up with a new name and reuse the fsync variable. That way users of old installs which have fsync=off silently get this new behaviour. I'm not sure I like that idea since I use fsync=off to run cpu overhead tests here. But from a user's point of view it's probably the right thing. This is really what fsync=off should always have been doing. Say you call them SOFT COMMIT and HARD COMMIT... HARD COMMIT fsyncs, obviously. Does SOFT COMMIT fflush() the WAL (so it's postgres-crash-safe) or not ? (just in case some user C function misbehaves and crashes) Do we get a config param to set default_commit_mode=hard or soft ? By the way InnoDB has a similar option where you set innodb_flush_log_on_commit (or something). However you cannot set it on a per-transaction basis. So, on a e-commerce site, for instance, most transactions will be unimportant (ie. no need to fsync, ACI only, like incrementing products view counts, add to cart, etc) but some transactions will have to be guaranteed (full ACID) like recording that an order has been submitted / paid / shipped. But with InnoDB you can't choose this on a per-transaction basis, so it's all or nothing. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] tsearch in core patch
Alvaro Herrera [EMAIL PROTECTED] writes: I very much doubt that the different spanishes are any different in the stemming rules, so there's no need for es_ES, es_PE, es_AR, es_CL etc; but in the case of portuguese I'm not so sure. Maybe there are other examples (like chinese, but I'm not sure how useful is tsearch for chinese). And the .ISO8859-1 part you don't need at all if you accept that the files are UTF8 by design, as Tom proposed. Also, the problem we're dealing with here is mainly lack of standardization of the encoding part of locale names. AFAIK, just about everybody agrees on es_ES, ru_RU, etc; it's the part that comes after that (if any) that is not too consistent across platforms. So I see no problem in distinguishing between pt_PT and pt_BR if it turns out we have to. The trick is to not look at any more of the locale name than that; and if we standardize on stopword files are UTF8 then I don't think we need to. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] month abreviation
Jaime Casanova wrote: On 6/22/07, Euler Taveira de Oliveira [EMAIL PROTECTED] wrote: Jaime Casanova wrote: note the month abreviation (mons?) is this intentional? This notation has been used since the code was written (~7 years ago) [1]. [1] http://developer.postgresql.org/cvsweb.cgi/pgsql/src/backend/utils/adt/datetime.c?rev=1.42;content-type=text%2Fx-cvsweb-markup mmm... so, it had been bad for 7 years now... ;) ok, acceptting that as an abreviattion for months, what controls that. why u get years, days and mons, i mean, why is this one abreviated when the other two are not I thought there was some standard that required that, but I don't remember which one. -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Worries about delayed-commit semantics
Joshua D. Drake wrote: Bruce Momjian wrote: Simon Riggs wrote: On Fri, 2007-06-22 at 14:29 +0100, Gregory Stark wrote: Joshua D. Drake [EMAIL PROTECTED] writes: Tom Lane wrote: untrustworthy disk hardware, for instance. I'd much rather use names derived from deferred commit or delayed commit or some such. Honestly, I prefer these names as well as it seems directly related versus transaction guarantee which sounds to be more like us saying, if we turn it off our transactions are bogus. That was the intention..., but name change accepted. Hm, another possibility: synchronous_commit = off Ooo, I like that. Any other takers? Yea, I like that too but I am now realizing that we are not really deferring or delaying the COMMIT command but rather the recovery of the commit. GUC as full_commit_recovery? recovery is a bad word I think. It is related too closely to failure. commit_stability? reliable_commit? -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Worries about delayed-commit semantics
Bruce Momjian wrote: Simon Riggs wrote: On Fri, 2007-06-22 at 14:29 +0100, Gregory Stark wrote: Joshua D. Drake [EMAIL PROTECTED] writes: Tom Lane wrote: untrustworthy disk hardware, for instance. I'd much rather use names derived from deferred commit or delayed commit or some such. Honestly, I prefer these names as well as it seems directly related versus transaction guarantee which sounds to be more like us saying, if we turn it off our transactions are bogus. That was the intention..., but name change accepted. Hm, another possibility: synchronous_commit = off Ooo, I like that. Any other takers? Yea, I like that too but I am now realizing that we are not really deferring or delaying the COMMIT command but rather the recovery of the commit. GUC as full_commit_recovery? recovery is a bad word I think. It is related too closely to failure. Sincerely, Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Worries about delayed-commit semantics
Joshua D. Drake wrote: I like synchronous_commit = off, it even has a little girlfriend getting spin while being accurate :) In my experience, *_commit = off rarely gets you a girlfriend ... cheers andrew ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] tsearch in core patch
Teodor Sigaev [EMAIL PROTECTED] writes: I don't think we are going to do language selection automatically --- the user is going to have to set tsearch_conf_name. Are you suggest to remove long-lived feature of tsearch? In that case we don't need cfglocale (or cfglanguage as Tom suggested) and cfgdefault columns in pg_ts_cfg at all. Just set up tsearch_conf_name. Is the point here for initdb to be able to establish a sane default initially? Seems to me it can guess the language from the first component of the locale (ru_RU - russian). regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] tsearch in core patch
Teodor Sigaev wrote: --- how do many languages use ISO8859-1 locale?. ISO8859-1 is encoding, not locale. I meant, if we'll use encoding name (for example PG_LATIN1) we couldn't distinguish languages which use that encoding (for example italian and finnish and some more), but using locale names it's possible: it_IT.ISO8859-1, fi_FI.ISO8859-1 I don't understand. Why use it_IT.ISO8859-1? You just need to know the language, so it is enough. The _IT part specifies that it's the italian spoken in Italy. This may be irrelevant in most cases, but consider that pt_PT and pt_BR are AFAIK somewhat different languages. I very much doubt that the different spanishes are any different in the stemming rules, so there's no need for es_ES, es_PE, es_AR, es_CL etc; but in the case of portuguese I'm not so sure. Maybe there are other examples (like chinese, but I'm not sure how useful is tsearch for chinese). And the .ISO8859-1 part you don't need at all if you accept that the files are UTF8 by design, as Tom proposed. -- Alvaro Herrera Developer, http://www.PostgreSQL.org/ Nadie esta tan esclavizado como el que se cree libre no siendolo (Goethe) ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Worries about delayed-commit semantics
Joshua D. Drake wrote: Simon Riggs wrote: On Fri, 2007-06-22 at 14:29 +0100, Gregory Stark wrote: Joshua D. Drake [EMAIL PROTECTED] writes: Tom Lane wrote: untrustworthy disk hardware, for instance. I'd much rather use names derived from deferred commit or delayed commit or some such. Honestly, I prefer these names as well as it seems directly related versus transaction guarantee which sounds to be more like us saying, if we turn it off our transactions are bogus. That was the intention..., but name change accepted. Hm, another possibility: synchronous_commit = off Ooo, I like that. Any other takers? I like synchronous_commit = off, it even has a little girlfriend getting spin while being accurate :) Or perhaps sync_on_commit = off? Less girlfriend-speak perhaps:no_sync_on_commit = on -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Worries about delayed-commit semantics
Simon Riggs [EMAIL PROTECTED] writes: On Fri, 2007-06-22 at 14:29 +0100, Gregory Stark wrote: Joshua D. Drake [EMAIL PROTECTED] writes: Hm, another possibility: synchronous_commit = off Ooo, I like that. Any other takers? OK with me regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] GUC time unit spelling a bit inconsistent
Andrew Sullivan wrote: On Thu, Jun 21, 2007 at 03:24:51PM +0200, Michael Paesold wrote: There are valid reasons against 5m as mega-bytes, because here m does not refer to a unit, it refers to a quantifier (if that is a reasonable English word) of a unit. So it should really be 5mb. log_rotation_age = 5m log_rotation_size = 5mb Except, of course, that 5mb would be understood by those of us who work in metric and use both bits and bytes as 5 millibits. I at one point submitted a patch to make units case insensitive, I have since submitting that patch decided that was a horrible idea. Why can't we use standard units? Mb, Kb, KB, MB... (I don't know the standard unit for minutes). The more I see this going back and forth it seems we should just do it right the first time and tell everyone else to read: The fine manual The spec(s) that define the units. Joshua D. Drake Which would be an absurd value, but since Postgres had support for time travel once, who knows what other wonders the developers have come up with ;-) (I will note, though, that this B vs b problem really gets up my nose, especially when I hear people who are ostensibly designing networks talking about gigabyte ethernet cards. I would _like_ such a card, I confess, but to my knowledge the standard hasn't gotten that far yet.) Nevertheless, I think that Tom's original suggestion was at least a HINT, which seems perfectly reasonable to me. A -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Worries about delayed-commit semantics
Simon Riggs wrote: On Fri, 2007-06-22 at 14:29 +0100, Gregory Stark wrote: Joshua D. Drake [EMAIL PROTECTED] writes: Tom Lane wrote: untrustworthy disk hardware, for instance. I'd much rather use names derived from deferred commit or delayed commit or some such. Honestly, I prefer these names as well as it seems directly related versus transaction guarantee which sounds to be more like us saying, if we turn it off our transactions are bogus. That was the intention..., but name change accepted. Hm, another possibility: synchronous_commit = off Ooo, I like that. Any other takers? Yea, I like that too but I am now realizing that we are not really deferring or delaying the COMMIT command but rather the recovery of the commit. GUC as full_commit_recovery? -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] tsearch in core patch
Tom Lane wrote: Alvaro Herrera [EMAIL PROTECTED] writes: I very much doubt that the different spanishes are any different in the stemming rules, so there's no need for es_ES, es_PE, es_AR, es_CL etc; but in the case of portuguese I'm not so sure. Maybe there are other examples (like chinese, but I'm not sure how useful is tsearch for chinese). And the .ISO8859-1 part you don't need at all if you accept that the files are UTF8 by design, as Tom proposed. Also, the problem we're dealing with here is mainly lack of standardization of the encoding part of locale names. AFAIK, just about everybody agrees on es_ES, ru_RU, etc; it's the part that comes after that (if any) that is not too consistent across platforms. So I see no problem in distinguishing between pt_PT and pt_BR if it turns out we have to. The trick is to not look at any more of the locale name than that; and if we standardize on stopword files are UTF8 then I don't think we need to. OK, and the open question is when do we do this default setting. If we do it in initdb then we can isolate all the detection there. -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Worries about delayed-commit semantics
On Fri, 22 Jun 2007 16:43:00 +0200, Bruce Momjian [EMAIL PROTECTED] wrote: Simon Riggs wrote: On Fri, 2007-06-22 at 14:29 +0100, Gregory Stark wrote: Joshua D. Drake [EMAIL PROTECTED] writes: Tom Lane wrote: untrustworthy disk hardware, for instance. I'd much rather use names derived from deferred commit or delayed commit or some such. Honestly, I prefer these names as well as it seems directly related versus transaction guarantee which sounds to be more like us saying, if we turn it off our transactions are bogus. That was the intention..., but name change accepted. Hm, another possibility: synchronous_commit = off Ooo, I like that. Any other takers? Yea, I like that too but I am now realizing that we are not really deferring or delaying the COMMIT command but rather the recovery of the commit. GUC as full_commit_recovery? commit_waits_for_fsync = force_yes : makes all commits hard yes : commits are hard unless specified otherwise [default] no : commits are soft unless specified otherwise [should replace fsync=off use case] force_no : makes all commits soft (controller with write cache emulator) the force_yes and force_no are for benchmarking purposes mostly, ie. once your app is tuned to specify which commits have to be guaranteed (hard) and which don't (soft) you can then bench it with force_yes and force_no to see how much you gained, and how much you'd gain by buying a write cache controller... ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Worries about delayed-commit semantics
Bruce Momjian [EMAIL PROTECTED] writes: Joshua D. Drake wrote: Hm, another possibility: synchronous_commit = off Ooo, I like that. Any other takers? Yea, I like that too but I am now realizing that we are not really deferring or delaying the COMMIT command but rather the recovery of the commit. GUC as full_commit_recovery? recovery is a bad word I think. It is related too closely to failure. commit_stability? reliable_commit? What's wrong with synchronous_commit? It's accurate and simple. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Worries about delayed-commit semantics
On Fri, 2007-06-22 at 10:52 -0400, Bruce Momjian wrote: commit_stability? reliable_commit? commit_durability? That then relates it directly to the D in ACID. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] tsearch in core patch
On Fri, 22 Jun 2007, Bruce Momjian wrote: Tom Lane wrote: Alvaro Herrera [EMAIL PROTECTED] writes: I very much doubt that the different spanishes are any different in the stemming rules, so there's no need for es_ES, es_PE, es_AR, es_CL etc; but in the case of portuguese I'm not so sure. Maybe there are other examples (like chinese, but I'm not sure how useful is tsearch for chinese). And the .ISO8859-1 part you don't need at all if you accept that the files are UTF8 by design, as Tom proposed. Also, the problem we're dealing with here is mainly lack of standardization of the encoding part of locale names. AFAIK, just about everybody agrees on es_ES, ru_RU, etc; it's the part that comes after that (if any) that is not too consistent across platforms. So I see no problem in distinguishing between pt_PT and pt_BR if it turns out we have to. The trick is to not look at any more of the locale name than that; and if we standardize on stopword files are UTF8 then I don't think we need to. OK, and the open question is when do we do this default setting. If we do it in initdb then we can isolate all the detection there. We can do that at initdb time, but we still have to decide how to map human-readable language name and lang part of locale name. Are we going to hardcode it ? It's not friendly for hosting solution, when people often have no access to the postgresql.conf, so they need to remember setting tsearch_conf_name. It could be solved using 'alter user ... set tsearch_conf_name' command though. Regards, Oleg _ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83 ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] tsearch in core patch
Tom Lane wrote: Alvaro Herrera [EMAIL PROTECTED] writes: I very much doubt that the different spanishes are any different in the stemming rules, so there's no need for es_ES, es_PE, es_AR, es_CL etc; but in the case of portuguese I'm not so sure. Maybe there are other examples (like chinese, but I'm not sure how useful is tsearch for chinese). And the .ISO8859-1 part you don't need at all if you accept that the files are UTF8 by design, as Tom proposed. Also, the problem we're dealing with here is mainly lack of standardization of the encoding part of locale names. AFAIK, just about everybody agrees on es_ES, ru_RU, etc; it's the part that comes after that (if any) that is not too consistent across platforms. That may have been true until we started supporting Windows... Swedish_Sweden.1252 is what I get on my machine, for example. Principle is the same, but values certainly aren't. //Magnus ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Worries about delayed-commit semantics
Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: Joshua D. Drake wrote: Hm, another possibility: synchronous_commit = off Ooo, I like that. Any other takers? Yea, I like that too but I am now realizing that we are not really deferring or delaying the COMMIT command but rather the recovery of the commit. GUC as full_commit_recovery? recovery is a bad word I think. It is related too closely to failure. commit_stability? reliable_commit? What's wrong with synchronous_commit? It's accurate and simple. That is fine too. -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Worries about delayed-commit semantics
PFC wrote: On Fri, 22 Jun 2007 16:43:00 +0200, Bruce Momjian [EMAIL PROTECTED] wrote: Simon Riggs wrote: On Fri, 2007-06-22 at 14:29 +0100, Gregory Stark wrote: Joshua D. Drake [EMAIL PROTECTED] writes: Tom Lane wrote: untrustworthy disk hardware, for instance. I'd much rather use names derived from deferred commit or delayed commit or some such. Honestly, I prefer these names as well as it seems directly related versus transaction guarantee which sounds to be more like us saying, if we turn it off our transactions are bogus. That was the intention..., but name change accepted. Hm, another possibility: synchronous_commit = off Ooo, I like that. Any other takers? Yea, I like that too but I am now realizing that we are not really deferring or delaying the COMMIT command but rather the recovery of the commit. GUC as full_commit_recovery? commit_waits_for_fsync = force_yes: makes all commits hard yes: commits are hard unless specified otherwise [default] no: commits are soft unless specified otherwise [should replace fsync=off use case] force_no: makes all commits soft (controller with write cache emulator) I think you got the last line backwards - without the fsync() after a commit, you can't be sure that the data made it into the controller cache. To be safe you *always* need the fsync() - but it will probably be much cheaper if your controller doesn't have to actually write to the disks, but can cache in battery-backed ram instead. Therefore, if you own such a controller, you probably don't need deferred commits. BTW, I like synchronous_commit too - but maybe asynchronous_commit would be even better, with inverted semantics of course. The you'd have asynchronous_commit = off as default. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Worries about delayed-commit semantics
On Jun 22, 2007, at 9:23 , Richard Huxton wrote: Or perhaps sync_on_commit = off? Or switch it around... sink_on_commit = on (sorry for the noise) Michael Glaesemann grzm seespotcode net ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] tsearch in core patch
Magnus Hagander wrote: Tom Lane wrote: Alvaro Herrera [EMAIL PROTECTED] writes: I very much doubt that the different spanishes are any different in the stemming rules, so there's no need for es_ES, es_PE, es_AR, es_CL etc; but in the case of portuguese I'm not so sure. Maybe there are other examples (like chinese, but I'm not sure how useful is tsearch for chinese). And the .ISO8859-1 part you don't need at all if you accept that the files are UTF8 by design, as Tom proposed. Also, the problem we're dealing with here is mainly lack of standardization of the encoding part of locale names. AFAIK, just about everybody agrees on es_ES, ru_RU, etc; it's the part that comes after that (if any) that is not too consistent across platforms. That may have been true until we started supporting Windows... Swedish_Sweden.1252 is what I get on my machine, for example. Principle is the same, but values certainly aren't. Well, at least the name is not itself translated, so a mapping table is not right out of the question. If they had put a name like Español_Chile instead of Spanish_Chile we would be in serious trouble. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Worries about delayed-commit semantics
Bruce Momjian wrote: Tom Lane wrote: What's wrong with synchronous_commit? It's accurate and simple. That is fine too. My concern would be that it can be read two ways: 1. When you commit, sync (something or other - unspecified) 2. Synchronise commits (to each other? to something else?)* It's obvious to people on the -hackers list what we're talking about, but is it so clear to a newbie, perhaps non-English speaker? * I can see people thinking this means something like commit_delay. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] tsearch in core patch
That may have been true until we started supporting Windows... Swedish_Sweden.1252 is what I get on my machine, for example. Principle is the same, but values certainly aren't. Well, at least the name is not itself translated, so a mapping table is not right out of the question. If they had put a name like Español_Chile instead of Spanish_Chile we would be in serious trouble. I don't think so, in oppsite case you can't type or show it to change locale :). So, final propose: rename cfglocale to cfglanguages and store in it array of laguage names which is produced from first part of locale names: russian '{ru_RU, Russian_Russia}' spanish '{es_ES, es_CL, Spanish_Spain, Spanish_Chile}' Comments? Is there some obstacles to use GIN indexes in pg_catalog? ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] tsearch in core patch
Michael Glaesemann wrote: On Jun 22, 2007, at 9:28 , Tom Lane wrote: Is the point here for initdb to be able to establish a sane default initially? Seems to me it can guess the language from the first component of the locale (ru_RU - russian). How would this work for initdb with locale C? Yea, that's a problem. I am thinking we should just avoid the entire issue and require it to be set by the user, and throw an error if the configuration is not set. -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] tsearch in core patch
[EMAIL PROTECTED] wrote: So, final propose: rename cfglocale to cfglanguages and store in it array of laguage names which is produced from first part of locale names: russian '{ru_RU, Russian_Russia}' spanish '{es_ES, es_CL, Spanish_Spain, Spanish_Chile}' Comments? Why not do it the other way around? es_ES spanish Spanish_Spain spanish ru_RU russian pt_BR portuguese_brazil That way you don't need any funny index. Or do you need the list of locales for each language? (but even if you do, you can easily obtain it by indexing both columns separately using btrees anyway) -- Alvaro Herrera http://www.PlanetPostgreSQL.org/ I can see support will not be a problem. 10 out of 10.(Simon Wittber) (http://archives.postgresql.org/pgsql-general/2004-12/msg00159.php) ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] tsearch in core patch
On Jun 22, 2007, at 9:28 , Tom Lane wrote: Is the point here for initdb to be able to establish a sane default initially? Seems to me it can guess the language from the first component of the locale (ru_RU - russian). How would this work for initdb with locale C? I'm worrying about that too. -- Tatsuo Ishii SRA OSS, Inc. Japan ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] tsearch in core patch
On Jun 22, 2007, at 9:28 , Tom Lane wrote: Is the point here for initdb to be able to establish a sane default initially? Seems to me it can guess the language from the first component of the locale (ru_RU - russian). How would this work for initdb with locale C? Michael Glaesemann grzm seespotcode net ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] tsearch in core patch
Why not do it the other way around? es_ES spanish Spanish_Spain spanish ru_RU russian pt_BR portuguese_brazil That way you don't need any funny index. Or do you need the list of locales for each language? (but even if you do, you can easily obtain it by indexing both columns separately using btrees anyway) Yes, that's possible but that icreases number of identical configuration: russian_win Russian_Russia russian_unixru_RU They doesn't differ except locale name. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [Fwd: Re: [HACKERS] tsearch in core patch]
How would this work for initdb with locale C? I'm worrying about that too. english '{en_GB, en_US, C}' I suppose, that locale name always has a dot separator exept C locale --- which is well known exception So we would have to?: japanese '{ja_JP, C}' How would we know C - japanese? Also I'm wondering how we could handle texts including Japanese and English. It's very common in Japan. -- Tatsuo Ishii SRA OSS, Inc. Japan ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] tsearch in core patch
Tatsuo Ishii [EMAIL PROTECTED] writes: On Jun 22, 2007, at 9:28 , Tom Lane wrote: Is the point here for initdb to be able to establish a sane default initially? Seems to me it can guess the language from the first component of the locale (ru_RU - russian). How would this work for initdb with locale C? I'm worrying about that too. I would be surprised if C locale defaulted to anything except English. I suppose it would be sensible to add a switch to allow people to select a different language. In any case, the only thing initdb would be doing would be setting up an initial value of a table entry or GUC variable, so you could always change it yourself later; it may not be worth sweating too much about this. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Worries about delayed-commit semantics
Richard Huxton [EMAIL PROTECTED] writes: Tom Lane wrote: What's wrong with synchronous_commit? It's accurate and simple. My concern would be that it can be read two ways: 1. When you commit, sync (something or other - unspecified) 2. Synchronise commits (to each other? to something else?)* Well, that's a fair point. deferred_commit would avoid that objection. I'm not sure it's real important though --- with practically all of the postgresql.conf variables, you really need to read the manual to know exactly what they do. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[Fwd: Re: [HACKERS] tsearch in core patch]
How would this work for initdb with locale C? I'm worrying about that too. english '{en_GB, en_US, C}' I suppose, that locale name always has a dot separator exept C locale --- which is well known exception ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Worries about delayed-commit semantics
Richard Huxton wrote: Bruce Momjian wrote: Tom Lane wrote: What's wrong with synchronous_commit? It's accurate and simple. That is fine too. My concern would be that it can be read two ways: 1. When you commit, sync (something or other - unspecified) 2. Synchronise commits (to each other? to something else?)* It's obvious to people on the -hackers list what we're talking about, but is it so clear to a newbie, perhaps non-English speaker? * I can see people thinking this means something like commit_delay. OTOH, the concept of synchronous vs. asynchronous (function) calls should be pretty well-known among database programmers and administrators. And (at least to me), this is really what this is about - the commit happens asynchronously, at the convenience of the database, and not the instant that I requested it. greetings, Florian Pflug ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] GUC time unit spelling a bit inconsistent
Am Freitag, 22. Juni 2007 15:34 schrieb Bruce Momjian: Consider even if we are clear that min is minutes, it could be chronological minutes or radial degree minutes, so yea, the context has to be considered. The correct symbol for an arc minute is ´, so there is no context dependency. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] tsearch in core patch
[EMAIL PROTECTED] wrote: Why not do it the other way around? es_ES spanish Spanish_Spain spanish ru_RU russian pt_BR portuguese_brazil That way you don't need any funny index. Or do you need the list of locales for each language? (but even if you do, you can easily obtain it by indexing both columns separately using btrees anyway) Yes, that's possible but that icreases number of identical configuration: russian_win Russian_Russia russian_unixru_RU They doesn't differ except locale name. But why do you need them to be different at all? Just make it russian Russian_Russia russian ru_RU Does that not work for some reason? What I was really suggesting was having a table mapping locale names into tsearch languages. Then the configuration could be made based on the language, not on the locale name. So the stopword list is for russian, regardless of whether the locale is Russian_Russia or ru_RU. Is this only for the stopword list, or does it also affect selecting a stemmer? Note: it's possible that the stopword list is different for brazilian portuguese than portuguese portuguese, which is why I was suggesting using a language portuguese_brazil and not just postuguese. Whereas you need a single stopword list for all the countries speaking spanish, which is why you need only one language called spanish. -- Alvaro Herrerahttp://www.advogato.org/person/alvherre Llegará una época en la que una investigación diligente y prolongada sacará a la luz cosas que hoy están ocultas (Séneca, siglo I) ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[HACKERS] fast stop before database system is ready
I apologize for not grabbing more information before the evidence was gone, but I think there may be a vulnerability to database corruption on PITR recovery if a stop is done with the fast option right after a database logs archive recovery complete. We normally have about 17 seconds between that and the database system is ready message for a particular database. Someone was watching the log and issued a fast stop about 1.5 seconds after the archive recovery is complete message. When the database came back up, it was corrupted. (The first problem message was about a bad sibling pointer, but the wheels pretty much fell off after that.) He deleted the database instance, got a fresh dump, and tried again without stopping the server at that point, and all is well. The dump used in the problem recovery attempt is now gone. I hesitate to report this since my information is so sketchy, but thought you might want the report anyway. The source and target of this PITR-style copy were both PostgreSQL 8.2.4 on SuSE Linux. For more details on the target see my recent posts about the corrupted database which turned out to be caused by bad hardware and outdated drivers. ( http://archives.postgresql.org/pgsql-admin/2007-06/msg00151.php ) The failed recovery was on that box, after fixing all known hardware and driver issues. No assistance needed. -Kevin ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[HACKERS] Refactoring parser/analyze.c
In connection with bug #3403 http://archives.postgresql.org/pgsql-bugs/2007-06/msg00114.php I've come to the conclusion that we really shouldn't do *any* processing of utility commands at parse analysis time; they should be left as raw-grammar output trees until execution. The key reason for this is that any processing we do that is dependent on database state might be obsolete by the time of execution, and we don't have any infrastructure for taking locks or otherwise checking the up-to-dateness of a utility command tree. The time delay involved could be significant in the case of a command that is put into the plan cache (eg, a statement in a plpgsql function), so this isn't an academic concern. I had already foreseen this and delayed the processing of several utility commands (eg, CREATE INDEX, CREATE RULE) until runtime as part of the plan-cache patch; but I left CREATE TABLE and ALTER TABLE alone, mistakenly thinking that their parse analysis work was purely syntactic transformations and so could be done without reference to the database state. As noted in the discussion of bug #3403, this is wrong with respect to the processing of SERIAL-column sequences. And there's also the matter of CREATE TABLE ... LIKE, for which the CVS-HEAD code says * Change the LIKE subtable portion of a CREATE TABLE statement into * column definitions which recreate the user defined column portions of * subtable. * * Note: because we do this at parse analysis time, any change in the * referenced table between parse analysis and execution won't be reflected * into the new table. Is this OK? So I'm thinking we should complete the break-up and delay the processing done by transformCreateStmt and transformAlterTableStmt until execution of the utility command begins. In the case of ALTER TABLE we should take out an exclusive lock on the target table before we even start to do any of transformAlterTableStmt's work. I had originally thought that parser/analyze.c was too intertwined to try to break up, but upon looking more closely I find that there is actually almost complete separation between the handling of plannable commands and utility commands. I would like to refactor analyze.c into two files to reflect this new understanding of when things happen: analyze.c: keeps parse_analyze, transformStmt, and the handling of SELECT/INSERT/UPDATE/DELETE commands, as well as EXPLAIN and DECLARE CURSOR, which are special cases but more nearly related to plannable commands than not. a new file named something like parse_utilcmd.c: transformCreateStmt, transformAlterTableStmt, transformCreateSchemaStmt, transformIndexStmt, transformRuleStmt, and subsidiary routines. These functions would now be called at the beginning of execution of the respective utility commands, and not from parse_analyze() at all. It looks like only release_pstate_resources() and makeFromExpr() are used in common by these two files; both of them arguably belong somewhere else anyway (parse_node.c and makefuncs.c respectively). Also we might need to export transformStmt() from analyze.c; the utility-command routines currently call that directly, and I'm undecided whether they can or should go through parse_analyze() instead. With this refactoring, there will not be any use of the extras_before/extras_after mechanism within analyze.c, and I'm sorely tempted to just rip it out, redeclaring parse_analyze() and friends to return a single Query node instead of a List. Can anyone foresee a reason we might still need to return multiple Query nodes from a single plannable statement? (Note: rule expansion isn't a reason, that happens later.) Comments? regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Bugtraq: Having Fun With PostgreSQL
On Jun 19, 2007, at 1:27 PM, Josh Berkus wrote: I know there's issues with using ident sameuser via TCP, but what about for filesystem socket connections? Not all OSes support ident ... Solaris and OpenBSD for two, don't, because they see ident as insecure. What about the unix domain socket, though? AFAIK that doesn't rely on ident but some other method... -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Bugtraq: Having Fun With PostgreSQL
Jim Nasby [EMAIL PROTECTED] writes: On Jun 19, 2007, at 1:27 PM, Josh Berkus wrote: Not all OSes support ident ... Solaris and OpenBSD for two, don't, because they see ident as insecure. What about the unix domain socket, though? AFAIK that doesn't rely on ident but some other method... On OpenBSD we use getpeereid() for unix sockets, and there are equivalent things on some other Unixen. We could never go over to ident as the standard default, though, because not all platforms have these sorts of features (if indeed they have unix sockets at all ...); and in any case it's not very secure for TCP. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[HACKERS] In California for a few days
FYI, I am visiting California until Wednesday, June 27, to attend a funeral. I will be reading email, but not as frequently. -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate