Re: [HACKERS] month abreviation

2007-06-22 Thread Jaime Casanova

On 6/22/07, Euler Taveira de Oliveira [EMAIL PROTECTED] wrote:

Jaime Casanova wrote:

 note the month abreviation (mons?) is this intentional?

This notation has been used since the code was written (~7 years ago) [1].

[1]
http://developer.postgresql.org/cvsweb.cgi/pgsql/src/backend/utils/adt/datetime.c?rev=1.42;content-type=text%2Fx-cvsweb-markup



mmm... so, it had been bad for 7 years now... ;)
ok, acceptting that as an abreviattion for months, what controls that.
why u get years, days and mons, i mean, why is this one
abreviated when the other two are not

--
regards,
Jaime Casanova

Programming today is a race between software engineers striving to
build bigger and better idiot-proof programs and the universe trying
to produce bigger and better idiots.
So far, the universe is winning.
  Richard Cook

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[HACKERS] Documentation of contrib modules

2007-06-22 Thread Albert Cervera Areny
Hi,
I think I'll have some spare time and I wanted to add some 
documentation of 
contrib modules as discussed in [1]. Then it was suggested only some of the 
contrib modules should be in the main docbook documentation. IMHO all of them 
(except start-scripts, probably) should be there so they have more exposure. 
I'm sure many PostgreSQL users don't know those contrib modules exist.
I'd like to add a new part after Internals called Contrib Modules. 
Also, 
one question it comes to mind when looking at the current README files is if 
compilation  installation instructions should be there or if with a simple 
generic psql -d dbname -f module.sql at the part introduction would be 
enough? Would do you think?

[1] http://archives.postgresql.org/pgsql-hackers/2007-01/msg01443.php


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: What does Page Layout version mean? (Was: Re: [HACKERS] Reducing NUMERIC size for 8.3)

2007-06-22 Thread Zdenek Kotala

Heikki Linnakangas wrote:
Since we're discussing upgrades, let me summarize the discussions we had 
over dinner in Ottawa for the benefit of all:




Thanks for summary.


As before, someone just needs to step up and do it.


I'm now working on proposal. I hope that it will ready soon.

Zdenek

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] EOL characters and multibyte encodings

2007-06-22 Thread William ZHANG

Joe Conway [EMAIL PROTECTED]
 Tom Lane wrote:
 Joe Conway [EMAIL PROTECTED] writes:
 My first thought on fixing this issue was to simply replace all 
 instances of '\r' in pg_proc.prosrc with '\n' prior to sending it to the 
 R parser. As far as I know, any instances of '\r' embedded in a 
 syntactically valid R statement must be escaped (i.e. literally the 
 characters \ and r), so that should not be a problem. But I am 
 concerned about how this potentially plays against multibyte characters. 
 Is it safe to do this, or do I need to use a mb-aware replace algorithm?

 It's safe, because you'll be dealing with prosrc inside the backend,
 therefore using a backend-legal encoding, and those don't have any ASCII
 aliasing problems (all bytes of an MB character must have high bit set).

The lower byte of some characters in BIG5, GBK, GB18030 may be less than
0x7F and don't have the high bit set. Fortunately, they don't use 0x0D and
0x0A (CR and LF).

Regards,
William ZHANG

 Great -- I wasn't sure about that.

 



---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Worries about delayed-commit semantics

2007-06-22 Thread Simon Riggs
On Thu, 2007-06-21 at 18:15 -0400, Tom Lane wrote:
 I've been reflecting a bit about whether the notion of deferred fsync
 for transaction commits is really safe.  The proposed patch tries to
 ensure that no consequences of a committed transaction can reach disk
 before the commit WAL record is fsync'd, but ISTM there are potential
 holes in what it's doing.  In particular the path that concerns me is
 
 (1) transaction A commits with deferred fsync;
 
 (2) transaction B observes some effect of A (eg, a committed-good tuple);
 
 (3) transaction B makes a change that is contingent on the observation.
 
 If B's changes were to reach disk in advance of A's commit record, we'd
 have a risk of logical inconsistency.  

B's changes cannot reach disk before B's commit record. That is the
existing WAL-before-data rule implemented by the buffer manager.

If B can see A's changes, then A has written a commit record to the log
that is definitely before B's commit record. So B's commit will also
commit A's changes to WAL when it flushes at EOX. So whether A is a
guaranteed transaction or not, B can always rely on those changes.

I agree this feels unsafe when you first think about it, and was the
reason for me taking months before publishing the idea.

 The patch is doing what it can
 to prevent *direct* effects of A from reaching disk before the commit
 record does, but it doesn't (and I think cannot) extend this to indirect
 effects perpetrated by other transactions.  An example of the sort of
 risk I'm worried about is a REINDEX omitting an index entry for a tuple
 that it sees as committed dead by A.
 
 Now this may be safe anyway, but it requires analysis that I don't
 recall anyone having put forward.  The cases that I can see are:
 
 1. Ordinary WAL-logged change in a shared buffer page.  The change will
 not be allowed to reach disk before the associated WAL record does, and
 that WAL record must follow A's commit, so we're safe.
 
 2. Non-WAL-logged change in a temp table.  Could reach disk in advance
 of A's commit, but we don't care since temp table contents don't survive
 crashes anyway.
 
 3. Non-WAL-logged change made via one of the paths we have introduced
 to avoid WAL overhead for bulk updates.  In these cases it's entirely
 possible for the data to reach disk before A's commit, because B will
 fsync it down to disk without any sort of interlock, as soon as it
 finishes the bulk update.  However, I believe it's the case that all
 these paths are designed to write data that no other transaction can see
 until after B commits.  That commit must follow A's in the WAL log,
 so until it has reached disk, the contents of the bulk-updated file
 are unimportant after a crash.
 
 So I think it's probably all OK, but this is a sufficiently long chain
 of reasoning that it had better be checked over by multiple people and
 recorded as part of the design implications of the patch.  Does anyone
 think any of this is wrong, or too fragile to survive future code
 changes?  Are there cases I've missed?

I've done the analysis, but perhaps I should finish the docs now to aid
with review of the patch on the points you make.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] tsearch in core patch

2007-06-22 Thread Teodor Sigaev

3) ALTER FULLTEXT CONFIGURATION cfgname ADD/ALTER/DROP MAPPING
done

Why not rename ALTER FULLTEXT CONFIGURATION -- ALTER TEXT SEARCH
CONFIGURATION here too ?


It's renamed too.


most languages can be written using UNICODE charset and UTF-8 encoding,
so neither charset not encoding can be used to determine language.

yes


 --- how do many languages use ISO8859-1 locale?. 

 ISO8859-1 is encoding, not locale.

I meant, if we'll use encoding name (for example PG_LATIN1) we couldn't 
distinguish languages which use that encoding (for example italian and finnish 
and some more), but using locale names it's possible: it_IT.ISO8859-1, 
fi_FI.ISO8859-1


--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] tsearch in core patch

2007-06-22 Thread Teodor Sigaev

The recommendation I was making was to use the language name, not the
encoding name, in the user-visible configuration.

How does it determine language of db automatically?

--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Worries about delayed-commit semantics

2007-06-22 Thread Gregory Stark
Tom Lane [EMAIL PROTECTED] writes:

Tom Lane [EMAIL PROTECTED] writes:

 I've been reflecting a bit about whether the notion of deferred fsync
 for transaction commits is really safe.  The proposed patch tries to
 ensure that no consequences of a committed transaction can reach disk
 before the commit WAL record is fsync'd, but ISTM there are potential
 holes in what it's doing.  In particular the path that concerns me is

 (1) transaction A commits with deferred fsync;

 (2) transaction B observes some effect of A (eg, a committed-good tuple);

 (3) transaction B makes a change that is contingent on the observation.

 If B's changes were to reach disk in advance of A's commit record, we'd
 have a risk of logical inconsistency.  The patch is doing what it can
 to prevent *direct* effects of A from reaching disk before the commit
 record does, but it doesn't (and I think cannot) extend this to indirect
 effects perpetrated by other transactions.  An example of the sort of
 risk I'm worried about is a REINDEX omitting an index entry for a tuple
 that it sees as committed dead by A.

 Now this may be safe anyway, but it requires analysis that I don't
 recall anyone having put forward.  The cases that I can see are:

I think Simon did try to put all this in writing when he first proposed it.
It's worth going through again with the actual implementation to be sure all
the same guarantees hold.

 So I think it's probably all OK, but this is a sufficiently long chain
 of reasoning that it had better be checked over by multiple people and
 recorded as part of the design implications of the patch.  Does anyone
 think any of this is wrong, or too fragile to survive future code
 changes?  Are there cases I've missed?

I think the logic you describe is not quite as subtle as you make it out to
be. Certainly it's a bit surprising at first but it all boils down to the
basic idea of how transactions and WAL records work: We never allow any other
transactions to see the effects of our transaction until the commit record is
fsynced to WAL. 

So now we're poking a hole in that but we certainly have to ensure that any
transactions that do see the results of our deferred commit themselves don't
record any visible effects until both their commit and ours hit WAL. The
essential point in Simon's approach that guarantees that is that when you
fsync you fsync all work that came before you. So committing a transaction
also commits all deferred commits that you might depend on.

 BTW: I really dislike the name transaction guarantee for the feature;
 it sounds like marketing-speak, not to mention overpromising what we
 can deliver.  Postgres can't guarantee anything in the face of
 untrustworthy disk hardware, for instance.  I'd much rather use names
 derived from deferred commit or delayed commit or some such.

Well from an implementation point of view we're delaying or deferring the
commit. But from a user's point of view the important thing for them to
realize is that a committed record could be lost.

Perhaps we should just not come up with a new name and reuse the fsync
variable. That way users of old installs which have fsync=off silently get
this new behaviour. I'm not sure I like that idea since I use fsync=off to run
cpu overhead tests here. But from a user's point of view it's probably the
right thing. This is really what fsync=off should always have been doing.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] EOL characters and multibyte encodings

2007-06-22 Thread Andrew Dunstan



William ZHANG wrote:


It's safe, because you'll be dealing with prosrc inside the backend,
therefore using a backend-legal encoding, and those don't have any ASCII
aliasing problems (all bytes of an MB character must have high bit set).
  


The lower byte of some characters in BIG5, GBK, GB18030 may be less than
0x7F and don't have the high bit set. Fortunately, they don't use 0x0D and
0x0A (CR and LF).

  
  


Those are client-only encodings, precisely for this sort of reason, and 
thus not relevant to the present discussion. As Tom points out above, 
when the language handler gets the code it will be encoded in the 
relevant backend encoding which can't be any of these.


(Side note: the restriction by the R parser to unix-only line endings is 
a dreadful piece of design. As Jon Postel rightly said, the best rule is 
Be liberal in what you accept and conservative in what you send. Just 
about every parser for every language has been able to handle this, so 
why must R be different?)


cheers

andrew

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] Worries about delayed-commit semantics

2007-06-22 Thread Joshua D. Drake

Tom Lane wrote:

I've been reflecting a bit about whether the notion of deferred fsync
for transaction commits is really safe.  The proposed patch tries to
ensure that no consequences of a committed transaction can reach disk
before the commit WAL record is fsync'd, but ISTM there are potential
holes in what it's doing.  In particular the path that concerns me is



BTW: I really dislike the name transaction guarantee for the feature;
it sounds like marketing-speak, not to mention overpromising what we
can deliver.  Postgres can't guarantee anything in the face of


Ahh but it can. :). PostgreSQL can guarantee that if the hardware is 
not faulty and the OS does what it is supposed to do... etc..


And yes, it is marketing but life is marketing, getting girlfriends is 
marketing. What matters is that once the marketing is over, you can 
stand up to the hype.



untrustworthy disk hardware, for instance.  I'd much rather use names
derived from deferred commit or delayed commit or some such.


Honestly, I prefer these names as well as it seems directly related 
versus transaction guarantee which sounds to be more like us saying, if 
we turn it off our transactions are bogus.


Joshua D. Drake



regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq




--

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] GUC time unit spelling a bit inconsistent

2007-06-22 Thread Bruce Momjian
Michael Paesold wrote:
  Btw.: I'm currently at DebConf in Edinburgh.  On Scottish motorway 
  signage, 5m means five miles.  Even the Americans do that better.  So, 
  no, you can't have m for minutes. ;)
 
 Even with the ;) here and the context, the last sentence sounds to me 
 quite arrogant. Most people here have tried to bring arguments and 
 reasoning... you put it off with irrelevant anecdotes in the wrong context.

It is hard to argue with your analysis here.

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] GUC time unit spelling a bit inconsistent

2007-06-22 Thread Bruce Momjian
Michael Paesold wrote:
 Marko Kreen wrote:
  Considering Postgres will never user either meter or mile
  in settings, I don't consider your argument valid.
  
  I don't see the value of having units globally unique (literally).
  It's enough if they unique in the context of postgresql.conf.
  
  Thus +1 of having additional shortcuts Tom suggested.
  Also +1 for having them case-insensitive.
 
 Agreed. Although I suggest perhaps to not press for m as minutes, 
 because it really is ambiguous for months or minutes, esp. in a 
 context like log_rotation_age.
 
 Please lets have the unambiguous abbreviations. Please lets make it all 
 case-insensitive. After all this discussion, what about a straight 
 forward vote? Bruce, we had those before, no?

Right.  No one dictates what goes into PostgreSQL and I think there are
clearly enough people who want improvement in this area, including
perhaps having 'm' meaning minutes and going with case insensitivity.
Please post a patch that we can discuss/review.  If it is small we can
try to get it into 8.3.

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] GUC time unit spelling a bit inconsistent

2007-06-22 Thread Bruce Momjian
Peter Eisentraut wrote:
 Am Donnerstag, 21. Juni 2007 15:12 schrieb Andrew Dunstan:
  You don't seem to have any understanding that the units should be
  interpreted in context.
 
 You are right.  I definitely have an understanding that units must be 
 interpretable without context.  And that clearly works for the most part.

Consider even if we are clear that min is minutes, it could be
chronological minutes or radial degree minutes, so yea, the context has
to be considered.

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Worries about delayed-commit semantics

2007-06-22 Thread Gregory Stark
Joshua D. Drake [EMAIL PROTECTED] writes:

 Tom Lane wrote:

 untrustworthy disk hardware, for instance.  I'd much rather use names
 derived from deferred commit or delayed commit or some such.

 Honestly, I prefer these names as well as it seems directly related versus
 transaction guarantee which sounds to be more like us saying, if we turn it 
 off
 our transactions are bogus.

Hm, another possibility: synchronous_commit = off

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] tsearch in core patch

2007-06-22 Thread Bruce Momjian
Teodor Sigaev wrote:
  The recommendation I was making was to use the language name, not the
  encoding name, in the user-visible configuration.

 How does it determine language of db automatically?

I don't think we are going to do language selection automatically ---
the user is going to have to set tsearch_conf_name.

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Worries about delayed-commit semantics

2007-06-22 Thread Simon Riggs
On Fri, 2007-06-22 at 14:29 +0100, Gregory Stark wrote:
 Joshua D. Drake [EMAIL PROTECTED] writes:
 
  Tom Lane wrote:
 
  untrustworthy disk hardware, for instance.  I'd much rather use names
  derived from deferred commit or delayed commit or some such.
 
  Honestly, I prefer these names as well as it seems directly related versus
  transaction guarantee which sounds to be more like us saying, if we turn it 
  off
  our transactions are bogus.

That was the intention..., but name change accepted.

 Hm, another possibility: synchronous_commit = off

Ooo, I like that. Any other takers?

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Worries about delayed-commit semantics

2007-06-22 Thread Joshua D. Drake

Simon Riggs wrote:

On Fri, 2007-06-22 at 14:29 +0100, Gregory Stark wrote:

Joshua D. Drake [EMAIL PROTECTED] writes:


Tom Lane wrote:


untrustworthy disk hardware, for instance.  I'd much rather use names
derived from deferred commit or delayed commit or some such.

Honestly, I prefer these names as well as it seems directly related versus
transaction guarantee which sounds to be more like us saying, if we turn it off
our transactions are bogus.


That was the intention..., but name change accepted.


Hm, another possibility: synchronous_commit = off


Ooo, I like that. Any other takers?


I like synchronous_commit = off, it even has a little girlfriend 
getting spin while being accurate :)


Joshua D. Drake




--

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] tsearch in core patch

2007-06-22 Thread Teodor Sigaev

I don't think we are going to do language selection automatically ---
the user is going to have to set tsearch_conf_name.


Are you suggest to remove long-lived feature of tsearch? In that case we don't 
need cfglocale (or cfglanguage as Tom suggested) and cfgdefault columns in 
pg_ts_cfg at all. Just set up tsearch_conf_name.

--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] Worries about delayed-commit semantics

2007-06-22 Thread PFC


So now we're poking a hole in that but we certainly have to ensure that  
any
transactions that do see the results of our deferred commit themselves  
don't

record any visible effects until both their commit and ours hit WAL. The
essential point in Simon's approach that guarantees that is that when you
fsync you fsync all work that came before you. So committing a  
transaction

also commits all deferred commits that you might depend on.


BTW: I really dislike the name transaction guarantee for the feature;
it sounds like marketing-speak, not to mention overpromising what we
can deliver.  Postgres can't guarantee anything in the face of
untrustworthy disk hardware, for instance.  I'd much rather use names
derived from deferred commit or delayed commit or some such.


Well from an implementation point of view we're delaying or deferring the
commit. But from a user's point of view the important thing for them to
realize is that a committed record could be lost.

Perhaps we should just not come up with a new name and reuse the fsync
variable. That way users of old installs which have fsync=off silently  
get
this new behaviour. I'm not sure I like that idea since I use fsync=off  
to run
cpu overhead tests here. But from a user's point of view it's probably  
the
right thing. This is really what fsync=off should always have been  
doing.


Say you call them SOFT COMMIT and HARD COMMIT...
HARD COMMIT fsyncs, obviously.
Does SOFT COMMIT fflush() the WAL (so it's postgres-crash-safe) or not ?
(just in case some user C function misbehaves and crashes)

Do we get a config param to set default_commit_mode=hard or soft ?

	By the way InnoDB has a similar option where you set  
innodb_flush_log_on_commit (or something). However you cannot set it on a  
per-transaction basis. So, on a e-commerce site, for instance, most  
transactions will be unimportant (ie. no need to fsync, ACI only, like  
incrementing products view counts, add to cart, etc) but some transactions  
will have to be guaranteed (full ACID) like recording that an order has  
been submitted / paid / shipped. But with InnoDB you can't choose this on  
a per-transaction basis, so it's all or nothing.




---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] tsearch in core patch

2007-06-22 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 I very much doubt that the different spanishes are any different in the
 stemming rules, so there's no need for es_ES, es_PE, es_AR, es_CL etc;
 but in the case of portuguese I'm not so sure.  Maybe there are other
 examples (like chinese, but I'm not sure how useful is tsearch for
 chinese).

 And the .ISO8859-1 part you don't need at all if you accept that the
 files are UTF8 by design, as Tom proposed.

Also, the problem we're dealing with here is mainly lack of
standardization of the encoding part of locale names.  AFAIK, just about
everybody agrees on es_ES, ru_RU, etc; it's the part that comes
after that (if any) that is not too consistent across platforms.
So I see no problem in distinguishing between pt_PT and pt_BR if it
turns out we have to.  The trick is to not look at any more of the
locale name than that; and if we standardize on stopword files are
UTF8 then I don't think we need to.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] month abreviation

2007-06-22 Thread Bruce Momjian
Jaime Casanova wrote:
 On 6/22/07, Euler Taveira de Oliveira [EMAIL PROTECTED] wrote:
  Jaime Casanova wrote:
 
   note the month abreviation (mons?) is this intentional?
  
  This notation has been used since the code was written (~7 years ago) [1].
 
  [1]
  http://developer.postgresql.org/cvsweb.cgi/pgsql/src/backend/utils/adt/datetime.c?rev=1.42;content-type=text%2Fx-cvsweb-markup
 
 
 mmm... so, it had been bad for 7 years now... ;)
 ok, acceptting that as an abreviattion for months, what controls that.
 why u get years, days and mons, i mean, why is this one
 abreviated when the other two are not

I thought there was some standard that required that, but I don't
remember which one.

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Worries about delayed-commit semantics

2007-06-22 Thread Bruce Momjian
Joshua D. Drake wrote:
 Bruce Momjian wrote:
  Simon Riggs wrote:
  On Fri, 2007-06-22 at 14:29 +0100, Gregory Stark wrote:
  Joshua D. Drake [EMAIL PROTECTED] writes:
 
  Tom Lane wrote:
 
  untrustworthy disk hardware, for instance.  I'd much rather use names
  derived from deferred commit or delayed commit or some such.
  Honestly, I prefer these names as well as it seems directly related 
  versus
  transaction guarantee which sounds to be more like us saying, if we turn 
  it off
  our transactions are bogus.
  That was the intention..., but name change accepted.
 
  Hm, another possibility: synchronous_commit = off
  Ooo, I like that. Any other takers?
  
  Yea, I like that too but I am now realizing that we are not really
  deferring or delaying the COMMIT command but rather the recovery of
  the commit.  GUC as full_commit_recovery?
 
 recovery is a bad word I think. It is related too closely to failure.

commit_stability?  reliable_commit?

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Worries about delayed-commit semantics

2007-06-22 Thread Joshua D. Drake

Bruce Momjian wrote:

Simon Riggs wrote:

On Fri, 2007-06-22 at 14:29 +0100, Gregory Stark wrote:

Joshua D. Drake [EMAIL PROTECTED] writes:


Tom Lane wrote:


untrustworthy disk hardware, for instance.  I'd much rather use names
derived from deferred commit or delayed commit or some such.

Honestly, I prefer these names as well as it seems directly related versus
transaction guarantee which sounds to be more like us saying, if we turn it off
our transactions are bogus.

That was the intention..., but name change accepted.


Hm, another possibility: synchronous_commit = off

Ooo, I like that. Any other takers?


Yea, I like that too but I am now realizing that we are not really
deferring or delaying the COMMIT command but rather the recovery of
the commit.  GUC as full_commit_recovery?


recovery is a bad word I think. It is related too closely to failure.

Sincerely,

Joshua D. Drake







--

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Worries about delayed-commit semantics

2007-06-22 Thread Andrew Dunstan



Joshua D. Drake wrote:


I like synchronous_commit = off, it even has a little girlfriend 
getting spin while being accurate :)




In my experience, *_commit = off rarely gets you a girlfriend ...

cheers

andrew

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] tsearch in core patch

2007-06-22 Thread Tom Lane
Teodor Sigaev [EMAIL PROTECTED] writes:
 I don't think we are going to do language selection automatically ---
 the user is going to have to set tsearch_conf_name.

 Are you suggest to remove long-lived feature of tsearch? In that case we 
 don't 
 need cfglocale (or cfglanguage as Tom suggested) and cfgdefault columns in 
 pg_ts_cfg at all. Just set up tsearch_conf_name.

Is the point here for initdb to be able to establish a sane default
initially?  Seems to me it can guess the language from the first
component of the locale (ru_RU - russian).

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] tsearch in core patch

2007-06-22 Thread Alvaro Herrera
Teodor Sigaev wrote:

  --- how do many languages use ISO8859-1 locale?. 
  ISO8859-1 is encoding, not locale.
 
 I meant, if we'll use encoding name (for example PG_LATIN1) we couldn't 
 distinguish languages which use that encoding (for example italian and 
 finnish and some more), but using locale names it's possible: 
 it_IT.ISO8859-1, fi_FI.ISO8859-1

I don't understand.  Why use it_IT.ISO8859-1?  You just need to know
the language, so it is enough.  The _IT part specifies that it's the
italian spoken in Italy.  This may be irrelevant in most cases, but
consider that pt_PT and pt_BR are AFAIK somewhat different languages.

I very much doubt that the different spanishes are any different in the
stemming rules, so there's no need for es_ES, es_PE, es_AR, es_CL etc;
but in the case of portuguese I'm not so sure.  Maybe there are other
examples (like chinese, but I'm not sure how useful is tsearch for
chinese).

And the .ISO8859-1 part you don't need at all if you accept that the
files are UTF8 by design, as Tom proposed.

-- 
Alvaro Herrera  Developer, http://www.PostgreSQL.org/
Nadie esta tan esclavizado como el que se cree libre no siendolo (Goethe)

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Worries about delayed-commit semantics

2007-06-22 Thread Richard Huxton

Joshua D. Drake wrote:

Simon Riggs wrote:

On Fri, 2007-06-22 at 14:29 +0100, Gregory Stark wrote:

Joshua D. Drake [EMAIL PROTECTED] writes:


Tom Lane wrote:


untrustworthy disk hardware, for instance.  I'd much rather use names
derived from deferred commit or delayed commit or some such.
Honestly, I prefer these names as well as it seems directly related 
versus
transaction guarantee which sounds to be more like us saying, if we 
turn it off

our transactions are bogus.


That was the intention..., but name change accepted.


Hm, another possibility: synchronous_commit = off


Ooo, I like that. Any other takers?


I like synchronous_commit = off, it even has a little girlfriend 
getting spin while being accurate :)


Or perhaps sync_on_commit = off?
Less girlfriend-speak perhaps:no_sync_on_commit = on

--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Worries about delayed-commit semantics

2007-06-22 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes:
 On Fri, 2007-06-22 at 14:29 +0100, Gregory Stark wrote:
 Joshua D. Drake [EMAIL PROTECTED] writes:
 Hm, another possibility: synchronous_commit = off

 Ooo, I like that. Any other takers?

OK with me

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] GUC time unit spelling a bit inconsistent

2007-06-22 Thread Joshua D. Drake

Andrew Sullivan wrote:

On Thu, Jun 21, 2007 at 03:24:51PM +0200, Michael Paesold wrote:
There are valid reasons against 5m as mega-bytes, because here m does 
not refer to a unit, it refers to a quantifier (if that is a reasonable 
English word) of a unit. So it should really be 5mb.


log_rotation_age = 5m
log_rotation_size = 5mb


Except, of course, that 5mb would be understood by those of us who
work in metric and use both bits and bytes as 5 millibits.


I at one point submitted a patch to make units case insensitive, I have 
since submitting that patch decided that was a horrible idea. Why can't 
we use standard units? Mb, Kb, KB, MB... (I don't know the standard unit 
for minutes).


The more I see this going back and forth it seems we should just do it 
right the first time and tell everyone else to read:


The fine manual
The spec(s) that define the units.

Joshua D. Drake





 Which
would be an absurd value, but since Postgres had support for time
travel once, who knows what other wonders the developers have come up
with ;-)  (I will note, though, that this B vs b problem really gets
up my nose, especially when I hear people who are ostensibly
designing networks talking about gigabyte ethernet cards.  I would
_like_ such a card, I confess, but to my knowledge the standard
hasn't gotten that far yet.)

Nevertheless, I think that Tom's original suggestion was at least a
HINT, which seems perfectly reasonable to me.  


A




--

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] Worries about delayed-commit semantics

2007-06-22 Thread Bruce Momjian
Simon Riggs wrote:
 On Fri, 2007-06-22 at 14:29 +0100, Gregory Stark wrote:
  Joshua D. Drake [EMAIL PROTECTED] writes:
  
   Tom Lane wrote:
  
   untrustworthy disk hardware, for instance.  I'd much rather use names
   derived from deferred commit or delayed commit or some such.
  
   Honestly, I prefer these names as well as it seems directly related versus
   transaction guarantee which sounds to be more like us saying, if we turn 
   it off
   our transactions are bogus.
 
 That was the intention..., but name change accepted.
 
  Hm, another possibility: synchronous_commit = off
 
 Ooo, I like that. Any other takers?

Yea, I like that too but I am now realizing that we are not really
deferring or delaying the COMMIT command but rather the recovery of
the commit.  GUC as full_commit_recovery?

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] tsearch in core patch

2007-06-22 Thread Bruce Momjian
Tom Lane wrote:
 Alvaro Herrera [EMAIL PROTECTED] writes:
  I very much doubt that the different spanishes are any different in the
  stemming rules, so there's no need for es_ES, es_PE, es_AR, es_CL etc;
  but in the case of portuguese I'm not so sure.  Maybe there are other
  examples (like chinese, but I'm not sure how useful is tsearch for
  chinese).
 
  And the .ISO8859-1 part you don't need at all if you accept that the
  files are UTF8 by design, as Tom proposed.
 
 Also, the problem we're dealing with here is mainly lack of
 standardization of the encoding part of locale names.  AFAIK, just about
 everybody agrees on es_ES, ru_RU, etc; it's the part that comes
 after that (if any) that is not too consistent across platforms.
 So I see no problem in distinguishing between pt_PT and pt_BR if it
 turns out we have to.  The trick is to not look at any more of the
 locale name than that; and if we standardize on stopword files are
 UTF8 then I don't think we need to.

OK, and the open question is when do we do this default setting.  If we
do it in initdb then we can isolate all the detection there.

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Worries about delayed-commit semantics

2007-06-22 Thread PFC

On Fri, 22 Jun 2007 16:43:00 +0200, Bruce Momjian [EMAIL PROTECTED] wrote:


Simon Riggs wrote:

On Fri, 2007-06-22 at 14:29 +0100, Gregory Stark wrote:
 Joshua D. Drake [EMAIL PROTECTED] writes:

  Tom Lane wrote:
 
  untrustworthy disk hardware, for instance.  I'd much rather use  
names

  derived from deferred commit or delayed commit or some such.
 
  Honestly, I prefer these names as well as it seems directly related  
versus
  transaction guarantee which sounds to be more like us saying, if we  
turn it off

  our transactions are bogus.

That was the intention..., but name change accepted.

 Hm, another possibility: synchronous_commit = off

Ooo, I like that. Any other takers?


Yea, I like that too but I am now realizing that we are not really
deferring or delaying the COMMIT command but rather the recovery of
the commit.  GUC as full_commit_recovery?



commit_waits_for_fsync =

force_yes   : makes all commits hard
yes : commits are hard unless specified otherwise [default]
	no	: commits are soft unless specified otherwise [should replace  
fsync=off use case]
	force_no	: makes all commits soft (controller with write cache  
emulator)


	the force_yes and force_no are for benchmarking purposes mostly, ie. once  
your app is tuned to specify which commits have to be guaranteed (hard)  
and which don't (soft) you can then bench it with force_yes and force_no  
to see how much you gained, and how much you'd gain by buying a write  
cache controller...



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Worries about delayed-commit semantics

2007-06-22 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 Joshua D. Drake wrote:
 Hm, another possibility: synchronous_commit = off

 Ooo, I like that. Any other takers?

 Yea, I like that too but I am now realizing that we are not really
 deferring or delaying the COMMIT command but rather the recovery of
 the commit.  GUC as full_commit_recovery?
 
 recovery is a bad word I think. It is related too closely to failure.

 commit_stability?  reliable_commit?

What's wrong with synchronous_commit?  It's accurate and simple.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Worries about delayed-commit semantics

2007-06-22 Thread Simon Riggs
On Fri, 2007-06-22 at 10:52 -0400, Bruce Momjian wrote:

 commit_stability?  reliable_commit?

commit_durability?

That then relates it directly to the D in ACID.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] tsearch in core patch

2007-06-22 Thread Oleg Bartunov

On Fri, 22 Jun 2007, Bruce Momjian wrote:


Tom Lane wrote:

Alvaro Herrera [EMAIL PROTECTED] writes:

I very much doubt that the different spanishes are any different in the
stemming rules, so there's no need for es_ES, es_PE, es_AR, es_CL etc;
but in the case of portuguese I'm not so sure.  Maybe there are other
examples (like chinese, but I'm not sure how useful is tsearch for
chinese).



And the .ISO8859-1 part you don't need at all if you accept that the
files are UTF8 by design, as Tom proposed.


Also, the problem we're dealing with here is mainly lack of
standardization of the encoding part of locale names.  AFAIK, just about
everybody agrees on es_ES, ru_RU, etc; it's the part that comes
after that (if any) that is not too consistent across platforms.
So I see no problem in distinguishing between pt_PT and pt_BR if it
turns out we have to.  The trick is to not look at any more of the
locale name than that; and if we standardize on stopword files are
UTF8 then I don't think we need to.


OK, and the open question is when do we do this default setting.  If we
do it in initdb then we can isolate all the detection there.


We can do that at initdb time, but we still have to decide how to map
human-readable language name and lang part of locale name. Are we going
to hardcode it ?

It's not friendly for hosting solution, when people often have no access
to the postgresql.conf, so they need to remember setting tsearch_conf_name.
It could be solved using 'alter user ... set tsearch_conf_name' command though.


Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] tsearch in core patch

2007-06-22 Thread Magnus Hagander
Tom Lane wrote:
 Alvaro Herrera [EMAIL PROTECTED] writes:
 I very much doubt that the different spanishes are any different in the
 stemming rules, so there's no need for es_ES, es_PE, es_AR, es_CL etc;
 but in the case of portuguese I'm not so sure.  Maybe there are other
 examples (like chinese, but I'm not sure how useful is tsearch for
 chinese).
 
 And the .ISO8859-1 part you don't need at all if you accept that the
 files are UTF8 by design, as Tom proposed.
 
 Also, the problem we're dealing with here is mainly lack of
 standardization of the encoding part of locale names.  AFAIK, just about
 everybody agrees on es_ES, ru_RU, etc; it's the part that comes
 after that (if any) that is not too consistent across platforms.

That may have been true until we started supporting Windows...
Swedish_Sweden.1252 is what I get on my machine, for example. Principle
is the same, but values certainly aren't.

//Magnus


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Worries about delayed-commit semantics

2007-06-22 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  Joshua D. Drake wrote:
  Hm, another possibility: synchronous_commit = off
 
  Ooo, I like that. Any other takers?
 
  Yea, I like that too but I am now realizing that we are not really
  deferring or delaying the COMMIT command but rather the recovery of
  the commit.  GUC as full_commit_recovery?
  
  recovery is a bad word I think. It is related too closely to failure.
 
  commit_stability?  reliable_commit?
 
 What's wrong with synchronous_commit?  It's accurate and simple.

That is fine too.

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Worries about delayed-commit semantics

2007-06-22 Thread Florian G. Pflug

PFC wrote:

On Fri, 22 Jun 2007 16:43:00 +0200, Bruce Momjian [EMAIL PROTECTED] wrote:

Simon Riggs wrote:

On Fri, 2007-06-22 at 14:29 +0100, Gregory Stark wrote:
 Joshua D. Drake [EMAIL PROTECTED] writes:
  Tom Lane wrote:
  untrustworthy disk hardware, for instance.  I'd much rather use names
  derived from deferred commit or delayed commit or some such.
 
  Honestly, I prefer these names as well as it seems directly related versus
  transaction guarantee which sounds to be more like us saying, if we turn it 
off
  our transactions are bogus.

That was the intention..., but name change accepted.

 Hm, another possibility: synchronous_commit = off

Ooo, I like that. Any other takers?


Yea, I like that too but I am now realizing that we are not really
deferring or delaying the COMMIT command but rather the recovery of
the commit.  GUC as full_commit_recovery?


commit_waits_for_fsync =

force_yes: makes all commits hard
yes: commits are hard unless specified otherwise [default]
no: commits are soft unless specified otherwise [should 
replace fsync=off use case]
force_no: makes all commits soft (controller with write cache 
emulator)


I think you got the last line backwards - without the fsync() after
a commit, you can't be sure that the data made it into the controller
cache. To be safe you *always* need the fsync() - but it will probably
be much cheaper if your controller doesn't have to actually write to
the disks, but can cache in battery-backed ram instead. Therefore,
if you own such a controller, you probably don't need deferred commits.

BTW, I like synchronous_commit too - but maybe asynchronous_commit
would be even better, with inverted semantics of course.
The you'd have asynchronous_commit = off as default.


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Worries about delayed-commit semantics

2007-06-22 Thread Michael Glaesemann


On Jun 22, 2007, at 9:23 , Richard Huxton wrote:


Or perhaps sync_on_commit = off?


Or switch it around...

sink_on_commit = on

(sorry for the noise)

Michael Glaesemann
grzm seespotcode net



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] tsearch in core patch

2007-06-22 Thread Alvaro Herrera
Magnus Hagander wrote:
 Tom Lane wrote:
  Alvaro Herrera [EMAIL PROTECTED] writes:
  I very much doubt that the different spanishes are any different in the
  stemming rules, so there's no need for es_ES, es_PE, es_AR, es_CL etc;
  but in the case of portuguese I'm not so sure.  Maybe there are other
  examples (like chinese, but I'm not sure how useful is tsearch for
  chinese).
  
  And the .ISO8859-1 part you don't need at all if you accept that the
  files are UTF8 by design, as Tom proposed.
  
  Also, the problem we're dealing with here is mainly lack of
  standardization of the encoding part of locale names.  AFAIK, just about
  everybody agrees on es_ES, ru_RU, etc; it's the part that comes
  after that (if any) that is not too consistent across platforms.
 
 That may have been true until we started supporting Windows...
 Swedish_Sweden.1252 is what I get on my machine, for example. Principle
 is the same, but values certainly aren't.

Well, at least the name is not itself translated, so a mapping table is
not right out of the question.  If they had put a name like
Español_Chile instead of Spanish_Chile we would be in serious
trouble.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Worries about delayed-commit semantics

2007-06-22 Thread Richard Huxton

Bruce Momjian wrote:

Tom Lane wrote:

What's wrong with synchronous_commit?  It's accurate and simple.


That is fine too.


My concern would be that it can be read two ways:
1. When you commit, sync (something or other - unspecified)
2. Synchronise commits (to each other? to something else?)*

It's obvious to people on the -hackers list what we're talking about, 
but is it so clear to a newbie, perhaps non-English speaker?


* I can see people thinking this means something like commit_delay.

--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] tsearch in core patch

2007-06-22 Thread teodor

 That may have been true until we started supporting Windows...
 Swedish_Sweden.1252 is what I get on my machine, for example. Principle
 is the same, but values certainly aren't.

 Well, at least the name is not itself translated, so a mapping table is
 not right out of the question.  If they had put a name like
 Español_Chile instead of Spanish_Chile we would be in serious
 trouble.
I don't think so, in oppsite case you can't type or show it to change
locale :).

So, final propose:
rename cfglocale to cfglanguages and store in it array of laguage names
which is produced from first part of locale names:
russian   '{ru_RU, Russian_Russia}'
spanish   '{es_ES, es_CL, Spanish_Spain, Spanish_Chile}'

Comments?

Is there some obstacles to  use GIN indexes in pg_catalog?


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] tsearch in core patch

2007-06-22 Thread Bruce Momjian
Michael Glaesemann wrote:
 
 On Jun 22, 2007, at 9:28 , Tom Lane wrote:
 
  Is the point here for initdb to be able to establish a sane default
  initially?  Seems to me it can guess the language from the first
  component of the locale (ru_RU - russian).
 
 How would this work for initdb with locale C?

Yea, that's a problem.  I am thinking we should just avoid the entire
issue and require it to be set by the user, and throw an error if the
configuration is not set.

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] tsearch in core patch

2007-06-22 Thread Alvaro Herrera
[EMAIL PROTECTED] wrote:

 So, final propose:
 rename cfglocale to cfglanguages and store in it array of laguage names
 which is produced from first part of locale names:
 russian   '{ru_RU, Russian_Russia}'
 spanish   '{es_ES, es_CL, Spanish_Spain, Spanish_Chile}'
 
 Comments?

Why not do it the other way around?
es_ES   spanish
Spanish_Spain   spanish
ru_RU   russian
pt_BR   portuguese_brazil

That way you don't need any funny index.  Or do you need the list of
locales for each language? (but even if you do, you can easily obtain it
by indexing both columns separately using btrees anyway)

-- 
Alvaro Herrera   http://www.PlanetPostgreSQL.org/
I can see support will not be a problem.  10 out of 10.(Simon Wittber)
  (http://archives.postgresql.org/pgsql-general/2004-12/msg00159.php)

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] tsearch in core patch

2007-06-22 Thread Tatsuo Ishii
 On Jun 22, 2007, at 9:28 , Tom Lane wrote:
 
  Is the point here for initdb to be able to establish a sane default
  initially?  Seems to me it can guess the language from the first
  component of the locale (ru_RU - russian).
 
 How would this work for initdb with locale C?

I'm worrying about that too.
--
Tatsuo Ishii
SRA OSS, Inc. Japan

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] tsearch in core patch

2007-06-22 Thread Michael Glaesemann


On Jun 22, 2007, at 9:28 , Tom Lane wrote:


Is the point here for initdb to be able to establish a sane default
initially?  Seems to me it can guess the language from the first
component of the locale (ru_RU - russian).


How would this work for initdb with locale C?

Michael Glaesemann
grzm seespotcode net



---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] tsearch in core patch

2007-06-22 Thread teodor
 Why not do it the other way around?
 es_ES spanish
 Spanish_Spain spanish
 ru_RU russian
 pt_BR portuguese_brazil

 That way you don't need any funny index.  Or do you need the list of
 locales for each language? (but even if you do, you can easily obtain it
 by indexing both columns separately using btrees anyway)

Yes, that's possible but that icreases number of identical configuration:
russian_win Russian_Russia
russian_unixru_RU

They doesn't differ except locale name.


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [Fwd: Re: [HACKERS] tsearch in core patch]

2007-06-22 Thread Tatsuo Ishii
  How would this work for initdb with locale C?
 
  I'm worrying about that too.
 
 english '{en_GB, en_US, C}'
 
 I suppose, that locale name always has a dot separator exept C locale ---
 which is well known exception

So we would have to?:

japanese '{ja_JP, C}'

How would we know C - japanese?

Also I'm wondering how we could handle texts including Japanese and
English. It's very common in Japan.
--
Tatsuo Ishii
SRA OSS, Inc. Japan

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] tsearch in core patch

2007-06-22 Thread Tom Lane
Tatsuo Ishii [EMAIL PROTECTED] writes:
 On Jun 22, 2007, at 9:28 , Tom Lane wrote:
 Is the point here for initdb to be able to establish a sane default
 initially?  Seems to me it can guess the language from the first
 component of the locale (ru_RU - russian).
 
 How would this work for initdb with locale C?

 I'm worrying about that too.

I would be surprised if C locale defaulted to anything except English.
I suppose it would be sensible to add a switch to allow people to select
a different language.  In any case, the only thing initdb would be doing
would be setting up an initial value of a table entry or GUC variable,
so you could always change it yourself later; it may not be worth
sweating too much about this.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Worries about delayed-commit semantics

2007-06-22 Thread Tom Lane
Richard Huxton [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 What's wrong with synchronous_commit?  It's accurate and simple.

 My concern would be that it can be read two ways:
 1. When you commit, sync (something or other - unspecified)
 2. Synchronise commits (to each other? to something else?)*

Well, that's a fair point.  deferred_commit would avoid that objection.

I'm not sure it's real important though --- with practically all of the
postgresql.conf variables, you really need to read the manual to know
exactly what they do.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[Fwd: Re: [HACKERS] tsearch in core patch]

2007-06-22 Thread teodor

 How would this work for initdb with locale C?

 I'm worrying about that too.

english '{en_GB, en_US, C}'

I suppose, that locale name always has a dot separator exept C locale ---
which is well known exception




---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Worries about delayed-commit semantics

2007-06-22 Thread Florian G. Pflug

Richard Huxton wrote:

Bruce Momjian wrote:

Tom Lane wrote:

What's wrong with synchronous_commit?  It's accurate and simple.


That is fine too.


My concern would be that it can be read two ways:
1. When you commit, sync (something or other - unspecified)
2. Synchronise commits (to each other? to something else?)*

It's obvious to people on the -hackers list what we're talking about, 
but is it so clear to a newbie, perhaps non-English speaker?


* I can see people thinking this means something like commit_delay.


OTOH, the concept of synchronous vs. asynchronous (function) calls
should be pretty well-known among database programmers and administrators.
And (at least to me), this is really what this is about - the commit
happens asynchronously, at the convenience of the database, and not
the instant that I requested it.

greetings, Florian Pflug


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] GUC time unit spelling a bit inconsistent

2007-06-22 Thread Peter Eisentraut
Am Freitag, 22. Juni 2007 15:34 schrieb Bruce Momjian:
 Consider even if we are clear that min is minutes, it could be
 chronological minutes or radial degree minutes, so yea, the context has
 to be considered.

The correct symbol for an arc minute is ´, so there is no context dependency.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] tsearch in core patch

2007-06-22 Thread Alvaro Herrera
[EMAIL PROTECTED] wrote:
  Why not do it the other way around?
  es_ES   spanish
  Spanish_Spain   spanish
  ru_RU   russian
  pt_BR   portuguese_brazil
 
  That way you don't need any funny index.  Or do you need the list of
  locales for each language? (but even if you do, you can easily obtain it
  by indexing both columns separately using btrees anyway)
 
 Yes, that's possible but that icreases number of identical configuration:
 russian_win Russian_Russia
 russian_unixru_RU
 
 They doesn't differ except locale name.

But why do you need them to be different at all?  Just make it

russian Russian_Russia
russian ru_RU

Does that not work for some reason?

What I was really suggesting was having a table mapping locale names
into tsearch languages.  Then the configuration could be made based on
the language, not on the locale name.  So the stopword list is for
russian, regardless of whether the locale is Russian_Russia or ru_RU.

Is this only for the stopword list, or does it also affect selecting a
stemmer?

Note: it's possible that the stopword list is different for brazilian
portuguese than portuguese portuguese, which is why I was suggesting
using a language portuguese_brazil and not just postuguese.  Whereas
you need a single stopword list for all the countries speaking spanish,
which is why you need only one language called spanish.

-- 
Alvaro Herrerahttp://www.advogato.org/person/alvherre
Llegará una época en la que una investigación diligente y prolongada sacará
a la luz cosas que hoy están ocultas (Séneca, siglo I)

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


[HACKERS] fast stop before database system is ready

2007-06-22 Thread Kevin Grittner
I apologize for not grabbing more information before the evidence was gone,
but I think there may be a vulnerability to database corruption on PITR
recovery if a stop is done with the fast option right after a database logs
archive recovery complete.  We normally have about 17 seconds between that
and the database system is ready message for a particular database.
Someone was watching the log and issued a fast stop about 1.5 seconds after
the archive recovery is complete message.  When the database came back up,
it was corrupted.  (The first problem message was about a bad sibling
pointer, but the wheels pretty much fell off after that.)  He deleted the
database instance, got a fresh dump, and tried again without stopping the
server at that point, and all is well.
 
The dump used in the problem recovery attempt is now gone.  I hesitate to
report this since my information is so sketchy, but thought you might want
the report anyway.
 
The source and target of this PITR-style copy were both PostgreSQL 8.2.4 on
SuSE Linux.  For more details on the target see my recent posts about the
corrupted database which turned out to be caused by bad hardware and outdated
drivers.
( http://archives.postgresql.org/pgsql-admin/2007-06/msg00151.php )
The failed recovery was on that box, after fixing all known hardware and
driver issues.
 
No assistance needed.
 
-Kevin
 



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[HACKERS] Refactoring parser/analyze.c

2007-06-22 Thread Tom Lane
In connection with bug #3403
http://archives.postgresql.org/pgsql-bugs/2007-06/msg00114.php
I've come to the conclusion that we really shouldn't do *any* processing
of utility commands at parse analysis time; they should be left as
raw-grammar output trees until execution.

The key reason for this is that any processing we do that is dependent
on database state might be obsolete by the time of execution, and we
don't have any infrastructure for taking locks or otherwise checking the
up-to-dateness of a utility command tree.  The time delay involved could
be significant in the case of a command that is put into the plan cache
(eg, a statement in a plpgsql function), so this isn't an academic
concern.  I had already foreseen this and delayed the processing of
several utility commands (eg, CREATE INDEX, CREATE RULE) until runtime
as part of the plan-cache patch; but I left CREATE TABLE and ALTER TABLE
alone, mistakenly thinking that their parse analysis work was purely
syntactic transformations and so could be done without reference to the
database state.  As noted in the discussion of bug #3403, this is wrong
with respect to the processing of SERIAL-column sequences.  And there's
also the matter of CREATE TABLE ... LIKE, for which the CVS-HEAD code
says

 * Change the LIKE subtable portion of a CREATE TABLE statement into
 * column definitions which recreate the user defined column portions of
 * subtable.
 *
 * Note: because we do this at parse analysis time, any change in the
 * referenced table between parse analysis and execution won't be reflected
 * into the new table.  Is this OK?

So I'm thinking we should complete the break-up and delay the processing
done by transformCreateStmt and transformAlterTableStmt until execution
of the utility command begins.  In the case of ALTER TABLE we should
take out an exclusive lock on the target table before we even start to
do any of transformAlterTableStmt's work.

I had originally thought that parser/analyze.c was too intertwined to
try to break up, but upon looking more closely I find that there is
actually almost complete separation between the handling of plannable
commands and utility commands.  I would like to refactor analyze.c
into two files to reflect this new understanding of when things happen:

analyze.c: keeps parse_analyze, transformStmt, and the handling of
SELECT/INSERT/UPDATE/DELETE commands, as well as EXPLAIN and DECLARE
CURSOR, which are special cases but more nearly related to plannable
commands than not.

a new file named something like parse_utilcmd.c: transformCreateStmt,
transformAlterTableStmt, transformCreateSchemaStmt, transformIndexStmt,
transformRuleStmt, and subsidiary routines.  These functions would now
be called at the beginning of execution of the respective utility
commands, and not from parse_analyze() at all.

It looks like only release_pstate_resources() and makeFromExpr() are
used in common by these two files; both of them arguably belong
somewhere else anyway (parse_node.c and makefuncs.c respectively).
Also we might need to export transformStmt() from analyze.c; the
utility-command routines currently call that directly, and I'm undecided
whether they can or should go through parse_analyze() instead.

With this refactoring, there will not be any use of the
extras_before/extras_after mechanism within analyze.c, and I'm sorely
tempted to just rip it out, redeclaring parse_analyze() and friends
to return a single Query node instead of a List.  Can anyone foresee
a reason we might still need to return multiple Query nodes from a
single plannable statement?  (Note: rule expansion isn't a reason,
that happens later.)

Comments?

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Bugtraq: Having Fun With PostgreSQL

2007-06-22 Thread Jim Nasby

On Jun 19, 2007, at 1:27 PM, Josh Berkus wrote:

I know there's issues with using ident sameuser via TCP, but what
about for filesystem socket connections?


Not all OSes support ident ... Solaris and OpenBSD for two, don't,  
because

they see ident as insecure.


What about the unix domain socket, though? AFAIK that doesn't rely on  
ident but some other method...

--
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] Bugtraq: Having Fun With PostgreSQL

2007-06-22 Thread Tom Lane
Jim Nasby [EMAIL PROTECTED] writes:
 On Jun 19, 2007, at 1:27 PM, Josh Berkus wrote:
 Not all OSes support ident ... Solaris and OpenBSD for two, don't,  
 because they see ident as insecure.

 What about the unix domain socket, though? AFAIK that doesn't rely on  
 ident but some other method...

On OpenBSD we use getpeereid() for unix sockets, and there are
equivalent things on some other Unixen.  We could never go over to
ident as the standard default, though, because not all platforms
have these sorts of features (if indeed they have unix sockets at
all ...); and in any case it's not very secure for TCP.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[HACKERS] In California for a few days

2007-06-22 Thread Bruce Momjian
FYI, I am visiting California until Wednesday, June 27, to attend a
funeral.  I will be reading email, but not as frequently.

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate