Re: [HACKERS] 7.0.3 reproduceable serious select error

2001-01-19 Thread Hannu Krosing

Rob van Nieuwkerk wrote:
 
 
 The problem query returns the *right* answer now !
 Turning LANG=en_US back on gives the old buggy behaviour.
 
 I know very little about this LANG, LOCALE etc. stuff.
 But for our application it is very important to support "weird" characters
 like " ..." etc. for names.  Basically we need all letter symbols
 in ISO-8859-1 (Latin 1).  A quick experiment shows that without the
 LANG setting I can still insert  select strings containing these
 symbols.
 
 Do I lose any postgresql functionality by just getting rid of the LANG
 environment variable ?  Will I be able to use full ISO-8859-1 in table
 fields without problems ?

You should, except that upper() and lower() will not give you right
answers 
for char128 and order by orders in ASCII (i.e. character code value)
order.

I would suggest that instead you keep the en_US locale (or some nl
locale 
if you need the rigt ordering from DB), but do _not_ create a b-tree 
(the default) index on your text fields. If you need the index for 
exact lookup (field=const) an hash idex will do fine and i'm pretty sure 
that LIKE optimisations will not use them to spoil searches ;).

---
Hannu



Re: [HACKERS] 7.0.3 reproduceable serious select error

2001-01-19 Thread Hannu Krosing

Bruce Momjian wrote:
 
 Added to TODO:
 
 * Add SHOW command to see locale

I'd rather like it to be a function, as version() is, because SHOW
commands may not 
play nice with other interfaces than psql. 
(and it can first be included in ./contrib if it's too late for a
"feature" grin)

Just make sure we will not confict with SQL standard in naming the
function.

---
Hannu



Re: [HACKERS] 7.0.3 reproduceable serious select error

2001-01-18 Thread Tom Lane

 I don't think I'm running postmaster in a non-ASCII locale.
 At least I did not explicitly do anything to accomplish it.

Did you have LANG, LOCALE, or any of the LC_xxx family of
environment variables set when you started the postmaster?
Some Linux distros tend to set those in system profile scripts ...

regards, tom lane



Re: [HACKERS] 7.0.3 reproduceable serious select error

2001-01-18 Thread Rob van Nieuwkerk


Tom Lane wrote:

  I don't think I'm running postmaster in a non-ASCII locale.
  At least I did not explicitly do anything to accomplish it.
 
 Did you have LANG, LOCALE, or any of the LC_xxx family of
 environment variables set when you started the postmaster?
 Some Linux distros tend to set those in system profile scripts ...

Checking whith ps and looking in /proc reveiled that postmaster indeed
had LANG set to "en_US" in its environment.  I disabled the system script
that makes this setting, restarted postgres/postmaster and reran my tests.

The problem query returns the *right* answer now !
Turning LANG=en_US back on gives the old buggy behaviour.

I know very little about this LANG, LOCALE etc. stuff.
But for our application it is very important to support "weird" characters
like "éõåÊ ..." etc. for names.  Basically we need all letter symbols
in ISO-8859-1 (Latin 1).  A quick experiment shows that without the
LANG setting I can still insert  select strings containing these
symbols.

Do I lose any postgresql functionality by just getting rid of the LANG
environment variable ?  Will I be able to use full ISO-8859-1 in table
fields without problems ?

Please tell if you want me to do any other tests !

greetings,
Rob van Nieuwkerk



Re: [HACKERS] 7.0.3 reproduceable serious select error

2001-01-18 Thread Trond Eivind Glomsrød

Rob van Nieuwkerk [EMAIL PROTECTED] writes:

 I know very little about this LANG, LOCALE etc. stuff.
 But for our application it is very important to support "weird" characters
 like " ..." etc. for names.  Basically we need all letter symbols
 in ISO-8859-1 (Latin 1). 

en_US is latin1 - this is what distinguishes it from POSIX/C.

-- 
Trond Eivind Glomsrd
Red Hat, Inc.



Re: [HACKERS] 7.0.3 reproduceable serious select error

2001-01-18 Thread Tom Lane

Rob van Nieuwkerk [EMAIL PROTECTED] writes:
 Checking whith ps and looking in /proc reveiled that postmaster indeed
 had LANG set to "en_US" in its environment.  I disabled the system script
 that makes this setting, restarted postgres/postmaster and reran my tests.

 The problem query returns the *right* answer now !
 Turning LANG=en_US back on gives the old buggy behaviour.

Caution: you can't just change the locale willy-nilly, because doing so
invalidates the sort ordering of btree indexes.  An index built under
one sort order is effectively corrupt under another.  I recommend that
you dumpall, then initdb under the desired LANG setting, then reload,
and be careful always to start the postmaster under that same setting
henceforth.

(BTW, 7.1 prevents this type of index screwup by locking down the
database's locale at initdb time --- the ONLY way to change sort order
in 7.1 is to initdb with the right locale environment variables.  But in
7.0 you gotta be careful about keeping the locale consistent.)

 I know very little about this LANG, LOCALE etc. stuff.
 But for our application it is very important to support "weird" characters
 like "éõåÊ ..." etc. for names.  Basically we need all
 letter symbols in ISO-8859-1 (Latin 1).

As long as you are not expecting things to sort in any particular order,
it really doesn't matter what locale you run Postgres in.  If you do
care about sort order of characters that aren't bog-standard USASCII,
then you may have a problem.  But you can store 'em in any case.

regards, tom lane



Re: [HACKERS] 7.0.3 reproduceable serious select error

2001-01-18 Thread Barry Lind

I meant to ask this the last time this came up on the list, but now is a
good time.  Given what Tom describes below as the behavior in 7.1
(initdb stores the locale info), how do you determine what locale a
database is running in in 7.1 after initdb?  Is there some file to look
at?  Is there some sql statement that can be used to select the setting
from the DB?

thanks,
--Barry


Tom Lane wrote:
 
 Rob van Nieuwkerk [EMAIL PROTECTED] writes:
  Checking whith ps and looking in /proc reveiled that postmaster indeed
  had LANG set to "en_US" in its environment.  I disabled the system script
  that makes this setting, restarted postgres/postmaster and reran my tests.
 
  The problem query returns the *right* answer now !
  Turning LANG=en_US back on gives the old buggy behaviour.
 
 Caution: you can't just change the locale willy-nilly, because doing so
 invalidates the sort ordering of btree indexes.  An index built under
 one sort order is effectively corrupt under another.  I recommend that
 you dumpall, then initdb under the desired LANG setting, then reload,
 and be careful always to start the postmaster under that same setting
 henceforth.
 
 (BTW, 7.1 prevents this type of index screwup by locking down the
 database's locale at initdb time --- the ONLY way to change sort order
 in 7.1 is to initdb with the right locale environment variables.  But in
 7.0 you gotta be careful about keeping the locale consistent.)
 
  I know very little about this LANG, LOCALE etc. stuff.
  But for our application it is very important to support "weird" characters
  like "éõåÊ ..." etc. for names.  Basically we need all
  letter symbols in ISO-8859-1 (Latin 1).
 
 As long as you are not expecting things to sort in any particular order,
 it really doesn't matter what locale you run Postgres in.  If you do
 care about sort order of characters that aren't bog-standard USASCII,
 then you may have a problem.  But you can store 'em in any case.
 
 regards, tom lane



Re: [HACKERS] 7.0.3 reproduceable serious select error

2001-01-18 Thread Tom Lane

Barry Lind [EMAIL PROTECTED] writes:
 I meant to ask this the last time this came up on the list, but now is a
 good time.  Given what Tom describes below as the behavior in 7.1
 (initdb stores the locale info), how do you determine what locale a
 database is running in in 7.1 after initdb?

Hm.  There probably ought to be an inquiry function or SHOW variable
for that, but at the moment there's not.  Offhand I can't think of any
direct way except to paw through the pg_control file looking for the
locale name (at least it's stored there in ASCII ;-)).

regards, tom lane



Re: [HACKERS] 7.0.3 reproduceable serious select error

2001-01-18 Thread Bruce Momjian

Added to TODO:

* Add SHOW command to see locale

 Barry Lind [EMAIL PROTECTED] writes:
  I meant to ask this the last time this came up on the list, but now is a
  good time.  Given what Tom describes below as the behavior in 7.1
  (initdb stores the locale info), how do you determine what locale a
  database is running in in 7.1 after initdb?
 
 Hm.  There probably ought to be an inquiry function or SHOW variable
 for that, but at the moment there's not.  Offhand I can't think of any
 direct way except to paw through the pg_control file looking for the
 locale name (at least it's stored there in ASCII ;-)).
 
   regards, tom lane
 


-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026