Re: [HACKERS] [pgsql-hackers-win32] UNICODE/UTF-8 on win32

2005-04-24 Thread Magnus Hagander
That is pretty much where we are ;-) I think we're fine for 8.0.x with this, because if you actually need UTF-8 (and can live with sorting broken, no upper/lower etc), you can do it using a manual initdb. For 8.1, I think the ICU approach looks a lot more promising than trying to do "on the fly co

Re: [HACKERS] [pgsql-hackers-win32] UNICODE/UTF-8 on win32

2005-04-24 Thread John Hansen
1 AM > To: John Hansen > Cc: Bruce Momjian; Tatsuo Ishii; [EMAIL PROTECTED]; > [EMAIL PROTECTED]; pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] [pgsql-hackers-win32] UNICODE/UTF-8 on win32 > > "John Hansen" <[EMAIL PROTECTED]> writes: > > Look at t

Re: [HACKERS] [pgsql-hackers-win32] UNICODE/UTF-8 on win32

2005-04-24 Thread Tom Lane
"John Hansen" <[EMAIL PROTECTED]> writes: > Look at the upper/lower I sent to the list, they should be able to > replace upper/lower for the utf8 encoding (and works independent of > locale).. I was under the impression we couldn't use these, precisely because they weren't locale-aware. ("It

Re: [HACKERS] [pgsql-hackers-win32] UNICODE/UTF-8 on win32

2005-04-24 Thread John Hansen
PM > To: John Hansen > Cc: Tatsuo Ishii; [EMAIL PROTECTED]; [EMAIL PROTECTED]; > [EMAIL PROTECTED]; pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] [pgsql-hackers-win32] UNICODE/UTF-8 on win32 > > John Hansen wrote: > > Look at the upper/lower I sent to the list, the

Re: [HACKERS] [pgsql-hackers-win32] UNICODE/UTF-8 on win32

2005-04-24 Thread John Hansen
ay, April 24, 2005 10:35 PM > To: Tatsuo Ishii > Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; > [EMAIL PROTECTED]; pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] [pgsql-hackers-win32] UNICODE/UTF-8 on win32 > > > Where are we on this? As far as I can tell, we never dis

Re: [HACKERS] [pgsql-hackers-win32] UNICODE/UTF-8 on win32

2005-04-24 Thread Bruce Momjian
MAIL PROTECTED]; > > [EMAIL PROTECTED]; pgsql-hackers@postgresql.org > > Subject: Re: [HACKERS] [pgsql-hackers-win32] UNICODE/UTF-8 on win32 > > > > > > Where are we on this? As far as I can tell, we never disabled UTF8 on > > Win32 in our code. The only thing we

Re: [HACKERS] [pgsql-hackers-win32] UNICODE/UTF-8 on win32

2005-04-24 Thread Bruce Momjian
Where are we on this? As far as I can tell, we never disabled UTF8 on Win32 in our code. The only thing we did do was to disable UTF8 in pginstaller. See this FAQ item: http://pginstaller.projects.postgresql.org/faq/FAQ_windows.html#2.6 Is the current setup OK? Should we allow UTF8 o

Re: [HACKERS] [pgsql-hackers-win32] UNICODE/UTF-8 on win32

2005-03-15 Thread Bruce Momjian
Tatsuo Ishii wrote: > I do understand the problem, but don't undertstand the decision you > guys made. The fact that UPPER/LOWER and some other functions does not > work in win32 is surely a problem for some languages, but not a > problem for otheres. For example, Japanese (and probably Chinese and

Re: [HACKERS] [pgsql-hackers-win32] UNICODE/UTF-8 on win32

2005-03-14 Thread Bruce Momjian
Peter Eisentraut wrote: > > o Disallow encodings like UTF8 which PostgreSQL supports > > but the operating system does not (already disallowed by > > pginstaller) > > I think the warning that initdb shouts out is already enough for this. > I don't think we want to dis

Re: [HACKERS] [pgsql-hackers-win32] UNICODE/UTF-8 on win32

2005-02-25 Thread Tom Lane
"John Hansen" <[EMAIL PROTECTED]> writes: > Right, so for the sample SQL I sent earlier, the result would be the same as > the input? > That's hardly a working upper/lower [ shrug... ] It works per the locale definition, which is that only 7-bit-ASCII a-z/A-Z get converted. The bottom line

Re: [HACKERS] [pgsql-hackers-win32] UNICODE/UTF-8 on win32

2005-02-25 Thread John Hansen
> On HPUX 10.20, mbstowcs seems to treat all byte values as > single-byte characters in C locale, so my sample-of-one says > that it works everywhere ;-). Right, so for the sample SQL I sent earlier, the result would be the same as the input? That's hardly a working upper/lower If a charac

Re: [HACKERS] [pgsql-hackers-win32] UNICODE/UTF-8 on win32

2005-02-25 Thread Tom Lane
"John Hansen" <[EMAIL PROTECTED]> writes: >> "It fails on my machine" should not be read as "it doesn't >> work for anyone". >> It all depends on how your local mbstowcs() works. > Ok,... Do you have an example of a system on which it works? On HPUX 10.20, mbstowcs seems to treat all byte values

Re: [HACKERS] [pgsql-hackers-win32] UNICODE/UTF-8 on win32

2005-02-25 Thread John Hansen
> > select upper('æøå'); > > ERROR: invalid multibyte character for locale > > HINT: The server's LC_CTYPE locale is probably > incompatible with the database encoding. > > > Consequently it seems that is does not work. > > "It fails on my machine" should not be read as "it doesn't > work for

Re: [HACKERS] [pgsql-hackers-win32] UNICODE/UTF-8 on win32

2005-02-25 Thread Tom Lane
"John Hansen" <[EMAIL PROTECTED]> writes: >> Sure it does. It's just that the defined behavior of the C >> locale is often useless in practice. > select upper('æøå'); > ERROR: invalid multibyte character for locale > HINT: The server's LC_CTYPE locale is probably incompatible with the > datab

Re: [HACKERS] [pgsql-hackers-win32] UNICODE/UTF-8 on win32

2005-02-25 Thread John Hansen
> John Hansen wrote: > > currently, upper/lower does not work with 2+ byte unicode > characters, > > on any OS under the C locale. > > Sure it does. It's just that the defined behavior of the C > locale is often useless in practice. select upper('æøå'); ERROR: invalid multibyte character for

Re: [HACKERS] [pgsql-hackers-win32] UNICODE/UTF-8 on win32

2005-02-24 Thread Peter Eisentraut
John Hansen wrote: > currently, upper/lower does not work with 2+ byte unicode characters, > on any OS under the C locale. Sure it does. It's just that the defined behavior of the C locale is often useless in practice. -- Peter Eisentraut http://developer.postgresql.org/~petere/ -

Re: [HACKERS] [pgsql-hackers-win32] UNICODE/UTF-8 on win32

2005-02-24 Thread Peter Eisentraut
Bruce Momjian wrote: > Oh, sorry. So there is no ordering in Unicode? That statement is meaningless. Unicode is a character set, not a collation order. > No wonder some > languages can't use Unicode effectively. That has nothing to do with it. > o Disallow encodings like UTF8 which

Re: [HACKERS] [pgsql-hackers-win32] UNICODE/UTF-8 on win32

2005-02-24 Thread John Hansen
> currently, upper/lower does not work with 2+ byte unicode > characters, on any OS under the C locale. Btw,... There are only 15 cases in the utf8 repertoire that depends on locale, these are the only cases where pg should report: ERROR: invalid multibyte character for locale HINT: The serv

Re: [HACKERS] [pgsql-hackers-win32] UNICODE/UTF-8 on win32

2005-02-24 Thread Tom Lane
"John Hansen" <[EMAIL PROTECTED]> writes: > Right,. So if that's fixed, then UTF8 will work only on windows? No. > (currently, upper/lower does not work with 2+ byte unicode characters, on any > OS) This information is obsolete. regards, tom lane --

Re: [HACKERS] [pgsql-hackers-win32] UNICODE/UTF-8 on win32

2005-02-24 Thread John Hansen
K, let me rephrase: currently, upper/lower does not work with 2+ byte unicode characters, on any OS under the C locale. ... John ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining colu

Re: [HACKERS] [pgsql-hackers-win32] UNICODE/UTF-8 on win32

2005-02-24 Thread John Hansen
> To fix UTF8, the data needs to be converted to > UTF16 and then > the Win32 wcscoll() can be used, and perhaps other functions > like towupper(). However, UTF8 already works with normal > locales but provides no ordering. Right,. So if that's fixed, then

Re: [HACKERS] [pgsql-hackers-win32] UNICODE/UTF-8 on win32

2005-02-24 Thread Bruce Momjian
Magnus Hagander wrote: > The installer does not permit it, but initdb lets you do anything yuo > want - I think that's where we are. If you know what you're doing, you > can use it by manually initdbing. > > There is no such thing as "unicode locale". Unicode (UTF8) is an > encoding, that has to b

Re: [HACKERS] [pgsql-hackers-win32] UNICODE/UTF-8 on win32

2005-02-22 Thread Magnus Hagander
The installer does not permit it, but initdb lets you do anything yuo want - I think that's where we are. If you know what you're doing, you can use it by manually initdbing. There is no such thing as "unicode locale". Unicode (UTF8) is an encoding, that has to be paired with a locale. I assume yo

Re: [HACKERS] [pgsql-hackers-win32] UNICODE/UTF-8 on win32

2005-02-21 Thread Bruce Momjian
Magnus, where are we on this? Seems we should allow unicode encoding and just not unicode locale in pginstaller. Also, Unicode is changing to UTF-8 in 8.1. --- Tatsuo Ishii wrote: > I do understand the problem, but don't

Re: [HACKERS] [pgsql-hackers-win32] UNICODE/UTF-8 on win32

2005-01-02 Thread Tatsuo Ishii
> >I do understand the problem, but don't undertstand the decision you > >guys made. The fact that UPPER/LOWER and some other functions does not > >work in win32 is surely a problem for some languages, but not a > >problem for otheres. For example, Japanese (and probably Chinese and > >Korean) does

Re: [HACKERS] [pgsql-hackers-win32] UNICODE/UTF-8 on win32

2005-01-02 Thread Tatsuo Ishii
> "Magnus Hagander" <[EMAIL PROTECTED]> writes: > > I didn't consider the C locale. Do you know for a fact that it works > > there on win32 as well, or is that an assumption? > > It should work. The only use of strcoll() in the backend is in > varstr_cmp which uses strncmp() instead for C locale.

Re: [HACKERS] [pgsql-hackers-win32] UNICODE/UTF-8 on win32

2005-01-02 Thread Tom Lane
"Magnus Hagander" <[EMAIL PROTECTED]> writes: > I didn't consider the C locale. Do you know for a fact that it works > there on win32 as well, or is that an assumption? It should work. The only use of strcoll() in the backend is in varstr_cmp which uses strncmp() instead for C locale. Lack of wo

Re: [HACKERS] [pgsql-hackers-win32] UNICODE/UTF-8 on win32

2005-01-02 Thread Magnus Hagander
>I do understand the problem, but don't undertstand the decision you >guys made. The fact that UPPER/LOWER and some other functions does not >work in win32 is surely a problem for some languages, but not a >problem for otheres. For example, Japanese (and probably Chinese and >Korean) does not have

Re: [HACKERS] [pgsql-hackers-win32] UNICODE/UTF-8 on win32

2005-01-02 Thread Tatsuo Ishii
I do understand the problem, but don't undertstand the decision you guys made. The fact that UPPER/LOWER and some other functions does not work in win32 is surely a problem for some languages, but not a problem for otheres. For example, Japanese (and probably Chinese and Korean) does not have a con