That is pretty much where we are ;-)
I think we're fine for 8.0.x with this, because if you actually need
UTF-8 (and can live with sorting broken, no upper/lower etc), you can do
it using a manual initdb.
For 8.1, I think the ICU approach looks a lot more promising than trying
to do "on the fly co
1 AM
> To: John Hansen
> Cc: Bruce Momjian; Tatsuo Ishii; [EMAIL PROTECTED];
> [EMAIL PROTECTED]; pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] [pgsql-hackers-win32] UNICODE/UTF-8 on win32
>
> "John Hansen" <[EMAIL PROTECTED]> writes:
> > Look at t
"John Hansen" <[EMAIL PROTECTED]> writes:
> Look at the upper/lower I sent to the list, they should be able to
> replace upper/lower for the utf8 encoding (and works independent of
> locale)..
I was under the impression we couldn't use these, precisely because they
weren't locale-aware. ("It
PM
> To: John Hansen
> Cc: Tatsuo Ishii; [EMAIL PROTECTED]; [EMAIL PROTECTED];
> [EMAIL PROTECTED]; pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] [pgsql-hackers-win32] UNICODE/UTF-8 on win32
>
> John Hansen wrote:
> > Look at the upper/lower I sent to the list, the
ay, April 24, 2005 10:35 PM
> To: Tatsuo Ishii
> Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED];
> [EMAIL PROTECTED]; pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] [pgsql-hackers-win32] UNICODE/UTF-8 on win32
>
>
> Where are we on this? As far as I can tell, we never dis
MAIL PROTECTED];
> > [EMAIL PROTECTED]; pgsql-hackers@postgresql.org
> > Subject: Re: [HACKERS] [pgsql-hackers-win32] UNICODE/UTF-8 on win32
> >
> >
> > Where are we on this? As far as I can tell, we never disabled UTF8 on
> > Win32 in our code. The only thing we
Where are we on this? As far as I can tell, we never disabled UTF8 on
Win32 in our code. The only thing we did do was to disable UTF8 in
pginstaller. See this FAQ item:
http://pginstaller.projects.postgresql.org/faq/FAQ_windows.html#2.6
Is the current setup OK? Should we allow UTF8 o
Tatsuo Ishii wrote:
> I do understand the problem, but don't undertstand the decision you
> guys made. The fact that UPPER/LOWER and some other functions does not
> work in win32 is surely a problem for some languages, but not a
> problem for otheres. For example, Japanese (and probably Chinese and
Peter Eisentraut wrote:
> > o Disallow encodings like UTF8 which PostgreSQL supports
> > but the operating system does not (already disallowed by
> > pginstaller)
>
> I think the warning that initdb shouts out is already enough for this.
> I don't think we want to dis
"John Hansen" <[EMAIL PROTECTED]> writes:
> Right, so for the sample SQL I sent earlier, the result would be the same as
> the input?
> That's hardly a working upper/lower
[ shrug... ] It works per the locale definition, which is that only
7-bit-ASCII a-z/A-Z get converted.
The bottom line
> On HPUX 10.20, mbstowcs seems to treat all byte values as
> single-byte characters in C locale, so my sample-of-one says
> that it works everywhere ;-).
Right, so for the sample SQL I sent earlier, the result would be the same as
the input?
That's hardly a working upper/lower
If a charac
"John Hansen" <[EMAIL PROTECTED]> writes:
>> "It fails on my machine" should not be read as "it doesn't
>> work for anyone".
>> It all depends on how your local mbstowcs() works.
> Ok,... Do you have an example of a system on which it works?
On HPUX 10.20, mbstowcs seems to treat all byte values
> > select upper('æøå');
> > ERROR: invalid multibyte character for locale
> > HINT: The server's LC_CTYPE locale is probably
> incompatible with the database encoding.
>
> > Consequently it seems that is does not work.
>
> "It fails on my machine" should not be read as "it doesn't
> work for
"John Hansen" <[EMAIL PROTECTED]> writes:
>> Sure it does. It's just that the defined behavior of the C
>> locale is often useless in practice.
> select upper('æøå');
> ERROR: invalid multibyte character for locale
> HINT: The server's LC_CTYPE locale is probably incompatible with the
> datab
> John Hansen wrote:
> > currently, upper/lower does not work with 2+ byte unicode
> characters,
> > on any OS under the C locale.
>
> Sure it does. It's just that the defined behavior of the C
> locale is often useless in practice.
select upper('æøå');
ERROR: invalid multibyte character for
John Hansen wrote:
> currently, upper/lower does not work with 2+ byte unicode characters,
> on any OS under the C locale.
Sure it does. It's just that the defined behavior of the C locale is
often useless in practice.
--
Peter Eisentraut
http://developer.postgresql.org/~petere/
-
Bruce Momjian wrote:
> Oh, sorry. So there is no ordering in Unicode?
That statement is meaningless. Unicode is a character set, not a
collation order.
> No wonder some
> languages can't use Unicode effectively.
That has nothing to do with it.
> o Disallow encodings like UTF8 which
> currently, upper/lower does not work with 2+ byte unicode
> characters, on any OS under the C locale.
Btw,...
There are only 15 cases in the utf8 repertoire that depends on locale, these
are the only cases where pg should report:
ERROR: invalid multibyte character for locale
HINT: The serv
"John Hansen" <[EMAIL PROTECTED]> writes:
> Right,. So if that's fixed, then UTF8 will work only on windows?
No.
> (currently, upper/lower does not work with 2+ byte unicode characters, on any
> OS)
This information is obsolete.
regards, tom lane
--
K, let me rephrase:
currently, upper/lower does not work with 2+ byte unicode characters, on any OS
under the C locale.
... John
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
joining colu
> To fix UTF8, the data needs to be converted to
> UTF16 and then
> the Win32 wcscoll() can be used, and perhaps other functions
> like towupper(). However, UTF8 already works with normal
> locales but provides no ordering.
Right,. So if that's fixed, then
Magnus Hagander wrote:
> The installer does not permit it, but initdb lets you do anything yuo
> want - I think that's where we are. If you know what you're doing, you
> can use it by manually initdbing.
>
> There is no such thing as "unicode locale". Unicode (UTF8) is an
> encoding, that has to b
The installer does not permit it, but initdb lets you do anything yuo
want - I think that's where we are. If you know what you're doing, you
can use it by manually initdbing.
There is no such thing as "unicode locale". Unicode (UTF8) is an
encoding, that has to be paired with a locale. I assume yo
Magnus, where are we on this? Seems we should allow unicode encoding
and just not unicode locale in pginstaller.
Also, Unicode is changing to UTF-8 in 8.1.
---
Tatsuo Ishii wrote:
> I do understand the problem, but don't
> >I do understand the problem, but don't undertstand the decision you
> >guys made. The fact that UPPER/LOWER and some other functions does not
> >work in win32 is surely a problem for some languages, but not a
> >problem for otheres. For example, Japanese (and probably Chinese and
> >Korean) does
> "Magnus Hagander" <[EMAIL PROTECTED]> writes:
> > I didn't consider the C locale. Do you know for a fact that it works
> > there on win32 as well, or is that an assumption?
>
> It should work. The only use of strcoll() in the backend is in
> varstr_cmp which uses strncmp() instead for C locale.
"Magnus Hagander" <[EMAIL PROTECTED]> writes:
> I didn't consider the C locale. Do you know for a fact that it works
> there on win32 as well, or is that an assumption?
It should work. The only use of strcoll() in the backend is in
varstr_cmp which uses strncmp() instead for C locale. Lack of
wo
>I do understand the problem, but don't undertstand the decision you
>guys made. The fact that UPPER/LOWER and some other functions does not
>work in win32 is surely a problem for some languages, but not a
>problem for otheres. For example, Japanese (and probably Chinese and
>Korean) does not have
I do understand the problem, but don't undertstand the decision you
guys made. The fact that UPPER/LOWER and some other functions does not
work in win32 is surely a problem for some languages, but not a
problem for otheres. For example, Japanese (and probably Chinese and
Korean) does not have a con
29 matches
Mail list logo