Re: [HACKERS] UTF8 or Unicode

2005-02-25 Thread Bruce Momjian
Peter Eisentraut wrote: > Am Freitag, 25. Februar 2005 16:26 schrieb Bruce Momjian: > > OK, but what about latin1? > > The following character set names are specified in the SQL standard and > therefore somewhat non-negotiable: > > SQL_CHARACTER > GRAPHIC_IRV > LATIN1 > ISO8BIT > UTF16 > UTF8 >

Re: [HACKERS] UTF8 or Unicode

2005-02-25 Thread Peter Eisentraut
Am Freitag, 25. Februar 2005 16:26 schrieb Bruce Momjian: > OK, but what about latin1? The following character set names are specified in the SQL standard and therefore somewhat non-negotiable: SQL_CHARACTER GRAPHIC_IRV LATIN1 ISO8BIT UTF16 UTF8 UCS2 SQL_TEXT SQL_IDENTIFIER So we have to use LA

Re: [HACKERS] UTF8 or Unicode

2005-02-25 Thread Tom Lane
Bruce Momjian writes: > Peter Eisentraut wrote: >> I think this is what we should do: >> >> UNICODE => UTF8 >> ALT => WIN866 >> WIN => WIN1251 >> TCVN => WIN1258 > OK, but what about latin1? I think LATIN1 is fine as-is. It's a reasonably popular name for the character set, and despite Tatsuo'

Re: [HACKERS] UTF8 or Unicode

2005-02-25 Thread Bruce Momjian
Peter Eisentraut wrote: > Am Freitag, 25. Februar 2005 05:51 schrieb Bruce Momjian: > > so I see what he is saying. We are not consistent in favoring the > > official names vs. the common names. > > > > I will work on a patch that people can review and test. > > I think this is what we should do:

Re: [HACKERS] UTF8 or Unicode

2005-02-25 Thread Peter Eisentraut
Am Freitag, 25. Februar 2005 05:51 schrieb Bruce Momjian: > so I see what he is saying. We are not consistent in favoring the > official names vs. the common names. > > I will work on a patch that people can review and test. I think this is what we should do: UNICODE => UTF8 ALT => WIN866 WIN =>

Re: [HACKERS] UTF8 or Unicode

2005-02-25 Thread Karel Zak
On Thu, 2005-02-24 at 23:51 -0500, Bruce Momjian wrote: > Tatsuo Ishii wrote: > > I do not object the changing UNICODE->UTF-8, but all these discussions > > sound a little bit funny to me. > > > > If you want to blame UNICODE, you should blame LATIN1 etc. as > > well. LATIN1(ISO-8859-1) is actuall

Re: [HACKERS] UTF8 or Unicode

2005-02-24 Thread Peter Eisentraut
Bruce Momjian wrote: > We are not consistent in favoring the > official names vs. the common names. The problem is rather that there are too many standards and conventions to choose from. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)

Re: [HACKERS] UTF8 or Unicode

2005-02-24 Thread Bruce Momjian
Tatsuo Ishii wrote: > I do not object the changing UNICODE->UTF-8, but all these discussions > sound a little bit funny to me. > > If you want to blame UNICODE, you should blame LATIN1 etc. as > well. LATIN1(ISO-8859-1) is actually a character set name, not an > encoding name. ISO-8859-1 can be en

Re: [HACKERS] UTF8 or Unicode

2005-02-22 Thread Tatsuo Ishii
yte stream. But it can be encoded in 7-bit too. So when we refer to LATIN1(ISO-8859-1), it's not clear if it's encoded in 7/8-bit. -- Tatsuo Ishii From: Bruce Momjian Subject: Re: [HACKERS] UTF8 or Unicode Date: Mon, 21 Feb 2005 22:08:25 -0500 (EST) Message-ID: <[EMAIL PROTECTED]>

Re: [HACKERS] UTF8 or Unicode

2005-02-21 Thread Bruce Momjian
Tom Lane wrote: > Bruce Momjian writes: > > I think we just need to _favor_ UTF8. > > I agree. > > > The question is where are we > > favoring Unicode rather than UTF8? > > It's the canonical name of the encoding, both in the code and the docs. > > regression=# create database e encoding 'utf-

Re: [HACKERS] UTF8 or Unicode

2005-02-18 Thread Tom Lane
Bruce Momjian writes: > I think we just need to _favor_ UTF8. I agree. > The question is where are we > favoring Unicode rather than UTF8? It's the canonical name of the encoding, both in the code and the docs. regression=# create database e encoding 'utf-8'; CREATE DATABASE regression=# \l

Re: [HACKERS] UTF8 or Unicode

2005-02-18 Thread Bruce Momjian
Dave Page wrote: > Karel Zak wrote: > > >> Yes, I think we should fix it and remove UNICODE and WIN encoding names > >> from PG code. > > > > The JDBC driver asks for a UNICODE client encoding before it knows the > > server version it is talking to. How do you avoid breaking this? > > So does pg

Re: [HACKERS] UTF8 or Unicode

2005-02-18 Thread Dave Page
-Original Message- From: [EMAIL PROTECTED] on behalf of Oliver Jowett Sent: Fri 2/18/2005 11:27 AM To: Karel Zak Cc: List pgsql-hackers Subject: Re: [HACKERS] UTF8 or Unicode Karel Zak wrote: >> Yes, I think we should fix it and remove UNICODE and WIN encoding names >>

Re: [HACKERS] UTF8 or Unicode

2005-02-18 Thread Oliver Jowett
Karel Zak wrote: On Sat, 2005-02-19 at 00:27 +1300, Oliver Jowett wrote: Karel Zak wrote: Yes, I think we should fix it and remove UNICODE and WIN encoding names from PG code. The JDBC driver asks for a UNICODE client encoding before it knows the server version it is talking to. How do you avoid

Re: [HACKERS] UTF8 or Unicode

2005-02-18 Thread Christopher Kings-Lynne
Add to 8.1 release notes: encoding names 'UNICODE' and 'WIN' are deprecated and it will removed in next release. Please, use correct names "UTF-8" and "WIN1215". 8.2: remove it. OK? Why on earth remove it? Just leave it in as an alias to UTF8 Chris ---(end of broadcast)

Re: [HACKERS] UTF8 or Unicode

2005-02-18 Thread Karel Zak
On Sat, 2005-02-19 at 00:27 +1300, Oliver Jowett wrote: > Karel Zak wrote: > > > Yes, I think we should fix it and remove UNICODE and WIN encoding names > > from PG code. > > The JDBC driver asks for a UNICODE client encoding before it knows the > server version it is talking to. How do you avoi

Re: [HACKERS] UTF8 or Unicode

2005-02-18 Thread Oliver Jowett
Karel Zak wrote: Yes, I think we should fix it and remove UNICODE and WIN encoding names from PG code. The JDBC driver asks for a UNICODE client encoding before it knows the server version it is talking to. How do you avoid breaking this? -O ---(end of broadcast)--

Re: [HACKERS] UTF8 or Unicode

2005-02-18 Thread Karel Zak
On Tue, 2005-02-15 at 14:33 +0100, Peter Eisentraut wrote: > Am Dienstag, 15. Februar 2005 10:22 schrieb Karel Zak: > > in PG: unicode = utf8 = utf-8 > > > > Our internal routines in src/backend/utils/mb/encnames.c accept all > > synonyms. The "official" internal PG name for UTF-8 is "UNICODE" :-(

Re: [HACKERS] UTF8 or Unicode

2005-02-16 Thread Agent M
On Feb 14, 2005, at 9:27 PM, Abhijit Menon-Sen wrote: I know UTF8 is a type of unicode but do we need to rename anything from Unicode to UTF8? I don't know. I'll go through the documentation to see if I can find anything that needs changing. It's not the documentation that is wrong. Specifying the

Re: [HACKERS] UTF8 or Unicode

2005-02-15 Thread Peter Eisentraut
Am Dienstag, 15. Februar 2005 10:22 schrieb Karel Zak: > in PG: unicode = utf8 = utf-8 > > Our internal routines in src/backend/utils/mb/encnames.c accept all > synonyms. The "official" internal PG name for UTF-8 is "UNICODE" :-( I think in the SQL standard the official name is UTF8. If someone w

Re: [HACKERS] UTF8 or Unicode

2005-02-15 Thread Karel Zak
On Mon, 2005-02-14 at 22:05 -0500, Bruce Momjian wrote: > Abhijit Menon-Sen wrote: > > At 2005-02-14 21:14:54 -0500, pgman@candle.pha.pa.us wrote: > > > > > > Should our multi-byte encoding be referred to as UTF8 or Unicode? > > > > The *encoding* should certainly be referred to as UTF-8. Unicode

Re: [HACKERS] UTF8 or Unicode

2005-02-14 Thread Bruce Momjian
Abhijit Menon-Sen wrote: > At 2005-02-14 21:14:54 -0500, pgman@candle.pha.pa.us wrote: > > > > Should our multi-byte encoding be referred to as UTF8 or Unicode? > > The *encoding* should certainly be referred to as UTF-8. Unicode is a > character set, not an encoding; Unicode characters may be enc

Re: [HACKERS] UTF8 or Unicode

2005-02-14 Thread Abhijit Menon-Sen
At 2005-02-14 21:14:54 -0500, pgman@candle.pha.pa.us wrote: > > Should our multi-byte encoding be referred to as UTF8 or Unicode? The *encoding* should certainly be referred to as UTF-8. Unicode is a character set, not an encoding; Unicode characters may be encoded with UTF-8, among other things.

[HACKERS] UTF8 or Unicode

2005-02-14 Thread Bruce Momjian
Should our multi-byte encoding be referred to as UTF8 or Unicode? I know UTF8 is a type of unicode but do we need to rename anything from Unicode to UTF8? Someone asked me via private email. -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us