Re: ICU's uconv vs Linux iconv and UTF-8

2002-02-01 Thread Mark Leisher
Dan FYI I have reported this brain-dead mapping problem to Unicode Dan Consortium but never got an answer. Well, they are not public Dan society in a way they charge for the membership to say anything. One Dan of the reasons so many Japanese love to hate Unicode... This kind

Re: ICU's uconv vs Linux iconv and UTF-8

2002-02-01 Thread Dan Kogai
On 2002.02.02, at 00:32, Jarkko Hietaniemi wrote: So far as I see Linux iconv is ascii-preservative while ICS's is Unicode-strict. From Perl's point of view ASCII preservative should be default. Why? I have already answered in the previous mail (Subject:More on Unicode Mappings,

Re: ICU's uconv vs Linux iconv and UTF-8

2002-02-01 Thread Dan Kogai
On 2002.02.01, at 23:57, Mark Leisher wrote: Dan FYI I have reported this brain-dead mapping problem to Unicode Dan Consortium but never got an answer. Well, they are not public Dan society in a way they charge for the membership to say anything. One Dan of the reasons so

Re: ICU's uconv vs Linux iconv and UTF-8

2002-02-01 Thread Dan Kogai
On 2002.02.02, at 00:37, Nick Ing-Simmons wrote: Oh, yes. This is the problem of the original Unicode 2.x map; It is not ASCII preservative. I have posted this problem to perl- [EMAIL PROTECTED] when I first released Jcode. Several discussions later, I made Jcode so that it preserves

Re: ICU's uconv vs Linux iconv and UTF-8

2002-02-01 Thread Mark Leisher
Dan As I addressed to [EMAIL PROTECTED], Yet another problems that Dan ftp://ftp.unicode.org/Public/MAPPINGS/EASTASIA/ is now gone so I Dan don't have a practical way to check the mapping. I want the mapping Dan back! *Sigh* Readme.txt, which *is* in the

Re: ICU's uconv vs Linux iconv and UTF-8

2002-02-01 Thread Mark Leisher
Nick ftp://ftp.unicode.org/Public/MAPPINGS/OBSOLETE Nick ***HOWEVER** if you use the NON-INTUTIVE URL: Nick http://ftp.unicode.org/Public/MAPPINGS/ Nick one gets redirected to Nick http://www.unicode.org/Public/MAPPINGS/ Nick which is as you state. Quite right. The

RE: ICU's uconv vs Linux iconv and UTF-8

2002-02-01 Thread Marco Cimarosti
Dan Kogai wrote: As I addressed to [EMAIL PROTECTED], Yet another problems that ftp://ftp.unicode.org/Public/MAPPINGS/EASTASIA/ is now gone so I don't have a practical way to check the mapping. I want the mapping back! The Unicode site is a little bit labyrinthic, sometimes. The web

Re: ICU's uconv vs Linux iconv and UTF-8

2002-02-01 Thread Mark Davis
- From: "Dan Kogai" [EMAIL PROTECTED] To: "Nick Ing-Simmons" [EMAIL PROTECTED] Cc: "Nick Ing-Simmons" [EMAIL PROTECTED]; "SADAHIRO Tomoyuki" [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED] Sent: Friday, February 01, 2002 07:46 Subject:

Re: RE: ICU's uconv vs Linux iconv and UTF-8

2002-02-01 Thread Rick McGowan
Marco wrote... The web version of the data seems more up to date than the ftp site. They are the same files, available through different protocols! Rick

RE: ICU's uconv vs Linux iconv and UTF-8

2002-02-01 Thread Yves Arrouye
As part of the mystery of CJK encodings I notice that IBM's ICU's uconv and SuSE6.4 linux iconv differ as to the UTF-8 representation if table.euc Both converters will round-trip with themselves and give byte exact copy of table.euc Weirdly they differ in how they map '\' and '~' in

Re: ICU's uconv vs Linux iconv and UTF-8

2002-02-01 Thread Dan Kogai
I'll answer this one. On 2002.02.02, at 03:28, Yves Arrouye wrote: That is understandable if they use different tables. The question is which one is the right EUC-JP, and which one do users want? ICU, as well as iconv, could have two tables with the different mappings. The question then

Re: ICU's uconv vs Linux iconv and UTF-8

2002-02-01 Thread Mark Davis \(jtcsv\)
] Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; SADAHIRO Tomoyuki [EMAIL PROTECTED] Sent: Friday, February 01, 2002 10:21 Subject: Re: ICU's uconv vs Linux iconv and UTF-8 Mark Davis [EMAIL PROTECTED] writes: ICU's pedantic form The goal for ICU is to be charset neutral

RE: ICU's uconv vs Linux iconv and UTF-8

2002-02-01 Thread Yves Arrouye
It is definitely a problem to try to interpret what any given label is supposed to be. The problem is that MIME labels and others are ambiguous, and are interpreted different ways on different systems. Still, in the meantime it does make sense to have EUC-JP associated to the most common