RE: Introducing the idea of a ROMAN VARIANT SELECTOR (was: Re: Proposing Fraktur)

2002-02-01 Thread Oliver Christ
Hi, Ken wrote: frakturDas sinkende Schiff sandte/fraktur SOSfraktur-Rufe./fraktur or conversely, perhaps better: Das sinkende Schiff sandte antiquaSOS/antiqua-Rufe. at the end, it may be more useful to rather markup the semantics than formatting properties, i.e. This is not a

[partly off-topic] A specialized kind of website, a teleutopia webspace.

2002-02-01 Thread William Overington
The recent sending of attachments in this unicode discussion group has led me to think once again about my idea for a specialized type of website. In view of the fact that, although I can do some client side JavaScript, I have no knowledge of server side scripting, I do not know whether my idea

Re: ICU's uconv vs Linux iconv and UTF-8

2002-02-01 Thread Mark Leisher
Dan FYI I have reported this brain-dead mapping problem to Unicode Dan Consortium but never got an answer. Well, they are not public Dan society in a way they charge for the membership to say anything. One Dan of the reasons so many Japanese love to hate Unicode... This kind

Re: ICU's uconv vs Linux iconv and UTF-8

2002-02-01 Thread Dan Kogai
On 2002.02.02, at 00:32, Jarkko Hietaniemi wrote: So far as I see Linux iconv is ascii-preservative while ICS's is Unicode-strict. From Perl's point of view ASCII preservative should be default. Why? I have already answered in the previous mail (Subject:More on Unicode Mappings,

Re: ICU's uconv vs Linux iconv and UTF-8

2002-02-01 Thread Dan Kogai
On 2002.02.01, at 23:57, Mark Leisher wrote: Dan FYI I have reported this brain-dead mapping problem to Unicode Dan Consortium but never got an answer. Well, they are not public Dan society in a way they charge for the membership to say anything. One Dan of the reasons so

Re: ICU's uconv vs Linux iconv and UTF-8

2002-02-01 Thread Dan Kogai
On 2002.02.02, at 00:37, Nick Ing-Simmons wrote: Oh, yes. This is the problem of the original Unicode 2.x map; It is not ASCII preservative. I have posted this problem to perl- [EMAIL PROTECTED] when I first released Jcode. Several discussions later, I made Jcode so that it preserves

Re: ICU's uconv vs Linux iconv and UTF-8

2002-02-01 Thread Mark Leisher
Dan As I addressed to [EMAIL PROTECTED], Yet another problems that Dan ftp://ftp.unicode.org/Public/MAPPINGS/EASTASIA/ is now gone so I Dan don't have a practical way to check the mapping. I want the mapping Dan back! *Sigh* Readme.txt, which *is* in the

Re: ICU's uconv vs Linux iconv and UTF-8

2002-02-01 Thread Mark Leisher
Nick ftp://ftp.unicode.org/Public/MAPPINGS/OBSOLETE Nick ***HOWEVER** if you use the NON-INTUTIVE URL: Nick http://ftp.unicode.org/Public/MAPPINGS/ Nick one gets redirected to Nick http://www.unicode.org/Public/MAPPINGS/ Nick which is as you state. Quite right. The

RE: ICU's uconv vs Linux iconv and UTF-8

2002-02-01 Thread Marco Cimarosti
Dan Kogai wrote: As I addressed to [EMAIL PROTECTED], Yet another problems that ftp://ftp.unicode.org/Public/MAPPINGS/EASTASIA/ is now gone so I don't have a practical way to check the mapping. I want the mapping back! The Unicode site is a little bit labyrinthic, sometimes. The web

Re: ICU's uconv vs Linux iconv and UTF-8

2002-02-01 Thread Mark Davis
ICU's pedantic form The goal for ICU is to be charset neutral, and support all of the conversions that are in modern use. There are a large number of variants of character sets; you can use the one you want. See: http://oss.software.ibm.com/icu/charset/index.html Mark - Original Message

Re: GB 18030 question

2002-02-01 Thread Michael Everson
At 11:23 -0800 2002-02-01, Deborah Goldsmith wrote: There is an error on page 10 of the GB 18030-2000 standard, in that the character with code point A3FE maps to U+FFE3 (FULLWIDTH MACRON), but is shown with a glyph that corresponds to U+FF5E (FULLWIDTH TILDE). The position of the character in

Re: RE: ICU's uconv vs Linux iconv and UTF-8

2002-02-01 Thread Rick McGowan
Marco wrote... The web version of the data seems more up to date than the ftp site. They are the same files, available through different protocols! Rick

GB 18030 question

2002-02-01 Thread Deborah Goldsmith
There is an error on page 10 of the GB 18030-2000 standard, in that the character with code point A3FE maps to U+FFE3 (FULLWIDTH MACRON), but is shown with a glyph that corresponds to U+FF5E (FULLWIDTH TILDE). The position of the character in its code block would also seem to indicate that

RE: ICU's uconv vs Linux iconv and UTF-8

2002-02-01 Thread Yves Arrouye
As part of the mystery of CJK encodings I notice that IBM's ICU's uconv and SuSE6.4 linux iconv differ as to the UTF-8 representation if table.euc Both converters will round-trip with themselves and give byte exact copy of table.euc Weirdly they differ in how they map '\' and '~' in

Re: ICU's uconv vs Linux iconv and UTF-8

2002-02-01 Thread Dan Kogai
I'll answer this one. On 2002.02.02, at 03:28, Yves Arrouye wrote: That is understandable if they use different tables. The question is which one is the right EUC-JP, and which one do users want? ICU, as well as iconv, could have two tables with the different mappings. The question then

Re: ICU's uconv vs Linux iconv and UTF-8

2002-02-01 Thread Mark Davis \(jtcsv\)
It is definitely a problem to try to interpret what any given label is supposed to be. The problem is that MIME labels and others are ambiguous, and are interpreted different ways on different systems. MIME/IANA is the best registry we have, but there are a number of significant problems: -

RE: ICU's uconv vs Linux iconv and UTF-8

2002-02-01 Thread Yves Arrouye
It is definitely a problem to try to interpret what any given label is supposed to be. The problem is that MIME labels and others are ambiguous, and are interpreted different ways on different systems. Still, in the meantime it does make sense to have EUC-JP associated to the most common