RE: FW: A product compatibility question

2001-10-09 Thread Ayers, Mike
From: Asmus Freytag [mailto:[EMAIL PROTECTED]] Sent: Tuesday, October 09, 2001 01:02 PM At 01:43 PM 10/9/01 -0400, Gary P. Grosso wrote: Because of Unicode's Han unification, I was under the impression that to get both Traditional Chinese and Simplified Chinese to really look right

RE: Code points for al-Qaeda

2001-10-03 Thread Ayers, Mike
From: Sampo Syreeni [mailto:[EMAIL PROTECTED]] Sent: Wednesday, October 03, 2001 02:50 AM On Wed, 3 Oct 2001, Marco Cimarosti wrote: Alef Fatha Lam Sukun Qaf Fatha Alef Ain Kasra Dal Fatha Teh-Marbuta (Damma) It strikes me as weird that none of the major news media have gone

RE: Code points for al-Qaeda

2001-10-03 Thread Ayers, Mike
From: John Cowan [mailto:[EMAIL PROTECTED]] Sent: Wednesday, October 03, 2001 09:14 AM Ayers, Mike wrote: I also recall when the U.S. government decided to switch from Wade-Giles to Pinyin romanization of Chinese and muscled the media into playing along. All that confusion

RE: Shape of the US Dollar Sign

2001-10-02 Thread Ayers, Mike
From: G. Adam Stanislav [mailto:[EMAIL PROTECTED]] Sent: Monday, October 01, 2001 12:07 PM Send him a check instead. Every single US check I have ever seen had a dollar sign printed to the left of the field where the numeric amount is to be entered. They all use the same glyph

RE: [OT] Roman numeral arithmetic

2001-10-02 Thread Ayers, Mike
From: Edward Cherlin [mailto:[EMAIL PROTECTED]] Sent: Saturday, September 29, 2001 05:55 PM If we omit the later use of subtractive notation (iv=4, xc=90 etc.), the original Roman numerals are exactly equivalent to the Chinese abacus where each wire holds four beads below the bar

RE: Re: A pun - will this work?

2001-09-26 Thread Ayers, Mike
$B:9=P?M(J: Kenneth Whistler [EMAIL PROTECTED]; $BF|;~(J: 01/09/26 2:23 Go man! Actually, if he's half Jamaican, I think you have to say "Go mon", which is also the Japanese for 50,000, yes? /|/|ike

RE: Re: A pun - will this work?

2001-09-26 Thread Ayers, Mike
From: Kenneth Whistler [mailto:[EMAIL PROTECTED]] Sent: Wednesday, September 26, 2001 11:34 AM Actually, if he's half Jamaican, I think you have to say Go mon, which is also the Japanese for 50,000, yes? No, actually, it is Japanese for 5th question, although that seems to be

RE: UTF-8 UCS-2/UTF-16 conversion for library use

2001-09-24 Thread Ayers, Mike
From: Asmus Freytag [mailto:[EMAIL PROTECTED]] Sent: Sunday, September 23, 2001 02:24 AM The typical situation involves cases where large data sets are cached in memory, for immediate access. Going to UTF-32 reduces the cache effectively by a factor of two, with no comparable

RE: UTF-8 UCS-2/UTF-16 conversion for library use

2001-09-24 Thread Ayers, Mike
If you think you have the answer to all the problems, then you don't know all the problems. I tried to make a point, and apparently made it poorly. I will try again. It seems that some people are arguing that UTF-16 is the ideal solution for all computing, and that UTF-8 and

RE: numeric ordering

2001-09-20 Thread Ayers, Mike
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] Sent: Thursday, September 20, 2001 12:10 PM Why not have as part of your kanji collation order, the Han digits one through nine, in that order? I believe that would be because they are not ordinarily sorted that way. Why are

RE: PDUTR #26 posted

2001-09-19 Thread Ayers, Mike
From: John Cowan [mailto:[EMAIL PROTECTED]] [EMAIL PROTECTED] scripsit: Oops! One of two Unicode 101 mistakes I made in the same day. Where was my brain? Unicode Ate Your Brain, of course! (See my tutorial at Orlando this year.) Nah, UTF ate it!

RE: PDUTR #26 posted

2001-09-14 Thread Ayers, Mike
From: Marcin 'Qrczak' Kowalczyk [mailto:[EMAIL PROTECTED]] Sent: Friday, September 14, 2001 02:11 AM Thu, 13 Sep 2001 12:52:04 -0700, Asmus Freytag [EMAIL PROTECTED] pisze: UTF-32 does have the same byte order issues as UTF-16, except that byte order is recognizable without a BOM.

RE: The trouble with text-sorting algorithms

2001-09-10 Thread Ayers, Mike
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] Sent: Monday, September 10, 2001 10:40 AM The trouble with algorithms for sorting *text* is that often an algorithm that prurportedly sorts TEXT will really be sorting at least partly by PRONUNCIATION. So is it really sorting text?

RE: [OT] o-circumflex

2001-09-07 Thread Ayers, Mike
From: J M Sykes [mailto:[EMAIL PROTECTED]] Sent: Friday, September 07, 2001 07:50 AM The classic example is 'resume' and 'résumé'. These are, by now, two quite distinct words, and the fact that there is no 'established' order is shown I spell both resume and have never been

RE: [OT] o-circumflex

2001-09-07 Thread Ayers, Mike
From: David Gallardo [mailto:[EMAIL PROTECTED]] Sent: Friday, September 07, 2001 10:07 AM As a practical matter, you need to take the diacritics into account when sorting, even in English where they (may or may not) have linguistic significance, otherwise you'll get nondeterministic

RE: japanese xml

2001-08-30 Thread Ayers, Mike
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] Sent: Thursday, August 30, 2001 06:06 AM IMO, I correctly replied to Viranga's question and I've no idea what you're talking about below. Let me try to put it another way. What you said may have been technically correct, but it

RE: japanese xml

2001-08-30 Thread Ayers, Mike
I have no idea what kind of stunt you're trying to pull. /|/|ike From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] Sent: Thursday, August 30, 2001 08:37 AM I have no idea of what you're talking about. Misha On 30/08/2001 16:11:14 Ayers, Mike wrote: From: [EMAIL

RE: japanese xml

2001-08-30 Thread Ayers, Mike
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] Sent: Thursday, August 30, 2001 08:36 AM Furthermore, Viranga's context appears to be XML, in which case it *is* possible to encode *all* Unicode code points using EUC (or ISO-8859-1 or ASCII or ...) I ask again - where's the

RE: japanese xml

2001-08-30 Thread Ayers, Mike
From: Addison Phillips [wM] [mailto:[EMAIL PROTECTED]] Sent: Thursday, August 30, 2001 09:51 AM 4. However, you can use any other encoding, provided you tag the file appropriately (so that the parser knows what the encoding is and can translate it to its internal representation).

RE: japanese xml

2001-08-30 Thread Ayers, Mike
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] Sent: Thursday, August 30, 2001 10:42 AM Interesting. My original reply is pasted in below. Please tell me how you managed to arrive at your interpretation. As I mentioned already, I misread your original reply, partially

Increasingly OT: RE: Errata in language/script list

2001-08-15 Thread Ayers, Mike
From: Philipp Reichmuth [mailto:[EMAIL PROTECTED]] This is not quite true. In fact, in the academic community (or at least in linguistics cultural sciences) it is established practice to transliterate some terms. In a sinologist's work on Mao he'll probably write Mao as Mao instead of

RE: Re[2]: Errata in language/script list

2001-08-14 Thread Ayers, Mike
From: Thomas Chan [mailto:[EMAIL PROTECTED]] e.g., If someone asked 1-2 (pre-Unicode 3.1) years ago the question, Can I write Cantonese with Unicode?, the answer would have been no or not really. If it were asked today, the answer would be yes. But try that question today with

RE: Re[2]: Errata in language/script list

2001-08-01 Thread Ayers, Mike
From: Marco Cimarosti [mailto:[EMAIL PROTECTED]] This is not correct: I have found the term Han or hanzi in any kind of literature, not only on Unicode documentation. Hanzi is a loan word which I have also often seen (usually written in italics as it should be), but I never said

RE: Re[2]: Errata in language/script list

2001-07-30 Thread Ayers, Mike
From: Philipp Reichmuth [mailto:[EMAIL PROTECTED]] On a side note of course it would by now probably make sense to add Latin as alphabet to Chinese as well since hanyu pinyin has been adopted as some sort of official latinization system by the Chinese government, but that's an entirely

RE: Re[2]: Errata in language/script list

2001-07-30 Thread Ayers, Mike
From: Kenneth Whistler [mailto:[EMAIL PROTECTED]] Also, I see that the script for Chinese is listed as Han, not Chinese. Must we insist on confusing people? The script in question is designated Han in the Unicode Standard, and has always been so, in part because it is also used

RE: Is there Unicode mail out there?

2001-07-20 Thread Ayers, Mike
From: Tex Texin [mailto:[EMAIL PROTECTED]] So it must not be an NCR, EXCEPT in the seemingly rare case where the string ]] appears in content AND that string is not being used to indicate the end of a CDATA section. How is that supposed to be read? Simple. Since ]] is used to

RE: Is there Unicode mail out there?

2001-07-19 Thread Ayers, Mike
From: John Cowan [mailto:[EMAIL PROTECTED]] I think that any proposal to shrink the range of well-formed documents is simply a nonstarter, regrettable as that is. I had thought that one of the main goals of XML Blueberry was mainframe compatibility. If so, won't they need to

RE: Is there Unicode mail out there?

2001-07-19 Thread Ayers, Mike
From: Shigemichi Yazawa [mailto:[EMAIL PROTECTED]] XML states Its goal is to enable generic SGML to be served, received, and processed on the Web in the way that is now possible with HTML. But, in my opinion, XML has outgrown its original goal way too far. XML seems to be used in every

RE: Is there Unicode mail out there?

2001-07-13 Thread Ayers, Mike
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] In a message dated 2001-07-13 5:27:41 Pacific Daylight Time, [EMAIL PROTECTED] writes: @š‚¶‚イ‚¢‚Á‚¿‚á‚ñš @Ž„‚͂낱‚¦‚ñ‚ç‚©‚ׂ³B Robert, please stop this. It doesn't seem to be UTF-8 (that is, I can't copy and

RE: A UTF-8 based News Service

2001-07-13 Thread Ayers, Mike
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] Raw UTF-8 4,382,592 Zipped UTF-82,264,152 (52% of raw UTF-8) Raw SCSU1,179,688 (27% of raw UTF-8) Zipped SCSU 104,316 (9% of raw SCSU, 5% of zipped UTF-8) The data set is truly

RE: Re: Is there Unicode mail out there?

2001-07-13 Thread Ayers, Mike
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] Those are MOJIBAKE for my SIG. Which is what you deserve for not sending UTF-8. Until you upgrade your mailer, your name wil be @š‚¶‚イ‚¢‚Á‚¿‚á‚ñš. :-p 1) I think that is mojibake for my name. It looks familiar.

RE: Is there Unicode mail out there?

2001-07-12 Thread Ayers, Mike
From: Jungshik Shin [mailto:[EMAIL PROTECTED]] Mysterious is why this prompting (by MS OE) did not happen to Mike Ayers when he replied to Peter's message with Thai string in Windows-874 adding some Chinese characters while MS OE (5.50.x) I tried certainly prompted me to pick

RE: Is there Unicode mail out there?

2001-07-12 Thread Ayers, Mike
From: Chris Wendt [mailto:[EMAIL PROTECTED]] Replying in the charset of the original message is in my view reasonable behavior: the recipient of your reply has the best chance to read the message in the encoding the original message was sent. Changing the encoding decreases the chance

RE: Is there Unicode mail out there?

2001-07-11 Thread Ayers, Mike
From: Mark Davis [mailto:[EMAIL PROTECTED]] Yes, that works fine. The Thai comes through clearly: ¡ÅÑ»ÁÒÍÂÙèáÅéÇ Woohoo!!! UTF-8 party!!! ???!!! /|/|ike

RE: Is there Unicode mail out there?

2001-07-11 Thread Ayers, Mike
: From: Ayers, Mike [mailto:[EMAIL PROTECTED]] Let's try this again... From: Mark Davis [mailto:[EMAIL PROTECTED]] Yes, that works fine. The Thai comes through clearly: ¡ÅÑ»ÁÒÍÂÙèáÅéÇ Woohoo!!! UTF-8 party!!! ???!!! /|/|ike

RE: Is there Unicode mail out there?

2001-07-11 Thread Ayers, Mike
Let's try this again... From: Mark Davis [mailto:[EMAIL PROTECTED]] Yes, that works fine. The Thai comes through clearly: ¡ÅÑ»ÁÒÍÂÙèáÅéÇ Woohoo!!! UTF-8 party!!! ???!!! /|/|ike

RE: Is there Unicode mail out there?

2001-07-11 Thread Ayers, Mike
From: Jungshik Shin [mailto:[EMAIL PROTECTED]] Nothing cryptic. As with others on this thread, your problem is to mistake Windows-874 (legacy encoding for Thai) for UTF-8. Because Windows-874 does NOT cover Chinese characters, they turned into '?'. Judging from your message hader,

RE: Terms constructed script, invented script (was: FW: Re: Shavian)

2001-07-09 Thread Ayers, Mike
From: Edward Cherlin [mailto:[EMAIL PROTECTED]] The 'tsu' sign in reduced form is traditionally used in Japanese for consonant doubling (chyotto is written chi yo tsu to), but has been adapted for glottal stops at the end of words. Odd. I've always considered Japanese double

RE: Unicode transliterations (and other operations)

2001-07-05 Thread Ayers, Mike
From: James Kass [mailto:[EMAIL PROTECTED]] てんどうりゅうじ wrote: Still haven't got the multiplication riddle solved, Mr. Kass? Sorry, I didn't know it was required. Almost asked 'which riddle?', but now notice the × in the signature portion as follows...   らんま  

RE: Innovative use of Latin ?!

2001-07-02 Thread Ayers, Mike
From: Martin Duerst [mailto:[EMAIL PROTECTED]] For people interested in new scripts, and new uses of existing scripts :-) http://www.google.com/intl/xx-hacker/ This looks like what is called L33T (elite) writing. It's popular among online gamers. Kinda like computer pig latin...

RE: Innovative use of Latin ?!

2001-07-02 Thread Ayers, Mike
From: Thomas Chan [mailto:[EMAIL PROTECTED]] On Mon, 2 Jul 2001, Ayers, Mike wrote: /|/|ike The way you sign your messages is related to that, isn't it? :) I've seen ]\/[, too. Only related in spirit. I typed some slashes and bars together once (I forget why - maybe

RE: UTFs, ACEs, and English horns

2001-06-18 Thread Ayers, Mike
From: James [mailto:[EMAIL PROTECTED]] There's already 2 Perl modules on CPAN that implement ACE. These modules are already in use by ISPs for CJKV iDNS registration. (One was packaged by me based on Paul Hoffman's IMC code.) They are based on draft-ietf-idn-race-02.txt So it seems

RE: UTF-8 syntax

2001-06-08 Thread Ayers, Mike
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] The defintions have problems that need to be fixed, though, and they're less clear for UTF-16 than they are for UTF-8. I'm becoming inclined to say that any argumentation for or against UTF-8s on the basis of whether it runs into

RE: UTF-8 syntax

2001-06-08 Thread Ayers, Mike
From: Jianping Yang [mailto:[EMAIL PROTECTED]] This will fix the following problem for example: For a searching engine to search the character U-0001 in UTF-8 string, and it could not find. But when UTF-8 is converted into UTF-16, it can found it there because ED A0 80 and ED B0

RE: Unicode under fire again

2001-06-05 Thread Ayers, Mike
I sense impending laughter! Let's get this straight: This is a claim that Unicode cannot navigate its way through the political sensitivities of the East Asian peoples. It is coming from someone who refers to those peoples as Orientals. I quote: Unicode

RE: RECOMMENDATIONs( Term Asian is not used properly on Computers and NET)

2001-06-05 Thread Ayers, Mike
From: Elliotte Rusty Harold [mailto:[EMAIL PROTECTED]] At 4:15 PM -0500 6/4/01, Ayers, Mike wrote: I have used Arabic numerals all my life without once thinking that I was writing Arabic. Really? I myself have been writing European numerals using the Arabic-Indic place-value

RE: RECOMMENDATIONs( Term Asian is not used properly on Computers and NET)

2001-06-05 Thread Ayers, Mike
Perhaps if Han is too unfamiliar a word to be used directly, Sino or Sinitic could be used as translations to convey the same meaning without using the overloaded term Chinese (language, culture, origin, ethnicity, nationality, etc), e.g., Sino characters, Sinitic characters.

RE: RECOMMENDATIONs( Term Asian is not used properly on Computers and NET)

2001-06-04 Thread Ayers, Mike
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] For the Han characters, I have found in the past that people whose native language does not use these characters usually refer to them as Chinese. Obviously (to us anyway), calling them Chinese characters is not adequate, so we

RE: RECOMMENDATIONs( Term Asian is not used properly on Computers and NET)

2001-06-04 Thread Ayers, Mike
However, some other people somewhere else may not like that even though 'Hanzi/Kanji/Hanja' are just different ways of pronouncing the identical words written in 'Chinese characters' meaning 'Chinese characters'. Let's not work based on imaginary fears. Unless someone can name

RE: RECOMMENDATIONs( Term Asian is not used properly on Computers and NET)

2001-06-04 Thread Ayers, Mike
From: Thomas Chan [mailto:[EMAIL PROTECTED]] I think the problem that Doug might be suggesting (correct me if I'm wrong, Doug) is that Chinese is also the name of a language(s). The I have used Arabic numerals all my life without once thinking that I was writing Arabic. Doug

RE: ISO vs Unicode UTF-8 (was RE: UTF-8 signature in web and email)

2001-05-31 Thread Ayers, Mike
If you have this funny encoding please don't call it UTF8 because it is not UTF8 and will only confuse users. You could call it OTF8 or something like that but not UTF8. How about WTF-8? Sorry - I couldn't resist. /|/|ike

RE: ISO vs Unicode UTF-8 (was RE: UTF-8 signature in web and email)

2001-05-31 Thread Ayers, Mike
From: Carl W. Brown [mailto:[EMAIL PROTECTED]] I resisted calling it FTF-8 (Funky Transfer Format - 8), but if you want to call it Weird Transfer Format - 8, I don't have any real objections. Well, that's ONE possible translation of WTF... /|/|ike

RE: ISO vs Unicode UTF-8 (was RE: UTF-8 signature in web and email)

2001-05-30 Thread Ayers, Mike
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] According to the proposal, UTF-8S and UTF-32S would not have the same status: they wouldn't be for interchange; they'd just be for representation internal to a given system, like UTF-EBCDIC (which, I think I heard, has not actually

RE: Term Asian is not used properly on Computers and NET

2001-05-29 Thread Ayers, Mike
From: Marco Cimarosti [mailto:[EMAIL PROTECTED]] Doug Ewell wrote: Peter has an excellent solution -- much better than trying to explain the term CJK to ordinary people -- and I plan to use the term East Asian in the future. But, if by East Asian you mean languages written with

RE: Genesis v. UDHR?

2001-05-25 Thread Ayers, Mike
From: Herman Ranes [mailto:[EMAIL PROTECTED]] Unfortunately, there are some errors in the UNHCRC 300 language collection. Also not wanting to fan any fires, I wish to point out why I believe the text from Genesis was chosen - most Bible translations (as far as I know) are worked on

RE: Single Unicode Font

2001-05-24 Thread Ayers, Mike
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] We want to be able to tell our characters apart. I don't - I just want to be able to read them. Oh by the way how do you tell LATIN CAPITAL LETTER P from GREEK CAPITAL LETTER RHO? Sure if you have context or if somebody

RE: Single Unicode Font

2001-05-23 Thread Ayers, Mike
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] Well, you want serifs. I, l, 1. Those aren't differentiated by serifs - the differentiators are part of the character. In any case, when I encounter a difficult bit like this, I intend to do exactly what I did just now - look at it

RE: Single Unicode Font

2001-05-22 Thread Ayers, Mike
From: Carl W. Brown [mailto:[EMAIL PROTECTED]] I find that the most compelling reason is that many characters should be rendered differently depending on user preferences. For example a Japanese user should have the han rendered into Japanese characters except for one that do not

RE: Single Unicode Font

2001-05-22 Thread Ayers, Mike
From: Carl W. Brown [mailto:[EMAIL PROTECTED]] For those who do not know enough to tell the difference between Kanji typography and Hanzi typography (and Hanja typography ;-) this yields no benefit and forces a meaningless choice (which script that you can't read do you

RE: Ancient writing found in Turkmenistan

2001-05-17 Thread Ayers, Mike
From: Marco Cimarosti [mailto:[EMAIL PROTECTED]] I wanted to forward it to these mailing lists, but the NYT copyright notice is quite clear in that articles can only be downloaded for private use. Hmmm - the NYT is based in the United States, where copyright laws have an

RE: IDS question

2001-05-03 Thread Ayers, Mike
From: Keld Jørn Simonsen [mailto:[EMAIL PROTECTED]] Is there then anyplace I can get a peek at the Extension B characters? gibbeligobble gobbeligoble gibberish jest to keep sarasvasti happy that there is something new in this message. I think I need to write a few more

RE: IDS question

2001-05-02 Thread Ayers, Mike
Bad web day... From: Thomas Chan [mailto:[EMAIL PROTECTED]] http://deall.ohio-state.edu/grads/chan.200/misc/xin_tangshu-76.3481.jpg I believe the correct address (this is probably line-split) is:

RE: IDS question

2001-05-02 Thread Ayers, Mike
From: John H. Jenkins [mailto:[EMAIL PROTECTED]] Unfortunately, it isn't available yet. Unicode doesn't have a Plane 2 font, although we're actively working to get one. Is there then anyplace I can get a peek at the Extension B characters? TiA, /|/|ike

RE: Tags and the Private Use Area

2001-05-01 Thread Ayers, Mike
From: William Overington [mailto:[EMAIL PROTECTED]] Can there be found a possible usage that such a scheme would not support? Finding just one would resolve the question. I suspect that the whole issue is covered by Goedel's(sp?) Incompleteness theorem, which says (approximately)

UTF-8 on this list

2001-04-30 Thread Ayers, Mike
Long after upgrading to Win2K, setting up all my fonts, and testing everything, I've come to a conclusion: there are darn few Unicode text messages on the Unicode mail list (i.e. characters are referred to by codepoint, but the character itself is never included). In fact, I think

RE: On the possibility of guidance code points for the Private Use Area

2001-04-25 Thread Ayers, Mike
From: Eric Muller [mailto:[EMAIL PROTECTED]] Ayers, Mike wrote: Currently, when sending email or interpreting HTML, the content is tagged for its encoding. Wouldn't PUA users simply use their own tag (say, PUA-mike-1) instead of UTF-8? Am I missing something? What we

RE: On the possibility of guidance code points for the Private Use Area

2001-04-24 Thread Ayers, Mike
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] A good point. A possible workaround would be a new plane-14 tag character. I don't see this as a good solution. This is not because of any objection to the plane 14 characters, but because I think the problem can be handled well

ASCII adequacy (was: RE: benefits of unicode)

2001-04-20 Thread Ayers, Mike
From: David Starner [mailto:[EMAIL PROTECTED]] Which, to the extent which this is true (show me how you plan to handle The Art of Computer Programming or the Dragon book, for example), is equally true of upper case. Capitalizing sentences is redundant with punctuation, and any additional

RE: benefits of unicode

2001-04-19 Thread Ayers, Mike
From: David Starner [mailto:[EMAIL PROTECTED]] THEN WHY WASTE A WHOLE BIT ON UPPER CASE? THEY CERTAINLY ARE NOT NECCESSARY AND I HAVE FREQUENTLY SEEN PEOPLE NOT USE THEM WHEN AVAILABLE. Good point. We didn't need 'em to get "Huckleberry Finn", so how necessary can they be?

RE: 21-bit unicode

2001-04-18 Thread Ayers, Mike
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] 21 = 3 * 7 so could you "flatten" it to 7-bit ASCII? Well, such flattening may cause the content to be misinterpreted. However, if you are trying to get Unicode past some really old mailers, this would be a reasonably efficient way

RE: benefits of unicode

2001-04-18 Thread Ayers, Mike
From: Edward Cherlin [mailto:[EMAIL PROTECTED]] At 2:04 PM -0500 4/17/01, Ayers, Mike wrote: From: Edward Cherlin [mailto:[EMAIL PROTECTED]] One of the strongest benefits of Unicode is that it supports adequate *monolingual* computing for the first time in any language

RE: benefits of unicode

2001-04-17 Thread Ayers, Mike
From: Edward Cherlin [mailto:[EMAIL PROTECTED]] I would like to point out, again, that there is not now, and cannot be, an 8-bit code page adequate to English, and the same is necessarily true for every other language in modern use. More than a century of typewriters and computers has

RE: Reviewing IETF documents

2001-04-16 Thread Ayers, Mike
[EMAIL PROTECTED] I hope that the claim of "multiple UTF-8 representations" does indeed refer to glyphs, in the sense that Unicode contains both precomposed characters and separable elements, halfwidth and fullwidth ASCII variants, etc. I hope it does *not* refer to the nonconformant

RE: Ruby Annotation and XHTML 1.1 are W3C Proposed Recommendations

2001-04-11 Thread Ayers, Mike
From: Martin Duerst [mailto:[EMAIL PROTECTED]] At 10:00 01/04/09 -0700, Carl W. Brown wrote: I am wondering how in the absence of a sub language how one should render Chinese ruby. Mandarin ruby will not do a Cantonese reader much good. Can I specify multiple ruby and then have one

[unicode] Re: Helpful info

2001-03-23 Thread Ayers, Mike
Because of this very sentence, I have tested the Who command, and guess what? On Fri, 23 Mar 2001 08:45:18 -0500 (EST), Listar [EMAIL PROTECTED] happily sent me a list of 686 subscribers to the Unicode list. Paranoid, I just tried the same and got: SNIP List context changed to

[unicode] Re: Avalanche recovery

2001-03-22 Thread Ayers, Mike
From: John Wilcock [mailto:[EMAIL PROTECTED]] While I'm at it, let me add another plea in favour of setting the Reply-to: header to point back to the list [*only on messages which lack this header*, allowing those who wish to receive personal replies to set the header accordingly].

[unicode] Re: Listar request results

2001-03-22 Thread Ayers, Mike
From: Gaute B Strokkenes [mailto:[EMAIL PROTECTED]] On Thu, 22 Mar 2001, [EMAIL PROTECTED] wrote: Your message has been rejected because it appears to quote too extensively from other posts. Since when has overquoting been a problem on this list? Does this mean that

[unicode] Re: Moving mail lists

2001-03-22 Thread Ayers, Mike
From: Roozbeh Pournader [mailto:[EMAIL PROTECTED]] On Fri, 23 Mar 2001, Sean O Seaghdha wrote: Please, please, please, can we not use this stupid [unicode] addition to the subject line. I agree with all the points that have been made against it so far. It's redundant, it

RE: Unicode complaints

2001-03-15 Thread Ayers, Mike
From: Michael Everson [mailto:[EMAIL PROTECTED]] How much does a radical weigh? I check in at about 200lb. /|/|ike

RE: Off topic: language death in the US

2001-03-15 Thread Ayers, Mike
From: Marco Cimarosti [mailto:[EMAIL PROTECTED]] Well, one wonders: could that president's madness possibly hide some ingenuity? /SNIP Not really, I suspect. Unlike the situations you describe, American foreign language training seems by and large to have the focus that

RE: Unicode webring

2001-03-12 Thread Ayers, Mike
From: Misha Wolf [mailto:[EMAIL PROTECTED]] What I want to know is whether we get a Unicode ring to wear. Misha Only if you join the UniClub(tm)! Kids who join the UniClub get a secret decoder Unicode webring, Cima's magic pocket encoder (handy for encoding magic pockets),

RE: UTF8 vs. Unicode (UTF16) in code

2001-03-09 Thread Ayers, Mike
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On 03/09/2001 12:53:57 PM "Ayers, Mike" wrote: Um... no. The UTF-32 CES can handle much more than the current space of the Unicode CCS. As far as I can tell, it's good to go until we need more than 32 bits to represe

RE: UTF8 vs. Unicode (UTF16) in code

2001-03-08 Thread Ayers, Mike
If you really want to finish the job, there's always UTF-32, which should do rather nicely until we meet the space aliens aith the 4,293,853,186 character alphabet! /|/|ike P.S. No, they're not Klingons! From: Ienup Sung [mailto:[EMAIL PROTECTED]] I think we shouldn't advocate

RE: UTF-8, C1 controls, and UNIX

2001-03-01 Thread Ayers, Mike
From: Frank da Cruz [mailto:[EMAIL PROTECTED]] Just to be sure: ISO 2022 has two modes, 7 bits and 8 bits, hasn't it? And in 7 bit mode (I know it's obsolescent), then C1 controls are not supposed to be interpreted as controls, are they? Nor as graphics. Clarification: If

Tengwar and Cirth (was: RE: Fictional scripts revisited, might as

2001-02-26 Thread Ayers, Mike
From: Michael Everson [mailto:[EMAIL PROTECTED]] Oh, we've got a *proposal* for Klingon. It does not, however, appear that it meets the criteria for use as well as Tengwar and Cirth. Okay, I've finally gotta ask: what are Tengwar and Cirth? Klingon I've heard of (and wish I

RE: Perception that Unicode is 16-bit (was: Re: Surrogate space i

2001-02-23 Thread Ayers, Mike
I advocate taking it one step farther, and referring to Unicode as "21 bits and counting". Sure, it should be a long long time before more space is needed, but it's a good idea to prepare the audience now. After all, pretty much every ceiling ever established in computing has been

RE: fictional scripts revisited

2001-02-23 Thread Ayers, Mike
From: David Starner [mailto:[EMAIL PROTECTED]] The second example I would like to raise are the "Square Words" or "New English Calligraphy"[6] (I don't know which name is more appropriate, but I will refer to it hereafter as "NEC"), which is a Sinoform script. NEC is a system where

RE: Perception that Unicode is 16-bit (was: Re: Surrogate space i

2001-02-23 Thread Ayers, Mike
From: John Cowan [mailto:[EMAIL PROTECTED]] Ayers, Mike wrote: After all, pretty much every ceiling ever established in computing has been broken through, and there is no reason to believe that it won't happen again! On the contrary. There *are* reasons to believe that it won't

RE: [OT] What is DEL for?

2001-02-22 Thread Ayers, Mike
From: Marco Cimarosti [mailto:[EMAIL PROTECTED]] This also casts some light on the fact that some fonts (notably JIS fonts) have a big black box glyphs at position 0x7F: it is probably for overwriting a character already printed on paper, so that it cannot be read anymore.

Kang Jie introduction?

2001-01-08 Thread Ayers, Mike
I am looking for a tutorial or introduction to Kang Jie typing. Kang Jie, sometimes called Chang Jie, as well as some other transliterations (none of which I('m quite sure that I'm spelling correctly, as I don't have a reference handy), is a language and dialect independent method for

RE: Kana and Case (was [totally OT] Unicode terminology)

2000-11-22 Thread Ayers, Mike
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] Okay. Get out your copy of the lyrics to the Ranma 1/2 Complete Vocal Collection Vol. 1. Now look at the lyrics to Ranbada Ranma (that's Track 12) and tell me that the long vowel mark is not used with hiragana. The long vowel

RE: Devanagari question

2000-11-15 Thread Ayers, Mike
From: Rick McGowan [mailto:[EMAIL PROTECTED]] Mike Ayers wrote: The last I knew, computer-savvy Taiwan and Hong Kong were continuing to invent new characters. In the end, the onus is on the computer to support the user. Yes, the computer should support the user, but... The

RE: Devanagari question

2000-11-14 Thread Ayers, Mike
From: D.V. Henkel-Wallace [mailto:[EMAIL PROTECTED]] At 06:30 2000-11-14 -0800, Marco Cimarosti wrote: But my point was: not even Mr. Ethnologue himself knows exactly *which* combinations are meaningful, in all orthographic system. And, clearly, no one can figure out which combinations

RE: Number separators

2000-10-31 Thread Ayers, Mike
From: James E. Agenbroad [mailto:[EMAIL PROTECTED]] Tuesday, October 31, 2000 You probably should check out what's done in India. The call hundred thousands "crores" and have a name I don't recall for tens of millions. I don't recall how

RE: Japanese scripts?

2000-10-30 Thread Ayers, Mike
From: Shawn Halwes [mailto:[EMAIL PROTECTED]] Can Japanese be effectively represented with only the Hiragana, and Katakana scripts? "Effectively"? No. Katakana-only writing is just wrong, and hiragana sans kanji (with or without katakana) is considered children's writing. From

RE: Locale ID's again: simplified vs. traditional

2000-10-03 Thread Ayers, Mike
From: Carl W. Brown [mailto:[EMAIL PROTECTED]] It seems that the proper solution is to use ISO 15924 which is part of the new RCF-1766 sublanguage specifications. However to my amazment that do not have separate script designations for traditional and simplified scripts.

This is not UniLocale!

2000-09-18 Thread Ayers, Mike
Isn't there a more appropriate forum for the localization issues? I might even subscribe. However, let's please move the topic to a more appropriate place and let character encoding issues comprise at least half the traffic around here. Thanks, /"\

RE: the Ethnologue

2000-09-13 Thread Ayers, Mike
With English, the problem with spell checking is quite different, and different lists of words would not be as easy for a solution: the en-US vs. en-GB tagging does not seem to adequately cover the various differences such as -ise vs. -ize, -our vs. -or, -re vs. -er, use of shall vs.

RE: the Ethnologue

2000-09-13 Thread Ayers, Mike
From: Arnt Gulbrandsen [mailto:[EMAIL PROTECTED]] Are there valid reasons why the imperfect but comprehensive needs to be a standard? I can see one reason for it _not_ to be a standard: A list can be added to faster, so it's easier for a list to be truly comprehensive.

RE: surrogate terminology (was Re: Surrogate support in *ML?

2000-09-12 Thread Ayers, Mike
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] What is confusing is that sometimes "surrogates" refer to certain code units (for UTF-16) that are reserved as code points, and sometimes "surrogates" is used to refer to 'characters on planes 01-10'. I think the latter is a misuse.

  1   2   >