RE: current version of unicode-font

2004-12-03 Thread Andrew C. West
On Fri, 03 Dec 2004 15:10:37 +0200, Cristian Secarã wrote: On Thu, 2 Dec 2004 07:51:42 -0800, Peter Constable wrote: Microsoft has never used the label 'OpenFont' for this or any of the fonts that ship with their products. However, the .ttf fonts that ship with their products are

Re: current version of unicode-font

2004-12-02 Thread Andrew C. West
On Fri, 03 Dec 2004 00:38:25 +0700, Paul Hastings wrote: John Cowan wrote: Googling for free Unicode fonts (no quotes) is useful. sort of, when i've googled for this in the past, language-specific (chinese seemed to be the most frequent) fonts turn up more often than not. hey if you

Re: official languages of ISO / IEC (CIE)

2004-11-09 Thread Andrew C. West
On Mon, 8 Nov 2004 15:13:21 -0800 (PST), E. Keown wrote: At the U.N. and in some countries, they have 'official languages.' The U.N. has 5, I think. Singapore has 4, several African countries have 2-3, and so forth. Does either the ISO or the IEC have official languages? Whether

Re: Public Review Issues Update

2004-10-22 Thread Andrew C. West
On Thu, 21 Oct 2004 12:06:23 -0700 (PDT), Kenneth Whistler wrote: Mark Davis wrote: All comments are reviewed at the next UTC meeting. Due to the volume, we don't reply to each and every one what the disposition was. If actions were taken, they are recorded in the minutes of the

Re: Public Review Issue: UAX #24 Proposed Update

2004-09-09 Thread Andrew C. West
On Thu, 9 Sep 2004 07:29:20 -0400, John Cowan wrote: Jony Rosenne scripsit: The UTC refused to add Yiddish to the name, unlike the other Yiddish specialties, and I am not aware of any other possibility. Why should it? Incorporating a language name into a character name, as in

Re: Ogham and Initialisms

2004-07-22 Thread Andrew C. West
On Thu, 22 Jul 2004 11:24:17 +0200, fantasai wrote: If a Latin initialism appears in a bottom-to-top text and the characters are oriented upright rather than rotated, should the initialism read up or down? UA S or S ? AU In traditional monumental

Medieval CJK race-horse names (was Re: Bantu click letters )

2004-06-11 Thread Andrew C. West
On Fri, 11 Jun 2004 03:04:17 +0100, Michael Everson wrote: How many people use medieval CJK race-horse-name characters? Actually, the famous Song dynasty female poet Li Qingzhao (1084-c.1151) invented a board game (da3 ma3 tu2 in Chinese) which involved racing around a course in which each

Additional examples of the Phoenician script in use

2004-06-08 Thread Andrew C. West
At the risk of keeping the thread from hell alive, I'd like to point out a new contribution by Michael Everson that may be of interest to participants in this debate : http://std.dkuug.dk/JTC1/SC2/WG2/docs/n2787-phoenician.pdf To my untrained eyes this document provides some pretty compelling

Re: Proposal to encode dominoes and other game symbols

2004-06-02 Thread Andrew C. West
On Wed, 2 Jun 2004 08:05:00 -0400, John Cowan wrote: H.7 Some criteria weaken the case for encoding -- the symbol is purely decorative This would seem to exclude dingbats altogether. Or perhaps more apposite examples would be the shamrock and fleur-de-lis symbols (see N2586R). Whilst

Re: Vertical BIDI

2004-05-28 Thread Andrew C. West
On Fri, 28 May 2004 06:51:27 -0700, Mark Davis wrote: As things now stand, Ogham must be wrapped in RLO...PDF brackets when mixed with vertical Han or Mongolian. Yes, that's true -- and I don't see any reason why people can't live with that... Those are the kinds of reasons we have the

White and Black Shogi Pieces [2616..2617] (was Re: Proposal to encode dominoes and other game symbols)

2004-05-27 Thread Andrew C. West
On Wed, 26 May 2004 04:34:21 -0700 (PDT), Andrew C. West wrote: On Tue, 25 May 2004 10:08:26 -0700, John Hudson wrote: Andrew C. West wrote: I've never quite worked out what purpose U+2616 [WHITE SHOGI PIECE] and U+2617 [BLACK SHOGI PIECE] are intended for. I would like

Re: Proposal to encode dominoes and other game symbols

2004-05-26 Thread Andrew C. West
On Tue, 25 May 2004 17:30:37 -0700, Rick McGowan wrote: John that going beyond the double-twelve (for now) is just speculative and not supported by actual use in dominoes books. I don't think this is speculative. A photograph of production domino sets above 12 is included in the

Re: Proposal to encode dominoes and other game symbols

2004-05-26 Thread Andrew C. West
On Tue, 25 May 2004 10:08:26 -0700, John Hudson wrote: Andrew C. West wrote: I've never quite worked out what purpose U+2616 [WHITE SHOGI PIECE] and U+2617 [BLACK SHOGI PIECE] are intended for. The standard game of shogi (Japanese Chess) has 20 uncoloured tiles on each side

Re: Proposal to encode dominoes and other game symbols

2004-05-26 Thread Andrew C. West
On Wed, 26 May 2004 13:09:43 +0100, Michael Everson wrote: At 04:40 -0700 2004-05-26, Andrew C. West wrote: But we're not encoding dominos per se, but rather encoding representations of domino pieces in textual contexts. Whilst pictures of domino sets are interesting, and provide useful

Re: Proposal to encode dominoes and other game symbols

2004-05-25 Thread Andrew C. West
On Mon, 24 May 2004 20:11:08 -0700, Patrick Andries wrote: Proposal to encode dominoes and other game symbols This could get out of hand very quickly. Chinese and Japanese (shogi) chess pieces? To complete U+2616 and U+2617 ? I've never quite worked out what purpose U+2616 [WHITE

Re: Proposal to encode dominoes and other game symbols

2004-05-25 Thread Andrew C. West
On Tue, 25 May 2004 10:23:19 +0100, Michael Everson wrote: Now that you mention it, it could well be that Chaturunga and Chinese Chess both could be considered extensions to a unified Chess repertoire: WHITE CHATURANGA COUNSELLOR (- white chess queen) WHITE CHATURANGA ELEPHANT (- white

Re: Proposal to encode dominoes and other game symbols

2004-05-25 Thread Andrew C. West
On Tue, 25 May 2004 13:00:51 +0100, Michael Everson wrote: At 03:27 -0700 2004-05-25, Andrew C. West wrote: On Tue, 25 May 2004 10:23:19 +0100, Michael Everson wrote: Now that you mention it, it could well be that Chaturunga and Chinese Chess both could be considered extensions

Re: Vertical BIDI

2004-05-19 Thread Andrew C. West
Michael Everson wrote: Come on, people. Read the standard, please. It's on page 338. Michael is absolutely right to rebuke me for not reading the Standard. Of course I have read the Ogham block intro before, and no doubt that is where I got the notion of rendering Ogham BTT from, but I had

Re: Vertical BIDI

2004-05-18 Thread Andrew C. West
On Mon, 17 May 2004 22:59:50 -0400, John Cowan wrote: It should not. That's what makes Ogham different from standard horizontal scripts -- it does have a preferred vertical orientation, It does ? I thought that the whole point of much of the recent discussion was the uncertainty of how Ogham

Re: Multiple Directions (was: Re: Coptic/Greek (Re: Phoenician))

2004-05-17 Thread Andrew C. West
On Sat, 15 May 2004 14:14:50 -0400, fantasai wrote: That's a hack, not a solution. There's a fine line between hack and solution, and I'm not sure which side of the line my proposed technique falls. Again, if you take the text out of the presentational context you've warped it into, it

Re: Multiple Directions (was: Re: Coptic/Greek (Re: Phoenician))

2004-05-17 Thread Andrew C. West
On Mon, 17 May 2004 12:15:55 +0100, Jon Hanna wrote: It seems to me that as far as Ogham goes the positioning of successive glyphs is more comparable to the way a graphics program will position text along a path (allowing text to go in a circle, for example) than the differences between LTR,

Re: Multiple Directions (was: Re: Coptic/Greek (Re: Phoenician))

2004-05-17 Thread Andrew C. West
On Mon, 17 May 2004 10:12:50 -0400, John Cowan wrote: Andrew C. West scripsit: Thus, if tb-lr were supported, your browser would display the following HTML line as vertical Mongolian with embedded Ogham reading top-to-bottom, but in a plain text editor, the Mongolian and Ogham would

Re: Multiple Directions (was: Re: Coptic/Greek (Re: Phoenician))

2004-05-17 Thread Andrew C. West
On Mon, 17 May 2004 12:32:14 -0400, [EMAIL PROTECTED] wrote: I follow you. The question is, then, whether T2B Ogham is legible or not to someone who reads B2T Ogham fluently -- unfortunately, your texts are all pothooks and tick marks to me. If you're used to reading Ogham LTR on the

Re: Multiple Directions (was: Re: Coptic/Greek (Re: Phoenician))

2004-05-15 Thread Andrew C. West
On Fri, 14 May 2004 18:44:10 +0100, Michael Everson wrote: You can't play around with Ogham directionality like that. Reversing it makes it read completely differently! The first example reads INGACLU; the second reads ULCAGNI. Well I disagree. As I said in the message, the RTL result

Re: Multiple Directions (was: Re: Coptic/Greek (Re: Phoenician))

2004-05-14 Thread Andrew C. West
On Thu, 13 May 2004 16:33:51 -0400, [EMAIL PROTECTED] wrote: That's irrelevant. L2R and R2L scripts are often mixed in the same sentence, whereas it's barely possible to mix horizontal and vertical scripts on the same page; when it must be done, the vertical script is generally rotated to

Re: Multiple Directions (was: Re: Coptic/Greek (Re: Phoenician))

2004-05-14 Thread Andrew C. West
On Fri, 14 May 2004 11:09:19 +0100, Michael Everson wrote: At 02:40 -0700 2004-05-14, Andrew C. West wrote: (not that Ogham's strictly BTT, but it is largely BTT in monumental inscriptions I think it is always BTT in the inscriptions. My understanding is that when written along

Re: Multiple Directions (was: Re: Coptic/Greek (Re: Phoenician))

2004-05-14 Thread Andrew C. West
On Fri, 14 May 2004 11:43:53 -0400, [EMAIL PROTECTED] wrote: Andrew C. West scripsit: In bilingual Manchu-Chinese texts, which were common during the Manchu Qing dynasty [1644-1911], the text normally follows the Manchu page layout, with vertical lines of Manchu and Chinese interleaved

Re: CJK(B) and IE6

2004-05-04 Thread Andrew C. West
On Sun, 2 May 2004 12:14:29 -0700, Doug Ewell wrote: jameskass at att dot net wrote: The BabelPad editor can easily convert between UTF-8 and NCRs... As can SC UniPad. For $199 (unless you're only interested in editing files up to 1,000 characters in length). Andrew

Brahmic Unification (was Re: New contribution )

2004-04-30 Thread Andrew C. West
On Thu, 29 Apr 2004 12:35:55 -0700, Rick McGowan wrote: The unified Brahmis proposal exactly proposes unification of systems with vastly different rendering behavior. That's part of the controversy with it. But that proposal is currently sitting on a siding waiting to be taken up by the

Re: Brahmic Unification (was Re: New contribution )

2004-04-30 Thread Andrew C. West
On Fri, Andrew C. West scripsit: For example, the excellent description of the Tocharian script (surely the worst made-up name for a dead script ever) at http://titus.fkidg1.uni-frankfurt.de/didact/idg/toch/tochbr.htm could be the basis of a proposal for this important Brahmic script

Re: Unihan.txt and the four dictionary sorting algorithm

2004-04-21 Thread Andrew C. West
On Tue, 20 Apr 2004 22:36:48 +0100, Raymond Mercier wrote: The problem of the size of Unihan has nothing at all to do with the cost of storage, and everything to do with the functioning of programs that might open and read it. Since the lines in Unihan are separated by 0x0A alone, not

Re: Unicode 4.0.1 Released

2004-04-02 Thread Andrew C. West
On Tue, 30 Mar 2004 15:49:53 -0800, Rick McGowan wrote: Unicode 4.0.1 has been released! The main new features in Unicode 4.0.1 are the following: 1. The first significant update of the Unihan Database (Unihan.txt) since Unicode 3.2.0, including a large number of fixes and

Re: French typographic thin space (was: Fixed Width Spaces)

2004-04-02 Thread Andrew C. West
Patrick Andries wrote: Asmus Freytag [EMAIL PROTECTED] a écrit : Have you folks noticed the addition of Narrow Non Break Space? Yes, but I have not been able to find a font with a narrow enough glyph (I just looked again at Code 2000). Does anyone know of an appropriate font for French in

[OT] BabelMap in French

2004-04-01 Thread Andrew C. West
Some of you may be interested to know that a French version of BabelMap (now supporting Unicode 4.0.1) is available from : http://uk.geocities.com/BabelStone1357/Software/BabelMap_fr.html In this version all Unicode data (character names, block/plane names, UCD properties, character

Re: French typographic thin space (was: Fixed Width Spaces)

2004-04-01 Thread Andrew C. West
On Thu, 1 Apr 2004 18:37:49 +0200, Antoine Leca wrote: On Thursday, April 01, 2004 12:37 AM Asmus Freytag [EMAIL PROTECTED] va escriure: Have you folks noticed the addition of Narrow Non Break Space? Is it intended (in part) for French typography? No, it was introduced for Mongolian;

Re: vertical direction control

2004-03-25 Thread Andrew C. West
On Thu, 25 Mar 2004 03:36:29 -0800, Peter Kirk wrote: What about a cell phone or PDA for use in China. Some users may prefer vertical display of text, but then the system needs to know what to do with Latin etc text embedded in the Chinese. Isn't that a credible scenario? Or are the

Re: LATIN SMALL LIGATURE CT

2004-03-02 Thread Andrew C. West
On Mon, 01 Mar 2004 20:02:45 -0800, D. Starner wrote: Most importantly, you don't need to wander all over the PUA - with modern typesetting systems and good fonts, you just place a ct there and the software automatically ligatures it for you. You can use a ZWJ to ask for a ligature and ZWNJ

Re: Codes for Individual Chinese Brushstrokes

2004-02-20 Thread Andrew C. West
On Thu, 19 Feb 2004 18:27:09 -0800 (PST), Kenneth Whistler wrote: Of the 64 entities listed on the page: http://www.chinavoc.com/arts/calligraphy/eightstroke.asp *none* of them are encoded, and *none* of them are standard enough to merit consideration -- if by consideration you mean

Re: interesting SIL-document

2004-02-04 Thread Andrew C. West
On Tue, 03 Feb 2004 10:53:40 -0800, Peter Kirk wrote: There are minimal pairs at the syllable level between the British pronounciation of Birmingham (silent h, stress on first syllable only) and many similar -ingham names, and (rarer) place names like Odiham (Hampshire) - although I

Re: interesting SIL-document

2004-02-04 Thread Andrew C. West
On Wed, 4 Feb 2004 11:12:41 +, Michael Everson wrote: At 02:50 -0800 2004-02-04, Peter Kirk wrote: As for Birmingham, I like the idea of analysing it as a monosyllable [b?m©Øm] although I would tend to think of the eng and the second m as syllabic, but there is then a near minimal

Re: Chinese FVS? (was: RE: Cuneiform Free Variation Selectors)

2004-01-22 Thread Andrew C. West
On Wed, 21 Jan 2004 11:13:33 -0700, John Jenkins wrote: Granted, epigraphy is tough on plain text. As Unicode starts to deal with dead scripts, we have to deal with the issues it raises. Variation selectors are one way of doing it. Yes, but I'm delighted to see from document N2684

Re: Chinese FVS? (was: RE: Cuneiform Free Variation Selectors)

2004-01-21 Thread Andrew C. West
On Tue, 20 Jan 2004 10:32:06 -0700, John Jenkins wrote: 1) U+9CE6 is a traditional Chinese character (a kind of swallow) without a SC counterpart encoded. However, applying the usual rules for simplifications, it would be easy to derive a simplified form which one could conceivably see

Re: Mongolian Unicoding (was Re: Cuneiform Free Variation Selectors)

2004-01-21 Thread Andrew C. West
On Tue, 20 Jan 2004 16:33:24 -0500, [EMAIL PROTECTED] wrote: Andrew C. West scripsit: These are glyph variants of Phags-pa letters that are used with semantic distinctiveness in a single (but very important) text, _Menggu Ziyun_ , a 14th century rhyming dictionary of Chinese in which

Re: Mongolian Unicoding (was Re: Cuneiform Free Variation Selectors)

2004-01-20 Thread Andrew C. West
On Tue, 20 Jan 2004 00:36:54 -0800, Asmus Freytag wrote: Currently, Variation Selectors work only one way. You could 'force' one particular shape. Leaving the VS off, gives you no restriction, leaving the software free to give you either shape. W/o defining the use of two VSs you cannot

Re: Mongolian Unicoding (was Re: Cuneiform Free Variation Selectors)

2004-01-19 Thread Andrew C. West
On Mon, 19 Jan 2004 05:23:31 +, [EMAIL PROTECTED] wrote: Dean Snyder wrote, Tom Gewecke wrote at 2:26 PM on Sunday, January 18, 2004: ... Agreed. I can't imagine that anyone who has ever tried to actually do anything with Unicode Mongolian would recommend variation selectors

Re: U+0185 in Zhuang and Azeri (was Re: unicode Digest V4 #3)

2004-01-15 Thread Andrew C. West
On Wed, 14 Jan 2004 10:44:18 -0800, Peter Kirk wrote: I received the following reply from a Zhuang researcher, which agrees with what Andrew has written: ... There are two other orthographies in use in Zhuang. Most important, there is an ancient Zhuang square-character script that has

Re: U+0185 in Zhuang and Azeri (was Re: unicode Digest V4 #3)

2004-01-06 Thread Andrew C. West
On Mon, 5 Jan 2004 17:37:30 -0800 (PST), Kenneth Whistler wrote: Perhaps someone on the list who knows more about the actual history of orthographic reform in the Zhuang Autonomous Region of Guangxi could chime in with more details. Well, I'm not really that knowledgeable about Zhuang, but

Re: Bhutanese marks

2004-01-06 Thread Andrew C. West
On Tue, 6 Jan 2004 05:29:51 -, C J Fynn wrote: U+0F09 which was erroneously named BKA-SHOG YIG MGO (should have been ZHU-YIG GO RGYAN), is used for writing respectfully to a senior particularly when requesting something. e.g when writing to a government officer or minister requesting a

Re: LATIN SOFT SIGN

2004-01-05 Thread Andrew C. West
On Mon, 5 Jan 2004 13:54:18 +, Michael Everson wrote: LATIN LETTER TONE SIX **is** the SOFT SIGN clone into Latin, and should be used for Pan-Turkic. I've suggested, but perhaps not loudly enough, that the reference glyph be modified to be more soft-sign like. LATIN LETTER TONE SIX

Re: Aramaic unification and information retrieval

2003-12-23 Thread Andrew C. West
On Mon, 22 Dec 2003 21:36:25 -0800, Doug Ewell wrote: Ancient forms of Aramaic aren't going to be taken up anytime soon for any consideration for encoding. And the Roadmap cannot be taken as a predetermination of the eventual decisions in this regard, in my opinion. Maybe not as far

Re: Aramaic unification and information retrieval

2003-12-23 Thread Andrew C. West
On Tue, 23 Dec 2003 01:59:06 -0800, Doug Ewell wrote: I deliberately followed the roadmap codepoints for my recent 'Phags-pa proposal even though I think 'Phags-pa probably belongs in the SMP (but I don't really care where 'Phags-pa is encoded as long as it is encoded, so I am happy to

Re: Text Editors and Canonical Equivalence (was Coloured diacritics)

2003-12-12 Thread Andrew C. West
On Fri, 12 Dec 2003 07:53:13 -0800, Peter Kirk wrote: OK. In fact I suspect that the number that have meaningful semantics and effective usage is actually rather small and could be fitted within the higher PUA planes if one chose to do that. After all, not many languages use large numbers

Re: Coloured diacritics (Was: Transcoding Tamil in the presence of markup)

2003-12-08 Thread Andrew C. West
On Sun, 7 Dec 2003 17:40:25 -0800, Doug Ewell wrote: There are plenty of things one can do with writing that aren't supported by computer encodings, and aren't really expected to be. The idea of a black i with a red dot was mentioned. Here's another: the piece-by-piece exploded diagrams used

Re: Ideographic Description Characters

2003-12-08 Thread Andrew C. West
On Sun, 7 Dec 2003 11:25:01 -0700, Tom Gewecke wrote: Can anyone tell me whether ideographic description characters are ever actually used? Well, I use them on a couple of my web pages to describe unencoded ideographs (try viewing http://uk.geocities.com/BabelStone1357/Alphabets/Zhuang.html

Re: Unihan kKorean pronunciations

2003-12-08 Thread Andrew C. West
On Fri, 5 Dec 2003 11:20:02 -0700, John Jenkins wrote: I checked with Lee Collins (who's the person who put the data in there originally). Quoth'a: It's called Yale, since it appears in a number of Samuel Martin's works published by Yale Press. Oops, I guess I really ought to have

Re: Unihan kKorean pronunciations

2003-12-08 Thread Andrew C. West
On Sat, 6 Dec 2003 05:17:16 +0900 (KST), Jungshik Shin wrote: For the nice summary of various transliteration/transcription schemes for Korean, see http://www.asahi-net.or.jp/~ez3k-msym/charsets/roma-k.htm Thanks, this page seems to provide just the information I need to convert the

Unihan kKorean pronunciations

2003-12-05 Thread Andrew C. West
Does anyone know what is the system of transliteration used for the kKorean key in the Unihan database ? The notes at the top of Unihan.txt simply state that kKorean gives The Korean pronunciation(s) of this character. However, the readings are in some strange orthography that I am not familiar

Re: Complex Combining

2003-11-28 Thread Andrew C. West
On Thu, 27 Nov 2003 08:11:55 -0800, Peter Kirk wrote: This is all rather interesting speculation. There are surely a lot of potential cases in scripts where some kind of combining mark can be considered as applying to a sequence of an arbitrary number of characters. For example:

RE: Complex Combining

2003-11-28 Thread Andrew C. West
On Fri, 28 Nov 2003 10:32:51 +, Arcane Jill wrote: You are getting personal and indulging in ad hominem. I consider this out of order. Wow, people really are tetchy today. The published Mail List Rules and Etiquette state that Correspondents should remain tolerably polite and consider

Re: numeric properties of Nl characters in the UCD

2003-11-26 Thread Andrew C. West
On Tue, 25 Nov 2003 16:16:15 -0800, Doug Ewell wrote: Well, one reason could be that there is no such character. (Did you mean U+1034A GOTHIC LETTER NINE HUNDRED?) But why do U+10341 [GOTHIC LETTER NINETY] and U+1034A [GOTHIC LETTER NINE HUNDRED], which are letters that are only ever used

Re: numeric properties of Nl characters in the UCD

2003-11-26 Thread Andrew C. West
On Wed, 26 Nov 2003 08:04:33 -0800, Peter Kirk wrote: On 26/11/2003 04:40, Andrew C. West wrote: Is this perhaps because all the other Gothic letters can also be used to represent numbers in exactly the same way that U+10341 and U+1034A are used (these two letter were devised specifically

Re: IE settings for surrogates support

2003-11-25 Thread Andrew C. West
On Mon, 24 Nov 2003 15:47:16 +0100 (CET), Philippe VERDY wrote: [HKEY_CURRENT_USER\Software\Microsoft\Internet Explorer\International\Scripts\42] IEFixedFontName=Code2001 IEPropFontName=Code2001 This setting is incorrect: the script IDs go between 3 and 40, See

Unihan Vietnamese Readings

2003-11-25 Thread Andrew C. West
I've been looking at the Vietnamese readings given in the Unihan database recently, and although I don't know Vietnamese, I think there may be something not quite right with some of them, and so I wondered if anyone on this list who knows Vietnamese could confirm the validity of the Unihan

Re: creating a test font w/ CJKV Extension B characters.

2003-11-24 Thread Andrew C. West
On Mon, 24 Nov 2003 10:12:52 +, [EMAIL PROTECTED] wrote: Even with the registery changes that allow Uniscript to work with such characters? Oops, my mistake. I had forgotten that I had deliberately deleted the registry settings that control how IE deals with surrogate pairs sometime ago

Re: creating a test font w/ CJKV Extension B characters.

2003-11-21 Thread Andrew C. West
On Thu, 20 Nov 2003 21:02:49 -0800, Doug Ewell wrote: An invalid GB18030 sequence, like FE 40, or a valid but out-of-range sequence, like E3 32 9A 36, should be treated just like an invalid or out-of-range UTF-8 sequence. Issue an error message, format the hard disk, whatever; just don't

RE: creating a test font w/ CJKV Extension B characters.

2003-11-21 Thread Andrew C. West
On Fri, 21 Nov 2003 15:12:26 +0100, Philippe Verdy wrote: Could an editor loading such incorrect but legacy GB-18030 file accept to load it and work with it using an internal-only UCS-4 mapping (or an extended UTF-8 mapping), to preserve those out of range sequences, as if they were mapped

Re: creating a test font w/ CJKV Extension B characters.

2003-11-20 Thread Andrew C. West
On Thu, 20 Nov 2003 01:32:16 +, [EMAIL PROTECTED] wrote: Frank Yung-Fong Tang wrote, If you visit http://people.netscape.com/ftang/testscript/gb18030/gb18030.cgi?page=596 and your machine have surrogate support install correctly and surrogate font install correctly then you

Re: Ewellic

2003-11-12 Thread Andrew C. West
On Wed, 12 Nov 2003 08:10:49 -0800, Doug Ewell wrote: I don't think the secrecy criterion is sufficient to qualify a writing system as a cipher (whether it is necessary is another question). Nüshü (sp?) was developed primarily for secrecy, if I'm not mistaken, and I doubt anyone would

Re: Handy table of combining character classes

2003-11-10 Thread Andrew C. West
On Fri, 7 Nov 2003 14:57:51 -0500, John Cowan wrote: Here's a little table of the combining classes, showing the value, the number of characters in the class, and a handy name (typically the one used in the Unicode Standard, or a CODE POINT NAME if there is only one; sometimes of my own

Re: elided base character or obliterated character (was: Hebrew composition model, with cantillation marks)

2003-11-07 Thread Andrew C. West
On Thu, 6 Nov 2003 12:51:53 -0500, John Cowan wrote: IIRC we talked about this a year or so ago, and kicked around the idea that the Chinese square could be treated as a glyph variant of U+3013 GETA MARK, which looks quite different but symbolizes the same thing. I suspect that few Chinese

Re: elided base character or obliterated character (was: Hebrew composition model, with cantillation marks)

2003-11-06 Thread Andrew C. West
On Wed, 5 Nov 2003 12:24:00 +0100, Philippe Verdy wrote: The obliterated character needed for paleolitic studies, or to encode any texts in which the character is not recognizable already exists: isn't it the REPLACEMENT CHARACTER? The problem of how to represent missing/obliterated

Re: [hebrew] Re: Hebrew composition model, with cantillation marks

2003-11-06 Thread Andrew C. West
On Thu, 6 Nov 2003 08:30:24 -0800, Doug Ewell wrote: I can't help thinking that other specialized lists, such as those for bidi and CJK, were created to resolve this exact type of problem. CJK list ? Now if only there was a list of Unicode lists ...

Re: FW: Web Form: Other Question, Problem, or Feedback

2003-10-24 Thread Andrew C. West
i'm looking for a tool or a tutorial to convert japanese signs in numeric unicode signs (e.g. #30041;). Can you help me? Try BabelPad at uk.geocities.com/BabelStone1357/Software/BabelPad.html Select the text, and click on Convert : NCR to Unicode from the menu. Or simply check the

Re: [tibex] Re: TIBETAN DIGIT HALF ZERO

2003-10-24 Thread Andrew C. West
On Thu, 23 Oct 2003 13:05:05 -0700, Peter Lofting wrote: The representation of slashed digits are problematic for two reasons. (1) The notation is that a slash indicates half of the value. This is different to the less a half interpretation Andrew describes, which would only be true for

Re: FW: Web Form: Other Question, Problem, or Feedback

2003-10-24 Thread Andrew C. West
On Fri, 24 Oct 2003 01:58:03 -0700 (PDT), Andrew C. West wrote: Try BabelPad at uk.geocities.com/BabelStone1357/Software/BabelPad.html Select the text, and click on Convert : NCR to Unicode from the menu. Or simply check the Convert NCRs checkbox on the file open dialog when you open

Re: [OT] Meaning of U+24560?

2003-10-13 Thread Andrew C. West
On Sun, 12 Oct 2003, Patrick Andries wrote: Would anyone know what U+24650 means? Probably only Yang Xiong really knows what U+24560 means in the context of Tetragram #9, but unfortunately he's been dead for a couple of thousand years. Rather than bravely attempt to translate the Chinese

Re: kMandarin and kCantonese in Unihan

2003-10-07 Thread Andrew C. West
On Tue, 7 Oct 2003 21:42:09 +0800, Anthony Fok wrote: What is a good place for discussions on these issues? And which personnel and which sources are involved with esp. the CJK-Ext-A kCantonese data? It would be nice to talk with the original people to find out how these errors crept in,

Re: Chinese departing tone marks

2003-09-30 Thread Andrew C. West
On Thu, 19 Jun 2003 17:38:06 -0500, [EMAIL PROTECTED] wrote: That sounds, then, like these are *not* two of the left-stemmed tone letters (mirrors of 02E5..02E9) that I'm going to be including in a proposal for additional modifier characters for tone. Peter, I notice from document N2626

Re: TAI NÜA , TAI LE

2003-09-15 Thread Andrew C. West
Not knowing very much about the Tai script I consulted some Chinese reference books to see how the Chinese designate the Tai script. The Languages and Scripts volume of the _Ci Hai_ encyclopaedia (Shanghai, 1978) states that there are four main script traditions used for writing the Dai (= Tai)

Re: Damn'd fools

2003-07-26 Thread Andrew C. West
On Fri, 25 Jul 2003 21:28:30 +0100, Michael Everson wrote: Presumably the name of the U.K. would change, however. Why? It would be the United Kingdom of Great Britain, which comprises England, Scotland, Wales, and the Duchy of Cornwall. United Kingdom of Great Britain as opposed to the

Re: Nu Shu script

2003-07-15 Thread Andrew C. West
On Mon, 14 Jul 2003 15:15:44 -0700 (PDT), Kenneth Whistler wrote: NuShu (or Nüshu) is periodically discovered and raised for discussion on this list. There has been considerable interest in Nü Shu (literally women's writing) in recent years, especially amongst feminist academics in Japan and

RE: Combining diacriticals and Cyrillic

2003-07-12 Thread Andrew C. West
On Fri, 11 Jul 2003 09:09:08 -0700, Rick Cameron wrote: Ah, but what you don't realise [and it's not surprising, because MSDN doesn't make it clear] is that when ScriptTextOut calls ExtTextOut, it passes glyph indices, and uses the ETO_GLYPH_INDEX option. Thus, the two statements are

Re: Combining diacriticals and Cyrillic

2003-07-11 Thread Andrew C. West
On Fri, 11 Jul 2003 13:15:14 +0200, Philippe Verdy wrote: The Win32 Text APIs (such as TextOut) actually DO support UniScribe transparently on Windows XP... In most applications, this means that the UniScribe support works without requiring explicit calls to the Uniscribe API. Surely some

Re: Documents needed for proposal

2003-07-04 Thread Andrew C. West
On Thu, 3 Jul 2003 08:59:19 -0700, Rick McGowan wrote: That is section 2.2 of the WG2 Principles and Procedures document. It is available on-line. Go here: http://std.dkuug.dk/JTC1/SC2/WG2/docs/principles.html I'm familiar with this document (n2352r), and it does indeed list the

hPhags-pa Proposal

2003-06-30 Thread Andrew C. West
I have made available for consultation a draft proposal for the encoding of the 'Phags-pa or hPhags-pa script (mainly used for writing Chinese and Mongolian during the 13th and 14th centuries) at : uk.geocities.com/BabelStone1357/hPhags-pa/N2352.html A set of additional pages relating to the

Re: Mongolian Rant (was: Biblical Hebrew... was: Tibetan... was: ...)

2003-06-29 Thread Andrew C. West
Ken, Thank you for your kind response to my latest rant. It has gone a long way to reassuring me that my concerns over MFVSs and the already defined Mongolian standardized variants are unfounded. On Fri, 27 Jun 2003 17:25:50 -0700 (PDT), Kenneth Whistler wrote: The Mongolian variants are

Re: Biblical Hebrew (Was: Major Defect in Combining Classes of Tibetan Vowels)

2003-06-27 Thread Andrew C. West
On Fri, 27 Jun 2003 04:22:30 -0500, [EMAIL PROTECTED] wrote: I just have a hard time believing that 50 years from now our grandchildren won't look back, What were they thinking? So it took them a couple of years to figure out canonical ordering and normalization; why on earth didn't they

Re: Question about Unicode Ranges in TrueType fonts

2003-06-26 Thread Andrew C. West
On Wed, 25 Jun 2003 21:58:28 -0700, Elisha Berns wrote: Some weeks back there were a number of postings about software for viewing Unicode Ranges in TrueType fonts and I had a few questions about that. Most viewers listed seemed to only check the Unicode Range bits of the fonts which can be

Re: Question about Unicode Ranges in TrueType fonts

2003-06-26 Thread Andrew C. West
On Thu, 26 Jun 2003 14:26:13 +0200, Philippe Verdy wrote: Isn't there a work-around with the following function (quote from Microsoft MSDN): (with the caveat that you first need to allocate and fill a Unicode string for the codepoints you want to test, and this can be lengthy if one wants to

Re: Major Defect in Combining Classes of Tibetan Vowels

2003-06-25 Thread Andrew C. West
On Wed, 25 Jun 2003 15:05:26 +0400, Valeriy E. Ushakov wrote: Err, as in this particular case one vowel sign is above and the other one is below the stack - i.e. they don't interact spatially - you cannot really distinguish them. ;) I know that the vowel signs do not interact with each other

Re: Major Defect in Combining Classes of Tibetan Vowels

2003-06-25 Thread Andrew C. West
On Wed, 25 Jun 2003 19:47:26 +0400, Valeriy E. Ushakov wrote: And given that the two look identical in writing in the first palce, this lexical difference had a chance to originate exactly *where*? You are putting the cart before the horse. Well, unless the text has been scanned with OCR, a

Re: Major Defect in Combining Classes of Tibetan Vowels

2003-06-25 Thread Andrew C. West
On Wed, 25 Jun 2003 13:41:27 -0700 (PDT), Kenneth Whistler wrote: Peter asked: How can things that are visually indistinguishable be lexically different? chat (en) chat (fr) And if Unicode reordered vowels in front of consonants, then we wouldn't be able to distinguish : chat (en)

Re: Chinese departing tone marks

2003-06-20 Thread Andrew C. West
On Thu, 19 Jun 2003 17:38:06 -0500, [EMAIL PROTECTED] wrote: That sounds, then, like these are *not* two of the left-stemmed tone letters (mirrors of 02E5..02E9) that I'm going to be including in a proposal for additional modifier characters for tone. I noticed that in Yuan Jiahua's

Re: Chinese departing tone marks

2003-06-20 Thread Andrew C. West
On Fri, 20 Jun 2003 08:28:12 -0500, [EMAIL PROTECTED] wrote: Hard to say without seeing them, but if they are simply contours, then those are already supported in Unicode by means of ligatures of the five already there. If it's something else, go ahead and send me the scan (with bibliographic

Re: Chinese departing tone marks

2003-06-20 Thread Andrew C. West
On Fri, 20 Jun 2003 09:27:34 -0500, [EMAIL PROTECTED] wrote: The bigger question is, can your software access the ligatures? Works like a dream with Uniscribe 1.453.3665.0 and later. Andrew

RE: International Font to be Used

2003-06-10 Thread Andrew C. West
On Mon, 09 Jun 2003 18:04:58 +0100, Raymond Mercier wrote: One (free) tool that will allow you to investigate what blocks of Unicode are actually covered in a font file is: http://pfaedit.sourceforge.net/ And to see what fonts on your disk support specified unicode blocks, another

Re: Classification of U+30FC KATAKANA-HIRAGANA PROLONGED SOUND MARK

2003-06-05 Thread Andrew C. West
On Wed, 4 Jun 2003 18:11:48 -0500 , Mount, Rob (Robert F) wrote: I am investigating differing behavior in various environments of the wide-character version of the C function isAlpha with respect to character U+30FC KATAKANA-HIRAGANA PROLONGED SOUND MARK. The UNICODE documents seem

Re: Length of Unicode Name

2003-06-04 Thread Andrew C. West
On Tue, 3 Jun 2003 21:51:26 +0930, Kevin Brown wrote: Does the Unicode Standard specify an upper limit to the length of a character's Unicode Name? See Annex L Character naming guidelines of ISO/IEC 10646-1: 2000 (unfortunately not easily available over the net, which is a shame as you have

RE: IPA Null Consonant

2003-06-03 Thread Andrew C. West
On Mon, 2 Jun 2003 08:13:49 -0500, [EMAIL PROTECTED] wrote: According to http://www.unicode.org/Public/UNIDATA/StandardizedVariants.html, there is no variation sequence 2205, FE00 defined. Somebody needs to tell the author(s) of this page that they can't make up their own variation-selector

Radical Property (was “book end” or enclosing characters in most languages?)

2003-05-30 Thread Andrew C. West
On Thu, 29 May 2003 16:05:37 -0700 (PDT), Kenneth Whistler wrote: In general, when people are interested in classes of characters, like this, a quick trip into the Unicode Character Database is a useful thing to do. In particular, look for the list of characters with the property

  1   2   >