Re: proposal for a creative commons character

2004-06-15 Thread jcowan
Michael Tiemann scripsit: Without getting greedy, I'd like to propose the adoption of the (cc) symbol in whatever way would be most expedient (so that creative commons authors can identify their work more appropriately), and leave for later the question of the other symbols. It's a logo. We

Re: Bantu click letters

2004-06-10 Thread jcowan
Peter Constable scripsit: Would you consider these too idiosyncratic? No. The idio- in idiosyncratic has to do with an individual. I forgot to point this out earlier, but !Xu phonology isn't idiosyncratic either -- it's just unusual. To the !Xu it's the normal thing. -- Is a chair finely

Re: Bantu click letters

2004-06-10 Thread jcowan
Michael Everson scripsit: You have a weird view of the history of phonetics, John. You haven't addressed the substantive issue: these are Latin characters used to represent sounds which in 1925 could not easily be represented. And never have been represented thus since. In their day,

Re: Bantu click letters

2004-06-10 Thread jcowan
Michael Everson scripsit: You don't KNOW that. You assert that. This is the adversarial style I was objecting to, John. Could you please take this on board? Fair enough, Michael. But the burden of going forward with the evidence is still yours. (I'll do what I can.) But it is QUITE

Re: Bantu click letters

2004-06-10 Thread jcowan
Asmus Freytag scripsit: That doesn't mean that we stop asking all the hard questions, but that we allow a presumption of usefulness for characters that were in demonstrated use over some time and by several authors. I quite agree. Here, however, we have (as far as the evidence goes) a

Re: Bantu click letters

2004-06-10 Thread jcowan
D. Starner scripsit: There's at least a small user community; those people who are actively transcribing old works, like Project Gutenberg. Due to the latest US copyright extensions, it will take us a couple decades, but we'll want to transcribe this article. In 2050. I wouldn't worry about

Re: Sinhala encoding issues

2004-06-09 Thread jcowan
Paul Nelson (TYPOGRAPHY) scripsit: Currently, our implementation is that a character displayed on its own is displayed on a dotted circle. From my recollection, this is what is recommended in TUS. This currently works as a stand-alone mark with a visual representation of the dotted circle in

Re: Sinhala encoding issues

2004-06-09 Thread jcowan
Paul Nelson (TYPOGRAPHY) scripsit: My assumption is the only SINHALA or SINHALA is sent to the Sinhala engine. Okay. How does Unicode propose making sure that, in the plain text case, that the space before the Sinhala combining mark is glued to the combining mark and not the previous

Re: Updated Phoenician proposal: confidential?

2004-06-02 Thread jcowan
Peter Kirk scripsit: As for the speculation that these users have been almost unanimously opposed to the proposal, I consider the remark inaccurate yet find myself unable to attack your credibility in this regard. Well, this sounds like a careful circumlocution for an ad hominem attack.

Re: Phoenician, Fraktur etc

2004-05-28 Thread jcowan
Mark E. Shoulson scripsit: True. I am awaiting with great impatience the arrival of a *comparative* text of the two Pentateuchs: Two columns, one MT, one SP, (both in Square Hebrew, MT pointed), with differences printed in LARGER LETTERS. Do you mean that you have ordered this and

Re: Response to Everson Phoen and why June 7? (Chris Fynn...)

2004-05-28 Thread jcowan
Mark E. Shoulson scripsit: (when he was awake) I talked to the assistant Rabbi, who'd given the talk. I told him we'd been disputing this for weeks, etc... Of course, a discussion lasting mere weeks wouldn't sound very significant to a Talmudist. -- John Cowan [EMAIL PROTECTED]

Re: Palaeo-Hebrew, Phoenician, and Unicode (Phoenician Unicode proposal)

2004-05-26 Thread jcowan
Peter Constable scripsit: So if I understand correctly, the only fonts that we know of so far that have PH glyphs encoded in the 0590..05FF block were developed by someone who thinks PH should be encoded as a distinct script from square Hebrew. Principle is one thing, expediency another.

Re: Why Fraktur is irrelevant (was RE: Fraktur Legibility (was Re:Response to Everson Phoenician)

2004-05-26 Thread jcowan
Peter Kirk scripsit: So I have an honest question. Can anyone, please, remind me of any technical arguments other than legibility for the separate encoding of Phoenician? The same as the general argument for separating any two scripts: the desire to create plain-text documents which contain

Re: Proposal to encode dominoes and other game symbols

2004-05-25 Thread jcowan
Michael Everson scripsit: Trumps in English. I suggest that 21 trumps be encoded, but not named, because the correspondence of names to numbers is variable. This would be the Major Arcana? Yes. AFAIK that term is relatively recent, ca. 1900; trump (i.e. triumph) goes back to the Tarot's

Re: Proposal to encode dominoes and other game symbols

2004-05-25 Thread jcowan
Michael Everson scripsit: Trump seems to mean something else in English these days. Not really. In the game of tarot/tarocchi, which is a species of whist, there are a fixed set of trumps; in successor games using the standard deck, which suit is trumps is determined by one of a variety of

Re: Fraktur Legibility (was Re: Response to Everson Phoenician)

2004-05-25 Thread jcowan
Dean Snyder scripsit: So, you are saying there are glyph streams in German Fraktur that fluent, native Germans would have trouble reading. And in Antiqua too. Consider O0OO000O0O0OOO000O0OO000O0O0OO0. -- John Cowan www.reutershealth.com www.ccil.org/~cowan [EMAIL PROTECTED] The Penguin

Re: New Public Review Issue posted

2004-05-25 Thread jcowan
Rick McGowan scripsit: The Unicode Technical Committee has posted a new issue for public review and comment. Details are on the following web page: http://www.unicode.org/review/ I have prepared a draft DiacriticFolding.txt file for this issue; it is temporarily available at

Re: Response to Everson Phoenician and why June 7?

2004-05-24 Thread jcowan
Michael Everson scripsit: and with interleaved collation, Which was rejected for the default template (and would go against the practices already in place in the default template) but is available to you in your tailorings. I don't accept that the existing practices are necessarily a

Re: Response to Everson Phoenician and why June 7?

2004-05-24 Thread jcowan
Michael Everson scripsit: People who need to override the default template can do so, according to the standard. If they're lucky. The less lucky will only get default-UCA sorting. The least lucky will get nothing but binary codepoint sorting and a few language-specific hacks. The default

Re: Zip vs. Non Zipped and ISO 15924 draft fixes

2004-05-21 Thread jcowan
Jon Hanna scripsit: [T]he default encoding on the server (which really should be utf-8 on www.unicode.org at this stage). Currently it is, but there are sticky issues: in particular, a default encoding overrides information in HTML meta elements as well as browser heuristics, at least for

Re: Zip vs. Non Zipped and ISO 15924 draft fixes

2004-05-21 Thread jcowan
Jon Hanna scripsit: This is passing strange, for the problem was UTF-8 being mis-interpreted as a legacy encoding, not the other way around. a) Not everyone uses a modern browser. 2) The problem might have been speculative (or memorious) rather than actual. -- John Cowan

Re: Zip vs. Non Zipped and ISO 15924 draft fixes

2004-05-21 Thread jcowan
Doug Ewell scripsit: John Cowan jcowan at reutershealth dot com wrote: Consequently, random pages that happen to be in non-Unicode charsets are getting mis-served and mis-displayed. The site will probably revert to having no default as a result, which is a great pity. I'm sure

Re: UTF-8 encoded texts on the website (was Zip vs. Non Zipped and ISO 15924 draft fixes)

2004-05-21 Thread jcowan
Philippe Verdy scripsit: You can instruct Apache to serve a part of the site with another default encoding by uploading with your FTP client a .htaccess file containing a different default MIME type association. .htaccess cannot do anything that hacking the httpd.conf file can't do. In this

Re: UTF-8 encoded texts on the website (was Zip vs. Non Zipped and ISO 15924 draft fixes)

2004-05-21 Thread jcowan
Philippe Verdy scripsit: But today the global httd.conf does not specify any charset in the content-type, In fact I have seen the current httpd.conf, and it does specify UTF-8 as the DefaultCharset. -- While staying with the Asonu, I met a man from John Cowan the Candensian plane, which

Re: ISO 15924 French name Gotique: a typo...???

2004-05-21 Thread jcowan
Philippe Verdy scripsit: There are many confusions in French with the meaning of the term gothique, Gothic and gothic have exactly the same confusions in English, with the addition of a subculture of people who dress in a rather unusual fashion. -- Dream projects long deferredJohn

Re: Vertical BIDI

2004-05-18 Thread jcowan
Andrew C. West scripsit: It does ? I thought that the whole point of much of the recent discussion was the uncertainty of how Ogham should be laid out in vertically formatted text, such as when embedded in Mongolian or vertical Chinese. What's uncertain is whether a lr or a rl progression is

Re: Multiple Directions (was: Re: Coptic/Greek (Re: Phoenician))

2004-05-17 Thread jcowan
Andrew C. West scripsit: I think you may have misunderstood me. I'm now suggesting that perhaps Ogham shouldn't be rendered bottom-to-top when embedded in vertical text such as Mongolian, but top-to-bottom as is the case with other LTR scripts such as Latin, I follow you. The question is,

Re: Multiple Directions (was: Re: Coptic/Greek (Re: Phoenician))

2004-05-17 Thread jcowan
Philippe Verdy scripsit: How can I get so much difference in Internet Explorer when rendering Ogham vertically (look at the trucated horizontal strokes), and is the absence of ligatures in Mongolian caused by lack of support of Internet Explorer or the version of the Code2000 font that I use

Re: Multiple Directions

2004-05-17 Thread jcowan
fantasai scripsit: One doesn't really have to tilt one's head sideways to read the English; it's possible to process the text vertically. I can read the passage at a comfortable speaking speed by this method, and I imagine the Japanese do the same. With just a little more practice I'd be

Re: Multiple Directions (was: Re: Coptic/Greek (Re: Phoenician))

2004-05-17 Thread jcowan
Michael Everson scripsit: TTB, not T2B, please. [...] BTT, not B2T, please. It would be a violation of my traditional cultural standards to use T instead of 2 for to. Furthermore, using 2 prevents me from writing TBB and other such horrors. Ogham has LTR directionality when horizontal, and

Re: Archaic-Greek/Palaeo-Hebrew (was, interleaved ordering; was, Phoenician)

2004-05-17 Thread jcowan
Michael Everson scripsit: What is a 'diascript' ? Dean's attempt to invent a new term for the gigantic bucket he thinks Hebrew is. Not new, not invented; as I said, already in use in French, German, and Dutch. By analogy with diaphoneme, presumably; an abstract representation which can

Re: ISO-15924 script nodes and UAX#24 script IDs

2004-05-17 Thread jcowan
Michael Everson scripsit: Or shouldn't simply Unicode deprecate script IDs in favor of ISO-15924 codes? This doesn't make any sense. I believe the suggestion is to drop the long-form Unicode script codes currently used for the Script property in favor of 15924 codes exclusively. -- John

Re: Vertical BIDI

2004-05-17 Thread jcowan
fantasai scripsit: (context: http://fantasai.inkedblade.net/style/discuss/vertical-bidi ) Notice that in B, the Chinese and the English are going in opposite directions, even though they're both LTR scripts. That's because the English is rotated and the Chinese is not, and rotated text

Re: Multiple Directions (was: Re: Coptic/Greek (Re: Phoenician))

2004-05-14 Thread jcowan
Andrew C. West scripsit: A page that contained both Mongolian and vertical CJK might require a vertical bidirectional algorithm, but AFAIK that question has not yet arisen. I'm a little confused by the last sentence. So was I. In bilingual Manchu-Chinese texts, which were common

Re: Multiple Directions (was: Re: Coptic/Greek (Re: Phoenician))

2004-05-14 Thread jcowan
Michael Everson scripsit: You can't play around with Ogham directionality like that. Reversing it makes it read completely differently! The first example reads INGACLU; the second reads ULCAGNI. Which is as much to say that R2L Ogham is illegible. But is T2B Ogham necessarily illegible,

Re: [BULK] - Re: Interleaved collation of related scripts

2004-05-14 Thread jcowan
Peter Kirk scripsit: Well, I accepted somewhat reluctantly that Phoenician should be separately encoded because a small number of users want it to be, although a majority apparently do not want it to be. Neither you nor anyone else knows what the majority wants, because most interested

Re: Multiple Directions (was: Re: Coptic/Greek (Re: Phoenician))

2004-05-14 Thread jcowan
Michael Everson scripsit: Which is as much to say that R2L Ogham is illegible. But is T2B Ogham necessarily illegible, especially if the glyphs were to be reversed? Try it and see. ;-) It's all Greek to me. -- How they ever reached any conclusion at all[EMAIL PROTECTED] is starkly

Re: Multiple Directions (was: Re: Coptic/Greek (Re: Phoenician))

2004-05-13 Thread jcowan
E. Keown scripsit: How did you decide that 'horizontal' is the default direction? My impression is that 85 - 95% of *all* elements of writing ever invented by humans are Chinese (or other ..JKV...). That's irrelevant. L2R and R2L scripts are often mixed in the same sentence, whereas it's

Re: UTF-8 nitpicking (was: RE: any unicode conversion tools?)

2004-05-13 Thread jcowan
Kenneth Whistler scripsit: It was only with Unicode 3.0 (and the correlated 10646-1:2000) that this was rationalized to the Unicode definition of UTF-8 formally consisting of only 1-4 bytes sequences, while simultaneously the potential need for 5 and 6-byte sequences in 10646 was removed,

Re: interleaved ordering (was RE: Phoenician)

2004-05-12 Thread jcowan
Mike Ayers scripsit: I agree with those who think that interleaving Phoenician ad Hebrew would not be a good default. I've asked it before and I'll ask it again: is it not correct that language scholars are those most likely to be able to create and use a nondefault sort order? I see

Re: Everson-bashing

2004-05-12 Thread jcowan
Peter Kirk scripsit: But have the others agreed with his judgments because they are convinced of their correctness? Or is it more that the others have trusted the judgments of the one they consider to be an expert, and have either not dared to stand up to him or have simply been unqulified

Re: Katakana_Or_Hiragana

2004-05-10 Thread jcowan
Tom Emerson scripsit: Perhaps Michael can enlighten us on the rational for grouping hiragana and katakana together as a single script. They aren't. They are collated together, that's all. -- How they ever reached any conclusion at all[EMAIL PROTECTED] is starkly unknowable to the human

Re: Katakana_Or_Hiragana

2004-05-10 Thread jcowan
Michael Everson scripsit: Phoenician and Hebrew should not be interfiled, of course, in the default table, though John Cowan seems to think otherwise. 'Seems', monsieur? Nay, 'does'; I know not 'seems'. --Not Quite Hamlet The point is, of course, that if Phoenician is to be used to

Re: Phoenician

2004-05-07 Thread jcowan
Jony Rosenne scripsit: A possible strong negative argument would be if having it would cause problems for those who do not think they need it. For example, if it would make searching more difficult. This argument has been raised, but I am not convinced the possible difficulties are

Re: Phoenician

2004-05-07 Thread jcowan
E. Keown scripsit: This could be solved by making Phoenician and Hebrew base characters equivalent at the first level of collation. Could this be translated and expanded into Basic Not-so-Geeky English???---Elaine It means that given an alphabetized list of words, some of which are

Re: Phoenician

2004-05-07 Thread jcowan
E. Keown scripsit: So could you do this with all Semitic/Afroasiatic languages which have something like alef and beth? Is there a numeric limit? No, there's no numerical limit. You could do it for whichever 22CWSAs Unicode ends up encoding. Another consequence is that searching as well as

Re: TR35 (was: Standardize TimeZone ID

2004-05-07 Thread jcowan
Carl W. Brown scripsit: So which timezone will the tr_TR locale in a TR35 database have? Asia/Istanbul or Europe/Istanbul or both? Both. I guess that the territory possessions list should be an another database that is merged. I think they should be in the same database. Guam is a

Re: TR35 (was: Standardize TimeZone ID)

2004-05-07 Thread jcowan
Philippe Verdy scripsit: I do agree. The fact that both Europe/Istanbul and Asia/Istanbul are referenced is probably not really political, but it reflects the fact that this city is on both continents, and that it's timezone covers more than just this city. Someone leaving on the Asian area

Re: v and u positional variants (Re: New contribution)

2004-05-06 Thread jcowan
Patrick Andries scripsit: The same is true for huit (8) / vit (he lives or virile member) , huitre (oyster) / vitre (window pane), huis (door) / vis (you (sing.) live, live ! or screw), etc. Similarly, English final -u/v was always interpreted as u, so phonetically final v had to be written

Re: For Phoenician

2004-05-06 Thread jcowan
Michael Everson scripsit: At 11:30 -0700 2004-05-06, E. Keown wrote: The logical implication of Everson's work is that part of the Dead Sea Scrolls and all the Samaritan material and all other material of that type, should be encoded in his proposed block. No, Elaine. The implication

Re: Just if and where is the then?

2004-05-05 Thread jcowan
John Jenkins scripsit: There is, moreover, a non-zero cost to revising a program or OS to use a new 8-bit encoding. Realistically, people running machines or using software too old to use Unicode aren't likely to get much advantage at this point by the creation of a new 8-bit standard.

Re: Yoruba Keyboard

2004-05-05 Thread jcowan
African Oracle scripsit: Looking at the above it is obvious that the acute on top of the e and o with dot below is a bit too high almost to the point of looking like a cedilla under E. The fact that it looks that way to you does not mean that it looks that way to everyone. I'm writing this

Re: New contribution

2004-05-04 Thread jcowan
Peter Constable scripsit: 2) the characters in question are structurally / behaviourally very similar to square Hebrew characters, but not to the characters of other scripts Not just very similar: structurally, behaviorally, and even phonemically identical. Item 1, I think we'd agree, is

Re: New contribution

2004-05-04 Thread jcowan
Peter Constable scripsit: What are the directional properties of Pheonician? Is it RTL only, or was it ever written with a different directionality? It's RTL only, except to the extent that you consider Archaic Greek a script variant of Phoenician. :-) -- John Cowan [EMAIL PROTECTED]

Re: New contribution

2004-05-04 Thread jcowan
Michael Everson scripsit: Well. Depends what you mean by forms. Our taxonomy currently lists Samaritan, Square Hebrew, Arabic, Syriac, and Mandaic as modern (RTL) forms of the parent Phoenician. Arabic and Syriac have very specialized shaping behavior which makes them obviously distinct

Re: Just if and where is the then?

2004-05-04 Thread jcowan
African Oracle scripsit: Are we saying we have exhausted such necessity? Yes. And what are these legacy-standard encodings? Those devised by ISO, various national governments, IBM, Microsoft, and Apple, roughly speaking. No new composite values will be added. - Peter Constable The above

Re: Drumming them out

2004-05-04 Thread jcowan
Michael Everson scripsit: Enshrining justifications in the proposal documents really all that important? It sounds like busywork to me. No, this point I insist on. It's really, really important, as we descend further into the labyrinth of difficult choices (how to encode? what to unify or

Re: New contribution

2004-05-03 Thread jcowan
Michael Everson scripsit: If you think that a Hebrew Gemara, with its baroque and wonderful typographic richness, can be represented in a Phoenician font, I don't think that one bit. (Why is it that when I disagree with someone, that person so frequently wants to accuse me of believing in

Re: Nice to join this forum....

2004-05-03 Thread jcowan
[EMAIL PROTECTED] scripsit: Wondering about casing, if the gb diagraph appears initially, I have a booklet for learning Yoruba which includes the proper name of the Rt. Rev. Isaac Gbekeleoluwa Abiodun Jadesimi in the bilingual dedication. In both the Yoruba and English versions of the

Re: New contribution

2004-04-30 Thread jcowan
Philippe Verdy scripsit: Suppose that a modern Hebrew text is speaking about Phoenician words, the script distinction is not only a matter of style but carries semantic distinctions as well, as they are distinct languages. It's obvious that a modern Hebrew reader will not be able to decipher

22CWSA

2004-04-30 Thread jcowan
It's been pointed out to me that I never explained the abbreviation 22CWSA. Mea culpa. I got tired of typing 22-character West Semitic abjad. -- Yes, chili in the eye is bad, but so is yourJohn Cowan ear. However, I would suggest you wash your[EMAIL PROTECTED] hands thoroughly before

Re: New contribution

2004-04-30 Thread jcowan
Michael Everson scripsit: But the variation of some Latin and Cyrillic letters can be just as great. Unsupported assertion. You don't have anything like the difference between a single-stroke Hebrew YOD and a three-pronged Phoenician YOD between Cyrillic and Latin. What about the

Re: FW: Web Form: Subj: Against Phoenician

2004-04-30 Thread jcowan
Ego et Michael Everson inter se scripserunt: An alternate version of Michael could present a similarly technically impeccable proposal for Gaelic script, and then the question would be, is it the same as Latin, or is it a separate script requiring a separate encoding? Except that he

Re: New contribution

2004-04-30 Thread jcowan
John Hudson scripsit: On the one hand, the obvious recommendation would be to tell semiticists to continue doing what they have been doing: encoding as Hebrew and displaying with Phoenician-style glyph variants, as this enables textual analysis and comparison with a larger body of Hebrew

Re: Public Review Issues Updated

2004-04-29 Thread jcowan
Philippe Verdy scripsit: So now we are left with orthographic/phonetic letters. c-stroke is one that was covered in your searches. But now that we know that capital C-stroke is also used, can Unicode be updated later to add a case mapping for c-stroke, if C-stroke is added later? Yes.

Re: [META] Should there be a separate public list for CLDR?

2004-04-25 Thread jcowan
Michael Everson scripsit: Please, Mark. You don't spend as much time on the Unicode list as I do. Trust me. Or trust MichKa. Either way, please make a new list for this specialized discussion area. I add my voice to these. Please create a separate list for public discussion of locales,

Variation selectors and vowel marks

2004-04-23 Thread jcowan
I'm surfacing an issue from [EMAIL PROTECTED] because it may have wider applicability. Currently, it's the rule that variation selector characters can't be applied to combining characters. This is sensible in the case of true diacritical marks: if two marks differ in shape, they ought in general

Re: Unihan.txt and the four dictionary sorting algorithm

2004-04-23 Thread jcowan
Edward H. Trager scripsit: (Windows' lack of a decent shell and command-line tools is probably what makes the OS most annoying). Cygwin (http://www.cygwin.com) is your friend; it provides a relatively complete Unix hosted on Win32. It works best on the NT branch of the family when the disks

Re: New Currency sign in Unicode

2004-04-02 Thread jcowan
Kenneth Whistler scripsit: Rick said: [...] I would tend to think that that what we have is just a set of variations on the ordinary cent sign, and any number of variant glyphs can be used. [...] I draw a somewhat different conclusion. But why? You don't provide any argument

Re: Fixed Width Spaces (was: Printing and Displaying DependentVowels)

2004-03-31 Thread jcowan
Language Analysis Systems, Inc. Unicode list reader scripsit: It sorta seems like the need to keep phrases like Louis XIV together is a valid one the deserves a solution, but it also seems fairly esoteric-- typesetters and people who give a lot of thought to the presentation of their text

Doing Markup in Plain Text: A Modest Proposal for Planes 4-B of Unicode

2004-03-31 Thread jcowan
XML has become the de facto standard for fancy text. It is therefore useful to explore ways and means of bringing XML into plain text, since obviously plain text is simpler than, and superior to, fancy text. The current method involving and and and / and who knows what else is obviously much

Re: Fixed Width Spaces (was: Printing and Displaying DependentVowels)

2004-03-31 Thread jcowan
Peter Kirk scripsit: But, as Ken has just clarified, with NBSP Louis' neck may be stretched rather uncomfortably, if not cut completely. Here is what I don't want to see (fixed width font required): Louis XVI was guillotinedin 1793. This, however, is a matter of presentation

Re: Unicode 4.0.1 Released

2004-03-31 Thread jcowan
Marco Cimarosti scripsit: So far, my understanding was that the normative properties of existing code points where carved in stone. Not all normative properties are immutable. A normative property is simply one which you have to get right if you claim conformance to that part of Unicode: you

Re: Why is U+17C1 of General category Mc while U+0E40 and U+0EC) are of category Lo ?

2004-03-29 Thread jcowan
Patrick Andries scripsit: Small question again. Why is U+17C1 KHMER VOWEL SIGN E of General category Mc (Mark, Spacing Combining) while similar signs in Lao and Thai, related scripts, are of General category Lo (Letter, Other) ? See U+0E40 THAI CHARACTER SARA E and U+0EC0 LAO VOWEL

Re: Printing and Displaying Dependent Vowels

2004-03-29 Thread jcowan
Antoine Leca scripsit: I am sorry John, I should have miss a post of yours. I asked you where it is written, and did not find any answer to this; unless someone consider that all marks, including spacing combining vowels, are (European) diacritics. Well, it depends on what the equivoque

Re: Printing and Displaying Dependent Vowels

2004-03-25 Thread jcowan
Avarangal scripsit: Can any one provide information on the sequences used for diplaying and printing dependent vowels as standalones. The standards-conforming way to do so is to precede the dependent vowel with a space character (U+0020). If this sequence is not displayed correctly, complain

Re: vertical direction control

2004-03-24 Thread jcowan
Kenneth Whistler scripsit: Ernest Cline wrote: It also doesn't account for boustrephedon writing direction either. ^ boustrophedon Ah well. I once referred to Herodotos throughout a posting as Herotodos (googling

Re: Irish dotless I (was: Languages with letters that always take diacriticals

2004-03-19 Thread jcowan
Pavel Adamek scripsit: From the viewpoint of sorting, the coding HCOMBINING C BEFORE would be much better than CCOMBINING H AFTER. For Czech, yes. For Spanish we want the latter. -- Her he asked if O'Hare Doctor tidings sent from far John Cowan coast and she with grameful sigh him

Re: What's the BMP being saved for?

2004-03-18 Thread jcowan
Arcane Jill scripsit: Why are characters being assigned codepoints U+, when there is still loads and loads of unused empty space below that point. In fact the BMP is currently 87.5% full. When the 32 remaining blocks currently shown on the Roadmap are completed, it will be almost 99%

Re: help needed with adding new character

2004-03-18 Thread jcowan
Jon Wilson scripsit: The character in question is a variant of CIRCLED LATIN CAPITAL LETTER A, commonly referred to as the Anarchy symbol. The bars of the A are longer than normal, extending to touch or even overlap the circle. It's basically a logo, and as such doesn't belong in Unicode,

Re: help needed with adding new character

2004-03-18 Thread jcowan
Jon Wilson scripsit: PEACE SYMBOL, YIN YANG and HAMMER AND SICKLE are represented in Unicode. The first and third are even logos for specific organisations (CND and various communist governements). PEACE SYMBOL, as its name indicates, has a considerably wider scope than nuclear disarmament,

Re: Irish dotless I (was: Languages with letters that always take diacriticals

2004-03-18 Thread jcowan
[EMAIL PROTECTED] scripsit: Thus, the digraph 0062+0068 (i.e., bh) represents the same conceptual object as 1E03. Note that, if a selection of Irish text is set using one convention or the other, problems with spell checkers will occur UNLESS there is some metadata that indicates the writing

Re: Irish dotless I (was: Languages with letters that always take diacriticals

2004-03-18 Thread jcowan
[EMAIL PROTECTED] scripsit: In this context, and if it's true that a spell checker could, in theory, be programmed to handle parallel encoding conventions, then why shouldn't Irish language traditionalists encode the i with a LATIN SMALL LETTER DOTLESS I such as 0131? It could be done, yes,

Re: Fwd: Re: (SC2WG2.609) New contribution N2705

2004-02-18 Thread jcowan
Ernest Cline scripsit: I'm not saying that sufficient support can't be shown. I'm saying that the examples shown are not enough to convince me of the desirability of encoding subscript x and subscript / as official Unicode characters instead of as markup or as private use characters. I

Re: (SC2WG2.609) New contribution N2705

2004-02-17 Thread jcowan
Philippe Verdy scripsit: I see some similarities between the undetermined vowel tainting letter (the subscripted x) and the leading star in the expression, used to denote an undetermined infered historic letter. Shouldn't both use the same glyph with just a distinct positioning? Could it be

Re: interesting SIL-document

2004-02-03 Thread jcowan
Philippe Verdy scripsit: Which words? hungry, hunger, Hungary, Henry ? I don't know a syllable-initial /h/ in English out of word-initial /h/... And even in that case, I think this comes from contracted phonetic of fast or popular speech, where there's an intermediate schwa between /h/ and

Re: interesting SIL-document

2004-02-03 Thread jcowan
Peter Kirk scripsit: John, your phonology isn't actually even reasonable. [eng] occurs intervocally in words like hanger, singing. Whether this is syllable initial depends on your analysis. Fair enough; but hang-er, sing-ing *is* the conventional analysis. English, generally speaking,

Re: Latin Theta?

2004-01-28 Thread jcowan
Mark E. Shoulson scripsit: I was playing around with making my very own IPA keyboard, and I discovered to my surprise that Unicode has no Latin Small Theta (for IPA). We have LATIN SMALL LETTER ALPHA (U+0251), LATIN SMALL LETTER GAMMA (U+0263), LATIN SMALL LETTER EPSILON (U+052B, though

Re: Unicode forms for internal storage - BOCU-1 speed

2004-01-22 Thread jcowan
Markus Scherer scripsit: UTF-8 is useful because it's simple, and supported just about everywhere - but it's otherwise hardly optimal for anything. You entirely omit its principal advantage, sine qua non: it's maximally ASCII-compatible, using bytes 0x00 to 0x7F to represent ASCII characters

Re: Unicode forms for internal storage - BOCU-1 speed

2004-01-22 Thread jcowan
Philippe Verdy scripsit: Is the other competing UTF-9 from Jerome Abela this one: No. Abela's version preserves all of 00-7F and A0-FF, packing all the rest of Unicode into sequences beginning with any of 80-9F. -- XQuery Blueberry DOMJohn Cowan Entity parser

Re: Mongolian Unicoding (was Re: Cuneiform Free Variation Selectors)

2004-01-20 Thread jcowan
Andrew C. West scripsit: These are glyph variants of Phags-pa letters that are used with semantic distinctiveness in a single (but very important) text, _Menggu Ziyun_ , a 14th century rhyming dictionary of Chinese in which Chinese ideographs are listed by their Phags-pa spellings. In this

Useful Breton links

2004-01-16 Thread jcowan
Philippe Verdy scripsit: I'd really like to know more about Breton, but the fact is that despite I am a native Breton and live there in Britanny, finding resources in this language is hard because it is not supported by public schools and even forbidden in all documents with some legal value.

Re: Klingon

2004-01-15 Thread jcowan
Michael Everson scripsit: At 14:53 +0100 2004-01-15, Chris Jacobs wrote: WHY THEN DISTRIBUTES THE KLI SUCH A BLATANTLY UNCONFORMANT FONT? yIjachQo'. vItlhob. {{{:-) Demonstrating once again that the One True Script for Klingon is Latin. -- John Cowan [EMAIL PROTECTED]

Re: Klingon

2004-01-15 Thread jcowan
Philippe Verdy scripsit: Not really: look at how uppercase letters are used: case mapping, which is quite safe in languages written with the Latin script, Oh, is it? I note quite a difference between a polish manufacturer and a Polish one. Indeed, in one case a Polish-language newspaper in

Re: Klingon

2004-01-15 Thread jcowan
Jon Hanna scripsit: A locale-sensitive title-case operations for the Klingon language would not produce Yijachqo'. Vitlhob. from yIjachQo'. vItlhob. any more than a locale-sensitive title-case operation for the Irish language would produce Nathair from nAthair although a deliberately fuzzy

Re: Klingon

2004-01-15 Thread jcowan
Philippe Verdy scripsit: Even in the case of Irish, the uppercase S denotes a distinctful variant of s, which should better be noted with some diacritic, such as a hacek or cedilla... That is not the case. Imagine what happens when reading uppercased Irish book titles and the confusion it

Re: Klingon

2004-01-15 Thread jcowan
Mark E. Shoulson scripsit: It's incredibly useful, Philippe, to have some inkling of what you're talking about before you answer. What, and ruin his large and growing reputation as one of the masters of misinformation? He'll be challenging Abrigon Gusiq next. -- While staying with the

Re: MIME-aware recode or iconv?

2004-01-15 Thread jcowan
Frank da Cruz scripsit: Is anybody aware of a Unix stdin/stdout application (suitable for piping) that converts a text stream from one character encoding to another based on its MIME headers (as you would find, for example, in an email message)? The following script is not bulletproof, but

Re: Klingon

2004-01-15 Thread jcowan
Philippe Verdy scripsit: OK. Then don't say it's Breton: It may occur in any Latin language, either as a typo, or within specific technical usages such as variable names in a C or Java program where a space cannot be used to separate words; here also it's not the normal orthograph part of the

Re: [OT] CJK - CJC (Re: Corea?)

2003-12-17 Thread jcowan
Alexander Savenkov scripsit: You mixed everything up, Phillippe. As we say in America, General Grant [1822-1885] Still Dead. -- Do what you will, John Cowan this Life's a Fiction[EMAIL PROTECTED] And is made up of

  1   2   >