Re: Unicode Search Engines

2002-02-21 Thread John Cowan
be put on the currently empty normalization-exception list, and will be decomposed as a result. In practice, no new precomposed characters are being added. -- John Cowan http://www.ccil.org/~cowan [EMAIL PROTECTED] To say that Bilbo's breath was taken away is no desc

Re: Unicode Search Engines

2002-02-20 Thread John Cowan
t present in Unicode). > I think > that there is a list of Unicode characters which are not allowed (forbidden? > deprecated?) in HTML specs. Correct. -- John Cowan http://www.ccil.org/~cowan [EMAIL PROTECTED] To say that Bilbo's breath was taken away is no desc

Re: Unicode Search Engines

2002-02-20 Thread John Cowan
TW, are you sure that it is NFKC? My understanding is that it was NFC + > some extra passages. It is NFC, with the additional proviso that n11n must be done even if characters appear as character references (&#x;) rather than actual characters. -- John Cowan http://www.c

Re: Unicode and end users

2002-02-19 Thread John Cowan
David Hopwood scripsit: > (I've just checked whether NTFS allows ill-formed UTF-16 filenames; it does, > at least on NT4.0, but you could reasonably treat that as an error.) NTFS filenames are UCS-2, not UTF-16, so "ill-formed" has no meaning. -- John Cowan

Re: Unicode and end users

2002-02-18 Thread John Cowan
isn't much of a problem.) The only way to ensure "no data loss" is to store file names as uninterpreted byte sequences, and forget about characters altogether. Which is what the kernel actually does: only 00 and 2F mean anything to it. -- John Cowan http://www.ccil.org/~cowan

Re: Off-Topic (Re: This spoofing and security thread)

2002-02-14 Thread John Cowan
German translation also has one "e" in it -- "Gib uns das t?gliche Brot", and Perec apparently (the facts are not quite certain) told someone that there *was* a single "e" in the original -- he did not disclose its whereabouts. -- John Cowan <[EMAIL PROTECTED]>

"Unicode, Oh Unicode": lyrics

2002-02-14 Thread John Cowan
I can't make out the lyrics through my crappy speakers. Are they on line anywhere? -- John Cowan <[EMAIL PROTECTED]> http://www.reutershealth.com I amar prestar aen, han mathon ne nen,http://www.ccil.org/~cowan han mathon ne chae, a han noston ne 'wilith. --Galadriel, _LOTR:FOTR_

Re: Off-Topic (Re: This spoofing and security thread)

2002-02-13 Thread John Cowan
on"): "The Void", wherein the hero, Anton Voyl, becomes Anton Vowl. There are German and Danish translations too. -- John Cowan http://www.ccil.org/~cowan [EMAIL PROTECTED] To say that Bilbo's breath was taken away is no description at all. There are

Re: Unicode and Security

2002-02-10 Thread John Cowan
the same address. In that case comment NOW, TODAY or TOMORROW, to the IETF IDN lists so that they can extend the nameprep process to do such things. (They will be resistant at this stage, no doubt, but it's worth a try.) The Unicode list can't help you. -- John Cowan

Re: Arabic indexes

2002-02-10 Thread John Cowan
emselves appear in the overall LTR order; that is, the pages beginning with aleph/alif are closer to the (LTR) front of the book. -- John Cowan http://www.ccil.org/~cowan [EMAIL PROTECTED] To say that Bilbo's breath was taken away is no description at all. There are no

Re: Unicode and Security

2002-02-09 Thread John Cowan
t a dream come true. And there was much rejoicing. -- John Cowan http://www.ccil.org/~cowan [EMAIL PROTECTED] To say that Bilbo's breath was taken away is no description at all. There are no words left to express his staggerment, since Men changed the language that they le

Arabic indexes

2002-02-08 Thread John Cowan
them. In the Arabic index, however, the indexed words appear with a page number before (that is, to the right) of them. Is this regular practice in Arabic indexing, or some bizarre bidi glitch? -- John Cowan http://www.ccil.org/~cowan [EMAIL PROTECTED] To say that Bilbo&#

Re: Unicode and Security

2002-02-07 Thread John Cowan
Kenneth Whistler wrote: > The only widely-deployed alternative approach I know of is > ETSI GSM 03.38 (used in mobile telephony), A truly bizarre character set: it supports English, French, mainland Scandinavian languages, Italian, Spanish with Graves, and GREEK SHOUTING. -- John

Re: (no subject)

2002-02-05 Thread John Cowan
rticular keys on the keyboard without regard to what they are used for in one locale or another. ISO 9995 is the controlling standard. -- John Cowan <[EMAIL PROTECTED]> http://www.reutershealth.com I amar prestar aen, han mathon ne nen,http://www.ccil.org/~cowan han mathon ne chae, a han noston ne 'wilith. --Galadriel, _LOTR:FOTR_

Re: A few questions about decomposition, equvalence and rendering

2002-02-05 Thread John Cowan
Lukas Pietsch wrote: > U+1FC1 is spacing in all the fonts that I've seen. Oops. Of course it is. -- John Cowan <[EMAIL PROTECTED]> http://www.reutershealth.com I amar prestar aen, han mathon ne nen,http://www.ccil.org/~cowan han mathon ne chae, a han noston ne 'w

Re: A few questions about decomposition, equvalence and rendering

2002-02-05 Thread John Cowan
number of precomposed spacing diacritical marks for > Greek (e.g. U+1FC1). However, and unless I've missed something, with > the exception of U+0385, they do not have combining (non-spacing) > versions. What's the rationale here? Eh? U+1FC1 *is* nonspacing. The U+1Fxx one

Re: Unicode and Security

2002-02-04 Thread John Cowan
nd Islam respectively". The other order will make no sense at all. -- John Cowan http://www.ccil.org/~cowan [EMAIL PROTECTED] To say that Bilbo's breath was taken away is no description at all. There are no words left to express his staggerment, since Men changed th

Re: Unicode and Security

2002-02-04 Thread John Cowan
ppearance of such a text tells you whether it is basically in English or Arabic. Therefore, this appearance can have either of two encodings: AL-ARAB = the Arabs\nAL-ISLAM = Islam the Arabs = AL-ARAB\nIslam = AL-ISLAM -- John Cowan http://www.ccil.org/~cowan

Re: Unicode and Security

2002-02-03 Thread John Cowan
the merits of an objection when no actual examples of the problem are given. -- John Cowan http://www.ccil.org/~cowan [EMAIL PROTECTED] To say that Bilbo's breath was taken away is no description at all. There are no words left to express his staggerment, since Me

Re: Unicode and Security

2002-02-03 Thread John Cowan
les, because they begin with a strong RTL character. Similar things happen when you construct XML documents with RTL element names: the bidi rules, which are meant for true text and not computer-readable stuff, sometimes produce visually confusing results. -- John Cowan http://www.cci

Re: Unicode Search Engines

2002-01-30 Thread John Cowan
language books, the tonal mark can be printed "alone". One > solution might be to combine them with a "space", but at present, this > does not work always. When does it not? It is the standard Unicode thing to do. -- John Cowan <[EMAIL PROTECTED]> http://ww

Re: RFC: Extended Ethiopic

2002-01-29 Thread John Cowan
in the sense used, "repertoire" is the term used in character standards for a set of characters. Even clearer would be to say "Ethiopic Character Repertoire and Ordering", but that may not be necessary. -- John Cowan http://www.ccil.org/~cowan [EMAIL

Re: Unicode 3.2: BETA files updated

2002-01-25 Thread John Cowan
tion' of a character is less clear than saying 'variant'. The variation selector specifies the variation which will produce the variant. -- John Cowan http://www.ccil.org/~cowan [EMAIL PROTECTED] Please leave your values| Check your assumptions.

Re: Wade -> Pinyin transliteration (Unihan ?)

2002-01-24 Thread John Cowan
t;Chiang Kai-Shek" isn't Wade-Giles; it isn't even Mandarin. -- John Cowan http://www.ccil.org/~cowan [EMAIL PROTECTED] Please leave your values| Check your assumptions. In fact, at the front desk. | check yo

Re: TC/SC mapping

2002-01-24 Thread John Cowan
t be adding a new TC (off a newly dug-up bone, perhaps) which simplifies to two different SCs. Fair enough. -- Not to perambulate || John Cowan <[EMAIL PROTECTED]> the corridors || http://www.reutershealth.com during the hours of repose || http://www.ccil.org/

Re: Tengwar vowel signs

2002-01-09 Thread John Cowan
in in particular is known to have used three different "modes", as the conventions are collectively called: abjad with vowels on following consonants, fully alphabetic, abjad with vowels on preceding consonants. (The alphabetic mode is analogous to the alphabetic mode of the Hebrew scrip

Re: Vertical scripts (was: Tategaki (was: Re: Updated...))

2002-01-03 Thread John Cowan
ns of "colorings" to the fundamental consonant structure. Unicode tribal elders are invited to mention which of the two conflicting principles they reckon to be the more important. -- John Cowan http://www.ccil.org/~cowan [EMAIL PROTECTED] Please leave your values

Re: Vertical scripts (was: Tategaki (was: Re: Updated...))

2002-01-03 Thread John Cowan
und to creating the Tengwar as we know them today: the Sarati of Ruumil. This is a TTB LTR abjad, like Mongolian. Vowel marks appearing to the left of the consonants are pronounced before them; those to the right, after them. http://user.tninet.se/~xof995c/sarati.htm -- John Cowan

Re: YO, ho ho, and a bottle of vodka

2001-10-30 Thread John Cowan
silon, and was used in borrowings from Greek. -- John Cowan http://www.ccil.org/~cowan [EMAIL PROTECTED] Please leave your values| Check your assumptions. In fact, at the front desk. | check your assumptions at the door.

Re: plane business

2001-10-01 Thread John Cowan
no clue. > BTW, it doesn’t make sense for every code position > ending in or FFFE to be a non character. It doesn't make much sense, but it is the rule anyway. > Why isn’t the same rule applied to the “hidden” non > characters, so that every code value ending in FDD0 to >

Re: PDUTR #26 posted

2001-09-19 Thread John Cowan
[EMAIL PROTECTED] scripsit: > Oops! One of two "Unicode 101" mistakes I made in the same day. Where was > my brain? Unicode Ate Your Brain, of course! (See my tutorial at Orlando this year.) -- John Cowan http://www.ccil.org/~cowan [EMAIL PROTECT

John Cowan is all right

2001-09-11 Thread John Cowan
Apologies for the cross-post. I amd my family are all fine and safe at home, about 3 km from ground zero. There is no problem here except a touch of air pollution. -- John Cowan http://www.ccil.org/~cowan [EMAIL PROTECTED] Please leave your values| Check

Re: [OT] o-circumflex

2001-09-09 Thread John Cowan
Keld Jørn Simonsen scripsit: > Yes, foreigners call our cities many strange things:-) > København is called Köpenhamn, Copenhagen, Kobenhagen, Copenhague, > and many more. Helsingør is called Elsinore. None of which is as weird as Leghorn for Livorno (Italy). -- John Cowan

Re: japanese xml

2001-09-04 Thread John Cowan
x27;s encoding, so an XML file can always be losslessly > expressed in any supported encoding. Only the character content can be represented losslessly, not the element type names, attribute names, enumerated attribute values, comments, processing instructions. -- John Cowan

Multilingual text (toy warning) in 22 languages

2001-07-23 Thread John Cowan
http://www.zompist.com/kinder.gif -- John Cowan [EMAIL PROTECTED] One art/there is/no less/no more/All things/to do/with sparks/galore --Douglas Hofstadter

Re: Is there Unicode mail out there?

2001-07-20 Thread John Cowan
Ayers, Mike scripsit: > Simple. Since "]]>" is used to mark the end of a CDATA section, and > since CDATA can contain anything, if you want to put the sequence "]]>" > INSIDE your CDATA, then you must escape the ">", or else it will END your > CDATA. That isn't what it says, and isn't tru

Ambiguous wording in XML Rec (was: Is there Unicode mail out there?)

2001-07-20 Thread John Cowan
marking the end of a CDATA > section.) Ah, I see. No, it's "must, for compatibility, be escaped using (">" or a character reference) when it appears [...]". If we insert "either" before "'>'", would that help? -- John Cowan

Re: Is there Unicode mail out there?

2001-07-20 Thread John Cowan
t string is not marking the end of a CDATA > section." Naah. Just because it says "may" doesn't mean anything: what "may" be done, also "may" be not done. You may use a numeric character reference for any legal character. -- John Cowan [EMAIL PROTECTED] One art/there is/no less/no more/All things/to do/with sparks/galore --Douglas Hofstadter

Re: Is there Unicode mail out there?

2001-07-17 Thread John Cowan
g the range of well-formed documents is an immediate loser, even if there is no plausible use for such documents. Just pretend you'll never get one of the legal non-characters. -- John Cowan [EMAIL PROTECTED] One art/there is/no less/no more/All things/t

Re: Eudora (was: Is there Unicode mail out there?)

2001-07-13 Thread John Cowan
Otto Stolz scripsit: > | Unicode 3.2 will have characters de- > | fined in the range above U+; Should have been "Unicode 3.1 already has characters" etc. -- John Cowan [EMAIL PROTECTED] One art/there is/no less/no more/All things/to do/wi

Re: Shavian

2001-07-06 Thread John Cowan
previous postings, and particularly the [EMAIL PROTECTED] mailing list, which has been running some 60 postings a month lately. -- John Cowan [EMAIL PROTECTED] One art/there is/no less/no more/All things/to do/with sparks/galore --Douglas Hofstadter

Re: Shavian

2001-07-06 Thread John Cowan
Shavian, and > Klingon, were invented by humans. Hear, hear. -- John Cowan [EMAIL PROTECTED] One art/there is/no less/no more/All things/to do/with sparks/galore --Douglas Hofstadter

Re: Shavian

2001-07-06 Thread John Cowan
, which means that the energy spent on invented scripts is nowise taken away from the energy that could be spent on obscure-but-real scripts. Would that it were otherwise. -- John Cowan [EMAIL PROTECTED] One art/there is/no less/no more/All things/to do/with sparks/galore --Douglas Hofstadter

Re: Unicode transliterations (and other operations)

2001-07-05 Thread John Cowan
y hand: no machines involved. > Meanwhile, when someone uses the terms in the > 'broader sense' (id est: dictionary definition), please let's not > chide them for it. Well, fine. But when someone is talking about physics, and uses "energy", "power", and

Re: Shavian (was: Re: UTF-17)

2001-07-04 Thread John Cowan
Everson is more than open-minded about such things. -- John Cowan [EMAIL PROTECTED] One art/there is/no less/no more/All things/to do/with sparks/galore --Douglas Hofstadter

Re: Unicode transliterations (and other operations)

2001-07-04 Thread John Cowan
is pronounced ['b@m@]. This was transcribed into (British) English as "Burma". Of course, to represent the pronunciation I am using an ASCII transliteration of IPA! -- John Cowan [EMAIL PROTECTED] One art/there is/no less/no more/All th

Re: New characters query

2001-07-03 Thread John Cowan
board squares, either. No. But don't the hexagrams appear in running text with hanzi? If so, then IMHO they should be encoded separately. -- John Cowan [EMAIL PROTECTED] One art/there is/no less/no more/All things/to do/with sparks/galore --Douglas Hofstadter

Re: DUDE-8, a compression proposal

2001-06-25 Thread John Cowan
Markus Scherer scripsit: > John Cowan wrote: > > 5. Emit all non-zero bytes. > > Do you mean "omit leading zeroes and emit following bytes"? You would not want to >emit all but a middle byte, right? Yes, of course *assumes paper bag* -- John Cowan

Re: UTF-17

2001-06-23 Thread John Cowan
xtension of UTF-32, so it should be called UTF-33. -- John Cowan [EMAIL PROTECTED] One art/there is/no less/no more/All things/to do/with sparks/galore --Douglas Hofstadter

DUDE-8, a compression proposal

2001-06-22 Thread John Cowan
ing of a DUDE-8 compression stream. DUDE-8 is simpler and simpler than SCSU, but doesn't allow recovery from garbles or even partial random or backwards access. -- John Cowan [EMAIL PROTECTED] One art/there is/no less/no more/All things/to do/with sparks/galo

UTF-8S: a modest proposal

2001-06-12 Thread John Cowan
such labels. By registering UTF-8S with IANA, it becomes a legitimate value of the encoding declaration in an XML document, for example, as well as suitable for use in MIME labels. -- John Cowan [EMAIL PROTECTED] One art/there is/no less/no more/All things/to do

Re: RECOMMENDATIONs( Term Asian is not used properly on Computers andNET)

2001-06-08 Thread John Cowan
est against the use of "innovation" for native Japanese words. It is the Chinese borrowings that are, historically, the innovations. -- John Cowan [EMAIL PROTECTED] One art/there is/no less/no more/All things/to do/with sparks/galore --Douglas Hofstadter

Re: RECOMMENDATIONs( Term Asian is not used properly on Computers andNET)

2001-06-06 Thread John Cowan
uot; as a political term and "han" as an ethnic one. -- John Cowan [EMAIL PROTECTED] One art/there is/no less/no more/All things/to do/with sparks/galore --Douglas Hofstadter

Re: UTF-8 syntax (RE: UTF-8S (was: Re: ISO vs Unicode UTF-8))

2001-06-05 Thread John Cowan
her requires nor recommends such. That may change, but with due regard for backward compatibility. The W3C (i.e. Misha, Martin, and me :-)) are thinking about it. Further deponent sayeth not. -- John Cowan [EMAIL PROTECTED] One art/there is/no less/no more/All

Re: Unicode under fire again

2001-06-05 Thread John Cowan
in their desire to do so. In short the author thinks that the Unicode and IRG people, to say nothing of WG2, were a) clueless, and b) not representative of the various CJK countries. Both statements are sufficiently refuted by the facts. -- John Cowan [EMAIL PROTECTED] One art/there is/no less/no more/All things/to do/with sparks/galore --Douglas Hofstadter

Re: Term Asian is not used properly on Computers and NET

2001-05-29 Thread John Cowan
David Gallardo scripsit: > Please excuse the unintended querulousness, but isn't the Greenwich meridian > merely the reification of this bias? Sure. Ditto the Gregorian calendar, and the decimal digit system, and many other international standards. But they *are* standards. --

Re: Term Asian is not used properly on Computers and NET

2001-05-29 Thread John Cowan
arbitrary meaning. This has certain annoying consequences, such as that Little Diomede (U.S.) in the Aleutian Islands is reckoned to be some tens of thousands of kilometers west of Big Diomede (Russia), despite the obvious fact that Little Diomede is about 30 km east of Big Diomede. -- John Cowan

Re: Unicode-based Cyrillic-Latin transliteration table

2001-05-28 Thread John Cowan
ence between letters is not a goal; it is > perfectly OK to transliterate U+0429 to "SHCH". I fear you have undertaken something hopeless. One could transliterate U+0429 as SHCH or S^C^ or any number of other things, but that is only appropriate for Russian. In Bulgarian, the only n

Re: UTF-8 signature in web and email

2001-05-25 Thread John Cowan
operating systems used the CRLF convention, which was inherited by CP/M and thence MS-DOS and Windows. -- John Cowan [EMAIL PROTECTED] One art/there is/no less/no more/All things/to do/with sparks/galore --Douglas Hofstadter

Re: A Europe of fonts

2001-05-25 Thread John Cowan
#x27;t forget the Hebrew script. Hebrew itself is an Asian language, but Yiddish is a European one. -- John Cowan [EMAIL PROTECTED] One art/there is/no less/no more/All things/to do/with sparks/galore --Douglas Hofstadter

Re: UTF-8 signature in web and email

2001-05-22 Thread John Cowan
to show "in- active" with hyphenation, whereas "in!active" at the end of a line must be "inactive", with wordwrap. -- John Cowan [EMAIL PROTECTED] One art/there is/no less/no more/All things/to do/with sparks/galore --Douglas Hofstadter

Re: [OT] bits and bytes

2001-05-17 Thread John Cowan
ith an extra bit unused. -- John Cowan [EMAIL PROTECTED] One art/there is/no less/no more/All things/to do/with sparks/galore --Douglas Hofstadter

Re: Tags and the Private Use Area

2001-04-28 Thread John Cowan
ent just to extend the 3.x UnicodeData and *Properties files. -- John Cowan [EMAIL PROTECTED] One art/there is/no less/no more/All things/to do/with sparks/galore --Douglas Hofstadter

Re: Tags and the Private Use Area

2001-04-25 Thread John Cowan
ell, I wanted to start the CSUR well in advance of actual usage, and encouraged everyone and his brother to register their scripts, *so that* code clash (at least within the conlang community) would never come into existence at all. -- John Cowan [EMAIL PROTECTED] On

Re: three characters?

2001-04-24 Thread John Cowan
characters that appear to involve ENCLOSING or OVERLAY combining characters don't get decompositions. -- John Cowan [EMAIL PROTECTED] One art/there is/no less/no more/All things/to do/with sparks/galore --Douglas Hofstadter

Re: Identifiers

2001-04-16 Thread John Cowan
RON instead of the "o" in your name, and nobody would be the wiser. This is just an exposure we are going to be stuck with from now on. -- John Cowan [EMAIL PROTECTED] One art/there is/no less/no more/All things/to do/with sparks/galore --Douglas Hofstadter

Re: [?UTF-8?][?UTF-8?]

2001-04-13 Thread John Cowan
ot to destroy it. It's inconsistent to treat this as a virtue of Unicode and a vice of CP 1252. -- John Cowan [EMAIL PROTECTED] One art/there is/no less/no more/All things/to do/with sparks/galore --Douglas Hofstadter

Re: [Fwd: Re: benefits of unicode]

2001-04-13 Thread John Cowan
. Otherwise you might as well claim that Windows 95 supports US-EBCDIC! -- John Cowan [EMAIL PROTECTED] One art/there is/no less/no more/All things/to do/with sparks/galore --Douglas Hofstadter

Re: benefits of unicode

2001-04-13 Thread John Cowan
d Inuktitut and perhaps Byzantine music, but its > a bit hard to establish that there are no other code pages. Oh sure. The point is that ISCII does exist, but Microsoft does not support it: therefore, if you are going to do Indic languages, you must have Unicode (for Microsoft environments,

Re: Too many Han (was: Re: How many noncharacters, unassigned and privatearea code points in 3.1.)

2001-04-09 Thread John Cowan
Marco Cimarosti scripsit: > John Cowan wrote: > > > What is the Chinese equivalent of the Jouyou Kanji, > > > anyway? > > > > There is none. > > Sorry to correct you: the education system in PRC China uses a list called > "Changyong Hanzi" (

Re: Too many Han (was: Re: How many noncharacters, unassigned and privatearea code points in 3.1.)

2001-04-07 Thread John Cowan
t all the songs ever written or something. > I doubt there will ever exist a complete list of all > Han characters. Well, no; new ones are being dug up pretty often. > What is the Chinese equivalent of the Jouyou Kanji, > anyway? There is none. -- John Cowan

[unicode] Re: x-bar character

2001-03-23 Thread John Cowan
ING MACRON the former, they are really both higher-level constructs. -- John Cowan [EMAIL PROTECTED] One art/there is/no less/no more/All things/to do/with sparks/galore --Douglas Hofstadter

[unicode] Re: x-bar character

2001-03-22 Thread John Cowan
Roozbeh Pournader scripsit: > I remember seeing an invisible times character somewhere, I think it was > in 3.2 tables. Would you look? Yes, at U+2062. But I think that is truly invisible, zero-width, and is used to render ab meaning a x b. -- John

[unicode] Re: x-bar character

2001-03-22 Thread John Cowan
y), no? In that case COMBINING MACRON would be better. Or should x-bar times y-bar be written with a THIN SPACE separating them? -- John Cowan [EMAIL PROTECTED] One art/there is/no less/no more/All things/to do/with sparks/galore --Douglas Hofstadter

Re: (SC22WG20.3355) Talking about cultures, see this

2001-03-15 Thread John Cowan
concentrée dans la région. There is a chapter on "Anomalies", such as Manhattan, Washington D.C., and Alaska. This region should perhaps have been included. -- There is / one art || John Cowan <[EMAIL PROTECTED]> no more / no less || http://www.reutershealth.com to do / all things || http://www.ccil.org/~cowan with art- / lessness \\ -- Piet Hein

Re: Unicode complaints

2001-03-15 Thread John Cowan
There is / one art || John Cowan <[EMAIL PROTECTED]> no more / no less || http://www.reutershealth.com to do / all things || http://www.ccil.org/~cowan with art- / lessness \\ -- Piet Hein

Re: Metafont to something real

2001-03-06 Thread John Cowan
Michael Everson wrote: > How do I convert Metafonts to outline fonts? This is a hard problem which Lin YawJen <[EMAIL PROTECTED]> claims to have solved; try contacting him. -- There is / one art || John Cowan <[EMAIL PROTECTED]> no more / no less

Re: Close enough

2001-03-02 Thread John Cowan
Spanner.) > > I believe you are referring to Larry Niven's ``Neutron star''. The > confusion probably arised from the publication of this short-story in > a collection edited by Asimov containing all Hugo's awards winners. This pun does not appear in the Niven st

Re: UTF-8, C1 controls, and UNIX

2001-03-01 Thread John Cowan
are what is sacrosanct in UTF-8. -- There is / one art || John Cowan <[EMAIL PROTECTED]> no more / no less || http://www.reutershealth.com to do / all things || http://www.ccil.org/~cowan with art- / lessness \\ -- Piet Hein

Re: Latin digraph characters (was: Re: Klingon silliness)

2001-02-27 Thread John Cowan
that titlecase was important in polytonic Greek, too. -- There is / one art || John Cowan <[EMAIL PROTECTED]> no more / no less || http://www.reutershealth.com to do / all things || http://www.ccil.org/~cowan with art- / lessness \\ -- Piet Hein

Re: Latin digraph characters (was: Re: Klingon silliness)

2001-02-27 Thread John Cowan
impression that > they were deprecated, though I could find no mention of that in TUS 3.0. They have compatibility decompositions, which is one kind of deprecation. -- There is / one art || John Cowan <[EMAIL PROTECTED]> no more / no less || http://www

Re: Perception that Unicode is 16-bit (was: Re: Surrogate space i

2001-02-26 Thread John Cowan
letters. There is also a distressing and non-English tendency to use "alphabet" as a synonym for "alphabetic letter" (e.g. "English uses 26 alphabets") which I have seen on this mailing list and elsewhere. This is a barbarism. -- There is / one art || John

Re: Perception that Unicode is 16-bit (was: Re: Surrogate space i

2001-02-23 Thread John Cowan
going to "break through" that? -- There is / one art || John Cowan <[EMAIL PROTECTED]> no more / no less || http://www.reutershealth.com to do / all things || http://www.ccil.org/~cowan with art- / lessness \\ -- Piet Hein

Re: An Aburdly Brief Introduction to Unicode (was Re: Perception ...)

2001-02-23 Thread John Cowan
00C0) or two consecutive abstract characters? If the former, does U+0051 U+0300 also represent an abstract character? -- There is / one art || John Cowan <[EMAIL PROTECTED]> no more / no less || http://www.reutershealth.com to do / all things || http:

Re: [OT] What is DEL for?

2001-02-21 Thread John Cowan
holes. Therefore, punched paper tape systems ignored it. -- There is / one art || John Cowan <[EMAIL PROTECTED]> no more / no less || http://www.reutershealth.com to do / all things || http://www.ccil.org/~cowan with art- / lessness \\ -- Piet Hein

Re: [OT] What is DEL for?

2001-02-21 Thread John Cowan
ISO-2022 character-set invocation (LS3, etc). I did not think you could put a 96-element character set such as 8859-1-high-half (ESC 02/13 04/01) into G0, but I see by checking ISO 2022 (ECMA-35) that you can, overriding the usual meanings of SP and DEL. -- There is / one art || John C

Re: [OT] What is DEL for?

2001-02-21 Thread John Cowan
t file > containing "ABCDEF") and then saves it? Would the file's content be > changed to "ABDEF"? No. As part of a text file, DEL has no known significance on any system. -- There is / one art || John Cowan <[EMAIL PROTECTED]> no more / no less

Re: Unicode terminology (RE: Perception that Unicode is 16-bit (was:

2001-02-21 Thread John Cowan
f what he does not understand." -- Samuel Johnson -- There is / one art || John Cowan <[EMAIL PROTECTED]> no more / no less || http://www.reutershealth.com to do / all things || http://www.ccil.org/~cowan with art- / lessness \\ -- Piet Hein

Re: Perception that Unicode is 16-bit (was: Re: Surrogate space in

2001-02-20 Thread John Cowan
o strong to call it an error. -- There is / one art || John Cowan <[EMAIL PROTECTED]> no more / no less || http://www.reutershealth.com to do / all things || http://www.ccil.org/~cowan with art- / lessness \\ -- Piet Hein

Re: Bastardizations of UTF-8 (was: Re: [OT] Unicode-compatible SQL?)

2001-02-05 Thread John Cowan
I know of some interfaces that are writing non-Java > code and are forced to deal with specialized handling of the modified > UTF-8. > It would be great to inform them they can use standard UTF-8 library > routines. *chomp* No such luck Doc! -- There is / one art || John C

Re: Bastardizations of UTF-8 (was: Re: [OT] Unicode-compatible SQL?)

2001-02-05 Thread John Cowan
hat they are general-purpose UTF-8 read/write functions. At one point, this was a FAQ on this list. -- There is / one art || John Cowan <[EMAIL PROTECTED]> no more / no less || http://www.reutershealth.com to do / all things || http://www.ccil.org/~

Re: [ OT ] ISO 10646-x and Unicode 3.0 and Hebrew ?

2001-02-02 Thread John Cowan
, e.g., is there a main Hebrew block and then an > "Alphabetic Presentation Forms" section with other Hebrew items? Yes. -- There is / one art || John Cowan <[EMAIL PROTECTED]> no more / no less || http://www.reutershealth.com to do / all things

Unicode 3.1: incomplete tags considered harmless/useful

2001-01-31 Thread John Cowan
E TAG; in particular, a LANGUAGE TAG at the beginning of plain text that is meant to apply to the whole text (document, human-readable-string in protocols, etc.) should be unproblematic. As currently worded, editors SHOULD not permit such uses. -- There is / one art || John Cowan &l

Unicode 3.1: UTF-8

2001-01-31 Thread John Cowan
should be illegal. This can be achieved by replacing the U+1000..U+ row in Table 3.1B as follows: U+1000..U+CFFF E1..EC 80..BF 80..BF U+D000..U+D7FF ED 80..9F 80..BF [9F underscored] U+E000..U+ EE 80..BF 80..BF -- There is / one art || John Cowan

Unicode 3.1: Georgian (editorial)

2001-01-31 Thread John Cowan
The 2nd paragraph in the revision of 7.5 appears to be a remnant. -- There is / one art || John Cowan <[EMAIL PROTECTED]> no more / no less || http://www.reutershealth.com to do / all things || http://www.ccil.org/~cowan with art- / le

Re: Property error for U+2118?

2001-01-31 Thread John Cowan
THEMATICAL SCRIPT CAPITAL P, which has its own codepoint U+1D4AB and category Lu. -- There is / one art || John Cowan <[EMAIL PROTECTED]> no more / no less || http://www.reutershealth.com to do / all things || http://www.ccil.org/~cowan with art- / lessness \\ -- Piet Hein

Re: extracting words

2001-01-29 Thread John Cowan
hether the document was English or Nootka before deciding whether to block "such". -- There is / one art || John Cowan <[EMAIL PROTECTED]> no more / no less || http://www.reutershealth.com to do / all things || http://www.ccil.org/~cowan with a

Re: Benefits of Unicode

2001-01-29 Thread John Cowan
re is / one art || John Cowan <[EMAIL PROTECTED]> no more / no less || http://www.reutershealth.com to do / all things || http://www.ccil.org/~cowan with art- / lessness \\ -- Piet Hein

Re: off-topic overview of standards

2001-01-26 Thread John Cowan
asked "Why didn't you use ARP?", he replied "What is ARP?" -- There is / one art || John Cowan <[EMAIL PROTECTED]> no more / no less || http://www.reutershealth.com to do / all things || http://www.ccil.org/~cowan with art- / lessness \\ -- Piet Hein

Re: Sarasvati

2001-01-25 Thread John Cowan
ne art || John Cowan <[EMAIL PROTECTED]> no more / no less || http://www.reutershealth.com to do / all things || http://www.ccil.org/~cowan with art- / lessness \\ -- Piet Hein

<    3   4   5   6   7   8   9   10   >