Re: Terms for rotations
On 2014-11-10 5:32 PM, Whistler, Ken wrote: WIDDERSHINS is shorter then Aye, but laddie, then we'd have to use DEASIL for CLOCKWISE! And we'd have wiccans after us to spell it "DEOSIL" instead. ;-) And the Irish would no doubt insist on DEISEAL. -- Curtis Clark, PhDhttp://www.cpp.edu/~jcclark Professor Emeritus Biological Sciences +1 909 869 4140 Cal Poly Pomona, Pomona CA 91768 Please note new email address: jccl...@cpp.edu ___ Unicode mailing list Unicode@unicode.org http://unicode.org/mailman/listinfo/unicode
Re: Encoding localizable sentences (was: RE: UTC Document Register Now Public)
On 2013-04-20 2:38 AM, William_J_G Overington wrote: I am thinking that the fact that I am not a linguist and that I am implicitly seeking the precision of mathematics and seeking provenance of a translation is perhaps the explanation of why I am thinking that localizable sentences is the way forward. There seems to a fundamental mismatch deep in human culture of the way that mathematics works precisely yet that translation often conveys an impression of meaning that is not congruently exact. Perhaps that is a factor in all of this. Natural language lacks the logic and precision of mathematics, and is only unpredictably unambiguous. That's why lojban was invented. https://en.wikipedia.org/wiki/Lojban -- Curtis Clarkhttp://www.csupomona.edu/~jcclark Biological Sciences +1 909 869 4140 Cal Poly Pomona, Pomona CA 91768
Re: If Unicode wants to show the Red Card to someone ...
On 2013-04-01 12:19 PM, Buck Golemon wrote: I'm sure that some cards are blue. Do they not also deserve a code point? This amounts to color prejudice. If we generalize the proposal, we should encode all the various colors of cards. Further, we could denormalize the "red card" symbol into combining characters for "red" and "card". This points to a general category of colored combining characters. The only remaining question is whether the colors should be represented in the HSL or HSV color space. Variation selectors! -- Curtis Clarkhttp://www.csupomona.edu/~jcclark Biological Sciences +1 909 869 4140 Cal Poly Pomona, Pomona CA 91768
Re: Missing geometric shapes
On 2012-11-06 4:11 PM, Mark E. Shoulson wrote: That said, I do think it would be reasonable and appropriate to encode the half-stars. There's no such thing as "plain text" on paper (everything in print is formatted somehow), but star ratings are really common in tables that contain nothing else but text, etc. I guess the plain stars have more support, being dingbats in printers' cases since long ago, but these half-stars do feel "texty" to me, anyway. It's just a glyph variant of ½. :-) -- Curtis Clarkhttp://www.csupomona.edu/~jcclark Biological Sciences +1 909 869 4140 Cal Poly Pomona, Pomona CA 91768
Re: Some QR codes each encoding one Unicode character
On 2012-10-08 6:09 AM, William_J_G Overington wrote: The idea is that hopefully in the future these QR codes could be scanned using a mobile telephone that has a QR reader and a suitable app so as to build up a sequence of Unicode characters, such as a telephone number, without the user needing to be able to push buttons. This could potentially be useful to some people with some disabilities. Perhaps it could also be useful to a person trying to amke a telephone call from a mobile telephone in cold weather where he or she would prefer not to need to remove his or her gloves to make the call. Inasmuch as QR codes are already able to encode telephone numbers (at least in the US, and I have assumed in the rest of the world as well), I don't see any utility in this, since it would force the user to scan the codes in sequence. whereas a QR code containing a full number would only need to be scanned once, and in at least some phones and software, the scan would initiate dialing. -- Curtis Clarkhttp://www.csupomona.edu/~jcclark Biological Sciences +1 909 869 4140 Cal Poly Pomona, Pomona CA 91768
Re: Mayan numerals
On 2012-08-23 3:58 PM, David Starner wrote: We must encode what people are currently using; stuff that no one is actually setting in type is of lesser interest. I have to ask myself, if these characters were already in use in mobile phones by a Japanese telcom, would people look at it differently? -- Curtis Clarkhttp://www.csupomona.edu/~jcclark Biological Sciences +1 909 869 4140 Cal Poly Pomona, Pomona CA 91768
Re: Definition of character
On 7/13/2011 3:49 PM, Ken Whistler wrote: As Asmus was at pains to point out, the character encoders are essentially engaged in an operational discovery process regarding "what characters there are". That in turn leads to a definition by enumeration: What characters are consists of the list of what characters there are. Speaking as a biologist, that's a common way that biologists approach "life". Trying to define it is essentialist, and essentialism has been rejected by most modern biologists. -- -- Curtis Clark Cal Poly Pomona
Re: Writing a proposal for an unusual script: SignWriting
On 6/11/2010 2:08 PM, Mark E. Shoulson wrote: I should probably read up more about SignWriting before trying to answer, but (yes, that stupid "I should do X but...") I'm wondering if there might be ways to shoehorn things into Unicode's style anyway. One answer might be what was done for Western musical notation. Another is the Plane 1 math alphabets, which can be used in ordinary writing, but which are more common in formulas with a precise 2-dimensional layout: again, a higher-level protocol (in this case, MathML or TeX) is needed for full use. (One might even imagine a SignML.) -- Curtis Clark http://www.csupomona.edu/~jcclark/ Director, I&IT Web Development +1 909 979 6371 University Web Coordinator, Cal Poly Pomona
Re: Please RSVP... (was: US-ASCII)
on 2004-12-11 09:21 John Cowan wrote: It's been used as an English verb, adjective, and noun for 30-40 years and perhaps much longer: see below. Longer. I can attest from my youth in the 1950s that my parents considered it ordinary English usage, and in fact knew of its origin. -- Curtis Clark http://www.csupomona.edu/~jcclark/ Web Coordinator, Cal Poly Pomona +1 909 979 6371 Professor, Biological Sciences +1 909 869 4062
Re: [increasingly OT--but it's Saturday night] Re: Unicode HTML, download
on 2004-11-22 00:17 fantasai wrote: Unless you are using XML tools to parse or generate the document, there is no advantage to using XHTML. Although I do use XHTML myself, I want to add that a valid HTML document can be unabmiguously translated to XHTML by programs such as HTML Tidy (http://tidy.sourceforge.net/). -- Curtis Clark http://www.csupomona.edu/~jcclark/ Web Coordinator, Cal Poly Pomona +1 909 979 6371 Professor, Biological Sciences +1 909 869 4062
Re: [increasingly OT--but it's Saturday night] Re: Unicode HTML, download
on 2004-11-21 05:40 Stefan Persson wrote: I think M$ bases their guesses on what to download on the charsets used. If e.g. EUC-JP is used, you may be asked to download a Japanese fount, even if the page doesn't contain any Japanese characters at all, I can confirm this--I was working with a draft web site made by a student assistant, and when I went to view it, IE asked if I wanted to install a Korean font. Turns out that Dreamweaver on his Korean-localized system had set the encoding to euc-kr, even though there was nothing beyond us-ascii. -- Curtis Clark http://www.csupomona.edu/~jcclark/ Web Coordinator, Cal Poly Pomona +1 909 979 6371 Professor, Biological Sciences +1 909 869 4062
Re: Unicode HTML, download
on 2004-11-19 10:36 E. Keown wrote: If I add the proper Unicode-related HTML code at the top, will people get Unicode-compatible text when they download this? I recommend that you post the URL for a beta to this list, and we can all check it for you. -- Curtis Clark http://www.csupomona.edu/~jcclark/ Web Coordinator, Cal Poly Pomona +1 909 979 6371 Professor, Biological Sciences +1 909 869 4062
Re: internationalization assumption
on 2004-09-29 20:45 Rick Cameron wrote: What characters needed by French are missing from Latin-1? I'd look it up, but I can't find the œuvre in which it is listed. :-) -- Curtis Clark http://www.csupomona.edu/~jcclark/ Web Coordinator, Cal Poly Pomona +1 909 979 6371 Professor, Biological Sciences +1 909 869 4062
Re: [OT] Decode Unicode!
on 2004-09-25 09:18 Philippe Verdy wrote: Not completely true. It is a bit less than 2 bits, due to its replication chains, and the presence of insertion points where cross-overs are possible. And ASCII is less than 7 bits when LZW is applied. But the effective code is a bit more complex than just the ATCG system, as some studies have demonstrated that the DNA alone has no function out of its substrate, whose nature influence its "decoding". ASCII of course has plenty of function outside its substrate. That's why I can rename a text file with the .exe extension, and it runs just fine. :-) -- Curtis Clark http://www.csupomona.edu/~jcclark/ Web Coordinator, Cal Poly Pomona +1 909 979 6371 Professor, Biological Sciences +1 909 869 4062
Re: Decode Unicode!
on 2004-09-24 10:05 Peter Constable did quote: After the DNA, the ASCII-Code is the most successful code on this planet. Things get more and more complex. DNA is a 2-bit code. -- Curtis Clark http://www.csupomona.edu/~jcclark/ Web Coordinator, Cal Poly Pomona +1 909 979 6371 Professor, Biological Sciences +1 909 869 4062
Re: FW: Looking for transcription or transliteration standards latin- >arabic
John Cowan wrote: The Unicode people are probably going to standardize on calling it "diacritic folding", by analogy to the term "case folding". Añd whàt shåll wë câll thë ãddítiõn of dîacrìtícs bÿ spämmêrs, ïñ ân ättëmpt tò fóòl spåm fîltêrs? -- Curtis Clark http://www.csupomona.edu/~jcclark/ Web Coordinator, Cal Poly Pomona +1 909 979 6371 Professor, Biological Sciences +1 909 869 4062
Re: Looking for transcription or transliteration standards latin- >arabic
An interesting historical case is Istanbul, whose name comes from the Greek phrase "eis ten poli" ("to the city" -- first "e" is epsilon, and second "e" is eta). That phrase tended to be pronounced "istimboli" and with dissimilation "istamboli". So when the Turks changed the name from Constantinople to Istanbul, they simply changed from a name with an obvious Greek derivation to one with a nonobvious Greek derivation. This explanation seems rather Byzantine to me. -- Curtis Clark http://www.csupomona.edu/~jcclark/ Web Coordinator, Cal Poly Pomona +1 909 979 6371 Professor, Biological Sciences +1 909 869 4062
Re: Revised Phoenician proposal
on 2004-06-06 13:50 D. Starner wrote: Let's be honest; the only people who matter in the least when discussing a script is the people who actually use it. And all evidence presented here indicates that scholars of Semitic languages--that is, the people who can actually read the stuff written in the script--are, not surprisingly, the majority users of Phoenician. (As a rhetorical device,) I have to say that I'm puzzled by this. All I've seemed to hear from Semiticists is that Phoenician is not a separate script. How, then, can these same Semiticists be the major users of something that doesn't exist? -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: Phoenician, Fraktur etc
on 2004-05-27 08:13 Otto Stolz wrote: Fraktur characters are not designed to be used in all upper-case text as has been stated before, in this thread. Nobody is used to this sort of pseudo script; hence, nobody will read it fluently. This pseudo-script *is* used in southern California, by aficionados of low-rider automobiles, by some hispanic gangs, by some graffiti artists, and in some prison tattoos (the actual glyphs are more the "Old English" style of blackletter majuscule, rather than a more typical Fraktur). My guess is that it is *intended* not to be read fluently, as a mark of exclusivity. -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: Response to Everson Phoenician and why June 7?
on 2004-05-25 12:06 Dean Snyder wrote: 3) Palaeo-Hebrew scribal redactions to Jewish Hebrew manuscripts To me, this is a convincing reason to encode palaeo-Hebrew separately: it would allow such manuscripts to be encoded in plain text. -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: Response to Everson Phoenician and why June 7?
I want to start out by saying that, although I personally support encoding Phoenician, I really have no stake in the outcome one way or the other, and I'm only participating in the "thread from Hell" (as I believe James Kass called it) because its dynamics interest me. on 2004-05-24 03:08 Peter Kirk wrote: If so, please give us some evidence for another side. I have none. I would be astonished if there weren't another side, but far stranger things than that have happened, and I've been wrong before. But maybe it is something else. For example, if you read evolutionary biologists strongly defending Darwinian evolution against creationist theories, does that imply an internal squabble among evoutionary biologists and therefore that some support creationism? Or does it rather imply a closing of ranks against outsiders who are attacking their discipline, a defence against (what they perceive as) unscientific attacks from those who don't know what they are talking about? This is a very apt analogy. IMO, it is *precisely* because evolutionary biologists disagree about some fundamental issues in evolutionary biology (such as the relative importance and scope of natural selection) that they "close ranks". As a result, some of the arguments presented against creationism are caricatures. And the "they don't know what they are talking about" rhetoric is common on both sides. As one who has debated creationists, I know that there are other approaches, that work incrementally better in educating people whose minds are not already made up. But the Semiticists who have posted against the proposal on this group seem to be falling into the same closed-rank pattern that I know so well from my own field. -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: Fraktur yet again (was: Re: Response to Everson Phoenician and why June 7?)
on 2004-05-24 06:37 Dean Snyder wrote: Diascript is to script as dialect is to language - part of a continuum of relatively minor variations. A script is a diascript with an army? (To paraphrase a saying about dialects...) -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: Response to Everson Phoenician and why June 7?
It's hard for me to believe that the world community of Semitic scholars is so small or monolithic that there aren't differences of opinion among them. I have been almost automatically suspicious of the posts by the Semiticists opposed to encoding Phoenician; after thirty-four years in academia (longer if I count that my father was a professor when I was a youth), I have yet to see a field in which there were not differences of opinion. Admittedly, all Semiticists might agree on the nature of Phoenician (just as all chemists accept the periodic table), but the fervor exhibited here makes me wonder what the issues *really* are. I am used to seeing such fervor among academics only when there has been some unstated agenda at work. And so I wonder, are we in this list reading only one side of an internal squabble among Semiticists? -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: ISO 15924 draft fixes
on 2004-05-21 07:10 Michael Everson wrote: I am not very happy about loading the plain-text in browsers. Three of my browsers load it and *all* the French UTF-8 is displayed in Latin 1. This *may* be a server issue. Iirc, the server has to be told to mark the text/plain MIME-type as UTF-8, since there are no tags (as there could be in HTML) and since browsers generally lack the heuristics to decide on coding of plain text. -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: ISO 15924 codes for ConScript
on 2004-05-20 07:52 Peter Constable wrote: One person wrote, regarding Qaak for Klingon: It's a shame you didn't pick something that could be pronounced in tlhIngan Hol, perhaps Qaap for pIqaD. Identifiers are identifiers, not words. That's why I sent my message to Doug off-list; it was a joke. -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: OT [was TR35]
on 2004-05-11 23:14 Jony Rosenne wrote: How does the Mozilla calendar handle time zone changes - does it store all time as GMT (UTC) and mess them all up when I change the time zone, or does it store them as local time and mess them up when communicating with people in other time zones? AFAICT, the latter. -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: OT [was TR35]
on 2004-05-11 10:49 Jony Rosenne wrote: Unfortunately they do not support Hebrew well enough. I did use Eudora before Hebrew e-mail was common, i.e. before Microsoft implemented the Unicode bidi algorithm. Mozilla 1.6 has a localization for Hebrew (http://www.mozilla.org/projects/l10n/mlp_status.html#moz_1.6), and afaict supports bidi (all the Hebrew on this list and on web pages comes out in the right direction). There is a calendar add-in (http://www.mozilla.org/projects/calendar/) that is quite nice (I like it in many ways better than Outlook, and it reads iCalendar files), and the email client does UTF, message threading, and Bayesian spam filtering. -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: Phoenician
on 2004-05-07 07:47 Peter Constable wrote: Have you not heard that yours is not the only scholarly community? To speak as though there is only one, or that all have the same needs as yours, seems a bit arrogant. Sadly, the hegemonist view is not restricted among scholars to these semeticists; in systematic biology, a group wanted to adapt the basic classification scheme of organisms to better fit current science. They were resisted, and began constructing their own classification scheme. The hegemonists who had resisted their making the classical scheme useful for their needs have resisted even more their creation of a new scheme. Sadly, "my way or the highway" has always been too common in academia. -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: Arid Canaanite Wasteland (was: Re: New contribution)
on 2004-05-02 16:26 Michael Everson wrote: Children learning about the history of their alphabets I've been following this discussion off and on, and figured I didn't have much to add, but I can relate to this remark. I was a child, once, and I had a fascination with scripts and languages that has continued to the present day. Although I have never been more than a dilettante in these fields, I'd like to think that what knowledge I have has positively influenced my long career as a botanist and my more recent career as a web developer. In an eighth-grade English class (I was around 14 years old), I wrote a short story about the ancient inhabitants of Palestine. (It was intended to be humorous, in the ways of 14-year-old boys.) In that story I included fictional place names written in what would fit into Michael's Phoenician block (I believe they were some sort of ancient Canaanite, if not Phoenician sensu stricto). I never progressed in my knowledge of Semitic scripts until a couple of years ago, when my daughter wanted a tattoo that said "peace" in Aramaic, and I researched enough to realize that Estrangelo Edessa wasn't likely to have been used to write Aramaic in the time of Jesus. And with these bits of knowledge, I have been able to follow the outlines of the discussion. If Unicode Phoenician had been around when I was 14, I would have used it. -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: [OT] Freedom and organization (was RE: help needed with adding ne w character)
on 2004-03-19 02:04 Marco Cimarosti wrote: Anarchism is against imposing forms of organization, non against organization itself. And standards are quite like the useful side of laws (the organization) without the harmful side (the imposition), so they should be welcome to anarchists. The sort of anarchy that I am familiar with involves decision by consensus. A really good example from my academic discipline is the International Code of Botanical Nomenclature. There are conventions held every six years in which the participants vote on changes to the code, so in that sense it might seem a democracy, but the code only works because there is a consensus among botanical taxonomists to use it. No governments enforce it. -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: Investigating: LATIN CAPITAL LETTER J WITH DOT ABOVE
on 2004-03-18 01:05 Pavel Adamek wrote: So it would be convenient to have an empty diacritical mark, (COMBINING NOTHING ABOVE) which would cause the "soft" dot of or to disappear, without adding anything else. Assuming this could be added to any other character, my mind boggles at the implications, both for decomposition and for rendering. :-) -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
OT? Languages with letters that always take diacriticals
Are there any languages that use letters with diacriticals, but *never* use the base letter without diacriticals? A made-up example to explain what made me think of it: Let's say a language has "ö", to represent the same sound that it does in German, but not "o", because the language lacks the sound represented by that letter in common European languages (the alternative being to use "o" to represent the "ö" sound). -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: Astrological symbols
on 2004-02-05 15:29 Ernest Cline wrote: [1] centaur - an asteroid/comet with a perihelion located between the orbits of Jupiter and Neptune whose orbit crosses that of one or more of Saturn, Uranus, or Neptune. The first known and largest of these objects is Chiron discovered 1977. Observation has since shown that Chiron is a large comet like body (150 - 200 km in diameter.) Pluto fits that definition. As to the proposal, as Michael might say, show examples from printed works, especially of the signs used in text. Are any of these also attested as classical symbols for the respective God/desses? -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: Phonology [was: interesting SIL-document]
on 2004-02-05 03:54 John Cowan wrote: Indeed. In fact, the first fuccative-insertion on record, laughably tame by today's standards, is an American's: William Randolph Hearst said of one of his reporters: "Tell Coates I said he is too inde-goddam-pendent!" I first heard of the expletive infix in the context of the "familiar speech" of the US Navy. My experience at Navy bases in later years suggests that the forms may have become ar-f***ing-chaic. (Did I divide that in the middle of a metrical foot?) -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: Combining down-pointing triangle above?
on 2004-01-18 17:47 Doug Ewell wrote: Is this just a fancified hacek, or a potential candidate for proposal? Evidently a hacek: http://www.chumashlanguage.com/vocab/vocab-01-fr.html -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: Detecting encoding in Plain text
on 2004-01-12 08:57 Tom Emerson wrote: You also have to deal with oddities of language: I tried one open source implementation of the Cavnar and Trenkel algorithm THAT CLAIMED THAT SHOUTED ENGLISH WAS ACTUALLY CZECH. SHOUTED AT CLOSE RANGE (~ 1 CM FROM THE EAR) AND WITH A CZECH ACCENT, IT SOUNDS PRETTY MUCH THE SAME. -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: Chinese rod numerals
on 2004-01-12 17:45 Kenneth Whistler wrote: The obvious precedent for a set of numerals like this are the Aegean numerals, U+10107..U+10118, which are also quite obviously derived from layouts of tallying sticks, and which have a units set 1-9 and a tens set 10-90 oriented at right angles to the 1-9 set. But the Aegean system used other counters for 100 and up, so there is not a problem of alternating values. And historical examples of the Aegean numbers exist *primarily* (if not exclusively?) in written form, on clay tablets. -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: Latin letter GHA or Latin letter IO ?
on 2004-01-03 14:23 Philippe Verdy wrote: The problem we were discussing here is that only the informative and non-normative properties are giving the appropriate identity of the encoded letters, but NONE of the existing normative properties... It seems to me that a little reflection would reveal that it is easiest to make properties normative when they are *not* informative, beyond whatever it is that they uniquely specify. There were very good reasons to make both code points and character names normative; people assume that the latter are informative, and that's where we get into trouble. No one argues about the fact that Æ is U+01A2; only its name is a "problem". -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: Pre-1923 characters?
on 2004-01-03 14:40 Philippe Verdy wrote: I have never seen you accepting compromizes and I doubt of your negociation faculties. A lot can be said about Michael, but it is inaccurate to say that he never changes his mind. One of the things that I have come to value over the years in his "pronouncements" is that they invariably reflect careful consideration, whether I agree with them or not. -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: German 0364 COMBINING LATIN SMALL LETTER E
on 2003-12-28 16:36 Gerd Schumacher wrote: In German the supralinear e may be used as a variation of the diaeresis above a, o, and u. Though it is old fashioned, indeed, it is still understandable, and might be used for invitation cards and the like. I don’t know a modern font with it, http://www.myfonts.com/fonts/urw/breitkopf-fraktur-d/regular/charmap.html -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: why Aramaic now lumpers and splitters
on 2003-12-24 12:29 Elaine Keown wrote: It appears to me that script experts may resemble experts in dialects/languages: there are lumpers and splitters Following up on my post about wariness to unify being correct in first principles: My day job uses my training as a plant taxonomist, a field in which there are also lumpers and splitters. I am a lumper, but, as you say, a "thinking lumper". If I have any doubts about whether two species of plant are separate, I maintain them as separate, in part as a challenge to future taxonomists (or me) to demonstrate that they are truly the same. Lumped species are "under the radar"--nonspecialists looking at them may never be aware of the disparate elements that make them up, and even specialists may not think to revisit them. It is ultimately easier to lump than to split (with plants, and I assume with languages and scripts as well), so those of us who are lumpers have a greater responsibility--it "comes with the territory". -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: why Aramaic now
on 2003-12-24 12:02 Elaine Keown wrote: Some of the sets of symbols I found---which I simply assumed could be added to "Hebrew"--are innately controversial because of the Roadmap. I've been following these threads with interest, as an uninformed bystander. Michael's unwillingness to unify in haste seems correct in first principles, independent of his expertise and experience. But you have presented the first cogent (to me :-) argument for why delaying the decision is a problem. One thing I've learned on this list is that Unicode done well respects no short-term convenience. -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: Aramaic unification and information retrieval
on 2003-12-24 02:39 [EMAIL PROTECTED] wrote: The relationship between mysticism/occult studies and language studies should definitely go in only one way. Otherwise we'd end up encoding one character for "true name of God" and fill the rest of the codespace with variant selectors to apply to it :) Um, isn't U+ the true name of God, and all the rest of Unicode variant selectors? -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: [OT] Keyboards (was: American English translation of character names)
on 2003-12-19 00:05 Arcane Jill wrote: The left and right keys are functionally identical anyway, and the key is functionally identical to a right mouse click. It's handy, though, for people who cannot use a mouse. (Okay, so is used for "screen capture to clipboard" but who needs a button for that?). I use it all the time. Saves buying screen capture software. They could have just used, for example, for and for , without then having to scrunch up the and keys and shrink the space bar. With this I agree, and the keys could have retained their meaning in DOS windows. Perhaps the older versions of Windows weren't up to the task. Vaguely ob Unicode, SC Unipad has keyboard layouts for many languages, but has the euro at Alt-Gr w on the "English (British)" keyboard. -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Stability of scientific names, was Stability of WG2
on 2003-12-16 15:27 Peter Kirk wrote: I'm no expert on this... I am. :-) but I thought that species could be transferred from genus to genus as knowledge advances. As John pointed out, the epithet stays the same. And presumably obvious spelling mistakes are corrected (contrast "FHTORA" in U+1D0C5), or are you saying that if the first publication had "Brontosuarus" as a typo this error would remain for ever? There are errors and then there are errors. Some are correctable, some are not, and botanists and zoologists have different rules about this. An example that's not entirely OT: There was a Russian physician with the last name ÐÑÑÐÐÑ - a "cyrillicization" of his German family name Escholtz. His name was commonly written then and today in German form as Johann Friedrich Eschscholtz, the schsch reduplication being a reflection of the Cyrillic spelling. He Latinized (language, not alphabet) his name (a common occurrence among naturalists) to Eschscholzius. He was physician to the Kotzebue expedition from Russia to (among other places) California; the ship's naturalist was Adelbert von Chamisso (author of _Peter Schlemiel_). Chamisso and Eschscholtz were fast friends (and some accounts imply that they were lovers). Chamisso named several new species of organisms for his friend, including the California poppy. In the original description of the California poppy, he named it _Eschscholzia californica_, making the genus name the feminine form of Eschscholtz's Latinized name (this is a common occurrence). In the caption of the illustration of the plant, however, it was spelled _Eschholzia_. But for over a century afterwards, most botanists and horticulturists spelled the genus _Eschscholtzia_, assuming that both spellings in the original description were typographic errors. But the rules of nomenclature are very specific about which types of errors can be corrected, and, since there is no obvious "correct" spelling of Escholtz, *the spelling that accompanied the original description must stand*, and the plant is correctly _Eschscholzia californica_. -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: Stability of WG2
on 2003-12-16 02:53 Peter Kirk wrote: Even if this is a millennial reign of peace and prosperity, processes of language change will not stop. A measure of comparison is the system of biological nomenclature, which has maintained stability of names in the face of increasing knowledge of organisms over a period of a quarter of a millenium. There are no ISO standards for scientific names--the system has succeeded through consensus, by biologists agreeing that a stable system is worth the trade of quite a bit of individualism (not to mention the periodic and sometimes raucous conventions when the rules are modified). -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: [OT reversing letters to avoid offence] Re: [Fwd: Re: Swastika to be banned by Microsoft?]
on 2003-12-15 11:24 Doug Ewell wrote: BTW, the first person to suggest using Variation Selectors to encode reversed K's and B's will get bonked in the head with a foam bat. Um, Doug, that would be you, for bringing it to our attention Consider yourself bonked. -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: Supporting the Unicode Project
on 2003-12-04 09:43 Edward H. Trager wrote: Actually, I am a bioinformatics programmer, and to date I have given away my programs away for free. The main reason I give them away for free is fairly simple: the market of genetics researchers potentially interested in buying them is too small, so I would not make that much money trying to sell them. To muddy the waters further, vendors who make gel analysis software that is involved in generating the basic data of genomics and proteomics charge huge amounts of money, that labs regularly pay, because some types (and venues) of biomedical research are well-funded. The issue of software cost is a complex one, involving both business and non-business decisions (and especially the latter in one-person operations). -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: MS Windows and Unicode 4.0 ?
on 2003-12-04 07:49 Stefan Persson wrote: Eudora doesn't support Unicode on *any* OS, right? Indeed. I and I'm sure many others on this list sent feedback to Qualcomm at Michael's behest, but a fat lot of good it did. At least Windows users can copy and paste into SC Unipad to get an idea of what's going on, but my solution was to switch to another client for this and other email lists. -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: MS Windows and Unicode 4.0 ?
on 2003-12-03 11:28 Edward H. Trager wrote: WHY NOT just *give* away the Linear B, Ogham, I give away Linear B. It's an incomplete set, and has not been vetted by experts, as has Michael's. It's worth what you pay for it. I and Michael both give away Ogham. His has the glyphs in the proper Unicode slots; mine is a font hack (I have the Unicode version sitting on my hard disk, tapping its foot waiting to get out). And making fonts isn't my day job. With Michael, you get scholarly expertise, professional care, and someone to complain to when things are wrong. With my fonts, you get what you pay for. I'd be happy, short-term, if Michael gave away more stuff. But until he becomes independently wealthy, we all lose long-term if he decides he can no longer afford to devote the time to script encoding. -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: MS Windows and Unicode 4.0 ?
on 2003-12-03 02:09 Arcane Jill wrote: I don't believe that anyone could rightly argue that, for instance, musical symbols were "esoteric". They're a standard part of my culture. And yet, I still can't put a treble clef in my document using the standard Windows fonts, and nor can I put it on a web site and believe that it will be viewed correctly by most western viewers. Um, as an off-and-on musician, I tend to expect a treble clef on a staff, and I don't really expect my OS to handle musical notation. I suppose if I wanted to say "here is what a treble clef looks like" on a web site, I would have to use a graphic. I'd have to do the same thing to show what a rose looks like. (And if I wanted to demonstrate its smell, I'd be out of luck.) By exactly the same reasoning, I expect all the math symbols to be there too, including mathematical alphanumeric symbols. This is not a strange or exotic requirement, it's just a part of living in this western culture and wanting to use they symbols of my culture. The bulk of math alphanumerics can be represented with markup, using standard fonts. Sure, it's no good for interchange, but viewers of a web page can *see* italic "a" (assuming they can see) with a simple a. -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: Hexadecimal digits?
on 2003-11-10 07:28 Jim Allan wrote: And the only way you can tell 7 decimal from 7 hex is by giving 7 to different code points, that is File777 in hex should sort after File999 in decimal. The CSS guru Eric Meyer noted that Ohio license plates translate as hex RGB colors, mostly purple: http://www.meyerweb.com/eric/thoughts/2002a.html#t20020228 -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: Berber/Tifinagh
on 2003-11-10 04:17 Michael Everson wrote: It still remains the case that Theban "orthography" is basically English, that is, it is Latin with funny glyphs. Why isn't Latin Serbian just Cyrillic Serbian with funny glyphs? I'm not trying to be intentionally dense here; Theban English and Serbian are different in many ways. But are there truly no edge cases, where whim is the only deciding factor? And how does whim turn into policy? -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: Berber/Tifinagh
on 2003-11-09 17:07 John Hudson wrote: I've given a lot of thought to transliteration and transcription at the glyph level: Which comes back to the issue of ciphers. It would seem to me that glyph-level transliteration is the accepted behavior for ciphers (else we would actually have to address whether such things as Theban should be encoded, and Braille would have been a non-issue from the get-go). What determines whether a script is a cipher of another? -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Clarification, please, was Re: Berber/Tifinagh
on 2003-11-09 10:41 Michael Everson wrote: I am appalled. I thought you understood something about Unicode, Philippe. At this point, I'm a bit puzzled about the circumstances in which an alphabet is a cipher of another, and when it isn't. In an offlist conversation, you, I, and others seemed to arrive at the consensus that the Theban "magickal script" was a cipher of Latin. And many years ago, you raised the question of whether Etruscan was a ciper of either Latin or Greek (as we both know now, it isn't). I assumed that the criteria were (1) the scripts can be used interchangeably to write a single language, and (2) there is a one-to-one correspondence between their glyphs. If Philippe were correct about the one-to-one correspondence, wouldn't the Latin glyphs be a cipher of the Tifinagh? And thus a glyph choice rather than a script choice? Let's say that the Klingons prevailed, and pIqaD were encoded. There is a one-to-one correspondence between the letters of pIqaD and single or groups of Latin letters (supposedly). Could one not make a pIqaD font in which the glyphs looked like the Latin letters or groups? I'm assuming I'm missing something here, and would like to know what it is. -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
OT: Inuktitut dictionary?
A friend is looking for a vocabulary-rich English-Inuktitut dictionary as a source for names for malamute dogs. He is a scholar in another field (astrobiology), and so is concerned with accuracy. I'm sure he would gladly learn the syllabics to the extent necessary. He has access to university interlibrary loan if the best dictionary is out of print. And I imagine he would be fine with Inupiaq, too. Please email offlist if you have any suggestions. Thanks! -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: PUA
on 2003-10-19 19:34 Chris Jacobs wrote: One problem is that there seems to be no way in plaintext unicode to specify who is in charge of a particular interpretation of the PUA. At last! Another use for Plane 14! :-) -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: About that alphabetician...
Of course, any Unicode character can be expressed as an XML character reference (e.g. म) in any web page encoding, even US-ASCII. -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: W3C Objects To Royalties On ISO Country Codes
on 2003-09-21 10:38 Michael Everson wrote: Golly, does that mean they'll pay people like me if they get royalties from people using ISO/IEC 10646? The current economic paradigm: "Steal. Sell." -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: Hexadecimal never again
on 2003-08-20 11:03 Rick McGowan wrote: Hex doesn't have an independent existence out in non-computing culture for, e.g., signs in the market place or monetary values. Caviar, 10kg, €FEED -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: [Way OT] Beer measurements (was: Re: Handwritten EURO sign)
on 2003-08-19 04:18 Pim Blokland wrote: Ha! Fat chance! You might as well suggest we abolish the yard altogether! Then, how would I have a yard sale? (or even a yard sail?) -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: [Way OT] Beer measurements (was: Re: Handwritten EURO sign)
on 2003-08-19 02:51 Marco Cimarosti wrote: TOILETS ---> 50 yds (45.72 m) To be precise, it should have said 50.00 yards (or perhaps 46 m). -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: Display of Isolated Nonspacing Marks (was Re: Questions on ZWNBS...)
on 2003-08-06 15:24 Doug Ewell wrote: I'm not a typographer (intelligent or otherwise), but I'm having a tough time seeing how Section 2.10 *requires* fonts and rendering engines to give a space-plus-combining-diacritic combination the exact minimum width of the diacritic alone, or to leave equal space before and after such a combination. All I think it is saying is that, for example, the combination i-plus-tilde may be wider than i alone, because tilde is wider than i. Considering that one approach is to use opentype to map a letter plus diacritical to a single glyph, an obvious solution would be to include space + diacritical combos in that table. An important font issue, but a font issue nonetheless. -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: Display of Isolated Nonspacing Marks (was Re: Questions on ZWNBS...)
on 2003-08-05 15:31 Peter Kirk wrote: Thank you, Mark. This helps to clarify things, but still doesn't explicitly answer my question of how to encode "a sentence like "In this language the diacritic ^ may appear above the letters ...", but instead of ^ I want to use a combining character" and want to display exactly one space before the combining character - do I encode two spaces or one? In this language the diacritic ̊ may appear above the letters... Two spaces, at least in Thunderbird Mail. -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: I am not in India II
Michael Everson wrote: People who believe that e-mails with a particular name in the From field must come from that very person can be called, ehem, naiive. That's an interesting way of writing the diaeresis on naïve, Adam. :-) It's a good thing it's soft-dotted! Or perhaps he meant naijve. :-) -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: Aramaic, Samaritan, Phoenician
Michael Everson wrote: Particularly as they regularly write text in both Coptic and Greek and this distinction is better expressed in plain text than in the font. This seems to me to be a key issue: would there be a need to include words or passages of eany of these early Semitic scripts in Hebrew text? If so, they warrant separate encoding. (There is the case of the Tetragrammaton already mentioned, but it may be an exception.) -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: Aramaic, Samaritan, Phoenician
Michael Everson wrote: So is there a real justification for separate alphabets here? To my mind, yes. It's worth noting that Aramaic can also be written in the (encoded) Syraic script, and my superficial googling suggests that at least one currently-used form of Syraic dates back over two millenia. -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: Announcement: New Unicode Savvy Logo
William Overington wrote: 2.. What is the situation if a page is encoded entirely properly as far as, say, using UTF-8 goes, yet also uses Private Use Area characters? UTF-8 includes the PUA. It specifies nothing, however, about its contents. -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: Announcement: New Unicode Savvy Logo
Philippe Verdy wrote: May be the PUA allocated spaces could be divided in normative categories, for example by assigning LTR or RTL base letters in some areas, diacritics in another large area splitted in 255 subspaces for combining characters, and symbols or ideographs in another large area. Um, then it wouldn't be private. I seem to remember a recent discussion of how Microsoft doing something similar was causing all kinds of difficulty. -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: The role of country codes/Not snazzy
Marion Gunn crossposted: Scríobh John Cowan <[EMAIL PROTECTED]>: Jon Hanna scripsit: ... It's funny, just earlier today, I castigated a member of a list I manage for posting a contribution to another list without the author's permission, an act which some of us regard as seriously *un*professional. I guess netiquette is another one of those cultural things. -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: Exciting new software release!
John Cowan wrote: There are, strictly speaking (some typographer correct me please if I am wrong), no italic sans serif fonts, but only slanted sans serif fonts. I believe Adobe Myriad claims a "true italic"; the letterforms are sans versions of standard italic letterforms, rather than obliques of the upright forms. -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: Exciting new software release!
Doug Ewell wrote: I see a few people have actually downloaded MathText and tried it out. I thought it would make a better joke to actually implement the thing, complete with UI mini-frills (icons to indicate scripts supported by the chosen style, selectable Unicode 3.x/4.x conversion to SCRIPT SMALL L, etc.) than simply to describe it on a Web page. This was in the same spirit as Michael, Roozbeh, and John's full-blown COMBINING HEART proposal, which was far funnier than if somebody had just mentioned the idea without developing it. And now of course your joke is perhaps the most robust IME for these characters. -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: FAQ entry (was: Looking for information on the UnicodeData file)
John Hudson wrote: The same people consider Latin a dead language, suitable only for study of ancient documents, which is clearly not the view taken at the Vatican, which continues to produce new documents in that language. In recent encyclicals, however, at least as published at www.vatican.va, the æ and œ are not used. Botanical taxonomists also produce new documents in Latin (descriptions of new species and other groups) and also eschew æ and œ, again no doubt because of font issues. -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: list etiquette (was Re: Tailoring of normalization
Lars Marius Garshol wrote: * Tex Texin | | There probably isn't a one-size fits all solution, short of those | not wanting a response changing their reply-to address to | "[EMAIL PROTECTED]". That's dangerous. Quite a few email clients will then create replies that go only to that address, so nobody will see them at all... Actually, no, the postmaster of the Montréal Stock Exchange (me.org) is likely to see every one of them. The only domain name that is reserved is example.com/org/whatever. Every other domain is potentially in use. -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: LATIN LETTER N WITH DIAERESIS?
Lukas Pietsch wrote: Your F725 Unknown-2, to me, looks like a German SCRIPT CAPITAL S, (compare with U+2112;SCRIPT CAPITAL L). Yes, we were taught to write an S like this in school. Perhaps it's used somewhere in mathematics? Looks to me like the proofreader's marginal deletion mark. F7AA might also be a proofreader's mark. -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
LATIN LETTER N WITH DIAERESIS?
I have a distinct memory of a precomposed Latin letter n with diaeresis (as in the band Spinal Tap), but now I can't find it. It doesn't matter to me whether it exists or not, other than helping me to understand my memory. Am I missing it? Did it exist once and is now gone? Or am I making it all up? -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: Omega + upsilon ligature?
John Cowan wrote: > And (uniquely for a Greek ligature?) was copied into the Latin alphabet, > and is now in use for /w/ in certain French-derived orthographies. Zum beispiel, U+0222 and U+0223, used in Ȣendat, an indigenous language in Québec. -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: Romanized Cyrillic bibliographic data--viable fonts?)
I wrote: > Until I c ... Some of you had as much trouble with my XML entities as I might have had with Jarkko's U+ codes. Here is the transliteration: "Until I converted Jarkko's text, I wondered if he wasn't trying to make a Unicode form of rot13, so that readers could choose not to be offended. Torsten, when will Unipad support converting the U+xxxx format?" -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: Romanized Cyrillic bibliographic data--viable fonts?)
[EMAIL PROTECTED] wrote: >>And who pays the poor font designer for his work? > > > U+0041 U+006C U+0074 U+0072 U+0075 U+0069 U+0073 U+006D U+0020 U+006F U+0072 U+0020 >U+006B U+0075 U+0064 U+006F U+0073 U+002C U+0020 U+006D U+0061 U+0079 U+0062 U+0065 >U+003F Reminds me of a line by a standup comedian referring to the broader context: "So I went to my landlord and said, 'Hey, *nice* apartment!'" Until I c onverted Jarkko's text, I w ondered i f he wasn 't trying to make a Unicode form of rot13, so that rea ders coul d choose not to be offended . Torsten , when wi ll Unipad support convertin g the U+x xxx forma t? -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: REALLY *not* Tamil - changing scripts (long)
Keld Jørn Simonsen wrote: > I dont think using @ in a new orthography is a good idea. This was indeed my surmise, and I'm glad to see agreement. -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: REALLY *not* Tamil - changing scripts (long)
Addison Phillips [wM] wrote: > Obviously I'm not an expert in these linguistic areas (and hence > rarely comment on them), but it seems to me that the lack of other > mechanisms makes Unicode an attractive target for criticism in this > area. Certainly no Unicode-bashing was intended (I'm more of a Unicode evangelist). I guess I'm confused about the use of Unicode character properties. Are you saying that, even though Unicode defines U+0027 as punctuation, other, I could use it as a glottal stop and create a locale that would treat it as a letter (and still be "Unicode compliant", whatever that is?). And if that's the case, are the Unicode properties just guides? Could I develop an orthography where YβÑبձâ would be a word, and there would be no consequences? -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
*not* Tamil - changing scripts (long)
James Kass wrote: > Isn't this kind of a Catch-22 for anyone contemplating script reform? > Do we discourage people from altering their own scripts? Should we? > It is suggested that scripts can be "alive" in the same sense that > languages are "alive"; changes (which are part of life) just occur > much more slowly in scripts. This touches on some "Unicode vs. the world" issues I've been thinking about, having to do with indigenous peoples developing orthographies for their own languages. My two examples are both languages of the Takic group in southern California. The Luiseño language declined to a very few native speakers, but has enjoyed a renaissance in recent years. The Gabrieleno (Tongva) language was effectively extinctâno native speakers, no recordings, some amount of written documentationâbut the Tongva are resurrecting it (it is similar enough to the other Takic languages that it is possible to reconstruct parts that are missing). Anthropological accounts of both languages are of course in the phonetic alphabets beloved by linguists in the days before IPA stabilization. And, like many other native Americans, the Luiseño and Tongva have wanted simpler orthographies that can be typed with US-English keyboards. I don't have a lot of familiarity with Luiseño, but web pages have included passages where non-letters (such as @) are used as letters. This solves the keyboarding problem (since few people would try to pronounce an email address as Luiseño), but I imagine all sorts of issues with sorthing, searching, word selection, casing, and all the other sorts of things that computers can do for "major" languages. Where all this involves me is with Tongva. I have been working with a Tongva ethnobotanist on a project that, among other things, involves plant labels in Tongva, English, and Latin. Tongva spelling is currently inconsistent, and my colleague has been regularizing it for this project (because he is the primary language teacher for the nation, and few have any fluency at all, he has this freedom). Somewhat like English, Tongva represents both the "oo" and "uh" sounds both by "u". Unlike English, the rest of the orthography provides no clues to which sound is meant. /If/ my colleague were to ask (and the Tongva may be satisfied with the existing orthography), I would suggest representing the "uh" sound with a Latin-1 letter (possibly û), and explain several simple alternatives for keyboarding it on Mac and Windows. I would *not* suggest overloading @, or some similar approach. I suppose that Unicode could add at some point "Luiseño letter @", with appropriate properties, but that would circumvent the reason for picking it: its presence in US-ASCII. In an ideal world, indigenous peoples would hook up with folks like Michael Everson (or even me) and get some guidance on how to have their orthography and eat it, too, but as things now stand, overloading, font hacks, and the like are the path of least resistance. -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
*not* Tamil: question about phones and rendering
Asmus Freytag wrote: > I cand e-mail you from my phone - it's too painful and too limited to > carry this conversation at length, besides the phone's not subscribed to > this list, but phones are *NOT* closed systems. Would complex rendering take place in the phone? Or would that happen in the phone company computers that communicate with the phone, and they would communicate with the phone in a private code? -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: How do I encode HTML documents in old languages ſuch as 17th century Swediſh in Unicode?
Stefan Persson wrote: > It wouldn't be poſſible to uſe the HTML > command, becauſe no Fraktur fount is commonly diſtributed with any OS. One > way could be to uſe the plane 1 Fraktur characters intended for mathematical > uſage and the combining "e" and "o" characters, and images for the remaining > characters. Which OS has a font that includes the Plane 1 math characters? -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: (long) Re: Chromatic font research
William Overington wrote: > This post makes the scientific > situation quite clear Several others have taken you to task for using English words with your own private meaning, rather than a generally accepted meaning that can be shared by all on the list. "Science" is one of those words. Science is the activity of finding out things that aren't already known. It involves hypotheses that can be tested by experimentation or observation. Your conclusions about ligatures are completely predictable from knowledge of the way that fonts work. No experiment was necessary, just as it is unnecessary to count stones to establish that four plus three equals seven. -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: Courtyard Codes and the Private Use Area
At 07:45 2002-05-25, William Overington wrote: >No, it does not. > >Character U+003C is LESS-THAN SIGN >Character U+003E is GREATER-THAN SIGN >Character U+002F is SOLIDUS > >If some other people have used those characters in a markup system with a >non-Unicode file format, that cannot be considered as Unicode providing the >basis for markup. I'm sorry, but I can't tell whether you are being intentionally contrarian or simply dense. To say that http://www.unicode.org";>Unicode does not provide the basis for markup is the same as saying that Unicode does not provide the basis for English or C++. XML is explicitly based on Unicode. And I have not a clue as to what you mean by a "non-Unicode file format" in this context. If you want to invent your own system of markup (using Unicode, just as W3C has), no one is stopping you, but I for one will not be paying attention. -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: Courtyard Codes and the Private Use Area
At 07:06 2002-05-24, Philipp Reichmuth wrote: >Again, markup is the better solution. And, to be honest, it's a bit of >a waste of space on the mailing list, don't you think? I agree. Unicode already provides the basis for a widely-used and standardized formal system of markup by providing the characters U+003C, U+003E, and U+002F. -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
OT Korean spam
Somehow I got on a Korean spam list a while back, and I get between 10 and 20 emails a day in euc-kr. The majority have subject lines that start with U+AD11 U+ACE0. If it's not obscene, could someone tell me what that means? (Thanks to SC Unipad, I can see the Hangul, although I don't read Korean.) -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: The Arrogants and the sillies (RE: Euros and cents)
At 02:04 PM 3/26/02, Jungshik Shin wrote: > Korean can form plural nouns by adding U+B4E4. Is that the plural, or would it be the deul...uh, dual? -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: Talk about Unicode Myths...
At 09:11 AM 3/20/02, John H. Jenkins wrote: >This doesn't reflect, however, what actual Japanese users want (or, at >least, would find acceptable). The correct algorithm is to display kanji >with Japanese glyphs if at all possible. > >Again, the typographic tradition in Japan is to write kanji with Japanese >glyphs *even* when Chinese is the language being written. Maybe I'm missing something here. My browsers don't display ASCII in fraktur, because I have not selected a fraktur font as either the system font or the default browser font. It seems to me that an average Japanese user would have only Japanese fonts installed, so that all CJK would appear in Japanese style no matter what its source. Why is there an issue? -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
OT, questions about Hanzi
Maybe this is off-topic, but I figure this is the place where I could get the quickest answers. What are the code points to write these things in their native languages? 1. "Hanzi" in Traditional and in Simplified 2. "Kanji" in Kanji 3. "Hangul" in Hangul (is it U+D55C U+AD74?) 4. Is "Hanja" ever written in Hanja in modern Korea? Is it U+D55C U+C790 in Hangul? 5. Are "katakana" and "hiragana" written in hiragana, or in Kanji? TIA! -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: Private Use Agreements and Unapproved Characters
At 08:59 PM 3/18/02, Doug Ewell wrote: >You are not going to find many fonts on the Web that contain PUA >characters. Actually, every Truetype font with Windows Symbol encoding uses the PUA. >Personally, I'd like to see a font that covers all or most >of the ConScript characters, but that seems impossible since so many of >the ConScript glyphs have become unavailable, possibly forever. Please explain what you mean by this. -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: Synthetic scripts
At 02:58 AM 3/17/02, Miikka-Markus Alhonen wrote: >What about "a script that was invented by one person with the principal >intention of representing an artificially constructed language"? >This would include Tengwar, Cirth and Klingon but not any of the other >above-mentioned cases. Hmmm. I guess that would also include APL. -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: Synthetic scripts
At 04:45 PM 3/16/02, Doug Ewell wrote: >But right away that definition includes not only Shavian, Tengwar, >Cirth, Klingon, and most of the contents of ConScript, but also >Ethiopic, Cherokee, Canadian Syllabics, Gothic, Deseret, and maybe Yi >Syllabics, all of which are already encoded in Unicode. And iirc Cyril and Methodius were people, although their script was based on Greek and continued to evolve. >An alternative working definition of "synthetic script" that means "one >invented to support a work of fiction" would be inappropriately aimed at >the Star Trek and Tolkein scripts. If one regards the Bible as a work of fiction, even more scripts could be added to this list. I agree with Michael Everson that we are talking about the *Universal* Character Set. The "Good-Return-on-Investment Character Set" or the "Important to Us Character Set" might also be useful to some people, but they will not be universal. -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: Should there be a "UniGlyph" standard?
At 15:07 2002-03-05, Kenneth Whistler wrote: >It is a little bit like trying to create a catalog of all >the lifeforms on Earth. [...] What looks easy for the obvious cases >quickly turns near impossible. Bad example--some of us make a career of doing the impossible (even with willows). I think the better point is that all efforts to *standardize* catalogs of living forms have failed. -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: Theban alphabet?
At 12:27 AM 3/1/02, Philipp Reichmuth wrote: > How about a glyph variant of U+2721? ;-) U+2721 U+FE00 U+20DD, perhaps? -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: Theban alphabet?
At 10:13 AM 2/28/02, Kenneth Whistler wrote: >It sounds to me that if Eric Raymond wants to pursue this, he >needs to get his act together (and maybe some Wiccans to support >him) to actually update and submit the proposal to the committees. This Wiccan says it's a cipher. -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: Theban alphabet?
At 11:01 AM 2/28/02, Michael Everson wrote: >I said that we'd need evidence written up. He did provide me some >arguments on the line of "if you write ABRACADABRA in Latin it doesn't >work, but if you write it in Theban it has power" which is, indeed, a >plain text differentiation. :-) The word "pentacle" doesn't have the power of the pentacle glyph, and yet I don't see that in Unicode. (I won't accept that it is a glyph variant of U+2606.) -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
UTF-8 was Re: Smiles, faces, etc
At 08:30 PM 2/14/02, David Starner wrote: >One out of two ain't bad, I guess. That was garbage on the screens of >some of the subscribers, though - UTF-8 display is still not universal. That's why I always open SC Unipad when I read this list, and paste as UTF-8. Unfortunately, Unipad seems to choke when one of the bytes of a UTF-8 sequence is 20h. -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: Unicode and Security
At 10:21 AM 2/7/02, Elliotte Rusty Harold wrote: >I don't like that solution, but not liking it doesn't mean it ain't gonna >happen as soon as Exxon loses a few billion dollars because somebody >spoofed them and thereby gained access to their bidding plans for oil leases. Enron lost a few billion dollars, and iirc Unicode was not involved. -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/