from:"Curtis Clark"

Re: Terms for rotations

2014-11-12 Thread Curtis Clark


On 2014-11-10 5:32 PM, Whistler, Ken wrote:

  WIDDERSHINS is shorter then

Aye, but laddie, then we'd have to use DEASIL for CLOCKWISE!

And we'd have wiccans after us to spell it "DEOSIL" instead. ;-)


And the Irish would no doubt insist on DEISEAL.

--
Curtis Clark, PhDhttp://www.cpp.edu/~jcclark
Professor Emeritus
Biological Sciences  +1 909 869 4140
Cal Poly Pomona, Pomona CA 91768

Please note new email address: jccl...@cpp.edu

___
Unicode mailing list
Unicode@unicode.org
http://unicode.org/mailman/listinfo/unicode

Re: Encoding localizable sentences (was: RE: UTC Document Register Now Public)

2013-04-20 Thread Curtis Clark


On 2013-04-20 2:38 AM, William_J_G Overington wrote:

I am thinking that the fact that I am not a linguist and that I am implicitly 
seeking the precision of mathematics and seeking provenance of a translation is 
perhaps the explanation of why I am thinking that localizable sentences is the 
way forward. There seems to a fundamental mismatch deep in human culture of the 
way that mathematics works precisely yet that translation often conveys an 
impression of meaning that is not congruently exact. Perhaps that is a factor 
in all of this.


Natural language lacks the logic and precision of mathematics, and is 
only unpredictably unambiguous. That's why lojban was invented.


https://en.wikipedia.org/wiki/Lojban

--
Curtis Clarkhttp://www.csupomona.edu/~jcclark
Biological Sciences   +1 909 869 4140
Cal Poly Pomona, Pomona CA 91768

Re: If Unicode wants to show the Red Card to someone ...

2013-04-01 Thread Curtis Clark


On 2013-04-01 12:19 PM, Buck Golemon wrote:

I'm sure that some cards are blue. Do they not also deserve a code point?
This amounts to color prejudice.

If we generalize the proposal, we should encode all the various colors 
of cards.
Further, we could denormalize the "red card" symbol into combining 
characters for "red" and "card".

This points to a general category of colored combining characters.

The only remaining question is whether the colors should be 
represented in the HSL or HSV color space.


Variation selectors!

--
Curtis Clarkhttp://www.csupomona.edu/~jcclark
Biological Sciences   +1 909 869 4140
Cal Poly Pomona, Pomona CA 91768

Re: Missing geometric shapes

2012-11-07 Thread Curtis Clark


On 2012-11-06 4:11 PM, Mark E. Shoulson wrote:
That said, I do think it would be reasonable and appropriate to encode 
the half-stars.  There's no such thing as "plain text" on paper 
(everything in print is formatted somehow), but star ratings are 
really common in tables that contain nothing else but text, etc. I 
guess the plain stars have more support, being dingbats in printers' 
cases since long ago, but these half-stars do feel "texty" to me, anyway. 


It's just a glyph variant of ½. :-)

--
Curtis Clarkhttp://www.csupomona.edu/~jcclark
Biological Sciences   +1 909 869 4140
Cal Poly Pomona, Pomona CA 91768

Re: Some QR codes each encoding one Unicode character

2012-10-08 Thread Curtis Clark


On 2012-10-08 6:09 AM, William_J_G Overington wrote:

The idea is that hopefully in the future these QR codes could be scanned using 
a mobile telephone that has a QR reader and a suitable app so as to build up a 
sequence of Unicode characters, such as a telephone number, without the user 
needing to be able to push buttons. This could potentially be useful to some 
people with some disabilities. Perhaps it could also be useful to a person 
trying to amke a telephone call from a mobile telephone in cold weather where 
he or she would prefer not to need to remove his or her gloves to make the call.


Inasmuch as QR codes are already able to encode telephone numbers (at 
least in the US, and I have assumed in the rest of the world as well), I 
don't see any utility in this, since it would force the user to scan the 
codes in sequence. whereas a QR code containing a full number would only 
need to be scanned once, and in at least some phones and software, the 
scan would initiate dialing.


--
Curtis Clarkhttp://www.csupomona.edu/~jcclark
Biological Sciences   +1 909 869 4140
Cal Poly Pomona, Pomona CA 91768

Re: Mayan numerals

2012-08-23 Thread Curtis Clark


On 2012-08-23 3:58 PM, David Starner wrote:

We must encode what people are currently using; stuff that no one is
actually setting in type is of lesser interest.


I have to ask myself, if these characters were already in use in mobile 
phones by a Japanese telcom, would people look at it differently?


--
Curtis Clarkhttp://www.csupomona.edu/~jcclark
Biological Sciences   +1 909 869 4140
Cal Poly Pomona, Pomona CA 91768

Re: Definition of character

2011-07-13 Thread Curtis Clark


On 7/13/2011 3:49 PM, Ken Whistler wrote:

As Asmus was at pains to point out, the character
encoders are essentially engaged in an operational discovery process 
regarding
"what characters there are". That in turn leads to a definition by 
enumeration: What
characters are consists of the list of what characters there are. 
Speaking as a biologist, that's a common way that biologists approach 
"life". Trying to define it is essentialist, and essentialism has been 
rejected by most modern biologists.


--
--
Curtis Clark
Cal Poly Pomona

Re: Writing a proposal for an unusual script: SignWriting

2010-06-11 Thread Curtis Clark


On 6/11/2010 2:08 PM, Mark E. Shoulson wrote:
I should probably read up more about SignWriting before trying to 
answer, but (yes, that stupid "I should do X but...") I'm wondering if 
there might be ways to shoehorn things into Unicode's style anyway.


One answer might be what was done for Western musical notation.
Another is the Plane 1 math alphabets, which can be used in ordinary 
writing, but which are more common in formulas with a precise 
2-dimensional layout: again, a higher-level protocol (in this case, 
MathML or TeX) is needed for full use. (One might even imagine a SignML.)



--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Director, I&IT Web Development   +1 909 979 6371
University Web Coordinator, Cal Poly Pomona

Re: Please RSVP... (was: US-ASCII)

2004-12-11 Thread Curtis Clark

on 2004-12-11 09:21 John Cowan wrote:
It's been used as an English verb, adjective, and noun for 30-40 years
and perhaps much longer: see below.
Longer. I can attest from my youth in the 1950s that my parents 
considered it ordinary English usage, and in fact knew of its origin.

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Web Coordinator, Cal Poly Pomona +1 909 979 6371
Professor, Biological Sciences   +1 909 869 4062

Re: [increasingly OT--but it's Saturday night] Re: Unicode HTML, download

2004-11-22 Thread Curtis Clark

on 2004-11-22 00:17 fantasai wrote:
Unless you are using XML tools to parse or generate the document, there
is no advantage to using XHTML.
Although I do use XHTML myself, I want to add that a valid HTML document 
can be unabmiguously translated to XHTML by programs such as HTML Tidy 
(http://tidy.sourceforge.net/).

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Web Coordinator, Cal Poly Pomona +1 909 979 6371
Professor, Biological Sciences   +1 909 869 4062

Re: [increasingly OT--but it's Saturday night] Re: Unicode HTML, download

2004-11-21 Thread Curtis Clark

on 2004-11-21 05:40 Stefan Persson wrote:
I think M$ bases their guesses on what to download on the charsets used. 
 If e.g. EUC-JP is used, you may be asked to download a Japanese fount, 
even if the page doesn't contain any Japanese characters at all, 
I can confirm this--I was working with a draft web site made by a 
student assistant, and when I went to view it, IE asked if I wanted to 
install a Korean font. Turns out that Dreamweaver on his 
Korean-localized system had set the encoding to euc-kr, even though 
there was nothing beyond us-ascii.

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Web Coordinator, Cal Poly Pomona +1 909 979 6371
Professor, Biological Sciences   +1 909 869 4062

Re: Unicode HTML, download

2004-11-19 Thread Curtis Clark

on 2004-11-19 10:36 E. Keown wrote:
If I add the proper Unicode-related HTML code at the
top, will people get Unicode-compatible text when
they download this? 
I recommend that you post the URL for a beta to this list, and we can 
all check it for you.

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Web Coordinator, Cal Poly Pomona +1 909 979 6371
Professor, Biological Sciences   +1 909 869 4062

Re: internationalization assumption

2004-09-29 Thread Curtis Clark

on 2004-09-29 20:45 Rick Cameron wrote:
What characters needed by French are missing from Latin-1?
I'd look it up, but I can't find the œuvre in which it is listed. :-)
--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Web Coordinator, Cal Poly Pomona +1 909 979 6371
Professor, Biological Sciences   +1 909 869 4062

Re: [OT] Decode Unicode!

2004-09-25 Thread Curtis Clark

on 2004-09-25 09:18 Philippe Verdy wrote:
Not completely true. It is a bit less than 2 bits, due to its 
replication chains, and the presence of insertion points where 
cross-overs are possible.
And ASCII is less than 7 bits when LZW is applied.
But the effective code is a bit more complex 
than just the ATCG system, as some studies have demonstrated that the 
DNA alone has no function out of its substrate, whose nature influence 
its "decoding".
ASCII of course has plenty of function outside its substrate. That's why 
I can rename a text file with the .exe extension, and it runs just fine. :-)

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Web Coordinator, Cal Poly Pomona +1 909 979 6371
Professor, Biological Sciences   +1 909 869 4062

Re: Decode Unicode!

2004-09-24 Thread Curtis Clark

on 2004-09-24 10:05 Peter Constable did quote:
After the DNA, the ASCII-Code is the most successful code on this
planet. 
Things get more and more complex. DNA is a 2-bit code.
--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Web Coordinator, Cal Poly Pomona +1 909 979 6371
Professor, Biological Sciences   +1 909 869 4062

Re: FW: Looking for transcription or transliteration standards latin- >arabic

2004-07-07 Thread Curtis Clark

John Cowan wrote:
The Unicode people are probably going to standardize on calling it
"diacritic folding", by analogy to the term "case folding".
Añd whàt shåll wë câll thë ãddítiõn of dîacrìtícs bÿ spämmêrs, ïñ ân 
ättëmpt tò fóòl spåm fîltêrs?

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Web Coordinator, Cal Poly Pomona +1 909 979 6371
Professor, Biological Sciences   +1 909 869 4062

Re: Looking for transcription or transliteration standards latin- >arabic

2004-07-07 Thread Curtis Clark

An interesting historical case is Istanbul, whose name comes from
the Greek phrase "eis ten poli" ("to the city" -- first "e" is epsilon,
and second "e" is eta).  That phrase tended to be pronounced "istimboli"
and with dissimilation "istamboli".  So when the Turks changed the name
from Constantinople to Istanbul, they simply changed from a name with
an obvious Greek derivation to one with a nonobvious Greek derivation.
This explanation seems rather Byzantine to me.
--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Web Coordinator, Cal Poly Pomona +1 909 979 6371
Professor, Biological Sciences   +1 909 869 4062

Re: Revised Phoenician proposal

2004-06-06 Thread Curtis Clark

on 2004-06-06 13:50 D. Starner wrote:
Let's be honest; the only people who matter in the least when discussing
a script is the people who actually use it. And all evidence presented here
indicates that scholars of Semitic languages--that is, the people who can
actually read the stuff written in the script--are, not surprisingly, the
majority users of Phoenician. 
(As a rhetorical device,) I have to say that I'm puzzled by this. All 
I've seemed to hear from Semiticists is that Phoenician is not a 
separate script. How, then, can these same Semiticists be the major 
users of something that doesn't exist?

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: Phoenician, Fraktur etc

2004-05-27 Thread Curtis Clark

on 2004-05-27 08:13 Otto Stolz wrote:
Fraktur characters are not designed to be used in all upper-case text
as has been stated before, in this thread. Nobody is used to this sort
of pseudo script; hence, nobody will read it fluently. 
This pseudo-script *is* used in southern California, by aficionados of 
low-rider automobiles, by some hispanic gangs, by some graffiti artists, 
and in some prison tattoos (the actual glyphs are more the "Old English" 
style of blackletter majuscule, rather than a more typical Fraktur). My 
guess is that it is *intended* not to be read fluently, as a mark of 
exclusivity.

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: Response to Everson Phoenician and why June 7?

2004-05-26 Thread Curtis Clark

on 2004-05-25 12:06 Dean Snyder wrote:
3) Palaeo-Hebrew scribal redactions to Jewish Hebrew manuscripts
To me, this is a convincing reason to encode palaeo-Hebrew separately: 
it would allow such manuscripts to be encoded in plain text.

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: Response to Everson Phoenician and why June 7?

2004-05-24 Thread Curtis Clark

I want to start out by saying that, although I personally support 
encoding Phoenician, I really have no stake in the outcome one way or 
the other, and I'm only participating in the "thread from Hell" (as I 
believe James Kass called it) because its dynamics interest me.

on 2004-05-24 03:08 Peter Kirk wrote:
If so, please give us some evidence for another side.
I have none. I would be astonished if there weren't another side, but 
far stranger things than that have happened, and I've been wrong before.

But maybe it is 
something else. For example, if you read evolutionary biologists 
strongly defending Darwinian evolution against creationist theories, 
does that imply an internal squabble among evoutionary biologists and 
therefore that some support creationism? Or does it rather imply a 
closing of ranks against outsiders who are attacking their discipline, a 
defence against (what they perceive as) unscientific attacks from those 
who don't know what they are talking about?
This is a very apt analogy. IMO, it is *precisely* because evolutionary 
biologists disagree about some fundamental issues in evolutionary 
biology (such as the relative importance and scope of natural selection) 
that they "close ranks". As a result, some of the arguments presented 
against creationism are caricatures. And the "they don't know what they 
are talking about" rhetoric is common on both sides.

As one who has debated creationists, I know that there are other 
approaches, that work incrementally better in educating people whose 
minds are not already made up. But the Semiticists who have posted 
against the proposal on this group seem to be falling into the same 
closed-rank pattern that I know so well from my own field.

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: Fraktur yet again (was: Re: Response to Everson Phoenician and why June 7?)

2004-05-24 Thread Curtis Clark

on 2004-05-24 06:37 Dean Snyder wrote:
Diascript is to script as dialect is to language - part of a continuum of
relatively minor variations.
A script is a diascript with an army? (To paraphrase a saying about 
dialects...)

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: Response to Everson Phoenician and why June 7?

2004-05-22 Thread Curtis Clark

It's hard for me to believe that the world community of Semitic scholars 
is so small or monolithic that there aren't differences of opinion among 
them. I have been almost automatically suspicious of the posts by the 
Semiticists opposed to encoding Phoenician; after thirty-four years in 
academia (longer if I count that my father was a professor when I was a 
youth), I have yet to see a field in which there were not differences of 
opinion. Admittedly, all Semiticists might agree on the nature of 
Phoenician (just as all chemists accept the periodic table), but the 
fervor exhibited here makes me wonder what the issues *really* are. I am 
used to seeing such fervor among academics only when there has been some 
unstated agenda at work. And so I wonder, are we in this list reading 
only one side of an internal squabble among Semiticists?

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: ISO 15924 draft fixes

2004-05-21 Thread Curtis Clark

on 2004-05-21 07:10 Michael Everson wrote:
I am not very happy about loading the plain-text in browsers. Three of 
my browsers load it and *all* the French UTF-8 is displayed in Latin 1.
This *may* be a server issue. Iirc, the server has to be told to mark 
the text/plain MIME-type as UTF-8, since there are no  tags (as 
there could be in HTML) and since browsers generally lack the heuristics 
to decide on coding of plain text.

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: ISO 15924 codes for ConScript

2004-05-20 Thread Curtis Clark

on 2004-05-20 07:52 Peter Constable wrote:
One person wrote, regarding Qaak for Klingon:

It's a shame you didn't pick something that could be pronounced in
tlhIngan Hol, perhaps Qaap for pIqaD.

Identifiers are identifiers, not words. 
That's why I sent my message to Doug off-list; it was a joke.
--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: OT [was TR35]

2004-05-12 Thread Curtis Clark

on 2004-05-11 23:14 Jony Rosenne wrote:
How does the Mozilla calendar handle time zone changes - does it store all
time as GMT (UTC) and mess them all up when I change the time zone, or does
it store them as local time and mess them up when communicating with people
in other time zones?
AFAICT, the latter.

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: OT [was TR35]

2004-05-11 Thread Curtis Clark

on 2004-05-11 10:49 Jony Rosenne wrote:

Unfortunately they do not support Hebrew well enough. 

I did use Eudora before Hebrew e-mail was common, i.e. before Microsoft
implemented the Unicode bidi algorithm.
Mozilla 1.6 has a localization for Hebrew 
(http://www.mozilla.org/projects/l10n/mlp_status.html#moz_1.6), and 
afaict supports bidi (all the Hebrew on this list and on web pages comes 
out in the right direction). There is a calendar add-in 
(http://www.mozilla.org/projects/calendar/) that is quite nice (I like 
it in many ways better than Outlook, and it reads iCalendar files), and 
the email client does UTF, message threading, and Bayesian spam filtering.

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: Phoenician

2004-05-07 Thread Curtis Clark

on 2004-05-07 07:47 Peter Constable wrote:
Have you not heard that yours is not the only scholarly community? To
speak as though there is only one, or that all have the same needs as
yours, seems a bit arrogant.
Sadly, the hegemonist view is not restricted among scholars to these 
semeticists; in systematic biology, a group wanted to adapt the basic 
classification scheme of organisms to better fit current science. They 
were resisted, and began constructing their own classification scheme. 
The hegemonists who had resisted their making the classical scheme 
useful for their needs have resisted even more their creation of a new 
scheme. Sadly, "my way or the highway" has always been too common in 
academia.

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: Arid Canaanite Wasteland (was: Re: New contribution)

2004-05-02 Thread Curtis Clark

on 2004-05-02 16:26 Michael Everson wrote:
Children learning about the history of their alphabets 
I've been following this discussion off and on, and figured I didn't 
have much to add, but I can relate to this remark. I was a child, once, 
and I had a fascination with scripts and languages that has continued to 
the present day. Although I have never been more than a dilettante in 
these fields, I'd like to think that what knowledge I have has 
positively influenced my long career as a botanist and my more recent 
career as a web developer.

In an eighth-grade English class (I was around 14 years old), I wrote a 
short story about the ancient inhabitants of Palestine. (It was intended 
to be humorous, in the ways of 14-year-old boys.) In that story I 
included fictional place names written in what would fit into Michael's 
Phoenician block (I believe they were some sort of ancient Canaanite, if 
not Phoenician sensu stricto).

I never progressed in my knowledge of Semitic scripts until a couple of 
years ago, when my daughter wanted a tattoo that said "peace" in 
Aramaic, and I researched enough to realize that Estrangelo Edessa 
wasn't likely to have been used to write Aramaic in the time of Jesus.

And with these bits of knowledge, I have been able to follow the 
outlines of the discussion.

If Unicode Phoenician had been around when I was 14, I would have used it.
--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: [OT] Freedom and organization (was RE: help needed with adding ne w character)

2004-03-19 Thread Curtis Clark

on 2004-03-19 02:04 Marco Cimarosti wrote:
Anarchism is against imposing forms of organization, non against
organization itself. And standards are quite like the useful side of laws
(the organization) without the harmful side (the imposition), so they should
be welcome to anarchists.
The sort of anarchy that I am familiar with involves decision by 
consensus. A really good example from my academic discipline is the 
International Code of Botanical Nomenclature. There are conventions held 
every six years in which the participants vote on changes to the code, 
so in that sense it might seem a democracy, but the code only works 
because there is a consensus among botanical taxonomists to use it. No 
governments enforce it.

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: Investigating: LATIN CAPITAL LETTER J WITH DOT ABOVE

2004-03-18 Thread Curtis Clark

on 2004-03-18 01:05 Pavel Adamek wrote:
So it would be convenient to have an empty diacritical mark,
(COMBINING NOTHING ABOVE)
which would cause the "soft" dot of  or  to disappear,
without adding anything else.
Assuming this could be added to any other character, my mind boggles at 
the implications, both for decomposition and for rendering. :-)

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

OT? Languages with letters that always take diacriticals

2004-03-15 Thread Curtis Clark

Are there any languages that use letters with diacriticals, but *never* 
use the base letter without diacriticals? A made-up example to explain 
what made me think of it: Let's say a language has "ö", to represent the 
same sound that it does in German, but not "o", because the language 
lacks the sound represented by that letter in common European languages 
(the alternative being to use "o" to represent the "ö" sound).

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: Astrological symbols

2004-02-05 Thread Curtis Clark

on 2004-02-05 15:29 Ernest Cline wrote:

[1] centaur -  an asteroid/comet with a perihelion located
between the orbits of Jupiter and Neptune whose orbit crosses
that of one or more of Saturn, Uranus, or Neptune.  The first
known and largest of these objects is Chiron discovered 1977.
Observation has since shown that Chiron is a large
comet like body (150 - 200 km in diameter.)
Pluto fits that definition.

As to the proposal, as Michael might say, show examples from printed 
works, especially of the signs used in text.

Are any of these also attested as classical symbols for the respective 
God/desses?

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: Phonology [was: interesting SIL-document]

2004-02-05 Thread Curtis Clark

on 2004-02-05 03:54 John Cowan wrote:
Indeed.  In fact, the first fuccative-insertion on record, laughably tame
by today's standards, is an American's:  William Randolph Hearst said of
one of his reporters:  "Tell Coates I said he is too inde-goddam-pendent!"
I first heard of the expletive infix in the context of the "familiar 
speech" of the US Navy. My experience at Navy bases in later years 
suggests that the forms may have become ar-f***ing-chaic. (Did I divide 
that in the middle of a metrical foot?)

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: Combining down-pointing triangle above?

2004-01-18 Thread Curtis Clark

on 2004-01-18 17:47 Doug Ewell wrote:

Is this just a fancified hacek, or a potential candidate for proposal?
Evidently a hacek: http://www.chumashlanguage.com/vocab/vocab-01-fr.html

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: Detecting encoding in Plain text

2004-01-12 Thread Curtis Clark

on 2004-01-12 08:57 Tom Emerson wrote:
You also have to deal with oddities of language: I tried one open
source implementation of the Cavnar and Trenkel algorithm THAT CLAIMED
THAT SHOUTED ENGLISH WAS ACTUALLY CZECH.
SHOUTED AT CLOSE RANGE (~ 1 CM FROM THE EAR) AND WITH A CZECH ACCENT, IT 
SOUNDS PRETTY MUCH THE SAME.

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: Chinese rod numerals

2004-01-12 Thread Curtis Clark

on 2004-01-12 17:45 Kenneth Whistler wrote:

The obvious precedent for a set of numerals like this are the Aegean
numerals, U+10107..U+10118, which are also quite obviously derived
from layouts of tallying sticks, and which have a units set 1-9
and a tens set 10-90 oriented at right angles to the 1-9 set. But
the Aegean system used other counters for 100 and up, so there is
not a problem of alternating values.
And historical examples of the Aegean numbers exist *primarily* (if not 
exclusively?) in written form, on clay tablets.

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: Latin letter GHA or Latin letter IO ?

2004-01-03 Thread Curtis Clark

on 2004-01-03 14:23 Philippe Verdy wrote:
The problem we were discussing here is that only the informative and
non-normative properties are giving the appropriate identity of the encoded
letters, but NONE of the existing normative properties... 
It seems to me that a little reflection would reveal that it is easiest 
to make properties normative when they are *not* informative, beyond 
whatever it is that they uniquely specify. There were very good reasons 
to make both code points and character names normative; people assume 
that the latter are informative, and that's where we get into trouble. 
No one argues about the fact that Æ is U+01A2; only its name is a "problem".

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: Pre-1923 characters?

2004-01-03 Thread Curtis Clark

on 2004-01-03 14:40 Philippe Verdy wrote:
I have never
seen you accepting compromizes and I doubt of your negociation faculties.
A lot can be said about Michael, but it is inaccurate to say that he 
never changes his mind. One of the things that I have come to value over 
the years in his "pronouncements" is that they invariably reflect 
careful consideration, whether I agree with them or not.

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: German 0364 COMBINING LATIN SMALL LETTER E

2003-12-29 Thread Curtis Clark

on 2003-12-28 16:36 Gerd Schumacher wrote:

In German the supralinear e may be used as a variation of the diaeresis
above a, o, and u. Though it is old fashioned, indeed, it is still
understandable, and might be used for invitation cards and the like. I don’t know a modern
font with it, 
http://www.myfonts.com/fonts/urw/breitkopf-fraktur-d/regular/charmap.html

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: why Aramaic now lumpers and splitters

2003-12-24 Thread Curtis Clark

on 2003-12-24 12:29 Elaine Keown wrote:
It appears to me that script experts may resemble
experts in dialects/languages:  there are lumpers and
splitters
Following up on my post about wariness to unify being correct in first 
principles:

My day job uses my training as a plant taxonomist, a field in which 
there are also lumpers and splitters. I am a lumper, but, as you say, a 
"thinking lumper". If I have any doubts about whether two species of 
plant are separate, I maintain them as separate, in part as a challenge 
to future taxonomists (or me) to demonstrate that they are truly the 
same. Lumped species are "under the radar"--nonspecialists looking at 
them may never be aware of the disparate elements that make them up, and 
even specialists may not think to revisit them. It is ultimately easier 
to lump than to split (with plants, and I assume with languages and 
scripts as well), so those of us who are lumpers have a greater 
responsibility--it "comes with the territory".

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: why Aramaic now

2003-12-24 Thread Curtis Clark

on 2003-12-24 12:02 Elaine Keown wrote:
Some of the sets of symbols I found---which I simply
assumed could be added to "Hebrew"--are innately
controversial because of the Roadmap.  
I've been following these threads with interest, as an uninformed 
bystander. Michael's unwillingness to unify in haste seems correct in 
first principles, independent of his expertise and experience. But you 
have presented the first cogent (to me :-) argument for why delaying the 
decision is a problem.

One thing I've learned on this list is that Unicode done well respects 
no short-term convenience.

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: Aramaic unification and information retrieval

2003-12-24 Thread Curtis Clark

on 2003-12-24 02:39 [EMAIL PROTECTED] wrote:
The relationship between mysticism/occult studies and language studies should 
definitely go in only one way. Otherwise we'd end up encoding one character 
for "true name of God" and fill the rest of the codespace with variant 
selectors to apply to it :)
Um, isn't U+ the true name of God, and all the rest of Unicode 
variant selectors?

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: [OT] Keyboards (was: American English translation of character names)

2003-12-19 Thread Curtis Clark

on 2003-12-19 00:05 Arcane Jill wrote:

The left and right 
 keys are functionally identical anyway, and the  key is 
functionally identical to a right mouse click. 
It's handy, though, for people who cannot use a mouse.

(Okay, so  is used for "screen capture to 
clipboard" but who needs a button for that?). 
I use it all the time. Saves buying screen capture software.

They could have just used, 
for example,  for  and  for , without 
then having to scrunch up the  and  keys and shrink the 
space bar. 
With this I agree, and the keys could have retained their meaning in DOS 
windows. Perhaps the older versions of Windows weren't up to the task.

Vaguely ob Unicode, SC Unipad has keyboard layouts for many languages, 
but has the euro at Alt-Gr w on the "English (British)" keyboard.

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Stability of scientific names, was Stability of WG2

2003-12-17 Thread Curtis Clark

on 2003-12-16 15:27 Peter Kirk wrote:

I'm no expert on this... 
I am. :-)

but I thought that species could be transferred 
from genus to genus as knowledge advances. 
As John pointed out, the epithet stays the same.

And presumably obvious 
spelling mistakes are corrected (contrast "FHTORA" in U+1D0C5), or are 
you saying that if the first publication had "Brontosuarus" as a typo 
this error would remain for ever?
There are errors and then there are errors. Some are correctable, some 
are not, and botanists and zoologists have different rules about this. 
An example that's not entirely OT: There was a Russian physician with 
the last name ÐÑÑÐÐÑ - a "cyrillicization" of his German family name 
Escholtz. His name was commonly written then and today in German form as 
Johann Friedrich Eschscholtz, the schsch reduplication being a 
reflection of the Cyrillic spelling. He Latinized (language, not 
alphabet) his name (a common occurrence among naturalists) to Eschscholzius.

He was physician to the Kotzebue expedition from Russia to (among other 
places) California; the ship's naturalist was Adelbert von Chamisso 
(author of _Peter Schlemiel_). Chamisso and Eschscholtz were fast 
friends (and some accounts imply that they were lovers). Chamisso named 
several new species of organisms for his friend, including the 
California poppy.

In the original description of the California poppy, he named it 
_Eschscholzia californica_, making the genus name the feminine form of 
Eschscholtz's Latinized name (this is a common occurrence). In the 
caption of the illustration of the plant, however, it was spelled 
_Eschholzia_. But for over a century afterwards, most botanists and 
horticulturists spelled the genus _Eschscholtzia_, assuming that both 
spellings in the original description were typographic errors.

But the rules of nomenclature are very specific about which types of 
errors can be corrected, and, since there is no obvious "correct" 
spelling of Escholtz, *the spelling that accompanied the original 
description must stand*, and the plant is correctly _Eschscholzia 
californica_.

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: Stability of WG2

2003-12-16 Thread Curtis Clark

on 2003-12-16 02:53 Peter Kirk wrote:
Even if this is a millennial reign of peace 
and prosperity, processes of language change will not stop. 
A measure of comparison is the system of biological nomenclature, which 
has maintained stability of names in the face of increasing knowledge of 
organisms over a period of a quarter of a millenium. There are no ISO 
standards for scientific names--the system has succeeded through 
consensus, by biologists agreeing that a stable system is worth the 
trade of quite a bit of individualism (not to mention the periodic and 
sometimes raucous conventions when the rules are modified).

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: [OT reversing letters to avoid offence] Re: [Fwd: Re: Swastika to be banned by Microsoft?]

2003-12-15 Thread Curtis Clark

on 2003-12-15 11:24 Doug Ewell wrote:

BTW, the first person to suggest using Variation Selectors to encode
reversed K's and B's will get bonked in the head with a foam bat.
Um, Doug, that would be you, for bringing it to our attention

Consider yourself bonked.

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: Supporting the Unicode Project

2003-12-04 Thread Curtis Clark

on 2003-12-04 09:43 Edward H. Trager wrote:
Actually, I am a bioinformatics programmer, and to date I have given away
my programs away for free.  The main reason I give them away for 
free is fairly simple: the market of genetics researchers 
potentially interested in buying them is too small,
so I would not make that much money trying to sell them.  
To muddy the waters further, vendors who make gel analysis software that 
is involved in generating the basic data of genomics and proteomics 
charge huge amounts of money, that labs regularly pay, because some 
types (and venues) of biomedical research are well-funded. The issue of 
software cost is a complex one, involving both business and non-business 
decisions (and especially the latter in one-person operations).

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: MS Windows and Unicode 4.0 ?

2003-12-04 Thread Curtis Clark

on 2003-12-04 07:49 Stefan Persson wrote:
Eudora doesn't support Unicode on *any* OS, right?
Indeed. I and I'm sure many others on this list sent feedback to 
Qualcomm at Michael's behest, but a fat lot of good it did. At least 
Windows users can copy and paste into SC Unipad to get an idea of what's 
going on, but my solution was to switch to another client for this and 
other email lists.

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: MS Windows and Unicode 4.0 ?

2003-12-03 Thread Curtis Clark

on 2003-12-03 11:28 Edward H. Trager wrote:

WHY NOT just *give* away the Linear B, Ogham, 
I give away Linear B. It's an incomplete set, and has not been vetted by 
experts, as has Michael's. It's worth what you pay for it.

I and Michael both give away Ogham. His has the glyphs in the proper 
Unicode slots; mine is a font hack (I have the Unicode version sitting 
on my hard disk, tapping its foot waiting to get out).

And making fonts isn't my day job. With Michael, you get scholarly 
expertise, professional care, and someone to complain to when things are 
wrong. With my fonts, you get what you pay for.

I'd be happy, short-term, if Michael gave away more stuff. But until he 
becomes independently wealthy, we all lose long-term if he decides he 
can no longer afford to devote the time to script encoding.

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: MS Windows and Unicode 4.0 ?

2003-12-03 Thread Curtis Clark

on 2003-12-03 02:09 Arcane Jill wrote:

I don't believe that anyone could rightly argue that, for 
instance, musical symbols were "esoteric". They're a standard part of my 
culture. And yet, I still can't put a treble clef in my document using 
the standard Windows fonts, and nor can I put it on a web site and 
believe that it will be viewed correctly by most western viewers. 
Um, as an off-and-on musician, I tend to expect a treble clef on a 
staff, and I don't really expect my OS to handle musical notation. I 
suppose if I wanted to say "here is what a treble clef looks like" on a 
web site, I would have to use a graphic. I'd have to do the same thing 
to show what a rose looks like. (And if I wanted to demonstrate its 
smell, I'd be out of luck.)

By 
exactly the same reasoning, I expect all the math symbols to be there 
too, including mathematical alphanumeric symbols. This is not a strange 
or exotic requirement, it's just a part of living in this western 
culture and wanting to use they symbols of my culture. 
The bulk of math alphanumerics can be represented with markup, using 
standard fonts. Sure, it's no good for interchange, but viewers of a web 
page can *see* italic "a" (assuming they can see) with a simple a.

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: Hexadecimal digits?

2003-11-10 Thread Curtis Clark

on 2003-11-10 07:28 Jim Allan wrote:

And the only way you can tell 7 decimal from 7 hex is by giving 7 to 
different code points, that is File777 in hex should sort after File999 
in decimal.
The CSS guru Eric Meyer noted that Ohio license plates translate as hex 
RGB colors, mostly purple: 
http://www.meyerweb.com/eric/thoughts/2002a.html#t20020228

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: Berber/Tifinagh

2003-11-10 Thread Curtis Clark

on 2003-11-10 04:17 Michael Everson wrote:

It still remains the case that Theban "orthography" is basically 
English, that is, it is Latin with funny glyphs.
Why isn't Latin Serbian just Cyrillic Serbian with funny glyphs? I'm not 
trying to be intentionally dense here; Theban English and Serbian are 
different in many ways. But are there truly no edge cases, where whim is 
the only deciding factor? And how does whim turn into policy?

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: Berber/Tifinagh

2003-11-09 Thread Curtis Clark

on 2003-11-09 17:07 John Hudson wrote:

I've given a lot of thought to transliteration and transcription at the 
glyph level: 
Which comes back to the issue of ciphers. It would seem to me that 
glyph-level transliteration is the accepted behavior for ciphers (else 
we would actually have to address whether such things as Theban should 
be encoded, and Braille would have been a non-issue from the get-go). 
What determines whether a script is a cipher of another?

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Clarification, please, was Re: Berber/Tifinagh

2003-11-09 Thread Curtis Clark

on 2003-11-09 10:41 Michael Everson wrote:

I am appalled. I thought you understood something about Unicode, Philippe.
At this point, I'm a bit puzzled about the circumstances in which an 
alphabet is a cipher of another, and when it isn't. In an offlist 
conversation, you, I, and others seemed to arrive at the consensus that 
the Theban "magickal script" was a cipher of Latin. And many years ago, 
you raised the question of whether Etruscan was a ciper of either Latin 
or Greek (as we both know now, it isn't). I assumed that the criteria 
were (1) the scripts can be used interchangeably to write a single 
language, and (2) there is a one-to-one correspondence between their glyphs.

If Philippe were correct about the one-to-one correspondence, wouldn't 
the Latin glyphs be a cipher of the Tifinagh? And thus a glyph choice 
rather than a script choice?

Let's say that the Klingons prevailed, and pIqaD were encoded. There is 
a one-to-one correspondence between the letters of pIqaD and single or 
groups of Latin letters (supposedly). Could one not make a pIqaD font in 
which the glyphs looked like the Latin letters or groups?

I'm assuming I'm missing something here, and would like to know what it is.

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

OT: Inuktitut dictionary?

2003-11-06 Thread Curtis Clark

A friend is looking for a vocabulary-rich English-Inuktitut dictionary 
as a source for names for malamute dogs. He is a scholar in another 
field (astrobiology), and so is concerned with accuracy. I'm sure he 
would gladly learn the syllabics to the extent necessary. He has access 
to university interlibrary loan if the best dictionary is out of print. 
And I imagine he would be fine with Inupiaq, too. Please email offlist 
if you have any suggestions. Thanks!

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: PUA

2003-10-19 Thread Curtis Clark

on 2003-10-19 19:34 Chris Jacobs wrote:

One problem is that there seems to be no way in plaintext unicode to specify
who is in charge of a particular interpretation of the PUA.
At last! Another use for Plane 14! :-)

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: About that alphabetician...

2003-09-25 Thread Curtis Clark

Of course, any Unicode character can be expressed as an XML character 
reference (e.g. म) in any web page encoding, even US-ASCII.
--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: W3C Objects To Royalties On ISO Country Codes

2003-09-21 Thread Curtis Clark

on 2003-09-21 10:38 Michael Everson wrote:

Golly, does that mean they'll pay people like me if they get royalties 
from people using ISO/IEC 10646?
The current economic paradigm: "Steal. Sell."

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: Hexadecimal never again

2003-08-20 Thread Curtis Clark

on 2003-08-20 11:03 Rick McGowan wrote:

Hex doesn't have an independent  
existence out in non-computing culture for, e.g., signs in the market place  
or monetary values.
Caviar, 10kg, €FEED

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: [Way OT] Beer measurements (was: Re: Handwritten EURO sign)

2003-08-19 Thread Curtis Clark

on 2003-08-19 04:18 Pim Blokland wrote:
Ha! Fat chance! You might as well suggest we abolish the yard
altogether!
Then, how would I have a yard sale? (or even a yard sail?)

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: [Way OT] Beer measurements (was: Re: Handwritten EURO sign)

2003-08-19 Thread Curtis Clark

on 2003-08-19 02:51 Marco Cimarosti wrote:
TOILETS --->
  50 yds (45.72 m)
To be precise, it should have said 50.00 yards (or perhaps 46 m).

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: Display of Isolated Nonspacing Marks (was Re: Questions on ZWNBS...)

2003-08-14 Thread Curtis Clark

on 2003-08-06 15:24 Doug Ewell wrote:
I'm not a typographer (intelligent or otherwise), but I'm having a tough
time seeing how Section 2.10 *requires* fonts and rendering engines to
give a space-plus-combining-diacritic combination the exact minimum
width of the diacritic alone, or to leave equal space before and after
such a combination.  All I think it is saying is that, for example, the
combination i-plus-tilde may be wider than i alone, because tilde is
wider than i.
Considering that one approach is to use opentype to map a letter plus 
diacritical to a single glyph, an obvious solution would be to include 
space + diacritical combos in that table. An important font issue, but a 
font issue nonetheless.

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: Display of Isolated Nonspacing Marks (was Re: Questions on ZWNBS...)

2003-08-05 Thread Curtis Clark

on 2003-08-05 15:31 Peter Kirk wrote:
Thank you, Mark. This helps to clarify things, but still doesn't 
explicitly answer my question of how to encode "a sentence like "In this 
language the diacritic ^ may appear above the letters ...", but instead 
of ^ I want to use a combining character"  and want to display exactly 
one space before the combining character - do I encode two spaces or one?
In this language the diacritic  ̊ may appear above the letters...

Two spaces, at least in Thunderbird Mail.

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: I am not in India II

2003-07-18 Thread Curtis Clark

Michael Everson wrote:

People who believe that e-mails with a particular name in the From field
must come from that very person can be called, ehem, naiive.


That's an interesting way of writing the diaeresis on naïve, Adam. :-)
It's a good thing it's soft-dotted! Or perhaps he meant naĳve. :-)

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: Aramaic, Samaritan, Phoenician

2003-07-15 Thread Curtis Clark

Michael Everson wrote:

Particularly as they regularly write text in both Coptic and Greek and 
this distinction is better expressed in plain text than in the font.
This seems to me to be a key issue: would there be a need to include 
words or passages of eany of these early Semitic scripts in Hebrew text? 
If so, they warrant separate encoding. (There is the case of the 
Tetragrammaton already mentioned, but it may be an exception.)

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: Aramaic, Samaritan, Phoenician

2003-07-14 Thread Curtis Clark

Michael Everson wrote:
So is there a real justification for separate alphabets here?


To my mind, yes.
It's worth noting that Aramaic can also be written in the (encoded) 
Syraic script, and my superficial googling suggests that at least one 
currently-used form of Syraic dates back over two millenia.

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: Announcement: New Unicode Savvy Logo

2003-05-31 Thread Curtis Clark

William Overington wrote:
2.. What is the situation if a page is encoded entirely properly as far as,
say, using UTF-8 goes, yet also uses Private Use Area characters?
UTF-8 includes the PUA. It specifies nothing, however, about its contents.

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: Announcement: New Unicode Savvy Logo

2003-05-31 Thread Curtis Clark

Philippe Verdy wrote:
May be the PUA allocated spaces could be divided in normative
categories, for example by assigning LTR or RTL base letters in some
areas, diacritics in another large area splitted in 255 subspaces for
combining characters, and symbols or ideographs in another large
area.
Um, then it wouldn't be private. I seem to remember a recent discussion 
of how Microsoft doing something similar was causing all kinds of 
difficulty.

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: The role of country codes/Not snazzy

2003-05-30 Thread Curtis Clark

Marion Gunn crossposted:
Scríobh John Cowan <[EMAIL PROTECTED]>:

Jon Hanna scripsit:
...
It's funny, just earlier today, I castigated a member of a list I manage 
for posting a contribution to another list without the author's 
permission, an act which some of us regard as seriously 
*un*professional. I guess netiquette is another one of those cultural 
things.

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: Exciting new software release!

2003-04-04 Thread Curtis Clark

John Cowan wrote:
There are, strictly speaking (some typographer correct me please if I am
wrong), no italic sans serif fonts, but only slanted sans serif fonts.
I believe Adobe Myriad claims a "true italic"; the letterforms are sans 
versions of standard italic letterforms, rather than obliques of the 
upright forms.

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: Exciting new software release!

2003-04-02 Thread Curtis Clark

Doug Ewell wrote:
I see a few people have actually downloaded MathText and tried it out.
I thought it would make a better joke to actually implement the thing,
complete with UI mini-frills (icons to indicate scripts supported by the
chosen style, selectable Unicode 3.x/4.x conversion to SCRIPT SMALL L,
etc.) than simply to describe it on a Web page.  This was in the same
spirit as Michael, Roozbeh, and John's full-blown COMBINING HEART
proposal, which was far funnier than if somebody had just mentioned the
idea without developing it.
And now of course your joke is perhaps the most robust IME for these 
characters.

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: FAQ entry (was: Looking for information on the UnicodeData file)

2003-03-11 Thread Curtis Clark

John Hudson wrote:
The same people consider Latin a dead language, suitable only 
for study of ancient documents, which is clearly not the view taken at 
the Vatican, which continues to produce new documents in that language. 
In recent encyclicals, however, at least as published at www.vatican.va, 
the æ and œ are not used.
Botanical taxonomists also produce new documents in Latin (descriptions 
of new species and other groups) and also eschew æ and œ, again no doubt 
because of font issues.

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: list etiquette (was Re: Tailoring of normalization

2003-02-06 Thread Curtis Clark

Lars Marius Garshol wrote:

* Tex Texin
| 
| There probably isn't a one-size fits all solution, short of those
| not wanting a response changing their reply-to address to
| "[EMAIL PROTECTED]".

That's dangerous. Quite a few email clients will then create replies
that go only to that address, so nobody will see them at all...


Actually, no, the postmaster of the Montréal Stock Exchange (me.org) is 
likely to see every one of them.

The only domain name that is reserved is example.com/org/whatever. Every 
other domain is potentially in use.

--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: LATIN LETTER N WITH DIAERESIS?

2003-02-03 Thread Curtis Clark

Lukas Pietsch wrote:

Your F725 Unknown-2, to me, looks like a German SCRIPT CAPITAL S,
(compare with U+2112;SCRIPT CAPITAL L). Yes, we were taught to write an
S like this in school. Perhaps it's used somewhere in mathematics?


Looks to me like the proofreader's marginal deletion mark. F7AA might 
also be a proofreader's mark.


--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

LATIN LETTER N WITH DIAERESIS?

2003-01-28 Thread Curtis Clark

I have a distinct memory of a precomposed Latin letter n with diaeresis 
(as in the band Spinal Tap), but now I can't find it. It doesn't matter 
to me whether it exists or not, other than helping me to understand my 
memory. Am I missing it? Did it exist once and is now gone? Or am I 
making it all up?
--
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: Omega + upsilon ligature?

2002-10-02 Thread Curtis Clark


John Cowan wrote:
> And (uniquely for a Greek ligature?) was copied into the Latin alphabet,
> and is now in use for /w/ in certain French-derived orthographies.

Zum beispiel, U+0222 and U+0223, used in Ȣendat, an indigenous language 
in Québec.


-- 
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: Romanized Cyrillic bibliographic data--viable fonts?)

2002-08-30 Thread Curtis Clark


I wrote:
> Until I c
...
Some of you had as much trouble with my XML entities as I might have had 
with Jarkko's U+ codes. Here is the transliteration:

"Until I converted Jarkko's text, I wondered if he wasn't trying to make 
  a Unicode form of rot13, so that readers could choose not to be 
offended. Torsten, when will Unipad support converting the U+xxxx format?"


-- 
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: Romanized Cyrillic bibliographic data--viable fonts?)

2002-08-30 Thread Curtis Clark

[EMAIL PROTECTED] wrote:
>>And who pays the poor font designer for his work?
> 
> 
> U+0041 U+006C U+0074 U+0072 U+0075 U+0069 U+0073 U+006D U+0020 U+006F U+0072 U+0020 
>U+006B U+0075 U+0064 U+006F U+0073 U+002C U+0020 U+006D U+0061 U+0079 U+0062 U+0065 
>U+003F

Reminds me of a line by a standup comedian referring to the broader 
context: "So I went to my landlord and said, 'Hey, *nice* apartment!'"

Until I c
onverted 
Jarkko's 
text, I w
ondered i
f he wasn
't trying
 to make 
a Unicode
 form of 
rot13, so
 that rea
ders coul
d choose 
not to be
 offended
. Torsten
, when wi
ll Unipad
 support 
convertin
g the U+x
xxx forma
t?

-- 
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: REALLY not Tamil - changing scripts (long)

2002-07-30 Thread Curtis Clark


Keld Jørn Simonsen wrote:

> I dont think using @ in a new orthography is a good idea.

This was indeed my surmise, and I'm glad to see agreement.

-- 
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: REALLY not Tamil - changing scripts (long)

2002-07-27 Thread Curtis Clark

Addison Phillips [wM] wrote:
 > Obviously I'm not an expert in these linguistic areas (and hence
 > rarely comment on them), but it seems to me that the lack of other
 > mechanisms makes Unicode an attractive target for criticism in this
 > area.

Certainly no Unicode-bashing was intended (I'm more of a Unicode 
evangelist). I guess I'm confused about the use of Unicode character 
properties. Are you saying that, even though Unicode defines U+0027 as 
punctuation, other, I could use it as a glottal stop and create a locale 
that would treat it as a letter (and still be "Unicode compliant", 
whatever that is?). And if that's the case, are the Unicode properties 
just guides? Could I develop an orthography where YÎ²ÑØ¨Õ±â would be a 
word, and there would be no consequences?

-- 
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

not Tamil - changing scripts (long)

2002-07-26 Thread Curtis Clark

James Kass wrote:
> Isn't this kind of a Catch-22 for anyone contemplating script reform?
> Do we discourage people from altering their own scripts?  Should we?
> It is suggested that scripts can be "alive" in the same sense that
> languages are "alive"; changes (which are part of life) just occur
> much more slowly in scripts.

This touches on some "Unicode vs. the world" issues I've been thinking 
about, having to do with indigenous peoples developing orthographies for 
their own languages.

My two examples are both languages of the Takic group in southern 
California. The LuiseÃ±o language declined to a very few native speakers, 
but has enjoyed a renaissance in recent years. The Gabrieleno (Tongva) 
language was effectively extinctâno native speakers, no recordings, some 
amount of written documentationâbut the Tongva are resurrecting it (it 
is similar enough to the other Takic languages that it is possible to 
reconstruct parts that are missing).

Anthropological accounts of both languages are of course in the phonetic 
alphabets beloved by linguists in the days before IPA stabilization. 
And, like many other native Americans, the LuiseÃ±o and Tongva have 
wanted simpler orthographies that can be typed with US-English keyboards.

I don't have a lot of familiarity with LuiseÃ±o, but web pages have 
included passages where non-letters (such as @) are used as letters. 
This solves the keyboarding problem (since few people would try to 
pronounce an email address as LuiseÃ±o), but I imagine all sorts of 
issues with sorthing, searching, word selection, casing, and all the 
other sorts of things that computers can do for "major" languages.

Where all this involves me is with Tongva. I have been working with a 
Tongva ethnobotanist on a project that, among other things, involves 
plant labels in Tongva, English, and Latin. Tongva spelling is currently 
inconsistent, and my colleague has been regularizing it for this project 
(because he is the primary language teacher for the nation, and few have 
any fluency at all, he has this freedom). Somewhat like English, Tongva 
represents both the "oo" and "uh"  sounds both by "u". Unlike English, 
the rest of the orthography provides no clues to which sound is meant.

/If/ my colleague were to ask (and the Tongva may be satisfied with the 
existing orthography), I would suggest representing the "uh" sound with 
a Latin-1 letter (possibly Ã»), and explain several simple alternatives 
for keyboarding it on Mac and Windows. I would *not* suggest overloading 
@, or some similar approach.

I suppose that Unicode could add at some point "LuiseÃ±o letter @", with 
appropriate properties, but that would circumvent the reason for picking 
it: its presence in US-ASCII. In an ideal world, indigenous peoples 
would hook up with folks like Michael Everson (or even me) and get some 
guidance on how to have their orthography and eat it, too, but as things 
now stand, overloading, font hacks, and the like are the path of least 
resistance.

-- 
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

not Tamil: question about phones and rendering

2002-07-26 Thread Curtis Clark

Asmus Freytag wrote:
> I cand e-mail you from my phone - it's too painful and too limited to 
> carry this conversation at length, besides the phone's not subscribed to 
> this list, but phones are *NOT* closed systems.

Would complex rendering take place in the phone? Or would that happen in 
the phone company computers that communicate with the phone, and they 
would communicate with the phone in a private code?

-- 
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: How do I encode HTML documents in old languages ſuch as 17th century Swediſh in Unicode?

2002-07-05 Thread Curtis Clark


Stefan Persson wrote:
> It wouldn't be poſſible to uſe the HTML 
> command, becauſe no Fraktur fount is commonly diſtributed with any OS. One
> way could be to uſe the plane 1 Fraktur characters intended for mathematical
> uſage and the combining "e" and "o" characters, and images for the remaining
> characters.

Which OS has a font that includes the Plane 1 math characters?

-- 
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: (long) Re: Chromatic font research

2002-06-29 Thread Curtis Clark

William Overington wrote:
> This post makes the scientific
> situation quite clear 

Several others have taken you to task for using English words with your 
own private meaning, rather than a generally accepted meaning that can 
be shared by all on the list. "Science" is one of those words. Science 
is the activity of finding out things that aren't already known. It 
involves hypotheses that can be tested by experimentation or 
observation. Your conclusions about ligatures are completely predictable 
from knowledge of the way that fonts work. No experiment was necessary, 
just as it is unnecessary to count stones to establish that four plus 
three equals seven.

-- 
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: Courtyard Codes and the Private Use Area

2002-05-25 Thread Curtis Clark

At 07:45 2002-05-25, William Overington wrote:
>No, it does not.
>
>Character U+003C is LESS-THAN SIGN
>Character U+003E is GREATER-THAN SIGN
>Character U+002F is SOLIDUS
>
>If some other people have used those characters in a markup system with a
>non-Unicode file format, that cannot be considered as Unicode providing the
>basis for markup.

I'm sorry, but I can't tell whether you are being intentionally 
contrarian or simply dense. To say that http://www.unicode.org";>Unicode does not provide the basis for 
markup is the same as saying that Unicode does not provide the 
basis for English or C++. XML is explicitly based on Unicode. And 
I have not a clue as to what you mean by a "non-Unicode file format" in 
this context.

If you want to invent your own system of markup (using Unicode, just as 
W3C has), no one is stopping you, but I for one will not be paying 
attention.

-- 
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: Courtyard Codes and the Private Use Area

2002-05-24 Thread Curtis Clark

At 07:06 2002-05-24, Philipp Reichmuth wrote:
>Again, markup is the better solution. And, to be honest, it's a bit of
>a waste of space on the mailing list, don't you think?

I agree. Unicode already provides the basis for a widely-used and 
standardized formal system of markup by providing the characters U+003C, 
U+003E, and U+002F.

-- 
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

OT Korean spam

2002-04-23 Thread Curtis Clark


Somehow I got on a Korean spam list a while back, and I get between 10 and 
20 emails a day in euc-kr. The majority have subject lines that start with 
U+AD11 U+ACE0. If it's not obscene, could someone tell me what that means? 
(Thanks to SC Unipad, I can see the Hangul, although I don't read Korean.)

-- 
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: The Arrogants and the sillies (RE: Euros and cents)

2002-03-26 Thread Curtis Clark


At 02:04 PM 3/26/02, Jungshik Shin wrote:
>   Korean can form plural nouns by adding U+B4E4.

Is that the plural, or would it be the deul...uh, dual?




-- 
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: Talk about Unicode Myths...

2002-03-20 Thread Curtis Clark

At 09:11 AM 3/20/02, John H. Jenkins wrote:
>This doesn't reflect, however, what actual Japanese users want (or, at 
>least, would find acceptable).  The correct algorithm is to display kanji 
>with Japanese glyphs if at all possible.
>
>Again, the typographic tradition in Japan is to write kanji with Japanese 
>glyphs *even* when Chinese is the language being written.

Maybe I'm missing something here. My browsers don't display ASCII in 
fraktur, because I have not selected a fraktur font as either the system 
font or the default browser font. It seems to me that an average Japanese 
user would have only Japanese fonts installed, so that all CJK would appear 
in Japanese style no matter what its source. Why is there an issue?

-- 
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

OT, questions about Hanzi

2002-03-19 Thread Curtis Clark


Maybe this is off-topic, but I figure this is the place where I could get 
the quickest answers. What are the code points to write these things in 
their native languages?

1. "Hanzi" in Traditional and in Simplified
2. "Kanji" in Kanji
3. "Hangul" in Hangul (is it U+D55C U+AD74?)
4. Is "Hanja" ever written in Hanja in modern Korea? Is it U+D55C U+C790 in 
Hangul?
5. Are "katakana" and "hiragana" written in hiragana, or in Kanji?

TIA!

-- 
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: Private Use Agreements and Unapproved Characters

2002-03-19 Thread Curtis Clark


At 08:59 PM 3/18/02, Doug Ewell wrote:
>You are not going to find many fonts on the Web that contain PUA
>characters.

Actually, every Truetype font with Windows Symbol encoding uses the PUA.

>Personally, I'd like to see a font that covers all or most
>of the ConScript characters, but that seems impossible since so many of
>the ConScript glyphs have become unavailable, possibly forever.

Please explain what you mean by this.


-- 
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: Synthetic scripts

2002-03-17 Thread Curtis Clark


At 02:58 AM 3/17/02, Miikka-Markus Alhonen wrote:
>What about "a script that was invented by one person with the principal
>intention of representing an artificially constructed language"?
>This would include Tengwar, Cirth and Klingon but not any of the other
>above-mentioned cases.

Hmmm. I guess that would also include APL.


-- 
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: Synthetic scripts

2002-03-16 Thread Curtis Clark

At 04:45 PM 3/16/02, Doug Ewell wrote:
>But right away that definition includes not only Shavian, Tengwar,
>Cirth, Klingon, and most of the contents of ConScript, but also
>Ethiopic, Cherokee, Canadian Syllabics, Gothic, Deseret, and maybe Yi
>Syllabics, all of which are already encoded in Unicode.

And iirc Cyril and Methodius were people, although their script was based 
on Greek and continued to evolve.

>An alternative working definition of "synthetic script" that means "one
>invented to support a work of fiction" would be inappropriately aimed at
>the Star Trek and Tolkein scripts.

If one regards the Bible as a work of fiction, even more scripts could be 
added to this list.

I agree with Michael Everson that we are talking about the *Universal* 
Character Set. The "Good-Return-on-Investment Character Set" or the 
"Important to Us Character Set" might also be useful to some people, but 
they will not be universal.

-- 
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: Should there be a "UniGlyph" standard?

2002-03-06 Thread Curtis Clark

At 15:07 2002-03-05, Kenneth Whistler wrote:
>It is a little bit like trying to create a catalog of all
>the lifeforms on Earth. [...] What looks easy for the obvious cases 
>quickly turns near impossible.

Bad example--some of us make a career of doing the impossible (even with 
willows). I think the better point is that all efforts to *standardize* 
catalogs of living forms have failed.

-- 
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: Theban alphabet?

2002-03-03 Thread Curtis Clark


At 12:27 AM 3/1/02, Philipp Reichmuth wrote:
> How about a glyph variant of U+2721? ;-) 

U+2721 U+FE00 U+20DD, perhaps?


-- 
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: Theban alphabet?

2002-02-28 Thread Curtis Clark


At 10:13 AM 2/28/02, Kenneth Whistler wrote:
>It sounds to me that if Eric Raymond wants to pursue this, he
>needs to get his act together (and maybe some Wiccans to support
>him) to actually update and submit the proposal to the committees.

This Wiccan says it's a cipher.


-- 
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: Theban alphabet?

2002-02-28 Thread Curtis Clark


At 11:01 AM 2/28/02, Michael Everson wrote:
>I said that we'd need evidence written up. He did provide me some 
>arguments on the line of "if you write ABRACADABRA in Latin it doesn't 
>work, but if you write it in Theban it has power" which is, indeed, a 
>plain text differentiation. :-)

The word "pentacle" doesn't have the power of the pentacle glyph, and yet I 
don't see that in Unicode. (I won't accept that it is a glyph variant of 
U+2606.)


-- 
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

UTF-8 was Re: Smiles, faces, etc

2002-02-15 Thread Curtis Clark

At 08:30 PM 2/14/02, David Starner wrote:
>One out of two ain't bad, I guess. That was garbage on the screens of
>some of the subscribers, though - UTF-8 display is still not universal.

That's why I always open SC Unipad when I read this list, and paste as 
UTF-8. Unfortunately, Unipad seems to choke when one of the bytes of a 
UTF-8 sequence is 20h.

-- 
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: Unicode and Security

2002-02-07 Thread Curtis Clark


At 10:21 AM 2/7/02, Elliotte Rusty Harold wrote:
>I don't like that solution, but not liking it doesn't mean it ain't gonna 
>happen as soon as Exxon loses a few billion dollars because somebody 
>spoofed them and thereby gained access to their bidding plans for oil leases.

Enron lost a few billion dollars, and iirc Unicode was not involved.


-- 
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

1 2 >

1 - 100 of 137 matches

Mail list logo