On Thu, Sep 1, 2011 at 12:13 AM, Stephen J. Turnbull <step...@xemacs.org> wrote:
> Where I cut your words, we are in 100% agreement.  (FWIW :-)

Not quite the same here, but I don't feel the need to have the last
word. Most of what you say makes sense, in some cases we'll quibble
later, but there are a few points where I have something to add:

> No, and I can tell you why!  The difference between characters and
> words is much more important than that between code point and grapheme
> cluster for most users and the developers who serve them.  Even small
> children recognize typographical ligatures as being composite objects,

True -- in fact I didn't know that ff and ffl ligatures *existed*
until I learned about Unix troff.

> while at least this Spanish-as-a-second-language learner was taught
> that `ñ' is an atomic character represented by a discontiguous glyph,
> like `i', and it is no more related to `n' than `m' is.  Users really
> believe that characters are atomic.  Even in the cases of Han
> characters and Hangul, users think of the characters as being
> "atomic," but in the sense of Bohr rather than that of Democritus.

Ah, I think this may very well be culture-dependent. In Holland there
are no Dutch words that use accented letters, but the accents are
known because there are a lot of words borrowed from French or German.
We (the Dutch) think of these as letters with accents and in fact we
think of the accents as modifiers that can be added to any letter (at
least I know that's how I thought about it -- perhaps I was also
influenced by the way one had to type those on a mechanical
typewriter). Dutch does have one native use of the umlaut (though it
has a different name, I forget which, maybe trema :-), when there are
two consecutive vowels that would normally be read as a special sound
(diphthong?). E.g. in "koe" (cow) the oe is two letters (not a single
letter formed of two distict shapes!) that mean a special sound
(roughly KOO). But in a word like "coëxistentie" (coexistence) the o
and e do not form the oe-sound, and to emphasize this to Dutch readers
(who believe their spelling is very logical :-), the official spelling
puts the umlaut on the e. This is definitely thought of as a separate
mark added to the e; ë is not a new letter. I have a feeling it's the
same way for the French and Germans, but I really don't know.
(Antoine? Georg?)

Finally, my guess is that the Spanish emphasis on ñ as a separate
letter has to do with teaching how it has a separate position in the
localized collation sequence, doesn't it? I'm also curious if ñ occurs
as a separate character on Spanish keyboards.

-- 
--Guido van Rossum (python.org/~guido)
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to