On Fri, Oct 14, 2016 at 7:18 PM, Cory Benfield <c...@lukasa.co.uk> wrote:
> The many glyphs that exist for writing various human languages are not
> inefficiency to be optimised away. Further, I should note that most places to
> not legislate about what character sets are acceptable to transcribe their
> languages. Indeed, plenty of non-romance-language-speakers have found ways to
> transcribe their languages of choice into the limited 8-bit character sets
> that the Anglophone world propagated: take a look at Arabish for the best
> kind of example of this behaviour, where "الجو عامل ايه النهارده فى
> إسكندرية؟" will get rendered as "el gaw 3amel eh elnaharda f eskendereya?”
I've worked with transliterations enough to have built myself a
dedicated translit tool. It's pretty straight-forward to come up with
something you can type on a US-English keyboard (eg "a\o" for "å", and
"d\-" for "đ"), and in some cases, it helps with visual/audio
synchronization, but nobody would ever claim that it's the best way to
represent that language.
> But I think you’re in a tiny minority of people who believe that all
> languages should be rendered in the same script. I can think of only two
> reasons to argue for this:
> 1. Dealing with lots of scripts is technologically tricky and it would be
> better if we didn’t bother. This is the anti-Unicode argument, and it’s a
> weak argument, though it has the advantage of being internally consistent.
> 2. There is some genuine harm caused by learning non-ASCII scripts.
#1 does carry a decent bit of weight, but only if you start with the
assumption that "characters are bytes". If you once shed that
assumption (and the related assumption that "characters are 16-bit
numbers"), the only weight it carries is "right-to-left text is
hard"... and let's face it, that *is* hard, but there are far, far
harder problems in computing.
Oh wait. Naming things. In Hebrew.
Python-ideas mailing list
Code of Conduct: http://python.org/psf/codeofconduct/