On Fri, Feb 01, 2002 at 08:22:54AM -0800, Bernard Miller wrote:
> Hello,
> For those of you not already on the Unicode mailing list I thought you would
> like to be aware of www.bytext.org. Bytext has a much better design than
> Unicode and is a better long term solution. One of the main features is that
> it is designed to be searchable with fast 8 bit regular expression
> algorithms. You may want to build in some flexibility to deal with Bytext in
> your implementation of UTF-8, perhaps even give up on UTF-8 altogether if it
> <92>s possible for you to focus on the long term.

And for those of you not already on the Unicode mail list, let me give
you a brief summary from my point of view:

Bytext is a very complex character encoding standard, offering little to
nothing over Unicode, and losing key features of Unicode like combining
characters and ASCII compatibility. (A minor pet peeve is that the
author left out a FORM FEED, as "Since FF looks like a PEC in screen
display it can produce unanticipated results...". Really? Vim shows form
feeds quite nicely, and when I put a form feed in, I usually expect the
results.) I can't really explain more because of the next point.

The author of Bytext shows neither political savy nor typographic skill.
He does not offer his standard in a plain text format, instead choosing
to offer it in Microsoft Word and PDF format. However, he doesn't take
advantage of those formats, giving us a badly formatted document with
headers and main points poorly marked or left unmarked. Furthermore, the
writing style is very hard to read, littering the document with newly
created acronyms and spending time attacking Unicode that should have
been used explaining the standard. He also shows algorithms through Java
code instead of writing it out. After quickly reading through the
document, I had no idea what a properly formed Bytext string would look
like and I didn't see any examples showing me one. It needs an editor
more than any work I've ever seen.

(Another pet peeve would be insisting that every programming language
add a type uByte for Bytext. Many languages have an existing byte type,
and many languages and programmers would find uByte a hideous name
clashing with the rest of the language. This is yet another place where
the author wants the world to change to fit Bytext instead of Bytext
working with the world.)

I find ISO-2022, Tron and Rosetta to be interesting, but I can't even
say that about Bytext. Maybe after some serious editing, some
interesting ideas might surface, but the complexity would still make it
unusable.

-- 
David Starner - [EMAIL PROTECTED], dvdeug/jabber.com (Jabber)
Pointless website: http://dvdeug.dhis.org
What we've got is a blue-light special on truth. It's the hottest thing 
with the youth. -- Information Society, "Peace and Love, Inc."
--
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Reply via email to