Re: Announcing Bytext

David Starner Sat, 02 Feb 2002 09:44:21 -0800

On Sat, Feb 02, 2002 at 02:16:37AM -0800, Bernard Miller wrote:
> Hopefully flags will go off when members of this list read things that are
> equivalent to "I don't understand it but here is my opinion on it...".


I'm a very intellegent person. If I can't understand your document after
reading through it a couple times, then the fault is yours.

> Bytext is a superset of Unicode normalization form C, so it certainly
> encodes all of ASCII including form feed, and all combining characters.

Page two, first paragraph: "There are no surrogate pairs; no combining
characters". 

Page one, last paragraph: "In particular, the role of Bytext is clearly
separated from the role of markup. This is contrasted with many features
of Unicode such as interlinear annotation characters (U+FFF9..U+FFFB);
the object replacement character (U+FFFC); the nesting bidirectional
control characters; and the "Tag Characters" (U+E0000..U+E007F)."

Page 37, fourth paragraph: "Unlike Unicode, Bytext does not recognize
any "page break" type of control character even as an informative
property because page formatting is definitely in the domain of markup.
Since FF looks like a PEC in screen display it can produce unanticipated
results that are very frustrating if it causes unnecessary pages to
print, wasting paper." (It doesn't say that it doesn't have a character
named FORM FEED, but it does say that it doesn't have a page break
character - i.e. a form feed.)

> ASCII code points are rearranged partly so that characters like form feed
> can be quickly identified by normalization algorithms. This is far from
> "losing ASCII compatibility". 

If you're going to rearrange them, they might as well go away - many
were confusing with better alternatives around anyway. But moving them
breaks every program that depended on the binary value of ASCII, which
we have more or less guarenteed will stay stable under Unix. Unicode is
almost always encoded as UTF-8 under Unix for one reason - because
0x00-0x7f is 0x00-0x7f in ASCII. No surprises. 

> Also, there is no need for a new
> primitive data type to support Bytext and certainly one is not "insisted"
> upon, merely recommended.

Sorry, recommended. 

Page 6, first paragraph: "It is recommended that "uByte", as in
"unsigned byte", be the name of the data type that is used to store each
byte that is to be interpreted as Bytext. It is also recommended that
each uByte be a primitive data type in programming languages that have
primitive types."

> It is perfectly reasonable to suppose that Bytext will never catch on, but
> the mere fact that it is in an embryonic stage of development is not PROOF
> that it will never catch on. 

The fact that your standard is incoherant and you have no corporate
or govermental support is strong evidence that it will never catch on.
The fact that it's a new startup in a field of preexisting strong
contenders puts the nail in casket. If you want to do a grassroots
project, it has to be clear and attractive, and fill a needed gap. Yet
another universal charset standard is not a needed gap.

> I love Linux 

As can be noted by the Microsoft Word document and lack of HTML or plain
text.

> and do not wish to disturb it's developers any further so
> perhaps this thread can continue off list for those interested.

I see no reason to move it, personally; it's not too far off topic for
the list.

Defend your system in public. If it is to become successful, it will
have to be defended in public, with understandings of and answers for
the standard arguments.

How about an example? Say, "ᎰᎵ hat Musik gut gehört." What does that
look like bytewise in Bytext?

-- 
David Starner - [EMAIL PROTECTED], dvdeug/jabber.com (Jabber)
Pointless website: http://dvdeug.dhis.org
What we've got is a blue-light special on truth. It's the hottest thing 
with the youth. -- Information Society, "Peace and Love, Inc."
--
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Re: Announcing Bytext

Reply via email to