On Sun, Feb 03, 2002 at 06:15:33PM +0100, Pablo Saratxaga wrote: > > Many of the elegant features of Unixes depend on the notion of 8 bit > > transparency: pipe, cat, echo... the byte stream is the common denominator. > > The functions are general purpose and thus more useful. Bytext takes this > > elegant notion to it?s logical conclusion: not only can you process text > > as bytes, you can also process bytes as text. > > I don't understand, how can you encode in an 8bit space all the characters > of the world languages ? > > And if it is a multi-byte encoding, then it should have about the same > problems as utf-8 or euc have when faced with byte-only utilities.
It sounds to me that any 8-bit character sequence (hopefully excluding nuls) is a valid character. That doesn't sound particularly useful, though. (So what if an arbitrary byte sequence can be displayed as random-ish characters of equally random languages?) If it's the case that any string of bytes is a valid character, then that brings up the question of how robust it is. (Seeking, sync; issues that UTF-8 solved.) I tried to look this up, but one of the first things I saw when paging down the Word version (after it asked me for a password but worked anyway) was: "Unicode is messed up beyond repair." I promptly became disgusted and closed the window. Remarks like that have no place whatsoever in a "standard". How can he possibly wonder why he gets negative reactions from Unicode folks when he's making comments like this? -- Glenn Maynard -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
