Re: Misuse of 8th bit [Was: My Querry]

Antoine Leca Fri, 26 Nov 2004 04:10:42 -0800

On Thursday, November 25th, 2004 08:05Z Philippe Verdy va escriure:
>
> In ASCII, or in all other ISO 646 charsets, code positions are ALL in
> the range 0 to 127. Nothing is defined outside of this range, exactly
> like Unicode does not define or mandate anything for code points
> larger than 0x10FFFF, should they be stored or handled in memory with
> 21-, 24-, 32-, or 64-bit code units, more or less packed according to
> architecture or network framing constraints.
> So the question of whever an application can or cannot use the extra
> bits is left to the application, and this has no influence on the
> standard charset encoding or on the encoding of Unicode itself.


What you seem to miss here is that given computers are nowadays based on
8-bit units, there have been a strong move in the '80s and the '90s to
_reserve_ ALL the 8 bits of the octet for characters. And what was asking A.
Freitag was precisely to avoid bringing different ideas about possibilities
to encode other class of informations inside the 8th bit of a ASCII-based
storage of a character.

In a similar vein, I cannot be in agreement that it could be advisable to
use the 22th, 23th, 32th, 63th, etc., the upper bits of the storage of a
Unicode codepoint. Right now, nobody is seeing any use for them as part of
characters, but history should have learned us we should prevent this kind
of optimisations to occur. Particularly when it is NOT defined by the
standards: such a situation leads everybody and his dog to find his
particular "optimum" use for these "free space", and these classes of
optimums do not generally collides between them...


Antoine

Re: Misuse of 8th bit [Was: My Querry]

Reply via email to