OK, the eXperimental Transformation Format goes thus (I didn't make it clear enough):
C0, G0, G1 and NBSP (0xA0) stay the same: a single byte. All Unicode characters from U+00A1 onwards are encoded in three bytes, the first of which is in the range C2..FE, the other two A1..C1. Thus U+00A1 = 0xC2 0xA1 0xA1 Advantages: 1. ASCII compatibility 2. C1 compatibility 3. Can be reduced to 7-bit SI/SO scheme with no control code overlap, thus being a UTF-7 without the real UTF-7's chief disadvantage of no sync. Disadvantages: 1. No simple way of filling bits like UTF-8's 110xxxxx 10xxxxxx. I suppose this brings us back to UTF-1's modulo complexities... 2. 3 bytes for all Unicode characters above U+00A0. 3. UTF-16 surrogate piggybacking - 6 bytes per outside-BMP codepoint. Really yucky, but those characters are rare. _________________________________________________________________ Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp.

