Pre-proposal for SCSU updates

Doug Ewell Mon, 01 Nov 2010 15:04:05 -0700

I'd like to try to gauge the community's interest, if any, in some
possible updates to UTS #6 and the SCSU mechanism, as follows:


(1)  Updating the spec to add dynamic-window offsets 0xA8 through 0xBF,
to permit encoding the blocks from U+A000 through U+ABFF in single-byte
mode.  This would allow the many small alphabets assigned to this range,
such as Bamum and Syloti Nagri and Phags-Pa, to be encoded efficiently
using SCSU.  Other offsets could be added as well, such as for Hangul
Jamo Extended-B.

(2)  Updating the spec to assign "reserved" tag bytes 0x0C (single-byte
mode) and 0xF2 (Unicode mode) as "reset all" commands, similar to 0xFF
in BOCU-1.  This would allow more efficient encoding in some cases, as
well as providing a possible synchronization mechanism for decoders.  As
an alternative, these unused tag bytes could be released for normal,
non-reserved use, so they would no longer require escaping.

(3)  Providing an informational section in UTS #6 on "line-safe SCSU," a
special-purpose SCSU encoding profile in which all state is returned to
the default at the end of each line, and all lines are terminated with
CR/LF.

I'm aware that many people have been discouraging the use of SCSU
altogether, on the basis of Web-page security concerns or the reputation
of SCSU as "difficult to implement."  These people will not be affected
one way or another by any enhancements to SCSU, and I am not focusing on
them at present.

--
Doug Ewell | Thornton, Colorado, USA | http://www.ewellic.org
RFC 5645, 4645, UTN #14 | ietf-languages @ is dot gd slash 2kf0s

Pre-proposal for SCSU updates

Reply via email to