I'd like to try to gauge the community's interest, if any, in some possible updates to UTS #6 and the SCSU mechanism, as follows:
(1) Updating the spec to add dynamic-window offsets 0xA8 through 0xBF, to permit encoding the blocks from U+A000 through U+ABFF in single-byte mode. This would allow the many small alphabets assigned to this range, such as Bamum and Syloti Nagri and Phags-Pa, to be encoded efficiently using SCSU. Other offsets could be added as well, such as for Hangul Jamo Extended-B. (2) Updating the spec to assign "reserved" tag bytes 0x0C (single-byte mode) and 0xF2 (Unicode mode) as "reset all" commands, similar to 0xFF in BOCU-1. This would allow more efficient encoding in some cases, as well as providing a possible synchronization mechanism for decoders. As an alternative, these unused tag bytes could be released for normal, non-reserved use, so they would no longer require escaping. (3) Providing an informational section in UTS #6 on "line-safe SCSU," a special-purpose SCSU encoding profile in which all state is returned to the default at the end of each line, and all lines are terminated with CR/LF. I'm aware that many people have been discouraging the use of SCSU altogether, on the basis of Web-page security concerns or the reputation of SCSU as "difficult to implement." These people will not be affected one way or another by any enhancements to SCSU, and I am not focusing on them at present. -- Doug Ewell | Thornton, Colorado, USA | http://www.ewellic.org RFC 5645, 4645, UTN #14 | ietf-languages @ is dot gd slash 2kf0s

