Well, is it right? I'm not sure of the status and the single byte-range for Big-5, though.
diff -urN ucm~/big5-eten.ucm ucm/big5-eten.ucm --- ucm~/big5-eten.ucm Thu Jan 23 23:21:00 2003 +++ ucm/big5-eten.ucm Tue Mar 25 21:43:00 2003 @@ -137,38 +137,6 @@ <U007E> \x7E |0 # TILDE <U007F> \x7F |0 # DELETE <U0080> \x80 |0 # <control> -<U0081> \x81 |0 # <control> -<U0082> \x82 |0 # BREAK PERMITTED HERE -<U0083> \x83 |0 # NO BREAK HERE -<U0084> \x84 |0 # <control> -<U0085> \x85 |0 # NEXT LINE -<U0086> \x86 |0 # START OF SELECTED AREA -<U0087> \x87 |0 # END OF SELECTED AREA -<U0088> \x88 |0 # CHARACTER TABULATION SET -<U0089> \x89 |0 # CHARACTER TABULATION WITH JUSTIFICATION -<U008A> \x8A |0 # LINE TABULATION SET -<U008B> \x8B |0 # PARTIAL LINE DOWN -<U008C> \x8C |0 # PARTIAL LINE UP -<U008D> \x8D |0 # REVERSE LINE FEED -<U008E> \x8E |0 # SINGLE SHIFT TWO -<U008F> \x8F |0 # SINGLE SHIFT THREE -<U0090> \x90 |0 # DEVICE CONTROL STRING -<U0091> \x91 |0 # PRIVATE USE ONE -<U0092> \x92 |0 # PRIVATE USE TWO -<U0093> \x93 |0 # SET TRANSMIT STATE -<U0094> \x94 |0 # CANCEL CHARACTER -<U0095> \x95 |0 # MESSAGE WAITING -<U0096> \x96 |0 # START OF GUARDED AREA -<U0097> \x97 |0 # END OF GUARDED AREA -<U0098> \x98 |0 # START OF STRING -<U0099> \x99 |0 # <control> -<U009A> \x9A |0 # SINGLE CHARACTER INTRODUCER -<U009B> \x9B |0 # CONTROL SEQUENCE INTRODUCER -<U009C> \x9C |0 # STRING TERMINATOR -<U009D> \x9D |0 # OPERATING SYSTEM COMMAND -<U009E> \x9E |0 # PRIVACY MESSAGE -<U009F> \x9F |0 # APPLICATION PROGRAM COMMAND -<U00A0> \xA0 |0 # NO-BREAK SPACE <U00A7> \xA1\xB1 |0 <U00A8> \xC6\xD8 |0 <U00AF> \xA1\xC2 |0 @@ -178,11 +146,6 @@ <U00D7> \xA1\xD1 |0 <U00F7> \xA1\xD2 |0 <U00F8> \xC8\xFB |0 -<U00FA> \xFA |0 # LATIN SMALL LETTER U WITH ACUTE -<U00FB> \xFC |0 # LATIN SMALL LETTER U WITH CIRCUMFLEX -<U00FD> \xFD |0 # LATIN SMALL LETTER Y WITH ACUTE -<U00FE> \xFE |0 # LATIN SMALL LETTER THORN -<U00FF> \xFF |0 # LATIN SMALL LETTER Y WITH DIAERESIS <U014B> \xC8\xFC |0 <U0153> \xC8\xFA |0 <U0250> \xC8\xF6 |0 diff -urN ucm~/big5-hkscs.ucm ucm/big5-hkscs.ucm --- ucm~/big5-hkscs.ucm Thu Jan 23 23:21:02 2003 +++ ucm/big5-hkscs.ucm Tue Mar 25 21:37:10 2003 @@ -136,13 +136,6 @@ <U007E> \x7E |0 # TILDE <U007F> \x7F |0 # DELETE <U0080> \x80 |0 # <control> -<U0081> \x81 |0 # <control> -<U0082> \x82 |0 # BREAK PERMITTED HERE -<U0083> \x83 |0 # NO BREAK HERE -<U0084> \x84 |0 # <control> -<U0085> \x85 |0 # NEXT LINE -<U0086> \x86 |0 # START OF SELECTED AREA -<U0087> \x87 |0 # END OF SELECTED AREA <U00A7> \xA1\xB1 |0 <U00A8> \xC6\xD8 |0 <U00AF> \xA1\xC2 |0 @@ -171,7 +164,6 @@ <U00F9> \x88\x7B |0 <U00FA> \x88\x79 |0 <U00FC> \x88\xA2 |0 -<U00FF> \xFF |0 # LATIN SMALL LETTER Y WITH DIAERESIS <U0100> \x88\x56 |0 <U0101> \x88\x67 |0 <U0112> \x88\x5A |0 Regards, SADAHIRO Tomoyuki > I often encounter lower-ascii codes mixed in with Big5 text, which is > fine > and straightforward to handle. However, a problem arises when upper > ascii occasionally occur outside of the Big5 range. When such a > character occurs, this is probably an error or part of a user-defined > character. > However, it appears that Encode DOES NOT display warnings for these but > rather maps individual upper ascii to conventional characters such as > Roman letters with diacritics commonly found in European languages. > (It appears that Encode displays warnings for characters that are within > the Big5 range, but do not have a mapping to Unicode, perhaps because > these code points are not used in Big5 itself.) > > Is there a way to cause Encode to display warnings for upper ascii > outside > of the Big5 range when converting from Big5 to Unicode? If not, could > the > developers consider this for a future fix? > > Mark