On Saturday, 15 October 2016 at 19:07:50 UTC, Patrick Schluter wrote:
At least with that lookup table below, you can detect isolated continuation bytes (192 and 193) and invalid codes (above 244).

192 and 193 can never appear in a UTF-8 text, they are overlongs not continuation bytes. Continuation are characters between 128 and 191 and thos are not allowed, so should be checked.



__gshared static immutable ubyte[] charWidthTab = [
            1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
            2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
            3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
            4, 4, 4, 4, 4, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1
];

length 5 and 6 need not to be tested specifically for your goto.


Reply via email to