24-Mar-2014 04:44, Simen Kjærås пишет:
On 2014-03-24 00:32, Mike wrote:
On Sunday, 23 March 2014 at 21:23:18 UTC, Andrei Alexandrescu wrote:
Here's a baseline: http://goo.gl/91vIGc. Destroy!
Andrei
This example only considers encodings of up to 4 bytes, but UTF-8 can
encode code points in as many as 6 bytes. Is that not a concern?
Mike
RFC 3629 (http://tools.ietf.org/html/rfc3629) restricted UTF-8 to
conform to constraints in UTF-16, removing all 5- and 6-byte sequences.
More importantly Unicode standard explicitly fixed the range of code
points to that of representable in UTF-16. Starting with the 5th version
of the standard if memory serves me right.
--
Dmitry Olshansky