On 2014-10-05 14:09, Uranuz wrote:
Maybe there is some idea how to just detect first code unit of grapheme without overhead for using Grapheme struct? I just tried to check if ch < 128 (for UTF-8). But this dont work. How to check if byte is continuation of code for single code point or if new sequence started?
Have a look here [1]. For example, if you have a byte that is between U+0080 and U+07FF you know that you need two bytes to get that whole code point.
[1] http://en.wikipedia.org/wiki/UTF-8#Description -- /Jacob Carlborg