I see. Thanks for all your replies! BTW I have a further question:
On Wed, Aug 28, 2013 at 1:44 PM, Philippe Verdy <[email protected]> wrote: > - in UTF-8, you'll need to look backward between 1 to 3 positions before > your start position to find the leading 8-bit code unit (>= 0xC0). Why should this be >=0xC0? -- Best regards, Xue Fuqiao.

