On 28/08/13 23:29, Xue Fuqiao wrote:
I see. Thanks for all your replies!BTW I have a further question: On Wed, Aug 28, 2013 at 1:44 PM, Philippe Verdy<[email protected]> wrote:- in UTF-8, you'll need to look backward between 1 to 3 positions before your start position to find the leading 8-bit code unit (>= 0xC0).Why should this be >=0xC0?
Because a well‐formed UTF-8 header byte must start with at least two 1 bits, numerically, the smallest such byte is 16#C0#.
-- Ian ◎

