On Mon, Oct 17, 2011 at 11:54 PM, Tom Lane <[email protected]> wrote:
> Robert Haas <[email protected]> writes:
>> - Why does the second byte need special handling for 0xED and 0xF4?
>
> http://www.faqs.org/rfcs/rfc3629.html
>
> See section 4 in particular.  The underlying requirement is to disallow
> multiple representations of the same Unicode code point.

I'm still confused.  The input string is already known to be valid
UTF-8, so the second byte (if there is one) must be between 0x80 and
0xBF.  Therefore it will be neither 0xED nor 0xF4.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to