[Lots of StrNCopy stuff snipped]

>  > I think this will also work for an encoding such as UTF-8, which might
>>  have three or even four bytes per character. In that case, if the
>>  string you were copying was composed of characters with the following
>>  number of bytes:
>>
>>  <1><1><1><1><1><1><1><3>
>>
>>  And you called StrNCopy(dst, string, 9), then what StrNCopy would do
>>  is copy the first seven bytes, and pad the remaining two bytes with
>>  nulls.

[snip]

>FYI...according to the PHP reference manuals, a UTF-8 character can be
>up to 6 bytes long....I would assume this is a UTF-8 attribute rather
>than something to do with PHP

The reference manual is wrong. The Unicode Standard Version 3.0 
clearly states (page 47 in the printed version) that UTF-8 is the 
Unicode Transformation Format that serializes a Unicode scalar value 
as a sequence of one to four bytes.

The confusion probably comes from various implementations (primarily 
Oracle) that encode Unicode scalar values > 0x0FFFF as a surrogate 
pair (two 16-bit Unicode values), where each Unicode value requires 
three bytes in UTF-8. I think they were pushing for this to be called 
"UTF-8S".

-- Ken
-- 
Ken Krugler
TransPac Software, Inc.
<http://www.transpac.com>
+1 530-470-9200

-- 
For information on using the Palm Developer Forums, or to unsubscribe, please see 
http://www.palmos.com/dev/support/forums/

Reply via email to