On 31-8-2013 13:15, Dimitry Sibiryakov wrote:
> 31.08.2013 10:55, Mark Rotteveel wrote:
>> I'd prefer to have an option to use UTF-16 (treated as a 2-byte
>> character set with surrogate pairs) as that will only halve the maximum
>> allowed number of characters.
>
>     Nope. If you take into account surrogates, UTF-16 will have the same 
> maximum of 4 bytes
> per character.
>

You are missing my point. There are two ways to consider UTF-16, one is 
your interpretation where each character is 2-4 bytes, or as 2 byte 
'characters', where some codepoints are built from a surrogate pair 
(which essential means that some codepoints require two 'characters', 
which in isolation don't make much sense).

As most languages don't need those surrogate pairs for their 
codepoints/glyphs, it is easier to consider UTF-16 to be 2 byte. As far 
as I know this is how most UTF-16 implementations handle it.

Mark
-- 
Mark Rotteveel

------------------------------------------------------------------------------
Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more!
Discover the easy way to master current and previous Microsoft technologies
and advance your career. Get an incredible 1,500+ hours of step-by-step
tutorial videos with LearnDevNow. Subscribe today and save!
http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel

Reply via email to