On 31-8-2013 13:15, Dimitry Sibiryakov wrote: > 31.08.2013 10:55, Mark Rotteveel wrote: >> I'd prefer to have an option to use UTF-16 (treated as a 2-byte >> character set with surrogate pairs) as that will only halve the maximum >> allowed number of characters. > > Nope. If you take into account surrogates, UTF-16 will have the same > maximum of 4 bytes > per character. >
You are missing my point. There are two ways to consider UTF-16, one is your interpretation where each character is 2-4 bytes, or as 2 byte 'characters', where some codepoints are built from a surrogate pair (which essential means that some codepoints require two 'characters', which in isolation don't make much sense). As most languages don't need those surrogate pairs for their codepoints/glyphs, it is easier to consider UTF-16 to be 2 byte. As far as I know this is how most UTF-16 implementations handle it. Mark -- Mark Rotteveel ------------------------------------------------------------------------------ Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more! Discover the easy way to master current and previous Microsoft technologies and advance your career. Get an incredible 1,500+ hours of step-by-step tutorial videos with LearnDevNow. Subscribe today and save! http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel