Re: [Firebird-devel] Unicode UTF-16 etc

Ann Harrison Sat, 31 Aug 2013 05:20:09 -0700


On Aug 31, 2013, at 4:55 AM, Mark Rotteveel <m...@lawinegevaar.nl> wrote:


> On 29-8-2013 17:41, Jim Starkey wrote:
>> Paradoxically, Japanese strings tend to be shorter in UTF-8 than 16 bit
>> Unicode.  The reason is simple: There are enough single byte characters
>> -- punctuation, control characters, and digits -- stay as single bytes,
>> double byte characters are a wash, and the single byte characters
>> generally balance the number of three byte characters.
>> 
>> UTF-16 is a mess with nasty problems of endians, multi-word characters,
>> and illegal codepoints to worry about.
>> 
> 
> Unfortunately the implementation of UTF-8 in Firebird is annoying 
> because it reduces that maximum allowed number of characters to a 1/4 of 
> that for single byte character sets making it necessary to switch to 
> blobs sooner.

A better solution is to change the implementation of CHAR and VARCHAR to accept 
longer strings.   

Cheers,

Ann



> l

------------------------------------------------------------------------------
Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more!
Discover the easy way to master current and previous Microsoft technologies
and advance your career. Get an incredible 1,500+ hours of step-by-step
tutorial videos with LearnDevNow. Subscribe today and save!
http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel

Re: [Firebird-devel] Unicode UTF-16 etc

Reply via email to