Re: [HACKERS] Fixed length data types issue

mark Fri, 08 Sep 2006 11:33:15 -0700

On Fri, Sep 08, 2006 at 12:57:29PM -0400, Tom Lane wrote:
> Martijn van Oosterhout <[email protected]> writes:
> >> AFAICT, most of the useful operations work on UChar, which is uint16:
> >> http://icu.sourceforge.net/apiref/icu4c/umachine_8h.html#6bb9fad572d65b30=
> > 5324ef288165e2ac
> > Oh, you're confusing UCS-2 with UTF-16,
> Ah, you're right, I did misunderstand that.  However, it's still
> apparently the case that ICU works mostly with UTF16 and handles other
> encodings only via conversion to UTF16.  That's a pretty serious
> mismatch with our needs --- we'll end up converting to UTF16 all the
> time.  We're certainly not going to change to using UTF16 as the actual
> native string representation inside the backend, both because of the
> space penalty and incompatibility with tools like bison.


I think I've been involved in a discussion like this in the past. Was
it mentioned in this list before? Yes the UTF-8 vs UTF-16 encoding
means that UTF-8 applications are at a disadvantage when using the
library. UTF-16 is considered more efficient to work with for everybody
except ASCII users. :-)

No opinion on the matter though. Changing PostgreSQL to UTF-16 would
be an undertaking... :-)

Cheers,
mark

-- 
[EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED]     
__________________________
.  .  _  ._  . .   .__    .  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/    |_     |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
                       and in the darkness bind them...

                           http://mark.mielke.cc/


---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Re: [HACKERS] Fixed length data types issue

Reply via email to