> -----Original Message----- > From: Dennis Bjorklund [mailto:[EMAIL PROTECTED] > Sent: Saturday, August 07, 2004 11:23 PM > To: John Hansen > Cc: Takehiko Abe; [EMAIL PROTECTED] > Subject: RE: [PATCHES] [HACKERS] UNICODE characters above 0x10000 > > On Sat, 7 Aug 2004, John Hansen wrote: > > > Now, is it really 24 bits tho? > > Afaict, it's really 21 (0 - 10FFFF or 0 - xxx10000 11111111 > 11111111) > > Yes, up to 0x10ffff should be enough. > > The 24 is not really important, this is all about what utf-8 > strings to accept as input. The strings are stored as utf-8 > strings and when processed inside pg it uses wchar_t that is > 32 bit (on some systems at least). By restricting the utf-8 > input to unicode we can in the future store each character as > 3 bytes if we want.
Which brings us back to something like the attached... > > -- > /Dennis Björklund > > > Regards, John Hansen
wchar.c.patch
Description: wchar.c.patch
---------------------------(end of broadcast)--------------------------- TIP 8: explain analyze is your friend