On 7/1/08, Tom Lane <[EMAIL PROTECTED]> wrote: > "Marko Kreen" <[EMAIL PROTECTED]> writes: > > On 6/26/08, Tom Lane <[EMAIL PROTECTED]> wrote: > > >> BTW, I don't think you can use that same-length optimization for > >> citext. There's no reason to think that upper/lowercase pairs will > >> have the same length all the time in multibyte encodings. > > > What about this code in current str_tolower(): > > > /* Output workspace cannot have more codes than input bytes */ > > workspace = (wchar_t *) palloc((nbytes + 1) * sizeof(wchar_t)); > > > That's working with wchars, not bytes.
Ah, I missed the point of char2wchar() line. I'm rather unfamiliar with various MB API-s, sorry. There's another thing I'm probably missing: does current code handle multi-wchar codepoints? Or is it guaranteed they don't happen? (Wasn't wchar_t usually 16bit value?) -- marko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers