On Mon, Mar 19, 2012 at 9:23 PM, Martijn van Oosterhout <klep...@svana.org> wrote: > Ouch. I was holding out hope that you could get a meaningful > improvement if we could use the first X bytes of the strxfrm output so > you only need to do a strcoll on strings that actually nearly match. > But with an information density of 9 bytes for one 1 character it > doesn't seem worthwhile.
When I was playing with glibc it was 4n. I think what they do is have n bytes for the high order bits, then n bytes for low order bits like capitalization or whitespace differences. I suspect they used to use 16 bits for each and have gone to some larger size. > That and this gem in the strxfrm manpage: > > RETURN VALUE > The strxfrm() function returns the number of bytes required to > store the transformed string in dest excluding the terminating > '\0' character. If the value returned is n or more, the > contents of dest are indeterminate. > > Which means that you have to take the entire transformed string, you > can't just ask for the first bit. I think that kind of leaves the whole > idea dead in the water. I believe the intended API is that you allocate a buffer with your guess of the right size, call strxfrm and if it returns a larger number you realloc your buffer and call it again. -- greg -- Sent via pgsql-hackers mailing list (firstname.lastname@example.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers