Re: [HACKERS] sortsupport for text

Greg Stark Mon, 19 Mar 2012 18:21:36 -0700

On Mon, Mar 19, 2012 at 9:23 PM, Martijn van Oosterhout
<[email protected]> wrote:
> Ouch. I was holding out hope that you could get a meaningful
> improvement if we could use the first X bytes of the strxfrm output so
> you only need to do a strcoll on strings that actually nearly match.
> But with an information density of 9 bytes for one 1 character it
> doesn't seem worthwhile.


When I was playing with glibc it was 4n. I think what they do is have
n bytes for the high order bits, then n bytes for low order bits like
capitalization or whitespace differences. I suspect they used to use
16 bits for each and have gone to some larger size.


> That and this gem in the strxfrm manpage:
>
> RETURN VALUE
>       The  strxfrm()  function returns the number of bytes required to
>       store the transformed string in dest excluding the terminating
>       '\0' character.  If the value returned is n or more, the
>       contents of dest are indeterminate.
>
> Which means that you have to take the entire transformed string, you
> can't just ask for the first bit. I think that kind of leaves the whole
> idea dead in the water.

I believe the intended API is that you allocate a buffer with your
guess of the right size, call strxfrm and if it returns a larger
number you realloc your buffer and call it again.


-- 
greg

-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

Reply via email to