On Mon, Feb 10, 2014 at 10:59 AM, Alexander Korotkov <aekorot...@gmail.com> wrote: > Done. Patch is splitted.
I took a quick look at this. Have you thought about making your new cmpSortSkipCols() function not use real comparisons? Since in the circumstances in which this optimization is expected to be effective (e.g. your original example) we can also expect a relatively low cardinality for the first n indexed attributes (again, as in your original example), in general when cmpSortSkipCols() is called there is a high chance that it will return true. If any pair of tuples (logically adjacent tuples fed in to cmpSortSkipCols() by an index scan in logical order) are not fully equal (i.e. their leading, indexed attributes are not equal) then we don't care about the details -- we just know that a new sort grouping is required. The idea here is that you can get away with simple binary equality comparisons, as we do when considering HOT-safety. Of course, you might find that two bitwise unequal values are equal according to their ordinary B-Tree support function 1 comparator (e.g. two numerics that differ only in their display scale). AFAICT this should be okay, since that just means that you have smaller sort groupings than strictly necessary. I'm not sure if that's worth it to more or less duplicate heap_tuple_attr_equals() to save a "mere" n expensive comparisons, but it's something to think about (actually, there are probably less than even n comparisons in practice because there'll be a limit). A similar idea appears in my SortSupport for text ("Poor man's normalized key"/strxfrm()) patch. A poor man's key comparison didn't work out, and there may be further differences that aren't captured in the special simple key representation, so we need to do a "proper comparison" to figure it out for sure. However, within the sortsupport routine comparator, we know that we're being called in this context, as a tie-breaker for a poor man's normalized key comparison that returned 0, and so are optimistic about the two datums being fully equal. An optimistic memcmp() is attempted before a strcoll() here if the lengths also match. I have not actually added special hints so that we're optimistic about keys being equal in other places (places that have nothing to do with the general idea of poor man's normalized keys), but that might not be a bad idea. Actually, it might not be a bad idea to just always have varstr_cmp() attempt a memcmp() first when two texts have equal length, no matter how it's called. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers