Further cleanup of gistsplit.c. After further reflection I was unconvinced that the existing coding is guaranteed to return valid union datums in every code path for multi-column indexes. Fix that by forcing a gistunionsubkey() call at the end of the recursion. Having done that, we can remove some clearly-redundant calls elsewhere. This should be a little faster for multi-column indexes (since the previous coding would uselessly do such a call for each column while unwinding the recursion), as well as much harder to break.
Also, simplify the handling of cases where one side or the other of a primary split contains only don't-care tuples. The previous coding used a very ugly hack in removeDontCares() that essentially forced one random tuple to be treated as non-don't-care, providing a random initial choice of seed datum for the secondary split. It seems unlikely that that method will give better-than-random splits. Instead, treat such a split as degenerate and just let the next column determine the split, the same way that we handle fully degenerate cases where the two sides produce identical union datums. Branch ------ REL9_1_STABLE Details ------- http://git.postgresql.org/pg/commitdiff/bffee6c52c7ae618a7f06da023bfdb66deb0bdb1 Modified Files -------------- src/backend/access/gist/gistsplit.c | 157 +++++++++++++++++++++-------------- 1 files changed, 93 insertions(+), 64 deletions(-) -- Sent via pgsql-committers mailing list ([email protected]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
