Hi! New version of patch is attached. On Thu, Dec 22, 2011 at 11:46 AM, Jeff Davis <pg...@j-davis.com> wrote:
> A few comments: > > * In range_gist_picksplit, it would be nice to have a little bit more > intuitive description of what's going on with the nonEmptyCount and > nonInfCount numbers. For instance, it appears to depend on the fact that > a range must either be in nonEmptyCount or in nonInfCount. Also, can you > describe the reason you're multiplying by two and taking the absolute > value? It seems to work, but I am missing the intuition behind those > operations. > total_count - 2*nonEmptyCount = (total_count - nonEmptyCount) - nonEmptyCount = emptyCount - nonEmptyCount So, it's really not evident. I've simplified it. > * The penalty function is fairly hard to read still. At a high level, I > think we're trying to accomplish a few things (in order from most to > least important): > - Keep normal ranges separate. > - Avoid broadening the class of the original predicate (e.g. turning > single-infinite into double-infinite). > - Avoid broadening (as determined by subtype_diff) the original > predicate. > - Favor adding ranges to narrower original predicates. > > Do you agree? If so, perhaps we can attack those more directly and it > might be a little more readable. > > Additionally, the arbitrary numbers might become a problem. Can we > choose better constants there? They would still be arbitrary when > compared with real numbers derived from subtype_diff, but maybe we can > still do better than what's there. > I've changes some comments and add constants for penalty values. > * Regarding the leftover "common" entries that can go to either side: > what is the "delta" measure trying to accomplish? When I work through > some examples, it seems to favor putting larger common ranges on the > left (small delta) and smaller common ranges on the right (smaller > delta). Why is that good? Or did I misread the code? > > Intuitively, I would think that we'd want to push the ranges with lower > upper bounds to the left and higher lower bounds to the right -- in > other words, recurse. Obviously, we'd need to make sure it terminated at > some point, but splitting the common entries does seem like a smaller > version of the original problem. Thoughts? > That was a bug. Actually, no "abs" is needed. Indeed it doesn't affect result significantly. ----- With best regards, Alexander Korotkov.
rangetypegist-0.6.patch.gz
Description: GNU Zip compressed data
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers