Re: GiST for range types (was Re: [HACKERS] Range Types - typo + NULL string constructor)

Alexander Korotkov Tue, 24 Jan 2012 04:08:20 -0800

Hi!

New version of patch is attached.
On Thu, Dec 22, 2011 at 11:46 AM, Jeff Davis <pg...@j-davis.com> wrote:


> A few comments:
>
> * In range_gist_picksplit, it would be nice to have a little bit more
> intuitive description of what's going on with the nonEmptyCount and
> nonInfCount numbers. For instance, it appears to depend on the fact that
> a range must either be in nonEmptyCount or in nonInfCount. Also, can you
> describe the reason you're multiplying by two and taking the absolute
> value? It seems to work, but I am missing the intuition behind those
> operations.
>
total_count - 2*nonEmptyCount = (total_count - nonEmptyCount) -
nonEmptyCount = emptyCount - nonEmptyCount
So, it's really not evident. I've simplified it.



> * The penalty function is fairly hard to read still. At a high level, I
> think we're trying to accomplish a few things (in order from most to
> least important):
>  - Keep normal ranges separate.
>  - Avoid broadening the class of the original predicate (e.g. turning
> single-infinite into double-infinite).
>  - Avoid broadening (as determined by subtype_diff) the original
> predicate.
>  - Favor adding ranges to narrower original predicates.
>
> Do you agree? If so, perhaps we can attack those more directly and it
> might be a little more readable.
>
> Additionally, the arbitrary numbers might become a problem. Can we
> choose better constants there? They would still be arbitrary when
> compared with real numbers derived from subtype_diff, but maybe we can
> still do better than what's there.
>
I've changes some comments and add constants for penalty values.


> * Regarding the leftover "common" entries that can go to either side:
> what is the "delta" measure trying to accomplish? When I work through
> some examples, it seems to favor putting larger common ranges on the
> left (small delta) and smaller common ranges on the right (smaller
> delta). Why is that good? Or did I misread the code?
>
> Intuitively, I would think that we'd want to push the ranges with lower
> upper bounds to the left and higher lower bounds to the right -- in
> other words, recurse. Obviously, we'd need to make sure it terminated at
> some point, but splitting the common entries does seem like a smaller
> version of the original problem. Thoughts?
>
That was a bug. Actually, no "abs" is needed. Indeed it doesn't affect
result significantly.

-----
With best regards,
Alexander Korotkov.

rangetypegist-0.6.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: GiST for range types (was Re: [HACKERS] Range Types - typo + NULL string constructor)

Reply via email to