Re: [HACKERS] GiST: interpretation of NaN from penalty function

2016-09-15 Thread Andrew Borodin
> Certainly, "NaN means infinity", as your patch proposes, has no more basis to 
> it than "NaN means zero".
You are absolutley right. Now I see that best interpretation is "NaN
means NaN". Seems like we need only drop a check. Nodes with NaN
penalties will be considered even worser than those with infinity
penalty.
Penalty calculation is CPU performance critical, it is called for
every tuple on page along insertion path. Ommiting this check will
speed this up...a tiny bit.
> If the penalty function doesn't like that interpretation, it shouldn't return 
> NaN.
It may return NaN accidentally. If NaN will pass through union()
function then index will be poisoned.
That's not a good contract to remember for extension developer.


Regards, Andrey Borodin.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST: interpretation of NaN from penalty function

2016-09-14 Thread Tom Lane
Andrew Borodin  writes:
> Currently GiST treats NaN penalty as zero penalty, in terms of
> generalized tree this means "perfect fit". I think that this situation
> should be considered "worst fit" instead.

On what basis?  It seems hard to me to make any principled argument here.
Certainly, "NaN means infinity", as your patch proposes, has no more basis
to it than "NaN means zero".  If the penalty function doesn't like that
interpretation, it shouldn't return NaN.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] GiST: interpretation of NaN from penalty function

2016-09-14 Thread Andrew Borodin
Hi hackers!

Currently GiST treats NaN penalty as zero penalty, in terms of
generalized tree this means "perfect fit". I think that this situation
should be considered "worst fit" instead.
Here is a patch to highlight place in the code.
I could not construct test to generate bad tree, which would be fixed
by this patch. There is not so much of cases when you get NaN. None of
them can be a result of usual additions and multiplications of real
values.

Do I miss something? Is there any case when NaN should be considered good fit?

Greg Stark was talking about this in
BANLkTi=d+bpps1cm4yc8kukhj63hwj4...@mail.gmail.com but that topic
didn't go far (due to triangles). I'm currently messing with floats in
penalties, very close to NaNs, and I think this question can be
settled.

Regrads, Andrey Borodin.
diff --git a/src/backend/access/gist/gistutil.c 
b/src/backend/access/gist/gistutil.c
index fac166d..0a62561 100644
--- a/src/backend/access/gist/gistutil.c
+++ b/src/backend/access/gist/gistutil.c
@@ -675,8 +675,14 @@ gistpenalty(GISTSTATE *giststate, int attno,
  PointerGetDatum(orig),
  PointerGetDatum(add),
  PointerGetDatum());
-   /* disallow negative or NaN penalty */
-   if (isnan(penalty) || penalty < 0.0)
+   /*
+* disallow negative or NaN penalty
+* NaN is considered float infinity - i.e. most inapropriate fit
+* negative is considered penaltiless fix
+*/
+   if (isnan(penalty))
+   penalty = get_float4_infinity();
+   if (penalty < 0.0)
penalty = 0.0;
}
else if (isNullOrig && isNullAdd)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers