> This thread has basically died, so marking as returned with feedback.
Well, not exactly died, but it's kind of in blind lead.
I'll summarize a bit for future references:
1. cube extension for indexing lowered dimensionality data (2d in 3d)
is broken. Queries effectively will degrade to full inde
On Thu, Sep 22, 2016 at 3:45 AM, Andrew Borodin wrote:
> [blah]
>
> Practically, this algorithm cannot be implemented in current GiST API
> (only if we provide non-penalty-based choose subtree function,
> optional for GiST extension), but it certainly has scientific value.
This thread has basical
Hi Heikki!
> Got a link for a description of the RR*-tree? Would be good to reference it
> in the patch comments, too.
Well, as for now, the best way to reach the paper is
DOI 10.1145/1559845.1559929
http://sci-hub.cc/
Authors in conversations clearly stated that they endorse (not sure in
correct
On 09/14/2016 07:57 PM, Andrew Borodin wrote:
Here is v6 of cube penalty patch.
There was a warning about unused edge function under systems without
__STDC_IEC_559__ defined, this patch fixes it.
Thanks!
Reading through this thread again, I think we got a bit side-tracked
with this float bit
Here is v6 of cube penalty patch.
There was a warning about unused edge function under systems without
__STDC_IEC_559__ defined, this patch fixes it.
Regards, Andrey Borodin.
diff --git a/contrib/cube/cube.c b/contrib/cube/cube.c
index 3feddef..ad868ac 100644
--- a/contrib/cube/cube.c
+++ b/contr
Hi hackers!
Here is gist cube penaly patch v5.
Changes:
1. union version of pack_float() is used, without warings and with
minimum asm commands emitted
2. test for __STDC_IEC_559__ is added, this is a standard way of C99
to determine float compliance with IEEE754. ANSI-only compiler+libs,
along wi
>Personally I wouldn't be very happy about an IEEE754 assumption.
Ok, so let's avoid autoconf man. There is no real explanations of the
ground for this assumption, just a reference to paper of David
Goldberg (and there is no metion about IEEE754 is absoulte
everywhere). BTW, may be we can ask David
On Thu, Sep 8, 2016 at 9:29 AM, Andrew Borodin wrote:
>>autoconf check for IEEE 754 floats
> Autoconf man says folowing:
>>it is safe to assume IEEE-754 in most portable code these days
> https://www.gnu.org/software/autoconf/manual/autoconf.html#Floating-Point-Portability
Personally I wouldn't
If you are still interested in. Here are 3 versions of pack-float. The
union version of pack-float should run faster. The code is simpler, the
dependencies are easier.
But it may be less accurate or even wrong, as for signed integers (x>>2)
and (x/4) are not the same. Consider x = -1.
You may try
Yes. You are right, ANSI C allows only load-time initializers. Attached
ANSI compatible version leads to the same assembly.
And let me suggest a bit-twiddling version as well. It gives 12
instructions, instead of 13. 12 is better, as modern x86 CPU will fetch
them at most in 3 cycles, one less tha
Thank you for your attention to details, Mikhail.
pack_float_good() looks good. But I'm not sure inline strict init is allowed
under ansi C. Converting to regular ancient form b.fp = v; won't change compile
result, would it?
Regards, Andrey Borodin.
--
Sent via pgsql-hackers mailing list (pgs
Excuse me for intervention.
It depends. For instance, i run PostgreSQL on the modern MIPS CPU, which
does not have sqrt support.
But you are right, it is supported in most cases. And if execution speed
of this very fuction is of concern, sqrtf(x) should be used instead of
sqrt(x).
Despite this,
>BTW, would someone explain to me why using a float here will not fail
>catastrophically for inputs outside the range of float?
Indeed, it will. And that's one of the reason I'm considering
advancing GiST API instead of advancing extensions.
It won't crash, but will produce terrible index, not s
Heikki Linnakangas writes:
> BTW, I would be OK with the bit-twiddling hack, if we had an autoconf
> check for IEEE 754 floats, and a graceful fallback for other systems.
I would still object to the version submitted, on the grounds of the
compiler warnings it causes. Possibly those could be a
>autoconf check for IEEE 754 floats
Autoconf man says folowing:
>it is safe to assume IEEE-754 in most portable code these days
https://www.gnu.org/software/autoconf/manual/autoconf.html#Floating-Point-Portability
> A union might be more readable
Here is union version of the patch. It's slower 10%
On 09/08/2016 09:39 AM, Михаил Бахтерев wrote:
Excuse me for intervention.
It depends. For instance, i run PostgreSQL on the modern MIPS CPU, which
does not have sqrt support.
But you are right, it is supported in most cases. And if execution speed
of this very fuction is of concern, sqrtf(x) s
Heikki Linnakangas writes:
> Unfortunately, sqrt(x) isn't very cheap.
You'd be surprised: sqrt is built-in on most modern hardware. On my
three-year-old workstation, sqrt(x) seems to take about 2.6ns. For
comparison, the pack_float version posted in
takes 3.9ns (and draws multiple compiler war
On 09/07/2016 09:20 PM, Andrew Borodin wrote:
Well, arithmetics is too fragile.
This version of float packing with arithmetical packaging
static float
pack_float(float actualValue, int realm)
{
double max,min;
max = FLT_MAX / ( 8 >> realm );
min = FLT_MAX / ( 16 >> realm );
if( realm == 0 )
min
Well, arithmetics is too fragile.
This version of float packing with arithmetical packaging
static float
pack_float(float actualValue, int realm)
{
double max,min;
max = FLT_MAX / ( 8 >> realm );
min = FLT_MAX / ( 16 >> realm );
if( realm == 0 )
min = 0;
/* squeeze the actual value between min and
Oh, sorry, made one more attemp and now I see your algorithm differently.
So you propose to use oE and oV as a markers of borders for what I call Realm.
But there may be very little significant bits in one of this ranges.
pg_sphere and PostGiS extensions tried to use 1 as a marker, with
alike form
On 09/07/2016 09:42 AM, Andrew Borodin wrote:
2. Your algorithm, among loosing some bits of precision (which is
absolutely acceptable - we have like 31 of them and that’s a lot) rely on
false assumption. We compare tuples on page not by MBR of inserted tuple,
but by MBR of tuple on page, which is
Hi Heikki!
Thank you for reviewing the code, it's always inspiring when a work is
noted (:
>Looking at the code, by "margin", you mean the sum of all "edges", i.e. of
each dimension, of the cube.
Indeed. As far as I remember, this is a terminology of old R*-tree paper.
Now they use the word "perim
On 08/29/2016 09:04 AM, Andrew Borodin wrote:
In this message I address only first problem. I want to construct a
penalty function that will choose:
1.Entry with a zero volume and smallest margin, that can
accommodate insertion tuple without extension, if one exists;
2.Entry with smallest
Here is new version of the patch.
Now function pack_float is commented better.
All inline keywords are removed. I haven't found any performance loss for that.
Remove of static's showed 1%-7% performance loss in SELECT computation
(3 test runs), so I left statics where they are.
Actually, to avoid
Thank you for your coments, Tom.
> Modern compilers are generally able to make their own decisions about it, and
> trying to put your thumb on the scales this heavily is not likely to improve
> the code.
Well, I have tested that combination of "static inline" affets
performance of index build on
Andrew Borodin writes:
> Does every supported Postgres platform conforms to IEEE 754 floating
> point specification?
Possibly, but it is against project policy to allow code to assume so.
That pack_float function is absolutely not acceptable IMV, and would
still be if it were adequately documente
Hi hackers!
Here is the new patch version.
With the help of Mikhail Bakhterev I've optimized subroutines of the
penalty function. Index build time now is roughly equivalent to build
time before patch (test attached to thread start).
Time of SELECT statement execution is reduced by 40%.
Changes in
Hi, hackers!
Some time ago I put a patch to commitfest that optimizes slightly GiST
inserts and updates. It's effect in general is quite modest (15% perf
improved), but for sorted data inserts are 3 times quicker. This
effect I attribute to mitigation of deficiencies of old split
algorithm.
Alexa
28 matches
Mail list logo