Re: [HACKERS] Box type equality

2015-09-29 Thread Jeff Anton

The Box type is the oldest non-linear type in the Postgres system.
We used it as the template for extensibility in the original system 
about thirty years ago.  We had R-trees for box indexing.  If you want 
fuzzy box matching, that seems possible with R-trees and some creativity 
by say matching an R-tree with a little larger box and using containment 
and maybe also not contained by a smaller box.  This is the idea behind 
strategies.  That you can use existing operations to build a new operation.


If you have to force this onto B-tree's I think you will have to choose 
one edge to index on (i.e. one of the four values) then fuzzy match that 
through the index and have a secondary condition to further restrict the 
matches.


As with all the geometric types, you can use containment boxes for them 
and have the secondary condition checks.


It's all just a few lines of code as Stonebraker used to say.

Jeff Anton


On 09/29/15 08:43, Tom Lane wrote:

Stanislav Kelvich <s.kelv...@postgrespro.ru> writes:

I've faced an issue with Box type comparison that exists almost for a five 
years.


Try twenty-five years.  The code's been like that since Berkeley.


   That can be fixed by b-tree equality for boxes, but we need some
   decisions there.


The problem with inventing a btree opclass for boxes is much more
fundamental than fuzzy comparisons, unfortunately.  Btree requires a
linear sort order, and there's no plausible linear ordering of boxes,
unless you compare areas which won't give the equality semantics you want.

We could perhaps invent an exact-equality operator and construct just
a hash opclass for it, no btree.

In any case I think it would be a mistake to consider only boxes; all
the built-in geometric types have related issues.

regards, tom lane





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] No Issue Tracker - Say it Ain't So!

2015-09-29 Thread Jeff Anton

Seems to me that there are a bunch of agendas here.

I read about not wanting to be trapped into a proprietary system.  You 
can be trapped in any software you depend upon.  Compilers, Platforms, 
SCM, issue tracking are all places to be trapped.


Postgres and Postgresql have been around a very long time for the 
software world.  It has outlived several of the systems it has depended 
upon over those many years.  I hope and expect that Postgres will 
continue to outlive some of these platforms.


So do not get hung up on having been 'burned' in the past.  Expect to be 
'burned' again.  Take steps to minimize that pain in the future.


For an issue tracker, open source or proprietary, I would want raw 
database dumps and schema information.  Postgres is a database after 
all.  If you truly have the data, you should be able to migrate it.


Also, does the system you adopt use Postgres?  You are your best open 
source software.  If performance starts to go downhill, you are in the 
best position to fix it if you understand and control it.  How 
responsive will whatever system be to your changing needs?  If you 
partner with an external group, the two groups will benefit from each 
other if they are truly sharing the technologies.


Jeff Anton


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Zero-padding and zero-masking fixes for to_char(float)

2015-03-24 Thread Jeff Anton
The issue of significant (decimal) digits to and from floating point 
representation is a complex one.


What 'significant' means may depend upon the intent.

There are (at least) two different tests which may need to be used.

* How many digits can be stored and then accurately returned?
or
* How many decimal digits are needed to recreate a floating point value? 
 Or in longer form, if you have a floating point value, you may want to 
print it in decimal form and then later scan that to recreate the exact 
bit pattern from the original floating point value.  How many decimal 
digits do you need?


The first question produces a smaller number of digits then the second one!

The idea of zero padding is, IMO, a bad idea all together.  It makes 
people feel better, but it adds inaccuracy.  I've lost this 
interpretation so many times now that I only mention it for the real 
number geeks out there.


Postgresql seems to be using the first interpretation and reporting 
fewer digits.  I've noticed this with pg_dump.  That a dump and restore 
of floating point values does not produce the same floating point 
values.  To me, that is inexcusable.  Using the -Fc format, real values 
are preserved.  I have a large database of security prices.  I want 
accuracy above all.


I do not have time right now to produce the needed evidence for all 
these cases of floating point values.  If there is interest I can 
produce this in a day or so.


Jeff Anton

BTW:  This is my first posting to this list.  I should introduce myself.
I'm Jeff Anton.  I was the first Postgres project lead programmer 
working for Michael Stonebraker at U.C. Berkeley a very long time ago.
The first version was never released.  I've since worked for several db 
companies.



On 03/24/15 06:47, Noah Misch wrote:

On Sun, Mar 22, 2015 at 10:53:12PM -0400, Bruce Momjian wrote:

On Sun, Mar 22, 2015 at 04:41:19PM -0400, Noah Misch wrote:

On Wed, Mar 18, 2015 at 05:52:44PM -0400, Bruce Momjian wrote:

This junk digit zeroing matches the Oracle behavior:

SELECT to_char(1.123456789123456789123456789d, 
'9.9') as x from dual;
--
1.12345678912345680

Our output with the patch would be:

SELECT to_char(float8 '1.123456789123456789123456789', 
'9.9');
--
1.12345678912345000



These outputs show Oracle treating 17 digits as significant while PostgreSQL
treats 15 digits as significant.  Should we match Oracle in this respect while
we're breaking compatibility anyway?  I tend to think yes.


Uh, I am hesistant to adjust our precision to match Oracle as I don't
know what they are using internally.


http://sqlfiddle.com/#!4/8b4cf/5 strongly implies 17 significant digits for
float8 and 9 digits for float4.





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers