Re: [HACKERS] Re: Adding IEEE 754:2008 decimal floating point and hardware support for it

2013-06-20 Thread Simon Riggs
On 20 June 2013 06:45, Craig Ringer cr...@2ndquadrant.com wrote:

 I think a good starting point would be to use the Intel and IBM
 libraries to implement basic DECIMAL32/64/128 to see if they perform
 better than the gcc builtins tested by Pavel by adapting his extension.

 If the performance isn't interesting it may still be worth adding for
 compliance reasons, but if we can only add IEEE-compliant decimal FP by
 using non-SQL-standard type names I don't think that's super useful.

I think we should be adding a datatype that is IEEE compliant, even if
that doesn't have space and/or performance advantages. We might hope
it does, but if not then it may do in the future.

It seems almost certain that the SQL standard would adopt the IEEE
datatypes in the future.

 If
 there are significant performance/space gains to be had, we could
 consider introducing DECIMAL32/64/128 types with the same names used by
 DB2, so people could explicitly choose to use them where appropriate.

Typenames are easily setup if compatibility is required, so thats not a problem.

We'd want to use the name the SQL std people assign.

--
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Adding IEEE 754:2008 decimal floating point and hardware support for it

2013-06-20 Thread Thomas Munro
On 20 June 2013 06:45, Craig Ringer cr...@2ndquadrant.com wrote:

 I think a good starting point would be to use the Intel and IBM
 libraries to implement basic DECIMAL32/64/128 to see if they perform
 better than the gcc builtins tested by Pavel by adapting his extension.


Just a few notes:

Not sure if this has already been mentioned, but GCC is using the IBM
decNumber library to implement those built-ins so the performance should be
nearly identical.

Unfortunately, many GCC builds shipped by Linux distributions don't
actually seem to have those built-ins configured anyway!

Also, the IBM 'xlc' compiler supports those built-ins (IBM being behind all
of this stuff...), and generates code using hardware instructions for
POWER6/POWER7, or software otherwise (quite possibly the same code again).

One further (undeveloped) thought: the IBM decNumber library doesn't just
support the 754-2008 types, it also supports a more general decNumber type
with arbitrary precision (well, up to 999,999,999 significant figures), so
if it were to finish up being used by core PG then it could also have other
uses.  I have no idea how decNumber (which encodes significant figures in
an integer coefficient, so one decimal digit per 3.2(?) bits) compares to
PG's DECIMAL (which encodes each digit in 4 bits, BCD style), in terms of
arithmetic performance and other trade-offs.


 If the performance isn't interesting it may still be worth adding for
 compliance reasons, but if we can only add IEEE-compliant decimal FP by
 using non-SQL-standard type names I don't think that's super useful. If
 there are significant performance/space gains to be had, we could
 consider introducing DECIMAL32/64/128 types with the same names used by
 DB2, so people could explicitly choose to use them where appropriate.


+1 for using the DB2 names.

I am interested in this topic as a user of both Postgres and DB2, and an
early adopter of 754-2008 in various software.  Actually I had started
working on my own DECFLOAT types for Postgres using decNumber in 2010 as I
mentioned on one of the lists, but life got in the way.  I had a very basic
extension sort of working though, and core support didn't seem necessary,
although I hadn't started on what I considered to be the difficult bit,
interactions with the other numerical types (ie deciding which conversions
and promotions would make sense and be safe).

Finally, I recently ran into a 3rd software implementation of 754-2008:
libmpdec (the other two being IBM decNumber and Intel's library), but I
haven't looked into it yet.

Thomas Munro


Re: [HACKERS] Re: Adding IEEE 754:2008 decimal floating point and hardware support for it

2013-06-20 Thread Thomas Munro
On 20 June 2013 08:05, Thomas Munro mu...@ip9.org wrote:

 On 20 June 2013 06:45, Craig Ringer cr...@2ndquadrant.com wrote:

 If the performance isn't interesting it may still be worth adding for

 compliance reasons, but if we can only add IEEE-compliant decimal FP by
 using non-SQL-standard type names I don't think that's super useful. If
 there are significant performance/space gains to be had, we could
 consider introducing DECIMAL32/64/128 types with the same names used by
 DB2, so people could explicitly choose to use them where appropriate.


 +1 for using the DB2 names.


On reflection, I should offer more than +1.  I think that the IBM name
DECFLOAT(16) is better than DECIMAL64 because:

1)  The number of significant decimal digits is probably of greater
importance to a typical end user than the number of binary digits used to
store it.
2)  Other SQL types are parameterised with this notation, such as
VARCHAR(6) and DECIMAL(6, 2).
3)  IEEE 754 has rather different semantics to SQL DECIMAL, I'm thinking
mainly of the behaviour of special values, so using a name like DECFLOAT(n)
instead of DECIMAL64 would draw greater attention to that fact (ie it's not
just a fixed sized DECIMAL).

Also, IBM was here first, and I *guess* they will propose DECFLOAT for
standardisation (they are behind proposals to add support to many other
languages), though I have no information on that.


Re: [HACKERS] Re: Adding IEEE 754:2008 decimal floating point and hardware support for it

2013-06-20 Thread Andres Freund
On 2013-06-20 13:45:24 +0800, Craig Ringer wrote:
 On 06/12/2013 07:51 PM, Andres Freund wrote:
  On 2013-06-12 19:47:46 +0800, Craig Ringer wrote:
  On 06/12/2013 05:55 PM, Greg Stark wrote:
  On Wed, Jun 12, 2013 at 12:56 AM, Craig Ringer cr...@2ndquadrant.com 
  wrote:
  The main thing I'm wondering is how/if to handle backward compatibility 
  with
  the existing NUMERIC and its DECIMAL alias
  If it were 100% functionally equivalent you could just hide the
  implementation internally. Have a bit that indicates which
  representation was stored and call the right function depending.
  That's what I was originally wondering about, but as Tom pointed out it
  won't work. We'd still need to handle scale and precision greater than
  that offered by _Decimal128 and wouldn't know in advance how much
  scale/precision they wanted to preserve. So we'd land up upcasting
  everything to NUMERIC whenever we did anything with it anyway, only to
  then convert it back into the appropriate fixed size decimal type for
  storage.

  Well, you can limit the upcasting to the cases where we would exceed
  the precision.

 How do you determine that for, say, DECIMAL '4'/ DECIMAL '3'? Or
 sqrt(DECIMAL '2') ?

Well, the suggestion above was not to actually implement them as
separate types. If you only store the precision inside the Datum you can
limit the upcasting to whatever you need.

 I think a good starting point would be to use the Intel and IBM
 libraries to implement basic DECIMAL32/64/128 to see if they perform
 better than the gcc builtins tested by Pavel by adapting his extension.

Another good thing to investigate early on is whether there's actually a
need for the feature outside complying to standards.

  Pretty pointless, and made doubly so by the fact that if we're
  not using a nice fixed-width type and have to support VARLENA we miss
  out on a whole bunch of performance benefits.
  I rather doubt that using a 1byte varlena - which it will be for
  reasonably sized Datums - will be a relevant bottleneck here. Maybe if
  you only have 'NOT NULL', fixed width columns, but even then...
 That's good to know - if I've overestimated the cost of using VARLENA
 for this, that's really quite good news.

From what I remember seing in profiles the biggest overhead is that the
short varlenas (not long ones though) frequently need to be copied
around so they are placed at an aligned address. I think with some care
numeric.c could be made to avoid that for the most common cases which
should speed up things nicely.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Adding IEEE 754:2008 decimal floating point and hardware support for it

2013-06-19 Thread Craig Ringer
On 06/12/2013 07:51 PM, Andres Freund wrote:
 On 2013-06-12 19:47:46 +0800, Craig Ringer wrote:
 On 06/12/2013 05:55 PM, Greg Stark wrote:
 On Wed, Jun 12, 2013 at 12:56 AM, Craig Ringer cr...@2ndquadrant.com 
 wrote:
 The main thing I'm wondering is how/if to handle backward compatibility 
 with
 the existing NUMERIC and its DECIMAL alias
 If it were 100% functionally equivalent you could just hide the
 implementation internally. Have a bit that indicates which
 representation was stored and call the right function depending.
 That's what I was originally wondering about, but as Tom pointed out it
 won't work. We'd still need to handle scale and precision greater than
 that offered by _Decimal128 and wouldn't know in advance how much
 scale/precision they wanted to preserve. So we'd land up upcasting
 everything to NUMERIC whenever we did anything with it anyway, only to
 then convert it back into the appropriate fixed size decimal type for
 storage.
 Well, you can limit the upcasting to the cases where we would exceed
 the precision.
How do you determine that for, say, DECIMAL '4'/ DECIMAL '3'? Or
sqrt(DECIMAL '2') ?

... actually, in all those cases Pg currently arbitrarily limits the
precision to 17 digits. Interesting. Not true for multiplication though:

regress= select (NUMERIC '4' / NUMERIC '3') * NUMERIC
'3.141592653589793238462643383279502884197169';
   ?column?  
--
 4.1887902047863908798971027247128958968414458906832371934277
(1 row)


so simple operations like:

SELECT (DECIMAL '4'/ DECIMAL '3') * (DECIMAL '1.11');

would exceed the precision currently provided and be upcast. We'd
quickly land up getting to full NUMERIC internally no matter what type
we started with.

I think a good starting point would be to use the Intel and IBM
libraries to implement basic DECIMAL32/64/128 to see if they perform
better than the gcc builtins tested by Pavel by adapting his extension.

If the performance isn't interesting it may still be worth adding for
compliance reasons, but if we can only add IEEE-compliant decimal FP by
using non-SQL-standard type names I don't think that's super useful. If
there are significant performance/space gains to be had, we could
consider introducing DECIMAL32/64/128 types with the same names used by
DB2, so people could explicitly choose to use them where appropriate.

 Pretty pointless, and made doubly so by the fact that if we're
 not using a nice fixed-width type and have to support VARLENA we miss
 out on a whole bunch of performance benefits.
 I rather doubt that using a 1byte varlena - which it will be for
 reasonably sized Datums - will be a relevant bottleneck here. Maybe if
 you only have 'NOT NULL', fixed width columns, but even then...
That's good to know - if I've overestimated the cost of using VARLENA
for this, that's really quite good news.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Adding IEEE 754:2008 decimal floating point and hardware support for it

2013-06-12 Thread Andres Freund
On 2013-06-12 19:47:46 +0800, Craig Ringer wrote:
 On 06/12/2013 05:55 PM, Greg Stark wrote:
  On Wed, Jun 12, 2013 at 12:56 AM, Craig Ringer cr...@2ndquadrant.com 
  wrote:
  The main thing I'm wondering is how/if to handle backward compatibility 
  with
  the existing NUMERIC and its DECIMAL alias
  If it were 100% functionally equivalent you could just hide the
  implementation internally. Have a bit that indicates which
  representation was stored and call the right function depending.
 
 That's what I was originally wondering about, but as Tom pointed out it
 won't work. We'd still need to handle scale and precision greater than
 that offered by _Decimal128 and wouldn't know in advance how much
 scale/precision they wanted to preserve. So we'd land up upcasting
 everything to NUMERIC whenever we did anything with it anyway, only to
 then convert it back into the appropriate fixed size decimal type for
 storage.

Well, you can limit the upcasting to the cases where we would exceed
the precision.

 Pretty pointless, and made doubly so by the fact that if we're
 not using a nice fixed-width type and have to support VARLENA we miss
 out on a whole bunch of performance benefits.

I rather doubt that using a 1byte varlena - which it will be for
reasonably sized Datums - will be a relevant bottleneck here. Maybe if
you only have 'NOT NULL', fixed width columns, but even then...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers