Re: [HACKERS] Reducing NUMERIC size for 8.3

2007-09-26 Thread Zdenek . Kotala

Andrew Dunstan wrote:




Zdenek Kotala wrote:


Tom Lane wrote:


Gregory Stark [EMAIL PROTECTED] writes:

We previously discussed compressing the numeric data type for small 
values:

http://archives.postgresql.org/pgsql-hackers/2007-06/msg00715.php



We didn't do this for 8.3 but in any case Tom did suggest we ought 
to reverse
the weight and sign/dscale so we could do this sometime without 
introducing

another incompatibility.



I had forgotten about that, but it does seem like a good idea to do 
it now.

Any objections?



For in-place upgrade purpose It would be good change also OID for 
numeric type and preserve current OID for current implementation on 
updated system.



   



If we want to get into that game we need a better way of allocating 
Oids. Right now anything not currently used is liable to be grabbed, 
so there's a high risk of reuse.


Yes, it will be necessary. Or maybe second way is create only really 
base types (name, int, bool ...) on bootstrap and others types will be 
created in standard manner by CREATE TYPE, CREATE OPERATOR ... 
commands.  Or third way is not remove old datatypes and only rename them 
to e.g. numeric_old1 ...


  Zdenek


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Reducing NUMERIC size for 8.3

2007-09-26 Thread Merlin Moncure
On 9/24/07, Gregory Stark [EMAIL PROTECTED] wrote:


 We previously discussed compressing the numeric data type for small values:

 http://archives.postgresql.org/pgsql-hackers/2007-06/msg00715.php

 We didn't do this for 8.3 but in any case Tom did suggest we ought to reverse
 the weight and sign/dscale so we could do this sometime without introducing
 another incompatibility.

 I think we also should move the NumericData and declaration to numeric.c and
 make the Numeric type an opaque pointer for the rest of the source tree. That
 will catch any contrib modules or third-party modules which would be broken by
 any new data representation.


 --- numeric.h   27 Feb 2007 23:48:10 +  1.24
 +++ numeric.h   24 Sep 2007 16:07:24 +0100
 @@ -63,8 +63,8 @@
  typedef struct NumericData
  {
 int32   vl_len_;/* varlena header (do not 
 touch directly!) */
 -   int16   n_weight;   /* Weight of 1st digit  */
 uint16  n_sign_dscale;  /* Sign + display scale */
 +   int16   n_weight;   /* Weight of 1st digit  */
 charn_data[1];  /* Digits (really array of 
 NumericDigit) */
  } NumericData;

would this break any application pulling a numeric field as binary
over the protocol?

merlin

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Reducing NUMERIC size for 8.3

2007-09-26 Thread Tom Lane
Merlin Moncure [EMAIL PROTECTED] writes:
 would this break any application pulling a numeric field as binary
 over the protocol?

No.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Reducing NUMERIC size for 8.3

2007-09-25 Thread Zdenek Kotala

Tom Lane wrote:

Gregory Stark [EMAIL PROTECTED] writes:

We previously discussed compressing the numeric data type for small values:
http://archives.postgresql.org/pgsql-hackers/2007-06/msg00715.php



We didn't do this for 8.3 but in any case Tom did suggest we ought to reverse
the weight and sign/dscale so we could do this sometime without introducing
another incompatibility.


I had forgotten about that, but it does seem like a good idea to do it now.
Any objections?


For in-place upgrade purpose It would be good change also OID for 
numeric type and preserve current OID for current implementation on 
updated system.



Zdenek

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Reducing NUMERIC size for 8.3

2007-09-25 Thread Andrew Dunstan



Zdenek Kotala wrote:

Tom Lane wrote:

Gregory Stark [EMAIL PROTECTED] writes:
We previously discussed compressing the numeric data type for small 
values:

http://archives.postgresql.org/pgsql-hackers/2007-06/msg00715.php


We didn't do this for 8.3 but in any case Tom did suggest we ought 
to reverse
the weight and sign/dscale so we could do this sometime without 
introducing

another incompatibility.


I had forgotten about that, but it does seem like a good idea to do 
it now.

Any objections?


For in-place upgrade purpose It would be good change also OID for 
numeric type and preserve current OID for current implementation on 
updated system.



   


If we want to get into that game we need a better way of allocating 
Oids. Right now anything not currently used is liable to be grabbed, so 
there's a high risk of reuse.


cheers

andrew

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Reducing NUMERIC size for 8.3

2007-09-24 Thread Gregory Stark


We previously discussed compressing the numeric data type for small values:

http://archives.postgresql.org/pgsql-hackers/2007-06/msg00715.php

We didn't do this for 8.3 but in any case Tom did suggest we ought to reverse
the weight and sign/dscale so we could do this sometime without introducing
another incompatibility.

I think we also should move the NumericData and declaration to numeric.c and
make the Numeric type an opaque pointer for the rest of the source tree. That
will catch any contrib modules or third-party modules which would be broken by
any new data representation.


--- numeric.h   27 Feb 2007 23:48:10 +  1.24
+++ numeric.h   24 Sep 2007 16:07:24 +0100  
@@ -63,8 +63,8 @@
 typedef struct NumericData
 {
int32   vl_len_;/* varlena header (do not touch 
directly!) */
-   int16   n_weight;   /* Weight of 1st digit  */
uint16  n_sign_dscale;  /* Sign + display scale */
+   int16   n_weight;   /* Weight of 1st digit  */
charn_data[1];  /* Digits (really array of 
NumericDigit) */
 } NumericData;
 


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Reducing NUMERIC size for 8.3

2007-09-24 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 We previously discussed compressing the numeric data type for small values:
 http://archives.postgresql.org/pgsql-hackers/2007-06/msg00715.php

 We didn't do this for 8.3 but in any case Tom did suggest we ought to reverse
 the weight and sign/dscale so we could do this sometime without introducing
 another incompatibility.

I had forgotten about that, but it does seem like a good idea to do it now.
Any objections?

 I think we also should move the NumericData and declaration to numeric.c and
 make the Numeric type an opaque pointer for the rest of the source
 tree.

I don't agree with that; we are not in the habit of doing it that way
for any other on-disk data type.  All it will accomplish is to force
people to make private copies of the struct declaration, thereby
entirely guaranteeing that they fail to track changes.  There will
always be legitimate reasons for external code to want to look at
on-disk bits.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Reducing NUMERIC size for 8.3

2007-09-24 Thread Gregory Stark
Tom Lane [EMAIL PROTECTED] writes:

Tom Lane [EMAIL PROTECTED] writes:

 Gregory Stark [EMAIL PROTECTED] writes:

 I think we also should move the NumericData and declaration to numeric.c and
 make the Numeric type an opaque pointer for the rest of the source
 tree.

 I don't agree with that; we are not in the habit of doing it that way
 for any other on-disk data type.  All it will accomplish is to force
 people to make private copies of the struct declaration, thereby
 entirely guaranteeing that they fail to track changes.  There will
 always be legitimate reasons for external code to want to look at
 on-disk bits.

Well the macros to do so would become quite a bit more complex. I imagine they
would become functions instead. I suppose a reasonable simple interface could
be ginned up. But anyone currently accessing the data directly would have to
go through the functions to access the bits.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Reducing NUMERIC size for 8.3

2007-07-17 Thread Bruce Momjian

This has been saved for the 8.4 release:

http://momjian.postgresql.org/cgi-bin/pgpatches_hold

---

Simon Riggs wrote:
 We've changed the on-disk database format in 8.3, so we have an
 opportunity to change other things also. There is a patch thats been on
 the patch queue for some time called numeric508, submitted Dec 2005;
 I've updated this patch now for 8.3 to remove bit rot (an hour's work).
 This is posted to pgsql-patches now and it works.
 
 The benefit of the patch is that it reduces each NUMERIC value by 2
 bytes, so will speed up things considerably. This is now especially
 important if we are looking to reduce the speed of numeric division by a
 factor of 4 (recent WIP patch).
 
 The objections to applying this patch originally were:
 1. it changes on-disk format (we've done this, so argument is void)
 2. it would restrict number of digits to 508 and there are allegedly
 some people that want to store  508 digits.
 
 The current patch passes all regression tests, but currently fails
 numeric_big.sql since this explicitly checks for support of
 numeric(1000,800).
 
 We could:
 a) accept the patch as-is and restrict NUMERIC to 508 digits
 b) refine the patch somewhat to allow 1000 digits
 
 (b) is possible in a couple of ways, both fairly quick:
 - extend the patch so that one of the spare bits from the second digit
 is used to represent dscale 508-1000. 
 - extend the patch so that if weight  127 or dscale  127 we would use
 the first byte in the digits as an extra indicator byte holding the high
 bits of both fields.
 Neither change makes any difference to numbers below
 1,000,000,000,000,000(127 zeroes in total)...000 which probably
 covers the vast majority of people's usage.
 
 Objections: True, we are passed feature freeze, but this patch has been
 on the queue for 14 months prior to freeze and has been waiting on disk
 format changes to make patch application acceptable. We definitely want
 to reduce the size of Numeric by 2 bytes at some point. The question in
 my mind is: When is the best time to make this change? If we put this
 off until 8.4, then it will get rejected again because we won't want to
 change the disk format again. So the best time to do this is now,
 otherwise we'll put it off forever.
 
 Can I get somebody other than Tom to agree to review the patch? Clearly
 waiting for Tom to review this is just going to delay release, which I
 don't want to do.
 
 -- 
   Simon Riggs 
   EnterpriseDB   http://www.enterprisedb.com
 
 
 
 ---(end of broadcast)---
 TIP 6: explain analyze is your friend

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: What does Page Layout version mean? (Was: Re: [HACKERS] Reducing NUMERIC size for 8.3)

2007-06-27 Thread Josh Berkus
Tom,

 That sounds good, but there are corner cases where it wouldn't work ---
 consider a page containing a single maximum-length tuple.

Certainly any mature upgrade-in-place tool will require a checker which you 
run first which determines if you have a prohibitive corner case.  

Besides, I thought we didn't allow tuples to grow to more than 1/2 page size?  
Or am I thinking indexes here?

 In general I don't see us accepting changes that would increase data
 size, but there's at least one troubling exception on the horizon:
 per-column or per-value locale support.  Another problem is that a strict
 rule of no data size increase ever might forbid acceptance of changes
 that achieve average space savings at the cost of increasing the size of
 some lesser-used cases.

Our attitude at the meeting was let's burn that bridge when we come to it.  
If we can develop a solid in-place upgrade too which will work for 8.1-8.2 
and 8.2-8.3, then we've done something worthwhile even if we break it in 8.4 
or 8.5.  It's possible that we won't implement anything that breaks it for 
five years or that someone will invent another solution before then, in which 
case we'd feel pretty dumb for having kept it on the drawing board all that 
time. 

Or to put it another way, the page-grows-on-upgrade problem is hard enough 
that it will probably take as much effort as the whole rest of the upgrade 
process to solve.  So let's tackle one problem at a time.

-- 
Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: What does Page Layout version mean? (Was: Re: [HACKERS] Reducing NUMERIC size for 8.3)

2007-06-22 Thread Zdenek Kotala

Heikki Linnakangas wrote:
Since we're discussing upgrades, let me summarize the discussions we had 
over dinner in Ottawa for the benefit of all:




Thanks for summary.


As before, someone just needs to step up and do it.


I'm now working on proposal. I hope that it will ready soon.

Zdenek

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: What does Page Layout version mean? (Was: Re: [HACKERS] Reducing NUMERIC size for 8.3)

2007-06-21 Thread Zdenek Kotala

Andrew Sullivan wrote:

On Wed, Jun 20, 2007 at 12:34:21PM -0400, Robert Treat wrote:
FWIW pg_migrator is a pretty good swing at an in-place upgrade tool for 
8.1-8.2.   Unfortunately until the PGDG decides that in-place upgrade is a 
constraint their willing to place on development, I see them a good 
chicken/egg away from making it a continually usefull tool. 


Or maybe cart/horse.  It seems to me that the rule more likely needs
to be that the migrator follow the development of the database than
that the database engine be strongly constrained by the needs of an
upgrade tool.  I agree that some commitment is needed, though.


I don't think that upgrade will strongly constrain a database engine. 
But some constrains we can expect. Any new development in some area will 
have to take care about upgrade. If you break some upgrade coding 
rules you will not to able create upgrade procedure or upgrade process 
will be risky.


For example upgrade between 8.1 - 8.2 is risky, because you are not 
able easy determine what format of cidr/inet is used on the page. If 
something fails during table structure upgrade, then it is not easy to 
recovery from this situation.



Zdenek

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: What does Page Layout version mean? (Was: Re: [HACKERS] Reducing NUMERIC size for 8.3)

2007-06-21 Thread Heikki Linnakangas
Since we're discussing upgrades, let me summarize the discussions we had 
over dinner in Ottawa for the benefit of all:


* pg_migrator is a sound approach to handling catalog changes.

* Handling any page format change that doesn't grow the space taken by 
data is straightforward. Note that all 8.2 - 8.3 changes fall in this 
category.


* Page format changes that grow data size are problematic, because there 
can be pages that can't be expanded to the new format because there's 
not enough space. However, it would be possible to write a pre-upgrade 
program to look for the problematic pages, and do a dummy UPDATE on some 
tuples to move them away from such pages, to make room. The pre-upgrade 
program could be run while the old database is still up, so it doesn't 
cause downtime.


* Page format conversion can be done at upgrade time, scanning all 
relations and rewriting them. Or it can be done at runtime in ReadBuffer 
whenever a page in the old format is accessed.


We don't know what changes upcoming releases will bring, but 8.2 - 8.3 
upgrade is feasible. Any future changes will need to be considered on a 
case-by-case basis, but we have a pretty good basic strategy that 
doesn't impose any strict restrictions on future development. Even if we 
can't guarantee that we can do a non dump/restore upgrade between every 
release, if we can do it say between every other release, that's *much* 
better than none.


As before, someone just needs to step up and do it.

Zdenek Kotala wrote:

Andrew Sullivan wrote:

On Wed, Jun 20, 2007 at 12:34:21PM -0400, Robert Treat wrote:
FWIW pg_migrator is a pretty good swing at an in-place upgrade tool 
for 8.1-8.2.   Unfortunately until the PGDG decides that in-place 
upgrade is a constraint their willing to place on development, I see 
them a good chicken/egg away from making it a continually usefull tool. 


Or maybe cart/horse.  It seems to me that the rule more likely needs
to be that the migrator follow the development of the database than
that the database engine be strongly constrained by the needs of an
upgrade tool.  I agree that some commitment is needed, though.


I don't think that upgrade will strongly constrain a database engine. 
But some constrains we can expect. Any new development in some area will 
have to take care about upgrade. If you break some upgrade coding 
rules you will not to able create upgrade procedure or upgrade process 
will be risky.


For example upgrade between 8.1 - 8.2 is risky, because you are not 
able easy determine what format of cidr/inet is used on the page. If 
something fails during table structure upgrade, then it is not easy to 
recovery from this situation.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: What does Page Layout version mean? (Was: Re: [HACKERS] Reducing NUMERIC size for 8.3)

2007-06-21 Thread Tom Lane
Heikki Linnakangas [EMAIL PROTECTED] writes:
 * Page format changes that grow data size are problematic, because there 
 can be pages that can't be expanded to the new format because there's 
 not enough space. However, it would be possible to write a pre-upgrade 
 program to look for the problematic pages, and do a dummy UPDATE on some 
 tuples to move them away from such pages, to make room. The pre-upgrade 
 program could be run while the old database is still up, so it doesn't 
 cause downtime.

That sounds good, but there are corner cases where it wouldn't work ---
consider a page containing a single maximum-length tuple.

In general I don't see us accepting changes that would increase data
size, but there's at least one troubling exception on the horizon:
per-column or per-value locale support.  Another problem is that a strict
rule of no data size increase ever might forbid acceptance of changes
that achieve average space savings at the cost of increasing the size of
some lesser-used cases.

In short, this point seems to need more thought.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: What does Page Layout version mean? (Was: Re: [HACKERS] Reducing NUMERIC size for 8.3)

2007-06-20 Thread Robert Treat
On Tuesday 19 June 2007 10:15, Tom Lane wrote:
 Zdenek Kotala [EMAIL PROTECTED] writes:
  I'm little bit confused when we introduce new page layout version? I
  expect that new version become with changes with pageheader, tuple
  header or data encoding (varlen/TOAST ...). But in case when there is
  new data type internal implementation, there was not reason to update
  version (see inet/cidr between 8.1 - 8.2). Can me somebody clarify this?

 Well, we've changed it when there was a benefit to an existing tool to
 do so.  So far that's meant page header and tuple header changes.  If
 we ever had a working in-place upgrade solution, I think we'd be willing
 to make the page version account for datatype format changes too.


FWIW pg_migrator is a pretty good swing at an in-place upgrade tool for 
8.1-8.2.   Unfortunately until the PGDG decides that in-place upgrade is a 
constraint their willing to place on development, I see them a good 
chicken/egg away from making it a continually usefull tool. 

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: What does Page Layout version mean? (Was: Re: [HACKERS] Reducing NUMERIC size for 8.3)

2007-06-20 Thread Andrew Sullivan
On Wed, Jun 20, 2007 at 12:34:21PM -0400, Robert Treat wrote:
 FWIW pg_migrator is a pretty good swing at an in-place upgrade tool for 
 8.1-8.2.   Unfortunately until the PGDG decides that in-place upgrade is a 
 constraint their willing to place on development, I see them a good 
 chicken/egg away from making it a continually usefull tool. 

Or maybe cart/horse.  It seems to me that the rule more likely needs
to be that the migrator follow the development of the database than
that the database engine be strongly constrained by the needs of an
upgrade tool.  I agree that some commitment is needed, though.

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
The whole tendency of modern prose is away from concreteness.
--George Orwell

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


What does Page Layout version mean? (Was: Re: [HACKERS] Reducing NUMERIC size for 8.3)

2007-06-19 Thread Zdenek Kotala

Simon Riggs wrote:



The objections to applying this patch originally were:
1. it changes on-disk format (we've done this, so argument is void)


I'm little bit confused when we introduce new page layout version? I 
expect that new version become with changes with pageheader, tuple 
header or data encoding (varlen/TOAST ...). But in case when there is 
new data type internal implementation, there was not reason to update 
version (see inet/cidr between 8.1 - 8.2). Can me somebody clarify this?



However, In upgrade point of view. It will be necessary to keep 
information which postgreSQL version creates page to correctly track 
data type implementation (or special structure). Information in 
pg_control file and PG_VERSION files are unusable for upgrade purpose.


Zdenek

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: What does Page Layout version mean? (Was: Re: [HACKERS] Reducing NUMERIC size for 8.3)

2007-06-19 Thread Tom Lane
Zdenek Kotala [EMAIL PROTECTED] writes:
 I'm little bit confused when we introduce new page layout version? I 
 expect that new version become with changes with pageheader, tuple 
 header or data encoding (varlen/TOAST ...). But in case when there is 
 new data type internal implementation, there was not reason to update 
 version (see inet/cidr between 8.1 - 8.2). Can me somebody clarify this?

Well, we've changed it when there was a benefit to an existing tool to
do so.  So far that's meant page header and tuple header changes.  If
we ever had a working in-place upgrade solution, I think we'd be willing
to make the page version account for datatype format changes too.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Reducing NUMERIC size for 8.3

2007-06-18 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes:
 We've changed the on-disk database format in 8.3, so we have an
 opportunity to change other things also. There is a patch thats been on
 the patch queue for some time called numeric508, submitted Dec 2005;

I thought that idea had been rejected long since as being an unreasonable
reduction in the flexibility of numeric arithmetic.

 The benefit of the patch is that it reduces each NUMERIC value by 2
 bytes, so will speed up things considerably. This is now especially
 important if we are looking to reduce the speed of numeric division by a
 factor of 4 (recent WIP patch).

The only way that patch would make division faster is by making the
slowest cases (longest input numbers) impossible.  Which is hardly a
solution.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Reducing NUMERIC size for 8.3

2007-06-18 Thread Andreas Pflug
Simon Riggs wrote:
 The objections to applying this patch originally were:
 2. it would restrict number of digits to 508 and there are allegedly
 some people that want to store  508 digits.
   
If 508 digits are not enough, are1000 digits be sufficient? Both limits
appear quite arbitrary to me.

Regards,
Andreas


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Reducing NUMERIC size for 8.3

2007-06-18 Thread Simon Riggs
On Mon, 2007-06-18 at 16:56 +0200, Andreas Pflug wrote:
 Simon Riggs wrote:
  The objections to applying this patch originally were:
  2. it would restrict number of digits to 508 and there are allegedly
  some people that want to store  508 digits.

 If 508 digits are not enough, are1000 digits be sufficient? Both limits
 appear quite arbitrary to me.

Thats the current limit; I agree, but I didn't choose it. IIRC if you
don't specify a limit then you can have arbitrarily long numbers.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Reducing NUMERIC size for 8.3

2007-06-18 Thread Simon Riggs
On Mon, 2007-06-18 at 10:54 -0400, Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  We've changed the on-disk database format in 8.3, so we have an
  opportunity to change other things also. There is a patch thats been on
  the patch queue for some time called numeric508, submitted Dec 2005;
 
 I thought that idea had been rejected long since as being an unreasonable
 reduction in the flexibility of numeric arithmetic.

I've re-read all of the posts from Dec 2005 and I see everyone in favour
of committing the patch, subject to review of whether limitation to 508
was acceptable. For me that was never finalised, at least I can't see a
clear decision. The important point is that the patch can be easily
modified to overcome these restrictions, a line of thought that was
never explored. 

The current patch can be easily modified to allow 1000 digits, yet still
saving 2 bytes per value when storing more common values. Or we can use
the same techniques to represent arbitrary number of digits, again still
saving 2 bytes on common values.

We have the ground work done, we just need to agree further changes.

  The benefit of the patch is that it reduces each NUMERIC value by 2
  bytes, so will speed up things considerably. This is now especially
  important if we are looking to reduce the speed of numeric division by a
  factor of 4 (recent WIP patch).
 
 The only way that patch would make division faster is by making the
 slowest cases (longest input numbers) impossible.  Which is hardly a
 solution.

It isn't directly related to division, but is a speed up nonetheless
which can offset the loss.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Reducing NUMERIC size for 8.3

2007-06-18 Thread Michael Paesold

Andreas Pflug wrote:

Simon Riggs wrote:

The objections to applying this patch originally were:
2. it would restrict number of digits to 508 and there are allegedly
some people that want to store  508 digits.
  

If 508 digits are not enough, are1000 digits be sufficient? Both limits
appear quite arbitrary to me.


This 1000 is just a restriction on the typmod of numeric.

You can still use a much higher number of digits, if you use 
unconstrained numeric:


test= create table test (n numeric); 


CREATE TABLE
test= insert into test values (10::numeric ^ 9);
INSERT 0 1
test= select length(n) from test;
 length

 100017
(1 row)

Best Regards
Michael Paesold


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] Reducing NUMERIC size for 8.3

2007-06-18 Thread Tom Lane
Andreas Pflug [EMAIL PROTECTED] writes:
 Simon Riggs wrote:
 The objections to applying this patch originally were:
 2. it would restrict number of digits to 508 and there are allegedly
 some people that want to store  508 digits.
 
 If 508 digits are not enough, are1000 digits be sufficient? Both limits
 appear quite arbitrary to me.

As per the recent discussion about factorial, the current limit of
numeric format is 10^131071 --- there is a whole lot of daylight between
that and 10^508.

I had a thought though: it's possible to reduce the header overhead for
typical-size numbers without giving up the ability to store large ones.
This is because the POS/NEG/NAN sign possibilities leave one unused bit
pattern.  Hence:

1. Switch the positions of the n_sign_dscale and n_weight fields in the
long format, so that the sign bits are in the first word.

2. Reserve the fourth sign bit pattern to denote a compressed-header
format in which there's just one uint16 header word and the
NumericDigits start right after that.  The header word could contain:
2 bits: sign distinguishing this from the two-word-header format
1 bit: actual number sign (POS or NEG, disallow NaN)
6 bits: weight, room for -32 .. 31
7 bits: dscale, room for 0 .. 127

3. When packing a NumericVar into a Numeric, use this short format when
it's not a NaN and the weight and dscale are in range, else use the long
format.

Since the weight is in base-1 digits, this bit allocation allows a
dynamic range of about +- 10^127 which fits well with the dscale range.
But I suspect that most of the use-cases for long numerics involve large
integers, so it might be more useful to shave another bit or two from
dscale and give 'em to weight.

In any case, no capability is lost, unlike the original proposal; and
this would be much less invasive than the original patch since there's
no need to play tricks with the content of the digit array.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Reducing NUMERIC size for 8.3

2007-06-18 Thread Andreas Pflug
Tom Lane wrote:
 Andreas Pflug [EMAIL PROTECTED] writes:
   
 Simon Riggs wrote:
 
 The objections to applying this patch originally were:
 2. it would restrict number of digits to 508 and there are allegedly
 some people that want to store  508 digits.

   
 If 508 digits are not enough, are1000 digits be sufficient? Both limits
 appear quite arbitrary to me.
 

 As per the recent discussion about factorial, the current limit of
 numeric format is 10^131071 --- there is a whole lot of daylight between
 that and 10^508.

 I had a thought though: it's possible to reduce the header overhead for
 typical-size numbers without giving up the ability to store large ones.
 This is because the POS/NEG/NAN sign possibilities leave one unused bit
 pattern.  Hence:

 1. Switch the positions of the n_sign_dscale and n_weight fields in the
 long format, so that the sign bits are in the first word.

 2. Reserve the fourth sign bit pattern to denote a compressed-header
 format in which there's just one uint16 header word and the
 NumericDigits start right after that.  The header word could contain:
   2 bits: sign distinguishing this from the two-word-header format
   1 bit: actual number sign (POS or NEG, disallow NaN)
   6 bits: weight, room for -32 .. 31
   7 bits: dscale, room for 0 .. 127

 3. When packing a NumericVar into a Numeric, use this short format when
 it's not a NaN and the weight and dscale are in range, else use the long
 format.

 Since the weight is in base-1 digits, this bit allocation allows a
 dynamic range of about +- 10^127 which fits well with the dscale range.
 But I suspect that most of the use-cases for long numerics involve large
 integers, so it might be more useful to shave another bit or two from
 dscale and give 'em to weight.

 In any case, no capability is lost, unlike the original proposal; and
 this would be much less invasive than the original patch since there's
 no need to play tricks with the content of the digit array.
   

I wonder if the currently waiting patch isn't Good Enough for
999. % of use cases, and all others can use numeric
instead of numeric(1000,800) or so. Especially since there are many
patches waiting that do need further investigation and refining.

Regards,
Andreas


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Reducing NUMERIC size for 8.3

2007-06-18 Thread Simon Riggs
On Mon, 2007-06-18 at 11:32 -0400, Tom Lane wrote:
 Andreas Pflug [EMAIL PROTECTED] writes:
  Simon Riggs wrote:
  The objections to applying this patch originally were:
  2. it would restrict number of digits to 508 and there are allegedly
  some people that want to store  508 digits.
  
  If 508 digits are not enough, are1000 digits be sufficient? Both limits
  appear quite arbitrary to me.
 
 As per the recent discussion about factorial, the current limit of
 numeric format is 10^131071 --- there is a whole lot of daylight between
 that and 10^508.
 
 I had a thought though: it's possible to reduce the header overhead for
 typical-size numbers without giving up the ability to store large ones.
 This is because the POS/NEG/NAN sign possibilities leave one unused bit
 pattern.  Hence:
 
 1. Switch the positions of the n_sign_dscale and n_weight fields in the
 long format, so that the sign bits are in the first word.
 
 2. Reserve the fourth sign bit pattern to denote a compressed-header
 format in which there's just one uint16 header word and the
 NumericDigits start right after that.  The header word could contain:
   2 bits: sign distinguishing this from the two-word-header format
   1 bit: actual number sign (POS or NEG, disallow NaN)
   6 bits: weight, room for -32 .. 31
   7 bits: dscale, room for 0 .. 127
 
 3. When packing a NumericVar into a Numeric, use this short format when
 it's not a NaN and the weight and dscale are in range, else use the long
 format.
 
 Since the weight is in base-1 digits, this bit allocation allows a
 dynamic range of about +- 10^127 which fits well with the dscale range.
 But I suspect that most of the use-cases for long numerics involve large
 integers, so it might be more useful to shave another bit or two from
 dscale and give 'em to weight.
 
 In any case, no capability is lost, unlike the original proposal; and
 this would be much less invasive than the original patch since there's
 no need to play tricks with the content of the digit array.

Sounds good. I thought there'd be a way.

Since this is your idea, would you like to do this, or should I?

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Reducing NUMERIC size for 8.3

2007-06-18 Thread Tom Lane
Andreas Pflug [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 In any case, no capability is lost, unlike the original proposal; and
 this would be much less invasive than the original patch since there's
 no need to play tricks with the content of the digit array.

 I wonder if the currently waiting patch isn't Good Enough for
 999. % of use cases, and all others can use numeric
 instead of numeric(1000,800) or so.

Apparently you misunderstand that patch: it takes capability away from
unconstrained numeric too.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Reducing NUMERIC size for 8.3

2007-06-18 Thread Simon Riggs
On Mon, 2007-06-18 at 17:49 +0200, Andreas Pflug wrote:

 I wonder if the currently waiting patch isn't Good Enough for
 999. % of use cases, and all others can use numeric
 instead of numeric(1000,800) or so. Especially since there are many
 patches waiting that do need further investigation and refining.

That still has problems.

Another approach would be to restrict the existing datatype NUMERIC to
508 digits, but introduce a new datatype of LONGNUM which allows
arbitrary length numerics for those that *do* care. It might be more
beneficial in the longer run to separate the use cases so we can further
tune them (not sure how just yet...)

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Reducing NUMERIC size for 8.3

2007-06-18 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes:
 Since this is your idea, would you like to do this, or should I?

Go for it.  I'm not actually convinced this is worth spending time on,
as Greg Stark's 1-byte-varlena patch already saved more for typical
numerics than this will.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Reducing NUMERIC size for 8.3

2007-06-18 Thread Simon Riggs
On Mon, 2007-06-18 at 11:55 -0400, Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  Since this is your idea, would you like to do this, or should I?
 
 Go for it.  

OK

 I'm not actually convinced this is worth spending time on,
 as Greg Stark's 1-byte-varlena patch already saved more for typical
 numerics than this will.

Understood, thats why I left this till last. This will save even more on
top of those savings. My concern is to make this change now while we
can, since we will be aiming for 8.3-8.4 to be a binary upgrade.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Reducing NUMERIC size for 8.3

2007-06-18 Thread Gregory Stark

Tom Lane [EMAIL PROTECTED] writes:

 I had a thought though: it's possible to reduce the header overhead for
 typical-size numbers without giving up the ability to store large ones.
 This is because the POS/NEG/NAN sign possibilities leave one unused bit
 pattern.  Hence:

I had a whack and doing something similar to this a while back -- though I was
more concerned with optimizing very small numeric variables, especially small
integers and typical currency-sized quantities.

The bottleneck I ran into was desupporting the macros which access the sign,
display scale, etc. Those are currently in numeric.h making them--in theory--
publicly available outside the internal functions. Afaik nothing actively uses
them outside of numeric.c though.

If we're happy desupporting those macros and not replacing them with anything
comparable then there are additional bits of state hidden in the varlena size.
Anything shorter than the shortest possible numeric representation can
implicitly be interpreted as some alternate compact representation. I already
had a patch that stored small integers in a single NumericDigit without any
numeric header at all.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Reducing NUMERIC size for 8.3

2007-06-18 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 Anything shorter than the shortest possible numeric representation can
 implicitly be interpreted as some alternate compact representation. I already
 had a patch that stored small integers in a single NumericDigit without any
 numeric header at all.

That's getting well beyond the realm of reason IMHO.  I doubt you can
merge it with this proposal anyway --- how will you disambiguate from
zero with a positive dscale (0.00)?

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Reducing NUMERIC size for 8.3

2007-06-18 Thread Tom Lane
I wrote:
 Gregory Stark [EMAIL PROTECTED] writes:
 Anything shorter than the shortest possible numeric representation can
 implicitly be interpreted as some alternate compact representation. I
 already had a patch that stored small integers in a single
 NumericDigit without any numeric header at all.

 That's getting well beyond the realm of reason IMHO.  I doubt you can
 merge it with this proposal anyway --- how will you disambiguate from
 zero with a positive dscale (0.00)?

Hmm, I take that back: it could be merged with this proposal pretty
easily.  Again assuming that we move the POS/NEG/NAN/SHORT sign
flags to the first header word, we can have:

Sign=NAN: it's a NAN, regardless of anything else.  We may as well
store NAN as just 2 bytes.

Sign=SHORT: it's a non-NAN with limited weight and dscale range,
as per my proposal.  A zero value would still be only 2 bytes,
but anything else is longer.  (This would be needed only for zero
with nonzero dscale, though.)

Sign=POS or NEG: if length == 2 bytes then interpret the remaining
14 bits as a single NBASE digit, with assumed weight and dscale 0.
This allows storing integers up to +/- in 2 bytes (+ 1 byte
varlena header).  If length  2 then it is a traditional-format
numeric.

I'm not entirely convinced this is worth the extra pack/unpack
logic, since said integers would be 4 bytes (+1 byte header) in
the sign=SHORT representation, which is not that much more.
Also, this means at least three different representations of some
values, which is going to be a headache for comparison and hashing.
But the bit-space is there.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Reducing NUMERIC size for 8.3

2007-06-18 Thread Simon Riggs
On Mon, 2007-06-18 at 12:44 -0400, Tom Lane wrote:
 I wrote:
  Gregory Stark [EMAIL PROTECTED] writes:
  Anything shorter than the shortest possible numeric representation can
  implicitly be interpreted as some alternate compact representation. I
  already had a patch that stored small integers in a single
  NumericDigit without any numeric header at all.
 
  That's getting well beyond the realm of reason IMHO.  I doubt you can
  merge it with this proposal anyway --- how will you disambiguate from
  zero with a positive dscale (0.00)?
 
 Hmm, I take that back: it could be merged with this proposal pretty
 easily.  Again assuming that we move the POS/NEG/NAN/SHORT sign
 flags to the first header word, we can have:
 
 Sign=NAN: it's a NAN, regardless of anything else.  We may as well
 store NAN as just 2 bytes.

 I'm not entirely convinced this is worth the extra pack/unpack
 logic, since said integers would be 4 bytes (+1 byte header) in
 the sign=SHORT representation, which is not that much more.
 Also, this means at least three different representations of some
 values, which is going to be a headache for comparison and hashing.
 But the bit-space is there.

I think representing zero as compactly as possible is worth the trouble,
so that in itself is enough for me to say Yes. Rest seems good too.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Reducing NUMERIC size for 8.3

2007-06-18 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes:
 I think representing zero as compactly as possible is worth the trouble,

Either of these proposals can do that in 2 bytes.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Reducing NUMERIC size for 8.3

2007-06-18 Thread Bruce Momjian
Simon Riggs wrote:
 On Mon, 2007-06-18 at 11:55 -0400, Tom Lane wrote:
  Simon Riggs [EMAIL PROTECTED] writes:
   Since this is your idea, would you like to do this, or should I?
  
  Go for it.  
 
 OK
 
  I'm not actually convinced this is worth spending time on,
  as Greg Stark's 1-byte-varlena patch already saved more for typical
  numerics than this will.
 
 Understood, thats why I left this till last. This will save even more on
 top of those savings. My concern is to make this change now while we
 can, since we will be aiming for 8.3-8.4 to be a binary upgrade.

Would someone please explain why we are considering this so far past
features freeze, and who suggtested that the 8.3-8.4 upgrade being a binary
upgrade was anything more than a pipe dream?

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Reducing NUMERIC size for 8.3

2007-06-18 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 Would someone please explain why we are considering this so far past
 features freeze, and who suggtested that the 8.3-8.4 upgrade being a binary
 upgrade was anything more than a pipe dream?

Well, Greg had left further squeezing of numerics as an open item in his
varlena patch, so it's not totally unreasonable to consider a patch for
that now --- as long as it's pretty small and simple.

I agree that in-place upgrade is a pipe dream until we see someone
actually step forward and do the work to build a usable pg_upgrade
utility.

If nothing else, we should consider swapping the n_sign_dscale and
n_weight fields now, since that would enable upward-compatible
implementation of these ideas later.  Otherwise any such patch
would probably get rejected if pg_upgrade did happen to emerge out
of nowhere.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Reducing NUMERIC size for 8.3

2007-06-18 Thread Gregory Stark
Bruce Momjian [EMAIL PROTECTED] writes:

 Would someone please explain why we are considering this so far past
 features freeze, and who suggtested that the 8.3-8.4 upgrade being a binary
 upgrade was anything more than a pipe dream?

Simon just updated a patch he had originally submitted over a year ago.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings