Re: [PATCHES] [HACKERS] TRUNCATE TABLE with IDENTITY

2008-06-08 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 2. I had first dismissed Neil's idea of transactional sequence updates
 as impossible, but on second look it could be done.  Suppose RESTART
 IDENTITY does this for each sequence;
 
 * obtain AccessExclusiveLock;
 * assign a new relfilenode;
 * insert a sequence row with all parameters copied except
 last_value copies start_value;
 * hold AccessExclusiveLock till commit.

 Hmm, this kills the idea of moving sequence data to a single
 non-transactional catalog :-(

Well, there are a number of holes in our ideas of how to do that anyway.
But offhand I don't see why we couldn't distinguish regular heap_update
from update_in_place on single rows within a catalog.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Proposal - improve eqsel estimates by including histogram bucket numdistinct statistics

2008-06-08 Thread Nathan Boley
Currently eqsel assumes that, except for the values stored as mcv's,
the number of times that a value appears in a table column is
independent of it's value. Unfortunately, this often yields
inappropriate selectivity estimates and frequently leads to
inappropriate plans.

As an example, consider an insurance company that keeps a record of
patient heights. Assume there are a 100 patient heights in this
column, and they are distributed normally with a mean of 1.7526 and a
standard deviation of 0.0762. Furthermore, assume that the heights are
only measured to the nearest centimeter. Then, we'd expect there to be
about 73 distinct heights, with a SD of 1.5.

Ignoring the effects of MCV's, the planner expects
  SELECT height FROM heights WHERE height = 1.75;
to yield roughly 13000 results. However, given that we know the
underlying distribution, we would expect to see ~52000 results.

Similarly, the planner expects to see 13000 results from
  SELECT 1.75 FROM heights WHERE height = 2.05;
While we expect to see 2.7.

Obviously this example is not totally convincing: if I were to post
this to pg-general looking for advice I'm sure that everyone would
tell me to just increase the size of my mcv stats. However, in cases
where the number of distinct values is higher, this isn't always
feasible. Also, why store a list of 50 values and their frequencies
when 10 extra would provide the same plans without bloating
pg_statistics?

To combat this problem, I have two different proposals.

Idea 1: Keep an array of stadistinct that correspond to each bucket size.

In the example above, ( again ignoring mcv's ) the quantile data is

0%10%   20%   30%   40%   50%   60%   70%   80%   90%   100%
1.38  1.66  1.69  1.71  1.73  1.75  1.77  1.79  1.82  1.85  2.12

with numdistinct values of ( respectively )

29  2  2  2  2  2  2  3  3 25

For the two above examples, this new approach would yield selectivity
estimates of

(100/10)/2 = 5  ( vs an actual ED of ~52000 )
and
(100/10)/25 = 4000  ( vs an actual ED of ~2.7 )

Furthermore, this is done without mcvs. Since mcv's would make the
histogram more sensitive to the edges, the estimates with mcv's should
be correspondingly better.


There are two potential problems that I see with this approach:

1) It assumes the = is equivalent to = and = . This is certainly
true for real numbers, but is it true for every equality relation that
eqsel predicts for?

2) It bloats the stats table.

Idea 2: Keep a correlation statistic between ndistinct and bucket size

This addresses problem #2.

In lieu of keeping an actual list of ndistinct per histogram bucket,
we store the linear scaling coefficient between histogram bucket width
and ndistinct/(avg ndistinct). To visualize this, it is easiest to
consider plotting the bucket width versus ndistinct. The scaling
coefficient is the linear line that passes through origin and
minimizes the square of the difference between it's estimate for
ndistinct and the actual value.

When I apply this method to the above data I find a coefficient of
13.63 for an average ndist of 72/10. This provides selectivity
estimates, for the above two examples, of
(100/10)/( 13.63*7.2*(1.77 - 1.75)  ) = 50950 ( vs an actual ED of ~52000 )
and
(100/10)/( 13.63*7.2*(2.12 - 1.85)  ) = 3774  ( vs an actual ED of ~2.7 )

Although this yields better results than idea 1 for this particular
example, it will be much more sensitive to weird distributions.

Obviously there are some special cases to consider: we wouldn't want
the stats to be skewed such that they provide really bad plans.
However, with some carefully designed caps I believe that we could
ensure that the estimates are at least as good as they are now. In
fact, I'm not certain that an R^2 penalty is the correct loss
function. Ideally, we want to minimize the extra time that the db
spends by choosing an incorrect plan. Maybe slight overestimations are
better than slight underestimations? Maybe the cost of the occasional
(really) bad plan is less than the cost of a bunch of kinda bad plans?

Finally, we aren't limited to just one coefficient. We could also
store multiple coefficents to improve our estimates, and provide a
compromise between ideas 1 and 2.

Food for future thought...

I addition to the previous benefits, I think that this method has the
potential to make the process by which MCV are chosen (or not chosen)
smarter. Now the planner chooses a value to be an mcv candidate if
it's frequency is greater than 1.25 * the average frequency. Given
that this improved selectivity estimate is implemented, maybe a better
way would be to include a value as an mcv if it's a) above a certain
threshold and b) the histogram selectivity estimator does do a poor
job.

What are peoples thoughts on idea 1 vs idea 2?

Am I missing any relevant details about the planner's operation?

Do people think that the improved estimates would be worth the
additional overhead?

-Nathan

-- 
Sent via pgsql-hackers 

Re: [HACKERS] Proposal - improve eqsel estimates by including histogram bucket numdistinct statistics

2008-06-08 Thread Tom Lane
Nathan Boley [EMAIL PROTECTED] writes:
 ... There are two potential problems that I see with this approach:

 1) It assumes the = is equivalent to = and = . This is certainly
 true for real numbers, but is it true for every equality relation that
 eqsel predicts for?

The cases that compute_scalar_stats is used in have that property, since
the  and = operators are taken from the same btree opclass.

 Do people think that the improved estimates would be worth the
 additional overhead?

Your argument seems to consider only columns having a normal
distribution.  How badly does it fall apart for non-normal
distributions?  (For instance, Zipfian distributions seem to be pretty
common in database work, from what I've seen.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] handling TOAST tables in autovacuum

2008-06-08 Thread Alvaro Herrera
Hi,

We've been making noises about dealing with TOAST tables as separate
entities in autovacuum for some time now.  So here's a proposal:

Let's do it.

That's about it :-)

The only change of some consideration is that we will need two passes
over pg_class to get the list of relations to vacuum, instead of one as
we do currently.  The problem is that we first need to fetch the
(heap relid, toast relid) mapping before attempting to figure out if any
given TOAST table needs vacuuming.  This is because we want to be using
the main table's pg_autovacuum, and we can't get at that unless we know
the main relid.

Another open question is whether the TOAST table should be processed at
all if the main table is vacuumed.  My opinion is we don't -- if we're
going to deal with them separately, let's go the whole nine yards.
Autovacuum will only process a toast table when, by itself, it shows
that it needs processing.  (Obviously this doesn't mean we change
semantics of user-invoked VACUUM -- those will continue to vacuum the
TOAST table along the main table).

Should we display TOAST tables separately in pg_stat_*_tables?  (Maybe
pg_stat_toast_tables?)

Thoughts?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] math error or rounding problem Money type

2008-06-08 Thread Gregory Stark
Mark Kirkwood [EMAIL PROTECTED] writes:

 IFAIK (dimly recalling numerical analysis courses at university) SUM and ROUND
 can *never* be commuted. In general the recommended approach is to round as
 late as possible and as few times are possible - so your 1st query is the
 correct or best way to go.

I don't think as late as possible applies with money. If you were dealing
with approximate measurements you want to round as late as possible because
rounding is throwing away precision. But if you're dealing with money you're
dealing with exact quantities. 

There is only going to be one correct time to round and that's whenever you're
creating an actual ledger item or order line item or whatever. Once you've
calculated how much interest to credit or whatever you have to make that
credit an exact number of cents and the $0.004 you lost or gained in rounding
never comes up again.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's PostGIS support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Overhauling GUCS

2008-06-08 Thread Gregory Stark
Joshua D. Drake [EMAIL PROTECTED] writes:

 On Fri, 2008-06-06 at 20:19 -0400, Tom Lane wrote:
 Robert Treat [EMAIL PROTECTED] writes:

 Actually, the reason it's still 10 is that the effort expended to get it
 changed has been *ZERO*.  I keep asking for someone to make some
 measurements, do some benchmarking, anything to make a plausible case
 for a specific higher value as being a reasonable place to set it.

 The silence has been deafening.

 Not surprising really. It is a simple adjustment to make and it also is
 easy to spot when its a problem. However it is not trivial to test for
 (in terms of time and effort). I know 10 is wrong and so do you. If you
 don't I am curious why I see so many posts from you saying, Your
 estimates are off, what is your default_statistics_target? with yet
 even more responses saying, Uhh 10. 

Ah, but we only ever hear about the cases where it's wrong of course. In other
words even if we raised it to some optimal value we would still have precisely
the same experience of seeing only posts on list about it being insufficient.

What's needed is some speed benchmarks for complex queries with varying size
statistics so we can see how badly large statistic tables hurt planning time. 

The flip side of seeing how much larger tables help planning accuracy is much
harder to measure. Offhand I don't see any systematic way to go about it.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's PostGIS support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] handling TOAST tables in autovacuum

2008-06-08 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 The only change of some consideration is that we will need two passes
 over pg_class to get the list of relations to vacuum, instead of one as
 we do currently.  The problem is that we first need to fetch the
 (heap relid, toast relid) mapping before attempting to figure out if any
 given TOAST table needs vacuuming.  This is because we want to be using
 the main table's pg_autovacuum, and we can't get at that unless we know
 the main relid.

Umm ... is it chiseled in stone someplace that toast tables shouldn't
have their own pg_autovacuum entries?  Seems like that might be a
reasonable component of a whole nine yards approach.

 Should we display TOAST tables separately in pg_stat_*_tables?  (Maybe
 pg_stat_toast_tables?)

+1 for pg_stat_toast_tables, I think.  If you separate them out then
there will need to be some kind of smarts to help the user figure out
which main table a toast table belongs to.  This would be easy with a
separate view.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump restore time and Foreign Keys

2008-06-08 Thread Robert Treat
On Saturday 07 June 2008 16:22:56 Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  Perhaps we need a GUC that says expert_mode = on. ...  Another idea
  might be to make such command options superuser only, to ensure the
  power is available, yet only in the hands of, by-definition, the trusted
  few.

 This all seems pretty useless, as the sort of user most likely to shoot
 himself in the foot will also always be running as superuser.


yeah, i'm not a big fan of set enable_footgun=true since the people likely 
to get tripped up are going to blindly enable these modes. 

otoh, if we do such a thing, i would be a big fan of calling 
it enable_footgun :-)

 I'd much rather see us expend more effort on speeding up the checks
 than open holes in the system.


and i'm sure no one is against that idea, but you're never going to be able to 
match the performance of just avoiding the check. 

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Overhauling GUCS

2008-06-08 Thread Robert Treat
On Sunday 08 June 2008 19:07:21 Gregory Stark wrote:
 Joshua D. Drake [EMAIL PROTECTED] writes:
  On Fri, 2008-06-06 at 20:19 -0400, Tom Lane wrote:
  Robert Treat [EMAIL PROTECTED] writes:
 
  Actually, the reason it's still 10 is that the effort expended to get it
  changed has been *ZERO*.  I keep asking for someone to make some
  measurements, do some benchmarking, anything to make a plausible case
  for a specific higher value as being a reasonable place to set it.
 
  The silence has been deafening.
 
  Not surprising really. It is a simple adjustment to make and it also is
  easy to spot when its a problem. However it is not trivial to test for
  (in terms of time and effort). I know 10 is wrong and so do you. If you
  don't I am curious why I see so many posts from you saying, Your
  estimates are off, what is your default_statistics_target? with yet
  even more responses saying, Uhh 10.

 Ah, but we only ever hear about the cases where it's wrong of course. In
 other words even if we raised it to some optimal value we would still have
 precisely the same experience of seeing only posts on list about it being
 insufficient.


The slipside to this is that we're not trying to find the perfect setting, 
we're just trying to determine a number that will cause more benefit than 
harm compared to the number we have now. While I am sure there are cases 
where 100 is too low as well, I cannot recall ever having seen someone 
suggest lowering the default_stats_target to something less than 100.  (I 
know sit back and wait for someone to comb the archives, just to find that 1 
time). 

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] libpq support for arrays and composites

2008-06-08 Thread Andrew Dunstan


One of the areas where libpq seems to be severely lacking is in handling 
arrays and composites in query results. I'd like to set about rectifying 
that.


Ideally this would mean that drivers using libpq could easily and 
reliably deliver such objects suitably structured in their particular 
languages (e.g. lists and hashes in Perl).


One complicating factor I see is that there is no protocol level support 
for anything other than simple objects - each data value is simply a 
stream of bytes of a known length. We would therefore need some pretty 
robust processing to pick apart structured objects.


We'll need a new API to handle such objects. I'm thinking of something like:

PQarray * PQgetArray( const PGresult *res, int row_number, int 
column_number);

int PQgetArrayNDims(PQarray * array);
int PQgetArrayLower(PQarray * array, int dim);
int PQgetArrayUpper(PQarray * array, int dim);
int PQgetArrayElementLength(PQarray * array, int dim1, ...);
bool PQgetArrayElementIsNull(PQarray * array, int dim1, ...);
char * PQgetArrayElement(PQarray * array, int dim1, ...);

PQcomposite * PQgetComposite(const PGresult *res, int row_number, int 
column_number);

PQcomposite * PQgetArrayElementComposite(PQarray * array, int dim1, ...);
int PQgetCompositeNFields(PQcomposite * composite);
char * PQgetCompositeFName(PQcomposite * composite, int fnumber);
int PQgetCompositeFNumber(PQcomposite * composite, char * fname);
Oid PQgetCOmpositeFType(PQcomposite * composite, int fnumber);
int PQgetCompositeFieldLength(PQcomposite * , int fnumber);
bool PQgetCompositeFieldIsNull(PQcomposite * composite, int fnumber);
char * PQgetCompositeField(PQcomposite * composite, int fnumber);

Not sure if we need analogs for PQfformat, PQfmod or PQfsize - I suspect 
not, but right now I'm just thinking out loud.


Thoughts? Is this worth doing?

cheers

andrew







--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] libpq support for arrays and composites

2008-06-08 Thread Tom Lane
Andrew Dunstan [EMAIL PROTECTED] writes:
 One complicating factor I see is that there is no protocol level support 
 for anything other than simple objects - each data value is simply a 
 stream of bytes of a known length. We would therefore need some pretty 
 robust processing to pick apart structured objects.

Well, it's not that hard, the quoting/escaping rules for array_out and
record_out are well defined.

Are you intending that these operations support both text and binary
results?

The array accessors with ... parameter lists strike me as a bit
dangerous, because there is no way at all to verify that the caller is
passing the expected number of dimensions.  Can't that be made tighter?
Also you need to spell out the error handling conventions for each of
these.

I think you missed some free() operations.

It might also be useful to provide some functions that form an array or
composite value from per-element strings, ie, the converse of the
de-construction routines.  Here I'd be happy to skip the binary case.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump restore time and Foreign Keys

2008-06-08 Thread Tom Lane
Robert Treat [EMAIL PROTECTED] writes:
 and i'm sure no one is against that idea, but you're never going to be able 
 to 
 match the performance of just avoiding the check. 

We'll never be able to match the performance of not having transactions,
either, but the community has never for a moment considered having a
no transactions mode.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] handling TOAST tables in autovacuum

2008-06-08 Thread Alvaro Herrera
Tom Lane wrote:
 Alvaro Herrera [EMAIL PROTECTED] writes:
  The only change of some consideration is that we will need two passes
  over pg_class to get the list of relations to vacuum, instead of one as
  we do currently.  The problem is that we first need to fetch the
  (heap relid, toast relid) mapping before attempting to figure out if any
  given TOAST table needs vacuuming.  This is because we want to be using
  the main table's pg_autovacuum, and we can't get at that unless we know
  the main relid.
 
 Umm ... is it chiseled in stone someplace that toast tables shouldn't
 have their own pg_autovacuum entries?  Seems like that might be a
 reasonable component of a whole nine yards approach.

No, but I think it's a bit awkward for users to follow _only_ its own
entry.  I forgot to mention that in the patch I currently have, what
autovacuum does is try to get the TOAST table's own pg_autovacuum entry,
and if that fails, get the main rel's entry.

The point here is that if the user disables autovac for the main table,
then it's expected that it is automagically disabled for the toast table
as well, for the usual case where they are disabling it because the
table is too big.  Automatically processing the toast table would be
completely unexpected, and most likely unwelcome.

Of course, for the even rarer cases when you want to disable it for the
main rel and enable it for the toast table, you can do that too.  (I
can't think of a case where this would be useful though.)

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] handling TOAST tables in autovacuum

2008-06-08 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 The point here is that if the user disables autovac for the main table,
 then it's expected that it is automagically disabled for the toast table
 as well, for the usual case where they are disabling it because the
 table is too big.

Hmm, good point.  OK, two passes it is.  (I thought about remembering
the toast table rows in memory so as not to scan the catalog twice,
but I'm not sure you really save much that way.)

Another thing to think about here is locking: I believe you need to get
a vacuum-type lock on the parent table not only the toast table, so
vacuuming a toast table without any knowledge of which table is its
parent ain't gonna fly anyway.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] libpq support for arrays and composites

2008-06-08 Thread Andrew Dunstan



Tom Lane wrote:

Andrew Dunstan [EMAIL PROTECTED] writes:
  
One complicating factor I see is that there is no protocol level support 
for anything other than simple objects - each data value is simply a 
stream of bytes of a known length. We would therefore need some pretty 
robust processing to pick apart structured objects.



Well, it's not that hard, the quoting/escaping rules for array_out and
record_out are well defined.

Are you intending that these operations support both text and binary
results?
  


I'm a bit open on that.


The array accessors with ... parameter lists strike me as a bit
dangerous, because there is no way at all to verify that the caller is
passing the expected number of dimensions.  Can't that be made tighter?
  


Well, the only alternative I can think of is to make the client walk the 
array one dimension at a time. Something like:


PQarray * PQgetInnerArray(PQarray * array, int dim);

then when we're down to the leaf level, we could have:

int PQgetArrayElementLength(PQarray * array, int dim);
bool PQgetArrayElementIsNull(PQarray * array, int dim);
char * PQgetArrayElement(PQarray * array, int dim);

That strikes me as somewhat more cumbersome, so I guess the question is 
whether it's worth it.  It probably fits the slightly clunky feel of libpq.



Also you need to spell out the error handling conventions for each of
these.

I think you missed some free() operations.
  



Oh, yes, both of these are certainly true. This isn't really even a 
proposal yet, more a sketch that would lead to a proposal. I'm hoping to 
get some other input too, before settling this down, especially from 
driver writers.

It might also be useful to provide some functions that form an array or
composite value from per-element strings, ie, the converse of the
de-construction routines.  Here I'd be happy to skip the binary case.


  


Yeah, that had occurred to me. Will think about it more, although it 
could possibly be done as a separate project, too.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump restore time and Foreign Keys

2008-06-08 Thread Robert Treat
On Sunday 08 June 2008 20:12:15 Tom Lane wrote:
 Robert Treat [EMAIL PROTECTED] writes:
  and i'm sure no one is against that idea, but you're never going to be
  able to match the performance of just avoiding the check.

 We'll never be able to match the performance of not having transactions,
 either, but the community has never for a moment considered having a
 no transactions mode.


it's unclear what a no transaction mode would mean, but I'd be willing to 
guess some people have consider aspects of it (we've just never had 
agreement)

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Automating our version-stamping a bit better

2008-06-08 Thread Tom Lane
So while tagging the upcoming releases, I got annoyed once again about
what a tedious, error-prone bit of donkeywork it is.  You've got to find
and update the sub-version numbers, and *not* any chance occurrence of
the same strings (eg s/20/21/g for version 7.4.21 would've mangled some
copyright dates).  And the changes tend to move around a little bit in
each back branch, making it even easier to blow it.  ISTM we should get
the machine to do it for us.

I propose to write a little perl script to be used like this:

cd top-level-of-tree
sh src/tools/version_stamp 22
cvs commit -m Stamp release 7.4.22

The script takes just one argument, which could be devel, betaN,
rcN, or just a minor version number N.  Note the assumption that the
script knows the major version.  Since we expect to adjust the script
from time to time for version changes anyway, I don't see why we
shouldn't have the major version stored right in the script.  Tagging a
new development branch after a release is split off would then look like

cd src/tools
edit version_stamp, update a variable assignment at its head
cvs commit -m Update version_stamp for 8.5
cd ../..
sh src/tools/version_stamp devel
cvs commit -m Stamp CVS HEAD as 8.5devel

Note that this is not all that helpful if we just do it in CVS HEAD.
I propose adding the script to all active branches back to 7.4, with
suitable adjustments for each branch as needed.

I think we should probably include configure.in in the set of files
that this script updates, and get rid of the current two-step
arrangement where Marc stamps configure.in/configure after somebody
else stamps everything else.  Marc's tarball-wrapping process would
thus look roughly like

sh src/tools/version_stamp 4
autoconf
cvs commit -m Stamp release 8.3.4
cvs tag REL8_3_4
... build tarball ...

I'm tempted to suggest letting the script invoke autoconf, too,
but that would require standardizing where to find the correct
version of autoconf for each branch; so it might not be such a
great idea. 

Thoughts, objections?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Automating our version-stamping a bit better

2008-06-08 Thread Andrew Dunstan



Tom Lane wrote:


I'm tempted to suggest letting the script invoke autoconf, too,
but that would require standardizing where to find the correct
version of autoconf for each branch; so it might not be such a
great idea. 

  


Unfortunately that's true. Maybe we could agree on using an alias for 
the right version of autoconf, but it seems likely to be error prone.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Automating our version-stamping a bit better

2008-06-08 Thread Tom Lane
Andrew Dunstan [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 I'm tempted to suggest letting the script invoke autoconf, too,
 but that would require standardizing where to find the correct
 version of autoconf for each branch; so it might not be such a
 great idea.  

 Unfortunately that's true. Maybe we could agree on using an alias for 
 the right version of autoconf, but it seems likely to be error prone.

Actually, the way I do things is that my setup script for working
with each particular version tree includes adjusting $PATH so that
the right autoconf gets found just by saying autoconf.  If everyone
who might tag releases wanted to do it the same way, then we could
just let the script say autoconf.  But I'm not sure anybody else
likes that plan.  What I was thinking was just to have the script
print out something like

Tagged tree as 8.3.4
Don't forget to run autoconf 2.59 before committing

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] handling TOAST tables in autovacuum

2008-06-08 Thread Euler Taveira de Oliveira

Alvaro Herrera wrote:


We've been making noises about dealing with TOAST tables as separate
entities in autovacuum for some time now.  So here's a proposal:


Let's keep it simple. Why not just adding a toast_enabled flag (disabled
by default) in pg_autovacuum? If it's set then main and toast tables are
processed by autovac.

FYI, I have a WIP patch to remove pg_autovacuum in favor of reloptions. 
And based on your proposal, it'll be needed to add reloptions to toast 
tables too. IMO, we should keep that code as simple as possible.



--
  Euler Taveira de Oliveira
  http://www.timbira.com/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [CORE] [HACKERS] Automating our version-stamping a bit better

2008-06-08 Thread Marc G. Fournier
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



- --On Sunday, June 08, 2008 21:27:03 -0400 Tom Lane [EMAIL PROTECTED] wrote:

 Andrew Dunstan [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 I'm tempted to suggest letting the script invoke autoconf, too,
 but that would require standardizing where to find the correct
 version of autoconf for each branch; so it might not be such a
 great idea. 

 Unfortunately that's true. Maybe we could agree on using an alias for
 the right version of autoconf, but it seems likely to be error prone.

 Actually, the way I do things is that my setup script for working
 with each particular version tree includes adjusting $PATH so that
 the right autoconf gets found just by saying autoconf.  If everyone
 who might tag releases wanted to do it the same way, then we could
 just let the script say autoconf.  But I'm not sure anybody else
 likes that plan.  What I was thinking was just to have the script
 print out something like

   Tagged tree as 8.3.4
   Don't forget to run autoconf 2.59 before committing

I like that one ...

- -- 
Marc G. FournierHub.Org Hosting Solutions S.A. (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.9 (FreeBSD)

iEYEARECAAYFAkhMj4MACgkQ4QvfyHIvDvNWAACfeEuX8PCwbPgZLutpya859T+5
sDYAoKgTnLoypgDOwr4TSYVd+G5Dn+kn
=Cl6d
-END PGP SIGNATURE-


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] handling TOAST tables in autovacuum

2008-06-08 Thread Alvaro Herrera
Euler Taveira de Oliveira wrote:

 FYI, I have a WIP patch to remove pg_autovacuum in favor of reloptions.  

Really?  Please send it my way to review/apply as soon as you have it
ready, independently of what we do with toast tables.

 Let's keep it simple. Why not just adding a toast_enabled flag (disabled
 by default) in pg_autovacuum? If it's set then main and toast tables are
 processed by autovac.

Actually I think your proposal is more cumbersome to use and less
flexible, because you can't set specific values for the other options
for toast tables.

 And based on your proposal, it'll be needed to add reloptions to toast  
 tables too. IMO, we should keep that code as simple as possible.

Sure, what's the problem with that?  We only need to make sure that
ALTER TABLE works for setting reloptions for toast tables.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] libpq support for arrays and composites

2008-06-08 Thread Andrew Chernow

Andrew Dunstan wrote:


One of the areas where libpq seems to be severely lacking is in handling 
arrays and composites in query results. I'd like to set about rectifying 
that.


Ideally this would mean that drivers using libpq could easily and 
reliably deliver such objects suitably structured in their particular 
languages (e.g. lists and hashes in Perl).


One complicating factor I see is that there is no protocol level support 
for anything other than simple objects - each data value is simply a 
stream of bytes of a known length. We would therefore need some pretty 
robust processing to pick apart structured objects.


We'll need a new API to handle such objects. I'm thinking of something 
like:


PQarray * PQgetArray( const PGresult *res, int row_number, int 
column_number);

int PQgetArrayNDims(PQarray * array);
int PQgetArrayLower(PQarray * array, int dim);
int PQgetArrayUpper(PQarray * array, int dim);
int PQgetArrayElementLength(PQarray * array, int dim1, ...);
bool PQgetArrayElementIsNull(PQarray * array, int dim1, ...);
char * PQgetArrayElement(PQarray * array, int dim1, ...);

PQcomposite * PQgetComposite(const PGresult *res, int row_number, int 
column_number);

PQcomposite * PQgetArrayElementComposite(PQarray * array, int dim1, ...);
int PQgetCompositeNFields(PQcomposite * composite);
char * PQgetCompositeFName(PQcomposite * composite, int fnumber);
int PQgetCompositeFNumber(PQcomposite * composite, char * fname);
Oid PQgetCOmpositeFType(PQcomposite * composite, int fnumber);
int PQgetCompositeFieldLength(PQcomposite * , int fnumber);
bool PQgetCompositeFieldIsNull(PQcomposite * composite, int fnumber);
char * PQgetCompositeField(PQcomposite * composite, int fnumber);

Not sure if we need analogs for PQfformat, PQfmod or PQfsize - I suspect 
not, but right now I'm just thinking out loud.


Thoughts? Is this worth doing?

cheers

andrew





libpqtypes already implemented this.  It is a different approach but provides 
the same functionality; with the inclusion of being able to handle every data 
type.  libpqtypes uses the PGresult API for composites and arrays, rather than 
adding a new set of functions.


To support this, one must be able to convert all data types (unless you are only 
supporting text results) because composites can be made up of any data type.


Simple arrays:
http://libpqtypes.esilo.com/man3/pqt-specs.html#array

Composite arrays:
http://libpqtypes.esilo.com/man3/pqt-composites.html

EXAMPLE OF GETTING A COMPOSITE:
(taken from http://libpqtypes.esilo.com/ home page)

/* Let's get a composite.
 * CREATE TYPE simple AS (a int4, t text);
 */
PGint4 i4;
PGtext text;
PGresult *res, *simple;
int resultFormat = 1;

/* Your composites need to be registered */
PQregisterTypeHandler(conn, simple, NULL, NULL);

/* 2nd arg, PGparam, can be NULL if there are no query params.
 * Composites require binary results, so we can't use PQexec().
 */
res = PQparamExec(conn, NULL, SELECT my_simple FROM t, resultFormat);
if(!res)
  fprintf(stderr, ERROR: %s\n, PQgeterror());

/* Get the simple composite, which is exposed as a PGresult. */
PQgetf(res, 0, %simple, 0, simple);
PQclear(res); /* no longer needed */

/* Get the simple composite attributes from the simple result.
 * Reference fields by name by using a '#' rather than a '%'.
 * The field names are the composite attribute names.
 */
PQgetf(simple, 0, #int4 #text, a, i4, t, text);
PQclear(simple);

--
Andrew Chernow
eSilo, LLC
every bit counts
http://www.esilo.com/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] handling TOAST tables in autovacuum

2008-06-08 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 Euler Taveira de Oliveira wrote:
 And based on your proposal, it'll be needed to add reloptions to toast  
 tables too. IMO, we should keep that code as simple as possible.

 Sure, what's the problem with that?  We only need to make sure that
 ALTER TABLE works for setting reloptions for toast tables.

... actually, the problem is going to be how do you get pg_dump to dump
and reload such settings?  The toast tables are not going to have the
same names after dump/reload.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers