Re: [PATCHES] [DOCS] OS/X startup scripts

2007-05-13 Thread Neil Conway
On Sun, 2007-13-05 at 18:58 -0700, David Fetter wrote:
> cvs diff works just great until you want to add or remove a file
> without write permissions to the CVS repository, i.e. when you've
> checked out as anonymous.

Personally, I usually work against a checkout from a local mirror of the
CVS repository (which you can create via cvsup or rsync). With that
setup, "cvs add" and "cvs diff -N" work fine, since you can arrange for
write access to the local mirror.

(I'm always surprised to hear that anyone does a non-trivial amount of
work on Postgres without setting up a CVS mirror...)

-Neil



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] [DOCS] Autovacuum and XID wraparound

2007-05-13 Thread Neil Conway
On Sun, 2007-13-05 at 22:06 -0400, Tom Lane wrote:
> This fact is already documented in at least three places; do we really
> need two more?

I think we need to at least modify the documentation for the autovacuum
GUC parameter, which currently states only that it "controls whether the
server should run the autovacuum launcher daemon" -- this is not
strictly true, and in any case, it isn't the whole story.

> The proposed addition to postgresql.conf seems particularly
> over-the-top

I agree that this information doesn't really belong in postgresql.conf.

-Neil



---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PATCHES] Updateable cursors patch

2007-05-13 Thread Jaime Casanova

On 4/4/07, FAST PostgreSQL <[EMAIL PROTECTED]> wrote:

Attached is a working updateable cursors patch. The core functionality has
been implemented and the patch also contains the regression tests and
documentation.



this one doesn't apply cleanly to HEAD because of the changes in
http://archives.postgresql.org/pgsql-committers/2007-04/msg00447.php

--
regards,
Jaime Casanova

"Programming today is a race between software engineers striving to
build bigger and better idiot-proof programs and the universe trying
to produce bigger and better idiots.
So far, the universe is winning."
  Richard Cook

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PATCHES] [DOCS] OS/X startup scripts

2007-05-13 Thread Mark Kirkwood

David Fetter wrote:

cvs diff works just great until you want to add or remove a file
without write permissions to the CVS repository, i.e. when you've
checked out as anonymous.



I usually saved an untouched version of the tree to compare against, so 
something like:


$ cvs diff -Nacr pgsql.orig pgsql

gives a complete patch including added/deleted files. It is a bit 
primitive, but is pretty easy to do!


Cheers

Mark

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PATCHES] [DOCS] Autovacuum and XID wraparound

2007-05-13 Thread David Fetter
On Sun, May 13, 2007 at 10:06:40PM -0400, Tom Lane wrote:
> David Fetter <[EMAIL PROTECTED]> writes:
> > Per Neil Conway, here's some doc patches re: the autovacuum
> > daemon's behavior.  Should this be back-patched to 8.2x?
> 
> This fact is already documented in at least three places; do we
> really need two more?

Yes.

> The proposed addition to postgresql.conf seems particularly
> over-the-top, since there is no entry in that file that even
> pretends to offer a complete description of the associated behavior.

I think that a boolean that doesn't do what you expect booleans to do,
i.e. turn the thing all the way off, is worth a mention.

Cheers,
David.
-- 
David Fetter <[EMAIL PROTECTED]> http://fetter.org/
phone: +1 415 235 3778AIM: dfetter666
  Skype: davidfetter

Remember to vote!
Consider donating to PostgreSQL: http://www.postgresql.org/about/donate

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PATCHES] [DOCS] Autovacuum and XID wraparound

2007-05-13 Thread Tom Lane
David Fetter <[EMAIL PROTECTED]> writes:
> Per Neil Conway, here's some doc patches re: the autovacuum daemon's
> behavior.  Should this be back-patched to 8.2x?

This fact is already documented in at least three places; do we really
need two more?  The proposed addition to postgresql.conf seems
particularly over-the-top, since there is no entry in that file that
even pretends to offer a complete description of the associated
behavior.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PATCHES] [DOCS] OS/X startup scripts

2007-05-13 Thread David Fetter
On Sun, May 13, 2007 at 09:51:53PM -0400, Tom Lane wrote:
> David Fetter <[EMAIL PROTECTED]> writes:
> > On Sun, May 13, 2007 at 07:04:44PM -0400, Andrew Dunstan wrote:
> >> Tom Lane wrote:
> >>> Strange, it works fine for everyone else.
> >> 
> >> Especially if you have cvsutils installed (can be found in many
> >> places including fedora extras).
> 
> > I didn't know about those, so I'll prepare a patch to the
> > developer and documenter docs that mentions this utility :)
> 
> I dunno what cvsutils is, but I do know that plain old "cvs diff"
> works fine whether you have commit privs or not.  Rather than
> preparing a patch to our docs, perhaps you should spend some time
> reading the CVS docs.

cvs diff works just great until you want to add or remove a file
without write permissions to the CVS repository, i.e. when you've
checked out as anonymous.

Cheers,
David.
-- 
David Fetter <[EMAIL PROTECTED]> http://fetter.org/
phone: +1 415 235 3778AIM: dfetter666
  Skype: davidfetter

Remember to vote!
Consider donating to PostgreSQL: http://www.postgresql.org/about/donate

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[PATCHES] On patching without write access to CVS

2007-05-13 Thread David Fetter
Folks,

Thanks to Andrew Dunstan for pointing me toward cvsutils.  As not
everybody knows about them, here's a small patch which lets people
know at least in theory where they are.

Cheers,
D
-- 
David Fetter <[EMAIL PROTECTED]> http://fetter.org/
phone: +1 415 235 3778AIM: dfetter666
  Skype: davidfetter

Remember to vote!
Consider donating to PostgreSQL: http://www.postgresql.org/about/donate
Index: doc/FAQ_DEV
===
RCS file: /projects/cvsroot/pgsql/doc/FAQ_DEV,v
retrieving revision 1.135
diff -c -r1.135 FAQ_DEV
*** doc/FAQ_DEV 5 May 2007 14:33:55 -   1.135
--- doc/FAQ_DEV 14 May 2007 01:56:11 -
***
*** 108,114 
 work. Failure to do so might mean your patch is rejected. If your work
 is being sponsored by a company, read this article for tips on being
 more effective.
!
 A web site is maintained for patches awaiting review,
 http://momjian.postgresql.org/cgi-bin/pgpatches, and those that are
 being kept for the next release,
--- 108,120 
 work. Failure to do so might mean your patch is rejected. If your work
 is being sponsored by a company, read this article for tips on being
 more effective.
! 
!To create patches which would otherwise require that you have write
!access to the CVS repository, for example ones that add or remove
!files, you can use cvsutils.  The cvsutils toolchain is packaged
!for many operating systems and available in source form at
!http://www.red-bean.com/cvsutils/
! 
 A web site is maintained for patches awaiting review,
 http://momjian.postgresql.org/cgi-bin/pgpatches, and those that are
 being kept for the next release,
Index: doc/src/sgml/cvs.sgml
===
RCS file: /projects/cvsroot/pgsql/doc/src/sgml/cvs.sgml,v
retrieving revision 1.42
diff -c -r1.42 cvs.sgml
*** doc/src/sgml/cvs.sgml   27 Mar 2007 01:45:22 -  1.42
--- doc/src/sgml/cvs.sgml   14 May 2007 01:56:11 -
***
*** 15,20 
--- 15,24 
  Thomas
  Lockhart
 
+
+ David
+ Fetter
+

1999-05-20
   
***
*** 150,155 
--- 154,166 
 comes with CVS, or see the online
 documentation at http://www.nongnu.org/cvs/";>.

+   
+For those things which CVS does not do
+by itself, such as letting you create patches without write access,
+you can use cvsutils, which is packaged
+for many operating systems, or available in source form at http://www.red-bean.com/cvsutils/";>.
+   
   
  
   

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] Concurrent psql patch

2007-05-13 Thread Tom Lane
Gregory Stark <[EMAIL PROTECTED]> writes:
> So would you prefer \g& as Jim Nasby suggested? I hadn't even considered that
> previously since I'm not accustomed to using \g but it does seem kind of
> pretty. I normally use ; but I suppose there's nothing wrong with just
> declaring that asynchronous commands must be issued using \g& rather than use
> the semicolon to fire them off.

It makes sense to me... but what is the state of the session afterward?
Should this be combined with switching to another connection?

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] [DOCS] OS/X startup scripts

2007-05-13 Thread Tom Lane
David Fetter <[EMAIL PROTECTED]> writes:
> On Sun, May 13, 2007 at 07:04:44PM -0400, Andrew Dunstan wrote:
>> Tom Lane wrote:
>>> Strange, it works fine for everyone else.
>> 
>> Especially if you have cvsutils installed (can be found in many
>> places including fedora extras).

> I didn't know about those, so I'll prepare a patch to the developer
> and documenter docs that mentions this utility :)

I dunno what cvsutils is, but I do know that plain old "cvs diff" works
fine whether you have commit privs or not.  Rather than preparing a
patch to our docs, perhaps you should spend some time reading the CVS
docs.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] [DOCS] OS/X startup scripts

2007-05-13 Thread David Fetter
On Sun, May 13, 2007 at 07:04:44PM -0400, Andrew Dunstan wrote:
> 
> 
> Tom Lane wrote:
> >David Fetter <[EMAIL PROTECTED]> writes:
> >  
> >>I haven't included the customary diffs.  This points me to some of
> >>the many deficiencies of CVS, namely that I would need write
> >>access in order to have it create a diff,
> >
> >Strange, it works fine for everyone else.
> 
> Especially if you have cvsutils installed (can be found in many
> places including fedora extras).

I didn't know about those, so I'll prepare a patch to the developer
and documenter docs that mentions this utility :)

Cheers,
D
-- 
David Fetter <[EMAIL PROTECTED]> http://fetter.org/
phone: +1 415 235 3778AIM: dfetter666
  Skype: davidfetter

Remember to vote!
Consider donating to PostgreSQL: http://www.postgresql.org/about/donate

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[PATCHES] Autovacuum and XID wraparound

2007-05-13 Thread David Fetter
Folks,

Per Neil Conway, here's some doc patches re: the autovacuum daemon's
behavior.  Should this be back-patched to 8.2x?

Cheers,
D
-- 
David Fetter <[EMAIL PROTECTED]> http://fetter.org/
phone: +1 415 235 3778AIM: dfetter666
  Skype: davidfetter

Remember to vote!
Consider donating to PostgreSQL: http://www.postgresql.org/about/donate
Index: doc/src/sgml/config.sgml
===
RCS file: /projects/cvsroot/pgsql/doc/src/sgml/config.sgml,v
retrieving revision 1.122
diff -c -r1.122 config.sgml
*** doc/src/sgml/config.sgml20 Apr 2007 02:37:37 -  1.122
--- doc/src/sgml/config.sgml14 May 2007 01:16:02 -
***
*** 3172,3177 
--- 3172,3185 
  This parameter can only be set in the postgresql.conf
  file or on the server command line.
 
+
+ 
+   Even when this variable is set to off, the autovacuum daemon
+   will run periodically in order to prevent transaction_id
+   wraparound.  See  for
+   more information.
+ 
+

   
  
Index: src/backend/utils/misc/postgresql.conf.sample
===
RCS file: 
/projects/cvsroot/pgsql/src/backend/utils/misc/postgresql.conf.sample,v
retrieving revision 1.215
diff -c -r1.215 postgresql.conf.sample
*** src/backend/utils/misc/postgresql.conf.sample   18 Apr 2007 16:44:18 
-  1.215
--- src/backend/utils/misc/postgresql.conf.sample   14 May 2007 01:16:02 
-
***
*** 372,377 
--- 372,379 
  #---
  # AUTOVACUUM PARAMETERS
  #---
+ # Note: even when autovacuum is turned off, the autovacuum daemon will
+ # run in order to prevent transaction id wraparound.
  
  #autovacuum = on  # enable autovacuum subprocess?
# 'on' requires stats_start_collector

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PATCHES] Have vacuum emit a warning when it runs out of maintenance_work_mem

2007-05-13 Thread Tom Lane
Heikki Linnakangas <[EMAIL PROTECTED]> writes:
>>> Or we could switch to a more compact representation of the dead tuples, 
>>> and not need such a big maintenance_work_mem in the first place.

> One idea is to use a compressed bitmap like in the bitmap index patch, 
> and a tree of block numbers or ranges to allow random access to it.

I thought a bit about that but it doesn't seem tremendously appealing,
at least not as the only representation, because it's not more compact
for small numbers of dead tuples per page.  (And we don't have the "out"
of switching to lossy representation.)

Here's a design sketch that works if we are willing to limit VACUUM's
usable maintenance_work_mem to 4GB:

1. Store an array of page numbers plus offsets into a second working
array of uint16 (the offsets are 32 bits, whence the 4GB limitation).
This requires 8 bytes per page-with-dead-tuples, and since it will be
built in order as a byproduct of our scanning order, it can be
binary-searched on the page number.

2. The offset part of the per-page entry points at a segment of the
uint16 array belonging to this page.  It can have one of 2 formats.
For a small number of dead tuples on the page, we just store an
array of line numbers.  For a larger number, we store a bitmap
showing the positions of dead tuples.  While we scan a page, we
accumulate dead tuple line numbers in a small working array, and
then either copy those to the large array or build a bitmap from
them, depending on which will be smaller.  Since the offsets into
the uint16 array will always be even, we can usurp the low-order
bit of the pointer word to distinguish which representation is
stored.

3. You might think we need to spend an additional word storing
how many line numbers or bitmap words there are per page, but
we can save that space by comparing offsets of adjacent entries
in the per-page array, since we know they're stored adjacently.

I envision the per-page array as being built upwards from the bottom of
a single large maintenance_work_mem-sized array, and the uint16 array
data as being filled from the top down, and whenever the two pointers
are in danger of crossing, we stop and do an index vacuum cycle, just
like in the current logic.  This lets us avoid having to guess in
advance how much per-page versus per-tuple space we will need.  Note
this means the end of a page entry's uint16 data is determined by
looking at the prior page entry's offset instead of the next one,
but that seems no big problem.

So the lookup process involves a binary search on the page number only,
and then either a scan of the tuple line numbers or a single bitmap
probe.  (We could make the scan be a binary search, but since that
representation will only be used with small numbers of tuples, it'd
probably not be any faster than a simple search loop.)  AFAICS that
ought to be as fast or faster than the current lookup methodology;
significantly faster where there are many dead tuples per page.

The space requirements are:

No dead tuples on page  0 bytes  (same as now)
1 dead tuple on page10 bytes (vs 6 now)
2 dead tuples   12 bytes (same as now)
3 dead tuples   14 bytes (vs 18 now)

and so on, except that for typical table densities of say 100 tuples per
page, we will switch over to the bitmap representation at 6 or so dead
tuples per page, and so the requirement will never go beyond about 20
bytes per page whereas the current method could get as bad as 600 bytes
for an all-dead page.

What this says is that it's not worth changing if you expect low
dead-tuple densities (which IIRC was the assumption I made when I
designed the current representation).  In a table with an average of
less than 1 dead tuple per page, this way is a loser.  OTOH, with very
low dead-tuple densities it may hardly matter, since you're going to
have to scan many GB of heap before you fill maintenance_work_mem
anyway.

If you assume that a more typical scenario is vacuuming after 10%
or so tuple "churn", then there would be 10 or so dead tuples per
page, which makes this a good win: about 20 vs about 60 bytes per
page, with the win going up the longer vacuum is delayed.

HOT would take away some of the win, probably, but I'm not sure
how much.

Comments?  Can anyone invent a better data structure?

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] Concurrent psql patch

2007-05-13 Thread Gregory Stark
"Tom Lane" <[EMAIL PROTECTED]> writes:

> Gregory Stark <[EMAIL PROTECTED]> writes:
>> "David Fetter" <[EMAIL PROTECTED]> writes:
>>> What's the reasoning behind \c&?  Does it "send things into the
>>> background" the way & does in the shell?
>
>> Sort of. It sends the *subsequent* command to the background...
>
> That sounds just bizarre.  Existing backslash commands that do something
> to a SQL command are typed *after* the command they affect (\g for
> instance).  I don't think you should randomly change that.

So would you prefer \g& as Jim Nasby suggested? I hadn't even considered that
previously since I'm not accustomed to using \g but it does seem kind of
pretty. I normally use ; but I suppose there's nothing wrong with just
declaring that asynchronous commands must be issued using \g& rather than use
the semicolon to fire them off.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] [DOCS] OS/X startup scripts

2007-05-13 Thread Andrew Dunstan



Tom Lane wrote:

David Fetter <[EMAIL PROTECTED]> writes:
  

I haven't included the customary diffs.  This points me to some of the
many deficiencies of CVS, namely that I would need write access in
order to have it create a diff,



Strange, it works fine for everyone else.


  


Especially if you have cvsutils installed (can be found in many places 
including fedora extras).


cheers

andrew

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PATCHES] [DOCS] OS/X startup scripts

2007-05-13 Thread Tom Lane
David Fetter <[EMAIL PROTECTED]> writes:
> I haven't included the customary diffs.  This points me to some of the
> many deficiencies of CVS, namely that I would need write access in
> order to have it create a diff,

Strange, it works fine for everyone else.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PATCHES] Concurrent psql patch

2007-05-13 Thread Tom Lane
Gregory Stark <[EMAIL PROTECTED]> writes:
> "David Fetter" <[EMAIL PROTECTED]> writes:
>> What's the reasoning behind \c&?  Does it "send things into the
>> background" the way & does in the shell?

> Sort of. It sends the *subsequent* command to the background...

That sounds just bizarre.  Existing backslash commands that do something
to a SQL command are typed *after* the command they affect (\g for
instance).  I don't think you should randomly change that.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PATCHES] Concurrent psql patch

2007-05-13 Thread Gregory Stark

"Jim C. Nasby" <[EMAIL PROTECTED]> writes:

> On Sun, May 13, 2007 at 02:39:45PM +0100, Gregory Stark wrote:
>> 
>> I was originally thinking \c1, \c2, ... for \cswitch and \c& for \cnowait. 
>> I'm
>> not sure if going for cryptic short commands is better or worse here.
>
> \c& sounds fine (as do \c1...\c9). \g& would probably be helpful as well
> (send query buffer to server in nowait mode).

Er, I just realized I typed the wrong thing there. It can't be \c& since I do
assign a meaning to that "make a new connection to the same place as this
one".

I meant \&

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] Concurrent psql patch

2007-05-13 Thread Jim C. Nasby
On Sun, May 13, 2007 at 02:39:45PM +0100, Gregory Stark wrote:
> "Jim Nasby" <[EMAIL PROTECTED]> writes:
> 
> > I don't see how we could make the names shorter without moving away from a
> > backslash command (which I'm guessing would be painful).
> >
> > Assuming we're stuck with a backslash command \cs[witch] and \cn
> > [owait] seem to be about as good as we could get.
> 
> I don't have \cs or \cn set up as abbreviations. 
> 
> I was originally thinking \c1, \c2, ... for \cswitch and \c& for \cnowait. I'm
> not sure if going for cryptic short commands is better or worse here.

Would \c# limit us to 9 concurrent connections? Might want

\cs[witch] [session]

which would switch to the specified session. If none specified, it would
switch back to whatever session was previously active.

\c& sounds fine (as do \c1...\c9). \g& would probably be helpful as well
(send query buffer to server in nowait mode).
-- 
Jim Nasby  [EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PATCHES] Have vacuum emit a warning when it runs out of maintenance_work_mem

2007-05-13 Thread Jim C. Nasby
On Sun, May 13, 2007 at 11:19:07AM +0100, Heikki Linnakangas wrote:
> Jim C. Nasby wrote:
> >On Sat, May 12, 2007 at 07:57:44PM +0100, Heikki Linnakangas wrote:
> >>Or we could switch to a more compact representation of the dead tuples, 
> >>and not need such a big maintenance_work_mem in the first place.
> >
> >Sure, but even with a more compact representation you can still run out
> >of maintenance_work_mem... unless we allow this to spill to disk. At
> >first guess that sounds insane, but if you've got a large enough set of
> >indexes it *might* actually be faster.
> 
> It would only make sense if the table is clustered on an index, so that 
> you'd in practice only need to keep part of the array in memory at a 
> time. It's pretty narrow use case, not worth spending time on I think.

There might be ways to get around that. For example, instead of testing
every index entry one at a time, you could read in several pages of
index entries, sort the entries based on ctid, and then use that to do
the lookups. Might be worth looking at one of these days...
-- 
Jim Nasby  [EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 6: explain analyze is your friend


[PATCHES] OS/X startup scripts

2007-05-13 Thread David Fetter
Folks,

The attached tarball should be unpacked in contrib/startup-scripts,
and supplants PostgreSQL.darwin and StartupParameters.plist.darwin.
Thanks to Mark Cotner of Reflectr
 fame for this update :)

I haven't included the customary diffs.  This points me to some of the
many deficiencies of CVS, namely that I would need write access in
order to have it create a diff, and write access is boolean, which
means that I can't get write access only to the parts of the tree that
make sense for me to have write access to.

What say we consider using Git, which includes a CVS interface,
starting after 8.3 gets out the door?

Cheers,
David.
-- 
David Fetter <[EMAIL PROTECTED]> http://fetter.org/
phone: +1 415 235 3778AIM: dfetter666
  Skype: davidfetter

Remember to vote!
Consider donating to PostgreSQL: http://www.postgresql.org/about/donate


PostgreSQL_startup.tgz
Description: GNU Zip compressed data

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PATCHES] Concurrent psql patch

2007-05-13 Thread Gregory Stark
"David Fetter" <[EMAIL PROTECTED]> writes:

>> I was originally thinking \c1, \c2, ... for \cswitch and \c& for
>> \cnowait. I'm not sure if going for cryptic short commands is better
>> or worse here.
>
> +1 for \c1, \c2, etc.
>
> What's the reasoning behind \c&?  Does it "send things into the
> background" the way & does in the shell?

Sort of. It sends the *subsequent* command to the background... And unlike the
shell you can't usefully do anything more in the current session while the
command is in the background, you have to manually switch sessions before
issuing subsequent commands.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] Automatic adjustment of bgwriter_lru_maxpages

2007-05-13 Thread Greg Smith

On Sun, 13 May 2007, Heikki Linnakangas wrote:

StrategyReportWrite increments numClientWrites without holding the 
BufFreeListLock, that's a race condition. The terminology needs some 
adjustment; clients don't write buffers, backends do.


That was another piece of debugging code I moved into the main path 
without thinking too hard about it, good catch.  I have a 
documentation/naming patch I've started on that revises a lot of the 
pg_stat_bgwriter names to be more consistant and easier to understand (as 
well as re-ordering the view); the underlying code is still fluid enough 
that I was trying to nail that down first.


That algorithm seems decent, but I wonder why the simple fudge factor 
wasn't good enough? I would've thought that a 2x or even bigger fudge 
factor would still be only a tiny fraction of shared_buffers, and 
wouldn't really affect performance.


I like the way the smoothing evens out the I/O rates.  I saw occasional 
spots where the buffer allocations drop to 0 for a few intervals while 
other stuff is going on everybody is waiting for, and I didn't want all 
LRU cleanup come to halt just because there's a fraction of a second where 
nothing happened in the middle of a very busy period.


As for why not overestimate, if you get into a situation where the buffer 
cache is very dirty with much of the data being recently used (I normally 
see this with bulk UPDATEs on indexed tables), you can end up scanning 
many buffers for each one you find that can be written out.  In this kind 
of situation, deciding that you actually need to write out twice as many 
just because you don't trust your estimate is very inefficient.


I was able to simulate most of the bad behavior I look for with the 
pgbench schema using "update accounts set abalance=abalance+1;".  To throw 
some sample numbers out, on my test server I was just doing final work on 
last night, I was seeing peaks of about 600-1200 buffers allocated per 
200ms interval doing that simple UPDATE with shared_buffers=32768.


Let's call it 2% of the pool.  If 50% of the pool is either dirty or can't 
be reused yet, that means I'll average having to scan 2%/50%=4% of the 
pool to find enough buffers to reuse per interval.  I wouldn't describe 
that as a tiny fraction, and doubling it is not an insignificant load 
increase.  I'd like to be able to increase the LRU percentage scanned 
without being concerned that I'm wasting resources because of this 
situation.


The fact that this problem exists is what got me digging into the 
background writer code in the first place, because it's way worse on my 
production server than this example suggests.  The buffer cache is bigger, 
but the ability of the server to dirty it under heavy load is far better. 
Returning to the theme discussed in the -hackers thread I referenced: 
you can't try to make the background writer LRU do all the writes without 
exposing yourself to issues like this, because it doesn't touch the usage 
counts.  Therefore it's vulnerable to breakdowns if your buffer pool 
shifts toward dirty and non-reusable.


Having the background writer run amok when reusable buffers are rare can 
really pull down the performance of the other backends (as well as delay 
checkpoints), both in terms of CPU usage and locking issues.  I don't feel 
it's a good idea to try and push it too hard unless some of these 
underlying issues are fixed first; I'd rather err on the side of letting 
it do less rather than more than it has to.


The normal way to return multiple values is to pass a pointer as an 
argument, though that can get ugly as well if there's a lot of return 
values.


I'm open to better suggestions, but after tinkering with this interface 
for over a month now--including pointers and enums--this is the first 
implementation I was happy with.


There are four things I eventually need returned here, to support the 
fully automatic BGW tuning. My 1st implementation passed in pointers, and 
in addition to being ugly I found consistantly checking for null pointers 
and data consistancy a drag, both from the coding and the overhead 
perspective.



What combinations of the flags are valid? Would an enum be better?


And my 2nd generation code used an enum.  There are five possible return 
code states:


CLEAN + REUSABLE + !WRITTEN
CLEAN + !REUSABLE + !WRITTEN
!CLEAN + !REUSABLE + WRITTEN (all-scan only)
!CLEAN + !REUSABLE + !WRITTEN (rejected by skip)
!CLEAN + REUSABLE + WRITTEN

!CLEAN + REUSABLE + !WRITTEN isn't possible (all paths will write dirty 
reusable buffers)


I found the enum-based code more confusing, both reading it and making 
sure it was correct when writing it, than the current form.  Right now I 
have lines like:


 if (buffer_state & BUF_REUSABLE)

With an enum this has to be something like

  if (buffer_state == BUF_CLEAN_REUSABLE || buffer_state == 
BUF_REUSABLE_WRITTEN)


And that was a pain all around; I kept having to stare at the table above 
to make sure the code w

Re: [PATCHES] Concurrent psql patch

2007-05-13 Thread David Fetter
On Sun, May 13, 2007 at 02:39:45PM +0100, Gregory Stark wrote:
> "Jim Nasby" <[EMAIL PROTECTED]> writes:
> 
> > I don't see how we could make the names shorter without moving
> > away from a backslash command (which I'm guessing would be
> > painful).
> >
> > Assuming we're stuck with a backslash command \cs[witch] and \cn
> > [owait] seem to be about as good as we could get.
> 
> I don't have \cs or \cn set up as abbreviations. 
> 
> I was originally thinking \c1, \c2, ... for \cswitch and \c& for
> \cnowait. I'm not sure if going for cryptic short commands is better
> or worse here.

+1 for \c1, \c2, etc.

What's the reasoning behind \c&?  Does it "send things into the
background" the way & does in the shell?

Cheers,
David.
-- 
David Fetter <[EMAIL PROTECTED]> http://fetter.org/
phone: +1 415 235 3778AIM: dfetter666
  Skype: davidfetter

Remember to vote!
Consider donating to PostgreSQL: http://www.postgresql.org/about/donate

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PATCHES] Automatic adjustment of bgwriter_lru_maxpages

2007-05-13 Thread Heikki Linnakangas

Greg Smith wrote:
The original code came from before there was a pg_stat_bgwriter.  The 
first patch (buf-alloc-stats) takes the two most interesting pieces of 
data the original patch collected, the number of buffers allocated 
recently and the number that the clients wrote out, and ties all that 
into the new stats structure.  With this patch applied, you can get a 
feel for things like churn/turnover in the buffer pool that were very 
hard to quantify before.  Also, it makes it easy to measure how well 
your background writer is doing at writing buffers so the clients don't 
have to.  Applying this would complete one of my personal goals for the 
8.3 release, which was having stats to track every type of buffer write.


I split this out because I think it's very useful to have regardless of 
whether the automatic tuning portion is accepted, and I think these 
smaller patches make the review easier.  The main thing I would 
recommend someone check is how am_bg_writer is (mis?)used here.  I 
spliced some of the debugging-only code from the original patch, and I 
can't tell if the result is a robust enough approach to solving the 
problem of having every client indirectly report their activity to the 
background writer.  Other than that, I think this code is ready for 
review and potentially comitting.


This looks good to me in principle. StrategyReportWrite increments 
numClientWrites without holding the BufFreeListLock, that's a race 
condition. The terminology needs some adjustment; clients don't write 
buffers, backends do.


Splitting the patch to two is a good idea.

The second patch (limit-lru) adds on top of that a constraint of the LRU 
writer so that it doesn't do any more work than it has to.  Note that I 
left verbose debugging code in here because I'm much less confident this 
patch is complete.


It predicts upcoming buffer allocations using a 16-period weighted 
moving average of recent activity, which you can think of as the last 
3.2 seconds at the default interval.  After testing a few systems that 
seemed a decent compromise of smoothing in both directions.  I found the 
2X overallocation fudge factor of the original patch way too aggressive, 
and just pick the larger of the most recent allocation amount or the 
smoothed value.  The main thing that throws off the allocation 
estimation is when you hit a checkpoint, which can give a big spike 
after the background writer returns to BgBufferSync and notices all the 
buffers that were allocated during the checkpoint write; the code then 
tries to find more buffers it can recycle than it needs to.  Since the 
checkpoint itself normally leaves a large wake of reusable buffers 
behind it, I didn't find this to be a serious problem.


Can you tell more about the tests you performed? That algorithm seems 
decent, but I wonder why the simple fudge factor wasn't good enough? I 
would've thought that a 2x or even bigger fudge factor would still be 
only a tiny fraction of shared_buffers, and wouldn't really affect 
performance.


The load distributed checkpoint patch should mitigate the checkpoint 
spike problem by continuing the LRU scan throughout the checkpoint.


There's another communication issue here, which is that SyncOneBuffer 
needs to return more information about the buffer than it currently does 
once it gets it locked.  The background writer needs to know more than 
just if it was written to tune itself.  The original patch used a clever 
trick for this which worked but I found confusing.  I happen to have a 
bunch of other background writer tuning code I'm working on, and I had 
to come up with a more robust way to communicate buffer internals back 
via this channel.  I used that code here, it's a bitmask setup similar 
to how flags like BM_DIRTY are used.  It's overkill for solving this 
particular problem, but I think the interface is clean and it helps 
support future enhancements in intelligent background writing.


Uh, that looks pretty ugly to me. The normal way to return multiple 
values is to pass a pointer as an argument, though that can get ugly as 
well if there's a lot of return values. What combinations of the flags 
are valid? Would an enum be better? Or how about moving the checks for 
dirty and pinned buffers from SyncOneBuffer to the callers?


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PATCHES] Doc update for back-branches, CLUSTER and MVCC-safety

2007-05-13 Thread Magnus Hagander
Heikki Linnakangas wrote:
> I posted a doc update a while ago to mention that CLUSTER isn't MVCC-safe:
> 
> http://archives.postgresql.org/pgsql-patches/2007-03/msg00137.php
> 
> CLUSTER is now MVCC-safe in CVS HEAD, but the documentation for older
> releases should still be updated.
> 
Applied to back-branches, thanks for the reminder.

//Magnus


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] Concurrent psql patch

2007-05-13 Thread Gregory Stark
"Jim Nasby" <[EMAIL PROTECTED]> writes:

> I don't see how we could make the names shorter without moving away from a
> backslash command (which I'm guessing would be painful).
>
> Assuming we're stuck with a backslash command \cs[witch] and \cn
> [owait] seem to be about as good as we could get.

I don't have \cs or \cn set up as abbreviations. 

I was originally thinking \c1, \c2, ... for \cswitch and \c& for \cnowait. I'm
not sure if going for cryptic short commands is better or worse here.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] ECPG patch to use prepare for improved performance

2007-05-13 Thread Bruce Momjian

This has been saved for the 8.4 release:

http://momjian.postgresql.org/cgi-bin/pgpatches_hold

---

William Lawrance wrote:
> 
> This updated patch for ECPG uses the current routines by 
> default. If an environment variable (ECPGUSEPREPARE) is set 
> to "yes", it uses the new routine that prepares and 
> caches each statement.
> 
> 
> 
> 
> -Original Message-
> From: Michael Meskes [mailto:[EMAIL PROTECTED]
> Sent: Thursday, May 10, 2007 3:01 AM
> To: William Lawrance
> Cc: Michael Meskes; Pgsql-Patches
> Subject: Re: [PATCHES] ECPG patch to use prepare for improved
> performance
> 
> 
> On Wed, May 09, 2007 at 01:12:17PM -0700, William Lawrance wrote:
> > 2. The performance was improved by about 1 hour in the 3 hour 
> >elapsed time of the application. This is important to the 
> >customer in terms of accomplishing his work load in the
> >time that has been allotted, based on his experience with DB2.
> >Without this improvement, he is likely to consider it too slow.
> 
> But this only holds for one customer. I don't think this will hold for
> every single application. At least I do not see a reason why this
> should hold everytime.
> 
> > I would like to emphasize that we aren't measuring an artificial
> > test program; this is a real customer's application. We loaded
> > 7 million rows into 217 tables to run the application. I believe 
> > it is representative of many real batch applications.
> 
> But how about non-batch applications?
> 
> > Is there reason not to prepare each statement?  
> 
> I'm completely against forcing such a design decision on the programmer.
> Hopefully I will be able to add a real prepare statement soon.
> 
> > Could it be predicated upon a user supplied option ?  
> 
> Yes, this is fine with me. If you could rearrange the patch I will test
> and commit it.
> 
> Michael
> -- 
> Michael Meskes
> Email: Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
> ICQ: 179140304, AIM/Yahoo: michaelmeskes, Jabber: [EMAIL PROTECTED]
> Go SF 49ers! Go Rhein Fire! Use Debian GNU/Linux! Use PostgreSQL!
[ Attachment, skipping... ]

> 
> ---(end of broadcast)---
> TIP 5: don't forget to increase your free space map settings

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] updated WIP: arrays of composites

2007-05-13 Thread Bruce Momjian

TODO marked as done:

o -Add support for arrays of complex types

I assume this is _not_ done, as stated below:

o Add support for arrays of domains

I will add a URL for this item:

http://archives.postgresql.org/pgsql-patches/2007-05/msg00114.php

---

Tom Lane wrote:
> Andrew Dunstan <[EMAIL PROTECTED]> writes:
> >>> Attached is my rework of David Fetter's array of composites patch. It 
> >>> has all the agreed modifications and checks, except altering the name 
> >>> mangling.
> 
> Applied with revisions.  There are some loose ends yet:
> 
> * I didn't do anything about arrays of domains.  Although I think they'd
> basically work, there's one nasty fly in the ointment, which is ALTER
> DOMAIN ADD CONSTRAINT.  get_rels_with_domain() is not smart enough to
> detect arrays of domains, and its callers are not nearly smart enough to
> apply their checks to arrays.  So I think this had better wait for 8.4.
> 
> BTW, I realized there's an existing bug here as of 8.2: when I enabled
> domains over domains I didn't do anything with get_rels_with_domain().
> Fortunately this is a relatively easy thing to deal with, we can just
> recurse to find columns of derived domain types, which the callers don't
> really need to treat any differently than they do now.  I'll go fix
> that part.
> 
> * The feature leaves something to be desired in terms of usability,
> because array[row()] doesn't work:
> 
> regression=# create type foo as (f1 int, f2 int);
> CREATE TYPE
> regression=# create table bar (ff1 foo[]);
> CREATE TABLE
> regression=# insert into bar values(array[row(1,2),row(3,4)]);
> ERROR:  could not find array type for data type record
> regression=#
> 
> You can only get it to work if you plaster ::foo onto *each* row()
> construct.  Ugh.  This didn't seem trivial to improve.
> 
> * I'm a bit concerned about dump order.  If a user wants to create
> types named "foo" and "_foo", he can, but it will only work if he
> makes "_foo" first --- else the derived type for foo is in the way.
> Since pg_dump has no clue about that constraint, it might easily
> dump "foo" first leading to an unrestorable dump.  The most usable
> solution would be to auto-rename previously created array types,
> but I dunno how implementable that would be.
> 
>   regards, tom lane
> 
> ---(end of broadcast)---
> TIP 2: Don't 'kill -9' the postmaster

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PATCHES] Have vacuum emit a warning when it runs out of maintenance_work_mem

2007-05-13 Thread Heikki Linnakangas

Jim C. Nasby wrote:

On Sat, May 12, 2007 at 07:57:44PM +0100, Heikki Linnakangas wrote:
Or we could switch to a more compact representation of the dead tuples, 
and not need such a big maintenance_work_mem in the first place.


Sure, but even with a more compact representation you can still run out
of maintenance_work_mem... unless we allow this to spill to disk. At
first guess that sounds insane, but if you've got a large enough set of
indexes it *might* actually be faster.


It would only make sense if the table is clustered on an index, so that 
you'd in practice only need to keep part of the array in memory at a 
time. It's pretty narrow use case, not worth spending time on I think.



Either way, as long as maintenance_work_mem is an issue I think we need
a way to warn users.


I agree.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PATCHES] Have vacuum emit a warning when it runs out of maintenance_work_mem

2007-05-13 Thread Heikki Linnakangas

Tom Lane wrote:

Heikki Linnakangas <[EMAIL PROTECTED]> writes:
Or we could switch to a more compact representation of the dead tuples, 
and not need such a big maintenance_work_mem in the first place.


Hm, you got any ideas?  One constraint is that it doesn't seem
acceptable to make the search function any slower.


That's the biggest problem.

One idea is to use a compressed bitmap like in the bitmap index patch, 
and a tree of block numbers or ranges to allow random access to it.


Another idea is to use the current array representation, but instead of 
storing a item pointer on every slot, you store either a normal item 
pointer, or three extra offsets on the previous block. To make the 
binary search work, set the high bit (which isn't used otherwise) of the 
extra offsets to tell them apart from normal item pointers. When the 
binary search lands on those extra offsets, scan backwards to the 
closest normal item to get the block number.


Performance is a bit hard to predict. A more compact representation 
might be more cache-efficient, which might offset the cost of a more 
complex search function.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match