Re: [PATCHES] SIGPIPE handling

2003-11-16 Thread Manfred Spraul
Bruce Momjian wrote:

Here is my logic --- 99% of apps don't install a SIGPIPE signal handler,
and 90% will not add a SIGPIPE/SIG_IGN call to their applications.  I
guess I am looking for something that would allow the performance
benefit of not doing a pgsignal() call around very send() for the
majority of our apps.  What was the speed improvement?
 

Around 10% for a heavily multithreaded app on an 8-way Xeon server. Far 
less for a single threaded app and far less for uniprocessor systems: 
the kernel must update the pending queue of all threads and that causes 
lots of contention for the (per-process) spinlock that protects the 
signal handlers.


Granted, we need to do something because our current setup isn't even
thread-safe.  Also, how is your patch more thread-safe than the old one?
The detection is thread-safe, but I don't see how the use is.
First function in main():

signal(SIGPIPE, SIG_IGN);
PQsetsighandling(1);
This results in perfectly thread-safe sigpipe handling. If it's a 
multithreaded app that needs correct correct per-thread delivery of 
SIGPIPE signals for console IO, then the libpq user must implement the 
sequence I describe below.

 If you
still pgsignal around the calls, I don't see how two threads couldn't
do:
thread 1thread 2

pgsignal(SIGPIPE, SIG_IGN);
pgsignal(SIGPIPE, SIG_DFL);
send();
pgsignal(SIGPIPE, SIG_DFL);
	send();
	pgsignal(SIGPIPE, SIG_DFL);
	
This runs thread1 with SIGPIPE as SIG_DFL.  
 

Correct. A thread safe sequence might be something like:

pthread_sigmask(SIG_BLOCK,{SIGPIPE});
send();
if (sigpending(SIGPIPE) {
   sigwait({SIGPIPE},);
}
pthread_sigmask(SIG_UNBLOCK,{SIGPIPE});
But this sequence only works for users that link against libpthread. And 
the same sequence with sigprocmask is undefined for multithreaded apps.

--
   Manfred
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [pgsql-hackers-win32] [PATCHES] SRA Win32 sync() code

2003-11-16 Thread Bruce Momjian
Tom Lane wrote:
> > Do we know that having the background writer fsync a file that was
> > written by a backend cause all the data to fsync?  I think I could write
> > a program to test this by timing each of these tests:
> 
> That might prove something about the particular platform you tested it
> on; but it would not speak to the real problem, which is what we can
> assume is true on every platform...

The attached program does test if fsync can be used on a file descriptor
after the file is closed and then reopened.  I see:

write  0.000613
write & fsync  0.001727
write, close & fsync   0.001633

This shows that fsync works even after the file is closed and reopened. 
I could test by writing using a subprocess, but I don't see how that
would be different, and it would mess up my timings.

Anyway, if we find all our platforms can pass this test, we might be
able to allow backends to do their own writes and just record the file
name somewhere for the checkpointer to fsync.  It also shows write/fsync
was 3x slower than simple write.

Does anyone have a platform where the last duration is significantly
different from the middle timing?

I am keeping this discussion on patches because of the C program
attachment.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073
/*
 *  test_fsync.c
 *  tests if fsync can be done from another process than the original write
 */

#include 
#include 
#include 
#include 
#include 

void die(char *str);
void print_elapse(struct timeval start_t, struct timeval elapse_t);

int main(int argc, char *argv[])
{
struct timeval start_t;
struct timeval elapse_t;
int tmpfile;
int i;
char charout = 44;

/* write only */
gettimeofday(&start_t, NULL);
if ((tmpfile = open("/var/tmp/test_fsync.out", O_RDWR | O_CREAT)) == -1)
die("can't open /var/tmp/test_fsync.out");
for (i = 0; i < 200; i++)
write(tmpfile, &charout, 1);
close(tmpfile); 
gettimeofday(&elapse_t, NULL);
unlink("/var/tmp/test_fsync.out");
printf("write  ");
print_elapse(start_t, elapse_t);
printf("\n");

/* write & fsync */
gettimeofday(&start_t, NULL);
if ((tmpfile = open("/var/tmp/test_fsync.out", O_RDWR | O_CREAT)) == -1)
die("can't open /var/tmp/test_fsync.out");
for (i = 0; i < 200; i++)
write(tmpfile, &charout, 1);
fsync(tmpfile);
close(tmpfile); 
gettimeofday(&elapse_t, NULL);
unlink("/var/tmp/test_fsync.out");
printf("write & fsync  ");
print_elapse(start_t, elapse_t);
printf("\n");

/* write, close & fsync */
gettimeofday(&start_t, NULL);
if ((tmpfile = open("/var/tmp/test_fsync.out", O_RDWR | O_CREAT)) == -1)
die("can't open /var/tmp/test_fsync.out");
for (i = 0; i < 200; i++)
write(tmpfile, &charout, 1);
close(tmpfile);
/* reopen file */
if ((tmpfile = open("/var/tmp/test_fsync.out", O_RDWR | O_CREAT)) == -1)
die("can't open /var/tmp/test_fsync.out");
fsync(tmpfile);
close(tmpfile); 
gettimeofday(&elapse_t, NULL);
unlink("/var/tmp/test_fsync.out");
printf("write, close & fsync   ");
print_elapse(start_t, elapse_t);
printf("\n");

return 0;
}

void print_elapse(struct timeval start_t, struct timeval elapse_t)
{
if (elapse_t.tv_usec < start_t.tv_usec)
{
elapse_t.tv_sec--;
elapse_t.tv_usec += 100;
}

printf("%ld.%06ld", (long) (elapse_t.tv_sec - start_t.tv_sec),
 (long) (elapse_t.tv_usec - start_t.tv_usec));
}

void die(char *str)
{
fprintf(stderr, "%s", str);
exit(1);
}

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [pgsql-hackers-win32] [PATCHES] SRA Win32 sync() code

2003-11-16 Thread Bruce Momjian
Tom Lane wrote:
> Bruce Momjian <[EMAIL PROTECTED]> writes:
> > Where am I wrong?
> 
> I don't think any of this is relevant.  There are a certain number of
> blocks we have to get down to disk before we can declare a transaction
> committed, and there are a certain number that we have to get down to
> disk before we can declare a checkpoint complete.  You are focusing too
> much on the question of whether a particular process performs an fsync
> operation, and ignoring the fact that ultimately it's got to wait for
> I/O to complete --- directly or indirectly.  If it blocks waiting for
> some other process to declare a buffer clean, rather than writing for
> itself, what's the difference?

The difference is two-fold.  First, there might be 10 other backends
asking for writes, so it isn't clear that asking someone else do the
right is as fast.  Second, that other writer is doing fsync, so it is
100x or 1000x slower than our current setup.

> Sure, fsync serializes the particular process that's doing it, but we
> can deal with that by spreading the fsyncs across multiple processes,
> and trying to ensure that they are mostly background processes rather
> than foreground ones.

How many?  That was my point, that it might require 1000 backend
processes _and_ it would be slower because we are write/fsync rather
than write.  However, I think we could fix that by doing the write and
returning OK to the backend, then doing the fsync whenever we want ---
perhaps that was already your plan.

> I don't claim that immediate-fsync-on-write is the only answer, but
> I cannot follow your reasoning for dismissing it out of hand ... and I
> certainly cannot buy *any* logic that says that sync() is a good answer
> to any of these issues.  AFAICS sync() means that we abandon
> responsibility.

sync() means we group the fsync into one massive one, that sync all
other process I/O too --- clearly bad, but I am hoping for something as
good as what we currently have because that sync hopefully is only ever
few minutes.

> > Do we know that having the background writer fsync a file that was
> > written by a backend cause all the data to fsync?  I think I could write
> > a program to test this by timing each of these tests:
> 
> That might prove something about the particular platform you tested it
> on; but it would not speak to the real problem, which is what we can
> assume is true on every platform...

Yes, it would only be per platform.  I was thinking we could have a
platform test and enable this behavior on platforms that support it
(all?) and use sync on the others.

Also, let me say I am glad we are delving into this.  Our buffer system
has needed an overhaul for a while, and right now we already have some
major improvements for 7.5, and this discussion is just a continuation
of those improvements.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [pgsql-hackers-win32] [PATCHES] SRA Win32 sync() code

2003-11-16 Thread Tom Lane
Bruce Momjian <[EMAIL PROTECTED]> writes:
> Where am I wrong?

I don't think any of this is relevant.  There are a certain number of
blocks we have to get down to disk before we can declare a transaction
committed, and there are a certain number that we have to get down to
disk before we can declare a checkpoint complete.  You are focusing too
much on the question of whether a particular process performs an fsync
operation, and ignoring the fact that ultimately it's got to wait for
I/O to complete --- directly or indirectly.  If it blocks waiting for
some other process to declare a buffer clean, rather than writing for
itself, what's the difference?

Sure, fsync serializes the particular process that's doing it, but we
can deal with that by spreading the fsyncs across multiple processes,
and trying to ensure that they are mostly background processes rather
than foreground ones.

I don't claim that immediate-fsync-on-write is the only answer, but
I cannot follow your reasoning for dimissing it out of hand ... and I
certainly cannot buy *any* logic that says that sync() is a good answer
to any of these issues.  AFAICS sync() means that we abandon
responsibility.

> Do we know that having the background writer fsync a file that was
> written by a backend cause all the data to fsync?  I think I could write
> a program to test this by timing each of these tests:

That might prove something about the particular platform you tested it
on; but it would not speak to the real problem, which is what we can
assume is true on every platform...

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [pgsql-hackers-win32] [PATCHES] SRA Win32 sync() code

2003-11-16 Thread Bruce Momjian
Tom Lane wrote:
> Bruce Momjian <[EMAIL PROTECTED]> writes:
> > Tom Lane wrote:
> >> Seriously though, if we can move the bulk of the writing work into
> >> background processes then I don't believe that there will be any
> >> significant penalty for regular backends.
> 
> > If the background writer starts using fsync(), we can have normal
> > backends that do a write() set a shared memory boolean.  We can then
> > test that boolean and do sync() only if other backends had to do their
> > own writes.
> 
> That seems like the worst of both worlds --- you still are depending on
> sync() for correctness.
> 
> Also, as long as backends only *seldom* do writes, making them fsync a
> write when they do make one will be less of an impact on overall system
> performance than having a sync() ensue shortly afterwards.  I think you
> are focusing too narrowly on the idea that backends shouldn't ever wait
> for writes, and failing to see the bigger picture.  What we need to
> optimize is overall system performance, not an arbitrary restriction
> that certain processes never wait for certain things.

OK, let me give you my logic and you can tell me where I am wrong.

First, how many backend can a single write process support if all the
backends are doing insert/update/deletes?  5?  10?  Let's assume 10. 
Second, once we change write to write/fsync, how much slower will that
be?  100x, 1000x?  Let's say 10x.

So, by my logic, if we have 100 backends all doing updates, we will need
10 * 100 or 1000 writer processes or threads to keep up with that load. 
That seems quite excessive to me from a context switching and process
overhead perspective.

Where am I wrong?

Also, if we go with the fsync only at checkpoint, we are doing fsync's
once every minute (at checkpoint time) rather than several times a
second potentially.

Do we know that having the background writer fsync a file that was
written by a backend cause all the data to fsync?  I think I could write
a program to test this by timing each of these tests:

create an empty file
open file
time fsync
close

open file
write 2mb into the file
time fsync
close

open file
write 2mb into the file
close
open file
time fsync
close

If I do the write via system(), I am doing it in a separate process so
the test should work.  Should I try this?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] SIGPIPE handling

2003-11-16 Thread Bruce Momjian
Manfred Spraul wrote:
> Bruce Momjian wrote:
> 
> >I thought it should be global too, basically testing on the first
> >connection request.
> >
> What if two PQconnect calls happen at the same time?
> I would really prefer the manual approach with a new PQsetsighandler 
> function - the autodetection is fragile, it's trivial to find a special 
> case where it breaks.
> Bruce, you wrote that a new function would be overdesign. Are you sure? 
> Your simpler proposals all fail with multithreaded apps.
> I've attached the patch that implements the global flag with two special 
> function that access it.

Here is my logic --- 99% of apps don't install a SIGPIPE signal handler,
and 90% will not add a SIGPIPE/SIG_IGN call to their applications.  I
guess I am looking for something that would allow the performance
benefit of not doing a pgsignal() call around very send() for the
majority of our apps.  What was the speed improvement?

Just the fact you had to add the SIG_IGN call to pgbench shows that most
apps need some special handling to get this performance benefit, and I
would like to avoid that.

Your PQsetsighandler() idea --- would that be fore SIGPIPE only?  Would
it be acceptable to tell application developers they have to use
PQsetsig*pipe*handler() call to register a SIGPIPE handler?  If so, that
would be great because we would do the pgsignal call around send() only
when it was needed.  It might be the cleanest way and the most reliable.

Granted, we need to do something because our current setup isn't even
thread-safe.  Also, how is your patch more thread-safe than the old one?
The detection is thread-safe, but I don't see how the use is.  If you
still pgsignal around the calls, I don't see how two threads couldn't
do:

thread 1thread 2

pgsignal(SIGPIPE, SIG_IGN);
pgsignal(SIGPIPE, SIG_DFL);
send();
pgsignal(SIGPIPE, SIG_DFL);

send();
pgsignal(SIGPIPE, SIG_DFL);

This runs thread1 with SIGPIPE as SIG_DFL.  

What are we ignoring the SIGPIPE for on send anyway?  Is this in case
the backend crashed?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PATCHES] SIGPIPE handling

2003-11-16 Thread Bruce Momjian
Tom Lane wrote:
> Bruce Momjian <[EMAIL PROTECTED]> writes:
> > Yes, I was afraid of that.  Here's another idea.  If the signal handler
> > is SIG_DFL, we install our own signal handler for SIGPIPE, and set/clear a
> > global variable before/after we send().
> 
> That would address the speed issue but not the multithread correctness
> issue.  Also, what happens if the app replaces the signal handler later?

Well, our current setup doesn't do multithreaded properly either.  In
fact, I am starting to worry about libpq's thread-safety.   Should I?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PATCHES] SRA Win32 sync() code

2003-11-16 Thread Tom Lane
Jan Wieck <[EMAIL PROTECTED]> writes:
> Well, the bgwriter has basically the same chance the checkpointer has 
> ... mdblindwrt() in the end, because he doesn't have the relcache handy. 

We could easily get rid of mdblindwrt --- there is no very good reason
that we use the relcache for I/O.  There could and should be a
lower-level notion of "open relation" that bgwriter and checkpoint
could use.  See recent discussion with Neil, for example.  Vadim had
always wanted to do that, IIRC.

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[PATCHES] Alter Table phase 1 -- Please apply to 7.5

2003-11-16 Thread Rod Taylor
Completes:
ALTER TABLE ADD COLUMN does not honour DEFAULT and non-CHECK
CONSTRAINT

ALTER TABLE ADD COLUMN column DEFAULT should fill existing rows
with DEFAULT value

ALTER TABLE ADD COLUMN column SERIAL doesn't create sequence
because of the item above

Previously described reorganization of all ALTER TABLE commands.

Most of the way through column type change. I need to supply a followup
patch which deals with logical attribute numbers.

ALTER TABLE table ALTER [COLUMN] column TYPE type USING expression;

Syntax documentation updates only. Content to come later.


altertable.patch.gz
Description: GNU Zip compressed data

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PATCHES] SRA Win32 sync() code

2003-11-16 Thread Jan Wieck
Tom Lane wrote:

Jan Wieck <[EMAIL PROTECTED]> writes:
Removing sync() entirely requires very accurate fsync()'ing in the 
background writer, the checkpointer and the backends. Basically none of 
them can mark a block "clean" if he fails to fsync() the relation later! 
This will be a mess to code.
Not really.  The O_SYNC solution for example would be trivial to code.
Well, the bgwriter has basically the same chance the checkpointer has 
... mdblindwrt() in the end, because he doesn't have the relcache handy. 
So you want to open(O_SYNC), write(), close() every single block? I 
don't expect that to be much better than a global sync().

Jan

--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PATCHES] SRA Win32 sync() code

2003-11-16 Thread Tom Lane
Jan Wieck <[EMAIL PROTECTED]> writes:
> Removing sync() entirely requires very accurate fsync()'ing in the 
> background writer, the checkpointer and the backends. Basically none of 
> them can mark a block "clean" if he fails to fsync() the relation later! 
> This will be a mess to code.

Not really.  The O_SYNC solution for example would be trivial to code.

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PATCHES] SRA Win32 sync() code

2003-11-16 Thread Jan Wieck
Tom Lane wrote:

Bruce Momjian <[EMAIL PROTECTED]> writes:
Tom Lane wrote:
Seriously though, if we can move the bulk of the writing work into
background processes then I don't believe that there will be any
significant penalty for regular backends.

If the background writer starts using fsync(), we can have normal
backends that do a write() set a shared memory boolean.  We can then
test that boolean and do sync() only if other backends had to do their
own writes.
That seems like the worst of both worlds --- you still are depending on
sync() for correctness.
Also, as long as backends only *seldom* do writes, making them fsync a
write when they do make one will be less of an impact on overall system
performance than having a sync() ensue shortly afterwards.  I think you
are focusing too narrowly on the idea that backends shouldn't ever wait
for writes, and failing to see the bigger picture.  What we need to
optimize is overall system performance, not an arbitrary restriction
that certain processes never wait for certain things.
Removing sync() entirely requires very accurate fsync()'ing in the 
background writer, the checkpointer and the backends. Basically none of 
them can mark a block "clean" if he fails to fsync() the relation later! 
This will be a mess to code.

Jan

--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [PATCHES] SIGPIPE handling

2003-11-16 Thread Tom Lane
Manfred Spraul <[EMAIL PROTECTED]> writes:
> + extern void PQsetsighandling(int internal_sigign);

These sorts of things are commonly designed so that the set() operation
incidentally returns the previous setting.  I'm not sure if anyone would
care, but it's only a couple more lines of code to make that happen, so
I'd suggest doing so just in case.

Otherwise I think this is a good patch.  The documentation could use a
little more wordsmithing, perhaps.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PATCHES] SRA Win32 sync() code

2003-11-16 Thread Tom Lane
Bruce Momjian <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> Seriously though, if we can move the bulk of the writing work into
>> background processes then I don't believe that there will be any
>> significant penalty for regular backends.

> If the background writer starts using fsync(), we can have normal
> backends that do a write() set a shared memory boolean.  We can then
> test that boolean and do sync() only if other backends had to do their
> own writes.

That seems like the worst of both worlds --- you still are depending on
sync() for correctness.

Also, as long as backends only *seldom* do writes, making them fsync a
write when they do make one will be less of an impact on overall system
performance than having a sync() ensue shortly afterwards.  I think you
are focusing too narrowly on the idea that backends shouldn't ever wait
for writes, and failing to see the bigger picture.  What we need to
optimize is overall system performance, not an arbitrary restriction
that certain processes never wait for certain things.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PATCHES] SIGPIPE handling

2003-11-16 Thread Manfred Spraul
Bruce Momjian wrote:

I thought it should be global too, basically testing on the first
connection request.
What if two PQconnect calls happen at the same time?
I would really prefer the manual approach with a new PQsetsighandler 
function - the autodetection is fragile, it's trivial to find a special 
case where it breaks.
Bruce, you wrote that a new function would be overdesign. Are you sure? 
Your simpler proposals all fail with multithreaded apps.
I've attached the patch that implements the global flag with two special 
function that access it.

--
   Manfred
Index: contrib/pgbench/README.pgbench
===
RCS file: /projects/cvsroot/pgsql-server/contrib/pgbench/README.pgbench,v
retrieving revision 1.9
diff -c -r1.9 README.pgbench
*** contrib/pgbench/README.pgbench  10 Jun 2003 09:07:15 -  1.9
--- contrib/pgbench/README.pgbench  8 Nov 2003 21:43:53 -
***
*** 112,117 
--- 112,121 
might be a security hole since ps command will
show the password. Use this for TESTING PURPOSE ONLY.
  
+   -a
+   Disable SIGPIPE delivery globally instead of within each
+   libpq operation.
+ 
-n
No vacuuming and cleaning the history table prior to the
test is performed.
Index: contrib/pgbench/pgbench.c
===
RCS file: /projects/cvsroot/pgsql-server/contrib/pgbench/pgbench.c,v
retrieving revision 1.27
diff -c -r1.27 pgbench.c
*** contrib/pgbench/pgbench.c   27 Sep 2003 19:15:34 -  1.27
--- contrib/pgbench/pgbench.c   8 Nov 2003 21:43:54 -
***
*** 28,33 
--- 28,34 
  #else
  #include 
  #include 
+ #include 
  
  #ifdef HAVE_GETOPT_H
  #include 
***
*** 105,112 
  static void
  usage()
  {
!   fprintf(stderr, "usage: pgbench [-h hostname][-p port][-c nclients][-t 
ntransactions][-s scaling_factor][-n][-C][-v][-S][-N][-l][-U login][-P 
password][-d][dbname]\n");
!   fprintf(stderr, "(initialize mode): pgbench -i [-h hostname][-p port][-s 
scaling_factor][-U login][-P password][-d][dbname]\n");
  }
  
  /* random number generator */
--- 106,113 
  static void
  usage()
  {
!   fprintf(stderr, "usage: pgbench [-h hostname][-p port][-c nclients][-t 
ntransactions][-s scaling_factor][-n][-C][-v][-S][-N][-l][-a][-U login][-P 
password][-d][dbname]\n");
!   fprintf(stderr, "(initialize mode): pgbench -i [-h hostname][-p port][-s 
scaling_factor][-U login][-P password][-d][dbname][-a]\n");
  }
  
  /* random number generator */
***
*** 703,712 
else if ((env = getenv("PGUSER")) != NULL && *env != '\0')
login = env;
  
!   while ((c = getopt(argc, argv, "ih:nvp:dc:t:s:U:P:CNSl")) != -1)
{
switch (c)
{
case 'i':
is_init_mode++;
break;
--- 704,719 
else if ((env = getenv("PGUSER")) != NULL && *env != '\0')
login = env;
  
!   while ((c = getopt(argc, argv, "aih:nvp:dc:t:s:U:P:CNSl")) != -1)
{
switch (c)
{
+   case 'a':
+ #ifndef WIN32
+   signal(SIGPIPE, SIG_IGN);
+ #endif
+   PQsetsighandling(0);
+   break;
case 'i':
is_init_mode++;
break;
Index: doc/src/sgml/libpq.sgml
===
RCS file: /projects/cvsroot/pgsql-server/doc/src/sgml/libpq.sgml,v
retrieving revision 1.141
diff -c -r1.141 libpq.sgml
*** doc/src/sgml/libpq.sgml 1 Nov 2003 01:56:29 -   1.141
--- doc/src/sgml/libpq.sgml 8 Nov 2003 21:43:56 -
***
*** 645,650 
--- 645,693 

   
  
+  
+   
PQsetsighandlingPQsetsighandling
+   
PQgetsighandlingPQgetsighandling
+   
+
+Set/query SIGPIPE signal handling.
+ 
+ void PQsetsighandling(int internal_sigign);
+ 
+ 
+ int PQgetsighandling(void);
+ 
+ 
+ 
+ 
+ These functions allow to query and set the SIGPIPE signal handling
+ of libpq: by default, Unix systems generate a (fatal) SIGPIPE signal
+ on write attempts to a disconnected socket. Most callers expect a
+ normal error return instead of the signal. A normal error return can
+ be achieved by blocking or ignoring the SIGPIPE signal. This can be
+ done either globally in the application or inside libpq.
+
+
+ If internal signal handling is enabled (this is the default), then
+ libpq sets the SIGPIPE handler to SIG_IGN before every socket send
+ operation and restores it afterwards. This prevents libpq from
+ killing the application, at the cost of a slight performance
+ dec

Re: [PATCHES] SRA Win32 sync() code

2003-11-16 Thread Bruce Momjian
Tom Lane wrote:
> Bruce Momjian <[EMAIL PROTECTED]> writes:
> > Tom Lane wrote:
> >> One reason I like the idea of adopting a sync-when-you-write policy is
> >> that it eliminates the need for anything as messy as that.
> 
> > Yes, but can we do it without causing a performance degredation, and I
> > would hate to change something to make things easier on Win32 while
> > penalizing all platforms.
> 
> Having to keep a list of modified files in shared memory isn't a penalty?
> 
> Seriously though, if we can move the bulk of the writing work into
> background processes then I don't believe that there will be any
> significant penalty for regular backends.  And I believe that it would
> be a huge advantage from a correctness point of view if we could stop
> depending on sync().  The fact that Windows hasn't got sync() is merely
> another reason we should stop using it.

If the background writer starts using fsync(), we can have normal
backends that do a write() set a shared memory boolean.  We can then
test that boolean and do sync() only if other backends had to do their
own writes.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PATCHES] SIGPIPE handling

2003-11-16 Thread Bruce Momjian
Tom Lane wrote:
> Manfred Spraul <[EMAIL PROTECTED]> writes:
> > But how should libpq notice that the caller handles sigpipe signals?
> > a) autodetection - if the sigpipe handler is not the default, then the 
> > caller knows what he's doing.
> > b) a new PGsetsignalhandler() function.
> > c) an additional flag passed to PGconnectdb.
> 
> > Tom preferred a). One problem is that the autodetection is not perfect: 
> > an app could block the signal with sigprocmask, or it could install a 
> > handler that doesn't expect sigpipe signals from within libpq.
> > I would prefer b), because it guarantees that the patch has no effect on 
> > existing apps.
> 
> I have no particular objection to (b) either, but IIRC there was some
> dispute about whether it sets a global or per-connection flag.  ISTM
> that "I have a correct signal handler" is a global assertion (within one
> process) and so a global flag is appropriate.  Someone else (Bruce?)
> didn't like that though.

I thought it should be global too, basically testing on the first
connection request.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PATCHES] SIGPIPE handling

2003-11-16 Thread Tom Lane
Bruce Momjian <[EMAIL PROTECTED]> writes:
> Yes, I was afraid of that.  Here's another idea.  If the signal handler
> is SIG_DFL, we install our own signal handler for SIGPIPE, and set/clear a
> global variable before/after we send().

That would address the speed issue but not the multithread correctness
issue.  Also, what happens if the app replaces the signal handler later?

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PATCHES] SIGPIPE handling

2003-11-16 Thread Bruce Momjian
Tom Lane wrote:
> Bruce Momjian <[EMAIL PROTECTED]> writes:
> > Is running the rest of the
> > application with SIGPIPE <= SIG_IGN a problem?
> 
> That is NOT an acceptable thing for a library to do.

Yes, I was afraid of that.  Here's another idea.  If the signal handler
is SIG_DFL, we install our own signal handler for SIGPIPE, and set/clear a
global variable before/after we send().  When our signal handler is
called, we check to see if our global variable is set, and we either
ignore or exit().  Can we do that safely?  Seems it only fails when they
register a signal handler after establishing a database connection.

How would this work in a threaded app --- not too well, I think.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] SIGPIPE handling

2003-11-16 Thread Tom Lane
Kurt Roeckx <[EMAIL PROTECTED]> writes:
> On Sun, Nov 16, 2003 at 06:28:06PM +0100, Kurt Roeckx wrote:
>> Is there a reason we don't make use of the MSG_NOSIGNAL flag to
>> send()?  Or is the problem in case of SSL?

> Oh, seems to be a Linux only thing?

That and the SSL problem.  I wouldn't object to implementing it as a
platform-specific optimization if we could get it to handle the SSL
case, but without SSL support I think it's too limited.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PATCHES] SRA Win32 sync() code

2003-11-16 Thread Andrew Dunstan


Manfred Spraul wrote:

Tom Lane wrote:

Seriously though, if we can move the bulk of the writing work into
background processes then I don't believe that there will be any
significant penalty for regular backends.  And I believe that it would
be a huge advantage from a correctness point of view if we could stop
depending on sync().
Which function guarantees that renames of WAL files arrived on the 
disk? AFAIK sync() is the only function that guarantees that.

What about the sync app from sysinternals? It seems Mark Russinovich 
figured out how to implement sync on Win32:
http://www.sysinternals.com/ntw2k/source/misc.shtml#Sync

It requires administrative priveledges, but it shouldn't be that 
difficult to write a tiny service that runs in the LocalSystem 
account, listens to a pipe and syncs all disks when asked.


I think we'd have to do it from scratch, because of these license terms:

---

There is no charge to use any of the software published on this Web site 
at home or at work, so long as each user downloads and installs the 
product directly from www.sysinternals.com.

A commercial license is required to redistribute any of these utilities 
directly (whether by computer media, a file server, an email attachment, 
etc.) or to embed them in- or link them to- another program.
--

Also, do we want to force a broad brush sync() of just fsync our own files?

cheers

andrew

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PATCHES] SIGPIPE handling

2003-11-16 Thread Tom Lane
Manfred Spraul <[EMAIL PROTECTED]> writes:
> But how should libpq notice that the caller handles sigpipe signals?
> a) autodetection - if the sigpipe handler is not the default, then the 
> caller knows what he's doing.
> b) a new PGsetsignalhandler() function.
> c) an additional flag passed to PGconnectdb.

> Tom preferred a). One problem is that the autodetection is not perfect: 
> an app could block the signal with sigprocmask, or it could install a 
> handler that doesn't expect sigpipe signals from within libpq.
> I would prefer b), because it guarantees that the patch has no effect on 
> existing apps.

I have no particular objection to (b) either, but IIRC there was some
dispute about whether it sets a global or per-connection flag.  ISTM
that "I have a correct signal handler" is a global assertion (within one
process) and so a global flag is appropriate.  Someone else (Bruce?)
didn't like that though.

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PATCHES] SIGPIPE handling

2003-11-16 Thread Kurt Roeckx
On Sun, Nov 16, 2003 at 06:28:06PM +0100, Kurt Roeckx wrote:
> On Sun, Nov 16, 2003 at 12:56:10PM +0100, Manfred Spraul wrote:
> > Hi,
> > 
> > attached is an update of my automatic sigaction patch: I've moved the 
> > actual sigaction calls into pqsignal.c and added a helper function 
> > (pgsignalinquire(signo)). I couldn't remove the include  from 
> > fe-connect.c: it's required for the SIGPIPE definition.
> > Additionally I've added a -a flag for pgbench that sets the signal 
> > handler before calling PQconnectdb.
> 
> Is there a reason we don't make use of the MSG_NOSIGNAL flag to
> send()?  Or is the problem in case of SSL?

Oh, seems to be a Linux only thing?


Kurt


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PATCHES] SRA Win32 sync() code

2003-11-16 Thread Tom Lane
Manfred Spraul <[EMAIL PROTECTED]> writes:
> Which function guarantees that renames of WAL files arrived on the disk? 

The OS itself is supposed to guarantee that; that's what a journaling
file system is for.  In any case, I don't think we care.  Renaming would
apply only to WAL segments that are not currently needed where they are,
and would only be needed under their new names at some future time.
If the rename gets lost shortly after it's done, it can be redone.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PATCHES] SIGPIPE handling

2003-11-16 Thread Kurt Roeckx
On Sun, Nov 16, 2003 at 12:56:10PM +0100, Manfred Spraul wrote:
> Hi,
> 
> attached is an update of my automatic sigaction patch: I've moved the 
> actual sigaction calls into pqsignal.c and added a helper function 
> (pgsignalinquire(signo)). I couldn't remove the include  from 
> fe-connect.c: it's required for the SIGPIPE definition.
> Additionally I've added a -a flag for pgbench that sets the signal 
> handler before calling PQconnectdb.

Is there a reason we don't make use of the MSG_NOSIGNAL flag to
send()?  Or is the problem in case of SSL?


Kurt


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PATCHES] improve overcommit docs

2003-11-16 Thread Andrew Dunstan
That covers it extremely well.

cheers

andrew

Tom Lane wrote:

Andrew Dunstan <[EMAIL PROTECTED]> writes:
 

At the time I wrote the original 2.6 was not out even in prerelease, 
which is why I was deliberately somewhat vague about it. It is still in 
prerelease, and it will in fact work slightly differently from what was 
in some 2.4 kernels - there are 2 settings that govern this instead of 
1.
   

Okay, I revised that section yet again based on this info:
http://candle.pha.pa.us/main/writings/pgsql/sgml/kernel-resources.html#AEN17043
Thanks for the update.
			regards, tom lane

 



---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [PATCHES] SRA Win32 sync() code

2003-11-16 Thread Manfred Spraul
Tom Lane wrote:

Seriously though, if we can move the bulk of the writing work into
background processes then I don't believe that there will be any
significant penalty for regular backends.  And I believe that it would
be a huge advantage from a correctness point of view if we could stop
depending on sync().
Which function guarantees that renames of WAL files arrived on the disk? 
AFAIK sync() is the only function that guarantees that.

What about the sync app from sysinternals? It seems Mark Russinovich 
figured out how to implement sync on Win32:
http://www.sysinternals.com/ntw2k/source/misc.shtml#Sync

It requires administrative priveledges, but it shouldn't be that 
difficult to write a tiny service that runs in the LocalSystem account, 
listens to a pipe and syncs all disks when asked.

--
   Manfred
---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PATCHES] improve overcommit docs

2003-11-16 Thread Tom Lane
Andrew Dunstan <[EMAIL PROTECTED]> writes:
> At the time I wrote the original 2.6 was not out even in prerelease, 
> which is why I was deliberately somewhat vague about it. It is still in 
> prerelease, and it will in fact work slightly differently from what was 
> in some 2.4 kernels - there are 2 settings that govern this instead of 
> 1.

Okay, I revised that section yet again based on this info:
http://candle.pha.pa.us/main/writings/pgsql/sgml/kernel-resources.html#AEN17043
Thanks for the update.

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PATCHES] SIGPIPE handling

2003-11-16 Thread Manfred Spraul
Bruce Momjian wrote:

Better.  However, I am confused over when we do sigaction.  I thought we
were going to do it only if they had a signal handler defined, meaning
if (pipehandler != SIG_DFL &&
pipehandler != SIG_IGN &&
pipehandler != SIG_ERR)
conn->do_sigaction = true;
else
conn->do_sigaction = false;
By doing this, we don't do sigaction in the default case where no
handler was defined.
No. If no handler was definied, then libpq must define a handler. 
Without a handler, a network disconnect would result in a SIGPIE that 
kills the app.

 I thought we would just set the entire application
to SIGPIPE <= SIG_IGN.  This gives us good performance in all cases
except when a signal handler is defined.
I don't want to change the whole app - perhaps someone expects that 
sigpipe works? Perhaps psql for the console input, or something similar?

 Is running the rest of the
application with SIGPIPE <= SIG_IGN a problem?
 

I think that depends on the application, and libpq shouldn't mandate 
that SIGPIPE must be SIG_IGN. Right now libpq tries to catch sigpipe 
signals by manually installing/restoring a signal handler around send() 
calls. This doesn't work for multithreaded apps, because the signal 
handlers are per-process, not per-thread.

Thus for multithreaded apps, the libpq user is responsible for handling 
sigpipe. The API change should be a big problem - the current system 
doesn't work, and there shouldn't be many multithreaded apps.

But how should libpq notice that the caller handles sigpipe signals?
a) autodetection - if the sigpipe handler is not the default, then the 
caller knows what he's doing.
b) a new PGsetsignalhandler() function.
c) an additional flag passed to PGconnectdb.

Tom preferred a). One problem is that the autodetection is not perfect: 
an app could block the signal with sigprocmask, or it could install a 
handler that doesn't expect sigpipe signals from within libpq.
I would prefer b), because it guarantees that the patch has no effect on 
existing apps.
c) is bad, Tom explained that the connect string is often directly 
specified by the user.

--
   Manfred
---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PATCHES] SIGPIPE handling

2003-11-16 Thread Tom Lane
Bruce Momjian <[EMAIL PROTECTED]> writes:
> Is running the rest of the
> application with SIGPIPE <= SIG_IGN a problem?

That is NOT an acceptable thing for a library to do.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PATCHES] SRA Win32 sync() code

2003-11-16 Thread Tom Lane
Bruce Momjian <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> One reason I like the idea of adopting a sync-when-you-write policy is
>> that it eliminates the need for anything as messy as that.

> Yes, but can we do it without causing a performance degredation, and I
> would hate to change something to make things easier on Win32 while
> penalizing all platforms.

Having to keep a list of modified files in shared memory isn't a penalty?

Seriously though, if we can move the bulk of the writing work into
background processes then I don't believe that there will be any
significant penalty for regular backends.  And I believe that it would
be a huge advantage from a correctness point of view if we could stop
depending on sync().  The fact that Windows hasn't got sync() is merely
another reason we should stop using it.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PATCHES] SRA Win32 sync() code

2003-11-16 Thread Bruce Momjian
Tom Lane wrote:
> Bruce Momjian <[EMAIL PROTECTED]> writes:
> > Not sure how we are going to do this in Win32, but somehow we will have
> > to record all open files between checkpoints in an area that the
> > checkpoint process can read during a checkpoint.
> 
> One reason I like the idea of adopting a sync-when-you-write policy is
> that it eliminates the need for anything as messy as that.

Yes, but can we do it without causing a performance degredation, and I
would hate to change something to make things easier on Win32 while
penalizing all platforms.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PATCHES] SRA Win32 sync() code

2003-11-16 Thread Tom Lane
Bruce Momjian <[EMAIL PROTECTED]> writes:
> Not sure how we are going to do this in Win32, but somehow we will have
> to record all open files between checkpoints in an area that the
> checkpoint process can read during a checkpoint.

One reason I like the idea of adopting a sync-when-you-write policy is
that it eliminates the need for anything as messy as that.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] SIGPIPE handling

2003-11-16 Thread Bruce Momjian

Better.  However, I am confused over when we do sigaction.  I thought we
were going to do it only if they had a signal handler defined, meaning

if (pipehandler != SIG_DFL &&
pipehandler != SIG_IGN &&
pipehandler != SIG_ERR)
conn->do_sigaction = true;
else
conn->do_sigaction = false;

By doing this, we don't do sigaction in the default case where no
handler was defined.  I thought we would just set the entire application
to SIGPIPE <= SIG_IGN.  This gives us good performance in all cases
except when a signal handler is defined.  Is running the rest of the
application with SIGPIPE <= SIG_IGN a problem?

However, the code patch is:

if (pipehandler == SIG_DFL || pipehandler == SIG_ERR)
conn->do_sigaction = true;
else
conn->do_sigaction = false;

This gives us good performance only if SIGPIPE <= SIG_IGN has been set
by the application or a sigaction function has been defined.

---

Manfred Spraul wrote:
> Hi,
> 
> attached is an update of my automatic sigaction patch: I've moved the 
> actual sigaction calls into pqsignal.c and added a helper function 
> (pgsignalinquire(signo)). I couldn't remove the include  from 
> fe-connect.c: it's required for the SIGPIPE definition.
> Additionally I've added a -a flag for pgbench that sets the signal 
> handler before calling PQconnectdb.
> 
> Tested on Fedora Core 1 (Redhat Linux) with pgbench.
> 
> --
> Manfred

> Index: src/interfaces/libpq/fe-connect.c
> ===
> RCS file: /projects/cvsroot/pgsql-server/src/interfaces/libpq/fe-connect.c,v
> retrieving revision 1.263
> diff -c -r1.263 fe-connect.c
> *** src/interfaces/libpq/fe-connect.c 18 Oct 2003 05:02:06 -  1.263
> --- src/interfaces/libpq/fe-connect.c 16 Nov 2003 11:44:47 -
> ***
> *** 41,46 
> --- 41,48 
>   #include 
>   #endif
>   #include 
> + #include 
> + #include "pqsignal.h"
>   #endif
>   
>   #include "libpq/ip.h"
> ***
> *** 881,886 
> --- 883,891 
>   struct addrinfo hint;
>   const char *node = NULL;
>   int ret;
> + #ifndef WIN32
> + pqsigfunc pipehandler;
> + #endif
>   
>   if (!conn)
>   return 0;
> ***
> *** 950,955 
> --- 955,976 
>   conn->allow_ssl_try = false;
>   else if (conn->sslmode[0] == 'a')   /* "allow" */
>   conn->wait_ssl_try = true;
> + #endif
> + #ifndef WIN32
> + /* 
> +  * Autodetect SIGPIPE signal handling:
> +  * The default action per Unix spec is kill current process and
> +  * that's not acceptable. If the current setting is not the default,
> +  * then assume that the caller knows what he's doing and leave the
> +  * signal handler unchanged. Otherwise set the signal handler to
> +  * SIG_IGN around each send() syscall. Unfortunately this is both
> +  * unreliable and slow for multithreaded apps.
> +  */
> + pipehandler = pqsignalinquire(SIGPIPE);
> + if (pipehandler == SIG_DFL || pipehandler == SIG_ERR)
> + conn->do_sigaction = true;
> + else
> + conn->do_sigaction = false;
>   #endif
>   
>   /*
> Index: src/interfaces/libpq/fe-secure.c
> ===
> RCS file: /projects/cvsroot/pgsql-server/src/interfaces/libpq/fe-secure.c,v
> retrieving revision 1.32
> diff -c -r1.32 fe-secure.c
> *** src/interfaces/libpq/fe-secure.c  29 Sep 2003 16:38:04 -  1.32
> --- src/interfaces/libpq/fe-secure.c  16 Nov 2003 11:44:47 -
> ***
> *** 348,354 
>   ssize_t n;
>   
>   #ifndef WIN32
> ! pqsigfunc   oldsighandler = pqsignal(SIGPIPE, SIG_IGN);
>   #endif
>   
>   #ifdef USE_SSL
> --- 348,357 
>   ssize_t n;
>   
>   #ifndef WIN32
> ! pqsigfunc   oldsighandler = NULL;
> ! 
> ! if (conn->do_sigaction)
> ! oldsighandler = pqsignal(SIGPIPE, SIG_IGN);
>   #endif
>   
>   #ifdef USE_SSL
> ***
> *** 408,414 
>   n = send(conn->sock, ptr, len, 0);
>   
>   #ifndef WIN32
> ! pqsignal(SIGPIPE, oldsighandler);
>   #endif
>   
>   return n;
> --- 411,418 
>   n = send(conn->sock, ptr, len, 0);
>   
>   #ifndef WIN32
> ! if (conn->do_sigaction)
> ! pqsignal(SIGPIPE, oldsighandler);
>   #endif
>   
>   return n;
> Index: src/interfaces/libpq/libpq-int.h
> ===
> RCS file: /projects/cvsroot/pgsql-server/src/interfaces/libpq/libpq-int.h,v
> retrieving revision 1.82
> diff -c -r1.82 libpq-int.h
> *** src/interfaces/libpq/libpq-int.h  5 Sep 2003 02:08:36 -   1.82
> --- src/interfaces/libpq/libpq-int.h  16 Nov 2003 11:44:48 -
> *

Re: [PATCHES] ALTER TABLE modifications

2003-11-16 Thread Hannu Krosing
Rod Taylor kirjutas L, 08.11.2003 kell 18:55:
> A general re-organization of Alter Table. Node wise, it is a
> AlterTableStmt with a list of AlterTableCmds.  The Cmds are the
> individual actions to be completed (Add constraint, drop constraint, add
> column, etc.)
> 
> Processing is done in 2 phases. The first phase updates the system
> catalogs and creates a work queue for the table scan. The second phase
> is to conduct the actual table scan evaluating all constraints and other
> per tuple processing simultaneously, as required. This has no effect on
> single step operations, but has a large benefit for combinational logic
> where multiple table scans would otherwise be required.

...

> ALTER TABLE tab ALTER COLUMN col TYPE text TRANSFORM ...; 
> Currently migrates indexes, check constraints, defaults, and the
> column definition to the new type with optional transform. If
> the tranform is not supplied, a standard assignment cast is
> attempted.

Do you have special cases for type changes which don't need data
transforms. 

I mean things like changing VARCHAR(10) to VARCHAR(20), dropping the NOT
NULL constraint or changing CHECK A < 3 to CHECK A < 4. 

All these could be done with no data migration or extra checking.

So how much of it should PG attemt to detect automatically and should
there be NOSCAN option when progremmer knows better 
(changing CHECK ABS(A) < 3 into CHECK 9 > (A*A) )


Hannu


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


[PATCHES] SIGPIPE handling

2003-11-16 Thread Manfred Spraul
Hi,

attached is an update of my automatic sigaction patch: I've moved the 
actual sigaction calls into pqsignal.c and added a helper function 
(pgsignalinquire(signo)). I couldn't remove the include  from 
fe-connect.c: it's required for the SIGPIPE definition.
Additionally I've added a -a flag for pgbench that sets the signal 
handler before calling PQconnectdb.

Tested on Fedora Core 1 (Redhat Linux) with pgbench.

--
   Manfred
Index: src/interfaces/libpq/fe-connect.c
===
RCS file: /projects/cvsroot/pgsql-server/src/interfaces/libpq/fe-connect.c,v
retrieving revision 1.263
diff -c -r1.263 fe-connect.c
*** src/interfaces/libpq/fe-connect.c   18 Oct 2003 05:02:06 -  1.263
--- src/interfaces/libpq/fe-connect.c   16 Nov 2003 11:44:47 -
***
*** 41,46 
--- 41,48 
  #include 
  #endif
  #include 
+ #include 
+ #include "pqsignal.h"
  #endif
  
  #include "libpq/ip.h"
***
*** 881,886 
--- 883,891 
struct addrinfo hint;
const char *node = NULL;
int ret;
+ #ifndef WIN32
+   pqsigfunc pipehandler;
+ #endif
  
if (!conn)
return 0;
***
*** 950,955 
--- 955,976 
conn->allow_ssl_try = false;
else if (conn->sslmode[0] == 'a')   /* "allow" */
conn->wait_ssl_try = true;
+ #endif
+ #ifndef WIN32
+   /* 
+* Autodetect SIGPIPE signal handling:
+* The default action per Unix spec is kill current process and
+* that's not acceptable. If the current setting is not the default,
+* then assume that the caller knows what he's doing and leave the
+* signal handler unchanged. Otherwise set the signal handler to
+* SIG_IGN around each send() syscall. Unfortunately this is both
+* unreliable and slow for multithreaded apps.
+*/
+   pipehandler = pqsignalinquire(SIGPIPE);
+   if (pipehandler == SIG_DFL || pipehandler == SIG_ERR)
+   conn->do_sigaction = true;
+   else
+   conn->do_sigaction = false;
  #endif
  
/*
Index: src/interfaces/libpq/fe-secure.c
===
RCS file: /projects/cvsroot/pgsql-server/src/interfaces/libpq/fe-secure.c,v
retrieving revision 1.32
diff -c -r1.32 fe-secure.c
*** src/interfaces/libpq/fe-secure.c29 Sep 2003 16:38:04 -  1.32
--- src/interfaces/libpq/fe-secure.c16 Nov 2003 11:44:47 -
***
*** 348,354 
ssize_t n;
  
  #ifndef WIN32
!   pqsigfunc   oldsighandler = pqsignal(SIGPIPE, SIG_IGN);
  #endif
  
  #ifdef USE_SSL
--- 348,357 
ssize_t n;
  
  #ifndef WIN32
!   pqsigfunc   oldsighandler = NULL;
! 
!   if (conn->do_sigaction)
!   oldsighandler = pqsignal(SIGPIPE, SIG_IGN);
  #endif
  
  #ifdef USE_SSL
***
*** 408,414 
n = send(conn->sock, ptr, len, 0);
  
  #ifndef WIN32
!   pqsignal(SIGPIPE, oldsighandler);
  #endif
  
return n;
--- 411,418 
n = send(conn->sock, ptr, len, 0);
  
  #ifndef WIN32
!   if (conn->do_sigaction)
!   pqsignal(SIGPIPE, oldsighandler);
  #endif
  
return n;
Index: src/interfaces/libpq/libpq-int.h
===
RCS file: /projects/cvsroot/pgsql-server/src/interfaces/libpq/libpq-int.h,v
retrieving revision 1.82
diff -c -r1.82 libpq-int.h
*** src/interfaces/libpq/libpq-int.h5 Sep 2003 02:08:36 -   1.82
--- src/interfaces/libpq/libpq-int.h16 Nov 2003 11:44:48 -
***
*** 329,334 
--- 329,337 
charpeer_dn[256 + 1];   /* peer distinguished name */
charpeer_cn[SM_USER + 1];   /* peer common name */
  #endif
+ #ifndef WIN32
+   booldo_sigaction;   /* set SIGPIPE to SIG_IGN around every send() 
call */
+ #endif
  
/* Buffer for current error message */
PQExpBufferData errorMessage;   /* expansible string */
Index: src/interfaces/libpq/pqsignal.c
===
RCS file: /projects/cvsroot/pgsql-server/src/interfaces/libpq/pqsignal.c,v
retrieving revision 1.17
diff -c -r1.17 pqsignal.c
*** src/interfaces/libpq/pqsignal.c 4 Aug 2003 02:40:20 -   1.17
--- src/interfaces/libpq/pqsignal.c 16 Nov 2003 11:44:48 -
***
*** 40,42 
--- 40,61 
return oact.sa_handler;
  #endif   /* !HAVE_POSIX_SIGNALS */
  }
+ 
+ pqsigfunc
+ pqsignalinquire(int signo)
+ {
+ #if !defined(HAVE_POSIX_SIGNALS)
+   pqsigfunc old;
+   old = signal(SIGPIPE, SIG_IGN);
+   signal(SIGPIPE, old);
+   return old;
+ #else
+   struct sigaction oact;
+ 
+   if (sigaction(SIGPIPE, NULL, &oact) != 0)
+  return SIG_ERR;
+

Re: [PATCHES] improve overcommit docs

2003-11-16 Thread Andrew Dunstan
At the time I wrote the original 2.6 was not out even in prerelease, 
which is why I was deliberately somewhat vague about it. It is still in 
prerelease, and it will in fact work slightly differently from what was 
in some 2.4 kernels - there are 2 settings that govern this instead of 
1. Here is the 2.6 description straight from 
linux-2.6.0-test9/Documentation/vm/overcommit-accounting:

---
The Linux kernel supports three overcommit handling modes
0   -   Heuristic overcommit handling. Obvious overcommits of
   address space are refused. Used for a typical system. It
   ensures a seriously wild allocation fails while allowing
   overcommit to reduce swap usage.  root is allowed to
   allocate slighly more memory in this mode. This is the
   default.
1   -   No overcommit handling. Appropriate for some scientific
   applications.
2   -   (NEW) strict overcommit. The total address space commit
   for the system is not permitted to exceed swap + a
   configurable percentage (default is 50) of physical RAM.
   Depending on the percentage you use, in most situations
   this means a process will not be killed while accessing
   pages but will receive errors on memory allocation as
   appropriate.
The overcommit policy is set via the sysctl `vm.overcommit_memory'.

The overcommit percentage is set via `vm.overcommit_ratio'.
-


Also note that this is wrong for 2.4:

 run the machine out of memory. If your kernel supports the strict
! paranoid modes of overcommit handling, you can also relieve this
There are 2 modes: strict (allow commit up to sizeof(swap plus 1/2 RAM) ) and paranoid (allow commit up to sizeof(swap) ).

Wordsmith it however you like

cheers

andrew

Neil Conway wrote:

This patch makes some improvements to the section of the documentation
that describes the Linux 2.4 memory overcommit behavior.
I removed the almost content-free assertion that "You will need enough
swap space to cover your memory needs." If this is intended to
communicate anything meaningful, can someone rephrase it, please?
This patch also includes a fix for a typo noticed by Robert Treat.

Is this suitable for 7.4 (either the whole patch, or just the typo
fix)?
-Neil
 



Index: doc/src/sgml/runtime.sgml
===
RCS file: /var/lib/cvs/pgsql-server/doc/src/sgml/runtime.sgml,v
retrieving revision 1.218
diff -c -r1.218 runtime.sgml
*** doc/src/sgml/runtime.sgml	14 Nov 2003 15:43:22 -	1.218
--- doc/src/sgml/runtime.sgml	16 Nov 2003 02:07:42 -
***
*** 1294,1300 
 
  Unfortunately, there is no well-defined method for determining
  ideal values for the family of cost variables that
!  below. You are encouraged to experiment and share
  your findings.
 

--- 1294,1300 
 
  Unfortunately, there is no well-defined method for determining
  ideal values for the family of cost variables that
!  appear below. You are encouraged to experiment and share
  your findings.
 

***
*** 3267,3301 
Linux Memory Overcommit
 

! Linux kernels of version 2.4.* have a poor default memory
! overcommit behavior, which can result in the PostgreSQL server
! (postmaster process) being killed by the
! kernel if the memory demands of another process cause the system
! to run out of memory.

 

! If this happens, you will see a kernel message looking like this
! (consult your system documentation and configuration on where to
! look for such a message):
 
 Out of Memory: Killed process 12345 (postmaster). 
 
! And, of course, you will find that your database server has
! disappeared.

 

 To avoid this situation, run PostgreSQL
 on a machine where you can be sure that other processes will not
 run the machine out of memory. If your kernel supports the strict
! and/or paranoid modes of overcommit handling, you can also relieve
! this problem by altering the system's default behaviour. This can
! be determined by examining the function
! vm_enough_memory in the file mm/mmap.c
! in the kernel source. If this file reveals that the strict and/or
! paranoid modes are supported by your kernel, turn one of these
! modes on by using
 
 sysctl -w vm.overcommit_memory=2
 
--- 3267,3302 
Linux Memory Overcommit
 

! In Linux 2.4, the default virtual memory configuration is not
! optimal for PostgreSQL. Because of the
! way that the kernel implements memory overcommit, the kernel may
! terminate the PostgreSQL server (the
! postmaster process) if the