Re: [PATCHES] SIGPIPE handling
Bruce Momjian wrote: Here is my logic --- 99% of apps don't install a SIGPIPE signal handler, and 90% will not add a SIGPIPE/SIG_IGN call to their applications. I guess I am looking for something that would allow the performance benefit of not doing a pgsignal() call around very send() for the majority of our apps. What was the speed improvement? Around 10% for a heavily multithreaded app on an 8-way Xeon server. Far less for a single threaded app and far less for uniprocessor systems: the kernel must update the pending queue of all threads and that causes lots of contention for the (per-process) spinlock that protects the signal handlers. Granted, we need to do something because our current setup isn't even thread-safe. Also, how is your patch more thread-safe than the old one? The detection is thread-safe, but I don't see how the use is. First function in main(): signal(SIGPIPE, SIG_IGN); PQsetsighandling(1); This results in perfectly thread-safe sigpipe handling. If it's a multithreaded app that needs correct correct per-thread delivery of SIGPIPE signals for console IO, then the libpq user must implement the sequence I describe below. If you still pgsignal around the calls, I don't see how two threads couldn't do: thread 1thread 2 pgsignal(SIGPIPE, SIG_IGN); pgsignal(SIGPIPE, SIG_DFL); send(); pgsignal(SIGPIPE, SIG_DFL); send(); pgsignal(SIGPIPE, SIG_DFL); This runs thread1 with SIGPIPE as SIG_DFL. Correct. A thread safe sequence might be something like: pthread_sigmask(SIG_BLOCK,{SIGPIPE}); send(); if (sigpending(SIGPIPE) { sigwait({SIGPIPE},); } pthread_sigmask(SIG_UNBLOCK,{SIGPIPE}); But this sequence only works for users that link against libpthread. And the same sequence with sigprocmask is undefined for multithreaded apps. -- Manfred ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [pgsql-hackers-win32] [PATCHES] SRA Win32 sync() code
Tom Lane wrote: > > Do we know that having the background writer fsync a file that was > > written by a backend cause all the data to fsync? I think I could write > > a program to test this by timing each of these tests: > > That might prove something about the particular platform you tested it > on; but it would not speak to the real problem, which is what we can > assume is true on every platform... The attached program does test if fsync can be used on a file descriptor after the file is closed and then reopened. I see: write 0.000613 write & fsync 0.001727 write, close & fsync 0.001633 This shows that fsync works even after the file is closed and reopened. I could test by writing using a subprocess, but I don't see how that would be different, and it would mess up my timings. Anyway, if we find all our platforms can pass this test, we might be able to allow backends to do their own writes and just record the file name somewhere for the checkpointer to fsync. It also shows write/fsync was 3x slower than simple write. Does anyone have a platform where the last duration is significantly different from the middle timing? I am keeping this discussion on patches because of the C program attachment. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 /* * test_fsync.c * tests if fsync can be done from another process than the original write */ #include #include #include #include #include void die(char *str); void print_elapse(struct timeval start_t, struct timeval elapse_t); int main(int argc, char *argv[]) { struct timeval start_t; struct timeval elapse_t; int tmpfile; int i; char charout = 44; /* write only */ gettimeofday(&start_t, NULL); if ((tmpfile = open("/var/tmp/test_fsync.out", O_RDWR | O_CREAT)) == -1) die("can't open /var/tmp/test_fsync.out"); for (i = 0; i < 200; i++) write(tmpfile, &charout, 1); close(tmpfile); gettimeofday(&elapse_t, NULL); unlink("/var/tmp/test_fsync.out"); printf("write "); print_elapse(start_t, elapse_t); printf("\n"); /* write & fsync */ gettimeofday(&start_t, NULL); if ((tmpfile = open("/var/tmp/test_fsync.out", O_RDWR | O_CREAT)) == -1) die("can't open /var/tmp/test_fsync.out"); for (i = 0; i < 200; i++) write(tmpfile, &charout, 1); fsync(tmpfile); close(tmpfile); gettimeofday(&elapse_t, NULL); unlink("/var/tmp/test_fsync.out"); printf("write & fsync "); print_elapse(start_t, elapse_t); printf("\n"); /* write, close & fsync */ gettimeofday(&start_t, NULL); if ((tmpfile = open("/var/tmp/test_fsync.out", O_RDWR | O_CREAT)) == -1) die("can't open /var/tmp/test_fsync.out"); for (i = 0; i < 200; i++) write(tmpfile, &charout, 1); close(tmpfile); /* reopen file */ if ((tmpfile = open("/var/tmp/test_fsync.out", O_RDWR | O_CREAT)) == -1) die("can't open /var/tmp/test_fsync.out"); fsync(tmpfile); close(tmpfile); gettimeofday(&elapse_t, NULL); unlink("/var/tmp/test_fsync.out"); printf("write, close & fsync "); print_elapse(start_t, elapse_t); printf("\n"); return 0; } void print_elapse(struct timeval start_t, struct timeval elapse_t) { if (elapse_t.tv_usec < start_t.tv_usec) { elapse_t.tv_sec--; elapse_t.tv_usec += 100; } printf("%ld.%06ld", (long) (elapse_t.tv_sec - start_t.tv_sec), (long) (elapse_t.tv_usec - start_t.tv_usec)); } void die(char *str) { fprintf(stderr, "%s", str); exit(1); } ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [pgsql-hackers-win32] [PATCHES] SRA Win32 sync() code
Tom Lane wrote: > Bruce Momjian <[EMAIL PROTECTED]> writes: > > Where am I wrong? > > I don't think any of this is relevant. There are a certain number of > blocks we have to get down to disk before we can declare a transaction > committed, and there are a certain number that we have to get down to > disk before we can declare a checkpoint complete. You are focusing too > much on the question of whether a particular process performs an fsync > operation, and ignoring the fact that ultimately it's got to wait for > I/O to complete --- directly or indirectly. If it blocks waiting for > some other process to declare a buffer clean, rather than writing for > itself, what's the difference? The difference is two-fold. First, there might be 10 other backends asking for writes, so it isn't clear that asking someone else do the right is as fast. Second, that other writer is doing fsync, so it is 100x or 1000x slower than our current setup. > Sure, fsync serializes the particular process that's doing it, but we > can deal with that by spreading the fsyncs across multiple processes, > and trying to ensure that they are mostly background processes rather > than foreground ones. How many? That was my point, that it might require 1000 backend processes _and_ it would be slower because we are write/fsync rather than write. However, I think we could fix that by doing the write and returning OK to the backend, then doing the fsync whenever we want --- perhaps that was already your plan. > I don't claim that immediate-fsync-on-write is the only answer, but > I cannot follow your reasoning for dismissing it out of hand ... and I > certainly cannot buy *any* logic that says that sync() is a good answer > to any of these issues. AFAICS sync() means that we abandon > responsibility. sync() means we group the fsync into one massive one, that sync all other process I/O too --- clearly bad, but I am hoping for something as good as what we currently have because that sync hopefully is only ever few minutes. > > Do we know that having the background writer fsync a file that was > > written by a backend cause all the data to fsync? I think I could write > > a program to test this by timing each of these tests: > > That might prove something about the particular platform you tested it > on; but it would not speak to the real problem, which is what we can > assume is true on every platform... Yes, it would only be per platform. I was thinking we could have a platform test and enable this behavior on platforms that support it (all?) and use sync on the others. Also, let me say I am glad we are delving into this. Our buffer system has needed an overhaul for a while, and right now we already have some major improvements for 7.5, and this discussion is just a continuation of those improvements. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [pgsql-hackers-win32] [PATCHES] SRA Win32 sync() code
Bruce Momjian <[EMAIL PROTECTED]> writes: > Where am I wrong? I don't think any of this is relevant. There are a certain number of blocks we have to get down to disk before we can declare a transaction committed, and there are a certain number that we have to get down to disk before we can declare a checkpoint complete. You are focusing too much on the question of whether a particular process performs an fsync operation, and ignoring the fact that ultimately it's got to wait for I/O to complete --- directly or indirectly. If it blocks waiting for some other process to declare a buffer clean, rather than writing for itself, what's the difference? Sure, fsync serializes the particular process that's doing it, but we can deal with that by spreading the fsyncs across multiple processes, and trying to ensure that they are mostly background processes rather than foreground ones. I don't claim that immediate-fsync-on-write is the only answer, but I cannot follow your reasoning for dimissing it out of hand ... and I certainly cannot buy *any* logic that says that sync() is a good answer to any of these issues. AFAICS sync() means that we abandon responsibility. > Do we know that having the background writer fsync a file that was > written by a backend cause all the data to fsync? I think I could write > a program to test this by timing each of these tests: That might prove something about the particular platform you tested it on; but it would not speak to the real problem, which is what we can assume is true on every platform... regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [pgsql-hackers-win32] [PATCHES] SRA Win32 sync() code
Tom Lane wrote: > Bruce Momjian <[EMAIL PROTECTED]> writes: > > Tom Lane wrote: > >> Seriously though, if we can move the bulk of the writing work into > >> background processes then I don't believe that there will be any > >> significant penalty for regular backends. > > > If the background writer starts using fsync(), we can have normal > > backends that do a write() set a shared memory boolean. We can then > > test that boolean and do sync() only if other backends had to do their > > own writes. > > That seems like the worst of both worlds --- you still are depending on > sync() for correctness. > > Also, as long as backends only *seldom* do writes, making them fsync a > write when they do make one will be less of an impact on overall system > performance than having a sync() ensue shortly afterwards. I think you > are focusing too narrowly on the idea that backends shouldn't ever wait > for writes, and failing to see the bigger picture. What we need to > optimize is overall system performance, not an arbitrary restriction > that certain processes never wait for certain things. OK, let me give you my logic and you can tell me where I am wrong. First, how many backend can a single write process support if all the backends are doing insert/update/deletes? 5? 10? Let's assume 10. Second, once we change write to write/fsync, how much slower will that be? 100x, 1000x? Let's say 10x. So, by my logic, if we have 100 backends all doing updates, we will need 10 * 100 or 1000 writer processes or threads to keep up with that load. That seems quite excessive to me from a context switching and process overhead perspective. Where am I wrong? Also, if we go with the fsync only at checkpoint, we are doing fsync's once every minute (at checkpoint time) rather than several times a second potentially. Do we know that having the background writer fsync a file that was written by a backend cause all the data to fsync? I think I could write a program to test this by timing each of these tests: create an empty file open file time fsync close open file write 2mb into the file time fsync close open file write 2mb into the file close open file time fsync close If I do the write via system(), I am doing it in a separate process so the test should work. Should I try this? -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PATCHES] SIGPIPE handling
Manfred Spraul wrote: > Bruce Momjian wrote: > > >I thought it should be global too, basically testing on the first > >connection request. > > > What if two PQconnect calls happen at the same time? > I would really prefer the manual approach with a new PQsetsighandler > function - the autodetection is fragile, it's trivial to find a special > case where it breaks. > Bruce, you wrote that a new function would be overdesign. Are you sure? > Your simpler proposals all fail with multithreaded apps. > I've attached the patch that implements the global flag with two special > function that access it. Here is my logic --- 99% of apps don't install a SIGPIPE signal handler, and 90% will not add a SIGPIPE/SIG_IGN call to their applications. I guess I am looking for something that would allow the performance benefit of not doing a pgsignal() call around very send() for the majority of our apps. What was the speed improvement? Just the fact you had to add the SIG_IGN call to pgbench shows that most apps need some special handling to get this performance benefit, and I would like to avoid that. Your PQsetsighandler() idea --- would that be fore SIGPIPE only? Would it be acceptable to tell application developers they have to use PQsetsig*pipe*handler() call to register a SIGPIPE handler? If so, that would be great because we would do the pgsignal call around send() only when it was needed. It might be the cleanest way and the most reliable. Granted, we need to do something because our current setup isn't even thread-safe. Also, how is your patch more thread-safe than the old one? The detection is thread-safe, but I don't see how the use is. If you still pgsignal around the calls, I don't see how two threads couldn't do: thread 1thread 2 pgsignal(SIGPIPE, SIG_IGN); pgsignal(SIGPIPE, SIG_DFL); send(); pgsignal(SIGPIPE, SIG_DFL); send(); pgsignal(SIGPIPE, SIG_DFL); This runs thread1 with SIGPIPE as SIG_DFL. What are we ignoring the SIGPIPE for on send anyway? Is this in case the backend crashed? -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PATCHES] SIGPIPE handling
Tom Lane wrote: > Bruce Momjian <[EMAIL PROTECTED]> writes: > > Yes, I was afraid of that. Here's another idea. If the signal handler > > is SIG_DFL, we install our own signal handler for SIGPIPE, and set/clear a > > global variable before/after we send(). > > That would address the speed issue but not the multithread correctness > issue. Also, what happens if the app replaces the signal handler later? Well, our current setup doesn't do multithreaded properly either. In fact, I am starting to worry about libpq's thread-safety. Should I? -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PATCHES] SRA Win32 sync() code
Jan Wieck <[EMAIL PROTECTED]> writes: > Well, the bgwriter has basically the same chance the checkpointer has > ... mdblindwrt() in the end, because he doesn't have the relcache handy. We could easily get rid of mdblindwrt --- there is no very good reason that we use the relcache for I/O. There could and should be a lower-level notion of "open relation" that bgwriter and checkpoint could use. See recent discussion with Neil, for example. Vadim had always wanted to do that, IIRC. regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[PATCHES] Alter Table phase 1 -- Please apply to 7.5
Completes: ALTER TABLE ADD COLUMN does not honour DEFAULT and non-CHECK CONSTRAINT ALTER TABLE ADD COLUMN column DEFAULT should fill existing rows with DEFAULT value ALTER TABLE ADD COLUMN column SERIAL doesn't create sequence because of the item above Previously described reorganization of all ALTER TABLE commands. Most of the way through column type change. I need to supply a followup patch which deals with logical attribute numbers. ALTER TABLE table ALTER [COLUMN] column TYPE type USING expression; Syntax documentation updates only. Content to come later. altertable.patch.gz Description: GNU Zip compressed data ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PATCHES] SRA Win32 sync() code
Tom Lane wrote: Jan Wieck <[EMAIL PROTECTED]> writes: Removing sync() entirely requires very accurate fsync()'ing in the background writer, the checkpointer and the backends. Basically none of them can mark a block "clean" if he fails to fsync() the relation later! This will be a mess to code. Not really. The O_SYNC solution for example would be trivial to code. Well, the bgwriter has basically the same chance the checkpointer has ... mdblindwrt() in the end, because he doesn't have the relcache handy. So you want to open(O_SYNC), write(), close() every single block? I don't expect that to be much better than a global sync(). Jan -- #==# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #== [EMAIL PROTECTED] # ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PATCHES] SRA Win32 sync() code
Jan Wieck <[EMAIL PROTECTED]> writes: > Removing sync() entirely requires very accurate fsync()'ing in the > background writer, the checkpointer and the backends. Basically none of > them can mark a block "clean" if he fails to fsync() the relation later! > This will be a mess to code. Not really. The O_SYNC solution for example would be trivial to code. regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] SRA Win32 sync() code
Tom Lane wrote: Bruce Momjian <[EMAIL PROTECTED]> writes: Tom Lane wrote: Seriously though, if we can move the bulk of the writing work into background processes then I don't believe that there will be any significant penalty for regular backends. If the background writer starts using fsync(), we can have normal backends that do a write() set a shared memory boolean. We can then test that boolean and do sync() only if other backends had to do their own writes. That seems like the worst of both worlds --- you still are depending on sync() for correctness. Also, as long as backends only *seldom* do writes, making them fsync a write when they do make one will be less of an impact on overall system performance than having a sync() ensue shortly afterwards. I think you are focusing too narrowly on the idea that backends shouldn't ever wait for writes, and failing to see the bigger picture. What we need to optimize is overall system performance, not an arbitrary restriction that certain processes never wait for certain things. Removing sync() entirely requires very accurate fsync()'ing in the background writer, the checkpointer and the backends. Basically none of them can mark a block "clean" if he fails to fsync() the relation later! This will be a mess to code. Jan -- #==# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #== [EMAIL PROTECTED] # ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PATCHES] SIGPIPE handling
Manfred Spraul <[EMAIL PROTECTED]> writes: > + extern void PQsetsighandling(int internal_sigign); These sorts of things are commonly designed so that the set() operation incidentally returns the previous setting. I'm not sure if anyone would care, but it's only a couple more lines of code to make that happen, so I'd suggest doing so just in case. Otherwise I think this is a good patch. The documentation could use a little more wordsmithing, perhaps. regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PATCHES] SRA Win32 sync() code
Bruce Momjian <[EMAIL PROTECTED]> writes: > Tom Lane wrote: >> Seriously though, if we can move the bulk of the writing work into >> background processes then I don't believe that there will be any >> significant penalty for regular backends. > If the background writer starts using fsync(), we can have normal > backends that do a write() set a shared memory boolean. We can then > test that boolean and do sync() only if other backends had to do their > own writes. That seems like the worst of both worlds --- you still are depending on sync() for correctness. Also, as long as backends only *seldom* do writes, making them fsync a write when they do make one will be less of an impact on overall system performance than having a sync() ensue shortly afterwards. I think you are focusing too narrowly on the idea that backends shouldn't ever wait for writes, and failing to see the bigger picture. What we need to optimize is overall system performance, not an arbitrary restriction that certain processes never wait for certain things. regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PATCHES] SIGPIPE handling
Bruce Momjian wrote: I thought it should be global too, basically testing on the first connection request. What if two PQconnect calls happen at the same time? I would really prefer the manual approach with a new PQsetsighandler function - the autodetection is fragile, it's trivial to find a special case where it breaks. Bruce, you wrote that a new function would be overdesign. Are you sure? Your simpler proposals all fail with multithreaded apps. I've attached the patch that implements the global flag with two special function that access it. -- Manfred Index: contrib/pgbench/README.pgbench === RCS file: /projects/cvsroot/pgsql-server/contrib/pgbench/README.pgbench,v retrieving revision 1.9 diff -c -r1.9 README.pgbench *** contrib/pgbench/README.pgbench 10 Jun 2003 09:07:15 - 1.9 --- contrib/pgbench/README.pgbench 8 Nov 2003 21:43:53 - *** *** 112,117 --- 112,121 might be a security hole since ps command will show the password. Use this for TESTING PURPOSE ONLY. + -a + Disable SIGPIPE delivery globally instead of within each + libpq operation. + -n No vacuuming and cleaning the history table prior to the test is performed. Index: contrib/pgbench/pgbench.c === RCS file: /projects/cvsroot/pgsql-server/contrib/pgbench/pgbench.c,v retrieving revision 1.27 diff -c -r1.27 pgbench.c *** contrib/pgbench/pgbench.c 27 Sep 2003 19:15:34 - 1.27 --- contrib/pgbench/pgbench.c 8 Nov 2003 21:43:54 - *** *** 28,33 --- 28,34 #else #include #include + #include #ifdef HAVE_GETOPT_H #include *** *** 105,112 static void usage() { ! fprintf(stderr, "usage: pgbench [-h hostname][-p port][-c nclients][-t ntransactions][-s scaling_factor][-n][-C][-v][-S][-N][-l][-U login][-P password][-d][dbname]\n"); ! fprintf(stderr, "(initialize mode): pgbench -i [-h hostname][-p port][-s scaling_factor][-U login][-P password][-d][dbname]\n"); } /* random number generator */ --- 106,113 static void usage() { ! fprintf(stderr, "usage: pgbench [-h hostname][-p port][-c nclients][-t ntransactions][-s scaling_factor][-n][-C][-v][-S][-N][-l][-a][-U login][-P password][-d][dbname]\n"); ! fprintf(stderr, "(initialize mode): pgbench -i [-h hostname][-p port][-s scaling_factor][-U login][-P password][-d][dbname][-a]\n"); } /* random number generator */ *** *** 703,712 else if ((env = getenv("PGUSER")) != NULL && *env != '\0') login = env; ! while ((c = getopt(argc, argv, "ih:nvp:dc:t:s:U:P:CNSl")) != -1) { switch (c) { case 'i': is_init_mode++; break; --- 704,719 else if ((env = getenv("PGUSER")) != NULL && *env != '\0') login = env; ! while ((c = getopt(argc, argv, "aih:nvp:dc:t:s:U:P:CNSl")) != -1) { switch (c) { + case 'a': + #ifndef WIN32 + signal(SIGPIPE, SIG_IGN); + #endif + PQsetsighandling(0); + break; case 'i': is_init_mode++; break; Index: doc/src/sgml/libpq.sgml === RCS file: /projects/cvsroot/pgsql-server/doc/src/sgml/libpq.sgml,v retrieving revision 1.141 diff -c -r1.141 libpq.sgml *** doc/src/sgml/libpq.sgml 1 Nov 2003 01:56:29 - 1.141 --- doc/src/sgml/libpq.sgml 8 Nov 2003 21:43:56 - *** *** 645,650 --- 645,693 + + PQsetsighandlingPQsetsighandling + PQgetsighandlingPQgetsighandling + + +Set/query SIGPIPE signal handling. + + void PQsetsighandling(int internal_sigign); + + + int PQgetsighandling(void); + + + + + These functions allow to query and set the SIGPIPE signal handling + of libpq: by default, Unix systems generate a (fatal) SIGPIPE signal + on write attempts to a disconnected socket. Most callers expect a + normal error return instead of the signal. A normal error return can + be achieved by blocking or ignoring the SIGPIPE signal. This can be + done either globally in the application or inside libpq. + + + If internal signal handling is enabled (this is the default), then + libpq sets the SIGPIPE handler to SIG_IGN before every socket send + operation and restores it afterwards. This prevents libpq from + killing the application, at the cost of a slight performance + dec
Re: [PATCHES] SRA Win32 sync() code
Tom Lane wrote: > Bruce Momjian <[EMAIL PROTECTED]> writes: > > Tom Lane wrote: > >> One reason I like the idea of adopting a sync-when-you-write policy is > >> that it eliminates the need for anything as messy as that. > > > Yes, but can we do it without causing a performance degredation, and I > > would hate to change something to make things easier on Win32 while > > penalizing all platforms. > > Having to keep a list of modified files in shared memory isn't a penalty? > > Seriously though, if we can move the bulk of the writing work into > background processes then I don't believe that there will be any > significant penalty for regular backends. And I believe that it would > be a huge advantage from a correctness point of view if we could stop > depending on sync(). The fact that Windows hasn't got sync() is merely > another reason we should stop using it. If the background writer starts using fsync(), we can have normal backends that do a write() set a shared memory boolean. We can then test that boolean and do sync() only if other backends had to do their own writes. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PATCHES] SIGPIPE handling
Tom Lane wrote: > Manfred Spraul <[EMAIL PROTECTED]> writes: > > But how should libpq notice that the caller handles sigpipe signals? > > a) autodetection - if the sigpipe handler is not the default, then the > > caller knows what he's doing. > > b) a new PGsetsignalhandler() function. > > c) an additional flag passed to PGconnectdb. > > > Tom preferred a). One problem is that the autodetection is not perfect: > > an app could block the signal with sigprocmask, or it could install a > > handler that doesn't expect sigpipe signals from within libpq. > > I would prefer b), because it guarantees that the patch has no effect on > > existing apps. > > I have no particular objection to (b) either, but IIRC there was some > dispute about whether it sets a global or per-connection flag. ISTM > that "I have a correct signal handler" is a global assertion (within one > process) and so a global flag is appropriate. Someone else (Bruce?) > didn't like that though. I thought it should be global too, basically testing on the first connection request. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] SIGPIPE handling
Bruce Momjian <[EMAIL PROTECTED]> writes: > Yes, I was afraid of that. Here's another idea. If the signal handler > is SIG_DFL, we install our own signal handler for SIGPIPE, and set/clear a > global variable before/after we send(). That would address the speed issue but not the multithread correctness issue. Also, what happens if the app replaces the signal handler later? regards, tom lane ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [PATCHES] SIGPIPE handling
Tom Lane wrote: > Bruce Momjian <[EMAIL PROTECTED]> writes: > > Is running the rest of the > > application with SIGPIPE <= SIG_IGN a problem? > > That is NOT an acceptable thing for a library to do. Yes, I was afraid of that. Here's another idea. If the signal handler is SIG_DFL, we install our own signal handler for SIGPIPE, and set/clear a global variable before/after we send(). When our signal handler is called, we check to see if our global variable is set, and we either ignore or exit(). Can we do that safely? Seems it only fails when they register a signal handler after establishing a database connection. How would this work in a threaded app --- not too well, I think. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PATCHES] SIGPIPE handling
Kurt Roeckx <[EMAIL PROTECTED]> writes: > On Sun, Nov 16, 2003 at 06:28:06PM +0100, Kurt Roeckx wrote: >> Is there a reason we don't make use of the MSG_NOSIGNAL flag to >> send()? Or is the problem in case of SSL? > Oh, seems to be a Linux only thing? That and the SSL problem. I wouldn't object to implementing it as a platform-specific optimization if we could get it to handle the SSL case, but without SSL support I think it's too limited. regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PATCHES] SRA Win32 sync() code
Manfred Spraul wrote: Tom Lane wrote: Seriously though, if we can move the bulk of the writing work into background processes then I don't believe that there will be any significant penalty for regular backends. And I believe that it would be a huge advantage from a correctness point of view if we could stop depending on sync(). Which function guarantees that renames of WAL files arrived on the disk? AFAIK sync() is the only function that guarantees that. What about the sync app from sysinternals? It seems Mark Russinovich figured out how to implement sync on Win32: http://www.sysinternals.com/ntw2k/source/misc.shtml#Sync It requires administrative priveledges, but it shouldn't be that difficult to write a tiny service that runs in the LocalSystem account, listens to a pipe and syncs all disks when asked. I think we'd have to do it from scratch, because of these license terms: --- There is no charge to use any of the software published on this Web site at home or at work, so long as each user downloads and installs the product directly from www.sysinternals.com. A commercial license is required to redistribute any of these utilities directly (whether by computer media, a file server, an email attachment, etc.) or to embed them in- or link them to- another program. -- Also, do we want to force a broad brush sync() of just fsync our own files? cheers andrew ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PATCHES] SIGPIPE handling
Manfred Spraul <[EMAIL PROTECTED]> writes: > But how should libpq notice that the caller handles sigpipe signals? > a) autodetection - if the sigpipe handler is not the default, then the > caller knows what he's doing. > b) a new PGsetsignalhandler() function. > c) an additional flag passed to PGconnectdb. > Tom preferred a). One problem is that the autodetection is not perfect: > an app could block the signal with sigprocmask, or it could install a > handler that doesn't expect sigpipe signals from within libpq. > I would prefer b), because it guarantees that the patch has no effect on > existing apps. I have no particular objection to (b) either, but IIRC there was some dispute about whether it sets a global or per-connection flag. ISTM that "I have a correct signal handler" is a global assertion (within one process) and so a global flag is appropriate. Someone else (Bruce?) didn't like that though. regards, tom lane ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PATCHES] SIGPIPE handling
On Sun, Nov 16, 2003 at 06:28:06PM +0100, Kurt Roeckx wrote: > On Sun, Nov 16, 2003 at 12:56:10PM +0100, Manfred Spraul wrote: > > Hi, > > > > attached is an update of my automatic sigaction patch: I've moved the > > actual sigaction calls into pqsignal.c and added a helper function > > (pgsignalinquire(signo)). I couldn't remove the include from > > fe-connect.c: it's required for the SIGPIPE definition. > > Additionally I've added a -a flag for pgbench that sets the signal > > handler before calling PQconnectdb. > > Is there a reason we don't make use of the MSG_NOSIGNAL flag to > send()? Or is the problem in case of SSL? Oh, seems to be a Linux only thing? Kurt ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PATCHES] SRA Win32 sync() code
Manfred Spraul <[EMAIL PROTECTED]> writes: > Which function guarantees that renames of WAL files arrived on the disk? The OS itself is supposed to guarantee that; that's what a journaling file system is for. In any case, I don't think we care. Renaming would apply only to WAL segments that are not currently needed where they are, and would only be needed under their new names at some future time. If the rename gets lost shortly after it's done, it can be redone. regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PATCHES] SIGPIPE handling
On Sun, Nov 16, 2003 at 12:56:10PM +0100, Manfred Spraul wrote: > Hi, > > attached is an update of my automatic sigaction patch: I've moved the > actual sigaction calls into pqsignal.c and added a helper function > (pgsignalinquire(signo)). I couldn't remove the include from > fe-connect.c: it's required for the SIGPIPE definition. > Additionally I've added a -a flag for pgbench that sets the signal > handler before calling PQconnectdb. Is there a reason we don't make use of the MSG_NOSIGNAL flag to send()? Or is the problem in case of SSL? Kurt ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PATCHES] improve overcommit docs
That covers it extremely well. cheers andrew Tom Lane wrote: Andrew Dunstan <[EMAIL PROTECTED]> writes: At the time I wrote the original 2.6 was not out even in prerelease, which is why I was deliberately somewhat vague about it. It is still in prerelease, and it will in fact work slightly differently from what was in some 2.4 kernels - there are 2 settings that govern this instead of 1. Okay, I revised that section yet again based on this info: http://candle.pha.pa.us/main/writings/pgsql/sgml/kernel-resources.html#AEN17043 Thanks for the update. regards, tom lane ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PATCHES] SRA Win32 sync() code
Tom Lane wrote: Seriously though, if we can move the bulk of the writing work into background processes then I don't believe that there will be any significant penalty for regular backends. And I believe that it would be a huge advantage from a correctness point of view if we could stop depending on sync(). Which function guarantees that renames of WAL files arrived on the disk? AFAIK sync() is the only function that guarantees that. What about the sync app from sysinternals? It seems Mark Russinovich figured out how to implement sync on Win32: http://www.sysinternals.com/ntw2k/source/misc.shtml#Sync It requires administrative priveledges, but it shouldn't be that difficult to write a tiny service that runs in the LocalSystem account, listens to a pipe and syncs all disks when asked. -- Manfred ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PATCHES] improve overcommit docs
Andrew Dunstan <[EMAIL PROTECTED]> writes: > At the time I wrote the original 2.6 was not out even in prerelease, > which is why I was deliberately somewhat vague about it. It is still in > prerelease, and it will in fact work slightly differently from what was > in some 2.4 kernels - there are 2 settings that govern this instead of > 1. Okay, I revised that section yet again based on this info: http://candle.pha.pa.us/main/writings/pgsql/sgml/kernel-resources.html#AEN17043 Thanks for the update. regards, tom lane ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PATCHES] SIGPIPE handling
Bruce Momjian wrote: Better. However, I am confused over when we do sigaction. I thought we were going to do it only if they had a signal handler defined, meaning if (pipehandler != SIG_DFL && pipehandler != SIG_IGN && pipehandler != SIG_ERR) conn->do_sigaction = true; else conn->do_sigaction = false; By doing this, we don't do sigaction in the default case where no handler was defined. No. If no handler was definied, then libpq must define a handler. Without a handler, a network disconnect would result in a SIGPIE that kills the app. I thought we would just set the entire application to SIGPIPE <= SIG_IGN. This gives us good performance in all cases except when a signal handler is defined. I don't want to change the whole app - perhaps someone expects that sigpipe works? Perhaps psql for the console input, or something similar? Is running the rest of the application with SIGPIPE <= SIG_IGN a problem? I think that depends on the application, and libpq shouldn't mandate that SIGPIPE must be SIG_IGN. Right now libpq tries to catch sigpipe signals by manually installing/restoring a signal handler around send() calls. This doesn't work for multithreaded apps, because the signal handlers are per-process, not per-thread. Thus for multithreaded apps, the libpq user is responsible for handling sigpipe. The API change should be a big problem - the current system doesn't work, and there shouldn't be many multithreaded apps. But how should libpq notice that the caller handles sigpipe signals? a) autodetection - if the sigpipe handler is not the default, then the caller knows what he's doing. b) a new PGsetsignalhandler() function. c) an additional flag passed to PGconnectdb. Tom preferred a). One problem is that the autodetection is not perfect: an app could block the signal with sigprocmask, or it could install a handler that doesn't expect sigpipe signals from within libpq. I would prefer b), because it guarantees that the patch has no effect on existing apps. c) is bad, Tom explained that the connect string is often directly specified by the user. -- Manfred ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PATCHES] SIGPIPE handling
Bruce Momjian <[EMAIL PROTECTED]> writes: > Is running the rest of the > application with SIGPIPE <= SIG_IGN a problem? That is NOT an acceptable thing for a library to do. regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PATCHES] SRA Win32 sync() code
Bruce Momjian <[EMAIL PROTECTED]> writes: > Tom Lane wrote: >> One reason I like the idea of adopting a sync-when-you-write policy is >> that it eliminates the need for anything as messy as that. > Yes, but can we do it without causing a performance degredation, and I > would hate to change something to make things easier on Win32 while > penalizing all platforms. Having to keep a list of modified files in shared memory isn't a penalty? Seriously though, if we can move the bulk of the writing work into background processes then I don't believe that there will be any significant penalty for regular backends. And I believe that it would be a huge advantage from a correctness point of view if we could stop depending on sync(). The fact that Windows hasn't got sync() is merely another reason we should stop using it. regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PATCHES] SRA Win32 sync() code
Tom Lane wrote: > Bruce Momjian <[EMAIL PROTECTED]> writes: > > Not sure how we are going to do this in Win32, but somehow we will have > > to record all open files between checkpoints in an area that the > > checkpoint process can read during a checkpoint. > > One reason I like the idea of adopting a sync-when-you-write policy is > that it eliminates the need for anything as messy as that. Yes, but can we do it without causing a performance degredation, and I would hate to change something to make things easier on Win32 while penalizing all platforms. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PATCHES] SRA Win32 sync() code
Bruce Momjian <[EMAIL PROTECTED]> writes: > Not sure how we are going to do this in Win32, but somehow we will have > to record all open files between checkpoints in an area that the > checkpoint process can read during a checkpoint. One reason I like the idea of adopting a sync-when-you-write policy is that it eliminates the need for anything as messy as that. regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PATCHES] SIGPIPE handling
Better. However, I am confused over when we do sigaction. I thought we were going to do it only if they had a signal handler defined, meaning if (pipehandler != SIG_DFL && pipehandler != SIG_IGN && pipehandler != SIG_ERR) conn->do_sigaction = true; else conn->do_sigaction = false; By doing this, we don't do sigaction in the default case where no handler was defined. I thought we would just set the entire application to SIGPIPE <= SIG_IGN. This gives us good performance in all cases except when a signal handler is defined. Is running the rest of the application with SIGPIPE <= SIG_IGN a problem? However, the code patch is: if (pipehandler == SIG_DFL || pipehandler == SIG_ERR) conn->do_sigaction = true; else conn->do_sigaction = false; This gives us good performance only if SIGPIPE <= SIG_IGN has been set by the application or a sigaction function has been defined. --- Manfred Spraul wrote: > Hi, > > attached is an update of my automatic sigaction patch: I've moved the > actual sigaction calls into pqsignal.c and added a helper function > (pgsignalinquire(signo)). I couldn't remove the include from > fe-connect.c: it's required for the SIGPIPE definition. > Additionally I've added a -a flag for pgbench that sets the signal > handler before calling PQconnectdb. > > Tested on Fedora Core 1 (Redhat Linux) with pgbench. > > -- > Manfred > Index: src/interfaces/libpq/fe-connect.c > === > RCS file: /projects/cvsroot/pgsql-server/src/interfaces/libpq/fe-connect.c,v > retrieving revision 1.263 > diff -c -r1.263 fe-connect.c > *** src/interfaces/libpq/fe-connect.c 18 Oct 2003 05:02:06 - 1.263 > --- src/interfaces/libpq/fe-connect.c 16 Nov 2003 11:44:47 - > *** > *** 41,46 > --- 41,48 > #include > #endif > #include > + #include > + #include "pqsignal.h" > #endif > > #include "libpq/ip.h" > *** > *** 881,886 > --- 883,891 > struct addrinfo hint; > const char *node = NULL; > int ret; > + #ifndef WIN32 > + pqsigfunc pipehandler; > + #endif > > if (!conn) > return 0; > *** > *** 950,955 > --- 955,976 > conn->allow_ssl_try = false; > else if (conn->sslmode[0] == 'a') /* "allow" */ > conn->wait_ssl_try = true; > + #endif > + #ifndef WIN32 > + /* > + * Autodetect SIGPIPE signal handling: > + * The default action per Unix spec is kill current process and > + * that's not acceptable. If the current setting is not the default, > + * then assume that the caller knows what he's doing and leave the > + * signal handler unchanged. Otherwise set the signal handler to > + * SIG_IGN around each send() syscall. Unfortunately this is both > + * unreliable and slow for multithreaded apps. > + */ > + pipehandler = pqsignalinquire(SIGPIPE); > + if (pipehandler == SIG_DFL || pipehandler == SIG_ERR) > + conn->do_sigaction = true; > + else > + conn->do_sigaction = false; > #endif > > /* > Index: src/interfaces/libpq/fe-secure.c > === > RCS file: /projects/cvsroot/pgsql-server/src/interfaces/libpq/fe-secure.c,v > retrieving revision 1.32 > diff -c -r1.32 fe-secure.c > *** src/interfaces/libpq/fe-secure.c 29 Sep 2003 16:38:04 - 1.32 > --- src/interfaces/libpq/fe-secure.c 16 Nov 2003 11:44:47 - > *** > *** 348,354 > ssize_t n; > > #ifndef WIN32 > ! pqsigfunc oldsighandler = pqsignal(SIGPIPE, SIG_IGN); > #endif > > #ifdef USE_SSL > --- 348,357 > ssize_t n; > > #ifndef WIN32 > ! pqsigfunc oldsighandler = NULL; > ! > ! if (conn->do_sigaction) > ! oldsighandler = pqsignal(SIGPIPE, SIG_IGN); > #endif > > #ifdef USE_SSL > *** > *** 408,414 > n = send(conn->sock, ptr, len, 0); > > #ifndef WIN32 > ! pqsignal(SIGPIPE, oldsighandler); > #endif > > return n; > --- 411,418 > n = send(conn->sock, ptr, len, 0); > > #ifndef WIN32 > ! if (conn->do_sigaction) > ! pqsignal(SIGPIPE, oldsighandler); > #endif > > return n; > Index: src/interfaces/libpq/libpq-int.h > === > RCS file: /projects/cvsroot/pgsql-server/src/interfaces/libpq/libpq-int.h,v > retrieving revision 1.82 > diff -c -r1.82 libpq-int.h > *** src/interfaces/libpq/libpq-int.h 5 Sep 2003 02:08:36 - 1.82 > --- src/interfaces/libpq/libpq-int.h 16 Nov 2003 11:44:48 - > *
Re: [PATCHES] ALTER TABLE modifications
Rod Taylor kirjutas L, 08.11.2003 kell 18:55: > A general re-organization of Alter Table. Node wise, it is a > AlterTableStmt with a list of AlterTableCmds. The Cmds are the > individual actions to be completed (Add constraint, drop constraint, add > column, etc.) > > Processing is done in 2 phases. The first phase updates the system > catalogs and creates a work queue for the table scan. The second phase > is to conduct the actual table scan evaluating all constraints and other > per tuple processing simultaneously, as required. This has no effect on > single step operations, but has a large benefit for combinational logic > where multiple table scans would otherwise be required. ... > ALTER TABLE tab ALTER COLUMN col TYPE text TRANSFORM ...; > Currently migrates indexes, check constraints, defaults, and the > column definition to the new type with optional transform. If > the tranform is not supplied, a standard assignment cast is > attempted. Do you have special cases for type changes which don't need data transforms. I mean things like changing VARCHAR(10) to VARCHAR(20), dropping the NOT NULL constraint or changing CHECK A < 3 to CHECK A < 4. All these could be done with no data migration or extra checking. So how much of it should PG attemt to detect automatically and should there be NOSCAN option when progremmer knows better (changing CHECK ABS(A) < 3 into CHECK 9 > (A*A) ) Hannu ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[PATCHES] SIGPIPE handling
Hi, attached is an update of my automatic sigaction patch: I've moved the actual sigaction calls into pqsignal.c and added a helper function (pgsignalinquire(signo)). I couldn't remove the include from fe-connect.c: it's required for the SIGPIPE definition. Additionally I've added a -a flag for pgbench that sets the signal handler before calling PQconnectdb. Tested on Fedora Core 1 (Redhat Linux) with pgbench. -- Manfred Index: src/interfaces/libpq/fe-connect.c === RCS file: /projects/cvsroot/pgsql-server/src/interfaces/libpq/fe-connect.c,v retrieving revision 1.263 diff -c -r1.263 fe-connect.c *** src/interfaces/libpq/fe-connect.c 18 Oct 2003 05:02:06 - 1.263 --- src/interfaces/libpq/fe-connect.c 16 Nov 2003 11:44:47 - *** *** 41,46 --- 41,48 #include #endif #include + #include + #include "pqsignal.h" #endif #include "libpq/ip.h" *** *** 881,886 --- 883,891 struct addrinfo hint; const char *node = NULL; int ret; + #ifndef WIN32 + pqsigfunc pipehandler; + #endif if (!conn) return 0; *** *** 950,955 --- 955,976 conn->allow_ssl_try = false; else if (conn->sslmode[0] == 'a') /* "allow" */ conn->wait_ssl_try = true; + #endif + #ifndef WIN32 + /* +* Autodetect SIGPIPE signal handling: +* The default action per Unix spec is kill current process and +* that's not acceptable. If the current setting is not the default, +* then assume that the caller knows what he's doing and leave the +* signal handler unchanged. Otherwise set the signal handler to +* SIG_IGN around each send() syscall. Unfortunately this is both +* unreliable and slow for multithreaded apps. +*/ + pipehandler = pqsignalinquire(SIGPIPE); + if (pipehandler == SIG_DFL || pipehandler == SIG_ERR) + conn->do_sigaction = true; + else + conn->do_sigaction = false; #endif /* Index: src/interfaces/libpq/fe-secure.c === RCS file: /projects/cvsroot/pgsql-server/src/interfaces/libpq/fe-secure.c,v retrieving revision 1.32 diff -c -r1.32 fe-secure.c *** src/interfaces/libpq/fe-secure.c29 Sep 2003 16:38:04 - 1.32 --- src/interfaces/libpq/fe-secure.c16 Nov 2003 11:44:47 - *** *** 348,354 ssize_t n; #ifndef WIN32 ! pqsigfunc oldsighandler = pqsignal(SIGPIPE, SIG_IGN); #endif #ifdef USE_SSL --- 348,357 ssize_t n; #ifndef WIN32 ! pqsigfunc oldsighandler = NULL; ! ! if (conn->do_sigaction) ! oldsighandler = pqsignal(SIGPIPE, SIG_IGN); #endif #ifdef USE_SSL *** *** 408,414 n = send(conn->sock, ptr, len, 0); #ifndef WIN32 ! pqsignal(SIGPIPE, oldsighandler); #endif return n; --- 411,418 n = send(conn->sock, ptr, len, 0); #ifndef WIN32 ! if (conn->do_sigaction) ! pqsignal(SIGPIPE, oldsighandler); #endif return n; Index: src/interfaces/libpq/libpq-int.h === RCS file: /projects/cvsroot/pgsql-server/src/interfaces/libpq/libpq-int.h,v retrieving revision 1.82 diff -c -r1.82 libpq-int.h *** src/interfaces/libpq/libpq-int.h5 Sep 2003 02:08:36 - 1.82 --- src/interfaces/libpq/libpq-int.h16 Nov 2003 11:44:48 - *** *** 329,334 --- 329,337 charpeer_dn[256 + 1]; /* peer distinguished name */ charpeer_cn[SM_USER + 1]; /* peer common name */ #endif + #ifndef WIN32 + booldo_sigaction; /* set SIGPIPE to SIG_IGN around every send() call */ + #endif /* Buffer for current error message */ PQExpBufferData errorMessage; /* expansible string */ Index: src/interfaces/libpq/pqsignal.c === RCS file: /projects/cvsroot/pgsql-server/src/interfaces/libpq/pqsignal.c,v retrieving revision 1.17 diff -c -r1.17 pqsignal.c *** src/interfaces/libpq/pqsignal.c 4 Aug 2003 02:40:20 - 1.17 --- src/interfaces/libpq/pqsignal.c 16 Nov 2003 11:44:48 - *** *** 40,42 --- 40,61 return oact.sa_handler; #endif /* !HAVE_POSIX_SIGNALS */ } + + pqsigfunc + pqsignalinquire(int signo) + { + #if !defined(HAVE_POSIX_SIGNALS) + pqsigfunc old; + old = signal(SIGPIPE, SIG_IGN); + signal(SIGPIPE, old); + return old; + #else + struct sigaction oact; + + if (sigaction(SIGPIPE, NULL, &oact) != 0) + return SIG_ERR; +
Re: [PATCHES] improve overcommit docs
At the time I wrote the original 2.6 was not out even in prerelease, which is why I was deliberately somewhat vague about it. It is still in prerelease, and it will in fact work slightly differently from what was in some 2.4 kernels - there are 2 settings that govern this instead of 1. Here is the 2.6 description straight from linux-2.6.0-test9/Documentation/vm/overcommit-accounting: --- The Linux kernel supports three overcommit handling modes 0 - Heuristic overcommit handling. Obvious overcommits of address space are refused. Used for a typical system. It ensures a seriously wild allocation fails while allowing overcommit to reduce swap usage. root is allowed to allocate slighly more memory in this mode. This is the default. 1 - No overcommit handling. Appropriate for some scientific applications. 2 - (NEW) strict overcommit. The total address space commit for the system is not permitted to exceed swap + a configurable percentage (default is 50) of physical RAM. Depending on the percentage you use, in most situations this means a process will not be killed while accessing pages but will receive errors on memory allocation as appropriate. The overcommit policy is set via the sysctl `vm.overcommit_memory'. The overcommit percentage is set via `vm.overcommit_ratio'. - Also note that this is wrong for 2.4: run the machine out of memory. If your kernel supports the strict ! paranoid modes of overcommit handling, you can also relieve this There are 2 modes: strict (allow commit up to sizeof(swap plus 1/2 RAM) ) and paranoid (allow commit up to sizeof(swap) ). Wordsmith it however you like cheers andrew Neil Conway wrote: This patch makes some improvements to the section of the documentation that describes the Linux 2.4 memory overcommit behavior. I removed the almost content-free assertion that "You will need enough swap space to cover your memory needs." If this is intended to communicate anything meaningful, can someone rephrase it, please? This patch also includes a fix for a typo noticed by Robert Treat. Is this suitable for 7.4 (either the whole patch, or just the typo fix)? -Neil Index: doc/src/sgml/runtime.sgml === RCS file: /var/lib/cvs/pgsql-server/doc/src/sgml/runtime.sgml,v retrieving revision 1.218 diff -c -r1.218 runtime.sgml *** doc/src/sgml/runtime.sgml 14 Nov 2003 15:43:22 - 1.218 --- doc/src/sgml/runtime.sgml 16 Nov 2003 02:07:42 - *** *** 1294,1300 Unfortunately, there is no well-defined method for determining ideal values for the family of cost variables that ! below. You are encouraged to experiment and share your findings. --- 1294,1300 Unfortunately, there is no well-defined method for determining ideal values for the family of cost variables that ! appear below. You are encouraged to experiment and share your findings. *** *** 3267,3301 Linux Memory Overcommit ! Linux kernels of version 2.4.* have a poor default memory ! overcommit behavior, which can result in the PostgreSQL server ! (postmaster process) being killed by the ! kernel if the memory demands of another process cause the system ! to run out of memory. ! If this happens, you will see a kernel message looking like this ! (consult your system documentation and configuration on where to ! look for such a message): Out of Memory: Killed process 12345 (postmaster). ! And, of course, you will find that your database server has ! disappeared. To avoid this situation, run PostgreSQL on a machine where you can be sure that other processes will not run the machine out of memory. If your kernel supports the strict ! and/or paranoid modes of overcommit handling, you can also relieve ! this problem by altering the system's default behaviour. This can ! be determined by examining the function ! vm_enough_memory in the file mm/mmap.c ! in the kernel source. If this file reveals that the strict and/or ! paranoid modes are supported by your kernel, turn one of these ! modes on by using sysctl -w vm.overcommit_memory=2 --- 3267,3302 Linux Memory Overcommit ! In Linux 2.4, the default virtual memory configuration is not ! optimal for PostgreSQL. Because of the ! way that the kernel implements memory overcommit, the kernel may ! terminate the PostgreSQL server (the ! postmaster process) if the