Re: [PATCHES] ecpg threading vs win32
On Mon, Mar 19, 2007 at 09:33:54AM +0900, ITAGAKI Takahiro wrote: Magnus Hagander [EMAIL PROTECTED] wrote: This patch replaces the pthreads code in ecpg with native win32 threads, in order to make it threadsafe. The idea is not to have to download the non-standard pthreads library on windows. Does it seem like it should be doing the right thing? Does somebody have a good test-case where ecpg breaks when not built thread-safe? (which would then also break when built thread-safe with a broken implementation) I have two questions about thread-safe ecpg. Q1. Don't you use CRITICAL_SECTION instead of Mutex (CreateMutex)? I've heard there is a performance benefit in CRITICAL_SECTION. If the mutex is shared only in one process, CS might be a better solution. http://japan.internet.com/developer/img/article/873/17801.gif http://world.std.com/~jmhart/csmutx.htm Yes, CS can be slightly faster. Though under this use-pattern, I think it will not make a measurable difference at all. The reason I went with Mutex is that I wanted it to be as similar as possible to the pthreads code. Q2. Do we need to use PQescapeStringConn() instead of PQescapeString()? PQescapeString() is used to escape literals, and the documentation says PQescapeStringConn() should be used in multi-threaded client programs. http://momjian.us/main/writings/pgsql/sgml/libpq-exec.html#LIBPQ-EXEC-ESCAPE-STRING | PQescapeString can be used safely in single-threaded client programs | that work with only one PostgreSQL connection at a time Seems so, but that's unrelated to this patch ;-) I'll leave the final comment on that up to Michael. //Magnus ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[PATCHES] Win32 shmem
Attached is a first attempt at a native win32 shmem implementatio, based on previous discussions. Instead of emulating the sysv stuff. The base stuff (using mapped pagefile) is still the same, of course. Needs more testing, but has survived my tests so far. And unlike the previous implementation, it does properly refuse to start a second postmaster in a data directory where there is one or more backends still running. Does it seem like I've overlooked anything obvious in this? I do get the feeling that this is too simple, but I don't know exactly where the problem is :-) //Magnus /*- * * win32_shmem.c * Implement shared memory using win32 facilities * * Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group * Portions Copyright (c) 1994, Regents of the University of California * * IDENTIFICATION * $PostgreSQL$ * *- */ #include postgres.h #include miscadmin.h #include storage/ipc.h #include storage/pg_shmem.h typedef HANDLE IpcMemoryKey; /* shared memory key passed to shmget(2) */ unsigned long UsedShmemSegID = 0; void *UsedShmemSegAddr = NULL; /* * Generate shared memory segment name. Expand the data directory, to generate * an identifier unique for this data directory. Then replace all backslashes * with forward slashes, since backslashes aren't permitted in global object names. * * XXX: What happens with junctions? It's only someone breaking things on purpose, * and this is still better than before, but we might want to do something about * that sometime in the future. */ static char * GetShareMemName(void) { char *retptr; DWORD bufsize; bufsize = GetFullPathName(DataDir, 0, NULL, NULL); if (bufsize 0) { retptr = malloc(bufsize+1+11); // 1 NULL and 11 for PostgreSQL if (retptr) { DWORD r; strcpy(retptr,PostgreSQL:); r = GetFullPathName(DataDir, bufsize, retptr+11, NULL); if (r = bufsize r != 0) { char *cp; for (cp = retptr; *cp; cp++) if (*cp == '\\') *cp = '/'; return retptr; } } } elog(FATAL, could not generate full pathname for datadir %s: %lu, DataDir, GetLastError()); /* Silence compiler */ return NULL; } /* * PGSharedMemoryIsInUse * * Is a previously-existing shmem segment still existing and in use? * * The point of this exercise is to detect the case where a prior postmaster * crashed, but it left child backends that are still running. Therefore * we only care about shmem segments that are associated with the intended * DataDir. This is an important consideration since accidental matches of * shmem segment IDs are reasonably common. * */ bool PGSharedMemoryIsInUse(unsigned long id1, unsigned long id2) { char *szShareMem; HANDLE hmap; szShareMem = GetShareMemName(); printf(Attempting duplicate check on '%s'\n, szShareMem); hmap = OpenFileMapping(FILE_MAP_READ, FALSE, szShareMem); free(szShareMem); if (hmap == NULL) return false; CloseHandle(hmap); return true; } /* * PGSharedMemoryCreate * * Create a shared memory segment of the given size and initialize its * standard header. * * makePrivate means to always create a new segment, rather than attach to * or recycle any existing segment. On win32, we always create a new segment, * since there is no need for recycling (segments go away automatically * when the last backend exits) * */ PGShmemHeader * PGSharedMemoryCreate(Size size, bool makePrivate, int port) { void *memAddress; PGShmemHeader *hdr; HANDLE hmap, hmap2; char *szShareMem; /* Room for a header? */ Assert(size MAXALIGN(sizeof(PGShmemHeader))); szShareMem = GetShareMemName(); /* Make sure PGSharedMemoryAttach doesn't fail without need */ UsedShmemSegAddr = NULL; hmap = CreateFileMapping((HANDLE) 0x, /* Use the swap file */ NULL, PAGE_READWRITE, /* Memory is Read/Write */ 0L, /* Size Upper 32 Bits */ (DWORD) size, /* Size Lower 32 bits */ szShareMem); if (!hmap) ereport(FATAL, (errmsg(could not create shared memory segment: %lu, GetLastError()), errdetail(Failed system call was CreateFileMapping(size=%lu, name=%s), size, szShareMem))); if (GetLastError() == ERROR_ALREADY_EXISTS) ereport(FATAL, (errmsg(pre-existing shared memory block is still in use), errhint(Check if there are any old server processes still running, and terminate them.))); /* * Make the handle inheritable */ if (!DuplicateHandle(GetCurrentProcess(), hmap, GetCurrentProcess(), hmap2, 0, TRUE, DUPLICATE_SAME_ACCESS)) ereport(FATAL, (errmsg(could not create shared memory segment: %lu, GetLastError()), errdetail(Failed system call was DuplicateHandle))); CloseHandle(hmap); memAddress = MapViewOfFileEx(hmap2, FILE_MAP_WRITE | FILE_MAP_READ, 0, 0, 0, NULL); if (!memAddress) {
[PATCHES] bgwriter stats
I want to be able to pull some stats out of the bgwriter to be able to track things. One thing is the total number of buffers written out. Other things are the number of checkpoints and such. Anyway. Attached patch adds this to the bgwriter shared memory. Is it safe to do this, and then just have a regular function running in a normal backend pulling out the value and returning it to the user, without locking? Given that only the bgwriter can write to it? Patch of course entirely incomplete, just illustrating the main approach I've been thinking of taking. Just want to get this question asked and answered before I go ahead and code more... //Magnus Index: src/backend/postmaster/bgwriter.c === RCS file: /cvsroot/pgsql/src/backend/postmaster/bgwriter.c,v retrieving revision 1.36 diff -c -r1.36 bgwriter.c *** src/backend/postmaster/bgwriter.c 17 Jan 2007 16:25:01 - 1.36 --- src/backend/postmaster/bgwriter.c 19 Mar 2007 19:38:27 - *** *** 119,124 --- 119,125 int num_requests; /* current # of requests */ int max_requests; /* allocated array size */ + int64 buffers_written; /* number of buffers written */ BgWriterRequest requests[1];/* VARIABLE LENGTH ARRAY */ } BgWriterShmemStruct; *** *** 427,433 last_checkpoint_time = now; } else ! BgBufferSync(); /* * Check for archive_timeout, if so, switch xlog files. First we do a --- 428,434 last_checkpoint_time = now; } else ! BgWriterShmem-buffers_written += BgBufferSync(); /* * Check for archive_timeout, if so, switch xlog files. First we do a Index: src/backend/storage/buffer/bufmgr.c === RCS file: /cvsroot/pgsql/src/backend/storage/buffer/bufmgr.c,v retrieving revision 1.215 diff -c -r1.215 bufmgr.c *** src/backend/storage/buffer/bufmgr.c 1 Feb 2007 19:10:27 - 1.215 --- src/backend/storage/buffer/bufmgr.c 19 Mar 2007 19:38:27 - *** *** 986,998 * * This is called periodically by the background writer process. */ ! void BgBufferSync(void) { static int buf_id1 = 0; int buf_id2; int num_to_scan; int num_written; /* Make sure we can handle the pin inside SyncOneBuffer */ ResourceOwnerEnlargeBuffers(CurrentResourceOwner); --- 986,999 * * This is called periodically by the background writer process. */ ! int BgBufferSync(void) { static int buf_id1 = 0; int buf_id2; int num_to_scan; int num_written; + int total_written = 0; /* Make sure we can handle the pin inside SyncOneBuffer */ ResourceOwnerEnlargeBuffers(CurrentResourceOwner); *** *** 1030,1035 --- 1031,1037 break; } } + total_written += num_written; } /* *** *** 1053,1059 --- 1055,1064 if (++buf_id2 = NBuffers) buf_id2 = 0; } + total_written += num_written; } + + return total_written; } /* Index: src/include/storage/bufmgr.h === RCS file: /cvsroot/pgsql/src/include/storage/bufmgr.h,v retrieving revision 1.102 diff -c -r1.102 bufmgr.h *** src/include/storage/bufmgr.h5 Jan 2007 22:19:57 - 1.102 --- src/include/storage/bufmgr.h19 Mar 2007 19:38:28 - *** *** 151,157 extern void BufmgrCommit(void); extern void BufferSync(void); ! extern void BgBufferSync(void); extern void AtProcExit_LocalBuffers(void); --- 151,157 extern void BufmgrCommit(void); extern void BufferSync(void); ! extern int BgBufferSync(void); extern void AtProcExit_LocalBuffers(void); ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PATCHES] bgwriter stats
Magnus Hagander wrote: Anyway. Attached patch adds this to the bgwriter shared memory. Is it safe to do this, and then just have a regular function running in a normal backend pulling out the value and returning it to the user, without locking? If the variable is an int64, I don't believe so: the architecture might not implement atomic read/writes of int64 values. -Neil ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PATCHES] bgwriter stats
Neil Conway wrote: Magnus Hagander wrote: Anyway. Attached patch adds this to the bgwriter shared memory. Is it safe to do this, and then just have a regular function running in a normal backend pulling out the value and returning it to the user, without locking? If the variable is an int64, I don't believe so: the architecture might not implement atomic read/writes of int64 values. Ok. But it should be safe if it's int32? Actually, since it's just statistics data, it wouldn't be a problem that it's not atomic, I think. If we really unlucky, we'll get the wrong value once. But most systems that poll such statistics can deal with that, in my experience. Then again, a normal int shouldn't be a problem either. //Magnus ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PATCHES] bgwriter stats
Magnus Hagander wrote: Ok. But it should be safe if it's int32? You should probably use sig_atomic_t, to be safe. Although I believe that read/writes to int are atomic on most platforms, in any case. Actually, since it's just statistics data, it wouldn't be a problem that it's not atomic, I think. If we really unlucky, we'll get the wrong value once. I don't think that's the right attitude to take, at all. Why not just use a lock? It's not like the overhead will be noticeable. Alternatively, you can get a consistent read from an int64 variable using a sig_atomic_t counter, with a little thought. Off the top of my head, something like the following should work: have the writer increment the sig_atomic_t counter, adjust the int64 stats value, and then increment the sig_atomic_t again. Have the reader save a local copy of the sig_atomic_t counter aside, then read from the int64 counter, and then recheck the sig_atomic_t counter. Repeat until the local pre-read and post-read snapshots of the sig_atomic_t counter are identical. -Neil ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] bgwriter stats
Neil Conway wrote: Magnus Hagander wrote: Ok. But it should be safe if it's int32? You should probably use sig_atomic_t, to be safe. Although I believe that read/writes to int are atomic on most platforms, in any case. Ok. That's an easy enough change. Actually, since it's just statistics data, it wouldn't be a problem that it's not atomic, I think. If we really unlucky, we'll get the wrong value once. I don't think that's the right attitude to take, at all. Why not just use a lock? It's not like the overhead will be noticeable. Probably, but none of the other code appears to take a lock out on it :) Alternatively, you can get a consistent read from an int64 variable using a sig_atomic_t counter, with a little thought. Off the top of my head, something like the following should work: have the writer increment the sig_atomic_t counter, adjust the int64 stats value, and then increment the sig_atomic_t again. Have the reader save a local copy of the sig_atomic_t counter aside, then read from the int64 counter, and then recheck the sig_atomic_t counter. Repeat until the local pre-read and post-read snapshots of the sig_atomic_t counter are identical. Thinking more about it, I think that's unnecessary. 32 bits is quite enough - if you're graphing it (for example), those tools deal with wraps already. They're usually mdae to deal with things like number of bytes on a router interface, which is certainly 32 bit a lot faster than us. But I'll take note of that for some time when I actually *need* a 64-bit value.- //Magnus ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] bgwriter stats
Magnus Hagander [EMAIL PROTECTED] writes: I want to be able to pull some stats out of the bgwriter to be able to track things. One thing is the total number of buffers written out. Other things are the number of checkpoints and such. Anyway. Attached patch adds this to the bgwriter shared memory. Is it safe to do this, and then just have a regular function running in a normal backend pulling out the value and returning it to the user, without locking? Given that only the bgwriter can write to it? This seems quite a bizarre way to do things. Why wouldn't you implement this functionality by shipping messages to the stats collector? regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] bgwriter stats
Magnus Hagander [EMAIL PROTECTED] writes: Neil Conway wrote: I don't think that's the right attitude to take, at all. Why not just use a lock? It's not like the overhead will be noticeable. Probably, but none of the other code appears to take a lock out on it :) Huh? It doesn't use a lock for touching the checkpoint counters, but that's OK because they're sig_atomic_t. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] Win32 shmem
Magnus Hagander [EMAIL PROTECTED] writes: Attached is a first attempt at a native win32 shmem implementatio, based on previous discussions. Instead of emulating the sysv stuff. The base stuff (using mapped pagefile) is still the same, of course. Needs more testing, but has survived my tests so far. And unlike the previous implementation, it does properly refuse to start a second postmaster in a data directory where there is one or more backends still running. That's good... Does it seem like I've overlooked anything obvious in this? I do get the feeling that this is too simple, but I don't know exactly where the problem is :-) I think you do still need the on_shmem_exit detach callback. Consider the situation where the postmaster is trying to reinitialize after a child crash. The Unix code is designed to detach and destroy the old segment then create a new one. If that's not what you want to do then this code still seems not right. The coding of GetShareMemName seems involuted. I'd normally expect success exit to be at the bottom of the routine, not deeply nested like this. You may be saving one elog(FATAL) call this way, but I'd argue that the two cases could do with different message texts anyway. (Also, GetSharedMemName seems to read better.) There seem to be a lot of system calls not checked for failure here. Do they really not have any failure possibilities? errdetail(Failed system call was MapViewOfFileEx, size, szShareMem))); Doesn't your compiler validate sprintf arguments? I trust the committed file isn't going to have DOS line endings. regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PATCHES] bgwriter stats
On Monday 19 March 2007 15:32, Tom Lane wrote: Magnus Hagander [EMAIL PROTECTED] writes: I want to be able to pull some stats out of the bgwriter to be able to track things. One thing is the total number of buffers written out. Other things are the number of checkpoints and such. Anyway. Attached patch adds this to the bgwriter shared memory. Is it safe to do this, and then just have a regular function running in a normal backend pulling out the value and returning it to the user, without locking? Given that only the bgwriter can write to it? This seems quite a bizarre way to do things. Why wouldn't you implement this functionality by shipping messages to the stats collector? I'm with Tom on this one.. All of our current stats are done via the stats collector, we should continue that way. While we are on the subject of stats, does anybody else feel there is merrit in haveing block level writes tracked on a relation by relation bases? regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] bgwriter stats
Tom Lane wrote: This seems quite a bizarre way to do things. Why wouldn't you implement this functionality by shipping messages to the stats collector? Would that have any benefits over the shmem approach? -Neil ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PATCHES] bgwriter stats
Neil Conway [EMAIL PROTECTED] writes: Tom Lane wrote: This seems quite a bizarre way to do things. Why wouldn't you implement this functionality by shipping messages to the stats collector? Would that have any benefits over the shmem approach? Well, for one thing, it would fit naturally into the existing stats structure instead of being a wart on the side. The problem of atomic access to an int64 would go away, yet we'd still be able to keep a running int64 total of the reports. You wouldn't lose the total over a shutdown/restart. The value would obey the transactional-snapshot rules we've established for stats output, making it safe to try to correlate it with other stats. Probably a few other things I'm not thinking of... regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings