Re: [PATCHES] ecpg threading vs win32

2007-03-19 Thread Magnus Hagander
On Mon, Mar 19, 2007 at 09:33:54AM +0900, ITAGAKI Takahiro wrote:
 
 Magnus Hagander [EMAIL PROTECTED] wrote:
 
  This patch replaces the pthreads code in ecpg with native win32 threads,
  in order to make it threadsafe. The idea is not to have to download the
  non-standard pthreads library on windows.
  
  Does it seem like it should be doing the right thing? Does somebody have
  a good test-case where ecpg breaks when not built thread-safe? (which
  would then also break when built thread-safe with a broken implementation)
 
 I have two questions about thread-safe ecpg.
 
 Q1. Don't you use CRITICAL_SECTION instead of Mutex (CreateMutex)?
I've heard there is a performance benefit in CRITICAL_SECTION.
If the mutex is shared only in one process, CS might be a better solution.
 http://japan.internet.com/developer/img/article/873/17801.gif
 http://world.std.com/~jmhart/csmutx.htm

Yes, CS can be slightly faster. Though under this use-pattern, I think
it will not make a measurable difference at all.
The reason I went with Mutex is that I wanted it to be as similar as
possible to the pthreads code.

 Q2. Do we need to use PQescapeStringConn() instead of PQescapeString()?
PQescapeString() is used to escape literals, and the documentation says
PQescapeStringConn() should be used in multi-threaded client programs.
 
 http://momjian.us/main/writings/pgsql/sgml/libpq-exec.html#LIBPQ-EXEC-ESCAPE-STRING
| PQescapeString can be used safely in single-threaded client programs
| that work with only one PostgreSQL connection at a time 

Seems so, but that's unrelated to this patch ;-) I'll leave the final
comment on that up to Michael.

//Magnus


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[PATCHES] Win32 shmem

2007-03-19 Thread Magnus Hagander
Attached is a first attempt at a native win32 shmem implementatio,
based on previous discussions. Instead of emulating the sysv stuff. The
base stuff (using mapped pagefile) is still the same, of course.

Needs more testing, but has survived my tests so far. And unlike the
previous implementation, it does properly refuse to start a second
postmaster in a data directory where there is one or more backends still
running.

Does it seem like I've overlooked anything obvious in this? I do get the
feeling that this is too simple, but I don't know exactly where the
problem is :-)

//Magnus

/*-
 *
 * win32_shmem.c
 *	  Implement shared memory using win32 facilities
 *
 * Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group
 * Portions Copyright (c) 1994, Regents of the University of California
 *
 * IDENTIFICATION
 *	  $PostgreSQL$
 *
 *-
 */
#include postgres.h

#include miscadmin.h
#include storage/ipc.h
#include storage/pg_shmem.h

typedef HANDLE IpcMemoryKey;		/* shared memory key passed to shmget(2) */

unsigned long UsedShmemSegID = 0;
void	   *UsedShmemSegAddr = NULL;


/*
 * Generate shared memory segment name. Expand the data directory, to generate
 * an identifier unique for this data directory. Then replace all backslashes
 * with forward slashes, since backslashes aren't permitted in global object names.
 *
 * XXX: What happens with junctions? It's only someone breaking things on purpose,
 *  and this is still better than before, but we might want to do something about
 *  that sometime in the future.
 */
static char *
GetShareMemName(void)
{
	char	   *retptr;
	DWORD		bufsize;

	bufsize = GetFullPathName(DataDir, 0, NULL, NULL);
	if (bufsize  0)
	{
		retptr = malloc(bufsize+1+11); // 1 NULL and 11 for PostgreSQL
		if (retptr)
		{
			DWORD r;
			
			strcpy(retptr,PostgreSQL:);
			r = GetFullPathName(DataDir, bufsize, retptr+11, NULL);
			if (r = bufsize  r != 0)
			{
char *cp;

for (cp = retptr; *cp; cp++)
	if (*cp == '\\')
		*cp = '/';
return retptr;
			}
		}
	}
	elog(FATAL, could not generate full pathname for datadir %s: %lu,
		 DataDir, GetLastError());
	
	/* Silence compiler */
	return NULL;
}


/*
 * PGSharedMemoryIsInUse
 *
 * Is a previously-existing shmem segment still existing and in use?
 *
 * The point of this exercise is to detect the case where a prior postmaster
 * crashed, but it left child backends that are still running.	Therefore
 * we only care about shmem segments that are associated with the intended
 * DataDir.  This is an important consideration since accidental matches of
 * shmem segment IDs are reasonably common.
 *
 */
bool
PGSharedMemoryIsInUse(unsigned long id1, unsigned long id2)
{
	char	   *szShareMem;
	HANDLE		hmap;

	szShareMem = GetShareMemName();

	printf(Attempting duplicate check on '%s'\n, szShareMem);
	hmap = OpenFileMapping(FILE_MAP_READ, FALSE, szShareMem);
	free(szShareMem);

	if (hmap == NULL)
		return false;

	CloseHandle(hmap);
	return true;
}


/*
 * PGSharedMemoryCreate
 *
 * Create a shared memory segment of the given size and initialize its
 * standard header.  
 *
 * makePrivate means to always create a new segment, rather than attach to
 * or recycle any existing segment. On win32, we always create a new segment,
 * since there is no need for recycling (segments go away automatically
 * when the last backend exits)
 *
 */
PGShmemHeader *
PGSharedMemoryCreate(Size size, bool makePrivate, int port)
{
	void	   *memAddress;
	PGShmemHeader *hdr;
	HANDLE hmap, hmap2;
	char	  *szShareMem;

	/* Room for a header? */
	Assert(size  MAXALIGN(sizeof(PGShmemHeader)));

	szShareMem = GetShareMemName();

	/* Make sure PGSharedMemoryAttach doesn't fail without need */
	UsedShmemSegAddr = NULL;

	hmap = CreateFileMapping((HANDLE) 0x,	/* Use the swap file	*/
			 NULL,
			 PAGE_READWRITE,		/* Memory is Read/Write */
			 0L,	/* Size Upper 32 Bits	*/
			 (DWORD) size,		/* Size Lower 32 bits */
			 szShareMem);
	if (!hmap)
		ereport(FATAL,
(errmsg(could not create shared memory segment: %lu, GetLastError()),
errdetail(Failed system call was CreateFileMapping(size=%lu, name=%s), size, szShareMem)));
	if (GetLastError() == ERROR_ALREADY_EXISTS)
		ereport(FATAL,
 (errmsg(pre-existing shared memory block is still in use),
 errhint(Check if there are any old server processes still running, and terminate them.)));

	/*
	 * Make the handle inheritable
	 */
	if (!DuplicateHandle(GetCurrentProcess(), hmap, GetCurrentProcess(), hmap2, 0, TRUE, DUPLICATE_SAME_ACCESS))
		ereport(FATAL,
(errmsg(could not create shared memory segment: %lu, GetLastError()),
errdetail(Failed system call was DuplicateHandle)));

	CloseHandle(hmap);

	memAddress = MapViewOfFileEx(hmap2, FILE_MAP_WRITE | FILE_MAP_READ, 0, 0, 0, NULL);
	if (!memAddress)
	{
		

[PATCHES] bgwriter stats

2007-03-19 Thread Magnus Hagander
I want to be able to pull some stats out of the bgwriter to be able to
track things. One thing is the total number of buffers written out.
Other things are the number of checkpoints and such.

Anyway. Attached patch adds this to the bgwriter shared memory. Is it
safe to do this, and then just have a regular function running in a
normal backend pulling out the value and returning it to the user,
without locking? Given that only the bgwriter can write to it?

Patch of course entirely incomplete, just illustrating the main approach
I've been thinking of taking. Just want to get this question asked and
answered before I go ahead and code more...

//Magnus

Index: src/backend/postmaster/bgwriter.c
===
RCS file: /cvsroot/pgsql/src/backend/postmaster/bgwriter.c,v
retrieving revision 1.36
diff -c -r1.36 bgwriter.c
*** src/backend/postmaster/bgwriter.c   17 Jan 2007 16:25:01 -  1.36
--- src/backend/postmaster/bgwriter.c   19 Mar 2007 19:38:27 -
***
*** 119,124 
--- 119,125 
  
int num_requests;   /* current # of requests */
int max_requests;   /* allocated array size */
+   int64   buffers_written; /* number of buffers written */
BgWriterRequest requests[1];/* VARIABLE LENGTH ARRAY */
  } BgWriterShmemStruct;
  
***
*** 427,433 
last_checkpoint_time = now;
}
else
!   BgBufferSync();
  
/*
 * Check for archive_timeout, if so, switch xlog files.  First 
we do a
--- 428,434 
last_checkpoint_time = now;
}
else
!   BgWriterShmem-buffers_written += BgBufferSync();
  
/*
 * Check for archive_timeout, if so, switch xlog files.  First 
we do a
Index: src/backend/storage/buffer/bufmgr.c
===
RCS file: /cvsroot/pgsql/src/backend/storage/buffer/bufmgr.c,v
retrieving revision 1.215
diff -c -r1.215 bufmgr.c
*** src/backend/storage/buffer/bufmgr.c 1 Feb 2007 19:10:27 -   1.215
--- src/backend/storage/buffer/bufmgr.c 19 Mar 2007 19:38:27 -
***
*** 986,998 
   *
   * This is called periodically by the background writer process.
   */
! void
  BgBufferSync(void)
  {
static int  buf_id1 = 0;
int buf_id2;
int num_to_scan;
int num_written;
  
/* Make sure we can handle the pin inside SyncOneBuffer */
ResourceOwnerEnlargeBuffers(CurrentResourceOwner);
--- 986,999 
   *
   * This is called periodically by the background writer process.
   */
! int
  BgBufferSync(void)
  {
static int  buf_id1 = 0;
int buf_id2;
int num_to_scan;
int num_written;
+   int total_written = 0;
  
/* Make sure we can handle the pin inside SyncOneBuffer */
ResourceOwnerEnlargeBuffers(CurrentResourceOwner);
***
*** 1030,1035 
--- 1031,1037 
break;
}
}
+   total_written += num_written;
}
  
/*
***
*** 1053,1059 
--- 1055,1064 
if (++buf_id2 = NBuffers)
buf_id2 = 0;
}
+   total_written += num_written;
}
+ 
+   return total_written;
  }
  
  /*
Index: src/include/storage/bufmgr.h
===
RCS file: /cvsroot/pgsql/src/include/storage/bufmgr.h,v
retrieving revision 1.102
diff -c -r1.102 bufmgr.h
*** src/include/storage/bufmgr.h5 Jan 2007 22:19:57 -   1.102
--- src/include/storage/bufmgr.h19 Mar 2007 19:38:28 -
***
*** 151,157 
  
  extern void BufmgrCommit(void);
  extern void BufferSync(void);
! extern void BgBufferSync(void);
  
  extern void AtProcExit_LocalBuffers(void);
  
--- 151,157 
  
  extern void BufmgrCommit(void);
  extern void BufferSync(void);
! extern int BgBufferSync(void);
  
  extern void AtProcExit_LocalBuffers(void);
  

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] bgwriter stats

2007-03-19 Thread Neil Conway

Magnus Hagander wrote:

Anyway. Attached patch adds this to the bgwriter shared memory. Is it
safe to do this, and then just have a regular function running in a
normal backend pulling out the value and returning it to the user,
without locking?
If the variable is an int64, I don't believe so: the architecture might 
not implement atomic read/writes of int64 values.


-Neil


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PATCHES] bgwriter stats

2007-03-19 Thread Magnus Hagander
Neil Conway wrote:
 Magnus Hagander wrote:
 Anyway. Attached patch adds this to the bgwriter shared memory. Is it
 safe to do this, and then just have a regular function running in a
 normal backend pulling out the value and returning it to the user,
 without locking?
 If the variable is an int64, I don't believe so: the architecture might
 not implement atomic read/writes of int64 values.

Ok. But it should be safe if it's int32?

Actually, since it's just statistics data, it wouldn't be a problem that
it's not atomic, I think. If we really unlucky, we'll get the wrong
value once. But most systems that poll such statistics can deal with
that, in my experience.

Then again, a normal int shouldn't be a problem either.

//Magnus

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PATCHES] bgwriter stats

2007-03-19 Thread Neil Conway

Magnus Hagander wrote:

Ok. But it should be safe if it's int32?
  
You should probably use sig_atomic_t, to be safe. Although I believe 
that read/writes to int are atomic on most platforms, in any case.



Actually, since it's just statistics data, it wouldn't be a problem that
it's not atomic, I think. If we really unlucky, we'll get the wrong
value once.
  
I don't think that's the right attitude to take, at all. Why not just 
use a lock? It's not like the overhead will be noticeable.


Alternatively, you can get a consistent read from an int64 variable 
using a sig_atomic_t counter, with a little thought. Off the top of my 
head, something like the following should work: have the writer 
increment the sig_atomic_t counter, adjust the int64 stats value, and 
then increment the sig_atomic_t again. Have the reader save a local copy 
of the sig_atomic_t counter aside, then read from the int64 counter, and 
then recheck the sig_atomic_t counter. Repeat until the local pre-read 
and post-read snapshots of the sig_atomic_t counter are identical.


-Neil


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] bgwriter stats

2007-03-19 Thread Magnus Hagander
Neil Conway wrote:
 Magnus Hagander wrote:
 Ok. But it should be safe if it's int32?
   
 You should probably use sig_atomic_t, to be safe. Although I believe
 that read/writes to int are atomic on most platforms, in any case.

Ok. That's an easy enough change.


 Actually, since it's just statistics data, it wouldn't be a problem that
 it's not atomic, I think. If we really unlucky, we'll get the wrong
 value once.
   
 I don't think that's the right attitude to take, at all. Why not just
 use a lock? It's not like the overhead will be noticeable.

Probably, but none of the other code appears to take a lock out on it :)


 Alternatively, you can get a consistent read from an int64 variable
 using a sig_atomic_t counter, with a little thought. Off the top of my
 head, something like the following should work: have the writer
 increment the sig_atomic_t counter, adjust the int64 stats value, and
 then increment the sig_atomic_t again. Have the reader save a local copy
 of the sig_atomic_t counter aside, then read from the int64 counter, and
 then recheck the sig_atomic_t counter. Repeat until the local pre-read
 and post-read snapshots of the sig_atomic_t counter are identical.

Thinking more about it, I think that's unnecessary. 32 bits is quite
enough - if you're graphing it (for example), those tools deal with
wraps already. They're usually mdae to deal with things like number of
bytes on a router interface, which is certainly  32 bit a lot faster
than us.

But I'll take note of that for some time when I actually *need* a 64-bit
value.-

//Magnus

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PATCHES] bgwriter stats

2007-03-19 Thread Tom Lane
Magnus Hagander [EMAIL PROTECTED] writes:
 I want to be able to pull some stats out of the bgwriter to be able to
 track things. One thing is the total number of buffers written out.
 Other things are the number of checkpoints and such.

 Anyway. Attached patch adds this to the bgwriter shared memory. Is it
 safe to do this, and then just have a regular function running in a
 normal backend pulling out the value and returning it to the user,
 without locking? Given that only the bgwriter can write to it?

This seems quite a bizarre way to do things.  Why wouldn't you implement
this functionality by shipping messages to the stats collector?

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] bgwriter stats

2007-03-19 Thread Tom Lane
Magnus Hagander [EMAIL PROTECTED] writes:
 Neil Conway wrote:
 I don't think that's the right attitude to take, at all. Why not just
 use a lock? It's not like the overhead will be noticeable.

 Probably, but none of the other code appears to take a lock out on it :)

Huh?  It doesn't use a lock for touching the checkpoint counters, but
that's OK because they're sig_atomic_t.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PATCHES] Win32 shmem

2007-03-19 Thread Tom Lane
Magnus Hagander [EMAIL PROTECTED] writes:
 Attached is a first attempt at a native win32 shmem implementatio,
 based on previous discussions. Instead of emulating the sysv stuff. The
 base stuff (using mapped pagefile) is still the same, of course.

 Needs more testing, but has survived my tests so far. And unlike the
 previous implementation, it does properly refuse to start a second
 postmaster in a data directory where there is one or more backends still
 running.

That's good...

 Does it seem like I've overlooked anything obvious in this? I do get the
 feeling that this is too simple, but I don't know exactly where the
 problem is :-)

I think you do still need the on_shmem_exit detach callback.  Consider
the situation where the postmaster is trying to reinitialize after a
child crash.  The Unix code is designed to detach and destroy the old
segment then create a new one.  If that's not what you want to do then
this code still seems not right.

The coding of GetShareMemName seems involuted.  I'd normally expect
success exit to be at the bottom of the routine, not deeply nested
like this.  You may be saving one elog(FATAL) call this way, but
I'd argue that the two cases could do with different message texts
anyway.  (Also, GetSharedMemName seems to read better.)

There seem to be a lot of system calls not checked for failure here.
Do they really not have any failure possibilities?

   errdetail(Failed system call was 
 MapViewOfFileEx, size, szShareMem)));

Doesn't your compiler validate sprintf arguments?

I trust the committed file isn't going to have DOS line endings.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PATCHES] bgwriter stats

2007-03-19 Thread Darcy Buskermolen
On Monday 19 March 2007 15:32, Tom Lane wrote:
 Magnus Hagander [EMAIL PROTECTED] writes:
  I want to be able to pull some stats out of the bgwriter to be able to
  track things. One thing is the total number of buffers written out.
  Other things are the number of checkpoints and such.
 
  Anyway. Attached patch adds this to the bgwriter shared memory. Is it
  safe to do this, and then just have a regular function running in a
  normal backend pulling out the value and returning it to the user,
  without locking? Given that only the bgwriter can write to it?

 This seems quite a bizarre way to do things.  Why wouldn't you implement
 this functionality by shipping messages to the stats collector?

I'm with Tom on this one.. All of our current stats are done via the stats 
collector, we should continue that way.  While we are on the subject of 
stats, does anybody else feel there is merrit in haveing block level writes 
tracked on a relation by relation bases? 


   regards, tom lane

 ---(end of broadcast)---
 TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] bgwriter stats

2007-03-19 Thread Neil Conway

Tom Lane wrote:

This seems quite a bizarre way to do things.  Why wouldn't you implement
this functionality by shipping messages to the stats collector?
  


Would that have any benefits over the shmem approach?

-Neil


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PATCHES] bgwriter stats

2007-03-19 Thread Tom Lane
Neil Conway [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 This seems quite a bizarre way to do things.  Why wouldn't you implement
 this functionality by shipping messages to the stats collector?

 Would that have any benefits over the shmem approach?

Well, for one thing, it would fit naturally into the existing stats
structure instead of being a wart on the side.  The problem of atomic
access to an int64 would go away, yet we'd still be able to keep a
running int64 total of the reports.  You wouldn't lose the total over a
shutdown/restart.  The value would obey the transactional-snapshot rules
we've established for stats output, making it safe to try to correlate
it with other stats.  Probably a few other things I'm not thinking of...

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings