Re: [PATCHES] Feature: POSIX Shared memory support, round 2

2007-02-09 Thread Tom Lane
Chris Marcellino [EMAIL PROTECTED] writes:
 Here is a new patch that uses the POSIX api's. It encodes the  
 canonical path (see 'man realpath') of the database's data directory  
 into the shared memory segment name using an strong hash function to  
 make it fit in the shared memory segment name under all cases,  
 without risk of key collision.

I find this patch utterly unreadable, because of your cavalier disregard
for making the comments match the truth.  You have copied-and-pasted the
original SysV code and fixed some small fraction of the comments, and I
cannot tell which ones still reflect reality --- but I can tell that a
lot of them don't.

Also, I don't see where this implements any sort of detection of live
backends attached to an existing segment, so I don't think you have
responded to that objection.  Magnus' idea for Windows was to use a
segment set up to automatically go away as soon as the last attacher
died, but AFAICT that isn't how this works.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


[PATCHES] Feature: POSIX Shared memory support, round 2

2007-02-08 Thread Chris Marcellino

As discussed earlier, using POSIX shared memory can solve a few issues,
On Mac OS X and other BSD's, the default System V shared memory  
limits are often very low and require adjustment for acceptable  
performance. Particularly, when Postgres is included as part of  
larger end-user friendly software products, these kernel settings  
are often difficult to change for 2 reasons:


1. The (arbitrarily) limited resources must be shared by all  
programs that use System V shared memory. For example on my Mac OS  
X computer, I have Postgres running a standalone database, but also  
as part of Apple Remote Desktop. Without manual adjustment, running  
both simultaneously causes one of them to fail. Correcting this in  
any robust way is challenging to automate for consumer-style (i.e.  
Mac) installers.


2. On these BSD's, this System V shared memory is wired down and  
cannot be swapped out for any reason. If Postgres is running as  
part of another software program or is a lower priority, other  
programs cannot use the potentially limited memory. This places the  
user or developer in a tricky position of having to minimize  
overall system impact, while permitting enough shared memory for  
Postgres to perform well.


Also, the SysV code is complex since it needs to deal with the  
(probable) likelihood that a shmid will collide with another program  
or postmaster.


Here is a new patch that uses the POSIX api's. It encodes the  
canonical path (see 'man realpath') of the database's data directory  
into the shared memory segment name using an strong hash function to  
make it fit in the shared memory segment name under all cases,  
without risk of key collision.


I have taken a new, simpler approach to handling databases that have  
been kill -9 or crashed. It is described in the comments, but  
essentially since all collisions in shared memory key must be from   
orphaned backends or crashed postmasters from the current data  
directory, they can be freed. A 2 character identifier field is  
prepended to the data directory hash, which is incremented after  
freeing an orphan, so that the new postmaster need not wait for the  
backends to die. This approach also works equally well on Windows as  
it does on Unixen. The comments also describe some of the portability  
concerns (which have been handled). Please see the code  
(PGSharedMemoryCreate and its helpers) for more information on this  
point.


To build/test this, place the attached file in src/backend/port/ and  
change the symbolic link pg_shmem.c to point to this file. If this  
gets used on BSD's, keep in mind that shared memory is no longer  
drawn from the SysV pool, so the SysV settings (SHMMAX, etc.) can be  
set to their default values to recover the memory that was wired down  
for the SysV pool.

I don't have access to any Linux machines to test this.

Thanks for your feedback,
Chris Marcellino




posix_shmem.c
Description: Binary data

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org