I am seeing Postgres 8.3.7 running as a service on Windows Server 2003 repeatedly fail to restart after a backend crash because of the following code in port/win32_shmem.c:


   /*
    * If the segment already existed, CreateFileMapping() will return a
    * handle to the existing one.
    */
   if (GetLastError() == ERROR_ALREADY_EXISTS)
   {
       /*
        * When recycling a shared memory segment, it may take a short while
        * before it gets dropped from the global namespace. So re-try after
        * sleeping for a second.
        */
CloseHandle(hmap); /* Close the old handle, since we got a valid
                                * one to the previous segment. */

       Sleep(1000);

hmap = CreateFileMapping((HANDLE) 0xFFFFFFFF, NULL, PAGE_READWRITE, 0L, (DWORD) size, szShareMem);
       if (!hmap)
           ereport(FATAL,
(errmsg("could not create shared memory segment: %lu", GetLastError()), errdetail("Failed system call was CreateFileMapping(size=%lu, name=%s).",
                              (unsigned long) size, szShareMem)));

       if (GetLastError() == ERROR_ALREADY_EXISTS)
           ereport(FATAL,
(errmsg("pre-existing shared memory block is still in use"), errhint("Check if there are any old server processes still running, and terminate them.")));
   }


It strikes me that we really need to try reconnecting to the shared memory here several times, and maybe the backoff need to increase each time. On a loaded server this cause postgres to fail to restart fairly reliably.

thoughts?

cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to