Tom Lane wrote:
Magnus Hagander mag...@hagander.net writes:
Passes my tests, but I can't really reproduce the requirement to retry,
so I haven't been able to test that part :(
The patch looks sane to me. If you want to test, perhaps reducing the
sleep to 1 msec or so would reproduce the
Tom Lane wrote:
Andrew Dunstan and...@dunslane.net writes:
Now presumably we sleep for 1 sec between the CloseHandle() call and the
CreateFileMapping() call in that code for a reason.
I'm not sure. Magnus never did answer my question about why the sleep
and retry was put in at all; it
Tom Lane wrote:
Magnus Hagander mag...@hagander.net writes:
Tom Lane wrote:
It says here:
http://msdn.microsoft.com/en-us/library/ms885627.aspx
FWIW, this is the Windows CE documentation. The one for win32 is at:
http://msdn.microsoft.com/en-us/library/ms679360(VS.85).aspx
Sorry, that
Magnus Hagander wrote:
Tom Lane wrote:
Andrew Dunstan and...@dunslane.net writes:
Now presumably we sleep for 1 sec between the CloseHandle() call and the
CreateFileMapping() call in that code for a reason.
I'm not sure. Magnus never did answer my question about why the
Andrew Dunstan and...@dunslane.net writes:
Magnus Hagander wrote:
The actual 1 second value was completely random - it fixed all the
issues on my test VM at the time. I don't recall exactly the details,
but I do recall having to run a lot of tests before I managed to provoke
an error, and
Tom Lane wrote:
I still think there's absolutely no evidence suggesting that a variable
backoff is necessary. Given how little this code is going to be
exercised in the real world, how long will it take till we find out
if you get it wrong? Use a simple retry loop and be done with it.
Tom Lane wrote:
Andrew Dunstan and...@dunslane.net writes:
Magnus Hagander wrote:
The actual 1 second value was completely random - it fixed all the
issues on my test VM at the time. I don't recall exactly the details,
but I do recall having to run a lot of tests before I managed to provoke
Magnus Hagander wrote:
Andrew, you want to write up a patch or do you want me to do it?
This is going to be backpatched, I assume?
--
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.
--
Sent via pgsql-hackers mailing
Alvaro Herrera alvhe...@commandprompt.com writes:
This is going to be backpatched, I assume?
Yeah, back to 8.2 I suppose.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
Magnus Hagander mag...@hagander.net writes:
Tom Lane wrote:
I still think there's absolutely no evidence suggesting that a variable
backoff is necessary. Given how little this code is going to be
exercised in the real world, how long will it take till we find out
if you get it wrong? Use a
Magnus Hagander wrote:
Andrew, you want to write up a patch or do you want me to do it?
Go for it.
cheers
andrew
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Andrew Dunstan wrote:
Magnus Hagander wrote:
Andrew, you want to write up a patch or do you want me to do it?
Go for it.
How does this look?
Passes my tests, but I can't really reproduce the requirement to retry,
so I haven't been able to test that part :(
//Magnus
***
Magnus Hagander wrote:
How does this look?
Passes my tests, but I can't really reproduce the requirement to retry,
so I haven't been able to test that part :(
I'm disappointed :-( I thought this thread (without reading it too
deeply) was about fixing the problem that backends sometimes
Alvaro Herrera alvhe...@commandprompt.com writes:
I'm disappointed :-( I thought this thread (without reading it too
deeply) was about fixing the problem that backends sometimes fail to
connect to shmem, on a system that's been running for a while.
Nobody knows yet what's wrong there or how
Magnus Hagander mag...@hagander.net writes:
Passes my tests, but I can't really reproduce the requirement to retry,
so I haven't been able to test that part :(
The patch looks sane to me. If you want to test, perhaps reducing the
sleep to 1 msec or so would reproduce the need to go around the
Tom Lane wrote:
Andrew Dunstan and...@dunslane.net writes:
I am seeing Postgres 8.3.7 running as a service on Windows Server 2003
repeatedly fail to restart after a backend crash because of the
following code in port/win32_shmem.c:
On further review, I see an entirely different
Andrew Dunstan wrote:
Tom Lane wrote:
Now this would only explain problems if there were some code path
through the postmaster that could leave the errno set to
ERROR_ALREADY_EXISTS (a/k/a EEXIST) when this code is reached. I'm not
sure there is one, and I have even less of a theory as
Magnus Hagander wrote:
Andrew, just to confirm: you've found a case where this happens
*repeatably*? That's what we've failed to do before - it's happened now
and then, but never during testing...
Well, it happened several times to my client within a matter of hours. I
didn't see any
Magnus Hagander mag...@hagander.net writes:
Tom Lane wrote:
It says here:
http://msdn.microsoft.com/en-us/library/ms885627.aspx
FWIW, this is the Windows CE documentation. The one for win32 is at:
http://msdn.microsoft.com/en-us/library/ms679360(VS.85).aspx
Sorry, that was the one that came
Tom Lane wrote:
The quick try would be to stick a SetLastError(0) in there, just to be
sure... Could be worth a try?
I kinda think we should do that whether or not it can be proven to
have anything to do with Andrew's report. It's just like errno = 0
for Unix --- sometimes you have to
Andrew Dunstan and...@dunslane.net writes:
Now presumably we sleep for 1 sec between the CloseHandle() call and the
CreateFileMapping() call in that code for a reason.
I'm not sure. Magnus never did answer my question about why the sleep
and retry was put in at all; it seems not unlikely from
Tom Lane wrote:
Now this would only explain problems if there were some code path
through the postmaster that could leave the errno set to
ERROR_ALREADY_EXISTS (a/k/a EEXIST) when this code is reached. I'm not
sure there is one, and I have even less of a theory as to why system
load might
Andrew Dunstan and...@dunslane.net writes:
Maybe we need to look at all the places we call GetLastError(). There
are quite a few of them.
It would only be an issue with syscalls that have badly designed APIs
like this one. Most of the time you know that the function has failed
and is supposed
I am seeing Postgres 8.3.7 running as a service on Windows Server 2003
repeatedly fail to restart after a backend crash because of the
following code in port/win32_shmem.c:
/*
* If the segment already existed, CreateFileMapping() will return a
* handle to the existing one.
*/
On Fri, May 1, 2009 at 12:59 AM, Andrew Dunstan and...@dunslane.net wrote:
It strikes me that we really need to try reconnecting to the shared memory
here several times, and maybe the backoff need to increase each time. On a
loaded server this cause postgres to fail to restart fairly reliably.
On Fri, May 1, 2009 at 8:42 AM, Dave Page dp...@pgadmin.org wrote:
On Fri, May 1, 2009 at 12:59 AM, Andrew Dunstan and...@dunslane.net wrote:
It strikes me that we really need to try reconnecting to the shared memory
here several times, and maybe the backoff need to increase each time. On a
On Fri, May 1, 2009 at 11:05 AM, Greg Stark st...@enterprisedb.com wrote:
Do we have any idea why it may take a short while before it gets
dropped from the global namespace? Is there some demon running which
only wakes up periodically? Or any specific reason it takes so long?
That might give
Dave Page wrote:
On Fri, May 1, 2009 at 12:59 AM, Andrew Dunstan and...@dunslane.net wrote:
It strikes me that we really need to try reconnecting to the shared memory
here several times, and maybe the backoff need to increase each time. On a
loaded server this cause postgres to fail to restart
On Fri, May 1, 2009 at 4:10 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
Dave Page wrote:
On Fri, May 1, 2009 at 12:59 AM, Andrew Dunstan and...@dunslane.net
wrote:
It strikes me that we really need to try reconnecting to the shared
memory
here several times, and maybe
Heikki Linnakangas wrote:
Dave Page wrote:
On Fri, May 1, 2009 at 12:59 AM, Andrew Dunstan and...@dunslane.net
wrote:
It strikes me that we really need to try reconnecting to the shared
memory
here several times, and maybe the backoff need to increase each
time. On a
loaded server this
Dave Page wrote:
On Fri, May 1, 2009 at 4:10 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
Dave Page wrote:
On Fri, May 1, 2009 at 12:59 AM, Andrew Dunstan and...@dunslane.net
wrote:
It strikes me that we really need to try reconnecting to the shared
memory
here several
Andrew Dunstan and...@dunslane.net writes:
It strikes me that we really need to try reconnecting to the shared
memory here several times, and maybe the backoff need to increase each
time.
Adding a backoff would make the code significantly more complex, with
no gain that I can see. Just loop
Tom Lane wrote:
Andrew Dunstan and...@dunslane.net writes:
It strikes me that we really need to try reconnecting to the shared
memory here several times, and maybe the backoff need to increase each
time.
Adding a backoff would make the code significantly more complex, with
no gain
Andrew Dunstan and...@dunslane.net writes:
We've seen similar things with other Windows file operations, IIRC. What
bothers me is that the problem might be precisely because the 1 second
sleep between the CloseHandle() call and the CreateFileMapping() call
might not be enough due to system
Tom Lane wrote:
Andrew Dunstan and...@dunslane.net writes:
We've seen similar things with other Windows file operations, IIRC. What
bothers me is that the problem might be precisely because the 1 second
sleep between the CloseHandle() call and the CreateFileMapping() call
might not be
Andrew Dunstan and...@dunslane.net writes:
I am seeing Postgres 8.3.7 running as a service on Windows Server 2003
repeatedly fail to restart after a backend crash because of the
following code in port/win32_shmem.c:
On further review, I see an entirely different explanation for possible
36 matches
Mail list logo