On 9/5/13 11:37 AM, Robert Haas wrote:
ISTM that at some point we'll want to look at putting top-level shared
>memory into this system (ie: allowing dynamic resizing of GUCs that affect
>shared memory size).
A lot of people want that, but being able to resize the shared memory
chunk itself is only the beginning of the problem.  So I wouldn't hold
my breath.

<starts breathing again>

>Wouldn't it protect against a crash while writing the file? I realize the
>odds of that are pretty remote, but AFAIK it wouldn't cost that much to
>write a new file and do an atomic mv...
If there's an OS-level crash, we don't need the state file; the shared
memory will be gone anyway.  And if it's a PostgreSQL-level failure,
this game neither helps nor hurts.

>>Sure.  A messed-up backend can clobber the control segment just as it
>>can clobber anything else in shared memory.  There's really no way
>>around that problem.  If the control segment has been overwritten by a
>>memory stomp, we can't use it to clean up.  There's no way around that
>>problem except to not the control segment, which wouldn't be better.
>
>Are we trying to protect against "memory stomps" when we restart after a
>backend dies? I thought we were just trying to ensure that all shared data
>structures were correct and consistent. If that's the case, then I was
>thinking that by using a pointer that can be updated in a CPU-atomic fashion
>we know we'd never end up with a corrupted entry that was in use; the
>partial write would be to a slot with nothing pointing at it so it could be
>safely reused.
When we restart after a backend dies, shared memory contents are
completely reset, from scratch.  This is true of both the fixed size
shared memory segment and of the dynamic shared memory control
segment.  The only difference is that, with the dynamic shared memory
control segment, we need to use the segment for cleanup before
throwing it out and starting over.  Extra caution is required because
we're examining memory that could hypothetically have been stomped on;
we must not let the postmaster do anything suicidal.

Not doing something suicidal is what I'm worried about (that and not cleaning 
up as well as possible).

The specific scenario I'm worried about is something like a PANIC in the middle 
of the snprintf call in dsm_write_state_file(). That would leave that file in a 
completely unknown state so who knows what would then happen on restart. ISTM 
that writing a temp file and then doing a filesystem mv would eliminate that 
issue.

Or is it safe to assume that the snprintf call will be atomic since we're just 
spitting out a long?
--
Jim C. Nasby, Data Architect                       j...@nasby.net
512.569.9461 (cell)                         http://jim.nasby.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to