Re: [PATCHES] Improve shutdown during online backup, take 4

2008-04-28 Thread Albe Laurenz
Tom Lane wrote:
  I've attached a patch that works for me. I hope I got it right.
 
 Applied with additional cleanup.  You hadn't thought very carefully
 about additional state transitions that would have to be introduced
 into the postmaster state machine to support a new state --- for
 example, as coded a SIGINT delivered to the postmaster after SIGTERM
 would fail to do anything at all, when of course it really ought to
 force us into fast shutdown.  Also, it's not really that hard to
 disallow non-superusers from connecting in PM_WAIT_BACKUP state.

Thank you for helping.
You know, this is my first patch for server code. I know that I
still have a lot to learn until I grok how all that works
together.

Yours,
Laurenz Albe

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] Improve shutdown during online backup, take 4

2008-04-26 Thread Tom Lane
Albe Laurenz [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 If you actually want the behavior you propose, then the only correct way
 to implement it is to embed it into the state machine logic, ie, do the
 CancelBackup inside PostmasterStateMachine in some state transition
 taken after the last child is gone.

 I've attached a patch that works for me. I hope I got it right.

Applied with additional cleanup.  You hadn't thought very carefully
about additional state transitions that would have to be introduced
into the postmaster state machine to support a new state --- for
example, as coded a SIGINT delivered to the postmaster after SIGTERM
would fail to do anything at all, when of course it really ought to
force us into fast shutdown.  Also, it's not really that hard to
disallow non-superusers from connecting in PM_WAIT_BACKUP state.

regards, tom lane

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] Improve shutdown during online backup, take 4

2008-04-25 Thread Tom Lane
Albe Laurenz [EMAIL PROTECTED] writes:
 That should work, but isn't it better if backup_label is removed
 only if we know we're going to shutdown cleanly?

Why?  That seems like an entirely arbitrary specification.

regards, tom lane

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] Improve shutdown during online backup, take 4

2008-04-25 Thread Albe Laurenz
Tom Lane wrote:
 Why?  That seems like an entirely arbitrary specification.

My resoning is that I think of smart/fast/immediate shutdown
as three different things.

For an immediate shutdown/crash thought it was best not to modify
anything to facilitate an analysis of the problem.

A fast shutdown that fails will end up as a crash or immediate
shutdown.


If you think that is is not important to only cancel backup mode
if we are sure that the shutdown will be clean, we might as well
also cancel online backup mode during an immediate shutdown.

In that case, I'd agree that the call to CancelBackup() could be moved
to WAIT_BACKUP (and executed only if it is no smart shutdown).
It would have the advantage of having all backup mode related
code in postmaster.c concentrated in one spot.


Make a suggestion, and I'll implement it that way.

Yours,
Laurenz Albe

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] Improve shutdown during online backup, take 4

2008-04-24 Thread Albe Laurenz
Tom Lane wrote:
 I concur that the messages added to pg_ctl are bizarrely formatted.
 Why would you put a newline in the middle of a sentence, when you
 could equally well emit something like
 
 WARNING: online backup mode is active.
 Shutdown will not complete until pg_stop_backup() is called.
 
 While we're on the subject, the messages added to xlog.c do not
 follow the style guidelines: in particular, errdetail should be
 a complete sentence, and the WARNING is trying to stuff independent
 thoughts into one message.  I'd probably do
 
 errmsg(online backup mode cancelled),
 errdetail(\%s\ was renamed to \%s\., ...
 
 errmsg(online backup mode was not cancelled),
 errdetail(Failed to rename \%s\ to \%s\: %m, ...

Attached is a patch that changes the messages along these lines.
Thanks!

 Lastly, the changes to pmdie's SIGINT handling seem quite bogus.
 Don't you need to transition into WAIT_BACKUP rather than WAIT_BACKENDS
 state in that case too?  Shouldn't you do CancelBackup *before*
 PostmasterStateMachine?  The thing screams of race conditions.

I suspect there must be a misunderstanding.
You cannot really mean that the postmaster should enter WAIT_BACKUP
state on a fast shutdown request.

Sure, CancelBackup could be called earlier. It doesn't do much more than
rename a file.
My reason for calling it late was that backup mode should really only be
cancelled if we manage to shutdown cleanly, and this is not clear until
the last child is gone.

I cannot see a race condition, except maybe the quite unlikely case
that two instances of CancelBackup might run concurrently, and it
*might* happen that one of them finds that backup_label is present but
fails to remove it, because the other one already did.
That would lead to a bogus backup mode not canceled message.

Are you referring to that or is there something more fundamental
I fail to see?

Yours,
Laurenz Albe


backup-shut.msg.patch
Description: backup-shut.msg.patch

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] Improve shutdown during online backup, take 4

2008-04-24 Thread Magnus Hagander
Albe Laurenz wrote:
 Tom Lane wrote:
  I concur that the messages added to pg_ctl are bizarrely formatted.
  Why would you put a newline in the middle of a sentence, when you
  could equally well emit something like
  
  WARNING: online backup mode is active.
  Shutdown will not complete until pg_stop_backup() is called.
  
  While we're on the subject, the messages added to xlog.c do not
  follow the style guidelines: in particular, errdetail should be
  a complete sentence, and the WARNING is trying to stuff independent
  thoughts into one message.  I'd probably do
  
  errmsg(online backup mode cancelled),
  errdetail(\%s\ was renamed to \%s\., ...
  
  errmsg(online backup mode was not cancelled),
  errdetail(Failed to rename \%s\ to \%s\: %m, ...
 
 Attached is a patch that changes the messages along these lines.
 Thanks!

Hmm. I've preivously been told not to use Failed to but instead use
Could not... Didn't notice that Tom used the other one in his
suggestion.

Tom (or someone else) - can you comment on if I misunderstood that
recommendation earlier, or if it still holds? I'll hold back a commit
until someone has commented on it :-)

Also, from this patch, you removed the %m part - I have re-added that
before commit.

(I will not for now comment on the rest of the mail, I'll leave that to
Tom)

//Magnus

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] Improve shutdown during online backup, take 4

2008-04-24 Thread Magnus Hagander
Tom Lane wrote:
 Magnus Hagander [EMAIL PROTECTED] writes:
  Hmm. I've preivously been told not to use Failed to but instead
  use Could not... Didn't notice that Tom used the other one in his
  suggestion.
 
  Tom (or someone else) - can you comment on if I misunderstood that
  recommendation earlier, or if it still holds?
 
 Oh, yes, could not is better.  My mistake.

Ok, thanks. Committed new error messages as such then.

//Magnus

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] Improve shutdown during online backup, take 4

2008-04-24 Thread Tom Lane
Albe Laurenz [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 Lastly, the changes to pmdie's SIGINT handling seem quite bogus.
 Don't you need to transition into WAIT_BACKUP rather than WAIT_BACKENDS
 state in that case too?  Shouldn't you do CancelBackup *before*
 PostmasterStateMachine?  The thing screams of race conditions.

 I suspect there must be a misunderstanding.
 You cannot really mean that the postmaster should enter WAIT_BACKUP
 state on a fast shutdown request.

Why not?  It'll fall out of the state again immediately in
PostmasterStateMachine, no, if we do a CancelBackup here?  I don't think
these two paths of control should be any more different than really
necessary.

 Sure, CancelBackup could be called earlier. It doesn't do much more than
 rename a file.
 My reason for calling it late was that backup mode should really only be
 cancelled if we manage to shutdown cleanly, and this is not clear until
 the last child is gone.

Well, if there were anything conditional about calling it, then maybe
that argument would hold some water, but the way you've got it here it
*will* get called anyway, just after the PostmasterStateMachine call
... except in the corner case where there were no child processes left
and so PostmasterStateMachine manages to decide it's ready to exit().
So far from implementing the logic you suggest, this arrangement never
gets it right, and has a race condition case in which it gets it exactly
backward.

The other reason for the remark about race conditions is that the
PostmasterStateMachine call should absolutely be the last thing that
pmdie() does --- putting anything after it is wrong, especially things
that might alter the PM state, as indeed CancelBackup could.  The reason
for having that in the signal handler is to cover the possibility that
no such call will occur immediately when we return to the wait loop.
In general all of the condition handlers in postmaster.c should be of
the form respond to the immediate condition, and then let
PostmasterStateMachine decide if there's anything else to do.

If you actually want the behavior you propose, then the only correct way
to implement it is to embed it into the state machine logic, ie, do the
CancelBackup inside PostmasterStateMachine in some state transition
taken after the last child is gone.

regards, tom lane

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] Improve shutdown during online backup, take 4

2008-04-24 Thread Tom Lane
Magnus Hagander [EMAIL PROTECTED] writes:
 Hmm. I've preivously been told not to use Failed to but instead use
 Could not... Didn't notice that Tom used the other one in his
 suggestion.

 Tom (or someone else) - can you comment on if I misunderstood that
 recommendation earlier, or if it still holds?

Oh, yes, could not is better.  My mistake.

regards, tom lane

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] Improve shutdown during online backup, take 4

2008-04-24 Thread Albe Laurenz
Tom Lane wrote:
 Lastly, the changes to pmdie's SIGINT handling seem quite bogus.
 Don't you need to transition into WAIT_BACKUP rather than WAIT_BACKENDS
 state in that case too?  Shouldn't you do CancelBackup *before*
 PostmasterStateMachine?  The thing screams of race conditions.
 
 I suspect there must be a misunderstanding.
 You cannot really mean that the postmaster should enter WAIT_BACKUP
 state on a fast shutdown request.
 
 Why not?  It'll fall out of the state again immediately in
 PostmasterStateMachine, no, if we do a CancelBackup here?  I don't think
 these two paths of control should be any more different than really
 necessary.

We cannot call CancelBackup there because that's exactly the state
in which a smart shutdown waits for a superuser to issue pg_stop_backup().

 Well, if there were anything conditional about calling it, then maybe
 that argument would hold some water, but the way you've got it here it
 *will* get called anyway, just after the PostmasterStateMachine call
[...]

I see.

 The other reason for the remark about race conditions is that the
 PostmasterStateMachine call should absolutely be the last thing that
 pmdie() does --- putting anything after it is wrong, especially things
 that might alter the PM state, as indeed CancelBackup could.  

I see that too. Thanks for explaining.

 If you actually want the behavior you propose, then the only correct way
 to implement it is to embed it into the state machine logic, ie, do the
 CancelBackup inside PostmasterStateMachine in some state transition
 taken after the last child is gone.

I've attached a patch that works for me. I hope I got it right.

Yours,
Laurenz Albe


cancel_backup.patch
Description: cancel_backup.patch

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] Improve shutdown during online backup, take 4

2008-04-24 Thread Tom Lane
Albe Laurenz [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 Why not?  It'll fall out of the state again immediately in
 PostmasterStateMachine, no, if we do a CancelBackup here?

 We cannot call CancelBackup there because that's exactly the state
 in which a smart shutdown waits for a superuser to issue pg_stop_backup().

Er, I was complaining about the fast-shutdown code path, not the
smart-shutdown one.

regards, tom lane

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] Improve shutdown during online backup, take 4

2008-04-23 Thread Magnus Hagander
Albe Laurenz wrote:
 Magnus Hagander wrote:
  This doesn't look like our normal coding standards, and should
  probably be changed:
  +   if (0 != stat(BACKUP_LABEL_FILE, stat_buf))
  
  (there's a number of similar places)
 
 I see. Lacking guidelines, I now copied how stat(2) is used in
 other parts of the code.

Really? I thought we didn't do that :), and I recall having my own
patches bounced for the same reason.

Just to be sure, we are talking about the
if (0 == foo)
vs
if (foo == 0)

thing, right? If not, then I wasn't clear enough in my note :-(

//Magnus

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] Improve shutdown during online backup, take 4

2008-04-23 Thread Magnus Hagander
Albe Laurenz wrote:
 Magnus Hagander wrote:
  This doesn't look like our normal coding standards, and should
  probably be changed:
  +   if (0 != stat(BACKUP_LABEL_FILE, stat_buf))
  
  (there's a number of similar places)
 
 I see. Lacking guidelines, I now copied how stat(2) is used in
 other parts of the code.

Applied with some minor changes to the error messages to make them
closer to our guidelines. (With my track record, it's not entirely
unlikely that someone will fix them further though :-P)

//Magnus

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] Improve shutdown during online backup, take 4

2008-04-23 Thread Alvaro Herrera
Magnus Hagander wrote:

 Applied with some minor changes to the error messages to make them
 closer to our guidelines. (With my track record, it's not entirely
 unlikely that someone will fix them further though :-P)

I think the messages should not have a newline in the middle.

Also, I am wondering if in PM_WAIT_BACKUP mode we should accept new
connections from superusers only.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] Improve shutdown during online backup, take 4

2008-04-23 Thread Magnus Hagander
Alvaro Herrera wrote:
 Magnus Hagander wrote:
 
  Applied with some minor changes to the error messages to make them
  closer to our guidelines. (With my track record, it's not entirely
  unlikely that someone will fix them further though :-P)
 
 I think the messages should not have a newline in the middle.

Are you talking about the one in pg_ctl? We have other messages in
pg_ctl that already do this, so I figured it was ok there...


 Also, I am wondering if in PM_WAIT_BACKUP mode we should accept new
 connections from superusers only.

Oh yeah, bleh, that reminds me that I should send off the couple of
comments for further improvements I thought of.

One was just this.

Another was, should we expose the cancel backup as a SQL level command?

//Magnus

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] Improve shutdown during online backup, take 4

2008-04-23 Thread Albe Laurenz
Alvaro Herrera wrote:
 I think the messages should not have a newline in the middle.
 
 Also, I am wondering if in PM_WAIT_BACKUP mode we should accept new
 connections from superusers only.

I spent some thought on that.
You'd need to wait until the user is authenticated before you can determine
if he/she is a superuser and may connect (otherwise I think it would be a
security leak that enables any attacker to find out whether a given user is
a superuser without knowing the password).

By that time the server process is already forked.
I couldn't see a way to check the postmaster state at that point,
so I decided not to try and keep it simple.

If you have any ideas how I could do such a check reasonably,
I'd be happy to try it, because basically I think it would be the
right thing.

Yours,
Laurenz Albe

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] Improve shutdown during online backup, take 4

2008-04-23 Thread Tom Lane
Magnus Hagander [EMAIL PROTECTED] writes:
 Alvaro Herrera wrote:
 I think the messages should not have a newline in the middle.

 Are you talking about the one in pg_ctl? We have other messages in
 pg_ctl that already do this, so I figured it was ok there...

I concur that the messages added to pg_ctl are bizarrely formatted.
Why would you put a newline in the middle of a sentence, when you
could equally well emit something like

WARNING: online backup mode is active.
Shutdown will not complete until pg_stop_backup() is called.

While we're on the subject, the messages added to xlog.c do not
follow the style guidelines: in particular, errdetail should be
a complete sentence, and the WARNING is trying to stuff independent
thoughts into one message.  I'd probably do

errmsg(online backup mode cancelled),
errdetail(\%s\ was renamed to \%s\., ...

errmsg(online backup mode was not cancelled),
errdetail(Failed to rename \%s\ to \%s\: %m, ...

Lastly, the changes to pmdie's SIGINT handling seem quite bogus.
Don't you need to transition into WAIT_BACKUP rather than WAIT_BACKENDS
state in that case too?  Shouldn't you do CancelBackup *before*
PostmasterStateMachine?  The thing screams of race conditions.

regards, tom lane

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches