Re: [HACKERS] recovery getting interrupted is not so unusual as it used to be

2010-06-04 Thread Greg Stark
On Sat, Jun 5, 2010 at 2:20 AM, Robert Haas  wrote:
>> I've tried to keep this as similar as possible to the existing message while 
>> making it less ambiguous about cause and effect.
>>
>> "If this has occurred more than once corrupt data might be the cause and you 
>> might need to choose an earlier recovery target".

> If the database system is exiting unexpectedly during archive
> recovery, some data might be corrupted and you might need to choose an
> earlier recovery target.

I think you've missed the key addition in Florian's suggestions. The
"might be the cause" tips the user off to what's going on. Your
statement is just as ambiguous as the original message in that it
could be (and usually would be) read as saying that the interruption
of recovery could cause the corruption.

I would probably write it as "If this is happening repeatedly it might
be caused by corrupt data. Try choosing an earlier recovery target
prior to the corruption.". Florian's phrasing seemed ok to me too
though.


-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] recovery getting interrupted is not so unusual as it used to be

2010-06-04 Thread Robert Haas
On Fri, Jun 4, 2010 at 8:21 PM, Florian Pflug  wrote:
> On Jun 3, 2010, at 5:25 , Robert Haas wrote:
>> On Wed, Jun 2, 2010 at 10:34 PM, Florian Pflug  wrote:
 Oh.  Well, if that's the case, then I guess I lean toward applying the
 patch as-is.  Then there's no need for the caveat "and without manual
 intervention".
>>>
>>> That still leaves the messages awfully ambiguous concerning the cause (data 
>>> corruption) and the effect (crash during recovery).
>>>
>>> How about
>>> "If this has occurred more than once, it is probably caused by corrupt data 
>>> and you have to use the latest backup for recovery"
>>> for the crash recovery case and
>>> "If this has occurred more than once, it is probably caused by corrupt data 
>>> and you have to choose an earlier recovery target"
>>> for the PITR case.
>>>
>>> I don't see why currently only the PITR-case includes the "more than once" 
>>> clause. Its probably supposed to prevent unnecessarily alarming the user if 
>>> the "crash" was in fact a stray SIGKILL or an out-of-memory condition, 
>>> which seems equally likely in both cases.
>>
>> I've applied the patch for now - we can fix the wording of the other
>> messages with a follow-on patch if we agree on what they should say.
>> I don't like the use of the phrase "you have to", particularly...  I
>> would tend to leave the archive recovery message alone and change the
>> crash recovery message to be more like it.
>
> Since a loose log of this shed gave me quite a bump on my forehead once, one 
> last attempt at fixing it.
>
> I've tried to keep this as similar as possible to the existing message while 
> making it less ambiguous about cause and effect.
>
> "If this has occurred more than once corrupt data might be the cause and you 
> might need to choose an earlier recovery target".
> and
> "If this has occurred more than once corrupt data might be the cause and you 
> might need to restore from backup".

How about:

If the database system is exiting unexpectedly during archive
recovery, some data might be corrupted and you might need to choose an
earlier recovery target.
If the database system is exiting unexpectedly during crash recovery,
some data might be corrupted and you might need to restore from
backup.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] recovery getting interrupted is not so unusual as it used to be

2010-06-04 Thread Florian Pflug
On Jun 3, 2010, at 5:25 , Robert Haas wrote:
> On Wed, Jun 2, 2010 at 10:34 PM, Florian Pflug  wrote:
>>> Oh.  Well, if that's the case, then I guess I lean toward applying the
>>> patch as-is.  Then there's no need for the caveat "and without manual
>>> intervention".
>> 
>> That still leaves the messages awfully ambiguous concerning the cause (data 
>> corruption) and the effect (crash during recovery).
>> 
>> How about
>> "If this has occurred more than once, it is probably caused by corrupt data 
>> and you have to use the latest backup for recovery"
>> for the crash recovery case and
>> "If this has occurred more than once, it is probably caused by corrupt data 
>> and you have to choose an earlier recovery target"
>> for the PITR case.
>> 
>> I don't see why currently only the PITR-case includes the "more than once" 
>> clause. Its probably supposed to prevent unnecessarily alarming the user if 
>> the "crash" was in fact a stray SIGKILL or an out-of-memory condition, which 
>> seems equally likely in both cases.
> 
> I've applied the patch for now - we can fix the wording of the other
> messages with a follow-on patch if we agree on what they should say.
> I don't like the use of the phrase "you have to", particularly...  I
> would tend to leave the archive recovery message alone and change the
> crash recovery message to be more like it.

Since a loose log of this shed gave me quite a bump on my forehead once, one 
last attempt at fixing it.

I've tried to keep this as similar as possible to the existing message while 
making it less ambiguous about cause and effect.

"If this has occurred more than once corrupt data might be the cause and you 
might need to choose an earlier recovery target".
and
"If this has occurred more than once corrupt data might be the cause and you 
might need to restore from backup".

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] recovery getting interrupted is not so unusual as it used to be

2010-06-02 Thread Robert Haas
On Wed, Jun 2, 2010 at 10:34 PM, Florian Pflug  wrote:
>> Oh.  Well, if that's the case, then I guess I lean toward applying the
>> patch as-is.  Then there's no need for the caveat "and without manual
>> intervention".
>
> That still leaves the messages awfully ambiguous concerning the cause (data 
> corruption) and the effect (crash during recovery).
>
> How about
> "If this has occurred more than once, it is probably caused by corrupt data 
> and you have to use the latest backup for recovery"
> for the crash recovery case and
> "If this has occurred more than once, it is probably caused by corrupt data 
> and you have to choose an earlier recovery target"
> for the PITR case.
>
> I don't see why currently only the PITR-case includes the "more than once" 
> clause. Its probably supposed to prevent unnecessarily alarming the user if 
> the "crash" was in fact a stray SIGKILL or an out-of-memory condition, which 
> seems equally likely in both cases.

I've applied the patch for now - we can fix the wording of the other
messages with a follow-on patch if we agree on what they should say.
I don't like the use of the phrase "you have to", particularly...  I
would tend to leave the archive recovery message alone and change the
crash recovery message to be more like it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] recovery getting interrupted is not so unusual as it used to be

2010-06-02 Thread Florian Pflug
On Jun 3, 2010, at 3:31 , Robert Haas wrote:
> On Wed, Jun 2, 2010 at 9:07 PM, Florian Pflug  wrote:
>> On Jun 3, 2010, at 0:58 , Robert Haas wrote:
>>> But maybe the message isn't right the first time either.  After all
>>> the point of having a write-ahead log in the first place is that we
>>> should be able to prevent corruption in the event of an unexpected
>>> shutdown.  Maybe the right thing to do is to forget about adding a new
>>> state and just remove or change the errhint from these messages:
>> 
>> You've fallen prey to a (very common) miss-interpration of this message. It 
>> is not about corruption *caused* by a crash during recovery, it's about 
>> corruption *causing* the crash.
>> 
>> I'm not in favor of getting rid of that message entirely, since produces a 
>> worthwhile hint if the crash was really caused by corrupt data. But it 
>> desperately needs a better wording that makes cause and effect perfectly 
>> clear. That even you miss-read it conclusively proves that.
>> 
>> How about
>> "If this has happened repeatedly and without manual intervention, it was 
>> probably caused by corrupted data and you may need to restore from backup"
>> for the crash recovery case and
>> "If this has happened repeatedly and without manual intervention, it was 
>> probably caused by corrupted data and you may need to choose an earlier 
>> recovery target"
>> for the PITR case.
> 
> Oh.  Well, if that's the case, then I guess I lean toward applying the
> patch as-is.  Then there's no need for the caveat "and without manual
> intervention".

That still leaves the messages awfully ambiguous concerning the cause (data 
corruption) and the effect (crash during recovery).

How about
"If this has occurred more than once, it is probably caused by corrupt data and 
you have to use the latest backup for recovery"
for the crash recovery case and
"If this has occurred more than once, it is probably caused by corrupt data and 
you have to choose an earlier recovery target"
for the PITR case.

I don't see why currently only the PITR-case includes the "more than once" 
clause. Its probably supposed to prevent unnecessarily alarming the user if the 
"crash" was in fact a stray SIGKILL or an out-of-memory condition, which seems 
equally likely in both cases.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] recovery getting interrupted is not so unusual as it used to be

2010-06-02 Thread Robert Haas
On Wed, Jun 2, 2010 at 9:07 PM, Florian Pflug  wrote:
> On Jun 3, 2010, at 0:58 , Robert Haas wrote:
>> But maybe the message isn't right the first time either.  After all
>> the point of having a write-ahead log in the first place is that we
>> should be able to prevent corruption in the event of an unexpected
>> shutdown.  Maybe the right thing to do is to forget about adding a new
>> state and just remove or change the errhint from these messages:
>
> You've fallen prey to a (very common) miss-interpration of this message. It 
> is not about corruption *caused* by a crash during recovery, it's about 
> corruption *causing* the crash.
>
> I'm not in favor of getting rid of that message entirely, since produces a 
> worthwhile hint if the crash was really caused by corrupt data. But it 
> desperately needs a better wording that makes cause and effect perfectly 
> clear. That even you miss-read it conclusively proves that.
>
> How about
> "If this has happened repeatedly and without manual intervention, it was 
> probably caused by corrupted data and you may need to restore from backup"
> for the crash recovery case and
> "If this has happened repeatedly and without manual intervention, it was 
> probably caused by corrupted data and you may need to choose an earlier 
> recovery target"
> for the PITR case.

Oh.  Well, if that's the case, then I guess I lean toward applying the
patch as-is.  Then there's no need for the caveat "and without manual
intervention".

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] recovery getting interrupted is not so unusual as it used to be

2010-06-02 Thread Florian Pflug
On Jun 3, 2010, at 0:58 , Robert Haas wrote:
> But maybe the message isn't right the first time either.  After all
> the point of having a write-ahead log in the first place is that we
> should be able to prevent corruption in the event of an unexpected
> shutdown.  Maybe the right thing to do is to forget about adding a new
> state and just remove or change the errhint from these messages:

You've fallen prey to a (very common) miss-interpration of this message. It is 
not about corruption *caused* by a crash during recovery, it's about corruption 
*causing* the crash.

I'm not in favor of getting rid of that message entirely, since produces a 
worthwhile hint if the crash was really caused by corrupt data. But it 
desperately needs a better wording that makes cause and effect perfectly clear. 
That even you miss-read it conclusively proves that.

How about
"If this has happened repeatedly and without manual intervention, it was 
probably caused by corrupted data and you may need to restore from backup"
for the crash recovery case and
"If this has happened repeatedly and without manual intervention, it was 
probably caused by corrupted data and you may need to choose an earlier 
recovery target"
for the PITR case.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] recovery getting interrupted is not so unusual as it used to be

2010-06-02 Thread Robert Haas
On Wed, Jun 2, 2010 at 5:39 PM, Heikki Linnakangas
 wrote:
> On 02/06/10 23:50, Robert Haas wrote:
>>
>> First, is it appropriate to set the control file state to
>> DB_SHUTDOWNED_IN_RECOVERY even when we're in crash recovery (as
>> opposed to archive recovery/SR)?  My vote is no, but Heikki thought it
>> might be OK.
>
> My logic on that is:
>
> If the database is known to be in good shape, i.e not corrupt, after
> shutdown during crash recovery, then we should not print the warning at
> restart saying "This probably means that some data is corrupted". There's no
> reason to believe the database is corrupt if it's a controlled shutdown, so
> setting control file state to DB_SHUTDOWNED_IN_RECOVERY is OK. But if it's
> not OK for some reason, then we really shouldn't allow the shut down in the
> first place until we hit the end of WAL.
>
> So the option "allow shutdown, but warn at restart that your data is
> probably corrupt" does not make sense in any case.

Well, the point is, we emit that message every time we go to recover
from a crash.  Presumably the message is as valid after a restart of
crash recovery as it was the first time around.



But maybe the message isn't right the first time either.  After all
the point of having a write-ahead log in the first place is that we
should be able to prevent corruption in the event of an unexpected
shutdown.  Maybe the right thing to do is to forget about adding a new
state and just remove or change the errhint from these messages:

ereport(LOG, (errmsg("database system was interrupted while in
recovery at %s", str_time(ControlFile->time)),
errhint("This probably means that some data is
corrupted and"
" you will have to use the
last backup for recovery.")));

ereport(LOG, (errmsg("database system was interrupted while in
recovery at log time %s", str_time(ControlFile->checkPointCopy.time)),
   errhint("If this has occurred more than once
some data might be corrupted"
  " and you might need to choose an earlier
recovery target.")));

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] recovery getting interrupted is not so unusual as it used to be

2010-06-02 Thread Heikki Linnakangas

On 02/06/10 23:50, Robert Haas wrote:

First, is it appropriate to set the control file state to
DB_SHUTDOWNED_IN_RECOVERY even when we're in crash recovery (as
opposed to archive recovery/SR)?  My vote is no, but Heikki thought it
might be OK.


My logic on that is:

If the database is known to be in good shape, i.e not corrupt, after 
shutdown during crash recovery, then we should not print the warning at 
restart saying "This probably means that some data is corrupted". 
There's no reason to believe the database is corrupt if it's a 
controlled shutdown, so setting control file state to 
DB_SHUTDOWNED_IN_RECOVERY is OK. But if it's not OK for some reason, 
then we really shouldn't allow the shut down in the first place until we 
hit the end of WAL.


So the option "allow shutdown, but warn at restart that your data is 
probably corrupt" does not make sense in any case.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] recovery getting interrupted is not so unusual as it used to be

2010-06-02 Thread Robert Haas
On Mon, May 17, 2010 at 4:33 AM, Fujii Masao  wrote:
> On Sat, May 15, 2010 at 3:20 AM, Robert Haas  wrote:
>> Hmm, OK, I think that makes sense.  Would you care to propose a patch?
>
> Yep. Here is the patch.
>
> This patch distinguishes normal shutdown from unexpected exit, while the
> server is in recovery. That is, when smart or fast shutdown is requested
> during recovery, the bgwriter sets the ControlFile->state to new-introduced
> DB_SHUTDOWNED_IN_RECOVERY state.
>
> When recovery starts from the DB_SHUTDOWNED_IN_RECOVERY state, the startup
> process emits
>
>    LOG:  database system was shut down in recovery at 2010-05-12 20:35:24 EDT
>
> instead of
>
>    LOG:  database system was interrupted while in recovery at log
> time 2010-05-12 20:35:24 EDT
>    HINT:  If this has occurred more than once some data might be
> corrupted and you might need to choose an earlier recovery target.

Heikki and I discussed this over IM today and came away with two questions.

First, is it appropriate to set the control file state to
DB_SHUTDOWNED_IN_RECOVERY even when we're in crash recovery (as
opposed to archive recovery/SR)?  My vote is no, but Heikki thought it
might be OK.

Second, one of the places where this patch updates the control file
immediately follows a call to UpdateMinRecoveryPoint().  That can lead
to fsync-ing the control file twice in a row.  Should we worry about
this or just let it go?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] recovery getting interrupted is not so unusual as it used to be

2010-05-25 Thread Robert Haas
On Tue, May 25, 2010 at 12:36 PM, Simon Riggs  wrote:
> On Tue, 2010-05-25 at 19:12 +0900, Fujii Masao wrote:
>> On Mon, May 17, 2010 at 5:33 PM, Fujii Masao  wrote:
>> > On Sat, May 15, 2010 at 3:20 AM, Robert Haas  wrote:
>> >> Hmm, OK, I think that makes sense.  Would you care to propose a patch?
>> >
>> > Yep. Here is the patch.
>> >
>> > This patch distinguishes normal shutdown from unexpected exit, while the
>> > server is in recovery. That is, when smart or fast shutdown is requested
>> > during recovery, the bgwriter sets the ControlFile->state to new-introduced
>> > DB_SHUTDOWNED_IN_RECOVERY state.
>>
>> This patch is worth applying for 9.0? If not, I'll add it into
>> the next CF for 9.1.
>
> Presumably Robert will be applying the patch? It seems to address the
> concern raised on the thread.

Yes, I was planning to review it.  But if you or someone else would
like to cut in, that's OK too.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] recovery getting interrupted is not so unusual as it used to be

2010-05-25 Thread Simon Riggs
On Tue, 2010-05-25 at 19:12 +0900, Fujii Masao wrote:
> On Mon, May 17, 2010 at 5:33 PM, Fujii Masao  wrote:
> > On Sat, May 15, 2010 at 3:20 AM, Robert Haas  wrote:
> >> Hmm, OK, I think that makes sense.  Would you care to propose a patch?
> >
> > Yep. Here is the patch.
> >
> > This patch distinguishes normal shutdown from unexpected exit, while the
> > server is in recovery. That is, when smart or fast shutdown is requested
> > during recovery, the bgwriter sets the ControlFile->state to new-introduced
> > DB_SHUTDOWNED_IN_RECOVERY state.
> 
> This patch is worth applying for 9.0? If not, I'll add it into
> the next CF for 9.1.

Presumably Robert will be applying the patch? It seems to address the
concern raised on the thread.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] recovery getting interrupted is not so unusual as it used to be

2010-05-25 Thread Fujii Masao
On Mon, May 17, 2010 at 5:33 PM, Fujii Masao  wrote:
> On Sat, May 15, 2010 at 3:20 AM, Robert Haas  wrote:
>> Hmm, OK, I think that makes sense.  Would you care to propose a patch?
>
> Yep. Here is the patch.
>
> This patch distinguishes normal shutdown from unexpected exit, while the
> server is in recovery. That is, when smart or fast shutdown is requested
> during recovery, the bgwriter sets the ControlFile->state to new-introduced
> DB_SHUTDOWNED_IN_RECOVERY state.

This patch is worth applying for 9.0? If not, I'll add it into
the next CF for 9.1.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] recovery getting interrupted is not so unusual as it used to be

2010-05-17 Thread Fujii Masao
On Sat, May 15, 2010 at 3:20 AM, Robert Haas  wrote:
> Hmm, OK, I think that makes sense.  Would you care to propose a patch?

Yep. Here is the patch.

This patch distinguishes normal shutdown from unexpected exit, while the
server is in recovery. That is, when smart or fast shutdown is requested
during recovery, the bgwriter sets the ControlFile->state to new-introduced
DB_SHUTDOWNED_IN_RECOVERY state.

When recovery starts from the DB_SHUTDOWNED_IN_RECOVERY state, the startup
process emits

LOG:  database system was shut down in recovery at 2010-05-12 20:35:24 EDT

instead of

LOG:  database system was interrupted while in recovery at log
time 2010-05-12 20:35:24 EDT
HINT:  If this has occurred more than once some data might be
corrupted and you might need to choose an earlier recovery target.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


db_state_for_shutdown_in_recovery_v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] recovery getting interrupted is not so unusual as it used to be

2010-05-14 Thread Robert Haas
On Thu, May 13, 2010 at 1:28 AM, Fujii Masao  wrote:
> On Thu, May 13, 2010 at 12:10 PM, Robert Haas  wrote:
>> Hmm, it seems this is my night to rediscover the wisdom of your
>> previous proposals.  I think that state would only be appropriate when
>> we shutdown after reaching consistency, not any shutdown during
>> recovery.  Do you agree?
>
> No. When shutdown happens before reaching consistency, the database might
> be inconsistent, but which doesn't mean that some data might be corrupted.
> We can get consistent (not corrupted) database by applying the WAL records
> to inconsistent one.
>
>> HINT:  If this has occurred more than once some data might be
>> corrupted and you might need to choose an earlier recovery target.
>
> I think that the hint message indicates the data corruption which prevents
> recovery from completing, rather than the inconsistency of the database.

Hmm, OK, I think that makes sense.  Would you care to propose a patch?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] recovery getting interrupted is not so unusual as it used to be

2010-05-12 Thread Fujii Masao
On Thu, May 13, 2010 at 12:10 PM, Robert Haas  wrote:
> Hmm, it seems this is my night to rediscover the wisdom of your
> previous proposals.  I think that state would only be appropriate when
> we shutdown after reaching consistency, not any shutdown during
> recovery.  Do you agree?

No. When shutdown happens before reaching consistency, the database might
be inconsistent, but which doesn't mean that some data might be corrupted.
We can get consistent (not corrupted) database by applying the WAL records
to inconsistent one.

> HINT:  If this has occurred more than once some data might be
> corrupted and you might need to choose an earlier recovery target.

I think that the hint message indicates the data corruption which prevents
recovery from completing, rather than the inconsistency of the database.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] recovery getting interrupted is not so unusual as it used to be

2010-05-12 Thread Robert Haas
On Wed, May 12, 2010 at 11:07 PM, Fujii Masao  wrote:
> On Thu, May 13, 2010 at 10:01 AM, Robert Haas  wrote:
>> When firing up a properly shut down HS slave, I get:
>>
>> LOG:  database system was interrupted while in recovery at log time
>> 2010-05-12 20:35:24 EDT
>> HINT:  If this has occurred more than once some data might be
>> corrupted and you might need to choose an earlier recovery target.
>>
>> But this is kind of an alarming hint for what is now a normal and
>> expected condition.  Can we detect the difference between the case
>> where the HINT is really accurate and the case where it's not in some
>> way, and display a better message?
>
> How about my previous proposal: adding new system status like
> DB_SHUTDOWNED_IN_RECOVERY, setting the status to it when the shutdown
> is performed during recovery, and reporting the suitable message
> when starting up the server from it?
> http://archives.postgresql.org/pgsql-hackers/2010-02/msg00337.php

Hmm, it seems this is my night to rediscover the wisdom of your
previous proposals.  I think that state would only be appropriate when
we shutdown after reaching consistency, not any shutdown during
recovery.  Do you agree?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] recovery getting interrupted is not so unusual as it used to be

2010-05-12 Thread Fujii Masao
On Thu, May 13, 2010 at 10:01 AM, Robert Haas  wrote:
> When firing up a properly shut down HS slave, I get:
>
> LOG:  database system was interrupted while in recovery at log time
> 2010-05-12 20:35:24 EDT
> HINT:  If this has occurred more than once some data might be
> corrupted and you might need to choose an earlier recovery target.
>
> But this is kind of an alarming hint for what is now a normal and
> expected condition.  Can we detect the difference between the case
> where the HINT is really accurate and the case where it's not in some
> way, and display a better message?

How about my previous proposal: adding new system status like
DB_SHUTDOWNED_IN_RECOVERY, setting the status to it when the shutdown
is performed during recovery, and reporting the suitable message
when starting up the server from it?
http://archives.postgresql.org/pgsql-hackers/2010-02/msg00337.php

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] recovery getting interrupted is not so unusual as it used to be

2010-05-12 Thread Robert Haas
When firing up a properly shut down HS slave, I get:

LOG:  database system was interrupted while in recovery at log time
2010-05-12 20:35:24 EDT
HINT:  If this has occurred more than once some data might be
corrupted and you might need to choose an earlier recovery target.

But this is kind of an alarming hint for what is now a normal and
expected condition.  Can we detect the difference between the case
where the HINT is really accurate and the case where it's not in some
way, and display a better message?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers