Re: [HACKERS] Problem with PITR recovery

2005-04-21 Thread Simon Riggs
On Thu, 2005-04-21 at 08:57 -0400, Bruce Momjian wrote: > Michael Paesold wrote: > > Tom Lane wrote: > > > Bruce Momjian wrote: > > >> OK, makes sense. Could we give them a command to archive it before they > > >> shut down? That would make sense. > > > > > > Not if the idea is to be certain you

Re: [HACKERS] Problem with PITR recovery

2005-04-21 Thread Bruce Momjian
Michael Paesold wrote: > Tom Lane wrote: > > > Bruce Momjian wrote: > >> OK, makes sense. Could we give them a command to archive it before they > >> shut down? That would make sense. > > > > Not if the idea is to be certain you got everything ... I think what we > > have to do is document a man

Re: [HACKERS] Problem with PITR recovery

2005-04-20 Thread Michael Paesold
Tom Lane wrote: Bruce Momjian wrote: OK, makes sense. Could we give them a command to archive it before they shut down? That would make sense. Not if the idea is to be certain you got everything ... I think what we have to do is document a manual procedure for archiving the last XLOG file. What B

Re: [HACKERS] Problem with PITR recovery

2005-04-20 Thread Tom Lane
> OK, makes sense. Could we give them a command to archive it before they > shut down? That would make sense. Not if the idea is to be certain you got everything ... I think what we have to do is document a manual procedure for archiving the last XLOG file. But really my question is "what's the

Re: [HACKERS] Problem with PITR recovery

2005-04-20 Thread Bruce Momjian
Tom Lane wrote: > Bruce Momjian writes: > > Tom Lane wrote: > >> However, this still begs the question of why we are bothering. > >> I disagree with the goal in this particular case anyhow: I do not > >> think it's necessary, safe, nor sane for a shutdown to try to archive > >> the last XLOG segme

Re: [HACKERS] Problem with PITR recovery

2005-04-20 Thread Tom Lane
Bruce Momjian writes: > Tom Lane wrote: >> However, this still begs the question of why we are bothering. >> I disagree with the goal in this particular case anyhow: I do not >> think it's necessary, safe, nor sane for a shutdown to try to archive >> the last XLOG segment. Even if we fixed the xl

Re: [HACKERS] Problem with PITR recovery

2005-04-20 Thread Bruce Momjian
Tom Lane wrote: > Simon Riggs <[EMAIL PROTECTED]> writes: > > Treating shutdown checkpoint markers as xlog switches is possible but > > gives problems since archive_command is a SUSET variable. On replay we > > wouldn't necessarily know whether a shutdown checkpoint was treated as > > an xlog switc

Re: [HACKERS] Problem with PITR recovery

2005-04-20 Thread Simon Riggs
On Wed, 2005-04-20 at 15:51 -0400, Tom Lane wrote: > Simon Riggs <[EMAIL PROTECTED]> writes: > > AFAICS this is the only case of unconditionally acquiring all 3 locks. > > You just lost me ... I think the above is certainly a bad idea from a > concurrency standpoint, and very possibly a deadlock r

Re: [HACKERS] Problem with PITR recovery

2005-04-20 Thread Simon Riggs
On Wed, 2005-04-20 at 15:59 -0400, Tom Lane wrote: > Simon Riggs <[EMAIL PROTECTED]> writes: > > Treating shutdown checkpoint markers as xlog switches is possible but > > gives problems since archive_command is a SUSET variable. On replay we > > wouldn't necessarily know whether a shutdown checkpoi

Re: [HACKERS] Problem with PITR recovery

2005-04-20 Thread Tom Lane
Simon Riggs <[EMAIL PROTECTED]> writes: > Treating shutdown checkpoint markers as xlog switches is possible but > gives problems since archive_command is a SUSET variable. On replay we > wouldn't necessarily know whether a shutdown checkpoint was treated as > an xlog switch when it was written, so

Re: [HACKERS] Problem with PITR recovery

2005-04-20 Thread Tom Lane
Simon Riggs <[EMAIL PROTECTED]> writes: > AFAICS this is the only case of unconditionally acquiring all 3 locks. You just lost me ... I think the above is certainly a bad idea from a concurrency standpoint, and very possibly a deadlock risk. In any case you are thinking about it the wrong way. I

Re: [HACKERS] Problem with PITR recovery

2005-04-20 Thread Simon Riggs
On Mon, 2005-04-18 at 23:20 +0100, Simon Riggs wrote: > My plan would be to write a special xlog record for xlog switching. This > would be a special processing instruction, rather than a data/redo > instructions. This would be implemented as another xlog info value on > the xlog_redo resource mana

Re: [HACKERS] Problem with PITR recovery

2005-04-20 Thread Simon Riggs
On Wed, 2005-04-20 at 09:28 +0200, Klaus Naumann wrote: > > > Actually, me too. Never saw the need for the Oracle command myself. > > It actually has. If you want to move your redo logs to a new disk, you > create a new redo log file and then issue a ALTER SYSTEM SWITCH LOGFILE; > to switch to th

Re: [HACKERS] Problem with PITR recovery

2005-04-20 Thread Andrew Rawnsley
It is also recommended when creating new standby control files, when Oracle can't automatically expand the data file capacity on a standby like it does with a live database. Nothing like seeing the 'Didn't restore from sufficiently old backup' message when Oracle is confused (which seems to

Re: [HACKERS] Problem with PITR recovery

2005-04-20 Thread Klaus Naumann
Hi Simon, Actually, me too. Never saw the need for the Oracle command myself. It actually has. If you want to move your redo logs to a new disk, you create a new redo log file and then issue a ALTER SYSTEM SWITCH LOGFILE; to switch to the new logfile. Then you can remove the "old" one (speaking jus

Re: [HACKERS] Problem with PITR recovery

2005-04-19 Thread Oleg Bartunov
On Tue, 19 Apr 2005, Jeff Davis wrote: Unless I misunderstand something, I think you're overreacting a bit. The Y're right. It's all emotions :) Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Ste

Re: [HACKERS] Problem with PITR recovery

2005-04-19 Thread Bruce Momjian
Jeff Davis wrote: > Unless I misunderstand something, I think you're overreacting a bit. The > failure case is that the machine on which the database resides vaporizes > after you've done "pg_stop_backup()" but before the archiver archives > the WAL segments used during the backup procedure. > > I

Re: [HACKERS] Problem with PITR recovery

2005-04-19 Thread Jeff Davis
On Tue, 2005-04-19 at 15:23 +0400, Oleg Bartunov wrote: > This is not an argument ! It's shame we still don't understand do we really > have reliable online backup or just hype with a lot of restriction and > caution. I'm not experienced Oracle DBA but I don't want to be a blind user. > I read sem

Re: [HACKERS] Problem with PITR recovery

2005-04-19 Thread Bruce Momjian
Alvaro Herrera wrote: > On Tue, Apr 19, 2005 at 11:05:32AM -0400, Bruce Momjian wrote: > > Simon Riggs wrote: > > > > The disk would only fill if the archiver doesn't keep up with > > > transmitting xlog files to the archive. The archive can fill up if it is > > > not correctly sized, even now. Sw

Re: [HACKERS] Problem with PITR recovery

2005-04-19 Thread Alvaro Herrera
On Tue, Apr 19, 2005 at 11:05:32AM -0400, Bruce Momjian wrote: > Simon Riggs wrote: > > The disk would only fill if the archiver doesn't keep up with > > transmitting xlog files to the archive. The archive can fill up if it is > > not correctly sized, even now. Switching log files every N seconds

Re: [HACKERS] Problem with PITR recovery

2005-04-19 Thread Tom Lane
Bruce Momjian writes: > I was thinking of the archiver filling because of lots of almost-empty > 16mb files. If you archive every five seconds, it is 11 Gigs/hour, > which is not too bad, I guess, but I would bet compression would save > space and I/O load too. If you wanted to archive every few

Re: [HACKERS] Problem with PITR recovery

2005-04-19 Thread Bruce Momjian
Simon Riggs wrote: > On Mon, 2005-04-18 at 21:25 -0400, Bruce Momjian wrote: > > Tom Lane wrote: > > > Simon Riggs <[EMAIL PROTECTED]> writes: > > > > The wal file could be truncated after the log switch record, though I'd > > > > want to make sure that didn't cause other problems. > > > > > > Whi

Re: [HACKERS] Problem with PITR recovery

2005-04-19 Thread Oleg Bartunov
On Tue, 19 Apr 2005, Simon Riggs wrote: On Tue, 2005-04-19 at 08:55 +0400, Oleg Bartunov wrote: On Mon, 18 Apr 2005, Simon Riggs wrote: but I'm not sure it's best practice to delete them at that point. I would recommend that users keep at least the last 3 backups. So, I'd prefer the wording ...all

Re: [HACKERS] Problem with PITR recovery

2005-04-19 Thread Simon Riggs
On Tue, 2005-04-19 at 08:55 +0400, Oleg Bartunov wrote: > On Mon, 18 Apr 2005, Simon Riggs wrote: > > but I'm not sure it's best practice to delete them at that point. I > > would recommend that users keep at least the last 3 backups. So, I'd > > prefer the wording > > > > ...all archived WAL segme

Re: [HACKERS] Problem with PITR recovery

2005-04-19 Thread Simon Riggs
On Mon, 2005-04-18 at 21:25 -0400, Bruce Momjian wrote: > Tom Lane wrote: > > Simon Riggs <[EMAIL PROTECTED]> writes: > > > The wal file could be truncated after the log switch record, though I'd > > > want to make sure that didn't cause other problems. > > > > Which it would: that would break WAL

Re: [HACKERS] Problem with PITR recovery

2005-04-18 Thread Oleg Bartunov
On Tue, 19 Apr 2005, Simon Riggs wrote: I'd suggest this as a backpatch for 8.0.x, when completed. Not a chance --- it's a new feature, not a bug fix, and has substantial risk of breaking things. No problem for me personally; I only request it, according to users wishes. Users wish deterministic pr

Re: [HACKERS] Problem with PITR recovery

2005-04-18 Thread Oleg Bartunov
On Mon, 18 Apr 2005, Simon Riggs wrote: On Mon, 2005-04-18 at 13:41 -0400, Bruce Momjian wrote: Tom Lane wrote: Bruce Momjian writes: I guess I didn't see the connection between the file system backup and the WAL files, when in fact you need the WAL files that go with the file system badckup to do

Re: [HACKERS] Problem with PITR recovery

2005-04-18 Thread Bruce Momjian
Simon Riggs wrote: > On Mon, 2005-04-18 at 13:41 -0400, Bruce Momjian wrote: > > Tom Lane wrote: > > > Bruce Momjian writes: > > > > I guess I didn't see the connection between the file system backup and > > > > the WAL files, when in fact you need the WAL files that go with the file > > > > syste

Re: [HACKERS] Problem with PITR recovery

2005-04-18 Thread Bruce Momjian
Tom Lane wrote: > Simon Riggs <[EMAIL PROTECTED]> writes: > > The wal file could be truncated after the log switch record, though I'd > > want to make sure that didn't cause other problems. > > Which it would: that would break WAL file recycling. Good point. I don't see non-full WAL archiving as

Re: [HACKERS] Problem with PITR recovery

2005-04-18 Thread Simon Riggs
On Mon, 2005-04-18 at 19:21 -0400, Tom Lane wrote: > Simon Riggs <[EMAIL PROTECTED]> writes: > > The wal file could be truncated after the log switch record, though I'd > > want to make sure that didn't cause other problems. > > Which it would: that would break WAL file recycling. Yeh, there's ju

Re: [HACKERS] Problem with PITR recovery

2005-04-18 Thread Tom Lane
Simon Riggs <[EMAIL PROTECTED]> writes: > The wal file could be truncated after the log switch record, though I'd > want to make sure that didn't cause other problems. Which it would: that would break WAL file recycling. > That would be initiated through a single function pg_walfile_switch() > wh

Re: [HACKERS] Problem with PITR recovery

2005-04-18 Thread Simon Riggs
On Mon, 2005-04-18 at 13:41 -0400, Bruce Momjian wrote: > Tom Lane wrote: > > Bruce Momjian writes: > > > I guess I didn't see the connection between the file system backup and > > > the WAL files, when in fact you need the WAL files that go with the file > > > system badckup to do the recovery.

Re: [HACKERS] Problem with PITR recovery

2005-04-18 Thread Simon Riggs
On Mon, 2005-04-18 at 16:44 +0200, [EMAIL PROTECTED] wrote: > Rob Butler <[EMAIL PROTECTED]> wrote on 18.04.2005, 15:05:20: > > > > > I'd say it's very not cool :) It's not we all > > > expected from PITR. > > > I recall now Simon mentioned about that and have it > > > in his TODO. > > > Other thi

Re: [HACKERS] Problem with PITR recovery

2005-04-18 Thread Bruce Momjian
Tom Lane wrote: > Bruce Momjian writes: > > Tom Lane wrote: > >> Archive on stop is right out. The common reason for a stop is that the > >> system is being shut down, and we don't have time to archive a WAL file > >> before init will kill -9 us. > > > Ah, good point. Can we do it for 'smart' s

Re: [HACKERS] Problem with PITR recovery

2005-04-18 Thread Bruce Momjian
Tom Lane wrote: > Bruce Momjian writes: > > I guess I didn't see the connection between the file system backup and > > the WAL files, when in fact you need the WAL files that go with the file > > system badckup to do the recovery. Do you have new suggested text? > > I think it probably needs to

Re: [HACKERS] Problem with PITR recovery

2005-04-18 Thread Tom Lane
Bruce Momjian writes: > Tom Lane wrote: >> Archive on stop is right out. The common reason for a stop is that the >> system is being shut down, and we don't have time to archive a WAL file >> before init will kill -9 us. > Ah, good point. Can we do it for 'smart' shutdown mode, which is the > d

Re: [HACKERS] Problem with PITR recovery

2005-04-18 Thread Tom Lane
Bruce Momjian writes: > I guess I didn't see the connection between the file system backup and > the WAL files, when in fact you need the WAL files that go with the file > system badckup to do the recovery. Do you have new suggested text? I think it probably needs to mention *both* the tar dump

Re: [HACKERS] Problem with PITR recovery

2005-04-18 Thread Bruce Momjian
Tom Lane wrote: > Bruce Momjian writes: > > Ragnar Hafstað wrote: > >> On Sat, 2005-04-16 at 23:06 -0400, Bruce Momjian wrote: > >>> I am not clear on what the "backup dump file" is? I assume it means > >>> 0001123455CD. It is called "WAL segment file" above. I > >>> will rename tha

Re: [HACKERS] Problem with PITR recovery

2005-04-18 Thread Bruce Momjian
Tom Lane wrote: > Bruce Momjian writes: > > OK, I updated the two current TODO items: > > * Automatically force archiving of partially-filled WAL files when > > pg_stop_backup() is called or the server is stopped > > > Is this OK? > > Archive on stop is right out. The common reason fo

Re: [HACKERS] Problem with PITR recovery

2005-04-18 Thread Tom Lane
Bruce Momjian writes: > OK, I updated the two current TODO items: > * Automatically force archiving of partially-filled WAL files when > pg_stop_backup() is called or the server is stopped > Is this OK? Archive on stop is right out. The common reason for a stop is that the system

Re: [HACKERS] Problem with PITR recovery

2005-04-18 Thread Tom Lane
Bruce Momjian writes: > Ragnar Hafstað wrote: >> On Sat, 2005-04-16 at 23:06 -0400, Bruce Momjian wrote: >>> I am not clear on what the "backup dump file" is? I assume it means >>> 0001123455CD. It is called "WAL segment file" above. I >>> will rename that phrase to match the above

Re: [HACKERS] Problem with PITR recovery

2005-04-18 Thread Greg Stark
Bruce Momjian writes: > You mean don't force the archive copy but just have pg_stop_backup() > hang until the files fill? Yea, we could do that, but there is no way > to know how long the hang might take. Actually I meant both. -- greg ---(end of broadcast)--

Re: [HACKERS] Problem with PITR recovery

2005-04-18 Thread Bruce Momjian
Greg Stark wrote: > Bruce Momjian writes: > > > I see your point. New text is: > > > > 4 Again connect to the database as a superuser, and issue the command > > > > SELECT pg_stop_backup(); > > > > This should return successfully. > > > > 5 Once the W

Re: [HACKERS] Problem with PITR recovery

2005-04-18 Thread Bruce Momjian
Oleg Bartunov wrote: > On Mon, 18 Apr 2005, Rob Butler wrote: > > > > >> I'd say it's very not cool :) It's not we all > >> expected from PITR. > >> I recall now Simon mentioned about that and have it > >> in his TODO. > >> Other thing I don't understand what's the problem to > >> generate WAL fil

Re: [HACKERS] Problem with PITR recovery

2005-04-18 Thread Bruce Momjian
OK, I updated the two current TODO items: * Allow point-in-time recovery to archive partially filled write-ahead logs Currently only full WAL files are archived. This means that the most recent transactions aren't available for recovery in ca

Re: [HACKERS] Problem with PITR recovery

2005-04-18 Thread Greg Stark
Bruce Momjian writes: > I see your point. New text is: > > 4 Again connect to the database as a superuser, and issue the command > > SELECT pg_stop_backup(); > > This should return successfully. > > 5 Once the WAL segment files used du

Re: Re: [HACKERS] Problem with PITR recovery

2005-04-18 Thread simon
Rob Butler <[EMAIL PROTECTED]> wrote on 18.04.2005, 15:05:20: > > > I'd say it's very not cool :) It's not we all > > expected from PITR. > > I recall now Simon mentioned about that and have it > > in his TODO. > > Other thing I don't understand what's the problem to > > generate WAL file > > by

Re: [HACKERS] Problem with PITR recovery

2005-04-18 Thread Bruce Momjian
Oleg Bartunov wrote: > > Is there something in the current wording that needs clarification? > > I'd say it's very not cool :) It's not we all expected from PITR. > I recall now Simon mentioned about that and have it in his TODO. > Other thing I don't understand what's the problem to generate WAL

Re: [HACKERS] Problem with PITR recovery

2005-04-18 Thread Oleg Bartunov
On Mon, 18 Apr 2005, Rob Butler wrote: I'd say it's very not cool :) It's not we all expected from PITR. I recall now Simon mentioned about that and have it in his TODO. Other thing I don't understand what's the problem to generate WAL file by demand ? Probably, TODO should says about this. This w

Re: [HACKERS] Problem with PITR recovery

2005-04-18 Thread Bruce Momjian
Jeff Davis wrote: > On Mon, 2005-04-18 at 00:20 -0400, Bruce Momjian wrote: > > Jeff Davis wrote: > > > > > > Can you sort of run through the failure case again, and how to prevent > > > it? > > > > The failure case in the original docs is that you do your > > pg_stop_backup(), and then delete al

Re: [HACKERS] Problem with PITR recovery

2005-04-18 Thread Rob Butler
> I'd say it's very not cool :) It's not we all > expected from PITR. > I recall now Simon mentioned about that and have it > in his TODO. > Other thing I don't understand what's the problem to > generate WAL file > by demand ? Probably, TODO should says about this. This would definetly be a good

Re: [HACKERS] Problem with PITR recovery

2005-04-18 Thread Oleg Bartunov
On Mon, 18 Apr 2005, Bruce Momjian wrote: Jeff Davis wrote: I could still use a little clarification. It seems sort of like there is an extra step, like: (1) start archiving (2) pg_start_backup() (3) copy PGDATA directory with tar (4) pg_stop_backup() (5) ?? And the text you have at http://candle.p

Re: [HACKERS] Problem with PITR recovery

2005-04-17 Thread Jeff Davis
On Mon, 2005-04-18 at 00:20 -0400, Bruce Momjian wrote: > Jeff Davis wrote: > > > > Can you sort of run through the failure case again, and how to prevent > > it? > > The failure case in the original docs is that you do your > pg_stop_backup(), and then delete all the WAL file before the *.backup

Re: [HACKERS] Problem with PITR recovery

2005-04-17 Thread Bruce Momjian
Jeff Davis wrote: > > I could still use a little clarification. It seems sort of like there is > an extra step, like: > > (1) start archiving > (2) pg_start_backup() > (3) copy PGDATA directory with tar > (4) pg_stop_backup() > (5) ?? > > And the text you have at > http://candle.pha.pa.us/main/w

Re: [HACKERS] Problem with PITR recovery

2005-04-17 Thread Jeff Davis
I could still use a little clarification. It seems sort of like there is an extra step, like: (1) start archiving (2) pg_start_backup() (3) copy PGDATA directory with tar (4) pg_stop_backup() (5) ?? And the text you have at http://candle.pha.pa.us/main/writings/pgsql/sgml/backup-online.html say

Re: [HACKERS] Problem with PITR recovery

2005-04-17 Thread Bruce Momjian
pgman wrote: > I figured that part of the goal of PITR was that you could recover from > just the tar backup and archived WAL files --- using the pg_xlog > contents is nice, but not something we can require. > > I understood the last missing WAL log would cause missing information, > but not that

Re: [HACKERS] Problem with PITR recovery

2005-04-17 Thread Bruce Momjian
Bruce Momjian wrote: > I figured that part of the goal of PITR was that you could recover from > just the tar backup and archived WAL files --- using the pg_xlog > contents is nice, but not something we can require. > > I understood the last missing WAL log would cause missing information, > but n

Re: [HACKERS] Problem with PITR recovery

2005-04-17 Thread Bruce Momjian
Ragnar Hafstað wrote: > On Sat, 2005-04-16 at 23:06 -0400, Bruce Momjian wrote: > [about backup procedure with PITR documentation > > > I see in the docs: > > > > To make use of this backup, you will need to keep around all the WAL > > segment files generated at or after the starting time

Re: [HACKERS] Problem with PITR recovery

2005-04-17 Thread Ragnar Hafstað
On Sat, 2005-04-16 at 23:06 -0400, Bruce Momjian wrote: [about backup procedure with PITR documentation > I see in the docs: > > To make use of this backup, you will need to keep around all the WAL > segment files generated at or after the starting time of the backup. To > aid y

Re: [HACKERS] Problem with PITR recovery

2005-04-16 Thread Bruce Momjian
Tom Lane wrote: > Bruce Momjian writes: > > The problem is that we don't archive the partially written xlog file, > > and in this case that xlog file contains the information needed to make > > the tar file consistent. > > > Is this a known problem? Do we document this? If so, I can't find it.

Re: [HACKERS] Problem with PITR recovery

2005-04-16 Thread Tom Lane
Bruce Momjian writes: > The problem is that we don't archive the partially written xlog file, > and in this case that xlog file contains the information needed to make > the tar file consistent. > Is this a known problem? Do we document this? If so, I can't find it. Yes, and yes. You did not

[HACKERS] Problem with PITR recovery

2005-04-15 Thread Bruce Momjian
I had a problem using PITR recovery just now. If I do: SELECT pg_start_backup('label'); do my tar SELECT pg_stop_backup(); and stop the server, delete /data, then recover from the tar, delete files in pg_xlog, then set recovery.conf to restore, it fails, I think because n