Re: [HACKERS] Forcing current WAL file to be archived

2006-07-31 Thread Hannu Krosing
Ühel kenal päeval, T, 2006-07-25 kell 17:05, kirjutas Simon Riggs:
 On Tue, 2006-07-25 at 11:53 -0400, Tom Lane wrote:
  That's fine, but feature freeze is in a week and we don't even have
  the
  basic function for manually doing a log file switch.  Let's get that
  done first and then think about automatic switches.
 
 Agreed.

Simon, did you (or anybody else) manage to complete the patch for adding
the (wal_filename, offset) returning function ?


-- 

Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me:  callto:hkrosing
Get Skype for free:  http://www.skype.com



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Forcing current WAL file to be archived

2006-07-31 Thread Simon Riggs
On Tue, 2006-08-01 at 00:40 +0300, Hannu Krosing wrote:
 Ühel kenal päeval, T, 2006-07-25 kell 17:05, kirjutas Simon Riggs:
  On Tue, 2006-07-25 at 11:53 -0400, Tom Lane wrote:
   That's fine, but feature freeze is in a week and we don't even have
   the
   basic function for manually doing a log file switch.  Let's get that
   done first and then think about automatic switches.
  
  Agreed.
 
 Simon, did you (or anybody else) manage to complete the patch for adding
 the (wal_filename, offset) returning function ?

Just wrapping now.

I tried to add archive_timeout also, though am still fiddling with that,
so I've taken that back out for now.

-- 
  Simon Riggs
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Forcing current WAL file to be archived

2006-07-31 Thread Simon Riggs
On Tue, 2006-08-01 at 00:40 +0300, Hannu Krosing wrote:
 Ühel kenal päeval, T, 2006-07-25 kell 17:05, kirjutas Simon Riggs:
  On Tue, 2006-07-25 at 11:53 -0400, Tom Lane wrote:
   That's fine, but feature freeze is in a week and we don't even have
   the
   basic function for manually doing a log file switch.  Let's get that
   done first and then think about automatic switches.
  
  Agreed.

So: automatic switching of xlogs

I've written a patch to implement archive_timeout, apart from the
infrastructure required to allow archiver to use LWLocks.

If we do this, it will allow the archiver to write to shared memory and
log files in particular. People may have a robustness issue with that,
so I'd like to check before doing this.

As a result, I'm thinking: What's the minimum infrastructure I can get
away with?

I'll post to -patches what I've got, to further this discussion.

-- 
  Simon Riggs
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


[HACKERS] Forcing current WAL file to be archived

2006-07-25 Thread Bruce Momjian
Where are we on these TODO items:

  o Allow point-in-time recovery to archive partially filled
write-ahead logs [pitr]

Currently only full WAL files are archived. This means that the
most recent transactions aren't available for recovery in case
of a disk failure. This could be triggered by a user command or
a timer.

  o Automatically force archiving of partially-filled WAL files when
pg_stop_backup() is called or the server is stopped

Doing this will allow administrators to know more easily when
the archive contains all the files needed for point-in-time
recovery.
http://archives.postgresql.org/pgsql-patches/2005-04/msg00121.php

  o Add reporting of the current WAL file, perhaps as part of
partial log file archiving

Seems they should be completed for 8.2.  I have only a /contrib version for
the last one.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Forcing current WAL file to be archived

2006-07-25 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 Where are we on these TODO items:

   o Allow point-in-time recovery to archive partially filled
 write-ahead logs [pitr]

I believe we'd agreed that the necessary infrastructure for this is
just a function to tell the current WAL segment name and offset.

   o Automatically force archiving of partially-filled WAL files when
 pg_stop_backup() is called or the server is stopped

I see no need for that to be automatic.  I'd vote for a simple
function pg_finish_wal_segment() or something like that, which you
call just after pg_stop_backup() if you want this behavior.  Trying
to tie it into pg_stop_backup() will only make things more complicated
and less flexible.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Forcing current WAL file to be archived

2006-07-25 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  Where are we on these TODO items:
 
o Allow point-in-time recovery to archive partially filled
  write-ahead logs [pitr]
 
 I believe we'd agreed that the necessary infrastructure for this is
 just a function to tell the current WAL segment name and offset.

Yes, perhaps, though I can envision a GUC that does regularly partial
archiving.  I will add a question mark to the item.  In fact, the
description has more details:

  o Allow point-in-time recovery to archive partially filled
write-ahead logs? [pitr]

Currently only full WAL files are archived. This means that the
most recent transactions aren't available for recovery in case
of a disk failure. This could be triggered by a user command or
a timer.

o Automatically force archiving of partially-filled WAL files when
  pg_stop_backup() is called or the server is stopped
 
 I see no need for that to be automatic.  I'd vote for a simple
 function pg_finish_wal_segment() or something like that, which you
 call just after pg_stop_backup() if you want this behavior.  Trying
 to tie it into pg_stop_backup() will only make things more complicated
 and less flexible.

I assumed we would have a function like pg_finish_wal_segment(), and
server stop and stop_backup would call it too, the reason being, it
would greatly simplify our documentation on how to use PITR if these
were done automatically.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Forcing current WAL file to be archived

2006-07-25 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 I assumed we would have a function like pg_finish_wal_segment(), and
 server stop and stop_backup would call it too,

That idea is *exactly* what I'm objecting to.

 the reason being, it
 would greatly simplify our documentation on how to use PITR if these
 were done automatically.

No it wouldn't, it'd just bloat the already excessive WAL volume.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Forcing current WAL file to be archived

2006-07-25 Thread Hannu Krosing
Ühel kenal päeval, T, 2006-07-25 kell 10:51, kirjutas Bruce Momjian:
 Where are we on these TODO items:
 

   o Add reporting of the current WAL file, perhaps as part of
 partial log file archiving

It would be nice to have a function that tells both filename and offset
of current WAL file, so it would be possible to do live async streaming
of up-to-subsecond changes without too much overhead.

This could be used to solve all the above problems with some extra work
on side of WAL-shipping framework.

Marko Kreen thought he might also do some work on such a function, but
he is on a vacation, so I'm not sure he will be able to get it done by
feature freeze.

It would be extra nice if postgres would nudge some external process on
each WAL write via a signal or UDP packet (after write and before flush)
so there would be virtually no delay between WAL write and notification,
but just the function would also go a long way.

 Seems they should be completed for 8.2.  I have only a /contrib version for
 the last one.

-- 

Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me:  callto:hkrosing
Get Skype for free:  http://www.skype.com



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Forcing current WAL file to be archived

2006-07-25 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  I assumed we would have a function like pg_finish_wal_segment(), and
  server stop and stop_backup would call it too,
 
 That idea is *exactly* what I'm objecting to.
 
  the reason being, it
  would greatly simplify our documentation on how to use PITR if these
  were done automatically.
 
 No it wouldn't, it'd just bloat the already excessive WAL volume.

Well, it only would happen when you have PITR enabled.

For example, if you do pg_stop_backup(), in what cases would you not
also call pg_finish_wal_segment()?  I can't think of one.  Maybe the
server restart case isn't necessary.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Forcing current WAL file to be archived

2006-07-25 Thread Bruce Momjian
Hannu Krosing wrote:
 ?hel kenal p?eval, T, 2006-07-25 kell 10:51, kirjutas Bruce Momjian:
  Where are we on these TODO items:
  
 
o Add reporting of the current WAL file, perhaps as part of
  partial log file archiving
 
 It would be nice to have a function that tells both filename and offset
 of current WAL file, so it would be possible to do live async streaming
 of up-to-subsecond changes without too much overhead.

OK, offset added to TODO item.  What would the offset give us?

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Forcing current WAL file to be archived

2006-07-25 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 For example, if you do pg_stop_backup(), in what cases would you not
 also call pg_finish_wal_segment()?  I can't think of one.

I can't see why you would need to, unless your intention is not to run
PITR at all but only to make a filesystem backup instead of using
pg_dump.  Normally you'd be running a continuing archival process and
there's no particular need to force the current WAL segment off to
archive at that exact instant.

My point here is that forcing the current segment to archive is a
function of whatever your continuous-archiving process is, and it's
not necessarily tied to backups.  We should not prejudge when people
want that fairly-expensive function to be invoked.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Forcing current WAL file to be archived

2006-07-25 Thread Simon Riggs
On Tue, 2006-07-25 at 11:20 -0400, Bruce Momjian wrote:
 Tom Lane wrote:
  Bruce Momjian [EMAIL PROTECTED] writes:
   Where are we on these TODO items:
  
 o Allow point-in-time recovery to archive partially filled
   write-ahead logs [pitr]
  
  I believe we'd agreed that the necessary infrastructure for this is
  just a function to tell the current WAL segment name and offset.
 
 Yes, perhaps, though I can envision a GUC that does regularly partial
 archiving.  I will add a question mark to the item. 

I was planning to add a new GUC

archive_timeout (integer) = max # secs between log file switches

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Forcing current WAL file to be archived

2006-07-25 Thread Simon Riggs
On Tue, 2006-07-25 at 11:07 -0400, Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  Where are we on these TODO items:
 
o Allow point-in-time recovery to archive partially filled
  write-ahead logs [pitr]
 
 I believe we'd agreed that the necessary infrastructure for this is
 just a function to tell the current WAL segment name and offset.

Not that I remember. That was just a proposal for backpatching to
8.1/8.0 so that it would be easier to cope with PITR at those releases.

o Automatically force archiving of partially-filled WAL files when
  pg_stop_backup() is called or the server is stopped
 
 I see no need for that to be automatic.  I'd vote for a simple
 function pg_finish_wal_segment() or something like that, which you
 call just after pg_stop_backup() if you want this behavior.  Trying
 to tie it into pg_stop_backup() will only make things more complicated
 and less flexible.

Putting it into pg_stop_backup was what we previously agreed.

Where is the loss of flexibility?

I need to get this straight because I was actually intending to do this
for 8.2, i.e. next few days.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Forcing current WAL file to be archived

2006-07-25 Thread Hannu Krosing
Ühel kenal päeval, T, 2006-07-25 kell 11:27, kirjutas Bruce Momjian:
 Hannu Krosing wrote:
  ?hel kenal p?eval, T, 2006-07-25 kell 10:51, kirjutas Bruce Momjian:
   Where are we on these TODO items:
   
  
 o Add reporting of the current WAL file, perhaps as part of
   partial log file archiving
  
  It would be nice to have a function that tells both filename and offset
  of current WAL file, so it would be possible to do live async streaming
  of up-to-subsecond changes without too much overhead.
 
 OK, offset added to TODO item.  What would the offset give us?

the offset returned by lseek() on the WAL file, that is the end of the
part of the WAL file which has actually been written to.

-- 

Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me:  callto:hkrosing
Get Skype for free:  http://www.skype.com



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Forcing current WAL file to be archived

2006-07-25 Thread Bruce Momjian
Hannu Krosing wrote:
 ?hel kenal p?eval, T, 2006-07-25 kell 11:27, kirjutas Bruce Momjian:
  Hannu Krosing wrote:
   ?hel kenal p?eval, T, 2006-07-25 kell 10:51, kirjutas Bruce Momjian:
Where are we on these TODO items:

   
  o Add reporting of the current WAL file, perhaps as part of
partial log file archiving
   
   It would be nice to have a function that tells both filename and offset
   of current WAL file, so it would be possible to do live async streaming
   of up-to-subsecond changes without too much overhead.
  
  OK, offset added to TODO item.  What would the offset give us?
 
 the offset returned by lseek() on the WAL file, that is the end of the
 part of the WAL file which has actually been written to.

Sorry, I was actually asking what use the offset would be to a user.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Forcing current WAL file to be archived

2006-07-25 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes:
 On Tue, 2006-07-25 at 11:07 -0400, Tom Lane wrote:
 I see no need for that to be automatic.  I'd vote for a simple
 function pg_finish_wal_segment() or something like that, which you
 call just after pg_stop_backup() if you want this behavior.  Trying
 to tie it into pg_stop_backup() will only make things more complicated
 and less flexible.

 Putting it into pg_stop_backup was what we previously agreed.
 Where is the loss of flexibility?

I don't see why you think this function should be tied to making a
backup.  There are other reasons for wanting to force a WAL switch
than that, and there are scenarios in which you don't need a WAL
switch at the end of a backup.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Forcing current WAL file to be archived

2006-07-25 Thread Simon Riggs
On Tue, 2006-07-25 at 11:31 -0400, Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  For example, if you do pg_stop_backup(), in what cases would you not
  also call pg_finish_wal_segment()?  I can't think of one.
 
 I can't see why you would need to, unless your intention is not to run
 PITR at all but only to make a filesystem backup instead of using
 pg_dump. 

If thats all you want you can set 
archive_command = 'echo %f %p  /dev/null'

  Normally you'd be running a continuing archival process and
 there's no particular need to force the current WAL segment off to
 archive at that exact instant.

That's exactly the point of contention. When we originally completed
PITR we thought that was acceptable. It isn't and many people have stuck
pins in effigies of me since then. :-/

 My point here is that forcing the current segment to archive is a
 function of whatever your continuous-archiving process is, and it's
 not necessarily tied to backups.  We should not prejudge when people
 want that fairly-expensive function to be invoked.

The point is until that last WAL file is backed up, the whole backup is
useless. It isn't good policy to have a backup's value be contingent on
some future event.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Forcing current WAL file to be archived

2006-07-25 Thread Bruce Momjian
Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  On Tue, 2006-07-25 at 11:07 -0400, Tom Lane wrote:
  I see no need for that to be automatic.  I'd vote for a simple
  function pg_finish_wal_segment() or something like that, which you
  call just after pg_stop_backup() if you want this behavior.  Trying
  to tie it into pg_stop_backup() will only make things more complicated
  and less flexible.
 
  Putting it into pg_stop_backup was what we previously agreed.
  Where is the loss of flexibility?
 
 I don't see why you think this function should be tied to making a
 backup.  There are other reasons for wanting to force a WAL switch
 than that, and there are scenarios in which you don't need a WAL

Yes, that is why we would have a separate function too.

 switch at the end of a backup.

Well, I figured if you just did a backup, you would want a switch in
_most_ cases, and since you just did a backup, I figured an extra WAL
file would be minimal additional overhead.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Forcing current WAL file to be archived

2006-07-25 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes:
 I was planning to add a new GUC
   archive_timeout (integer) = max # secs between log file switches

That's fine, but feature freeze is in a week and we don't even have the
basic function for manually doing a log file switch.  Let's get that
done first and then think about automatic switches.

Also, forcing another 16M of WAL out every few seconds is a pretty crude
and inefficient way of making sure your hot standby server is up to
date.  As Hannu noted, an archiving script can do better than that if it
can access the current WAL file name and offset.  So we really want to
get the function to return that info done as well.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Forcing current WAL file to be archived

2006-07-25 Thread Bruce Momjian
Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  I was planning to add a new GUC
  archive_timeout (integer) = max # secs between log file switches
 
 That's fine, but feature freeze is in a week and we don't even have the
 basic function for manually doing a log file switch.  Let's get that
 done first and then think about automatic switches.
 
 Also, forcing another 16M of WAL out every few seconds is a pretty crude
 and inefficient way of making sure your hot standby server is up to
 date.  As Hannu noted, an archiving script can do better than that if it
 can access the current WAL file name and offset.  So we really want to
 get the function to return that info done as well.

Agreed.  One concern I have is that we have waited for Simon to complete
this for 1.5 years, and now with a week left he is still working on it
(or starting on it).  I am wondering if someone else needs to take this
on, because if Simon doesn't complete it in a week, we don't have it for
8.2, and we would then have to sit through another year of PITR
complaints.  :-(

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Forcing current WAL file to be archived

2006-07-25 Thread Stephen Frost
* Simon Riggs ([EMAIL PROTECTED]) wrote:
 On Tue, 2006-07-25 at 11:20 -0400, Bruce Momjian wrote:
  Yes, perhaps, though I can envision a GUC that does regularly partial
  archiving.  I will add a question mark to the item. 
 
 I was planning to add a new GUC
 
   archive_timeout (integer) = max # secs between log file switches

I'd love to see both this GUC and the function itself make it into 8.2..

I'm tempted to agree with Bruce about running the wal-archive-function
after pg_stop_backup().  The backup isn't any good without all the WALs
which were used during the backup anyway (iirc) so I can't really think
why you'd want any time at all between backup happening and backup
actually usable.

Also, compared to the backup itself I'd tend to doubt there would be
much of a performance hit.  It may be expensive compared to other
regular queries/operations but an rsync across the entire database isn't
exactly cheap.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Forcing current WAL file to be archived

2006-07-25 Thread Bruce Momjian
Simon Riggs wrote:
 On Tue, 2006-07-25 at 11:31 -0400, Tom Lane wrote:
  Bruce Momjian [EMAIL PROTECTED] writes:
   For example, if you do pg_stop_backup(), in what cases would you not
   also call pg_finish_wal_segment()?  I can't think of one.
  
  I can't see why you would need to, unless your intention is not to run
  PITR at all but only to make a filesystem backup instead of using
  pg_dump. 
 
 If thats all you want you can set 
   archive_command = 'echo %f %p  /dev/null'

Uh, what good is a file system backup without the WAL files modified
during the backup?

   Normally you'd be running a continuing archival process and
  there's no particular need to force the current WAL segment off to
  archive at that exact instant.
 
 That's exactly the point of contention. When we originally completed
 PITR we thought that was acceptable. It isn't and many people have stuck
 pins in effigies of me since then. :-/
 
  My point here is that forcing the current segment to archive is a
  function of whatever your continuous-archiving process is, and it's
  not necessarily tied to backups.  We should not prejudge when people
  want that fairly-expensive function to be invoked.
 
 The point is until that last WAL file is backed up, the whole backup is
 useless. It isn't good policy to have a backup's value be contingent on
 some future event.

Good analysis.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Forcing current WAL file to be archived

2006-07-25 Thread Csaba Nagy
 OK, offset added to TODO item.  What would the offset give us?

The last offset could be remembered by the external program, and it only
has to transfer from the last offset to the new one. It allows
incremental streaming of the WAL files... of course the external program
will be a lot more complex than the current shell scripts which can be
used for WAL archiving...

The problems I see with this is if in this case the normal postgres WAL
archiving won't conflict with this streaming ? And if yes, wouldn't it
be better to have a separate mechanism for the stream based archiving ?
I mean what will happen if postgres successfully archives the WAL file
and then moves it out of way before the streaming process finishes with
it, the streaming process will have a problem... 

A few months ago I spent some time thinking about a solution where a WAL
based standby could be built using only normal data base connections to
the master server, and one of the ideas was to create a WAL subscription
mechanism where the standby subscribes for getting WAL files, and
updates it's subscription status with the last processed WAL file after
each processed file. The master can then recycle the WAL files only
after they were confirmed by all current subscriptions... and to avoid
excessive WAL file bloat if a slave goes offline, the subscription could
be canceled automatically if it gets too much behind.

If this mechanism is in place, it would be also nice if the slave could
ask for the WAL records to be streamed on a normal data base connection.
The function which would do it could be smart enough to stream the
current WAL file too up to the current offset and then wait for new
records. The slave would invoke the function for each WAL file it needs
to transfer, and then when finished it would update it's subscription
status and continue with the next one. The streaming function should not
update the subscription status as this way the slave can ask for the
file again if something goes wrong with the transfer.

The third thing needed to create a facility for one-connection-standby
building is to be able to stream the OS files of the DB through a DB
connection - I guess that can be done with a relatively simple C
function...

With all these things in place, a program could be written which would
run on the standby machine and completely automatically set up the
standby, only needing a simple connection string to the master...

Cheers,
Csaba.



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Forcing current WAL file to be archived

2006-07-25 Thread Bruce Momjian

OK, makes sense.  That is much more sophisticated that I imagined.

---

Csaba Nagy wrote:
  OK, offset added to TODO item.  What would the offset give us?
 
 The last offset could be remembered by the external program, and it only
 has to transfer from the last offset to the new one. It allows
 incremental streaming of the WAL files... of course the external program
 will be a lot more complex than the current shell scripts which can be
 used for WAL archiving...
 
 The problems I see with this is if in this case the normal postgres WAL
 archiving won't conflict with this streaming ? And if yes, wouldn't it
 be better to have a separate mechanism for the stream based archiving ?
 I mean what will happen if postgres successfully archives the WAL file
 and then moves it out of way before the streaming process finishes with
 it, the streaming process will have a problem... 
 
 A few months ago I spent some time thinking about a solution where a WAL
 based standby could be built using only normal data base connections to
 the master server, and one of the ideas was to create a WAL subscription
 mechanism where the standby subscribes for getting WAL files, and
 updates it's subscription status with the last processed WAL file after
 each processed file. The master can then recycle the WAL files only
 after they were confirmed by all current subscriptions... and to avoid
 excessive WAL file bloat if a slave goes offline, the subscription could
 be canceled automatically if it gets too much behind.
 
 If this mechanism is in place, it would be also nice if the slave could
 ask for the WAL records to be streamed on a normal data base connection.
 The function which would do it could be smart enough to stream the
 current WAL file too up to the current offset and then wait for new
 records. The slave would invoke the function for each WAL file it needs
 to transfer, and then when finished it would update it's subscription
 status and continue with the next one. The streaming function should not
 update the subscription status as this way the slave can ask for the
 file again if something goes wrong with the transfer.
 
 The third thing needed to create a facility for one-connection-standby
 building is to be able to stream the OS files of the DB through a DB
 connection - I guess that can be done with a relatively simple C
 function...
 
 With all these things in place, a program could be written which would
 run on the standby machine and completely automatically set up the
 standby, only needing a simple connection string to the master...
 
 Cheers,
 Csaba.
 

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Forcing current WAL file to be archived

2006-07-25 Thread Simon Riggs
On Tue, 2006-07-25 at 11:53 -0400, Tom Lane wrote:
 That's fine, but feature freeze is in a week and we don't even have
 the
 basic function for manually doing a log file switch.  Let's get that
 done first and then think about automatic switches.

Agreed.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Forcing current WAL file to be archived

2006-07-25 Thread Hannu Krosing
Ühel kenal päeval, T, 2006-07-25 kell 11:48, kirjutas Bruce Momjian:
 Hannu Krosing wrote:
  ?hel kenal p?eval, T, 2006-07-25 kell 11:27, kirjutas Bruce Momjian:
   Hannu Krosing wrote:
?hel kenal p?eval, T, 2006-07-25 kell 10:51, kirjutas Bruce Momjian:
 Where are we on these TODO items:
 

   o Add reporting of the current WAL file, perhaps as part of
 partial log file archiving

It would be nice to have a function that tells both filename and offset
of current WAL file, so it would be possible to do live async streaming
of up-to-subsecond changes without too much overhead.
   
   OK, offset added to TODO item.  What would the offset give us?
  
  the offset returned by lseek() on the WAL file, that is the end of the
  part of the WAL file which has actually been written to.
 
 Sorry, I was actually asking what use the offset would be to a user.

There would be an external async process, which continuously polls the
offset and pushes everything written between the polls to slave site.

so when this process starts up it gets (file = wal1 and
offset=1) and it sends first 1 bytes to slave site, at next
rountd it gets (file = wal1 and offset=15000) and it sends bytes
10001-15000 to remote and so on. this way the slave has a lag no more
than the poll interval in usable WAL data.

-- 

Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me:  callto:hkrosing
Get Skype for free:  http://www.skype.com




---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Forcing current WAL file to be archived

2006-07-25 Thread Simon Riggs
On Tue, 2006-07-25 at 11:45 -0400, Tom Lane wrote:
 there are scenarios in which you don't need a WAL
 switch at the end of a backup. 

My mind's blank today, so forgive me that I cannot see what that might
be.

Assuming such a case, would it be possible to have two functions?

pg_stop_backup()
pg_stop_backup(boolean); --parameter says log switch or not

Most people use the existing parameter-less function,

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Forcing current WAL file to be archived

2006-07-25 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes:
 On Tue, 2006-07-25 at 11:31 -0400, Tom Lane wrote:
 My point here is that forcing the current segment to archive is a
 function of whatever your continuous-archiving process is, and it's
 not necessarily tied to backups.  We should not prejudge when people
 want that fairly-expensive function to be invoked.

 The point is until that last WAL file is backed up, the whole backup is
 useless. It isn't good policy to have a backup's value be contingent on
 some future event.

You are assuming here that the continuous archiving process is identical
to the WAL part of the base-backup process.  If what you want is an
identifiable self-contained base backup then you copy off the WAL files
along with the tar dump; there's no need to force a switch of the
current WAL file before you copy it.

I don't disagree that in many scenarios the switch is needful.  What I'm
saying is that we should provide a separately accessible function for it.
PG's PITR support is basically designed as a toolkit that lets you build
a PITR solution, not as do-everything, one-size-fits-all monolithic
functionality, and I want to stay in that spirit.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Forcing current WAL file to be archived

2006-07-25 Thread Hannu Krosing
Ühel kenal päeval, T, 2006-07-25 kell 17:52, kirjutas Csaba Nagy:
  OK, offset added to TODO item.  What would the offset give us?
 
 The last offset could be remembered by the external program, and it
 only
 has to transfer from the last offset to the new one. It allows
 incremental streaming of the WAL files... of course the external
 program
 will be a lot more complex than the current shell scripts which can be
 used for WAL archiving...
 
 The problems I see with this is if in this case the normal postgres
 WAL
 archiving won't conflict with this streaming ? 

You are not forced to use it if your shell scripts do conflict.

What I envisioned, was that the current WAL archiving shell script would
just do some CRC check over the WAL's already shipped, or as we
currently use rsync to do the actual shipping this is what happens
automatically.

 And if yes, wouldn't it
 be better to have a separate mechanism for the stream based
 archiving ?

why separate ? I'm a great believer in doing the minimum useful change,
at least in systems used in production. We already have a working
solution for full file shipping, so why not just augment it with
streaming the currently-written-to file.

 I mean what will happen if postgres successfully archives the WAL file
 and then moves it out of way before the streaming process finishes
 with
 it, the streaming process will have a problem... 

This should not happen. your streaming process should be smart enought
to guarantee that.

-- 

Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me:  callto:hkrosing
Get Skype for free:  http://www.skype.com

NOTICE: This communication contains privileged or other confidential
information. If you have received it in error, please advise the sender
by reply email and immediately delete the message and any attachments
without copying or disclosing the contents.


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Forcing current WAL file to be archived

2006-07-25 Thread Simon Riggs
On Tue, 2006-07-25 at 11:57 -0400, Bruce Momjian wrote:
 Tom Lane wrote:
  Simon Riggs [EMAIL PROTECTED] writes:
   I was planning to add a new GUC
 archive_timeout (integer) = max # secs between log file switches
  
  That's fine, but feature freeze is in a week and we don't even have the
  basic function for manually doing a log file switch.  Let's get that
  done first and then think about automatic switches.
  
  Also, forcing another 16M of WAL out every few seconds is a pretty crude
  and inefficient way of making sure your hot standby server is up to
  date.  As Hannu noted, an archiving script can do better than that if it
  can access the current WAL file name and offset.  So we really want to
  get the function to return that info done as well.
 
 Agreed.  One concern I have is that we have waited for Simon to complete
 this for 1.5 years, and now with a week left he is still working on it
 (or starting on it).  

Since we cannot agree even now on what should be done, you'll forgive me
for not having completed it sooner, especially since you know more about
my schedule now than others.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Forcing current WAL file to be archived

2006-07-25 Thread Bruce Momjian
Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  On Tue, 2006-07-25 at 11:31 -0400, Tom Lane wrote:
  My point here is that forcing the current segment to archive is a
  function of whatever your continuous-archiving process is, and it's
  not necessarily tied to backups.  We should not prejudge when people
  want that fairly-expensive function to be invoked.
 
  The point is until that last WAL file is backed up, the whole backup is
  useless. It isn't good policy to have a backup's value be contingent on
  some future event.
 
 You are assuming here that the continuous archiving process is identical
 to the WAL part of the base-backup process.  If what you want is an
 identifiable self-contained base backup then you copy off the WAL files
 along with the tar dump; there's no need to force a switch of the
 current WAL file before you copy it.

If you are doing that, I think for consistency you would want a WAL file
that is completely archived, rather than pulling the current one while
it is being written to.

 I don't disagree that in many scenarios the switch is needful.  What I'm
 saying is that we should provide a separately accessible function for it.
 PG's PITR support is basically designed as a toolkit that lets you build
 a PITR solution, not as do-everything, one-size-fits-all monolithic
 functionality, and I want to stay in that spirit.

I don't think we want people wiring their own calculator.  Sure we can
give them wires and have them do it themselves, but if we can make it
easier for 99% of the cases (with little downside), we should do it.
PITR has become more of a toolkit only because the partial WAL file
writes were not completed in the original implementation.  PITR is hard
enough --- we need to make it easier if possible.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Forcing current WAL file to be archived

2006-07-25 Thread Csaba Nagy
  The problems I see with this is if in this case the normal postgres
  WAL
  archiving won't conflict with this streaming ? 
 
 You are not forced to use it if your shell scripts do conflict.
 
 What I envisioned, was that the current WAL archiving shell script would
 just do some CRC check over the WAL's already shipped, or as we
 currently use rsync to do the actual shipping this is what happens
 automatically.

Hmm, that sounds pretty smart... the archive process rsyncing over the
file which was previously streamed... I guess this will mean very little
overhead (in fact it only means the WAL archive to be read once more
than absolutely necessary, and the CPU power to compute the CRCs).

  And if yes, wouldn't it
  be better to have a separate mechanism for the stream based
  archiving ?
 
 why separate ? I'm a great believer in doing the minimum useful change,
 at least in systems used in production. We already have a working
 solution for full file shipping, so why not just augment it with
 streaming the currently-written-to file.

That's good so, I also have a working script, so I'm also not very
motivated to do anything more complicated... but 6 months ago I would
have been really glad to have a stand-alone program which I could
install along postgres on the slave, point it to the master, and get a
working WAL shipping based stand-by. Instead I spent a few days setting
up our standby scripts and testing it under load... and never being
certain it really works and it won't break exactly when I need it
most...

  I mean what will happen if postgres successfully archives the WAL file
  and then moves it out of way before the streaming process finishes
  with
  it, the streaming process will have a problem... 
 
 This should not happen. your streaming process should be smart enought
 to guarantee that.

OK, true, the streaming script should always stream only the current
file. If the last offset was from a previous WAL, it can be safely reset
to 0, and stream the new WAL from the beginning. So the streaming script
needs to remember the last WAL and offset, not just the offset.

Cheers,
Csaba.



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Forcing current WAL file to be archived

2006-07-25 Thread Bruce Momjian
Simon Riggs wrote:
 On Tue, 2006-07-25 at 11:57 -0400, Bruce Momjian wrote:
  Tom Lane wrote:
   Simon Riggs [EMAIL PROTECTED] writes:
I was planning to add a new GUC
archive_timeout (integer) = max # secs between log file switches
   
   That's fine, but feature freeze is in a week and we don't even have the
   basic function for manually doing a log file switch.  Let's get that
   done first and then think about automatic switches.
   
   Also, forcing another 16M of WAL out every few seconds is a pretty crude
   and inefficient way of making sure your hot standby server is up to
   date.  As Hannu noted, an archiving script can do better than that if it
   can access the current WAL file name and offset.  So we really want to
   get the function to return that info done as well.
  
  Agreed.  One concern I have is that we have waited for Simon to complete
  this for 1.5 years, and now with a week left he is still working on it
  (or starting on it).  
 
 Since we cannot agree even now on what should be done, you'll forgive me
 for not having completed it sooner, especially since you know more about
 my schedule now than others.

It is not a blame issue.  The issue is I would like this completed for
8.2 and I want to minimize the possibility it will not be done.

I think we do know what we want done.  It is just that we are not
certain of the user interface.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Forcing current WAL file to be archived

2006-07-25 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes:
 Assuming such a case, would it be possible to have two functions?

 pg_stop_backup()
 pg_stop_backup(boolean); --parameter says log switch or not

Well, it seems everyone but me thinks that pg_stop_backup should
force a WAL switch, so I'll yield on that point.  But we still
need the separate function too, so that people can manually force
a WAL switch --- just the same as we still have a manual CHECKPOINT
command.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Forcing current WAL file to be archived

2006-07-25 Thread Bruce Momjian
Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  Assuming such a case, would it be possible to have two functions?
 
  pg_stop_backup()
  pg_stop_backup(boolean); --parameter says log switch or not
 
 Well, it seems everyone but me thinks that pg_stop_backup should
 force a WAL switch, so I'll yield on that point.  But we still
 need the separate function too, so that people can manually force
 a WAL switch --- just the same as we still have a manual CHECKPOINT
 command.

Agreed, with separate function too.  No sense in limiting the toolkit,
as you explained.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Forcing current WAL file to be archived

2006-07-25 Thread Albe Laurenz
Tom Lane wrote:
 The point is until that last WAL file is backed up, the whole backup
is
 useless. It isn't good policy to have a backup's value be contingent
on
 some future event.
 
 You are assuming here that the continuous archiving process is
identical
 to the WAL part of the base-backup process.  If what you want is an
 identifiable self-contained base backup then you copy off the WAL
files
 along with the tar dump; there's no need to force a switch of the
 current WAL file before you copy it.

I think you are right.

 I don't disagree that in many scenarios the switch is needful.  What
I'm
 saying is that we should provide a separately accessible function for
it.
 PG's PITR support is basically designed as a toolkit that lets you
build
 a PITR solution, not as do-everything, one-size-fits-all monolithic
 functionality, and I want to stay in that spirit.

I agree that it is enough to have a separate pg_finish_wal_segment().

Adding that in your backup script between pg_stop_backup() and tarring
of the archived WAL files would by a simple enough step.

Yours,
Laurenz Albe

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Forcing current WAL file to be archived

2006-07-25 Thread Stephen Frost
* Bruce Momjian ([EMAIL PROTECTED]) wrote:
 Tom Lane wrote:
  You are assuming here that the continuous archiving process is identical
  to the WAL part of the base-backup process.  If what you want is an
  identifiable self-contained base backup then you copy off the WAL files
  along with the tar dump; there's no need to force a switch of the
  current WAL file before you copy it.
 
 If you are doing that, I think for consistency you would want a WAL file
 that is completely archived, rather than pulling the current one while
 it is being written to.

I've never been terribly pleased with having to copy the current WAL
while it's being written to.  The setup we're using is basically:

On the source system:
pg_start_backup()
rsync
pg_stop_backup()
Figure out the current WAL and do a fake archive of it

On the backup server:
Grab the start/end WAL logs of the backup
Verify that all the WAL logs archived during the backup are available

It sounds like I'd be changing do a fake-archive of the current WAL to
call the archive_wal function.  In either case I worry some about a
possible race-condition or something going wrong which invalidates the
backup.

I think it would actually be really nice to have a 'verify_backup' tool
which could be non-interactively run against a backup to check that the
backup was successful.  The one we hacked up really just checks that
there are files available with the right names.  Something more
substantial than that (but without affecting the actual backup) would be
really nice since it would improve confidence that the backup really can
be restored from.

Thanks,

Stephen


signature.asc
Description: Digital signature