On Thu, Sep 30, 2010 at 15:45, Kevin Grittner <kevin.gritt...@wicourts.gov> wrote: > Magnus Hagander <mag...@hagander.net> wrote: > >>> If you could keep the development "friendly" to such features, I >>> may get around to adding them to support our needs.... >> >> Would it be enough to have kind of an "archive_command" switch >> that says "whenever you've finished a complete wal segment, run >> this command on it"? > > That would allow some nice options. I've been thinking what would > be the ideal use of this with our backup scheme, and the best I've > thought up would be that each WAL segment file would be a single > output stream, with the option of calling a executable (which could > be a script) with the target file name and then piping the stream to > it. At 16MB or a forced xlog switch, it would close the stream and > call the executable again with a new file name. You could have a > default executable for the default behavior, or just build in a > default if no executable is specified.
The problem with that one (which I'm sure is solvable somehow) is how to deal with restarts. Both restarts in the middle of a segment (happens all the time if you don't have an archive_timeout set), and really also restarts between segments. How would the tool know where to begin streaming again? Right now, it looks at the files - but doing it by your suggestion there are no files to look at. We'd need a second script/command to call to figure out where to restart from in that case, no? > The reason I like this is that I could pipe the stream through > pg_clearxlogtail and gzip pretty much "as is" to the locations on > the database server currently used for rsync to the two targets, and > the rsync commands would send the incremental changes once per > minute to both targets. I haven't thought of another solution which > provides incremental transmission of the WAL to the local backup > location, which would be a nice thing to have, since this is most > crucial when the WAN is down and not only is WAL data not coming > back to our central location, but our application framework based > replication stream isn't making back, either. It should be safe to just rsync the archive directory as it's being written by pg_streamrecv. Doesn't that give you the property you're looking for - local machine gets data streamed in live, remote machine gets it rsynced every minute? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers