Re: [HACKERS] Hot standby, recovery infra

2009-02-05 Thread Heikki Linnakangas
Simon Riggs wrote: On Thu, 2009-02-05 at 11:46 +0200, Heikki Linnakangas wrote: Simon Riggs wrote: So we might end up flushing more often *and* we will be doing it potentially in the code path of other users. For example, imagine a database that fits completely in shared buffers. If we

Re: [HACKERS] Hot standby, recovery infra

2009-02-05 Thread Simon Riggs
On Thu, 2009-02-05 at 13:18 +0200, Heikki Linnakangas wrote: Simon Riggs wrote: On Thu, 2009-02-05 at 11:46 +0200, Heikki Linnakangas wrote: Simon Riggs wrote: So we might end up flushing more often *and* we will be doing it potentially in the code path of other users. For example,

Re: [HACKERS] Hot standby, recovery infra

2009-02-05 Thread Heikki Linnakangas
Simon Riggs wrote: On Thu, 2009-02-05 at 13:18 +0200, Heikki Linnakangas wrote: Simon Riggs wrote: On Thu, 2009-02-05 at 11:46 +0200, Heikki Linnakangas wrote: Simon Riggs wrote: So we might end up flushing more often *and* we will be doing it potentially in the code path of other users. For

Re: [HACKERS] Hot standby, recovery infra

2009-02-05 Thread Simon Riggs
On Thu, 2009-02-05 at 14:18 +0200, Heikki Linnakangas wrote: when the control file is updated in XLogFlush, it's typically the bgwriter doing it as it cleans buffers ahead of the clock hand, not the startup process That is the key point. Let's do it your way. -- Simon Riggs

Re: [HACKERS] Hot standby, recovery infra

2009-02-05 Thread Simon Riggs
On Thu, 2009-02-05 at 21:54 +0200, Heikki Linnakangas wrote: - If bgwriter is performing a restartpoint when recovery ends, the startup checkpoint will be queued up behind the restartpoint. And since it uses the same smoothing logic as checkpoints, it can take quite some time for that to

Re: [HACKERS] Hot standby, recovery infra

2009-02-04 Thread Heikki Linnakangas
Fujii Masao wrote: On Fri, Jan 30, 2009 at 11:55 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: The startup process now catches SIGTERM, and calls proc_exit() at the next WAL record. That's what will happen in a fast shutdown. Unexpected death of the startup process is

Re: [HACKERS] Hot standby, recovery infra

2009-02-04 Thread Heikki Linnakangas
Simon Riggs wrote: * I think we are now renaming the recovery.conf file too early. The comment says We have already restored all the WAL segments we need from the archive, and we trust that they are not going to go away even if we crash. We have, but the files overwrite each other as they

Re: [HACKERS] Hot standby, recovery infra

2009-02-04 Thread Simon Riggs
On Wed, 2009-02-04 at 19:03 +0200, Heikki Linnakangas wrote: Simon Riggs wrote: * I think we are now renaming the recovery.conf file too early. The comment says We have already restored all the WAL segments we need from the archive, and we trust that they are not going to go away even if

Re: [HACKERS] Hot standby, recovery infra

2009-02-04 Thread Fujii Masao
Hi, On Wed, Feb 4, 2009 at 8:35 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Yes, and in fact I ran into it myself yesterday while testing. It seems that we should reset FatalError earlier, ie. when the recovery starts and bgwriter is launched. I'm not sure why we in CVS

Re: [HACKERS] Hot standby, recovery infra

2009-02-04 Thread Tom Lane
Fujii Masao masao.fu...@gmail.com writes: On Wed, Feb 4, 2009 at 8:35 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: ... I'm not sure why we in CVS HEAD we don't reset FatalError until after the startup process is finished. Which may repeat the recovery crash and

Re: [HACKERS] Hot standby, recovery infra

2009-02-04 Thread Heikki Linnakangas
Tom Lane wrote: Fujii Masao masao.fu...@gmail.com writes: On Wed, Feb 4, 2009 at 8:35 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: ... I'm not sure why we in CVS HEAD we don't reset FatalError until after the startup process is finished. Which may repeat the recovery

Re: [HACKERS] Hot standby, recovery infra

2009-02-04 Thread Heikki Linnakangas
Simon Riggs wrote: We could avoid that by performing a good old startup checkpoint, but I quite like the fast failover time we get without it. ISTM it's either slow failover or (fast failover, but restart archive recovery if crashes). I would suggest that at end of recovery we write the last

Re: [HACKERS] Hot standby, recovery infra

2009-02-04 Thread Simon Riggs
On Thu, 2009-02-05 at 09:28 +0200, Heikki Linnakangas wrote: I've changed the way minRecoveryPoint is updated now anyway, so it no longer happens every XLogFileRead(). Care to elucidate? I got rid of minSafeStartPoint, advancing minRecoveryPoint instead. And it's advanced in

Re: [HACKERS] Hot standby, recovery infra

2009-02-04 Thread Simon Riggs
On Thu, 2009-02-05 at 09:28 +0200, Heikki Linnakangas wrote: Simon Riggs wrote: We could avoid that by performing a good old startup checkpoint, but I quite like the fast failover time we get without it. ISTM it's either slow failover or (fast failover, but restart archive recovery

Re: [HACKERS] Hot Standby (v9d)

2009-02-03 Thread Andrew Dunstan
Hannu Krosing wrote: Actually we came up with a solution to this - use filesystem level snapshots (like LVM2+XFS or ZFS), and redirect backends with long-running queries to use fs snapshot mounted to a different mountpoint. I don't think Simon has yet put full support for it in code, but it

Re: [HACKERS] Hot Standby (v9d)

2009-02-03 Thread Gregory Stark
Hannu Krosing ha...@krosing.net writes: Actually we came up with a solution to this - use filesystem level snapshots (like LVM2+XFS or ZFS), and redirect backends with long-running queries to use fs snapshot mounted to a different mountpoint. Uhm, how do you determine which snapshot to

Re: [HACKERS] Hot Standby (v9d)

2009-02-03 Thread Robert Haas
I don't see any way around the fact that when a tuple is removed, it's gone and can't be accessed by queries. Either you don't remove it, or you kill the query. Actually we came up with a solution to this - use filesystem level snapshots (like LVM2+XFS or ZFS), and redirect backends with

Re: [HACKERS] Hot Standby (v9d)

2009-02-03 Thread Hannu Krosing
On Tue, 2009-02-03 at 08:40 -0500, Andrew Dunstan wrote: Hannu Krosing wrote: Actually we came up with a solution to this - use filesystem level snapshots (like LVM2+XFS or ZFS), and redirect backends with long-running queries to use fs snapshot mounted to a different mountpoint. I

Re: [HACKERS] Hot Standby (v9d)

2009-02-03 Thread Hannu Krosing
On Tue, 2009-02-03 at 09:14 -0500, Robert Haas wrote: I don't see any way around the fact that when a tuple is removed, it's gone and can't be accessed by queries. Either you don't remove it, or you kill the query. Actually we came up with a solution to this - use filesystem level

Re: [HACKERS] Hot Standby (v9d)

2009-02-03 Thread Simon Riggs
On Tue, 2009-02-03 at 09:14 -0500, Robert Haas wrote: I think _the_ solution is to notice when you're about to vacuum a page that is still visible to a running backend on the standby, and save that page off to a separate cache of old page versions (perhaps using the relation fork mechanism).

Re: [HACKERS] Hot Standby (v9d)

2009-02-03 Thread Hannu Krosing
On Tue, 2009-02-03 at 13:50 +, Gregory Stark wrote: Hannu Krosing ha...@krosing.net writes: Actually we came up with a solution to this - use filesystem level snapshots (like LVM2+XFS or ZFS), and redirect backends with long-running queries to use fs snapshot mounted to a different

Re: [HACKERS] Hot Standby (v9d)

2009-02-03 Thread Simon Riggs
On Tue, 2009-02-03 at 08:40 -0500, Andrew Dunstan wrote: Hannu Krosing wrote: Actually we came up with a solution to this - use filesystem level snapshots (like LVM2+XFS or ZFS), and redirect backends with long-running queries to use fs snapshot mounted to a different mountpoint. I

Re: [HACKERS] Hot Standby (v9d)

2009-02-03 Thread Robert Haas
On Tue, Feb 3, 2009 at 9:40 AM, Simon Riggs si...@2ndquadrant.com wrote: On Tue, 2009-02-03 at 09:14 -0500, Robert Haas wrote: I think _the_ solution is to notice when you're about to vacuum a page that is still visible to a running backend on the standby, and save that page off to a separate

Re: [HACKERS] Hot Standby (v9d)

2009-02-03 Thread Simon Riggs
On Tue, 2009-02-03 at 15:55 +0100, Andres Freund wrote: Hi, On 02/03/2009 02:26 PM, Hannu Krosing wrote: I don't see any way around the fact that when a tuple is removed, it's gone and can't be accessed by queries. Either you don't remove it, or you kill the query. Actually we came up

Re: [HACKERS] Hot Standby (v9d)

2009-02-03 Thread Hannu Krosing
On Tue, 2009-02-03 at 10:19 -0500, Robert Haas wrote: On Tue, Feb 3, 2009 at 9:40 AM, Simon Riggs si...@2ndquadrant.com wrote: On Tue, 2009-02-03 at 09:14 -0500, Robert Haas wrote: I think _the_ solution is to notice when you're about to vacuum a page that is still visible to a running

Re: [HACKERS] Hot Standby (v9d)

2009-02-03 Thread Hannu Krosing
On Tue, 2009-02-03 at 14:28 +, Simon Riggs wrote: On Tue, 2009-02-03 at 08:40 -0500, Andrew Dunstan wrote: Hannu Krosing wrote: Actually we came up with a solution to this - use filesystem level snapshots (like LVM2+XFS or ZFS), and redirect backends with long-running queries to

Re: [HACKERS] Hot Standby (v9d)

2009-02-03 Thread Andres Freund
Hi, On 02/03/2009 02:26 PM, Hannu Krosing wrote: I don't see any way around the fact that when a tuple is removed, it's gone and can't be accessed by queries. Either you don't remove it, or you kill the query. Actually we came up with a solution to this - use filesystem level snapshots (like

Re: [HACKERS] Hot Standby (v9d)

2009-02-03 Thread Simon Riggs
On Tue, 2009-02-03 at 18:09 +0200, Hannu Krosing wrote: On Tue, 2009-02-03 at 14:28 +, Simon Riggs wrote: On Tue, 2009-02-03 at 08:40 -0500, Andrew Dunstan wrote: Hannu Krosing wrote: Actually we came up with a solution to this - use filesystem level snapshots (like LVM2+XFS

Re: [HACKERS] Hot Standby (v9d)

2009-02-03 Thread Hannu Krosing
On Wed, 2009-01-28 at 22:19 +0200, Heikki Linnakangas wrote: Tom Lane wrote: ... Well, those unexpectedly cancelled queries could have represented critical functionality too. I think this argument calls the entire approach into question. If there is no safe setting for the parameter

Re: [HACKERS] Hot standby, recovery infra

2009-02-03 Thread Fujii Masao
Hi, On Fri, Jan 30, 2009 at 11:55 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: The startup process now catches SIGTERM, and calls proc_exit() at the next WAL record. That's what will happen in a fast shutdown. Unexpected death of the startup process is treated the same as

Re: [HACKERS] Hot standby, recovery infra

2009-02-01 Thread Simon Riggs
On Sat, 2009-01-31 at 22:32 +0200, Heikki Linnakangas wrote: If you poison your WAL archive with a XLOG_CRASH_RECOVERY record, recovery will never be able to proceed over that point. There would have to be a switch to ignore those records, at the very least. Definitely in assert mode only.

Re: [HACKERS] Hot standby, recovery infra

2009-02-01 Thread Simon Riggs
On Sat, 2009-01-31 at 22:41 +0200, Heikki Linnakangas wrote: I like this way because it means we might in the future get Startup process to perform post-recovery actions also. Yeah, it does. Do you have something in mind already? Yes, but nothing that needs to be discussed yet. --

Re: [HACKERS] Hot standby, recovery infra

2009-01-31 Thread Simon Riggs
On Fri, 2009-01-30 at 13:15 +0200, Heikki Linnakangas wrote: Simon Riggs wrote: I'm thinking to add a new function that will allow crash testing easier. pg_crash_standby() will issue a new xlog record, XLOG_CRASH_STANDBY, which when replayed will just throw a FATAL error and crash

Re: [HACKERS] Hot standby, recovery infra

2009-01-31 Thread Simon Riggs
On Fri, 2009-01-30 at 13:25 +0200, Heikki Linnakangas wrote: That whole area was something I was leaving until last, since immediate shutdown doesn't work either, even in HEAD. (Fujii-san and I discussed this before Christmas, briefly). We must handle shutdown gracefully, can't just

Re: [HACKERS] Hot standby, recovery infra

2009-01-31 Thread Simon Riggs
On Fri, 2009-01-30 at 16:55 +0200, Heikki Linnakangas wrote: Ok, here's an attempt to make shutdown work gracefully. Startup process now signals postmaster three times during startup: first when it has done all the initialization, and starts redo. At that point. postmaster launches

Re: [HACKERS] Hot standby, recovery infra

2009-01-31 Thread Heikki Linnakangas
Simon Riggs wrote: On Fri, 2009-01-30 at 13:15 +0200, Heikki Linnakangas wrote: Simon Riggs wrote: I'm thinking to add a new function that will allow crash testing easier. pg_crash_standby() will issue a new xlog record, XLOG_CRASH_STANDBY, which when replayed will just throw a FATAL error

Re: [HACKERS] Hot standby, recovery infra

2009-01-31 Thread Heikki Linnakangas
Simon Riggs wrote: On Fri, 2009-01-30 at 16:55 +0200, Heikki Linnakangas wrote: Ok, here's an attempt to make shutdown work gracefully. Startup process now signals postmaster three times during startup: first when it has done all the initialization, and starts redo. At that point. postmaster

Re: [HACKERS] Hot standby, recovery infra

2009-01-30 Thread Heikki Linnakangas
I just realized that the new minSafeStartPoint is actually exactly the same concept as the existing minRecoveryPoint. As the recovery progresses, we could advance minRecoveryPoint just as well as the new minSafeStartPoint. Perhaps it's a good idea to keep them separate anyway though, the

Re: [HACKERS] Hot standby, recovery infra

2009-01-30 Thread Simon Riggs
On Thu, 2009-01-29 at 20:35 +0200, Heikki Linnakangas wrote: Hmm, another point of consideration is how this interacts with the pause/continue. In particular, it was suggested earlier that you could put an option into recovery.conf to start in paused mode. If you pause recovery, and then

Re: [HACKERS] Hot standby, recovery infra

2009-01-30 Thread Simon Riggs
On Thu, 2009-01-29 at 19:20 +0200, Heikki Linnakangas wrote: Heikki Linnakangas wrote: It looks like if you issue a fast shutdown during recovery, postmaster doesn't kill bgwriter. Hmm, seems like we haven't thought through how shutdown during consistent recovery is supposed to behave

Re: [HACKERS] Hot standby, recovery infra

2009-01-30 Thread Simon Riggs
On Fri, 2009-01-30 at 11:33 +0200, Heikki Linnakangas wrote: I just realized that the new minSafeStartPoint is actually exactly the same concept as the existing minRecoveryPoint. As the recovery progresses, we could advance minRecoveryPoint just as well as the new minSafeStartPoint.

Re: [HACKERS] Hot standby, recovery infra

2009-01-30 Thread Simon Riggs
On Thu, 2009-01-29 at 14:21 +0200, Heikki Linnakangas wrote: It looks like if you issue a fast shutdown during recovery, postmaster doesn't kill bgwriter. Thanks for the report. I'm thinking to add a new function that will allow crash testing easier. pg_crash_standby() will issue a new xlog

Re: [HACKERS] Hot standby, recovery infra

2009-01-30 Thread Heikki Linnakangas
Simon Riggs wrote: I'm thinking to add a new function that will allow crash testing easier. pg_crash_standby() will issue a new xlog record, XLOG_CRASH_STANDBY, which when replayed will just throw a FATAL error and crash Startup process. We won't be adding that to the user docs... This will

Re: [HACKERS] Hot standby, recovery infra

2009-01-30 Thread Heikki Linnakangas
Simon Riggs wrote: On Thu, 2009-01-29 at 19:20 +0200, Heikki Linnakangas wrote: Hmm, seems like we haven't thought through how shutdown during consistent recovery is supposed to behave in general. Right now, smart shutdown doesn't do anything during consistent recovery, because the startup

Re: [HACKERS] Hot standby, recovery infra

2009-01-29 Thread Simon Riggs
On Thu, 2009-01-29 at 09:34 +0200, Heikki Linnakangas wrote: It does *during recovery*, before InitXLogAccess is called. Yeah, it's harmless currently. It would be pretty hard to keep it up-to-date in bgwriter and other processes. I think it's better to keep it at 0, which is clearly an

Re: [HACKERS] Hot standby, recovery infra

2009-01-29 Thread Heikki Linnakangas
Simon Riggs wrote: On Thu, 2009-01-29 at 10:36 +0900, Fujii Masao wrote: Hi, On Wed, Jan 28, 2009 at 11:19 PM, Fujii Masao masao.fu...@gmail.com wrote: I feel quite good about this patch now. Given the amount of code churn, it requires testing, and I'll read it through one more time after

Re: [HACKERS] Hot standby, recovery infra

2009-01-29 Thread Simon Riggs
On Thu, 2009-01-29 at 11:20 +0200, Heikki Linnakangas wrote: Simon Riggs wrote: On Thu, 2009-01-29 at 10:36 +0900, Fujii Masao wrote: Hi, On Wed, Jan 28, 2009 at 11:19 PM, Fujii Masao masao.fu...@gmail.com wrote: I feel quite good about this patch now. Given the amount of code

Re: [HACKERS] Hot standby, recovery infra

2009-01-29 Thread Heikki Linnakangas
Simon Riggs wrote: My proposed fix for Fujii-san's minSafeStartPoint bug is to introduce another control file state DB_IN_ARCHIVE_RECOVERY_BASE. This would show that we are still recovering up to the point of the end of the base backup. Once we reach minSafeStartPoint we then switch state to

Re: [HACKERS] Hot standby, recovery infra

2009-01-29 Thread Heikki Linnakangas
It looks like if you issue a fast shutdown during recovery, postmaster doesn't kill bgwriter. ... LOG: restored log file 00010028 from archive LOG: restored log file 00010029 from archive LOG: consistent recovery state reached at 0/295C ... LOG: restored

Re: [HACKERS] Hot standby, recovery infra

2009-01-29 Thread Simon Riggs
On Thu, 2009-01-29 at 12:22 +0200, Heikki Linnakangas wrote: Simon Riggs wrote: My proposed fix for Fujii-san's minSafeStartPoint bug is to introduce another control file state DB_IN_ARCHIVE_RECOVERY_BASE. This would show that we are still recovering up to the point of the end of the base

Re: [HACKERS] Hot standby, recovery infra

2009-01-29 Thread Heikki Linnakangas
Simon Riggs wrote: On Thu, 2009-01-29 at 12:22 +0200, Heikki Linnakangas wrote: It comes from the fact that we set minSafeStartPoint beyond the actual end of WAL, if the last WAL segment is only partially filled (= fails CRC check at some point). If we crash after setting minSafeStartPoint

Re: [HACKERS] Hot standby, recovery infra

2009-01-29 Thread Simon Riggs
On Thu, 2009-01-29 at 15:31 +0200, Heikki Linnakangas wrote: Now when we restart the recovery, we will never reach minSafeStartPoint, which is now 0/400, and we'll fail with the error that Fujii-san pointed out. We're already way past the min recovery point of base backup by then. The

Re: [HACKERS] Hot standby, recovery infra

2009-01-29 Thread Heikki Linnakangas
Simon Riggs wrote: On Thu, 2009-01-29 at 15:31 +0200, Heikki Linnakangas wrote: Now when we restart the recovery, we will never reach minSafeStartPoint, which is now 0/400, and we'll fail with the error that Fujii-san pointed out. We're already way past the min recovery point of base

Re: [HACKERS] Hot standby, recovery infra

2009-01-29 Thread Heikki Linnakangas
Heikki Linnakangas wrote: It looks like if you issue a fast shutdown during recovery, postmaster doesn't kill bgwriter. Hmm, seems like we haven't thought through how shutdown during consistent recovery is supposed to behave in general. Right now, smart shutdown doesn't do anything during

Re: [HACKERS] Hot standby, recovery infra

2009-01-29 Thread Heikki Linnakangas
Heikki Linnakangas wrote: Simon Riggs wrote: On Thu, 2009-01-29 at 15:31 +0200, Heikki Linnakangas wrote: Now when we restart the recovery, we will never reach minSafeStartPoint, which is now 0/400, and we'll fail with the error that Fujii-san pointed out. We're already way past the min

Re: [HACKERS] Hot standby, recovery infra

2009-01-28 Thread Simon Riggs
On Wed, 2009-01-28 at 12:04 +0200, Heikki Linnakangas wrote: I've been reviewing and massaging the so called recovery infra patch. Thanks. I feel quite good about this patch now. Given the amount of code churn, it requires testing, and I'll read it through one more time after sleeping over

Re: [HACKERS] Hot standby, recovery infra

2009-01-28 Thread Fujii Masao
Hi, On Wed, Jan 28, 2009 at 7:04 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: I've been reviewing and massaging the so called recovery infra patch. Great! I feel quite good about this patch now. Given the amount of code churn, it requires testing, and I'll read it

Re: [HACKERS] Hot standby, recovery infra

2009-01-28 Thread Simon Riggs
On Wed, 2009-01-28 at 23:19 +0900, Fujii Masao wrote: @@ -355,6 +359,27 @@ BackgroundWriterMain(void) */ PG_SETMASK(UnBlockSig); + BgWriterRecoveryMode = IsRecoveryProcessingMode(); + + if (BgWriterRecoveryMode) + elog(DEBUG1, bgwriter starting during

Re: [HACKERS] Hot standby, recovery infra

2009-01-28 Thread Fujii Masao
Hi, On Wed, Jan 28, 2009 at 11:47 PM, Simon Riggs si...@2ndquadrant.com wrote: On Wed, 2009-01-28 at 23:19 +0900, Fujii Masao wrote: @@ -355,6 +359,27 @@ BackgroundWriterMain(void) */ PG_SETMASK(UnBlockSig); + BgWriterRecoveryMode = IsRecoveryProcessingMode(); + + if

Re: [HACKERS] Hot standby, recovery infra

2009-01-28 Thread Simon Riggs
On Wed, 2009-01-28 at 23:54 +0900, Fujii Masao wrote: Why is InitXLOGAccess() called also here when bgwriter is started after recovery? That is already called by AuxiliaryProcessMain(). InitXLOGAccess() sets the timeline and also gets the latest record pointer. If the bgwriter is

Re: [HACKERS] Hot standby, recovery infra

2009-01-28 Thread Heikki Linnakangas
Fujii Masao wrote: On Wed, Jan 28, 2009 at 7:04 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: I feel quite good about this patch now. Given the amount of code churn, it requires testing, and I'll read it through one more time after sleeping over it. Simon, do you see

Re: [HACKERS] Hot Standby (v9d)

2009-01-28 Thread Gregory Stark
I skimmed through the Hot Standby patch for a preliminary review. I noted the following things, some minor tweaks, some just questions. None of the things I noted are big issues unless some of the questions uncover issues. 1) This code is obviously a cut-pasto: + else if (strcmp(tok1,

Re: [HACKERS] Hot Standby (v9d)

2009-01-28 Thread Simon Riggs
On Wed, 2009-01-28 at 18:55 +, Gregory Stark wrote: I skimmed through the Hot Standby patch for a preliminary review. I noted the following things, some minor tweaks, some just questions. None of the things I noted are big issues unless some of the questions uncover issues. Thanks for

Re: [HACKERS] Hot Standby (v9d)

2009-01-28 Thread Joshua D. Drake
On Wed, 2009-01-28 at 19:27 +, Simon Riggs wrote: On Wed, 2009-01-28 at 18:55 +, Gregory Stark wrote: Agreed. As explained when I published that patch it is deliberately severe to allow testing of conflict resolution and feedback on it. I still *strongly* feel the default has to be

Re: [HACKERS] Hot Standby (v9d)

2009-01-28 Thread Heikki Linnakangas
Gregory Stark wrote: 6) I still don't understand why you need unobserved_xids. We don't need this in normal running, an xid we don't know for certain is committed is exactly the same as a transaction we know is currently running or aborted. So why do you need it during HS? In normal operation,

Re: [HACKERS] Hot Standby (v9d)

2009-01-28 Thread Simon Riggs
On Wed, 2009-01-28 at 18:55 +, Gregory Stark wrote: I still don't understand why you need unobserved_xids. We don't need this in normal running, an xid we don't know for certain is committed is exactly the same as a transaction we know is currently running or aborted. So why do you need

Re: [HACKERS] Hot Standby (v9d)

2009-01-28 Thread Tom Lane
Joshua D. Drake j...@commandprompt.com writes: On Wed, 2009-01-28 at 19:27 +, Simon Riggs wrote: On Wed, 2009-01-28 at 18:55 +, Gregory Stark wrote: I still *strongly* feel the default has to be the non-destructive conservative -1. I don't. Primarily, we must support high

Re: [HACKERS] Hot Standby (v9d)

2009-01-28 Thread Aidan Van Dyk
* Tom Lane t...@sss.pgh.pa.us [090128 15:02]: Well, those unexpectedly cancelled queries could have represented critical functionality too. I think this argument calls the entire approach into question. If there is no safe setting for the parameter then we need to find a way to not have

Re: [HACKERS] Hot Standby (v9d)

2009-01-28 Thread Greg Stark
(sorry for top posting -- blame apple) I don't see anything dangerous with either setting. For use cases where the backup is the primary purpose then killing queries is fine. For use cases where the maching is a reporting machine then saving large amounts of archived logs is fine.

Re: [HACKERS] Hot Standby (v9d)

2009-01-28 Thread Greg Stark
Put another way: your characterization is no more true than claiming there's no safe setting for statement_timeout since a large value means clog could overflow your disk and your tables could bloat. (I note we default statement_timeout to off though) -- Greg On 28 Jan 2009, at 19:56, Tom

Re: [HACKERS] Hot Standby (v9d)

2009-01-28 Thread Simon Riggs
On Wed, 2009-01-28 at 21:41 +0200, Heikki Linnakangas wrote: So, you can think of the unobserved xids array as an extension of ProcArray. The entries are like light-weight PGPROC entries. In fact I proposed earlier to simply create dummy PGPROC entries instead. Which we don't do because we

Re: [HACKERS] Hot Standby (v9d)

2009-01-28 Thread Jeff Davis
On Wed, 2009-01-28 at 19:27 +, Simon Riggs wrote: my failover was 12 hours behind when I needed it to be 10 seconds behind and I lost a $1 million because of downtime of Postgres The same could be said for warm standby right now. Or Slony-I, for that matter. I think that we can reasonably

Re: [HACKERS] Hot Standby (v9d)

2009-01-28 Thread Heikki Linnakangas
Tom Lane wrote: Joshua D. Drake j...@commandprompt.com writes: On Wed, 2009-01-28 at 19:27 +, Simon Riggs wrote: On Wed, 2009-01-28 at 18:55 +, Gregory Stark wrote: I still *strongly* feel the default has to be the non-destructive conservative -1. I don't. Primarily, we must support

Re: [HACKERS] Hot Standby (v9d)

2009-01-28 Thread Simon Riggs
On Wed, 2009-01-28 at 11:33 -0800, Joshua D. Drake wrote: On Wed, 2009-01-28 at 19:27 +, Simon Riggs wrote: On Wed, 2009-01-28 at 18:55 +, Gregory Stark wrote: Agreed. As explained when I published that patch it is deliberately severe to allow testing of conflict resolution and

Re: [HACKERS] Hot Standby (v9d)

2009-01-28 Thread Simon Riggs
On Wed, 2009-01-28 at 14:56 -0500, Tom Lane wrote: Well, those unexpectedly cancelled queries could have represented critical functionality too. I think this argument calls the entire approach into question. If there is no safe setting for the parameter then we need to find a way to not

Re: [HACKERS] Hot Standby (v9d)

2009-01-28 Thread Heikki Linnakangas
Simon Riggs wrote: The essential choice is What would you like the max failover time to be?. Some users want one server with max 5 mins behind, some want two servers, one with 0 seconds behind, one with 12 hours behind It's not quite that simple. Setting max_standby_delay=5mins means that

Re: [HACKERS] Hot Standby (v9d)

2009-01-28 Thread Simon Riggs
On Wed, 2009-01-28 at 22:47 +0200, Heikki Linnakangas wrote: Simon Riggs wrote: The essential choice is What would you like the max failover time to be?. Some users want one server with max 5 mins behind, some want two servers, one with 0 seconds behind, one with 12 hours behind It's

Re: [HACKERS] Hot Standby (v9d)

2009-01-28 Thread Jeff Davis
On Wed, 2009-01-28 at 22:47 +0200, Heikki Linnakangas wrote: It's not quite that simple. Setting max_standby_delay=5mins means that you're willing to wait 5 minutes for each query to die. Which means that in worst case you have to stop for 5 minutes at every single vacuum record, and fall

Re: [HACKERS] Hot Standby (v9d)

2009-01-28 Thread Robert Haas
I don't. Primarily, we must support high availability. It is much better if we get people saying I get my queries cancelled and we say RTFM and change parameter X, than if people say my failover was 12 hours behind when I needed it to be 10 seconds behind and I lost a $1 million because of

Re: [HACKERS] Hot Standby (v9d)

2009-01-28 Thread Gregory Stark
Simon Riggs si...@2ndquadrant.com writes: On Wed, 2009-01-28 at 14:56 -0500, Tom Lane wrote: Well, those unexpectedly cancelled queries could have represented critical functionality too. I think this argument calls the entire approach into question. If there is no safe setting for the

Re: [HACKERS] Hot Standby (v9d)

2009-01-28 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes: I vote with Simon. The thing is that if you get some queries cancelled, you'll realize you have a problem. ... or if you don't, they couldn't have been all that critical. Having your failover be 12 hours behind (or 12 months behind) is something

Re: [HACKERS] Hot standby, recovery infra

2009-01-28 Thread Fujii Masao
Hi, On Wed, Jan 28, 2009 at 11:19 PM, Fujii Masao masao.fu...@gmail.com wrote: I feel quite good about this patch now. Given the amount of code churn, it requires testing, and I'll read it through one more time after sleeping over it. Simon, do you see anything wrong with this? I also read

Re: [HACKERS] Hot standby, recovery infra

2009-01-28 Thread Fujii Masao
Hi, On Wed, Jan 28, 2009 at 11:19 PM, Fujii Masao masao.fu...@gmail.com wrote: I feel quite good about this patch now. Given the amount of code churn, it requires testing, and I'll read it through one more time after sleeping over it. Simon, do you see anything wrong with this? I also read

Re: [HACKERS] Hot standby, recovery infra

2009-01-28 Thread Simon Riggs
On Thu, 2009-01-29 at 12:18 +0900, Fujii Masao wrote: Hi, On Wed, Jan 28, 2009 at 11:19 PM, Fujii Masao masao.fu...@gmail.com wrote: I feel quite good about this patch now. Given the amount of code churn, it requires testing, and I'll read it through one more time after sleeping over

Re: [HACKERS] Hot standby, recovery infra

2009-01-28 Thread Simon Riggs
On Thu, 2009-01-29 at 10:36 +0900, Fujii Masao wrote: Hi, On Wed, Jan 28, 2009 at 11:19 PM, Fujii Masao masao.fu...@gmail.com wrote: I feel quite good about this patch now. Given the amount of code churn, it requires testing, and I'll read it through one more time after sleeping over

Re: [HACKERS] Hot standby, recovery infra

2009-01-28 Thread Heikki Linnakangas
Simon Riggs wrote: On Thu, 2009-01-29 at 12:18 +0900, Fujii Masao wrote: Though this is a matter of taste, I think that it's weird that bgwriter runs with ThisTimeLineID = 0 during recovery. This is because XLogCtl-ThisTimeLineID is set at the end of recovery. ISTM this will be a cause of bug

Re: [HACKERS] Hot standby, recovery infrastructure

2009-01-27 Thread Simon Riggs
On Tue, 2009-01-27 at 15:59 +0200, Heikki Linnakangas wrote: Regarding this comment: + /* +* Prior to 8.4 we wrote a Shutdown Checkpoint at the end of recovery. +* This could add minutes to the startup time, so we want bgwriter +* to perform it. This then frees the

Re: [HACKERS] Hot standby, recovery infrastructure

2009-01-27 Thread Heikki Linnakangas
Simon Riggs wrote: On Tue, 2009-01-27 at 15:59 +0200, Heikki Linnakangas wrote: Regarding this comment: + /* +* Prior to 8.4 we wrote a Shutdown Checkpoint at the end of recovery. +* This could add minutes to the startup time, so we want bgwriter +* to perform it. This then

Re: [HACKERS] Hot standby, recovery infrastructure

2009-01-27 Thread Simon Riggs
On Tue, 2009-01-27 at 17:50 +0200, Heikki Linnakangas wrote: Hmm, I think we have small issue if the last WAL segment restored from the archive is an incomplete one: All normal archive recoveries have complete WAL files, since an xlog switch record jumps past missing entries at the end of the

Re: [HACKERS] Hot standby, recovery infrastructure

2009-01-27 Thread Simon Riggs
On Tue, 2009-01-27 at 17:50 +0200, Heikki Linnakangas wrote: Just think standard-online-checkpoint and it all fits. Exactly that made me wonder why the first checkpoint needs to be any different. Not really following you to be honest, assuming this was a separate point to the other part

Re: [HACKERS] Hot standby, recovery infrastructure

2009-01-27 Thread Heikki Linnakangas
Simon Riggs wrote: On Tue, 2009-01-27 at 17:50 +0200, Heikki Linnakangas wrote: Hmm, I think we have small issue if the last WAL segment restored from the archive is an incomplete one: All normal archive recoveries have complete WAL files, since an xlog switch record jumps past missing

Re: [HACKERS] Hot standby, recovery infrastructure

2009-01-27 Thread Simon Riggs
On Tue, 2009-01-27 at 20:11 +0200, Heikki Linnakangas wrote: Simon Riggs wrote: On Tue, 2009-01-27 at 17:50 +0200, Heikki Linnakangas wrote: Hmm, I think we have small issue if the last WAL segment restored from the archive is an incomplete one: All normal archive recoveries have

Re: [HACKERS] Hot standby, recovery infrastructure

2009-01-27 Thread Heikki Linnakangas
Simon Riggs wrote: On Tue, 2009-01-27 at 20:11 +0200, Heikki Linnakangas wrote: Simon Riggs wrote: On Tue, 2009-01-27 at 17:50 +0200, Heikki Linnakangas wrote: Hmm, I think we have small issue if the last WAL segment restored from the archive is an incomplete one: All normal archive

Re: [HACKERS] Hot Standby (v9d)

2009-01-27 Thread Simon Riggs
On Sat, 2009-01-24 at 17:24 +1300, Mark Kirkwood wrote: I'm doing some test runs with this now. I notice an old flatfiles related bug has reappeared: Bug fix patch against git repo. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support ***

Re: [HACKERS] Hot standby, dropping a tablespace

2009-01-26 Thread Simon Riggs
On Sun, 2009-01-25 at 19:56 +0200, Heikki Linnakangas wrote: Simon Riggs wrote: 2. Kill all connections by given user. Hmm, not used for anything, actually. Should remove the roleId argument from GetConflictingVirtualXIDs. No, because we still need to add code to kill-connected-users

Re: [HACKERS] Hot standby, dropping a tablespace

2009-01-26 Thread Simon Riggs
On Sun, 2009-01-25 at 12:13 +, Grzegorz Jaskiewicz wrote: On 2009-01-25, at 09:04, Simon Riggs wrote: On Sat, 2009-01-24 at 21:58 +0200, Heikki Linnakangas wrote: When replaying a DROP TABLE SPACE, you first try to remove the directory, and if that fails, you assume that it's

Re: [HACKERS] Hot standby, dropping a tablespace

2009-01-26 Thread Andres Freund
Hi, Simon Riggs wrote: On Sun, 2009-01-25 at 19:56 +0200, Heikki Linnakangas wrote: Simon Riggs wrote: 2. Kill all connections by given user. Hmm, not used for anything, actually. Should remove the roleId argument from GetConflictingVirtualXIDs. No, because we still need to add code to

Re: [HACKERS] Hot standby, conflict resolution

2009-01-26 Thread Simon Riggs
On Sun, 2009-01-25 at 16:19 +, Simon Riggs wrote: On Fri, 2009-01-23 at 21:30 +0200, Heikki Linnakangas wrote: Ok, then I think we have a little race condition. The startup process doesn't get any reply indicating that the target backend has processed the SIGINT and set the

Re: [HACKERS] Hot standby, conflict resolution

2009-01-26 Thread Heikki Linnakangas
Simon Riggs wrote: Rather than signalling, we could use a hasconflict boolean for each proc in a shared data structure. It can be read without spinlock, but should only be written while holding spinlock. Each time we read a block we check if hasconflict is set. If it is, we grab spinlock,

Re: [HACKERS] Hot standby, dropping a tablespace

2009-01-25 Thread Simon Riggs
On Sat, 2009-01-24 at 21:58 +0200, Heikki Linnakangas wrote: When replaying a DROP TABLE SPACE, you first try to remove the directory, and if that fails, you assume that it's because it's in use as a temp tablespace in a read-only transaction. That sounds like you think there another

<    3   4   5   6   7   8   9   10   >