Simon Riggs wrote:
On Thu, 2009-02-05 at 11:46 +0200, Heikki Linnakangas wrote:
Simon Riggs wrote:
So we might end up flushing more often *and* we will be doing it
potentially in the code path of other users.
For example, imagine a database that fits completely in shared buffers.
If we
On Thu, 2009-02-05 at 13:18 +0200, Heikki Linnakangas wrote:
Simon Riggs wrote:
On Thu, 2009-02-05 at 11:46 +0200, Heikki Linnakangas wrote:
Simon Riggs wrote:
So we might end up flushing more often *and* we will be doing it
potentially in the code path of other users.
For example,
Simon Riggs wrote:
On Thu, 2009-02-05 at 13:18 +0200, Heikki Linnakangas wrote:
Simon Riggs wrote:
On Thu, 2009-02-05 at 11:46 +0200, Heikki Linnakangas wrote:
Simon Riggs wrote:
So we might end up flushing more often *and* we will be doing it
potentially in the code path of other users.
For
On Thu, 2009-02-05 at 14:18 +0200, Heikki Linnakangas wrote:
when the control file is updated in XLogFlush, it's
typically the bgwriter doing it as it cleans buffers ahead of the
clock hand, not the startup process
That is the key point. Let's do it your way.
--
Simon Riggs
On Thu, 2009-02-05 at 21:54 +0200, Heikki Linnakangas wrote:
- If bgwriter is performing a restartpoint when recovery ends, the
startup checkpoint will be queued up behind the restartpoint. And since
it uses the same smoothing logic as checkpoints, it can take quite some
time for that to
Fujii Masao wrote:
On Fri, Jan 30, 2009 at 11:55 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
The startup process now catches SIGTERM, and calls proc_exit() at the next
WAL record. That's what will happen in a fast shutdown. Unexpected death of
the startup process is
Simon Riggs wrote:
* I think we are now renaming the recovery.conf file too early. The
comment says We have already restored all the WAL segments we need from
the archive, and we trust that they are not going to go away even if we
crash. We have, but the files overwrite each other as they
On Wed, 2009-02-04 at 19:03 +0200, Heikki Linnakangas wrote:
Simon Riggs wrote:
* I think we are now renaming the recovery.conf file too early. The
comment says We have already restored all the WAL segments we need from
the archive, and we trust that they are not going to go away even if
Hi,
On Wed, Feb 4, 2009 at 8:35 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
Yes, and in fact I ran into it myself yesterday while testing. It seems that
we should reset FatalError earlier, ie. when the recovery starts and
bgwriter is launched. I'm not sure why we in CVS
Fujii Masao masao.fu...@gmail.com writes:
On Wed, Feb 4, 2009 at 8:35 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
... I'm not sure why we in CVS HEAD we don't reset
FatalError until after the startup process is finished.
Which may repeat the recovery crash and
Tom Lane wrote:
Fujii Masao masao.fu...@gmail.com writes:
On Wed, Feb 4, 2009 at 8:35 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
... I'm not sure why we in CVS HEAD we don't reset
FatalError until after the startup process is finished.
Which may repeat the recovery
Simon Riggs wrote:
We could avoid that by performing a good old startup checkpoint, but I
quite like the fast failover time we get without it.
ISTM it's either slow failover or (fast failover, but restart archive
recovery if crashes).
I would suggest that at end of recovery we write the last
On Thu, 2009-02-05 at 09:28 +0200, Heikki Linnakangas wrote:
I've changed the way minRecoveryPoint is updated now anyway, so it no
longer happens every XLogFileRead().
Care to elucidate?
I got rid of minSafeStartPoint, advancing minRecoveryPoint instead. And
it's advanced in
On Thu, 2009-02-05 at 09:28 +0200, Heikki Linnakangas wrote:
Simon Riggs wrote:
We could avoid that by performing a good old startup checkpoint, but I
quite like the fast failover time we get without it.
ISTM it's either slow failover or (fast failover, but restart archive
recovery
Hannu Krosing wrote:
Actually we came up with a solution to this - use filesystem level
snapshots (like LVM2+XFS or ZFS), and redirect backends with
long-running queries to use fs snapshot mounted to a different
mountpoint.
I don't think Simon has yet put full support for it in code, but it
Hannu Krosing ha...@krosing.net writes:
Actually we came up with a solution to this - use filesystem level
snapshots (like LVM2+XFS or ZFS), and redirect backends with
long-running queries to use fs snapshot mounted to a different
mountpoint.
Uhm, how do you determine which snapshot to
I don't see any way around the fact that when a tuple is removed, it's
gone and can't be accessed by queries. Either you don't remove it, or
you kill the query.
Actually we came up with a solution to this - use filesystem level
snapshots (like LVM2+XFS or ZFS), and redirect backends with
On Tue, 2009-02-03 at 08:40 -0500, Andrew Dunstan wrote:
Hannu Krosing wrote:
Actually we came up with a solution to this - use filesystem level
snapshots (like LVM2+XFS or ZFS), and redirect backends with
long-running queries to use fs snapshot mounted to a different
mountpoint.
I
On Tue, 2009-02-03 at 09:14 -0500, Robert Haas wrote:
I don't see any way around the fact that when a tuple is removed, it's
gone and can't be accessed by queries. Either you don't remove it, or
you kill the query.
Actually we came up with a solution to this - use filesystem level
On Tue, 2009-02-03 at 09:14 -0500, Robert Haas wrote:
I think _the_ solution is to notice when you're about to vacuum a page
that is still visible to a running backend on the standby, and save
that page off to a separate cache of old page versions (perhaps using
the relation fork mechanism).
On Tue, 2009-02-03 at 13:50 +, Gregory Stark wrote:
Hannu Krosing ha...@krosing.net writes:
Actually we came up with a solution to this - use filesystem level
snapshots (like LVM2+XFS or ZFS), and redirect backends with
long-running queries to use fs snapshot mounted to a different
On Tue, 2009-02-03 at 08:40 -0500, Andrew Dunstan wrote:
Hannu Krosing wrote:
Actually we came up with a solution to this - use filesystem level
snapshots (like LVM2+XFS or ZFS), and redirect backends with
long-running queries to use fs snapshot mounted to a different
mountpoint.
I
On Tue, Feb 3, 2009 at 9:40 AM, Simon Riggs si...@2ndquadrant.com wrote:
On Tue, 2009-02-03 at 09:14 -0500, Robert Haas wrote:
I think _the_ solution is to notice when you're about to vacuum a page
that is still visible to a running backend on the standby, and save
that page off to a separate
On Tue, 2009-02-03 at 15:55 +0100, Andres Freund wrote:
Hi,
On 02/03/2009 02:26 PM, Hannu Krosing wrote:
I don't see any way around the fact that when a tuple is removed, it's
gone and can't be accessed by queries. Either you don't remove it, or
you kill the query.
Actually we came up
On Tue, 2009-02-03 at 10:19 -0500, Robert Haas wrote:
On Tue, Feb 3, 2009 at 9:40 AM, Simon Riggs si...@2ndquadrant.com wrote:
On Tue, 2009-02-03 at 09:14 -0500, Robert Haas wrote:
I think _the_ solution is to notice when you're about to vacuum a page
that is still visible to a running
On Tue, 2009-02-03 at 14:28 +, Simon Riggs wrote:
On Tue, 2009-02-03 at 08:40 -0500, Andrew Dunstan wrote:
Hannu Krosing wrote:
Actually we came up with a solution to this - use filesystem level
snapshots (like LVM2+XFS or ZFS), and redirect backends with
long-running queries to
Hi,
On 02/03/2009 02:26 PM, Hannu Krosing wrote:
I don't see any way around the fact that when a tuple is removed, it's
gone and can't be accessed by queries. Either you don't remove it, or
you kill the query.
Actually we came up with a solution to this - use filesystem level
snapshots (like
On Tue, 2009-02-03 at 18:09 +0200, Hannu Krosing wrote:
On Tue, 2009-02-03 at 14:28 +, Simon Riggs wrote:
On Tue, 2009-02-03 at 08:40 -0500, Andrew Dunstan wrote:
Hannu Krosing wrote:
Actually we came up with a solution to this - use filesystem level
snapshots (like LVM2+XFS
On Wed, 2009-01-28 at 22:19 +0200, Heikki Linnakangas wrote:
Tom Lane wrote:
...
Well, those unexpectedly cancelled queries could have represented
critical functionality too. I think this argument calls the entire
approach into question. If there is no safe setting for the parameter
Hi,
On Fri, Jan 30, 2009 at 11:55 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
The startup process now catches SIGTERM, and calls proc_exit() at the next
WAL record. That's what will happen in a fast shutdown. Unexpected death of
the startup process is treated the same as
On Sat, 2009-01-31 at 22:32 +0200, Heikki Linnakangas wrote:
If you poison your WAL archive with a XLOG_CRASH_RECOVERY record,
recovery will never be able to proceed over that point. There would have
to be a switch to ignore those records, at the very least.
Definitely in assert mode only.
On Sat, 2009-01-31 at 22:41 +0200, Heikki Linnakangas wrote:
I like this way because it means we might in the future get Startup
process to perform post-recovery actions also.
Yeah, it does. Do you have something in mind already?
Yes, but nothing that needs to be discussed yet.
--
On Fri, 2009-01-30 at 13:15 +0200, Heikki Linnakangas wrote:
Simon Riggs wrote:
I'm thinking to add a new function that will allow crash testing easier.
pg_crash_standby() will issue a new xlog record, XLOG_CRASH_STANDBY,
which when replayed will just throw a FATAL error and crash
On Fri, 2009-01-30 at 13:25 +0200, Heikki Linnakangas wrote:
That whole area was something I was leaving until last, since
immediate
shutdown doesn't work either, even in HEAD. (Fujii-san and I
discussed
this before Christmas, briefly).
We must handle shutdown gracefully, can't just
On Fri, 2009-01-30 at 16:55 +0200, Heikki Linnakangas wrote:
Ok, here's an attempt to make shutdown work gracefully.
Startup process now signals postmaster three times during startup: first
when it has done all the initialization, and starts redo. At that point.
postmaster launches
Simon Riggs wrote:
On Fri, 2009-01-30 at 13:15 +0200, Heikki Linnakangas wrote:
Simon Riggs wrote:
I'm thinking to add a new function that will allow crash testing easier.
pg_crash_standby() will issue a new xlog record, XLOG_CRASH_STANDBY,
which when replayed will just throw a FATAL error
Simon Riggs wrote:
On Fri, 2009-01-30 at 16:55 +0200, Heikki Linnakangas wrote:
Ok, here's an attempt to make shutdown work gracefully.
Startup process now signals postmaster three times during startup: first
when it has done all the initialization, and starts redo. At that point.
postmaster
I just realized that the new minSafeStartPoint is actually exactly the
same concept as the existing minRecoveryPoint. As the recovery
progresses, we could advance minRecoveryPoint just as well as the new
minSafeStartPoint.
Perhaps it's a good idea to keep them separate anyway though, the
On Thu, 2009-01-29 at 20:35 +0200, Heikki Linnakangas wrote:
Hmm, another point of consideration is how this interacts with the
pause/continue. In particular, it was suggested earlier that you
could
put an option into recovery.conf to start in paused mode. If you
pause
recovery, and then
On Thu, 2009-01-29 at 19:20 +0200, Heikki Linnakangas wrote:
Heikki Linnakangas wrote:
It looks like if you issue a fast shutdown during recovery, postmaster
doesn't kill bgwriter.
Hmm, seems like we haven't thought through how shutdown during
consistent recovery is supposed to behave
On Fri, 2009-01-30 at 11:33 +0200, Heikki Linnakangas wrote:
I just realized that the new minSafeStartPoint is actually exactly the
same concept as the existing minRecoveryPoint. As the recovery
progresses, we could advance minRecoveryPoint just as well as the new
minSafeStartPoint.
On Thu, 2009-01-29 at 14:21 +0200, Heikki Linnakangas wrote:
It looks like if you issue a fast shutdown during recovery, postmaster
doesn't kill bgwriter.
Thanks for the report.
I'm thinking to add a new function that will allow crash testing easier.
pg_crash_standby() will issue a new xlog
Simon Riggs wrote:
I'm thinking to add a new function that will allow crash testing easier.
pg_crash_standby() will issue a new xlog record, XLOG_CRASH_STANDBY,
which when replayed will just throw a FATAL error and crash Startup
process. We won't be adding that to the user docs...
This will
Simon Riggs wrote:
On Thu, 2009-01-29 at 19:20 +0200, Heikki Linnakangas wrote:
Hmm, seems like we haven't thought through how shutdown during
consistent recovery is supposed to behave in general. Right now, smart
shutdown doesn't do anything during consistent recovery, because the
startup
On Thu, 2009-01-29 at 09:34 +0200, Heikki Linnakangas wrote:
It does *during recovery*, before InitXLogAccess is called. Yeah, it's
harmless currently. It would be pretty hard to keep it up-to-date in
bgwriter and other processes. I think it's better to keep it at 0,
which is clearly an
Simon Riggs wrote:
On Thu, 2009-01-29 at 10:36 +0900, Fujii Masao wrote:
Hi,
On Wed, Jan 28, 2009 at 11:19 PM, Fujii Masao masao.fu...@gmail.com wrote:
I feel quite good about this patch now. Given the amount of code churn, it
requires testing, and I'll read it through one more time after
On Thu, 2009-01-29 at 11:20 +0200, Heikki Linnakangas wrote:
Simon Riggs wrote:
On Thu, 2009-01-29 at 10:36 +0900, Fujii Masao wrote:
Hi,
On Wed, Jan 28, 2009 at 11:19 PM, Fujii Masao masao.fu...@gmail.com
wrote:
I feel quite good about this patch now. Given the amount of code
Simon Riggs wrote:
My proposed fix for Fujii-san's minSafeStartPoint bug is to introduce
another control file state DB_IN_ARCHIVE_RECOVERY_BASE. This would show
that we are still recovering up to the point of the end of the base
backup. Once we reach minSafeStartPoint we then switch state to
It looks like if you issue a fast shutdown during recovery, postmaster
doesn't kill bgwriter.
...
LOG: restored log file 00010028 from archive
LOG: restored log file 00010029 from archive
LOG: consistent recovery state reached at 0/295C
...
LOG: restored
On Thu, 2009-01-29 at 12:22 +0200, Heikki Linnakangas wrote:
Simon Riggs wrote:
My proposed fix for Fujii-san's minSafeStartPoint bug is to introduce
another control file state DB_IN_ARCHIVE_RECOVERY_BASE. This would show
that we are still recovering up to the point of the end of the base
Simon Riggs wrote:
On Thu, 2009-01-29 at 12:22 +0200, Heikki Linnakangas wrote:
It
comes from the fact that we set minSafeStartPoint beyond the actual end
of WAL, if the last WAL segment is only partially filled (= fails CRC
check at some point). If we crash after setting minSafeStartPoint
On Thu, 2009-01-29 at 15:31 +0200, Heikki Linnakangas wrote:
Now when we restart the recovery, we will never reach
minSafeStartPoint, which is now 0/400, and we'll fail with the
error that Fujii-san pointed out. We're already way past the min
recovery point of base backup by then.
The
Simon Riggs wrote:
On Thu, 2009-01-29 at 15:31 +0200, Heikki Linnakangas wrote:
Now when we restart the recovery, we will never reach
minSafeStartPoint, which is now 0/400, and we'll fail with the
error that Fujii-san pointed out. We're already way past the min
recovery point of base
Heikki Linnakangas wrote:
It looks like if you issue a fast shutdown during recovery, postmaster
doesn't kill bgwriter.
Hmm, seems like we haven't thought through how shutdown during
consistent recovery is supposed to behave in general. Right now, smart
shutdown doesn't do anything during
Heikki Linnakangas wrote:
Simon Riggs wrote:
On Thu, 2009-01-29 at 15:31 +0200, Heikki Linnakangas wrote:
Now when we restart the recovery, we will never reach
minSafeStartPoint, which is now 0/400, and we'll fail with the
error that Fujii-san pointed out. We're already way past the min
On Wed, 2009-01-28 at 12:04 +0200, Heikki Linnakangas wrote:
I've been reviewing and massaging the so called recovery infra patch.
Thanks.
I feel quite good about this patch now. Given the amount of code
churn, it requires testing, and I'll read it through one more time
after sleeping over
Hi,
On Wed, Jan 28, 2009 at 7:04 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
I've been reviewing and massaging the so called recovery infra patch.
Great!
I feel quite good about this patch now. Given the amount of code churn, it
requires testing, and I'll read it
On Wed, 2009-01-28 at 23:19 +0900, Fujii Masao wrote:
@@ -355,6 +359,27 @@ BackgroundWriterMain(void)
*/
PG_SETMASK(UnBlockSig);
+ BgWriterRecoveryMode = IsRecoveryProcessingMode();
+
+ if (BgWriterRecoveryMode)
+ elog(DEBUG1, bgwriter starting during
Hi,
On Wed, Jan 28, 2009 at 11:47 PM, Simon Riggs si...@2ndquadrant.com wrote:
On Wed, 2009-01-28 at 23:19 +0900, Fujii Masao wrote:
@@ -355,6 +359,27 @@ BackgroundWriterMain(void)
*/
PG_SETMASK(UnBlockSig);
+ BgWriterRecoveryMode = IsRecoveryProcessingMode();
+
+ if
On Wed, 2009-01-28 at 23:54 +0900, Fujii Masao wrote:
Why is InitXLOGAccess() called also here when bgwriter is started after
recovery? That is already called by AuxiliaryProcessMain().
InitXLOGAccess() sets the timeline and also gets the latest record
pointer. If the bgwriter is
Fujii Masao wrote:
On Wed, Jan 28, 2009 at 7:04 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
I feel quite good about this patch now. Given the amount of code churn, it
requires testing, and I'll read it through one more time after sleeping over
it. Simon, do you see
I skimmed through the Hot Standby patch for a preliminary review. I noted the
following things, some minor tweaks, some just questions. None of the things I
noted are big issues unless some of the questions uncover issues.
1) This code is obviously a cut-pasto:
+ else if (strcmp(tok1,
On Wed, 2009-01-28 at 18:55 +, Gregory Stark wrote:
I skimmed through the Hot Standby patch for a preliminary review. I noted the
following things, some minor tweaks, some just questions. None of the things I
noted are big issues unless some of the questions uncover issues.
Thanks for
On Wed, 2009-01-28 at 19:27 +, Simon Riggs wrote:
On Wed, 2009-01-28 at 18:55 +, Gregory Stark wrote:
Agreed. As explained when I published that patch it is deliberately
severe to allow testing of conflict resolution and feedback on it.
I still *strongly* feel the default has to be
Gregory Stark wrote:
6) I still don't understand why you need unobserved_xids. We don't need this
in normal running, an xid we don't know for certain is committed is exactly
the same as a transaction we know is currently running or aborted. So why do
you need it during HS?
In normal operation,
On Wed, 2009-01-28 at 18:55 +, Gregory Stark wrote:
I still don't understand why you need unobserved_xids. We don't need
this in normal running, an xid we don't know for certain is committed
is exactly the same as a transaction we know is currently running or
aborted. So why do you need
Joshua D. Drake j...@commandprompt.com writes:
On Wed, 2009-01-28 at 19:27 +, Simon Riggs wrote:
On Wed, 2009-01-28 at 18:55 +, Gregory Stark wrote:
I still *strongly* feel the default has to be the
non-destructive conservative -1.
I don't. Primarily, we must support high
* Tom Lane t...@sss.pgh.pa.us [090128 15:02]:
Well, those unexpectedly cancelled queries could have represented
critical functionality too. I think this argument calls the entire
approach into question. If there is no safe setting for the parameter
then we need to find a way to not have
(sorry for top posting -- blame apple)
I don't see anything dangerous with either setting. For use cases
where the backup is the primary purpose then killing queries is fine.
For use cases where the maching is a reporting machine then saving
large amounts of archived logs is fine.
Put another way: your characterization is no more true than claiming
there's no safe setting for statement_timeout since a large value
means clog could overflow your disk and your tables could bloat.
(I note we default statement_timeout to off though)
--
Greg
On 28 Jan 2009, at 19:56, Tom
On Wed, 2009-01-28 at 21:41 +0200, Heikki Linnakangas wrote:
So, you can think of the unobserved xids array as an extension of
ProcArray. The entries are like light-weight PGPROC entries. In fact I
proposed earlier to simply create dummy PGPROC entries instead.
Which we don't do because we
On Wed, 2009-01-28 at 19:27 +, Simon Riggs wrote:
my failover was 12 hours behind when I needed it to be 10 seconds
behind and I lost a $1 million because of downtime of Postgres
The same could be said for warm standby right now. Or Slony-I, for that
matter. I think that we can reasonably
Tom Lane wrote:
Joshua D. Drake j...@commandprompt.com writes:
On Wed, 2009-01-28 at 19:27 +, Simon Riggs wrote:
On Wed, 2009-01-28 at 18:55 +, Gregory Stark wrote:
I still *strongly* feel the default has to be the
non-destructive conservative -1.
I don't. Primarily, we must support
On Wed, 2009-01-28 at 11:33 -0800, Joshua D. Drake wrote:
On Wed, 2009-01-28 at 19:27 +, Simon Riggs wrote:
On Wed, 2009-01-28 at 18:55 +, Gregory Stark wrote:
Agreed. As explained when I published that patch it is deliberately
severe to allow testing of conflict resolution and
On Wed, 2009-01-28 at 14:56 -0500, Tom Lane wrote:
Well, those unexpectedly cancelled queries could have represented
critical functionality too. I think this argument calls the entire
approach into question. If there is no safe setting for the parameter
then we need to find a way to not
Simon Riggs wrote:
The essential choice is What would you like the max failover time to
be?. Some users want one server with max 5 mins behind, some want two
servers, one with 0 seconds behind, one with 12 hours behind
It's not quite that simple. Setting max_standby_delay=5mins means that
On Wed, 2009-01-28 at 22:47 +0200, Heikki Linnakangas wrote:
Simon Riggs wrote:
The essential choice is What would you like the max failover time to
be?. Some users want one server with max 5 mins behind, some want two
servers, one with 0 seconds behind, one with 12 hours behind
It's
On Wed, 2009-01-28 at 22:47 +0200, Heikki Linnakangas wrote:
It's not quite that simple. Setting max_standby_delay=5mins means that
you're willing to wait 5 minutes for each query to die. Which means that
in worst case you have to stop for 5 minutes at every single vacuum
record, and fall
I don't. Primarily, we must support high availability. It is much better
if we get people saying I get my queries cancelled and we say RTFM and
change parameter X, than if people say my failover was 12 hours behind
when I needed it to be 10 seconds behind and I lost a $1 million because
of
Simon Riggs si...@2ndquadrant.com writes:
On Wed, 2009-01-28 at 14:56 -0500, Tom Lane wrote:
Well, those unexpectedly cancelled queries could have represented
critical functionality too. I think this argument calls the entire
approach into question. If there is no safe setting for the
Robert Haas robertmh...@gmail.com writes:
I vote with Simon. The thing is that if you get some queries
cancelled, you'll realize you have a problem.
... or if you don't, they couldn't have been all that critical.
Having your failover be 12 hours
behind (or 12 months behind) is something
Hi,
On Wed, Jan 28, 2009 at 11:19 PM, Fujii Masao masao.fu...@gmail.com wrote:
I feel quite good about this patch now. Given the amount of code churn, it
requires testing, and I'll read it through one more time after sleeping over
it. Simon, do you see anything wrong with this?
I also read
Hi,
On Wed, Jan 28, 2009 at 11:19 PM, Fujii Masao masao.fu...@gmail.com wrote:
I feel quite good about this patch now. Given the amount of code churn, it
requires testing, and I'll read it through one more time after sleeping over
it. Simon, do you see anything wrong with this?
I also read
On Thu, 2009-01-29 at 12:18 +0900, Fujii Masao wrote:
Hi,
On Wed, Jan 28, 2009 at 11:19 PM, Fujii Masao masao.fu...@gmail.com wrote:
I feel quite good about this patch now. Given the amount of code churn, it
requires testing, and I'll read it through one more time after sleeping
over
On Thu, 2009-01-29 at 10:36 +0900, Fujii Masao wrote:
Hi,
On Wed, Jan 28, 2009 at 11:19 PM, Fujii Masao masao.fu...@gmail.com wrote:
I feel quite good about this patch now. Given the amount of code churn, it
requires testing, and I'll read it through one more time after sleeping
over
Simon Riggs wrote:
On Thu, 2009-01-29 at 12:18 +0900, Fujii Masao wrote:
Though this is a matter of taste, I think that it's weird that bgwriter
runs with ThisTimeLineID = 0 during recovery. This is because
XLogCtl-ThisTimeLineID is set at the end of recovery. ISTM this will
be a cause of bug
On Tue, 2009-01-27 at 15:59 +0200, Heikki Linnakangas wrote:
Regarding this comment:
+ /*
+* Prior to 8.4 we wrote a Shutdown Checkpoint at the end of recovery.
+* This could add minutes to the startup time, so we want bgwriter
+* to perform it. This then frees the
Simon Riggs wrote:
On Tue, 2009-01-27 at 15:59 +0200, Heikki Linnakangas wrote:
Regarding this comment:
+ /*
+* Prior to 8.4 we wrote a Shutdown Checkpoint at the end of recovery.
+* This could add minutes to the startup time, so we want bgwriter
+* to perform it. This then
On Tue, 2009-01-27 at 17:50 +0200, Heikki Linnakangas wrote:
Hmm, I think we have small issue if the last WAL segment restored from
the archive is an incomplete one:
All normal archive recoveries have complete WAL files, since an xlog
switch record jumps past missing entries at the end of the
On Tue, 2009-01-27 at 17:50 +0200, Heikki Linnakangas wrote:
Just think standard-online-checkpoint and it all fits.
Exactly that made me wonder why the first checkpoint needs to be any
different.
Not really following you to be honest, assuming this was a separate
point to the other part
Simon Riggs wrote:
On Tue, 2009-01-27 at 17:50 +0200, Heikki Linnakangas wrote:
Hmm, I think we have small issue if the last WAL segment restored from
the archive is an incomplete one:
All normal archive recoveries have complete WAL files, since an xlog
switch record jumps past missing
On Tue, 2009-01-27 at 20:11 +0200, Heikki Linnakangas wrote:
Simon Riggs wrote:
On Tue, 2009-01-27 at 17:50 +0200, Heikki Linnakangas wrote:
Hmm, I think we have small issue if the last WAL segment restored from
the archive is an incomplete one:
All normal archive recoveries have
Simon Riggs wrote:
On Tue, 2009-01-27 at 20:11 +0200, Heikki Linnakangas wrote:
Simon Riggs wrote:
On Tue, 2009-01-27 at 17:50 +0200, Heikki Linnakangas wrote:
Hmm, I think we have small issue if the last WAL segment restored from
the archive is an incomplete one:
All normal archive
On Sat, 2009-01-24 at 17:24 +1300, Mark Kirkwood wrote:
I'm doing some test runs with this now. I notice an old flatfiles
related bug has reappeared:
Bug fix patch against git repo.
--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support
***
On Sun, 2009-01-25 at 19:56 +0200, Heikki Linnakangas wrote:
Simon Riggs wrote:
2. Kill all connections by given user. Hmm, not used for anything,
actually. Should remove the roleId argument from GetConflictingVirtualXIDs.
No, because we still need to add code to kill-connected-users
On Sun, 2009-01-25 at 12:13 +, Grzegorz Jaskiewicz wrote:
On 2009-01-25, at 09:04, Simon Riggs wrote:
On Sat, 2009-01-24 at 21:58 +0200, Heikki Linnakangas wrote:
When replaying a DROP TABLE SPACE, you first try to remove the
directory, and if that fails, you assume that it's
Hi,
Simon Riggs wrote:
On Sun, 2009-01-25 at 19:56 +0200, Heikki Linnakangas wrote:
Simon Riggs wrote:
2. Kill all connections by given user. Hmm, not used for anything,
actually. Should remove the roleId argument from GetConflictingVirtualXIDs.
No, because we still need to add code to
On Sun, 2009-01-25 at 16:19 +, Simon Riggs wrote:
On Fri, 2009-01-23 at 21:30 +0200, Heikki Linnakangas wrote:
Ok, then I think we have a little race condition. The startup process
doesn't get any reply indicating that the target backend has
processed
the SIGINT and set the
Simon Riggs wrote:
Rather than signalling, we could use a hasconflict boolean for each proc
in a shared data structure. It can be read without spinlock, but should
only be written while holding spinlock.
Each time we read a block we check if hasconflict is set. If it is, we
grab spinlock,
On Sat, 2009-01-24 at 21:58 +0200, Heikki Linnakangas wrote:
When replaying a DROP TABLE SPACE, you first try to remove the
directory, and if that fails, you assume that it's because it's in use
as a temp tablespace in a read-only transaction.
That sounds like you think there another
701 - 800 of 968 matches
Mail list logo