Re: [HACKERS] [9.3 bug] disk space in pg_xlog increases during archive recovery

2014-02-15 Thread Andres Freund
On 2014-02-12 13:33:31 +0100, Andres Freund wrote:
 On 2014-02-12 21:23:54 +0900, MauMau wrote:
  Maybe we could consider in that direction, but there is a problem.  Archive
  recovery slows down compared to 9.1, because of repeated restartpoints.
  Archive recovery should be as fast as possible, because it typically applies
  dozens or hundreds of WAL files, and the DBA desires immediate resumption of
  operation.
  
  So, I think we should restore 9.1 behavior for archive recovery.
 
 It's easy to be fast, if it's not correct. I don't see how that
 behaviour is acceptable, imo the previous behaviour simply was a bug
 whose fix was too invasive to backpatch.
 
 I don't think you can convince me (but maybe others) that the old
 behaviour is acceptable.

I'm going to mark this patch as Rejected. Please speak up if that's
not accurate.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [9.3 bug] disk space in pg_xlog increases during archive recovery

2014-02-12 Thread MauMau

From: Andres Freund and...@2ndquadrant.com

On 2014-02-02 23:50:40 +0900, Fujii Masao wrote:

Right. If standby_mode is enabled, checkpoint_segment can trigger
the restartpoint. But the problem is that the timing of restartpoint
depends on not only the checkpoint parameters (i.e.,
checkpoint_timeout and checkpoint_segments) that are used during
archive recovery but also the checkpoint WAL that was generated
by the master.


Sure. But we really *need* all the WAL since the last checkpoint's redo
location locally to be safe.


For example, could you imagine the case where the master generated
only one checkpoint WAL since the last backup and it crashed with
database corruption. Then DBA decided to perform normal archive
recovery by using the last backup. In this case, even if DBA reduces
both checkpoint_timeout and checkpoint_segments, only one
restartpoint can occur during recovery. This low frequency of
restartpoint might fill up the disk space with lots of WAL files.


I am not sure I understand the point of this scenario. If the primary
crashed after a checkpoint, there won't be that much WAL since it
happened...


 If the issue is that you're not using standby_mode (if so, why?), then
 the fix maybe is to make that apply to a wider range of situations.

I guess that he is not using standby_mode because, according to
his first email in this thread, he said he would like to prevent WAL
from accumulating in pg_xlog during normal archive recovery (i.e., PITR).


Well, that doesn't necessarily prevent you from using
standby_mode... But yes, that might be the case.

I wonder if we shouldn't just always look at checkpoint segments during
!crash recovery.


Maybe we could consider in that direction, but there is a problem.  Archive 
recovery slows down compared to 9.1, because of repeated restartpoints. 
Archive recovery should be as fast as possible, because it typically applies 
dozens or hundreds of WAL files, and the DBA desires immediate resumption of 
operation.


So, I think we should restore 9.1 behavior for archive recovery.  The 
attached patch keeps restored archived WAL in pg_xlog/ only during standby 
recovery.  It is based on Fujii-san's revison of the patch, with 
AllowCascadeReplication() condition removed from two if statements.


Regards
MauMau


wal_increase_in_pitr_v4.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [9.3 bug] disk space in pg_xlog increases during archive recovery

2014-02-12 Thread Andres Freund
On 2014-02-12 21:23:54 +0900, MauMau wrote:
 Maybe we could consider in that direction, but there is a problem.  Archive
 recovery slows down compared to 9.1, because of repeated restartpoints.
 Archive recovery should be as fast as possible, because it typically applies
 dozens or hundreds of WAL files, and the DBA desires immediate resumption of
 operation.
 
 So, I think we should restore 9.1 behavior for archive recovery.

It's easy to be fast, if it's not correct. I don't see how that
behaviour is acceptable, imo the previous behaviour simply was a bug
whose fix was too invasive to backpatch.

I don't think you can convince me (but maybe others) that the old
behaviour is acceptable.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [9.3 bug] disk space in pg_xlog increases during archive recovery

2014-02-02 Thread Fujii Masao
On Sun, Feb 2, 2014 at 5:49 AM, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-01-24 22:31:17 +0900, MauMau wrote:
 From: Fujii Masao masao.fu...@gmail.com
 On Wed, Jan 22, 2014 at 6:37 AM, Heikki Linnakangas
 Thanks! The patch looks good to me. Attached is the updated version of
 the patch. I added the comments.

 Thank you very much.  Your comment looks great.  I tested some recovery
 situations, and confirmed that WAL segments were kept/removed as intended.
 I'll update the CommitFest entry with this patch.

 You haven't updated the patches status so far, so I've now marked as
 returned feedback as several people voiced serious doubts about the
 approach. If that's not accurate please speak up.

 The problem is, we might not be able to perform restartpoints more
 aggressively
 even if we reduce checkpoint_timeout in the server under the archive
 recovery.
 Because the frequency of occurrence of restartpoints depends on not only
 that
 checkpoint_timeout but also the checkpoints which happened while the
 server
 was running.

 I haven't tried reducing checkpoint_timeout.

 Did you try reducing checkpoint_segments? As I pointed out, at least if
 standby_mode is enabled, it will also trigger checkpoints, independently
 from checkpoint_timeout.

Right. If standby_mode is enabled, checkpoint_segment can trigger
the restartpoint. But the problem is that the timing of restartpoint
depends on not only the checkpoint parameters (i.e.,
checkpoint_timeout and checkpoint_segments) that are used during
archive recovery but also the checkpoint WAL that was generated
by the master.

For example, could you imagine the case where the master generated
only one checkpoint WAL since the last backup and it crashed with
database corruption. Then DBA decided to perform normal archive
recovery by using the last backup. In this case, even if DBA reduces
both checkpoint_timeout and checkpoint_segments, only one
restartpoint can occur during recovery. This low frequency of
restartpoint might fill up the disk space with lots of WAL files.

This would be harmless if the server that we are performing recovery
in has enough disk space. But I can imagine that some users want to
recover the database and restart the service temporarily in poor server
with less enough disk space until they can purchase sufficient server.
In this case, accumulating lots of WAL files in pg_xlog might be harmful.

 If the issue is that you're not using standby_mode (if so, why?), then
 the fix maybe is to make that apply to a wider range of situations.

I guess that he is not using standby_mode because, according to
his first email in this thread, he said he would like to prevent WAL
from accumulating in pg_xlog during normal archive recovery (i.e., PITR).

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [9.3 bug] disk space in pg_xlog increases during archive recovery

2014-02-02 Thread Andres Freund
On 2014-02-02 23:50:40 +0900, Fujii Masao wrote:
 On Sun, Feb 2, 2014 at 5:49 AM, Andres Freund and...@2ndquadrant.com wrote:
  On 2014-01-24 22:31:17 +0900, MauMau wrote:
  I haven't tried reducing checkpoint_timeout.
 
  Did you try reducing checkpoint_segments? As I pointed out, at least if
  standby_mode is enabled, it will also trigger checkpoints, independently
  from checkpoint_timeout.

 Right. If standby_mode is enabled, checkpoint_segment can trigger
 the restartpoint. But the problem is that the timing of restartpoint
 depends on not only the checkpoint parameters (i.e.,
 checkpoint_timeout and checkpoint_segments) that are used during
 archive recovery but also the checkpoint WAL that was generated
 by the master.

Sure. But we really *need* all the WAL since the last checkpoint's redo
location locally to be safe.

 For example, could you imagine the case where the master generated
 only one checkpoint WAL since the last backup and it crashed with
 database corruption. Then DBA decided to perform normal archive
 recovery by using the last backup. In this case, even if DBA reduces
 both checkpoint_timeout and checkpoint_segments, only one
 restartpoint can occur during recovery. This low frequency of
 restartpoint might fill up the disk space with lots of WAL files.

I am not sure I understand the point of this scenario. If the primary
crashed after a checkpoint, there won't be that much WAL since it
happened...

  If the issue is that you're not using standby_mode (if so, why?), then
  the fix maybe is to make that apply to a wider range of situations.
 
 I guess that he is not using standby_mode because, according to
 his first email in this thread, he said he would like to prevent WAL
 from accumulating in pg_xlog during normal archive recovery (i.e., PITR).

Well, that doesn't necessarily prevent you from using
standby_mode... But yes, that might be the case.

I wonder if we shouldn't just always look at checkpoint segments during
!crash recovery.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [9.3 bug] disk space in pg_xlog increases during archive recovery

2014-02-01 Thread Andres Freund
On 2014-01-21 23:37:43 +0200, Heikki Linnakangas wrote:
 On 01/21/2014 07:31 PM, Fujii Masao wrote:
 On Fri, Dec 20, 2013 at 9:21 PM, MauMau maumau...@gmail.com wrote:
 From: Fujii Masao masao.fu...@gmail.com
 
 ! if (source == XLOG_FROM_ARCHIVE  StandbyModeRequested)
 
 Even when standby_mode is not enabled, we can use cascade replication and
 it needs the accumulated WAL files. So I think that
 AllowCascadeReplication()
 should be added into this condition.
 
 !   snprintf(recoveryPath, MAXPGPATH, XLOGDIR /RECOVERYXLOG);
 !   XLogFilePath(xlogpath, ThisTimeLineID, endLogSegNo);
 !
 !   if (restoredFromArchive)
 
 Don't we need to check !StandbyModeRequested and
 !AllowCascadeReplication()
 here?
 
 Oh, you are correct.  Okay, done.
 
 Thanks! The patch looks good to me. Attached is the updated version of
 the patch. I added the comments.
 
 Sorry for reacting so slowly, but I'm not sure I like this patch. It's a
 quite useful property that all the WAL files that are needed for recovery
 are copied into pg_xlog, even when restoring from archive, even when not
 doing cascading replication. It guarantees that you can restart the standby,
 even if the connection to the archive is lost for some reason. I
 intentionally changed the behavior for archive recovery too, when it was
 introduced for cascading replication. Also, I think it's good that the
 behavior does not depend on whether cascading replication is enabled - it's
 a quite subtle difference.
 
 So, IMHO this is not a bug, it's a feature.

Very much seconded. With the old behaviour it's possible to get into the
situation that you cannot get your standby productive anymore if the
archive server died. Since the archive server is often the primary
that's imo unacceptable.

I'd even go as far as saying the previous behaviour is a bug.

 To solve the original problem of running out of disk space in archive
 recovery, I wonder if we should perform restartpoints more aggressively. We
 intentionally don't trigger restatpoings by checkpoint_segments, only
 checkpoint_timeout, but I wonder if there should be an option for that.
 MauMau, did you try simply reducing checkpoint_timeout, while doing
 recovery?

Hm, don't we actually do cause trigger restartpoints based on checkpoint
segments?

static int
XLogPageRead(XLogReaderState *xlogreader, XLogRecPtr targetPagePtr, int reqLen,
 XLogRecPtr targetRecPtr, char *readBuf, TimeLineID *readTLI)
{
...

if (readFile = 0  !XLByteInSeg(targetPagePtr, readSegNo))
{
/*
 * Request a restartpoint if we've replayed too much xlog since the
 * last one.
 */
if (StandbyModeRequested  bgwriterLaunched)
{
if (XLogCheckpointNeeded(readSegNo))
{
(void) GetRedoRecPtr();
if (XLogCheckpointNeeded(readSegNo))
RequestCheckpoint(CHECKPOINT_CAUSE_XLOG);
}
}
...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [9.3 bug] disk space in pg_xlog increases during archive recovery

2014-02-01 Thread Andres Freund
On 2014-01-24 22:31:17 +0900, MauMau wrote:
 From: Fujii Masao masao.fu...@gmail.com
 On Wed, Jan 22, 2014 at 6:37 AM, Heikki Linnakangas
 Thanks! The patch looks good to me. Attached is the updated version of
 the patch. I added the comments.
 
 Thank you very much.  Your comment looks great.  I tested some recovery
 situations, and confirmed that WAL segments were kept/removed as intended.
 I'll update the CommitFest entry with this patch.

You haven't updated the patches status so far, so I've now marked as
returned feedback as several people voiced serious doubts about the
approach. If that's not accurate please speak up.

 The problem is, we might not be able to perform restartpoints more
 aggressively
 even if we reduce checkpoint_timeout in the server under the archive
 recovery.
 Because the frequency of occurrence of restartpoints depends on not only
 that
 checkpoint_timeout but also the checkpoints which happened while the
 server
 was running.
 
 I haven't tried reducing checkpoint_timeout.

Did you try reducing checkpoint_segments? As I pointed out, at least if
standby_mode is enabled, it will also trigger checkpoints, independently
from checkpoint_timeout.

If the issue is that you're not using standby_mode (if so, why?), then
the fix maybe is to make that apply to a wider range of situations.

 I think we cannot take that
 approach, because we cannot suggest appropriate checkpoint_timeout to users.
 That is, what checkpoint_timeout setting can we suggest so that WAL doesn't
 accumulate in pg_xlog/ more than 9.1?

Well, = 9.1's behaviour can lead to loss of a consistent database, so I
don't think it's what we need to compare the current state to.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [9.3 bug] disk space in pg_xlog increases during archive recovery

2014-01-24 Thread MauMau

From: Fujii Masao masao.fu...@gmail.com

On Wed, Jan 22, 2014 at 6:37 AM, Heikki Linnakangas

Thanks! The patch looks good to me. Attached is the updated version of
the patch. I added the comments.


Thank you very much.  Your comment looks great.  I tested some recovery 
situations, and confirmed that WAL segments were kept/removed as intended. 
I'll update the CommitFest entry with this patch.



hlinnakan...@vmware.com wrote:

Sorry for reacting so slowly, but I'm not sure I like this patch. It's a
quite useful property that all the WAL files that are needed for recovery
are copied into pg_xlog, even when restoring from archive, even when not
doing cascading replication. It guarantees that you can restart the 
standby,

even if the connection to the archive is lost for some reason. I
intentionally changed the behavior for archive recovery too, when it was
introduced for cascading replication. Also, I think it's good that the
behavior does not depend on whether cascading replication is enabled - 
it's

a quite subtle difference.

So, IMHO this is not a bug, it's a feature.


Yep.


I understood the benefit for the standby recovery.


To solve the original problem of running out of disk space in archive
recovery, I wonder if we should perform restartpoints more aggressively. 
We

intentionally don't trigger restatpoings by checkpoint_segments, only
checkpoint_timeout, but I wonder if there should be an option for that.


That's an option.


MauMau, did you try simply reducing checkpoint_timeout, while doing
recovery?


The problem is, we might not be able to perform restartpoints more 
aggressively
even if we reduce checkpoint_timeout in the server under the archive 
recovery.
Because the frequency of occurrence of restartpoints depends on not only 
that
checkpoint_timeout but also the checkpoints which happened while the 
server

was running.


I haven't tried reducing checkpoint_timeout.  I think we cannot take that 
approach, because we cannot suggest appropriate checkpoint_timeout to users. 
That is, what checkpoint_timeout setting can we suggest so that WAL doesn't 
accumulate in pg_xlog/ more than 9.1?


In addition, as Fujii-san said, it doesn't seem we can restartpoint 
completely.  Plus, if we can cause restartpoints frequently, the recovery 
would take (much?) longer, because shared buffers are flushed more 
frequently.


So, how about just removing AllowCascadeReplication() condition from the 
patch?  That allows WAL to accumulate in pg_xlog/ during standby recovery 
but not during archive recovery.


Regards
MauMau




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [9.3 bug] disk space in pg_xlog increases during archive recovery

2014-01-21 Thread Fujii Masao
On Fri, Dec 20, 2013 at 9:21 PM, MauMau maumau...@gmail.com wrote:
 From: Fujii Masao masao.fu...@gmail.com

 ! if (source == XLOG_FROM_ARCHIVE  StandbyModeRequested)

 Even when standby_mode is not enabled, we can use cascade replication and
 it needs the accumulated WAL files. So I think that
 AllowCascadeReplication()
 should be added into this condition.

 !   snprintf(recoveryPath, MAXPGPATH, XLOGDIR /RECOVERYXLOG);
 !   XLogFilePath(xlogpath, ThisTimeLineID, endLogSegNo);
 !
 !   if (restoredFromArchive)

 Don't we need to check !StandbyModeRequested and
 !AllowCascadeReplication()
 here?


 Oh, you are correct.  Okay, done.

Thanks! The patch looks good to me. Attached is the updated version of
the patch. I added the comments.

Did you test whether this patch works properly in several recovery cases?

Regards,

-- 
Fujii Masao
*** a/src/backend/access/transam/xlog.c
--- b/src/backend/access/transam/xlog.c
***
*** 3772,3781  XLogFileRead(XLogSegNo segno, int emode, TimeLineID tli,
  	}
  
  	/*
! 	 * If the segment was fetched from archival storage, replace the existing
! 	 * xlog segment (if any) with the archival version.
  	 */
! 	if (source == XLOG_FROM_ARCHIVE)
  	{
  		KeepFileRestoredFromArchive(path, xlogfname);
  
--- 3772,3792 
  	}
  
  	/*
! 	 * If the segment was fetched from archival storage and either cascading
! 	 * replication or standby_mode is enabled, replace the existing xlog
! 	 * segment (if any) with the archival version. Cascading replication needs
! 	 * this replacement so that cascading walsender can send the xlog segment
! 	 * which was restored from the archive. When standby_mode is enabled,
! 	 * fast promotion is performed at the end of recovery, and also needs this
! 	 * replacement so that the crash recovery just after fast promotion can
! 	 * replay all the required segments from pg_xlog.
! 	 *
! 	 * If the replacement is not required, the segments are always restored onto
! 	 * the same file named RECOVERYXLOG from the archive. This prevents the
! 	 * large increase of segments in pg_xlog.
  	 */
! 	if (source == XLOG_FROM_ARCHIVE 
! 		(AllowCascadeReplication() || StandbyModeRequested))
  	{
  		KeepFileRestoredFromArchive(path, xlogfname);
  
***
*** 5572,5593  exitArchiveRecovery(TimeLineID endTLI, XLogSegNo endLogSegNo)
  	}
  
  	/*
! 	 * If we are establishing a new timeline, we have to copy data from the
! 	 * last WAL segment of the old timeline to create a starting WAL segment
! 	 * for the new timeline.
  	 *
! 	 * Notify the archiver that the last WAL segment of the old timeline is
! 	 * ready to copy to archival storage. Otherwise, it is not archived for a
! 	 * while.
  	 */
! 	if (endTLI != ThisTimeLineID)
  	{
! 		XLogFileCopy(endLogSegNo, endTLI, endLogSegNo);
  
! 		if (XLogArchivingActive())
  		{
! 			XLogFileName(xlogpath, endTLI, endLogSegNo);
! 			XLogArchiveNotify(xlogpath);
  		}
  	}
  
--- 5583,5646 
  	}
  
  	/*
! 	 * If the segment was fetched from archival storage and neither cascading
! 	 * replication nor fast promotion is enabled, we replace the existing xlog
! 	 * segment (if any) with the archival version named RECOVERYXLOG. This is
! 	 * because whatever is in XLOGDIR is very possibly older than what we have
! 	 * from the archives, since it could have come from restoring a PGDATA
! 	 * backup.	In any case, the archival version certainly is more
! 	 * descriptive of what our current database state is, because that is what
! 	 * we replayed from.
! 	 *
! 	 * Note that if we are establishing a new timeline, ThisTimeLineID is
! 	 * already set to the new value, and so we will create a new file instead
! 	 * of overwriting any existing file.  (This is, in fact, always the case
! 	 * at present.)
  	 *
! 	 * If either cascading replication or fast promotion is enabled, we don't
! 	 * need the replacement here because it has already done just after the
! 	 * segment was restored from the archive.
  	 */
! 	snprintf(recoveryPath, MAXPGPATH, XLOGDIR /RECOVERYXLOG);
! 	XLogFilePath(xlogpath, ThisTimeLineID, endLogSegNo);
! 
! 	if (restoredFromArchive 
! 		!AllowCascadeReplication()  !StandbyModeRequested)
  	{
! 		unlink(xlogpath);		/* might or might not exist */
! 		if (rename(recoveryPath, xlogpath) != 0)
! 			ereport(FATAL,
! 	(errcode_for_file_access(),
! 	 errmsg(could not rename file \%s\ to \%s\: %m,
! 			recoveryPath, xlogpath)));
! 		/* XXX might we need to fix permissions on the file? */
! 	}
! 	else
! 	{
! 		/*
! 		 * Since there might be a partial WAL segment named RECOVERYXLOG,
! 		 * get rid of it.
! 		 */
! 		unlink(recoveryPath);	/* ignore any error */
  
! 		/*
! 		 * If we are establishing a new timeline, we have to copy data from
! 		 * the last WAL segment of the old timeline to create a starting WAL
! 		 * segment for the new timeline.
! 		 *
! 		 * Notify the archiver that the last WAL segment of the old timeline
! 		 * is ready to copy to archival 

Re: [HACKERS] [9.3 bug] disk space in pg_xlog increases during archive recovery

2014-01-21 Thread Heikki Linnakangas

On 01/21/2014 07:31 PM, Fujii Masao wrote:

On Fri, Dec 20, 2013 at 9:21 PM, MauMau maumau...@gmail.com wrote:

From: Fujii Masao masao.fu...@gmail.com


! if (source == XLOG_FROM_ARCHIVE  StandbyModeRequested)

Even when standby_mode is not enabled, we can use cascade replication and
it needs the accumulated WAL files. So I think that
AllowCascadeReplication()
should be added into this condition.

!   snprintf(recoveryPath, MAXPGPATH, XLOGDIR /RECOVERYXLOG);
!   XLogFilePath(xlogpath, ThisTimeLineID, endLogSegNo);
!
!   if (restoredFromArchive)

Don't we need to check !StandbyModeRequested and
!AllowCascadeReplication()
here?


Oh, you are correct.  Okay, done.


Thanks! The patch looks good to me. Attached is the updated version of
the patch. I added the comments.


Sorry for reacting so slowly, but I'm not sure I like this patch. It's a 
quite useful property that all the WAL files that are needed for 
recovery are copied into pg_xlog, even when restoring from archive, even 
when not doing cascading replication. It guarantees that you can restart 
the standby, even if the connection to the archive is lost for some 
reason. I intentionally changed the behavior for archive recovery too, 
when it was introduced for cascading replication. Also, I think it's 
good that the behavior does not depend on whether cascading replication 
is enabled - it's a quite subtle difference.


So, IMHO this is not a bug, it's a feature.

To solve the original problem of running out of disk space in archive 
recovery, I wonder if we should perform restartpoints more aggressively. 
We intentionally don't trigger restatpoings by checkpoint_segments, only 
checkpoint_timeout, but I wonder if there should be an option for that. 
MauMau, did you try simply reducing checkpoint_timeout, while doing 
recovery?


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [9.3 bug] disk space in pg_xlog increases during archive recovery

2014-01-21 Thread Fujii Masao
On Wed, Jan 22, 2014 at 6:37 AM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
 On 01/21/2014 07:31 PM, Fujii Masao wrote:

 On Fri, Dec 20, 2013 at 9:21 PM, MauMau maumau...@gmail.com wrote:

 From: Fujii Masao masao.fu...@gmail.com

 ! if (source == XLOG_FROM_ARCHIVE  StandbyModeRequested)

 Even when standby_mode is not enabled, we can use cascade replication
 and
 it needs the accumulated WAL files. So I think that
 AllowCascadeReplication()
 should be added into this condition.

 !   snprintf(recoveryPath, MAXPGPATH, XLOGDIR /RECOVERYXLOG);
 !   XLogFilePath(xlogpath, ThisTimeLineID, endLogSegNo);
 !
 !   if (restoredFromArchive)

 Don't we need to check !StandbyModeRequested and
 !AllowCascadeReplication()
 here?


 Oh, you are correct.  Okay, done.


 Thanks! The patch looks good to me. Attached is the updated version of
 the patch. I added the comments.


 Sorry for reacting so slowly, but I'm not sure I like this patch. It's a
 quite useful property that all the WAL files that are needed for recovery
 are copied into pg_xlog, even when restoring from archive, even when not
 doing cascading replication. It guarantees that you can restart the standby,
 even if the connection to the archive is lost for some reason. I
 intentionally changed the behavior for archive recovery too, when it was
 introduced for cascading replication. Also, I think it's good that the
 behavior does not depend on whether cascading replication is enabled - it's
 a quite subtle difference.

 So, IMHO this is not a bug, it's a feature.

Yep.

 To solve the original problem of running out of disk space in archive
 recovery, I wonder if we should perform restartpoints more aggressively. We
 intentionally don't trigger restatpoings by checkpoint_segments, only
 checkpoint_timeout, but I wonder if there should be an option for that.

That's an option.

 MauMau, did you try simply reducing checkpoint_timeout, while doing
 recovery?

The problem is, we might not be able to perform restartpoints more aggressively
even if we reduce checkpoint_timeout in the server under the archive recovery.
Because the frequency of occurrence of restartpoints depends on not only that
checkpoint_timeout but also the checkpoints which happened while the server
was running.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [9.3 bug] disk space in pg_xlog increases during archive recovery

2013-12-20 Thread MauMau

From: Fujii Masao masao.fu...@gmail.com

! if (source == XLOG_FROM_ARCHIVE  StandbyModeRequested)

Even when standby_mode is not enabled, we can use cascade replication and
it needs the accumulated WAL files. So I think that 
AllowCascadeReplication()

should be added into this condition.

!   snprintf(recoveryPath, MAXPGPATH, XLOGDIR /RECOVERYXLOG);
!   XLogFilePath(xlogpath, ThisTimeLineID, endLogSegNo);
!
!   if (restoredFromArchive)

Don't we need to check !StandbyModeRequested and 
!AllowCascadeReplication()

here?


Oh, you are correct.  Okay, done.



!   /*
!* If the latest segment is not archival, but there's 
still a

!* RECOVERYXLOG laying about, get rid of it.
!*/
!   unlink(recoveryPath);   /* ignore any error */

The similar line exists in the lower side of exitArchiveRecovery(), so 
ISTM that

you can refactor that.


That's already done in the previous patch: deletion of RECOVERYXLOG was 
moved into else clause, as in:


-  /*
-   * Since there might be a partial WAL segment named RECOVERYXLOG, get rid
-   * of it.
-   */
-  snprintf(recoveryPath, MAXPGPATH, XLOGDIR /RECOVERYXLOG);
-  unlink(recoveryPath);  /* ignore any error */
-



Regards
MauMau


wal_increase_in_pitr_v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [9.3 bug] disk space in pg_xlog increases during archive recovery

2013-12-18 Thread Fujii Masao
On Mon, Dec 16, 2013 at 9:22 PM, MauMau maumau...@gmail.com wrote:
 Hi, Fujii san,

 On Wed, Aug 7, 2013 at 7:03 AM, Fujii Masao masao.fu...@gmail.com wrote:

 On second thought, probably we cannot remove the restored WAL files early
 because they might be required for fast promotion which is new feature in
 9.3.
 In fast promotion, an end-of-recovery checkpoint is not executed. After
 the end
 of recovery, normal online checkpoint starts. If the server crashes
 before such
 an online checkpoint completes, the server needs to replay again all the
 WAL
 files which it replayed before the promotion. Since this is the crash
 recovery,
 all those WAL files need to exist in pg_xlog directory. So if we remove
 the
 restored WAL file from pg_xlog early, such a crash recovery might fail.

 So even if cascade replication is disabled, if standby_mode = on, i.e.,
 fast
 promotion can be performed, we cannot remove the restored WAL files
 early.


 Following Fujii-san's advice, I've made the attached patch.

Thanks for the patch!

! if (source == XLOG_FROM_ARCHIVE  StandbyModeRequested)

Even when standby_mode is not enabled, we can use cascade replication and
it needs the accumulated WAL files. So I think that AllowCascadeReplication()
should be added into this condition.

!   snprintf(recoveryPath, MAXPGPATH, XLOGDIR /RECOVERYXLOG);
!   XLogFilePath(xlogpath, ThisTimeLineID, endLogSegNo);
!
!   if (restoredFromArchive)

Don't we need to check !StandbyModeRequested and !AllowCascadeReplication()
here?

!   /*
!* If the latest segment is not archival, but there's still a
!* RECOVERYXLOG laying about, get rid of it.
!*/
!   unlink(recoveryPath);   /* ignore any error */

The similar line exists in the lower side of exitArchiveRecovery(), so ISTM that
you can refactor that.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [9.3 bug] disk space in pg_xlog increases during archive recovery

2013-12-16 Thread MauMau

Hi, Fujii san,


On Wed, Aug 7, 2013 at 7:03 AM, Fujii Masao masao.fu...@gmail.com wrote:

On second thought, probably we cannot remove the restored WAL files early
because they might be required for fast promotion which is new feature in 
9.3.
In fast promotion, an end-of-recovery checkpoint is not executed. After 
the end
of recovery, normal online checkpoint starts. If the server crashes 
before such
an online checkpoint completes, the server needs to replay again all the 
WAL
files which it replayed before the promotion. Since this is the crash 
recovery,
all those WAL files need to exist in pg_xlog directory. So if we remove 
the

restored WAL file from pg_xlog early, such a crash recovery might fail.

So even if cascade replication is disabled, if standby_mode = on, i.e., 
fast
promotion can be performed, we cannot remove the restored WAL files 
early.


Following Fujii-san's advice, I've made the attached patch.  This prevents 
unnecessary WAL accumulation in pg_xlog/ during archive recovery (not 
standby recovery).  To do this, I had to revive some code in 
exitArchiveRecovery() from 9.1.


Could you commit this to 9.2 and later?


Regards
MauMau


wal_increase_in_pitr.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [9.3 bug] disk space in pg_xlog increases during archive recovery

2013-08-07 Thread Fujii Masao
On Fri, Aug 2, 2013 at 10:52 AM, Michael Paquier
michael.paqu...@gmail.com wrote:



 On Fri, Aug 2, 2013 at 12:24 AM, MauMau maumau...@gmail.com wrote:

 From: Fujii Masao masao.fu...@gmail.com

 However, isn't StandbyRequested true (= standby_mode set to on) to
 enable
 warm standby?


 We can set up warm-standby by using pg_standby even if standby_mode =
 off.


 I see.  However, I understand that pg_standby is a legacy feature, and the
 current way to setup a warm standby is to set standby=on and restore_command
 in recovery.conf.  So I think it might not be necessary to support cascading
 replication with pg_standby, if supporting it would prevent better
 implementation of new features.

 You are right about that, you should stick with the core features as much as
 you can and limit the use of wrapper utilities. Since 9.1 and the apparition
 of a built-in standby mode inside Postgres (with the apparition of the GUC
 parameter hot_standby), it seems better to use that instead of pg_standby,
 which would likely (personal opinion, feel free to scream at me) be removed
 in a future release.

I'm sure that there are some users who use pg_standby for warm-standby
for some reasons, for example, the document (*1) explains such a configuration,
a user can specify the file-check-interval (by using -s option), and so on.
I agree that using pg_standby + cascade replication would be very rare.
But I'm not confident that *no one* uses pg_standby + cascade replication.

(*1) http://www.postgresql.org/docs/devel/static/log-shipping-alternative.html

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [9.3 bug] disk space in pg_xlog increases during archive recovery

2013-08-07 Thread Fujii Masao
On Fri, Aug 2, 2013 at 12:24 AM, MauMau maumau...@gmail.com wrote:
 From: Fujii Masao masao.fu...@gmail.com

 However, isn't StandbyRequested true (= standby_mode set to on) to enable
 warm standby?


 We can set up warm-standby by using pg_standby even if standby_mode = off.


 I see.  However, I understand that pg_standby is a legacy feature, and the
 current way to setup a warm standby is to set standby=on and restore_command
 in recovery.conf.  So I think it might not be necessary to support cascading
 replication with pg_standby, if supporting it would prevent better
 implementation of new features.



  I'm afraid people set max_wal_senders0 and hot_standby=on
 even on the primary server to make the contents of postgresql.conf
 identical
 on both the primary and the standby for easier configuration.  If so,
 normal
 archive recovery (PITR, not the standby recovery) would face the original
 problem -- unnecessary WAL accumulation in pg_xlog/.  So I'm wonder if
 AllowCascadeReplication() is enough.


 One idea is to add new GUC parameter like enable_cascade_replication
 and then prevent WAL file from being kept in pg_xlog if that parameter is
 off.
 But we cannot apply such change into the already-released version 9.2.


 Yes, I thought the same, too.  Adding a new GUC to enable cascading
 replication now would be a usability degradation.  But I have no idea of any
 better way.  It seems we need to take either v1 or v2 of the patch I sent.
 If we consider that we don't have to support cascading replication for a
 legacy pg_standby, v1 patch is definitely better because the standby server
 would not keep restored archive WAL in pg_xlog when cascading replication is
 not used.  Otherwise, we have to take v2 patch.

 Could you choose either and commit it?

On second thought, probably we cannot remove the restored WAL files early
because they might be required for fast promotion which is new feature in 9.3.
In fast promotion, an end-of-recovery checkpoint is not executed. After the end
of recovery, normal online checkpoint starts. If the server crashes before such
an online checkpoint completes, the server needs to replay again all the WAL
files which it replayed before the promotion. Since this is the crash recovery,
all those WAL files need to exist in pg_xlog directory. So if we remove the
restored WAL file from pg_xlog early, such a crash recovery might fail.

So even if cascade replication is disabled, if standby_mode = on, i.e., fast
promotion can be performed, we cannot remove the restored WAL files early.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [9.3 bug] disk space in pg_xlog increases during archive recovery

2013-08-07 Thread Jeff Janes
On Wed, Aug 7, 2013 at 7:03 AM, Fujii Masao masao.fu...@gmail.com wrote:
 On Fri, Aug 2, 2013 at 12:24 AM, MauMau maumau...@gmail.com wrote:
 From: Fujii Masao masao.fu...@gmail.com

 However, isn't StandbyRequested true (= standby_mode set to on) to enable
 warm standby?


 We can set up warm-standby by using pg_standby even if standby_mode = off.


 I see.  However, I understand that pg_standby is a legacy feature, and the
 current way to setup a warm standby is to set standby=on and restore_command
 in recovery.conf.  So I think it might not be necessary to support cascading
 replication with pg_standby, if supporting it would prevent better
 implementation of new features.



  I'm afraid people set max_wal_senders0 and hot_standby=on
 even on the primary server to make the contents of postgresql.conf
 identical
 on both the primary and the standby for easier configuration.  If so,
 normal
 archive recovery (PITR, not the standby recovery) would face the original
 problem -- unnecessary WAL accumulation in pg_xlog/.  So I'm wonder if
 AllowCascadeReplication() is enough.


 One idea is to add new GUC parameter like enable_cascade_replication
 and then prevent WAL file from being kept in pg_xlog if that parameter is
 off.
 But we cannot apply such change into the already-released version 9.2.


 Yes, I thought the same, too.  Adding a new GUC to enable cascading
 replication now would be a usability degradation.  But I have no idea of any
 better way.  It seems we need to take either v1 or v2 of the patch I sent.
 If we consider that we don't have to support cascading replication for a
 legacy pg_standby, v1 patch is definitely better because the standby server
 would not keep restored archive WAL in pg_xlog when cascading replication is
 not used.  Otherwise, we have to take v2 patch.

 Could you choose either and commit it?

 On second thought, probably we cannot remove the restored WAL files early
 because they might be required for fast promotion which is new feature in 9.3.
 In fast promotion, an end-of-recovery checkpoint is not executed. After the 
 end
 of recovery, normal online checkpoint starts. If the server crashes before 
 such
 an online checkpoint completes, the server needs to replay again all the WAL
 files which it replayed before the promotion. Since this is the crash 
 recovery,
 all those WAL files need to exist in pg_xlog directory. So if we remove the
 restored WAL file from pg_xlog early, such a crash recovery might fail.

 So even if cascade replication is disabled, if standby_mode = on, i.e., fast
 promotion can be performed, we cannot remove the restored WAL files early.

But we should still be able to remove files older than the latest
restart point, right?

Cheers,

Jeff


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [9.3 bug] disk space in pg_xlog increases during archive recovery

2013-08-02 Thread Dimitri Fontaine
Michael Paquier michael.paqu...@gmail.com writes:
 By reading this thread, -1 for the addition of a new GUC parameter related
 to cascading, it looks like an overkill for the possible gain. And +1 for
 the removal of WAL file once it is replayed in archive recovery if
 cascading replication is not allowed. However, what about

Well, we still don't register standby servers, so I vote against early
removal of files that you have no possible way to know if they are still
useful or not. We need something smarter here. Please.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [9.3 bug] disk space in pg_xlog increases during archive recovery

2013-08-01 Thread MauMau

From: Fujii Masao masao.fu...@gmail.com

However, isn't StandbyRequested true (= standby_mode set to on) to enable
warm standby?


We can set up warm-standby by using pg_standby even if standby_mode = off.


I see.  However, I understand that pg_standby is a legacy feature, and the 
current way to setup a warm standby is to set standby=on and restore_command 
in recovery.conf.  So I think it might not be necessary to support cascading 
replication with pg_standby, if supporting it would prevent better 
implementation of new features.




 I'm afraid people set max_wal_senders0 and hot_standby=on
even on the primary server to make the contents of postgresql.conf 
identical
on both the primary and the standby for easier configuration.  If so, 
normal

archive recovery (PITR, not the standby recovery) would face the original
problem -- unnecessary WAL accumulation in pg_xlog/.  So I'm wonder if
AllowCascadeReplication() is enough.


One idea is to add new GUC parameter like enable_cascade_replication
and then prevent WAL file from being kept in pg_xlog if that parameter is 
off.

But we cannot apply such change into the already-released version 9.2.


Yes, I thought the same, too.  Adding a new GUC to enable cascading 
replication now would be a usability degradation.  But I have no idea of any 
better way.  It seems we need to take either v1 or v2 of the patch I sent. 
If we consider that we don't have to support cascading replication for a 
legacy pg_standby, v1 patch is definitely better because the standby server 
would not keep restored archive WAL in pg_xlog when cascading replication is 
not used.  Otherwise, we have to take v2 patch.


Could you choose either and commit it?

Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [9.3 bug] disk space in pg_xlog increases during archive recovery

2013-08-01 Thread Michael Paquier
On Fri, Aug 2, 2013 at 12:24 AM, MauMau maumau...@gmail.com wrote:

 From: Fujii Masao masao.fu...@gmail.com

  However, isn't StandbyRequested true (= standby_mode set to on) to enable
 warm standby?


 We can set up warm-standby by using pg_standby even if standby_mode = off.


 I see.  However, I understand that pg_standby is a legacy feature, and the
 current way to setup a warm standby is to set standby=on and
 restore_command in recovery.conf.  So I think it might not be necessary to
 support cascading replication with pg_standby, if supporting it would
 prevent better implementation of new features.

You are right about that, you should stick with the core features as much
as you can and limit the use of wrapper utilities. Since 9.1 and the
apparition of a built-in standby mode inside Postgres (with the apparition
of the GUC parameter hot_standby), it seems better to use that instead of
pg_standby, which would likely (personal opinion, feel free to scream at
me) be removed in a future release.




   I'm afraid people set max_wal_senders0 and hot_standby=on
 even on the primary server to make the contents of postgresql.conf
 identical
 on both the primary and the standby for easier configuration.  If so,
 normal
 archive recovery (PITR, not the standby recovery) would face the original
 problem -- unnecessary WAL accumulation in pg_xlog/.  So I'm wonder if
 AllowCascadeReplication() is enough.


 One idea is to add new GUC parameter like enable_cascade_replication
 and then prevent WAL file from being kept in pg_xlog if that parameter is
 off.
 But we cannot apply such change into the already-released version 9.2.


 Yes, I thought the same, too.  Adding a new GUC to enable cascading
 replication now would be a usability degradation.  But I have no idea of
 any better way.  It seems we need to take either v1 or v2 of the patch I
 sent. If we consider that we don't have to support cascading replication
 for a legacy pg_standby, v1 patch is definitely better because the standby
 server would not keep restored archive WAL in pg_xlog when cascading
 replication is not used.  Otherwise, we have to take v2 patch.

By reading this thread, -1 for the addition of a new GUC parameter related
to cascading, it looks like an overkill for the possible gain. And +1 for
the removal of WAL file once it is replayed in archive recovery if
cascading replication is not allowed. However, what about
wal_keep_segments? Doesn't the server keep segments even if replication is
not allowed? If yes, the immediate WAL file drop after replay needs to take
care of that...
-- 
Michael


Re: [HACKERS] [9.3 bug] disk space in pg_xlog increases during archive recovery

2013-07-31 Thread MauMau

Hello, Fujii san, all,

From: Fujii Masao masao.fu...@gmail.com

On Sun, Jul 28, 2013 at 7:59 AM, MauMau maumau...@gmail.com wrote:
Do you think this should be fixed?

I think so.


How should it be fixed?


What about removing the restored archived file as soon as it's replayed
if cascading replication is not enabled (i.e., max_wal_senders = 0 or
hot_standby = off)? This doesn't seem to break the existing behavior
in 9.2.


Please find attached the patch to fix the problem.  I changed to keep 
restored WAL files in pg_xlog/ only on a cascading standby server.  Could 
you review and commit this?


BTW, KeepFileRestoredFromArchive() is also called to keep timeline history 
files in pg_xlog/.  What is this for?  Is this necessary for recovery other 
than cascading standbys?  This seems to accumulate timeline history files 
forever in pg_xlog/.


Regards
MauMau



skip_wal_save.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [9.3 bug] disk space in pg_xlog increases during archive recovery

2013-07-31 Thread Fujii Masao
On Wed, Jul 31, 2013 at 10:43 PM, MauMau maumau...@gmail.com wrote:
 Hello, Fujii san, all,

 From: Fujii Masao masao.fu...@gmail.com

 On Sun, Jul 28, 2013 at 7:59 AM, MauMau maumau...@gmail.com wrote:
 Do you think this should be fixed?

 I think so.

 How should it be fixed?


 What about removing the restored archived file as soon as it's replayed
 if cascading replication is not enabled (i.e., max_wal_senders = 0 or
 hot_standby = off)? This doesn't seem to break the existing behavior
 in 9.2.


 Please find attached the patch to fix the problem.  I changed to keep
 restored WAL files in pg_xlog/ only on a cascading standby server.  Could
 you review and commit this?

-   if (source == XLOG_FROM_ARCHIVE)
+   if (source == XLOG_FROM_ARCHIVE 
+   StandbyModeRequested  AllowCascadeReplication())

I think that the condition of StandbyModeRequested should be removed
because someone might want to set up the cascade standby from the standby
of warm-standby configuration.

 BTW, KeepFileRestoredFromArchive() is also called to keep timeline history
 files in pg_xlog/.  What is this for?  Is this necessary for recovery other
 than cascading standbys?

Yes. Please see 60df192aea0e6458f20301546e11f7673c102101

 This seems to accumulate timeline history files
 forever in pg_xlog/.

Yes, because basically there is no way to delete the timeline history files
from pg_xlog.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [9.3 bug] disk space in pg_xlog increases during archive recovery

2013-07-31 Thread MauMau

From: Fujii Masao masao.fu...@gmail.com

- if (source == XLOG_FROM_ARCHIVE)
+ if (source == XLOG_FROM_ARCHIVE 
+ StandbyModeRequested  AllowCascadeReplication())

I think that the condition of StandbyModeRequested should be removed
because someone might want to set up the cascade standby from the standby
of warm-standby configuration.


Fixed and attached the revised patch.

However, isn't StandbyRequested true (= standby_mode set to on) to enable 
warm standby?  I'm afraid people set max_wal_senders0 and hot_standby=on 
even on the primary server to make the contents of postgresql.conf identical 
on both the primary and the standby for easier configuration.  If so, normal 
archive recovery (PITR, not the standby recovery) would face the original 
problem -- unnecessary WAL accumulation in pg_xlog/.  So I'm wonder if 
AllowCascadeReplication() is enough.


Please take either this patch or the previous one.

Regards
MauMau


skip_wal_save_v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [9.3 bug] disk space in pg_xlog increases during archive recovery

2013-07-31 Thread Fujii Masao
On Thu, Aug 1, 2013 at 7:37 AM, MauMau maumau...@gmail.com wrote:
 From: Fujii Masao masao.fu...@gmail.com

 - if (source == XLOG_FROM_ARCHIVE)
 + if (source == XLOG_FROM_ARCHIVE 
 + StandbyModeRequested  AllowCascadeReplication())

 I think that the condition of StandbyModeRequested should be removed
 because someone might want to set up the cascade standby from the standby
 of warm-standby configuration.


 Fixed and attached the revised patch.

 However, isn't StandbyRequested true (= standby_mode set to on) to enable
 warm standby?

We can set up warm-standby by using pg_standby even if standby_mode = off.

  I'm afraid people set max_wal_senders0 and hot_standby=on
 even on the primary server to make the contents of postgresql.conf identical
 on both the primary and the standby for easier configuration.  If so, normal
 archive recovery (PITR, not the standby recovery) would face the original
 problem -- unnecessary WAL accumulation in pg_xlog/.  So I'm wonder if
 AllowCascadeReplication() is enough.

One idea is to add new GUC parameter like enable_cascade_replication
and then prevent WAL file from being kept in pg_xlog if that parameter is off.
But we cannot apply such change into the already-released version 9.2.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [9.3 bug] disk space in pg_xlog increases during archive recovery

2013-07-30 Thread MauMau

From: Fujii Masao masao.fu...@gmail.com
So, I think it is a bug that the disk space increases if not using 
cascading
replication.  Those who migrated from 9.1 and do not use 9.2 features 
would

be surprised like me.


Do you think this should be fixed?


I think so.


Thanks for your agreement.  I'll try to submit the patch as soon as possible 
so that the fix will be included in the next minor release and 9.3.  If it 
seems unexpectedly difficult, I'd like to consult you.




How should it be fixed?


What about removing the restored archived file as soon as it's replayed
if cascading replication is not enabled (i.e., max_wal_senders = 0 or
hot_standby = off)? This doesn't seem to break the existing behavior
in 9.2.


I'll consider this condition, but I wonder if 9.1 users are setting 
max_wal_senders0 and hot_standby=on even when just performing archive 
recovery (e.g. to use the same recovery.conf for all cases).  What do you 
think?  Is there any other good condition to judge if cascading replication 
is used?


Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [9.3 bug] disk space in pg_xlog increases during archive recovery

2013-07-29 Thread Fujii Masao
On Sun, Jul 28, 2013 at 7:59 AM, MauMau maumau...@gmail.com wrote:
 Hello,

 I'm sorry I've been touching several things recently before fixing any of
 them.

 I've noticed undesirable disk space increase while performing archive
 recovery with PostgreSQL 9.3.  This happens with 9.2, too.

 I just performed archived recovery with the following parameters in
 recovery.conf.  I'm not using replication.

 restore_command = 'cp ...'
 recovery_target_timeline = 'latest'

 As the archive recovery progresses, the disk space used by $PGDATA/pg_xlog
 increases.  It seems that restored archive WAL files are accumulated there.
 This is considerable amount depending on the number of archived WAL files.
 In my case, the recovery failed because of the shortage of disk space.  This
 did not happen with 9.1

 The cause appears to be KeepFileRestoredFromArchive().  This function saves
 restored archive WAL files in pg_xlog/.  I guess this is for cascading
 replication, a new feature added in 9.2.

Yes. Those accumulated WAL files will be removed basically when
restartpoint ends.
But if restartpoint is slow compared to the WAL reply, the problem you described
would happen.

 So, I think it is a bug that the disk space increases if not using cascading
 replication.  Those who migrated from 9.1 and do not use 9.2 features would
 be surprised like me.


 Do you think this should be fixed?

I think so.

 How should it be fixed?

What about removing the restored archived file as soon as it's replayed
if cascading replication is not enabled (i.e., max_wal_senders = 0 or
hot_standby = off)? This doesn't seem to break the existing behavior
in 9.2.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers