Re: [HACKERS] Re: [BUGS] BUG #4566: pg_stop_backup() reports incorrect STOP WAL LOCATION

2010-02-17 Thread Takahiro Itagaki
I'd like to apply the patch to HEAD and previous releases because
the issue seems to be a bug in the core. Any comments or objections?

Some users actually use STOP WAL LOCATION in their backup script,
and they've countered the bug with 1/256 probability in recent days.


Fujii Masao masao.fu...@gmail.com wrote:

 On Fri, Feb 5, 2010 at 9:08 AM, Takahiro Itagaki
 itagaki.takah...@oss.ntt.co.jp wrote:
 
  Fujii Masao masao.fu...@gmail.com wrote:
 
  On Fri, Dec 5, 2008 at 11:41 PM, Randy Isbell jisb...@cisco.com wrote:
   An inconsistency exists between the segment name reported by
   pg_stop_backup() and the actual WAL file name.
  
   START WAL LOCATION: 10/FE1E2BAC (file 0002001000FE)
   STOP WAL LOCATION: 10/FF00 (file 0002001000FF)
 
  But it was rejected because its change might break the existing app.
 
  It might break existing applications if it returns FE instead of FF,
  but never-used filename surprises users. (IMO, the existing apps probably
  crash if FF returned, i.e, 1/256 of the time.)
 
  Should it return the *next* reasonable log filename instead of FF?
  For example, 00020020 for the above case.
 
 Here is the patch that avoids a nonexistent file name, according to
 Itagaki-san's suggestion. If we are crossing a logid boundary, the
 next reasonable file name is used instead of a nonexistent one.


Regards,
---
Takahiro Itagaki
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [BUGS] BUG #4566: pg_stop_backup() reports incorrect STOP WAL LOCATION

2010-02-17 Thread Tom Lane
Takahiro Itagaki itagaki.takah...@oss.ntt.co.jp writes:
 I'd like to apply the patch to HEAD and previous releases because
 the issue seems to be a bug in the core. Any comments or objections?

The proposed patch seems quite ugly to me; not only the messy coding,
but the fact that it might return either the segment containing the
XLOG_BACKUP_END record or the next one.

I think an appropriate fix might just be s/XLByteToSeg/XLByteToPrevSeg/,
so that the result is always the segment containing the XLOG_BACKUP_END
record even when the record ends exactly at a segment boundary.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [BUGS] BUG #4566: pg_stop_backup() reports incorrect STOP WAL LOCATION

2010-02-15 Thread Fujii Masao
On Thu, Feb 4, 2010 at 4:28 PM, Fujii Masao masao.fu...@gmail.com wrote:
 Sorry for resurrecting an old argument.
 http://archives.postgresql.org/message-id/200812051441.mb5efg1m007...@wwwmaster.postgresql.org

 I got the complaint about this behavior of the current pg_stop_backup()
 in this morning. I thought that this is the bug, and created the patch.
 But it was rejected because its change might break the existing app.
 Though I'm not sure if there is really such an app. Anyway I think that
 something like the following statements should be added into the document.
 Thought?

 
 Note that the WAL file name in the backup history file cannot be used
 to determine which WAL files are required for the backup. Because it
 indicates the subsequent WAL file of the starting or ending one for
 the backup, when its location is exactly at a WAL file boundary (What
 is worse, sometimes it indicates a nonexistent WAL file).
 

Here is the patch that adds the above-mentioned note. I think this
should be back-patched up to 8.0. Thought?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
*** a/doc/src/sgml/backup.sgml
--- b/doc/src/sgml/backup.sgml
***
*** 859,864  SELECT pg_stop_backup();
--- 859,869 
  If you used the label to identify the associated dump file,
  then the archived history file is enough to tell you which dump file to
  restore.
+ Note that the WAL file name in the backup history file cannot be used
+ to determine which WAL files are required for the backup. Because it
+ indicates the subsequent WAL file of the starting or ending one for
+ the backup, when its location is exactly at a WAL file boundary (What
+ is worse, sometimes it indicates a nonexistent WAL file).
 /para
  
 para

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [BUGS] BUG #4566: pg_stop_backup() reports incorrect STOP WAL LOCATION

2010-02-15 Thread Fujii Masao
On Fri, Feb 5, 2010 at 9:08 AM, Takahiro Itagaki
itagaki.takah...@oss.ntt.co.jp wrote:

 Fujii Masao masao.fu...@gmail.com wrote:

 On Fri, Dec 5, 2008 at 11:41 PM, Randy Isbell jisb...@cisco.com wrote:
  An inconsistency exists between the segment name reported by
  pg_stop_backup() and the actual WAL file name.
 
  START WAL LOCATION: 10/FE1E2BAC (file 0002001000FE)
  STOP WAL LOCATION: 10/FF00 (file 0002001000FF)

 But it was rejected because its change might break the existing app.

 It might break existing applications if it returns FE instead of FF,
 but never-used filename surprises users. (IMO, the existing apps probably
 crash if FF returned, i.e, 1/256 of the time.)

 Should it return the *next* reasonable log filename instead of FF?
 For example, 00020020 for the above case.

Here is the patch that avoids a nonexistent file name, according to
Itagaki-san's suggestion. If we are crossing a logid boundary, the
next reasonable file name is used instead of a nonexistent one.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
*** a/src/backend/access/transam/xlog.c
--- b/src/backend/access/transam/xlog.c
***
*** 8057,8063  pg_stop_backup(PG_FUNCTION_ARGS)
  	 */
  	RequestXLogSwitch();
  
! 	XLByteToSeg(stoppoint, _logId, _logSeg);
  	XLogFileName(stopxlogfilename, ThisTimeLineID, _logId, _logSeg);
  
  	/* Use the log timezone here, not the session timezone */
--- 8057,8078 
  	 */
  	RequestXLogSwitch();
  
! 	if (stoppoint.xrecoff = XLogSegSize)
! 	{
! 		XLogRecPtr	recptr = stoppoint;
! 
! 		/*
! 		 * Since xlog segment file name is calculated by using XLByteToSeg,
! 		 * it might indicate a nonexistent file (i.e., which ends in FF)
! 		 * when we are crossing a logid boundary. In this case, we use the
! 		 * next reasonable file name instead of nonexistent one.
! 		 */
! 		recptr.xlogid += 1;
! 		recptr.xrecoff = XLOG_BLCKSZ;
! 		XLByteToSeg(recptr, _logId, _logSeg);
! 	}
! 	else
! 		XLByteToSeg(stoppoint, _logId, _logSeg);
  	XLogFileName(stopxlogfilename, ThisTimeLineID, _logId, _logSeg);
  
  	/* Use the log timezone here, not the session timezone */

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [BUGS] BUG #4566: pg_stop_backup() reports incorrect STOP WAL LOCATION

2010-02-04 Thread Takahiro Itagaki

Fujii Masao masao.fu...@gmail.com wrote:

 On Fri, Dec 5, 2008 at 11:41 PM, Randy Isbell jisb...@cisco.com wrote:
  An inconsistency exists between the segment name reported by
  pg_stop_backup() and the actual WAL file name.
 
  START WAL LOCATION: 10/FE1E2BAC (file 0002001000FE)
  STOP WAL LOCATION: 10/FF00 (file 0002001000FF)

 But it was rejected because its change might break the existing app.

It might break existing applications if it returns FE instead of FF,
but never-used filename surprises users. (IMO, the existing apps probably
crash if FF returned, i.e, 1/256 of the time.)

Should it return the *next* reasonable log filename instead of FF?
For example, 00020020 for the above case.

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [BUGS] BUG #4566: pg_stop_backup() reports incorrect STOP WAL LOCATION

2010-02-04 Thread Fujii Masao
On Fri, Feb 5, 2010 at 9:08 AM, Takahiro Itagaki
itagaki.takah...@oss.ntt.co.jp wrote:
 But it was rejected because its change might break the existing app.

 It might break existing applications if it returns FE instead of FF,
 but never-used filename surprises users. (IMO, the existing apps probably
 crash if FF returned, i.e, 1/256 of the time.)

 Should it return the *next* reasonable log filename instead of FF?
 For example, 00020020 for the above case.

I wonder if that change also breaks the existing app. But since
I've never seen the app that doesn't use that filename at face
value, I agree to change the existing (odd for me) behavior of
pg_stop_backup().

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [BUGS] BUG #4566: pg_stop_backup() reports incorrect STOP WAL LOCATION

2010-02-03 Thread Fujii Masao
On Fri, Dec 5, 2008 at 11:41 PM, Randy Isbell jisb...@cisco.com wrote:
 An inconsistency exists between the segment name reported by
 pg_stop_backup() and the actual WAL file name.


 SELECT pg_start_backup('filename');
         pg_start_backup
        -
         10/FE1E2BAC
        (1 row)

 Later:
 SELECT pg_stop_backup();
         pg_stop_backup
        
         10/FF00
        (1 row)

 The resulting *.backup file:

 START WAL LOCATION: 10/FE1E2BAC (file 0002001000FE)
 STOP WAL LOCATION: 10/FF00 (file 0002001000FF)
 CHECKPOINT LOCATION: 10/FE1E2BAC
 START TIME: 2008-11-09 01:15:06 CST
 LABEL: /bck/db/sn200811090115.tar.gz
 STOP TIME: 2008-11-09 01:15:48 CST

 In my 8.3.4 instance, WAL file naming occurs as:

 ...
 0001000300FD
 0001000300FE
 00010004
 000100040001
 ...

 WAL files never end in 'FF'.  This causes a problem when trying to collect
 the ending WAL file for backup.

Sorry for resurrecting an old argument.
http://archives.postgresql.org/message-id/200812051441.mb5efg1m007...@wwwmaster.postgresql.org

I got the complaint about this behavior of the current pg_stop_backup()
in this morning. I thought that this is the bug, and created the patch.
But it was rejected because its change might break the existing app.
Though I'm not sure if there is really such an app. Anyway I think that
something like the following statements should be added into the document.
Thought?


Note that the WAL file name in the backup history file cannot be used
to determine which WAL files are required for the backup. Because it
indicates the subsequent WAL file of the starting or ending one for
the backup, when its location is exactly at a WAL file boundary (What
is worse, sometimes it indicates a nonexistent WAL file).


Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers