Re: [HACKERS] Your review of pg_receivexlog/pg_basebackup

2011-11-03 Thread Magnus Hagander
On Tue, Nov 1, 2011 at 05:53, Fujii Masao masao.fu...@gmail.com wrote:
 On Tue, Nov 1, 2011 at 3:08 AM, Magnus Hagander mag...@hagander.net wrote:
 On Fri, Oct 28, 2011 at 08:46, Fujii Masao masao.fu...@gmail.com wrote:
 On Thu, Oct 27, 2011 at 11:14 PM, Magnus Hagander mag...@hagander.net 
 wrote:
 Here's a version that does this. Turns out this requires a lot less
 code than what was previously in there, which is always nice.

 We still need to solve the other part which is how to deal with the
 partial files on restore. But this is definitely a cleaner way from a
 pure pg_receivexlog perspective.

 Comments/reviews?

 Looks good.

 Minor comment:
 the source code comment of FindStreamingStart() seems to need to be updated.

 Here's an updated patch that both includes this update to the comment,
 and also the functionality to pre-pad files to 16Mb. This also seems
 to have simplified the code, which is a nice bonus.

 Here are the comments:

 In open_walfile(), zerobuf needs to be free'd after use of it.

Ooops, yes.


 +       f = open(fn, O_WRONLY | O_CREAT | PG_BINARY, 0666);

 We should use S_IRUSR | S_IWUSR instead of 0666 as a file access modes?

Agreed, changed.


 +               if (write(f, zerobuf, XLOG_BLCKSZ) != XLOG_BLCKSZ)
 +               {
 +                       fprintf(stderr, _(%s: could not pad WAL segment %s: 
 %s\n),
 +                                       progname, fn, strerror(errno));
 +                       close(f);
 +                       return -1;
 +               }

 When write() fails, we should delete the partial WAL file, like
 XLogFileInit() does?

Yes, that's probably a good idae. Added a simple unlink() call
directly after the close().

 If not, subsequent pg_receivexlog always fails unless a user deletes
 it manually.
 Because open_walfile() always fails when it finds an existing partial WAL 
 file.

 When open_walfile() fails, pg_receivexlog exits without closing the 
 connection.
 I don't think this is good error handling. But this issue itself is
 not what we're
 trying to address now. So I think you can commit separately from current 
 patch.

Wow, when looking into that, there was a nice bug in open_walfile -
when the file failed to open, it would write that error message but
not return - then proceed to write a second error message from fstat.
Oops.

Anyway - yes, the return value of ReceiveXLogStream isn't checked at
all. That's certainly not very nice. I'll go fix that too.

I'll apply the patch with the fixes you've mentioned above. Please
check master again in a few minutes. Thanks!


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Your review of pg_receivexlog/pg_basebackup

2011-10-31 Thread Magnus Hagander
On Fri, Oct 28, 2011 at 08:46, Fujii Masao masao.fu...@gmail.com wrote:
 On Thu, Oct 27, 2011 at 11:14 PM, Magnus Hagander mag...@hagander.net wrote:
 Here's a version that does this. Turns out this requires a lot less
 code than what was previously in there, which is always nice.

 We still need to solve the other part which is how to deal with the
 partial files on restore. But this is definitely a cleaner way from a
 pure pg_receivexlog perspective.

 Comments/reviews?

 Looks good.

 Minor comment:
 the source code comment of FindStreamingStart() seems to need to be updated.

Here's an updated patch that both includes this update to the comment,
and also the functionality to pre-pad files to 16Mb. This also seems
to have simplified the code, which is a nice bonus.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/
*** a/src/bin/pg_basebackup/pg_receivexlog.c
--- b/src/bin/pg_basebackup/pg_receivexlog.c
***
*** 71,104  usage(void)
  static bool
  segment_callback(XLogRecPtr segendpos, uint32 timeline)
  {
- 	char		fn[MAXPGPATH];
- 	struct stat statbuf;
- 
  	if (verbose)
  		fprintf(stderr, _(%s: finished segment at %X/%X (timeline %u)\n),
  progname, segendpos.xlogid, segendpos.xrecoff, timeline);
  
  	/*
- 	 * Check if there is a partial file for the name we just finished, and if
- 	 * there is, remove it under the assumption that we have now got all the
- 	 * data we need.
- 	 */
- 	segendpos.xrecoff /= XLOG_SEG_SIZE;
- 	PrevLogSeg(segendpos.xlogid, segendpos.xrecoff);
- 	snprintf(fn, sizeof(fn), %s/%08X%08X%08X.partial,
- 			 basedir, timeline,
- 			 segendpos.xlogid,
- 			 segendpos.xrecoff);
- 	if (stat(fn, statbuf) == 0)
- 	{
- 		/* File existed, get rid of it */
- 		if (verbose)
- 			fprintf(stderr, _(%s: removing file \%s\\n),
- 	progname, fn);
- 		unlink(fn);
- 	}
- 
- 	/*
  	 * Never abort from this - we handle all aborting in continue_streaming()
  	 */
  	return false;
--- 71,81 
***
*** 119,127  continue_streaming(void)
  /*
   * Determine starting location for streaming, based on:
   * 1. If there are existing xlog segments, start at the end of the last one
!  * 2. If the last one is a partial segment, rename it and start over, since
!  *	  we don't sync after every write.
!  * 3. If no existing xlog exists, start from the beginning of the current
   *	  WAL segment.
   */
  static XLogRecPtr
--- 96,103 
  /*
   * Determine starting location for streaming, based on:
   * 1. If there are existing xlog segments, start at the end of the last one
!  *that is complete (size matches XLogSegSize)
!  * 2. If no valid xlog exists, start from the beginning of the current
   *	  WAL segment.
   */
  static XLogRecPtr
***
*** 133,139  FindStreamingStart(XLogRecPtr currentpos, uint32 currenttimeline)
  	bool		b;
  	uint32		high_log = 0;
  	uint32		high_seg = 0;
- 	bool		partial = false;
  
  	dir = opendir(basedir);
  	if (dir == NULL)
--- 109,114 
***
*** 195,201  FindStreamingStart(XLogRecPtr currentpos, uint32 currenttimeline)
  			disconnect_and_exit(1);
  		}
  
! 		if (statbuf.st_size == 16 * 1024 * 1024)
  		{
  			/* Completed segment */
  			if (log  high_log ||
--- 170,176 
  			disconnect_and_exit(1);
  		}
  
! 		if (statbuf.st_size == XLOG_SEG_SIZE)
  		{
  			/* Completed segment */
  			if (log  high_log ||
***
*** 208,244  FindStreamingStart(XLogRecPtr currentpos, uint32 currenttimeline)
  		}
  		else
  		{
! 			/*
! 			 * This is a partial file. Rename it out of the way.
! 			 */
! 			char		newfn[MAXPGPATH];
! 
! 			fprintf(stderr, _(%s: renaming partial file \%s\ to \%s.partial\\n),
! 	progname, dirent-d_name, dirent-d_name);
! 
! 			snprintf(newfn, sizeof(newfn), %s/%s.partial,
! 	 basedir, dirent-d_name);
! 
! 			if (stat(newfn, statbuf) == 0)
! 			{
! /*
!  * XXX: perhaps we should only error out if the existing file
!  * is larger?
!  */
! fprintf(stderr, _(%s: file \%s\ already exists. Check and clean up manually.\n),
! 		progname, newfn);
! disconnect_and_exit(1);
! 			}
! 			if (rename(fullpath, newfn) != 0)
! 			{
! fprintf(stderr, _(%s: could not rename \%s\ to \%s\: %s\n),
! 		progname, fullpath, newfn, strerror(errno));
! disconnect_and_exit(1);
! 			}
! 
! 			/* Don't continue looking for more, we assume this is the last */
! 			partial = true;
! 			break;
  		}
  	}
  
--- 183,191 
  		}
  		else
  		{
! 			fprintf(stderr, _(%s: segment file '%s' is incorrect size %d, skipping\n),
! 	progname, dirent-d_name, (int) statbuf.st_size);
! 			continue;
  		}
  	}
  
***
*** 247,263  FindStreamingStart(XLogRecPtr currentpos, uint32 currenttimeline)
  	if (high_log  0 || high_seg  0)
  	{
  		XLogRecPtr	high_ptr;
! 
! 		if (!partial)
! 		{
! 			/*
! 			 * If the segment was partial, the pointer is already at the right
! 			 * location since we want to 

Re: [HACKERS] Your review of pg_receivexlog/pg_basebackup

2011-10-31 Thread Fujii Masao
On Tue, Nov 1, 2011 at 3:08 AM, Magnus Hagander mag...@hagander.net wrote:
 On Fri, Oct 28, 2011 at 08:46, Fujii Masao masao.fu...@gmail.com wrote:
 On Thu, Oct 27, 2011 at 11:14 PM, Magnus Hagander mag...@hagander.net 
 wrote:
 Here's a version that does this. Turns out this requires a lot less
 code than what was previously in there, which is always nice.

 We still need to solve the other part which is how to deal with the
 partial files on restore. But this is definitely a cleaner way from a
 pure pg_receivexlog perspective.

 Comments/reviews?

 Looks good.

 Minor comment:
 the source code comment of FindStreamingStart() seems to need to be updated.

 Here's an updated patch that both includes this update to the comment,
 and also the functionality to pre-pad files to 16Mb. This also seems
 to have simplified the code, which is a nice bonus.

Here are the comments:

In open_walfile(), zerobuf needs to be free'd after use of it.

+   f = open(fn, O_WRONLY | O_CREAT | PG_BINARY, 0666);

We should use S_IRUSR | S_IWUSR instead of 0666 as a file access modes?

+   if (write(f, zerobuf, XLOG_BLCKSZ) != XLOG_BLCKSZ)
+   {
+   fprintf(stderr, _(%s: could not pad WAL segment %s: 
%s\n),
+   progname, fn, strerror(errno));
+   close(f);
+   return -1;
+   }

When write() fails, we should delete the partial WAL file, like
XLogFileInit() does?
If not, subsequent pg_receivexlog always fails unless a user deletes
it manually.
Because open_walfile() always fails when it finds an existing partial WAL file.

When open_walfile() fails, pg_receivexlog exits without closing the connection.
I don't think this is good error handling. But this issue itself is
not what we're
trying to address now. So I think you can commit separately from current patch.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Your review of pg_receivexlog/pg_basebackup

2011-10-28 Thread Fujii Masao
On Thu, Oct 27, 2011 at 11:14 PM, Magnus Hagander mag...@hagander.net wrote:
 Here's a version that does this. Turns out this requires a lot less
 code than what was previously in there, which is always nice.

 We still need to solve the other part which is how to deal with the
 partial files on restore. But this is definitely a cleaner way from a
 pure pg_receivexlog perspective.

 Comments/reviews?

Looks good.

Minor comment:
the source code comment of FindStreamingStart() seems to need to be updated.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Your review of pg_receivexlog/pg_basebackup

2011-10-27 Thread Magnus Hagander
On Wed, Oct 26, 2011 at 09:52, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 (CC'ing pgsql-hackers, this started as an IM discussion yesterday but really
 belongs in the archives)

 On 25.10.2011 23:52, Magnus Hagander wrote:

 There's a tiny chance to get incomplete xlog files with pg_receivexlog if
 you crash:
 1. pg_receivexlog finishes write()ing a file but system crashes before
 fsync() finishes.
 2. When pg_receivexlog restarts after crash, the last WAL file was not
 fully flushed to disk, with
 holes in the middle, but it has the right length. pg_receivexlog will
 continue streaming from the next file.
 not sure if we care about such a narrow window, but maybe we do

 So how would we go about fixing that?  Always unlink the last file in
 the directory and try from there would seem dangerous too - what if
 it's not available on the master anymore, then we might have given up
 on data...

 Start streaming from the beginning of the last segment, but don't unlink it
 first. Just overwrite it as you receive the data.

 Or, always create new xlog file as 0001000100D3.partial, and
 only when it's fully written, fsync it, and then rename it to
 0001000100D3. Then you know that if a file doesn't have the
 .partial suffix, it's complete. The fact that the last partial file always
 has the .partial suffix needs some extra pushups in the restore_command,
 though.

Here's a version that does this. Turns out this requires a lot less
code than what was previously in there, which is always nice.

We still need to solve the other part which is how to deal with the
partial files on restore. But this is definitely a cleaner way from a
pure pg_receivexlog perspective.

Comments/reviews?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/
*** a/src/bin/pg_basebackup/pg_receivexlog.c
--- b/src/bin/pg_basebackup/pg_receivexlog.c
***
*** 71,104  usage(void)
  static bool
  segment_callback(XLogRecPtr segendpos, uint32 timeline)
  {
- 	char		fn[MAXPGPATH];
- 	struct stat statbuf;
- 
  	if (verbose)
  		fprintf(stderr, _(%s: finished segment at %X/%X (timeline %u)\n),
  progname, segendpos.xlogid, segendpos.xrecoff, timeline);
  
  	/*
- 	 * Check if there is a partial file for the name we just finished, and if
- 	 * there is, remove it under the assumption that we have now got all the
- 	 * data we need.
- 	 */
- 	segendpos.xrecoff /= XLOG_SEG_SIZE;
- 	PrevLogSeg(segendpos.xlogid, segendpos.xrecoff);
- 	snprintf(fn, sizeof(fn), %s/%08X%08X%08X.partial,
- 			 basedir, timeline,
- 			 segendpos.xlogid,
- 			 segendpos.xrecoff);
- 	if (stat(fn, statbuf) == 0)
- 	{
- 		/* File existed, get rid of it */
- 		if (verbose)
- 			fprintf(stderr, _(%s: removing file \%s\\n),
- 	progname, fn);
- 		unlink(fn);
- 	}
- 
- 	/*
  	 * Never abort from this - we handle all aborting in continue_streaming()
  	 */
  	return false;
--- 71,81 
***
*** 133,139  FindStreamingStart(XLogRecPtr currentpos, uint32 currenttimeline)
  	bool		b;
  	uint32		high_log = 0;
  	uint32		high_seg = 0;
- 	bool		partial = false;
  
  	dir = opendir(basedir);
  	if (dir == NULL)
--- 110,115 
***
*** 195,201  FindStreamingStart(XLogRecPtr currentpos, uint32 currenttimeline)
  			disconnect_and_exit(1);
  		}
  
! 		if (statbuf.st_size == 16 * 1024 * 1024)
  		{
  			/* Completed segment */
  			if (log  high_log ||
--- 171,177 
  			disconnect_and_exit(1);
  		}
  
! 		if (statbuf.st_size == XLOG_SEG_SIZE)
  		{
  			/* Completed segment */
  			if (log  high_log ||
***
*** 208,244  FindStreamingStart(XLogRecPtr currentpos, uint32 currenttimeline)
  		}
  		else
  		{
! 			/*
! 			 * This is a partial file. Rename it out of the way.
! 			 */
! 			char		newfn[MAXPGPATH];
! 
! 			fprintf(stderr, _(%s: renaming partial file \%s\ to \%s.partial\\n),
! 	progname, dirent-d_name, dirent-d_name);
! 
! 			snprintf(newfn, sizeof(newfn), %s/%s.partial,
! 	 basedir, dirent-d_name);
! 
! 			if (stat(newfn, statbuf) == 0)
! 			{
! /*
!  * XXX: perhaps we should only error out if the existing file
!  * is larger?
!  */
! fprintf(stderr, _(%s: file \%s\ already exists. Check and clean up manually.\n),
! 		progname, newfn);
! disconnect_and_exit(1);
! 			}
! 			if (rename(fullpath, newfn) != 0)
! 			{
! fprintf(stderr, _(%s: could not rename \%s\ to \%s\: %s\n),
! 		progname, fullpath, newfn, strerror(errno));
! disconnect_and_exit(1);
! 			}
! 
! 			/* Don't continue looking for more, we assume this is the last */
! 			partial = true;
! 			break;
  		}
  	}
  
--- 184,192 
  		}
  		else
  		{
! 			fprintf(stderr, _(%s: segment file '%s' is incorrect size %d, skipping\n),
! 	progname, dirent-d_name, (int) statbuf.st_size);
! 			continue;
  		}
  	}
  
***
*** 247,263  FindStreamingStart(XLogRecPtr currentpos, uint32 currenttimeline)
  	if (high_log  

Re: [HACKERS] Your review of pg_receivexlog/pg_basebackup

2011-10-26 Thread Heikki Linnakangas
(CC'ing pgsql-hackers, this started as an IM discussion yesterday but 
really belongs in the archives)


On 25.10.2011 23:52, Magnus Hagander wrote:

There's a tiny chance to get incomplete xlog files with pg_receivexlog if you 
crash:
1. pg_receivexlog finishes write()ing a file but system crashes before fsync() 
finishes.
2. When pg_receivexlog restarts after crash, the last WAL file was not fully 
flushed to disk, with
holes in the middle, but it has the right length. pg_receivexlog will continue 
streaming from the next file.
not sure if we care about such a narrow window, but maybe we do


So how would we go about fixing that?  Always unlink the last file in
the directory and try from there would seem dangerous too - what if
it's not available on the master anymore, then we might have given up
on data...


Start streaming from the beginning of the last segment, but don't unlink 
it first. Just overwrite it as you receive the data.


Or, always create new xlog file as 0001000100D3.partial, 
and only when it's fully written, fsync it, and then rename it to 
0001000100D3. Then you know that if a file doesn't have 
the .partial suffix, it's complete. The fact that the last partial file 
always has the .partial suffix needs some extra pushups in the 
restore_command, though.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers