On Fri, Jul 26, 2013 at 01:27:34PM -0400, Bruce Momjian wrote:
> On Thu, Jul 25, 2013 at 10:57:28AM -0400, Bruce Momjian wrote:
> > Everyone should be aware that the 9.3 pg_upgrade -j/--jobs option on
> > Windows is currently broken, and hopefully will be fixed by the next
> > beta.
> > 
> > Someone at PGDay UK told me they were getting pg_upgrade -j crashes on
> > Windows.  Andrew Dunstan was able to reproduce the crash, and that has
> > been fixed, but there is still a race condition that I am trying to
> > diagnose.
> 
> After three days of testing and creating a Windows MinGW build
> environment (with Andrew's help), I have come up with the attached patch
> which fixes the pg_upgrade -j race condition on Windows.  In summary,
> creating a file with fopen() from a non-primary thread and then calling
> system() soon after can result in a file-in-use error.  The solution was
> to issue the fopen() after system() in such cases.
> 
> This would be applied to head and 9.3.

Sorry, patch attached.

-- 
  Bruce Momjian  <br...@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/contrib/pg_upgrade/exec.c b/contrib/pg_upgrade/exec.c
new file mode 100644
index ef123a8..3ec6e1c
*** a/contrib/pg_upgrade/exec.c
--- b/contrib/pg_upgrade/exec.c
*************** static void validate_exec(const char *di
*** 21,26 ****
--- 21,27 ----
  
  #ifdef WIN32
  static int	win32_check_directory_write_permissions(void);
+ DWORD       mainThreadId = 0;
  #endif
  
  
*************** static int	win32_check_directory_write_p
*** 37,48 ****
   * If throw_error is true, this raises a PG_FATAL error and pg_upgrade
   * terminates; otherwise it is just reported as PG_REPORT and exec_prog()
   * returns false.
   */
  bool
  exec_prog(const char *log_file, const char *opt_log_file,
  		  bool throw_error, const char *fmt,...)
  {
! 	int			result;
  	int			written;
  
  #define MAXCMDLEN (2 * MAXPGPATH)
--- 38,51 ----
   * If throw_error is true, this raises a PG_FATAL error and pg_upgrade
   * terminates; otherwise it is just reported as PG_REPORT and exec_prog()
   * returns false.
+  *
+  * The code requires it be called first from the primary thread on Windows.
   */
  bool
  exec_prog(const char *log_file, const char *opt_log_file,
  		  bool throw_error, const char *fmt,...)
  {
! 	int			result = 0;
  	int			written;
  
  #define MAXCMDLEN (2 * MAXPGPATH)
*************** exec_prog(const char *log_file, const ch
*** 50,55 ****
--- 53,64 ----
  	FILE	   *log;
  	va_list		ap;
  
+ #ifdef WIN32
+ 	/* We assume we are called from the primary thread first */
+ 	if (mainThreadId == 0)
+ 		mainThreadId = GetCurrentThreadId();
+ #endif
+ 
  	written = strlcpy(cmd, SYSTEMQUOTE, sizeof(cmd));
  	va_start(ap, fmt);
  	written += vsnprintf(cmd + written, MAXCMDLEN - written, fmt, ap);
*************** exec_prog(const char *log_file, const ch
*** 61,66 ****
--- 70,91 ----
  	if (written >= MAXCMDLEN)
  		pg_log(PG_FATAL, "command too long\n");
  
+ 	pg_log(PG_VERBOSE, "%s\n", cmd);
+ 
+ #ifdef WIN32
+ 	/*
+ 	 * For some reason, Windows issues a file-in-use error if we write data
+ 	 * to the log file from a non-primary thread just before we create a
+ 	 * subprocess that also writes to the same log file.  One fix is to
+ 	 * sleep for 100ms.  A cleaner fix is to write to the log file _after_
+ 	 * the subprocess has completed, so we do this only when writing from
+ 	 * a non-primary thread.  fflush(), running system() twice, and
+ 	 * pre-creating the file do not see to help.
+ 	 */
+ 	if (mainThreadId != GetCurrentThreadId())
+ 		result = system(cmd);
+ #endif
+ 
  	log = fopen(log_file, "a");
  
  #ifdef WIN32
*************** exec_prog(const char *log_file, const ch
*** 84,94 ****
  
  	if (log == NULL)
  		pg_log(PG_FATAL, "cannot write to log file %s\n", log_file);
  #ifdef WIN32
! 	fprintf(log, "\n\n");
  #endif
- 	pg_log(PG_VERBOSE, "%s\n", cmd);
  	fprintf(log, "command: %s\n", cmd);
  
  	/*
  	 * In Windows, we must close the log file at this point so the file is not
--- 109,126 ----
  
  	if (log == NULL)
  		pg_log(PG_FATAL, "cannot write to log file %s\n", log_file);
+ 
  #ifdef WIN32
! 	/* Are we printing "command:" before its output? */
! 	if (mainThreadId == GetCurrentThreadId())
! 		fprintf(log, "\n\n");
  #endif
  	fprintf(log, "command: %s\n", cmd);
+ #ifdef WIN32
+ 	/* Are we printing "command:" after its output? */
+ 	if (mainThreadId != GetCurrentThreadId())
+ 		fprintf(log, "\n\n");
+ #endif
  
  	/*
  	 * In Windows, we must close the log file at this point so the file is not
*************** exec_prog(const char *log_file, const ch
*** 96,102 ****
  	 */
  	fclose(log);
  
! 	result = system(cmd);
  
  	if (result != 0)
  	{
--- 128,138 ----
  	 */
  	fclose(log);
  
! #ifdef WIN32
! 	/* see comment above */
! 	if (mainThreadId == GetCurrentThreadId())
! #endif
! 		result = system(cmd);
  
  	if (result != 0)
  	{
*************** exec_prog(const char *log_file, const ch
*** 118,124 ****
  	}
  
  #ifndef WIN32
- 
  	/*
  	 * We can't do this on Windows because it will keep the "pg_ctl start"
  	 * output filename open until the server stops, so we do the \n\n above on
--- 154,159 ----
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to