Re: [HACKERS] [CORE] 7.4RC2 regression failur and not running stats collector process

2003-11-14 Thread Christopher Browne
[EMAIL PROTECTED] (Josh Berkus) writes:
 Too bad we didn't figure this out yesterday.  We are now in code freeze
 for 7.4 release, and I'm hesitant to apply a fix for what is arguably a
 broken platform.  Core guys, time for a vote ... do we fix, or hold this
 for 7.4.1?

 One thing I've not seen an answer to:  does Postgres run acceptably on other 
 people's Solaris boxes?   If this bug is preventing running on Solaris at 
 all, I'd say fix it ... Solaris is a major platform.   If it only affects 
 users of one particular Solaris patch version, then we do a big warning and 
 save it for 7.4.1.

For what it's worth, I have been running regression on Solaris with
numerous of the betas, and RC1 and [just now] RC2, with NO problems.

If the patch is deemed vital for others, it's possible that all I'm
reporting is one of the statistics that will be outnumbered by others.
(And in that case, I would be quick to test the patch to ensure it
causes no adverse side-effects.)

But it's not apparent that it is _vital_ here right now.
-- 
let name=cbbrowne and tld=libertyrms.info in name ^ @ ^ tld;;
http://dev6.int.libertyrms.com/
Christopher Browne
(416) 646 3304 x124 (land)

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] [CORE] 7.4RC2 regression failur and not running stats collector process

2003-11-14 Thread Tom Lane
Christopher Browne [EMAIL PROTECTED] writes:
 For what it's worth, I have been running regression on Solaris with
 numerous of the betas, and RC1 and [just now] RC2, with NO problems.

It seems clear that some Solaris installations are affected and some
are not.  Presumably there is some version difference or some local
configuration difference ... but since we don't know what the critical
factor is, we have no basis for guessing what fraction of Solaris
installations will see the problem.

 (And in that case, I would be quick to test the patch to ensure it
 causes no adverse side-effects.)

Here is the proposed patch --- please test it ASAP if you can.
This is against RC2.

regards, tom lane

*** src/backend/postmaster/pgstat.c.origFri Nov  7 16:55:50 2003
--- src/backend/postmaster/pgstat.c Fri Nov 14 15:02:14 2003
***
*** 203,208 
--- 203,216 
goto startup_failed;
}
  
+   /*
+* On some platforms, getaddrinfo_all() may return multiple addresses
+* only one of which will actually work (eg, both IPv6 and IPv4 addresses
+* when kernel will reject IPv6).  Worse, the failure may occur at the
+* bind() or perhaps even connect() stage.  So we must loop through the
+* results till we find a working combination.  We will generate LOG
+* messages, but no error, for bogus combinations.
+*/
for (addr = addrs; addr; addr = addr-ai_next)
{
  #ifdef HAVE_UNIX_SOCKETS
***
*** 210,262 
if (addr-ai_family == AF_UNIX)
continue;
  #endif
!   if ((pgStatSock = socket(addr-ai_family, SOCK_DGRAM, 0)) = 0)
!   break;
!   }
  
!   if (!addr || pgStatSock  0)
!   {
!   ereport(LOG,
!   (errcode_for_socket_access(),
!errmsg(could not create socket for statistics 
collector: %m)));
!   goto startup_failed;
!   }
  
!   /*
!* Bind it to a kernel assigned port on localhost and get the assigned
!* port via getsockname().
!*/
!   if (bind(pgStatSock, addr-ai_addr, addr-ai_addrlen)  0)
!   {
!   ereport(LOG,
!   (errcode_for_socket_access(),
!errmsg(could not bind socket for statistics 
collector: %m)));
!   goto startup_failed;
!   }
  
!   freeaddrinfo_all(hints.ai_family, addrs);
!   addrs = NULL;
  
!   alen = sizeof(pgStatAddr);
!   if (getsockname(pgStatSock, (struct sockaddr *)  pgStatAddr, alen)  0)
!   {
!   ereport(LOG,
!   (errcode_for_socket_access(),
! errmsg(could not get address of socket for statistics collector: 
%m)));
!   goto startup_failed;
}
  
!   /*
!* Connect the socket to its own address.  This saves a few cycles by
!* not having to respecify the target address on every send. This also
!* provides a kernel-level check that only packets from this same
!* address will be received.
!*/
!   if (connect(pgStatSock, (struct sockaddr *)  pgStatAddr, alen)  0)
{
ereport(LOG,
(errcode_for_socket_access(),
!errmsg(could not connect socket for statistics 
collector: %m)));
goto startup_failed;
}
  
--- 218,285 
if (addr-ai_family == AF_UNIX)
continue;
  #endif
!   /*
!* Create the socket.
!*/
!   if ((pgStatSock = socket(addr-ai_family, SOCK_DGRAM, 0))  0)
!   {
!   ereport(LOG,
!   (errcode_for_socket_access(),
!errmsg(could not create socket for 
statistics collector: %m)));
!   continue;
!   }
  
!   /*
!* Bind it to a kernel assigned port on localhost and get the assigned
!* port via getsockname().
!*/
!   if (bind(pgStatSock, addr-ai_addr, addr-ai_addrlen)  0)
!   {
!   ereport(LOG,
!   (errcode_for_socket_access(),
!errmsg(could not bind socket for statistics 
collector: %m)));
!   closesocket(pgStatSock);
!   pgStatSock = -1;
!   continue;
!   }
  
!   alen = sizeof(pgStatAddr);
!   if (getsockname(pgStatSock, (struct sockaddr *) pgStatAddr, alen)  
0)
!   {
!   ereport(LOG,
!   (errcode_for_socket_access(),
!  

Re: [HACKERS] [CORE] 7.4RC2 regression failur and not running stats collector process

2003-11-14 Thread Glenn Wiorek
Hmm I know it's been a while  since I used patch but I seem to be having
problems applying it.  Perhaps my patch is outdated??

patch -b pgstat.c   patchfile
Looks like a new-style context diff.
Hunk#2failed at line 203.
Hunk#2failed at line 210.
Hunk#3failed at line 284.
3 out of 3 hunks ailed: saving reject to pgstat.c.rej

- Original Message - 
From: Tom Lane [EMAIL PROTECTED]
To: Christopher Browne [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Sent: Friday, November 14, 2003 2:42 PM
Subject: Re: [HACKERS] [CORE] 7.4RC2 regression failur and not running stats
collector process


 Christopher Browne [EMAIL PROTECTED] writes:
  For what it's worth, I have been running regression on Solaris with
  numerous of the betas, and RC1 and [just now] RC2, with NO problems.

 It seems clear that some Solaris installations are affected and some
 are not.  Presumably there is some version difference or some local
 configuration difference ... but since we don't know what the critical
 factor is, we have no basis for guessing what fraction of Solaris
 installations will see the problem.

  (And in that case, I would be quick to test the patch to ensure it
  causes no adverse side-effects.)

 Here is the proposed patch --- please test it ASAP if you can.
 This is against RC2.

 regards, tom lane








 *** src/backend/postmaster/pgstat.c.orig Fri Nov  7 16:55:50 2003
 --- src/backend/postmaster/pgstat.c Fri Nov 14 15:02:14 2003
 ***
 *** 203,208 
 --- 203,216 
   goto startup_failed;
   }

 + /*
 + * On some platforms, getaddrinfo_all() may return multiple addresses
 + * only one of which will actually work (eg, both IPv6 and IPv4 addresses
 + * when kernel will reject IPv6).  Worse, the failure may occur at the
 + * bind() or perhaps even connect() stage.  So we must loop through the
 + * results till we find a working combination.  We will generate LOG
 + * messages, but no error, for bogus combinations.
 + */
   for (addr = addrs; addr; addr = addr-ai_next)
   {
   #ifdef HAVE_UNIX_SOCKETS
 ***
 *** 210,262 
   if (addr-ai_family == AF_UNIX)
   continue;
   #endif
 ! if ((pgStatSock = socket(addr-ai_family, SOCK_DGRAM, 0)) = 0)
 ! break;
 ! }

 ! if (!addr || pgStatSock  0)
 ! {
 ! ereport(LOG,
 ! (errcode_for_socket_access(),
 ! errmsg(could not create socket for statistics collector: %m)));
 ! goto startup_failed;
 ! }

 ! /*
 ! * Bind it to a kernel assigned port on localhost and get the assigned
 ! * port via getsockname().
 ! */
 ! if (bind(pgStatSock, addr-ai_addr, addr-ai_addrlen)  0)
 ! {
 ! ereport(LOG,
 ! (errcode_for_socket_access(),
 ! errmsg(could not bind socket for statistics collector: %m)));
 ! goto startup_failed;
 ! }

 ! freeaddrinfo_all(hints.ai_family, addrs);
 ! addrs = NULL;

 ! alen = sizeof(pgStatAddr);
 ! if (getsockname(pgStatSock, (struct sockaddr *)  pgStatAddr, alen) 
0)
 ! {
 ! ereport(LOG,
 ! (errcode_for_socket_access(),
 !   errmsg(could not get address of socket for statistics collector:
%m)));
 ! goto startup_failed;
   }

 ! /*
 ! * Connect the socket to its own address.  This saves a few cycles by
 ! * not having to respecify the target address on every send. This also
 ! * provides a kernel-level check that only packets from this same
 ! * address will be received.
 ! */
 ! if (connect(pgStatSock, (struct sockaddr *)  pgStatAddr, alen)  0)
   {
   ereport(LOG,
   (errcode_for_socket_access(),
 ! errmsg(could not connect socket for statistics collector: %m)));
   goto startup_failed;
   }

 --- 218,285 
   if (addr-ai_family == AF_UNIX)
   continue;
   #endif
 ! /*
 ! * Create the socket.
 ! */
 ! if ((pgStatSock = socket(addr-ai_family, SOCK_DGRAM, 0))  0)
 ! {
 ! ereport(LOG,
 ! (errcode_for_socket_access(),
 ! errmsg(could not create socket for statistics collector: %m)));
 ! continue;
 ! }

 ! /*
 ! * Bind it to a kernel assigned port on localhost and get the assigned
 ! * port via getsockname().
 ! */
 ! if (bind(pgStatSock, addr-ai_addr, addr-ai_addrlen)  0)
 ! {
 ! ereport(LOG,
 ! (errcode_for_socket_access(),
 ! errmsg(could not bind socket for statistics collector: %m)));
 ! closesocket(pgStatSock);
 ! pgStatSock = -1;
 ! continue;
 ! }

 ! alen = sizeof(pgStatAddr);
 ! if (getsockname(pgStatSock, (struct sockaddr *) pgStatAddr, alen)  0)
 ! {
 ! ereport(LOG,
 ! (errcode_for_socket_access(),
 ! errmsg(could not get address of socket for statistics collector:
%m)));
 ! closesocket(pgStatSock);
 ! pgStatSock = -1;
 ! continue;
 ! }

 ! /*
 ! * Connect the socket to its own address.  This saves a few cycles by
 ! * not having to respecify the target address on every send. This also
 ! * provides a kernel-level check that only packets from this same
 ! * address will be received.
 ! */
 ! if (connect(pgStatSock, (struct sockaddr *) pgStatAddr, alen)  0)
 ! {
 ! ereport(LOG,
 ! (errcode_for_socket_access(),
 ! errmsg(could not connect socket