Re: [HACKERS] [CORE] 7.4RC2 regression failur and not running stats collector process
[EMAIL PROTECTED] (Josh Berkus) writes: Too bad we didn't figure this out yesterday. We are now in code freeze for 7.4 release, and I'm hesitant to apply a fix for what is arguably a broken platform. Core guys, time for a vote ... do we fix, or hold this for 7.4.1? One thing I've not seen an answer to: does Postgres run acceptably on other people's Solaris boxes? If this bug is preventing running on Solaris at all, I'd say fix it ... Solaris is a major platform. If it only affects users of one particular Solaris patch version, then we do a big warning and save it for 7.4.1. For what it's worth, I have been running regression on Solaris with numerous of the betas, and RC1 and [just now] RC2, with NO problems. If the patch is deemed vital for others, it's possible that all I'm reporting is one of the statistics that will be outnumbered by others. (And in that case, I would be quick to test the patch to ensure it causes no adverse side-effects.) But it's not apparent that it is _vital_ here right now. -- let name=cbbrowne and tld=libertyrms.info in name ^ @ ^ tld;; http://dev6.int.libertyrms.com/ Christopher Browne (416) 646 3304 x124 (land) ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] [CORE] 7.4RC2 regression failur and not running stats collector process
Christopher Browne [EMAIL PROTECTED] writes: For what it's worth, I have been running regression on Solaris with numerous of the betas, and RC1 and [just now] RC2, with NO problems. It seems clear that some Solaris installations are affected and some are not. Presumably there is some version difference or some local configuration difference ... but since we don't know what the critical factor is, we have no basis for guessing what fraction of Solaris installations will see the problem. (And in that case, I would be quick to test the patch to ensure it causes no adverse side-effects.) Here is the proposed patch --- please test it ASAP if you can. This is against RC2. regards, tom lane *** src/backend/postmaster/pgstat.c.origFri Nov 7 16:55:50 2003 --- src/backend/postmaster/pgstat.c Fri Nov 14 15:02:14 2003 *** *** 203,208 --- 203,216 goto startup_failed; } + /* +* On some platforms, getaddrinfo_all() may return multiple addresses +* only one of which will actually work (eg, both IPv6 and IPv4 addresses +* when kernel will reject IPv6). Worse, the failure may occur at the +* bind() or perhaps even connect() stage. So we must loop through the +* results till we find a working combination. We will generate LOG +* messages, but no error, for bogus combinations. +*/ for (addr = addrs; addr; addr = addr-ai_next) { #ifdef HAVE_UNIX_SOCKETS *** *** 210,262 if (addr-ai_family == AF_UNIX) continue; #endif ! if ((pgStatSock = socket(addr-ai_family, SOCK_DGRAM, 0)) = 0) ! break; ! } ! if (!addr || pgStatSock 0) ! { ! ereport(LOG, ! (errcode_for_socket_access(), !errmsg(could not create socket for statistics collector: %m))); ! goto startup_failed; ! } ! /* !* Bind it to a kernel assigned port on localhost and get the assigned !* port via getsockname(). !*/ ! if (bind(pgStatSock, addr-ai_addr, addr-ai_addrlen) 0) ! { ! ereport(LOG, ! (errcode_for_socket_access(), !errmsg(could not bind socket for statistics collector: %m))); ! goto startup_failed; ! } ! freeaddrinfo_all(hints.ai_family, addrs); ! addrs = NULL; ! alen = sizeof(pgStatAddr); ! if (getsockname(pgStatSock, (struct sockaddr *) pgStatAddr, alen) 0) ! { ! ereport(LOG, ! (errcode_for_socket_access(), ! errmsg(could not get address of socket for statistics collector: %m))); ! goto startup_failed; } ! /* !* Connect the socket to its own address. This saves a few cycles by !* not having to respecify the target address on every send. This also !* provides a kernel-level check that only packets from this same !* address will be received. !*/ ! if (connect(pgStatSock, (struct sockaddr *) pgStatAddr, alen) 0) { ereport(LOG, (errcode_for_socket_access(), !errmsg(could not connect socket for statistics collector: %m))); goto startup_failed; } --- 218,285 if (addr-ai_family == AF_UNIX) continue; #endif ! /* !* Create the socket. !*/ ! if ((pgStatSock = socket(addr-ai_family, SOCK_DGRAM, 0)) 0) ! { ! ereport(LOG, ! (errcode_for_socket_access(), !errmsg(could not create socket for statistics collector: %m))); ! continue; ! } ! /* !* Bind it to a kernel assigned port on localhost and get the assigned !* port via getsockname(). !*/ ! if (bind(pgStatSock, addr-ai_addr, addr-ai_addrlen) 0) ! { ! ereport(LOG, ! (errcode_for_socket_access(), !errmsg(could not bind socket for statistics collector: %m))); ! closesocket(pgStatSock); ! pgStatSock = -1; ! continue; ! } ! alen = sizeof(pgStatAddr); ! if (getsockname(pgStatSock, (struct sockaddr *) pgStatAddr, alen) 0) ! { ! ereport(LOG, ! (errcode_for_socket_access(), !
Re: [HACKERS] [CORE] 7.4RC2 regression failur and not running stats collector process
Hmm I know it's been a while since I used patch but I seem to be having problems applying it. Perhaps my patch is outdated?? patch -b pgstat.c patchfile Looks like a new-style context diff. Hunk#2failed at line 203. Hunk#2failed at line 210. Hunk#3failed at line 284. 3 out of 3 hunks ailed: saving reject to pgstat.c.rej - Original Message - From: Tom Lane [EMAIL PROTECTED] To: Christopher Browne [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Sent: Friday, November 14, 2003 2:42 PM Subject: Re: [HACKERS] [CORE] 7.4RC2 regression failur and not running stats collector process Christopher Browne [EMAIL PROTECTED] writes: For what it's worth, I have been running regression on Solaris with numerous of the betas, and RC1 and [just now] RC2, with NO problems. It seems clear that some Solaris installations are affected and some are not. Presumably there is some version difference or some local configuration difference ... but since we don't know what the critical factor is, we have no basis for guessing what fraction of Solaris installations will see the problem. (And in that case, I would be quick to test the patch to ensure it causes no adverse side-effects.) Here is the proposed patch --- please test it ASAP if you can. This is against RC2. regards, tom lane *** src/backend/postmaster/pgstat.c.orig Fri Nov 7 16:55:50 2003 --- src/backend/postmaster/pgstat.c Fri Nov 14 15:02:14 2003 *** *** 203,208 --- 203,216 goto startup_failed; } + /* + * On some platforms, getaddrinfo_all() may return multiple addresses + * only one of which will actually work (eg, both IPv6 and IPv4 addresses + * when kernel will reject IPv6). Worse, the failure may occur at the + * bind() or perhaps even connect() stage. So we must loop through the + * results till we find a working combination. We will generate LOG + * messages, but no error, for bogus combinations. + */ for (addr = addrs; addr; addr = addr-ai_next) { #ifdef HAVE_UNIX_SOCKETS *** *** 210,262 if (addr-ai_family == AF_UNIX) continue; #endif ! if ((pgStatSock = socket(addr-ai_family, SOCK_DGRAM, 0)) = 0) ! break; ! } ! if (!addr || pgStatSock 0) ! { ! ereport(LOG, ! (errcode_for_socket_access(), ! errmsg(could not create socket for statistics collector: %m))); ! goto startup_failed; ! } ! /* ! * Bind it to a kernel assigned port on localhost and get the assigned ! * port via getsockname(). ! */ ! if (bind(pgStatSock, addr-ai_addr, addr-ai_addrlen) 0) ! { ! ereport(LOG, ! (errcode_for_socket_access(), ! errmsg(could not bind socket for statistics collector: %m))); ! goto startup_failed; ! } ! freeaddrinfo_all(hints.ai_family, addrs); ! addrs = NULL; ! alen = sizeof(pgStatAddr); ! if (getsockname(pgStatSock, (struct sockaddr *) pgStatAddr, alen) 0) ! { ! ereport(LOG, ! (errcode_for_socket_access(), ! errmsg(could not get address of socket for statistics collector: %m))); ! goto startup_failed; } ! /* ! * Connect the socket to its own address. This saves a few cycles by ! * not having to respecify the target address on every send. This also ! * provides a kernel-level check that only packets from this same ! * address will be received. ! */ ! if (connect(pgStatSock, (struct sockaddr *) pgStatAddr, alen) 0) { ereport(LOG, (errcode_for_socket_access(), ! errmsg(could not connect socket for statistics collector: %m))); goto startup_failed; } --- 218,285 if (addr-ai_family == AF_UNIX) continue; #endif ! /* ! * Create the socket. ! */ ! if ((pgStatSock = socket(addr-ai_family, SOCK_DGRAM, 0)) 0) ! { ! ereport(LOG, ! (errcode_for_socket_access(), ! errmsg(could not create socket for statistics collector: %m))); ! continue; ! } ! /* ! * Bind it to a kernel assigned port on localhost and get the assigned ! * port via getsockname(). ! */ ! if (bind(pgStatSock, addr-ai_addr, addr-ai_addrlen) 0) ! { ! ereport(LOG, ! (errcode_for_socket_access(), ! errmsg(could not bind socket for statistics collector: %m))); ! closesocket(pgStatSock); ! pgStatSock = -1; ! continue; ! } ! alen = sizeof(pgStatAddr); ! if (getsockname(pgStatSock, (struct sockaddr *) pgStatAddr, alen) 0) ! { ! ereport(LOG, ! (errcode_for_socket_access(), ! errmsg(could not get address of socket for statistics collector: %m))); ! closesocket(pgStatSock); ! pgStatSock = -1; ! continue; ! } ! /* ! * Connect the socket to its own address. This saves a few cycles by ! * not having to respecify the target address on every send. This also ! * provides a kernel-level check that only packets from this same ! * address will be received. ! */ ! if (connect(pgStatSock, (struct sockaddr *) pgStatAddr, alen) 0) ! { ! ereport(LOG, ! (errcode_for_socket_access(), ! errmsg(could not connect socket