Hello,

This patch fixes peer reviving mechanism for DNS refreshes.

Every hour, peerRefreshDNS() performs a DNS lookup of all cache_peer
addresses. Before this patch, even if DNS lookup results did not change,
the associated peerDNSConfigure() code silently cleared dead peer
marking, if any (CachePeer::tcp_up counter).  Forcefully reviving dead
peers every hour can lead to transaction delays (and delays may lead to
failures) due to connection timeouts when using a still dead peer.

This patch starts standard TCP probing instead of pointless dead peer
reviving, correctly refreshing peer state.  The primary goal is to cover
situation when DNS refresh changes peer addresses list. However TCP
probing may be useful for other situations either, without much overhead
(that is why it starts unconditionally).  For example, we need it when
DNS refresh returns the same addresses list but in different order. Also
it should help with dead idle dead peers detection.


Thanks,
Eduard.
Do not revive unconditionally dead peers after DNS refresh.

Every hour, peerRefreshDNS() performs a DNS lookup of all cache_peer
addresses. Before this patch, even if DNS lookup results did not change,
the associated peerDNSConfigure() code silently cleared dead peer
marking, if any (CachePeer::tcp_up counter).  Forcefully reviving dead
peers every hour can lead to transaction delays (and delays may lead to
failures) due to connection timeouts when using a still dead peer.

This patch starts standard TCP probing instead of pointless dead peer
reviving, correctly refreshing peer state.  The primary goal is to cover
situation when DNS refresh changes peer address list. However TCP
probing may be useful for other situations either, without much overhead
(that is why it starts unconditionally).  For example, we need it when
DNS refresh returns the same addresses list but in different order. Also
it should help with dead idle peers detection.

=== modified file 'src/neighbors.cc'
--- src/neighbors.cc	2017-04-12 23:34:50 +0000
+++ src/neighbors.cc	2017-04-27 16:30:22 +0000
@@ -1177,72 +1177,74 @@ positiveTimeout(const time_t timeout)
 
 static void
 peerDNSConfigure(const ipcache_addrs *ia, const Dns::LookupDetails &, void *data)
 {
     // TODO: connections to no-longer valid IP addresses should be
     // closed when we can detect such IP addresses.
 
     CachePeer *p = (CachePeer *)data;
 
     int j;
 
     if (p->n_addresses == 0) {
         debugs(15, DBG_IMPORTANT, "Configuring " << neighborTypeStr(p) << " " << p->host << "/" << p->http_port << "/" << p->icp.port);
 
         if (p->type == PEER_MULTICAST)
             debugs(15, DBG_IMPORTANT, "    Multicast TTL = " << p->mcast.ttl);
     }
 
     p->n_addresses = 0;
 
     if (ia == NULL) {
         debugs(0, DBG_CRITICAL, "WARNING: DNS lookup for '" << p->host << "' failed!");
         return;
     }
 
     if ((int) ia->count < 1) {
         debugs(0, DBG_CRITICAL, "WARNING: No IP address found for '" << p->host << "'!");
         return;
     }
 
-    p->tcp_up = p->connect_fail_limit;
-
     for (j = 0; j < (int) ia->count && j < PEER_MAX_ADDRESSES; ++j) {
         p->addresses[j] = ia->in_addrs[j];
         debugs(15, 2, "--> IP address #" << j << ": " << p->addresses[j]);
         ++ p->n_addresses;
     }
 
     p->in_addr.setEmpty();
     p->in_addr = p->addresses[0];
     p->in_addr.port(p->icp.port);
 
+    // always start probing in order to effectively detect
+    // dead or revived peers
+    (void)peerProbeConnect(p);
+
     if (p->type == PEER_MULTICAST)
         peerCountMcastPeersSchedule(p, 10);
 
 #if USE_ICMP
     if (p->type != PEER_MULTICAST && IamWorkerProcess())
         if (!p->options.no_netdb_exchange)
             eventAddIsh("netdbExchangeStart", netdbExchangeStart, p, 30.0, 1);
 #endif
 
     if (p->standby.mgr.valid())
         PeerPoolMgr::Checkpoint(p->standby.mgr, "resolved peer");
 }
 
 static void
 peerRefreshDNS(void *data)
 {
     CachePeer *p = NULL;
 
     if (eventFind(peerRefreshDNS, NULL))
         eventDelete(peerRefreshDNS, NULL);
 
     if (!data && 0 == stat5minClientRequests()) {
         /* no recent client traffic, wait a bit */
         eventAddIsh("peerRefreshDNS", peerRefreshDNS, NULL, 180.0, 1);
         return;
     }
 
     for (p = Config.peers; p; p = p->next)
         ipcache_nbgethostbyname(p->host, peerDNSConfigure, p);
 

_______________________________________________
squid-dev mailing list
[email protected]
http://lists.squid-cache.org/listinfo/squid-dev

Reply via email to