Hello, On Tue, 5 Feb 2013, Dmitry Akindinov wrote:
> Hello, > > We have met a quite troublesome situation which causes an internal SYN > storm. > > The simplified version of the configuration consists of 2 servers - A > and B, both running Linux kernel 3.7.4-20. > > Both have the IPVS software enabled, A is acting as the active load > balancer, B as a backup. > Both servers act as real servers also. > > At some point, there is an incoming TCP connection from IPpair > (address:port) I. > The load balancer A decides to process it locally. Connection is > established, and the balancer status is distributed to server B via > syncing broadcast. > > The client closes connection, and again the status is updated on B via > the broadcast - the connection is now in the "TCP_WAIT" state. > > Pretty soon (within 10 seconds) the client opens the new TCP connection > using the same IP pair I. > It is not a good TCP practice, but nevertheless, some clients work this way. > > This time the load balancer A decides that the connection is to be > handled on the server B (persistence is switched off). If connection still exists in balancer A it is not going to select new real server. May be only if expire_nodest_conn is set and when current real server becomes unavailable a new real server can be selected for next packets (2nd SYN). > The SYN packet is relayed to the server B, which finds an existing > routing record for that pair I. > And that record (in the CLOSE state) - points to the server A, and the > SYN packet is relayed there. > > The server A processes it again, directs it to the server B again, and > the loop spirals, since the server B does not have the new connection > table element I synced. More likely the SYN comes short after the conn in server A is expired but the synced conn in server B is not expired yet. This can happen often because the sync protocol is not perfect, conns in backup tend to expire later. > We can send packet dumps illustrating the problem. > > If our analysis is correct, what are the available workarounds? I see that we discussed this problem August 2012. I assume this is DR method and all IPVS rules are present in backup? Are you using the old "sync_threshold" algorithm or the new one with sync_refresh_period=10? > a) we can always use "persistent" option with time larger than CLOSE > (TIME_WAIT?) state time. May be the problem will move from the normal conns to the persistent conn templates. The simplest solution is to use: if (ipvs->sync_state & IP_VS_STATE_BACKUP) return NF_ACCEPT; in all hooks. This will stop all traffic. The problem is that we do not know if the backup function is used for part of the virtual services, other virtual services can be in normal mode, possibly using IP_VS_STATE_MASTER. I assume that was the reason the master and backup functions to be able to run together, with different sync_id. I'm not sure what is more appropriate to apply here, some sysctl var that will stop the forwarding mode while backup function is enabled. This solution will work better if we don't want to change the tools that manage IPVS rules and it is easier to implement, eg. "backup_only=1" to activate such mode. The result would be: if (ipvs->sync_state & IP_VS_STATE_BACKUP && ipvs->backup_only) return NF_ACCEPT; /* Try local server */ Another solution would be to add optional syncid attribute to the virtual server. By this way we will know that received packet matches the backup_syncid (we are used as real server from director) or else it is directed from client to our virtual server. In the second case if the packet matches master_syncid (as additional check to the IP_VS_STATE_MASTER flag) a sync message would be sent. > b) on the server B we can remove the iptables records marking incoming > packets with a flag used with the IPVS uses. > We can insert those iptable rule(s) only when the server B becomes the > main load balancer. But will it stop IPVS from running all incoming > packets via its (synced) connections table? What is the case now, do you have IPVS rules on server B while the backup function is enabled? > -- > Best regards, > Dmitry Akindinov Regards -- Julian Anastasov <j...@ssi.bg> _______________________________________________ Please read the documentation before posting - it's available at: http://www.linuxvirtualserver.org/ LinuxVirtualServer.org mailing list - lvs-users@LinuxVirtualServer.org Send requests to lvs-users-requ...@linuxvirtualserver.org or go to http://lists.graemef.net/mailman/listinfo/lvs-users