The point about adjusting TCP MSS and MTU with iptables rules is well-taken for
production deployments. But my scale test is running in a controlled
environment, all Ethernet with no intervening provider networks or public
internet, and stable, jumbo MTUs. So I’m certain it cannot be the cause of the
issue I’m seeing.
I have this on both server (responder) and client (initiator) in charon.conf:
ikesa_table_segments = 16
ikesa_table_size = 1024
Since the client is simulating connections from what would be many thousands of
individual clients in a real situation, I did not think the following setting
was relevant and did not apply it per the comments in the IKE SA table article.
reuse_ikesa = no
The test is being run between Fedora 25 boxes with a very recent kernel:
4.11.12-200.fc25.x86_64
Aside from DPD settings and keyingtries=%forever, are there any other settings
which would help initially failed connections to keep retrying until they
successfully establish? Or other settings which would need tuning for
large-scale deployments? I’m no IKE/IPsec expert.
Thanks.
> On Oct 4, 2017, at 5:55 PM, Noel Kuntze
> <[email protected]> wrote:
>
> You do not need to explicitely accept frag-needed. It is included in ctstate
> RELATED.
>
> dpddelay sets the interval between dpd packets, not when dpdaction is taken.
> dpdtimeout controls when the action is taken.
>
> The firewall rules you mentioned are needed anyway and do not deserve the
> term optimization. Not using them commonly breaks scenarios,
> and they are vital to having working tunnels.
>
> strongSwan is specifically optimized for multi core CPUs. You probably have
> problems because the CPU scheduler moves the threads around a lot.
> You can try working around that by tuning it(, upgrading your kernel hoping
> that it fixes that) or by changing the code to pin the threads to certain
> CPUs.
>
> I hope you optimized the strongSwan settings to make efficient use of
> parallelism by using hashtables[1].
>
> [1] https://wiki.strongswan.org/projects/strongswan/wiki/IkeSaTable
>
> On 04.10.2017 08:55, Anvar Kuchkartaev wrote:
>> TCPMSS parameters in firewall are required proper routing of tcp connections
>> of client within the ipsec tunnel but:
>> iptables -A INPUT -p icmp --icmp-type fragmentation-needed -j ACCEPT
>>
>> Rule can help to udp connections when mtu changes. The Same thing happened
>> on me when connection from clients ISP being throttled and dropped silently.
>> Use:
>>
>> dpddelay=300s
>> dpdaction=clear
>>
>> On server side (this will check dead peers and remove them in every 300
>> seconds in your case if client disappears maximum after 300s he/she can
>> connect, you might decrease 300s to find optimal time)
>>
>> And use:
>>
>> dpddelay=5s
>> dpdaction=restart
>>
>> On client side (if connection dropped client will check in each 5s and
>> restart connection automatically if it drops)
>> In this case server will drop connections if they completely disconnected
>> within the 300s maximum and client will restart the connection in 5s if
>> temporary failure occured due to packet loss.
>>
>> Also adding mobike=yes into ipsec.conf connections and changing reuse_ikesa
>> to yes in strongswan.d/charon.conf will help connection remain active even
>> if ip changes or temporary disruptions (if client uses mobile 3G connection
>> with high latency and low bandwith).
>>
>> Anvar Kuchkartaev
>> [email protected]
>> Original Message
>> From: Stephen Scheck
>> Sent: martes, 3 de octubre de 2017 09:18 p.m.
>> To: Anvar Kuchkartaev
>> Cc: Jamie Stuart; [email protected]
>> Subject: Re: [strongSwan] Timeout on poor connection
>>
>>
>> Thanks for the configs.
>>
>> I added the dpd* parameters to my configurations. My situation is a little
>> different in that my traffic is primarily UDP, so the TCP MSS settings are
>> not needed. I also need to use IKEv1. Furthermore, I’m running a scale test
>> in which there’s low latency and plenty of bandwidth, which may nonetheless
>> be saturated by the number of simultaneous connections which are being
>> attempted.
>>
>> Unfortunately, the dpd* parameters did not help. I still notice a small
>> number (25-50) connections out of several thousand which fail to establish,
>> and stay that way until the StrongSwans are restarted.
>>
>> Does anybody know of any further parameters which may influence connection
>> attempts and retries?
>>
>> One thing that I’ve noted is that if I run both the client and server
>> StrongSwan processes on single core machines, or with the StrongSwan threads
>> pinned to a single CPU, the success rate is *decidedly better* than with
>> multiple cores available (although, occasionally, even then a couple of them
>> fail to establish and stay “stuck”).
>>
>> I’m beginning to think there may be some troublesome concurrency bugs in the
>> StrongSwan IKEv1 routines.
>>
>> Any help appreciated!
>>
>>
>>
>>> On Sep 30, 2017, at 7:14 PM, Anvar Kuchkartaev <[email protected]> wrote:
>>>
>>> ipsec.conf
>>>
>>> keyexchange=ikev2
>>> type=tunnel
>>> dpdaction=clear
>>> dpddelay=300s
>>> rekey=yes
>>> left=%any
>>> right=%any
>>> fragmentation=yes
>>> compress=yes
>>>
>>> parameters from server side and:
>>>
>>> dpdtimeout=20s
>>> dpddelay=5s
>>> dpdaction=restart
>>>
>>> from client side I think most important.
>>>
>>> Also you have to do several server optimizations like:
>>>
>>>
>>> firewall:
>>>
>>> iptables -A INPUT -p esp -j ACCEPT
>>>
>>> iptables -A INPUT -p udp -m multiport --dport 500,4500 -j ACCEPT
>>>
>>> iptables -A INPUT -p icmp --icmp-type fragmentation-needed -j ACCEPT
>>>
>>> iptables -I FORWARD -p tcp --tcp-flags SYN,RST SYN -j TCPMSS
>>> --clamp-mss-to-pmtu
>>>
>>> sysctl.conf
>>>
>>> net.ipv4.ip_forward_use_pmtu=1 (I assume you have done rest of the sysctl
>>> configurations like ip_forward, etc.)
>>>
>>>
>>>
>>> On 30/09/17 19:37, Jamie Stuart wrote:
>>>> Could you post your (redacted) strongswan config Anvar?
>>>>
>>>>> On 30 Sep 2017, at 00:59, Anvar Kuchkartaev <[email protected]> wrote:
>>>>>
>>>>> I also have some clients connecting from central Asia where internet is
>>>>> very poor and restricted. The main optimizations must be done at the
>>>>> server os and firewall not in strongswan. In strongswan try to
>>>>> authenticate server with 2048 bit certificate or higher and watch out IKE
>>>>> ciphers, dos_protection, ikesa_table_size, ikesa_table_segments,
>>>>> ikesa_hashtable_size parameters. Allow only IKEv2 if possible and
>>>>> decrease dpd requests and set dpdaction=restart to restart connection
>>>>> automatically if tunnel fails. From operating system watch out mtu
>>>>> changes because in my case I had a lot of mtu decreases within the
>>>>> provider network in the region client located. Allow icmp fragmentation
>>>>> needed requests from firewall and make tcpmss optimizations. It is also
>>>>> recommended to install proxy server behind VPN server which only possible
>>>>> to connect within the VPN tunnel (so client could configure it's browser
>>>>> to proxy server to enhance connection stability).
>>>>>
>>>>> Anvar Kuchkartaev
>>>>> [email protected]
>>>>> Original Message
>>>>> From: Jamie Stuart
>>>>> Sent: viernes, 29 de septiembre de 2017 05:59 p.m.
>>>>> To: [email protected]
>>>>> Subject: [strongSwan] Timeout on poor connection
>>>>>
>>>>>
>>>>> Hi,
>>>>>
>>>>> We have client (running on LEDE) connecting to a server (Ubuntu). The
>>>>> client is connecting from rural Africa of 2G/3G with high latency and low
>>>>> speed.
>>>>> Often, the connection does not come up, timing out after 5 retracts like
>>>>> the log below:
>>>>>
>>>>>
>>>>> ipsec up {connection}
>>>>> initiating IKE_SA {connection}[2] to {serverip}
>>>>> generating IKE_SA_INIT request 0 [ SA KE No N(NATD_S_IP) N(NATD_S_IP)
>>>>> N(FRAG_SUP) N(HASH_ALG) N(REDIR_SUP)]
>>>>> sending packet: from {clientip}[500] to {serverip}[500] (378 bytes)
>>>>> retransmit 1 of request with message ID 0
>>>>> sending packet: from {clientip}[500] to {serverip}[500] (378 bytes)
>>>>> retransmit 2 of request with message ID 0
>>>>> sending packet: from {clientip}[500] to {serverip}[500] (378 bytes)
>>>>> retransmit 3 of request with message ID 0
>>>>> sending packet: from {clientip}[500] to {serverip}[500] (378 bytes)
>>>>>
>>>>>
>>>>> Is there anything more we can do to make the connection 1) establish more
>>>>> reliably 2) remain ’up’ even over a power quality connection (using
>>>>> MOBIKE already)
>>>>>
>>>>>
>>>>> Thanks in advance!
>>>>>
>>>>> Jamie, onebillion
>>>>>
>>>
>>>
>>
>>
>