Re: [strongSwan] problems with charon in 4.5.2 (was: 4.4.1)
Hi! now i ran strongswan 4.5.2 for two days and it looks more stable then 4.4.1 on our testbed. however, even 4.5.2 died tonight. the connection between alvina and sarah went down and attempts to reinitiate it failed. i attache the output of grep alvina /var/log/daemon (on sarah) and vice versa. it seems alvina gives up on sarah when it does not respond. sarah in turn has some issues where it becomes unresponsive for several seconds and freezes. it could be that the rekeying interval fell into such a freeze. i didnt go to the bottom of those freezes yet, but i would think that this could happen in real life situations, too. Of course i would like the two hosts to try harder to re-establish their connection again. did something go wrong at that point? how can i increase the number of reconnection attempts in case of loss of SA? %forever sounds long to me, but hey. should i just put a really big number here? for the record: this is how my config file for 4.5.2 looks like. is there anything else i can do for resilience? config setup plutostart=no # pluto is used for IKEv1 conn %default ikelifetime=3h # strongSwan default lifetime=1h # strongSwan default margintime=9m# strongSwan default keyingtries=%forever # strongSwan default mobike=no# mobike is used for NAT traversal keyexchange=ikev2 ike=aes128-sha1-modp2048 esp=aes128-sha1-modp2048 left=%defaultroute leftcert=host_server.crt type=transport # should work just as good as tunnel, but less overhead reauth=no# recommended so that SAs are rekeyed, not reauthenticaed # Begin connection section # For all connections, the peer with the host name # that is first in a lexicographical sorting # is selected as the initiator of the connection. #for $peer in $peers conn $host-$peer.name right=$peer.ip rightid=C=SE, O=Spotify, CN=$peer.name auto=start #end for 2011-05-31T22:39:26.000+00:00 alvina.ash.spotify.net charon: 04[NET] alvina.ash.spotify.net-sarah.sto.spotify.net|322 received packet: from 78.31.14.56[500] to 193.182.12.31[500] 2011-05-31T22:39:26.000+00:00 alvina.ash.spotify.net charon: 04[ENC] alvina.ash.spotify.net-sarah.sto.spotify.net|322 parsed CREATE_CHILD_SA response 0 [ N(USE_TRANSP) SA No KE TSi TSr ] 2011-05-31T22:39:26.000+00:00 alvina.ash.spotify.net charon: 04[IKE] alvina.ash.spotify.net-sarah.sto.spotify.net|322 received USE_TRANSPORT_MODE notify 2011-05-31T22:39:26.000+00:00 alvina.ash.spotify.net charon: 04[IKE] alvina.ash.spotify.net-sarah.sto.spotify.net|322 CHILD_SA alvina.ash.spotify.net-sarah.sto.spotify.net{19} established with SPIs c68e90b3_i c1afe29b_o and TS 193.182.12.31/32 === 78.31.14.56/32 2011-05-31T22:39:26.000+00:00 alvina.ash.spotify.net charon: 04[IKE] alvina.ash.spotify.net-sarah.sto.spotify.net|322 reinitiating already active tasks 2011-05-31T22:39:26.000+00:00 alvina.ash.spotify.net charon: 04[IKE] alvina.ash.spotify.net-sarah.sto.spotify.net|322 CHILD_REKEY task 2011-05-31T22:39:26.000+00:00 alvina.ash.spotify.net charon: 04[IKE] alvina.ash.spotify.net-sarah.sto.spotify.net|322 closing CHILD_SA alvina.ash.spotify.net-sarah.sto.spotify.net{19} with SPIs cd8f72b1_i (44485206 bytes) c20dc52a_o (25465004 bytes) and TS 193.182.12.31/32 === 78.31.14.56/32 2011-05-31T22:39:26.000+00:00 alvina.ash.spotify.net charon: 04[IKE] alvina.ash.spotify.net-sarah.sto.spotify.net|322 sending DELETE for ESP CHILD_SA with SPI cd8f72b1 2011-05-31T22:39:26.000+00:00 alvina.ash.spotify.net charon: 04[ENC] alvina.ash.spotify.net-sarah.sto.spotify.net|322 generating INFORMATIONAL request 1 [ D ] 2011-05-31T22:39:26.000+00:00 alvina.ash.spotify.net charon: 04[NET] alvina.ash.spotify.net-sarah.sto.spotify.net|322 sending packet: from 193.182.12.31[500] to 78.31.14.56[500] 2011-05-31T22:39:26.000+00:00 alvina.ash.spotify.net charon: 14[NET] alvina.ash.spotify.net-sarah.sto.spotify.net|322 received packet: from 78.31.14.56[500] to 193.182.12.31[500] 2011-05-31T22:39:26.000+00:00 alvina.ash.spotify.net charon: 14[ENC] alvina.ash.spotify.net-sarah.sto.spotify.net|322 parsed INFORMATIONAL response 1 [ D ] 2011-05-31T22:39:26.000+00:00 alvina.ash.spotify.net charon: 14[IKE] alvina.ash.spotify.net-sarah.sto.spotify.net|322 received DELETE for ESP CHILD_SA with SPI c20dc52a 2011-05-31T22:39:26.000+00:00 alvina.ash.spotify.net charon: 14[IKE] alvina.ash.spotify.net-sarah.sto.spotify.net|322 CHILD_SA closed 2011-05-31T22:39:26.000+00:00 alvina.ash.spotify.net charon: 14[IKE] alvina.ash.spotify.net-sarah.sto.spotify.net|322 activating new tasks 2011-05-31T22:39:26.000+00:00 alvina.ash.spotify.net charon: 14[IKE] alvina.ash.spotify.net-sarah.sto.spotify.net|322 nothing to initiate 2011-05-31T23:21:47.000+00:00 alvina.ash.spotify.net charon: 14[NET] alvina.ash.spotify.net-sarah.sto.spotify.net|322 received packet: from 78.31.14.56[500] to
Re: [strongSwan] problems with charon in 4.5.2 (was: 4.4.1)
Hi, Of course i would like the two hosts to try harder to re-establish their connection again. did something go wrong at that point? how can i increase the number of reconnection attempts in case of loss of SA? %forever sounds long to me, but hey. should i just put a really big number here? The keyingtries parameter in IKEv2 applies to initial connection setup only, but not for ordinary exchanges. To try reestablishing the tunnel, you could enable dpdaction=restart (but optionally with a dpddelay=0). Then the keyingtries parameter will apply when the tunnel is restarted after an ordinary exchange times out. The problem with dpdaction is: We currently use the same value as the so-called close-action, defining what we should do if the remote end closes the tunnel. And this is problematic if you enforce duplicate checking, i.e. with uniqueids. We really should split up these two parameters to be separately configurable. If you'd like to try this approach, you should consider the attached patch to disable the close-action for now. Another variant to realize always-up tunnels is to have a routed policy, and establish tunnels on traffic. If the tunnel fails, it will get reestablished. There are some problems with policy management (as the kernel does not support identical policies), hence I currently can't recommend it. Tobias has tried to solve these issues at [1], I'll do some testing with his work to see if this could be an option for you. Best regards Martin [1]http://git.strongswan.org/?p=strongswan.git;a=shortlog;h=refs/heads/policy-history From 7cb42fc160e4ab77638955c88440f8e70adb51b6 Mon Sep 17 00:00:00 2001 From: Martin Willi mar...@revosec.ch Date: Wed, 1 Jun 2011 16:38:52 +0200 Subject: [PATCH] Disable CHILD_SA close_action --- src/libcharon/plugins/stroke/stroke_config.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/src/libcharon/plugins/stroke/stroke_config.c b/src/libcharon/plugins/stroke/stroke_config.c index 2b31643..ba7d9c2 100644 --- a/src/libcharon/plugins/stroke/stroke_config.c +++ b/src/libcharon/plugins/stroke/stroke_config.c @@ -824,7 +824,7 @@ static child_cfg_t *build_child_cfg(private_stroke_config_t *this, child_cfg = child_cfg_create( msg-add_conn.name, lifetime, msg-add_conn.me.updown, msg-add_conn.me.hostaccess, -msg-add_conn.mode, ACTION_NONE, dpd, dpd, msg-add_conn.ipcomp, +msg-add_conn.mode, ACTION_NONE, dpd, ACTION_NONE, msg-add_conn.ipcomp, msg-add_conn.inactivity, msg-add_conn.reqid, mark_in, mark_out, msg-add_conn.tfc); child_cfg-set_mipv6_options(child_cfg, msg-add_conn.proxy_mode, -- 1.7.4.1 ___ Users mailing list Users@lists.strongswan.org https://lists.strongswan.org/mailman/listinfo/users
Re: [strongSwan] problems with charon in 4.4.1
after the test setup survived the night (i dont know if there were problems during the night, but if there where, they self-healed, which is almost as good.) this morning the there were again several hosts without and SA in ESTABLISHED state (according to ipsec statusall). it centered around fiona again. after running ipsec down $connection-name; ipsec up $connection-name things worked again, except for the connection between fiona and grazyna.lon.spotify.net (which are in the same local network, where i dont expect any package loss and very low latency). the SA was set up again, but both hosts were unable to ping each other or transfere test-data. we dont fiddle with iptables, so i expect xfrm policy or state went south. so today i uploaded even the output of ( ip xfrm policy show; ip xfrm state show ) /tmp/xfrm-policy-and-state.dump Unfortunatly i forgot to set the charon logging of the cfg module to 2, as i had intended. charon 4.4.1 does not seem to know the keyword in the config file, but i had intended to set it with stroke. doh! I did, however, change the configuration to use auto=route and dpdaction=hold, in order to make the setup more resiliant and somewhat self-healing. Could you please check out if you can see any fishyness in the logs, regardless? note that this time we dont just have a failure to negotiate a SA, but even to transmit any payload after its there, so this is something new and somewhat more scary (that depends on your view, i guess...) the logs and dumps are at http://origin.scdn.co/u/wp/fiona.lon.spotify.net-charon.log http://origin.scdn.co/u/wp/fiona.lon.spotify.net-xfrm-policy-and-state.dump http://origin.scdn.co/u/wp/grazyna.lon.spotify.net-charon.log http://origin.scdn.co/u/wp/grazyna.lon.spotify.net-xfrm-policy-and-state.dump thanks! /andreas On Thu, May 26, 2011 at 12:51 PM, Andreas Schuldei schuldei+strongs...@spotify.com wrote: On Wed, May 25, 2011 at 8:49 AM, Andreas Schuldei schuldei+strongs...@spotify.com wrote: now i uploaded new logs from taylor and aldona. the two dropped their SA sometimes after 2011-05-24T21:48:21 (that is the last good SA negotiation i can see in the logs) and didnt manage to establish a new one. could someone please look at the logs and tell me if i can do anything about this failure (by choosing different config options)? The configuration files are unchanged since my first mail. please find the log files at http://origin.scdn.co/u/wp/aldona.ash.spotify.net-charon.log http://origin.scdn.co/u/wp/taylor.sto.spotify.net-charon.log they are smaller this time, and the timestamp might make it easier, too. :-) yesterday i changed (per the suggestions on irc) the ipsec.conf to say reauth=no, to make the connections less prone to reauthentication isssues (and also switched to transport mode). Then i restarted everything and also extended the testbed a little, so that we have 23 machines sending random traffic to each other through ipsec continuously. (-253 host-to-host connections :-) i uploaded new log files of a failure now, which centers around fiona. aldona, alejandra, alvina, amber and annmarie failed to re-establish their SAs to fiona. annmarie for example set up its last SA at 21:37 and then stopped talking to fiona altogether. other traffic to other hosts goes on as before. please check out http://origin.scdn.co/u/wp/aldona.ash.spotify.net-charon.log http://origin.scdn.co/u/wp/alejandra.ash.spotify.net-charon.log http://origin.scdn.co/u/wp/alvina.ash.spotify.net-charon.log http://origin.scdn.co/u/wp/amber.lon.spotify.net-charon.log http://origin.scdn.co/u/wp/annmarie.ash.spotify.net-charon.log http://origin.scdn.co/u/wp/fiona.lon.spotify.net-charon.log as well as their IPsec config files that are now generated with this template: $comment config setup plutostart=no # pluto is used for IKEv1 conn %default ikelifetime=3h # strongSwan default lifetime=1h # strongSwan default margintime=9m # strongSwan default keyingtries=%forever # strongSwan default mobike=no # mobike is used for NAT traversal keyexchange=ikev2 ike=aes128-sha1-modp2048 esp=aes128-sha1-modp2048 left=%defaultroute leftcert=host_server.crt type=transport # should work just as good as tunnel, but less overhead reauth=no # recommended so that SAs are rekeyed, not reauthenticaed # Begin connection section # For all connections, the peer with the host name # that is first in a lexicographical sorting # is selected as the initiator of the connection. #for $peer in $peers conn $host-$peer.name right=$peer.ip rightid=C=SE, O=Spotify, CN=$peer.name #if $peer.initiator auto=start dpdaction=restart #else auto=add dpdaction=clear #end if #end for regarding the remnants
Re: [strongSwan] problems with charon in 4.4.1
On Wed, May 25, 2011 at 8:49 AM, Andreas Schuldei schuldei+strongs...@spotify.com wrote: now i uploaded new logs from taylor and aldona. the two dropped their SA sometimes after 2011-05-24T21:48:21 (that is the last good SA negotiation i can see in the logs) and didnt manage to establish a new one. could someone please look at the logs and tell me if i can do anything about this failure (by choosing different config options)? The configuration files are unchanged since my first mail. please find the log files at http://origin.scdn.co/u/wp/aldona.ash.spotify.net-charon.log http://origin.scdn.co/u/wp/taylor.sto.spotify.net-charon.log they are smaller this time, and the timestamp might make it easier, too. :-) yesterday i changed (per the suggestions on irc) the ipsec.conf to say reauth=no, to make the connections less prone to reauthentication isssues (and also switched to transport mode). Then i restarted everything and also extended the testbed a little, so that we have 23 machines sending random traffic to each other through ipsec continuously. (-253 host-to-host connections :-) i uploaded new log files of a failure now, which centers around fiona. aldona, alejandra, alvina, amber and annmarie failed to re-establish their SAs to fiona. annmarie for example set up its last SA at 21:37 and then stopped talking to fiona altogether. other traffic to other hosts goes on as before. please check out http://origin.scdn.co/u/wp/aldona.ash.spotify.net-charon.log http://origin.scdn.co/u/wp/alejandra.ash.spotify.net-charon.log http://origin.scdn.co/u/wp/alvina.ash.spotify.net-charon.log http://origin.scdn.co/u/wp/amber.lon.spotify.net-charon.log http://origin.scdn.co/u/wp/annmarie.ash.spotify.net-charon.log http://origin.scdn.co/u/wp/fiona.lon.spotify.net-charon.log as well as their IPsec config files that are now generated with this template: $comment config setup plutostart=no # pluto is used for IKEv1 conn %default ikelifetime=3h # strongSwan default lifetime=1h # strongSwan default margintime=9m# strongSwan default keyingtries=%forever # strongSwan default mobike=no# mobike is used for NAT traversal keyexchange=ikev2 ike=aes128-sha1-modp2048 esp=aes128-sha1-modp2048 left=%defaultroute leftcert=host_server.crt type=transport # should work just as good as tunnel, but less overhead reauth=no# recommended so that SAs are rekeyed, not reauthenticaed # Begin connection section # For all connections, the peer with the host name # that is first in a lexicographical sorting # is selected as the initiator of the connection. #for $peer in $peers conn $host-$peer.name right=$peer.ip rightid=C=SE, O=Spotify, CN=$peer.name #if $peer.initiator auto=start dpdaction=restart #else auto=add dpdaction=clear #end if #end for regarding the remnants of xfrm policy after /etc/init.d/ipsec stop: is that a sign for the cleanup of charon at shutdown gone wrong? i also see that the xfrm kernel modules are still heavily used (with a usage count of ~60, on some machines) when charon was stopped and no SAs are active any more. how can i see with lsof (or similar tools) what userspace (or kernel) stuff uses it? ___ Users mailing list Users@lists.strongswan.org https://lists.strongswan.org/mailman/listinfo/users
Re: [strongSwan] problems with charon in 4.4.1
On Tue, May 24, 2011 at 8:48 AM, Andreas Schuldei schuldei+strongs...@spotify.com wrote: On Mon, May 23, 2011 at 11:44 PM, Andreas Steffen andreas.stef...@strongswan.org wrote: Hello Andreas, debugging these many connections might be easier using the condensed /var/log/auth.log which has the following entries: http://www.strongswan.org/uml/testresults45/ikev2/dpd-restart/carol.auth.log the auth.log was still huge on taylor. i attempted to start from a clean slate today and did this on all machines in the test bed: /etc/init.d/ipsec stop rm -f /var/run/charon.pid /var/run/starter.pid /var/run/charon.ctl /etc/init.d/ipsec stop logrotate -f /etc/logrotate.conf ip xfrm policy flush /etc/network/if-up.d/ssh-outside-ipsec # this adds xfrm policy for port 500UDP and ssh traffic to NOT go through ipsec /etc/init.d/ipsec start and again taylor got immediate problems with the three hosts, just like yesterday. We dont have additional firewall rules that limit traffic between these hosts. Other hosts in the ash.spotify.net domain dont have problems either. Can something else get confused? is there more state somewhere? do i need to unload the xfrm modules? the connections between hosts, once turned bad, remained bad until i rebooted the machines in question. since then (last few hours) it works nicely. but rebooting is not a real option, of course. and connections going into a state that is unrecoverable is not so good, either. ___ Users mailing list Users@lists.strongswan.org https://lists.strongswan.org/mailman/listinfo/users
Re: [strongSwan] problems with charon in 4.4.1
Hello Andreas, I just analyzed the first part of the alvina.ash.spotify.net log file and I see that of the 15 initiated IKE_SAs only 4 succeed in the first round. Are there connection problems to the other 11 hosts, are some of the peers not online yet or is the computing power of the hosts so small that they cannot handle more than 4 IKE_SAs without multiple retransmission rounds? Regards Andreas On 05/23/2011 08:14 PM, Andreas Schuldei wrote: the charon log files for these four hosts are available for download here: http://origin.scdn.co/u/wp/alvina.ash.spotify.net-charon.log.gz http://origin.scdn.co/u/wp/annalise.ash.spotify.net-charon.log.gz http://origin.scdn.co/u/wp/annmarie.ash.spotify.net-charon.log.gz http://origin.scdn.co/u/wp/taylor.sto.spotify.net-charon.log.gz On Mon, May 23, 2011 at 2:46 PM, Andreas Schuldei schuldei+strongs...@spotify.com wrote: hi! I seem to be experiencing problems with charon in strongswan 4.4.1. One problem is that charon sometimes failes to reinitiate SAs once they expire. I set up a testbed with 17 hosts to reproduce and track down the issue, as it takes some time for it to manifest. since every host has several connections to the other peers in this ipsec setup, it is tricky to see what log entry is caused by which connection. how can single out the log entries from those affected/failing connections? how can i get a verbose status dump from charon showing what it thinks the status is of all the connections it keeps track of? i dont want to attache 16M of log files here. please advice what parts are useful, and i would appreciate tips on how to extract those. the hosts that i currenly see problems with are up: root@taylor:~# fping annalise.ash.spotify.net annmarie.ash.spotify.net alvina.ash.spotify.net annalise.ash.spotify.net is alive annmarie.ash.spotify.net is alive alvina.ash.spotify.net is alive but ipsec statusall has no SA for them. (see ipsec-statusall.txt) please also find attached annalises and taylors ipsec.conf. the other hosts' ipsec.conf is equivalent. there is always one initiator for each connection. ___ Users mailing list Users@lists.strongswan.org https://lists.strongswan.org/mailman/listinfo/users -- == Andreas Steffen andreas.stef...@strongswan.org strongSwan - the Linux VPN Solution!www.strongswan.org Institute for Internet Technologies and Applications University of Applied Sciences Rapperswil CH-8640 Rapperswil (Switzerland) ===[ITA-HSR]== [ ] alvina.ash.spotify.net-aldona.ash.spotify.net [ ] alvina.ash.spotify.net-alejandra.ash.spotify.net [ 1] alvina.ash.spotify.net-amber.lon.spotify.net 78.31.10.34 [ 2] alvina.ash.spotify.net-annalise.ash.spotify.net193.182.12.46 [ 3] alvina.ash.spotify.net-annmarie.ash.spotify.net193.182.12.49 [ 4] alvina.ash.spotify.net-dorothy.ash.spotify.net 193.182.12.147 [ 5] alvina.ash.spotify.net-fiona.lon.spotify.net 78.31.10.48 [ 6] alvina.ash.spotify.net-gordana.sto.spotify.net 78.31.14.162 {1} [ 7] alvina.ash.spotify.net-grazyna.lon.spotify.net 78.31.10.243 [ 8] alvina.ash.spotify.net-lillian.lon.spotify.net 78.31.10.84 {2} [ 9] alvina.ash.spotify.net-marissa.sto.spotify.net 78.31.14.98 {3} [10] alvina.ash.spotify.net-parody.sto.spotify.net 78.31.14.164 [11] alvina.ash.spotify.net-renate.lon.spotify.net 78.31.10.210 [12] alvina.ash.spotify.net-sarah.sto.spotify.net 78.31.14.56 {4} [13] alvina.ash.spotify.net-savannah.sto.spotify.net78.31.14.58 [14] alvina.ash.spotify.net-sibylla.sto.spotify.net 78.31.14.131 [15] alvina.ash.spotify.net-taylor.sto.spotify.net 78.31.14.85 ___ Users mailing list Users@lists.strongswan.org https://lists.strongswan.org/mailman/listinfo/users
Re: [strongSwan] problems with charon in 4.4.1
Hello Andreas, debugging these many connections might be easier using the condensed /var/log/auth.log which has the following entries: http://www.strongswan.org/uml/testresults45/ikev2/dpd-restart/carol.auth.log Regards Andreas On 05/23/2011 08:14 PM, Andreas Schuldei wrote: the charon log files for these four hosts are available for download here: http://origin.scdn.co/u/wp/alvina.ash.spotify.net-charon.log.gz http://origin.scdn.co/u/wp/annalise.ash.spotify.net-charon.log.gz http://origin.scdn.co/u/wp/annmarie.ash.spotify.net-charon.log.gz http://origin.scdn.co/u/wp/taylor.sto.spotify.net-charon.log.gz On Mon, May 23, 2011 at 2:46 PM, Andreas Schuldei schuldei+strongs...@spotify.com wrote: hi! I seem to be experiencing problems with charon in strongswan 4.4.1. One problem is that charon sometimes failes to reinitiate SAs once they expire. I set up a testbed with 17 hosts to reproduce and track down the issue, as it takes some time for it to manifest. since every host has several connections to the other peers in this ipsec setup, it is tricky to see what log entry is caused by which connection. how can single out the log entries from those affected/failing connections? how can i get a verbose status dump from charon showing what it thinks the status is of all the connections it keeps track of? i dont want to attache 16M of log files here. please advice what parts are useful, and i would appreciate tips on how to extract those. the hosts that i currenly see problems with are up: root@taylor:~# fping annalise.ash.spotify.net annmarie.ash.spotify.net alvina.ash.spotify.net annalise.ash.spotify.net is alive annmarie.ash.spotify.net is alive alvina.ash.spotify.net is alive but ipsec statusall has no SA for them. (see ipsec-statusall.txt) please also find attached annalises and taylors ipsec.conf. the other hosts' ipsec.conf is equivalent. there is always one initiator for each connection. ___ Users mailing list Users@lists.strongswan.org https://lists.strongswan.org/mailman/listinfo/users -- == Andreas Steffen andreas.stef...@strongswan.org strongSwan - the Linux VPN Solution!www.strongswan.org Institute for Internet Technologies and Applications University of Applied Sciences Rapperswil CH-8640 Rapperswil (Switzerland) ===[ITA-HSR]== ___ Users mailing list Users@lists.strongswan.org https://lists.strongswan.org/mailman/listinfo/users
Re: [strongSwan] problems with charon in 4.4.1
ipsec was started by puppet. that means that the connections are initiated over an interval of about 30 min. when i checked later on i discovered that some hosts did not get their initial puppet trigger for some reason. our physical nets are quite good, we dont see package loss or so within our sites. between the sites (ash, lon, sto) we go through the wild internet, and occasional connection issues could happen. they are not the rule, though. all servers are real high powered servers, none of them is to puny to get ipsec negotiations right on the first try. :-) On Mon, May 23, 2011 at 11:38 PM, Andreas Steffen andreas.stef...@strongswan.org wrote: Hello Andreas, I just analyzed the first part of the alvina.ash.spotify.net log file and I see that of the 15 initiated IKE_SAs only 4 succeed in the first round. Are there connection problems to the other 11 hosts, are some of the peers not online yet or is the computing power of the hosts so small that they cannot handle more than 4 IKE_SAs without multiple retransmission rounds? Regards Andreas On 05/23/2011 08:14 PM, Andreas Schuldei wrote: the charon log files for these four hosts are available for download here: http://origin.scdn.co/u/wp/alvina.ash.spotify.net-charon.log.gz http://origin.scdn.co/u/wp/annalise.ash.spotify.net-charon.log.gz http://origin.scdn.co/u/wp/annmarie.ash.spotify.net-charon.log.gz http://origin.scdn.co/u/wp/taylor.sto.spotify.net-charon.log.gz On Mon, May 23, 2011 at 2:46 PM, Andreas Schuldei schuldei+strongs...@spotify.com wrote: hi! I seem to be experiencing problems with charon in strongswan 4.4.1. One problem is that charon sometimes failes to reinitiate SAs once they expire. I set up a testbed with 17 hosts to reproduce and track down the issue, as it takes some time for it to manifest. since every host has several connections to the other peers in this ipsec setup, it is tricky to see what log entry is caused by which connection. how can single out the log entries from those affected/failing connections? how can i get a verbose status dump from charon showing what it thinks the status is of all the connections it keeps track of? i dont want to attache 16M of log files here. please advice what parts are useful, and i would appreciate tips on how to extract those. the hosts that i currenly see problems with are up: root@taylor:~# fping annalise.ash.spotify.net annmarie.ash.spotify.net alvina.ash.spotify.net annalise.ash.spotify.net is alive annmarie.ash.spotify.net is alive alvina.ash.spotify.net is alive but ipsec statusall has no SA for them. (see ipsec-statusall.txt) please also find attached annalises and taylors ipsec.conf. the other hosts' ipsec.conf is equivalent. there is always one initiator for each connection. ___ Users mailing list Users@lists.strongswan.org https://lists.strongswan.org/mailman/listinfo/users -- == Andreas Steffen andreas.stef...@strongswan.org strongSwan - the Linux VPN Solution! www.strongswan.org Institute for Internet Technologies and Applications University of Applied Sciences Rapperswil CH-8640 Rapperswil (Switzerland) ===[ITA-HSR]== ___ Users mailing list Users@lists.strongswan.org https://lists.strongswan.org/mailman/listinfo/users
[strongSwan] Problems with Charon
I've got a host-to-host connection that should be kept alive 24/7. machine 1: config setup plutostart=no # IKEv1 charonstart=yes # IKEv2 nat_traversal=no # Add connections here. # Sample VPN connections conn %default ikelifetime=60m keylife=20m rekeymargin=3m keyingtries=%forever keyexchange=ikev2 dpdaction=hold mobike=no conn server1 left=XX.X.XX.XX leftcert=server1-cert.pem left...@server1.xxx.com right=YY.YY.YY.YY right...@server2.xxx.com auto=start server2: config setup plutostart=no # IKEv1 charonstart=yes # IKEv2 nat_traversal=no # Add connections here. # Sample VPN connections conn %default ikelifetime=60m keylife=20m rekeymargin=3m keyingtries=%forever keyexchange=ikev2 dpdaction=clear mobike=no conn server12 left=YY.YY.YY.YY leftcert=server2-cert.pem left...@server2.xxx.com right=XX.XX.XX.XX right...@server1.xxx.com auto=add when i start ipsec on both sides it works for a few minutes, then it just doesnt any longer, although the SAs are still alive. server2[2]: ESTABLISHED 11 minutes ago, XX.XX.XX.XX[server1.XXX.com]...YY.YY.YY.YY[server2.XXX.com] server2{2}: INSTALLED, TUNNEL, ESP SPIs: cb043689_i c4ecff51_o server2{2}: XX.XX.XX.XX/32 === YY.YY.YY.YY/32 But no traffic flow can be established. Logs gives me errors like these: Sep 2 02:44:30 server1 charon: 11[KNL] querying policy failed: No such file or directory (2) I have to restart the whole daemon on server1 to get the traffic flowing again.. for a few minutes. Any ideas? ___ Users mailing list Users@lists.strongswan.org https://lists.strongswan.org/mailman/listinfo/users
Re: [strongSwan] Problems with Charon
Hi, are you running strongSwan on CentOS or RedHat? There is an issue with these Linux kernels where IPsec policies get deleted when they are queried e.g. by ipsec statusall or DPD. I think this kernel bug was fixed recently by RedHat. Best regards Andreas ServerAlex wrote: I've got a host-to-host connection that should be kept alive 24/7. machine 1: config setup plutostart=no # IKEv1 charonstart=yes # IKEv2 nat_traversal=no # Add connections here. # Sample VPN connections conn %default ikelifetime=60m keylife=20m rekeymargin=3m keyingtries=%forever keyexchange=ikev2 dpdaction=hold mobike=no conn server1 left=XX.X.XX.XX leftcert=server1-cert.pem left...@server1.xxx.com right=YY.YY.YY.YY right...@server2.xxx.com auto=start server2: config setup plutostart=no # IKEv1 charonstart=yes # IKEv2 nat_traversal=no # Add connections here. # Sample VPN connections conn %default ikelifetime=60m keylife=20m rekeymargin=3m keyingtries=%forever keyexchange=ikev2 dpdaction=clear mobike=no conn server12 left=YY.YY.YY.YY leftcert=server2-cert.pem left...@server2.xxx.com right=XX.XX.XX.XX right...@server1.xxx.com auto=add when i start ipsec on both sides it works for a few minutes, then it just doesnt any longer, although the SAs are still alive. server2[2]: ESTABLISHED 11 minutes ago, XX.XX.XX.XX[server1.XXX.com]...YY.YY.YY.YY[server2.XXX.com] server2{2}: INSTALLED, TUNNEL, ESP SPIs: cb043689_i c4ecff51_o server2{2}: XX.XX.XX.XX/32 === YY.YY.YY.YY/32 But no traffic flow can be established. Logs gives me errors like these: Sep 2 02:44:30 server1 charon: 11[KNL] querying policy failed: No such file or directory (2) I have to restart the whole daemon on server1 to get the traffic flowing again.. for a few minutes. Any ideas? == Andreas Steffen andreas.stef...@strongswan.org strongSwan - the Linux VPN Solution!www.strongswan.org Institute for Internet Technologies and Applications University of Applied Sciences Rapperswil CH-8640 Rapperswil (Switzerland) ===[ITA-HSR]== ___ Users mailing list Users@lists.strongswan.org https://lists.strongswan.org/mailman/listinfo/users