Re: [strongSwan] problems with charon in 4.5.2 (was: 4.4.1)

2011-06-01 Thread Andreas Schuldei
Hi!

now i ran strongswan 4.5.2 for two days and it looks more stable then
4.4.1 on our testbed.

however, even 4.5.2 died tonight. the connection between alvina and
sarah went down and attempts to reinitiate it failed.

i attache the output of grep alvina /var/log/daemon (on sarah) and vice versa.

it seems alvina gives up on sarah when it does not respond. sarah in
turn has some issues where it becomes unresponsive for several seconds
and freezes. it could be that the rekeying interval fell into such a
freeze. i didnt go to the bottom of those freezes yet, but i would
think that this could happen in real life situations, too. Of course i
would like the two hosts to try harder to re-establish their
connection again. did something go wrong at that point? how can i
increase the number of reconnection attempts in case of loss of SA?
%forever sounds long to me, but hey. should i just put a really big
number here?

for the record: this is how my config file for 4.5.2 looks like. is
there anything else i can do for resilience?

config setup
plutostart=no # pluto is used for IKEv1

conn %default
ikelifetime=3h   # strongSwan default
lifetime=1h  # strongSwan default
margintime=9m# strongSwan default
keyingtries=%forever # strongSwan default
mobike=no# mobike is used for NAT traversal
keyexchange=ikev2
ike=aes128-sha1-modp2048
esp=aes128-sha1-modp2048
left=%defaultroute
leftcert=host_server.crt
type=transport   # should work just as good as tunnel, but
less overhead
reauth=no# recommended so that SAs are rekeyed, not
reauthenticaed

# Begin connection section

# For all connections, the peer with the host name
# that is first in a lexicographical sorting
# is selected as the initiator of the connection.
#for $peer in $peers

conn $host-$peer.name
right=$peer.ip
rightid=C=SE, O=Spotify, CN=$peer.name
auto=start
#end for


2011-05-31T22:39:26.000+00:00 alvina.ash.spotify.net charon: 04[NET]
alvina.ash.spotify.net-sarah.sto.spotify.net|322 received packet:
from 78.31.14.56[500] to 193.182.12.31[500]
2011-05-31T22:39:26.000+00:00 alvina.ash.spotify.net charon: 04[ENC]
alvina.ash.spotify.net-sarah.sto.spotify.net|322 parsed
CREATE_CHILD_SA response 0 [ N(USE_TRANSP) SA No KE TSi TSr ]
2011-05-31T22:39:26.000+00:00 alvina.ash.spotify.net charon: 04[IKE]
alvina.ash.spotify.net-sarah.sto.spotify.net|322 received
USE_TRANSPORT_MODE notify
2011-05-31T22:39:26.000+00:00 alvina.ash.spotify.net charon: 04[IKE]
alvina.ash.spotify.net-sarah.sto.spotify.net|322 CHILD_SA
alvina.ash.spotify.net-sarah.sto.spotify.net{19} established with SPIs
c68e90b3_i c1afe29b_o and TS 193.182.12.31/32 === 78.31.14.56/32
2011-05-31T22:39:26.000+00:00 alvina.ash.spotify.net charon: 04[IKE]
alvina.ash.spotify.net-sarah.sto.spotify.net|322 reinitiating
already active tasks
2011-05-31T22:39:26.000+00:00 alvina.ash.spotify.net charon: 04[IKE]
alvina.ash.spotify.net-sarah.sto.spotify.net|322   CHILD_REKEY task
2011-05-31T22:39:26.000+00:00 alvina.ash.spotify.net charon: 04[IKE]
alvina.ash.spotify.net-sarah.sto.spotify.net|322 closing CHILD_SA
alvina.ash.spotify.net-sarah.sto.spotify.net{19} with SPIs cd8f72b1_i
(44485206 bytes) c20dc52a_o (25465004 bytes) and TS 193.182.12.31/32
=== 78.31.14.56/32
2011-05-31T22:39:26.000+00:00 alvina.ash.spotify.net charon: 04[IKE]
alvina.ash.spotify.net-sarah.sto.spotify.net|322 sending DELETE for
ESP CHILD_SA with SPI cd8f72b1
2011-05-31T22:39:26.000+00:00 alvina.ash.spotify.net charon: 04[ENC]
alvina.ash.spotify.net-sarah.sto.spotify.net|322 generating
INFORMATIONAL request 1 [ D ]
2011-05-31T22:39:26.000+00:00 alvina.ash.spotify.net charon: 04[NET]
alvina.ash.spotify.net-sarah.sto.spotify.net|322 sending packet:
from 193.182.12.31[500] to 78.31.14.56[500]
2011-05-31T22:39:26.000+00:00 alvina.ash.spotify.net charon: 14[NET]
alvina.ash.spotify.net-sarah.sto.spotify.net|322 received packet:
from 78.31.14.56[500] to 193.182.12.31[500]
2011-05-31T22:39:26.000+00:00 alvina.ash.spotify.net charon: 14[ENC]
alvina.ash.spotify.net-sarah.sto.spotify.net|322 parsed
INFORMATIONAL response 1 [ D ]
2011-05-31T22:39:26.000+00:00 alvina.ash.spotify.net charon: 14[IKE]
alvina.ash.spotify.net-sarah.sto.spotify.net|322 received DELETE for
ESP CHILD_SA with SPI c20dc52a
2011-05-31T22:39:26.000+00:00 alvina.ash.spotify.net charon: 14[IKE]
alvina.ash.spotify.net-sarah.sto.spotify.net|322 CHILD_SA closed
2011-05-31T22:39:26.000+00:00 alvina.ash.spotify.net charon: 14[IKE]
alvina.ash.spotify.net-sarah.sto.spotify.net|322 activating new
tasks
2011-05-31T22:39:26.000+00:00 alvina.ash.spotify.net charon: 14[IKE]
alvina.ash.spotify.net-sarah.sto.spotify.net|322 nothing to initiate
2011-05-31T23:21:47.000+00:00 alvina.ash.spotify.net charon: 14[NET]
alvina.ash.spotify.net-sarah.sto.spotify.net|322 received packet:
from 78.31.14.56[500] to 

Re: [strongSwan] problems with charon in 4.5.2 (was: 4.4.1)

2011-06-01 Thread Martin Willi
Hi,

 Of course i would like the two hosts to try harder to re-establish
 their connection again. did something go wrong at that point? how can i
 increase the number of reconnection attempts in case of loss of SA?
 %forever sounds long to me, but hey. should i just put a really big
 number here?

The keyingtries parameter in IKEv2 applies to initial connection setup
only, but not for ordinary exchanges. To try reestablishing the tunnel,
you could enable dpdaction=restart (but optionally with a dpddelay=0).
Then the keyingtries parameter will apply when the tunnel is restarted
after an ordinary exchange times out.

The problem with dpdaction is: We currently use the same value as the
so-called close-action, defining what we should do if the remote end
closes the tunnel. And this is problematic if you enforce duplicate
checking, i.e. with uniqueids. We really should split up these two
parameters to be separately configurable.

If you'd like to try this approach, you should consider the attached
patch to disable the close-action for now.


Another variant to realize always-up tunnels is to have a routed
policy, and establish tunnels on traffic. If the tunnel fails, it will
get reestablished. There are some problems with policy management (as
the kernel does not support identical policies), hence I currently can't
recommend it. Tobias has tried to solve these issues at [1], I'll do
some testing with his work to see if this could be an option for you.

Best regards
Martin

[1]http://git.strongswan.org/?p=strongswan.git;a=shortlog;h=refs/heads/policy-history

From 7cb42fc160e4ab77638955c88440f8e70adb51b6 Mon Sep 17 00:00:00 2001
From: Martin Willi mar...@revosec.ch
Date: Wed, 1 Jun 2011 16:38:52 +0200
Subject: [PATCH] Disable CHILD_SA close_action

---
 src/libcharon/plugins/stroke/stroke_config.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/src/libcharon/plugins/stroke/stroke_config.c b/src/libcharon/plugins/stroke/stroke_config.c
index 2b31643..ba7d9c2 100644
--- a/src/libcharon/plugins/stroke/stroke_config.c
+++ b/src/libcharon/plugins/stroke/stroke_config.c
@@ -824,7 +824,7 @@ static child_cfg_t *build_child_cfg(private_stroke_config_t *this,
 	child_cfg = child_cfg_create(
 msg-add_conn.name, lifetime,
 msg-add_conn.me.updown, msg-add_conn.me.hostaccess,
-msg-add_conn.mode, ACTION_NONE, dpd, dpd, msg-add_conn.ipcomp,
+msg-add_conn.mode, ACTION_NONE, dpd, ACTION_NONE, msg-add_conn.ipcomp,
 msg-add_conn.inactivity, msg-add_conn.reqid,
 mark_in, mark_out, msg-add_conn.tfc);
 	child_cfg-set_mipv6_options(child_cfg, msg-add_conn.proxy_mode,
-- 
1.7.4.1

___
Users mailing list
Users@lists.strongswan.org
https://lists.strongswan.org/mailman/listinfo/users

Re: [strongSwan] problems with charon in 4.4.1

2011-05-27 Thread Andreas Schuldei
after the test setup survived the night (i dont know if there were
problems during the night, but if there where, they self-healed, which
is almost as good.)  this morning the there were again several hosts
without and SA in ESTABLISHED state (according to ipsec statusall).

it centered around fiona again. after running ipsec down
$connection-name; ipsec up $connection-name things worked again,
except for the connection between fiona and grazyna.lon.spotify.net
(which are in the same local network, where i dont expect any package
loss and very low latency). the SA was set up again, but both hosts
were unable to ping each other or transfere test-data. we dont fiddle
with iptables, so i expect xfrm policy or state went south.  so today
i uploaded even the output of ( ip xfrm policy show; ip xfrm state
show )  /tmp/xfrm-policy-and-state.dump

Unfortunatly i forgot to set the charon logging of the cfg module to
2, as i had intended. charon 4.4.1 does not seem to know the keyword
in the config file, but i had intended to set it with stroke. doh! I
did, however, change the configuration to use auto=route and
dpdaction=hold, in order to make the setup more resiliant and somewhat
self-healing.

Could you please check out if you can see any fishyness in the logs,
regardless? note that this time we dont just have a failure to
negotiate a SA, but even to transmit any payload after its there, so
this is something new and somewhat more scary (that depends on your
view, i guess...)

the logs and dumps are at

http://origin.scdn.co/u/wp/fiona.lon.spotify.net-charon.log
http://origin.scdn.co/u/wp/fiona.lon.spotify.net-xfrm-policy-and-state.dump
http://origin.scdn.co/u/wp/grazyna.lon.spotify.net-charon.log
http://origin.scdn.co/u/wp/grazyna.lon.spotify.net-xfrm-policy-and-state.dump


thanks!
/andreas



On Thu, May 26, 2011 at 12:51 PM, Andreas Schuldei
schuldei+strongs...@spotify.com wrote:
 On Wed, May 25, 2011 at 8:49 AM, Andreas Schuldei
 schuldei+strongs...@spotify.com wrote:
 now i uploaded new logs from taylor and aldona. the two dropped their
 SA sometimes after 2011-05-24T21:48:21 (that is the last good SA
 negotiation i can see in the logs) and didnt manage to establish a new
 one.

 could someone please look at the logs and tell me if i can do anything
 about this failure (by choosing different config options)? The
 configuration files are unchanged since my first mail.

 please find the log files at
 http://origin.scdn.co/u/wp/aldona.ash.spotify.net-charon.log
 http://origin.scdn.co/u/wp/taylor.sto.spotify.net-charon.log

 they are smaller this time, and the timestamp might make it easier, too. :-)

 yesterday i changed (per the suggestions on irc) the ipsec.conf to say
 reauth=no, to make the connections less prone to reauthentication
 isssues (and also switched to transport mode).  Then i restarted
 everything and also extended the testbed a little, so that we have 23
 machines sending random traffic to each other through ipsec
 continuously. (-253 host-to-host connections :-)

 i uploaded new log files of a failure now, which centers around fiona.
 aldona, alejandra, alvina, amber and annmarie failed to re-establish
 their SAs to fiona. annmarie for example set up its last SA at 21:37
 and then stopped talking to fiona altogether. other traffic to other
 hosts goes on as before.

 please check out

 http://origin.scdn.co/u/wp/aldona.ash.spotify.net-charon.log
 http://origin.scdn.co/u/wp/alejandra.ash.spotify.net-charon.log
 http://origin.scdn.co/u/wp/alvina.ash.spotify.net-charon.log
 http://origin.scdn.co/u/wp/amber.lon.spotify.net-charon.log
 http://origin.scdn.co/u/wp/annmarie.ash.spotify.net-charon.log
 http://origin.scdn.co/u/wp/fiona.lon.spotify.net-charon.log

 as well as their IPsec config files that are now generated with this template:

 $comment

 config setup
        plutostart=no # pluto is used for IKEv1

 conn %default
        ikelifetime=3h           # strongSwan default
        lifetime=1h              # strongSwan default
        margintime=9m            # strongSwan default
        keyingtries=%forever     # strongSwan default
        mobike=no                # mobike is used for NAT traversal
        keyexchange=ikev2
        ike=aes128-sha1-modp2048
        esp=aes128-sha1-modp2048
        left=%defaultroute
        leftcert=host_server.crt
        type=transport           # should work just as good as tunnel, but
 less overhead
        reauth=no                # recommended so that SAs are rekeyed, not
 reauthenticaed


 # Begin connection section

 # For all connections, the peer with the host name
 # that is first in a lexicographical sorting
 # is selected as the initiator of the connection.
 #for $peer in $peers

 conn $host-$peer.name
        right=$peer.ip
        rightid=C=SE, O=Spotify, CN=$peer.name
        #if $peer.initiator
        auto=start
        dpdaction=restart
        #else
        auto=add
        dpdaction=clear
        #end if
 #end for


 regarding the remnants 

Re: [strongSwan] problems with charon in 4.4.1

2011-05-26 Thread Andreas Schuldei
On Wed, May 25, 2011 at 8:49 AM, Andreas Schuldei
schuldei+strongs...@spotify.com wrote:
 now i uploaded new logs from taylor and aldona. the two dropped their
 SA sometimes after 2011-05-24T21:48:21 (that is the last good SA
 negotiation i can see in the logs) and didnt manage to establish a new
 one.

 could someone please look at the logs and tell me if i can do anything
 about this failure (by choosing different config options)? The
 configuration files are unchanged since my first mail.

 please find the log files at
 http://origin.scdn.co/u/wp/aldona.ash.spotify.net-charon.log
 http://origin.scdn.co/u/wp/taylor.sto.spotify.net-charon.log

 they are smaller this time, and the timestamp might make it easier, too. :-)

yesterday i changed (per the suggestions on irc) the ipsec.conf to say
reauth=no, to make the connections less prone to reauthentication
isssues (and also switched to transport mode).  Then i restarted
everything and also extended the testbed a little, so that we have 23
machines sending random traffic to each other through ipsec
continuously. (-253 host-to-host connections :-)

i uploaded new log files of a failure now, which centers around fiona.
aldona, alejandra, alvina, amber and annmarie failed to re-establish
their SAs to fiona. annmarie for example set up its last SA at 21:37
and then stopped talking to fiona altogether. other traffic to other
hosts goes on as before.

please check out

http://origin.scdn.co/u/wp/aldona.ash.spotify.net-charon.log
http://origin.scdn.co/u/wp/alejandra.ash.spotify.net-charon.log
http://origin.scdn.co/u/wp/alvina.ash.spotify.net-charon.log
http://origin.scdn.co/u/wp/amber.lon.spotify.net-charon.log
http://origin.scdn.co/u/wp/annmarie.ash.spotify.net-charon.log
http://origin.scdn.co/u/wp/fiona.lon.spotify.net-charon.log

as well as their IPsec config files that are now generated with this template:

$comment

config setup
plutostart=no # pluto is used for IKEv1

conn %default
ikelifetime=3h   # strongSwan default
lifetime=1h  # strongSwan default
margintime=9m# strongSwan default
keyingtries=%forever # strongSwan default
mobike=no# mobike is used for NAT traversal
keyexchange=ikev2
ike=aes128-sha1-modp2048
esp=aes128-sha1-modp2048
left=%defaultroute
leftcert=host_server.crt
type=transport   # should work just as good as tunnel, but
less overhead
reauth=no# recommended so that SAs are rekeyed, not
reauthenticaed


# Begin connection section

# For all connections, the peer with the host name
# that is first in a lexicographical sorting
# is selected as the initiator of the connection.
#for $peer in $peers

conn $host-$peer.name
right=$peer.ip
rightid=C=SE, O=Spotify, CN=$peer.name
#if $peer.initiator
auto=start
dpdaction=restart
#else
auto=add
dpdaction=clear
#end if
#end for


regarding the remnants of xfrm policy after /etc/init.d/ipsec stop: is
that a sign for the cleanup of charon at shutdown gone wrong? i also
see that the xfrm kernel modules are still heavily used (with a usage
count of ~60, on some machines) when charon was stopped and no SAs are
active any more. how can i see with lsof (or similar tools) what
userspace (or kernel) stuff uses it?

___
Users mailing list
Users@lists.strongswan.org
https://lists.strongswan.org/mailman/listinfo/users


Re: [strongSwan] problems with charon in 4.4.1

2011-05-24 Thread Andreas Schuldei
On Tue, May 24, 2011 at 8:48 AM, Andreas Schuldei
schuldei+strongs...@spotify.com wrote:
 On Mon, May 23, 2011 at 11:44 PM, Andreas Steffen
 andreas.stef...@strongswan.org wrote:
 Hello Andreas,

 debugging these many connections might be easier using the
 condensed /var/log/auth.log which has the following entries:

 http://www.strongswan.org/uml/testresults45/ikev2/dpd-restart/carol.auth.log

 the auth.log was still huge on taylor.

 i attempted to start from a clean slate today and did this on all
 machines in the test bed:

        /etc/init.d/ipsec stop
        rm -f /var/run/charon.pid /var/run/starter.pid /var/run/charon.ctl
        /etc/init.d/ipsec stop
       logrotate -f /etc/logrotate.conf
        ip xfrm policy flush
        /etc/network/if-up.d/ssh-outside-ipsec  # this adds xfrm policy for
 port 500UDP and ssh traffic to NOT go through ipsec
        /etc/init.d/ipsec start

 and again taylor got immediate problems with the three hosts, just
 like yesterday. We dont have additional firewall rules that limit
 traffic between these hosts. Other hosts in the ash.spotify.net domain
 dont have problems either.
 Can something else get confused?
 is there more state somewhere?

do i need to unload the xfrm modules?
the connections between hosts, once turned bad, remained bad until i
rebooted the machines in question. since then (last few hours) it
works nicely. but rebooting is not a real option, of course. and
connections going into a state that is unrecoverable is not so good,
either.

___
Users mailing list
Users@lists.strongswan.org
https://lists.strongswan.org/mailman/listinfo/users

Re: [strongSwan] problems with charon in 4.4.1

2011-05-23 Thread Andreas Steffen
Hello Andreas,

I just analyzed the first part of the alvina.ash.spotify.net log
file and I see that of the 15 initiated IKE_SAs only 4 succeed in
the first round. Are there connection problems to the other 11 hosts,
are some of the peers not online yet or is the computing power of the
hosts so small that they cannot handle more than 4 IKE_SAs without
multiple retransmission rounds?

Regards

Andreas

On 05/23/2011 08:14 PM, Andreas Schuldei wrote:
 the charon log files for these four hosts are available for download here:
 http://origin.scdn.co/u/wp/alvina.ash.spotify.net-charon.log.gz
 http://origin.scdn.co/u/wp/annalise.ash.spotify.net-charon.log.gz
 http://origin.scdn.co/u/wp/annmarie.ash.spotify.net-charon.log.gz
 http://origin.scdn.co/u/wp/taylor.sto.spotify.net-charon.log.gz
 
 
 On Mon, May 23, 2011 at 2:46 PM, Andreas Schuldei
 schuldei+strongs...@spotify.com wrote:
 hi!

 I seem to be experiencing problems with charon in strongswan 4.4.1.

 One problem is that charon sometimes failes to reinitiate SAs once
 they expire. I set up a testbed with 17 hosts to reproduce and track
 down the issue, as it takes some time for it to manifest.

 since every host has several connections to the other peers in this
 ipsec setup, it is tricky to see what log entry is caused by which
 connection. how can single out the log entries from those
 affected/failing connections? how can i get a verbose status dump from
 charon showing what it thinks the status is of all the connections it
 keeps track of?
 i dont want to attache 16M of log files here. please advice what parts
 are useful, and i would appreciate tips on how to extract those.

 the hosts that i currenly see problems with are up:

 root@taylor:~# fping annalise.ash.spotify.net annmarie.ash.spotify.net
 alvina.ash.spotify.net
 annalise.ash.spotify.net is alive
 annmarie.ash.spotify.net is alive
 alvina.ash.spotify.net is alive

 but ipsec statusall has no SA for them. (see ipsec-statusall.txt)

 please also find attached annalises and taylors ipsec.conf. the other
 hosts' ipsec.conf is equivalent. there is always one initiator for
 each connection.

 
 ___
 Users mailing list
 Users@lists.strongswan.org
 https://lists.strongswan.org/mailman/listinfo/users


-- 
==
Andreas Steffen andreas.stef...@strongswan.org
strongSwan - the Linux VPN Solution!www.strongswan.org
Institute for Internet Technologies and Applications
University of Applied Sciences Rapperswil
CH-8640 Rapperswil (Switzerland)
===[ITA-HSR]==
[  ] alvina.ash.spotify.net-aldona.ash.spotify.net
[  ] alvina.ash.spotify.net-alejandra.ash.spotify.net
[ 1] alvina.ash.spotify.net-amber.lon.spotify.net   78.31.10.34 

[ 2] alvina.ash.spotify.net-annalise.ash.spotify.net193.182.12.46   
[ 3] alvina.ash.spotify.net-annmarie.ash.spotify.net193.182.12.49   
[ 4] alvina.ash.spotify.net-dorothy.ash.spotify.net 193.182.12.147  

[ 5] alvina.ash.spotify.net-fiona.lon.spotify.net   78.31.10.48 

[ 6] alvina.ash.spotify.net-gordana.sto.spotify.net 78.31.14.162
   {1} 
[ 7] alvina.ash.spotify.net-grazyna.lon.spotify.net 78.31.10.243

[ 8] alvina.ash.spotify.net-lillian.lon.spotify.net 78.31.10.84 
   {2} 
[ 9] alvina.ash.spotify.net-marissa.sto.spotify.net 78.31.14.98 
   {3} 
[10] alvina.ash.spotify.net-parody.sto.spotify.net  78.31.14.164

[11] alvina.ash.spotify.net-renate.lon.spotify.net  78.31.10.210

[12] alvina.ash.spotify.net-sarah.sto.spotify.net   78.31.14.56 
   {4} 
[13] alvina.ash.spotify.net-savannah.sto.spotify.net78.31.14.58 

[14] alvina.ash.spotify.net-sibylla.sto.spotify.net 78.31.14.131

[15] alvina.ash.spotify.net-taylor.sto.spotify.net  78.31.14.85 

___
Users mailing list
Users@lists.strongswan.org
https://lists.strongswan.org/mailman/listinfo/users

Re: [strongSwan] problems with charon in 4.4.1

2011-05-23 Thread Andreas Steffen
Hello Andreas,

debugging these many connections might be easier using the
condensed /var/log/auth.log which has the following entries:

http://www.strongswan.org/uml/testresults45/ikev2/dpd-restart/carol.auth.log

Regards

Andreas

On 05/23/2011 08:14 PM, Andreas Schuldei wrote:
 the charon log files for these four hosts are available for download here:
 http://origin.scdn.co/u/wp/alvina.ash.spotify.net-charon.log.gz
 http://origin.scdn.co/u/wp/annalise.ash.spotify.net-charon.log.gz
 http://origin.scdn.co/u/wp/annmarie.ash.spotify.net-charon.log.gz
 http://origin.scdn.co/u/wp/taylor.sto.spotify.net-charon.log.gz
 
 
 On Mon, May 23, 2011 at 2:46 PM, Andreas Schuldei
 schuldei+strongs...@spotify.com wrote:
 hi!

 I seem to be experiencing problems with charon in strongswan 4.4.1.

 One problem is that charon sometimes failes to reinitiate SAs once
 they expire. I set up a testbed with 17 hosts to reproduce and track
 down the issue, as it takes some time for it to manifest.

 since every host has several connections to the other peers in this
 ipsec setup, it is tricky to see what log entry is caused by which
 connection. how can single out the log entries from those
 affected/failing connections? how can i get a verbose status dump from
 charon showing what it thinks the status is of all the connections it
 keeps track of?
 i dont want to attache 16M of log files here. please advice what parts
 are useful, and i would appreciate tips on how to extract those.

 the hosts that i currenly see problems with are up:

 root@taylor:~# fping annalise.ash.spotify.net annmarie.ash.spotify.net
 alvina.ash.spotify.net
 annalise.ash.spotify.net is alive
 annmarie.ash.spotify.net is alive
 alvina.ash.spotify.net is alive

 but ipsec statusall has no SA for them. (see ipsec-statusall.txt)

 please also find attached annalises and taylors ipsec.conf. the other
 hosts' ipsec.conf is equivalent. there is always one initiator for
 each connection.

 
 ___
 Users mailing list
 Users@lists.strongswan.org
 https://lists.strongswan.org/mailman/listinfo/users


-- 
==
Andreas Steffen andreas.stef...@strongswan.org
strongSwan - the Linux VPN Solution!www.strongswan.org
Institute for Internet Technologies and Applications
University of Applied Sciences Rapperswil
CH-8640 Rapperswil (Switzerland)
===[ITA-HSR]==

___
Users mailing list
Users@lists.strongswan.org
https://lists.strongswan.org/mailman/listinfo/users


Re: [strongSwan] problems with charon in 4.4.1

2011-05-23 Thread Andreas Schuldei
ipsec was started by puppet. that means that the connections are
initiated over an interval of about 30 min.
when i checked later on i discovered that some hosts did not get their
initial puppet trigger for some reason.
our physical nets are quite good, we dont see package loss or so
within our sites. between the sites (ash, lon, sto) we go through the
wild internet, and occasional connection issues could happen. they are
not the rule, though. all servers are real high powered servers,
none of them is to puny to get ipsec negotiations right on the first
try. :-)


On Mon, May 23, 2011 at 11:38 PM, Andreas Steffen
andreas.stef...@strongswan.org wrote:
 Hello Andreas,

 I just analyzed the first part of the alvina.ash.spotify.net log
 file and I see that of the 15 initiated IKE_SAs only 4 succeed in
 the first round. Are there connection problems to the other 11 hosts,
 are some of the peers not online yet or is the computing power of the
 hosts so small that they cannot handle more than 4 IKE_SAs without
 multiple retransmission rounds?

 Regards

 Andreas

 On 05/23/2011 08:14 PM, Andreas Schuldei wrote:
 the charon log files for these four hosts are available for download here:
 http://origin.scdn.co/u/wp/alvina.ash.spotify.net-charon.log.gz
 http://origin.scdn.co/u/wp/annalise.ash.spotify.net-charon.log.gz
 http://origin.scdn.co/u/wp/annmarie.ash.spotify.net-charon.log.gz
 http://origin.scdn.co/u/wp/taylor.sto.spotify.net-charon.log.gz


 On Mon, May 23, 2011 at 2:46 PM, Andreas Schuldei
 schuldei+strongs...@spotify.com wrote:
 hi!

 I seem to be experiencing problems with charon in strongswan 4.4.1.

 One problem is that charon sometimes failes to reinitiate SAs once
 they expire. I set up a testbed with 17 hosts to reproduce and track
 down the issue, as it takes some time for it to manifest.

 since every host has several connections to the other peers in this
 ipsec setup, it is tricky to see what log entry is caused by which
 connection. how can single out the log entries from those
 affected/failing connections? how can i get a verbose status dump from
 charon showing what it thinks the status is of all the connections it
 keeps track of?
 i dont want to attache 16M of log files here. please advice what parts
 are useful, and i would appreciate tips on how to extract those.

 the hosts that i currenly see problems with are up:

 root@taylor:~# fping annalise.ash.spotify.net annmarie.ash.spotify.net
 alvina.ash.spotify.net
 annalise.ash.spotify.net is alive
 annmarie.ash.spotify.net is alive
 alvina.ash.spotify.net is alive

 but ipsec statusall has no SA for them. (see ipsec-statusall.txt)

 please also find attached annalises and taylors ipsec.conf. the other
 hosts' ipsec.conf is equivalent. there is always one initiator for
 each connection.


 ___
 Users mailing list
 Users@lists.strongswan.org
 https://lists.strongswan.org/mailman/listinfo/users


 --
 ==
 Andreas Steffen                         andreas.stef...@strongswan.org
 strongSwan - the Linux VPN Solution!                www.strongswan.org
 Institute for Internet Technologies and Applications
 University of Applied Sciences Rapperswil
 CH-8640 Rapperswil (Switzerland)
 ===[ITA-HSR]==


___
Users mailing list
Users@lists.strongswan.org
https://lists.strongswan.org/mailman/listinfo/users

[strongSwan] Problems with Charon

2009-09-01 Thread ServerAlex
I've got a host-to-host connection that should be kept alive 24/7.

machine 1:
config setup
plutostart=no   # IKEv1
charonstart=yes # IKEv2
nat_traversal=no

# Add connections here.

# Sample VPN connections
conn %default
ikelifetime=60m
keylife=20m
rekeymargin=3m
keyingtries=%forever
keyexchange=ikev2
dpdaction=hold
mobike=no

conn server1
left=XX.X.XX.XX
leftcert=server1-cert.pem
left...@server1.xxx.com
right=YY.YY.YY.YY
right...@server2.xxx.com
auto=start

server2:
config setup
plutostart=no   # IKEv1
charonstart=yes # IKEv2
nat_traversal=no

# Add connections here.

# Sample VPN connections
conn %default
ikelifetime=60m
keylife=20m
rekeymargin=3m
keyingtries=%forever
keyexchange=ikev2
dpdaction=clear
mobike=no

conn server12
left=YY.YY.YY.YY
leftcert=server2-cert.pem
left...@server2.xxx.com
right=XX.XX.XX.XX
right...@server1.xxx.com
auto=add


when i start ipsec on both sides it works for a few minutes, then it
just doesnt any longer, although the SAs are still alive.
server2[2]: ESTABLISHED 11 minutes ago,
XX.XX.XX.XX[server1.XXX.com]...YY.YY.YY.YY[server2.XXX.com]
server2{2}:  INSTALLED, TUNNEL, ESP SPIs: cb043689_i c4ecff51_o
server2{2}:   XX.XX.XX.XX/32 === YY.YY.YY.YY/32

But no traffic flow can be established. Logs gives me errors like these:
Sep  2 02:44:30 server1 charon: 11[KNL] querying policy failed: No
such file or directory (2)

I have to restart the whole daemon on server1 to get the traffic
flowing again.. for a few minutes.

Any ideas?
___
Users mailing list
Users@lists.strongswan.org
https://lists.strongswan.org/mailman/listinfo/users


Re: [strongSwan] Problems with Charon

2009-09-01 Thread Andreas Steffen
Hi,

are you running strongSwan on CentOS or RedHat? There is an issue with
these Linux kernels where IPsec policies get deleted when they are
queried e.g. by ipsec statusall or DPD. I think this kernel bug was
fixed recently by RedHat.

Best regards

Andreas

ServerAlex wrote:
 I've got a host-to-host connection that should be kept alive 24/7.
 
 machine 1:
 config setup
 plutostart=no   # IKEv1
 charonstart=yes # IKEv2
 nat_traversal=no
 
 # Add connections here.
 
 # Sample VPN connections
 conn %default
 ikelifetime=60m
 keylife=20m
 rekeymargin=3m
 keyingtries=%forever
 keyexchange=ikev2
 dpdaction=hold
 mobike=no
 
 conn server1
 left=XX.X.XX.XX
 leftcert=server1-cert.pem
 left...@server1.xxx.com
 right=YY.YY.YY.YY
 right...@server2.xxx.com
 auto=start
 
 server2:
 config setup
 plutostart=no   # IKEv1
 charonstart=yes # IKEv2
 nat_traversal=no
 
 # Add connections here.
 
 # Sample VPN connections
 conn %default
 ikelifetime=60m
 keylife=20m
 rekeymargin=3m
 keyingtries=%forever
 keyexchange=ikev2
 dpdaction=clear
 mobike=no
 
 conn server12
 left=YY.YY.YY.YY
 leftcert=server2-cert.pem
 left...@server2.xxx.com
 right=XX.XX.XX.XX
 right...@server1.xxx.com
 auto=add
 
 
 when i start ipsec on both sides it works for a few minutes, then it
 just doesnt any longer, although the SAs are still alive.
 server2[2]: ESTABLISHED 11 minutes ago,
 XX.XX.XX.XX[server1.XXX.com]...YY.YY.YY.YY[server2.XXX.com]
 server2{2}:  INSTALLED, TUNNEL, ESP SPIs: cb043689_i c4ecff51_o
 server2{2}:   XX.XX.XX.XX/32 === YY.YY.YY.YY/32
 
 But no traffic flow can be established. Logs gives me errors like these:
 Sep  2 02:44:30 server1 charon: 11[KNL] querying policy failed: No
 such file or directory (2)
 
 I have to restart the whole daemon on server1 to get the traffic
 flowing again.. for a few minutes.
 
 Any ideas?

==
Andreas Steffen andreas.stef...@strongswan.org
strongSwan - the Linux VPN Solution!www.strongswan.org

Institute for Internet Technologies and Applications
University of Applied Sciences Rapperswil
CH-8640 Rapperswil (Switzerland)
===[ITA-HSR]==

___
Users mailing list
Users@lists.strongswan.org
https://lists.strongswan.org/mailman/listinfo/users