subject:"\"\\\[strongSwan\\\] Best practices regarding monitoring\""

Re: [strongSwan] Best practices regarding monitoring

2017-06-18 Thread Martin Willi

Hi Peter

> So, am I correct to assume that you guys usually evaluate the output
> of `ipsec statusall`

Preferably I'd do that over vici [1], as it provides a much better
interface for various languages to query tunnel status or re-initiate
tunnels.

> Do you simply send pings to remote systems "behind" the VPN?

Actually out-of-sync state is quite uncommon at least with IKEv2. If
your peer looses CHILD_SAs but happily answers to DPD/liveness checks
on IKE, there is probably a bug somewhere. If a peer deletes a
CHILD_SA, it must signal that over IKE, hence its peer should notice
that. Even complex rekey collisions are actually defined, but probably
not all implementations handle them correctly. Also, you might consider
updating to 5.5.x, which brings some additional improvements regarding
collision exchanges.

> If there is no DPD that uses CHILD_SAs, there might be nothing else
> that you can do.

There isn't, as from a protocol level this is not needed in IKEv2 due
to the strict state synchronization it provides. Of course you could
use a short CHILD_SA rekeying interval to check its liveness, but that
isn't an optimal solution, either.

Regards
Martin

[1]https://wiki.strongswan.org/projects/strongswan/wiki/Vici

Re: [strongSwan] Best practices regarding monitoring

2017-06-14 Thread Peter Hofmann

Hi,

On Fri, Jun 09, 2017 at 09:11:27PM +0200, Noel Kuntze wrote:
> Besides DPD, there's no standard that charon implements for that. I am
> also not aware of any that uses CHILD_SAs.

alright, too bad. :-/

So, am I correct to assume that you guys usually evaluate the output of
`ipsec statusall` and maybe `ip xfrm {state,policy}` to implement
monitoring? Do you simply send pings to remote systems "behind" the VPN?

(If there is no DPD that uses CHILD_SAs, there might be nothing else
that you can do.)

> Huh? Check `ip xfrm state` and `ip xfrm policy`, they give you the SAD and 
> SPD.
> Also check if you receive any ESP packets and what their SPIs are.

`ip xfrm state` shows the same SPIs as `ipsec statusall` does. Policies
look fine, too. With tcpdump, I can see outgoing encrypted traffic that
uses the correct SPIs (and we can decrypt that traffic using Wireshark
and the keys shown by `ip xfrm state`). No incoming ESP traffic, though.

> I think the much more plausible cases are the following:
> 1) Kernel does not send expiration messages to charon when an SA soft or hard 
> expires
> 2) Something in between drops the ESP traffic. Maybe there's a problem with a 
> stateful firewall? iptables rules?

As for #1: How can I check that? I assume that `ip xfrm state` would not
show any SAs but `ipsec statusall` still shows them, right?

As for #2: Totally possible. We always check our firewalls, but traffic
may still be dropped on the remote end.

Don't get me wrong, though. I only posted one exemplary scenario that we
see with one of our IPSec peers. It illustrates nicely that our
strongswan/charon/kernel looks like it's working fine, but still, no
response from the remote peer until we do a "service strongswan
restart". I understand that I may not have posted all required
information to debug this particular issue, simply because that's not
what I'm after. :-)

At the end of the day, we have to work closely with the admins of our
remote peers to fix the individual issues. We're not able to reliably
*detect* them, though. Any suggestions are highly appreciated.

Thanks!
Peter

Re: [strongSwan] Best practices regarding monitoring

2017-06-09 Thread Noel Kuntze

Hello Peter,

On 09.06.2017 11:46, Peter Hofmann wrote:
> Hi,
> 
> we're running various Ubuntu systems with StrongSwan 5.1 or 5.3. Each
> system connects to exactly one IPSec/IKE peer. We usually don't know
> what kind of peer that is -- is it also running StrongSwan, is it a
> hardware firewall, does it run OpenBSD, ... ? No idea. No way of
> retrieving log files. They're all black boxes to us. Okay.
> 
> Now, the big question is: How to monitor IPSec connectivity?

Ask the administrator of the remote peer for some service that you can use to 
check connectivity. Besides DPD,
there's no standard that charon implements for that. I am also not aware of any 
that uses CHILD_SAs.

> 
> It's easy to check if there are IKE SAs. It's also not a big deal to
> check if there are CHILD SAs. We can do that. However, checking that is
> not enough.
> 
> Let me give you an example.
> 
> Here's some output of "ipsec statusall":
> 
> Status of IKE charon daemon (strongSwan 5.1.2, Linux 3.13.0-67-generic, 
> x86_64):
>   uptime: 5 days, since Jun 02 11:51:14 2017
>   malloc: sbrk 1511424, mmap 0, used 343856, free 1167568
>   worker threads: 11 of 16 idle, 5/0/0/0 working, job queue: 0/0/0/0, 
> scheduled: 84
>   loaded plugins: charon test-vectors aes rc2 sha1 sha2 md4 md5 rdrand 
> random nonce x509 revocation constraints pkcs1 pkcs7 pkcs8 pkcs12 pem openssl 
> xcbc cmac hmac ctr ccm gcm attr kernel-netlink resolve socket-default stroke 
> updown eap-identity addrblock
> Listening IP addresses:
>   10.1.2.3
>   $public_IP
> Connections:
>   peer_1:  $public_IP...$peer_IP  IKEv2
>   peer_1:   local:  [$public_IP] uses pre-shared key authentication
>   peer_1:   remote: [$peer_IP] uses pre-shared key authentication
>   peer_1:   child:  192.168.23.24/32 === 192.168.100.200/32 TUNNEL
> Routed Connections:
>   peer_1{1}:  ROUTED, TUNNEL
>   peer_1{1}:   192.168.23.24/32 === 192.168.100.200/32
> Security Associations (1 up, 0 connecting):
>   peer_1[79]: ESTABLISHED 82 minutes ago, 
> $public_IP[$public_IP]...$peer_IP[$peer_IP]
>   peer_1[79]: IKEv2 SPIs: 1234567890_i abcdefghi_r*, rekeying disabled
>   peer_1[79]: IKE proposal: 
> AES_CBC_256/HMAC_SHA2_256_128/PRF_HMAC_SHA2_256/MODP_8192
>   peer_1{1}:  INSTALLED, TUNNEL, ESP SPIs: c112233_i c445566_o
>   peer_1{1}:  AES_CBC_256/HMAC_SHA2_256_128, 49208 bytes_i (239 pkts, 
> 1145s ago), 59836 bytes_o (491 pkts, 14s ago), rekeying disabled
>   peer_1{1}:   192.168.23.24/32 === 192.168.100.200/32
> 
> Looks fine, doesn't it? Except 192.168.100.200 does not respond.
> tcpdump shows that we properly encrypt our traffic using those exact
> SPIs and everything. On our end, everything looks fine. But our peer
> simply ignores our encrypted traffic. It's as if our peer has
> "forgotten" about those SPIs.

Huh? Check `ip xfrm state` and `ip xfrm policy`, they give you the SAD and SPD.
Also check if you receive any ESP packets and what their SPIs are.
I think the much more plausible cases are the following:
1) Kernel does not send expiration messages to charon when an SA soft or hard 
expires
2) Something in between drops the ESP traffic. Maybe there's a problem with a 
stateful firewall? iptables rules?

See above.

Kind regards

Noel



signature.asc
Description: OpenPGP digital signature

[strongSwan] Best practices regarding monitoring

2017-06-09 Thread Peter Hofmann

Hi,

we're running various Ubuntu systems with StrongSwan 5.1 or 5.3. Each
system connects to exactly one IPSec/IKE peer. We usually don't know
what kind of peer that is -- is it also running StrongSwan, is it a
hardware firewall, does it run OpenBSD, ... ? No idea. No way of
retrieving log files. They're all black boxes to us. Okay.

Now, the big question is: How to monitor IPSec connectivity?

It's easy to check if there are IKE SAs. It's also not a big deal to
check if there are CHILD SAs. We can do that. However, checking that is
not enough.

Let me give you an example.

Here's some output of "ipsec statusall":

Status of IKE charon daemon (strongSwan 5.1.2, Linux 3.13.0-67-generic, 
x86_64):
  uptime: 5 days, since Jun 02 11:51:14 2017
  malloc: sbrk 1511424, mmap 0, used 343856, free 1167568
  worker threads: 11 of 16 idle, 5/0/0/0 working, job queue: 0/0/0/0, 
scheduled: 84
  loaded plugins: charon test-vectors aes rc2 sha1 sha2 md4 md5 rdrand 
random nonce x509 revocation constraints pkcs1 pkcs7 pkcs8 pkcs12 pem openssl 
xcbc cmac hmac ctr ccm gcm attr kernel-netlink resolve socket-default stroke 
updown eap-identity addrblock
Listening IP addresses:
  10.1.2.3
  $public_IP
Connections:
  peer_1:  $public_IP...$peer_IP  IKEv2
  peer_1:   local:  [$public_IP] uses pre-shared key authentication
  peer_1:   remote: [$peer_IP] uses pre-shared key authentication
  peer_1:   child:  192.168.23.24/32 === 192.168.100.200/32 TUNNEL
Routed Connections:
  peer_1{1}:  ROUTED, TUNNEL
  peer_1{1}:   192.168.23.24/32 === 192.168.100.200/32
Security Associations (1 up, 0 connecting):
  peer_1[79]: ESTABLISHED 82 minutes ago, 
$public_IP[$public_IP]...$peer_IP[$peer_IP]
  peer_1[79]: IKEv2 SPIs: 1234567890_i abcdefghi_r*, rekeying disabled
  peer_1[79]: IKE proposal: 
AES_CBC_256/HMAC_SHA2_256_128/PRF_HMAC_SHA2_256/MODP_8192
  peer_1{1}:  INSTALLED, TUNNEL, ESP SPIs: c112233_i c445566_o
  peer_1{1}:  AES_CBC_256/HMAC_SHA2_256_128, 49208 bytes_i (239 pkts, 
1145s ago), 59836 bytes_o (491 pkts, 14s ago), rekeying disabled
  peer_1{1}:   192.168.23.24/32 === 192.168.100.200/32

Looks fine, doesn't it? Except 192.168.100.200 does not respond.
tcpdump shows that we properly encrypt our traffic using those exact
SPIs and everything. On our end, everything looks fine. But our peer
simply ignores our encrypted traffic. It's as if our peer has
"forgotten" about those SPIs.

If you look closely, you can see that there's outgoing traffic, but no
incoming traffic:

peer_1{1}: ... 49208 bytes_i (239 pkts, 1145s ago), 59836 bytes_o (491 
pkts, 14s ago)

Reinitiating the entire connection (essentially, doing "service
strongswan restart") fixes the problem and we can immediately reach
192.168.100.200.

(Yes, in this specific case, it might be worth a try to reenable
rekeying on our end. Still, my question is not about fixing this problem
at hand. :-))

What do you guys do in such situations? What are best practices for
monitoring? How do you detect "dead" CHILD SAs? Is that even possible?

There's the obvious idea: Try to ping a system "behind" the VPN. In the
example above, we could issue pings to 192.168.100.200 and, if that
system does not respond, consider the IPSec connection to be "down". We
would like to avoid that, though. Ideally, we could find a way to
directly check whether all CHILD SAs are "healthy". Pinging
192.168.100.200 would be "indirect" monitoring: It's a different system
and *that* system could be down, not the IPSec connection.

In other words, maybe there's something like DPD in IKEv2, but operating
on the level of CHILD SAs?

Thank you very much in advance!
Peter

Re: [strongSwan] Best practices regarding monitoring

Re: [strongSwan] Best practices regarding monitoring

Re: [strongSwan] Best practices regarding monitoring

[strongSwan] Best practices regarding monitoring

4 matches

Site Navigation

Mail list logo

Footer information