Re: [Swan] Valid packets dropping in the kernel

2018-11-07 Thread Dharma Indurthy
..n.3...+..[..
0x0030:   84d4 0b00   1011 1213 1415  
0x0040:  1617 1819 1a1b 1c1d 1e1f 2021 2223 2425  ...!"#$%
0x0050:  2627 2829 2a2b 2c2d 2e2f 3031 3233 3435  &'()*+,-./012345
0x0060:  3637 67
01:32:27.812447 IP 10.153.32.166 > 172.20.75.204: ICMP echo reply, id
13229, seq 16, length 64
0x:  0e82 073f 73ab 0eef 4216 5634 0800 4500  ...?s...B.V4..E.
0x0010:  0054 05a8  7d01 14e2 0a99 20a6 ac14  .T}.
0x0020:  4bcc  6ead 33ad 0010 2b92 e35b   K...n.3...+..[..
0x0030:   84d4 0b00   1011 1213 1415  
0x0040:  1617 1819 1a1b 1c1d 1e1f 2021 2223 2425  ...!"#$%
0x0050:  2627 2829 2a2b 2c2d 2e2f 3031 3233 3435  &'()*+,-./012345
0x0060:  3637 67

Definitely seems interesting, but no idea what causes the reqid to get out
of sync.  Actually, the reqid matches the policies for the 1x3 connection:
src 10.50.36.4/32 dst 10.253.0.1/32
dir fwd priority 1040351
tmpl src 12.131.93.13 dst 172.20.109.76
proto esp reqid 20137 mode tunnel
src 10.50.36.4/32 dst 10.253.0.1/32
dir in priority 1040351
tmpl src 12.131.93.13 dst 172.20.109.76
proto esp reqid 20137 mode tunnel
src 10.253.0.1/32 dst 10.50.36.4/32
dir out priority 1040351
tmpl src 172.20.109.76 dst 12.131.93.13
proto esp reqid 20137 mode tunnel

But not the others.

On Tue, Nov 6, 2018 at 10:59 AM Dharma Indurthy 
wrote:

> Hey, Paul.  I appreciate your response.
>
> Do not use leftsourceip= if you specify more then one leftsubnet. Also,
>> leftsourceip= must be an IP address within the (single) leftsubnet=
>>
> > right=12.131.93.13
>> > rightsubnets=" 10.50.32.166/32 10.50.32.239/32 10.50.36.4/32 "
>> > rightsourceip=12.131.93.13
>>
>> The same applies here.
>>
>
> Good to know, but I don't think it's getting used.  We'll clean  up the
> config.
>
>
>> > SAs come up, and we can ping their side.
>>
>> > 000 #3166924: "orthooklahoma3937/1x1":4500 STATE_QUICK_I2 (sent QI2,
>> IPsec SA established); EVENT_SA_REPLACE in 918s; newest IPSEC; eroute
>> owner; isakmp#3166786; idle; import:admin initiate
>> > 000 #3166924: "orthooklahoma3937/1x1" esp.815a3ae9@12.131.93.13
>> esp.618dd3ad@172.20.109.76 ref=0 refhim=0 Traffic: ESPin=0B ESPout=0B!
>> ESPmax=4194303B
>> > 000 #3167825: "orthooklahoma3937/1x2":4500 STATE_QUICK_I2 (sent QI2,
>> IPsec SA established); EVENT_SA_REPLACE in 1148s; newest IPSEC; eroute
>> owner; isakmp#3166786; idle; import:admin initiate
>> > 000 #3167825: "orthooklahoma3937/1x2" esp.73c12328@12.131.93.13
>> esp.b76a1e64@172.20.109.76 ref=0 refhim=0 Traffic: ESPin=0B ESPout=0B!
>> ESPmax=4194303B
>> > 000 #3165167: "orthooklahoma3937/1x3":4500 STATE_QUICK_I2 (sent QI2,
>> IPsec SA established); EVENT_SA_REPLACE in 82s; newest IPSEC; eroute owner;
>> isakmp#3136241; idle; import:admin initiate
>> > 000 #3165167: "orthooklahoma3937/1x3" esp.33a967a1@12.131.93.13
>> esp.72596d49@172.20.109.76 ref=0 refhim=0 Traffic: ESPin=0B ESPout=0B!
>> ESPmax=4194303B
>> > 000 #3166787: "orthooklahoma3937/2x1":4500 STATE_QUICK_I2 (sent QI2,
>> IPsec SA established); EVENT_SA_REPLACE in 891s; newest IPSEC; eroute
>> owner; isakmp#3166786; idle; import:admin initiate
>> > 000 #3166787: "orthooklahoma3937/2x1" esp.970dcc23@12.131.93.13
>> esp.207c2a70@172.20.109.76 ref=0 refhim=0 Traffic: ESPin=0B ESPout=0B!
>> ESPmax=4194303B
>> > 000 #3166964: "orthooklahoma3937/2x2":4500 STATE_QUICK_I2 (sent QI2,
>> IPsec SA established); EVENT_SA_REPLACE in 602s; newest IPSEC; eroute
>> owner; isakmp#3166786; idle; import:admin initiate
>> > 000 #3166964: "orthooklahoma3937/2x2" esp.61180b3@12.131.93.13
>> esp.50ff9d05@172.20.109.76 ref=0 refhim=0 Traffic: ESPin=1KB ESPout=1KB!
>> ESPmax=4194303B
>> > 000 #3162278: "orthooklahoma3937/2x3":4500 STATE_QUICK_I2 (sent QI2,
>> IPsec SA established); EVENT_SA_EXPIRE in 437s; isakmp#3136241; idle;
>> import:admin initiate
>> > 000 #3162278: "orthooklahoma3937/2x3" esp.e4c24f90@12.131.93.13
>> esp.cadf8591@172.20.109.76 ref=0 refhim=0 Traffic: ESPin=0B ESPout=0B!
>> ESPmax=4194303B
>> > 000 #3162955: "orthooklahoma3937/2x3":4500 STATE_QUICK_R2 (IPsec SA
>> established); EVENT_SA_REPLACE in 399s; newest IPSEC; eroute owner;
>> isakmp#3136241; idle; import:admin initiate
>> > 000 #3162955: "orthooklahoma3937/2x3" esp.d783e492@12.131.93.13
>> esp.1d0a885d@172.20.109.76 ref=0 refhim=0 Traffic: ESPin=42KB ESPout=0B!
>> ESPmax=4

[Swan] One more hi-rekey cycling issue

2018-11-06 Thread Dharma Indurthy
Previously, we mentioned this issue:
https://lists.libreswan.org/pipermail/swan/2018/002759.html which
more-or-less appears to be working as designed, although I have not seen
the specific pattern since our 3.25 upgrade.

However, we have a new infinitely loop that appears to occur completely on
our side, with no delete payload/re-initiate prompted by the other side.

We started with a connection that looked like this:
000 #439457: "essentia342/2x3":500 STATE_QUICK_I1 (sent QI1, expecting
QR1); EVENT_v1_RETRANSMIT in 6s; lastdpd=-1s(seq in:0 out:0); idle;
import:admin initiate
000 #439452: "essentia342/2x4":500 STATE_QUICK_I1 (sent QI1, expecting
QR1); EVENT_v1_RETRANSMIT in 6s; lastdpd=-1s(seq in:0 out:0); idle;
import:admin initiate
000 #439459: "essentia342/2x5":500 STATE_QUICK_I1 (sent QI1, expecting
QR1); EVENT_v1_RETRANSMIT in 6s; lastdpd=-1s(seq in:0 out:0); idle;
import:admin initiate
000 #439463: "essentia342/2x6":500 STATE_QUICK_I1 (sent QI1, expecting
QR1); EVENT_v1_RETRANSMIT in 6s; lastdpd=-1s(seq in:0 out:0); idle;
import:admin initiate
000 #439455: "essentia342/2x7":500 STATE_QUICK_I1 (sent QI1, expecting
QR1); EVENT_v1_RETRANSMIT in 6s; lastdpd=-1s(seq in:0 out:0); idle;
import:admin initiate
000 #336954: "essentia342/2x8":500 STATE_MAIN_I4 (ISAKMP SA established);
EVENT_SA_REPLACE in 51115s; newest ISAKMP; lastdpd=34438s(seq in:18617
out:18616); idle; import:admin initiate
000 #439406: "essentia342/2x8":500 STATE_QUICK_I1 (sent QI1, expecting
QR1); EVENT_v1_RETRANSMIT in 30s; lastdpd=-1s(seq in:0 out:0); idle;
import:admin initiate
000 #439464: "essentia342/2x8":500 STATE_QUICK_I1 (sent QI1, expecting
QR1); EVENT_v1_RETRANSMIT in 6s; lastdpd=-1s(seq in:0 out:0); idle;
import:admin initiate

We did a delete -- which deleted the duplicate 2x8 SAs:
root@ip-172-20-109-76(vpn):/etc/ipsec.d# ipsec auto --delete essentia342
002 "essentia342/1x1": deleting non-instance connection
002 "essentia342/1x1" #439986: deleting state (STATE_QUICK_I1) and NOT
sending notification
002 "essentia342/1x2": deleting non-instance connection
002 "essentia342/1x2" #439988: deleting state (STATE_QUICK_I1) and NOT
sending notification
002 "essentia342/1x3": deleting non-instance connection
002 "essentia342/1x3" #439996: deleting state (STATE_QUICK_I1) and NOT
sending notification
002 "essentia342/1x4": deleting non-instance connection
002 "essentia342/1x4" #439990: deleting state (STATE_QUICK_I1) and NOT
sending notification
002 "essentia342/1x5": deleting non-instance connection
002 "essentia342/1x5" #439984: deleting state (STATE_QUICK_I1) and NOT
sending notification
002 "essentia342/1x6": deleting non-instance connection
002 "essentia342/1x6" #439995: deleting state (STATE_QUICK_I1) and NOT
sending notification
002 "essentia342/1x7": deleting non-instance connection
002 "essentia342/1x7" #439982: deleting state (STATE_QUICK_I1) and NOT
sending notification
002 "essentia342/1x8": deleting non-instance connection
002 "essentia342/1x8" #439983: deleting state (STATE_QUICK_I1) and NOT
sending notification
002 "essentia342/2x1": deleting non-instance connection
002 "essentia342/2x1" #439992: deleting state (STATE_QUICK_I1) and NOT
sending notification
002 "essentia342/2x2": deleting non-instance connection
002 "essentia342/2x2" #43: deleting state (STATE_QUICK_I1) and NOT
sending notification
002 "essentia342/2x3": deleting non-instance connection
002 "essentia342/2x3" #439994: deleting state (STATE_QUICK_I1) and NOT
sending notification
002 "essentia342/2x4": deleting non-instance connection
002 "essentia342/2x4" #439987: deleting state (STATE_QUICK_I1) and NOT
sending notification
002 "essentia342/2x5": deleting non-instance connection
002 "essentia342/2x5" #439998: deleting state (STATE_QUICK_I1) and NOT
sending notification
002 "essentia342/2x6": deleting non-instance connection
002 "essentia342/2x6" #439989: deleting state (STATE_QUICK_I1) and NOT
sending notification
002 "essentia342/2x7": deleting non-instance connection
002 "essentia342/2x7" #439991: deleting state (STATE_QUICK_I1) and NOT
sending notification
002 "essentia342/2x8": deleting non-instance connection
002 "essentia342/2x8" #439997: deleting state (STATE_QUICK_I1) and NOT
sending notification
002 "essentia342/2x8" #439941: deleting state (STATE_QUICK_I1) and NOT
sending notification
002 "essentia342/2x8" #336954: deleting state (STATE_MAIN_I4) and sending
notification

And then an add + up.  Then we see this:

000 initiating all conns with alias='essentia342'
002 "essentia342/2x8" #440083: initiating Main Mode
104 "essentia342/2x8" #440083: STATE_MAIN_I1: initiate
106 "essentia342/2x8" #440083: STATE_MAIN_I2: sent MI2, expecting MR2
003 "essentia342/2x8" #440083: ignoring unknown Vendor ID payload
[407f3135484ae73200fd5aea12860ac1]
108 "essentia342/2x8" #440083: STATE_MAIN_I3: sent MI3, expecting MR3
002 "essentia342/2x8" #440083: Peer ID is ID_IPV4_ADDR: '208.72.50.5'
004 "essentia342/2x8" #440083: STATE_MAIN_I4: ISAKMP SA established

Re: [Swan] Valid packets dropping in the kernel

2018-11-06 Thread Dharma Indurthy
Hey, Paul.  I appreciate your response.

Do not use leftsourceip= if you specify more then one leftsubnet. Also,
> leftsourceip= must be an IP address within the (single) leftsubnet=
>
> right=12.131.93.13
> > rightsubnets=" 10.50.32.166/32 10.50.32.239/32 10.50.36.4/32 "
> > rightsourceip=12.131.93.13
>
> The same applies here.
>

Good to know, but I don't think it's getting used.  We'll clean  up the
config.


> > SAs come up, and we can ping their side.
>
> > 000 #3166924: "orthooklahoma3937/1x1":4500 STATE_QUICK_I2 (sent QI2,
> IPsec SA established); EVENT_SA_REPLACE in 918s; newest IPSEC; eroute
> owner; isakmp#3166786; idle; import:admin initiate
> > 000 #3166924: "orthooklahoma3937/1x1" esp.815a3ae9@12.131.93.13
> esp.618dd3ad@172.20.109.76 ref=0 refhim=0 Traffic: ESPin=0B ESPout=0B!
> ESPmax=4194303B
> > 000 #3167825: "orthooklahoma3937/1x2":4500 STATE_QUICK_I2 (sent QI2,
> IPsec SA established); EVENT_SA_REPLACE in 1148s; newest IPSEC; eroute
> owner; isakmp#3166786; idle; import:admin initiate
> > 000 #3167825: "orthooklahoma3937/1x2" esp.73c12328@12.131.93.13
> esp.b76a1e64@172.20.109.76 ref=0 refhim=0 Traffic: ESPin=0B ESPout=0B!
> ESPmax=4194303B
> > 000 #3165167: "orthooklahoma3937/1x3":4500 STATE_QUICK_I2 (sent QI2,
> IPsec SA established); EVENT_SA_REPLACE in 82s; newest IPSEC; eroute owner;
> isakmp#3136241; idle; import:admin initiate
> > 000 #3165167: "orthooklahoma3937/1x3" esp.33a967a1@12.131.93.13
> esp.72596d49@172.20.109.76 ref=0 refhim=0 Traffic: ESPin=0B ESPout=0B!
> ESPmax=4194303B
> > 000 #3166787: "orthooklahoma3937/2x1":4500 STATE_QUICK_I2 (sent QI2,
> IPsec SA established); EVENT_SA_REPLACE in 891s; newest IPSEC; eroute
> owner; isakmp#3166786; idle; import:admin initiate
> > 000 #3166787: "orthooklahoma3937/2x1" esp.970dcc23@12.131.93.13
> esp.207c2a70@172.20.109.76 ref=0 refhim=0 Traffic: ESPin=0B ESPout=0B!
> ESPmax=4194303B
> > 000 #3166964: "orthooklahoma3937/2x2":4500 STATE_QUICK_I2 (sent QI2,
> IPsec SA established); EVENT_SA_REPLACE in 602s; newest IPSEC; eroute
> owner; isakmp#3166786; idle; import:admin initiate
> > 000 #3166964: "orthooklahoma3937/2x2" esp.61180b3@12.131.93.13
> esp.50ff9d05@172.20.109.76 ref=0 refhim=0 Traffic: ESPin=1KB ESPout=1KB!
> ESPmax=4194303B
> > 000 #3162278: "orthooklahoma3937/2x3":4500 STATE_QUICK_I2 (sent QI2,
> IPsec SA established); EVENT_SA_EXPIRE in 437s; isakmp#3136241; idle;
> import:admin initiate
> > 000 #3162278: "orthooklahoma3937/2x3" esp.e4c24f90@12.131.93.13
> esp.cadf8591@172.20.109.76 ref=0 refhim=0 Traffic: ESPin=0B ESPout=0B!
> ESPmax=4194303B
> > 000 #3162955: "orthooklahoma3937/2x3":4500 STATE_QUICK_R2 (IPsec SA
> established); EVENT_SA_REPLACE in 399s; newest IPSEC; eroute owner;
> isakmp#3136241; idle; import:admin initiate
> > 000 #3162955: "orthooklahoma3937/2x3" esp.d783e492@12.131.93.13
> esp.1d0a885d@172.20.109.76 ref=0 refhim=0 Traffic: ESPin=42KB ESPout=0B!
> ESPmax=4194303B
> > 000 #3166786: "orthooklahoma3937/2x3":4500 STATE_MAIN_R3 (sent MR3,
> ISAKMP SA established); EVENT_SA_REPLACE in 26486s; newest ISAKMP; nodpd;
> idle; import:admin initiate
> >
> > We have duplicate SAs for some reason -- you can see that for 2x3, not
> sure if that matters.
>
> It should not matter. What seems to have happened is that when you
> established the IKE SA, and you were in the process of establishing all
> the IPsec SA's, the other end also started doing the same IPsec SA's.
> So you ended up with one connection which was initiated by you and
> responded to by you. One of them should vanish after a little while.
>
> Yeah, that's what I thought.  They do come and go, but we consistently
have two:
000 #439432: "orthooklahoma3937/2x3":4500 STATE_QUICK_I2 (sent QI2, IPsec
SA established); EVENT_SA_EXPIRE in 47s; isakmp#430186; idle; import:admin
initiate
000 #439432: "orthooklahoma3937/2x3" esp.16ea20ad@12.131.93.13
esp.6916d827@172.20.109.76 ref=0 refhim=0 Traffic: ESPin=0B ESPout=0B!
ESPmax=4194303B
000 #449005: "orthooklahoma3937/2x3":4500 STATE_QUICK_I2 (sent QI2, IPsec
SA established); EVENT_SA_REPLACE in 1873s; newest IPSEC; eroute owner;
isakmp#430186; idle; import:admin initiate
000 #449005: "orthooklahoma3937/2x3" esp.523917e3@12.131.93.13
esp.51b2fd1a@172.20.109.76 ref=0 refhim=0 Traffic: ESPin=0B ESPout=0B!
ESPmax=4194303B

^At the moment, we have two that our side has initiated.  Still, as far as
I can see, no big deal.  Seems to be valid on both sides.

> It's the 1x1 SA that's pertinent.  We NAT the source and target ips via
> PREROUTING and POSTROUTING rules, and I
> > can see traffic initiated by the customer hitting PREROUTING but never
> hitting POSTROUTING and never leaving the box.
>
> Are you using the policy matching for ipsec? See:
>

We don't use policy matching, but we've never had to before.  For inbound
customer traffic, we PREROUTE to match the config, and then we POSTROUTE to
NAT the traffic past our gateway.  You can see the pings match the config
and disappear.  We do this for all 

[Swan] Valid packets dropping in the kernel

2018-11-02 Thread Dharma Indurthy
Hey, folks.

I have a conundrum.  It looks very similar to
https://lists.libreswan.org/pipermail/swan/2018/002834.html, which doesn't
have an outcome yet, I don't think.

We have the following connection, one of a couple hundred -- the rest of
which seem to work fine as far as we can tell.  I can't be sure, because I
can't detect the issue from my side.

conn customer
type=tunnel
authby=secret
left="172.20.109.76"
leftid=52.205.166.91
leftsourceip="172.20.109.76"
leftsubnets=" 10.253.1.53/32 10.253.0.1/32 "
right=12.131.93.13
rightsubnets=" 10.50.32.166/32 10.50.32.239/32 10.50.36.4/32 "
rightsourceip=12.131.93.13
auto=start
ike=aes256-sha1;modp1024
phase2alg=aes256-sha1;modp1024
ikelifetime=28800
salifetime=3600
dpdaction=restart
dpddelay=30
dpdtimeout=120
pfs=yes

SAs come up, and we can ping their side.

000 "orthooklahoma3937/1x1": 10.253.1.53/32===172.20.109.76
<172.20.109.76>[52.205.166.91]...12.131.93.13<12.131.93.13>===
10.50.32.166/32; erouted; eroute owner: #3166924
000 "orthooklahoma3937/1x1": oriented; my_ip=172.20.109.76;
their_ip=12.131.93.13; my_updown=ipsec _updown;
000 "orthooklahoma3937/1x1":   xauth us:none, xauth them:none,
my_username=[any]; their_username=[any]
000 "orthooklahoma3937/1x1":   our auth:secret, their auth:secret
000 "orthooklahoma3937/1x1":   modecfg info: us:none, them:none, modecfg
policy:push, dns:unset, domains:unset, banner:unset, cat:unset;
000 "orthooklahoma3937/1x1":   labeled_ipsec:no;
000 "orthooklahoma3937/1x1":   policy_label:unset;
000 "orthooklahoma3937/1x1":   ike_life: 28800s; ipsec_life: 3600s;
replay_window: 32; rekey_margin: 540s; rekey_fuzz: 100%; keyingtries: 0;
000 "orthooklahoma3937/1x1":   retransmit-interval: 500ms;
retransmit-timeout: 60s;
000 "orthooklahoma3937/1x1":   initial-contact:no; cisco-unity:no;
fake-strongswan:no; send-vendorid:no; send-no-esp-tfc:no;
000 "orthooklahoma3937/1x1":   policy:
PSK+ENCRYPT+TUNNEL+PFS+UP+IKEV1_ALLOW+IKEV2_ALLOW+SAREF_TRACK+IKE_FRAG_ALLOW+ESN_NO;
000 "orthooklahoma3937/1x1":   conn_prio: 32,32; interface: ens5; metric:
0; mtu: unset; sa_prio:auto; sa_tfc:none;
000 "orthooklahoma3937/1x1":   nflog-group: unset; mark: unset;
vti-iface:unset; vti-routing:no; vti-shared:no; nic-offload:auto;
000 "orthooklahoma3937/1x1":   our idtype: ID_IPV4_ADDR; our
id=52.205.166.91; their idtype: ID_IPV4_ADDR; their id=12.131.93.13
000 "orthooklahoma3937/1x1":   dpd: action:restart; delay:30; timeout:120;
nat-t: encaps:auto; nat_keepalive:yes; ikev1_natt:both
000 "orthooklahoma3937/1x1":   newest ISAKMP SA: #0; newest IPsec SA:
#3166924;
000 "orthooklahoma3937/1x1":   aliases: orthooklahoma3937
000 "orthooklahoma3937/1x1":   IKE algorithms:
AES_CBC_256-HMAC_SHA1-MODP1024
000 "orthooklahoma3937/1x1":   ESP algorithms:
AES_CBC_256-HMAC_SHA1_96-MODP1024
000 "orthooklahoma3937/1x1":   ESP algorithm newest:
AES_CBC_256-HMAC_SHA1_96; pfsgroup=MODP1024
000 "orthooklahoma3937/1x2": 10.253.1.53/32===172.20.109.76
<172.20.109.76>[52.205.166.91]...12.131.93.13<12.131.93.13>===
10.50.32.239/32; erouted; eroute owner: #3167825
000 "orthooklahoma3937/1x2": oriented; my_ip=172.20.109.76;
their_ip=12.131.93.13; my_updown=ipsec _updown;
000 "orthooklahoma3937/1x2":   xauth us:none, xauth them:none,
my_username=[any]; their_username=[any]
000 "orthooklahoma3937/1x2":   our auth:secret, their auth:secret
000 "orthooklahoma3937/1x2":   modecfg info: us:none, them:none, modecfg
policy:push, dns:unset, domains:unset, banner:unset, cat:unset;
000 "orthooklahoma3937/1x2":   labeled_ipsec:no;
000 "orthooklahoma3937/1x2":   policy_label:unset;
000 "orthooklahoma3937/1x2":   ike_life: 28800s; ipsec_life: 3600s;
replay_window: 32; rekey_margin: 540s; rekey_fuzz: 100%; keyingtries: 0;
000 "orthooklahoma3937/1x2":   retransmit-interval: 500ms;
retransmit-timeout: 60s;
000 "orthooklahoma3937/1x2":   initial-contact:no; cisco-unity:no;
fake-strongswan:no; send-vendorid:no; send-no-esp-tfc:no;
000 "orthooklahoma3937/1x2":   policy:
PSK+ENCRYPT+TUNNEL+PFS+UP+IKEV1_ALLOW+IKEV2_ALLOW+SAREF_TRACK+IKE_FRAG_ALLOW+ESN_NO;
000 "orthooklahoma3937/1x2":   conn_prio: 32,32; interface: ens5; metric:
0; mtu: unset; sa_prio:auto; sa_tfc:none;
000 "orthooklahoma3937/1x2":   nflog-group: unset; mark: unset;
vti-iface:unset; vti-routing:no; vti-shared:no; nic-offload:auto;
000 "orthooklahoma3937/1x2":   our idtype: ID_IPV4_ADDR; our
id=52.205.166.91; their idtype: ID_IPV4_ADDR; their id=12.131.93.13
000 "orthooklahoma3937/1x2":   dpd: action:restart; delay:30; timeout:120;
nat-t: encaps:auto; nat_keepalive:yes; ikev1_natt:both
000 "orthooklahoma3937/1x2":   newest ISAKMP SA: #0; newest IPsec SA:
#3167825;
000 "orthooklahoma3937/1x2":   aliases: orthooklahoma3937
000 "orthooklahoma3937/1x2":   IKE algorithms:
AES_CBC_256-HMAC_SHA1-MODP1024
000 "orthooklahoma3937/1x2":   ESP algorithms:
AES_CBC_256-HMAC_SHA1_96-MODP1024
000 "orthooklahoma3937/1x2":   ESP algorithm newest:
AES_CBC_256-HMAC_SHA1_96; 

[Swan] mis-matched phase 2 settings cause infinite rekeys, high load, and broad failure across unrelated tunnels

2018-10-19 Thread Dharma Indurthy
Hey, folks.

My colleague Terell described this issue about a month ago.  For
background, we have libreswan server running that supports ~150
connections.  We proceeded with a libreswan upgrade to 3.25.

ipsec verify:
Verifying installed system and configuration files

Version check and ipsec on-path[OK]
Libreswan 3.25 (netkey) on 4.15.0-1020-aws
Checking for IPsec support in kernel  [OK]

The upgrade seemed to be successful.  However, we just encountered the
infinite look rekey problem.  What appeared to happen is that the re-keys
looped like crazy and persisted until pluto became unresponsive, and
systemd then killed the process.  Here's the gist.

We added this config:
conn baycare4059
type=tunnel
authby=secret
left=
leftid=
leftsourceip=
leftsubnets="   "
right=
rightsubnets="   "
rightsourceip=
auto=start
ike=aes256-sha1;modp1536
phase2alg=aes256-sha1;modp1536
ikelifetime=86400
salifetime=3600
dpdaction=restart
dpddelay=30
dpdtimeout=120
pfs=yes

*Here's the beginning of the logs.  We haven't reread secrets, so we can't
connect:*
Oct 18 14:50:37 ip-172-20-109-76 pluto[23193]: "baycare4059/2x2" #4360027:
STATE_MAIN_I1: retransmission; will wait 16 seconds for response
Oct 18 14:50:37 ip-172-20-109-76 pluto[23193]: "baycare4059/2x2" #4360027:
Can't authenticate: no preshared key found for `52.205.166.91' and
`204.76.135.13'.  Attribute OAKLEY_AUTHENTICATION_METHOD
Oct 18 14:50:37 ip-172-20-109-76 pluto[23193]: "baycare4059/2x2" #4360027:
no acceptable Oakley Transform
Oct 18 14:50:37 ip-172-20-109-76 pluto[23193]: "baycare4059/2x2" #4360027:
sending notification NO_PROPOSAL_CHOSEN to 204.76.135.13:500

*So then we fix the secret, reread, and encounter an infinite loop.  We
still don't know what if any configuration mismatch there is.  The
connection logs like crazy.  The logs below represent a fraction of a
second:*
Oct 18 14:51:25 ip-172-20-109-76 pluto[23193]: "baycare4059/2x2" #4360091:
Peer ID is ID_IPV4_ADDR: '204.76.135.13'
Oct 18 14:51:25 ip-172-20-109-76 pluto[23193]: "baycare4059/2x2" #4360091:
STATE_MAIN_I4: ISAKMP SA established {auth=PRESHARED_KEY cipher=aes_256
integ=sha group=MODP1536}
Oct 18 14:51:25 ip-172-20-109-76 pluto[23193]: "baycare4059/1x1" #4360113:
initiating Quick Mode
PSK+ENCRYPT+TUNNEL+PFS+UP+IKEV1_ALLOW+IKEV2_ALLOW+SAREF_TRACK+IKE_FRAG_ALLOW+ESN_NO
{using isakmp#4360091 msgid:62ca303f
proposal=AES_CBC_256-HMAC_SHA1_96-MODP1536 pfsgroup=MODP1536}
Oct 18 14:51:25 ip-172-20-109-76 pluto[23193]: "baycare4059/1x2" #4360114:
initiating Quick Mode
PSK+ENCRYPT+TUNNEL+PFS+UP+IKEV1_ALLOW+IKEV2_ALLOW+SAREF_TRACK+IKE_FRAG_ALLOW+ESN_NO
{using isakmp#4360091 msgid:ceca2e6d
proposal=AES_CBC_256-HMAC_SHA1_96-MODP1536 pfsgroup=MODP1536}
Oct 18 14:51:25 ip-172-20-109-76 pluto[23193]: "baycare4059/2x1" #4360115:
initiating Quick Mode
PSK+ENCRYPT+TUNNEL+PFS+UP+IKEV1_ALLOW+IKEV2_ALLOW+SAREF_TRACK+IKE_FRAG_ALLOW+ESN_NO
{using isakmp#4360091 msgid:cbbfdcb9
proposal=AES_CBC_256-HMAC_SHA1_96-MODP1536 pfsgroup=MODP1536}
Oct 18 14:51:25 ip-172-20-109-76 pluto[23193]: "baycare4059/2x2" #4360116:
initiating Quick Mode
PSK+ENCRYPT+TUNNEL+PFS+UP+IKEV1_ALLOW+IKEV2_ALLOW+SAREF_TRACK+IKE_FRAG_ALLOW+ESN_NO
{using isakmp#4360091 msgid:d616b6b4
proposal=AES_CBC_256-HMAC_SHA1_96-MODP1536 pfsgroup=MODP1536}
Oct 18 14:51:25 ip-172-20-109-76 pluto[23193]: "southdenvergastro3801/2x3"
#4307776: ignoring informational payload INVALID_ID_INFORMATION,
msgid=, length=60
Oct 18 14:51:25 ip-172-20-109-76 pluto[23193]: | ISAKMP Notification Payload
Oct 18 14:51:25 ip-172-20-109-76 pluto[23193]: |   00 00 00 3c  00 00 00
01  03 04 00 12
Oct 18 14:51:25 ip-172-20-109-76 pluto[23193]: "southdenvergastro3801/2x3"
#4307776: received and ignored informational message
Oct 18 14:51:25 ip-172-20-109-76 pluto[23193]: "baycare4059/2x2" #4360091:
ignoring informational payload INVALID_ID_INFORMATION, msgid=,
length=352
Oct 18 14:51:25 ip-172-20-109-76 pluto[23193]: | ISAKMP Notification Payload
Oct 18 14:51:25 ip-172-20-109-76 pluto[23193]: |   00 00 01 60  00 00 00
01  03 04 00 12
Oct 18 14:51:25 ip-172-20-109-76 pluto[23193]: "baycare4059/2x2" #4360091:
received and ignored informational message
Oct 18 14:51:25 ip-172-20-109-76 pluto[23193]: "baycare4059/2x2" #4360091:
received Delete SA payload: self-deleting ISAKMP State #4360091
Oct 18 14:51:25 ip-172-20-109-76 pluto[23193]: "baycare4059/2x2" #4360091:
deleting state (STATE_MAIN_I4) and sending notification
Oct 18 14:51:25 ip-172-20-109-76 pluto[23193]: "baycare4059/2x2" #4360091:
reschedule pending child #4360116 STATE_QUICK_I1 of connection
"baycare4059/2x2" - the parent is going away
Oct 18 14:51:25 ip-172-20-109-76 pluto[23193]: "baycare4059/2x2" #4360091:
reschedule pending child #4360115 STATE_QUICK_I1 of connection
"baycare4059/2x1" - the parent is going away
Oct 18 14:51:25 ip-172-20-109-76 pluto[23193]: "baycare4059/2x2" #4360091:
reschedule pending