On 11/1/24 11:37, Eelco Chaudron wrote:
> 
> 
> On 1 Nov 2024, at 2:23, Ilya Maximets wrote:
> 
>> Normally ovs-monitor-ipsec will start all the connections it manages.
>> This is required, because we do not generally know if the other side of
>> the tunnel is going to initiate the IPsec connection or not.
>> For example, the other side might not belong to an OVS setup, so it may
>> not be managed by the other instance of ovs-monitor-ipsec.  There are
>> also issues in Libreswan that may cause the other side to fail the
>> connection initiation in a way that it will not try again.
>>
>> However, in many cases the other side is managed by ovs-monitor-ipsec.
>> And in that scenario there is a high chance the both sides will try
>> to initiate the connection at the same time.  This is known as
>> crossing streams.  Unfortunately, Libreswan, 4.x in particular, doesn't
>> handle this well and either crashes or ends up in a state where
>> connections reported as active, but no traffic can actually go through.
>>
>> For tunnels, where we create separate incoming and outgoing connections
>> (geneve), we may start (add + up) the outgoing connection and only add
>> the incoming one.  This would give the other side some time to initiate,
>> avoiding the crossing streams and giving Libreswan a higher chance to
>> survive.
>>
>> We still have to try to bring the incoming connections up at some point
>> if they do not become active.  Reconciliation logic will take care of
>> this.  Next time we check the active connections, we'll try to reconcile
>> and will bring all the loaded but not active connections up.  So, we're
>> loosing at most 15 seconds if something goes wrong.
>>
>> This change greatly improves stability with Libreswan 4.x.  It's still
>> not enough to enable the ping test for it, but hopefully enough for
>> real world setups to not hit the Libreswan issues often.
>>
>> GRE connections will still be started from both sides.  We do already
>> have some issues in case users name their tunnels with -in- or -out-
>> in the name, so it's not a new problem, but if the regex accidentally
>> matches on such a GRE tunnel, we'll again loose at most 15 seconds
>> before they will be brought up during reconciliation.  So, should not
>> be a big deal.
>>
>> Note: ipsec auto in Libreswan < 5 accepts --asynchronous together with
>> --add, even thought the --asynchronous flag is only for up/down/start,
>> but Libreswan 5 fails the command, so we need to add it conditionally.
>>
>> Signed-off-by: Ilya Maximets <[email protected]>
> 
> Don’t you love these workarounds for specific version ;)

Yeah, I know...  This will make the life of end users much easier though,
so I think it is valuable.

I was also discussing these issues with the Libreswan developers and it
seems that from their point of view initiating connection from both sides
is a rare abnormal use case and normally only one side would initiate...

That's maybe why such scenarios didn't get a lot of testing in Libreswan
itself.

I do completely agree that the way Libreswan handles crossing streams is
a disaster and has to be fixed on their side.  But being more in line with
how Libreswan expects to be used should be a good thing for OVS users in
the end, I think.

Would love to see all distributions to move to Libreswan 5.1+ that handles
crossing streams well enough, AFAICT.  But it's likely not happening any
time soon.

> However, the change looks good to me.
> 
> This concludes my review of the series; all patches should be acked!
> But just in case for the series:
> 
> Acked-by: Eelco Chaudron <[email protected]>
> 

Thanks!

Best regards, Ilya Maximets.
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to