On 6/20/25 10:26 AM, Eelco Chaudron wrote:
> 
> 
> On 17 Jun 2025, at 22:49, Ilya Maximets wrote:
> 
>> If started with --no-restart-ike-daemon, ovs-monitor-ipsec doesn't
>> clear the NSS database.  This is not a problem if the certificates do
>> not change while the monitor is down, because completely duplicate
>> entries cannot be added to the NSS database.  However, if the monitor
>> is stopped, then certificates change on disk and then the monitor is
>> started back, it will add new tunnel certificates alongside the old
>> ones and will fail to add the new CA certificate.  So, we'll end up
>> with multiple certificates for the same tunnel and the outdated CA
>> certificate.  This will not allow creating new connections as we'll
>> not be able to verify certificates of the new CA:
>>
>>   # certutil -L -d sql:/var/lib/ipsec/nss
>>
>>   Certificate Nickname             Trust Attributes
>>                                    SSL,S/MIME,JAR/XPI
>>
>>   ovs_certkey_c04c352b             u,u,u
>>   ovs_cert_cacert                  CT,,
>>   ovs_certkey_c04c352b             u,u,u
>>   ovs_certkey_c04c352b             u,u,u
>>   ovs_certkey_c04c352b             u,u,u
>>   ovs_certkey_c04c352b             u,u,u
>>   ovs_certkey_c04c352b             u,u,u
>>   ovs_certkey_c04c352b             u,u,u
>>
>>   pluto: "ovn-c04c35-0-out-1" #459: processing decrypted
>>    IKE_AUTH request containing SK{IDi,CERT,CERTREQ,IDr,AUTH,SA,
>>    TSi,TSr,N(USE_TRANSPORT_MODE)}
>>   pluto: "ovn-c04c35-0-out-1" #459: NSS: ERROR:
>>    IPsec certificate CN=c04c352b,OU=kind,O=ovnkubernetes,C=US invalid:
>>    SEC_ERROR_UNKNOWN_ISSUER: Peer's Certificate issuer is not recognized.
>>   pluto: "ovn-c04c35-0-out-1" #459: NSS: end certificate invalid
>>
>> Fix that by always trying to remove the certificate from the NSS
>> database before importing the new one.  This doesn't affect Libreswan,
>> because it will not access the database until we ask it to re-read
>> the secrets.
>>
>> We have to call deletion multiple times in order to remove all the
>> potential duplicates from previous runs.  This will be useful on
>> upgrade, but also may save us if one of the deletions ever fail for
>> any reason and we'll end up with a duplicate entry anyway.
>>
>> One alternative might be to always clear the database, even if the
>> --no-restart-ike-daemon option is set, but there is a chance that
>> we'll refresh and ask to re-read secrets before we got all the tunnel
>> information from the database.  That may affect dataplane.  Even if
>> this is really not possible, the logic seems too far apart to rely on.
>>
>> The clearing may seem redundant now, but it may still be useful to
>> clean up certificates for tunnels that disappeared while the monitor
>> was down.  Approach taken in this change doesn't cover this case.
>>
>> Test is added to check the described scenario.  The 'on_exit' command
>> is converted to obtain the monitor PID at exit, since we're now killing
>> one monitor and starting another.
>>
>> Fixes: fe5ff26a49f6 ("ovs-monitor-ipsec: Add option to not restart IKE 
>> daemon.")
>> Reported-at: https://issues.redhat.com/browse/FDP-1473
>> Signed-off-by: Ilya Maximets <i.maximets@ov
> 
> The approach and changes look good to me.
> 
> Acked-by: Eelco Chaudron <echau...@redhat.com>
> 

Thanks, Mike and Eelco.  Unfortunately, there seems to be some bug in
libreswan 4.6 specifically :( that makes it impossible to re-add removed
connection if we remove and re-add the same certificate...  Newer versions
of libreswan work just fine.  I'll probably need to work around that
somehow, e.g. by dumping out the current cert and comparing in order to
avoid removing and adding back the same cert.  Investigating...

Best regards, Ilya Maximets.
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to