On 31 August 2017 at 19:40, Noel Kuntze <noel.kuntze+strongswan-users-ml@thermi.consulting> wrote: > The aborting of the initation is a deliberate design decision. That is > because this is a configuration error of the remote peer. > Use auto=route to get the kernel and charon to try to establish a matching > CHILD_SA for the traffic matching the TS.
Hi Noel, Thanks for the explanation. I guess the swanctl equivalent of auto=route is start_action=trap. What do you mean by "remote peer"? The initiator carol or the responder moon? I actually tried setting start_action=trap on carol before but got the same NO_PROPOSAL_CHOSEN error after a few successful test runs. I just tried setting start_action=trap on moon as well and I haven't been able to reproduce the error after many test runs. So this might indeed fix the problem! I am surprised this setting is not set on the same test in the strongSwan project: https://github.com/strongswan/strongswan/blob/master/testing/tests/swanctl/rw-psk-ipv4/hosts/moon/etc/swanctl/swanctl.conf Maybe the machines in the strongSwan test suite are booted sequentially instead of in parallel like in my NixOS test so the error doesn't appear. > There are many more failure cases than just that that would need to be > considered to make charon try to keep a CHILD_SA up at all times. Is there any documentation on how to configure strongSwan for systems that need to work reliably and fully autonomously? Thanks a lot for the start_action=trap tip. I think I'm also going to set that on my company VPN because it seems to make things more reliable. Bas > On 31.08.2017 19:31, Bas van Dijk wrote: >> I've now changed the testScript[1] to first start moon, wait for the >> strongswan-swanctl service to start and then start carol. Using this >> setup it's almost guaranteed that moon has loaded the connection >> before carol initiates the connection. >> >> In the process of debugging this I did discover the option: >> charon.retry_initiate_interval "Interval in seconds to use when >> retrying to initiate an IKE_SA (e.g. if >> DNS resolution failed)". Would it make sense to extend the behavior of >> this option to also retry an IKE_SA if a previous attempt failed *for >> any reason* (so not just on DNS failures)? If it works like that it >> will solve my problem because carol will just retry initiating the >> connection after it gets the NO_PROP message. It will make initiation >> more automatic and robust. >> >> Bas >> >> [1] >> https://github.com/LumiGuide/nixpkgs/blob/c16b7285fe9cc379227a255f955b38c6830a7b24/nixos/tests/strongswan-swanctl.nix#L150 >> >> On 31 August 2017 at 11:04, Bas van Dijk <v.dijk....@gmail.com> wrote: >>> Ok after studying this part of the log a bit further: >>> >>> https://gist.github.com/basvandijk/a2de93d8c93ce925838c1dbf2ee1d925#file-strongswan-swanctl-test-failure-log-L1428:L1459 >>> >>> I see that the following is going on: >>> >>> 1. moon has started charon-systemd but hasn't loaded the connection yet >>> 2. carol sends a IKE_SA_INIT request to moon >>> 3. since moon hasn't loaded the connection yet it can't find an IKE >>> config for 192.168.1.3...192.168.1.2 and sends a NO_PROP response back >>> to carol >>> 4. moon loads the connection >>> 5. carol warns about the "received NO_PROPOSAL_CHOSEN notify error" >>> 6. pings from carol to alice fail continuously because the VPN is not >>> established >>> >>> Is there a way for carol to keep trying to establish a connection >>> until it succeeds? >>> >>> Bas >>> >>> >>> On 31 August 2017 at 09:14, Bas van Dijk <v.dijk....@gmail.com> wrote: >>>> I also included the log of a successful test run: >>>> >>>> https://gist.github.com/basvandijk/a2de93d8c93ce925838c1dbf2ee1d925#file-strongswan-swanctl-test-success-log >>>> >>>> On 31 August 2017 at 09:09, Bas van Dijk <v.dijk....@gmail.com> wrote: >>>>> I noticed that my test succeeds most of the time but I just observed a >>>>> test run where carol keeps trying to ping alice but fails each time. >>>>> The following line from the test log[1] seems suspect: >>>>> >>>>> carol# [ 4.538963] charon-systemd[716]: received NO_PROPOSAL_CHOSEN >>>>> notify error >>>>> >>>>> I haven't looked into this error yet but I suspect it's a concurrency >>>>> issue. Note that all machines start up at the same time[2]. I think if >>>>> I first start moon, wait for the strongswan-swanctl.service to start >>>>> and then start carol it always succeeds. But I rather not introduce >>>>> that sequentialism and I suspect that strongSwan should be able to >>>>> handle not fully booted gateways and that I just forgot to configure >>>>> some option somewhere. >>>>> >>>>> Any ideas why the test sometimes fails? >>>>> >>>>> [1] https://gist.github.com/basvandijk/a2de93d8c93ce925838c1dbf2ee1d925 >>>>> [2] >>>>> https://github.com/LumiGuide/nixpkgs/blob/b1bab8cff348ac743ecc6734f1852a16db41a9a2/nixos/tests/strongswan-swanctl.nix#L151 >>>>> >>>>> On 30 August 2017 at 11:52, Bas van Dijk <v.dijk....@gmail.com> wrote: >>>>>> The test now succeeds[1]. >>>>>> >>>>>> Thanks for your help. >>>>>> >>>>>> Bas >>>>>> >>>>>> [1] https://groups.google.com/d/msg/nix-devel/X-0T97MLR7I/cGUCWjXQAAAJ >>>>>> >>>>>> On 30 August 2017 at 02:57, Bas van Dijk <v.dijk....@gmail.com> wrote: >>>>>>> On 30 August 2017 at 02:29, Noel Kuntze >>>>>>> <noel.kuntze+strongswan-users-ml@thermi.consulting> wrote: >>>>>>>> Two things: >>>>>>>> - Please don't pipe stuff from the web into bash, it just asks for >>>>>>>> trouble and especially don't advertise or advise people to do it. >>>>>>> >>>>>>> Hi Noel, good point. This should probably be removed from nixos.org/nix. >>>>>>> >>>>>>>> - Try enforcing UDP encapsulation. If the FW rules actually change >>>>>>>> something, then currently only IKE is allowed, but there's no NAT, so >>>>>>>> ESP is used as transport protocol. >>>>>>> >>>>>>> Something similar was suggested[1] on the nix-devel mailinglist. I >>>>>>> will see how to get that to work. >>>>>>> >>>>>>> Bas >>>>>>> >>>>>>> [1] >>>>>>> https://groups.google.com/forum/#!msg/nix-devel/X-0T97MLR7I/jbPQucPOAAAJ >>>>>>> >>>>>>>> Kind regards >>>>>>>> >>>>>>>> Noel >>>>>>>> >>>>>>>> On 30.08.2017 02:18, Bas van Dijk wrote: >>>>>>>>> I've created a PR for the NixOS Linux distribution that adds a module >>>>>>>>> for strongswan-swanctl: >>>>>>>>> >>>>>>>>> https://github.com/NixOS/nixpkgs/pull/27958 >>>>>>>>> >>>>>>>>> Although the new module works on our company VPN I would also like to >>>>>>>>> add a NixOS test to ensure it keeps working. I've mimicked one of the >>>>>>>>> swanctl tests from the strongswan project: >>>>>>>>> >>>>>>>>> >>>>>>>>> https://github.com/LumiGuide/nixpkgs/blob/strongswan-swanctl-test/nixos/tests/strongswan-swanctl.nix >>>>>>>>> >>>>>>>>> Although SAs get established successfully between gateway moon and >>>>>>>>> roadwarrior carol I can't seem to ping alice from carol. Since I'm no >>>>>>>>> networking expert I'm probably missing something obvious. It would be >>>>>>>>> great if somebody could give me a tip or point me in the right >>>>>>>>> direction. >>>>>>>>> >>>>>>>>> To run the test for yourself you don't need to install NixOS, you only >>>>>>>>> need the Nix package manager (which is easy to uninstall later on; >>>>>>>>> just rm -r /nix): >>>>>>>>> >>>>>>>>> $ curl https://nixos.org/nix/install | sh >>>>>>>>> >>>>>>>>> Then clone my nixpkgs fork and checkout the right branch: >>>>>>>>> >>>>>>>>> $ git clone https://github.com/LumiGuide/nixpkgs.git >>>>>>>>> $ cd nixpkgs >>>>>>>>> $ git checkout strongswan-swanctl-test >>>>>>>>> >>>>>>>>> Look in nixos/tests/strongswan-swanctl.nix to see how to run the test >>>>>>>>> but the following should get you started: >>>>>>>>> >>>>>>>>> $ nix-build nixos/tests/strongswan-swanctl.nix >>>>>>>>> >>>>>>>>> Note that I also asked this question on the nix-devel mailinglist: >>>>>>>>> >>>>>>>>> https://groups.google.com/forum/#!topic/nix-devel/X-0T97MLR7I >>>>>>>>> >>>>>>>>> Cheers, >>>>>>>>> >>>>>>>>> Bas >>>>>>>> >