[Touch-packages] [Bug 1853956] Re: 34 wireguard peers result in invalid peer configuration

2019-12-05 Thread Joshua Sjoding
It turns out the fix for this issue was backported to systemd v240:

https://github.com/systemd/systemd-stable/pull/37

I performed a release upgrade on one of our affected servers, bringing
it up from ubuntu 18.04 to ubuntu 19.04 (which uses systemd v240), and I
can confirm that the peers are being configured correctly now.

So this issue affects ubuntu 18.04 LTS but not any later supported
releases. 18.10 was also affected but it's EOL.

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1853956

Title:
  34 wireguard peers result in invalid peer configuration

Status in systemd package in Ubuntu:
  New

Bug description:
  ubuntu server 18.04.3 LTS
  systemd 237-3ubuntu10.31
  wireguard 0.0.20191012-wg1~bionic from PPA.

  We're using systemd-networkd to configure wireguard via
  wireguard.netdev and wireguard.network files in /etc/systemd/network/.
  All endpoints have IPv4 addresses.

  When we include 34, 35, or 36 [WireGuardPeer] entries in the netdev
  file some peers are configured incorrectly. The affected peers seem to
  be related to the total number of peers (counting from 0 here):

  33 peers: No issue
  34 peers: Peer 1 and 2 fail
  35 peers: Peer 2 and 3 fail
  36 peers: Peer 3 and 4 fail
  37 peers: No issue

  In all cases peer 0 is functional. For an affected pair of peers A and
  B, peer A ends up with the allowed IP address range of peer B. Peer B
  ends up with no allowed IP addresses. This can be seen in the output
  of wg. The connections to both peers fail because of incorrect address
  range assignments.

  We first encountered this issue in a production environment when we
  moved from 33 to 34 unique peers on each server. The issue was
  reproduced on 3 different physical servers with similar configuration
  by adding and removing peer 34.

  The [WireGuardPeer] entries do not need to be unique to reproduce the
  issue. In my testing I used 6 distinct peers and then used 28 or more
  identical copies of a 7th peer. The results were the same.

  In January 2019 a bug was reported that was also related to the number of 
wireguard peers, but the description seems sufficiently different from our case 
that I felt I should file a distinct bug report. Here's a link to that report 
in case I'm wrong about that:
  https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1811149

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1853956/+subscriptions

-- 
Mailing list: https://launchpad.net/~touch-packages
Post to : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp


[Touch-packages] [Bug 1811149] Re: 23 wireguard peers hang systemd-networkd

2019-12-04 Thread Joshua Sjoding
As near as I can tell the fix for this was never backported from systemd
v241 to bionic. I recently filed a related a bug report here:

https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1853956

My symptoms are a little different (misconfiguration instead of an
infinite loop), but I have a strong suspicion that the underlying cause
is the same.

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1811149

Title:
  23 wireguard peers hang systemd-networkd

Status in systemd:
  Fix Released
Status in systemd package in Ubuntu:
  New

Bug description:
  I'm running Ubuntu 18.04.1 LTS with systemd=237-3ubuntu10.9.
  Linux kernel version is 4.15.0-32-generic #35-Ubuntu SMP.
  wireguard=0.0.20181218-wg1~bionic from PPA.

  I have a Wireguard-based VPN server that has several peers. As soon as
  number of peers is 22 or lower, everything works okay. As soon as I
  add the 23rd peer, restart of `systemd-networkd` service fails with
  timeout while systemd-networkd hogs CPU.

  Moreover, if I reboot the box while wireguard configuration is
  "broken", systemd-networkd fails to apply network settings on boot and
  the box is no longer accessible over the network.

  Configuration is structured in a following way (keys are fake):

  ==> wg0.netdev
  [NetDev]
  Name=wg0
  Kind=wireguard
  Description=Wireguard VPN server
  [WireGuard]
  ListenPort=4500
  PrivateKey=kNl7tkhCM1Crj8RhUIn8xvwcg+UoOkw26kQjQEtZk1k=
  [WireGuardPeer]
  PublicKey=AfM1AN4IIUe5AVypFg2pcNrQmqOtZQIJLgusbkDYXkI=
  AllowedIPs=fd6f:b446:a2ca:0400:cb6f:b446:a2ca:bd0b/128
  AllowedIPs=fd6f:b446:a2ca:cb6f:b446:a2ca::/96
  # and 22 more [WireGuardPeer] like that

  ==> wg0.network
  Name=wg0
  [Network]
  Address=fd6f:b446:a2ca:0400::1/64
  [Route]
  Destination=fd6f:b446:a2ca:cb6f:b446:a2ca::/96
  # and 22 more [Route] sections like that

  syslog logs are attached both for "good" and "bad" cases, sample of
  strace logs is also attached for "bad" case.

  I'm filling the issue here as the aforementioned systemd version is
  already out of scope of upstream bug tracker per
  https://github.com/systemd/systemd/blob/master/docs/CONTRIBUTING.md
  #filing-issues

To manage notifications about this bug go to:
https://bugs.launchpad.net/systemd/+bug/1811149/+subscriptions

-- 
Mailing list: https://launchpad.net/~touch-packages
Post to : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp


[Touch-packages] [Bug 1853956] Re: 34 wireguard peers result in invalid peer configuration

2019-12-04 Thread Joshua Sjoding
I think the underlying problem is improper fragmentation of netlink
messages sent to the WireGuard device by systemd v237 in the
set_wireguard_interface function:

https://github.com/systemd/systemd/blob/v237/src/network/netdev/wireguard.c#L107

Appending netlink message data can fail if the message size limit has
been exceeded. This can happen if there are too many peers or ip masks
in the netdev file, and the v237 code doesn't seem to handle this
properly. It's supposed to split the data up into message fragments, but
instead it can end up writing incoherent data to the netlink socket or
end up in an infinite loop.

This issue was fixed in systemd v241 by reworking the code over a few
commits:

https://github.com/systemd/systemd/pull/11418
https://github.com/systemd/systemd/pull/11580 (this fixed issues with the first 
PR)

I found some comments (now resolved) on one of the commits illuminating:

https://github.com/systemd/systemd/pull/11418/commits/e1f717d4a02e15ae11a191dd4962b2f4d117678d

Mic92 on 2019-01-15:

> The idea is that netlink's messages are limited in size. If an
interface has many peers, addresses or ip masks then the configuration
might not fit into one message and has to be split across different
messages.

yuwata on 2019-01-15:

> Yeah. I guess there was some bug in the cancellation logic, and it
causes infinite loop with the magic number 23.

The infinite loop with 23 peers yuwata mentions is a reference to Leonid's bug 
report from January:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1811149

I expect that backporting these fixes from v241 to bionic's systemd v237
branch would resolve both my issue and the issue reported by Leonid.

I realize this is a non-trivial change and there's a regression risk.

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1853956

Title:
  34 wireguard peers result in invalid peer configuration

Status in systemd package in Ubuntu:
  New

Bug description:
  ubuntu server 18.04.3 LTS
  systemd 237-3ubuntu10.31
  wireguard 0.0.20191012-wg1~bionic from PPA.

  We're using systemd-networkd to configure wireguard via
  wireguard.netdev and wireguard.network files in /etc/systemd/network/.
  All endpoints have IPv4 addresses.

  When we include 34, 35, or 36 [WireGuardPeer] entries in the netdev
  file some peers are configured incorrectly. The affected peers seem to
  be related to the total number of peers (counting from 0 here):

  33 peers: No issue
  34 peers: Peer 1 and 2 fail
  35 peers: Peer 2 and 3 fail
  36 peers: Peer 3 and 4 fail
  37 peers: No issue

  In all cases peer 0 is functional. For an affected pair of peers A and
  B, peer A ends up with the allowed IP address range of peer B. Peer B
  ends up with no allowed IP addresses. This can be seen in the output
  of wg. The connections to both peers fail because of incorrect address
  range assignments.

  We first encountered this issue in a production environment when we
  moved from 33 to 34 unique peers on each server. The issue was
  reproduced on 3 different physical servers with similar configuration
  by adding and removing peer 34.

  The [WireGuardPeer] entries do not need to be unique to reproduce the
  issue. In my testing I used 6 distinct peers and then used 28 or more
  identical copies of a 7th peer. The results were the same.

  In January 2019 a bug was reported that was also related to the number of 
wireguard peers, but the description seems sufficiently different from our case 
that I felt I should file a distinct bug report. Here's a link to that report 
in case I'm wrong about that:
  https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1811149

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1853956/+subscriptions

-- 
Mailing list: https://launchpad.net/~touch-packages
Post to : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp


[Touch-packages] [Bug 1853956] Re: 34 wireguard peers result in invalid peer configuration

2019-12-04 Thread Joshua Sjoding
I now believe the dmesg complaint in my last comment to be a separate
issue. A fix for it was backported to systemd v238 in this commit:

https://github.com/systemd/systemd-
stable/commit/7db3fe08c5eb83584f3a3d356876b4acaa797585#diff-
f29d1bfc98e548dc0eb497c3d17cbefa

It was not backported to systemd v237:

https://github.com/systemd/systemd-
stable/commits/v237-stable/src/network/netdev/wireguard.c

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1853956

Title:
  34 wireguard peers result in invalid peer configuration

Status in systemd package in Ubuntu:
  New

Bug description:
  ubuntu server 18.04.3 LTS
  systemd 237-3ubuntu10.31
  wireguard 0.0.20191012-wg1~bionic from PPA.

  We're using systemd-networkd to configure wireguard via
  wireguard.netdev and wireguard.network files in /etc/systemd/network/.
  All endpoints have IPv4 addresses.

  When we include 34, 35, or 36 [WireGuardPeer] entries in the netdev
  file some peers are configured incorrectly. The affected peers seem to
  be related to the total number of peers (counting from 0 here):

  33 peers: No issue
  34 peers: Peer 1 and 2 fail
  35 peers: Peer 2 and 3 fail
  36 peers: Peer 3 and 4 fail
  37 peers: No issue

  In all cases peer 0 is functional. For an affected pair of peers A and
  B, peer A ends up with the allowed IP address range of peer B. Peer B
  ends up with no allowed IP addresses. This can be seen in the output
  of wg. The connections to both peers fail because of incorrect address
  range assignments.

  We first encountered this issue in a production environment when we
  moved from 33 to 34 unique peers on each server. The issue was
  reproduced on 3 different physical servers with similar configuration
  by adding and removing peer 34.

  The [WireGuardPeer] entries do not need to be unique to reproduce the
  issue. In my testing I used 6 distinct peers and then used 28 or more
  identical copies of a 7th peer. The results were the same.

  In January 2019 a bug was reported that was also related to the number of 
wireguard peers, but the description seems sufficiently different from our case 
that I felt I should file a distinct bug report. Here's a link to that report 
in case I'm wrong about that:
  https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1811149

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1853956/+subscriptions

-- 
Mailing list: https://launchpad.net/~touch-packages
Post to : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp


[Touch-packages] [Bug 1853956] Re: 34 wireguard peers result in invalid peer configuration

2019-12-04 Thread Joshua Sjoding
On two systems with 33 peers I noticed that this shows up in dmesg after
a reboot:

netlink: 'systemd-network': attribute type 5 has an invalid length.

These lines also show up whenever I run `sudo systemctl restart systemd-
networkd` now. They didn't show up before the reboot.

This suggests that there may be issues I haven't noticed yet even with
fewer than 34 peers. In our production environment not all of our peers
are online all the time, so an issue affecting a few of them could go
unnoticed.

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1853956

Title:
  34 wireguard peers result in invalid peer configuration

Status in systemd package in Ubuntu:
  New

Bug description:
  ubuntu server 18.04.3 LTS
  systemd 237-3ubuntu10.31
  wireguard 0.0.20191012-wg1~bionic from PPA.

  We're using systemd-networkd to configure wireguard via
  wireguard.netdev and wireguard.network files in /etc/systemd/network/.
  All endpoints have IPv4 addresses.

  When we include 34, 35, or 36 [WireGuardPeer] entries in the netdev
  file some peers are configured incorrectly. The affected peers seem to
  be related to the total number of peers (counting from 0 here):

  33 peers: No issue
  34 peers: Peer 1 and 2 fail
  35 peers: Peer 2 and 3 fail
  36 peers: Peer 3 and 4 fail
  37 peers: No issue

  In all cases peer 0 is functional. For an affected pair of peers A and
  B, peer A ends up with the allowed IP address range of peer B. Peer B
  ends up with no allowed IP addresses. This can be seen in the output
  of wg. The connections to both peers fail because of incorrect address
  range assignments.

  We first encountered this issue in a production environment when we
  moved from 33 to 34 unique peers on each server. The issue was
  reproduced on 3 different physical servers with similar configuration
  by adding and removing peer 34.

  The [WireGuardPeer] entries do not need to be unique to reproduce the
  issue. In my testing I used 6 distinct peers and then used 28 or more
  identical copies of a 7th peer. The results were the same.

  In January 2019 a bug was reported that was also related to the number of 
wireguard peers, but the description seems sufficiently different from our case 
that I felt I should file a distinct bug report. Here's a link to that report 
in case I'm wrong about that:
  https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1811149

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1853956/+subscriptions

-- 
Mailing list: https://launchpad.net/~touch-packages
Post to : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp


[Touch-packages] [Bug 1853956] [NEW] 34 wireguard peers result in invalid peer configuration

2019-11-25 Thread Joshua Sjoding
Public bug reported:

ubuntu server 18.04.3 LTS
systemd 237-3ubuntu10.31
wireguard 0.0.20191012-wg1~bionic from PPA.

We're using systemd-networkd to configure wireguard via wireguard.netdev
and wireguard.network files in /etc/systemd/network/. All endpoints have
IPv4 addresses.

When we include 34, 35, or 36 [WireGuardPeer] entries in the netdev file
some peers are configured incorrectly. The affected peers seem to be
related to the total number of peers (counting from 0 here):

33 peers: No issue
34 peers: Peer 1 and 2 fail
35 peers: Peer 2 and 3 fail
36 peers: Peer 3 and 4 fail
37 peers: No issue

In all cases peer 0 is functional. For an affected pair of peers A and
B, peer A ends up with the allowed IP address range of peer B. Peer B
ends up with no allowed IP addresses. This can be seen in the output of
wg. The connections to both peers fail because of incorrect address
range assignments.

We first encountered this issue in a production environment when we
moved from 33 to 34 unique peers on each server. The issue was
reproduced on 3 different physical servers with similar configuration by
adding and removing peer 34.

The [WireGuardPeer] entries do not need to be unique to reproduce the
issue. In my testing I used 6 distinct peers and then used 28 or more
identical copies of a 7th peer. The results were the same.

In January 2019 a bug was reported that was also related to the number of 
wireguard peers, but the description seems sufficiently different from our case 
that I felt I should file a distinct bug report. Here's a link to that report 
in case I'm wrong about that:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1811149

** Affects: systemd (Ubuntu)
 Importance: Undecided
 Status: New


** Tags: networkd systemd-networkd wireguard

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1853956

Title:
  34 wireguard peers result in invalid peer configuration

Status in systemd package in Ubuntu:
  New

Bug description:
  ubuntu server 18.04.3 LTS
  systemd 237-3ubuntu10.31
  wireguard 0.0.20191012-wg1~bionic from PPA.

  We're using systemd-networkd to configure wireguard via
  wireguard.netdev and wireguard.network files in /etc/systemd/network/.
  All endpoints have IPv4 addresses.

  When we include 34, 35, or 36 [WireGuardPeer] entries in the netdev
  file some peers are configured incorrectly. The affected peers seem to
  be related to the total number of peers (counting from 0 here):

  33 peers: No issue
  34 peers: Peer 1 and 2 fail
  35 peers: Peer 2 and 3 fail
  36 peers: Peer 3 and 4 fail
  37 peers: No issue

  In all cases peer 0 is functional. For an affected pair of peers A and
  B, peer A ends up with the allowed IP address range of peer B. Peer B
  ends up with no allowed IP addresses. This can be seen in the output
  of wg. The connections to both peers fail because of incorrect address
  range assignments.

  We first encountered this issue in a production environment when we
  moved from 33 to 34 unique peers on each server. The issue was
  reproduced on 3 different physical servers with similar configuration
  by adding and removing peer 34.

  The [WireGuardPeer] entries do not need to be unique to reproduce the
  issue. In my testing I used 6 distinct peers and then used 28 or more
  identical copies of a 7th peer. The results were the same.

  In January 2019 a bug was reported that was also related to the number of 
wireguard peers, but the description seems sufficiently different from our case 
that I felt I should file a distinct bug report. Here's a link to that report 
in case I'm wrong about that:
  https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1811149

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1853956/+subscriptions

-- 
Mailing list: https://launchpad.net/~touch-packages
Post to : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp