[Touch-packages] [Bug 1881972] Re: systemd-networkd crashes with invalid pointer

2020-07-21 Thread John Nielsen
No crashes on my test machine for 12 days. Push it!

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1881972

Title:
  systemd-networkd crashes with invalid pointer

Status in systemd package in Ubuntu:
  Fix Released
Status in systemd source package in Bionic:
  In Progress

Bug description:
  [impact]

  systemd-networkd double-free causes crash under some circumstances,
  such as adding/removing ip rules

  [test case]

  see original description

  [regression potential]

  this strdup's strings during addition of routing policy rules, so any
  regression would likely occur when adding/modifying/removing ip rules,
  possibly including networkd segfault or failure to add/remove/modify
  ip rules.

  [scope]

  this is needed for bionic.

  this is fixed by upstream commit
  eeab051b28ba6e1b4a56d369d4c6bf7cfa71947c which is included starting in
  v240, so this is already included in Focal and later.

  I did not research what original commit introduced the problem, but
  the reporter indicates this did not happen for Xenial so it's unlikely
  this is a problem in Xenial or earlier.

  [original description]

  This is a serious regression with systemd-networkd that I ran in to
  while setting up a NAT router in AWS. The AWS AMI ubuntu/images/hvm-
  ssd/ubuntu-bionic-18.04-amd64-server-20200131 with
  systemd-237-3ubuntu10.33 does NOT have the problem, but the next most
  recent AWS AMI ubuntu/images/hvm-ssd/ubuntu-
  bionic-18.04-amd64-server-20200311 with systemd-including
  237-3ubuntu10.39 does.

  Also, a system booted from the (good) 20200131 AMI starts showing the
  problem after updating only systemd (to 237-3ubuntu10.41) and its
  direct dependencies (e.g. 'apt-get install systemd'). So I'm fairly
  confident that a change to the systemd package between
  237-3ubuntu10.33 and 237-3ubuntu10.39 introduced the problem and it is
  still present.

  On the NAT router I use three interfaces and have separate routing
  tables for admin and forwarded traffic. Things come up fine initially
  but every 30-60 minutes (DHCP lease renewal time?) one or more
  interfaces is reconfigured and most of the time systemd-networkd will
  crash and need to be restarted. Eventually the system becomes
  unreachable when the default crash loop backoff logic prevents the
  network service from being restarted at all. The log excerpt attached
  illustrates the crash loop.

  Also including the netplan and networkd config files below.

  # grep . /etc/netplan/*
  /etc/netplan/50-cloud-init.yaml:# This file is generated from information 
provided by the datasource.  Changes
  /etc/netplan/50-cloud-init.yaml:# to it will not persist across an instance 
reboot.  To disable cloud-init's
  /etc/netplan/50-cloud-init.yaml:# network configuration capabilities, write a 
file
  /etc/netplan/50-cloud-init.yaml:# 
/etc/cloud/cloud.cfg.d/99-disable-network-config.cfg with the following:
  /etc/netplan/50-cloud-init.yaml:# network: {config: disabled}
  /etc/netplan/50-cloud-init.yaml:network:
  /etc/netplan/50-cloud-init.yaml:version: 2
  /etc/netplan/50-cloud-init.yaml:ethernets:
  /etc/netplan/50-cloud-init.yaml:ens5:
  /etc/netplan/50-cloud-init.yaml:dhcp4: true
  /etc/netplan/50-cloud-init.yaml:match:
  /etc/netplan/50-cloud-init.yaml:macaddress: xx:xx:xx:xx:xx:xx
  /etc/netplan/50-cloud-init.yaml:set-name: ens5
  /etc/netplan/99_config.yaml:network:
  /etc/netplan/99_config.yaml:  version: 2
  /etc/netplan/99_config.yaml:  renderer: networkd
  /etc/netplan/99_config.yaml:  ethernets:
  /etc/netplan/99_config.yaml:ens6:
  /etc/netplan/99_config.yaml:  match:
  /etc/netplan/99_config.yaml:macaddress: yy:yy:yy:yy:yy:yy
  /etc/netplan/99_config.yaml:  dhcp4: true
  /etc/netplan/99_config.yaml:  dhcp4-overrides:
  /etc/netplan/99_config.yaml:use-routes: false
  /etc/netplan/99_config.yaml:ens7:
  /etc/netplan/99_config.yaml:  match:
  /etc/netplan/99_config.yaml:macaddress: zz:zz:zz:zz:zz:zz
  /etc/netplan/99_config.yaml:  mtu: 1500
  /etc/netplan/99_config.yaml:  dhcp4: true
  /etc/netplan/99_config.yaml:  dhcp4-overrides:
  /etc/netplan/99_config.yaml:use-mtu: false
  /etc/netplan/99_config.yaml:use-routes: false

  # grep . /etc/networkd-dispatcher/*/*
  /etc/networkd-dispatcher/configured.d/nat:#!/bin/bash
  /etc/networkd-dispatcher/configured.d/nat:# Do additional configuration for 
the inside and outside interfaces
  /etc/networkd-dispatcher/configured.d/nat:# route table used for 
forwarded/routed/natted traffic
  /etc/networkd-dispatcher/configured.d/nat:FWD_TABLE=99
  /etc/networkd-dispatcher/configured.d/nat:if [ "${IFACE}" = "ens6" ]; then
  /etc/networkd-dispatcher/configured.d/nat:  # delete link-local route for 
inside in default table
  

[Touch-packages] [Bug 1881972] Re: systemd-networkd crashes with invalid pointer

2020-07-29 Thread John Nielsen
The scripts for configured.d and configuring.d to add and remove IP
rules (included above) are likely the culprit. @ddstreet would you like
me to write that up more compactly?

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1881972

Title:
  systemd-networkd crashes with invalid pointer

Status in systemd package in Ubuntu:
  Fix Released
Status in systemd source package in Bionic:
  Incomplete

Bug description:
  [impact]

  systemd-networkd double-free causes crash under some circumstances,
  such as adding/removing ip rules

  [test case]

  see original description

  [regression potential]

  this strdup's strings during addition of routing policy rules, so any
  regression would likely occur when adding/modifying/removing ip rules,
  possibly including networkd segfault or failure to add/remove/modify
  ip rules.

  [scope]

  this is needed for bionic.

  this is fixed by upstream commit
  eeab051b28ba6e1b4a56d369d4c6bf7cfa71947c which is included starting in
  v240, so this is already included in Focal and later.

  I did not research what original commit introduced the problem, but
  the reporter indicates this did not happen for Xenial so it's unlikely
  this is a problem in Xenial or earlier.

  [original description]

  This is a serious regression with systemd-networkd that I ran in to
  while setting up a NAT router in AWS. The AWS AMI ubuntu/images/hvm-
  ssd/ubuntu-bionic-18.04-amd64-server-20200131 with
  systemd-237-3ubuntu10.33 does NOT have the problem, but the next most
  recent AWS AMI ubuntu/images/hvm-ssd/ubuntu-
  bionic-18.04-amd64-server-20200311 with systemd-including
  237-3ubuntu10.39 does.

  Also, a system booted from the (good) 20200131 AMI starts showing the
  problem after updating only systemd (to 237-3ubuntu10.41) and its
  direct dependencies (e.g. 'apt-get install systemd'). So I'm fairly
  confident that a change to the systemd package between
  237-3ubuntu10.33 and 237-3ubuntu10.39 introduced the problem and it is
  still present.

  On the NAT router I use three interfaces and have separate routing
  tables for admin and forwarded traffic. Things come up fine initially
  but every 30-60 minutes (DHCP lease renewal time?) one or more
  interfaces is reconfigured and most of the time systemd-networkd will
  crash and need to be restarted. Eventually the system becomes
  unreachable when the default crash loop backoff logic prevents the
  network service from being restarted at all. The log excerpt attached
  illustrates the crash loop.

  Also including the netplan and networkd config files below.

  # grep . /etc/netplan/*
  /etc/netplan/50-cloud-init.yaml:# This file is generated from information 
provided by the datasource.  Changes
  /etc/netplan/50-cloud-init.yaml:# to it will not persist across an instance 
reboot.  To disable cloud-init's
  /etc/netplan/50-cloud-init.yaml:# network configuration capabilities, write a 
file
  /etc/netplan/50-cloud-init.yaml:# 
/etc/cloud/cloud.cfg.d/99-disable-network-config.cfg with the following:
  /etc/netplan/50-cloud-init.yaml:# network: {config: disabled}
  /etc/netplan/50-cloud-init.yaml:network:
  /etc/netplan/50-cloud-init.yaml:version: 2
  /etc/netplan/50-cloud-init.yaml:ethernets:
  /etc/netplan/50-cloud-init.yaml:ens5:
  /etc/netplan/50-cloud-init.yaml:dhcp4: true
  /etc/netplan/50-cloud-init.yaml:match:
  /etc/netplan/50-cloud-init.yaml:macaddress: xx:xx:xx:xx:xx:xx
  /etc/netplan/50-cloud-init.yaml:set-name: ens5
  /etc/netplan/99_config.yaml:network:
  /etc/netplan/99_config.yaml:  version: 2
  /etc/netplan/99_config.yaml:  renderer: networkd
  /etc/netplan/99_config.yaml:  ethernets:
  /etc/netplan/99_config.yaml:ens6:
  /etc/netplan/99_config.yaml:  match:
  /etc/netplan/99_config.yaml:macaddress: yy:yy:yy:yy:yy:yy
  /etc/netplan/99_config.yaml:  dhcp4: true
  /etc/netplan/99_config.yaml:  dhcp4-overrides:
  /etc/netplan/99_config.yaml:use-routes: false
  /etc/netplan/99_config.yaml:ens7:
  /etc/netplan/99_config.yaml:  match:
  /etc/netplan/99_config.yaml:macaddress: zz:zz:zz:zz:zz:zz
  /etc/netplan/99_config.yaml:  mtu: 1500
  /etc/netplan/99_config.yaml:  dhcp4: true
  /etc/netplan/99_config.yaml:  dhcp4-overrides:
  /etc/netplan/99_config.yaml:use-mtu: false
  /etc/netplan/99_config.yaml:use-routes: false

  # grep . /etc/networkd-dispatcher/*/*
  /etc/networkd-dispatcher/configured.d/nat:#!/bin/bash
  /etc/networkd-dispatcher/configured.d/nat:# Do additional configuration for 
the inside and outside interfaces
  /etc/networkd-dispatcher/configured.d/nat:# route table used for 
forwarded/routed/natted traffic
  /etc/networkd-dispatcher/configured.d/nat:FWD_TABLE=99
  /etc/networkd-dispatcher/configured.d/nat:if [ "${IFACE}" = "ens6" ]; then
  

[Touch-packages] [Bug 1881972] Re: systemd-networkd crashes with invalid pointer

2020-07-29 Thread John Nielsen
** Description changed:

  [impact]
  
  systemd-networkd double-free causes crash under some circumstances, such
  as adding/removing ip rules
  
  [test case]
  
- see original description
+ Use networkd-dispatcher events to add and remove IP rules. The example
+ scripts below are contrived (and by themselves likely to break access to
+ a machine) but would be adequate to trigger the bug. Put scripts like
+ these in place, reboot or run "netplan apply", and then leave the
+ machine running for a few DHCP renewal cycles.
+ 
+ === /etc/networkd-dispatcher/configured.d/test.sh ===
+ #!/bin/bash
+ 
+ /sbin/ip rule add iif lo lookup 99
+ /sbin/ip rule add to 10.0.0.0/8 iif lo lookup main
+ === END ===
+ === /etc/networkd-dispatcher/configuring.d/test.sh ===
+ #!/bin/bash
+ 
+ # Tear down existing ip rules so they aren't duplicated
+ OLDIFS="${IFS}"
+ IFS="
+ "
+ for rule in `ip rule show|grep "iif lo" | cut -d: -f2-`; do
+   IFS="${OLDIFS}"
+   ip rule delete ${rule}
+ done
+ IFS="${OLDIFS}"
+ === END ===
  
  [regression potential]
  
  this strdup's strings during addition of routing policy rules, so any
  regression would likely occur when adding/modifying/removing ip rules,
  possibly including networkd segfault or failure to add/remove/modify ip
  rules.
  
  [scope]
  
  this is needed for bionic.
  
  this is fixed by upstream commit
  eeab051b28ba6e1b4a56d369d4c6bf7cfa71947c which is included starting in
  v240, so this is already included in Focal and later.
  
  I did not research what original commit introduced the problem, but the
  reporter indicates this did not happen for Xenial so it's unlikely this
  is a problem in Xenial or earlier.
  
  [original description]
  
  This is a serious regression with systemd-networkd that I ran in to
  while setting up a NAT router in AWS. The AWS AMI ubuntu/images/hvm-ssd
  /ubuntu-bionic-18.04-amd64-server-20200131 with systemd-237-3ubuntu10.33
  does NOT have the problem, but the next most recent AWS AMI
  ubuntu/images/hvm-ssd/ubuntu-bionic-18.04-amd64-server-20200311 with
  systemd-including 237-3ubuntu10.39 does.
  
  Also, a system booted from the (good) 20200131 AMI starts showing the
  problem after updating only systemd (to 237-3ubuntu10.41) and its direct
  dependencies (e.g. 'apt-get install systemd'). So I'm fairly confident
  that a change to the systemd package between 237-3ubuntu10.33 and
  237-3ubuntu10.39 introduced the problem and it is still present.
  
  On the NAT router I use three interfaces and have separate routing
  tables for admin and forwarded traffic. Things come up fine initially
  but every 30-60 minutes (DHCP lease renewal time?) one or more
  interfaces is reconfigured and most of the time systemd-networkd will
  crash and need to be restarted. Eventually the system becomes
  unreachable when the default crash loop backoff logic prevents the
  network service from being restarted at all. The log excerpt attached
  illustrates the crash loop.
  
  Also including the netplan and networkd config files below.
  
  # grep . /etc/netplan/*
  /etc/netplan/50-cloud-init.yaml:# This file is generated from information 
provided by the datasource.  Changes
  /etc/netplan/50-cloud-init.yaml:# to it will not persist across an instance 
reboot.  To disable cloud-init's
  /etc/netplan/50-cloud-init.yaml:# network configuration capabilities, write a 
file
  /etc/netplan/50-cloud-init.yaml:# 
/etc/cloud/cloud.cfg.d/99-disable-network-config.cfg with the following:
  /etc/netplan/50-cloud-init.yaml:# network: {config: disabled}
  /etc/netplan/50-cloud-init.yaml:network:
  /etc/netplan/50-cloud-init.yaml:version: 2
  /etc/netplan/50-cloud-init.yaml:ethernets:
  /etc/netplan/50-cloud-init.yaml:ens5:
  /etc/netplan/50-cloud-init.yaml:dhcp4: true
  /etc/netplan/50-cloud-init.yaml:match:
  /etc/netplan/50-cloud-init.yaml:macaddress: xx:xx:xx:xx:xx:xx
  /etc/netplan/50-cloud-init.yaml:set-name: ens5
  /etc/netplan/99_config.yaml:network:
  /etc/netplan/99_config.yaml:  version: 2
  /etc/netplan/99_config.yaml:  renderer: networkd
  /etc/netplan/99_config.yaml:  ethernets:
  /etc/netplan/99_config.yaml:ens6:
  /etc/netplan/99_config.yaml:  match:
  /etc/netplan/99_config.yaml:macaddress: yy:yy:yy:yy:yy:yy
  /etc/netplan/99_config.yaml:  dhcp4: true
  /etc/netplan/99_config.yaml:  dhcp4-overrides:
  /etc/netplan/99_config.yaml:use-routes: false
  /etc/netplan/99_config.yaml:ens7:
  /etc/netplan/99_config.yaml:  match:
  /etc/netplan/99_config.yaml:macaddress: zz:zz:zz:zz:zz:zz
  /etc/netplan/99_config.yaml:  mtu: 1500
  /etc/netplan/99_config.yaml:  dhcp4: true
  /etc/netplan/99_config.yaml:  dhcp4-overrides:
  /etc/netplan/99_config.yaml:use-mtu: false
  /etc/netplan/99_config.yaml:use-routes: false
  
  # grep . /etc/networkd-dispatcher/*/*
  /etc/networkd-dispatcher/configured.d/nat:#!/bin/bash
  

[Touch-packages] [Bug 1881972] Re: systemd-networkd crashes with invalid pointer

2020-07-09 Thread John Nielsen
So far so good running the latest package for 10 hours. I'll let it run
another day or two but previously I would have seen the issue by now.

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1881972

Title:
  systemd-networkd crashes with invalid pointer

Status in systemd package in Ubuntu:
  Fix Released
Status in systemd source package in Bionic:
  In Progress

Bug description:
  [impact]

  systemd-networkd double-free causes crash under some circumstances,
  such as adding/removing ip rules

  [test case]

  see original description

  [regression potential]

  this strdup's strings during addition of routing policy rules, so any
  regression would likely occur when adding/modifying/removing ip rules,
  possibly including networkd segfault or failure to add/remove/modify
  ip rules.

  [scope]

  this is needed for bionic.

  this is fixed by upstream commit
  eeab051b28ba6e1b4a56d369d4c6bf7cfa71947c which is included starting in
  v240, so this is already included in Focal and later.

  I did not research what original commit introduced the problem, but
  the reporter indicates this did not happen for Xenial so it's unlikely
  this is a problem in Xenial or earlier.

  [original description]

  This is a serious regression with systemd-networkd that I ran in to
  while setting up a NAT router in AWS. The AWS AMI ubuntu/images/hvm-
  ssd/ubuntu-bionic-18.04-amd64-server-20200131 with
  systemd-237-3ubuntu10.33 does NOT have the problem, but the next most
  recent AWS AMI ubuntu/images/hvm-ssd/ubuntu-
  bionic-18.04-amd64-server-20200311 with systemd-including
  237-3ubuntu10.39 does.

  Also, a system booted from the (good) 20200131 AMI starts showing the
  problem after updating only systemd (to 237-3ubuntu10.41) and its
  direct dependencies (e.g. 'apt-get install systemd'). So I'm fairly
  confident that a change to the systemd package between
  237-3ubuntu10.33 and 237-3ubuntu10.39 introduced the problem and it is
  still present.

  On the NAT router I use three interfaces and have separate routing
  tables for admin and forwarded traffic. Things come up fine initially
  but every 30-60 minutes (DHCP lease renewal time?) one or more
  interfaces is reconfigured and most of the time systemd-networkd will
  crash and need to be restarted. Eventually the system becomes
  unreachable when the default crash loop backoff logic prevents the
  network service from being restarted at all. The log excerpt attached
  illustrates the crash loop.

  Also including the netplan and networkd config files below.

  # grep . /etc/netplan/*
  /etc/netplan/50-cloud-init.yaml:# This file is generated from information 
provided by the datasource.  Changes
  /etc/netplan/50-cloud-init.yaml:# to it will not persist across an instance 
reboot.  To disable cloud-init's
  /etc/netplan/50-cloud-init.yaml:# network configuration capabilities, write a 
file
  /etc/netplan/50-cloud-init.yaml:# 
/etc/cloud/cloud.cfg.d/99-disable-network-config.cfg with the following:
  /etc/netplan/50-cloud-init.yaml:# network: {config: disabled}
  /etc/netplan/50-cloud-init.yaml:network:
  /etc/netplan/50-cloud-init.yaml:version: 2
  /etc/netplan/50-cloud-init.yaml:ethernets:
  /etc/netplan/50-cloud-init.yaml:ens5:
  /etc/netplan/50-cloud-init.yaml:dhcp4: true
  /etc/netplan/50-cloud-init.yaml:match:
  /etc/netplan/50-cloud-init.yaml:macaddress: xx:xx:xx:xx:xx:xx
  /etc/netplan/50-cloud-init.yaml:set-name: ens5
  /etc/netplan/99_config.yaml:network:
  /etc/netplan/99_config.yaml:  version: 2
  /etc/netplan/99_config.yaml:  renderer: networkd
  /etc/netplan/99_config.yaml:  ethernets:
  /etc/netplan/99_config.yaml:ens6:
  /etc/netplan/99_config.yaml:  match:
  /etc/netplan/99_config.yaml:macaddress: yy:yy:yy:yy:yy:yy
  /etc/netplan/99_config.yaml:  dhcp4: true
  /etc/netplan/99_config.yaml:  dhcp4-overrides:
  /etc/netplan/99_config.yaml:use-routes: false
  /etc/netplan/99_config.yaml:ens7:
  /etc/netplan/99_config.yaml:  match:
  /etc/netplan/99_config.yaml:macaddress: zz:zz:zz:zz:zz:zz
  /etc/netplan/99_config.yaml:  mtu: 1500
  /etc/netplan/99_config.yaml:  dhcp4: true
  /etc/netplan/99_config.yaml:  dhcp4-overrides:
  /etc/netplan/99_config.yaml:use-mtu: false
  /etc/netplan/99_config.yaml:use-routes: false

  # grep . /etc/networkd-dispatcher/*/*
  /etc/networkd-dispatcher/configured.d/nat:#!/bin/bash
  /etc/networkd-dispatcher/configured.d/nat:# Do additional configuration for 
the inside and outside interfaces
  /etc/networkd-dispatcher/configured.d/nat:# route table used for 
forwarded/routed/natted traffic
  /etc/networkd-dispatcher/configured.d/nat:FWD_TABLE=99
  /etc/networkd-dispatcher/configured.d/nat:if [ "${IFACE}" = "ens6" ]; then
  

[Touch-packages] [Bug 1881972] Re: systemd-networkd crashes with invalid pointer

2020-07-08 Thread John Nielsen
Here's one of the new coredumps I'm getting at boot now. Note that I don't have 
debugging symbols installed for the PPA version of systemd.
# coredumpctl gdb 714
   PID: 714 (systemd-network)
   UID: 100 (systemd-network)
   GID: 102 (systemd-network)
Signal: 11 (SEGV)
 Timestamp: Wed 2020-07-08 17:17:01 UTC (8min ago)
  Command Line: /lib/systemd/systemd-networkd
Executable: /lib/systemd/systemd-networkd
 Control Group: /system.slice/systemd-networkd.service
  Unit: systemd-networkd.service
 Slice: system.slice
   Boot ID: df33bbaec4134b45aaabe8b3fca7dade
Machine ID: ec267b3475883f9edb99f554607bb456
  Hostname: ip-10-0-4-251
   Storage: 
/var/lib/systemd/coredump/core.systemd-network.100.df33bbaec4134b45aaabe8b3fca7dade.714.159422862100.lz4
   Message: Process 714 (systemd-network) of user 100 dumped core.

Stack trace of thread 714:
#0  0x5627a7f3425c n/a (systemd-networkd)
#1  0x5627a7fb5760 n/a (systemd-networkd)
#2  0x5627a7f26526 sd_netlink_process (systemd-networkd)
#3  0x5627a7f267c3 n/a (systemd-networkd)
#4  0x5627a7f2b6be n/a (systemd-networkd)
#5  0x5627a7f2b93a sd_event_dispatch (systemd-networkd)
#6  0x5627a7f2bac9 sd_event_run (systemd-networkd)
#7  0x5627a7f2bd0b sd_event_loop (systemd-networkd)
#8  0x5627a7eff3d6 n/a (systemd-networkd)
#9  0x7f6d90500b97 __libc_start_main (libc.so.6)
#10 0x5627a7effaba n/a (systemd-networkd)


** Attachment added: 
"core.systemd-network.100.df33bbaec4134b45aaabe8b3fca7dade.714.159422862100.lz4"
   
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1881972/+attachment/5390826/+files/core.systemd-network.100.df33bbaec4134b45aaabe8b3fca7dade.714.159422862100.lz4

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1881972

Title:
  systemd-networkd crashes with invalid pointer

Status in systemd package in Ubuntu:
  Fix Released
Status in systemd source package in Bionic:
  In Progress

Bug description:
  [impact]

  systemd-networkd double-free causes crash under some circumstances,
  such as adding/removing ip rules

  [test case]

  see original description

  [regression potential]

  this strdup's strings during addition of routing policy rules, so any
  regression would likely occur when adding/modifying/removing ip rules,
  possibly including networkd segfault or failure to add/remove/modify
  ip rules.

  [scope]

  this is needed for bionic.

  this is fixed by upstream commit
  eeab051b28ba6e1b4a56d369d4c6bf7cfa71947c which is included starting in
  v240, so this is already included in Focal and later.

  I did not research what original commit introduced the problem, but
  the reporter indicates this did not happen for Xenial so it's unlikely
  this is a problem in Xenial or earlier.

  [original description]

  This is a serious regression with systemd-networkd that I ran in to
  while setting up a NAT router in AWS. The AWS AMI ubuntu/images/hvm-
  ssd/ubuntu-bionic-18.04-amd64-server-20200131 with
  systemd-237-3ubuntu10.33 does NOT have the problem, but the next most
  recent AWS AMI ubuntu/images/hvm-ssd/ubuntu-
  bionic-18.04-amd64-server-20200311 with systemd-including
  237-3ubuntu10.39 does.

  Also, a system booted from the (good) 20200131 AMI starts showing the
  problem after updating only systemd (to 237-3ubuntu10.41) and its
  direct dependencies (e.g. 'apt-get install systemd'). So I'm fairly
  confident that a change to the systemd package between
  237-3ubuntu10.33 and 237-3ubuntu10.39 introduced the problem and it is
  still present.

  On the NAT router I use three interfaces and have separate routing
  tables for admin and forwarded traffic. Things come up fine initially
  but every 30-60 minutes (DHCP lease renewal time?) one or more
  interfaces is reconfigured and most of the time systemd-networkd will
  crash and need to be restarted. Eventually the system becomes
  unreachable when the default crash loop backoff logic prevents the
  network service from being restarted at all. The log excerpt attached
  illustrates the crash loop.

  Also including the netplan and networkd config files below.

  # grep . /etc/netplan/*
  /etc/netplan/50-cloud-init.yaml:# This file is generated from information 
provided by the datasource.  Changes
  /etc/netplan/50-cloud-init.yaml:# to it will not persist across an instance 
reboot.  To disable cloud-init's
  /etc/netplan/50-cloud-init.yaml:# network configuration capabilities, write a 
file
  /etc/netplan/50-cloud-init.yaml:# 
/etc/cloud/cloud.cfg.d/99-disable-network-config.cfg with the following:
  

[Touch-packages] [Bug 1881972] Re: systemd-networkd crashes with invalid pointer

2020-07-08 Thread John Nielsen
I added the ppa and did a dist-upgrade then rebooted. systemd-networkd
now consistently crashes once at boot (looks like a different crash
though). But then everything appears to work after networkd restarts
once. I will let it run and see if the invalid pointer crash happens.

# journalctl -l -u systemd-networkd -b 0
-- Logs begin at Wed 2020-06-03 15:00:29 UTC, end at Wed 2020-07-08 17:20:23 
UTC. --
Jul 08 17:17:01 ip-10-0-4-251 systemd[1]: Starting Network Service...
Jul 08 17:17:01 ip-10-0-4-251 systemd-networkd[714]: Enumeration completed
Jul 08 17:17:01 ip-10-0-4-251 systemd[1]: Started Network Service.
Jul 08 17:17:01 ip-10-0-4-251 systemd-networkd[714]: ens5: Link UP
Jul 08 17:17:01 ip-10-0-4-251 systemd-networkd[714]: ens5: Gained carrier
Jul 08 17:17:01 ip-10-0-4-251 systemd-networkd[714]: ens5: Link DOWN
Jul 08 17:17:01 ip-10-0-4-251 systemd-networkd[714]: ens5: Lost carrier
Jul 08 17:17:01 ip-10-0-4-251 systemd[1]: systemd-networkd.service: Main 
process exited, code=dumped, status=11/SEGV
Jul 08 17:17:01 ip-10-0-4-251 systemd[1]: systemd-networkd.service: Failed with 
result 'core-dump'.
Jul 08 17:17:01 ip-10-0-4-251 systemd[1]: systemd-networkd.service: Service has 
no hold-off time, scheduling restart.
Jul 08 17:17:01 ip-10-0-4-251 systemd[1]: systemd-networkd.service: Scheduled 
restart job, restart counter is at 1.
Jul 08 17:17:01 ip-10-0-4-251 systemd[1]: Stopped Network Service.

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1881972

Title:
  systemd-networkd crashes with invalid pointer

Status in systemd package in Ubuntu:
  Fix Released
Status in systemd source package in Bionic:
  In Progress

Bug description:
  [impact]

  systemd-networkd double-free causes crash under some circumstances,
  such as adding/removing ip rules

  [test case]

  see original description

  [regression potential]

  this strdup's strings during addition of routing policy rules, so any
  regression would likely occur when adding/modifying/removing ip rules,
  possibly including networkd segfault or failure to add/remove/modify
  ip rules.

  [scope]

  this is needed for bionic.

  this is fixed by upstream commit
  eeab051b28ba6e1b4a56d369d4c6bf7cfa71947c which is included starting in
  v240, so this is already included in Focal and later.

  I did not research what original commit introduced the problem, but
  the reporter indicates this did not happen for Xenial so it's unlikely
  this is a problem in Xenial or earlier.

  [original description]

  This is a serious regression with systemd-networkd that I ran in to
  while setting up a NAT router in AWS. The AWS AMI ubuntu/images/hvm-
  ssd/ubuntu-bionic-18.04-amd64-server-20200131 with
  systemd-237-3ubuntu10.33 does NOT have the problem, but the next most
  recent AWS AMI ubuntu/images/hvm-ssd/ubuntu-
  bionic-18.04-amd64-server-20200311 with systemd-including
  237-3ubuntu10.39 does.

  Also, a system booted from the (good) 20200131 AMI starts showing the
  problem after updating only systemd (to 237-3ubuntu10.41) and its
  direct dependencies (e.g. 'apt-get install systemd'). So I'm fairly
  confident that a change to the systemd package between
  237-3ubuntu10.33 and 237-3ubuntu10.39 introduced the problem and it is
  still present.

  On the NAT router I use three interfaces and have separate routing
  tables for admin and forwarded traffic. Things come up fine initially
  but every 30-60 minutes (DHCP lease renewal time?) one or more
  interfaces is reconfigured and most of the time systemd-networkd will
  crash and need to be restarted. Eventually the system becomes
  unreachable when the default crash loop backoff logic prevents the
  network service from being restarted at all. The log excerpt attached
  illustrates the crash loop.

  Also including the netplan and networkd config files below.

  # grep . /etc/netplan/*
  /etc/netplan/50-cloud-init.yaml:# This file is generated from information 
provided by the datasource.  Changes
  /etc/netplan/50-cloud-init.yaml:# to it will not persist across an instance 
reboot.  To disable cloud-init's
  /etc/netplan/50-cloud-init.yaml:# network configuration capabilities, write a 
file
  /etc/netplan/50-cloud-init.yaml:# 
/etc/cloud/cloud.cfg.d/99-disable-network-config.cfg with the following:
  /etc/netplan/50-cloud-init.yaml:# network: {config: disabled}
  /etc/netplan/50-cloud-init.yaml:network:
  /etc/netplan/50-cloud-init.yaml:version: 2
  /etc/netplan/50-cloud-init.yaml:ethernets:
  /etc/netplan/50-cloud-init.yaml:ens5:
  /etc/netplan/50-cloud-init.yaml:dhcp4: true
  /etc/netplan/50-cloud-init.yaml:match:
  /etc/netplan/50-cloud-init.yaml:macaddress: xx:xx:xx:xx:xx:xx
  /etc/netplan/50-cloud-init.yaml:set-name: ens5
  /etc/netplan/99_config.yaml:network:
  /etc/netplan/99_config.yaml:  version: 2
  

[Touch-packages] [Bug 1881972] Re: systemd-networkd crashes with invalid pointer

2020-07-09 Thread John Nielsen
I was not able to reproduce the original issue on
237-3ubuntu10.42~202007071725~ubuntu18.04.1 after letting it run for 12+
hours. I have now installed the newer
237-3ubuntu10.42~202007081907~ubuntu18.04.1 from the same PPA. I no
longer see a SEGV when the service first starts at boot, thanks! I will
let it run a few hours again to confirm that the original issue has been
addressed.

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1881972

Title:
  systemd-networkd crashes with invalid pointer

Status in systemd package in Ubuntu:
  Fix Released
Status in systemd source package in Bionic:
  In Progress

Bug description:
  [impact]

  systemd-networkd double-free causes crash under some circumstances,
  such as adding/removing ip rules

  [test case]

  see original description

  [regression potential]

  this strdup's strings during addition of routing policy rules, so any
  regression would likely occur when adding/modifying/removing ip rules,
  possibly including networkd segfault or failure to add/remove/modify
  ip rules.

  [scope]

  this is needed for bionic.

  this is fixed by upstream commit
  eeab051b28ba6e1b4a56d369d4c6bf7cfa71947c which is included starting in
  v240, so this is already included in Focal and later.

  I did not research what original commit introduced the problem, but
  the reporter indicates this did not happen for Xenial so it's unlikely
  this is a problem in Xenial or earlier.

  [original description]

  This is a serious regression with systemd-networkd that I ran in to
  while setting up a NAT router in AWS. The AWS AMI ubuntu/images/hvm-
  ssd/ubuntu-bionic-18.04-amd64-server-20200131 with
  systemd-237-3ubuntu10.33 does NOT have the problem, but the next most
  recent AWS AMI ubuntu/images/hvm-ssd/ubuntu-
  bionic-18.04-amd64-server-20200311 with systemd-including
  237-3ubuntu10.39 does.

  Also, a system booted from the (good) 20200131 AMI starts showing the
  problem after updating only systemd (to 237-3ubuntu10.41) and its
  direct dependencies (e.g. 'apt-get install systemd'). So I'm fairly
  confident that a change to the systemd package between
  237-3ubuntu10.33 and 237-3ubuntu10.39 introduced the problem and it is
  still present.

  On the NAT router I use three interfaces and have separate routing
  tables for admin and forwarded traffic. Things come up fine initially
  but every 30-60 minutes (DHCP lease renewal time?) one or more
  interfaces is reconfigured and most of the time systemd-networkd will
  crash and need to be restarted. Eventually the system becomes
  unreachable when the default crash loop backoff logic prevents the
  network service from being restarted at all. The log excerpt attached
  illustrates the crash loop.

  Also including the netplan and networkd config files below.

  # grep . /etc/netplan/*
  /etc/netplan/50-cloud-init.yaml:# This file is generated from information 
provided by the datasource.  Changes
  /etc/netplan/50-cloud-init.yaml:# to it will not persist across an instance 
reboot.  To disable cloud-init's
  /etc/netplan/50-cloud-init.yaml:# network configuration capabilities, write a 
file
  /etc/netplan/50-cloud-init.yaml:# 
/etc/cloud/cloud.cfg.d/99-disable-network-config.cfg with the following:
  /etc/netplan/50-cloud-init.yaml:# network: {config: disabled}
  /etc/netplan/50-cloud-init.yaml:network:
  /etc/netplan/50-cloud-init.yaml:version: 2
  /etc/netplan/50-cloud-init.yaml:ethernets:
  /etc/netplan/50-cloud-init.yaml:ens5:
  /etc/netplan/50-cloud-init.yaml:dhcp4: true
  /etc/netplan/50-cloud-init.yaml:match:
  /etc/netplan/50-cloud-init.yaml:macaddress: xx:xx:xx:xx:xx:xx
  /etc/netplan/50-cloud-init.yaml:set-name: ens5
  /etc/netplan/99_config.yaml:network:
  /etc/netplan/99_config.yaml:  version: 2
  /etc/netplan/99_config.yaml:  renderer: networkd
  /etc/netplan/99_config.yaml:  ethernets:
  /etc/netplan/99_config.yaml:ens6:
  /etc/netplan/99_config.yaml:  match:
  /etc/netplan/99_config.yaml:macaddress: yy:yy:yy:yy:yy:yy
  /etc/netplan/99_config.yaml:  dhcp4: true
  /etc/netplan/99_config.yaml:  dhcp4-overrides:
  /etc/netplan/99_config.yaml:use-routes: false
  /etc/netplan/99_config.yaml:ens7:
  /etc/netplan/99_config.yaml:  match:
  /etc/netplan/99_config.yaml:macaddress: zz:zz:zz:zz:zz:zz
  /etc/netplan/99_config.yaml:  mtu: 1500
  /etc/netplan/99_config.yaml:  dhcp4: true
  /etc/netplan/99_config.yaml:  dhcp4-overrides:
  /etc/netplan/99_config.yaml:use-mtu: false
  /etc/netplan/99_config.yaml:use-routes: false

  # grep . /etc/networkd-dispatcher/*/*
  /etc/networkd-dispatcher/configured.d/nat:#!/bin/bash
  /etc/networkd-dispatcher/configured.d/nat:# Do additional configuration for 
the inside and outside interfaces
  

[Touch-packages] [Bug 1881972] Re: systemd-networkd crashes with invalid pointer

2020-06-16 Thread John Nielsen
# coredumpctl gdb 28819
   PID: 28819 (systemd-network)
   UID: 100 (systemd-network)
   GID: 102 (systemd-network)
Signal: 6 (ABRT)
 Timestamp: Tue 2020-06-16 19:36:22 UTC (16min ago)
  Command Line: /lib/systemd/systemd-networkd
Executable: /lib/systemd/systemd-networkd
 Control Group: /system.slice/systemd-networkd.service
  Unit: systemd-networkd.service
 Slice: system.slice
   Boot ID: 578e8b2c2e1a43afbd27211be1a4f531
Machine ID: ec267b3475883f9edb99f554607bb456
  Hostname: ip-10-0-4-251
   Storage: 
/var/lib/systemd/coredump/core.systemd-network.100.578e8b2c2e1a43afbd27211be1a4f531.28819.159233618200.lz4
   Message: Process 28819 (systemd-network) of user 100 dumped core.

Stack trace of thread 28819:
#0  0x7f740d023e97 raise (libc.so.6)
#1  0x7f740d025801 abort (libc.so.6)
#2  0x7f740d06e897 n/a (libc.so.6)
#3  0x7f740d07590a n/a (libc.so.6)
#4  0x7f740d07ce1c cfree (libc.so.6)
#5  0x55fa5c16276b routing_policy_rule_free 
(systemd-networkd)
#6  0x55fa5c1f69e2 manager_rtnl_process_rule 
(systemd-networkd)
#7  0x55fa5c1731d6 process_match (systemd-networkd)
#8  0x55fa5c173413 io_callback (systemd-networkd)
#9  0x55fa5c178350 source_dispatch (systemd-networkd)
#10 0x55fa5c1785ea sd_event_dispatch (systemd-networkd)
#11 0x55fa5c178779 sd_event_run (systemd-networkd)
#12 0x55fa5c1789bb sd_event_loop (systemd-networkd)
#13 0x55fa5c1413a6 main (systemd-networkd)
#14 0x7f740d006b97 __libc_start_main (libc.so.6)
#15 0x55fa5c141a8a _start (systemd-networkd)


** Attachment added: "core dump file"
   
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1881972/+attachment/5384495/+files/core.systemd-network.100.578e8b2c2e1a43afbd27211be1a4f531.28819.159233618200.lz4

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1881972

Title:
  systemd-networkd crashes with invalid pointer

Status in systemd package in Ubuntu:
  Incomplete

Bug description:
  This is a serious regression with systemd-networkd that I ran in to
  while setting up a NAT router in AWS. The AWS AMI ubuntu/images/hvm-
  ssd/ubuntu-bionic-18.04-amd64-server-20200131 with
  systemd-237-3ubuntu10.33 does NOT have the problem, but the next most
  recent AWS AMI ubuntu/images/hvm-ssd/ubuntu-
  bionic-18.04-amd64-server-20200311 with systemd-including
  237-3ubuntu10.39 does.

  Also, a system booted from the (good) 20200131 AMI starts showing the
  problem after updating only systemd (to 237-3ubuntu10.41) and its
  direct dependencies (e.g. 'apt-get install systemd'). So I'm fairly
  confident that a change to the systemd package between
  237-3ubuntu10.33 and 237-3ubuntu10.39 introduced the problem and it is
  still present.

  On the NAT router I use three interfaces and have separate routing
  tables for admin and forwarded traffic. Things come up fine initially
  but every 30-60 minutes (DHCP lease renewal time?) one or more
  interfaces is reconfigured and most of the time systemd-networkd will
  crash and need to be restarted. Eventually the system becomes
  unreachable when the default crash loop backoff logic prevents the
  network service from being restarted at all. The log excerpt attached
  illustrates the crash loop.

  Also including the netplan and networkd config files below.

  # grep . /etc/netplan/*
  /etc/netplan/50-cloud-init.yaml:# This file is generated from information 
provided by the datasource.  Changes
  /etc/netplan/50-cloud-init.yaml:# to it will not persist across an instance 
reboot.  To disable cloud-init's
  /etc/netplan/50-cloud-init.yaml:# network configuration capabilities, write a 
file
  /etc/netplan/50-cloud-init.yaml:# 
/etc/cloud/cloud.cfg.d/99-disable-network-config.cfg with the following:
  /etc/netplan/50-cloud-init.yaml:# network: {config: disabled}
  /etc/netplan/50-cloud-init.yaml:network:
  /etc/netplan/50-cloud-init.yaml:version: 2
  /etc/netplan/50-cloud-init.yaml:ethernets:
  /etc/netplan/50-cloud-init.yaml:ens5:
  /etc/netplan/50-cloud-init.yaml:dhcp4: true
  /etc/netplan/50-cloud-init.yaml:match:
  /etc/netplan/50-cloud-init.yaml:macaddress: xx:xx:xx:xx:xx:xx
  /etc/netplan/50-cloud-init.yaml:set-name: ens5
  /etc/netplan/99_config.yaml:network:
  /etc/netplan/99_config.yaml:  version: 2
  /etc/netplan/99_config.yaml:  renderer: networkd
  /etc/netplan/99_config.yaml:  ethernets:
  /etc/netplan/99_config.yaml:ens6:
  /etc/netplan/99_config.yaml:  match:
  

[Touch-packages] [Bug 1881972] Re: systemd-networkd crashes with invalid pointer

2020-06-16 Thread John Nielsen
I've gathered several core dumps now but the stacktraces are identical.
The logged reason for dumping core is "free(): invalid pointer".

I wonder if there is a race condition with whatever networkd itself is
doing when it reconfigures the interface (which it seems to do more
aggressively for DHCP renewals than it used to) and what my scripts in
/etc/networkd-dispatcher/configured.d and /etc/networkd-
dispatcher/configuring.d are doing. Assuming the
"routing_policy_rule_free" function is equivalent to "ip rule delete
..." there could be a conflict with my "configuring.d" script in
particular. The "configured.d" script adds some extra routing policy
rules and the "configuring.d" script deletes them so they aren't
duplicated every time the network is configured.

** Changed in: systemd (Ubuntu)
   Status: Incomplete => New

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1881972

Title:
  systemd-networkd crashes with invalid pointer

Status in systemd package in Ubuntu:
  New

Bug description:
  This is a serious regression with systemd-networkd that I ran in to
  while setting up a NAT router in AWS. The AWS AMI ubuntu/images/hvm-
  ssd/ubuntu-bionic-18.04-amd64-server-20200131 with
  systemd-237-3ubuntu10.33 does NOT have the problem, but the next most
  recent AWS AMI ubuntu/images/hvm-ssd/ubuntu-
  bionic-18.04-amd64-server-20200311 with systemd-including
  237-3ubuntu10.39 does.

  Also, a system booted from the (good) 20200131 AMI starts showing the
  problem after updating only systemd (to 237-3ubuntu10.41) and its
  direct dependencies (e.g. 'apt-get install systemd'). So I'm fairly
  confident that a change to the systemd package between
  237-3ubuntu10.33 and 237-3ubuntu10.39 introduced the problem and it is
  still present.

  On the NAT router I use three interfaces and have separate routing
  tables for admin and forwarded traffic. Things come up fine initially
  but every 30-60 minutes (DHCP lease renewal time?) one or more
  interfaces is reconfigured and most of the time systemd-networkd will
  crash and need to be restarted. Eventually the system becomes
  unreachable when the default crash loop backoff logic prevents the
  network service from being restarted at all. The log excerpt attached
  illustrates the crash loop.

  Also including the netplan and networkd config files below.

  # grep . /etc/netplan/*
  /etc/netplan/50-cloud-init.yaml:# This file is generated from information 
provided by the datasource.  Changes
  /etc/netplan/50-cloud-init.yaml:# to it will not persist across an instance 
reboot.  To disable cloud-init's
  /etc/netplan/50-cloud-init.yaml:# network configuration capabilities, write a 
file
  /etc/netplan/50-cloud-init.yaml:# 
/etc/cloud/cloud.cfg.d/99-disable-network-config.cfg with the following:
  /etc/netplan/50-cloud-init.yaml:# network: {config: disabled}
  /etc/netplan/50-cloud-init.yaml:network:
  /etc/netplan/50-cloud-init.yaml:version: 2
  /etc/netplan/50-cloud-init.yaml:ethernets:
  /etc/netplan/50-cloud-init.yaml:ens5:
  /etc/netplan/50-cloud-init.yaml:dhcp4: true
  /etc/netplan/50-cloud-init.yaml:match:
  /etc/netplan/50-cloud-init.yaml:macaddress: xx:xx:xx:xx:xx:xx
  /etc/netplan/50-cloud-init.yaml:set-name: ens5
  /etc/netplan/99_config.yaml:network:
  /etc/netplan/99_config.yaml:  version: 2
  /etc/netplan/99_config.yaml:  renderer: networkd
  /etc/netplan/99_config.yaml:  ethernets:
  /etc/netplan/99_config.yaml:ens6:
  /etc/netplan/99_config.yaml:  match:
  /etc/netplan/99_config.yaml:macaddress: yy:yy:yy:yy:yy:yy
  /etc/netplan/99_config.yaml:  dhcp4: true
  /etc/netplan/99_config.yaml:  dhcp4-overrides:
  /etc/netplan/99_config.yaml:use-routes: false
  /etc/netplan/99_config.yaml:ens7:
  /etc/netplan/99_config.yaml:  match:
  /etc/netplan/99_config.yaml:macaddress: zz:zz:zz:zz:zz:zz
  /etc/netplan/99_config.yaml:  mtu: 1500
  /etc/netplan/99_config.yaml:  dhcp4: true
  /etc/netplan/99_config.yaml:  dhcp4-overrides:
  /etc/netplan/99_config.yaml:use-mtu: false
  /etc/netplan/99_config.yaml:use-routes: false

  # grep . /etc/networkd-dispatcher/*/*
  /etc/networkd-dispatcher/configured.d/nat:#!/bin/bash
  /etc/networkd-dispatcher/configured.d/nat:# Do additional configuration for 
the inside and outside interfaces
  /etc/networkd-dispatcher/configured.d/nat:# route table used for 
forwarded/routed/natted traffic
  /etc/networkd-dispatcher/configured.d/nat:FWD_TABLE=99
  /etc/networkd-dispatcher/configured.d/nat:if [ "${IFACE}" = "ens6" ]; then
  /etc/networkd-dispatcher/configured.d/nat:  # delete link-local route for 
inside in default table
  /etc/networkd-dispatcher/configured.d/nat:  /sbin/ip route delete 10.0.3.0/24 
2>/dev/null || true
  

[Touch-packages] [Bug 1881972] Re: systemd-networkd crashes with invalid pointer

2020-06-03 Thread John Nielsen
The system I pulled the networkd log from had not yet become unreachable
but here's a snippet from one that did:

May 27 07:38:18 ip-10-0-4-228 systemd[1]: systemd-networkd.service: Failed with 
result 'core-dump'.
May 27 07:38:18 ip-10-0-4-228 systemd[1]: Failed to start Network Service.
May 27 07:38:18 ip-10-0-4-228 systemd[1]: Dependency failed for Wait for 
Network to be Configured.
May 27 07:38:18 ip-10-0-4-228 systemd[1]: systemd-networkd-wait-online.service: 
Job systemd-networkd-wait-online.service/start failed with result 'dependency'
May 27 07:38:18 ip-10-0-4-228 systemd[1]: systemd-networkd.socket: Failed with 
result 'service-start-limit-hit'.

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1881972

Title:
  systemd-networkd crashes with invalid pointer

Status in systemd package in Ubuntu:
  New

Bug description:
  This is a serious regression with systemd-networkd that I ran in to
  while setting up a NAT router in AWS. The AWS AMI ubuntu/images/hvm-
  ssd/ubuntu-bionic-18.04-amd64-server-20200131 with
  systemd-237-3ubuntu10.33 does NOT have the problem, but the next most
  recent AWS AMI ubuntu/images/hvm-ssd/ubuntu-
  bionic-18.04-amd64-server-20200311 with systemd-including
  237-3ubuntu10.39 does.

  Also, a system booted from the (good) 20200131 AMI starts showing the
  problem after updating only systemd (to 237-3ubuntu10.41) and its
  direct dependencies (e.g. 'apt-get install systemd'). So I'm fairly
  confident that a change to the systemd package between
  237-3ubuntu10.33 and 237-3ubuntu10.39 introduced the problem and it is
  still present.

  On the NAT router I use three interfaces and have separate routing
  tables for admin and forwarded traffic. Things come up fine initially
  but every 30-60 minutes (DHCP lease renewal time?) one or more
  interfaces is reconfigured and most of the time systemd-networkd will
  crash and need to be restarted. Eventually the system becomes
  unreachable when the default crash loop backoff logic prevents the
  network service from being restarted at all. The log excerpt attached
  illustrates the crash loop.

  Also including the netplan and networkd config files below.

  # grep . /etc/netplan/*
  /etc/netplan/50-cloud-init.yaml:# This file is generated from information 
provided by the datasource.  Changes
  /etc/netplan/50-cloud-init.yaml:# to it will not persist across an instance 
reboot.  To disable cloud-init's
  /etc/netplan/50-cloud-init.yaml:# network configuration capabilities, write a 
file
  /etc/netplan/50-cloud-init.yaml:# 
/etc/cloud/cloud.cfg.d/99-disable-network-config.cfg with the following:
  /etc/netplan/50-cloud-init.yaml:# network: {config: disabled}
  /etc/netplan/50-cloud-init.yaml:network:
  /etc/netplan/50-cloud-init.yaml:version: 2
  /etc/netplan/50-cloud-init.yaml:ethernets:
  /etc/netplan/50-cloud-init.yaml:ens5:
  /etc/netplan/50-cloud-init.yaml:dhcp4: true
  /etc/netplan/50-cloud-init.yaml:match:
  /etc/netplan/50-cloud-init.yaml:macaddress: xx:xx:xx:xx:xx:xx
  /etc/netplan/50-cloud-init.yaml:set-name: ens5
  /etc/netplan/99_config.yaml:network:
  /etc/netplan/99_config.yaml:  version: 2
  /etc/netplan/99_config.yaml:  renderer: networkd
  /etc/netplan/99_config.yaml:  ethernets:
  /etc/netplan/99_config.yaml:ens6:
  /etc/netplan/99_config.yaml:  match:
  /etc/netplan/99_config.yaml:macaddress: yy:yy:yy:yy:yy:yy
  /etc/netplan/99_config.yaml:  dhcp4: true
  /etc/netplan/99_config.yaml:  dhcp4-overrides:
  /etc/netplan/99_config.yaml:use-routes: false
  /etc/netplan/99_config.yaml:ens7:
  /etc/netplan/99_config.yaml:  match:
  /etc/netplan/99_config.yaml:macaddress: zz:zz:zz:zz:zz:zz
  /etc/netplan/99_config.yaml:  mtu: 1500
  /etc/netplan/99_config.yaml:  dhcp4: true
  /etc/netplan/99_config.yaml:  dhcp4-overrides:
  /etc/netplan/99_config.yaml:use-mtu: false
  /etc/netplan/99_config.yaml:use-routes: false

  # grep . /etc/networkd-dispatcher/*/*
  /etc/networkd-dispatcher/configured.d/nat:#!/bin/bash
  /etc/networkd-dispatcher/configured.d/nat:# Do additional configuration for 
the inside and outside interfaces
  /etc/networkd-dispatcher/configured.d/nat:# route table used for 
forwarded/routed/natted traffic
  /etc/networkd-dispatcher/configured.d/nat:FWD_TABLE=99
  /etc/networkd-dispatcher/configured.d/nat:if [ "${IFACE}" = "ens6" ]; then
  /etc/networkd-dispatcher/configured.d/nat:  # delete link-local route for 
inside in default table
  /etc/networkd-dispatcher/configured.d/nat:  /sbin/ip route delete 10.0.3.0/24 
2>/dev/null || true
  /etc/networkd-dispatcher/configured.d/nat:  # add link-local route for inside 
in table 99
  /etc/networkd-dispatcher/configured.d/nat:  /sbin/ip route replace 
10.0.3.0/24 dev ens6 scope link src 10.0.3.171 

[Touch-packages] [Bug 1881972] Re: systemd-networkd crashes with invalid pointer

2020-06-03 Thread John Nielsen
Also of note is that on systems with systemd-237-3ubuntu10.33 or older I
don't see the "ens5: Configured" log messages at all after initial
configuration, even though I'm sure DHCP renewals are happening.

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1881972

Title:
  systemd-networkd crashes with invalid pointer

Status in systemd package in Ubuntu:
  New

Bug description:
  This is a serious regression with systemd-networkd that I ran in to
  while setting up a NAT router in AWS. The AWS AMI ubuntu/images/hvm-
  ssd/ubuntu-bionic-18.04-amd64-server-20200131 with
  systemd-237-3ubuntu10.33 does NOT have the problem, but the next most
  recent AWS AMI ubuntu/images/hvm-ssd/ubuntu-
  bionic-18.04-amd64-server-20200311 with systemd-including
  237-3ubuntu10.39 does.

  Also, a system booted from the (good) 20200131 AMI starts showing the
  problem after updating only systemd (to 237-3ubuntu10.41) and its
  direct dependencies (e.g. 'apt-get install systemd'). So I'm fairly
  confident that a change to the systemd package between
  237-3ubuntu10.33 and 237-3ubuntu10.39 introduced the problem and it is
  still present.

  On the NAT router I use three interfaces and have separate routing
  tables for admin and forwarded traffic. Things come up fine initially
  but every 30-60 minutes (DHCP lease renewal time?) one or more
  interfaces is reconfigured and most of the time systemd-networkd will
  crash and need to be restarted. Eventually the system becomes
  unreachable when the default crash loop backoff logic prevents the
  network service from being restarted at all. The log excerpt attached
  illustrates the crash loop.

  Also including the netplan and networkd config files below.

  # grep . /etc/netplan/*
  /etc/netplan/50-cloud-init.yaml:# This file is generated from information 
provided by the datasource.  Changes
  /etc/netplan/50-cloud-init.yaml:# to it will not persist across an instance 
reboot.  To disable cloud-init's
  /etc/netplan/50-cloud-init.yaml:# network configuration capabilities, write a 
file
  /etc/netplan/50-cloud-init.yaml:# 
/etc/cloud/cloud.cfg.d/99-disable-network-config.cfg with the following:
  /etc/netplan/50-cloud-init.yaml:# network: {config: disabled}
  /etc/netplan/50-cloud-init.yaml:network:
  /etc/netplan/50-cloud-init.yaml:version: 2
  /etc/netplan/50-cloud-init.yaml:ethernets:
  /etc/netplan/50-cloud-init.yaml:ens5:
  /etc/netplan/50-cloud-init.yaml:dhcp4: true
  /etc/netplan/50-cloud-init.yaml:match:
  /etc/netplan/50-cloud-init.yaml:macaddress: xx:xx:xx:xx:xx:xx
  /etc/netplan/50-cloud-init.yaml:set-name: ens5
  /etc/netplan/99_config.yaml:network:
  /etc/netplan/99_config.yaml:  version: 2
  /etc/netplan/99_config.yaml:  renderer: networkd
  /etc/netplan/99_config.yaml:  ethernets:
  /etc/netplan/99_config.yaml:ens6:
  /etc/netplan/99_config.yaml:  match:
  /etc/netplan/99_config.yaml:macaddress: yy:yy:yy:yy:yy:yy
  /etc/netplan/99_config.yaml:  dhcp4: true
  /etc/netplan/99_config.yaml:  dhcp4-overrides:
  /etc/netplan/99_config.yaml:use-routes: false
  /etc/netplan/99_config.yaml:ens7:
  /etc/netplan/99_config.yaml:  match:
  /etc/netplan/99_config.yaml:macaddress: zz:zz:zz:zz:zz:zz
  /etc/netplan/99_config.yaml:  mtu: 1500
  /etc/netplan/99_config.yaml:  dhcp4: true
  /etc/netplan/99_config.yaml:  dhcp4-overrides:
  /etc/netplan/99_config.yaml:use-mtu: false
  /etc/netplan/99_config.yaml:use-routes: false

  # grep . /etc/networkd-dispatcher/*/*
  /etc/networkd-dispatcher/configured.d/nat:#!/bin/bash
  /etc/networkd-dispatcher/configured.d/nat:# Do additional configuration for 
the inside and outside interfaces
  /etc/networkd-dispatcher/configured.d/nat:# route table used for 
forwarded/routed/natted traffic
  /etc/networkd-dispatcher/configured.d/nat:FWD_TABLE=99
  /etc/networkd-dispatcher/configured.d/nat:if [ "${IFACE}" = "ens6" ]; then
  /etc/networkd-dispatcher/configured.d/nat:  # delete link-local route for 
inside in default table
  /etc/networkd-dispatcher/configured.d/nat:  /sbin/ip route delete 10.0.3.0/24 
2>/dev/null || true
  /etc/networkd-dispatcher/configured.d/nat:  # add link-local route for inside 
in table 99
  /etc/networkd-dispatcher/configured.d/nat:  /sbin/ip route replace 
10.0.3.0/24 dev ens6 scope link src 10.0.3.171 table ${FWD_TABLE}
  /etc/networkd-dispatcher/configured.d/nat:  # add routes to VPC cidrs via 
inside gateway in table 99
  /etc/networkd-dispatcher/configured.d/nat:  /sbin/ip route replace 
10.0.0.0/16 via 10.0.3.1 table ${FWD_TABLE}
  /etc/networkd-dispatcher/configured.d/nat:  # add rules to use table 99
  /etc/networkd-dispatcher/configured.d/nat:  /sbin/ip rule add iif  ens6 
lookup ${FWD_TABLE}
  /etc/networkd-dispatcher/configured.d/nat:  /sbin/ip 

[Touch-packages] [Bug 1881972] [NEW] systemd-networkd crashes with invalid pointer

2020-06-03 Thread John Nielsen
Public bug reported:

This is a serious regression with systemd-networkd that I ran in to
while setting up a NAT router in AWS. The AWS AMI ubuntu/images/hvm-ssd
/ubuntu-bionic-18.04-amd64-server-20200131 with systemd-237-3ubuntu10.33
does NOT have the problem, but the next most recent AWS AMI
ubuntu/images/hvm-ssd/ubuntu-bionic-18.04-amd64-server-20200311 with
systemd-including 237-3ubuntu10.39 does.

Also, a system booted from the (good) 20200131 AMI starts showing the
problem after updating only systemd (to 237-3ubuntu10.41) and its direct
dependencies (e.g. 'apt-get install systemd'). So I'm fairly confident
that a change to the systemd package between 237-3ubuntu10.33 and
237-3ubuntu10.39 introduced the problem and it is still present.

On the NAT router I use three interfaces and have separate routing
tables for admin and forwarded traffic. Things come up fine initially
but every 30-60 minutes (DHCP lease renewal time?) one or more
interfaces is reconfigured and most of the time systemd-networkd will
crash and need to be restarted. Eventually the system becomes
unreachable when the default crash loop backoff logic prevents the
network service from being restarted at all. The log excerpt attached
illustrates the crash loop.

Also including the netplan and networkd config files below.

# grep . /etc/netplan/*
/etc/netplan/50-cloud-init.yaml:# This file is generated from information 
provided by the datasource.  Changes
/etc/netplan/50-cloud-init.yaml:# to it will not persist across an instance 
reboot.  To disable cloud-init's
/etc/netplan/50-cloud-init.yaml:# network configuration capabilities, write a 
file
/etc/netplan/50-cloud-init.yaml:# 
/etc/cloud/cloud.cfg.d/99-disable-network-config.cfg with the following:
/etc/netplan/50-cloud-init.yaml:# network: {config: disabled}
/etc/netplan/50-cloud-init.yaml:network:
/etc/netplan/50-cloud-init.yaml:version: 2
/etc/netplan/50-cloud-init.yaml:ethernets:
/etc/netplan/50-cloud-init.yaml:ens5:
/etc/netplan/50-cloud-init.yaml:dhcp4: true
/etc/netplan/50-cloud-init.yaml:match:
/etc/netplan/50-cloud-init.yaml:macaddress: xx:xx:xx:xx:xx:xx
/etc/netplan/50-cloud-init.yaml:set-name: ens5
/etc/netplan/99_config.yaml:network:
/etc/netplan/99_config.yaml:  version: 2
/etc/netplan/99_config.yaml:  renderer: networkd
/etc/netplan/99_config.yaml:  ethernets:
/etc/netplan/99_config.yaml:ens6:
/etc/netplan/99_config.yaml:  match:
/etc/netplan/99_config.yaml:macaddress: yy:yy:yy:yy:yy:yy
/etc/netplan/99_config.yaml:  dhcp4: true
/etc/netplan/99_config.yaml:  dhcp4-overrides:
/etc/netplan/99_config.yaml:use-routes: false
/etc/netplan/99_config.yaml:ens7:
/etc/netplan/99_config.yaml:  match:
/etc/netplan/99_config.yaml:macaddress: zz:zz:zz:zz:zz:zz
/etc/netplan/99_config.yaml:  mtu: 1500
/etc/netplan/99_config.yaml:  dhcp4: true
/etc/netplan/99_config.yaml:  dhcp4-overrides:
/etc/netplan/99_config.yaml:use-mtu: false
/etc/netplan/99_config.yaml:use-routes: false

# grep . /etc/networkd-dispatcher/*/*
/etc/networkd-dispatcher/configured.d/nat:#!/bin/bash
/etc/networkd-dispatcher/configured.d/nat:# Do additional configuration for the 
inside and outside interfaces
/etc/networkd-dispatcher/configured.d/nat:# route table used for 
forwarded/routed/natted traffic
/etc/networkd-dispatcher/configured.d/nat:FWD_TABLE=99
/etc/networkd-dispatcher/configured.d/nat:if [ "${IFACE}" = "ens6" ]; then
/etc/networkd-dispatcher/configured.d/nat:  # delete link-local route for 
inside in default table
/etc/networkd-dispatcher/configured.d/nat:  /sbin/ip route delete 10.0.3.0/24 
2>/dev/null || true
/etc/networkd-dispatcher/configured.d/nat:  # add link-local route for inside 
in table 99
/etc/networkd-dispatcher/configured.d/nat:  /sbin/ip route replace 10.0.3.0/24 
dev ens6 scope link src 10.0.3.171 table ${FWD_TABLE}
/etc/networkd-dispatcher/configured.d/nat:  # add routes to VPC cidrs via 
inside gateway in table 99
/etc/networkd-dispatcher/configured.d/nat:  /sbin/ip route replace 10.0.0.0/16 
via 10.0.3.1 table ${FWD_TABLE}
/etc/networkd-dispatcher/configured.d/nat:  # add rules to use table 99
/etc/networkd-dispatcher/configured.d/nat:  /sbin/ip rule add iif  ens6 lookup 
${FWD_TABLE}
/etc/networkd-dispatcher/configured.d/nat:  /sbin/ip rule add oif  ens6 lookup 
${FWD_TABLE}
/etc/networkd-dispatcher/configured.d/nat:  /sbin/ip rule add from 
10.0.3.171/32 lookup ${FWD_TABLE}
/etc/networkd-dispatcher/configured.d/nat:elif [ "${IFACE}" = "ens7" ]; then
/etc/networkd-dispatcher/configured.d/nat:  # delete link-local route for 
outside in default table
/etc/networkd-dispatcher/configured.d/nat:  /sbin/ip route delete 10.0.2.0/24 
2>/dev/null || true
/etc/networkd-dispatcher/configured.d/nat:  # add link-local route for outside 
in table 99
/etc/networkd-dispatcher/configured.d/nat:  /sbin/ip route replace 10.0.2.0/24 
dev ens7 scope link src