Re: [tor-bugs] #34185 [Internal Services/Tor Sysadmin Team]: ganeti clusters don't like automatic upgrades

2020-06-10 Thread Tor Bug Tracker & Wiki
#34185: ganeti clusters don't like automatic upgrades
-+-
 Reporter:  anarcat  |  Owner:  hiro
 Type:  defect   | Status:
 |  assigned
 Priority:  High |  Milestone:
Component:  Internal Services/Tor Sysadmin Team  |Version:
 Severity:  Major| Resolution:
 Keywords:  tpa-roadmap-june |  Actual Points:
Parent ID:   | Points:
 Reviewer:   |Sponsor:
-+-
Changes (by hiro):

 * keywords:  tpa-roadmap-may => tpa-roadmap-june


--
Ticket URL: 
Tor Bug Tracker & Wiki 
The Tor Project: anonymity online
___
tor-bugs mailing list
tor-bugs@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs


Re: [tor-bugs] #34185 [Internal Services/Tor Sysadmin Team]: ganeti clusters don't like automatic upgrades

2020-05-29 Thread Tor Bug Tracker & Wiki
#34185: ganeti clusters don't like automatic upgrades
-+-
 Reporter:  anarcat  |  Owner:  hiro
 Type:  defect   | Status:
 |  assigned
 Priority:  High |  Milestone:
Component:  Internal Services/Tor Sysadmin Team  |Version:
 Severity:  Major| Resolution:
 Keywords:  tpa-roadmap-may  |  Actual Points:
Parent ID:   | Points:
 Reviewer:   |Sponsor:
-+-

Comment (by hiro):

 https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=961746 opened bug with
 upstream yesterday.

--
Ticket URL: 
Tor Bug Tracker & Wiki 
The Tor Project: anonymity online
___
tor-bugs mailing list
tor-bugs@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs


Re: [tor-bugs] #34185 [Internal Services/Tor Sysadmin Team]: ganeti clusters don't like automatic upgrades

2020-05-27 Thread Tor Bug Tracker & Wiki
#34185: ganeti clusters don't like automatic upgrades
-+-
 Reporter:  anarcat  |  Owner:  hiro
 Type:  defect   | Status:
 |  assigned
 Priority:  High |  Milestone:
Component:  Internal Services/Tor Sysadmin Team  |Version:
 Severity:  Major| Resolution:
 Keywords:  tpa-roadmap-may  |  Actual Points:
Parent ID:   | Points:
 Reviewer:   |Sponsor:
-+-

Comment (by hiro):

 Migrating VMs between nodes returns the VM back online.

--
Ticket URL: 
Tor Bug Tracker & Wiki 
The Tor Project: anonymity online
___
tor-bugs mailing list
tor-bugs@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs


Re: [tor-bugs] #34185 [Internal Services/Tor Sysadmin Team]: ganeti clusters don't like automatic upgrades

2020-05-27 Thread Tor Bug Tracker & Wiki
#34185: ganeti clusters don't like automatic upgrades
-+-
 Reporter:  anarcat  |  Owner:  hiro
 Type:  defect   | Status:
 |  assigned
 Priority:  High |  Milestone:
Component:  Internal Services/Tor Sysadmin Team  |Version:
 Severity:  Major| Resolution:
 Keywords:  tpa-roadmap-may  |  Actual Points:
Parent ID:   | Points:
 Reviewer:   |Sponsor:
-+-

Comment (by hiro):

 Tested reinstalling openvswitch with

 {{{
 apt install --reinstall openvswitch-switch
 }}}

 On fsn-node-06. It caused openvswitch-switch to restart:


 {{{
 Active: active (exited) since Wed 2020-05-27 17:09:27 UTC; 2min 44s ago
 }}}

 I think openvswitch should be upgraded manually for the time being.

--
Ticket URL: 
Tor Bug Tracker & Wiki 
The Tor Project: anonymity online
___
tor-bugs mailing list
tor-bugs@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs


Re: [tor-bugs] #34185 [Internal Services/Tor Sysadmin Team]: ganeti clusters don't like automatic upgrades

2020-05-27 Thread Tor Bug Tracker & Wiki
#34185: ganeti clusters don't like automatic upgrades
-+-
 Reporter:  anarcat  |  Owner:  hiro
 Type:  defect   | Status:
 |  assigned
 Priority:  High |  Milestone:
Component:  Internal Services/Tor Sysadmin Team  |Version:
 Severity:  Major| Resolution:
 Keywords:  tpa-roadmap-may  |  Actual Points:
Parent ID:   | Points:
 Reviewer:   |Sponsor:
-+-

Comment (by hiro):

 Openvswitch was updated together with the following group of packages:

 {{{
 2020-05-10 06:12:53,754 INFO Packages that will be upgraded: base-files
 distro-info-data iputils-arping iputils-ping iputils-tracepath
 libbrlapi0.6 libfuse2 l
 ibpam-systemd libsystemd0 libudev1 linux-compiler-gcc-8-x86 linux-headers-
 amd64 linux-image-amd64 linux-kbuild-4.19 openvswitch-common openvswitch-
 switch post
 fix postfix-cdb rake rubygems-integration systemd systemd-sysv tzdata udev
 }}}

 Checking openvswitch status it has not been restarted since the 10th of
 may:

 {{{
 Loaded: loaded (/lib/systemd/system/openvswitch-switch.service; enabled;
 vendor preset: enabled)
Active: active (exited) since Sun 2020-05-10 14:05:11 UTC; 2 weeks 3
 days ago

 }}}

 And from the log on that day I actually see it died twice:

 2020-05-10T06:13:16.534Z|3|vlog(monitor)|INFO|opened log file
 /var/log/openvswitch/ovs-vswitchd.log
 2020-05-10T06:13:16.534Z|4|daemon_unix(monitor)|INFO|pid 3211 died,
 exit status 0, exiting
 2020-05-10T06:13:16.787Z|1|vlog|INFO|opened log file
 /var/log/openvswitch/ovs-vswitchd.log
 2020-05-10T06:13:16.788Z|2|ovs_numa|INFO|Discovered 12 CPU cores on
 NUMA node 0
 2020-05-10T06:13:16.788Z|3|ovs_numa|INFO|Discovered 1 NUMA nodes and
 12 CPU cores
 
2020-05-10T06:13:16.788Z|4|reconnect|INFO|unix:/var/run/openvswitch/db.sock:
 connecting...
 
2020-05-10T06:13:16.788Z|5|reconnect|INFO|unix:/var/run/openvswitch/db.sock:
 connected
 2020-05-10T06:13:16.791Z|6|bridge|INFO|ovs-vswitchd (Open vSwitch)
 2.10.1
 2020-05-10T06:13:17.332Z|2|daemon_unix(monitor)|INFO|pid 29781 died,
 exit status 0, exiting
 2020-05-10T06:13:17.621Z|1|vlog|INFO|opened log file
 /var/log/openvswitch/ovs-vswitchd.log
 2020-05-10T06:13:17.623Z|2|ovs_numa|INFO|Discovered 12 CPU cores on
 NUMA node 0
 2020-05-10T06:13:17.623Z|3|ovs_numa|INFO|Discovered 1 NUMA nodes and
 12 CPU cores
 
2020-05-10T06:13:17.623Z|4|reconnect|INFO|unix:/var/run/openvswitch/db.sock:
 connecting...
 
2020-05-10T06:13:17.623Z|5|reconnect|INFO|unix:/var/run/openvswitch/db.sock:
 connected
 2020-05-10T06:13:17.630Z|6|bridge|INFO|ovs-vswitchd (Open vSwitch)
 2.10.1

 ...

 2020-05-10T14:02:23.078Z|00036|bridge|INFO|bridge br0: using datapath ID
 7eb83553f345
 2020-05-10T14:02:23.398Z|00037|bridge|INFO|bridge br0: deleted interface
 br0 on port 65534
 2020-05-10T14:02:23.578Z|2|daemon_unix(monitor)|INFO|pid 29951 died,
 exit status 0, exiting
 2020-05-10T14:05:05.241Z|1|vlog|INFO|opened log file
 /var/log/openvswitch/ovs-vswitchd.log
 2020-05-10T14:05:05.247Z|2|ovs_numa|INFO|Discovered 12 CPU cores on
 NUMA node 0
 2020-05-10T14:05:05.247Z|3|ovs_numa|INFO|Discovered 1 NUMA nodes and
 12 CPU cores
 
2020-05-10T14:05:05.247Z|4|reconnect|INFO|unix:/var/run/openvswitch/db.sock:
 connecting...
 
2020-05-10T14:05:05.247Z|5|reconnect|INFO|unix:/var/run/openvswitch/db.sock:
 connected
 2020-05-10T14:05:05.250Z|6|bridge|INFO|ovs-vswitchd (Open vSwitch)
 2.10.1
 2020-05-10T14:05:09.384Z|7|ofproto_dpif|INFO|system@ovs-system:
 Datapath supports recirculation
 2020-05-10T14:05:09.384Z|8|ofproto_dpif|INFO|system@ovs-system: VLAN
 header stack length probed as 2
 2020-05-10T14:05:09.384Z|9|ofproto_dpif|INFO|system@ovs-system: MPLS
 label stack length probed as 1
 2020-05-10T14:05:09.384Z|00010|ofproto_dpif|INFO|system@ovs-system:
 Datapath supports truncate action
 2020-05-10T14:05:09.384Z|00011|ofproto_dpif|INFO|system@ovs-system:
 Datapath supports unique flow ids


 {{{

 }}}

--
Ticket URL: 
Tor Bug Tracker & Wiki 
The Tor Project: anonymity online
___
tor-bugs mailing list
tor-bugs@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs


Re: [tor-bugs] #34185 [Internal Services/Tor Sysadmin Team]: ganeti clusters don't like automatic upgrades

2020-05-26 Thread Tor Bug Tracker & Wiki
#34185: ganeti clusters don't like automatic upgrades
-+-
 Reporter:  anarcat  |  Owner:  hiro
 Type:  defect   | Status:
 |  assigned
 Priority:  High |  Milestone:
Component:  Internal Services/Tor Sysadmin Team  |Version:
 Severity:  Major| Resolution:
 Keywords:  tpa-roadmap-may  |  Actual Points:
Parent ID:   | Points:
 Reviewer:   |Sponsor:
-+-

Comment (by hiro):

 On the 10th of may there was an unattended upgrade. The kernel was updated
 and the system restarted.
 Openvswithch was updated and restarted so maybe the blacklist didn't work.

 According to the unattended upgrades logs the following packages were
 handled by need restart:

 {{{

 Restarting services...
  /etc/needrestart/restart.d/dbus.service
  systemctl restart apt-daily-upgrade.service ganeti.service
 smartmontools.service ssh.service strongswan.service syslog-ng.service
 systemd-logind.service unattended-upgrades.service unbound.service

 }}}

--
Ticket URL: 
Tor Bug Tracker & Wiki 
The Tor Project: anonymity online
___
tor-bugs mailing list
tor-bugs@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs


Re: [tor-bugs] #34185 [Internal Services/Tor Sysadmin Team]: ganeti clusters don't like automatic upgrades

2020-05-11 Thread Tor Bug Tracker & Wiki
#34185: ganeti clusters don't like automatic upgrades
-+-
 Reporter:  anarcat  |  Owner:  hiro
 Type:  defect   | Status:
 |  assigned
 Priority:  High |  Milestone:
Component:  Internal Services/Tor Sysadmin Team  |Version:
 Severity:  Major| Resolution:
 Keywords:  tpa-roadmap-may  |  Actual Points:
Parent ID:   | Points:
 Reviewer:   |Sponsor:
-+-

Comment (by anarcat):

 This is the mail I sent on sunday:

 > There was a ~8h ganeti outage until about now. It seems the buster point
 release broke things in our automated upgrade procedure. I didn't have
 time to diagnose the issue (I was running out) and figured it was more
 urgent to restore the service.
 >
 > I rebooted all gnt-fsn nodes by hand (without migrating). Some instances
 returned with a state of "ERROR_down", so I manually started them (with
 gnt-instance start). Everything now seems to be back up.
 >
 > I haven't looked at Nagios in details, but everything is mostly
 "yellow" now so I'll assume we're good.
 >
 > It would be great if someone could look at the logs and see what
 happened. I suspect the openvswitch fix didn't work, or maybe there are
 other servers we need to block from needrestart's automation (or maybe
 even unattended-upgrades).

--
Ticket URL: 
Tor Bug Tracker & Wiki 
The Tor Project: anonymity online
___
tor-bugs mailing list
tor-bugs@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs