[Bug 1830427] Re: Instance may loose network connectivity after resuming the 2nd time

2019-06-20 Thread Balint Reczey
I tried verification on Cosmic and Disco, too, but they did not hibernate for 
the second time.
Since the contents of _this_ package is the same I suspect a kernel issue but I 
need to do more debugging.

Since getting the fix in for Bionic and Xenial is quite urgent I propose 
accepting the SRUs for those releases despite later releases don't get the fix 
yet.
The Xenial and Bionic packages are already heavily tested by CPC, too.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1830427

Title:
  Instance may lose network connectivity after resuming the 2nd time

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ec2-hibinit-agent/+bug/1830427/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1830427] Re: Instance may loose network connectivity after resuming the 2nd time

2019-06-20 Thread Balint Reczey
Verified 1.0.0-0ubuntu4~18.04.2 on Bionic:

[  104.950957] PM: hibernation entry
[  105.058661] PM: Syncing filesystems ... 
[  105.070422] PM: done.
[  105.070423] Freezing user space processes ... (elapsed 0.001 seconds) done.
[  105.071662] OOM killer disabled.
[  105.071811] PM: Marking nosave pages: [mem 0x-0x0fff]
[  105.071812] PM: Marking nosave pages: [mem 0x0009f000-0x000f]
[  105.071814] PM: Marking nosave pages: [mem 0xbfffa000-0x]
[  105.072162] PM: Basic memory bitmaps created
[  105.072170] PM: Preallocating image memory... done (allocated 155518 pages)
[  105.182089] PM: Allocated 622072 kbytes in 0.10 seconds (6220.72 MB/s)
[  105.182090] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) 
done.
[  105.318242] ACPI: Preparing to enter system sleep state S4
[  105.318300] PM: Saving platform NVS memory
[  105.318302] Disabling non-boot CPUs ...
[  105.332476] Unregister pv shared memory for cpu 1
[  105.333622] smpboot: CPU 1 is now offline
[  105.356694] Unregister pv shared memory for cpu 2
[  105.357833] smpboot: CPU 2 is now offline
[  105.380437] Unregister pv shared memory for cpu 3
[  105.381598] smpboot: CPU 3 is now offline
[  105.383096] PM: Creating hibernation image:
[  105.472763] PM: Need to copy 149179 pages
[  105.472766] PM: Normal pages needed: 149179 + 1024, available pages: 1856580
[4.344618] kvm-clock: cpu 0, msr 2:29b53001, primary cpu clock, resume
[4.344682] PM: Restoring platform NVS memory
[4.346041] Enabling non-boot CPUs ...
[4.346093] x86: Booting SMP configuration:
[4.346094] smpboot: Booting Node 0 Processor 1 APIC 0x2
[4.346187] kvm-clock: cpu 1, msr 2:29b53041, secondary cpu clock
[4.346443] KVM setup async PF for cpu 1
[4.346447] kvm-stealtime: cpu 1, msr 2298a4040
[4.346496]  cache: parent cpu1 should not be sleeping
[4.346674] CPU1 is up
[4.346695] smpboot: Booting Node 0 Processor 2 APIC 0x1
[4.346784] kvm-clock: cpu 2, msr 2:29b53081, secondary cpu clock
[4.347078] KVM setup async PF for cpu 2
[4.347082] kvm-stealtime: cpu 2, msr 229924040
[4.347126]  cache: parent cpu2 should not be sleeping
[4.347541] CPU2 is up
[4.347562] smpboot: Booting Node 0 Processor 3 APIC 0x3
[4.347647] kvm-clock: cpu 3, msr 2:29b530c1, secondary cpu clock
[4.347890] KVM setup async PF for cpu 3
[4.347894] kvm-stealtime: cpu 3, msr 2299a4040
[4.347927]  cache: parent cpu3 should not be sleeping
[4.348115] CPU3 is up
[4.348268] ACPI: Waking up from system sleep state S4
[4.387725] ena: ena device version: 0.10
[4.387726] ena: ena controller version: 0.0.1 implementation version 1
[4.519157] ena :00:05.0: Device reset completed successfully, Driver 
info: Elastic Network Adapter (ENA) v2.0.3K
   
[4.524466] PM: Basic memory bitmaps freed
[4.524467] OOM killer enabled.
[4.524467] Restarting tasks ... done.
[4.632437] PM: hibernation exit
[  284.358553] Adding 7807992k swap on /swap-hibinit.  Priority:-2 extents:13 
across:10101752k SSFS
[  284.367131] PM: Starting manual resume from disk
[  284.367485] PM: Image not found (code -22)
[  284.429876] PM: hibernation entry
[  284.536165] PM: Syncing filesystems ... 
[  284.543916] PM: done.
[  284.543918] Freezing user space processes ... (elapsed 0.001 seconds) done.
[  284.545175] OOM killer disabled.
[  284.545336] PM: Marking nosave pages: [mem 0x-0x0fff]
[  284.545337] PM: Marking nosave pages: [mem 0x0009f000-0x000f]
[  284.545339] PM: Marking nosave pages: [mem 0xbfffa000-0x]
[  284.545727] PM: Basic memory bitmaps created
[  284.545734] PM: Preallocating image memory... done (allocated 155510 pages)
[  284.655191] PM: Allocated 622040 kbytes in 0.10 seconds (6220.40 MB/s)
[  284.655192] Freezing remaining freezable tasks ... (elapsed 0.000 seconds) 
done.
[  284.793867] ACPI: Preparing to enter system sleep state S4
[  284.793924] PM: Saving platform NVS memory
[  284.793925] Disabling non-boot CPUs ...
[  284.808205] Unregister pv shared memory for cpu 1
[  284.809301] smpboot: CPU 1 is now offline
[  284.832403] Unregister pv shared memory for cpu 2
[  284.833479] smpboot: CPU 2 is now offline
[  284.856162] Unregister pv shared memory for cpu 3
[  284.857233] smpboot: CPU 3 is now offline
[  284.858329] PM: Creating hibernation image:
[  284.948195] PM: Need to copy 150004 pages
[  284.948198] PM: Normal pages needed: 150004 + 1024, available pages: 1855757
[4.182500] kvm-clock: cpu 0, msr 2:29b53001, primary cpu clock, resume
[4.182566] PM: Restoring platform NVS memory
[4.183927] Enabling non-boot CPUs ...
[4.183987] x86: Booting SMP configuration:
[4.183988] smpboot: Booting Node 0 Processor 1 APIC 0x2
[4.184079] kvm-clock: cpu 1, msr 2:29b53041, secondary cpu clock
[4.184338] KVM setup async PF for cpu 1
[4.184341] kvm-stealtime: cpu 1, msr 2298a4040
[4.184390]  cache: parent cpu1 should 

[Bug 1830427] Re: Instance may loose network connectivity after resuming the 2nd time

2019-06-20 Thread Balint Reczey
Verified ec2-hibinit-agent 1.0.0-0ubuntu4~16.04.2 on Xenial:

[  195.098817] done.
[  470.982689] Adding 4095996k swap on /swap-hibinit.  Priority:-1 extents:5 
across:4382716k SSFS
[  471.001029] PM: Hibernation mode set to 'platform'
[  471.089553] PM: Syncing filesystems ... done.
[  471.092020] Freezing user space processes ... (elapsed 0.001 seconds) done.
[  471.093318] PM: Marking nosave pages: [mem 0x-0x0fff]
[  471.093320] PM: Marking nosave pages: [mem 0x0009e000-0x000f]
[  471.093322] PM: Basic memory bitmaps created
[  471.093329] PM: Preallocating image memory... done (allocated 147573 pages)
[  471.163207] PM: Allocated 590292 kbytes in 0.06 seconds (9838.20 MB/s)
[  471.163209] Freezing remaining freezable tasks ... (elapsed 0.000 seconds) 
done.
[  471.199361] PM: freeze of devices complete after 35.000 msecs
[  471.199527] PM: late freeze of devices complete after 0.162 msecs
[  471.204345] PM: noirq freeze of devices complete after 4.815 msecs
[  471.204349] ACPI: Preparing to enter system sleep state S4
[  471.204475] PM: Saving platform NVS memory
[  471.204477] Disabling non-boot CPUs ...
[  471.204966] Broke affinity for irq 1
[  471.204970] Broke affinity for irq 4
[  471.204974] Broke affinity for irq 8
[  471.204977] Broke affinity for irq 9
[  471.204981] Broke affinity for irq 12
[  471.205074] Broke affinity for irq 60
[  471.206093] smpboot: CPU 1 is now offline
[  471.217186] PM: Creating hibernation image:
[  471.220057] PM: Need to copy 146085 pages
[  471.220057] PM: Normal pages needed: 146085 + 1024, available pages: 836780
[  471.220057] PM: Restoring platform NVS memory
[  471.220057] xen:grant_table: Grant tables using version 1 layout
[  471.220057] Enabling non-boot CPUs ...
[  471.220057] installing Xen timer for CPU 1
[  471.236234] x86: Booting SMP configuration:
[  471.236236] smpboot: Booting Node 0 Processor 1 APIC 0x1
[  471.237520] Skipped synchronization checks as TSC is reliable.
[  471.237537] cpu 1 spinlock event irq 59
[  471.237904]  cache: parent cpu1 should not be sleeping
[  471.238000] CPU1 is up
[  471.238064] ACPI: Waking up from system sleep state S4
[  471.242499] PM: noirq restore of devices complete after 4.348 msecs
[  471.242675] PM: early restore of devices complete after 0.110 msecs
[  471.263920] rtc_cmos 00:02: System wakeup disabled by ACPI
[  471.268813] Setting capacity to 41943040
[  471.285823] PM: restore of devices complete after 25.502 msecs
[  471.285999] PM: Image restored successfully.
[  471.286009] PM: Basic memory bitmaps freed
[  471.286010] Restarting tasks ... 
[  471.289490] ixgbevf :00:03.0: NIC Link is Up 10 Gbps
[  471.292384] done.
ubuntu@ip-172-31-1-3:~$ cat /etc/issue
Ubuntu 16.04.6 LTS \n \l

ubuntu@ip-172-31-1-3:~$ screen -ls
There is a screen on:
1745..ip-172-31-1-3 (06/20/2019 08:32:23 PM)(Detached)
1 Socket in /var/run/screen/S-ubuntu.
ubuntu@ip-172-31-1-3:~$ dpkg -l ec2-hibinit-agent 
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name   Version  Architecture Description
+++-==---==
ii  ec2-hibinit-agent  1.0.0-0ubuntu4~1 all  Amazon EC2 
hibernation agent
ubuntu@ip-172-31-1-3:~$ dpkg -l ec2-hibinit-agent  | cat
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name  VersionArchitecture Description
+++-=-==--=
ii  ec2-hibinit-agent 1.0.0-0ubuntu4~16.04.2 all  Amazon EC2 
hibernation agent

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1830427

Title:
  Instance may loose network connectivity after resuming the 2nd time

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ec2-hibinit-agent/+bug/1830427/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1830427] Re: Instance may loose network connectivity after resuming the 2nd time

2019-06-07 Thread Steve Langasek
Hello Balint, or anyone else affected,

Accepted ec2-hibinit-agent into xenial-proposed. The package will build
now and be available at https://launchpad.net/ubuntu/+source/ec2
-hibinit-agent/1.0.0-0ubuntu4~16.04.2 in a few hours, and then in the
-proposed repository.

Please help us by testing this new package.  See
https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how
to enable and use -proposed.  Your feedback will aid us getting this
update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug,
mentioning the version of the package you tested and change the tag from
verification-needed-xenial to verification-done-xenial. If it does not
fix the bug for you, please add a comment stating that, and change the
tag to verification-failed-xenial. In either case, without details of
your testing we will not be able to proceed.

Further information regarding the verification process can be found at
https://wiki.ubuntu.com/QATeam/PerformingSRUVerification .  Thank you in
advance for helping!

N.B. The updated package will be released to -updates after the bug(s)
fixed by this package have been verified and the package has been in
-proposed for a minimum of 7 days.

** Changed in: ec2-hibinit-agent (Ubuntu Xenial)
   Status: New => Fix Committed

** Tags added: verification-needed-xenial

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1830427

Title:
  Instance may loose network connectivity after resuming the 2nd time

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ec2-hibinit-agent/+bug/1830427/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1830427] Re: Instance may loose network connectivity after resuming the 2nd time

2019-05-27 Thread Balint Reczey
** Description changed:

  [Impact]
  
   * Some hibernated, then started instances don't restore network connectivity 
keeping the instance unreachable.
   * The fix is restarting systemd-networkd on resume.
  
  [Test Case]
  
-  0. Start a  instance from an encrypted EBS-backed AMI, with
+  0. Start an m5.large instance from an encrypted EBS-backed AMI, with
  hibernation enabled.
  
   1. Install ec2-hibinit-agent
  
   2. Start a long running process on the instance, like top in screen.
  
   3. Hibernate, then after it finished start the instance on EC2 console
  
   4. Log in to the instance and observe top still running in screen (to
  prove that the instance resumed and had not been restarted).
  
   5. Hibernate, then after it finished start the instance on EC2 console
  
   6. Log in to the instance and observe top still running in screen.
     (This second cycle ensures that hibernation works more than once.)
  
  [Regression Potential]
  
   * Restarting systemd-networkd may cause disturbances in complex networking 
setups, but since the system was hibernated networking was down anyway.
   * The hook in /lib/systemd/system-sleep/ is ran in parallel to other hooks 
in the same directory and restarting networking may break them. In Bionic the 
following packages use similar hooks:
  
   $ apt-file search /lib/systemd/system-sleep/
  atop: /lib/systemd/system-sleep/atop-pm
  battery-stats: /lib/systemd/system-sleep/battery-stats
  ec2-hibinit-agent: /lib/systemd/system-sleep/hibinit-agent
  hdparm: /lib/systemd/system-sleep/hdparm
  lizardfs-chunkserver: /lib/systemd/system-sleep/lizardfs-chunkserver
  tuxonice-userui: /lib/systemd/system-sleep/tuxonice
  unattended-upgrades: /lib/systemd/system-sleep/unattended-upgrades
  
  Only lizardfs-chunkserver may be affected because it starts lizardfs-
  chunkserver.service on resume but by the description it claims to be
  reliable thus a networking restart is probably tolerated, too. Also it
  has ~50 popcon count in Debian which may not warrant an extensive
  investigation nor adding Breaks: without being sure that it breaks.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1830427

Title:
  Instance may loose network connectivity after resuming the 2nd time

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ec2-hibinit-agent/+bug/1830427/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1830427] Re: Instance may loose network connectivity after resuming the 2nd time

2019-05-25 Thread Francis Ginther
** Tags added: id-5c000da0aa62bc2994611bd2

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1830427

Title:
  Instance may loose network connectivity after resuming the 2nd time

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ec2-hibinit-agent/+bug/1830427/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1830427] Re: Instance may loose network connectivity after resuming the 2nd time

2019-05-24 Thread Steve Langasek
Hello Balint, or anyone else affected,

Accepted ec2-hibinit-agent into bionic-proposed. The package will build
now and be available at https://launchpad.net/ubuntu/+source/ec2
-hibinit-agent/1.0.0-0ubuntu4~18.04.1 in a few hours, and then in the
-proposed repository.

Please help us by testing this new package.  See
https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how
to enable and use -proposed.  Your feedback will aid us getting this
update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug,
mentioning the version of the package you tested and change the tag from
verification-needed-bionic to verification-done-bionic. If it does not
fix the bug for you, please add a comment stating that, and change the
tag to verification-failed-bionic. In either case, without details of
your testing we will not be able to proceed.

Further information regarding the verification process can be found at
https://wiki.ubuntu.com/QATeam/PerformingSRUVerification .  Thank you in
advance for helping!

N.B. The updated package will be released to -updates after the bug(s)
fixed by this package have been verified and the package has been in
-proposed for a minimum of 7 days.

** Changed in: ec2-hibinit-agent (Ubuntu Bionic)
   Status: New => Fix Committed

** Tags added: verification-needed-bionic

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1830427

Title:
  Instance may loose network connectivity after resuming the 2nd time

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ec2-hibinit-agent/+bug/1830427/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1830427] Re: Instance may loose network connectivity after resuming the 2nd time

2019-05-24 Thread Steve Langasek
Hello Balint, or anyone else affected,

Accepted ec2-hibinit-agent into disco-proposed. The package will build
now and be available at https://launchpad.net/ubuntu/+source/ec2
-hibinit-agent/1.0.0-0ubuntu4.19.04.0 in a few hours, and then in the
-proposed repository.

Please help us by testing this new package.  See
https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how
to enable and use -proposed.  Your feedback will aid us getting this
update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug,
mentioning the version of the package you tested and change the tag from
verification-needed-disco to verification-done-disco. If it does not fix
the bug for you, please add a comment stating that, and change the tag
to verification-failed-disco. In either case, without details of your
testing we will not be able to proceed.

Further information regarding the verification process can be found at
https://wiki.ubuntu.com/QATeam/PerformingSRUVerification .  Thank you in
advance for helping!

N.B. The updated package will be released to -updates after the bug(s)
fixed by this package have been verified and the package has been in
-proposed for a minimum of 7 days.

** Description changed:

  [Impact]
  
   * Some hibernated, then started instances don't restore network connectivity 
keeping the instance unreachable.
-  * The fix is restarting systemd-networkd on resume.
+  * The fix is restarting systemd-networkd on resume.
  
  [Test Case]
  
   0. Start a  instance from an encrypted EBS-backed AMI, with
  hibernation enabled.
  
   1. Install ec2-hibinit-agent
  
   2. Start a long running process on the instance, like top in screen.
  
   3. Hibernate, then after it finished start the instance on EC2 console
  
-  4. Log in to the instance and observe top still running in screen.
+  4. Log in to the instance and observe top still running in screen (to
+ prove that the instance resumed and had not been restarted).
  
   5. Hibernate, then after it finished start the instance on EC2 console
  
   6. Log in to the instance and observe top still running in screen.
     (This second cycle ensures that hibernation works more than once.)
  
  [Regression Potential]
  
   * Restarting systemd-networkd may cause disturbances in complex networking 
setups, but since the system was hibernated networking was down anyway.
-  * The hook in /lib/systemd/system-sleep/ is ran in parallel to other hooks 
in the same directory and restarting networking may break them. In Bionic the 
following packages use similar hooks:
+  * The hook in /lib/systemd/system-sleep/ is ran in parallel to other hooks 
in the same directory and restarting networking may break them. In Bionic the 
following packages use similar hooks:
  
-  $ apt-file search /lib/systemd/system-sleep/
+  $ apt-file search /lib/systemd/system-sleep/
  atop: /lib/systemd/system-sleep/atop-pm
  battery-stats: /lib/systemd/system-sleep/battery-stats
  ec2-hibinit-agent: /lib/systemd/system-sleep/hibinit-agent
  hdparm: /lib/systemd/system-sleep/hdparm
  lizardfs-chunkserver: /lib/systemd/system-sleep/lizardfs-chunkserver
  tuxonice-userui: /lib/systemd/system-sleep/tuxonice
  unattended-upgrades: /lib/systemd/system-sleep/unattended-upgrades
  
  Only lizardfs-chunkserver may be affected because it starts lizardfs-
  chunkserver.service on resume but by the description it claims to be
  reliable thus a networking restart is probably tolerated, too. Also it
  has ~50 popcon count in Debian which may not warrant an extensive
  investigation nor adding Breaks: without being sure that it breaks.

** Changed in: ec2-hibinit-agent (Ubuntu Disco)
   Status: New => Fix Committed

** Tags added: verification-needed verification-needed-disco

** Changed in: ec2-hibinit-agent (Ubuntu Cosmic)
   Status: New => Fix Committed

** Tags added: verification-needed-cosmic

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1830427

Title:
  Instance may loose network connectivity after resuming the 2nd time

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ec2-hibinit-agent/+bug/1830427/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1830427] Re: Instance may loose network connectivity after resuming the 2nd time

2019-05-24 Thread Launchpad Bug Tracker
This bug was fixed in the package ec2-hibinit-agent - 1.0.0-0ubuntu5

---
ec2-hibinit-agent (1.0.0-0ubuntu5) eoan; urgency=medium

  * debian/gbp.conf: Fix packaging branch name
  * Restart systemd-networkd on resuming from hibernation.
On resume the system sometimes does not restore network connections
and this is a way of reliably triggering the restoration. (LP: #1830427)

 -- Balint Reczey   Fri, 24 May 2019 21:48:20 +0200

** Changed in: ec2-hibinit-agent (Ubuntu)
   Status: New => Fix Released

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1830427

Title:
  Instance may loose network connectivity after resuming the 2nd time

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ec2-hibinit-agent/+bug/1830427/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1830427] Re: Instance may loose network connectivity after resuming the 2nd time

2019-05-24 Thread Balint Reczey
** Description changed:

  [Impact]
  
-  * Some hibernated, then started instances don't restore network
- connectivity keeping the instance unreachable.
+  * Some hibernated, then started instances don't restore network connectivity 
keeping the instance unreachable.
+  * The fix is restarting systemd-networkd on resume.
  
  [Test Case]
  
-  0. Start a  instance from an encrypted EBS-backed AMI, with
+  0. Start a  instance from an encrypted EBS-backed AMI, with
  hibernation enabled.
  
-  1. Install ec2-hibinit-agent
+  1. Install ec2-hibinit-agent
  
-  2. Start a long running process on the instance, like top in screen.
+  2. Start a long running process on the instance, like top in screen.
  
-  3. Hibernate, then after it finished start the instance on EC2 console
+  3. Hibernate, then after it finished start the instance on EC2 console
  
-  4. Log in to the instance and observe top still running in screen.
+  4. Log in to the instance and observe top still running in screen.
  
-  5. Hibernate, then after it finished start the instance on EC2 console
+  5. Hibernate, then after it finished start the instance on EC2 console
  
-  6. Log in to the instance and observe top still running in screen. 
-(This second cycle ensures that hibernation works more than once.)
+  6. Log in to the instance and observe top still running in screen.
+    (This second cycle ensures that hibernation works more than once.)
  
  [Regression Potential]
  
-  TODO
+  * Restarting systemd-networkd may cause disturbances in complex networking 
setups, but since the system was hibernated networking was down anyway.
+  * The hook in /lib/systemd/system-sleep/ is ran in parallel to other hooks 
in the same directory and restarting networking may break them. In Bionic the 
following packages use similar hooks:
+ 
+  $ apt-file search /lib/systemd/system-sleep/
+ atop: /lib/systemd/system-sleep/atop-pm
+ battery-stats: /lib/systemd/system-sleep/battery-stats
+ ec2-hibinit-agent: /lib/systemd/system-sleep/hibinit-agent
+ hdparm: /lib/systemd/system-sleep/hdparm
+ lizardfs-chunkserver: /lib/systemd/system-sleep/lizardfs-chunkserver
+ tuxonice-userui: /lib/systemd/system-sleep/tuxonice
+ unattended-upgrades: /lib/systemd/system-sleep/unattended-upgrades
+ 
+ Only lizardfs-chunkserver may be affected because it starts lizardfs-
+ chunkserver.service on resume but by the description it claims to be
+ reliable thus a networking restart is probably tolerated, too. Also it
+ has ~50 popcon count in Debian which may not warrant an extensive
+ investigation nor adding Breaks: without being sure that it breaks.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1830427

Title:
  Instance may loose network connectivity after resuming the 2nd time

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ec2-hibinit-agent/+bug/1830427/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1830427] Re: Instance may loose network connectivity after resuming the 2nd time

2019-05-24 Thread Balint Reczey
** Summary changed:

- Instance may loosw network connectivity after resuming the 2nd time
+ Instance may loose network connectivity after resuming the 2nd time

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1830427

Title:
  Instance may loose network connectivity after resuming the 2nd time

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ec2-hibinit-agent/+bug/1830427/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs