[Bug 1948995] Re: Allow reverting to older revisions of a snap
I think this bug should be followed up with a change to retain=3 on Ubuntu servers. ** Changed in: snapd (Ubuntu) Status: Expired => New -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1948995 Title: Allow reverting to older revisions of a snap To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/snapd/+bug/1948995/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1900438] Re: Bcache bypass writeback on caching device with fragmentation
** Summary changed: - Bcache bypasse writeback on caching device with fragmentation + Bcache bypass writeback on caching device with fragmentation -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1900438 Title: Bcache bypass writeback on caching device with fragmentation To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1900438/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1902960] Re: Upgrade from 245.4-4ubuntu3.3 to 245.4-4ubuntu3.2 appears to break DNS resolution in some cases
I confirm I got it working at first boot on azure with systemd-245.4-4ubuntu3.4 ``` ubuntu@machine-3:~$ sudo networkctl IDX LINK TYPE OPERATIONAL SETUP 1 lo loopback carrier unmanaged 2 eth0 etherroutableconfigured 2 links listed. ubuntu@machine-3:~$ sudo apt update Hit:1 http://ppa.launchpad.net/telegraf-devs/ppa/ubuntu focal InRelease Hit:2 http://us.archive.ubuntu.com/ubuntu focal InRelease Get:3 http://us.archive.ubuntu.com/ubuntu focal-updates InRelease [114 kB] Get:4 http://us.archive.ubuntu.com/ubuntu focal-backports InRelease [101 kB] Get:5 http://us.archive.ubuntu.com/ubuntu focal-security InRelease [109 kB] Get:6 http://us.archive.ubuntu.com/ubuntu focal-proposed InRelease [267 kB] Fetched 591 kB in 3s (225 kB/s) Reading package lists... Done Building dependency tree Reading state information... Done All packages are up to date. ubuntu@machine-3:~$ dpkg -l systemd Desired=Unknown/Install/Remove/Purge/Hold | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad) ||/ Name Version Architecture Description +++-==---= ii systemd245.4-4ubuntu3.4 amd64system and service manager ``` -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1902960 Title: Upgrade from 245.4-4ubuntu3.3 to 245.4-4ubuntu3.2 appears to break DNS resolution in some cases To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-images/+bug/1902960/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1904549] Re: MTU is not set on vlan interface
Sadly, the journalctl logs don't go back that far now. It failed on a baremetal server, not a cloud. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1904549 Title: MTU is not set on vlan interface To manage notifications about this bug go to: https://bugs.launchpad.net/netplan/+bug/1904549/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1898786] Re: bcache: Issues with large IO wait in bch_mca_scan() when shrinker is enabled
Ack ! I'll check with Launchpad team then, I think they would probably prefer to wait for the -updates indeed. Thanks again for your work dans Dan's. Cheers, -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1898786 Title: bcache: Issues with large IO wait in bch_mca_scan() when shrinker is enabled To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1898786/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1898786] Re: bcache: Issues with large IO wait in bch_mca_scan() when shrinker is enabled
Hello Matthew, sorry for the lack of response. I'll check with Launchpad people if we can justify a reboot of the server soon and will keep you posted ! Regards, -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1898786 Title: bcache: Issues with large IO wait in bch_mca_scan() when shrinker is enabled To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1898786/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1904549] Re: MTU is not set on vlan interface
Hello, I don't need it on bond-manlan for sure (MTU 1500 on this one and its VLANs), but I read in several place that setting MTU on the bond would also set it on interface members. And that indeed worked correctly. As I said, I think the issue is definitely more in the udevd/networkd part (remind me of the azure issue seen in https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1902960) because the /run/systemd/network files were correct after the netplan apply. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1904549 Title: MTU is not set on vlan interface To manage notifications about this bug go to: https://bugs.launchpad.net/netplan/+bug/1904549/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1902960] Re: Upgrade from 245.4-4ubuntu3.3 to 245.4-4ubuntu3.2 appears to break DNS resolution in some cases
Thanks for the explanation. I confirm that the workaround using "sytemctl restart systemd-udev- trigger && systemctl restart systemd-networkd" does the trick. @Dan Watkins : did you do some specific thing to reproduce the issue on your local VM ? It would be interesting to see the whole logs happening there. We could possibly hijack the image to add a | udevadm control --log-priority=debug and see what happens. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1902960 Title: Upgrade from 245.4-4ubuntu3.3 to 245.4-4ubuntu3.2 appears to break DNS resolution in some cases To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/1902960/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1902960] Re: Upgrade from 245.4-4ubuntu3.3 to 245.4-4ubuntu3.2 appears to break DNS resolution in some cases
Here is a pastebin of the situation and how I tried to resolve this : https://pastebin.ubuntu.com/p/c6cfKqvBmN/ Unfortunately, the interface stays "unmanaged". When I check the netplan source (https://github.com/CanonicalLtd/netplan/blob/master/netplan/cli/commands/apply.py#L128), it just stops systemd-networkd service, then start it after generating the file. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1902960 Title: Upgrade from 245.4-4ubuntu3.3 to 245.4-4ubuntu3.2 appears to break DNS resolution in some cases To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1902960/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1900438] Re: Bcache bypasse writeback on caching device with fragmentation
I tried to run the apport-collect on one of the server we can see the issue on : Waiting to hear from Launchpad about your decision... *** Collecting problem information The collected information can be sent to the developers to improve the application. This might take a few minutes. ..dpkg-query: no packages found matching linux .. Traceback (most recent call last): File "/usr/bin/apport-cli", line 370, in if not app.run_argv(): File "/usr/lib/python2.7/dist-packages/apport/ui.py", line 666, in run_argv return self.run_update_report() File "/usr/lib/python2.7/dist-packages/apport/ui.py", line 564, in run_update_report self.report.add_proc_environ() File "/usr/lib/python2.7/dist-packages/apport/report.py", line 577, in add_proc_environ proc_pid_fd = os.open('/proc/%s' % pid, os.O_RDONLY | os.O_PATH | os.O_DIRECTORY) AttributeError: 'module' object has no attribute 'O_PATH' Error in sys.excepthook: Traceback (most recent call last): File "/usr/lib/python2.7/dist-packages/apport_python_hook.py", line 109, in apport_excepthook pr.add_proc_info(extraenv=['PYTHONPATH', 'PYTHONHOME']) File "/usr/lib/python2.7/dist-packages/apport/report.py", line 507, in add_proc_info proc_pid_fd = os.open('/proc/%s' % pid, os.O_RDONLY | os.O_PATH | os.O_DIRECTORY) AttributeError: 'module' object has no attribute 'O_PATH' But since the bug has already been marked as "Confirmed". -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1900438 Title: Bcache bypasse writeback on caching device with fragmentation To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1900438/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1900438] [NEW] Bcache bypasse writeback on caching device with fragmentation
Public bug reported: Hello, An upstream bug has been opened on the matter for quite some time now [0]. I can reproduce easily on our production compute node instance, which are trusty host with xenial hwe kernels (4.15.0-101-generic). However due to heavy backport and such, doing real tracing is a bit hard there. I was able to reproduce the behavior on a hwe-bionic kernel as well. Since most of our critical deployments use bcache, I think this is a kinda nasty bug to have. Reproducing the issue is relatively easy with the script provided in the bug [1]. The script used to capture the stats is this one [2]. [0]: https://bugzilla.kernel.org/show_bug.cgi?id=206767 [1]: https://pastebin.ubuntu.com/p/YnnvvSRhXK/ [2]: https://pastebin.ubuntu.com/p/XfVpzg32sN/ ** Affects: linux (Ubuntu) Importance: Undecided Status: New -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1900438 Title: Bcache bypasse writeback on caching device with fragmentation To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1900438/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1899852] Re: Cannot assign requested address: AH00072: make_sock: could not bind to address
** Description changed: Hello, Let's first list my configuration items: * apache2 2.4.29-1ubuntu4.14 * release: Ubuntu 18.04.5 LTS Upon reboot, the following message is seen in apache2.service logs: -- Unit apache2.service has begun starting up. Oct 14 12:18:32 SERVER apachectl[3833]: (99)Cannot assign requested address: AH00072: make_sock: could not bind to address [REDACTED IPV6.33]:443 Oct 14 12:18:32 SERVER apachectl[3833]: no listening sockets available, shutting down Oct 14 12:18:32 SERVER apachectl[3833]: AH00015: Unable to open logs Oct 14 12:18:32 SERVER apachectl[3833]: Action 'start' failed. Oct 14 12:18:32 SERVER apachectl[3833]: The Apache error log may have more information. Oct 14 12:18:33 SERVER systemd[1]: apache2.service: Control process exited, code=exited status=1 Oct 14 12:18:33 SERVER systemd[1]: apache2.service: Failed with result 'exit-code'. Oct 14 12:18:33 SERVER systemd[1]: Failed to start The Apache HTTP Server. The apache2 configuration is using the ipv4 and ipv6 present on the server: /etc/apache2/ports.conf:Listen :443 /etc/apache2/ports.conf:Listen :443 /etc/apache2/ports.conf:Listen [REDACTED IPV6::33]:443 /etc/apache2/ports.conf:Listen [REDACTED IPV6::35]:443 - and the /etc/network/interfaces look as this (no netplan): + and the /etc/network/interfaces looks like this (no netplan): # Additional IPs that are used to serve https traffic for # releases.ubuntu.com so that archive doesn't respond on 443. auto bond0:1 iface bond0:1 inet static - address .247/32 - # Using up/down to avoid LP:1347246. - up /sbin/ip addr add REDACTED IPV6::33/128 dev $IFACE preferred_lft 0 - down /bin/ip addr del REDACTED IPV6::33/128 dev $IFACE preferred_lft 0 + address .247/32 + # Using up/down to avoid LP:1347246. + up /sbin/ip addr add REDACTED IPV6::33/128 dev $IFACE preferred_lft 0 + down /bin/ip addr del REDACTED IPV6::33/128 dev $IFACE preferred_lft 0 # Additional IPs that are used to serve *.clouds.archive.ubuntu.com # with HTTPProtocolOptions unsafe, which is needed to work around # cloud-init bug LP:1868232 (cRT#125271). auto bond0:2 iface bond0:2 inet static - address .245/32 - # Using up/down to avoid LP:1347246. - up /sbin/ip addr add REDACTED IPV6::35/128 dev $IFACE preferred_lft 0 - down /bin/ip addr del REDACTED IPV6::35/128 dev $IFACE preferred_lft 0 + address .245/32 + # Using up/down to avoid LP:1347246. + up /sbin/ip addr add REDACTED IPV6::35/128 dev $IFACE preferred_lft 0 + down /bin/ip addr del REDACTED IPV6::35/128 dev $IFACE preferred_lft 0 - I was surprised that the apache2.service does not contain a + I was surprised that the apache2.service does not contain a After=network-online.target $ systemctl show apache2.service | grep -E '(Wants|Require|After|Before)' RemainAfterExit=no Requires=system.slice sysinit.target -.mount Before=multi-user.target shutdown.target After=basic.target sysinit.target systemd-journald.socket system.slice network.target nss-lookup.target systemd-tmpfiles-setup.service remote-fs.target -.mount RequiresMountsFor=/var/tmp /tmp $ systemctl show network.target | grep "^After" After=network-pre.target ifup@bond0.service ifup@ens2f0.service ifup@ens2f1.service systemd-resolved.service ufw.service networking.service systemd-networkd.service So I was wondering if the "ifup@bond0" was enough as a dependency here, to be sure to have the ipv6 up and running or if we would need something like "ifup@bond0:2" and "ifup@bond0:1" as part of the list of the services in the network.target "After" list. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1899852 Title: Cannot assign requested address: AH00072: make_sock: could not bind to address To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/apache2/+bug/1899852/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1899852] [NEW] Cannot assign requested address: AH00072: make_sock: could not bind to address
Public bug reported: Hello, Let's first list my configuration items: * apache2 2.4.29-1ubuntu4.14 * release: Ubuntu 18.04.5 LTS Upon reboot, the following message is seen in apache2.service logs: -- Unit apache2.service has begun starting up. Oct 14 12:18:32 SERVER apachectl[3833]: (99)Cannot assign requested address: AH00072: make_sock: could not bind to address [REDACTED IPV6.33]:443 Oct 14 12:18:32 SERVER apachectl[3833]: no listening sockets available, shutting down Oct 14 12:18:32 SERVER apachectl[3833]: AH00015: Unable to open logs Oct 14 12:18:32 SERVER apachectl[3833]: Action 'start' failed. Oct 14 12:18:32 SERVER apachectl[3833]: The Apache error log may have more information. Oct 14 12:18:33 SERVER systemd[1]: apache2.service: Control process exited, code=exited status=1 Oct 14 12:18:33 SERVER systemd[1]: apache2.service: Failed with result 'exit-code'. Oct 14 12:18:33 SERVER systemd[1]: Failed to start The Apache HTTP Server. The apache2 configuration is using the ipv4 and ipv6 present on the server: /etc/apache2/ports.conf:Listen :443 /etc/apache2/ports.conf:Listen :443 /etc/apache2/ports.conf:Listen [REDACTED IPV6::33]:443 /etc/apache2/ports.conf:Listen [REDACTED IPV6::35]:443 and the /etc/network/interfaces look as this (no netplan): # Additional IPs that are used to serve https traffic for # releases.ubuntu.com so that archive doesn't respond on 443. auto bond0:1 iface bond0:1 inet static address .247/32 # Using up/down to avoid LP:1347246. up /sbin/ip addr add REDACTED IPV6::33/128 dev $IFACE preferred_lft 0 down /bin/ip addr del REDACTED IPV6::33/128 dev $IFACE preferred_lft 0 # Additional IPs that are used to serve *.clouds.archive.ubuntu.com # with HTTPProtocolOptions unsafe, which is needed to work around # cloud-init bug LP:1868232 (cRT#125271). auto bond0:2 iface bond0:2 inet static address .245/32 # Using up/down to avoid LP:1347246. up /sbin/ip addr add REDACTED IPV6::35/128 dev $IFACE preferred_lft 0 down /bin/ip addr del REDACTED IPV6::35/128 dev $IFACE preferred_lft 0 I was surprised that the apache2.service does not contain a After=network-online.target $ systemctl show apache2.service | grep -E '(Wants|Require|After|Before)' RemainAfterExit=no Requires=system.slice sysinit.target -.mount Before=multi-user.target shutdown.target After=basic.target sysinit.target systemd-journald.socket system.slice network.target nss-lookup.target systemd-tmpfiles-setup.service remote-fs.target -.mount RequiresMountsFor=/var/tmp /tmp $ systemctl show network.target | grep "^After" After=network-pre.target ifup@bond0.service ifup@ens2f0.service ifup@ens2f1.service systemd-resolved.service ufw.service networking.service systemd-networkd.service So I was wondering if the "ifup@bond0" was enough as a dependency here, to be sure to have the ipv6 up and running or if we would need something like "ifup@bond0:2" and "ifup@bond0:1" as part of the list of the services in the network.target "After" list. ** Affects: apache2 (Ubuntu) Importance: Undecided Status: New -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1899852 Title: Cannot assign requested address: AH00072: make_sock: could not bind to address To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/apache2/+bug/1899852/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1898786] Re: Issue with bcache bch_mca_scan causing huge IO wait
The testing of the new kernel looks very promising. We don't observe any of the latency/IOwait we had before even with the btree_shrinker enabled. We'll give it a week probably, but having a backport of those patches would be fantastic for sure. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1898786 Title: Issue with bcache bch_mca_scan causing huge IO wait To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1898786/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1898786] Re: Issue with bcache bch_mca_scan causing huge IO wait
Thanks Matthew, we'll try this kernel tomorrow. I tested it on an openstack instance, the only downside is that it uninstalls the official 4.15.0-118-generic one. Regards, -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1898786 Title: Issue with bcache bch_mca_scan causing huge IO wait To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1898786/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1898786] Re: Issue with bcache bch_mca_scan causing huge IO wait
Hi Mauricio, The tests in [5][6] and [7] have been done with a 44GB memory VM. This VM usually has 64GB of allocated memory. The goal was to verify that the load of the whole btree in RAM was not prevented (like in mca-reap -> down_write_trylock). All this kernel stuff is rather new to me, so I may be following a wrong lead. ** Description changed: Hello, In short, we faced an issue with a huge IO wait on a bionic Ubuntu 4.15.0-118.119-generic kernel. This is the full list of process and the kernel function they were stuck in [0]. The main issue can probably be summarized by this perf reports * first identify that the cpu are stuck in idle because of something[1] * second, see what kernel function seems to stuck the process kswapd0 and kswapd1 [2]. We could see that this seems to be the mutex_lock in the bch_mca_scan function [3]. After running the command: | sudo bash -c "echo 1 > /sys/fs/bcache/f1a1e8cb-3e6b-40ea-852e- 583c48d0c2b8/internal/btree_shrinker_disabled" The server started to respond normally and the IO wait dropped significantly Here is a trace of the bcache event related lock in the kernel obtained with some bpfcc-tools [4]. klockstat-bpfcc -c bch_ -i 5 -s 3 The trace has been run in parallel with the following command line echo "Shrinker disabled: $(date)"; sleep 60; echo "Enabling shrinker: $(date)"; echo 0 | sudo tee /sys/block/bcache0/bcache/cache/internal/btree_shrinker_disabled ; sleep 60; echo "Disabling shrinker: $(date)"; echo 1 | sudo tee /sys/block/bcache0/bcache/cache/internal/btree_shrinker_disabled; sleep 60; echo "End of test: $(date)" Trying to dig more, we reduced by 20 GB the memory allocated to a VM on the server. * The bcache btree size fluctuation seems "normal" [5] - * I noticed that, when the shrinker was enabled,a lot of time was spent in the locks during "bch_btree_insert_node". + * I noticed that, when the shrinker was enabled,a lot of time was spent in the locks during "bch_btree_insert_node". [6] I decided to check if one of the function called during bch_btree_insert_node was taking longer than usual when the shrinker was enabled. I finally found the "funclatency" tool and tried do have the same approach I had with the klockstat [7]. However, that was inconclusive. I could see there that the bch_btree_insert_node was barely called during the whole duration of the test. Which made me think it's amount of time spent in lock is more due to another process acquiring the lock. I'm going to try to have another go with some perf/klockstat/funclatency focused on bch_mca_scan and the function called there. Also, here are some memory related metrics [8]. - Now another perf stacktrace with the command used [9]. Strangely this one doesn't show any bch_mca_scan at all. I enabled the shrinker again, hoping to get more traces, but apparently the timeframe was not right. Not enough load to trigger the cliff resulting in a 1sec IOwait plateau. Which is interesting because that means that without the maximal workload, the kernel can cope with the shrinker. - [0]: https://pastebin.ubuntu.com/p/QYXPdsMCWC/ [1]: https://pastebin.ubuntu.com/p/BFsvF7H54r/ [2]: https://pastebin.ubuntu.com/p/35qdsHYHf5/ [3]: https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/bionic/tree/drivers/md/bcache/btree.c?h=Ubuntu-4.15.0-118.119#n674 [4]: https://pastebin.ubuntu.com/p/qhyqP35fCw/ [5]: https://pastebin.ubuntu.com/p/McjxxqTVjn/ [6]: https://pastebin.ubuntu.com/p/KmrnW4Ng8F/ [7]: https://pastebin.ubuntu.com/p/fSX4c7tTFV/ [8]: https://pastebin.ubuntu.com/p/CZgXkgKhmJ/ [9]: https://pastebin.ubuntu.com/p/DzKCP8NGdf/ $ cat /proc/version_signature Ubuntu 4.15.0-118.119-generic 4.15.18 ProblemType: Bug DistroRelease: Ubuntu 18.04 Package: linux-image-4.15.0-118-generic 4.15.0-118.119 ProcVersionSignature: User Name 4.15.0-118.119-generic 4.15.18 Uname: Linux 4.15.0-118-generic x86_64 AlsaDevices: total 0 crw-rw 1 root audio 116, 1 Sep 29 10:04 seq crw-rw 1 root audio 116, 33 Sep 29 10:04 timer AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay' ApportVersion: 2.20.9-0ubuntu7.16 Architecture: amd64 ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 'arecord' AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: Date: Tue Oct 6 20:36:18 2020 IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig' MachineType: HP ProLiant DL380 G7 PciMultimedia: ProcFB: 0 radeondrmfb ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-118-generic root=UUID=c6ad1629-a506-4043-a339-6d57f0708d12 ro console=ttyS1,115200 nosplash RelatedPackageVersions: linux-restricted-modules-4.15.0-118-generic N/A
[Bug 1898786] Re: Issue with bcache bch_mca_scan causing huge IO wait
** Description changed: Hello, In short, we faced an issue with a huge IO wait on a bionic Ubuntu 4.15.0-118.119-generic kernel. This is the full list of process and the kernel function they were stuck in [0]. The main issue can probably be summarized by this perf reports * first identify that the cpu are stuck in idle because of something[1] * second, see what kernel function seems to stuck the process kswapd0 and kswapd1 [2]. We could see that this seems to be the mutex_lock in the bch_mca_scan function [3]. After running the command: | sudo bash -c "echo 1 > /sys/fs/bcache/f1a1e8cb-3e6b-40ea-852e- 583c48d0c2b8/internal/btree_shrinker_disabled" The server started to respond normally and the IO wait dropped significantly Here is a trace of the bcache event related lock in the kernel obtained with some bpfcc-tools [4]. klockstat-bpfcc -c bch_ -i 5 -s 3 The trace has been run in parallel with the following command line echo "Shrinker disabled: $(date)"; sleep 60; echo "Enabling shrinker: $(date)"; echo 0 | sudo tee /sys/block/bcache0/bcache/cache/internal/btree_shrinker_disabled ; sleep 60; echo "Disabling shrinker: $(date)"; echo 1 | sudo tee /sys/block/bcache0/bcache/cache/internal/btree_shrinker_disabled; sleep 60; echo "End of test: $(date)" Trying to dig more, we reduced by 20 GB the memory allocated to a VM on the server. * The bcache btree size fluctuation seems "normal" [5] * I noticed that, when the shrinker was enabled,a lot of time was spent in the locks during "bch_btree_insert_node". I decided to check if one of the function called during bch_btree_insert_node was taking longer than usual when the shrinker was enabled. I finally found the "funclatency" tool and tried do have the same approach I had with the klockstat [7]. However, that was inconclusive. I could see there that the bch_btree_insert_node was barely called during the whole duration of the test. Which made me think it's amount of time spent in lock is more due to another process acquiring the lock. I'm going to try to have another go with some perf/klockstat/funclatency focused on bch_mca_scan and the function called there. Also, here are some memory related metrics [8]. + Now another perf stacktrace with the command used [9]. + Strangely this one doesn't show any bch_mca_scan at all. + + I enabled the shrinker again, hoping to get more traces, but apparently the timeframe was not right. Not enough load to trigger the cliff resulting in a 1sec IOwait plateau. + Which is interesting because that means that without the maximal workload, the kernel can cope with the shrinker. + + [0]: https://pastebin.ubuntu.com/p/QYXPdsMCWC/ [1]: https://pastebin.ubuntu.com/p/BFsvF7H54r/ [2]: https://pastebin.ubuntu.com/p/35qdsHYHf5/ [3]: https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/bionic/tree/drivers/md/bcache/btree.c?h=Ubuntu-4.15.0-118.119#n674 [4]: https://pastebin.ubuntu.com/p/qhyqP35fCw/ [5]: https://pastebin.ubuntu.com/p/McjxxqTVjn/ [6]: https://pastebin.ubuntu.com/p/KmrnW4Ng8F/ [7]: https://pastebin.ubuntu.com/p/fSX4c7tTFV/ [8]: https://pastebin.ubuntu.com/p/CZgXkgKhmJ/ + [9]: https://pastebin.ubuntu.com/p/DzKCP8NGdf/ $ cat /proc/version_signature Ubuntu 4.15.0-118.119-generic 4.15.18 ProblemType: Bug DistroRelease: Ubuntu 18.04 Package: linux-image-4.15.0-118-generic 4.15.0-118.119 ProcVersionSignature: User Name 4.15.0-118.119-generic 4.15.18 Uname: Linux 4.15.0-118-generic x86_64 AlsaDevices: total 0 crw-rw 1 root audio 116, 1 Sep 29 10:04 seq crw-rw 1 root audio 116, 33 Sep 29 10:04 timer AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay' ApportVersion: 2.20.9-0ubuntu7.16 Architecture: amd64 ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 'arecord' AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: Date: Tue Oct 6 20:36:18 2020 IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig' MachineType: HP ProLiant DL380 G7 PciMultimedia: ProcFB: 0 radeondrmfb ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-118-generic root=UUID=c6ad1629-a506-4043-a339-6d57f0708d12 ro console=ttyS1,115200 nosplash RelatedPackageVersions: linux-restricted-modules-4.15.0-118-generic N/A linux-backports-modules-4.15.0-118-generic N/A linux-firmware 1.173.18 RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill' SourcePackage: linux UpgradeStatus: Upgraded to bionic on 2019-09-27 (375 days ago) dmi.bios.date: 05/05/2011 dmi.bios.vendor: HP dmi.bios.version: P67 dmi.chassis.type: 23 dmi.chassis.vendor: HP dmi.modalias: dmi:bvnHP:bvrP67:bd05/05/2011:svnHP:pnProLiantDL380G7:pvr:cvnHP:ct23:cvr:
[Bug 1898786] Re: Issue with bcache bch_mca_scan causing huge IO wait
** Description changed: Hello, In short, we faced an issue with a huge IO wait on a bionic Ubuntu 4.15.0-118.119-generic kernel. This is the full list of process and the kernel function they were stuck in [0]. The main issue can probably be summarized by this perf reports * first identify that the cpu are stuck in idle because of something[1] * second, see what kernel function seems to stuck the process kswapd0 and kswapd1 [2]. We could see that this seems to be the mutex_lock in the bch_mca_scan function [3]. After running the command: | sudo bash -c "echo 1 > /sys/fs/bcache/f1a1e8cb-3e6b-40ea-852e- 583c48d0c2b8/internal/btree_shrinker_disabled" The server started to respond normally and the IO wait dropped significantly Here is a trace of the bcache event related lock in the kernel obtained with some bpfcc-tools [4]. klockstat-bpfcc -c bch_ -i 5 -s 3 The trace has been run in parallel with the following command line echo "Shrinker disabled: $(date)"; sleep 60; echo "Enabling shrinker: $(date)"; echo 0 | sudo tee /sys/block/bcache0/bcache/cache/internal/btree_shrinker_disabled ; sleep 60; echo "Disabling shrinker: $(date)"; echo 1 | sudo tee /sys/block/bcache0/bcache/cache/internal/btree_shrinker_disabled; sleep 60; echo "End of test: $(date)" + Trying to dig more, we reduced by 20 GB the memory allocated to a VM on the server. + * The bcache btree size fluctuation seems "normal" [5] + * I noticed that, when the shrinker was enabled,a lot of time was spent in the locks during "bch_btree_insert_node". + + I decided to check if one of the function called during + bch_btree_insert_node was taking longer than usual when the shrinker was + enabled. + + I finally found the "funclatency" tool and tried do have the same approach I had with the klockstat [7]. However, that was inconclusive. I could see there that the bch_btree_insert_node was barely called during the whole duration of the test. + Which made me think it's amount of time spent in lock is more due to another process acquiring the lock. + + I'm going to try to have another go with some perf/klockstat/funclatency + focused on bch_mca_scan and the function called there. + + Also, here are some memory related metrics [8]. + [0]: https://pastebin.ubuntu.com/p/QYXPdsMCWC/ [1]: https://pastebin.ubuntu.com/p/BFsvF7H54r/ [2]: https://pastebin.ubuntu.com/p/35qdsHYHf5/ [3]: https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/bionic/tree/drivers/md/bcache/btree.c?h=Ubuntu-4.15.0-118.119#n674 [4]: https://pastebin.ubuntu.com/p/qhyqP35fCw/ + [5]: https://pastebin.ubuntu.com/p/McjxxqTVjn/ + [6]: https://pastebin.ubuntu.com/p/KmrnW4Ng8F/ + [7]: https://pastebin.ubuntu.com/p/fSX4c7tTFV/ + [8]: https://pastebin.ubuntu.com/p/CZgXkgKhmJ/ $ cat /proc/version_signature Ubuntu 4.15.0-118.119-generic 4.15.18 ProblemType: Bug DistroRelease: Ubuntu 18.04 Package: linux-image-4.15.0-118-generic 4.15.0-118.119 ProcVersionSignature: User Name 4.15.0-118.119-generic 4.15.18 Uname: Linux 4.15.0-118-generic x86_64 AlsaDevices: total 0 crw-rw 1 root audio 116, 1 Sep 29 10:04 seq crw-rw 1 root audio 116, 33 Sep 29 10:04 timer AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay' ApportVersion: 2.20.9-0ubuntu7.16 Architecture: amd64 ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 'arecord' AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: Date: Tue Oct 6 20:36:18 2020 IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig' MachineType: HP ProLiant DL380 G7 PciMultimedia: ProcFB: 0 radeondrmfb ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-118-generic root=UUID=c6ad1629-a506-4043-a339-6d57f0708d12 ro console=ttyS1,115200 nosplash RelatedPackageVersions: linux-restricted-modules-4.15.0-118-generic N/A linux-backports-modules-4.15.0-118-generic N/A linux-firmware 1.173.18 RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill' SourcePackage: linux UpgradeStatus: Upgraded to bionic on 2019-09-27 (375 days ago) dmi.bios.date: 05/05/2011 dmi.bios.vendor: HP dmi.bios.version: P67 dmi.chassis.type: 23 dmi.chassis.vendor: HP dmi.modalias: dmi:bvnHP:bvrP67:bd05/05/2011:svnHP:pnProLiantDL380G7:pvr:cvnHP:ct23:cvr: dmi.product.family: ProLiant dmi.product.name: ProLiant DL380 G7 dmi.sys.vendor: HP -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1898786 Title: Issue with bcache bch_mca_scan causing huge IO wait To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1898786/+subscriptions -- ubuntu-bugs mailing list
[Bug 1898786] Re: Issue with bcache bch_mca_scan causing huge IO wait
Hello Mauricio, That was also one of the conclusion we reached yesterday. I followed a wrong lead yesterday, after getting another more detailed output with klockstat. I'll update the description of the bug at the end, adding the bits I found. The system is not really having memory pressure but to confirm this, we resized the memory allocated to a VM on the server from 64GB to 44GB and looked into the shrinker behavior. It didn't change a thing, and the IO wait started to rise again. As for the IO load, I'm not sure, I would probably need to do a blktrace and register it (never did that, but I read that you can then passes the result to fio to reproduce a load) -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1898786 Title: Issue with bcache bch_mca_scan causing huge IO wait To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1898786/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1898786] Re: Issue with bcache bch_mca_scan causing huge IO wait
** Description changed: Hello, In short, we faced an issue with a huge IO wait on a bionic Ubuntu 4.15.0-118.119-generic kernel. This is the full list of process and the kernel function they were stuck in [0]. The main issue can probably be summarized by this perf reports * first identify that the cpu are stuck in idle because of something[1] * second, see what kernel function seems to stuck the process kswapd0 and kswapd1 [2]. We could see that this seems to be the mutex_lock in the bch_mca_scan function [3]. After running the command: | sudo bash -c "echo 1 > /sys/fs/bcache/f1a1e8cb-3e6b-40ea-852e- 583c48d0c2b8/internal/btree_shrinker_disabled" The server started to respond normally and the IO wait dropped significantly - [0]: https://pastebin.canonical.com/p/wYYKwHdRXk/ - [1]: https://pastebin.canonical.com/p/n2Tw57QyBC/ - [2]: https://pastebin.canonical.com/p/3QqFTfdHhX/ + [0]: https://pastebin.ubuntu.com/p/QYXPdsMCWC/ + [1]: https://pastebin.ubuntu.com/p/BFsvF7H54r/ + [2]: https://pastebin.ubuntu.com/p/35qdsHYHf5/ [3]: https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/bionic/tree/drivers/md/bcache/btree.c?h=Ubuntu-4.15.0-118.119#n674 $ cat /proc/version_signature Ubuntu 4.15.0-118.119-generic 4.15.18 ProblemType: Bug DistroRelease: Ubuntu 18.04 Package: linux-image-4.15.0-118-generic 4.15.0-118.119 ProcVersionSignature: User Name 4.15.0-118.119-generic 4.15.18 Uname: Linux 4.15.0-118-generic x86_64 AlsaDevices: total 0 crw-rw 1 root audio 116, 1 Sep 29 10:04 seq crw-rw 1 root audio 116, 33 Sep 29 10:04 timer AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay' ApportVersion: 2.20.9-0ubuntu7.16 Architecture: amd64 ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 'arecord' AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: Date: Tue Oct 6 20:36:18 2020 IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig' MachineType: HP ProLiant DL380 G7 PciMultimedia: ProcFB: 0 radeondrmfb ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-118-generic root=UUID=c6ad1629-a506-4043-a339-6d57f0708d12 ro console=ttyS1,115200 nosplash RelatedPackageVersions: linux-restricted-modules-4.15.0-118-generic N/A linux-backports-modules-4.15.0-118-generic N/A linux-firmware 1.173.18 RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill' SourcePackage: linux UpgradeStatus: Upgraded to bionic on 2019-09-27 (375 days ago) dmi.bios.date: 05/05/2011 dmi.bios.vendor: HP dmi.bios.version: P67 dmi.chassis.type: 23 dmi.chassis.vendor: HP dmi.modalias: dmi:bvnHP:bvrP67:bd05/05/2011:svnHP:pnProLiantDL380G7:pvr:cvnHP:ct23:cvr: dmi.product.family: ProLiant dmi.product.name: ProLiant DL380 G7 dmi.sys.vendor: HP ** Description changed: Hello, In short, we faced an issue with a huge IO wait on a bionic Ubuntu 4.15.0-118.119-generic kernel. This is the full list of process and the kernel function they were stuck in [0]. The main issue can probably be summarized by this perf reports * first identify that the cpu are stuck in idle because of something[1] * second, see what kernel function seems to stuck the process kswapd0 and kswapd1 [2]. We could see that this seems to be the mutex_lock in the bch_mca_scan function [3]. After running the command: | sudo bash -c "echo 1 > /sys/fs/bcache/f1a1e8cb-3e6b-40ea-852e- 583c48d0c2b8/internal/btree_shrinker_disabled" - The server started to respond normally and the IO wait dropped significantly + The server started to respond normally and the IO wait dropped + significantly + + Here is a trace of the bcache event related lock in the kernel obtained + with some bpfcc-tools [4]. + + klockstat-bpfcc -c bch_ -i 5 -s 3 + + The trace has been run in parallel with the following command line + + echo "Shrinker disabled: $(date)"; sleep 60; echo "Enabling shrinker: + $(date)"; echo 0 | sudo tee + /sys/block/bcache0/bcache/cache/internal/btree_shrinker_disabled ; sleep + 60; echo "Disabling shrinker: $(date)"; echo 1 | sudo tee + /sys/block/bcache0/bcache/cache/internal/btree_shrinker_disabled; sleep + 60; echo "End of test: $(date)" + + [0]: https://pastebin.ubuntu.com/p/QYXPdsMCWC/ [1]: https://pastebin.ubuntu.com/p/BFsvF7H54r/ [2]: https://pastebin.ubuntu.com/p/35qdsHYHf5/ [3]: https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/bionic/tree/drivers/md/bcache/btree.c?h=Ubuntu-4.15.0-118.119#n674 + [4]: https://pastebin.ubuntu.com/p/qhyqP35fCw/ $ cat /proc/version_signature Ubuntu 4.15.0-118.119-generic 4.15.18 ProblemType: Bug DistroRelease: Ubuntu 18.04 Package: linux-image-4.15.0-118-generic 4.15.0-118.119 ProcVersionSignature: User Name
[Bug 1898786] Re: Issue with bcache bch_mca_scan causing huge IO wait
Here is a trace of the bcache event related lock in the kernel obtained with some bpfcc-tools. klockstat-bpfcc -c bch_ -i 5 -s 3 The trace has been run in parallel with the following command line echo "Shrinker disabled: $(date)"; sleep 60; echo "Enabling shrinker: $(date)"; echo 0 | sudo tee /sys/block/bcache0/bcache/cache/internal/btree_shrinker_disabled ; sleep 60; echo "Disabling shrinker: $(date)"; echo 1 | sudo tee /sys/block/bcache0/bcache/cache/internal/btree_shrinker_disabled; sleep 60; echo "End of test: $(date)" The log are here : https://pastebin.canonical.com/p/jVKdbV3RrK/ -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1898786 Title: Issue with bcache bch_mca_scan causing huge IO wait To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1898786/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1898786] Re: Issue with bcache bch_mca_scan causing huge IO wait
Actually we cannot be sure no. The server didn't have any metrics prior to few days ago and the issue was already there. It's worth nothing that few servers have this bcache configuration, because the cache mode is configured as writethrough and the load is pretty significant. So no last "good" version. Actually, we have various IO wait issue on another platform (but running xenial-hwe kernel) and we suspected bcache already. Mentioning it to show that bcache behavior seems to be related to some disk performance since quite some time. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1898786 Title: Issue with bcache bch_mca_scan causing huge IO wait To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1898786/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1898786] [NEW] Issue with bcache bch_mca_scan causing huge IO wait
Public bug reported: Hello, In short, we faced an issue with a huge IO wait on a bionic Ubuntu 4.15.0-118.119-generic kernel. This is the full list of process and the kernel function they were stuck in [2]. The main issue can probably be summarized by this perf reports * first identify that the cpu are stuck in idle because of something[1] * second, see what kernel function seems to stuck the process kswapd0 and kswapd1 [2]. We could see that this seems to be the mutex_lock in the bch_mca_scan function [3]. After running the command: | sudo bash -c "echo 1 > /sys/fs/bcache/f1a1e8cb-3e6b-40ea-852e- 583c48d0c2b8/internal/btree_shrinker_disabled" The server started to respond normally and the IO wait dropped significantly [0]: https://pastebin.canonical.com/p/wYYKwHdRXk/ [1]: https://pastebin.canonical.com/p/n2Tw57QyBC/ [2]: https://pastebin.canonical.com/p/3QqFTfdHhX/ [3]: https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/bionic/tree/drivers/md/bcache/btree.c?h=Ubuntu-4.15.0-118.119#n674 $ cat /proc/version_signature Ubuntu 4.15.0-118.119-generic 4.15.18 ProblemType: Bug DistroRelease: Ubuntu 18.04 Package: linux-image-4.15.0-118-generic 4.15.0-118.119 ProcVersionSignature: User Name 4.15.0-118.119-generic 4.15.18 Uname: Linux 4.15.0-118-generic x86_64 AlsaDevices: total 0 crw-rw 1 root audio 116, 1 Sep 29 10:04 seq crw-rw 1 root audio 116, 33 Sep 29 10:04 timer AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay' ApportVersion: 2.20.9-0ubuntu7.16 Architecture: amd64 ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 'arecord' AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: Date: Tue Oct 6 20:36:18 2020 IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig' MachineType: HP ProLiant DL380 G7 PciMultimedia: ProcFB: 0 radeondrmfb ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-118-generic root=UUID=c6ad1629-a506-4043-a339-6d57f0708d12 ro console=ttyS1,115200 nosplash RelatedPackageVersions: linux-restricted-modules-4.15.0-118-generic N/A linux-backports-modules-4.15.0-118-generic N/A linux-firmware 1.173.18 RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill' SourcePackage: linux UpgradeStatus: Upgraded to bionic on 2019-09-27 (375 days ago) dmi.bios.date: 05/05/2011 dmi.bios.vendor: HP dmi.bios.version: P67 dmi.chassis.type: 23 dmi.chassis.vendor: HP dmi.modalias: dmi:bvnHP:bvrP67:bd05/05/2011:svnHP:pnProLiantDL380G7:pvr:cvnHP:ct23:cvr: dmi.product.family: ProLiant dmi.product.name: ProLiant DL380 G7 dmi.sys.vendor: HP ** Affects: linux (Ubuntu) Importance: Undecided Status: New ** Tags: amd64 apport-bug bionic -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1898786 Title: Issue with bcache bch_mca_scan causing huge IO wait To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1898786/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1873352] UdevLog.txt
apport information ** Attachment added: "UdevLog.txt" https://bugs.launchpad.net/bugs/1873352/+attachment/5355748/+files/UdevLog.txt -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1873352 Title: unregister_netdevice: waiting for tape3e33cb9-d6 to become free. Usage count = 2 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1873352/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1873352] ProcCpuinfo.txt
apport information ** Attachment added: "ProcCpuinfo.txt" https://bugs.launchpad.net/bugs/1873352/+attachment/5355742/+files/ProcCpuinfo.txt -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1873352 Title: unregister_netdevice: waiting for tape3e33cb9-d6 to become free. Usage count = 2 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1873352/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1873352] ProcInterrupts.txt
apport information ** Attachment added: "ProcInterrupts.txt" https://bugs.launchpad.net/bugs/1873352/+attachment/5355745/+files/ProcInterrupts.txt -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1873352 Title: unregister_netdevice: waiting for tape3e33cb9-d6 to become free. Usage count = 2 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1873352/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1873352] WifiSyslog.txt
apport information ** Attachment added: "WifiSyslog.txt" https://bugs.launchpad.net/bugs/1873352/+attachment/5355749/+files/WifiSyslog.txt -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1873352 Title: unregister_netdevice: waiting for tape3e33cb9-d6 to become free. Usage count = 2 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1873352/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1873352] UdevDb.txt
apport information ** Attachment added: "UdevDb.txt" https://bugs.launchpad.net/bugs/1873352/+attachment/5355747/+files/UdevDb.txt -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1873352 Title: unregister_netdevice: waiting for tape3e33cb9-d6 to become free. Usage count = 2 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1873352/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1873352] ProcCpuinfoMinimal.txt
apport information ** Attachment added: "ProcCpuinfoMinimal.txt" https://bugs.launchpad.net/bugs/1873352/+attachment/5355743/+files/ProcCpuinfoMinimal.txt -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1873352 Title: unregister_netdevice: waiting for tape3e33cb9-d6 to become free. Usage count = 2 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1873352/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1873352] ProcModules.txt
apport information ** Attachment added: "ProcModules.txt" https://bugs.launchpad.net/bugs/1873352/+attachment/5355746/+files/ProcModules.txt -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1873352 Title: unregister_netdevice: waiting for tape3e33cb9-d6 to become free. Usage count = 2 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1873352/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1873352] NonfreeKernelModules.txt
apport information ** Attachment added: "NonfreeKernelModules.txt" https://bugs.launchpad.net/bugs/1873352/+attachment/5355741/+files/NonfreeKernelModules.txt -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1873352 Title: unregister_netdevice: waiting for tape3e33cb9-d6 to become free. Usage count = 2 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1873352/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1873352] CurrentDmesg.txt
apport information ** Attachment added: "CurrentDmesg.txt" https://bugs.launchpad.net/bugs/1873352/+attachment/5355738/+files/CurrentDmesg.txt -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1873352 Title: unregister_netdevice: waiting for tape3e33cb9-d6 to become free. Usage count = 2 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1873352/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1873352] ProcEnviron.txt
apport information ** Attachment added: "ProcEnviron.txt" https://bugs.launchpad.net/bugs/1873352/+attachment/5355744/+files/ProcEnviron.txt -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1873352 Title: unregister_netdevice: waiting for tape3e33cb9-d6 to become free. Usage count = 2 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1873352/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1873352] Lsusb.txt
apport information ** Attachment added: "Lsusb.txt" https://bugs.launchpad.net/bugs/1873352/+attachment/5355740/+files/Lsusb.txt -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1873352 Title: unregister_netdevice: waiting for tape3e33cb9-d6 to become free. Usage count = 2 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1873352/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1873352] BootDmesg.txt
apport information ** Attachment added: "BootDmesg.txt" https://bugs.launchpad.net/bugs/1873352/+attachment/5355736/+files/BootDmesg.txt -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1873352 Title: unregister_netdevice: waiting for tape3e33cb9-d6 to become free. Usage count = 2 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1873352/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1873352] CRDA.txt
apport information ** Attachment added: "CRDA.txt" https://bugs.launchpad.net/bugs/1873352/+attachment/5355737/+files/CRDA.txt -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1873352 Title: unregister_netdevice: waiting for tape3e33cb9-d6 to become free. Usage count = 2 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1873352/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1873352] [NEW] unregister_netdevice: waiting for tape3e33cb9-d6 to become free. Usage count = 2
Public bug reported: Hello, We recently encountered an issue when trying to stop nova instances. | Apr 16 09:35:10 ginzel kernel: [6925958.071665] unregister_netdevice: waiting for tape3e33cb9-d6 to become free. Usage count = 2 When I try to stop manually the qemu process, it errors """ ballot@ginzel:~$ sudo virsh destroy instance-000718af setlocale: No such file or directory error: Failed to destroy domain instance-000718af error: Failed to terminate process 20953 with SIGKILL: Device or resource busy """ The qemu process is in D state because of this """ libvirt+ 20953 12.9 0.5 9001400 2925908 ? DApr15 84:25 /usr/bin/qemu-system-x86_64 -name instance-000718af -S -machine pc-i440fx-trusty,accel=kvm,usb=off -cpu EPYC-IBPB-2.0,+perfctr_nb,+perfctr_core,+t opoext,+tce,+wdt,+skinit,+extapic,+cmp_legacy,+osxsave,+ht -m 4096 -realtime mlock=off -smp 2,sockets=2,cores=1,threads=1 -uuid 020ecf87-4c96-47af-8d3f-d86d51b445f4 -smbios type=1,manufacturer=OpenStack Foundati on,product=OpenStack Nova,version=2014.1.5,serial=993710e9-0a44-428e-88ec-9210c6dfed55,uuid=020ecf87-4c96-47af-8d3f-d86d51b445f4 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/q emu/instance-000718af.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -boot strict=on -device pii x3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/srv/nova/instances/020ecf87-4c96-47af-8d3f-d86d51b445f4/disk,if=none,id=drive-virtio-disk0,format=qcow2,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0 ,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=24,id=hostnet0,vhost=on,vhostfd=29 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=fa:16:3e:eb:99:5e,bus=pci.0,addr=0x3 -chardev f ile,id=charserial0,path=/srv/nova/instances/020ecf87-4c96-47af-8d3f-d86d51b445f4/console.log -device isa-serial,chardev=charserial0,id=serial0 -chardev pty,id=charserial1 -device isa-serial,chardev=charserial1,i d=serial1 -device usb-tablet,id=input0 -vnc 0.0.0.0:4 -k en-us -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 ballot@ginzel:~$ sudo cat /proc/20953/sta stack statstatm status """ and we can see the stack of the process """ ballot@ginzel:~$ sudo cat /proc/20953/stack [<0>] msleep+0x2d/0x40 [<0>] netdev_run_todo+0x11c/0x320 [<0>] rtnl_unlock+0xe/0x10 [<0>] tun_chr_close+0x28/0x30 [<0>] __fput+0xea/0x220 [<0>] fput+0xe/0x10 [<0>] task_work_run+0x9d/0xc0 [<0>] exit_to_usermode_loop+0xc0/0xd0 [<0>] do_syscall_64+0x115/0x130 [<0>] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 [<0>] 0x """ Will attach the relevant files --- AlsaDevices: total 0 crw-rw 1 root audio 116, 1 Jan 27 05:43 seq crw-rw 1 root audio 116, 33 Jan 27 05:43 timer AplayDevices: Error: [Errno 2] No such file or directory ApportVersion: 2.14.1-0ubuntu3.29 Architecture: amd64 ArecordDevices: Error: [Errno 2] No such file or directory AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: DistroRelease: Ubuntu 14.04 IwConfig: Error: [Errno 2] No such file or directory MachineType: HPE ProLiant DL385 Gen10 Package: linux (not installed) PciMultimedia: ProcFB: 0 mgadrmfb ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-42-generic root=UUID=06e712a9-0adb-4078-8976-0c41ce346aa5 ro console=tty0 console=ttyS0,115200 ProcVersionSignature: Ubuntu 4.15.0-42.45-generic 4.15.18 RelatedPackageVersions: linux-restricted-modules-4.15.0-42-generic N/A linux-backports-modules-4.15.0-42-generic N/A linux-firmware 1.127.24 RfKill: Error: [Errno 2] No such file or directory Tags: trusty uec-images Uname: Linux 4.15.0-42-generic x86_64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: _MarkForUpload: True dmi.bios.date: 10/02/2018 dmi.bios.vendor: HPE dmi.bios.version: A40 dmi.board.name: ProLiant DL385 Gen10 dmi.board.vendor: HPE dmi.chassis.type: 23 dmi.chassis.vendor: HPE dmi.modalias: dmi:bvnHPE:bvrA40:bd10/02/2018:svnHPE:pnProLiantDL385Gen10:pvr:rvnHPE:rnProLiantDL385Gen10:rvr:cvnHPE:ct23:cvr:
[Bug 1873352] Lspci.txt
apport information ** Attachment added: "Lspci.txt" https://bugs.launchpad.net/bugs/1873352/+attachment/5355739/+files/Lspci.txt -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1873352 Title: unregister_netdevice: waiting for tape3e33cb9-d6 to become free. Usage count = 2 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1873352/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs