[Kernel-packages] [Bug 1717224] Comment bridged from LTC Bugzilla
--- Comment From heinz-werner_se...@de.ibm.com 2018-02-27 05:21 EDT--- IBM bugzilla status -> closed, Fix Released and verified -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1717224 Title: virsh start of virtual guest domain fails with internal error due to low default aio-max-nr sysctl value Status in Ubuntu on IBM z Systems: Fix Released Status in kvm package in Ubuntu: Won't Fix Status in linux package in Ubuntu: Fix Released Status in procps package in Ubuntu: New Status in linux source package in Xenial: Fix Released Status in procps source package in Xenial: New Status in linux source package in Artful: Fix Released Status in procps source package in Artful: New Bug description: Starting virtual guests via on Ubuntu 16.04.2 LTS installed with its KVM hypervisor on an IBM Z14 system LPAR fails on the 18th guest with the following error: root@zm93k8:/rawimages/ubu1604qcow2# virsh start zs93kag70038 error: Failed to start domain zs93kag70038 error: internal error: process exited while connecting to monitor: 2017-07-26T01:48:26.352534Z qemu-kvm: -drive file=/guestimages/data1/zs93kag70038.qcow2,format=qcow2,if=none,id=drive-virtio-disk0,cache=none,aio=native: Could not open backing file: Could not set AIO state: Inappropriate ioctl for device The previous 17 guests started fine: root@zm93k8# virsh start zs93kag70020 Domain zs93kag70020 started root@zm93k8# virsh start zs93kag70021 Domain zs93kag70021 started . . root@zm93k8:/rawimages/ubu1604qcow2# virsh start zs93kag70036 Domain zs93kag70036 started We ended up fixing the issue by adding the following line to /etc/sysctl.conf : fs.aio-max-nr = 4194304 ... then, reload the sysctl config file: root@zm93k8:/etc# sysctl -p /etc/sysctl.conf fs.aio-max-nr = 4194304 Now, we're able to start more guests... root@zm93k8:/etc# virsh start zs93kag70036 Domain zs93kag70036 started The default value was originally set to 65535: root@zm93k8:/rawimages/ubu1604qcow2# cat /proc/sys/fs/aio-max-nr 65536 Note, we chose the 4194304 value, because this is what our KVM on System Z hypervisor ships as its default value. Eg. on our zKVM system: [root@zs93ka ~]# cat /proc/sys/fs/aio-max-nr 4194304 ubuntu@zm93k8:/etc$ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description:Ubuntu 16.04.2 LTS Release:16.04 Codename: xenial ubuntu@zm93k8:/etc$ ubuntu@zm93k8:/etc$ dpkg -s qemu-kvm |grep Version Version: 1:2.5+dfsg-5ubuntu10.8 Is something already documented for Ubuntu KVM users warning them about the low default value, and some guidance as to how to select an appropriate value? Also, would you consider increasing the default aio-max-nr value to something much higher, to accommodate significantly more virtual guests? Thanks! ---uname output--- ubuntu@zm93k8:/etc$ uname -a Linux zm93k8 4.4.0-62-generic #83-Ubuntu SMP Wed Jan 18 14:12:54 UTC 2017 s390x s390x s390x GNU/Linux Machine Type = z14 ---Debugger--- A debugger is not configured ---Steps to Reproduce--- See Problem Description. The problem was happening a week ago, so this may not reflect that activity. This file was collected on Aug 7, one week after we were hitting the problem. If I need to reproduce the problem and get fresh data, please let me know. /var/log/messages doesn't exist on this system, so I provided syslog output instead. All data have been collected too late after the problem was observed over a week ago. If you need me to reproduce the problem and get new data, please let me know. That's not a problem. Also, we would have to make special arrangements for login access to these systems. I'm happy to run traces and data collection for you as needed. If that's not sufficient, then we'll explore log in access for you. Thanks... - Scott G. I was able to successfully recreate the problem and captured / attached new debug docs. Recreate procedure: # Started out with no virtual guests running. ubuntu@zm93k8:/home/scottg$ virsh list IdName State # Set fs.aio-max-nr back to original Ubuntu "out of the box" value in /etc/sysctl.conf ubuntu@zm93k8:~$ tail -1 /etc/sysctl.conf fs.aio-max-nr = 65536 ## sysctl -a shows: fs.aio-max-nr = 4194304 ## Reload sysctl. ubuntu@zm93k8:~$ sudo sysctl -p /etc/sysctl.conf fs.aio-max-nr = 65536 ubuntu@zm93k8:~$ ubuntu@zm93k8:~$ sudo sysctl -a |grep fs.aio-max-nr fs.aio-max-nr = 65536 ubuntu@zm93k8:~$ cat /proc/sys/fs/aio-max-nr 65536 # Attempt to start more than 17 qcow2 virtual guests on the Ubuntu host. Fails on the 18th XML. Script used to start
[Kernel-packages] [Bug 1717224] Comment bridged from LTC Bugzilla
--- Comment From cborn...@de.ibm.com 2018-02-27 04:03 EDT--- I can confirm that this is fixed in 4.13.0-36-generic and 4.4.0-116-generic --- Comment From cborn...@de.ibm.com 2018-02-27 04:04 EDT--- Tested HWE and LTS kernels, both contain this fix. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1717224 Title: virsh start of virtual guest domain fails with internal error due to low default aio-max-nr sysctl value Status in Ubuntu on IBM z Systems: In Progress Status in kvm package in Ubuntu: Won't Fix Status in linux package in Ubuntu: Fix Released Status in procps package in Ubuntu: New Status in linux source package in Xenial: Fix Released Status in procps source package in Xenial: New Status in linux source package in Artful: Fix Released Status in procps source package in Artful: New Bug description: Starting virtual guests via on Ubuntu 16.04.2 LTS installed with its KVM hypervisor on an IBM Z14 system LPAR fails on the 18th guest with the following error: root@zm93k8:/rawimages/ubu1604qcow2# virsh start zs93kag70038 error: Failed to start domain zs93kag70038 error: internal error: process exited while connecting to monitor: 2017-07-26T01:48:26.352534Z qemu-kvm: -drive file=/guestimages/data1/zs93kag70038.qcow2,format=qcow2,if=none,id=drive-virtio-disk0,cache=none,aio=native: Could not open backing file: Could not set AIO state: Inappropriate ioctl for device The previous 17 guests started fine: root@zm93k8# virsh start zs93kag70020 Domain zs93kag70020 started root@zm93k8# virsh start zs93kag70021 Domain zs93kag70021 started . . root@zm93k8:/rawimages/ubu1604qcow2# virsh start zs93kag70036 Domain zs93kag70036 started We ended up fixing the issue by adding the following line to /etc/sysctl.conf : fs.aio-max-nr = 4194304 ... then, reload the sysctl config file: root@zm93k8:/etc# sysctl -p /etc/sysctl.conf fs.aio-max-nr = 4194304 Now, we're able to start more guests... root@zm93k8:/etc# virsh start zs93kag70036 Domain zs93kag70036 started The default value was originally set to 65535: root@zm93k8:/rawimages/ubu1604qcow2# cat /proc/sys/fs/aio-max-nr 65536 Note, we chose the 4194304 value, because this is what our KVM on System Z hypervisor ships as its default value. Eg. on our zKVM system: [root@zs93ka ~]# cat /proc/sys/fs/aio-max-nr 4194304 ubuntu@zm93k8:/etc$ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description:Ubuntu 16.04.2 LTS Release:16.04 Codename: xenial ubuntu@zm93k8:/etc$ ubuntu@zm93k8:/etc$ dpkg -s qemu-kvm |grep Version Version: 1:2.5+dfsg-5ubuntu10.8 Is something already documented for Ubuntu KVM users warning them about the low default value, and some guidance as to how to select an appropriate value? Also, would you consider increasing the default aio-max-nr value to something much higher, to accommodate significantly more virtual guests? Thanks! ---uname output--- ubuntu@zm93k8:/etc$ uname -a Linux zm93k8 4.4.0-62-generic #83-Ubuntu SMP Wed Jan 18 14:12:54 UTC 2017 s390x s390x s390x GNU/Linux Machine Type = z14 ---Debugger--- A debugger is not configured ---Steps to Reproduce--- See Problem Description. The problem was happening a week ago, so this may not reflect that activity. This file was collected on Aug 7, one week after we were hitting the problem. If I need to reproduce the problem and get fresh data, please let me know. /var/log/messages doesn't exist on this system, so I provided syslog output instead. All data have been collected too late after the problem was observed over a week ago. If you need me to reproduce the problem and get new data, please let me know. That's not a problem. Also, we would have to make special arrangements for login access to these systems. I'm happy to run traces and data collection for you as needed. If that's not sufficient, then we'll explore log in access for you. Thanks... - Scott G. I was able to successfully recreate the problem and captured / attached new debug docs. Recreate procedure: # Started out with no virtual guests running. ubuntu@zm93k8:/home/scottg$ virsh list IdName State # Set fs.aio-max-nr back to original Ubuntu "out of the box" value in /etc/sysctl.conf ubuntu@zm93k8:~$ tail -1 /etc/sysctl.conf fs.aio-max-nr = 65536 ## sysctl -a shows: fs.aio-max-nr = 4194304 ## Reload sysctl. ubuntu@zm93k8:~$ sudo sysctl -p /etc/sysctl.conf fs.aio-max-nr = 65536 ubuntu@zm93k8:~$ ubuntu@zm93k8:~$ sudo sysctl -a |grep fs.aio-max-nr fs.aio-max-nr = 65536 ubuntu@zm93k8:~$ cat /proc/sys/fs/aio-max-nr 65536
[Kernel-packages] [Bug 1717224] Comment bridged from LTC Bugzilla
--- Comment From heinz-werner_se...@de.ibm.com 2018-01-26 03:21 EDT--- Canonical, any updates available for this LP ? -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1717224 Title: virsh start of virtual guest domain fails with internal error due to low default aio-max-nr sysctl value Status in Ubuntu on IBM z Systems: In Progress Status in kvm package in Ubuntu: Confirmed Status in linux package in Ubuntu: In Progress Status in procps package in Ubuntu: New Status in kvm source package in Xenial: New Status in linux source package in Xenial: In Progress Status in procps source package in Xenial: New Status in kvm source package in Zesty: New Status in linux source package in Zesty: In Progress Status in procps source package in Zesty: New Status in kvm source package in Artful: Confirmed Status in linux source package in Artful: In Progress Status in procps source package in Artful: New Bug description: Starting virtual guests via on Ubuntu 16.04.2 LTS installed with its KVM hypervisor on an IBM Z14 system LPAR fails on the 18th guest with the following error: root@zm93k8:/rawimages/ubu1604qcow2# virsh start zs93kag70038 error: Failed to start domain zs93kag70038 error: internal error: process exited while connecting to monitor: 2017-07-26T01:48:26.352534Z qemu-kvm: -drive file=/guestimages/data1/zs93kag70038.qcow2,format=qcow2,if=none,id=drive-virtio-disk0,cache=none,aio=native: Could not open backing file: Could not set AIO state: Inappropriate ioctl for device The previous 17 guests started fine: root@zm93k8# virsh start zs93kag70020 Domain zs93kag70020 started root@zm93k8# virsh start zs93kag70021 Domain zs93kag70021 started . . root@zm93k8:/rawimages/ubu1604qcow2# virsh start zs93kag70036 Domain zs93kag70036 started We ended up fixing the issue by adding the following line to /etc/sysctl.conf : fs.aio-max-nr = 4194304 ... then, reload the sysctl config file: root@zm93k8:/etc# sysctl -p /etc/sysctl.conf fs.aio-max-nr = 4194304 Now, we're able to start more guests... root@zm93k8:/etc# virsh start zs93kag70036 Domain zs93kag70036 started The default value was originally set to 65535: root@zm93k8:/rawimages/ubu1604qcow2# cat /proc/sys/fs/aio-max-nr 65536 Note, we chose the 4194304 value, because this is what our KVM on System Z hypervisor ships as its default value. Eg. on our zKVM system: [root@zs93ka ~]# cat /proc/sys/fs/aio-max-nr 4194304 ubuntu@zm93k8:/etc$ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description:Ubuntu 16.04.2 LTS Release:16.04 Codename: xenial ubuntu@zm93k8:/etc$ ubuntu@zm93k8:/etc$ dpkg -s qemu-kvm |grep Version Version: 1:2.5+dfsg-5ubuntu10.8 Is something already documented for Ubuntu KVM users warning them about the low default value, and some guidance as to how to select an appropriate value? Also, would you consider increasing the default aio-max-nr value to something much higher, to accommodate significantly more virtual guests? Thanks! ---uname output--- ubuntu@zm93k8:/etc$ uname -a Linux zm93k8 4.4.0-62-generic #83-Ubuntu SMP Wed Jan 18 14:12:54 UTC 2017 s390x s390x s390x GNU/Linux Machine Type = z14 ---Debugger--- A debugger is not configured ---Steps to Reproduce--- See Problem Description. The problem was happening a week ago, so this may not reflect that activity. This file was collected on Aug 7, one week after we were hitting the problem. If I need to reproduce the problem and get fresh data, please let me know. /var/log/messages doesn't exist on this system, so I provided syslog output instead. All data have been collected too late after the problem was observed over a week ago. If you need me to reproduce the problem and get new data, please let me know. That's not a problem. Also, we would have to make special arrangements for login access to these systems. I'm happy to run traces and data collection for you as needed. If that's not sufficient, then we'll explore log in access for you. Thanks... - Scott G. I was able to successfully recreate the problem and captured / attached new debug docs. Recreate procedure: # Started out with no virtual guests running. ubuntu@zm93k8:/home/scottg$ virsh list IdName State # Set fs.aio-max-nr back to original Ubuntu "out of the box" value in /etc/sysctl.conf ubuntu@zm93k8:~$ tail -1 /etc/sysctl.conf fs.aio-max-nr = 65536 ## sysctl -a shows: fs.aio-max-nr = 4194304 ## Reload sysctl. ubuntu@zm93k8:~$ sudo sysctl -p /etc/sysctl.conf fs.aio-max-nr = 65536 ubuntu@zm93k8:~$ ubuntu@zm93k8:~$ sudo sysctl
[Kernel-packages] [Bug 1717224] Comment bridged from LTC Bugzilla
--- Comment From swgre...@us.ibm.com 2017-11-10 14:39 EDT--- (In reply to comment #31) > Hi Scott, > the howto is mixed for Desktop users, Server users and selective upgrades. > For your case you only need the most simple case which would be: > > Essentially you want to: > > # Check - all other updates done (to clear the view) > $ apt list --upgradable > Listing... Done > > # Enable proposed for z on Server > $ echo "deb http://ports.ubuntu.com/ubuntu-ports/ xenial-proposed main > restricted universe multiverse" | sudo tee > /etc/apt/sources.list.d/enable-proposed.list > $ sudo apt update > $ apt list --upgradable > [...] > linux-headers-generic/xenial-proposed 4.4.0.100.105 s390x [upgradable from: > 4.4.0.98.103] > linux-headers-virtual/xenial-proposed 4.4.0.100.105 s390x [upgradable from: > 4.4.0.98.103] > linux-image-virtual/xenial-proposed 4.4.0.100.105 s390x [upgradable from: > 4.4.0.98.103] > > # Install just the kernels from proposed > $ sudo apt install linux-generic > > No need to set apt prefs if you only do a selective install. > If you'd do a global "sudo apt upgrade" you'd get all, but that is likely > not what you want in your case. After you have done so you can just > enable/disable the line in /etc/apt/sources.list.d/enable-proposed.list as > needed. > > Hope that helps Yes, your instructions were immensely useful, thanks for the explanation. With the proposed fix applied, I am now able to start over 100 virtual guests, even with aio-max-nr set to 64K: root@zm93k8:~# cat /proc/sys/fs/aio-max-nr 65535 root@zm93k8:/tmp# virsh list |grep running 86zs93kag70041 running 87zs93kag70042 running 88zs93kag70055 running 89zs93kag70056 running 90zs93kag70057 running 91zs93kag70058 running 92zs93kag70059 running 93zs93kag70060 running 94zs93kag70061 running 95zs93kag70062 running 96zs93kag70063 running 97zs93kag70064 running 98zs93kag70065 running 99zs93kag70066 running 100 zs93kag70067 running 101 zs93kag70068 running 102 zs93kag70069 running 103 zs93kag70070 running 104 zs93kag70071 running 105 zs93kag70072 running 106 zs93kag70073 running 107 zs93kag70074 running 108 zs93kag70075 running 109 zs93kag70077 running 110 zs93kag70078 running 111 zs93kag70079 running 112 zs93kag70080 running 113 zs93kag70081 running 114 zs93kag70082 running 115 zs93kag70083 running 116 zs93kag70084 running 117 zs93kag70085 running 118 zs93kag70086 running 119 zs93kag70087 running 120 zs93kag70088 running 121 zs93kag70089 running 122 zs93kag70090 running 123 zs93kag70091 running 124 zs93kag70092 running 125 zs93kag70093 running 126 zs93kag70094 running 127 zs93kag70095 running 128 zs93kag70096 running 129 zs93kag70097 running 130 zs93kag70098 running 131 zs93kag70099 running 132 zs93kag70100 running 133 zs93kag70101 running 134 zs93kag70102 running 135 zs93kag70103 running 136 zs93kag70104 running 137 zs93kag70105 running 138 zs93kag70106 running 139 zs93kag70107 running 140 zs93kag70108 running 141 zs93kag70109 running 142 zs93kag70110 running 143 zs93kag70111 running 144 zs93kag70112 running 145 zs93kag70113 running 146 zs93kag70114 running 147 zs93kag70115 running 148 zs93kag70116 running 149 zs93kag70117 running 150 zs93kag70118 running 151 zs93kag70119 running 152 zs93kag70120 running 153 zs93kag70121 running 154 zs93kag70122 running 155 zs93kag70123 running 156 zs93kag70124 running 157 zs93kag70125 running 158 zs93kag70126 running 159 zs93kag70127 running 160 zs93kag70128 running 161 zs93kag70129
[Kernel-packages] [Bug 1717224] Comment bridged from LTC Bugzilla
--- Comment From swgre...@us.ibm.com 2017-11-09 16:56 EDT--- Heinz, I'm trying to pick up the kernel update using the -proposed method at the URL you provided. I added this line to the sources.list ... root@zm93k8:/home/scottg# grep proposed /etc/apt/sources.list deb http://ports.ubuntu.com/ubuntu-ports xenial-proposed restricted main multiverse universe Created proposed-updates file ... root@zm93k8:/home/scottg# cat /etc/apt/preferences.d/proposed-updates Package: * Pin: release a=xenial-proposed Pin-Priority: 400 The simulated update doesn't show any available updates (as expected / desired): root@zm93k8:/home/scottg# sudo apt-get upgrade -s Reading package lists... Done Building dependency tree Reading state information... Done Calculating upgrade... Done 0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded. However, I can't figure out how to actually install the specific kernel package ... I tried a simulation and I get: root@zm93k8:/home/scottg# sudo apt-get install Ubuntu-4.4.0-98.121/xenial-proposed -s Reading package lists... Done Building dependency tree Reading state information... Done E: Unable to locate package Ubuntu-4.4.0-98.121 E: Couldn't find any package by glob 'Ubuntu-4.4.0-98.121' E: Couldn't find any package by regex 'Ubuntu-4.4.0-98.121' root@zm93k8:/home/scottg# Do I have incorrect syntax? Incorrect package name? Not sure why I'm not finding the update. Note, I also had to comment out these URLs to prevent from finding updates: #deb http://ports.ubuntu.com/ xenial main restricted universe multiverse #deb http://ports.ubuntu.com/ xenial-updates main restricted universe multiverse #deb http://ports.ubuntu.com/ xenial-security main universe This is unfamiliar territory... thanks for your help. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1717224 Title: virsh start of virtual guest domain fails with internal error due to low default aio-max-nr sysctl value Status in Ubuntu on IBM z Systems: In Progress Status in kvm package in Ubuntu: Confirmed Status in linux package in Ubuntu: In Progress Status in procps package in Ubuntu: New Status in kvm source package in Xenial: New Status in linux source package in Xenial: In Progress Status in procps source package in Xenial: New Status in kvm source package in Zesty: New Status in linux source package in Zesty: In Progress Status in procps source package in Zesty: New Status in kvm source package in Artful: Confirmed Status in linux source package in Artful: In Progress Status in procps source package in Artful: New Bug description: Starting virtual guests via on Ubuntu 16.04.2 LTS installed with its KVM hypervisor on an IBM Z14 system LPAR fails on the 18th guest with the following error: root@zm93k8:/rawimages/ubu1604qcow2# virsh start zs93kag70038 error: Failed to start domain zs93kag70038 error: internal error: process exited while connecting to monitor: 2017-07-26T01:48:26.352534Z qemu-kvm: -drive file=/guestimages/data1/zs93kag70038.qcow2,format=qcow2,if=none,id=drive-virtio-disk0,cache=none,aio=native: Could not open backing file: Could not set AIO state: Inappropriate ioctl for device The previous 17 guests started fine: root@zm93k8# virsh start zs93kag70020 Domain zs93kag70020 started root@zm93k8# virsh start zs93kag70021 Domain zs93kag70021 started . . root@zm93k8:/rawimages/ubu1604qcow2# virsh start zs93kag70036 Domain zs93kag70036 started We ended up fixing the issue by adding the following line to /etc/sysctl.conf : fs.aio-max-nr = 4194304 ... then, reload the sysctl config file: root@zm93k8:/etc# sysctl -p /etc/sysctl.conf fs.aio-max-nr = 4194304 Now, we're able to start more guests... root@zm93k8:/etc# virsh start zs93kag70036 Domain zs93kag70036 started The default value was originally set to 65535: root@zm93k8:/rawimages/ubu1604qcow2# cat /proc/sys/fs/aio-max-nr 65536 Note, we chose the 4194304 value, because this is what our KVM on System Z hypervisor ships as its default value. Eg. on our zKVM system: [root@zs93ka ~]# cat /proc/sys/fs/aio-max-nr 4194304 ubuntu@zm93k8:/etc$ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description:Ubuntu 16.04.2 LTS Release:16.04 Codename: xenial ubuntu@zm93k8:/etc$ ubuntu@zm93k8:/etc$ dpkg -s qemu-kvm |grep Version Version: 1:2.5+dfsg-5ubuntu10.8 Is something already documented for Ubuntu KVM users warning them about the low default value, and some guidance as to how to select an appropriate value? Also, would you consider increasing the default aio-max-nr value to something much higher, to accommodate significantly more virtual guests? Thanks! ---uname output--- ubuntu@zm93k8:/etc$ uname -a Linux zm93k8 4.4.0-62-generic #83-Ubuntu SMP Wed
[Kernel-packages] [Bug 1717224] Comment bridged from LTC Bugzilla
--- Comment From swgre...@us.ibm.com 2017-11-09 11:30 EDT--- (In reply to comment #27) > Commit 2a8a98673c13 is in Xenial -proposed, kernel version: > Ubuntu-4.4.0-98.121. It looks like you tested with version > 4.4.0-62-generic. Can you try testing with the -proposed Xenial kernel: > > See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to > enable and use -proposed. > > Another option is to test with the current -updates Artful kernel. Heintz - I should be able to get time on the Ubuntu test system soon. I'll try the proposed kernel level and report results ASAP. Thanks... - Scott -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1717224 Title: virsh start of virtual guest domain fails with internal error due to low default aio-max-nr sysctl value Status in Ubuntu on IBM z Systems: In Progress Status in kvm package in Ubuntu: Confirmed Status in linux package in Ubuntu: In Progress Status in procps package in Ubuntu: New Status in kvm source package in Xenial: New Status in linux source package in Xenial: In Progress Status in procps source package in Xenial: New Status in kvm source package in Zesty: New Status in linux source package in Zesty: In Progress Status in procps source package in Zesty: New Status in kvm source package in Artful: Confirmed Status in linux source package in Artful: In Progress Status in procps source package in Artful: New Bug description: Starting virtual guests via on Ubuntu 16.04.2 LTS installed with its KVM hypervisor on an IBM Z14 system LPAR fails on the 18th guest with the following error: root@zm93k8:/rawimages/ubu1604qcow2# virsh start zs93kag70038 error: Failed to start domain zs93kag70038 error: internal error: process exited while connecting to monitor: 2017-07-26T01:48:26.352534Z qemu-kvm: -drive file=/guestimages/data1/zs93kag70038.qcow2,format=qcow2,if=none,id=drive-virtio-disk0,cache=none,aio=native: Could not open backing file: Could not set AIO state: Inappropriate ioctl for device The previous 17 guests started fine: root@zm93k8# virsh start zs93kag70020 Domain zs93kag70020 started root@zm93k8# virsh start zs93kag70021 Domain zs93kag70021 started . . root@zm93k8:/rawimages/ubu1604qcow2# virsh start zs93kag70036 Domain zs93kag70036 started We ended up fixing the issue by adding the following line to /etc/sysctl.conf : fs.aio-max-nr = 4194304 ... then, reload the sysctl config file: root@zm93k8:/etc# sysctl -p /etc/sysctl.conf fs.aio-max-nr = 4194304 Now, we're able to start more guests... root@zm93k8:/etc# virsh start zs93kag70036 Domain zs93kag70036 started The default value was originally set to 65535: root@zm93k8:/rawimages/ubu1604qcow2# cat /proc/sys/fs/aio-max-nr 65536 Note, we chose the 4194304 value, because this is what our KVM on System Z hypervisor ships as its default value. Eg. on our zKVM system: [root@zs93ka ~]# cat /proc/sys/fs/aio-max-nr 4194304 ubuntu@zm93k8:/etc$ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description:Ubuntu 16.04.2 LTS Release:16.04 Codename: xenial ubuntu@zm93k8:/etc$ ubuntu@zm93k8:/etc$ dpkg -s qemu-kvm |grep Version Version: 1:2.5+dfsg-5ubuntu10.8 Is something already documented for Ubuntu KVM users warning them about the low default value, and some guidance as to how to select an appropriate value? Also, would you consider increasing the default aio-max-nr value to something much higher, to accommodate significantly more virtual guests? Thanks! ---uname output--- ubuntu@zm93k8:/etc$ uname -a Linux zm93k8 4.4.0-62-generic #83-Ubuntu SMP Wed Jan 18 14:12:54 UTC 2017 s390x s390x s390x GNU/Linux Machine Type = z14 ---Debugger--- A debugger is not configured ---Steps to Reproduce--- See Problem Description. The problem was happening a week ago, so this may not reflect that activity. This file was collected on Aug 7, one week after we were hitting the problem. If I need to reproduce the problem and get fresh data, please let me know. /var/log/messages doesn't exist on this system, so I provided syslog output instead. All data have been collected too late after the problem was observed over a week ago. If you need me to reproduce the problem and get new data, please let me know. That's not a problem. Also, we would have to make special arrangements for login access to these systems. I'm happy to run traces and data collection for you as needed. If that's not sufficient, then we'll explore log in access for you. Thanks... - Scott G. I was able to successfully recreate the problem and captured / attached new debug docs. Recreate procedure: # Started out with no virtual guests running.
[Kernel-packages] [Bug 1717224] Comment bridged from LTC Bugzilla
--- Comment From swgre...@us.ibm.com 2017-10-17 18:34 EDT--- Hi folks. Good news! We got a test window on the Ubuntu KVM host today. We provisioned a collection of 24 new virtual Ubuntu guests for this test. Each virtual domain uses a single qcow2 virtual boot volume. All guests are configured exactly the same (except guests zs93kag100080, zs93kag100081 and zs93kag100082 are on a macvtap interface. Otherwise, identical.). Here's a sample of one (running) guest's XML: ubuntu@zm93k8:/home/scottg$ virsh dumpxml zs93kag100080 zs93kag100080 6bd4ebad-414b-4e1e-9995-7d061331ec01 4194304 4194304 2 /machine hvm destroy restart preserve /usr/bin/qemu-kvm libvirt-6bd4ebad-414b-4e1e-9995-7d061331ec01 libvirt-6bd4ebad-414b-4e1e-9995-7d061331ec01 To set up the test, we shutdown all virtual domains, and then ran a script which simply starts the guests, one at a time and captures fs .aio-nr before / after each 'virsh start'. After attempting to start all guests in the list, it goes into a loop, checking fs.aio-nr once every minute for 10 minutes to see if that value changes (which it does not). ubuntu@zm93k8:/home/scottg$ ./start_macvtaps_debug.sh Test started at Tue Oct 17 17:48:29 EDT 2017 cat /proc/sys/fs/aio-max-nr 65535 fs.aio-nr = 0 Starting zs93kag100080 ; Count = 1 zs93kag100080 started succesfully ... fs.aio-nr = 6144 Starting zs93kag100081 ; Count = 2 zs93kag100081 started succesfully ... fs.aio-nr = 12288 Starting zs93kag100082 ; Count = 3 zs93kag100082 started succesfully ... fs.aio-nr = 18432 Starting zs93kag100083 ; Count = 4 zs93kag100083 started succesfully ... fs.aio-nr = 24576 Starting zs93kag100084 ; Count = 5 zs93kag100084 started succesfully ... fs.aio-nr = 30720 Starting zs93kag100085 ; Count = 6 zs93kag100085 started succesfully ... fs.aio-nr = 36864 Starting zs93kag70024 ; Count = 7 zs93kag70024 started succesfully ... fs.aio-nr = 43008 Starting zs93kag70025 ; Count = 8 zs93kag70025 started succesfully ... fs.aio-nr = 49152 Starting zs93kag70026 ; Count = 9 zs93kag70026 started succesfully ... fs.aio-nr = 55296 Starting zs93kag70027 ; Count = 10 zs93kag70027 started succesfully ... fs.aio-nr = 61440 Starting zs93kag70038 ; Count = 11 zs93kag70038 started succesfully ... fs.aio-nr = 67584 Starting zs93kag70039 ; Count = 12 zs93kag70039 started succesfully ... fs.aio-nr = 73728 Starting zs93kag70040 ; Count = 13 zs93kag70040 started succesfully ... fs.aio-nr = 79872 Starting zs93kag70043 ; Count = 14 zs93kag70043 started succesfully ... fs.aio-nr = 86016 Starting zs93kag70045 ; Count = 15 zs93kag70045 started succesfully ... fs.aio-nr = 92160 Starting zs93kag70046 ; Count = 16 zs93kag70046 started succesfully ... fs.aio-nr = 98304 Starting zs93kag70047 ; Count = 17 zs93kag70047 started succesfully ... fs.aio-nr = 104448 Starting zs93kag70048 ; Count = 18 zs93kag70048 started succesfully ... fs.aio-nr = 110592 Starting zs93kag70049 ; Count = 19 zs93kag70049 started succesfully ... fs.aio-nr = 116736 Starting zs93kag70050 ; Count = 20 zs93kag70050 started succesfully ... fs.aio-nr = 122880 Starting zs93kag70051 ; Count = 21 zs93kag70051 started succesfully ... fs.aio-nr = 129024 Starting zs93kag70052 ; Count = 22 Error starting guest zs93kag70052 . error: Failed to start domain zs93kag70052 error: internal error: process exited while connecting to monitor: 2017-10-17T21:49:06.68Z qemu-kvm: -drive file=/guestimages/data1/zs93kag70052.qcow2,format=qcow2,if=none,id=drive-virtio-disk0,cache=none,aio=native: Could not refresh total sector count: Bad file descriptor fs.aio-nr = 129024 Starting zs93kag70053 ; Count = 23 Error starting guest zs93kag70053 . error: Failed to start domain zs93kag70053 error: internal error: process exited while connecting to monitor: 2017-10-17T21:49:07.933457Z qemu-kvm: -drive file=/guestimages/data1/zs93kag70053.qcow2,format=qcow2,if=none,id=drive-virtio-disk0,cache=none,aio=native: Could not refresh total sector count: Bad file descriptor fs.aio-nr = 129024 Starting zs93kag70054 ; Count = 24 Error starting guest zs93kag70054 . error: Failed to start domain zs93kag70054 error: internal error: process exited while connecting to monitor: 2017-10-17T21:49:09.084863Z qemu-kvm: -drive file=/guestimages/data1/zs93kag70054.qcow2,format=qcow2,if=none,id=drive-virtio-disk0,cache=none,aio=native: Could not refresh total sector count: Bad file descriptor fs.aio-nr = 129024 Monitor fs.aio-nr for 10 minutes, capture value every 60 seconds... Sleeping 60 seconds. Loop count = 1 fs.aio-nr = 129024 Sleeping 60 seconds. Loop count = 2 fs.aio-nr = 129024 Sleeping 60 seconds. Loop count = 3 fs.aio-nr = 129024 Sleeping 60 seconds. Loop count = 4 fs.aio-nr = 129024 Sleeping 60 seconds. Loop count = 5 fs.aio-nr = 129024 Sleeping 60 seconds. Loop count = 6 fs.aio-nr = 129024 Sleeping 60 seconds. Loop count = 7
[Kernel-packages] [Bug 1717224] Comment bridged from LTC Bugzilla
--- Comment From cborn...@de.ibm.com 2017-10-17 07:09 EDT--- FWIW, I checked on my system and the aio-nr only increases for disks with io='native' (libvirt) or aio='native' (qemu command line) QEMU does an io_setup of 128 events per disk. But with current kernels this increases this by 2048 on my system. Looks like we need commit 2a8a98673c13cb2a61a6476153acf8344adfa992 Author: Mauricio Faria de OliveiraAuthorDate: Wed Jul 5 10:53:16 2017 -0300 Commit: Benjamin LaHaise CommitDate: Thu Sep 7 12:28:28 2017 -0400 fs: aio: fix the increment of aio-nr and counting against aio-max-nr to fix the accounting. This will still result in a limitation of 64k/128 = 512 disk for ALL guests if nothing else uses aio contexts. Since aio context do not preallocate any ressources What about - applying above fix - increase to 128k to have enough capacity. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1717224 Title: virsh start of virtual guest domain fails with internal error due to low default aio-max-nr sysctl value Status in Ubuntu on IBM z Systems: Confirmed Status in kvm package in Ubuntu: Confirmed Status in linux package in Ubuntu: New Status in procps package in Ubuntu: New Bug description: Starting virtual guests via on Ubuntu 16.04.2 LTS installed with its KVM hypervisor on an IBM Z14 system LPAR fails on the 18th guest with the following error: root@zm93k8:/rawimages/ubu1604qcow2# virsh start zs93kag70038 error: Failed to start domain zs93kag70038 error: internal error: process exited while connecting to monitor: 2017-07-26T01:48:26.352534Z qemu-kvm: -drive file=/guestimages/data1/zs93kag70038.qcow2,format=qcow2,if=none,id=drive-virtio-disk0,cache=none,aio=native: Could not open backing file: Could not set AIO state: Inappropriate ioctl for device The previous 17 guests started fine: root@zm93k8# virsh start zs93kag70020 Domain zs93kag70020 started root@zm93k8# virsh start zs93kag70021 Domain zs93kag70021 started . . root@zm93k8:/rawimages/ubu1604qcow2# virsh start zs93kag70036 Domain zs93kag70036 started We ended up fixing the issue by adding the following line to /etc/sysctl.conf : fs.aio-max-nr = 4194304 ... then, reload the sysctl config file: root@zm93k8:/etc# sysctl -p /etc/sysctl.conf fs.aio-max-nr = 4194304 Now, we're able to start more guests... root@zm93k8:/etc# virsh start zs93kag70036 Domain zs93kag70036 started The default value was originally set to 65535: root@zm93k8:/rawimages/ubu1604qcow2# cat /proc/sys/fs/aio-max-nr 65536 Note, we chose the 4194304 value, because this is what our KVM on System Z hypervisor ships as its default value. Eg. on our zKVM system: [root@zs93ka ~]# cat /proc/sys/fs/aio-max-nr 4194304 ubuntu@zm93k8:/etc$ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description:Ubuntu 16.04.2 LTS Release:16.04 Codename: xenial ubuntu@zm93k8:/etc$ ubuntu@zm93k8:/etc$ dpkg -s qemu-kvm |grep Version Version: 1:2.5+dfsg-5ubuntu10.8 Is something already documented for Ubuntu KVM users warning them about the low default value, and some guidance as to how to select an appropriate value? Also, would you consider increasing the default aio-max-nr value to something much higher, to accommodate significantly more virtual guests? Thanks! ---uname output--- ubuntu@zm93k8:/etc$ uname -a Linux zm93k8 4.4.0-62-generic #83-Ubuntu SMP Wed Jan 18 14:12:54 UTC 2017 s390x s390x s390x GNU/Linux Machine Type = z14 ---Debugger--- A debugger is not configured ---Steps to Reproduce--- See Problem Description. The problem was happening a week ago, so this may not reflect that activity. This file was collected on Aug 7, one week after we were hitting the problem. If I need to reproduce the problem and get fresh data, please let me know. /var/log/messages doesn't exist on this system, so I provided syslog output instead. All data have been collected too late after the problem was observed over a week ago. If you need me to reproduce the problem and get new data, please let me know. That's not a problem. Also, we would have to make special arrangements for login access to these systems. I'm happy to run traces and data collection for you as needed. If that's not sufficient, then we'll explore log in access for you. Thanks... - Scott G. I was able to successfully recreate the problem and captured / attached new debug docs. Recreate procedure: # Started out with no virtual guests running. ubuntu@zm93k8:/home/scottg$ virsh list IdName State # Set fs.aio-max-nr back to original
[Kernel-packages] [Bug 1717224] Comment bridged from LTC Bugzilla
--- Comment From swgre...@us.ibm.com 2017-10-16 11:27 EDT--- (In reply to comment #19) > Any updates? Have you been able to monitor fs.aio-nr exhaustion? Sorry, no. Is this data critical for debug? I would have to disrupt our current configuration in order to bring up the Ubuntu KVM host. If so, I'll see what I can do. Thanks... -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1717224 Title: virsh start of virtual guest domain fails with internal error due to low default aio-max-nr sysctl value Status in Ubuntu on IBM z Systems: Confirmed Status in kvm package in Ubuntu: Confirmed Status in linux package in Ubuntu: New Status in procps package in Ubuntu: New Bug description: Starting virtual guests via on Ubuntu 16.04.2 LTS installed with its KVM hypervisor on an IBM Z14 system LPAR fails on the 18th guest with the following error: root@zm93k8:/rawimages/ubu1604qcow2# virsh start zs93kag70038 error: Failed to start domain zs93kag70038 error: internal error: process exited while connecting to monitor: 2017-07-26T01:48:26.352534Z qemu-kvm: -drive file=/guestimages/data1/zs93kag70038.qcow2,format=qcow2,if=none,id=drive-virtio-disk0,cache=none,aio=native: Could not open backing file: Could not set AIO state: Inappropriate ioctl for device The previous 17 guests started fine: root@zm93k8# virsh start zs93kag70020 Domain zs93kag70020 started root@zm93k8# virsh start zs93kag70021 Domain zs93kag70021 started . . root@zm93k8:/rawimages/ubu1604qcow2# virsh start zs93kag70036 Domain zs93kag70036 started We ended up fixing the issue by adding the following line to /etc/sysctl.conf : fs.aio-max-nr = 4194304 ... then, reload the sysctl config file: root@zm93k8:/etc# sysctl -p /etc/sysctl.conf fs.aio-max-nr = 4194304 Now, we're able to start more guests... root@zm93k8:/etc# virsh start zs93kag70036 Domain zs93kag70036 started The default value was originally set to 65535: root@zm93k8:/rawimages/ubu1604qcow2# cat /proc/sys/fs/aio-max-nr 65536 Note, we chose the 4194304 value, because this is what our KVM on System Z hypervisor ships as its default value. Eg. on our zKVM system: [root@zs93ka ~]# cat /proc/sys/fs/aio-max-nr 4194304 ubuntu@zm93k8:/etc$ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description:Ubuntu 16.04.2 LTS Release:16.04 Codename: xenial ubuntu@zm93k8:/etc$ ubuntu@zm93k8:/etc$ dpkg -s qemu-kvm |grep Version Version: 1:2.5+dfsg-5ubuntu10.8 Is something already documented for Ubuntu KVM users warning them about the low default value, and some guidance as to how to select an appropriate value? Also, would you consider increasing the default aio-max-nr value to something much higher, to accommodate significantly more virtual guests? Thanks! ---uname output--- ubuntu@zm93k8:/etc$ uname -a Linux zm93k8 4.4.0-62-generic #83-Ubuntu SMP Wed Jan 18 14:12:54 UTC 2017 s390x s390x s390x GNU/Linux Machine Type = z14 ---Debugger--- A debugger is not configured ---Steps to Reproduce--- See Problem Description. The problem was happening a week ago, so this may not reflect that activity. This file was collected on Aug 7, one week after we were hitting the problem. If I need to reproduce the problem and get fresh data, please let me know. /var/log/messages doesn't exist on this system, so I provided syslog output instead. All data have been collected too late after the problem was observed over a week ago. If you need me to reproduce the problem and get new data, please let me know. That's not a problem. Also, we would have to make special arrangements for login access to these systems. I'm happy to run traces and data collection for you as needed. If that's not sufficient, then we'll explore log in access for you. Thanks... - Scott G. I was able to successfully recreate the problem and captured / attached new debug docs. Recreate procedure: # Started out with no virtual guests running. ubuntu@zm93k8:/home/scottg$ virsh list IdName State # Set fs.aio-max-nr back to original Ubuntu "out of the box" value in /etc/sysctl.conf ubuntu@zm93k8:~$ tail -1 /etc/sysctl.conf fs.aio-max-nr = 65536 ## sysctl -a shows: fs.aio-max-nr = 4194304 ## Reload sysctl. ubuntu@zm93k8:~$ sudo sysctl -p /etc/sysctl.conf fs.aio-max-nr = 65536 ubuntu@zm93k8:~$ ubuntu@zm93k8:~$ sudo sysctl -a |grep fs.aio-max-nr fs.aio-max-nr = 65536 ubuntu@zm93k8:~$ cat /proc/sys/fs/aio-max-nr 65536 # Attempt to start more than 17 qcow2 virtual guests on the Ubuntu host. Fails on the 18th XML. Script used to start guests..
[Kernel-packages] [Bug 1717224] Comment bridged from LTC Bugzilla
--- Comment From swgre...@us.ibm.com 2017-09-14 10:08 EDT--- (In reply to comment #17) > OTOH I wonder to some extend how you exceded the 65k with "just" 18 guests. > Not strictly required, but it might be very interesting before your system > becomes unavailable to track: > $ sysctl fs.aio-nr > while starting the guests. How much get added per guest, would the settle > down after the guest is done with initial rampup. Thank you all for investigating this issue... Unfortunately, my Ubuntu KVM hosts have been shutdown temporarily to use those LPARs for some other testing. I'll provide the fs-aio-nr data just as soon as those resources free up again. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1717224 Title: virsh start of virtual guest domain fails with internal error due to low default aio-max-nr sysctl value Status in Ubuntu on IBM z Systems: Confirmed Status in kvm package in Ubuntu: Confirmed Status in linux package in Ubuntu: New Status in procps package in Ubuntu: New Bug description: Starting virtual guests via on Ubuntu 16.04.2 LTS installed with its KVM hypervisor on an IBM Z14 system LPAR fails on the 18th guest with the following error: root@zm93k8:/rawimages/ubu1604qcow2# virsh start zs93kag70038 error: Failed to start domain zs93kag70038 error: internal error: process exited while connecting to monitor: 2017-07-26T01:48:26.352534Z qemu-kvm: -drive file=/guestimages/data1/zs93kag70038.qcow2,format=qcow2,if=none,id=drive-virtio-disk0,cache=none,aio=native: Could not open backing file: Could not set AIO state: Inappropriate ioctl for device The previous 17 guests started fine: root@zm93k8# virsh start zs93kag70020 Domain zs93kag70020 started root@zm93k8# virsh start zs93kag70021 Domain zs93kag70021 started . . root@zm93k8:/rawimages/ubu1604qcow2# virsh start zs93kag70036 Domain zs93kag70036 started We ended up fixing the issue by adding the following line to /etc/sysctl.conf : fs.aio-max-nr = 4194304 ... then, reload the sysctl config file: root@zm93k8:/etc# sysctl -p /etc/sysctl.conf fs.aio-max-nr = 4194304 Now, we're able to start more guests... root@zm93k8:/etc# virsh start zs93kag70036 Domain zs93kag70036 started The default value was originally set to 65535: root@zm93k8:/rawimages/ubu1604qcow2# cat /proc/sys/fs/aio-max-nr 65536 Note, we chose the 4194304 value, because this is what our KVM on System Z hypervisor ships as its default value. Eg. on our zKVM system: [root@zs93ka ~]# cat /proc/sys/fs/aio-max-nr 4194304 ubuntu@zm93k8:/etc$ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description:Ubuntu 16.04.2 LTS Release:16.04 Codename: xenial ubuntu@zm93k8:/etc$ ubuntu@zm93k8:/etc$ dpkg -s qemu-kvm |grep Version Version: 1:2.5+dfsg-5ubuntu10.8 Is something already documented for Ubuntu KVM users warning them about the low default value, and some guidance as to how to select an appropriate value? Also, would you consider increasing the default aio-max-nr value to something much higher, to accommodate significantly more virtual guests? Thanks! ---uname output--- ubuntu@zm93k8:/etc$ uname -a Linux zm93k8 4.4.0-62-generic #83-Ubuntu SMP Wed Jan 18 14:12:54 UTC 2017 s390x s390x s390x GNU/Linux Machine Type = z14 ---Debugger--- A debugger is not configured ---Steps to Reproduce--- See Problem Description. The problem was happening a week ago, so this may not reflect that activity. This file was collected on Aug 7, one week after we were hitting the problem. If I need to reproduce the problem and get fresh data, please let me know. /var/log/messages doesn't exist on this system, so I provided syslog output instead. All data have been collected too late after the problem was observed over a week ago. If you need me to reproduce the problem and get new data, please let me know. That's not a problem. Also, we would have to make special arrangements for login access to these systems. I'm happy to run traces and data collection for you as needed. If that's not sufficient, then we'll explore log in access for you. Thanks... - Scott G. I was able to successfully recreate the problem and captured / attached new debug docs. Recreate procedure: # Started out with no virtual guests running. ubuntu@zm93k8:/home/scottg$ virsh list IdName State # Set fs.aio-max-nr back to original Ubuntu "out of the box" value in /etc/sysctl.conf ubuntu@zm93k8:~$ tail -1 /etc/sysctl.conf fs.aio-max-nr = 65536 ## sysctl -a shows: fs.aio-max-nr = 4194304 ## Reload sysctl. ubuntu@zm93k8:~$ sudo sysctl -p /etc/sysctl.conf