Jon,
Looks like a qemu issue for the Server Team to take a look at.
Michael
On 03/31/2017 02:59 PM, Launchpad Bug Tracker wrote:
> bugproxy (bugproxy) has assigned this bug to you for Ubuntu:
>
> ---Problem Description---
> I am trying to do hotplug attach with Mellanox CX3 card to a guest but I get
> failure.
> virsh attach-device powerio-le12-ubuntu-17.04 ./add_cx3.xml
> error: Failed to attach device from ./add_cx3.xml
> error: internal error: unable to execute QEMU command 'device_add': vfio
> error: 0044:01:00.0: failed to setup container for group 6: RAM memory
> listener initialization failed for container
>
>
> from the log file from qemu I see this:
> 2017-02-14T22:55:40.721108Z qemu-system-ppc64: backend does not support BE
> vnet headers; falling back on use rspace virtio
>
> This is with kernel 4.9.0-15-generic and qemu level:
> dpkg --list| grep qemu
> ii ipxe-qemu
> 1.0.0+git-20150424.a25a16d-1ubuntu2 all PXE boot firmware - ROM
> images for qemu
> ii qemu 1:2.8+dfsg-2ubuntu1
> ppc64el fast processor emulator
> ii qemu-block-extra:ppc64el 1:2.8+dfsg-2ubuntu1
> ppc64el extra block backend modules for qemu-system and
> qemu-utils
> ii qemu-kvm 1:2.8+dfsg-2ubuntu1
> ppc64el QEMU Full virtualization
> ii qemu-slof 20161019+dfsg-1
> all Slimline Open Firmware -- QEMU PowerPC version
> ii qemu-system 1:2.8+dfsg-2ubuntu1
> ppc64el QEMU full system emulation binaries
> ii qemu-system-arm 1:2.8+dfsg-2ubuntu1
> ppc64el QEMU full system emulation binaries (arm)
> ii qemu-system-common 1:2.8+dfsg-2ubuntu1
> ppc64el QEMU full system emulation binaries (common files)
> ii qemu-system-mips 1:2.8+dfsg-2ubuntu1
> ppc64el QEMU full system emulation binaries (mips)
> ii qemu-system-misc 1:2.8+dfsg-2ubuntu1
> ppc64el QEMU full system emulation binaries (miscellaneous)
> ii qemu-system-ppc 1:2.8+dfsg-2ubuntu1
> ppc64el QEMU full system emulation binaries (ppc)
> ii qemu-system-sparc 1:2.8+dfsg-2ubuntu1
> ppc64el QEMU full system emulation binaries (sparc)
> ii qemu-system-x86 1:2.8+dfsg-2ubuntu1
> ppc64el QEMU full system emulation binaries (x86)
> ii qemu-user 1:2.8+dfsg-2ubuntu1
> ppc64el QEMU user mode emulation binaries
> ii qemu-user-binfmt 1:2.8+dfsg-2ubuntu1
> ppc64el QEMU user mode binfmt registration for qemu-user
> ii qemu-utils 1:2.8+dfsg-2ubuntu1
> ppc64el QEMU utilities
>
>
> ---uname output---
> 4.9.0-15-generic #16-Ubuntu SMP Fri Jan 20 15:28:49 UTC 2017 ppc64le ppc64le
> ppc64le GNU/Linux
>
> Machine Type = P8
>
> ---Steps to Reproduce---
> bring up a guest and then try to attach device like this:
> virsh attach-device powerio-le12-ubuntu-17.04 ./add_cx3.xml --live
> error: Failed to attach device from ./add_cx3.xml
> error: internal error: unable to execute QEMU command 'device_add': vfio
> error: 0044:01:00.0: failed to setup container for group 6: RAM memory
> listener initialization failed for container
>
> When I retried the steps for add_cx3.xml on the same machine I noticed
> the following in the host logs:
>
> [ 1374.276210] KVM guest htab at c000001e56000000 (order 26), LPID 1
> [ 1383.824281] hrtimer: interrupt took 923 ns
> [ 1447.479194] audit_printk_skb: 15 callbacks suppressed
> [ 1447.479198] audit: type=1400 audit(1487194729.006:17): apparmor="DENIED"
> operation="setrlimit" profile="/usr/sbin/libvirtd" pid=6853 comm="libvirtd"
> rlimit=memlock value=8694792192
> [ 1447.481927] pci 0044:01 : [PE# 002] Disabling 64-bit DMA bypass
> [ 1447.481935] pci 0044:01 : [PE# 002] Removing DMA window #0
> [ 1447.481978] pci 0044:01 : [PE# 002] Removing DMA window #0
> [ 1447.481980] pci 0044:01 : [PE# 002] Removing DMA window #1
> [ 1447.485667] pci 0044:01 : [PE# 002] Setting up window#0 0..7fffffff
> pg=1000
> [ 1447.485670] pci 0044:01 : [PE# 002] Enabling 64-bit DMA bypass
> [ 1517.030701] audit: type=1400 audit(1487194798.559:18): apparmor="DENIED"
> operation="setrlimit" profile="/usr/sbin/libvirtd" pid=6853 comm="libvirtd"
> rlimit=memlock value=8694792192
> [ 1517.033286] pci 0044:01 : [PE# 002] Disabling 64-bit DMA bypass
> [ 1517.033290] pci 0044:01 : [PE# 002] Removing DMA window #0
> [ 1517.033322] pci 0044:01 : [PE# 002] Removing DMA window #0
> [ 1517.033325] pci 0044:01 : [PE# 002] Removing DMA window #1
> [ 1517.036971] pci 0044:01 : [PE# 002] Setting up window#0 0..7fffffff
> pg=1000
> [ 1517.036974] pci 0044:01 : [PE# 002] Enabling 64-bit DMA bypass
>
> I'm not sure if the apparmor issues are affecting functionality or not.
> That may be worth looking into a separate bug, or a dupe of
> https://bugzilla.linux.ibm.com/show_bug.cgi?id=146192
>
> As noted there I did the following to work around it:
>
> sudo aa-complain /usr/sbin/libvirtd
> sudo aa-complain
> /etc/apparmor.d/libvirt/libvirt-????????-????-????-????-????????????
>
> I still got the VFIO memory listener error however. If I install QEMU
> 2.7.0 I no longer see the VFIO error and things seems to succeed from a
> host perspective:
>
> root@powerio-le11:/etc/libvirt/qemu# virsh attach-device
> powerio-le12-ubuntu-17.04 ./add_cx3.xml --live
> Device attached successfully
>
> root@powerio-le11:/etc/libvirt/qemu# dmesg | tail -6
> [ 3880.813971] KVM guest htab at c000001e56000000 (order 26), LPID 1
> [ 3917.656384] audit: type=1400 audit(1487197199.210:26): apparmor="ALLOWED"
> operation="setrlimit" profile="/usr/sbin/libvirtd" pid=6853 comm="libvirtd"
> rlimit=memlock value=8694792192
> [ 3917.659276] pci 0044:01 : [PE# 002] Disabling 64-bit DMA bypass
> [ 3917.659284] pci 0044:01 : [PE# 002] Removing DMA window #0
> [ 3917.688803] vfio-pci 0044:01:00.0: enabling device (0400 -> 0402)
> [ 3917.800106] vfio_ecap_init: 0044:01:00.0 hiding ecap 0x19@0x18c
>
> In the guest things look okay initially:
>
> [ 28.797667] RTAS: event: 1, Type: Unknown, Severity: 1
> [ 29.062821] pci 0000:00:05.0: [15b3:1007] type 00 class 0x020000
> [ 29.063118] pci 0000:00:05.0: reg 0x10: [mem 0x100a0000000-0x100a00fffff
> 64bit]
> [ 29.063341] pci 0000:00:05.0: reg 0x18: [mem 0x2c0200000000-0x2c0201ffffff
> 64bit pref]
> [ 29.063701] pci 0000:00:05.0: reg 0x30: [mem 0x00000000-0x000fffff pref]
> [ 29.065237] iommu: Adding device 0000:00:05.0 to group 0
> [ 29.065332] pci 0000:00:05.0: BAR 2: assigned [mem
> 0x10122000000-0x10123ffffff 64bit pref]
> [ 29.065675] pci 0000:00:05.0: BAR 0: assigned [mem
> 0x10121800000-0x101218fffff 64bit]
> [ 29.066010] pci 0000:00:05.0: BAR 6: assigned [mem
> 0x100a0000000-0x100a00fffff pref]
> [ 29.066105] mlx4_core: Mellanox ConnectX core driver v4.0-1.0.1 (29 Jan
> 2017)
> [ 29.066127] mlx4_core: Initializing 0000:00:05.0
> [ 29.066210] mlx4_core 0000:00:05.0: enabling device (0000 -> 0002)
> [ 29.076273] mlx4_core 0000:00:05.0: Using 64-bit direct DMA at offset
> 800000000000000
>
>
> but eventually I see the following error:
>
>
> [ 89.925954] mlx4_core 0000:00:05.0: device is going to be reset
> [ 99.923755] mlx4_core 0000:00:05.0: Failed to obtain HW semaphore, aborting
> [ 99.924052] mlx4_core 0000:00:05.0: Fail to reset HCA
> [ 99.924305] kernel BUG at
> /var/lib/dkms/mlnx-ofed-kernel/4.0/build/drivers/net/ethernet/mellanox/mlx4/catas.c:193!
> [ 99.924643] Oops: Exception in kernel mode, sig: 5 [#1]
> [ 99.924811] SMP NR_CPUS=2048 [ 99.924889] NUMA
> [ 99.924968] pSeries
> [ 99.925048] Modules linked in: rdma_ucm(OE) ib_ucm(OE) ib_ipoib(OE)
> ib_uverbs(OE) ib_umad(OE) mlx5_ib(OE) mlx5_core(OE) mlx4_ib(OE) mlx4_en(OE)
> mlx4_core(OE) devlink vmx_crypto ib_iser rdma_cm(OE) iw_cm(OE) ib_cm(OE)
> ib_core(OE) mlx_compat(OE) configfs iscsi_tcp libiscsi_tcp libiscsi
> scsi_transport_iscsi knem(OE) ip_tables x_tables autofs4 btrfs raid10 raid456
> async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq
> libcrc32c raid1 raid0 multipath linear ibmvscsi crc32c_vpmsum virtio_net
> virtio_blk
> [ 99.927029] CPU: 10 PID: 4600 Comm: drmgr Tainted: G OE
> 4.9.0-12-generic #13-Ubuntu
> [ 99.927316] task: c0000001dfc27e00 task.stack: c0000001dd630000
> [ 99.927515] NIP: d000000003c62794 LR: d000000003c6277c CTR:
> c0000000006c4a80
> [ 99.927752] REGS: c0000001dd6332a0 TRAP: 0700 Tainted: G OE
> (4.9.0-12-generic)
> [ 99.928029] MSR: 800000010282b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]>[
> 99.928645] CR: 48022222 XER: 20000000
> [ 99.928764] CFAR: c000000000710c28 SOFTE: 1
> GPR00: d000000003c6277c c0000001dd633520 d000000003cbca7c 0000000000000029
> GPR04: 0000000000000001 000000000000014d 6573657420484341 0d0a6c20746f2072
> GPR08: 0000000000000007 0000000000000001 c0000001e2a94300 0000000000000006
> GPR12: 0000000000002200 c00000000fb85a00 c0000001dd9a0060 0000000000000000
> GPR16: 00000000024080c0 00000000024202c2 0000000000000000 d000000003cb7418
> GPR20: 0000000000000000 d0000800802c0680 c0000001dd9a04e8 c0000001dd9a0518
> GPR24: 000000000000ea60 0000000000000000 c000000001443a00 c0000001dd6336f0
> GPR28: 0000000000000000 0000000000000004 c0000001e2a94360 c0000001dd9a0060
> NIP [d000000003c62794] mlx4_enter_error_state.part.0+0x35c/0x460 [mlx4_core]
> [ 99.931952] LR [d000000003c6277c]
> mlx4_enter_error_state.part.0+0x344/0x460 [mlx4_core]
> [ 99.932190] Call Trace:
> [ 99.932278] [c0000001dd633520] [d000000003c6277c]
> mlx4_enter_error_state.part.0+0x344/0x460 [mlx4_core] (unreliable)
> [ 99.932647] [c0000001dd6335b0] [d000000003c66df8] __mlx4_cmd+0x720/0x970
> [mlx4_core]
> [ 99.932946] [c0000001dd633680] [d000000003c73d88] mlx4_QUERY_FW+0x90/0x420
> [mlx4_core]
> [ 99.933238] [c0000001dd633730] [d000000003c7fd28]
> mlx4_load_one+0x440/0x1ac0 [mlx4_core]
> [ 99.933520] [c0000001dd633850] [d000000003c81a40]
> mlx4_init_one+0x698/0x7c0 [mlx4_core]
> [ 99.933922] [c0000001dd633960] [c00000000063049c]
> local_pci_probe+0x6c/0x140
> [ 99.934171] [c0000001dd6339f0] [c0000000006312e8]
> pci_device_probe+0x178/0x200
> [ 99.934430] [c0000001dd633a50] [c000000000716970]
> driver_probe_device+0x240/0x540
> [ 99.934657] [c0000001dd633ae0] [c00000000071344c]
> bus_for_each_drv+0x8c/0xf0
> [ 99.934848] [c0000001dd633b30] [c0000000007164f0]
> __device_attach+0x140/0x210
> [ 99.935057] [c0000001dd633bc0] [c000000000621d38]
> pci_bus_add_device+0x78/0x100
> [ 99.935270] [c0000001dd633c30] [c000000000621e20]
> pci_bus_add_devices+0x60/0xe0
> [ 99.935488] [c0000001dd633c70] [c000000000625b44] pci_rescan_bus+0x44/0x70
> [ 99.935666] [c0000001dd633ca0] [c000000000631ee4]
> bus_rescan_store+0x84/0xb0
> [ 99.935840] [c0000001dd633ce0] [c000000000712fb4] bus_attr_store+0x44/0x70
> [ 99.936039] [c0000001dd633d00] [c0000000003d52b8] sysfs_kf_write+0x68/0xa0
> [ 99.936210] [c0000001dd633d20] [c0000000003d417c]
> kernfs_fop_write+0x17c/0x250
> [ 99.936407] [c0000001dd633d70] [c00000000031924c] __vfs_write+0x3c/0x70
> [ 99.936583] [c0000001dd633d90] [c00000000031a4b4] vfs_write+0xd4/0x240
> [ 99.936760] [c0000001dd633de0] [c00000000031c018] SyS_write+0x68/0x110
> [ 99.936934] [c0000001dd633e30] [c00000000000bd84] system_call+0x38/0xe0
> [ 99.937102] Instruction dump:
> [ 99.937188] e93f0000 3d020000 e8888078 e8690000 386300a0 4803f8f1 e8410018
> e95f0000
> [ 99.937472] e92a0000 81290098 2f890001 409efea0 <0fe00000> 60000000
> 60420000 e93f0000
> [ 99.937726] ---[ end trace 66826e43e8c8b7ba ]---
> [ 99.937832]
>
> It's not clear to me if this new guest issue is specific to QEMU 2.7, or
> something that would also be present on 2.8 if not for the VFIO issue
> originally noted in this bug. First step I think will be to root-cause
> the VFIO issue, fix it, and see if the guest issue remains afterward. If
> it does we can track that as a separate bug (or perhaps we already seen
> this somewhere? seems vaguely familiar).
>
> Need to hop of machine for today, but can look at it more tomorrow.
>
> (In reply to comment #10)
>
>> [ 1517.030701] audit: type=1400 audit(1487194798.559:18): apparmor="DENIED"
>> operation="setrlimit" profile="/usr/sbin/libvirtd" pid=6853 comm="libvirtd"
>> rlimit=memlock value=8694792192
>> I'm not sure if the apparmor issues are affecting functionality or not. That
>> may be worth looking into a separate bug, or a dupe of
>> https://bugzilla.linux.ibm.com/show_bug.cgi?id=146192
>>
> Let me check again the Ubuntu 16.10 system because I did the same steps to
> update the /etc/libvirt/qemu.conf in Ubuntu 17.04 like I did in 16.10 but
> still see it. Not sure if I did something else.
>
>> It's not clear to me if this new guest issue is specific to QEMU 2.7, or
>> something that would also be present on 2.8 if not for the VFIO issue
>> originally noted in this bug. First step I think will be to root-cause the
>> VFIO issue, fix it, and see if the guest issue remains afterward. If it does
>> we can track that as a separate bug (or perhaps we already seen this
>> somewhere? seems vaguely familiar).
>>
>> Need to hop of machine for today, but can look at it more tomorrow.
> For this I see it with Ubuntu 16.10 KVM and the issue is the command are
> timing out like the dmas are not getting to the HW. I can see this with any
> Mellanox card I had tried. I can open separate bug more specific to 16.10 if
> you want.
>
> == Comment: #15 - MICHAEL D. ROTH <[email protected]> - 2017-02-22 13:22:53 ==
> I tried a bisect between 2.7.0 and 2.8.0/hostos to find the origin of these
> errors:
>
> root@powerio-le11:/etc/libvirt/qemu# virsh attach-device
> powerio-le12-ubuntu-17.04 ./add_cx3.xml --live
> error: Failed to attach device from ./add_cx3.xml
> error: internal error: unable to execute QEMU command 'device_add': Device
> initialization failed
>
> The commit that caused the "breakage" was:
>
> root@powerio-le11:~/mdroth/qemu.git# git bisect good
> 01905f58f166646619c35a2ebfc3ca3ed4cad62d is the first bad commit
> commit 01905f58f166646619c35a2ebfc3ca3ed4cad62d
> Author: Eric Auger <[email protected]>
> Date: Mon Oct 17 10:57:59 2016 -0600
>
> vfio: Pass an Error object to vfio_connect_container
>
>
> However all that does is turn vfio init errors into fatal errors that are
> passed on to libvirt, as opposed to just logging them in background and
> continuing execution. If I go back to 2.7.0 and re-test, I find that while
> libvirt reports the attach is successful, the log file still shows:
>
> LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
> QEMU_AUDIO_DRV=none /usr/bin/kvm -name
> guest=powerio-le12-ubuntu-17.04,debug-threads=on -S -object
> secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-17-powerio-le12-ubuntu-/master-key.aes
> -machine pseries-2.7,accel=kvm,usb=off,dump-guest-core=off -m 8192 -realtime
> mlock=off -smp 16,sockets=1,cores=2,threads=8 -uuid
> bd3248c2-5686-4e18-b86e-799292bf4ad3 -display none -no-user-config
> -nodefaults -chardev
> socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-17-powerio-le12-ubuntu-/monitor.sock,server,nowait
> -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown
> -boot strict=on -device pci-ohci,id=usb,bus=pci.0,addr=0x2 -device
> spapr-vscsi,id=scsi0,reg=0x2000 -drive
> file=/var/lib/libvirt/images/powerio-le12-ubuntu-17.04.qcow2,format=qcow2,if=none,id=drive-virtio-disk0
> -device
> virtio-blk-pci,scsi=off,bus=pci.0,addr=0x3,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1
> -dr
ive if=none,id=drive-scsi0-0-0-0,readonly=on -device
scsi-cd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0
-netdev tap,fd=25,id=hostnet0,vhost=on,vhostfd=27 -device
virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:eb:a9:da,bus=pci.0,addr=0x1
-chardev pty,id=charserial0 -device
spapr-vty,chardev=charserial0,reg=0x30000000 -device
virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4 -msg timestamp=on
> Domain id=17 is tainted: high-privileges
> char device redirected to /dev/pts/5 (label charserial0)
> vfio: RAM memory listener initialization failed for container
>
> So this issue seems to have existed since before 2.7.0, assuming it is
> stemming from QEMU and not related to kernel. Will look into it more.
>
> == Comment: #16 - MICHAEL D. ROTH <[email protected]> - 2017-02-22 18:02:36 ==
> I think this is some sort of permissions/rlimit issue after all.
>
> If I invoke QEMU directly without libvirt, then to the attach from the
> QEMU monitor, I see the device added successfully with no error, and I
> also don't see the subsequent crashes within the guest relating to
> mlx_QUERY_FW:
>
> root@powerio-le11:~/mdroth/qemu-build# ppc64-softmmu/qemu-system-ppc64
> -object
> secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-2
> -powerio-le12-ubuntu-/master-key.aes -machine
> pseries-2.7,accel=kvm,usb=off,dump-guest-core=off -m 8192 -realtime
> mlock=off -smp 16,sockets=1,cores=2,threads=8 -uuid bd3248c2-5686-4e18
> -b86e-799292bf4ad3 -display none -no-user-config -nodefaults -rtc
> base=utc -no-shutdown -boot strict=on -device pci-
> ohci,id=usb,bus=pci.0,addr=0x2 -device spapr-vscsi,id=scsi0,reg=0x2000
> -drive file=/var/lib/libvirt/images/powerio-
> le12-ubuntu-17.04.qcow2,format=qcow2,if=none,id=drive-virtio-disk0
> -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x3,drive=drive-virtio-
> disk0,id=virtio-disk0,bootindex=1 -drive if=none,id=drive-
> scsi0-0-0-0,readonly=on -device scsi-cd,bus=scsi0.0,channel=0,scsi-
> id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0 -netdev
> tap,id=hostnet0,vhost=on,script=/etc/qemu-ifup -device virtio-net-
> pci,netdev=hostnet0,id=net0,mac=52:54:00:eb:a9:da,bus=pci.0,addr=0x1
> -device spapr-vty,chardev=charserial0,reg=0x30000000 -device virtio-
> balloon-pci,id=balloon0,bus=pci.0,addr=0x4 -msg timestamp=on -vga none
> -nographic -chardev stdio,mux=on,id=charserial0 -monitor
> chardev:charserial0
>
> root@powerio-le11:~/mdroth# ./vfio-bind 0044:01:00.0
> unbinding 0044:01:00.0 via /sys/bus/pci/devices/0044:01:00.0/driver/unbind
> binding 0044:01:00.0
> echo 0x15b3 0x1007 >/sys/bus/pci/drivers/vfio-pci/new_id
>
> (qemu) device_add vfio-pci,host=0044:01:00.0,id=hp0
>
> root@powerio-le12:~# dmesg | tail -36
> [ 236.294903] RTAS: event: 1, Type: Unknown, Severity: 1
> [ 236.574958] pci 0000:00:00.0: [15b3:1007] type 00 class 0x020000
> [ 236.575630] pci 0000:00:00.0: reg 0x10: [mem 0x00000000-0x000fffff 64bit]
> [ 236.575986] pci 0000:00:00.0: reg 0x18: [mem 0x00000000-0x01ffffff 64bit
> pref]
> [ 236.576592] pci 0000:00:00.0: reg 0x30: [mem 0x00000000-0x000fffff pref]
> [ 236.578890] iommu: Adding device 0000:00:00.0 to group 0
> [ 236.578985] pci 0000:00:00.0: BAR 2: assigned [mem
> 0x10122000000-0x10123ffffff 64bit pref]
> [ 236.580466] pci 0000:00:00.0: BAR 0: assigned [mem
> 0x10121800000-0x101218fffff 64bit]
> [ 236.580921] pci 0000:00:00.0: BAR 6: assigned [mem
> 0x100a0000000-0x100a00fffff pref]
> [ 236.581011] mlx4_core: Mellanox ConnectX core driver v4.0-1.0.1 (29 Jan
> 2017)
> [ 236.581162] mlx4_core: Initializing 0000:00:00.0
> [ 236.581272] mlx4_core 0000:00:00.0: enabling device (0000 -> 0002)
> [ 236.583876] mlx4_core 0000:00:00.0: Using 64-bit direct DMA at offset
> 800000000000000
> [ 242.122882] mlx4_core: device is working in RoCE mode: Roce V1
> [ 242.122884] mlx4_core: UD QP Gid type is: V1
> [ 243.652901] mlx4_core 0000:00:00.0: PCIe link speed is 8.0GT/s, device
> supports 8.0GT/s
> [ 243.652904] mlx4_core 0000:00:00.0: PCIe link width is x8, device supports
> x8
> [ 243.877392] mlx4_en: Mellanox ConnectX HCA Ethernet driver v4.0-1.0.1 (29
> Jan 2017)
> [ 243.877592] mlx4_en 0000:00:00.0: Activating port:1
> [ 243.904087] mlx4_en: 0000:00:00.0: Port 1: Using 128 TX rings
> [ 243.904090] mlx4_en: 0000:00:00.0: Port 1: Using 8 RX rings
> [ 243.904093] mlx4_en: 0000:00:00.0: Port 1: frag:0 - size:1522 prefix:0
> stride:1536
> [ 243.904770] mlx4_en: 0000:00:00.0: Port 1: Initializing port
> [ 243.905354] mlx4_en 0000:00:00.0: registered PHC clock
> [ 243.906985] mlx4_en 0000:00:00.0: Activating port:2
> [ 243.917716] mlx4_core 0000:00:00.0 enp0s0: renamed from eth0
> [ 243.919899] mlx4_en: 0000:00:00.0: Port 2: Using 128 TX rings
> [ 243.919901] mlx4_en: 0000:00:00.0: Port 2: Using 8 RX rings
> [ 243.919903] mlx4_en: 0000:00:00.0: Port 2: frag:0 - size:1522 prefix:0
> stride:1536
> [ 243.920694] mlx4_en: 0000:00:00.0: Port 2: Initializing port
> [ 243.941713] <mlx4_ib> mlx4_ib_add: mlx4_ib: Mellanox ConnectX InfiniBand
> driver v4.0-1.0.1 (29 Jan 2017)
> [ 244.039494] <mlx4_ib> mlx4_ib_add: counter index 2 for port 1 allocated 1
> [ 244.039520] <mlx4_ib> mlx4_ib_add: counter index 3 for port 2 allocated 1
> [ 244.098796] mlx4_core 0000:00:00.0 enp0s0d1: renamed from eth0
> [ 245.266775] mlx4_en: enp0s0: Link Up
> [ 245.266891] mlx4_en: enp0s0d1: Link Up
>
> Everything appears to be functioning. Also worth noting, the host
> doesn't report any apparmor messages:
>
> [ 3683.945997] KVM guest htab at c000001e5a000000 (order 26), LPID 2
> [ 3878.433033] br0: port 2(vnet0) entered disabled state
> [ 3878.436993] device vnet0 left promiscuous mode
> [ 3878.436995] br0: port 2(vnet0) entered disabled state
> [ 3927.505181] pci 0044:01 : [PE# 02] Disabling 64-bit DMA bypass
> [ 3927.505188] pci 0044:01 : [PE# 02] Removing DMA window #0
> [ 3928.018862] pci 0044:01 : [PE# 02] Setting up window#0 0..3fffffff
> pg=1000
> [ 3928.024266] pci 0044:01 : [PE# 02] Setting up window#1
> 800000000000000..8000001ffffffff pg=10000
> [ 3928.403651] vfio-pci 0044:01:00.0: enabling device (0400 -> 0402)
> [ 3928.514975] vfio_ecap_init: 0044:01:00.0 hiding ecap 0x19@0x18c
>
> If I try to hotplug the device via libvirt, I see the vfio listener
> registration failure originally noted. If I enabled traces in qemu, i
> see where that listener failure is stemming from:
>
> C_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
> QEMU_AUDIO_DRV=none /usr/bin/kvm -name
> guest=powerio-le12-ubuntu-17.04,debug-threads=on -S -object
> secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-2-powerio-le12-ubuntu-/master-key.aes
> -machine pseries-2.7,accel=kvm,usb=off,dump-guest-core=off -m 8192 -realtime
> mlock=off -smp 16,sockets=1,cores=2,threads=8 -uuid
> bd3248c2-5686-4e18-b86e-799292bf4ad3 -display none -no-user-config
> -nodefaults -chardev
> socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-2-powerio-le12-ubuntu-/monitor.sock,server,nowait
> -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown
> -boot strict=on -device pci-ohci,id=usb,bus=pci.0,addr=0x2 -device
> spapr-vscsi,id=scsi0,reg=0x2000 -drive
> file=/var/lib/libvirt/images/powerio-le12-ubuntu-17.04.qcow2,format=qcow2,if=none,id=drive-virtio-disk0
> -device
> virtio-blk-pci,scsi=off,bus=pci.0,addr=0x3,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1
> -drive
if=none,id=drive-scsi0-0-0-0,readonly=on -device
scsi-cd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0
-netdev tap,fd=25,id=hostnet0,vhost=on,vhostfd=27 -device
virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:eb:a9:da,bus=pci.0,addr=0x1
-chardev pty,id=charserial0 -device
spapr-vty,chardev=charserial0,reg=0x30000000 -device
virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4 -msg timestamp=on
> Domain id=2 is tainted: high-privileges
> 2017-02-22T23:01:35.080908Z qemu-system-ppc64: -chardev pty,id=charserial0:
> char device redirected to /dev/pts/6 (label charserial0)
> [email protected]:vfio_realize (0044:01:00.0) group 6
> [email protected]:vfio_prereg_register va=3ffd2bff0000 size=200000000
> ret=-12
> [email protected]:vfio_prereg_listener_region_add_skip 10080000020 -
> 1008000003f
> [email protected]:vfio_prereg_listener_region_add_skip 10080000040 -
> 1008000007f
> [email protected]:vfio_prereg_listener_region_add_skip 10080000080 -
> 1008000009f
> [email protected]:vfio_prereg_listener_region_add_skip 100e0000000 -
> 100e000001f
> [email protected]:vfio_prereg_listener_region_add_skip 100e0000800 -
> 100e0000807
> [email protected]:vfio_prereg_listener_region_add_skip 100e0001000 -
> 100e00010ff
> [email protected]:vfio_prereg_listener_region_add_skip 100e0002000 -
> 100e000202f
> [email protected]:vfio_prereg_listener_region_add_skip 100e0002800 -
> 100e0002807
> [email protected]:vfio_prereg_listener_region_add_skip 10120000000 -
> 10120000fff
> [email protected]:vfio_prereg_listener_region_add_skip 10120001000 -
> 10120001fff
> [email protected]:vfio_prereg_listener_region_add_skip 10120002000 -
> 10120002fff
> [email protected]:vfio_prereg_listener_region_add_skip 10120003000 -
> 10120402fff
> [email protected]:vfio_prereg_listener_region_add_skip 10120800000 -
> 10120800fff
> [email protected]:vfio_prereg_listener_region_add_skip 10120801000 -
> 10120801fff
> [email protected]:vfio_prereg_listener_region_add_skip 10120802000 -
> 10120802fff
> [email protected]:vfio_prereg_listener_region_add_skip 10120803000 -
> 10120c02fff
> [email protected]:vfio_prereg_listener_region_add_skip 10121000000 -
> 10121000fff
> [email protected]:vfio_prereg_listener_region_add_skip 10121001000 -
> 10121001fff
> [email protected]:vfio_prereg_listener_region_add_skip 10121002000 -
> 10121002fff
> [email protected]:vfio_prereg_listener_region_add_skip 10121003000 -
> 10121402fff
>
> vfio_prereg_register's ret=-12 is the errno value set by:
>
> ret = ioctl(container->fd, VFIO_IOMMU_SPAPR_REGISTER_MEMORY, ®);
>
> which indicates that VFIO_IOMMU_SPAPR_REGISTER_MEMORY is failing with
> "Cannot allocate memory". In the host, I see an apparmor message:
>
> [ 1607.260426] KVM guest htab at c000001e56000000 (order 26), LPID 1
> [ 1745.761165] audit: type=1400 audit(1487804633.611:18): apparmor="ALLOWED"
> operation="setrlimit" profile="/usr/sbin/libvirtd" pid=5329 comm="libvirtd"
> rlimit=memlock value=8694792192
> [ 1745.763764] pci 0044:01 : [PE# 02] Disabling 64-bit DMA bypass
> [ 1745.763771] pci 0044:01 : [PE# 02] Removing DMA window #0
> [ 1745.763864] pci 0044:01 : [PE# 02] Removing DMA window #0
> [ 1745.763867] pci 0044:01 : [PE# 02] Removing DMA window #1
> [ 1745.767676] pci 0044:01 : [PE# 02] Setting up window#0 0..7fffffff
> pg=1000
> [ 1745.767679] pci 0044:01 : [PE# 02] Enabling 64-bit DMA bypass
>
> Originally these were "DENIED" errors, but In comment #10 i noted I'd
> worked around that via:
>
> sudo aa-complain /usr/sbin/libvirtd
> sudo aa-complain
> /etc/apparmor.d/libvirt/libvirt-????????-????-????-????-????????????
>
> as noted in https://bugzilla.linux.ibm.com/show_bug.cgi?id=146192
>
> But either that workaround is insufficient, or there's some other issue
> relating to libvirt priviledge levels that seems to be at issue, given
> that QEMU doesn't have any issues when using directly as root.
>
>
> Can u try now because I was using the system in the weekend and the card was
> dead plus the guest was doing pci passthru of the card also. So I took out
> the card from the guest xml and I can recreate again.
> virsh attach-device powerio-le12-ubuntu-17.04 ./add_hydepark.xml --live
> error: Failed to attach device from ./add_hydepark.xml
> error: internal error: unable to execute QEMU command 'device_add': vfio
> error: 0040:01:00.0: failed to setup container for group 5: RAM memory
> listener initialization failed for container
>
> This is because of the memlock hard limits that libvirt does. The
> upstream 2.5.0 doesnt have the problem.
>
> The libvirt starts with a certain value for max memlock and adjusts it during
> the hotplug. The upstream 2.5.0 is adjusting it correctly for my guest having
> <memory unit='KiB'>16777216</memory>
> to Max locked memory 17368612864 17368612864 bytes
> on hotplug, where as the ubuntu libvirt is not.
>
> The same can be worked around by hard coding the max limits with the below
> tag for the guest powerio-le14-ubuntu-17.04
> <memtune>
> <hard_limit unit='KiB'>16961536</hard_limit>
> <soft_limit unit='KiB'>16961536</soft_limit>
> </memtune>
>
> Trying to figure out the patch which might be missing on Ubuntu libvirt.
>
> I went through the code and figured the required patches are all there.
> The package apparmor-profiles was missing and I installed that.
>
> I had to add #include <abstractions/libvirt-qemu> to
> /etc/apprmor.d/usr.bin.libvirt and add /dev/vfio/vfio rw, to
> /etc/apparmor.d/abstractions/libvirt-qemu so I could get the hotplug
> working
>
> I did above three together to get it working and not sure which of the
> them actually fixed(mosty including libvirt-qemu) as the appromor keeps
> the profiles in cache and reinstalling libvirt-daemon-system(which
> provides the /etc/apprmor.d/usr.bin.libvirt) didnt reinstall the
> file(!!).
>
> The apparmor is kind of keeping the profiles in cache somewhere and
> relioading is not helping. Everything seems to be working fine now that
> is making it hard to say exactly which of the two steps fixed it. Or
> having the apparmor-profiles made the trick.
>
> Carol, Let me know if you are planning for re-image sometime so we can
> see exactly which of the 3 helps get rid of the problem.
>
> Would it be sufficient to just document this issue?
>
> For now may be we can document the steps.
>
> All steps except the step3 (3. Add /dev/vfio/vfio rw in abstractions
> /libvirt-qemu ), are not avoidable. The Step3 can be avoided if we can
> make changes to the default libvirt-qemu file on the distro.
>
> ** Affects: ubuntu
> Importance: Undecided
> Assignee: Taco Screen team (taco-screen-team)
> Status: New
>
>
> ** Tags: architecture-ppc64le bugnameltc-151486 severity-high
> targetmilestone-inin1704
--
Michael Hohnbaum
OIL Program Manager
Power (ppc64el) Development Project Manager
Canonical, Ltd.
** Bug watch added: bugzilla.linux.ibm.com/ #146192
https://bugzilla.linux.ibm.com/show_bug.cgi?id=146192
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1678322
Title:
Ubuntu 17.04 KVM: Can not do hotplug attach
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1678322/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs