After comparing the sysfs data, I don't see any differences w.r.t the
physical paths in sysfs for the thunder nic.

I wonder if there is something that detects "xenial" and does one
thing, vs "bionic" despite the xenial host using the same kernel
level.
The apparmor denied on the namespaces only shows up under bionic but
both kernels are the same level, so we should be seeing the
same errors if both stacks were use the same cgroups.

Can we check charms, juju or lxd w.r.t how it those cgroups are
mounted?  That may not be related but we're running out of
differences.

On Tue, May 22, 2018 at 9:21 PM, Jason Hobbs <[email protected]> wrote:
> ls -alR /sys on bionic http://paste.ubuntu.com/p/nrxyRGP3By/
>
> The bionic kernel has also bumped:
> Linux aurorus 4.15.0-22-generic #24-Ubuntu SMP Wed May 16 12:14:36 UTC
> 2018 aarch64 aarch64 aarch64 GNU/Linux
>
> On Tue, May 22, 2018 at 7:10 PM, Ryan Harper <[email protected]> 
> wrote:
>> Looks like the ls -aLR contains more data;  we can compare bionic.
>>
>> On Tue, May 22, 2018 at 6:53 PM, Jason Hobbs <[email protected]> 
>> wrote:
>>> cd /sys/bus/pci/devices && grep -nr . *
>>>
>>> xenial:
>>> http://paste.ubuntu.com/p/F5qyvN2Qrr/
>>>
>>> On Tue, May 22, 2018 at 5:27 PM, Jason Hobbs <[email protected]> 
>>> wrote:
>>>> Do you really want a tar? How about ls -alR? xenial:
>>>>
>>>> http://paste.ubuntu.com/p/wyQ3kTsyBB/
>>>>
>>>> On Tue, May 22, 2018 at 5:14 PM, Jason Hobbs <[email protected]> 
>>>> wrote:
>>>>> ok; looks like that 4.15.0-22-generic just released and wasn't what I
>>>>> used in the first reproduction... I doubt that's it.
>>>>>
>>>>> On Tue, May 22, 2018 at 4:58 PM, Ryan Harper <[email protected]> 
>>>>> wrote:
>>>>>> Comparing the kernel logs, on Xenial, the second nic comes up:
>>>>>>
>>>>>> May 22 15:00:27 aurorus kernel: [   24.840500] IPv6:
>>>>>> ADDRCONF(NETDEV_UP): enP2p1s0f2: link is not ready
>>>>>> May 22 15:00:27 aurorus kernel: [   25.472391] thunder-nicvf
>>>>>> 0002:01:00.2 enP2p1s0f2: Link is Up 10000 Mbps Full duplex
>>>>>>
>>>>>> But on bionic, we only ever have f3 up.  Note this isn't a network
>>>>>> configuration, but rather the state of the Nic and the switch.
>>>>>> It doesn't appear to matter, 0f3 is what get's bridged by juju anyhow.
>>>>>> But it does suggest that something is different.
>>>>>>
>>>>>> There is a slight kernel version variance as well:
>>>>>>
>>>>>> Xenial:
>>>>>> May 22 15:00:27 aurorus kernel: [    0.000000] Linux version
>>>>>> 4.15.0-22-generic (buildd@bos02-arm64-038) (gcc version 5.4.0 20160609
>>>>>> (Ubuntu/Lin
>>>>>>
>>>>>> Bionic:
>>>>>> May 17 18:03:47 aurorus kernel: [    0.000000] Linux version
>>>>>> 4.15.0-20-generic (buildd@bos02-arm64-029) (gcc version 7.3.0
>>>>>> (Ubuntu/Linaro 7.3.
>>>>>>
>>>>>> Looks like Xenial does not use unified cgroup namespaces, not sure
>>>>>> what affect this may have on what's running in those lxd juju
>>>>>> containers.
>>>>>>
>>>>>> % grep DENIED *.log
>>>>>> bionic.log:May 17 18:19:33 aurorus kernel: [  983.592228] audit:
>>>>>> type=1400 audit(1526581173.043:70): apparmor="DENIED"
>>>>>> operation="mount" info="failed flags match" error=-13
>>>>>> profile="lxd-juju-657fe9-1-lxd-1_</var/lib/lxd>"
>>>>>> name="/sys/fs/cgroup/unified/" pid=24143 comm="systemd"
>>>>>> fstype="cgroup2" srcname="cgroup" flags="rw, nosuid, nodev, noexec"
>>>>>> bionic.log:May 17 18:19:33 aurorus kernel: [  983.592476] audit:
>>>>>> type=1400 audit(1526581173.043:71): apparmor="DENIED"
>>>>>> operation="mount" info="failed flags match" error=-13
>>>>>> profile="lxd-juju-657fe9-1-lxd-1_</var/lib/lxd>"
>>>>>> name="/sys/fs/cgroup/unified/" pid=24143 comm="systemd"
>>>>>> fstype="cgroup2" srcname="cgroup" flags="rw, nosuid, nodev, noexec"
>>>>>> bionic.log:May 17 18:19:41 aurorus kernel: [  991.818402] audit:
>>>>>> type=1400 audit(1526581181.267:88): apparmor="DENIED"
>>>>>> operation="mount" info="failed flags match" error=-13
>>>>>> profile="lxd-juju-657fe9-1-lxd-1_</var/lib/lxd>"
>>>>>> name="/run/systemd/unit-root/var/lib/lxcfs/" pid=24757
>>>>>> comm="(networkd)" flags="ro, nosuid, nodev, remount, bind"
>>>>>> bionic.log:May 17 18:19:46 aurorus kernel: [  997.271203] audit:
>>>>>> type=1400 audit(1526581186.719:90): apparmor="DENIED"
>>>>>> operation="mount" info="failed flags match" error=-13
>>>>>> profile="lxd-juju-657fe9-1-lxd-2_</var/lib/lxd>"
>>>>>> name="/sys/fs/cgroup/unified/" pid=25227 comm="systemd"
>>>>>> fstype="cgroup2" srcname="cgroup" flags="rw, nosuid, nodev, noexec"
>>>>>> bionic.log:May 17 18:19:46 aurorus kernel: [  997.271425] audit:
>>>>>> type=1400 audit(1526581186.723:91): apparmor="DENIED"
>>>>>> operation="mount" info="failed flags match" error=-13
>>>>>> profile="lxd-juju-657fe9-1-lxd-2_</var/lib/lxd>"
>>>>>> name="/sys/fs/cgroup/unified/" pid=25227 comm="systemd"
>>>>>> fstype="cgroup2" srcname="cgroup" flags="rw, nosuid, nodev, noexec"
>>>>>> bionic.log:May 17 18:19:55 aurorus kernel: [ 1006.285863] audit:
>>>>>> type=1400 audit(1526581195.735:108): apparmor="DENIED"
>>>>>> operation="mount" info="failed flags match" error=-13
>>>>>> profile="lxd-juju-657fe9-1-lxd-2_</var/lib/lxd>"
>>>>>> name="/run/systemd/unit-root/" pid=26209 comm="(networkd)" flags="ro,
>>>>>> remount, bind"
>>>>>> bionic.log:May 17 18:20:12 aurorus kernel: [ 1022.760512] audit:
>>>>>> type=1400 audit(1526581212.211:110): apparmor="DENIED"
>>>>>> operation="mount" info="failed flags match" error=-13
>>>>>> profile="lxd-juju-657fe9-1-lxd-0_</var/lib/lxd>"
>>>>>> name="/sys/fs/cgroup/unified/" pid=28344 comm="systemd"
>>>>>> fstype="cgroup2" srcname="cgroup" flags="rw, nosuid, nodev, noexec"
>>>>>> bionic.log:May 17 18:20:12 aurorus kernel: [ 1022.760713] audit:
>>>>>> type=1400 audit(1526581212.211:111): apparmor="DENIED"
>>>>>> operation="mount" info="failed flags match" error=-13
>>>>>> profile="lxd-juju-657fe9-1-lxd-0_</var/lib/lxd>"
>>>>>> name="/sys/fs/cgroup/unified/" pid=28344 comm="systemd"
>>>>>> fstype="cgroup2" srcname="cgroup" flags="rw, nosuid, nodev, noexec"
>>>>>> bionic.log:May 17 18:20:20 aurorus kernel: [ 1031.256448] audit:
>>>>>> type=1400 audit(1526581220.707:128): apparmor="DENIED"
>>>>>> operation="mount" info="failed flags match" error=-13
>>>>>> profile="lxd-juju-657fe9-1-lxd-0_</var/lib/lxd>"
>>>>>> name="/run/systemd/unit-root/" pid=29205 comm="(networkd)" flags="ro,
>>>>>> remount, bind"
>>>>>> bionic.log:May 17 18:30:03 aurorus kernel: [ 1613.787782] audit:
>>>>>> type=1400 audit(1526581803.277:151): apparmor="DENIED"
>>>>>> operation="mount" info="failed flags match" error=-13
>>>>>> profile="lxd-juju-657fe9-1-lxd-0_</var/lib/lxd>" name="/bin/"
>>>>>> pid=91926 comm="(arter.sh)" flags="ro, remount, bind"
>>>>>> bionic.log:May 17 18:30:03 aurorus kernel: [ 1613.832621] audit:
>>>>>> type=1400 audit(1526581803.321:152): apparmor="DENIED"
>>>>>> operation="mount" info="failed flags match" error=-13
>>>>>> profile="lxd-juju-657fe9-1-lxd-0_</var/lib/lxd>" name="/bin/"
>>>>>> pid=91949 comm="(y-helper)" flags="ro, remount, bind"
>>>>>>
>>>>>>
>>>>>> xenial.log:May 22 15:15:10 aurorus kernel: [  918.311740] audit:
>>>>>> type=1400 audit(1527002110.131:109): apparmor="DENIED"
>>>>>> operation="file_mmap"
>>>>>> namespace="root//lxd-juju-878ab5-1-lxd-1_<var-lib-lxd>"
>>>>>> profile="/usr/lib/lxd/lxd-bridge-proxy"
>>>>>> name="/usr/lib/lxd/lxd-bridge-proxy" pid=40973 comm="lxd-bridge-prox"
>>>>>> requested_mask="m" denied_mask="m" fsuid=100000 ouid=100000
>>>>>> xenial.log:May 22 15:15:11 aurorus kernel: [  919.605481] audit:
>>>>>> type=1400 audit(1527002111.427:115): apparmor="DENIED"
>>>>>> operation="file_mmap"
>>>>>> namespace="root//lxd-juju-878ab5-1-lxd-2_<var-lib-lxd>"
>>>>>> profile="/usr/lib/lxd/lxd-bridge-proxy"
>>>>>> name="/usr/lib/lxd/lxd-bridge-proxy" pid=41233 comm="lxd-bridge-prox"
>>>>>> requested_mask="m" denied_mask="m" fsuid=100000 ouid=100000
>>>>>>
>>>>>> Looking at the nova.pci.utils code, the different errors seem to be
>>>>>> related to sysfs entries:
>>>>>>
>>>>>> https://git.openstack.org/cgit/openstack/nova/tree/nova/pci/utils.py?id=e919720e08fae5c07cecda00ac2d51b0a09f533e#n196
>>>>>>
>>>>>> If the sysfs path exists, then we go "further down" the hole and get
>>>>>> an error like in bionic, but if the sysfs path does not exist, then we
>>>>>> get
>>>>>> the exception we see in Xenial.
>>>>>>
>>>>>> Can we get a tar of /sys for both to see if this confirms the
>>>>>> suspicion that we're taking different paths due to differing kernels?
>>>>>>
>>>>>>
>>>>>> On Tue, May 22, 2018 at 3:27 PM, Jason Hobbs <[email protected]> 
>>>>>> wrote:
>>>>>>> marked new on nova-compute-charm due to rharper's comment #18, and new
>>>>>>> on libvirt because I've posted all the requested logs now.
>>>>>>>
>>>>>>> --
>>>>>>> You received this bug notification because you are subscribed to the bug
>>>>>>> report.
>>>>>>> https://bugs.launchpad.net/bugs/1771662
>>>>>>>
>>>>>>> Title:
>>>>>>>   libvirtError: Node device not found: no node device with matching name
>>>>>>>
>>>>>>> To manage notifications about this bug go to:
>>>>>>> https://bugs.launchpad.net/charm-nova-compute/+bug/1771662/+subscriptions
>>>>>>
>>>>>> --
>>>>>> You received this bug notification because you are subscribed to the bug
>>>>>> report.
>>>>>> https://bugs.launchpad.net/bugs/1771662
>>>>>>
>>>>>> Title:
>>>>>>   libvirtError: Node device not found: no node device with matching name
>>>>>>
>>>>>> Status in OpenStack nova-compute charm:
>>>>>>   New
>>>>>> Status in libvirt package in Ubuntu:
>>>>>>   New
>>>>>>
>>>>>> Bug description:
>>>>>>   After deploying openstack on arm64 using bionic and queens, no
>>>>>>   hypervisors show upon. On my compute nodes, I have an error like:
>>>>>>
>>>>>>   2018-05-16 19:23:08.165 282170 ERROR nova.compute.manager
>>>>>>   libvirtError: Node device not found: no node device with matching name
>>>>>>   'net_enP2p1s0f1_40_8d_5c_ba_b8_d2'
>>>>>>
>>>>>>   In my /var/log/nova/nova-compute.log
>>>>>>
>>>>>>   I'm not sure why this is happening - I don't use enP2p1s0f1 for
>>>>>>   anything.
>>>>>>
>>>>>>   There are a lot of interesting messages about that interface in syslog:
>>>>>>   http://paste.ubuntu.com/p/8WT8NqCbCf/
>>>>>>
>>>>>>   Here is my bundle: http://paste.ubuntu.com/p/fWWs6r8Nr5/
>>>>>>
>>>>>>   The same bundle works fine for xenial-queens, with the source changed
>>>>>>   to the cloud-archive, and using stable charms rather than -next. I hit
>>>>>>   this same issue on bionic queens using either stable or next charms.
>>>>>>
>>>>>>   This thread has some related info, I think:
>>>>>>   https://www.spinics.net/linux/fedora/libvir/msg160975.html
>>>>>>
>>>>>>   This is with juju 2.4 beta 2.
>>>>>>
>>>>>>   Package versions on affected system:
>>>>>>   http://paste.ubuntu.com/p/yfQH3KJzng/
>>>>>>
>>>>>> To manage notifications about this bug go to:
>>>>>> https://bugs.launchpad.net/charm-nova-compute/+bug/1771662/+subscriptions
>>>
>>> --
>>> You received this bug notification because you are subscribed to the bug
>>> report.
>>> https://bugs.launchpad.net/bugs/1771662
>>>
>>> Title:
>>>   libvirtError: Node device not found: no node device with matching name
>>>
>>> To manage notifications about this bug go to:
>>> https://bugs.launchpad.net/charm-nova-compute/+bug/1771662/+subscriptions
>>
>> --
>> You received this bug notification because you are subscribed to the bug
>> report.
>> https://bugs.launchpad.net/bugs/1771662
>>
>> Title:
>>   libvirtError: Node device not found: no node device with matching name
>>
>> Status in OpenStack nova-compute charm:
>>   New
>> Status in libvirt package in Ubuntu:
>>   New
>>
>> Bug description:
>>   After deploying openstack on arm64 using bionic and queens, no
>>   hypervisors show upon. On my compute nodes, I have an error like:
>>
>>   2018-05-16 19:23:08.165 282170 ERROR nova.compute.manager
>>   libvirtError: Node device not found: no node device with matching name
>>   'net_enP2p1s0f1_40_8d_5c_ba_b8_d2'
>>
>>   In my /var/log/nova/nova-compute.log
>>
>>   I'm not sure why this is happening - I don't use enP2p1s0f1 for
>>   anything.
>>
>>   There are a lot of interesting messages about that interface in syslog:
>>   http://paste.ubuntu.com/p/8WT8NqCbCf/
>>
>>   Here is my bundle: http://paste.ubuntu.com/p/fWWs6r8Nr5/
>>
>>   The same bundle works fine for xenial-queens, with the source changed
>>   to the cloud-archive, and using stable charms rather than -next. I hit
>>   this same issue on bionic queens using either stable or next charms.
>>
>>   This thread has some related info, I think:
>>   https://www.spinics.net/linux/fedora/libvir/msg160975.html
>>
>>   This is with juju 2.4 beta 2.
>>
>>   Package versions on affected system:
>>   http://paste.ubuntu.com/p/yfQH3KJzng/
>>
>> To manage notifications about this bug go to:
>> https://bugs.launchpad.net/charm-nova-compute/+bug/1771662/+subscriptions
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1771662
>
> Title:
>   libvirtError: Node device not found: no node device with matching name
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/charm-nova-compute/+bug/1771662/+subscriptions

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1771662

Title:
  libvirtError: Node device not found: no node device with matching name

To manage notifications about this bug go to:
https://bugs.launchpad.net/charm-nova-compute/+bug/1771662/+subscriptions

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to