[Bug 1857074] Re: Cavium ThunderX CN88XX Panic : Unknown reason

2020-01-15 Thread Alexandru Avadanii
Hi,
Not sure this is useful (since it might be obvious), but adding `nopti` to 
kernel parameters works around the issue, indicating this is indeed related to 
kpti.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1857074

Title:
  Cavium ThunderX CN88XX Panic : Unknown reason

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1857074/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1797332] Re: qemu nested virtualization is not working with Ubuntu16.04 + Intel CPU

2019-02-22 Thread Alexandru Avadanii
FWIW, bumping the kernel on the host (and most likely on the L1 VMs too) should 
work.
The HWE kernel in Xenial is the same version (4.15) with the kernel used by 
Bionic (18.04), so this should fix the problem:
$ apt install linux-generic-hwe-16.04
$ reboot

BR,
Alex

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1797332

Title:
  qemu nested virtualization is not working with Ubuntu16.04 + Intel CPU

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1797332/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1813371] [NEW] OVS 2.9+ systemd integration issues

2019-01-25 Thread Alexandru Avadanii
Public bug reported:

For a few months now, we have been using OVS 2.9 (or newer) on Ubuntu Xenial in 
OPNFV, both with and without DPDK.
A while ago, we observed a couple of rare race conditions when multiple Linux 
interfaces/bridges are mixed with OVS ports/bridges. We also observed races 
between DPDK binding and openvswitch-switch (actually openvswitch-switch-dpdk 
configured using alternatives).
We worked around those issues by using a solution derived from the official OVS 
Debian readme, which recommends avoiding using `auto` for OVS bridges. Instead, 
we used `auto` for OVS bridges, but omitted the `auto` for the OVS ports in 
them. That worked almost perfectly for a while.

However, we recently bumped a few unrelated software components (since
we migrated from Queens to Rocky in OPNFV) and we started experiecing
race conditions again.

So I dugg a bit and found a couple of things:

1. Broken dependency between ovsdb-server/ovs-vswitchd systemd services and 
networking.service
This is probably a copy-pasta error from [1] `Before: network.service` which 
should probably be `Before: networking.service` on Debian systems.
The consequence is quite serious - on Debian systems, the OVS services start 
*after* networking.service.
Changing this leads to a service order change, which turns out to be quite the 
rabbit hole ...

2. Outdated ifupdown scripts
For example /etc/network/if-pre-up.d/openvswitch still references the old 
`openvswitch-nonetwork.service`.
Luckily, this is not critical, as the fallback uses `service openvswitch-switch 
[...]`, so I'm not sure this should be changed, but I thought it's worth 
mentioning.

3. Debian OVS does *not* handle OVS bridges without `auto`
Upstream OVS readme recommends ommitting `auto` for OVS bridges, as mentioned 
earlier, to avoid exactly the race conditions we saw.
Although following the recommendation in the upstream readme leads to a working 
system (`networking.service` no longer fails to start due to missing OVS 
bridges and/or vice-versa - ovs services no longer complain about Linux 
interfaces being in down state when trying to add them to OVS bridges), OVS 
bridges end up in DOWN state since nobody bothers to ifup them.
Imo, networking.service (or some *other* mechanism) should call `/sbin/ifup 
--allow=ovs -a --read-environment` *after* the initial `/sbin/ifup -a 
--read-enviroment` (provided the ordering issue #1 was changed to start OVS 
first, of course).

4. ovsdb-server should never start before DPDK service if DPDK is installed
This should actually be easy to fix and I have to admit I haven't run into it 
lately, although I remember it being an issue a while ago.
Anyway, a simple `After: dpdk.service` wouldn't hurt.

5. If OVS starts before networking.service, cloud-init causes cyclic 
dependencies
If we configure OVS services to start first, systemd might decide to randomly 
remove some units to break the following circular dependency:
  ovs-vswitchd --> ovsdb-server -(default dep)-> sysinit.target -->
  cloud-init.service --> networking.service --> ovs-vswitchd
In my tests, I just set 'DefaultDependencies=no' for OVS services, although 
this might require explicitly adding back some of the indirect dependencies of 
`sysinit.target`, so it's a sensible recommendation.

On my test systems, I didn't bother handling #2, as for the others I
have some systemd drop-ins (see below), which so far seem to produce
reproductible working environments.

# cat /etc/systemd/system/ovsdb-server.service.d/override.conf
[Unit]
After=dpdk.service
Before=networking.service
DefaultDependencies=no

# cat /etc/systemd/system/networking.service.d/ovs_workaround.conf
[Service]
ExecStart=/sbin/ifup --allow=ovs -a --read-environment

# cat /etc/systemd/system/ovs-vswitchd.service.d/override.conf
[Unit]
Before=networking.service
DefaultDependencies=no

# lsb_release -rd
Description:Ubuntu 16.04.5 LTS
Release:16.04

# apt-cache policy openvswitch-switch
openvswitch-switch:
  Installed: 2.9.0-0ubuntu1~cloud0
  Candidate: 2.9.0-0ubuntu1~cloud0
  Version table:
 *** 2.9.0-0ubuntu1~cloud0 500
500 http://ubuntu-cloud.archive.canonical.com/ubuntu 
xenial-updates/queens/main amd64 Packages
100 /var/lib/dpkg/status

[1] https://github.com/openvswitch/ovs/blob/master/rhel
/usr_lib_systemd_system_ovsdb-server.service#L4

** Affects: openvswitch (Ubuntu)
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1813371

Title:
  OVS 2.9+ systemd integration issues

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/openvswitch/+bug/1813371/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1582181] Re: AArch64: slow cpuinfo due to redundant loop

2018-03-19 Thread Alexandru Avadanii
Upstream PR merged.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1582181

Title:
  AArch64: slow cpuinfo due to redundant loop

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/lshw/+bug/1582181/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1582181] Re: AArch64: slow cpuinfo due to redundant loop

2017-12-28 Thread Alexandru Avadanii
I see nobody acted on this, so I sent a PR [1] upstream.
Will update this ticket if it gets pulled.

[1] https://github.com/lyonel/lshw/pull/36

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1582181

Title:
  AArch64: slow cpuinfo due to redundant loop

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/lshw/+bug/1582181/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1582181] Re: AArch64: slow cpuinfo due to redundant loop

2017-09-08 Thread Alexandru Avadanii
Hi,
If it helps, we have an old DEB package at [1].
I think it's based on the lshw version that was used by Trusty or Xenial at 
that time.

[1] http://linux.enea.com/mos-repos/ubuntu/10.0/pool/main/l/lshw/

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1582181

Title:
  AArch64: slow cpuinfo due to redundant loop

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/lshw/+bug/1582181/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1673564] Re: ThunderX: soft lockup on 4.8+ kernels when running qemu-efi with vhost=on

2017-03-27 Thread Alexandru Avadanii
Hi, Dann,
Thanks for looking into this!
One more thing: we blacklisted the module "vhost_net", and that bypasses the 
issue.
I know it's not the right direction for finding a fix, but maybe it helps with 
the debug.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1673564

Title:
  ThunderX: soft lockup on 4.8+ kernels when running qemu-efi with
  vhost=on

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/edk2/+bug/1673564/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1674837] Re: thunder nic: RX_PACKET_DIS fix regression with Extreme Networks switch

2017-03-21 Thread Alexandru Avadanii
** Changed in: linux (Ubuntu)
   Status: Incomplete => Confirmed

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1674837

Title:
  thunder nic: RX_PACKET_DIS fix regression with Extreme Networks switch

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1674837/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1630038] Re: thunder nic: avoid link delays due to RX_PACKET_DIS

2017-03-21 Thread Alexandru Avadanii
Hi, Dann,
I created a new bug and pasted the same info as above at [1].
Afaict, there is no useful information in the logs when link training fails.

[1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1674837

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1630038

Title:
  thunder nic: avoid link delays due to RX_PACKET_DIS

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1630038/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1674837] [NEW] thunder nic: RX_PACKET_DIS fix regression with Extreme Networks switch

2017-03-21 Thread Alexandru Avadanii
Public bug reported:

Upstream backport [3] introduced a regression with ThunderX nodes (CRB-1S, 
CRB-2S) and our 10G switch (Extreme Networks x670 10GE L3).
We have opened a downstream bug report [1], where we temporarily bypassed this 
by pinning the kernel to 4.4.0-45.
I also tested 4.8 (multiple builds), 4.10 and 4.11-rc1 (vanilla); all are still 
affected by link training issues with our switch, with 4.11-rc1 not working at 
all and reporting more issues (logs attached in a different LP comment [2]).

I also confirmed that reverting the commit in questions fixes the issues
in our setup (tested on top of 4.10.0-13 linux-image-generic-hwe-edge
package from Xenial).

BR,
Alex

[1] https://jira.opnfv.org/browse/ARMBAND-168
[2] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1672521/comments/17
[3] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1630038

** Affects: linux (Ubuntu)
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1674837

Title:
  thunder nic: RX_PACKET_DIS fix regression with Extreme Networks switch

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1674837/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1674837] Re: thunder nic: RX_PACKET_DIS fix regression with Extreme Networks switch

2017-03-21 Thread Alexandru Avadanii
Let me know if I should attach any logs, although there are *no* traces
anywhere, at least with default log levels (without recompiling).

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1674837

Title:
  thunder nic: RX_PACKET_DIS fix regression with Extreme Networks switch

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1674837/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1672521] Re: ThunderX: soft lockup on 4.8+ kernels

2017-03-21 Thread Alexandru Avadanii
Hi, Dann,
First of all, I think the bug title is misleading, as this issue happens on all 
kernels we tested (4.4.0-45..66, 4.8.0-x, 4.10.0-x etc).

To be fair, we haven't this exact bug (or at least I don't think we did)
in practice, i.e. without running stress-ng, 4.4.0-x never ever crashed.

The VM use case turned out to be a different bug [1], triggered 100% by
AAVMF + vhost.

Let me know if I can provide anything else.
I consider this particular bug minor (if we don't poke it with stress-ng, 
everything works well), compared to AAVMF + vhost [1].

Thanks,
Alex

[1] https://bugs.launchpad.net/ubuntu/+source/edk2/+bug/1673564

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1672521

Title:
  ThunderX: soft lockup on 4.8+ kernels

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1672521/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1630038] Re: thunder nic: avoid link delays due to RX_PACKET_DIS

2017-03-21 Thread Alexandru Avadanii
Hi,

1) We tested different models (CRB-1S, CRB-2S) - all behave the same.
2) Please check the logs "ThunderX 4.11-rc1 console log" in [2] linked above. I 
don't think firmware version makes a difference for this issue (we saw the same 
bug with firmwares: T22, T27, T31).

All in all, this issue seems pretty tied to the switch we use, and all
firmware/board model combinations behaved the same ...

Thanks,
Alex

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1630038

Title:
  thunder nic: avoid link delays due to RX_PACKET_DIS

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1630038/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1630038] Re: thunder nic: avoid link delays due to RX_PACKET_DIS

2017-03-19 Thread Alexandru Avadanii
Hi,
This fix introduced a regression with ThunderX nodes (CRB-1S, CRB-2S) and our 
10G switch (Extreme Networks x670 10GE L3).
We have opened a downstream bug report [1], where we temporarily bypassed this 
by pinning the kernel to 4.4.0-45.
I also tested 4.8 (multiple builds), 4.10 and 4.11-rc1 (vanilla); all are still 
affected by link training issues with our switch, with 4.11-rc1 not working at 
all and reporting more issues (logs attached in a different LP comment [2]).

BR,
Alex

[1] https://jira.opnfv.org/browse/ARMBAND-168
[2] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1672521/comments/17

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1630038

Title:
  thunder nic: avoid link delays due to RX_PACKET_DIS

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1630038/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1672521] Re: ThunderX: soft lockup on 4.8+ kernels

2017-03-14 Thread Alexandru Avadanii
4.11-rc1 console log attached.
Board firmware is latest available on Gigabyte's site (T31).

1. Install 4.11-rc1 (`make modules_install install`) and reboot
2. Observe networking driver issues in boot log
   Dmesg: 4.11-rc1_dmesg_on_clean_boot.log [3]
3. Try `ping google.com`, obviously not working
4. `modprobe -r nicpf` (leads to multiple oopses in dmesg)
Console log: 4.11-rc1_modprobe_r_nicpf_output.log [1]
Dmesg :4.11-rc1_dmesg_after_modprobe_r_nicpf.log [2]
5. `modprobe nicpf` (this usually works, and afterwards network is up and 
running - not sure whether ALL interfaces are ok, as not all of them are 
connected) - however this time it led to a soft lockup (see full logs attached 
here);

[1] http://paste.ubuntu.com/24178311/
[2] http://paste.ubuntu.com/24178312/
[3] http://paste.ubuntu.com/24178313/

** Attachment added: "ThunderX 4.11-rc1 console log"
   
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1672521/+attachment/4837770/+files/thunderx_4.11_rc1_console_log.txt

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1672521

Title:
  ThunderX: soft lockup on 4.8+ kernels

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1672521/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1672521] Re: ThunderX: soft lockup on 4.8+ kernels

2017-03-14 Thread Alexandru Avadanii
Hi,
I tried out 4.11-rc1 a few days ago. Unfortunately, I did not get the board to 
boot properly from the start, since ThunderX networking drivers failed to 
allocate MSI-X/MSI interrupts, and polling on some registers also failed ...

So, with 4.11-rc1, at least one networking interfaces was never coming
online due to unmapped interrupts/failed polling, but unloading `nicpf`
and reloading it seemed to work (networking worked after this). After
this, the soft lockup happened, but I can't be sure I did not mess
something else.

Let me try this again and get back to you with some proper logs, but off
the top of my head, things got worse with 4.11-rc1 ...

Thanks,
Alex

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1672521

Title:
  ThunderX: soft lockup on 4.8+ kernels

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1672521/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1672521] ProcModules.txt

2017-03-13 Thread Alexandru Avadanii
apport information

** Attachment added: "ProcModules.txt"
   
https://bugs.launchpad.net/bugs/1672521/+attachment/4837219/+files/ProcModules.txt

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1672521

Title:
  ThunderX: soft lockup on 4.8+ kernels

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1672521/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1672521] ProcInterrupts.txt

2017-03-13 Thread Alexandru Avadanii
apport information

** Attachment added: "ProcInterrupts.txt"
   
https://bugs.launchpad.net/bugs/1672521/+attachment/4837218/+files/ProcInterrupts.txt

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1672521

Title:
  ThunderX: soft lockup on 4.8+ kernels

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1672521/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1672521] UdevDb.txt

2017-03-13 Thread Alexandru Avadanii
apport information

** Attachment added: "UdevDb.txt"
   https://bugs.launchpad.net/bugs/1672521/+attachment/4837220/+files/UdevDb.txt

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1672521

Title:
  ThunderX: soft lockup on 4.8+ kernels

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1672521/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1672521] Lsusb.txt

2017-03-13 Thread Alexandru Avadanii
apport information

** Attachment added: "Lsusb.txt"
   https://bugs.launchpad.net/bugs/1672521/+attachment/4837216/+files/Lsusb.txt

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1672521

Title:
  ThunderX: soft lockup on 4.8+ kernels

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1672521/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1672521] Lspci.txt

2017-03-13 Thread Alexandru Avadanii
apport information

** Attachment added: "Lspci.txt"
   https://bugs.launchpad.net/bugs/1672521/+attachment/4837215/+files/Lspci.txt

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1672521

Title:
  ThunderX: soft lockup on 4.8+ kernels

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1672521/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1672521] ProcCpuinfo.txt

2017-03-13 Thread Alexandru Avadanii
apport information

** Attachment added: "ProcCpuinfo.txt"
   
https://bugs.launchpad.net/bugs/1672521/+attachment/4837217/+files/ProcCpuinfo.txt

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1672521

Title:
  ThunderX: soft lockup on 4.8+ kernels

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1672521/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1672521] WifiSyslog.txt

2017-03-13 Thread Alexandru Avadanii
apport information

** Attachment added: "WifiSyslog.txt"
   
https://bugs.launchpad.net/bugs/1672521/+attachment/4837221/+files/WifiSyslog.txt

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1672521

Title:
  ThunderX: soft lockup on 4.8+ kernels

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1672521/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1672521] Re: ThunderX: soft lockup on 4.8+ kernels

2017-03-13 Thread Alexandru Avadanii
apport information

** Tags added: apport-collected xenial

** Description changed:

  I have been trying to easily reproduce this for days.
  We initially observed it in OPNFV Armband, when we tried to upgrade our 
Ubuntu Xenial installation kernel to linux-image-generic-hwe-16.04 (4.8).
  
  In our environment, this was easily triggered on compute nodes, when 
launching multiple VMs (we suspected OVS, QEMU etc.).
  However, in order to rule out our specifics, we looked for a simple way to 
reproduce it on all ThunderX nodes we have access to, and we finally found it:
  
  $ apt-get install stress-ng
  $ stress-ng --hdd 1024
  
  We tested different FW versions, provided by both chip/board manufacturers, 
and with all of them the result is 100% reproductible, leading to a kernel Oops 
[1]:
  [  726.070531] INFO: task kworker/0:1:312 blocked for more than 120 seconds.
  [  726.077908]   Tainted: GW I 4.8.0-41-generic 
#44~16.04.1-Ubuntu
  [  726.085850] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [  726.094383] kworker/0:1 D 080861bc 0   312  2 
0x
  [  726.094401] Workqueue: events vmstat_shepherd
  [  726.094404] Call trace:
  [  726.094411] [] __switch_to+0x94/0xa8
  [  726.094418] [] __schedule+0x224/0x718
  [  726.094421] [] schedule+0x38/0x98
  [  726.094425] [] schedule_preempt_disabled+0x14/0x20
  [  726.094428] [] __mutex_lock_slowpath+0xd4/0x168
  [  726.094431] [] mutex_lock+0x58/0x70
  [  726.094437] [] get_online_cpus+0x44/0x70
  [  726.094440] [] vmstat_shepherd+0x3c/0xe8
  [  726.094446] [] process_one_work+0x150/0x478
  [  726.094449] [] worker_thread+0x50/0x4b8
  [  726.094453] [] kthread+0xec/0x100
  [  726.094456] [] ret_from_fork+0x10/0x40
  
  
  Over the last few days, I tested all 4.8-* and 4.10 (zesty backport), the 
soft lockup happens with each and every one of them.
  On the other hand, 4.4.0-45-generic seems to work perfectly fine (probably 
newer 4.4.0-* too, but due to a regression in the ethernet drivers after 
4.4.0-45, we can't test those with ease) under normal conditions, yet running 
stress-ng leads to the same oops.
  
  [1] http://paste.ubuntu.com/24172516/
+ --- 
+ AlsaDevices:
+  total 0
+  crw-rw 1 root audio 116,  1 Mar 13 19:27 seq
+  crw-rw 1 root audio 116, 33 Mar 13 19:27 timer
+ AplayDevices: Error: [Errno 2] No such file or directory
+ ApportVersion: 2.20.1-0ubuntu2.5
+ Architecture: arm64
+ ArecordDevices: Error: [Errno 2] No such file or directory
+ AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
+ DistroRelease: Ubuntu 16.04
+ IwConfig: Error: [Errno 2] No such file or directory
+ MachineType: GIGABYTE R120-T30
+ Package: linux (not installed)
+ PciMultimedia:
+  
+ ProcEnviron:
+  TERM=vt220
+  PATH=(custom, no user)
+  XDG_RUNTIME_DIR=
+  LANG=en_US.UTF-8
+  SHELL=/bin/bash
+ ProcFB: 0 astdrmfb
+ ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-4.8.0-41-generic 
root=/dev/mapper/os-root ro console=tty0 console=ttyS0,115200 
console=ttyAMA0,115200 net.ifnames=1 biosdevname=0 rootdelay=90 nomodeset quiet 
splash vt.handoff=7
+ ProcVersionSignature: Ubuntu 4.8.0-41.44~16.04.1-generic 4.8.17
+ RelatedPackageVersions:
+  linux-restricted-modules-4.8.0-41-generic N/A
+  linux-backports-modules-4.8.0-41-generic  N/A
+  linux-firmware1.157.8
+ RfKill: Error: [Errno 2] No such file or directory
+ Tags:  xenial
+ Uname: Linux 4.8.0-41-generic aarch64
+ UpgradeStatus: No upgrade log present (probably fresh install)
+ UserGroups:
+  
+ _MarkForUpload: True
+ dmi.bios.date: 11/22/2016
+ dmi.bios.vendor: GIGABYTE
+ dmi.bios.version: T22
+ dmi.board.asset.tag: 01234567890123456789AB
+ dmi.board.name: MT30-GS0
+ dmi.board.vendor: GIGABYTE
+ dmi.board.version: 01234567
+ dmi.chassis.asset.tag: 01234567890123456789AB
+ dmi.chassis.type: 17
+ dmi.chassis.vendor: GIGABYTE
+ dmi.chassis.version: 01234567
+ dmi.modalias: 
dmi:bvnGIGABYTE:bvrT22:bd11/22/2016:svnGIGABYTE:pnR120-T30:pvr0100:rvnGIGABYTE:rnMT30-GS0:rvr01234567:cvnGIGABYTE:ct17:cvr01234567:
+ dmi.product.name: R120-T30
+ dmi.product.version: 0100
+ dmi.sys.vendor: GIGABYTE

** Attachment added: "CRDA.txt"
   https://bugs.launchpad.net/bugs/1672521/+attachment/4837212/+files/CRDA.txt

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1672521

Title:
  ThunderX: soft lockup on 4.8+ kernels

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1672521/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1672521] CurrentDmesg.txt

2017-03-13 Thread Alexandru Avadanii
apport information

** Attachment added: "CurrentDmesg.txt"
   
https://bugs.launchpad.net/bugs/1672521/+attachment/4837213/+files/CurrentDmesg.txt

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1672521

Title:
  ThunderX: soft lockup on 4.8+ kernels

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1672521/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1672521] JournalErrors.txt

2017-03-13 Thread Alexandru Avadanii
apport information

** Attachment added: "JournalErrors.txt"
   
https://bugs.launchpad.net/bugs/1672521/+attachment/4837214/+files/JournalErrors.txt

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1672521

Title:
  ThunderX: soft lockup on 4.8+ kernels

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1672521/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1672521] [NEW] ThunderX: soft lockup on 4.8+ kernels

2017-03-13 Thread Alexandru Avadanii
Public bug reported:

I have been trying to easily reproduce this for days.
We initially observed it in OPNFV Armband, when we tried to upgrade our Ubuntu 
Xenial installation kernel to linux-image-generic-hwe-16.04 (4.8).

In our environment, this was easily triggered on compute nodes, when launching 
multiple VMs (we suspected OVS, QEMU etc.).
However, in order to rule out our specifics, we looked for a simple way to 
reproduce it on all ThunderX nodes we have access to, and we finally found it:

$ apt-get install stress-ng
$ stress-ng --hdd 1024

We tested different FW versions, provided by both chip/board manufacturers, and 
with all of them the result is 100% reproductible, leading to a kernel Oops [1]:
[  726.070531] INFO: task kworker/0:1:312 blocked for more than 120 seconds.
[  726.077908]   Tainted: GW I 4.8.0-41-generic 
#44~16.04.1-Ubuntu
[  726.085850] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[  726.094383] kworker/0:1 D 080861bc 0   312  2 0x
[  726.094401] Workqueue: events vmstat_shepherd
[  726.094404] Call trace:
[  726.094411] [] __switch_to+0x94/0xa8
[  726.094418] [] __schedule+0x224/0x718
[  726.094421] [] schedule+0x38/0x98
[  726.094425] [] schedule_preempt_disabled+0x14/0x20
[  726.094428] [] __mutex_lock_slowpath+0xd4/0x168
[  726.094431] [] mutex_lock+0x58/0x70
[  726.094437] [] get_online_cpus+0x44/0x70
[  726.094440] [] vmstat_shepherd+0x3c/0xe8
[  726.094446] [] process_one_work+0x150/0x478
[  726.094449] [] worker_thread+0x50/0x4b8
[  726.094453] [] kthread+0xec/0x100
[  726.094456] [] ret_from_fork+0x10/0x40


Over the last few days, I tested all 4.8-* and 4.10 (zesty backport), the soft 
lockup happens with each and every one of them.
On the other hand, 4.4.0-45-generic seems to work perfectly fine (probably 
newer 4.4.0-* too, but due to a regression in the ethernet drivers after 
4.4.0-45, we can't test those with ease) under normal conditions, yet running 
stress-ng leads to the same oops.

[1] http://paste.ubuntu.com/24172516/

** Affects: linux (Ubuntu)
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1672521

Title:
  ThunderX: soft lockup on 4.8+ kernels

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1672521/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1582181] [NEW] AArch64: slow cpuinfo due to redundant loop

2016-05-16 Thread Alexandru Avadanii
Public bug reported:

lshw on AArch64 hardware is painfully slow.
This affects both lshw in current Ubuntu releases and vanilla upstream.

For a 48 core node, cpuinfo parsing added up to 30 seconds (8 lines
per core in /proc/cpuinfo add up to 384 lines to parse).

For a 96 core node, parsing took up to 5 minutes (!).

I think the problem was introduced by [1], and can be summarized as:
- CPU capabilities should be added only to the current CPU core,
  and NOT to all previous CPU cores parsed.

My suggestion is dropping the loop in [1], thus calling the 
and  only for currentcpu.

I put together a small patch (basically removing the for loop in question)
at [2] (or see attachement), which should be applied on top of version
"02.16-2ubuntu1.3" from Ubuntu Trusty 14.04.

After applying the patch at [2], parsing for the above system (48 cores)
takes less than 1 second (instead of 30s), with the exact same results ...

[1]
https://github.com/lyonel/lshw/commit/beb89de5a3c10449fe73f1c77b2486d868e5bc9a
#diff-f4010714738fa4cdd5999499579da2b3R217

[2] http://paste.ubuntu.com/16456620/

# lsb_release -rd
Description:Ubuntu 14.04.4 LTS
Release:14.04

BR,
Alex

** Affects: lshw (Ubuntu)
 Importance: Undecided
 Status: New

** Patch added: "AArch64-cpuinfo-Remove-redundant-cpu-caps-loop.patch"
   
https://bugs.launchpad.net/bugs/1582181/+attachment/4663771/+files/AArch64-cpuinfo-Remove-redundant-cpu-caps-loop.patch

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1582181

Title:
  AArch64: slow cpuinfo due to redundant loop

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/lshw/+bug/1582181/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs