Re: [Kernel-packages] [Bug 1857074] Re: Cavium ThunderX CN88XX Panic : Unknown reason

2020-01-16 Thread dann frazier
On Wed, Jan 15, 2020 at 11:28 PM Juerg Haefliger wrote: > > On Thu, 16 Jan 2020 02:14:16 - > dann frazier wrote: > > > I built a kernel with the proposed patches[*] and ran a reboot/kernel > > compile test on 4 systems. The tests survived 46 total iterations > > (~12/system) before I

[Kernel-packages] [Bug 1857074] Re: Cavium ThunderX CN88XX Panic : Unknown reason

2020-01-16 Thread Juerg Haefliger
The root cause of this problem is that the order in which errata and cpu features are evaluated and enabled is reversed. On ThunderX boxes that have erratum 27456 enable, KPTI needs to be turned off to prevent I-cache clobbering. But due to the reversed order, the callback to enable KPTI is

Re: [Kernel-packages] [Bug 1857074] Re: Cavium ThunderX CN88XX Panic : Unknown reason

2020-01-15 Thread Juerg Haefliger
On Thu, 16 Jan 2020 02:14:16 - dann frazier wrote: > I built a kernel with the proposed patches[*] and ran a reboot/kernel > compile test on 4 systems. The tests survived 46 total iterations > (~12/system) before I interrupted. Two systems failed with "Synchronous > External Abort:

[Kernel-packages] [Bug 1857074] Re: Cavium ThunderX CN88XX Panic : Unknown reason

2020-01-15 Thread Juerg Haefliger
@alexandru-avadanii Thanks for mentioning 'nopti' which made me realize that there was something fishy about KPTI that I haven't paid much attention before. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu.

[Kernel-packages] [Bug 1857074] Re: Cavium ThunderX CN88XX Panic : Unknown reason

2020-01-15 Thread dann frazier
I've now seen an occurrence of the the SEA/ECC issue on a system w/ the 4.15.0-70 kernel, so I think we can safely assume this is not a regression related to this bug. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu.

[Kernel-packages] [Bug 1857074] Re: Cavium ThunderX CN88XX Panic : Unknown reason

2020-01-15 Thread dann frazier
I built a kernel with the proposed patches[*] and ran a reboot/kernel compile test on 4 systems. The tests survived 46 total iterations (~12/system) before I interrupted. Two systems failed with "Synchronous External Abort: synchronous parity or ECC error" errors. I've reverted the systems back

[Kernel-packages] [Bug 1857074] Re: Cavium ThunderX CN88XX Panic : Unknown reason

2020-01-15 Thread Juerg Haefliger
Yeah I've noticed that as well but at the time thought that nopti just changes the timing somehow and masks the problem. But I'm not so sure anymore so thanks for mentioning it! -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in

[Kernel-packages] [Bug 1857074] Re: Cavium ThunderX CN88XX Panic : Unknown reason

2020-01-15 Thread Alexandru Avadanii
Hi, Not sure this is useful (since it might be obvious), but adding `nopti` to kernel parameters works around the issue, indicating this is indeed related to kpti. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu.

[Kernel-packages] [Bug 1857074] Re: Cavium ThunderX CN88XX Panic : Unknown reason

2020-01-14 Thread Ubuntu Foundations Team Bug Bot
** Tags added: patch -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1857074 Title: Cavium ThunderX CN88XX Panic : Unknown reason Status in linux package in Ubuntu: Confirmed Status

Re: [Kernel-packages] [Bug 1857074] Re: Cavium ThunderX CN88XX Panic : Unknown reason

2020-01-14 Thread dann frazier
On Tue, Jan 14, 2020 at 8:35 AM Juerg Haefliger <1857...@bugs.launchpad.net> wrote: > > We certainly want this: > > commit 71c751f2a43fa03fae3cf5f0067ed3001a397013 > Author: Mark Rutland > Date: Mon Apr 23 11:41:33 2018 +0100 > > arm64: add sentinel to kpti_safe_list Agreed, nice catch.

[Kernel-packages] [Bug 1857074] Re: Cavium ThunderX CN88XX Panic : Unknown reason

2020-01-14 Thread Juerg Haefliger
We certainly want this: commit 71c751f2a43fa03fae3cf5f0067ed3001a397013 Author: Mark Rutland Date: Mon Apr 23 11:41:33 2018 +0100 arm64: add sentinel to kpti_safe_list We're missing a sentinel entry in kpti_safe_list. Thus is_midr_in_range_list() can walk past the end of

[Kernel-packages] [Bug 1857074] Re: Cavium ThunderX CN88XX Panic : Unknown reason

2020-01-14 Thread Juerg Haefliger
Hrmm. Never mind the Qualcomm errata, this is a Cavium box :-( But I get consistent failures so I do believe that commit cce360b54ce6 causes issues although the erratas are indeed enabled: [0.00] CPU features: enabling workaround for Cavium erratum 27456 [0.00] CPU features:

Re: [Kernel-packages] [Bug 1857074] Re: Cavium ThunderX CN88XX Panic : Unknown reason

2020-01-14 Thread Juerg Haefliger
On Mon, 13 Jan 2020 23:51:13 - dann frazier wrote: > v4.14.151 upstream fails as well - but with a different symptom (see > below). v4.14.150 seems fine, so I'll try and bisect between the two. Of > course, that's really just a shot in the dark, as we know this issue is > finicky.. The

[Kernel-packages] [Bug 1857074] Re: Cavium ThunderX CN88XX Panic : Unknown reason

2020-01-13 Thread dann frazier
v4.14.151 upstream fails as well - but with a different symptom (see below). v4.14.150 seems fine, so I'll try and bisect between the two. Of course, that's really just a shot in the dark, as we know this issue is finicky.. [ 34.896151] Unable to handle kernel paging request at virtual address

[Kernel-packages] [Bug 1857074] Re: Cavium ThunderX CN88XX Panic : Unknown reason

2020-01-13 Thread dann frazier
** Attachment added: "4.14.164.config" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1857074/+attachment/5319910/+files/4.14.164.config -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu.

[Kernel-packages] [Bug 1857074] Re: Cavium ThunderX CN88XX Panic : Unknown reason

2020-01-13 Thread dann frazier
fyi, I was able to reproduce this w/ upstream 4.14.164, built w/ an Ubuntu-based config. ** Attachment added: "4.14.164.dmesg" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1857074/+attachment/5319909/+files/4.14.164.dmesg -- You received this bug notification because you are a

[Kernel-packages] [Bug 1857074] Re: Cavium ThunderX CN88XX Panic : Unknown reason

2020-01-07 Thread dann frazier
** Changed in: linux (Ubuntu Bionic) Status: New => Confirmed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1857074 Title: Cavium ThunderX CN88XX Panic : Unknown reason Status

[Kernel-packages] [Bug 1857074] Re: Cavium ThunderX CN88XX Panic : Unknown reason

2020-01-06 Thread dann frazier
** Attachment added: "anuchin-4.15.0-74.84-bionic-oops.log" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1857074/+attachment/5317960/+files/anuchin-4.15.0-74.84-bionic-oops.log -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to

[Kernel-packages] [Bug 1857074] Re: Cavium ThunderX CN88XX Panic : Unknown reason

2020-01-06 Thread dann frazier
** Attachment added: "seidel-4.15.0-74.84-bionic-oops.log" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1857074/+attachment/5317962/+files/seidel-4.15.0-74.84-bionic-oops.log -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to

[Kernel-packages] [Bug 1857074] Re: Cavium ThunderX CN88XX Panic : Unknown reason

2020-01-06 Thread dann frazier
** Attachment added: "seidel-4.15.0-74.84-xenial-no-oops.log" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1857074/+attachment/5317963/+files/seidel-4.15.0-74.84-xenial-no-oops.log -- You received this bug notification because you are a member of Kernel Packages, which is subscribed

[Kernel-packages] [Bug 1857074] Re: Cavium ThunderX CN88XX Panic : Unknown reason

2020-01-06 Thread dann frazier
** Attachment added: "anuchin-4.15.0-74.84-xenial-no-oops.log" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1857074/+attachment/5317961/+files/anuchin-4.15.0-74.84-xenial-no-oops.log -- You received this bug notification because you are a member of Kernel Packages, which is

[Kernel-packages] [Bug 1857074] Re: Cavium ThunderX CN88XX Panic : Unknown reason

2020-01-06 Thread dann frazier
Our testing found the same issue on 2 other CN88XX systems (seidel & anuchin), I'll attach the logs here. Interestingly, while both systems hit the oops when booting the bionic 4.15.0-74.84, neither system had a problem with the xenial hwe 4.15.0-74.84. -- You received this bug notification

[Kernel-packages] [Bug 1857074] Re: Cavium ThunderX CN88XX Panic : Unknown reason

2019-12-19 Thread Sean Feole
Full console output for starmie ** Attachment added: "starmie.txt" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1857074/+attachment/5314163/+files/starmie.txt ** Also affects: linux (Ubuntu Bionic) Importance: Undecided Status: New -- You received this bug notification