[Bug 1788035] Re: nvme: avoid cqe corruption

2019-10-03 Thread Po-Hsu Lin
Fix can be found in Eoan, mark this as fix-released. ** Changed in: linux (Ubuntu) Status: In Progress => Fix Released -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1788035 Title: nvme:

[Bug 1788035] Re: nvme: avoid cqe corruption

2019-07-24 Thread Brad Figg
** Tags added: cscc -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1788035 Title: nvme: avoid cqe corruption To manage notifications about this bug go to:

[Bug 1788035] Re: nvme: avoid cqe corruption

2019-01-23 Thread Guilherme G. Piccoli
Some updates here: the patch was released in the -proposed pocket, and is available in the kernel 4.4.0-1075-aws - to enable the proposed repository please see this https://wiki.ubuntu.com/Testing/EnableProposed. The plan is to have this kernel released in the first week of February, after all

[Bug 1788035] Re: nvme: avoid cqe corruption

2018-12-04 Thread Guilherme G. Piccoli
I'm investigating this issue, and built a kernel with the following two patches: a) https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7776db1ccc1 b) A debug patch present in http://lists.infradead.org/pipermail/linux-nvme/2017-February/008498.html The idea of the

[Bug 1788035] Re: nvme: avoid cqe corruption

2018-10-31 Thread Brian Moyles
We encountered an instance that had a nvme failure very early on in boot today. I've updated our internal Canonical case as well as our Amazon case on this, but posting relevant details here as well for consistency: # uname -a Linux XXX 4.4.0-1069-aws #79-Ubuntu SMP Mon Sep 24 15:01:41 UTC 2018

[Bug 1788035] Re: nvme: avoid cqe corruption

2018-10-24 Thread Marco
As this issue seems far from being solved and I dont see any progess coming from canonical neither aws which I find quite annoying considering the impact for them today we switched back our instances to m4 -- You received this bug notification because you are a member of Ubuntu Bugs, which is

[Bug 1788035] Re: nvme: avoid cqe corruption

2018-10-18 Thread Scott Emmons
We can confirm that this patch does not solve the issue as we are still seeing the same dmesg pattern with the 4.4.0-1069-aws kernel. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1788035 Title:

[Bug 1788035] Re: nvme: avoid cqe corruption

2018-10-17 Thread Greg Frank
We did not have a series of steps to reproduce this. Just left a server running without much happening and boom. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1788035 Title: nvme: avoid cqe

[Bug 1788035] Re: nvme: avoid cqe corruption

2018-10-17 Thread Marco
I didn't find so far way to reproduce the issue systematically. It doesn't seem to me load related as nodes with lower load crash more often then ones with high load. But I can confirm that the fix released with ubuntu 4.4.0-135 kernel doesn't fix the issue. As this morning (17/10/28) we faced

[Bug 1788035] Re: nvme: avoid cqe corruption

2018-10-12 Thread Marco
So far the best I could reach is get the kernel call trace but not crashing the node yet. Oct 12 15:33:41 ip-10-16-21-10 kernel: [10919.306845] INFO: task java:1932 blocked for more than 120 seconds. Oct 12 15:33:41 ip-10-16-21-10 kernel: [10919.308573] Not tainted 4.4.0-1069-aws

[Bug 1788035] Re: nvme: avoid cqe corruption

2018-10-12 Thread Marco
I have the same. I have the less loaded instances in my real environment crashing and the environment where I am trying to reproduce the issue stressing it not crashing. I am finding a way to reproduce it. How did you reproduce it? -- You received this bug notification because you are a

[Bug 1788035] Re: nvme: avoid cqe corruption

2018-10-12 Thread Greg Frank
We were reproducing this multiple times a day on multiple of our EC2 M5 instances. Interesting anecdote, our least loaded instances produced the bug more often than our heavily loaded instances. We've since switched to M4 servers and do not have time to flip back and help test this right now. --

[Bug 1788035] Re: nvme: avoid cqe corruption

2018-10-12 Thread Marco
I am trying to reproduce it. Has anyone tried it ? -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1788035 Title: nvme: avoid cqe corruption To manage notifications about this bug go to:

[Bug 1788035] Re: nvme: avoid cqe corruption

2018-09-26 Thread Greg Frank
i am also still seeing this bug after the fix. ** Attachment added: "syslog" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1788035/+attachment/5193195/+files/syslog -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu.

[Bug 1788035] Re: nvme: avoid cqe corruption

2018-09-26 Thread Tanguy Moal
Hello, it seems that this issue or a similar one still occurs despite the fix. Please find syslog output here after. Best regards ** Attachment added: "Bug still present on 4.4.0-135" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1788035/+attachment/5193088/+files/nvme_4.4.0.log --

[Bug 1788035] Re: nvme: avoid cqe corruption

2018-09-10 Thread Launchpad Bug Tracker
This bug was fixed in the package linux - 4.4.0-135.161 --- linux (4.4.0-135.161) xenial; urgency=medium * linux: 4.4.0-135.161 -proposed tracker (LP: #1788766) * [Regression] APM Merlin boards fail to recover link after interface down/up (LP: #1785739) - net: phylib:

[Bug 1788035] Re: nvme: avoid cqe corruption

2018-08-29 Thread Brad Figg
This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed- xenial' to 'verification-done-xenial'. If the problem still exists, change the tag

[Bug 1788035] Re: nvme: avoid cqe corruption

2018-08-27 Thread Kleber Sacilotto de Souza
** Changed in: linux (Ubuntu Xenial) Status: In Progress => Fix Committed -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1788035 Title: nvme: avoid cqe corruption To manage notifications

[Bug 1788035] Re: nvme: avoid cqe corruption

2018-08-22 Thread Joseph Salisbury
** Tags added: kernel-da-key xenial -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1788035 Title: nvme: avoid cqe corruption To manage notifications about this bug go to:

[Bug 1788035] Re: nvme: avoid cqe corruption

2018-08-22 Thread Kleber Sacilotto de Souza
** Also affects: linux (Ubuntu Xenial) Importance: Undecided Status: New ** Changed in: linux (Ubuntu Xenial) Status: New => In Progress -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu.