Re: Which came first, hard kernel lockup or SATA errors?

2017-10-10 Thread Ed Swierk
Continuing the conversation with the voices in my head... On Mon, Oct 9, 2017 at 10:45 PM, Ed Swierk wrote: > Based on the addresses in the stack and registers, here's what I think > happened. > > On cpu 13: > > - task_numa_fault() calls task_numa_migrate(), which selects the task > on cpu 0

Re: Which came first, hard kernel lockup or SATA errors?

2017-10-09 Thread Ed Swierk
On Fri, Oct 6, 2017 at 6:25 PM, Ed Swierk wrote: > I'm trying to untangle a series of problems that suddenly occurred on > a dual-socket Xeon server system that had been running a bunch of KVM > workloads just fine for over 6 weeks (4.4.52-grsec kernel, > Debian-derived userspace). I think I've n

Which came first, hard kernel lockup or SATA errors?

2017-10-06 Thread Ed Swierk
I'm trying to untangle a series of problems that suddenly occurred on a dual-socket Xeon server system that had been running a bunch of KVM workloads just fine for over 6 weeks (4.4.52-grsec kernel, Debian-derived userspace). Here are the highlights, with timestamps in seconds: [3851435] NMI watc