On Mon, Nov 28, 2016 at 11:10:14AM +0800, Ming Lei wrote:
> When I run stress-ng via the following steps on one ARM64 dual
> socket system(Cavium Thunder), the kernel oops can often be
> triggered after running the stress test for several hours(sometimes
> it may take longer):
> - git clone git://kernel.ubuntu.com/cking/stress-ng.git
> - apply the attachment patch which just makes the posix file
> lock stress test more aggressive
> - run the test via '~/git/stress-ng$./stress-ng --lockf 128 --aggressive'
> From the oops log, looks one garbage file_lock node is got
> from the linked list of 'ctx->flc_posix' when the issue happens.
> BTW, the issue isn't observed on single socket Cavium Thunder yet,
> and the same issue can be seen on Ubuntu Xenial(v4.4 based kernel)
FWIW, I've been running this on Seattle for 24 hours with your patch applied
and not seen any problems yet. That said, Thomas did just fix an rt_mutex
race which only seemed to pop up on Thunder, so you could give those
patches a try.