------- Comment From balb...@au1.ibm.com 2016-07-17 21:43 EDT-------
I tried the kernel at http://people.canonical.com/~kamal/lp1573062/lp1573062.1/ 
and it worked fine for me

------- Comment From balb...@au1.ibm.com 2016-07-19 01:04 EDT-------
Looks like I got a failure with the run on 
http://people.canonical.com/~kamal/lp1573062/lp1573062.1/

But with my diff + 4.4.0 source from apt-source, I can always get the
the following command to succeed.

timeout -s 9 $end_time stress-ng --aggressive --verify --timeout
$runtime --brk 0

I've tried three times with my diff (all success) and twice with the
kernel @ ~kamal (one failure and one success). I've not tried the longer
7 hour run

------- Comment From balb...@au1.ibm.com 2016-07-19 01:37 EDT-------
In the kern.log posted, it looks like the problem has moved to

rwsem_wake+0xcc/0x110
up_write+0x78/0x90
unlink_anon_vmas+0x15c/0x2c0

A bunch of threads are stuck on rwsem_wake -- spinning on the
sem->wait_lock. I can see a whole bunch of exiting  stress-ng-mmapf
stuck on this lock, spinning. I'll double check this. Can we get a build
with lockdep enabled? I am unable to reproduce this issue at my end with
the diff applied on my machine at the moment

------- Comment From balb...@au1.ibm.com 2016-07-19 19:51 EDT-------
I am cloning the sources to debug further

------- Comment From balb...@au1.ibm.com 2016-07-19 23:52 EDT-------
I cloned the kernel from 
https://git.launchpad.net/~kamalmostafa/ubuntu/+source/linux/+git/xenial/log/?h=lp1573062
 and built with the machine config specified from /boot/config. I also verified 
the diff matches my changes.

I ran

timeout -s 9 $end_time stress-ng --aggressive --verify --timeout
$runtime --brk 0

twice

Both the times, the test did the right thing. Could someone verify if

(a) The smaller subset works fine?
(b) The larger test fails, if so, can we get a run with lockdep

I was just testing for the command line above and I could see a
difference with those patches.

------- Comment From balb...@au1.ibm.com 2016-07-21 20:14 EDT-------
No, the diff matches, sorry for the confusion, but here is what I said

"I also verified the diff matches my changes"

In summary, here is what I did

1. cloned the sources
2. built locally on my machine
3. Ran stress-ng with recommended parameters
4. The test succeeded, got back the console

Did four runs and I got back the console each time

However with the provided binaries

Step 3 (stress-ng) failed for me once in two runs

------- Comment From balb...@au1.ibm.com 2016-07-25 08:08 EDT-------
Strange, I am able to reproduce the issue with the provided binaries, but not 
when I build it. I am not doing a deb build, but just a make -j64 with the 
config from /boot for 4.4.0-28. The problem could be at my end, but I am a 
little concerned.

I also noticed that if I am interacting with the system during runs, it
succeeds, frequently checking if the console is active (enters and
control-o-h). I am going to see if I can get a repro again and debug
further.

------- Comment From balb...@au1.ibm.com 2016-07-25 09:09 EDT-------
In the meanwhile, any updates on the bisect? I was hoping we could do both 
things (RCA and bisect) in parallel

Thanks,
Balbir

------- Comment From balb...@au1.ibm.com 2016-07-25 23:37 EDT-------
I've been working off the assumption that the bug was fixed in mainline :)

I tried a few runs, including 4.5
(4.5.0-040500-generic_4.5.0-040500.201605161244) and it worked for me as
well (comment #25). I presume I should stick to comment #92 and assume
that the bug is still present in mainline

------- Comment From balb...@au1.ibm.com 2016-07-31 21:17 EDT-------
Does this succeed on your system? Could you please try three runs?

timeout -s 9 $end_time stress-ng --aggressive --verify --timeout
$runtime --brk 0

------- Comment From balb...@au1.ibm.com 2016-08-09 21:29 EDT-------
Could the team please try the patch I posted at 
http://marc.info/?l=linux-mm&m=147071635030062&w=2? It is under discussion at 
the moment. I've tried it a few times at my end on top of the xenial git tree 
on top of the oom reaper changes. More testing in progress at my end

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1573062

Title:
  memory_stress_ng failing for IBM Power S812LC(TN71-BP012) for 16.04

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1573062/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to