Re: Machine crashes right *after* ~successful resume

2014-11-02 Thread Wilmer van der Gaast
On 01-11-14 02:10, Yinghai Lu wrote: Patch #1 worked after a simple s/&&/)/. And patch #2 seems to fix the problem as well! updated first #1. Works as well! Wilmer v/d Gaast. -- + .''`. - -- ---+ +- -- --- - --+ | wilmer : :' : gaast.net | | OSS

Re: Machine crashes right *after* ~successful resume

2014-11-02 Thread Wilmer van der Gaast
On 01-11-14 02:10, Yinghai Lu wrote: Patch #1 worked after a simple s//)/. And patch #2 seems to fix the problem as well! updated first #1. Works as well! Wilmer v/d Gaast. -- + .''`. - -- ---+ +- -- --- - --+ | wilmer : :' : gaast.net | | OSS

Re: Machine crashes right *after* ~successful resume

2014-10-31 Thread Wilmer van der Gaast
Hello, Patch #1 worked after a simple s/&&/)/. And patch #2 seems to fix the problem as well! Wilmer v/d Gaast. -- + .''`. - -- ---+ +- -- --- - --+ | wilmer : :' : gaast.net | | OSS Programmer www.bitlbee.org | | lintux `. `~' debian.org | |

Re: Machine crashes right *after* ~successful resume

2014-10-31 Thread Wilmer van der Gaast
On 31-10-14 16:11, Yinghai Lu wrote: Good. Please check if attached one on top of 3.17 only would work too. No luck, sadly. :-( Unsuccessful third resume. I forgot to set up the serial console, would that still be useful? Wilmer v/d Gaast. -- + .''`. - -- ---+ +- --

Re: Machine crashes right *after* ~successful resume

2014-10-31 Thread Wilmer van der Gaast
-at-boottime-only patch wasn't just working by accident yesterday, I tested it twice more with the same effect. Thanks, Wilmer van der Gaast. -- + .''`. - -- ---+ +- -- --- - --+ | wilmer : :' : gaast.net | | OSS Programmer www.bitlbee.org | | lintux

Re: Machine crashes right *after* ~successful resume

2014-10-31 Thread Wilmer van der Gaast
-at-boottime-only patch wasn't just working by accident yesterday, I tested it twice more with the same effect. Thanks, Wilmer van der Gaast. -- + .''`. - -- ---+ +- -- --- - --+ | wilmer : :' : gaast.net | | OSS Programmer www.bitlbee.org | | lintux

Re: Machine crashes right *after* ~successful resume

2014-10-31 Thread Wilmer van der Gaast
On 31-10-14 16:11, Yinghai Lu wrote: Good. Please check if attached one on top of 3.17 only would work too. No luck, sadly. :-( Unsuccessful third resume. I forgot to set up the serial console, would that still be useful? Wilmer v/d Gaast. -- + .''`. - -- ---+ +- --

Re: Machine crashes right *after* ~successful resume

2014-10-31 Thread Wilmer van der Gaast
Hello, Patch #1 worked after a simple s//)/. And patch #2 seems to fix the problem as well! Wilmer v/d Gaast. -- + .''`. - -- ---+ +- -- --- - --+ | wilmer : :' : gaast.net | | OSS Programmer www.bitlbee.org | | lintux `. `~' debian.org | | Full-time

Re: Machine crashes right *after* ~successful resume

2014-10-30 Thread Wilmer van der Gaast
On 30-10-14 23:02, Yinghai Lu wrote: http://gaast.net/~wilmer/.lkml/good3.17-patched-megadebug.txt no difference except on 00:1c.3 --- before.txt2014-10-30 15:20:35.782886485 -0700 +++ after.txt2014-10-30 15:21:37.034882515 -0700 @@ -49,10 +49,10 @@ 02f0: 00 00 00 00 00 00 00 00

Re: Machine crashes right *after* ~successful resume

2014-10-30 Thread Wilmer van der Gaast
Hello, On 30-10-14 16:57, Yinghai Lu wrote: Sadly, with that patch (applied against a vanilla 3.17 tree like all the> others) the second resume fails already. :-( oh, no. Really want to know which bit causes the problem. Good question. And I think you will find my new finding even more

Re: Machine crashes right *after* ~successful resume

2014-10-30 Thread Wilmer van der Gaast
Hello, On 30-10-14 00:53, Yinghai Lu wrote: Done, and that did work! Four suspend+resume cycles later and it's still stable. Then can you test attached simplified one. Sadly, with that patch (applied against a vanilla 3.17 tree like all the others) the second resume fails already. :-(

Re: Machine crashes right *after* ~successful resume

2014-10-30 Thread Wilmer van der Gaast
Hello, On 30-10-14 00:53, Yinghai Lu wrote: Done, and that did work! Four suspend+resume cycles later and it's still stable. Then can you test attached simplified one. Sadly, with that patch (applied against a vanilla 3.17 tree like all the others) the second resume fails already. :-(

Re: Machine crashes right *after* ~successful resume

2014-10-30 Thread Wilmer van der Gaast
Hello, On 30-10-14 16:57, Yinghai Lu wrote: Sadly, with that patch (applied against a vanilla 3.17 tree like all the others) the second resume fails already. :-( oh, no. Really want to know which bit causes the problem. Good question. And I think you will find my new finding even more

Re: Machine crashes right *after* ~successful resume

2014-10-30 Thread Wilmer van der Gaast
On 30-10-14 23:02, Yinghai Lu wrote: http://gaast.net/~wilmer/.lkml/good3.17-patched-megadebug.txt no difference except on 00:1c.3 --- before.txt2014-10-30 15:20:35.782886485 -0700 +++ after.txt2014-10-30 15:21:37.034882515 -0700 @@ -49,10 +49,10 @@ 02f0: 00 00 00 00 00 00 00 00

Re: Machine crashes right *after* ~successful resume

2014-10-29 Thread Wilmer van der Gaast
Helllo, On 29-10-14 05:17, Yinghai Lu wrote: (Diff is in the Intel device, not the ITE one.) That is strange. I did wonder later, why was I not seeing the ff* dump anymore after the resume.. Anyway please try attached patched on top of 3.17. Done, and that did work! Four suspend+resume

Re: Machine crashes right *after* ~successful resume

2014-10-29 Thread Wilmer van der Gaast
Helllo, On 29-10-14 05:17, Yinghai Lu wrote: (Diff is in the Intel device, not the ITE one.) That is strange. I did wonder later, why was I not seeing the ff* dump anymore after the resume.. Anyway please try attached patched on top of 3.17. Done, and that did work! Four suspend+resume

Re: Machine crashes right *after* ~successful resume

2014-10-28 Thread Wilmer van der Gaast
Hello, On 28-10-14 01:12, Yinghai Lu wrote: lspci -vv -s 00:1c.3 lspci -vv -s 04:00.0 before reverting enable bridge early patch http://gaast.net/~wilmer/.lkml/lspcixx-nopatch.txt (So that's 3.17 + your revert patch) and after reverting on 3.17+?

Re: Machine crashes right *after* ~successful resume

2014-10-28 Thread Wilmer van der Gaast
On 28-10-14 04:03, Yinghai Lu wrote: Please check if attached patch could fix the problem on your setup. Sadly it looks like it did not. :-( Applied your patch on a vanilla 3.17 tree, still seeing the same crash. I'll get more debugging output and the output you asked for in your previous

Re: Machine crashes right *after* ~successful resume

2014-10-28 Thread Wilmer van der Gaast
On 28-10-14 04:03, Yinghai Lu wrote: Please check if attached patch could fix the problem on your setup. Sadly it looks like it did not. :-( Applied your patch on a vanilla 3.17 tree, still seeing the same crash. I'll get more debugging output and the output you asked for in your previous

Re: Machine crashes right *after* ~successful resume

2014-10-28 Thread Wilmer van der Gaast
Hello, On 28-10-14 01:12, Yinghai Lu wrote: lspci -vv -s 00:1c.3 lspci -vv -s 04:00.0 before reverting enable bridge early patch http://gaast.net/~wilmer/.lkml/lspcixx-nopatch.txt (So that's 3.17 + your revert patch) and after reverting on 3.17+?

Re: Machine crashes right *after* ~successful resume

2014-10-27 Thread Wilmer van der Gaast
On 27-10-14 23:41, Yinghai Lu wrote: Can you only apply the patch that revert enable bridge early and two pci dump patches to see if 04:00.0 readout is 0xff? I was curious about that already, did that with a 3.16.6 that I think just had your revert applied (and using lspci - to get the

Re: Machine crashes right *after* ~successful resume

2014-10-27 Thread Wilmer van der Gaast
ces/\:04\:00.0/remove echo 1 > /sys/bus/pci/devices/\:00\:1c.3/pcie_link_disable before suspend/resume test. That worked! Resumed properly now. Full log in http://gaast.net/~wilmer/.lkml/good3.17.txt . Including the PCI dump at boot time, where that device doesn't dump just ff's. Wilm

Re: Machine crashes right *after* ~successful resume

2014-10-27 Thread Wilmer van der Gaast
g/buffered? Anyway, dumps are in: http://gaast.net/~wilmer/.lkml/bad3.17-pcidumps-no_console_suspend.txt http://gaast.net/~wilmer/.lkml/bad3.17-pcidumps.txt Cheers, Wilmer van der Gaast. -- + .''`. - -- ---+ +- -- --- - --+ | wilmer : :' : gaast.net | | OSS

Re: Machine crashes right *after* ~successful resume

2014-10-27 Thread Wilmer van der Gaast
? Anyway, dumps are in: http://gaast.net/~wilmer/.lkml/bad3.17-pcidumps-no_console_suspend.txt http://gaast.net/~wilmer/.lkml/bad3.17-pcidumps.txt Cheers, Wilmer van der Gaast. -- + .''`. - -- ---+ +- -- --- - --+ | wilmer : :' : gaast.net | | OSS Programmer

Re: Machine crashes right *after* ~successful resume

2014-10-27 Thread Wilmer van der Gaast
\:04\:00.0/remove echo 1 /sys/bus/pci/devices/\:00\:1c.3/pcie_link_disable before suspend/resume test. That worked! Resumed properly now. Full log in http://gaast.net/~wilmer/.lkml/good3.17.txt . Including the PCI dump at boot time, where that device doesn't dump just ff's. Wilmer van

Re: Machine crashes right *after* ~successful resume

2014-10-27 Thread Wilmer van der Gaast
On 27-10-14 23:41, Yinghai Lu wrote: Can you only apply the patch that revert enable bridge early and two pci dump patches to see if 04:00.0 readout is 0xff? I was curious about that already, did that with a 3.16.6 that I think just had your revert applied (and using lspci - to get the

Re: Machine crashes right *after* ~successful resume

2014-10-22 Thread Wilmer van der Gaast
Hello Yinghai, This looks more promising! Yinghai Lu (ying...@kernel.org) wrote: > > > > And then nothing, and it's hung. Looks the same to me (apart from the tsc > > issues + hpet switch) as a successful resume: > > then it stuck in pm_restore_console()? > That seems to be the case yes: [

Re: Machine crashes right *after* ~successful resume

2014-10-22 Thread Wilmer van der Gaast
Hello Yinghai, This looks more promising! Yinghai Lu (ying...@kernel.org) wrote: And then nothing, and it's hung. Looks the same to me (apart from the tsc issues + hpet switch) as a successful resume: then it stuck in pm_restore_console()? That seems to be the case yes: [

Re: Machine crashes right *after* ~successful resume

2014-10-21 Thread Wilmer van der Gaast
Hello, Sorry for the delay, finally poked at this again. It looks like the no_console_suspend flag was causing troubles, which I didn't really need anyway with logging going to my serial port. This is what I get now on the failing resume: [ 112.879390] PM: resume of devices complete after

Re: Machine crashes right *after* ~successful resume

2014-10-21 Thread Wilmer van der Gaast
Hello, Sorry for the delay, finally poked at this again. It looks like the no_console_suspend flag was causing troubles, which I didn't really need anyway with logging going to my serial port. This is what I get now on the failing resume: [ 112.879390] PM: resume of devices complete after

Re: Machine crashes right *after* ~successful resume

2014-10-19 Thread Wilmer van der Gaast
Hello, On 19-10-14 05:29, Yinghai Lu wrote: Please try to "debug ignore_loglevel no_console_suspend". Same thing. :-( [ 72.572354] Restarting tasks ... done. [ 72.576554] PM: calling nb rcu_pm_notify+0x0/0x60 [ 72.581277] PM: ... nb rcu_pm_notify+0x0/0x60 done [ 72.586115] PM:

Re: Machine crashes right *after* ~successful resume

2014-10-19 Thread Wilmer van der Gaast
Hello, On 19-10-14 05:29, Yinghai Lu wrote: Please try to debug ignore_loglevel no_console_suspend. Same thing. :-( [ 72.572354] Restarting tasks ... done. [ 72.576554] PM: calling nb rcu_pm_notify+0x0/0x60 [ 72.581277] PM: ... nb rcu_pm_notify+0x0/0x60 done [ 72.586115] PM: calling

Re: Machine crashes right *after* ~successful resume

2014-10-18 Thread Wilmer van der Gaast
(Resending, forgot to hit reply-to-all.) Hello Yinghai, On 18-10-14 22:28, Yinghai Lu wrote: > > Please apply attached debug patch on top of v3.17 and boot with > "debug ignore_loglevel initcall_debug no_console_suspend". > > Hope we can find out which nb notifier cause problem. > Did that.

Re: Machine crashes right *after* ~successful resume

2014-10-18 Thread Wilmer van der Gaast
(Resending, forgot to hit reply-to-all.) Hello Yinghai, On 18-10-14 22:28, Yinghai Lu wrote: Please apply attached debug patch on top of v3.17 and boot with debug ignore_loglevel initcall_debug no_console_suspend. Hope we can find out which nb notifier cause problem. Did that. Strangely,

Re: Machine crashes right *after* ~successful resume

2014-10-16 Thread Wilmer van der Gaast
Hello, I have filed a bug now: https://bugzilla.kernel.org/show_bug.cgi?id=86421 We should probably continue the discussion there now? I've added just you to the CC field, not sure who else on this thread is still interested at this point. On 16-10-14 17:36, Yinghai Lu wrote: Can you put

Re: Machine crashes right *after* ~successful resume

2014-10-16 Thread Wilmer van der Gaast
Hello, On 16-10-14 05:32, Yinghai Lu wrote: Can you please try attached patch? that should workaround the problem. Sadly, no luck. (I do assume you meant me to use the patch against a clean 3.17 tree *without* yesterday's revert patch applied.) Back to a crash at/after the third resume: [

Re: Machine crashes right *after* ~successful resume

2014-10-16 Thread Wilmer van der Gaast
Hello, On 16-10-14 05:32, Yinghai Lu wrote: Can you please try attached patch? that should workaround the problem. Sadly, no luck. (I do assume you meant me to use the patch against a clean 3.17 tree *without* yesterday's revert patch applied.) Back to a crash at/after the third resume: [

Re: Machine crashes right *after* ~successful resume

2014-10-16 Thread Wilmer van der Gaast
Hello, I have filed a bug now: https://bugzilla.kernel.org/show_bug.cgi?id=86421 We should probably continue the discussion there now? I've added just you to the CC field, not sure who else on this thread is still interested at this point. On 16-10-14 17:36, Yinghai Lu wrote: Can you put

Re: Machine crashes right *after* ~successful resume

2014-10-15 Thread Wilmer van der Gaast
Hello Yinghai, On 15-10-14 19:39, Yinghai Lu wrote: so third resume will not work? that is strange. second and third should not use same code path... Always exactly the third time, yes. Seems strange indeed. :-( I was under the impression that on each resume, completion time of device

Re: Machine crashes right *after* ~successful resume

2014-10-15 Thread Wilmer van der Gaast
Hello Rafael, Rafael J. Wysocki (r...@rjwysocki.net) wrote: > > Would it be feasible to revert 2e8b... to see if it fixes it on 3.17? > That's a merge, isn't it? > Correct, it was, and I did try to figure out which of its parents was the guilty one, but then I found out the real problem is

Re: Machine crashes right *after* ~successful resume

2014-10-15 Thread Wilmer van der Gaast
Hello Rafael, Rafael J. Wysocki (r...@rjwysocki.net) wrote: Would it be feasible to revert 2e8b... to see if it fixes it on 3.17? That's a merge, isn't it? Correct, it was, and I did try to figure out which of its parents was the guilty one, but then I found out the real problem is

Re: Machine crashes right *after* ~successful resume

2014-10-15 Thread Wilmer van der Gaast
Hello Yinghai, On 15-10-14 19:39, Yinghai Lu wrote: so third resume will not work? that is strange. second and third should not use same code path... Always exactly the third time, yes. Seems strange indeed. :-( I was under the impression that on each resume, completion time of device

Re: Machine crashes right *after* ~successful resume

2014-10-12 Thread Wilmer van der Gaast
situation, I'll post again. This is done: Still seeing the same issue. (And I'm using raw echo mem>/proc/... for all testing now.) Same for a "make defconfig" kernel. Wilmer van der Gaast. -- + .''`. - -- ---+ +- -- --- - --+ | wilmer : :'

Re: Machine crashes right *after* ~successful resume

2014-10-12 Thread Wilmer van der Gaast
, I'll post again. Wilmer van der Gaast. -- + .''`. - -- ---+ +- -- --- - --+ | wilmer : :' : gaast.net | | OSS Programmer www.bitlbee.org | | lintux `. `~' debian.org | | Full-time geek wilmer.gaast.net

Re: Machine crashes right *after* ~successful resume

2014-10-12 Thread Wilmer van der Gaast
. If this improves the situation, I'll post again. Wilmer van der Gaast. -- + .''`. - -- ---+ +- -- --- - --+ | wilmer : :' : gaast.net | | OSS Programmer www.bitlbee.org | | lintux `. `~' debian.org | | Full-time geek wilmer.gaast.net

Re: Machine crashes right *after* ~successful resume

2014-10-12 Thread Wilmer van der Gaast
, I'll post again. This is done: Still seeing the same issue. (And I'm using raw echo mem/proc/... for all testing now.) Same for a make defconfig kernel. Wilmer van der Gaast. -- + .''`. - -- ---+ +- -- --- - --+ | wilmer : :' : gaast.net | | OSS Programmer

Machine crashes right *after* ~successful resume

2014-10-07 Thread Wilmer van der Gaast
I'd love some ideas for troubleshooting an issue like this. "Attachments" in http://roy.gaast.net/~wilmer/.lkml/ since I just realised >200KB of attachments might not be appreciated. :-) Cheers, Wilmer van der Gaast. -- + .''`. - -- ---+ +- -- --- - --+ | wi

Machine crashes right *after* ~successful resume

2014-10-07 Thread Wilmer van der Gaast
an issue like this. Attachments in http://roy.gaast.net/~wilmer/.lkml/ since I just realised 200KB of attachments might not be appreciated. :-) Cheers, Wilmer van der Gaast. -- + .''`. - -- ---+ +- -- --- - --+ | wilmer : :' : gaast.net | | OSS Programmer