Re: Machine crashes right *after* ~successful resume

2014-11-02 Thread Wilmer van der Gaast
On 01-11-14 02:10, Yinghai Lu wrote: Patch #1 worked after a simple s/&&/)/. And patch #2 seems to fix the problem as well! updated first #1. Works as well! Wilmer v/d Gaast. -- + .''`. - -- ---+ +- -- --- - --+ | wilmer : :' : gaast.net | | OSS

Re: Machine crashes right *after* ~successful resume

2014-11-02 Thread Wilmer van der Gaast
On 01-11-14 02:10, Yinghai Lu wrote: Patch #1 worked after a simple s//)/. And patch #2 seems to fix the problem as well! updated first #1. Works as well! Wilmer v/d Gaast. -- + .''`. - -- ---+ +- -- --- - --+ | wilmer : :' : gaast.net | | OSS

Re: Machine crashes right *after* ~successful resume

2014-10-31 Thread Yinghai Lu
On Fri, Oct 31, 2014 at 5:00 PM, Wilmer van der Gaast wrote: > Hello, > > Patch #1 worked after a simple s/&&/)/. And patch #2 seems to fix the > problem as well! updated first #1. --- drivers/pci/pci.c | 18 ++ 1 file changed, 18 insertions(+) Index:

Re: Machine crashes right *after* ~successful resume

2014-10-31 Thread Wilmer van der Gaast
Hello, Patch #1 worked after a simple s/&&/)/. And patch #2 seems to fix the problem as well! Wilmer v/d Gaast. -- + .''`. - -- ---+ +- -- --- - --+ | wilmer : :' : gaast.net | | OSS Programmer www.bitlbee.org | | lintux `. `~' debian.org | |

Re: Machine crashes right *after* ~successful resume

2014-10-31 Thread Yinghai Lu
On Fri, Oct 31, 2014 at 2:22 PM, Yinghai Lu wrote: > On Fri, Oct 31, 2014 at 2:13 PM, Wilmer van der Gaast > wrote: >> On 31-10-14 16:11, Yinghai Lu wrote: >>> >>> >>> Good. Please check if attached one on top of 3.17 only would work too. >>> >> No luck, sadly. :-( Unsuccessful third resume.

Re: Machine crashes right *after* ~successful resume

2014-10-31 Thread Yinghai Lu
On Fri, Oct 31, 2014 at 2:13 PM, Wilmer van der Gaast wrote: > On 31-10-14 16:11, Yinghai Lu wrote: >> >> >> Good. Please check if attached one on top of 3.17 only would work too. >> > No luck, sadly. :-( Unsuccessful third resume. > > I forgot to set up the serial console, would that still be

Re: Machine crashes right *after* ~successful resume

2014-10-31 Thread Wilmer van der Gaast
On 31-10-14 16:11, Yinghai Lu wrote: Good. Please check if attached one on top of 3.17 only would work too. No luck, sadly. :-( Unsuccessful third resume. I forgot to set up the serial console, would that still be useful? Wilmer v/d Gaast. -- + .''`. - -- ---+ +- --

Re: Machine crashes right *after* ~successful resume

2014-10-31 Thread Yinghai Lu
On Fri, Oct 31, 2014 at 2:39 AM, Wilmer van der Gaast wrote: > Hello Yinghai, > > On 31-10-14 02:13, Yinghai Lu wrote: >> >> Last try: >> >> Please check attached patch that will keep state consistent. > > > Good news: This last patch worked! For good measure, I ran my test twice > with a reboot

Re: Machine crashes right *after* ~successful resume

2014-10-31 Thread Wilmer van der Gaast
Hello Yinghai, On 31-10-14 02:13, Yinghai Lu wrote: Last try: Please check attached patch that will keep state consistent. Good news: This last patch worked! For good measure, I ran my test twice with a reboot in between. Worked consistently. And similarly, to ensure that your

Re: Machine crashes right *after* ~successful resume

2014-10-31 Thread Wilmer van der Gaast
Hello Yinghai, On 31-10-14 02:13, Yinghai Lu wrote: Last try: Please check attached patch that will keep state consistent. Good news: This last patch worked! For good measure, I ran my test twice with a reboot in between. Worked consistently. And similarly, to ensure that your

Re: Machine crashes right *after* ~successful resume

2014-10-31 Thread Yinghai Lu
On Fri, Oct 31, 2014 at 2:39 AM, Wilmer van der Gaast wil...@gaast.net wrote: Hello Yinghai, On 31-10-14 02:13, Yinghai Lu wrote: Last try: Please check attached patch that will keep state consistent. Good news: This last patch worked! For good measure, I ran my test twice with a reboot

Re: Machine crashes right *after* ~successful resume

2014-10-31 Thread Wilmer van der Gaast
On 31-10-14 16:11, Yinghai Lu wrote: Good. Please check if attached one on top of 3.17 only would work too. No luck, sadly. :-( Unsuccessful third resume. I forgot to set up the serial console, would that still be useful? Wilmer v/d Gaast. -- + .''`. - -- ---+ +- --

Re: Machine crashes right *after* ~successful resume

2014-10-31 Thread Yinghai Lu
On Fri, Oct 31, 2014 at 2:13 PM, Wilmer van der Gaast wil...@gaast.net wrote: On 31-10-14 16:11, Yinghai Lu wrote: Good. Please check if attached one on top of 3.17 only would work too. No luck, sadly. :-( Unsuccessful third resume. I forgot to set up the serial console, would that still

Re: Machine crashes right *after* ~successful resume

2014-10-31 Thread Yinghai Lu
On Fri, Oct 31, 2014 at 2:22 PM, Yinghai Lu ying...@kernel.org wrote: On Fri, Oct 31, 2014 at 2:13 PM, Wilmer van der Gaast wil...@gaast.net wrote: On 31-10-14 16:11, Yinghai Lu wrote: Good. Please check if attached one on top of 3.17 only would work too. No luck, sadly. :-( Unsuccessful

Re: Machine crashes right *after* ~successful resume

2014-10-31 Thread Wilmer van der Gaast
Hello, Patch #1 worked after a simple s//)/. And patch #2 seems to fix the problem as well! Wilmer v/d Gaast. -- + .''`. - -- ---+ +- -- --- - --+ | wilmer : :' : gaast.net | | OSS Programmer www.bitlbee.org | | lintux `. `~' debian.org | | Full-time

Re: Machine crashes right *after* ~successful resume

2014-10-31 Thread Yinghai Lu
On Fri, Oct 31, 2014 at 5:00 PM, Wilmer van der Gaast wil...@gaast.net wrote: Hello, Patch #1 worked after a simple s//)/. And patch #2 seems to fix the problem as well! updated first #1. --- drivers/pci/pci.c | 18 ++ 1 file changed, 18 insertions(+) Index:

Re: Machine crashes right *after* ~successful resume

2014-10-30 Thread Yinghai Lu
On Thu, Oct 30, 2014 at 5:43 PM, Yinghai Lu wrote: > On Thu, Oct 30, 2014 at 4:24 PM, Wilmer van der Gaast > wrote: >> >> >> Same problem like this morning: Failure after the second resume already. :-( >> > can not find out any magic line in pci_enable_bridge that could cause > the difference.

Re: Machine crashes right *after* ~successful resume

2014-10-30 Thread Yinghai Lu
On Thu, Oct 30, 2014 at 4:24 PM, Wilmer van der Gaast wrote: > > > Same problem like this morning: Failure after the second resume already. :-( > can not find out any magic line in pci_enable_bridge that could cause the difference. so either use attached pcie_enable_bridge_ite.patch or just

Re: Machine crashes right *after* ~successful resume

2014-10-30 Thread Wilmer van der Gaast
On 30-10-14 23:02, Yinghai Lu wrote: http://gaast.net/~wilmer/.lkml/good3.17-patched-megadebug.txt no difference except on 00:1c.3 --- before.txt2014-10-30 15:20:35.782886485 -0700 +++ after.txt2014-10-30 15:21:37.034882515 -0700 @@ -49,10 +49,10 @@ 02f0: 00 00 00 00 00 00 00 00

Re: Machine crashes right *after* ~successful resume

2014-10-30 Thread Yinghai Lu
On Thu, Oct 30, 2014 at 2:54 PM, Wilmer van der Gaast wrote: > http://gaast.net/~wilmer/.lkml/good3.17-patched-megadebug.txt no difference except on 00:1c.3 --- before.txt2014-10-30 15:20:35.782886485 -0700 +++ after.txt2014-10-30 15:21:37.034882515 -0700 @@ -49,10 +49,10 @@ 02f0: 00

Re: Machine crashes right *after* ~successful resume

2014-10-30 Thread Wilmer van der Gaast
Hello, On 30-10-14 16:57, Yinghai Lu wrote: Sadly, with that patch (applied against a vanilla 3.17 tree like all the> others) the second resume fails already. :-( oh, no. Really want to know which bit causes the problem. Good question. And I think you will find my new finding even more

Re: Machine crashes right *after* ~successful resume

2014-10-30 Thread Yinghai Lu
On Thu, Oct 30, 2014 at 3:36 AM, Wilmer van der Gaast wrote: > Sadly, with that patch (applied against a vanilla 3.17 tree like all the> > others) the second resume fails already. :-( oh, no. Really want to know which bit causes the problem. Please check debug patch...that will print out pci

Re: Machine crashes right *after* ~successful resume

2014-10-30 Thread Wilmer van der Gaast
Hello, On 30-10-14 00:53, Yinghai Lu wrote: Done, and that did work! Four suspend+resume cycles later and it's still stable. Then can you test attached simplified one. Sadly, with that patch (applied against a vanilla 3.17 tree like all the others) the second resume fails already. :-(

Re: Machine crashes right *after* ~successful resume

2014-10-30 Thread Wilmer van der Gaast
Hello, On 30-10-14 00:53, Yinghai Lu wrote: Done, and that did work! Four suspend+resume cycles later and it's still stable. Then can you test attached simplified one. Sadly, with that patch (applied against a vanilla 3.17 tree like all the others) the second resume fails already. :-(

Re: Machine crashes right *after* ~successful resume

2014-10-30 Thread Yinghai Lu
On Thu, Oct 30, 2014 at 3:36 AM, Wilmer van der Gaast wil...@gaast.net wrote: Sadly, with that patch (applied against a vanilla 3.17 tree like all the others) the second resume fails already. :-( oh, no. Really want to know which bit causes the problem. Please check debug patch...that will

Re: Machine crashes right *after* ~successful resume

2014-10-30 Thread Wilmer van der Gaast
Hello, On 30-10-14 16:57, Yinghai Lu wrote: Sadly, with that patch (applied against a vanilla 3.17 tree like all the others) the second resume fails already. :-( oh, no. Really want to know which bit causes the problem. Good question. And I think you will find my new finding even more

Re: Machine crashes right *after* ~successful resume

2014-10-30 Thread Yinghai Lu
On Thu, Oct 30, 2014 at 2:54 PM, Wilmer van der Gaast wil...@gaast.net wrote: http://gaast.net/~wilmer/.lkml/good3.17-patched-megadebug.txt no difference except on 00:1c.3 --- before.txt2014-10-30 15:20:35.782886485 -0700 +++ after.txt2014-10-30 15:21:37.034882515 -0700 @@ -49,10

Re: Machine crashes right *after* ~successful resume

2014-10-30 Thread Wilmer van der Gaast
On 30-10-14 23:02, Yinghai Lu wrote: http://gaast.net/~wilmer/.lkml/good3.17-patched-megadebug.txt no difference except on 00:1c.3 --- before.txt2014-10-30 15:20:35.782886485 -0700 +++ after.txt2014-10-30 15:21:37.034882515 -0700 @@ -49,10 +49,10 @@ 02f0: 00 00 00 00 00 00 00 00

Re: Machine crashes right *after* ~successful resume

2014-10-30 Thread Yinghai Lu
On Thu, Oct 30, 2014 at 4:24 PM, Wilmer van der Gaast wil...@gaast.net wrote: Same problem like this morning: Failure after the second resume already. :-( can not find out any magic line in pci_enable_bridge that could cause the difference. so either use attached pcie_enable_bridge_ite.patch

Re: Machine crashes right *after* ~successful resume

2014-10-30 Thread Yinghai Lu
On Thu, Oct 30, 2014 at 5:43 PM, Yinghai Lu ying...@kernel.org wrote: On Thu, Oct 30, 2014 at 4:24 PM, Wilmer van der Gaast wil...@gaast.net wrote: Same problem like this morning: Failure after the second resume already. :-( can not find out any magic line in pci_enable_bridge that could

Re: Machine crashes right *after* ~successful resume

2014-10-29 Thread Yinghai Lu
On Wed, Oct 29, 2014 at 2:37 AM, Wilmer van der Gaast wrote: > >> Anyway please try attached patched on top of 3.17. >> > Done, and that did work! Four suspend+resume cycles later and it's still > stable. Then can you test attached simplified one. --- drivers/pci/pci.c | 13 + 1

Re: Machine crashes right *after* ~successful resume

2014-10-29 Thread Wilmer van der Gaast
Helllo, On 29-10-14 05:17, Yinghai Lu wrote: (Diff is in the Intel device, not the ITE one.) That is strange. I did wonder later, why was I not seeing the ff* dump anymore after the resume.. Anyway please try attached patched on top of 3.17. Done, and that did work! Four suspend+resume

Re: Machine crashes right *after* ~successful resume

2014-10-29 Thread Wilmer van der Gaast
Helllo, On 29-10-14 05:17, Yinghai Lu wrote: (Diff is in the Intel device, not the ITE one.) That is strange. I did wonder later, why was I not seeing the ff* dump anymore after the resume.. Anyway please try attached patched on top of 3.17. Done, and that did work! Four suspend+resume

Re: Machine crashes right *after* ~successful resume

2014-10-29 Thread Yinghai Lu
On Wed, Oct 29, 2014 at 2:37 AM, Wilmer van der Gaast wil...@gaast.net wrote: Anyway please try attached patched on top of 3.17. Done, and that did work! Four suspend+resume cycles later and it's still stable. Then can you test attached simplified one. --- drivers/pci/pci.c | 13

Re: Machine crashes right *after* ~successful resume

2014-10-28 Thread Yinghai Lu
On Tue, Oct 28, 2014 at 4:34 PM, Wilmer van der Gaast wrote: > > I've run the commands twice, once before and once after a single > suspend+resume cycle. Small difference and only before that cycle: > > ruby:~/crashit# diff -u lspcixx-* > --- lspcixx-nopatch.txt 2014-10-28 23:26:08.679690828

Re: Machine crashes right *after* ~successful resume

2014-10-28 Thread Wilmer van der Gaast
Hello, On 28-10-14 01:12, Yinghai Lu wrote: lspci -vv -s 00:1c.3 lspci -vv -s 04:00.0 before reverting enable bridge early patch http://gaast.net/~wilmer/.lkml/lspcixx-nopatch.txt (So that's 3.17 + your revert patch) and after reverting on 3.17+?

Re: Machine crashes right *after* ~successful resume

2014-10-28 Thread Wilmer van der Gaast
On 28-10-14 04:03, Yinghai Lu wrote: Please check if attached patch could fix the problem on your setup. Sadly it looks like it did not. :-( Applied your patch on a vanilla 3.17 tree, still seeing the same crash. I'll get more debugging output and the output you asked for in your previous

Re: Machine crashes right *after* ~successful resume

2014-10-28 Thread Wilmer van der Gaast
On 28-10-14 04:03, Yinghai Lu wrote: Please check if attached patch could fix the problem on your setup. Sadly it looks like it did not. :-( Applied your patch on a vanilla 3.17 tree, still seeing the same crash. I'll get more debugging output and the output you asked for in your previous

Re: Machine crashes right *after* ~successful resume

2014-10-28 Thread Wilmer van der Gaast
Hello, On 28-10-14 01:12, Yinghai Lu wrote: lspci -vv -s 00:1c.3 lspci -vv -s 04:00.0 before reverting enable bridge early patch http://gaast.net/~wilmer/.lkml/lspcixx-nopatch.txt (So that's 3.17 + your revert patch) and after reverting on 3.17+?

Re: Machine crashes right *after* ~successful resume

2014-10-28 Thread Yinghai Lu
On Tue, Oct 28, 2014 at 4:34 PM, Wilmer van der Gaast wil...@gaast.net wrote: I've run the commands twice, once before and once after a single suspend+resume cycle. Small difference and only before that cycle: ruby:~/crashit# diff -u lspcixx-* --- lspcixx-nopatch.txt 2014-10-28

Re: Machine crashes right *after* ~successful resume

2014-10-27 Thread Yinghai Lu
On Mon, Oct 27, 2014 at 6:12 PM, Yinghai Lu wrote: > On Mon, Oct 27, 2014 at 5:03 PM, Wilmer van der Gaast > wrote: >> I was curious about that already, did that with a 3.16.6 that I think just >> had your revert applied (and using lspci - to get the dump which I >> assumed would be the

Re: Machine crashes right *after* ~successful resume

2014-10-27 Thread Yinghai Lu
On Mon, Oct 27, 2014 at 5:03 PM, Wilmer van der Gaast wrote: > I was curious about that already, did that with a 3.16.6 that I think just > had your revert applied (and using lspci - to get the dump which I > assumed would be the same): No changes to 04:00 at all. > > Confirmed that this is

Re: Machine crashes right *after* ~successful resume

2014-10-27 Thread Wilmer van der Gaast
On 27-10-14 23:41, Yinghai Lu wrote: Can you only apply the patch that revert enable bridge early and two pci dump patches to see if 04:00.0 readout is 0xff? I was curious about that already, did that with a 3.16.6 that I think just had your revert applied (and using lspci - to get the

Re: Machine crashes right *after* ~successful resume

2014-10-27 Thread Yinghai Lu
On Mon, Oct 27, 2014 at 3:22 PM, Wilmer van der Gaast wrote: > Hello, > > On 27-10-14 18:23, Yinghai Lu wrote: >> >> >> 04:00.0 PCI bridge: Integrated Technology Express, Inc. Device 8892 >> >> So that ITE will not work after suspend/resume? >> > Even after the first one already, you mean? Yes.

Re: Machine crashes right *after* ~successful resume

2014-10-27 Thread Wilmer van der Gaast
Hello, On 27-10-14 18:23, Yinghai Lu wrote: 04:00.0 PCI bridge: Integrated Technology Express, Inc. Device 8892 So that ITE will not work after suspend/resume? Even after the first one already, you mean? Honestly, I don't really know what its purpose is, and it doesn't have any child

Re: Machine crashes right *after* ~successful resume

2014-10-27 Thread Pavel Machek
On Mon 2014-10-27 10:50:04, Wilmer van der Gaast wrote: > Hello Yinghai, > > Thanks again for your time! > > I've applied your two patches, and as a wild guess also added pci=dump to my > kernel cmdline though I guess that just gave me a boot-time dump - which > mostly didn't make it into my

Re: Machine crashes right *after* ~successful resume

2014-10-27 Thread Yinghai Lu
On Mon, Oct 27, 2014 at 3:50 AM, Wilmer van der Gaast wrote: > http://gaast.net/~wilmer/.lkml/bad3.17-pcidumps.txt [ 252.028142] PCI: :04:00.0 : ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 0010: ff ff ff ff ff ff ff ff 04:00.0 PCI bridge: Integrated Technology Express, Inc.

Re: Machine crashes right *after* ~successful resume

2014-10-27 Thread Wilmer van der Gaast
Hello Yinghai, Thanks again for your time! I've applied your two patches, and as a wild guess also added pci=dump to my kernel cmdline though I guess that just gave me a boot-time dump - which mostly didn't make it into my dmesg. I accidentally booted with no_console_suspend on the first

Re: Machine crashes right *after* ~successful resume

2014-10-27 Thread Wilmer van der Gaast
Hello Yinghai, Thanks again for your time! I've applied your two patches, and as a wild guess also added pci=dump to my kernel cmdline though I guess that just gave me a boot-time dump - which mostly didn't make it into my dmesg. I accidentally booted with no_console_suspend on the first

Re: Machine crashes right *after* ~successful resume

2014-10-27 Thread Yinghai Lu
On Mon, Oct 27, 2014 at 3:50 AM, Wilmer van der Gaast wil...@gaast.net wrote: http://gaast.net/~wilmer/.lkml/bad3.17-pcidumps.txt [ 252.028142] PCI: :04:00.0 : ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 0010: ff ff ff ff ff ff ff ff 04:00.0 PCI bridge: Integrated Technology

Re: Machine crashes right *after* ~successful resume

2014-10-27 Thread Pavel Machek
On Mon 2014-10-27 10:50:04, Wilmer van der Gaast wrote: Hello Yinghai, Thanks again for your time! I've applied your two patches, and as a wild guess also added pci=dump to my kernel cmdline though I guess that just gave me a boot-time dump - which mostly didn't make it into my dmesg.

Re: Machine crashes right *after* ~successful resume

2014-10-27 Thread Wilmer van der Gaast
Hello, On 27-10-14 18:23, Yinghai Lu wrote: 04:00.0 PCI bridge: Integrated Technology Express, Inc. Device 8892 So that ITE will not work after suspend/resume? Even after the first one already, you mean? Honestly, I don't really know what its purpose is, and it doesn't have any child

Re: Machine crashes right *after* ~successful resume

2014-10-27 Thread Yinghai Lu
On Mon, Oct 27, 2014 at 3:22 PM, Wilmer van der Gaast wil...@gaast.net wrote: Hello, On 27-10-14 18:23, Yinghai Lu wrote: 04:00.0 PCI bridge: Integrated Technology Express, Inc. Device 8892 So that ITE will not work after suspend/resume? Even after the first one already, you mean? Yes.

Re: Machine crashes right *after* ~successful resume

2014-10-27 Thread Wilmer van der Gaast
On 27-10-14 23:41, Yinghai Lu wrote: Can you only apply the patch that revert enable bridge early and two pci dump patches to see if 04:00.0 readout is 0xff? I was curious about that already, did that with a 3.16.6 that I think just had your revert applied (and using lspci - to get the

Re: Machine crashes right *after* ~successful resume

2014-10-27 Thread Yinghai Lu
On Mon, Oct 27, 2014 at 5:03 PM, Wilmer van der Gaast wil...@gaast.net wrote: I was curious about that already, did that with a 3.16.6 that I think just had your revert applied (and using lspci - to get the dump which I assumed would be the same): No changes to 04:00 at all. Confirmed

Re: Machine crashes right *after* ~successful resume

2014-10-27 Thread Yinghai Lu
On Mon, Oct 27, 2014 at 6:12 PM, Yinghai Lu ying...@kernel.org wrote: On Mon, Oct 27, 2014 at 5:03 PM, Wilmer van der Gaast wil...@gaast.net wrote: I was curious about that already, did that with a 3.16.6 that I think just had your revert applied (and using lspci - to get the dump which I

Re: Machine crashes right *after* ~successful resume

2014-10-26 Thread Yinghai Lu
On Wed, Oct 22, 2014 at 5:53 AM, Wilmer van der Gaast wrote: > That seems to be the case yes: > > [ 106.661152] PM: ... nb fw_pm_notify+0x0/0x150 done > [ 106.665939] PM: calling nb bsp_pm_callback+0x0/0x50 > [ 106.670814] PM: ... nb bsp_pm_callback+0x0/0x50 done > [ 106.675775]

Re: Machine crashes right *after* ~successful resume

2014-10-26 Thread Yinghai Lu
On Wed, Oct 22, 2014 at 5:53 AM, Wilmer van der Gaast wil...@gaast.net wrote: That seems to be the case yes: [ 106.661152] PM: ... nb fw_pm_notify+0x0/0x150 done [ 106.665939] PM: calling nb bsp_pm_callback+0x0/0x50 [ 106.670814] PM: ... nb bsp_pm_callback+0x0/0x50 done [ 106.675775]

Re: Machine crashes right *after* ~successful resume

2014-10-22 Thread Wilmer van der Gaast
Hello Yinghai, This looks more promising! Yinghai Lu (ying...@kernel.org) wrote: > > > > And then nothing, and it's hung. Looks the same to me (apart from the tsc > > issues + hpet switch) as a successful resume: > > then it stuck in pm_restore_console()? > That seems to be the case yes: [

Re: Machine crashes right *after* ~successful resume

2014-10-22 Thread Wilmer van der Gaast
Hello Yinghai, This looks more promising! Yinghai Lu (ying...@kernel.org) wrote: And then nothing, and it's hung. Looks the same to me (apart from the tsc issues + hpet switch) as a successful resume: then it stuck in pm_restore_console()? That seems to be the case yes: [

Re: Machine crashes right *after* ~successful resume

2014-10-21 Thread Yinghai Lu
On Tue, Oct 21, 2014 at 2:40 PM, Wilmer van der Gaast wrote: > Hello, > > Sorry for the delay, finally poked at this again. It looks like the > no_console_suspend flag was causing troubles, which I didn't really need > anyway with logging going to my serial port. > > This is what I get now on the

Re: Machine crashes right *after* ~successful resume

2014-10-21 Thread Wilmer van der Gaast
Hello, Sorry for the delay, finally poked at this again. It looks like the no_console_suspend flag was causing troubles, which I didn't really need anyway with logging going to my serial port. This is what I get now on the failing resume: [ 112.879390] PM: resume of devices complete after

Re: Machine crashes right *after* ~successful resume

2014-10-21 Thread Wilmer van der Gaast
Hello, Sorry for the delay, finally poked at this again. It looks like the no_console_suspend flag was causing troubles, which I didn't really need anyway with logging going to my serial port. This is what I get now on the failing resume: [ 112.879390] PM: resume of devices complete after

Re: Machine crashes right *after* ~successful resume

2014-10-21 Thread Yinghai Lu
On Tue, Oct 21, 2014 at 2:40 PM, Wilmer van der Gaast wil...@gaast.net wrote: Hello, Sorry for the delay, finally poked at this again. It looks like the no_console_suspend flag was causing troubles, which I didn't really need anyway with logging going to my serial port. This is what I get

Re: Machine crashes right *after* ~successful resume

2014-10-19 Thread Wilmer van der Gaast
Hello, On 19-10-14 05:29, Yinghai Lu wrote: Please try to "debug ignore_loglevel no_console_suspend". Same thing. :-( [ 72.572354] Restarting tasks ... done. [ 72.576554] PM: calling nb rcu_pm_notify+0x0/0x60 [ 72.581277] PM: ... nb rcu_pm_notify+0x0/0x60 done [ 72.586115] PM:

Re: Machine crashes right *after* ~successful resume

2014-10-19 Thread Pavel Machek
On Sun 2014-10-19 00:57:12, Wilmer van der Gaast wrote: > (Resending, forgot to hit reply-to-all.) > > Hello Yinghai, > > On 18-10-14 22:28, Yinghai Lu wrote: > > > > Please apply attached debug patch on top of v3.17 and boot with > > "debug ignore_loglevel initcall_debug no_console_suspend". >

Re: Machine crashes right *after* ~successful resume

2014-10-19 Thread Pavel Machek
On Sun 2014-10-19 00:57:12, Wilmer van der Gaast wrote: (Resending, forgot to hit reply-to-all.) Hello Yinghai, On 18-10-14 22:28, Yinghai Lu wrote: Please apply attached debug patch on top of v3.17 and boot with debug ignore_loglevel initcall_debug no_console_suspend. Hope we can

Re: Machine crashes right *after* ~successful resume

2014-10-19 Thread Wilmer van der Gaast
Hello, On 19-10-14 05:29, Yinghai Lu wrote: Please try to debug ignore_loglevel no_console_suspend. Same thing. :-( [ 72.572354] Restarting tasks ... done. [ 72.576554] PM: calling nb rcu_pm_notify+0x0/0x60 [ 72.581277] PM: ... nb rcu_pm_notify+0x0/0x60 done [ 72.586115] PM: calling

Re: Machine crashes right *after* ~successful resume

2014-10-18 Thread Yinghai Lu
On Sat, Oct 18, 2014 at 4:57 PM, Wilmer van der Gaast wrote: > On 18-10-14 22:28, Yinghai Lu wrote: >> >> Please apply attached debug patch on top of v3.17 and boot with >> "debug ignore_loglevel initcall_debug no_console_suspend". >> >> Hope we can find out which nb notifier cause problem. >> >

Re: Machine crashes right *after* ~successful resume

2014-10-18 Thread Wilmer van der Gaast
(Resending, forgot to hit reply-to-all.) Hello Yinghai, On 18-10-14 22:28, Yinghai Lu wrote: > > Please apply attached debug patch on top of v3.17 and boot with > "debug ignore_loglevel initcall_debug no_console_suspend". > > Hope we can find out which nb notifier cause problem. > Did that.

Re: Machine crashes right *after* ~successful resume

2014-10-18 Thread Yinghai Lu
On Thu, Oct 16, 2014 at 2:08 PM, Wilmer van der Gaast wrote: > Did that on this run, no difference either. For full completeness, I > reproduced this problem with no modules loaded (done from initramfs) at all, > with a kernel with your workaround included, logs are here: >

Re: Machine crashes right *after* ~successful resume

2014-10-18 Thread Yinghai Lu
On Thu, Oct 16, 2014 at 2:08 PM, Wilmer van der Gaast wil...@gaast.net wrote: Did that on this run, no difference either. For full completeness, I reproduced this problem with no modules loaded (done from initramfs) at all, with a kernel with your workaround included, logs are here:

Re: Machine crashes right *after* ~successful resume

2014-10-18 Thread Wilmer van der Gaast
(Resending, forgot to hit reply-to-all.) Hello Yinghai, On 18-10-14 22:28, Yinghai Lu wrote: Please apply attached debug patch on top of v3.17 and boot with debug ignore_loglevel initcall_debug no_console_suspend. Hope we can find out which nb notifier cause problem. Did that. Strangely,

Re: Machine crashes right *after* ~successful resume

2014-10-18 Thread Yinghai Lu
On Sat, Oct 18, 2014 at 4:57 PM, Wilmer van der Gaast wil...@gaast.net wrote: On 18-10-14 22:28, Yinghai Lu wrote: Please apply attached debug patch on top of v3.17 and boot with debug ignore_loglevel initcall_debug no_console_suspend. Hope we can find out which nb notifier cause problem.

Re: Machine crashes right *after* ~successful resume

2014-10-16 Thread Wilmer van der Gaast
Hello, I have filed a bug now: https://bugzilla.kernel.org/show_bug.cgi?id=86421 We should probably continue the discussion there now? I've added just you to the CC field, not sure who else on this thread is still interested at this point. On 16-10-14 17:36, Yinghai Lu wrote: Can you put

Re: Machine crashes right *after* ~successful resume

2014-10-16 Thread Yinghai Lu
On Thu, Oct 16, 2014 at 2:36 AM, Wilmer van der Gaast wrote: > Hello, > > On 16-10-14 05:32, Yinghai Lu wrote: >> >> >> Can you please try attached patch? that should workaround the problem. >> > Sadly, no luck. (I do assume you meant me to use the patch against a clean > 3.17 tree *without*

Re: Machine crashes right *after* ~successful resume

2014-10-16 Thread Wilmer van der Gaast
Hello, On 16-10-14 05:32, Yinghai Lu wrote: Can you please try attached patch? that should workaround the problem. Sadly, no luck. (I do assume you meant me to use the patch against a clean 3.17 tree *without* yesterday's revert patch applied.) Back to a crash at/after the third resume: [

Re: Machine crashes right *after* ~successful resume

2014-10-16 Thread Wilmer van der Gaast
Hello, On 16-10-14 05:32, Yinghai Lu wrote: Can you please try attached patch? that should workaround the problem. Sadly, no luck. (I do assume you meant me to use the patch against a clean 3.17 tree *without* yesterday's revert patch applied.) Back to a crash at/after the third resume: [

Re: Machine crashes right *after* ~successful resume

2014-10-16 Thread Yinghai Lu
On Thu, Oct 16, 2014 at 2:36 AM, Wilmer van der Gaast wil...@gaast.net wrote: Hello, On 16-10-14 05:32, Yinghai Lu wrote: Can you please try attached patch? that should workaround the problem. Sadly, no luck. (I do assume you meant me to use the patch against a clean 3.17 tree *without*

Re: Machine crashes right *after* ~successful resume

2014-10-16 Thread Wilmer van der Gaast
Hello, I have filed a bug now: https://bugzilla.kernel.org/show_bug.cgi?id=86421 We should probably continue the discussion there now? I've added just you to the CC field, not sure who else on this thread is still interested at this point. On 16-10-14 17:36, Yinghai Lu wrote: Can you put

Re: Machine crashes right *after* ~successful resume

2014-10-15 Thread Yinghai Lu
On Wed, Oct 15, 2014 at 4:34 PM, Wilmer van der Gaast wrote: > > Is there anything I can do now to find out why your change is causing my > machine to crash? Can you please try attached patch? that should workaround the problem. as some driver is using pci_enable_device in .resume instead of

Re: Machine crashes right *after* ~successful resume

2014-10-15 Thread Wilmer van der Gaast
Hello Yinghai, On 15-10-14 19:39, Yinghai Lu wrote: so third resume will not work? that is strange. second and third should not use same code path... Always exactly the third time, yes. Seems strange indeed. :-( I was under the impression that on each resume, completion time of device

Re: Machine crashes right *after* ~successful resume

2014-10-15 Thread Yinghai Lu
On Wed, Oct 15, 2014 at 6:58 AM, Bjorn Helgaas wrote: > [+cc Yinghai, author of 928bea964827 ("PCI: Delay enabling bridges > until they're needed")] > > On Wed, Oct 15, 2014 at 5:16 AM, Wilmer van der Gaast >> Not sure why 2e8b... was initially found guilty by git bisect, I fear >> that my

Re: Machine crashes right *after* ~successful resume

2014-10-15 Thread Bjorn Helgaas
[+cc Yinghai, author of 928bea964827 ("PCI: Delay enabling bridges until they're needed")] On Wed, Oct 15, 2014 at 5:16 AM, Wilmer van der Gaast wrote: > Hello Rafael, > > Rafael J. Wysocki (r...@rjwysocki.net) wrote: >> > Would it be feasible to revert 2e8b... to see if it fixes it on 3.17? >>

Re: Machine crashes right *after* ~successful resume

2014-10-15 Thread Wilmer van der Gaast
Hello Rafael, Rafael J. Wysocki (r...@rjwysocki.net) wrote: > > Would it be feasible to revert 2e8b... to see if it fixes it on 3.17? > That's a merge, isn't it? > Correct, it was, and I did try to figure out which of its parents was the guilty one, but then I found out the real problem is

Re: Machine crashes right *after* ~successful resume

2014-10-15 Thread Wilmer van der Gaast
Hello Rafael, Rafael J. Wysocki (r...@rjwysocki.net) wrote: Would it be feasible to revert 2e8b... to see if it fixes it on 3.17? That's a merge, isn't it? Correct, it was, and I did try to figure out which of its parents was the guilty one, but then I found out the real problem is

Re: Machine crashes right *after* ~successful resume

2014-10-15 Thread Bjorn Helgaas
[+cc Yinghai, author of 928bea964827 (PCI: Delay enabling bridges until they're needed)] On Wed, Oct 15, 2014 at 5:16 AM, Wilmer van der Gaast wil...@gaast.net wrote: Hello Rafael, Rafael J. Wysocki (r...@rjwysocki.net) wrote: Would it be feasible to revert 2e8b... to see if it fixes it on

Re: Machine crashes right *after* ~successful resume

2014-10-15 Thread Yinghai Lu
On Wed, Oct 15, 2014 at 6:58 AM, Bjorn Helgaas bhelg...@google.com wrote: [+cc Yinghai, author of 928bea964827 (PCI: Delay enabling bridges until they're needed)] On Wed, Oct 15, 2014 at 5:16 AM, Wilmer van der Gaast wil...@gaast.net Not sure why 2e8b... was initially found guilty by git

Re: Machine crashes right *after* ~successful resume

2014-10-15 Thread Wilmer van der Gaast
Hello Yinghai, On 15-10-14 19:39, Yinghai Lu wrote: so third resume will not work? that is strange. second and third should not use same code path... Always exactly the third time, yes. Seems strange indeed. :-( I was under the impression that on each resume, completion time of device

Re: Machine crashes right *after* ~successful resume

2014-10-15 Thread Yinghai Lu
On Wed, Oct 15, 2014 at 4:34 PM, Wilmer van der Gaast wil...@gaast.net wrote: Is there anything I can do now to find out why your change is causing my machine to crash? Can you please try attached patch? that should workaround the problem. as some driver is using pci_enable_device in .resume

Re: Machine crashes right *after* ~successful resume

2014-10-13 Thread Rafael J. Wysocki
On Sunday, October 12, 2014 10:40:32 PM Pavel Machek wrote: > Bjorn, any ideas? > > Would it be feasible to revert 2e8b... to see if it fixes it on 3.17? That's a merge, isn't it? I'd rather check what the pci/misc branch was based on and then bisect that branch. If you do $ git show fed2451

Re: Machine crashes right *after* ~successful resume

2014-10-13 Thread Rafael J. Wysocki
On Sunday, October 12, 2014 10:40:32 PM Pavel Machek wrote: Bjorn, any ideas? Would it be feasible to revert 2e8b... to see if it fixes it on 3.17? That's a merge, isn't it? I'd rather check what the pci/misc branch was based on and then bisect that branch. If you do $ git show fed2451

Re: Machine crashes right *after* ~successful resume

2014-10-12 Thread Wilmer van der Gaast
On 12-10-14 21:40, Pavel Machek wrote: Bjorn, any ideas? Would it be feasible to revert 2e8b... to see if it fixes it on 3.17? I've tried this, too many conflicts unfortunately. Just noticed this message appear during failing resumes by the way: [ 54.203072] Clocksource tsc unstable

Re: Machine crashes right *after* ~successful resume

2014-10-12 Thread Pavel Machek
Bjorn, any ideas? Would it be feasible to revert 2e8b... to see if it fixes it on 3.17? Thanks, Pavel On Sun 2014-10-12 16:49:18, Wilmer van der Gaast wrote: > Hello, > > Many thanks for your response! > > On 12-10-14

Re: Machine crashes right *after* ~successful resume

2014-10-12 Thread Wilmer van der Gaast
Hello, Many thanks for your response! On 12-10-14 15:30, Pavel Machek wrote: Has it ever worked ok? ...aha, in 3.10, ok. Correct. And I've tried a few more kernels now, compiled on my own. 3.17 still has this issue, 3.10 is completely fine all the way up to 3.10.57 (I've tested just under

Re: Machine crashes right *after* ~successful resume

2014-10-12 Thread Pavel Machek
Hi! > Rafael, including you on this since > http://linuxconcloudopenna2013.sched.org/event/d708f47d07cd44b9669610778c024708#.VDRzTDS_EUF > mentions you as the maintainer for Linux + power management. I hope this is > still accurate. > > Since Linux 3.12 (Debian version 3.12.9-1~bpo70+1) and all

Re: Machine crashes right *after* ~successful resume

2014-10-12 Thread Pavel Machek
Hi! Rafael, including you on this since http://linuxconcloudopenna2013.sched.org/event/d708f47d07cd44b9669610778c024708#.VDRzTDS_EUF mentions you as the maintainer for Linux + power management. I hope this is still accurate. Since Linux 3.12 (Debian version 3.12.9-1~bpo70+1) and all the

Re: Machine crashes right *after* ~successful resume

2014-10-12 Thread Wilmer van der Gaast
Hello, Many thanks for your response! On 12-10-14 15:30, Pavel Machek wrote: Has it ever worked ok? ...aha, in 3.10, ok. Correct. And I've tried a few more kernels now, compiled on my own. 3.17 still has this issue, 3.10 is completely fine all the way up to 3.10.57 (I've tested just under

Re: Machine crashes right *after* ~successful resume

2014-10-12 Thread Pavel Machek
Bjorn, any ideas? Would it be feasible to revert 2e8b... to see if it fixes it on 3.17? Thanks, Pavel On Sun 2014-10-12 16:49:18, Wilmer van der Gaast wrote: Hello, Many thanks for your response! On 12-10-14 15:30,

Re: Machine crashes right *after* ~successful resume

2014-10-12 Thread Wilmer van der Gaast
On 12-10-14 21:40, Pavel Machek wrote: Bjorn, any ideas? Would it be feasible to revert 2e8b... to see if it fixes it on 3.17? I've tried this, too many conflicts unfortunately. Just noticed this message appear during failing resumes by the way: [ 54.203072] Clocksource tsc unstable

  1   2   >