Re: 2.6.21rc suspend to ram regression on Lenovo X60
Pavel Machek wrote: Hi! Seeing a couple of MSI changes in there, on a hunch I booted latest tree with pci=nomsi, and it resumed again. Any ideas how to further debug this? I'll try backing out individual changes from that merge tomorrow. Thanks. Of those msi patches you have identified I don't see anything really obvious. And you actually marked them as good in your bisect so I don't expect it is core problem. We do have a known e1000 regression, with msi and suspend/resume. still? I tested this against rc3 and it's mostly just fine. even with msi enabled. So it is possible the nomsi avoided a driver problem. Especially as we have a number of driver changes on the on Linus's side of that merge. I also know we have some known issues with pci_save_state and pci_restore_state that require them to be paired for correct operation. For suspend and resume that is not generally a problem. I have fixes for the pci_save_state and pci_restore_state in the -mm and gregkh tree's. Since they also happen to fix the e1000 driver as a side effect they are worth looking at, at least if you have an e1000. hey, please include me on those! I don't have a clue which hardware the x60 has so I don't know which drivers it would be using. x60 indeed has e1000. yup. Auke - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21rc suspend to ram regression on Lenovo X60
Hi! > > Seeing a couple of MSI changes in there, on a hunch I booted latest tree > > with > > pci=nomsi, and it resumed again. > > > > Any ideas how to further debug this? > > I'll try backing out individual changes from that merge tomorrow. > > Thanks. > > Of those msi patches you have identified I don't see anything really > obvious. And you actually marked them as good in your bisect so > I don't expect it is core problem. > > We do have a known e1000 regression, with msi and suspend/resume. > So it is possible the nomsi avoided a driver problem. Especially > as we have a number of driver changes on the on Linus's side of > that merge. > > I also know we have some known issues with pci_save_state and > pci_restore_state that require them to be paired for correct > operation. For suspend and resume that is not generally a problem. > > I have fixes for the pci_save_state and pci_restore_state in the -mm > and gregkh tree's. Since they also happen to fix the e1000 driver as > a side effect they are worth looking at, at least if you have an > e1000. > > I don't have a clue which hardware the x60 has so I don't know which > drivers it would be using. x60 indeed has e1000. -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21rc suspend to ram regression on Lenovo X60
Hi! Seeing a couple of MSI changes in there, on a hunch I booted latest tree with pci=nomsi, and it resumed again. Any ideas how to further debug this? I'll try backing out individual changes from that merge tomorrow. Thanks. Of those msi patches you have identified I don't see anything really obvious. And you actually marked them as good in your bisect so I don't expect it is core problem. We do have a known e1000 regression, with msi and suspend/resume. So it is possible the nomsi avoided a driver problem. Especially as we have a number of driver changes on the on Linus's side of that merge. I also know we have some known issues with pci_save_state and pci_restore_state that require them to be paired for correct operation. For suspend and resume that is not generally a problem. I have fixes for the pci_save_state and pci_restore_state in the -mm and gregkh tree's. Since they also happen to fix the e1000 driver as a side effect they are worth looking at, at least if you have an e1000. I don't have a clue which hardware the x60 has so I don't know which drivers it would be using. x60 indeed has e1000. -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21rc suspend to ram regression on Lenovo X60
Pavel Machek wrote: Hi! Seeing a couple of MSI changes in there, on a hunch I booted latest tree with pci=nomsi, and it resumed again. Any ideas how to further debug this? I'll try backing out individual changes from that merge tomorrow. Thanks. Of those msi patches you have identified I don't see anything really obvious. And you actually marked them as good in your bisect so I don't expect it is core problem. We do have a known e1000 regression, with msi and suspend/resume. still? I tested this against rc3 and it's mostly just fine. even with msi enabled. So it is possible the nomsi avoided a driver problem. Especially as we have a number of driver changes on the on Linus's side of that merge. I also know we have some known issues with pci_save_state and pci_restore_state that require them to be paired for correct operation. For suspend and resume that is not generally a problem. I have fixes for the pci_save_state and pci_restore_state in the -mm and gregkh tree's. Since they also happen to fix the e1000 driver as a side effect they are worth looking at, at least if you have an e1000. hey, please include me on those! I don't have a clue which hardware the x60 has so I don't know which drivers it would be using. x60 indeed has e1000. yup. Auke - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21rc suspend to ram regression on Lenovo X60
On Thu, Mar 15, 2007 at 12:45:20PM -0700, Jeremy Fitzhardinge wrote: > Dave Jones wrote: > > I just did a build of top of tree, including those commits, and > > it's still broken. Booting with pci=nomsi no longer 'fixes' it > > though, which may indicate that the MSI changes were a red herring. > > (Or that the subsequent changes have regressed it even more, > > which seems unlikely looking at the changes). > > > > I just found the same thing on my X60. Current top-of-tree with > pci=nomsi does not improve things. When it resumes, the CPU is working > (capslock toggles, sysreq-b reboots), but the screen is blank. Yeah, I noticed the capslock works. Networking doesn't come back up though, and it doesn't seem to answer to command that I type blindly. Even trying to do something like.. pm-suspend ; dmesg >dmesg.out; /sbin/reboot doesn't seem to execute the commands on resume. Switching tty's to X with alt-f7 seems to lock it up to the point that even capslock doesn't work any more. I'll try and hook up a usb serial cable and see if I'm lucky enough to get something useful out of it in the absense of a serial port.. Dave -- http://www.codemonkey.org.uk - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21rc suspend to ram regression on Lenovo X60
Dave Jones wrote: > I just did a build of top of tree, including those commits, and > it's still broken. Booting with pci=nomsi no longer 'fixes' it > though, which may indicate that the MSI changes were a red herring. > (Or that the subsequent changes have regressed it even more, > which seems unlikely looking at the changes). > I just found the same thing on my X60. Current top-of-tree with pci=nomsi does not improve things. When it resumes, the CPU is working (capslock toggles, sysreq-b reboots), but the screen is blank. I was about to try 2.6.21-rc3-mm2; I'll see if that's any different. J - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21rc suspend to ram regression on Lenovo X60
Dave Jones <[EMAIL PROTECTED]> writes: > I just did a build of top of tree, including those commits, and > it's still broken. Booting with pci=nomsi no longer 'fixes' it > though, which may indicate that the MSI changes were a red herring. > (Or that the subsequent changes have regressed it even more, > which seems unlikely looking at the changes). > > .. or it could be something else introduced between rc3 (which is > what my bisect was based on) and todays tree. > > sigh. I'll do more bisecting after lunch. Thanks. It is good to know that things are worse, even if that isn't good news. Eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21rc suspend to ram regression on Lenovo X60
On Thu, Mar 15, 2007 at 10:11:01AM -0600, Eric W. Biederman wrote: > Dave Jones <[EMAIL PROTECTED]> writes: > > > On Tue, Mar 13, 2007 at 10:22:53AM +0100, Rafael J. Wysocki wrote: > > > On Tuesday, 13 March 2007 05:08, Dave Jones wrote: > > > > I spent considerable time over the last day or so bisecting to > > > > find out why an X60 stopped resuming somewhen between 2.6.20 and > > current > > -git. > > > > (Total lockup, black screen of death). > > > > > > Do you have CONFIG_TICK_ONESHOT or CONFIG_NO_HZ set? If you do, could > > you > > > please unset them and retest? > > > > I did try with NO_HZ unset, made no difference, I don't recall > > TICK_ONESHOT. > > I'm in meetings all day, but I'll check when I get home. > > I haven't heard anything more on this thread. > > I just wanted to double check. The tree that failed did it include > commits: > 392ee1e6dd901db6c4504617476f6442ed91f72d and > 9f35575dfc172f0a93fb464761883c8f49599b7a > > Mostly I was wondering if any of my later work to sort out msi > suspend/resume actually solved anything. I just did a build of top of tree, including those commits, and it's still broken. Booting with pci=nomsi no longer 'fixes' it though, which may indicate that the MSI changes were a red herring. (Or that the subsequent changes have regressed it even more, which seems unlikely looking at the changes). .. or it could be something else introduced between rc3 (which is what my bisect was based on) and todays tree. sigh. I'll do more bisecting after lunch. Dave -- http://www.codemonkey.org.uk - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21rc suspend to ram regression on Lenovo X60
On Thu, Mar 15, 2007 at 10:11:01AM -0600, Eric W. Biederman wrote: > I haven't heard anything more on this thread. Sorry, I've been stuck in meetings the last two days.. > I just wanted to double check. The tree that failed did it include > commits: > 392ee1e6dd901db6c4504617476f6442ed91f72d and > 9f35575dfc172f0a93fb464761883c8f49599b7a > > Mostly I was wondering if any of my later work to sort out msi > suspend/resume actually solved anything. I'll kick off some compiles and find out. Dave -- http://www.codemonkey.org.uk - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21rc suspend to ram regression on Lenovo X60
Dave Jones <[EMAIL PROTECTED]> writes: > On Tue, Mar 13, 2007 at 10:22:53AM +0100, Rafael J. Wysocki wrote: > > On Tuesday, 13 March 2007 05:08, Dave Jones wrote: > > > I spent considerable time over the last day or so bisecting to > > > find out why an X60 stopped resuming somewhen between 2.6.20 and current > -git. > > > (Total lockup, black screen of death). > > > > Do you have CONFIG_TICK_ONESHOT or CONFIG_NO_HZ set? If you do, could you > > please unset them and retest? > > I did try with NO_HZ unset, made no difference, I don't recall TICK_ONESHOT. > I'm in meetings all day, but I'll check when I get home. I haven't heard anything more on this thread. I just wanted to double check. The tree that failed did it include commits: 392ee1e6dd901db6c4504617476f6442ed91f72d and 9f35575dfc172f0a93fb464761883c8f49599b7a Mostly I was wondering if any of my later work to sort out msi suspend/resume actually solved anything. Eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21rc suspend to ram regression on Lenovo X60
Dave Jones [EMAIL PROTECTED] writes: On Tue, Mar 13, 2007 at 10:22:53AM +0100, Rafael J. Wysocki wrote: On Tuesday, 13 March 2007 05:08, Dave Jones wrote: I spent considerable time over the last day or so bisecting to find out why an X60 stopped resuming somewhen between 2.6.20 and current -git. (Total lockup, black screen of death). Do you have CONFIG_TICK_ONESHOT or CONFIG_NO_HZ set? If you do, could you please unset them and retest? I did try with NO_HZ unset, made no difference, I don't recall TICK_ONESHOT. I'm in meetings all day, but I'll check when I get home. I haven't heard anything more on this thread. I just wanted to double check. The tree that failed did it include commits: 392ee1e6dd901db6c4504617476f6442ed91f72d and 9f35575dfc172f0a93fb464761883c8f49599b7a Mostly I was wondering if any of my later work to sort out msi suspend/resume actually solved anything. Eric - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21rc suspend to ram regression on Lenovo X60
On Thu, Mar 15, 2007 at 10:11:01AM -0600, Eric W. Biederman wrote: I haven't heard anything more on this thread. Sorry, I've been stuck in meetings the last two days.. I just wanted to double check. The tree that failed did it include commits: 392ee1e6dd901db6c4504617476f6442ed91f72d and 9f35575dfc172f0a93fb464761883c8f49599b7a Mostly I was wondering if any of my later work to sort out msi suspend/resume actually solved anything. I'll kick off some compiles and find out. Dave -- http://www.codemonkey.org.uk - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21rc suspend to ram regression on Lenovo X60
On Thu, Mar 15, 2007 at 10:11:01AM -0600, Eric W. Biederman wrote: Dave Jones [EMAIL PROTECTED] writes: On Tue, Mar 13, 2007 at 10:22:53AM +0100, Rafael J. Wysocki wrote: On Tuesday, 13 March 2007 05:08, Dave Jones wrote: I spent considerable time over the last day or so bisecting to find out why an X60 stopped resuming somewhen between 2.6.20 and current -git. (Total lockup, black screen of death). Do you have CONFIG_TICK_ONESHOT or CONFIG_NO_HZ set? If you do, could you please unset them and retest? I did try with NO_HZ unset, made no difference, I don't recall TICK_ONESHOT. I'm in meetings all day, but I'll check when I get home. I haven't heard anything more on this thread. I just wanted to double check. The tree that failed did it include commits: 392ee1e6dd901db6c4504617476f6442ed91f72d and 9f35575dfc172f0a93fb464761883c8f49599b7a Mostly I was wondering if any of my later work to sort out msi suspend/resume actually solved anything. I just did a build of top of tree, including those commits, and it's still broken. Booting with pci=nomsi no longer 'fixes' it though, which may indicate that the MSI changes were a red herring. (Or that the subsequent changes have regressed it even more, which seems unlikely looking at the changes). .. or it could be something else introduced between rc3 (which is what my bisect was based on) and todays tree. sigh. I'll do more bisecting after lunch. Dave -- http://www.codemonkey.org.uk - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21rc suspend to ram regression on Lenovo X60
Dave Jones [EMAIL PROTECTED] writes: I just did a build of top of tree, including those commits, and it's still broken. Booting with pci=nomsi no longer 'fixes' it though, which may indicate that the MSI changes were a red herring. (Or that the subsequent changes have regressed it even more, which seems unlikely looking at the changes). .. or it could be something else introduced between rc3 (which is what my bisect was based on) and todays tree. sigh. I'll do more bisecting after lunch. Thanks. It is good to know that things are worse, even if that isn't good news. Eric - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21rc suspend to ram regression on Lenovo X60
Dave Jones wrote: I just did a build of top of tree, including those commits, and it's still broken. Booting with pci=nomsi no longer 'fixes' it though, which may indicate that the MSI changes were a red herring. (Or that the subsequent changes have regressed it even more, which seems unlikely looking at the changes). I just found the same thing on my X60. Current top-of-tree with pci=nomsi does not improve things. When it resumes, the CPU is working (capslock toggles, sysreq-b reboots), but the screen is blank. I was about to try 2.6.21-rc3-mm2; I'll see if that's any different. J - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21rc suspend to ram regression on Lenovo X60
On Thu, Mar 15, 2007 at 12:45:20PM -0700, Jeremy Fitzhardinge wrote: Dave Jones wrote: I just did a build of top of tree, including those commits, and it's still broken. Booting with pci=nomsi no longer 'fixes' it though, which may indicate that the MSI changes were a red herring. (Or that the subsequent changes have regressed it even more, which seems unlikely looking at the changes). I just found the same thing on my X60. Current top-of-tree with pci=nomsi does not improve things. When it resumes, the CPU is working (capslock toggles, sysreq-b reboots), but the screen is blank. Yeah, I noticed the capslock works. Networking doesn't come back up though, and it doesn't seem to answer to command that I type blindly. Even trying to do something like.. pm-suspend ; dmesg dmesg.out; /sbin/reboot doesn't seem to execute the commands on resume. Switching tty's to X with alt-f7 seems to lock it up to the point that even capslock doesn't work any more. I'll try and hook up a usb serial cable and see if I'm lucky enough to get something useful out of it in the absense of a serial port.. Dave -- http://www.codemonkey.org.uk - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21rc suspend to ram regression on Lenovo X60
On Tue, Mar 13, 2007 at 12:08:28AM -0400, Dave Jones wrote: > I spent considerable time over the last day or so bisecting to > find out why an X60 stopped resuming somewhen between 2.6.20 and current -git. > (Total lockup, black screen of death). > > The bisect log looked like this. > ... > Any ideas how to further debug this? > I'll try backing out individual changes from that merge tomorrow. If you've got a tree that looks like: --a-b-c-d-e-f-g-h-> \ / i-j-k-l-m-n where h is bad but both g and n are good, you can try testing the merge of g+k, etc. Which will find half the problem. Then you can do the same on the other side. Tedious. The best way to debug resume issues directly seems to be to do a fake suspend, possibly with filtering out particular devices: http://lwn.net/Articles/219033/ http://www.uwsg.iu.edu/hypermail/linux/kernel/0701.3/0397.html -- Mathematics is the supreme nostalgia of our time. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21rc suspend to ram regression on Lenovo X60
On Tue, Mar 13, 2007 at 10:22:53AM +0100, Rafael J. Wysocki wrote: > On Tuesday, 13 March 2007 05:08, Dave Jones wrote: > > I spent considerable time over the last day or so bisecting to > > find out why an X60 stopped resuming somewhen between 2.6.20 and current > > -git. > > (Total lockup, black screen of death). > > Do you have CONFIG_TICK_ONESHOT or CONFIG_NO_HZ set? If you do, could you > please unset them and retest? I did try with NO_HZ unset, made no difference, I don't recall TICK_ONESHOT. I'm in meetings all day, but I'll check when I get home. Dave -- http://www.codemonkey.org.uk - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21rc suspend to ram regression on Lenovo X60
On Tuesday, 13 March 2007 05:08, Dave Jones wrote: > I spent considerable time over the last day or so bisecting to > find out why an X60 stopped resuming somewhen between 2.6.20 and current -git. > (Total lockup, black screen of death). Do you have CONFIG_TICK_ONESHOT or CONFIG_NO_HZ set? If you do, could you please unset them and retest? Thanks, Rafael - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21rc suspend to ram regression on Lenovo X60
Dave Jones <[EMAIL PROTECTED]> writes: > I spent considerable time over the last day or so bisecting to > find out why an X60 stopped resuming somewhen between 2.6.20 and current -git. > (Total lockup, black screen of death). > > The bisect log looked like this. > > git-bisect start > # bad: [c8f71b01a50597e298dc3214a2f2be7b8d31170c] Linux 2.6.21-rc1 > git-bisect bad c8f71b01a50597e298dc3214a2f2be7b8d31170c > # good: [fa285a3d7924a0e3782926e51f16865c5129a2f7] Linux 2.6.20 > git-bisect good fa285a3d7924a0e3782926e51f16865c5129a2f7 > # bad: [574009c1a895aeeb85eaab29c235d75852b09eb8] Merge branch 'upstream' of > git://ftp.linux-mips.org/pub/scm/upstream-linus > git-bisect bad 574009c1a895aeeb85eaab29c235d75852b09eb8 > # bad: [43187902cbfafe73ede0144166b741fb0f7d04e1] Merge > master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6 > git-bisect bad 43187902cbfafe73ede0144166b741fb0f7d04e1 > # good: [1545085a28f226b59c243f88b82ea25393b0d63f] drm: Allow for 44 bit > user-tokens (or drm_file offsets) > git-bisect good 1545085a28f226b59c243f88b82ea25393b0d63f > # good: [c96e2c92072d3e78954c961f53d8c7352f7abbd7] Merge > master.kernel.org:/pub/scm/linux/kernel/git/gregkh/usb-2.6 > git-bisect good c96e2c92072d3e78954c961f53d8c7352f7abbd7 > # good: [31c56d820e03a2fd47f81d6c826f92caf511f9ee] [POWERPC] pasemi: iommu > support > git-bisect good 31c56d820e03a2fd47f81d6c826f92caf511f9ee > # bad: [78149df6d565c36675463352d0bfeb02b7a7] Merge > master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6 > git-bisect bad 78149df6d565c36675463352d0bfeb02b7a7 > # good: [3d9c18872fa1db5c43ab97d8cbca43775998e49c] shpchp: remove > CONFIG_HOTPLUG_PCI_SHPC_POLL_EVENT_MODE > git-bisect good 3d9c18872fa1db5c43ab97d8cbca43775998e49c > # good: [88187dfa4d8bb565df762f272511d2c91e427e0d] MSI: Replace pci_msi_quirk > with calls to pci_no_msi() > git-bisect good 88187dfa4d8bb565df762f272511d2c91e427e0d > # good: [866a8c87c4e51046602387953bbef76992107bcb] msi: Fix > msi_remove_pci_irq_vectors. > git-bisect good 866a8c87c4e51046602387953bbef76992107bcb > # good: [f7feaca77d6ad6bcfcc88ac54e3188970448d6fe] msi: Make MSI useable more > architectures > git-bisect good f7feaca77d6ad6bcfcc88ac54e3188970448d6fe > # good: [14719f325e1cd4ff757587e9a221ebaf394563ee] Revert "PCI: remove > duplicate > device id from ata_piix" > git-bisect good 14719f325e1cd4ff757587e9a221ebaf394563ee > > which led me to a final 'bad' commit of > 78149df6d565c36675463352d0bfeb02b7a7 > which is a merge changeset of lots of PCI bits. Ok. This is weird. It looks like you marked the merge bad but it's individual commits as good Which would indicate a problem on one of the branches it was merged with, or a problem that only shows up when both groups of changes are present. > Seeing a couple of MSI changes in there, on a hunch I booted latest tree with > pci=nomsi, and it resumed again. > > Any ideas how to further debug this? > I'll try backing out individual changes from that merge tomorrow. Thanks. Of those msi patches you have identified I don't see anything really obvious. And you actually marked them as good in your bisect so I don't expect it is core problem. We do have a known e1000 regression, with msi and suspend/resume. So it is possible the nomsi avoided a driver problem. Especially as we have a number of driver changes on the on Linus's side of that merge. I also know we have some known issues with pci_save_state and pci_restore_state that require them to be paired for correct operation. For suspend and resume that is not generally a problem. I have fixes for the pci_save_state and pci_restore_state in the -mm and gregkh tree's. Since they also happen to fix the e1000 driver as a side effect they are worth looking at, at least if you have an e1000. I don't have a clue which hardware the x60 has so I don't know which drivers it would be using. Eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21rc suspend to ram regression on Lenovo X60
Dave Jones [EMAIL PROTECTED] writes: I spent considerable time over the last day or so bisecting to find out why an X60 stopped resuming somewhen between 2.6.20 and current -git. (Total lockup, black screen of death). The bisect log looked like this. git-bisect start # bad: [c8f71b01a50597e298dc3214a2f2be7b8d31170c] Linux 2.6.21-rc1 git-bisect bad c8f71b01a50597e298dc3214a2f2be7b8d31170c # good: [fa285a3d7924a0e3782926e51f16865c5129a2f7] Linux 2.6.20 git-bisect good fa285a3d7924a0e3782926e51f16865c5129a2f7 # bad: [574009c1a895aeeb85eaab29c235d75852b09eb8] Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus git-bisect bad 574009c1a895aeeb85eaab29c235d75852b09eb8 # bad: [43187902cbfafe73ede0144166b741fb0f7d04e1] Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6 git-bisect bad 43187902cbfafe73ede0144166b741fb0f7d04e1 # good: [1545085a28f226b59c243f88b82ea25393b0d63f] drm: Allow for 44 bit user-tokens (or drm_file offsets) git-bisect good 1545085a28f226b59c243f88b82ea25393b0d63f # good: [c96e2c92072d3e78954c961f53d8c7352f7abbd7] Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/usb-2.6 git-bisect good c96e2c92072d3e78954c961f53d8c7352f7abbd7 # good: [31c56d820e03a2fd47f81d6c826f92caf511f9ee] [POWERPC] pasemi: iommu support git-bisect good 31c56d820e03a2fd47f81d6c826f92caf511f9ee # bad: [78149df6d565c36675463352d0bfeb02b7a7] Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6 git-bisect bad 78149df6d565c36675463352d0bfeb02b7a7 # good: [3d9c18872fa1db5c43ab97d8cbca43775998e49c] shpchp: remove CONFIG_HOTPLUG_PCI_SHPC_POLL_EVENT_MODE git-bisect good 3d9c18872fa1db5c43ab97d8cbca43775998e49c # good: [88187dfa4d8bb565df762f272511d2c91e427e0d] MSI: Replace pci_msi_quirk with calls to pci_no_msi() git-bisect good 88187dfa4d8bb565df762f272511d2c91e427e0d # good: [866a8c87c4e51046602387953bbef76992107bcb] msi: Fix msi_remove_pci_irq_vectors. git-bisect good 866a8c87c4e51046602387953bbef76992107bcb # good: [f7feaca77d6ad6bcfcc88ac54e3188970448d6fe] msi: Make MSI useable more architectures git-bisect good f7feaca77d6ad6bcfcc88ac54e3188970448d6fe # good: [14719f325e1cd4ff757587e9a221ebaf394563ee] Revert PCI: remove duplicate device id from ata_piix git-bisect good 14719f325e1cd4ff757587e9a221ebaf394563ee which led me to a final 'bad' commit of 78149df6d565c36675463352d0bfeb02b7a7 which is a merge changeset of lots of PCI bits. Ok. This is weird. It looks like you marked the merge bad but it's individual commits as good Which would indicate a problem on one of the branches it was merged with, or a problem that only shows up when both groups of changes are present. Seeing a couple of MSI changes in there, on a hunch I booted latest tree with pci=nomsi, and it resumed again. Any ideas how to further debug this? I'll try backing out individual changes from that merge tomorrow. Thanks. Of those msi patches you have identified I don't see anything really obvious. And you actually marked them as good in your bisect so I don't expect it is core problem. We do have a known e1000 regression, with msi and suspend/resume. So it is possible the nomsi avoided a driver problem. Especially as we have a number of driver changes on the on Linus's side of that merge. I also know we have some known issues with pci_save_state and pci_restore_state that require them to be paired for correct operation. For suspend and resume that is not generally a problem. I have fixes for the pci_save_state and pci_restore_state in the -mm and gregkh tree's. Since they also happen to fix the e1000 driver as a side effect they are worth looking at, at least if you have an e1000. I don't have a clue which hardware the x60 has so I don't know which drivers it would be using. Eric - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21rc suspend to ram regression on Lenovo X60
On Tuesday, 13 March 2007 05:08, Dave Jones wrote: I spent considerable time over the last day or so bisecting to find out why an X60 stopped resuming somewhen between 2.6.20 and current -git. (Total lockup, black screen of death). Do you have CONFIG_TICK_ONESHOT or CONFIG_NO_HZ set? If you do, could you please unset them and retest? Thanks, Rafael - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21rc suspend to ram regression on Lenovo X60
On Tue, Mar 13, 2007 at 10:22:53AM +0100, Rafael J. Wysocki wrote: On Tuesday, 13 March 2007 05:08, Dave Jones wrote: I spent considerable time over the last day or so bisecting to find out why an X60 stopped resuming somewhen between 2.6.20 and current -git. (Total lockup, black screen of death). Do you have CONFIG_TICK_ONESHOT or CONFIG_NO_HZ set? If you do, could you please unset them and retest? I did try with NO_HZ unset, made no difference, I don't recall TICK_ONESHOT. I'm in meetings all day, but I'll check when I get home. Dave -- http://www.codemonkey.org.uk - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21rc suspend to ram regression on Lenovo X60
On Tue, Mar 13, 2007 at 12:08:28AM -0400, Dave Jones wrote: I spent considerable time over the last day or so bisecting to find out why an X60 stopped resuming somewhen between 2.6.20 and current -git. (Total lockup, black screen of death). The bisect log looked like this. ... Any ideas how to further debug this? I'll try backing out individual changes from that merge tomorrow. If you've got a tree that looks like: --a-b-c-d-e-f-g-h- \ / i-j-k-l-m-n where h is bad but both g and n are good, you can try testing the merge of g+k, etc. Which will find half the problem. Then you can do the same on the other side. Tedious. The best way to debug resume issues directly seems to be to do a fake suspend, possibly with filtering out particular devices: http://lwn.net/Articles/219033/ http://www.uwsg.iu.edu/hypermail/linux/kernel/0701.3/0397.html -- Mathematics is the supreme nostalgia of our time. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
2.6.21rc suspend to ram regression on Lenovo X60
I spent considerable time over the last day or so bisecting to find out why an X60 stopped resuming somewhen between 2.6.20 and current -git. (Total lockup, black screen of death). The bisect log looked like this. git-bisect start # bad: [c8f71b01a50597e298dc3214a2f2be7b8d31170c] Linux 2.6.21-rc1 git-bisect bad c8f71b01a50597e298dc3214a2f2be7b8d31170c # good: [fa285a3d7924a0e3782926e51f16865c5129a2f7] Linux 2.6.20 git-bisect good fa285a3d7924a0e3782926e51f16865c5129a2f7 # bad: [574009c1a895aeeb85eaab29c235d75852b09eb8] Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus git-bisect bad 574009c1a895aeeb85eaab29c235d75852b09eb8 # bad: [43187902cbfafe73ede0144166b741fb0f7d04e1] Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6 git-bisect bad 43187902cbfafe73ede0144166b741fb0f7d04e1 # good: [1545085a28f226b59c243f88b82ea25393b0d63f] drm: Allow for 44 bit user-tokens (or drm_file offsets) git-bisect good 1545085a28f226b59c243f88b82ea25393b0d63f # good: [c96e2c92072d3e78954c961f53d8c7352f7abbd7] Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/usb-2.6 git-bisect good c96e2c92072d3e78954c961f53d8c7352f7abbd7 # good: [31c56d820e03a2fd47f81d6c826f92caf511f9ee] [POWERPC] pasemi: iommu support git-bisect good 31c56d820e03a2fd47f81d6c826f92caf511f9ee # bad: [78149df6d565c36675463352d0bfeb02b7a7] Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6 git-bisect bad 78149df6d565c36675463352d0bfeb02b7a7 # good: [3d9c18872fa1db5c43ab97d8cbca43775998e49c] shpchp: remove CONFIG_HOTPLUG_PCI_SHPC_POLL_EVENT_MODE git-bisect good 3d9c18872fa1db5c43ab97d8cbca43775998e49c # good: [88187dfa4d8bb565df762f272511d2c91e427e0d] MSI: Replace pci_msi_quirk with calls to pci_no_msi() git-bisect good 88187dfa4d8bb565df762f272511d2c91e427e0d # good: [866a8c87c4e51046602387953bbef76992107bcb] msi: Fix msi_remove_pci_irq_vectors. git-bisect good 866a8c87c4e51046602387953bbef76992107bcb # good: [f7feaca77d6ad6bcfcc88ac54e3188970448d6fe] msi: Make MSI useable more architectures git-bisect good f7feaca77d6ad6bcfcc88ac54e3188970448d6fe # good: [14719f325e1cd4ff757587e9a221ebaf394563ee] Revert "PCI: remove duplicate device id from ata_piix" git-bisect good 14719f325e1cd4ff757587e9a221ebaf394563ee which led me to a final 'bad' commit of 78149df6d565c36675463352d0bfeb02b7a7 which is a merge changeset of lots of PCI bits. Seeing a couple of MSI changes in there, on a hunch I booted latest tree with pci=nomsi, and it resumed again. Any ideas how to further debug this? I'll try backing out individual changes from that merge tomorrow. Dave -- http://www.codemonkey.org.uk - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
2.6.21rc suspend to ram regression on Lenovo X60
I spent considerable time over the last day or so bisecting to find out why an X60 stopped resuming somewhen between 2.6.20 and current -git. (Total lockup, black screen of death). The bisect log looked like this. git-bisect start # bad: [c8f71b01a50597e298dc3214a2f2be7b8d31170c] Linux 2.6.21-rc1 git-bisect bad c8f71b01a50597e298dc3214a2f2be7b8d31170c # good: [fa285a3d7924a0e3782926e51f16865c5129a2f7] Linux 2.6.20 git-bisect good fa285a3d7924a0e3782926e51f16865c5129a2f7 # bad: [574009c1a895aeeb85eaab29c235d75852b09eb8] Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus git-bisect bad 574009c1a895aeeb85eaab29c235d75852b09eb8 # bad: [43187902cbfafe73ede0144166b741fb0f7d04e1] Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6 git-bisect bad 43187902cbfafe73ede0144166b741fb0f7d04e1 # good: [1545085a28f226b59c243f88b82ea25393b0d63f] drm: Allow for 44 bit user-tokens (or drm_file offsets) git-bisect good 1545085a28f226b59c243f88b82ea25393b0d63f # good: [c96e2c92072d3e78954c961f53d8c7352f7abbd7] Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/usb-2.6 git-bisect good c96e2c92072d3e78954c961f53d8c7352f7abbd7 # good: [31c56d820e03a2fd47f81d6c826f92caf511f9ee] [POWERPC] pasemi: iommu support git-bisect good 31c56d820e03a2fd47f81d6c826f92caf511f9ee # bad: [78149df6d565c36675463352d0bfeb02b7a7] Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6 git-bisect bad 78149df6d565c36675463352d0bfeb02b7a7 # good: [3d9c18872fa1db5c43ab97d8cbca43775998e49c] shpchp: remove CONFIG_HOTPLUG_PCI_SHPC_POLL_EVENT_MODE git-bisect good 3d9c18872fa1db5c43ab97d8cbca43775998e49c # good: [88187dfa4d8bb565df762f272511d2c91e427e0d] MSI: Replace pci_msi_quirk with calls to pci_no_msi() git-bisect good 88187dfa4d8bb565df762f272511d2c91e427e0d # good: [866a8c87c4e51046602387953bbef76992107bcb] msi: Fix msi_remove_pci_irq_vectors. git-bisect good 866a8c87c4e51046602387953bbef76992107bcb # good: [f7feaca77d6ad6bcfcc88ac54e3188970448d6fe] msi: Make MSI useable more architectures git-bisect good f7feaca77d6ad6bcfcc88ac54e3188970448d6fe # good: [14719f325e1cd4ff757587e9a221ebaf394563ee] Revert PCI: remove duplicate device id from ata_piix git-bisect good 14719f325e1cd4ff757587e9a221ebaf394563ee which led me to a final 'bad' commit of 78149df6d565c36675463352d0bfeb02b7a7 which is a merge changeset of lots of PCI bits. Seeing a couple of MSI changes in there, on a hunch I booted latest tree with pci=nomsi, and it resumed again. Any ideas how to further debug this? I'll try backing out individual changes from that merge tomorrow. Dave -- http://www.codemonkey.org.uk - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/