Re: [BUG] sdhci regression in 2.6.21-rc2
Andrew Morton wrote: > Oh, it's just a pain in the ass. Please don't do it lightly - if there's a > really good reason then OK. > The mmc code is a mess (mostly my fault for the addition of SD support) and I'm trying to break things apart to clear the code up. That unfortunately meant moving files around. > Plus it helps if the massive file move isn't left sitting in some external > tree for months. I mean, it's usually a trivial thing, so do it just a > week before the pull is due. > > My goal with putting it in -mm was to give people a chance to cry bloody murder before I merged the stuff (which I intend to do immediately post 2.6.21). The changes are all over the place, so to ease my mind I prefer to have it simmer for a while. As to making your life easier, send patches to the mmc layer my way and I'll handle the conflicts. I made the mess so it's only fair that I have to clean it up. > But whatever. What are we going to do about $SUBJECT? > It should go in now, along with a fix to unregister the interrupt. I have a bunch of fixes I intend to push to Linus today and I'll include this in that group. Rgds -- -- Pierre Ossman Linux kernel, MMC maintainerhttp://www.kernel.org PulseAudio, core developer http://pulseaudio.org rdesktop, core developer http://www.rdesktop.org - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] sdhci regression in 2.6.21-rc2
On Tue, 06 Mar 2007 06:47:32 +0100 Pierre Ossman <[EMAIL PROTECTED]> wrote: > Andrew Morton wrote: > > > > (I'm also inclined to drop the darned mmc tree - am getting rather tired of > > people moving their files all over the tree all the time). > > > > > > Fine, I can stop bothering with putting up a test tree and just push the > stuff to Linus directly if that is what you prefer. Or is moving files > around barred from the kernel? Oh, it's just a pain in the ass. Please don't do it lightly - if there's a really good reason then OK. Plus it helps if the massive file move isn't left sitting in some external tree for months. I mean, it's usually a trivial thing, so do it just a week before the pull is due. But whatever. What are we going to do about $SUBJECT? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] sdhci regression in 2.6.21-rc2
Andrew Morton wrote: > > (I'm also inclined to drop the darned mmc tree - am getting rather tired of > people moving their files all over the tree all the time). > > Fine, I can stop bothering with putting up a test tree and just push the stuff to Linus directly if that is what you prefer. Or is moving files around barred from the kernel? -- -- Pierre Ossman Linux kernel, MMC maintainerhttp://www.kernel.org PulseAudio, core developer http://pulseaudio.org rdesktop, core developer http://www.rdesktop.org - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] sdhci regression in 2.6.21-rc2
On Mon, 05 Mar 2007 10:19:55 -0500 Mark Lord <[EMAIL PROTECTED]> wrote: > The interrupt is shared with another device, which resumes > earlier than the sdhci controller, and generates an interrupt. > > The sdhci interrupt handler runs, sees 0x in its own > device's interrupt status, and tries to handle it.. > The reason for the 0x is that the device is still > suspended, and *all* regs are reading back 0x. > > So.. the suspend routine should de-register the irq handler, > and the resume routine should re-register it again. > > Or perhaps a simpler kludge like this one, which fixes it for me: > > Signed-off-by: Mark Lord <[EMAIL PROTECTED]> > --- > --- linux/drivers/mmc/sdhci.c.orig2007-03-02 15:06:31.0 -0500 > +++ linux/drivers/mmc/sdhci.c 2007-03-05 10:13:51.0 -0500 > @@ -994,7 +994,7 @@ > > intmask = readl(host->ioaddr + SDHCI_INT_STATUS); > > - if (!intmask) { > + if (!intmask || intmask == 0x) { > result = IRQ_NONE; > goto out; > } This is actually pretty standard handling for a lot of drivers: any device which can appear on a cardbus or other hot-unpluggable bus and which shares interupts needs such treatment. I don't know whether anything which this driver drives could ever appear on such a bus, but I'm inclined to just apply it into 2.6.21. (I'm also inclined to drop the darned mmc tree - am getting rather tired of people moving their files all over the tree all the time). - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] sdhci regression in 2.6.21-rc2
Pierre Ossman wrote: Mark Lord wrote: From linux/Documentation/power/pci.txt: That conveniently leaves out the part of how to handle when we're not getting our stuff back. ;) But it seems to be the easier route anyway... I'll whip up a patch. It's probably best all-round. But another simpler way is to just set a global "am_suspended" bool in the _suspend() code, and clear it again in the _resume() code, and check it in the irq handler() code. And the test for 0x (my simple patch) is probably good to have regardless inside the irq handler(). Cheers - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] sdhci regression in 2.6.21-rc2
Mark Lord wrote: > Pierre Ossman wrote: >> >> Hmm... I guess it can't be as the interrupt handler isn't associated >> with a >> device, just a random pointer. >> >> So either release the interrupt (which seems a bit unsafe as then we >> might not >> get it back), or handle states at the start of the isr. > > From linux/Documentation/power/pci.txt: > That conveniently leaves out the part of how to handle when we're not getting our stuff back. ;) But it seems to be the easier route anyway... I'll whip up a patch. Rgds -- -- Pierre Ossman Linux kernel, MMC maintainerhttp://www.kernel.org PulseAudio, core developer http://pulseaudio.org rdesktop, core developer http://www.rdesktop.org - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] sdhci regression in 2.6.21-rc2
Pierre Ossman wrote: Pierre Ossman wrote: I'd say it's the kernel calling the interrupt handler of a currently sleeping device. Since we're seeing this problem I assume the kernel's interrupt code isn't aware of PM states? Hmm... I guess it can't be as the interrupt handler isn't associated with a device, just a random pointer. So either release the interrupt (which seems a bit unsafe as then we might not get it back), or handle states at the start of the isr. From linux/Documentation/power/pci.txt: A reference implementation - .suspend() { /* driver specific operations */ /* Disable IRQ */ free_irq(); /* If using MSI */ pci_disable_msi(); pci_save_state(); pci_enable_wake(); /* Disable IO/bus master/irq router */ pci_disable_device(); pci_set_power_state(pci_choose_state()); } .resume() { pci_set_power_state(PCI_D0); pci_restore_state(); /* device's irq possibly is changed, driver should take care */ pci_enable_device(); pci_set_master(); /* if using MSI, device's vector possibly is changed */ pci_enable_msi(); request_irq(); /* driver specific operations; */ } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] sdhci regression in 2.6.21-rc2
Pierre Ossman wrote: > Mark Lord wrote: >> But.. in the middle of all of this, we now see the SHDCI code >> trying to talk to its as-yet-not-restored device, and being rather >> noisy about it all: >> > > Not quite. I'd say it's the kernel calling the interrupt handler of a > currently sleeping device. Since we're seeing this problem I assume the > kernel's interrupt code isn't aware of PM states? > Hmm... I guess it can't be as the interrupt handler isn't associated with a device, just a random pointer. So either release the interrupt (which seems a bit unsafe as then we might not get it back), or handle states at the start of the isr. Rgds -- -- Pierre Ossman Linux kernel, MMC maintainerhttp://www.kernel.org PulseAudio, core developer http://pulseaudio.org rdesktop, core developer http://www.rdesktop.org - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] sdhci regression in 2.6.21-rc2
Mark Lord wrote: Pierre Ossman wrote: Mark Lord wrote: Another regression, for Pierre Ossman this time. My syslog gets spammed like this (below) on suspend/resume (to RAM) cycles. Worked fine, without all of the noise, in all previous kernels up to 2.6.20+. This looks like a PCI configuration issue. To me, it looks like a buggy driver "resume" sequence. The low-level SDHCI driver is trying to use the device before the PM/PCI stuff has restored the pre-suspend state. ... I've dug a bit deeper, instrumenting the driver, and found/fixed the problem. The interrupt is shared with another device, which resumes earlier than the sdhci controller, and generates an interrupt. The sdhci interrupt handler runs, sees 0x in its own device's interrupt status, and tries to handle it.. The reason for the 0x is that the device is still suspended, and *all* regs are reading back 0x. So.. the suspend routine should de-register the irq handler, and the resume routine should re-register it again. Or perhaps a simpler kludge like this one, which fixes it for me: Signed-off-by: Mark Lord <[EMAIL PROTECTED]> --- --- linux/drivers/mmc/sdhci.c.orig 2007-03-02 15:06:31.0 -0500 +++ linux/drivers/mmc/sdhci.c 2007-03-05 10:13:51.0 -0500 @@ -994,7 +994,7 @@ intmask = readl(host->ioaddr + SDHCI_INT_STATUS); - if (!intmask) { + if (!intmask || intmask == 0x) { result = IRQ_NONE; goto out; } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] sdhci regression in 2.6.21-rc2
Mark Lord wrote: > > But.. in the middle of all of this, we now see the SHDCI code > trying to talk to its as-yet-not-restored device, and being rather > noisy about it all: > Not quite. I'd say it's the kernel calling the interrupt handler of a currently sleeping device. Since we're seeing this problem I assume the kernel's interrupt code isn't aware of PM states? Rgds -- -- Pierre Ossman Linux kernel, MMC maintainerhttp://www.kernel.org PulseAudio, core developer http://pulseaudio.org rdesktop, core developer http://www.rdesktop.org - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] sdhci regression in 2.6.21-rc2
Pierre Ossman wrote: Mark Lord wrote: Another regression, for Pierre Ossman this time. My syslog gets spammed like this (below) on suspend/resume (to RAM) cycles. Worked fine, without all of the noise, in all previous kernels up to 2.6.20+. This looks like a PCI configuration issue. To me, it looks like a buggy driver "resume" sequence. The low-level SDHCI driver is trying to use the device before the PM/PCI stuff has restored the pre-suspend state. Can you bisect which commit caused the issue? Nope. I don't have time to run around that (huge) treadmill, thanks. And what's the PCI address of the device? :03:01.2 0805: Ricoh Co Ltd R5C822 SD/SDIO/MMC/MS/MSPro Host Adapter (rev 17) So, from the syslog, we see that the machine suspends (to RAM): kernel: Stopping tasks ... done. kernel: Suspending console(s) kernel: pl2303 1-1.3:1.0: no suspend for driver pl2303? kernel: ACPI: PCI interrupt for device :03:01.2 disabled kernel: ACPI: PCI interrupt for device :03:00.0 disabled kernel: ACPI: PCI interrupt for device :00:1f.2 disabled kernel: ACPI: PCI interrupt for device :00:1e.2 disabled kernel: ACPI: PCI interrupt for device :00:1d.7 disabled kernel: ACPI: PCI interrupt for device :00:1d.3 disabled kernel: ACPI: PCI interrupt for device :00:1d.2 disabled kernel: ACPI: PCI interrupt for device :00:1d.1 disabled kernel: ACPI: PCI interrupt for device :00:1d.0 disabled kernel: Intel machine check architecture supported. kernel: Intel machine check reporting enabled on CPU#0. A few seconds later, I manually wake-up the machine. It resumes (from RAM), and the PM code begins restoring PCI config space for various device: kernel: Back to C! kernel: PM: Writing back config space on device :00:01.0 at offset 7 (was 2000d0d0, writing d0d0) kernel: PM: Writing back config space on device :00:01.0 at offset 3 (was 1, writing 10010) kernel: PCI: Setting latency timer of device :00:01.0 to 64 kernel: ACPI: PCI Interrupt :00:1d.0[A] -> GSI 16 (level, low) -> IRQ 16 kernel: PCI: Setting latency timer of device :00:1d.0 to 64 kernel: usb usb2: root hub lost power or was reset kernel: PCI: Enabling device :00:1d.1 ( -> 0001) kernel: ACPI: PCI Interrupt :00:1d.1[B] -> GSI 17 (level, low) -> IRQ 18 kernel: PCI: Setting latency timer of device :00:1d.1 to 64 kernel: PM: Writing back config space on device :00:1d.1 at offset f (was 200, writing 20a) kernel: PM: Writing back config space on device :00:1d.1 at offset 8 (was 1, writing bf61) kernel: usb usb3: root hub lost power or was reset kernel: PCI: Enabling device :00:1d.2 ( -> 0001) kernel: ACPI: PCI Interrupt :00:1d.2[C] -> GSI 18 (level, low) -> IRQ 19 kernel: PCI: Setting latency timer of device :00:1d.2 to 64 kernel: PM: Writing back config space on device :00:1d.2 at offset f (was 300, writing 309) kernel: PM: Writing back config space on device :00:1d.2 at offset 8 (was 1, writing bf41) kernel: usb usb4: root hub lost power or was reset kernel: PCI: Enabling device :00:1d.3 ( -> 0001) kernel: ACPI: PCI Interrupt :00:1d.3[D] -> GSI 19 (level, low) -> IRQ 17 kernel: PCI: Setting latency timer of device :00:1d.3 to 64 kernel: PM: Writing back config space on device :00:1d.3 at offset f (was 400, writing 407) kernel: PM: Writing back config space on device :00:1d.3 at offset 8 (was 1, writing bf21) kernel: usb usb5: root hub lost power or was reset But.. in the middle of all of this, we now see the SHDCI code trying to talk to its as-yet-not-restored device, and being rather noisy about it all: kernel: sdhci: == REGISTER DUMP == kernel: sdhci: Sys addr: 0x | Version: 0x kernel: sdhci: Blk size: 0x | Blk cnt: 0x kernel: sdhci: Argument: 0x | Trn mode: 0x kernel: sdhci: Present: 0x | Host ctl: 0x00ff kernel: sdhci: Power:0x00ff | Blk gap: 0x00ff kernel: sdhci: Wake-up: 0x00ff | Clock:0x kernel: sdhci: Timeout: 0x00ff | Int stat: 0x kernel: sdhci: Int enab: 0x | Sig enab: 0x kernel: sdhci: AC12 err: 0x | Slot int: 0x kernel: sdhci: Caps: 0x | Max curr: 0x kernel: sdhci: === kernel: mmc0: Card is consuming too much power! kernel: mmc0: Unexpected interrupt 0x0080. kernel: sdhci: == REGISTER DUMP == kernel: sdhci: Sys addr: 0x | Version: 0x kernel: sdhci: Blk size: 0x | Blk cnt: 0x kernel: sdhci: Argument: 0x | Trn mode: 0x kernel: sdhci: Present: 0x | Host ctl: 0x00ff kernel: sdhci: Power:0x00ff | Blk gap: 0x00ff kernel: sdhci: Wake-up: 0x00ff | Clock:0x kernel: sdhci: Timeout: 0x00ff | Int stat: 0x kernel: sdhci: Int enab: 0x | Sig enab: 0x
Re: [BUG] sdhci regression in 2.6.21-rc2
Pierre Ossman wrote: Mark Lord wrote: Another regression, for Pierre Ossman this time. My syslog gets spammed like this (below) on suspend/resume (to RAM) cycles. Worked fine, without all of the noise, in all previous kernels up to 2.6.20+. This looks like a PCI configuration issue. To me, it looks like a buggy driver resume sequence. The low-level SDHCI driver is trying to use the device before the PM/PCI stuff has restored the pre-suspend state. Can you bisect which commit caused the issue? Nope. I don't have time to run around that (huge) treadmill, thanks. And what's the PCI address of the device? :03:01.2 0805: Ricoh Co Ltd R5C822 SD/SDIO/MMC/MS/MSPro Host Adapter (rev 17) So, from the syslog, we see that the machine suspends (to RAM): kernel: Stopping tasks ... done. kernel: Suspending console(s) kernel: pl2303 1-1.3:1.0: no suspend for driver pl2303? kernel: ACPI: PCI interrupt for device :03:01.2 disabled kernel: ACPI: PCI interrupt for device :03:00.0 disabled kernel: ACPI: PCI interrupt for device :00:1f.2 disabled kernel: ACPI: PCI interrupt for device :00:1e.2 disabled kernel: ACPI: PCI interrupt for device :00:1d.7 disabled kernel: ACPI: PCI interrupt for device :00:1d.3 disabled kernel: ACPI: PCI interrupt for device :00:1d.2 disabled kernel: ACPI: PCI interrupt for device :00:1d.1 disabled kernel: ACPI: PCI interrupt for device :00:1d.0 disabled kernel: Intel machine check architecture supported. kernel: Intel machine check reporting enabled on CPU#0. A few seconds later, I manually wake-up the machine. It resumes (from RAM), and the PM code begins restoring PCI config space for various device: kernel: Back to C! kernel: PM: Writing back config space on device :00:01.0 at offset 7 (was 2000d0d0, writing d0d0) kernel: PM: Writing back config space on device :00:01.0 at offset 3 (was 1, writing 10010) kernel: PCI: Setting latency timer of device :00:01.0 to 64 kernel: ACPI: PCI Interrupt :00:1d.0[A] - GSI 16 (level, low) - IRQ 16 kernel: PCI: Setting latency timer of device :00:1d.0 to 64 kernel: usb usb2: root hub lost power or was reset kernel: PCI: Enabling device :00:1d.1 ( - 0001) kernel: ACPI: PCI Interrupt :00:1d.1[B] - GSI 17 (level, low) - IRQ 18 kernel: PCI: Setting latency timer of device :00:1d.1 to 64 kernel: PM: Writing back config space on device :00:1d.1 at offset f (was 200, writing 20a) kernel: PM: Writing back config space on device :00:1d.1 at offset 8 (was 1, writing bf61) kernel: usb usb3: root hub lost power or was reset kernel: PCI: Enabling device :00:1d.2 ( - 0001) kernel: ACPI: PCI Interrupt :00:1d.2[C] - GSI 18 (level, low) - IRQ 19 kernel: PCI: Setting latency timer of device :00:1d.2 to 64 kernel: PM: Writing back config space on device :00:1d.2 at offset f (was 300, writing 309) kernel: PM: Writing back config space on device :00:1d.2 at offset 8 (was 1, writing bf41) kernel: usb usb4: root hub lost power or was reset kernel: PCI: Enabling device :00:1d.3 ( - 0001) kernel: ACPI: PCI Interrupt :00:1d.3[D] - GSI 19 (level, low) - IRQ 17 kernel: PCI: Setting latency timer of device :00:1d.3 to 64 kernel: PM: Writing back config space on device :00:1d.3 at offset f (was 400, writing 407) kernel: PM: Writing back config space on device :00:1d.3 at offset 8 (was 1, writing bf21) kernel: usb usb5: root hub lost power or was reset But.. in the middle of all of this, we now see the SHDCI code trying to talk to its as-yet-not-restored device, and being rather noisy about it all: kernel: sdhci: == REGISTER DUMP == kernel: sdhci: Sys addr: 0x | Version: 0x kernel: sdhci: Blk size: 0x | Blk cnt: 0x kernel: sdhci: Argument: 0x | Trn mode: 0x kernel: sdhci: Present: 0x | Host ctl: 0x00ff kernel: sdhci: Power:0x00ff | Blk gap: 0x00ff kernel: sdhci: Wake-up: 0x00ff | Clock:0x kernel: sdhci: Timeout: 0x00ff | Int stat: 0x kernel: sdhci: Int enab: 0x | Sig enab: 0x kernel: sdhci: AC12 err: 0x | Slot int: 0x kernel: sdhci: Caps: 0x | Max curr: 0x kernel: sdhci: === kernel: mmc0: Card is consuming too much power! kernel: mmc0: Unexpected interrupt 0x0080. kernel: sdhci: == REGISTER DUMP == kernel: sdhci: Sys addr: 0x | Version: 0x kernel: sdhci: Blk size: 0x | Blk cnt: 0x kernel: sdhci: Argument: 0x | Trn mode: 0x kernel: sdhci: Present: 0x | Host ctl: 0x00ff kernel: sdhci: Power:0x00ff | Blk gap: 0x00ff kernel: sdhci: Wake-up: 0x00ff | Clock:0x kernel: sdhci: Timeout: 0x00ff | Int stat: 0x kernel: sdhci: Int enab: 0x | Sig enab: 0x kernel: sdhci:
Re: [BUG] sdhci regression in 2.6.21-rc2
Mark Lord wrote: But.. in the middle of all of this, we now see the SHDCI code trying to talk to its as-yet-not-restored device, and being rather noisy about it all: Not quite. I'd say it's the kernel calling the interrupt handler of a currently sleeping device. Since we're seeing this problem I assume the kernel's interrupt code isn't aware of PM states? Rgds -- -- Pierre Ossman Linux kernel, MMC maintainerhttp://www.kernel.org PulseAudio, core developer http://pulseaudio.org rdesktop, core developer http://www.rdesktop.org - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] sdhci regression in 2.6.21-rc2
Mark Lord wrote: Pierre Ossman wrote: Mark Lord wrote: Another regression, for Pierre Ossman this time. My syslog gets spammed like this (below) on suspend/resume (to RAM) cycles. Worked fine, without all of the noise, in all previous kernels up to 2.6.20+. This looks like a PCI configuration issue. To me, it looks like a buggy driver resume sequence. The low-level SDHCI driver is trying to use the device before the PM/PCI stuff has restored the pre-suspend state. ... I've dug a bit deeper, instrumenting the driver, and found/fixed the problem. The interrupt is shared with another device, which resumes earlier than the sdhci controller, and generates an interrupt. The sdhci interrupt handler runs, sees 0x in its own device's interrupt status, and tries to handle it.. The reason for the 0x is that the device is still suspended, and *all* regs are reading back 0x. So.. the suspend routine should de-register the irq handler, and the resume routine should re-register it again. Or perhaps a simpler kludge like this one, which fixes it for me: Signed-off-by: Mark Lord [EMAIL PROTECTED] --- --- linux/drivers/mmc/sdhci.c.orig 2007-03-02 15:06:31.0 -0500 +++ linux/drivers/mmc/sdhci.c 2007-03-05 10:13:51.0 -0500 @@ -994,7 +994,7 @@ intmask = readl(host-ioaddr + SDHCI_INT_STATUS); - if (!intmask) { + if (!intmask || intmask == 0x) { result = IRQ_NONE; goto out; } - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] sdhci regression in 2.6.21-rc2
Pierre Ossman wrote: Mark Lord wrote: But.. in the middle of all of this, we now see the SHDCI code trying to talk to its as-yet-not-restored device, and being rather noisy about it all: Not quite. I'd say it's the kernel calling the interrupt handler of a currently sleeping device. Since we're seeing this problem I assume the kernel's interrupt code isn't aware of PM states? Hmm... I guess it can't be as the interrupt handler isn't associated with a device, just a random pointer. So either release the interrupt (which seems a bit unsafe as then we might not get it back), or handle states at the start of the isr. Rgds -- -- Pierre Ossman Linux kernel, MMC maintainerhttp://www.kernel.org PulseAudio, core developer http://pulseaudio.org rdesktop, core developer http://www.rdesktop.org - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] sdhci regression in 2.6.21-rc2
Pierre Ossman wrote: Pierre Ossman wrote: I'd say it's the kernel calling the interrupt handler of a currently sleeping device. Since we're seeing this problem I assume the kernel's interrupt code isn't aware of PM states? Hmm... I guess it can't be as the interrupt handler isn't associated with a device, just a random pointer. So either release the interrupt (which seems a bit unsafe as then we might not get it back), or handle states at the start of the isr. From linux/Documentation/power/pci.txt: A reference implementation - .suspend() { /* driver specific operations */ /* Disable IRQ */ free_irq(); /* If using MSI */ pci_disable_msi(); pci_save_state(); pci_enable_wake(); /* Disable IO/bus master/irq router */ pci_disable_device(); pci_set_power_state(pci_choose_state()); } .resume() { pci_set_power_state(PCI_D0); pci_restore_state(); /* device's irq possibly is changed, driver should take care */ pci_enable_device(); pci_set_master(); /* if using MSI, device's vector possibly is changed */ pci_enable_msi(); request_irq(); /* driver specific operations; */ } - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] sdhci regression in 2.6.21-rc2
Mark Lord wrote: Pierre Ossman wrote: Hmm... I guess it can't be as the interrupt handler isn't associated with a device, just a random pointer. So either release the interrupt (which seems a bit unsafe as then we might not get it back), or handle states at the start of the isr. From linux/Documentation/power/pci.txt: That conveniently leaves out the part of how to handle when we're not getting our stuff back. ;) But it seems to be the easier route anyway... I'll whip up a patch. Rgds -- -- Pierre Ossman Linux kernel, MMC maintainerhttp://www.kernel.org PulseAudio, core developer http://pulseaudio.org rdesktop, core developer http://www.rdesktop.org - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] sdhci regression in 2.6.21-rc2
Pierre Ossman wrote: Mark Lord wrote: From linux/Documentation/power/pci.txt: That conveniently leaves out the part of how to handle when we're not getting our stuff back. ;) But it seems to be the easier route anyway... I'll whip up a patch. It's probably best all-round. But another simpler way is to just set a global am_suspended bool in the _suspend() code, and clear it again in the _resume() code, and check it in the irq handler() code. And the test for 0x (my simple patch) is probably good to have regardless inside the irq handler(). Cheers - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] sdhci regression in 2.6.21-rc2
On Mon, 05 Mar 2007 10:19:55 -0500 Mark Lord [EMAIL PROTECTED] wrote: The interrupt is shared with another device, which resumes earlier than the sdhci controller, and generates an interrupt. The sdhci interrupt handler runs, sees 0x in its own device's interrupt status, and tries to handle it.. The reason for the 0x is that the device is still suspended, and *all* regs are reading back 0x. So.. the suspend routine should de-register the irq handler, and the resume routine should re-register it again. Or perhaps a simpler kludge like this one, which fixes it for me: Signed-off-by: Mark Lord [EMAIL PROTECTED] --- --- linux/drivers/mmc/sdhci.c.orig2007-03-02 15:06:31.0 -0500 +++ linux/drivers/mmc/sdhci.c 2007-03-05 10:13:51.0 -0500 @@ -994,7 +994,7 @@ intmask = readl(host-ioaddr + SDHCI_INT_STATUS); - if (!intmask) { + if (!intmask || intmask == 0x) { result = IRQ_NONE; goto out; } This is actually pretty standard handling for a lot of drivers: any device which can appear on a cardbus or other hot-unpluggable bus and which shares interupts needs such treatment. I don't know whether anything which this driver drives could ever appear on such a bus, but I'm inclined to just apply it into 2.6.21. (I'm also inclined to drop the darned mmc tree - am getting rather tired of people moving their files all over the tree all the time). - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] sdhci regression in 2.6.21-rc2
Andrew Morton wrote: (I'm also inclined to drop the darned mmc tree - am getting rather tired of people moving their files all over the tree all the time). Fine, I can stop bothering with putting up a test tree and just push the stuff to Linus directly if that is what you prefer. Or is moving files around barred from the kernel? -- -- Pierre Ossman Linux kernel, MMC maintainerhttp://www.kernel.org PulseAudio, core developer http://pulseaudio.org rdesktop, core developer http://www.rdesktop.org - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] sdhci regression in 2.6.21-rc2
On Tue, 06 Mar 2007 06:47:32 +0100 Pierre Ossman [EMAIL PROTECTED] wrote: Andrew Morton wrote: (I'm also inclined to drop the darned mmc tree - am getting rather tired of people moving their files all over the tree all the time). Fine, I can stop bothering with putting up a test tree and just push the stuff to Linus directly if that is what you prefer. Or is moving files around barred from the kernel? Oh, it's just a pain in the ass. Please don't do it lightly - if there's a really good reason then OK. Plus it helps if the massive file move isn't left sitting in some external tree for months. I mean, it's usually a trivial thing, so do it just a week before the pull is due. But whatever. What are we going to do about $SUBJECT? - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] sdhci regression in 2.6.21-rc2
Andrew Morton wrote: Oh, it's just a pain in the ass. Please don't do it lightly - if there's a really good reason then OK. The mmc code is a mess (mostly my fault for the addition of SD support) and I'm trying to break things apart to clear the code up. That unfortunately meant moving files around. Plus it helps if the massive file move isn't left sitting in some external tree for months. I mean, it's usually a trivial thing, so do it just a week before the pull is due. My goal with putting it in -mm was to give people a chance to cry bloody murder before I merged the stuff (which I intend to do immediately post 2.6.21). The changes are all over the place, so to ease my mind I prefer to have it simmer for a while. As to making your life easier, send patches to the mmc layer my way and I'll handle the conflicts. I made the mess so it's only fair that I have to clean it up. But whatever. What are we going to do about $SUBJECT? It should go in now, along with a fix to unregister the interrupt. I have a bunch of fixes I intend to push to Linus today and I'll include this in that group. Rgds -- -- Pierre Ossman Linux kernel, MMC maintainerhttp://www.kernel.org PulseAudio, core developer http://pulseaudio.org rdesktop, core developer http://www.rdesktop.org - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] sdhci regression in 2.6.21-rc2
Mark Lord wrote: > Another regression, for Pierre Ossman this time. > > My syslog gets spammed like this (below) on suspend/resume (to RAM) cycles. > Worked fine, without all of the noise, in all previous kernels up to > 2.6.20+. > This looks like a PCI configuration issue. Can you bisect which commit caused the issue? And what's the PCI address of the device? Rgds -- -- Pierre Ossman Linux kernel, MMC maintainerhttp://www.kernel.org PulseAudio, core developer http://pulseaudio.org rdesktop, core developer http://www.rdesktop.org - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[BUG] sdhci regression in 2.6.21-rc2
Another regression, for Pierre Ossman this time. My syslog gets spammed like this (below) on suspend/resume (to RAM) cycles. Worked fine, without all of the noise, in all previous kernels up to 2.6.20+. Mar 4 23:28:45 silvy logger: suspending Mar 4 23:29:09 silvy kernel: Stopping tasks ... done. Mar 4 23:29:09 silvy kernel: Suspending console(s) Mar 4 23:29:09 silvy kernel: pl2303 5-1.3:1.0: no suspend for driver pl2303? Mar 4 23:29:09 silvy kernel: ACPI: PCI interrupt for device :03:01.2 disabled Mar 4 23:29:09 silvy kernel: ACPI: PCI interrupt for device :03:00.0 disabled Mar 4 23:29:09 silvy kernel: ACPI: PCI interrupt for device :00:1f.2 disabled Mar 4 23:29:09 silvy kernel: ACPI: PCI interrupt for device :00:1e.2 disabled Mar 4 23:29:09 silvy kernel: ACPI: PCI interrupt for device :00:1d.7 disabled Mar 4 23:29:09 silvy kernel: ACPI: PCI interrupt for device :00:1d.3 disabled Mar 4 23:29:09 silvy kernel: ACPI: PCI interrupt for device :00:1d.2 disabled Mar 4 23:29:09 silvy kernel: ACPI: PCI interrupt for device :00:1d.1 disabled Mar 4 23:29:09 silvy kernel: ACPI: PCI interrupt for device :00:1d.0 disabled Mar 4 23:29:09 silvy kernel: Intel machine check architecture supported. Mar 4 23:29:09 silvy kernel: Intel machine check reporting enabled on CPU#0. Mar 4 23:29:09 silvy kernel: Back to C! Mar 4 23:29:09 silvy kernel: PM: Writing back config space on device :00:01.0 at offset 7 (was 2000d0d0, writing d0d0) Mar 4 23:29:09 silvy kernel: PM: Writing back config space on device :00:01.0 at offset 3 (was 1, writing 10010) Mar 4 23:29:09 silvy kernel: PCI: Setting latency timer of device :00:01.0 to 64 Mar 4 23:29:09 silvy kernel: ACPI: PCI Interrupt :00:1d.0[A] -> GSI 16 (level, low) -> IRQ 16 Mar 4 23:29:09 silvy kernel: PCI: Setting latency timer of device :00:1d.0 to 64 Mar 4 23:29:09 silvy kernel: usb usb1: root hub lost power or was reset Mar 4 23:29:09 silvy kernel: PCI: Enabling device :00:1d.1 ( -> 0001) Mar 4 23:29:09 silvy kernel: ACPI: PCI Interrupt :00:1d.1[B] -> GSI 17 (level, low) -> IRQ 18 Mar 4 23:29:09 silvy kernel: PCI: Setting latency timer of device :00:1d.1 to 64 Mar 4 23:29:09 silvy kernel: PM: Writing back config space on device :00:1d.1 at offset f (was 200, writing 20a) Mar 4 23:29:09 silvy kernel: PM: Writing back config space on device :00:1d.1 at offset 8 (was 1, writing bf61) Mar 4 23:29:09 silvy kernel: usb usb2: root hub lost power or was reset Mar 4 23:29:09 silvy kernel: PCI: Enabling device :00:1d.2 ( -> 0001) Mar 4 23:29:09 silvy kernel: ACPI: PCI Interrupt :00:1d.2[C] -> GSI 18 (level, low) -> IRQ 19 Mar 4 23:29:09 silvy kernel: PCI: Setting latency timer of device :00:1d.2 to 64 Mar 4 23:29:09 silvy kernel: PM: Writing back config space on device :00:1d.2 at offset f (was 300, writing 309) Mar 4 23:29:09 silvy kernel: PM: Writing back config space on device :00:1d.2 at offset 8 (was 1, writing bf41) Mar 4 23:29:09 silvy kernel: usb usb3: root hub lost power or was reset Mar 4 23:29:09 silvy kernel: PCI: Enabling device :00:1d.3 ( -> 0001) Mar 4 23:29:09 silvy kernel: ACPI: PCI Interrupt :00:1d.3[D] -> GSI 19 (level, low) -> IRQ 17 Mar 4 23:29:09 silvy kernel: PCI: Setting latency timer of device :00:1d.3 to 64 Mar 4 23:29:09 silvy kernel: PM: Writing back config space on device :00:1d.3 at offset f (was 400, writing 407) Mar 4 23:29:09 silvy kernel: PM: Writing back config space on device :00:1d.3 at offset 8 (was 1, writing bf21) Mar 4 23:29:09 silvy kernel: usb usb4: root hub lost power or was reset Mar 4 23:29:09 silvy kernel: mmc0: Got command interrupt even though no command operation was in progress. Mar 4 23:29:09 silvy kernel: sdhci: == REGISTER DUMP == Mar 4 23:29:09 silvy kernel: sdhci: Sys addr: 0x | Version: 0x Mar 4 23:29:09 silvy kernel: sdhci: Blk size: 0x | Blk cnt: 0x Mar 4 23:29:09 silvy kernel: sdhci: Argument: 0x | Trn mode: 0x Mar 4 23:29:09 silvy kernel: sdhci: Present: 0x | Host ctl: 0x00ff Mar 4 23:29:09 silvy kernel: sdhci: Power:0x00ff | Blk gap: 0x00ff Mar 4 23:29:09 silvy kernel: sdhci: Wake-up: 0x00ff | Clock:0x Mar 4 23:29:09 silvy kernel: sdhci: Timeout: 0x00ff | Int stat: 0x Mar 4 23:29:09 silvy kernel: sdhci: Int enab: 0x | Sig enab: 0x Mar 4 23:29:09 silvy kernel: sdhci: AC12 err: 0x | Slot int: 0x Mar 4 23:29:09 silvy kernel: sdhci: Caps: 0x | Max curr: 0x Mar 4 23:29:09 silvy kernel: sdhci: === Mar 4 23:29:09 silvy kernel: mmc0: Card is consuming too much power! Mar 4 23:29:09 silvy kernel: mmc0: Unexpected interrupt 0x0080. Mar 4 23:29:09 silvy kernel: sdhci: == REGISTER DUMP
[BUG] sdhci regression in 2.6.21-rc2
Another regression, for Pierre Ossman this time. My syslog gets spammed like this (below) on suspend/resume (to RAM) cycles. Worked fine, without all of the noise, in all previous kernels up to 2.6.20+. Mar 4 23:28:45 silvy logger: suspending Mar 4 23:29:09 silvy kernel: Stopping tasks ... done. Mar 4 23:29:09 silvy kernel: Suspending console(s) Mar 4 23:29:09 silvy kernel: pl2303 5-1.3:1.0: no suspend for driver pl2303? Mar 4 23:29:09 silvy kernel: ACPI: PCI interrupt for device :03:01.2 disabled Mar 4 23:29:09 silvy kernel: ACPI: PCI interrupt for device :03:00.0 disabled Mar 4 23:29:09 silvy kernel: ACPI: PCI interrupt for device :00:1f.2 disabled Mar 4 23:29:09 silvy kernel: ACPI: PCI interrupt for device :00:1e.2 disabled Mar 4 23:29:09 silvy kernel: ACPI: PCI interrupt for device :00:1d.7 disabled Mar 4 23:29:09 silvy kernel: ACPI: PCI interrupt for device :00:1d.3 disabled Mar 4 23:29:09 silvy kernel: ACPI: PCI interrupt for device :00:1d.2 disabled Mar 4 23:29:09 silvy kernel: ACPI: PCI interrupt for device :00:1d.1 disabled Mar 4 23:29:09 silvy kernel: ACPI: PCI interrupt for device :00:1d.0 disabled Mar 4 23:29:09 silvy kernel: Intel machine check architecture supported. Mar 4 23:29:09 silvy kernel: Intel machine check reporting enabled on CPU#0. Mar 4 23:29:09 silvy kernel: Back to C! Mar 4 23:29:09 silvy kernel: PM: Writing back config space on device :00:01.0 at offset 7 (was 2000d0d0, writing d0d0) Mar 4 23:29:09 silvy kernel: PM: Writing back config space on device :00:01.0 at offset 3 (was 1, writing 10010) Mar 4 23:29:09 silvy kernel: PCI: Setting latency timer of device :00:01.0 to 64 Mar 4 23:29:09 silvy kernel: ACPI: PCI Interrupt :00:1d.0[A] - GSI 16 (level, low) - IRQ 16 Mar 4 23:29:09 silvy kernel: PCI: Setting latency timer of device :00:1d.0 to 64 Mar 4 23:29:09 silvy kernel: usb usb1: root hub lost power or was reset Mar 4 23:29:09 silvy kernel: PCI: Enabling device :00:1d.1 ( - 0001) Mar 4 23:29:09 silvy kernel: ACPI: PCI Interrupt :00:1d.1[B] - GSI 17 (level, low) - IRQ 18 Mar 4 23:29:09 silvy kernel: PCI: Setting latency timer of device :00:1d.1 to 64 Mar 4 23:29:09 silvy kernel: PM: Writing back config space on device :00:1d.1 at offset f (was 200, writing 20a) Mar 4 23:29:09 silvy kernel: PM: Writing back config space on device :00:1d.1 at offset 8 (was 1, writing bf61) Mar 4 23:29:09 silvy kernel: usb usb2: root hub lost power or was reset Mar 4 23:29:09 silvy kernel: PCI: Enabling device :00:1d.2 ( - 0001) Mar 4 23:29:09 silvy kernel: ACPI: PCI Interrupt :00:1d.2[C] - GSI 18 (level, low) - IRQ 19 Mar 4 23:29:09 silvy kernel: PCI: Setting latency timer of device :00:1d.2 to 64 Mar 4 23:29:09 silvy kernel: PM: Writing back config space on device :00:1d.2 at offset f (was 300, writing 309) Mar 4 23:29:09 silvy kernel: PM: Writing back config space on device :00:1d.2 at offset 8 (was 1, writing bf41) Mar 4 23:29:09 silvy kernel: usb usb3: root hub lost power or was reset Mar 4 23:29:09 silvy kernel: PCI: Enabling device :00:1d.3 ( - 0001) Mar 4 23:29:09 silvy kernel: ACPI: PCI Interrupt :00:1d.3[D] - GSI 19 (level, low) - IRQ 17 Mar 4 23:29:09 silvy kernel: PCI: Setting latency timer of device :00:1d.3 to 64 Mar 4 23:29:09 silvy kernel: PM: Writing back config space on device :00:1d.3 at offset f (was 400, writing 407) Mar 4 23:29:09 silvy kernel: PM: Writing back config space on device :00:1d.3 at offset 8 (was 1, writing bf21) Mar 4 23:29:09 silvy kernel: usb usb4: root hub lost power or was reset Mar 4 23:29:09 silvy kernel: mmc0: Got command interrupt even though no command operation was in progress. Mar 4 23:29:09 silvy kernel: sdhci: == REGISTER DUMP == Mar 4 23:29:09 silvy kernel: sdhci: Sys addr: 0x | Version: 0x Mar 4 23:29:09 silvy kernel: sdhci: Blk size: 0x | Blk cnt: 0x Mar 4 23:29:09 silvy kernel: sdhci: Argument: 0x | Trn mode: 0x Mar 4 23:29:09 silvy kernel: sdhci: Present: 0x | Host ctl: 0x00ff Mar 4 23:29:09 silvy kernel: sdhci: Power:0x00ff | Blk gap: 0x00ff Mar 4 23:29:09 silvy kernel: sdhci: Wake-up: 0x00ff | Clock:0x Mar 4 23:29:09 silvy kernel: sdhci: Timeout: 0x00ff | Int stat: 0x Mar 4 23:29:09 silvy kernel: sdhci: Int enab: 0x | Sig enab: 0x Mar 4 23:29:09 silvy kernel: sdhci: AC12 err: 0x | Slot int: 0x Mar 4 23:29:09 silvy kernel: sdhci: Caps: 0x | Max curr: 0x Mar 4 23:29:09 silvy kernel: sdhci: === Mar 4 23:29:09 silvy kernel: mmc0: Card is consuming too much power! Mar 4 23:29:09 silvy kernel: mmc0: Unexpected interrupt 0x0080. Mar 4 23:29:09 silvy kernel: sdhci: == REGISTER DUMP == Mar
Re: [BUG] sdhci regression in 2.6.21-rc2
Mark Lord wrote: Another regression, for Pierre Ossman this time. My syslog gets spammed like this (below) on suspend/resume (to RAM) cycles. Worked fine, without all of the noise, in all previous kernels up to 2.6.20+. This looks like a PCI configuration issue. Can you bisect which commit caused the issue? And what's the PCI address of the device? Rgds -- -- Pierre Ossman Linux kernel, MMC maintainerhttp://www.kernel.org PulseAudio, core developer http://pulseaudio.org rdesktop, core developer http://www.rdesktop.org - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/