Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-30 Thread John Rose
Hi Paul- > I'm suggesting that the rpaphp code has a struct pci_driver whose > id_table and probe function are such that it will claim the EADS > bridges. (It would probably be best to match on vendor=IBM and > class=PCI-PCI bridge and let the probe function figure out which of > the bridges it

Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-30 Thread John Rose
Hi Paul- I'm suggesting that the rpaphp code has a struct pci_driver whose id_table and probe function are such that it will claim the EADS bridges. (It would probably be best to match on vendor=IBM and class=PCI-PCI bridge and let the probe function figure out which of the bridges it gets

Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-29 Thread Paul Mackerras
Linas Vepstas writes: > > One way to clean this up would be to make rpaphp the driver for the > > EADS bridges (from the pci code's point of view). > > I guess I don't understand what that means. Are you suggesting moving > pSeries_pci.c into the rpaphp code directory? No, not at all. :)

Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-29 Thread John Rose
> Not sure that I agree with this. Not all PCI hotplug slots have EADS > devices as parents. Ahem, "PCI hotplug" above should read "EEH-enabled". Sorry :) John - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More

Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-29 Thread John Rose
Hi Paul- > > 2) As a result, the code to call hot-unplug is a bit messy. In > >particular, there's a bit of hoop-jumping when hotplug is built as > >as a module (and said hoops were wrecked recently when I moved the > >code around, out of the rpaphp directory). > > One way to clean

Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-29 Thread Linas Vepstas
On Mon, Aug 29, 2005 at 04:40:20PM +1000, Paul Mackerras was heard to remark: > Linas Vepstas writes: > > > Actually, no. There are three issues: > > 1) hotplug routines are called from within kernel. GregKH has stated on > >multiple occasions that doing this is wrong/bad/evil. This includes

Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-29 Thread Linas Vepstas
On Fri, Aug 26, 2005 at 09:37:36AM +1000, Benjamin Herrenschmidt was heard to remark: > On Fri, 2005-08-26 at 09:18 +1000, Paul Mackerras wrote: > > Benjamin Herrenschmidt writes: > > > > > Ok, so what is the problem then ? Why do we have to wait at all ? Why > > > not just unplug/replug right

Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-29 Thread Linas Vepstas
On Fri, Aug 26, 2005 at 07:43:57AM +1000, Benjamin Herrenschmidt was heard to remark: > On Thu, 2005-08-25 at 11:21 -0500, Linas Vepstas wrote: > > On Thu, Aug 25, 2005 at 10:49:03AM +1000, Benjamin Herrenschmidt was heard > > to remark: > > > > > > Of course, we'll possibly end up with a

Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-29 Thread Paul Mackerras
Linas Vepstas writes: > Actually, no. There are three issues: > 1) hotplug routines are called from within kernel. GregKH has stated on >multiple occasions that doing this is wrong/bad/evil. This includes >calling hot-unplug. > > 2) As a result, the code to call hot-unplug is a bit

Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-29 Thread Paul Mackerras
Linas Vepstas writes: Actually, no. There are three issues: 1) hotplug routines are called from within kernel. GregKH has stated on multiple occasions that doing this is wrong/bad/evil. This includes calling hot-unplug. 2) As a result, the code to call hot-unplug is a bit messy. In

Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-29 Thread Linas Vepstas
On Fri, Aug 26, 2005 at 07:43:57AM +1000, Benjamin Herrenschmidt was heard to remark: On Thu, 2005-08-25 at 11:21 -0500, Linas Vepstas wrote: On Thu, Aug 25, 2005 at 10:49:03AM +1000, Benjamin Herrenschmidt was heard to remark: Of course, we'll possibly end up with a different ethX

Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-29 Thread Linas Vepstas
On Fri, Aug 26, 2005 at 09:37:36AM +1000, Benjamin Herrenschmidt was heard to remark: On Fri, 2005-08-26 at 09:18 +1000, Paul Mackerras wrote: Benjamin Herrenschmidt writes: Ok, so what is the problem then ? Why do we have to wait at all ? Why not just unplug/replug right away ?

Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-29 Thread Linas Vepstas
On Mon, Aug 29, 2005 at 04:40:20PM +1000, Paul Mackerras was heard to remark: Linas Vepstas writes: Actually, no. There are three issues: 1) hotplug routines are called from within kernel. GregKH has stated on multiple occasions that doing this is wrong/bad/evil. This includes

Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-29 Thread John Rose
Hi Paul- 2) As a result, the code to call hot-unplug is a bit messy. In particular, there's a bit of hoop-jumping when hotplug is built as as a module (and said hoops were wrecked recently when I moved the code around, out of the rpaphp directory). One way to clean this up

Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-29 Thread John Rose
Not sure that I agree with this. Not all PCI hotplug slots have EADS devices as parents. Ahem, PCI hotplug above should read EEH-enabled. Sorry :) John - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info

Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-29 Thread Paul Mackerras
Linas Vepstas writes: One way to clean this up would be to make rpaphp the driver for the EADS bridges (from the pci code's point of view). I guess I don't understand what that means. Are you suggesting moving pSeries_pci.c into the rpaphp code directory? No, not at all. :) I'm

Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-25 Thread Benjamin Herrenschmidt
On Fri, 2005-08-26 at 09:18 +1000, Paul Mackerras wrote: > Benjamin Herrenschmidt writes: > > > Ok, so what is the problem then ? Why do we have to wait at all ? Why > > not just unplug/replug right away ? > > We'd have to be absolutely certain that the driver could not possibly > take another

Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-25 Thread Paul Mackerras
Benjamin Herrenschmidt writes: > Ok, so what is the problem then ? Why do we have to wait at all ? Why > not just unplug/replug right away ? We'd have to be absolutely certain that the driver could not possibly take another interrupt or try to access the device on behalf of the old instance of

Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-25 Thread Benjamin Herrenschmidt
On Thu, 2005-08-25 at 11:21 -0500, Linas Vepstas wrote: > On Thu, Aug 25, 2005 at 10:49:03AM +1000, Benjamin Herrenschmidt was heard to > remark: > > > > Of course, we'll possibly end up with a different ethX or whatever, but > > Yep, but that's not an issue, since all the various device-naming

Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-25 Thread Linas Vepstas
On Thu, Aug 25, 2005 at 10:49:03AM +1000, Benjamin Herrenschmidt was heard to remark: > > Of course, we'll possibly end up with a different ethX or whatever, but Yep, but that's not an issue, since all the various device-naming schemes are supposed to be fixing this. Its a distinct problem; it

Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-25 Thread Linas Vepstas
On Thu, Aug 25, 2005 at 10:10:45AM +1000, Paul Mackerras was heard to remark: > Linas Vepstas writes: > > > The meta-issue that I'd like to reach consensus on first is whether > > there should be any hot-plug recovery attempted at all. Removing > > hot-plug-recovery support will make many of the

Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-25 Thread Linas Vepstas
On Thu, Aug 25, 2005 at 10:10:45AM +1000, Paul Mackerras was heard to remark: Linas Vepstas writes: The meta-issue that I'd like to reach consensus on first is whether there should be any hot-plug recovery attempted at all. Removing hot-plug-recovery support will make many of the issues

Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-25 Thread Linas Vepstas
On Thu, Aug 25, 2005 at 10:49:03AM +1000, Benjamin Herrenschmidt was heard to remark: Of course, we'll possibly end up with a different ethX or whatever, but Yep, but that's not an issue, since all the various device-naming schemes are supposed to be fixing this. Its a distinct problem; it

Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-25 Thread Paul Mackerras
Benjamin Herrenschmidt writes: Ok, so what is the problem then ? Why do we have to wait at all ? Why not just unplug/replug right away ? We'd have to be absolutely certain that the driver could not possibly take another interrupt or try to access the device on behalf of the old instance of the

Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-25 Thread Benjamin Herrenschmidt
On Fri, 2005-08-26 at 09:18 +1000, Paul Mackerras wrote: Benjamin Herrenschmidt writes: Ok, so what is the problem then ? Why do we have to wait at all ? Why not just unplug/replug right away ? We'd have to be absolutely certain that the driver could not possibly take another interrupt

Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-24 Thread Benjamin Herrenschmidt
> I think what I'd like to see is that when a slot gets isolated and the > driver doesn't have recovery code, the kernel calls the driver's > unplug function and generates a hotplug event to udev. Ideally this > would be a variant of the remove event which would say "and by the > way, please try

Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-24 Thread Paul Mackerras
Linas Vepstas writes: > The meta-issue that I'd like to reach consensus on first is whether > there should be any hot-plug recovery attempted at all. Removing > hot-plug-recovery support will make many of the issues you raise > to be moot. Yes, this probably the thorniest issue we have. My

Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-24 Thread Linas Vepstas
On Wed, Aug 24, 2005 at 10:45:31AM -0500, John Rose was heard to remark: > > +++ linux-2.6.13-rc6-git9/arch/ppc64/kernel/eeh_driver.c2005-08-23 > > 14:34:44.0 -0500 > > +/* > > + * PCI Hot Plug Controller Driver for RPA-compliant PPC64 platform. > > This probably isn't the right

Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-24 Thread John Rose
Hi Linas- I like the idea of splitting the recovery stuff into its own driver. A few comments on the last reorg patch: > Index: linux-2.6.13-rc6-git9/arch/ppc64/kernel/eeh.c ... > +static int > +eeh_slot_availability(struct device_node *dn) ... > +void eeh_restore_bars(struct device_node *dn)

Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-24 Thread Paul Mackerras
I wrote: > Linas Vepstas writes: > > In this patch at least, your mailer seems to have blanked out lines > that match ^[-+]$. Could you send them to me again with a different > mailer or put them on a web or ftp site somewhere? I got 3 copies of each of these mails, one directly, one through

Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-24 Thread John Rose
Hi Linas- I like the idea of splitting the recovery stuff into its own driver. A few comments on the last reorg patch: Index: linux-2.6.13-rc6-git9/arch/ppc64/kernel/eeh.c ... +static int +eeh_slot_availability(struct device_node *dn) ... +void eeh_restore_bars(struct device_node *dn)

Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-24 Thread Linas Vepstas
On Wed, Aug 24, 2005 at 10:45:31AM -0500, John Rose was heard to remark: +++ linux-2.6.13-rc6-git9/arch/ppc64/kernel/eeh_driver.c2005-08-23 14:34:44.0 -0500 +/* + * PCI Hot Plug Controller Driver for RPA-compliant PPC64 platform. This probably isn't the right header

Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-24 Thread Paul Mackerras
Linas Vepstas writes: The meta-issue that I'd like to reach consensus on first is whether there should be any hot-plug recovery attempted at all. Removing hot-plug-recovery support will make many of the issues you raise to be moot. Yes, this probably the thorniest issue we have. My

Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-24 Thread Benjamin Herrenschmidt
I think what I'd like to see is that when a slot gets isolated and the driver doesn't have recovery code, the kernel calls the driver's unplug function and generates a hotplug event to udev. Ideally this would be a variant of the remove event which would say and by the way, please try

Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-24 Thread Paul Mackerras
I wrote: Linas Vepstas writes: In this patch at least, your mailer seems to have blanked out lines that match ^[-+]$. Could you send them to me again with a different mailer or put them on a web or ftp site somewhere? I got 3 copies of each of these mails, one directly, one through

Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-23 Thread Paul Mackerras
Linas Vepstas writes: In this patch at least, your mailer seems to have blanked out lines that match ^[-+]$. Could you send them to me again with a different mailer or put them on a web or ftp site somewhere? Thanks, Paul. - To unsubscribe from this list: send the line "unsubscribe

[patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-23 Thread Linas Vepstas
Various PCI bus errors can be signaled by newer PCI controllers. The core error recovery routines are architecture dependent. This patch adds a recovery infrastructure for the PPC64 pSeries systems. Signed-off-by: Linas Vepstas <[EMAIL PROTECTED]> -- arch/ppc64/kernel/Makefile |2

[patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-23 Thread Linas Vepstas
Various PCI bus errors can be signaled by newer PCI controllers. The core error recovery routines are architecture dependent. This patch adds a recovery infrastructure for the PPC64 pSeries systems. Signed-off-by: Linas Vepstas [EMAIL PROTECTED] -- arch/ppc64/kernel/Makefile |2

Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-23 Thread Paul Mackerras
Linas Vepstas writes: In this patch at least, your mailer seems to have blanked out lines that match ^[-+]$. Could you send them to me again with a different mailer or put them on a web or ftp site somewhere? Thanks, Paul. - To unsubscribe from this list: send the line unsubscribe linux-kernel