Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-30 Thread John Rose
Hi Paul-

> I'm suggesting that the rpaphp code has a struct pci_driver whose
> id_table and probe function are such that it will claim the EADS
> bridges.  (It would probably be best to match on vendor=IBM and
> class=PCI-PCI bridge and let the probe function figure out which of
> the bridges it gets asked about are actually EADS bridges.)

Wouldn't this leave out hotplug-capable adapters who have direct PHB
parents, since these parent PHBs don't have pci_devs?  Thoughts?

Thanks-
John

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-30 Thread John Rose
Hi Paul-

 I'm suggesting that the rpaphp code has a struct pci_driver whose
 id_table and probe function are such that it will claim the EADS
 bridges.  (It would probably be best to match on vendor=IBM and
 class=PCI-PCI bridge and let the probe function figure out which of
 the bridges it gets asked about are actually EADS bridges.)

Wouldn't this leave out hotplug-capable adapters who have direct PHB
parents, since these parent PHBs don't have pci_devs?  Thoughts?

Thanks-
John

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-29 Thread Paul Mackerras
Linas Vepstas writes:

> > One way to clean this up would be to make rpaphp the driver for the
> > EADS bridges (from the pci code's point of view).  
> 
> I guess I don't understand what that means. Are you suggesting moving 
> pSeries_pci.c into the rpaphp code directory?

No, not at all. :)

I'm suggesting that the rpaphp code has a struct pci_driver whose
id_table and probe function are such that it will claim the EADS
bridges.  (It would probably be best to match on vendor=IBM and
class=PCI-PCI bridge and let the probe function figure out which of
the bridges it gets asked about are actually EADS bridges.)

> I would prefer to deprecate the hot-plug based recovery scheme.  This
> is for many reasons, including the fact that some devices that can get
> pci errors are soldered onto the planar, and are not hot-pluggable.

Those devices can still be isolated and reset, can they not?  And they
still have an EADS bridge above them, don't they?  Are there any that
can't be dynamically reassigned from one partition to another?  I.e.,
they may not be physically hotpluggable but they are still virtually
hotpluggable as far as the partition is concerned, IIUC.

Paul.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-29 Thread John Rose
> Not sure that I agree with this.  Not all PCI hotplug slots have EADS
> devices as parents. 

Ahem, "PCI hotplug" above should read "EEH-enabled".

Sorry :)

John

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-29 Thread John Rose
Hi Paul-

> > 2) As a result, the code to call hot-unplug is a bit messy. In
> >particular, there's a bit of hoop-jumping when hotplug is built as
> >as a module (and said hoops were wrecked recently when I moved the
> >code around, out of the rpaphp directory).
> 
> One way to clean this up would be to make rpaphp the driver for the
> EADS bridges (from the pci code's point of view).  Then it would
> automatically get included in the error recovery process and could do
> whatever it should.

Not sure that I agree with this.  Not all PCI hotplug slots have EADS
devices as parents.  This sort of topography dependency is something
we're trying to break in rpaphp.  Rpaphp could set this up for devices
that do have EADS parents, but then we've only covered a subset of
EEH-capable devices.  

Anyway, isn't the OS forbidden from performing most of the expected
function of such a driver for EADS devices:
Enable the device
Access device configuration space
Discover resources (addresses and IRQ numbers) provided by the device
Allocate these resources
Communicate with the device
Disable the device

Thanks-
John

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-29 Thread Linas Vepstas
On Mon, Aug 29, 2005 at 04:40:20PM +1000, Paul Mackerras was heard to remark:
> Linas Vepstas writes:
> 
> > Actually, no.  There are three issues:
> > 1) hotplug routines are called from within kernel. GregKH has stated on
> >multiple occasions that doing this is wrong/bad/evil. This includes
> >calling hot-unplug.
> > 
> > 2) As a result, the code to call hot-unplug is a bit messy. In
> >particular, there's a bit of hoop-jumping when hotplug is built as
> >as a module (and said hoops were wrecked recently when I moved the
> >code around, out of the rpaphp directory).
> 
> One way to clean this up would be to make rpaphp the driver for the
> EADS bridges (from the pci code's point of view).  

I guess I don't understand what that means. Are you suggesting moving 
pSeries_pci.c into the rpaphp code directory?

> Then it would
> automatically get included in the error recovery process and could do
> whatever it should.

John Rose, the current maintainer of the rpaphp code, is pretty militant 
about removing things from, not adding things to, the rpaphp code.
Which is a good idea, as chunks of that code are spaghetti, and do need
simplification and cleanup.

> > 3) Hot-unplug causes scripts to run in user-space. There is no way to 
> >know when these scripts are done, so its not clear if we've waited
> >long enough before calling hot-add (or if waiting is even necessary).
> 
> OK, so let's just add a new hotplug event called KOBJ_ERROR or
> something, which tells userspace that an error has occurred which has
> made the device inaccessible.  Greg, would that be OK?

Why do we need such an event?

I would prefer to deprecate the hot-plug based recovery scheme.  This
is for many reasons, including the fact that some devices that can get
pci errors are soldered onto the planar, and are not hot-pluggable.

--linas
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-29 Thread Linas Vepstas
On Fri, Aug 26, 2005 at 09:37:36AM +1000, Benjamin Herrenschmidt was heard to 
remark:
> On Fri, 2005-08-26 at 09:18 +1000, Paul Mackerras wrote:
> > Benjamin Herrenschmidt writes:
> > 
> > > Ok, so what is the problem then ? Why do we have to wait at all ? Why
> > > not just unplug/replug right away ?
> > 
> > We'd have to be absolutely certain that the driver could not possibly
> > take another interrupt or try to access the device on behalf of the
> > old instance of the device by the time it returned from the remove
> > function.  I'm not sure I'd trust most drivers that far...
> 
> Hrm... If a driver gets that wrong, then it will also blow up when
> unloaded as a module. 


:)  We've discovered two, so far, that blow up when unloaded: 
lpfc and e1000. I beleive these are now fixed in mainline.

--linas
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-29 Thread Linas Vepstas
On Fri, Aug 26, 2005 at 07:43:57AM +1000, Benjamin Herrenschmidt was heard to 
remark:
> On Thu, 2005-08-25 at 11:21 -0500, Linas Vepstas wrote:
> > On Thu, Aug 25, 2005 at 10:49:03AM +1000, Benjamin Herrenschmidt was heard 
> > to remark:
> > > 
> > > Of course, we'll possibly end up with a different ethX or whatever, but
> > 
> > Yep, but that's not an issue, since all the various device-naming
> > schemes are supposed to be fixing this. Its a distinct problem;
> > it needs to be solved even across cold-boots. 
> 
> Ok, so what is the problem then ? Why do we have to wait at all ? Why
> not just unplug/replug right away ?

Paranoia + old versions of udev. I beleive that older versions of udev
(such as the ones currently shipping with Red Hat RHEL4 and SuSE SLES9)
failed to serialize events properly.  I beleive that the newer versions
do serialize, but have not verified/tested.

--linas

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-29 Thread Paul Mackerras
Linas Vepstas writes:

> Actually, no.  There are three issues:
> 1) hotplug routines are called from within kernel. GregKH has stated on
>multiple occasions that doing this is wrong/bad/evil. This includes
>calling hot-unplug.
> 
> 2) As a result, the code to call hot-unplug is a bit messy. In
>particular, there's a bit of hoop-jumping when hotplug is built as
>as a module (and said hoops were wrecked recently when I moved the
>code around, out of the rpaphp directory).

One way to clean this up would be to make rpaphp the driver for the
EADS bridges (from the pci code's point of view).  Then it would
automatically get included in the error recovery process and could do
whatever it should.

> 3) Hot-unplug causes scripts to run in user-space. There is no way to 
>know when these scripts are done, so its not clear if we've waited
>long enough before calling hot-add (or if waiting is even necessary).

OK, so let's just add a new hotplug event called KOBJ_ERROR or
something, which tells userspace that an error has occurred which has
made the device inaccessible.  Greg, would that be OK?

The only trouble with that is that if we don't have a hotplug bridge
driver loaded for the slot, we can't tell the driver that its device
has gone away.  I guess that's not a big problem in practice.

Paul.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-29 Thread Paul Mackerras
Linas Vepstas writes:

 Actually, no.  There are three issues:
 1) hotplug routines are called from within kernel. GregKH has stated on
multiple occasions that doing this is wrong/bad/evil. This includes
calling hot-unplug.
 
 2) As a result, the code to call hot-unplug is a bit messy. In
particular, there's a bit of hoop-jumping when hotplug is built as
as a module (and said hoops were wrecked recently when I moved the
code around, out of the rpaphp directory).

One way to clean this up would be to make rpaphp the driver for the
EADS bridges (from the pci code's point of view).  Then it would
automatically get included in the error recovery process and could do
whatever it should.

 3) Hot-unplug causes scripts to run in user-space. There is no way to 
know when these scripts are done, so its not clear if we've waited
long enough before calling hot-add (or if waiting is even necessary).

OK, so let's just add a new hotplug event called KOBJ_ERROR or
something, which tells userspace that an error has occurred which has
made the device inaccessible.  Greg, would that be OK?

The only trouble with that is that if we don't have a hotplug bridge
driver loaded for the slot, we can't tell the driver that its device
has gone away.  I guess that's not a big problem in practice.

Paul.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-29 Thread Linas Vepstas
On Fri, Aug 26, 2005 at 07:43:57AM +1000, Benjamin Herrenschmidt was heard to 
remark:
 On Thu, 2005-08-25 at 11:21 -0500, Linas Vepstas wrote:
  On Thu, Aug 25, 2005 at 10:49:03AM +1000, Benjamin Herrenschmidt was heard 
  to remark:
   
   Of course, we'll possibly end up with a different ethX or whatever, but
  
  Yep, but that's not an issue, since all the various device-naming
  schemes are supposed to be fixing this. Its a distinct problem;
  it needs to be solved even across cold-boots. 
 
 Ok, so what is the problem then ? Why do we have to wait at all ? Why
 not just unplug/replug right away ?

Paranoia + old versions of udev. I beleive that older versions of udev
(such as the ones currently shipping with Red Hat RHEL4 and SuSE SLES9)
failed to serialize events properly.  I beleive that the newer versions
do serialize, but have not verified/tested.

--linas

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-29 Thread Linas Vepstas
On Fri, Aug 26, 2005 at 09:37:36AM +1000, Benjamin Herrenschmidt was heard to 
remark:
 On Fri, 2005-08-26 at 09:18 +1000, Paul Mackerras wrote:
  Benjamin Herrenschmidt writes:
  
   Ok, so what is the problem then ? Why do we have to wait at all ? Why
   not just unplug/replug right away ?
  
  We'd have to be absolutely certain that the driver could not possibly
  take another interrupt or try to access the device on behalf of the
  old instance of the device by the time it returned from the remove
  function.  I'm not sure I'd trust most drivers that far...
 
 Hrm... If a driver gets that wrong, then it will also blow up when
 unloaded as a module. 


:)  We've discovered two, so far, that blow up when unloaded: 
lpfc and e1000. I beleive these are now fixed in mainline.

--linas
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-29 Thread Linas Vepstas
On Mon, Aug 29, 2005 at 04:40:20PM +1000, Paul Mackerras was heard to remark:
 Linas Vepstas writes:
 
  Actually, no.  There are three issues:
  1) hotplug routines are called from within kernel. GregKH has stated on
 multiple occasions that doing this is wrong/bad/evil. This includes
 calling hot-unplug.
  
  2) As a result, the code to call hot-unplug is a bit messy. In
 particular, there's a bit of hoop-jumping when hotplug is built as
 as a module (and said hoops were wrecked recently when I moved the
 code around, out of the rpaphp directory).
 
 One way to clean this up would be to make rpaphp the driver for the
 EADS bridges (from the pci code's point of view).  

I guess I don't understand what that means. Are you suggesting moving 
pSeries_pci.c into the rpaphp code directory?

 Then it would
 automatically get included in the error recovery process and could do
 whatever it should.

John Rose, the current maintainer of the rpaphp code, is pretty militant 
about removing things from, not adding things to, the rpaphp code.
Which is a good idea, as chunks of that code are spaghetti, and do need
simplification and cleanup.

  3) Hot-unplug causes scripts to run in user-space. There is no way to 
 know when these scripts are done, so its not clear if we've waited
 long enough before calling hot-add (or if waiting is even necessary).
 
 OK, so let's just add a new hotplug event called KOBJ_ERROR or
 something, which tells userspace that an error has occurred which has
 made the device inaccessible.  Greg, would that be OK?

Why do we need such an event?

I would prefer to deprecate the hot-plug based recovery scheme.  This
is for many reasons, including the fact that some devices that can get
pci errors are soldered onto the planar, and are not hot-pluggable.

--linas
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-29 Thread John Rose
Hi Paul-

  2) As a result, the code to call hot-unplug is a bit messy. In
 particular, there's a bit of hoop-jumping when hotplug is built as
 as a module (and said hoops were wrecked recently when I moved the
 code around, out of the rpaphp directory).
 
 One way to clean this up would be to make rpaphp the driver for the
 EADS bridges (from the pci code's point of view).  Then it would
 automatically get included in the error recovery process and could do
 whatever it should.

Not sure that I agree with this.  Not all PCI hotplug slots have EADS
devices as parents.  This sort of topography dependency is something
we're trying to break in rpaphp.  Rpaphp could set this up for devices
that do have EADS parents, but then we've only covered a subset of
EEH-capable devices.  

Anyway, isn't the OS forbidden from performing most of the expected
function of such a driver for EADS devices:
Enable the device
Access device configuration space
Discover resources (addresses and IRQ numbers) provided by the device
Allocate these resources
Communicate with the device
Disable the device

Thanks-
John

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-29 Thread John Rose
 Not sure that I agree with this.  Not all PCI hotplug slots have EADS
 devices as parents. 

Ahem, PCI hotplug above should read EEH-enabled.

Sorry :)

John

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-29 Thread Paul Mackerras
Linas Vepstas writes:

  One way to clean this up would be to make rpaphp the driver for the
  EADS bridges (from the pci code's point of view).  
 
 I guess I don't understand what that means. Are you suggesting moving 
 pSeries_pci.c into the rpaphp code directory?

No, not at all. :)

I'm suggesting that the rpaphp code has a struct pci_driver whose
id_table and probe function are such that it will claim the EADS
bridges.  (It would probably be best to match on vendor=IBM and
class=PCI-PCI bridge and let the probe function figure out which of
the bridges it gets asked about are actually EADS bridges.)

 I would prefer to deprecate the hot-plug based recovery scheme.  This
 is for many reasons, including the fact that some devices that can get
 pci errors are soldered onto the planar, and are not hot-pluggable.

Those devices can still be isolated and reset, can they not?  And they
still have an EADS bridge above them, don't they?  Are there any that
can't be dynamically reassigned from one partition to another?  I.e.,
they may not be physically hotpluggable but they are still virtually
hotpluggable as far as the partition is concerned, IIUC.

Paul.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-25 Thread Benjamin Herrenschmidt
On Fri, 2005-08-26 at 09:18 +1000, Paul Mackerras wrote:
> Benjamin Herrenschmidt writes:
> 
> > Ok, so what is the problem then ? Why do we have to wait at all ? Why
> > not just unplug/replug right away ?
> 
> We'd have to be absolutely certain that the driver could not possibly
> take another interrupt or try to access the device on behalf of the
> old instance of the device by the time it returned from the remove
> function.  I'm not sure I'd trust most drivers that far...

Hrm... If a driver gets that wrong, then it will also blow up when
unloaded as a module. All drivers should be fully shut down by the time
they return from remove(). free_irq() is synchronous as is iounmap() and
both of those are usually called as part of remove(). I wouldn't be too
worried here.

Ben.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-25 Thread Paul Mackerras
Benjamin Herrenschmidt writes:

> Ok, so what is the problem then ? Why do we have to wait at all ? Why
> not just unplug/replug right away ?

We'd have to be absolutely certain that the driver could not possibly
take another interrupt or try to access the device on behalf of the
old instance of the device by the time it returned from the remove
function.  I'm not sure I'd trust most drivers that far...

Paul.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-25 Thread Benjamin Herrenschmidt
On Thu, 2005-08-25 at 11:21 -0500, Linas Vepstas wrote:
> On Thu, Aug 25, 2005 at 10:49:03AM +1000, Benjamin Herrenschmidt was heard to 
> remark:
> > 
> > Of course, we'll possibly end up with a different ethX or whatever, but
> 
> Yep, but that's not an issue, since all the various device-naming
> schemes are supposed to be fixing this. Its a distinct problem;
> it needs to be solved even across cold-boots. 

Ok, so what is the problem then ? Why do we have to wait at all ? Why
not just unplug/replug right away ?

> (Didn't I ever tell you about the day I added a new disk controller to
> my system, and /dev/hda became /dev/hde and thus /home was mounted on
> /usr and /var as /etc and all hell broke loose? Owww, device naming
> is a serious issue for home users and even more so for enterprise-class 
> users).
> 
> --linas
> 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-25 Thread Linas Vepstas
On Thu, Aug 25, 2005 at 10:49:03AM +1000, Benjamin Herrenschmidt was heard to 
remark:
> 
> Of course, we'll possibly end up with a different ethX or whatever, but

Yep, but that's not an issue, since all the various device-naming
schemes are supposed to be fixing this. Its a distinct problem;
it needs to be solved even across cold-boots. 

(Didn't I ever tell you about the day I added a new disk controller to
my system, and /dev/hda became /dev/hde and thus /home was mounted on
/usr and /var as /etc and all hell broke loose? Owww, device naming
is a serious issue for home users and even more so for enterprise-class 
users).

--linas


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-25 Thread Linas Vepstas
On Thu, Aug 25, 2005 at 10:10:45AM +1000, Paul Mackerras was heard to remark:
> Linas Vepstas writes:
> 
> > The meta-issue that I'd like to reach consensus on first is whether
> > there should be any hot-plug recovery attempted at all.  Removing
> > hot-plug-recovery support will make many of the issues you raise 
> > to be moot.
> 
> Yes, this probably the thorniest issue we have.
> 
> My feeling is that the unplug half of it is probably fairly
> uncontroversial, but the replug half is a can of worms.  Would you
> agree with that?

Actually, no.  There are three issues:
1) hotplug routines are called from within kernel. GregKH has stated on
   multiple occasions that doing this is wrong/bad/evil. This includes
   calling hot-unplug.

2) As a result, the code to call hot-unplug is a bit messy. In
   particular, there's a bit of hoop-jumping when hotplug is built as
   as a module (and said hoops were wrecked recently when I moved the
   code around, out of the rpaphp directory).

3) Hot-unplug causes scripts to run in user-space. There is no way to 
   know when these scripts are done, so its not clear if we've waited
   long enough before calling hot-add (or if waiting is even necessary).

> Is it udev that handles the hotplug notifications on the userspace
> side in all cases (do both RHEL and SLES use udev, for instance)?

Yes, and it seems to work fine despite the fact that the current 
sles9/rhel4 use rather oold versions of udev, which is criminal,
according to Kay Seivers.  I have not tested new versions of udev; 
I assume new versions will work "even better".

> How hard is it to add a new sort of notification, on the kernel side
> and in udev?

Why? To acheive what goal?  (Keep in mind that user-space eeh solutions
seem to fail when the affected device is storage, since block devices
and filesystems don't like the underlying storage to wink out.)

> I think what I'd like to see is that when a slot gets isolated and the
> driver doesn't have recovery code, the kernel calls the driver's
> unplug function and generates a hotplug event to udev.  Ideally this
> would be a variant of the remove event which would say "and by the
> way, please try replugging this slot when you've finished handling the
> remove event" or something along those lines.

Ahh, yes, this addresses the timing issue raised in point 3).  However,
I'm thinking that the timing issue is not really an issue, depending on
how udev is designed.  For example, consider a 100% cpu loaded, heavy
i/o loaded PC, and now we rapidly unplug and replug some USB key.
Presumably, udev will handle this gracefully.  The kernel error recovery
essentially looks the same to udev: the burp on the bus looks like a
rapid-fire unplug-replug.

BTW, zSeries has similar concerns, where channels can come and go,
causing thousands of unplug/replug udev events in rapid succession. 
(This was discussed on the udev mailing lists in the past).  It might
be interesting to have the zSeries folks discuss the current EEH design,
as this is something they have far more experience with than any of 
the pc or unix crowd.  I personally have not discussed with any zSeries 
people.

--linas

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-25 Thread Linas Vepstas
On Thu, Aug 25, 2005 at 10:10:45AM +1000, Paul Mackerras was heard to remark:
 Linas Vepstas writes:
 
  The meta-issue that I'd like to reach consensus on first is whether
  there should be any hot-plug recovery attempted at all.  Removing
  hot-plug-recovery support will make many of the issues you raise 
  to be moot.
 
 Yes, this probably the thorniest issue we have.
 
 My feeling is that the unplug half of it is probably fairly
 uncontroversial, but the replug half is a can of worms.  Would you
 agree with that?

Actually, no.  There are three issues:
1) hotplug routines are called from within kernel. GregKH has stated on
   multiple occasions that doing this is wrong/bad/evil. This includes
   calling hot-unplug.

2) As a result, the code to call hot-unplug is a bit messy. In
   particular, there's a bit of hoop-jumping when hotplug is built as
   as a module (and said hoops were wrecked recently when I moved the
   code around, out of the rpaphp directory).

3) Hot-unplug causes scripts to run in user-space. There is no way to 
   know when these scripts are done, so its not clear if we've waited
   long enough before calling hot-add (or if waiting is even necessary).

 Is it udev that handles the hotplug notifications on the userspace
 side in all cases (do both RHEL and SLES use udev, for instance)?

Yes, and it seems to work fine despite the fact that the current 
sles9/rhel4 use rather oold versions of udev, which is criminal,
according to Kay Seivers.  I have not tested new versions of udev; 
I assume new versions will work even better.

 How hard is it to add a new sort of notification, on the kernel side
 and in udev?

Why? To acheive what goal?  (Keep in mind that user-space eeh solutions
seem to fail when the affected device is storage, since block devices
and filesystems don't like the underlying storage to wink out.)

 I think what I'd like to see is that when a slot gets isolated and the
 driver doesn't have recovery code, the kernel calls the driver's
 unplug function and generates a hotplug event to udev.  Ideally this
 would be a variant of the remove event which would say and by the
 way, please try replugging this slot when you've finished handling the
 remove event or something along those lines.

Ahh, yes, this addresses the timing issue raised in point 3).  However,
I'm thinking that the timing issue is not really an issue, depending on
how udev is designed.  For example, consider a 100% cpu loaded, heavy
i/o loaded PC, and now we rapidly unplug and replug some USB key.
Presumably, udev will handle this gracefully.  The kernel error recovery
essentially looks the same to udev: the burp on the bus looks like a
rapid-fire unplug-replug.

BTW, zSeries has similar concerns, where channels can come and go,
causing thousands of unplug/replug udev events in rapid succession. 
(This was discussed on the udev mailing lists in the past).  It might
be interesting to have the zSeries folks discuss the current EEH design,
as this is something they have far more experience with than any of 
the pc or unix crowd.  I personally have not discussed with any zSeries 
people.

--linas

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-25 Thread Linas Vepstas
On Thu, Aug 25, 2005 at 10:49:03AM +1000, Benjamin Herrenschmidt was heard to 
remark:
 
 Of course, we'll possibly end up with a different ethX or whatever, but

Yep, but that's not an issue, since all the various device-naming
schemes are supposed to be fixing this. Its a distinct problem;
it needs to be solved even across cold-boots. 

(Didn't I ever tell you about the day I added a new disk controller to
my system, and /dev/hda became /dev/hde and thus /home was mounted on
/usr and /var as /etc and all hell broke loose? Owww, device naming
is a serious issue for home users and even more so for enterprise-class 
users).

--linas


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-25 Thread Paul Mackerras
Benjamin Herrenschmidt writes:

 Ok, so what is the problem then ? Why do we have to wait at all ? Why
 not just unplug/replug right away ?

We'd have to be absolutely certain that the driver could not possibly
take another interrupt or try to access the device on behalf of the
old instance of the device by the time it returned from the remove
function.  I'm not sure I'd trust most drivers that far...

Paul.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-25 Thread Benjamin Herrenschmidt
On Fri, 2005-08-26 at 09:18 +1000, Paul Mackerras wrote:
 Benjamin Herrenschmidt writes:
 
  Ok, so what is the problem then ? Why do we have to wait at all ? Why
  not just unplug/replug right away ?
 
 We'd have to be absolutely certain that the driver could not possibly
 take another interrupt or try to access the device on behalf of the
 old instance of the device by the time it returned from the remove
 function.  I'm not sure I'd trust most drivers that far...

Hrm... If a driver gets that wrong, then it will also blow up when
unloaded as a module. All drivers should be fully shut down by the time
they return from remove(). free_irq() is synchronous as is iounmap() and
both of those are usually called as part of remove(). I wouldn't be too
worried here.

Ben.


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-24 Thread Benjamin Herrenschmidt

> I think what I'd like to see is that when a slot gets isolated and the
> driver doesn't have recovery code, the kernel calls the driver's
> unplug function and generates a hotplug event to udev.  Ideally this
> would be a variant of the remove event which would say "and by the
> way, please try replugging this slot when you've finished handling the
> remove event" or something along those lines.

I'm still trying to understand why we care. What prevents us from just
uplugging the previous device and re-plugging right away ? After all,
the driver->remove() function is supposed to guarantee that no HW access
will happen after it returns and that everything was unmapped.

Of course, we'll possibly end up with a different ethX or whatever, but
I don't see the problem with that ... It's hopeless to think we might
manage to keep that identical anyway, unless the driver implements
proper error recovery.

Ben.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-24 Thread Paul Mackerras
Linas Vepstas writes:

> The meta-issue that I'd like to reach consensus on first is whether
> there should be any hot-plug recovery attempted at all.  Removing
> hot-plug-recovery support will make many of the issues you raise 
> to be moot.

Yes, this probably the thorniest issue we have.

My feeling is that the unplug half of it is probably fairly
uncontroversial, but the replug half is a can of worms.  Would you
agree with that?

Is it udev that handles the hotplug notifications on the userspace
side in all cases (do both RHEL and SLES use udev, for instance)?
How hard is it to add a new sort of notification, on the kernel side
and in udev?

I think what I'd like to see is that when a slot gets isolated and the
driver doesn't have recovery code, the kernel calls the driver's
unplug function and generates a hotplug event to udev.  Ideally this
would be a variant of the remove event which would say "and by the
way, please try replugging this slot when you've finished handling the
remove event" or something along those lines.

Thoughts?

Paul.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-24 Thread Linas Vepstas
On Wed, Aug 24, 2005 at 10:45:31AM -0500, John Rose was heard to remark:
> > +++ linux-2.6.13-rc6-git9/arch/ppc64/kernel/eeh_driver.c2005-08-23 
> > 14:34:44.0 -0500
> > +/*
> > + * PCI Hot Plug Controller Driver for RPA-compliant PPC64 platform.
> 
> This probably isn't the right header description for this file :)

Yes, this file is a little ball of ugliness that resulted from moving
things out of the rpaphp directory; and, yes, it's rather
un-reconstructed. I released it under the "release early" program.

The meta-issue that I'd like to reach consensus on first is whether
there should be any hot-plug recovery attempted at all.  Removing
hot-plug-recovery support will make many of the issues you raise 
to be moot.

> > +++ linux-2.6.13-rc6-git9/include/asm-ppc64/prom.h  2005-08-23 
> > 13:31:52.0 -0500
> > int busno;  /* for pci devices */
> > int bussubno;   /* for pci devices */
> > int devfn;  /* for pci devices */
> 
> How about a pointer to a struct of EEH fields?  Folks are touchy about
> adding anything PCI-specific to device nodes, especially since most DNs
> aren't PCI at all.

I attempted to remove all of the pci-related stuff from this struct,
and got a crash in very very early boot (before the transition from
real to virtual addressing). Not sure why, I was surprised.  It seems 
related to the flattening of the device ndode tree. I'll try again 
soon.

--linas
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-24 Thread John Rose
Hi Linas-

I like the idea of splitting the recovery stuff into its own driver.  A
few comments on the last reorg patch:

> Index: linux-2.6.13-rc6-git9/arch/ppc64/kernel/eeh.c
...
> +static int
> +eeh_slot_availability(struct device_node *dn)
...
> +void eeh_restore_bars(struct device_node *dn)

Inconsistent spacing in new code... 

> --- /dev/null 1970-01-01 00:00:00.0 +
> +++ linux-2.6.13-rc6-git9/arch/ppc64/kernel/eeh_driver.c  2005-08-23 
> 14:34:44.0 -0500
> @@ -0,0 +1,361 @@
> +/*
> + * PCI Hot Plug Controller Driver for RPA-compliant PPC64 platform.

This probably isn't the right header description for this file :)

> +
> +/**
> + * rpaphp_find_slot - find and return the slot holding the device
> + * @dev: pci device for which we want the slot structure.
> + */
> +static struct slot *rpaphp_find_slot(struct pci_dev *dev)
> +{
> + struct list_head *tmp, *n;
> + struct slot *slot;
> +
> + list_for_each_safe(tmp, n, _slot_head) {
> + struct pci_bus *bus;
> +
> + slot = list_entry(tmp, struct slot, rpaphp_slot_list);
> +
> + /* PHB's don't have bridges. */
> + bus = slot->bus;
> + if (bus == NULL)
> + continue;
> +
> + /* The PCI device could be the slot itself. */
> + if (bus->self == dev)
> + return slot;
> +
> + if (pci_search_bus_for_dev (bus, dev))
> + return slot;
> + }
> + return NULL;
> +}

This function breaks if rpaphp is compiled as a module.  It's probably
bad for kernel code to depend on symbols exported from modules.  This
raises two larger questions: Where should this new driver sit, and
should it be possible to compile it as a module as well?

> +/* --- */
> +/**
> + * handle_eeh_events -- reset a PCI device after hard lockup.
> + *
...
> +int eeh_reset_device (struct pci_dev *dev, struct device_node *dn, int 
> reconfig)

Header doesn't match the function :)

> +{
> + struct hotplug_slot *frozen_slot= NULL;
> + struct hotplug_slot_ops *frozen_ops= NULL;
> +
> + if (!dev)
> + return 1;
> +
> + if (reconfig) {
> + frozen_slot = pci_hp_find_slot(dev);
> + if (frozen_slot)
> + frozen_ops = frozen_slot->ops;
> + }
> +
> + if (frozen_ops) frozen_ops->disable_slot (frozen_slot);
> +
> + /* Reset the pci controller. (Asserts RST#; resets config space).
> +  * Reconfigure bridges and devices */
> + rtas_set_slot_reset (dn->child);
> +
> + /* Walk over all functions on this device */
> + struct device_node *peer = dn->child;
> + while (peer) {
> + rtas_configure_bridge(peer);
> + eeh_restore_bars(peer);
> + peer = peer->sibling;
> + }
> +
> + /* Give the system 5 seconds to finish running the user-space
> +  * hotplug scripts, e.g. ifdown for ethernet.  Yes, this is a hack,
> +  * but if we don't do this, weird things happen.
> +  */
> + if (frozen_ops) {
> + ssleep (5);
> + frozen_ops->enable_slot (frozen_slot);
> + }
> + return 0;
> +}

This dependence on struct hotplug_slot might be problematic as we
restrict the registration of "PCI hotplug" slots to exclude PHBs, VIO,
and embedded slots.  Noticed your comment to this effect.  I can work
with you offline on this.

> +
> +/* The longest amount of time to wait for a pci device
> + * to come back on line, in seconds.
> + */
> +#define MAX_WAIT_FOR_RECOVERY 15
> +
> +int handle_eeh_events (struct eeh_event *event)
> +{
> + 
...
> + frozen_device = pci_bus_to_OF_node(dev->bus);
> + if (!frozen_device)
> + {
> + printk (KERN_ERR "EEH: Cannot find PCI controller for %s\n",
> + pci_name(dev));
> +
> + return 1;
> + }
> + BUG_ON (frozen_device->phb==NULL);
> +
> + /* We get "permanent failure" messages on empty slots.
> +  * These are false alarms. Empty slots have no child dn. */
> + if ((event->state == pci_channel_io_perm_failure) && (frozen_device == 
> NULL))

The second part of this conditional will never be true, as this has just
been checked above.

> +   if (frozen_device)
> +   freeze_count = frozen_device->eeh_freeze_count;

This conditional will always be true, as this has also ben checked
above.

> Index: linux-2.6.13-rc6-git9/include/asm-ppc64/prom.h
> ===
> --- linux-2.6.13-rc6-git9.orig/include/asm-ppc64/prom.h   2005-08-19 
> 15:11:39.0 -0500
> +++ linux-2.6.13-rc6-git9/include/asm-ppc64/prom.h2005-08-23 
> 13:31:52.0 -0500
> @@ -135,11 +135,17 @@
>   int busno;  /* for pci devices */
>   int bussubno;   /* for pci devices */
>   int devfn;   

Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-24 Thread Paul Mackerras
I wrote:

> Linas Vepstas writes:
> 
> In this patch at least, your mailer seems to have blanked out lines
> that match ^[-+]$.  Could you send them to me again with a different
> mailer or put them on a web or ftp site somewhere?

I got 3 copies of each of these mails, one directly, one through
linuxppc64-dev and one through linux-kernel.  It looks like the copies
that came through the mailing lists are OK but the copy that came
directly to me is corrupted.  Weird.

Regards,
Paul.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-24 Thread John Rose
Hi Linas-

I like the idea of splitting the recovery stuff into its own driver.  A
few comments on the last reorg patch:

 Index: linux-2.6.13-rc6-git9/arch/ppc64/kernel/eeh.c
...
 +static int
 +eeh_slot_availability(struct device_node *dn)
...
 +void eeh_restore_bars(struct device_node *dn)

Inconsistent spacing in new code... /nit

 --- /dev/null 1970-01-01 00:00:00.0 +
 +++ linux-2.6.13-rc6-git9/arch/ppc64/kernel/eeh_driver.c  2005-08-23 
 14:34:44.0 -0500
 @@ -0,0 +1,361 @@
 +/*
 + * PCI Hot Plug Controller Driver for RPA-compliant PPC64 platform.

This probably isn't the right header description for this file :)

 +
 +/**
 + * rpaphp_find_slot - find and return the slot holding the device
 + * @dev: pci device for which we want the slot structure.
 + */
 +static struct slot *rpaphp_find_slot(struct pci_dev *dev)
 +{
 + struct list_head *tmp, *n;
 + struct slot *slot;
 +
 + list_for_each_safe(tmp, n, rpaphp_slot_head) {
 + struct pci_bus *bus;
 +
 + slot = list_entry(tmp, struct slot, rpaphp_slot_list);
 +
 + /* PHB's don't have bridges. */
 + bus = slot-bus;
 + if (bus == NULL)
 + continue;
 +
 + /* The PCI device could be the slot itself. */
 + if (bus-self == dev)
 + return slot;
 +
 + if (pci_search_bus_for_dev (bus, dev))
 + return slot;
 + }
 + return NULL;
 +}

This function breaks if rpaphp is compiled as a module.  It's probably
bad for kernel code to depend on symbols exported from modules.  This
raises two larger questions: Where should this new driver sit, and
should it be possible to compile it as a module as well?

 +/* --- */
 +/**
 + * handle_eeh_events -- reset a PCI device after hard lockup.
 + *
...
 +int eeh_reset_device (struct pci_dev *dev, struct device_node *dn, int 
 reconfig)

Header doesn't match the function :)

 +{
 + struct hotplug_slot *frozen_slot= NULL;
 + struct hotplug_slot_ops *frozen_ops= NULL;
 +
 + if (!dev)
 + return 1;
 +
 + if (reconfig) {
 + frozen_slot = pci_hp_find_slot(dev);
 + if (frozen_slot)
 + frozen_ops = frozen_slot-ops;
 + }
 +
 + if (frozen_ops) frozen_ops-disable_slot (frozen_slot);
 +
 + /* Reset the pci controller. (Asserts RST#; resets config space).
 +  * Reconfigure bridges and devices */
 + rtas_set_slot_reset (dn-child);
 +
 + /* Walk over all functions on this device */
 + struct device_node *peer = dn-child;
 + while (peer) {
 + rtas_configure_bridge(peer);
 + eeh_restore_bars(peer);
 + peer = peer-sibling;
 + }
 +
 + /* Give the system 5 seconds to finish running the user-space
 +  * hotplug scripts, e.g. ifdown for ethernet.  Yes, this is a hack,
 +  * but if we don't do this, weird things happen.
 +  */
 + if (frozen_ops) {
 + ssleep (5);
 + frozen_ops-enable_slot (frozen_slot);
 + }
 + return 0;
 +}

This dependence on struct hotplug_slot might be problematic as we
restrict the registration of PCI hotplug slots to exclude PHBs, VIO,
and embedded slots.  Noticed your comment to this effect.  I can work
with you offline on this.

 +
 +/* The longest amount of time to wait for a pci device
 + * to come back on line, in seconds.
 + */
 +#define MAX_WAIT_FOR_RECOVERY 15
 +
 +int handle_eeh_events (struct eeh_event *event)
 +{
 + 
...
 + frozen_device = pci_bus_to_OF_node(dev-bus);
 + if (!frozen_device)
 + {
 + printk (KERN_ERR EEH: Cannot find PCI controller for %s\n,
 + pci_name(dev));
 +
 + return 1;
 + }
 + BUG_ON (frozen_device-phb==NULL);
 +
 + /* We get permanent failure messages on empty slots.
 +  * These are false alarms. Empty slots have no child dn. */
 + if ((event-state == pci_channel_io_perm_failure)  (frozen_device == 
 NULL))

The second part of this conditional will never be true, as this has just
been checked above.

 +   if (frozen_device)
 +   freeze_count = frozen_device-eeh_freeze_count;

This conditional will always be true, as this has also ben checked
above.

 Index: linux-2.6.13-rc6-git9/include/asm-ppc64/prom.h
 ===
 --- linux-2.6.13-rc6-git9.orig/include/asm-ppc64/prom.h   2005-08-19 
 15:11:39.0 -0500
 +++ linux-2.6.13-rc6-git9/include/asm-ppc64/prom.h2005-08-23 
 13:31:52.0 -0500
 @@ -135,11 +135,17 @@
   int busno;  /* for pci devices */
   int bussubno;   /* for pci devices */
   int devfn;  /* for pci devices */
 - int eeh_mode;   /* See eeh.h for possible EEH_MODEs */
 - int 

Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-24 Thread Linas Vepstas
On Wed, Aug 24, 2005 at 10:45:31AM -0500, John Rose was heard to remark:
  +++ linux-2.6.13-rc6-git9/arch/ppc64/kernel/eeh_driver.c2005-08-23 
  14:34:44.0 -0500
  +/*
  + * PCI Hot Plug Controller Driver for RPA-compliant PPC64 platform.
 
 This probably isn't the right header description for this file :)

Yes, this file is a little ball of ugliness that resulted from moving
things out of the rpaphp directory; and, yes, it's rather
un-reconstructed. I released it under the release early program.

The meta-issue that I'd like to reach consensus on first is whether
there should be any hot-plug recovery attempted at all.  Removing
hot-plug-recovery support will make many of the issues you raise 
to be moot.

  +++ linux-2.6.13-rc6-git9/include/asm-ppc64/prom.h  2005-08-23 
  13:31:52.0 -0500
  int busno;  /* for pci devices */
  int bussubno;   /* for pci devices */
  int devfn;  /* for pci devices */
 
 How about a pointer to a struct of EEH fields?  Folks are touchy about
 adding anything PCI-specific to device nodes, especially since most DNs
 aren't PCI at all.

I attempted to remove all of the pci-related stuff from this struct,
and got a crash in very very early boot (before the transition from
real to virtual addressing). Not sure why, I was surprised.  It seems 
related to the flattening of the device ndode tree. I'll try again 
soon.

--linas
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-24 Thread Paul Mackerras
Linas Vepstas writes:

 The meta-issue that I'd like to reach consensus on first is whether
 there should be any hot-plug recovery attempted at all.  Removing
 hot-plug-recovery support will make many of the issues you raise 
 to be moot.

Yes, this probably the thorniest issue we have.

My feeling is that the unplug half of it is probably fairly
uncontroversial, but the replug half is a can of worms.  Would you
agree with that?

Is it udev that handles the hotplug notifications on the userspace
side in all cases (do both RHEL and SLES use udev, for instance)?
How hard is it to add a new sort of notification, on the kernel side
and in udev?

I think what I'd like to see is that when a slot gets isolated and the
driver doesn't have recovery code, the kernel calls the driver's
unplug function and generates a hotplug event to udev.  Ideally this
would be a variant of the remove event which would say and by the
way, please try replugging this slot when you've finished handling the
remove event or something along those lines.

Thoughts?

Paul.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-24 Thread Benjamin Herrenschmidt

 I think what I'd like to see is that when a slot gets isolated and the
 driver doesn't have recovery code, the kernel calls the driver's
 unplug function and generates a hotplug event to udev.  Ideally this
 would be a variant of the remove event which would say and by the
 way, please try replugging this slot when you've finished handling the
 remove event or something along those lines.

I'm still trying to understand why we care. What prevents us from just
uplugging the previous device and re-plugging right away ? After all,
the driver-remove() function is supposed to guarantee that no HW access
will happen after it returns and that everything was unmapped.

Of course, we'll possibly end up with a different ethX or whatever, but
I don't see the problem with that ... It's hopeless to think we might
manage to keep that identical anyway, unless the driver implements
proper error recovery.

Ben.


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-24 Thread Paul Mackerras
I wrote:

 Linas Vepstas writes:
 
 In this patch at least, your mailer seems to have blanked out lines
 that match ^[-+]$.  Could you send them to me again with a different
 mailer or put them on a web or ftp site somewhere?

I got 3 copies of each of these mails, one directly, one through
linuxppc64-dev and one through linux-kernel.  It looks like the copies
that came through the mailing lists are OK but the copy that came
directly to me is corrupted.  Weird.

Regards,
Paul.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-23 Thread Paul Mackerras
Linas Vepstas writes:

In this patch at least, your mailer seems to have blanked out lines
that match ^[-+]$.  Could you send them to me again with a different
mailer or put them on a web or ftp site somewhere?

Thanks,
Paul.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-23 Thread Linas Vepstas

Various PCI bus errors can be signaled by newer PCI controllers.  The
core error recovery routines are architecture dependent.  This patch adds
a recovery infrastructure for the  PPC64 pSeries systems.

Signed-off-by: Linas Vepstas <[EMAIL PROTECTED]>

--
 arch/ppc64/kernel/Makefile |2 
 arch/ppc64/kernel/eeh.c|  543 +
 arch/ppc64/kernel/eeh_driver.c |  361 +++
 arch/ppc64/kernel/eeh_event.c  |  116 
 arch/ppc64/kernel/eeh_event.h  |   52 +++
 arch/ppc64/kernel/rtas_pci.c   |5 
 include/asm-ppc64/eeh.h|  105 +--
 include/asm-ppc64/prom.h   |   10 
 include/asm-ppc64/rtas.h   |2 
 9 files changed, 1001 insertions(+), 195 deletions(-)

Index: linux-2.6.13-rc6-git9/arch/ppc64/kernel/eeh.c
===
--- linux-2.6.13-rc6-git9.orig/arch/ppc64/kernel/eeh.c  2005-08-19 
12:52:31.0 -0500
+++ linux-2.6.13-rc6-git9/arch/ppc64/kernel/eeh.c   2005-08-23 
16:53:05.0 -0500
@@ -1,32 +1,33 @@
 /*
+ *
  * eeh.c
  * Copyright (C) 2001 Dave Engebretsen & Todd Inglett IBM Corporation
- * 
+ *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License as published by
  * the Free Software Foundation; either version 2 of the License, or
  * (at your option) any later version.
- * 
+ *
  * This program is distributed in the hope that it will be useful,
  * but WITHOUT ANY WARRANTY; without even the implied warranty of
  * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  * GNU General Public License for more details.
- * 
+ *
  * You should have received a copy of the GNU General Public License
  * along with this program; if not, write to the Free Software
  * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307 USA
  */
 
-#include 
+#include 
 #include 
+#include 
 #include 
-#include 
-#include 
 #include 
 #include 
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -34,6 +35,7 @@
 #include 
 #include 
 #include "pci.h"
+#include "eeh_event.h"
 
 #undef DEBUG
 
@@ -49,8 +51,8 @@
  *  were "empty": all reads return 0xff's and all writes are silently
  *  ignored.  EEH slot isolation events can be triggered by parity
  *  errors on the address or data busses (e.g. during posted writes),
- *  which in turn might be caused by dust, vibration, humidity,
- *  radioactivity or plain-old failed hardware.
+ *  which in turn might be caused by low voltage on the bus, dust,
+ *  vibration, humidity, radioactivity or plain-old failed hardware.
  *
  *  Note, however, that one of the leading causes of EEH slot
  *  freeze events are buggy device drivers, buggy device microcode,
@@ -75,22 +77,13 @@
 #define BUID_HI(buid) ((buid) >> 32)
 #define BUID_LO(buid) ((buid) & 0x)
 
-/* EEH event workqueue setup. */
-static DEFINE_SPINLOCK(eeh_eventlist_lock);
-LIST_HEAD(eeh_eventlist);
-static void eeh_event_handler(void *);
-DECLARE_WORK(eeh_event_wq, eeh_event_handler, NULL);
-
-static struct notifier_block *eeh_notifier_chain;
-
 /*
  * If a device driver keeps reading an MMIO register in an interrupt
  * handler after a slot isolation event has occurred, we assume it
  * is broken and panic.  This sets the threshold for how many read
  * attempts we allow before panicking.
  */
-#define EEH_MAX_FAILS  1000
-static atomic_t eeh_fail_count;
+#define EEH_MAX_FAILS  10
 
 /* RTAS tokens */
 static int ibm_set_eeh_option;
@@ -107,6 +100,10 @@
 static int eeh_error_buf_size;
 
 /* System monitoring statistics */
+static DEFINE_PER_CPU(unsigned long, no_device);
+static DEFINE_PER_CPU(unsigned long, no_dn);
+static DEFINE_PER_CPU(unsigned long, no_cfg_addr);
+static DEFINE_PER_CPU(unsigned long, ignored_check);
 static DEFINE_PER_CPU(unsigned long, total_mmio_ffs);
 static DEFINE_PER_CPU(unsigned long, false_positives);
 static DEFINE_PER_CPU(unsigned long, ignored_failures);
@@ -224,9 +221,9 @@
while (*p) {
parent = *p;
piar = rb_entry(parent, struct pci_io_addr_range, rb_node);
-   if (alo < piar->addr_lo) {
+   if (ahi < piar->addr_lo) {
p = >rb_left;
-   } else if (ahi > piar->addr_hi) {
+   } else if (alo > piar->addr_hi) {
p = >rb_right;
} else {
if (dev != piar->pcidev ||
@@ -245,6 +242,11 @@
piar->pcidev = dev;
piar->flags = flags;
 
+#ifdef DEBUG
+   printk (KERN_DEBUG "PIAR: insert range=[%lx:%lx] dev=%s\n",
+  alo, ahi, pci_name (dev));
+#endif
+
rb_link_node(>rb_node, parent, p);
rb_insert_color(>rb_node, _io_addr_cache_root.rb_root);
 
@@ -268,8 +270,8 @@
if (!(dn->eeh_mode & EEH_MODE_SUPPORTED) ||
dn->eeh_mode & EEH_MODE_NOCHECK) {
 #ifdef DEBUG
-   

[patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-23 Thread Linas Vepstas

Various PCI bus errors can be signaled by newer PCI controllers.  The
core error recovery routines are architecture dependent.  This patch adds
a recovery infrastructure for the  PPC64 pSeries systems.

Signed-off-by: Linas Vepstas [EMAIL PROTECTED]

--
 arch/ppc64/kernel/Makefile |2 
 arch/ppc64/kernel/eeh.c|  543 +
 arch/ppc64/kernel/eeh_driver.c |  361 +++
 arch/ppc64/kernel/eeh_event.c  |  116 
 arch/ppc64/kernel/eeh_event.h  |   52 +++
 arch/ppc64/kernel/rtas_pci.c   |5 
 include/asm-ppc64/eeh.h|  105 +--
 include/asm-ppc64/prom.h   |   10 
 include/asm-ppc64/rtas.h   |2 
 9 files changed, 1001 insertions(+), 195 deletions(-)

Index: linux-2.6.13-rc6-git9/arch/ppc64/kernel/eeh.c
===
--- linux-2.6.13-rc6-git9.orig/arch/ppc64/kernel/eeh.c  2005-08-19 
12:52:31.0 -0500
+++ linux-2.6.13-rc6-git9/arch/ppc64/kernel/eeh.c   2005-08-23 
16:53:05.0 -0500
@@ -1,32 +1,33 @@
 /*
+ *
  * eeh.c
  * Copyright (C) 2001 Dave Engebretsen  Todd Inglett IBM Corporation
- * 
+ *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License as published by
  * the Free Software Foundation; either version 2 of the License, or
  * (at your option) any later version.
- * 
+ *
  * This program is distributed in the hope that it will be useful,
  * but WITHOUT ANY WARRANTY; without even the implied warranty of
  * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  * GNU General Public License for more details.
- * 
+ *
  * You should have received a copy of the GNU General Public License
  * along with this program; if not, write to the Free Software
  * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307 USA
  */
 
-#include linux/bootmem.h
+#include linux/delay.h
 #include linux/init.h
+#include linux/irq.h
 #include linux/list.h
-#include linux/mm.h
-#include linux/notifier.h
 #include linux/pci.h
 #include linux/proc_fs.h
 #include linux/rbtree.h
 #include linux/seq_file.h
 #include linux/spinlock.h
+#include asm/atomic.h
 #include asm/eeh.h
 #include asm/io.h
 #include asm/machdep.h
@@ -34,6 +35,7 @@
 #include asm/atomic.h
 #include asm/systemcfg.h
 #include pci.h
+#include eeh_event.h
 
 #undef DEBUG
 
@@ -49,8 +51,8 @@
  *  were empty: all reads return 0xff's and all writes are silently
  *  ignored.  EEH slot isolation events can be triggered by parity
  *  errors on the address or data busses (e.g. during posted writes),
- *  which in turn might be caused by dust, vibration, humidity,
- *  radioactivity or plain-old failed hardware.
+ *  which in turn might be caused by low voltage on the bus, dust,
+ *  vibration, humidity, radioactivity or plain-old failed hardware.
  *
  *  Note, however, that one of the leading causes of EEH slot
  *  freeze events are buggy device drivers, buggy device microcode,
@@ -75,22 +77,13 @@
 #define BUID_HI(buid) ((buid)  32)
 #define BUID_LO(buid) ((buid)  0x)
 
-/* EEH event workqueue setup. */
-static DEFINE_SPINLOCK(eeh_eventlist_lock);
-LIST_HEAD(eeh_eventlist);
-static void eeh_event_handler(void *);
-DECLARE_WORK(eeh_event_wq, eeh_event_handler, NULL);
-
-static struct notifier_block *eeh_notifier_chain;
-
 /*
  * If a device driver keeps reading an MMIO register in an interrupt
  * handler after a slot isolation event has occurred, we assume it
  * is broken and panic.  This sets the threshold for how many read
  * attempts we allow before panicking.
  */
-#define EEH_MAX_FAILS  1000
-static atomic_t eeh_fail_count;
+#define EEH_MAX_FAILS  10
 
 /* RTAS tokens */
 static int ibm_set_eeh_option;
@@ -107,6 +100,10 @@
 static int eeh_error_buf_size;
 
 /* System monitoring statistics */
+static DEFINE_PER_CPU(unsigned long, no_device);
+static DEFINE_PER_CPU(unsigned long, no_dn);
+static DEFINE_PER_CPU(unsigned long, no_cfg_addr);
+static DEFINE_PER_CPU(unsigned long, ignored_check);
 static DEFINE_PER_CPU(unsigned long, total_mmio_ffs);
 static DEFINE_PER_CPU(unsigned long, false_positives);
 static DEFINE_PER_CPU(unsigned long, ignored_failures);
@@ -224,9 +221,9 @@
while (*p) {
parent = *p;
piar = rb_entry(parent, struct pci_io_addr_range, rb_node);
-   if (alo  piar-addr_lo) {
+   if (ahi  piar-addr_lo) {
p = parent-rb_left;
-   } else if (ahi  piar-addr_hi) {
+   } else if (alo  piar-addr_hi) {
p = parent-rb_right;
} else {
if (dev != piar-pcidev ||
@@ -245,6 +242,11 @@
piar-pcidev = dev;
piar-flags = flags;
 
+#ifdef DEBUG
+   printk (KERN_DEBUG PIAR: insert range=[%lx:%lx] dev=%s\n,
+  alo, ahi, pci_name (dev));
+#endif
+
rb_link_node(piar-rb_node, 

Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-23 Thread Paul Mackerras
Linas Vepstas writes:

In this patch at least, your mailer seems to have blanked out lines
that match ^[-+]$.  Could you send them to me again with a different
mailer or put them on a web or ftp site somewhere?

Thanks,
Paul.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/