Re: [patch] MSI-X: fix resume crash

2007-03-28 Thread Eric W. Biederman
Len Brown <[EMAIL PROTECTED]> writes:

>> Tony, Len the way pci_disable_device is being used in a suspend/resume 
>> path by a few drivers is completely incompatible with the way irqs are 
>> allocated on ia64.  In particular people the following sequence occurs 
>> in several drivers.
>> 
>> probe:
>>   pci_enable_device(pdev);
>>   request_irq(pdev->irq);
>> suspend:
>>   pci_disable_device(pdev);
>> resume:
>>   pci_enable_device(pdev);
>> remove:
>>   free_irq(pdev->irq);
>>   pci_disable_device(pdev);
>
> There are no IA64 machines that support system suspend/resume today --
> so you have 0 chance of breaking the IA64 suspend/resume installed base.

Ok.  So that is why the inconsistency persists...

> My understanding is that Luming Yu has cobbled IA64 S4 support
> together for a future release though.
>
>> What I'm proposing we do is move the irq allocation code out of 
>> pci_enable_device and the irq freeing code out of pci_disable_device in 
>> the future.  If we move ia64 to a model where the irq number equal the 
>> gsi like we have for x86_64 and are in the middle of for i386 that 
>> should be pretty straight forward. It would even be relatively simple  
>> to delay vector allocation in that context until request_irq, if we 
>> needed the delayed allocation benefit.  Do you two have any problems 
>> with moving in that direction?
>
> I think consistency here would be _wonderful_.
> Of course the beauty of having identity GSI=IRQ and a /proc/interrupts
> that tells you what IOAPIC pin you are using become moot with MSI --
> but hey, showing the IRQ number rather than the vector number
> is consistent and makes sense.

Yes.  It also allows for bigger machines.  And I can get a consistent
number out of MSI if we allocate irq numbers in a sufficiently non-sparse
way.  Something like bus|device|func|irq which is 8+5+3+12 or 28 bits...
I'll never get there though if i keep unearthing this long standing bugs.

>> If fixing the arch code is unacceptable for some reason I'm not aware of 
>> we need to audit the 10-20 drivers that call pci_disable_device in their 
>> suspend/resume processing and ensure that they have freed all of the 
>> irqs before that point.  Given that I have bug reports on the msi path I 
>> know that isn't true.
>
> I think the suspend/resume interrupt logic needs some serious attention.
> We've had several schemes for suspend/resume of interrupts, several
> changes in strategy, and right now I think we are inconsistent,
> and frankly, I'm amazed it works at all.

What I have been doing lately is to aim at consistency in how a function
is called (and thus how it is expected to be used) and how it is actually
implemented.  When I have a choice I try to pick a forgiving implementation
so that driver writers don't have to follow a magic correct path for
things to work correctly.  

Removing the irq assignment from pci_enable_device is something that
matches implementation with use.

As for the rest it seems reasonable to me to allow an irq to be held
requested over suspend/resume and to save and restore apic and msi
capability state.  Especially since irq numbers are a kernel
abstraction we should be able to do with them what we need to.

Honestly the whole suspend/resume thing is beyond me at this point I'm
laptop free...  But I do know how to make code consistent with itself.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] MSI-X: fix resume crash

2007-03-28 Thread Len Brown
> Tony, Len the way pci_disable_device is being used in a suspend/resume 
> path by a few drivers is completely incompatible with the way irqs are 
> allocated on ia64.  In particular people the following sequence occurs 
> in several drivers.
> 
> probe:
>   pci_enable_device(pdev);
>   request_irq(pdev->irq);
> suspend:
>   pci_disable_device(pdev);
> resume:
>   pci_enable_device(pdev);
> remove:
>   free_irq(pdev->irq);
>   pci_disable_device(pdev);

There are no IA64 machines that support system suspend/resume today --
so you have 0 chance of breaking the IA64 suspend/resume installed base.

My understanding is that Luming Yu has cobbled IA64 S4 support
together for a future release though.

> What I'm proposing we do is move the irq allocation code out of 
> pci_enable_device and the irq freeing code out of pci_disable_device in 
> the future.  If we move ia64 to a model where the irq number equal the 
> gsi like we have for x86_64 and are in the middle of for i386 that 
> should be pretty straight forward. It would even be relatively simple  
> to delay vector allocation in that context until request_irq, if we 
> needed the delayed allocation benefit.  Do you two have any problems 
> with moving in that direction?

I think consistency here would be _wonderful_.
Of course the beauty of having identity GSI=IRQ and a /proc/interrupts
that tells you what IOAPIC pin you are using become moot with MSI --
but hey, showing the IRQ number rather than the vector number
is consistent and makes sense.

> If fixing the arch code is unacceptable for some reason I'm not aware of 
> we need to audit the 10-20 drivers that call pci_disable_device in their 
> suspend/resume processing and ensure that they have freed all of the 
> irqs before that point.  Given that I have bug reports on the msi path I 
> know that isn't true.

I think the suspend/resume interrupt logic needs some serious attention.
We've had several schemes for suspend/resume of interrupts, several
changes in strategy, and right now I think we are inconsistent,
and frankly, I'm amazed it works at all.

-Len


> From: Eric W. Biederman <[EMAIL PROTECTED]>
> Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]>
> ---
>  arch/cris/arch-v32/drivers/pci/bios.c |4 +++-
>  arch/frv/mb93090-mb00/pci-vdk.c   |3 ++-
>  arch/i386/pci/common.c|6 --
>  arch/ia64/pci/pci.c   |8 ++--
>  4 files changed, 15 insertions(+), 6 deletions(-)
> 
> Index: linux/arch/cris/arch-v32/drivers/pci/bios.c
> ===
> --- linux.orig/arch/cris/arch-v32/drivers/pci/bios.c
> +++ linux/arch/cris/arch-v32/drivers/pci/bios.c
> @@ -100,7 +100,9 @@ int pcibios_enable_device(struct pci_dev
>   if ((err = pcibios_enable_resources(dev, mask)) < 0)
>   return err;
>  
> - return pcibios_enable_irq(dev);
> + if (!dev->msi_enabled)
> + pcibios_enable_irq(dev);
> + return 0;
>  }
>  
>  int pcibios_assign_resources(void)
> Index: linux/arch/frv/mb93090-mb00/pci-vdk.c
> ===
> --- linux.orig/arch/frv/mb93090-mb00/pci-vdk.c
> +++ linux/arch/frv/mb93090-mb00/pci-vdk.c
> @@ -466,6 +466,7 @@ int pcibios_enable_device(struct pci_dev
>  
>   if ((err = pcibios_enable_resources(dev, mask)) < 0)
>   return err;
> - pcibios_enable_irq(dev);
> + if (!dev->msi_enabled)
> + pcibios_enable_irq(dev);
>   return 0;
>  }
> Index: linux/arch/i386/pci/common.c
> ===
> --- linux.orig/arch/i386/pci/common.c
> +++ linux/arch/i386/pci/common.c
> @@ -434,11 +434,13 @@ int pcibios_enable_device(struct pci_dev
>   if ((err = pcibios_enable_resources(dev, mask)) < 0)
>   return err;
>  
> - return pcibios_enable_irq(dev);
> + if (!dev->msi_enabled)
> + return pcibios_enable_irq(dev);
> + return 0;
>  }
>  
>  void pcibios_disable_device (struct pci_dev *dev)
>  {
> - if (pcibios_disable_irq)
> + if (!dev->msi_enabled && pcibios_disable_irq)
>   pcibios_disable_irq(dev);
>  }
> Index: linux/arch/ia64/pci/pci.c
> ===
> --- linux.orig/arch/ia64/pci/pci.c
> +++ linux/arch/ia64/pci/pci.c
> @@ -557,14 +557,18 @@ pcibios_enable_device (struct pci_dev *d
>   if (ret < 0)
>   return ret;
>  
> - return acpi_pci_irq_enable(dev);
> + if (!dev->msi_enabled)
> + return acpi_pci_irq_enable(dev);
> + return 0;
>  }
>  
>  void
>  pcibios_disable_device (struct pci_dev *dev)
>  {
>   BUG_ON(atomic_read(>enable_cnt));
> - acpi_pci_irq_disable(dev);
> + if (!dev->msi_enabled)
> + acpi_pci_irq_disable(dev);
> + return 0;
>  }
>  
>  void
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a 

Re: [patch] MSI-X: fix resume crash

2007-03-28 Thread Ingo Molnar

* Eric W. Biederman <[EMAIL PROTECTED]> wrote:

> > Eric's patch seems to have done the trick on my T60: i've done 10 
> > suspend+resumes and each worked fine. I've tidied up the description 
> > part of Eric's patch a bit for upstream application - find it below.
> 
> Thanks.  Tidying up the description has been on my todo list for the 
> last little bit but I just haven't gotten there.
> 
> I've gotten at least Tony's sign off on the architectural direction so 
> there is nothing to prevent this patch from going in.  Unless Linus or 
> someone wants a more thorough patch this late in the release cycle.
> 
> Signed-off-by: "Eric W. Biederman" <[EMAIL PROTECTED]>

i've updated the patch below with your sign-off, and trimmed the 
description to the relevant bits, for easy upstream application.

    Ingo

->
Subject: [patch] MSI-X: fix resume crash
From: Eric W. Biederman <[EMAIL PROTECTED]>

So I think the right solution is to simply make pci_enable_device just 
flip enable bits and move the rest of the work someplace else.

However a thorough cleanup is a little extreme for this point in the 
release cycle, so I think a quick hack that makes the code not stomp the 
irq when msi irq's are enabled should be the first fix.  Then we can 
later make the code not change the irqs at all.

Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]>
Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]>
---
 arch/cris/arch-v32/drivers/pci/bios.c |4 +++-
 arch/frv/mb93090-mb00/pci-vdk.c   |3 ++-
 arch/i386/pci/common.c|6 --
 arch/ia64/pci/pci.c   |8 ++--
 4 files changed, 15 insertions(+), 6 deletions(-)

Index: linux/arch/cris/arch-v32/drivers/pci/bios.c
===
--- linux.orig/arch/cris/arch-v32/drivers/pci/bios.c
+++ linux/arch/cris/arch-v32/drivers/pci/bios.c
@@ -100,7 +100,9 @@ int pcibios_enable_device(struct pci_dev
if ((err = pcibios_enable_resources(dev, mask)) < 0)
return err;
 
-   return pcibios_enable_irq(dev);
+   if (!dev->msi_enabled)
+   pcibios_enable_irq(dev);
+   return 0;
 }
 
 int pcibios_assign_resources(void)
Index: linux/arch/frv/mb93090-mb00/pci-vdk.c
===
--- linux.orig/arch/frv/mb93090-mb00/pci-vdk.c
+++ linux/arch/frv/mb93090-mb00/pci-vdk.c
@@ -466,6 +466,7 @@ int pcibios_enable_device(struct pci_dev
 
if ((err = pcibios_enable_resources(dev, mask)) < 0)
return err;
-   pcibios_enable_irq(dev);
+   if (!dev->msi_enabled)
+   pcibios_enable_irq(dev);
return 0;
 }
Index: linux/arch/i386/pci/common.c
===
--- linux.orig/arch/i386/pci/common.c
+++ linux/arch/i386/pci/common.c
@@ -434,11 +434,13 @@ int pcibios_enable_device(struct pci_dev
if ((err = pcibios_enable_resources(dev, mask)) < 0)
return err;
 
-   return pcibios_enable_irq(dev);
+   if (!dev->msi_enabled)
+   return pcibios_enable_irq(dev);
+   return 0;
 }
 
 void pcibios_disable_device (struct pci_dev *dev)
 {
-   if (pcibios_disable_irq)
+   if (!dev->msi_enabled && pcibios_disable_irq)
pcibios_disable_irq(dev);
 }
Index: linux/arch/ia64/pci/pci.c
===
--- linux.orig/arch/ia64/pci/pci.c
+++ linux/arch/ia64/pci/pci.c
@@ -557,14 +557,18 @@ pcibios_enable_device (struct pci_dev *d
if (ret < 0)
return ret;
 
-   return acpi_pci_irq_enable(dev);
+   if (!dev->msi_enabled)
+   return acpi_pci_irq_enable(dev);
+   return 0;
 }
 
 void
 pcibios_disable_device (struct pci_dev *dev)
 {
BUG_ON(atomic_read(>enable_cnt));
-   acpi_pci_irq_disable(dev);
+   if (!dev->msi_enabled)
+   acpi_pci_irq_disable(dev);
+   return 0;
 }
 
 void
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] MSI-X: fix resume crash

2007-03-28 Thread Eric W. Biederman
Ingo Molnar <[EMAIL PROTECTED]> writes:

> * Ingo Molnar <[EMAIL PROTECTED]> wrote:
>
>> [...] I'll now re-test Eric's MSI patch.
>
> Eric's patch seems to have done the trick on my T60: i've done 10 
> suspend+resumes and each worked fine. I've tidied up the description 
> part of Eric's patch a bit for upstream application - find it below.

Thanks.  Tidying up the description has been on my todo list for the
last little bit but I just haven't gotten there.

I've gotten at least Tony's sign off on the architectural direction
so there is nothing to prevent this patch from going in.  Unless
Linus or someone wants a more thorough patch this late in the
release cycle.

Signed-off-by: "Eric W. Biederman" <[EMAIL PROTECTED]>


> ------>
> Subject: [patch] MSI-X: fix resume crash
> From: Eric W. Biederman <[EMAIL PROTECTED]>
>
> I think the right solution is to simply make pci_enable_device just flip 
> enable bits and move the rest of the work someplace else.
>
> However a thorough cleanup is a little extreme for this point in the 
> release cycle, so I think a quick hack that makes the code not stomp the 
> irq when msi irq's are enabled should be the first fix.  Then we can 
> later make the code not change the irqs at all.
>
> Tony, Len the way pci_disable_device is being used in a suspend/resume 
> path by a few drivers is completely incompatible with the way irqs are 
> allocated on ia64.  In particular people the following sequence occurs 
> in several drivers.
>
> probe:
>   pci_enable_device(pdev);
>   request_irq(pdev->irq);
> suspend:
>   pci_disable_device(pdev);
> resume:
>   pci_enable_device(pdev);
> remove:
>   free_irq(pdev->irq);
>   pci_disable_device(pdev);
>
> What I'm proposing we do is move the irq allocation code out of 
> pci_enable_device and the irq freeing code out of pci_disable_device in 
> the future.  If we move ia64 to a model where the irq number equal the 
> gsi like we have for x86_64 and are in the middle of for i386 that 
> should be pretty straight forward.  It would even be relatively simple 
> to delay vector allocation in that context until request_irq, if we 
> needed the delayed allocation benefit.  Do you two have any problems 
> with moving in that direction?
>
> If fixing the arch code is unacceptable for some reason I'm not aware of 
> we need to audit the 10-20 drivers that call pci_disable_device in their 
> suspend/resume processing and ensure that they have freed all of the 
> irqs before that point.  Given that I have bug reports on the msi path I 
> know that isn't true.
>
> From: Eric W. Biederman <[EMAIL PROTECTED]>
> Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]>
> ---
>  arch/cris/arch-v32/drivers/pci/bios.c |4 +++-
>  arch/frv/mb93090-mb00/pci-vdk.c   |3 ++-
>  arch/i386/pci/common.c|6 --
>  arch/ia64/pci/pci.c   |8 ++--
>  4 files changed, 15 insertions(+), 6 deletions(-)
>
> Index: linux/arch/cris/arch-v32/drivers/pci/bios.c
> ===
> --- linux.orig/arch/cris/arch-v32/drivers/pci/bios.c
> +++ linux/arch/cris/arch-v32/drivers/pci/bios.c
> @@ -100,7 +100,9 @@ int pcibios_enable_device(struct pci_dev
>   if ((err = pcibios_enable_resources(dev, mask)) < 0)
>   return err;
>  
> - return pcibios_enable_irq(dev);
> + if (!dev->msi_enabled)
> + pcibios_enable_irq(dev);
> + return 0;
>  }
>  
>  int pcibios_assign_resources(void)
> Index: linux/arch/frv/mb93090-mb00/pci-vdk.c
> ===
> --- linux.orig/arch/frv/mb93090-mb00/pci-vdk.c
> +++ linux/arch/frv/mb93090-mb00/pci-vdk.c
> @@ -466,6 +466,7 @@ int pcibios_enable_device(struct pci_dev
>  
>   if ((err = pcibios_enable_resources(dev, mask)) < 0)
>   return err;
> - pcibios_enable_irq(dev);
> + if (!dev->msi_enabled)
> + pcibios_enable_irq(dev);
>   return 0;
>  }
> Index: linux/arch/i386/pci/common.c
> ===
> --- linux.orig/arch/i386/pci/common.c
> +++ linux/arch/i386/pci/common.c
> @@ -434,11 +434,13 @@ int pcibios_enable_device(struct pci_dev
>   if ((err = pcibios_enable_resources(dev, mask)) < 0)
>   return err;
>  
> - return pcibios_enable_irq(dev);
> + if (!dev->msi_enabled)
> + return pcibios_enable_irq(dev);
> + return 0;
>  }
>  
>  void pcibios_disable_device (struct pci_dev *dev)
>  {
> - if (pcibios_disable_irq)
> + if (!dev

[patch] MSI-X: fix resume crash

2007-03-28 Thread Ingo Molnar

* Ingo Molnar <[EMAIL PROTECTED]> wrote:

> [...] I'll now re-test Eric's MSI patch.

Eric's patch seems to have done the trick on my T60: i've done 10 
suspend+resumes and each worked fine. I've tidied up the description 
part of Eric's patch a bit for upstream application - find it below.

Ingo

-->
Subject: [patch] MSI-X: fix resume crash
From: Eric W. Biederman <[EMAIL PROTECTED]>

I think the right solution is to simply make pci_enable_device just flip 
enable bits and move the rest of the work someplace else.

However a thorough cleanup is a little extreme for this point in the 
release cycle, so I think a quick hack that makes the code not stomp the 
irq when msi irq's are enabled should be the first fix.  Then we can 
later make the code not change the irqs at all.

Tony, Len the way pci_disable_device is being used in a suspend/resume 
path by a few drivers is completely incompatible with the way irqs are 
allocated on ia64.  In particular people the following sequence occurs 
in several drivers.

probe:
  pci_enable_device(pdev);
  request_irq(pdev->irq);
suspend:
  pci_disable_device(pdev);
resume:
  pci_enable_device(pdev);
remove:
  free_irq(pdev->irq);
  pci_disable_device(pdev);

What I'm proposing we do is move the irq allocation code out of 
pci_enable_device and the irq freeing code out of pci_disable_device in 
the future.  If we move ia64 to a model where the irq number equal the 
gsi like we have for x86_64 and are in the middle of for i386 that 
should be pretty straight forward.  It would even be relatively simple 
to delay vector allocation in that context until request_irq, if we 
needed the delayed allocation benefit.  Do you two have any problems 
with moving in that direction?

If fixing the arch code is unacceptable for some reason I'm not aware of 
we need to audit the 10-20 drivers that call pci_disable_device in their 
suspend/resume processing and ensure that they have freed all of the 
irqs before that point.  Given that I have bug reports on the msi path I 
know that isn't true.

From: Eric W. Biederman <[EMAIL PROTECTED]>
Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]>
---
 arch/cris/arch-v32/drivers/pci/bios.c |4 +++-
 arch/frv/mb93090-mb00/pci-vdk.c   |3 ++-
 arch/i386/pci/common.c|6 --
 arch/ia64/pci/pci.c   |8 ++--
 4 files changed, 15 insertions(+), 6 deletions(-)

Index: linux/arch/cris/arch-v32/drivers/pci/bios.c
===
--- linux.orig/arch/cris/arch-v32/drivers/pci/bios.c
+++ linux/arch/cris/arch-v32/drivers/pci/bios.c
@@ -100,7 +100,9 @@ int pcibios_enable_device(struct pci_dev
if ((err = pcibios_enable_resources(dev, mask)) < 0)
return err;
 
-   return pcibios_enable_irq(dev);
+   if (!dev->msi_enabled)
+   pcibios_enable_irq(dev);
+   return 0;
 }
 
 int pcibios_assign_resources(void)
Index: linux/arch/frv/mb93090-mb00/pci-vdk.c
===
--- linux.orig/arch/frv/mb93090-mb00/pci-vdk.c
+++ linux/arch/frv/mb93090-mb00/pci-vdk.c
@@ -466,6 +466,7 @@ int pcibios_enable_device(struct pci_dev
 
if ((err = pcibios_enable_resources(dev, mask)) < 0)
return err;
-   pcibios_enable_irq(dev);
+   if (!dev->msi_enabled)
+   pcibios_enable_irq(dev);
return 0;
 }
Index: linux/arch/i386/pci/common.c
===
--- linux.orig/arch/i386/pci/common.c
+++ linux/arch/i386/pci/common.c
@@ -434,11 +434,13 @@ int pcibios_enable_device(struct pci_dev
if ((err = pcibios_enable_resources(dev, mask)) < 0)
return err;
 
-   return pcibios_enable_irq(dev);
+   if (!dev->msi_enabled)
+   return pcibios_enable_irq(dev);
+   return 0;
 }
 
 void pcibios_disable_device (struct pci_dev *dev)
 {
-   if (pcibios_disable_irq)
+   if (!dev->msi_enabled && pcibios_disable_irq)
pcibios_disable_irq(dev);
 }
Index: linux/arch/ia64/pci/pci.c
===
--- linux.orig/arch/ia64/pci/pci.c
+++ linux/arch/ia64/pci/pci.c
@@ -557,14 +557,18 @@ pcibios_enable_device (struct pci_dev *d
if (ret < 0)
return ret;
 
-   return acpi_pci_irq_enable(dev);
+   if (!dev->msi_enabled)
+   return acpi_pci_irq_enable(dev);
+   return 0;
 }
 
 void
 pcibios_disable_device (struct pci_dev *dev)
 {
BUG_ON(atomic_read(>enable_cnt));
-   acpi_pci_irq_disable(dev);
+   if (!dev->msi_enabled)
+   acpi_pci_irq_disable(dev);
+   return 0;
 }
 
 void
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More maj

[patch] MSI-X: fix resume crash

2007-03-28 Thread Ingo Molnar

* Ingo Molnar [EMAIL PROTECTED] wrote:

 [...] I'll now re-test Eric's MSI patch.

Eric's patch seems to have done the trick on my T60: i've done 10 
suspend+resumes and each worked fine. I've tidied up the description 
part of Eric's patch a bit for upstream application - find it below.

Ingo

--
Subject: [patch] MSI-X: fix resume crash
From: Eric W. Biederman [EMAIL PROTECTED]

I think the right solution is to simply make pci_enable_device just flip 
enable bits and move the rest of the work someplace else.

However a thorough cleanup is a little extreme for this point in the 
release cycle, so I think a quick hack that makes the code not stomp the 
irq when msi irq's are enabled should be the first fix.  Then we can 
later make the code not change the irqs at all.

Tony, Len the way pci_disable_device is being used in a suspend/resume 
path by a few drivers is completely incompatible with the way irqs are 
allocated on ia64.  In particular people the following sequence occurs 
in several drivers.

probe:
  pci_enable_device(pdev);
  request_irq(pdev-irq);
suspend:
  pci_disable_device(pdev);
resume:
  pci_enable_device(pdev);
remove:
  free_irq(pdev-irq);
  pci_disable_device(pdev);

What I'm proposing we do is move the irq allocation code out of 
pci_enable_device and the irq freeing code out of pci_disable_device in 
the future.  If we move ia64 to a model where the irq number equal the 
gsi like we have for x86_64 and are in the middle of for i386 that 
should be pretty straight forward.  It would even be relatively simple 
to delay vector allocation in that context until request_irq, if we 
needed the delayed allocation benefit.  Do you two have any problems 
with moving in that direction?

If fixing the arch code is unacceptable for some reason I'm not aware of 
we need to audit the 10-20 drivers that call pci_disable_device in their 
suspend/resume processing and ensure that they have freed all of the 
irqs before that point.  Given that I have bug reports on the msi path I 
know that isn't true.

From: Eric W. Biederman [EMAIL PROTECTED]
Signed-off-by: Ingo Molnar [EMAIL PROTECTED]
---
 arch/cris/arch-v32/drivers/pci/bios.c |4 +++-
 arch/frv/mb93090-mb00/pci-vdk.c   |3 ++-
 arch/i386/pci/common.c|6 --
 arch/ia64/pci/pci.c   |8 ++--
 4 files changed, 15 insertions(+), 6 deletions(-)

Index: linux/arch/cris/arch-v32/drivers/pci/bios.c
===
--- linux.orig/arch/cris/arch-v32/drivers/pci/bios.c
+++ linux/arch/cris/arch-v32/drivers/pci/bios.c
@@ -100,7 +100,9 @@ int pcibios_enable_device(struct pci_dev
if ((err = pcibios_enable_resources(dev, mask))  0)
return err;
 
-   return pcibios_enable_irq(dev);
+   if (!dev-msi_enabled)
+   pcibios_enable_irq(dev);
+   return 0;
 }
 
 int pcibios_assign_resources(void)
Index: linux/arch/frv/mb93090-mb00/pci-vdk.c
===
--- linux.orig/arch/frv/mb93090-mb00/pci-vdk.c
+++ linux/arch/frv/mb93090-mb00/pci-vdk.c
@@ -466,6 +466,7 @@ int pcibios_enable_device(struct pci_dev
 
if ((err = pcibios_enable_resources(dev, mask))  0)
return err;
-   pcibios_enable_irq(dev);
+   if (!dev-msi_enabled)
+   pcibios_enable_irq(dev);
return 0;
 }
Index: linux/arch/i386/pci/common.c
===
--- linux.orig/arch/i386/pci/common.c
+++ linux/arch/i386/pci/common.c
@@ -434,11 +434,13 @@ int pcibios_enable_device(struct pci_dev
if ((err = pcibios_enable_resources(dev, mask))  0)
return err;
 
-   return pcibios_enable_irq(dev);
+   if (!dev-msi_enabled)
+   return pcibios_enable_irq(dev);
+   return 0;
 }
 
 void pcibios_disable_device (struct pci_dev *dev)
 {
-   if (pcibios_disable_irq)
+   if (!dev-msi_enabled  pcibios_disable_irq)
pcibios_disable_irq(dev);
 }
Index: linux/arch/ia64/pci/pci.c
===
--- linux.orig/arch/ia64/pci/pci.c
+++ linux/arch/ia64/pci/pci.c
@@ -557,14 +557,18 @@ pcibios_enable_device (struct pci_dev *d
if (ret  0)
return ret;
 
-   return acpi_pci_irq_enable(dev);
+   if (!dev-msi_enabled)
+   return acpi_pci_irq_enable(dev);
+   return 0;
 }
 
 void
 pcibios_disable_device (struct pci_dev *dev)
 {
BUG_ON(atomic_read(dev-enable_cnt));
-   acpi_pci_irq_disable(dev);
+   if (!dev-msi_enabled)
+   acpi_pci_irq_disable(dev);
+   return 0;
 }
 
 void
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] MSI-X: fix resume crash

2007-03-28 Thread Eric W. Biederman
Ingo Molnar [EMAIL PROTECTED] writes:

 * Ingo Molnar [EMAIL PROTECTED] wrote:

 [...] I'll now re-test Eric's MSI patch.

 Eric's patch seems to have done the trick on my T60: i've done 10 
 suspend+resumes and each worked fine. I've tidied up the description 
 part of Eric's patch a bit for upstream application - find it below.

Thanks.  Tidying up the description has been on my todo list for the
last little bit but I just haven't gotten there.

I've gotten at least Tony's sign off on the architectural direction
so there is nothing to prevent this patch from going in.  Unless
Linus or someone wants a more thorough patch this late in the
release cycle.

Signed-off-by: Eric W. Biederman [EMAIL PROTECTED]


 --
 Subject: [patch] MSI-X: fix resume crash
 From: Eric W. Biederman [EMAIL PROTECTED]

 I think the right solution is to simply make pci_enable_device just flip 
 enable bits and move the rest of the work someplace else.

 However a thorough cleanup is a little extreme for this point in the 
 release cycle, so I think a quick hack that makes the code not stomp the 
 irq when msi irq's are enabled should be the first fix.  Then we can 
 later make the code not change the irqs at all.

 Tony, Len the way pci_disable_device is being used in a suspend/resume 
 path by a few drivers is completely incompatible with the way irqs are 
 allocated on ia64.  In particular people the following sequence occurs 
 in several drivers.

 probe:
   pci_enable_device(pdev);
   request_irq(pdev-irq);
 suspend:
   pci_disable_device(pdev);
 resume:
   pci_enable_device(pdev);
 remove:
   free_irq(pdev-irq);
   pci_disable_device(pdev);

 What I'm proposing we do is move the irq allocation code out of 
 pci_enable_device and the irq freeing code out of pci_disable_device in 
 the future.  If we move ia64 to a model where the irq number equal the 
 gsi like we have for x86_64 and are in the middle of for i386 that 
 should be pretty straight forward.  It would even be relatively simple 
 to delay vector allocation in that context until request_irq, if we 
 needed the delayed allocation benefit.  Do you two have any problems 
 with moving in that direction?

 If fixing the arch code is unacceptable for some reason I'm not aware of 
 we need to audit the 10-20 drivers that call pci_disable_device in their 
 suspend/resume processing and ensure that they have freed all of the 
 irqs before that point.  Given that I have bug reports on the msi path I 
 know that isn't true.

 From: Eric W. Biederman [EMAIL PROTECTED]
 Signed-off-by: Ingo Molnar [EMAIL PROTECTED]
 ---
  arch/cris/arch-v32/drivers/pci/bios.c |4 +++-
  arch/frv/mb93090-mb00/pci-vdk.c   |3 ++-
  arch/i386/pci/common.c|6 --
  arch/ia64/pci/pci.c   |8 ++--
  4 files changed, 15 insertions(+), 6 deletions(-)

 Index: linux/arch/cris/arch-v32/drivers/pci/bios.c
 ===
 --- linux.orig/arch/cris/arch-v32/drivers/pci/bios.c
 +++ linux/arch/cris/arch-v32/drivers/pci/bios.c
 @@ -100,7 +100,9 @@ int pcibios_enable_device(struct pci_dev
   if ((err = pcibios_enable_resources(dev, mask))  0)
   return err;
  
 - return pcibios_enable_irq(dev);
 + if (!dev-msi_enabled)
 + pcibios_enable_irq(dev);
 + return 0;
  }
  
  int pcibios_assign_resources(void)
 Index: linux/arch/frv/mb93090-mb00/pci-vdk.c
 ===
 --- linux.orig/arch/frv/mb93090-mb00/pci-vdk.c
 +++ linux/arch/frv/mb93090-mb00/pci-vdk.c
 @@ -466,6 +466,7 @@ int pcibios_enable_device(struct pci_dev
  
   if ((err = pcibios_enable_resources(dev, mask))  0)
   return err;
 - pcibios_enable_irq(dev);
 + if (!dev-msi_enabled)
 + pcibios_enable_irq(dev);
   return 0;
  }
 Index: linux/arch/i386/pci/common.c
 ===
 --- linux.orig/arch/i386/pci/common.c
 +++ linux/arch/i386/pci/common.c
 @@ -434,11 +434,13 @@ int pcibios_enable_device(struct pci_dev
   if ((err = pcibios_enable_resources(dev, mask))  0)
   return err;
  
 - return pcibios_enable_irq(dev);
 + if (!dev-msi_enabled)
 + return pcibios_enable_irq(dev);
 + return 0;
  }
  
  void pcibios_disable_device (struct pci_dev *dev)
  {
 - if (pcibios_disable_irq)
 + if (!dev-msi_enabled  pcibios_disable_irq)
   pcibios_disable_irq(dev);
  }
 Index: linux/arch/ia64/pci/pci.c
 ===
 --- linux.orig/arch/ia64/pci/pci.c
 +++ linux/arch/ia64/pci/pci.c
 @@ -557,14 +557,18 @@ pcibios_enable_device (struct pci_dev *d
   if (ret  0)
   return ret;
  
 - return acpi_pci_irq_enable(dev);
 + if (!dev-msi_enabled)
 + return acpi_pci_irq_enable(dev);
 + return 0;
  }
  
  void

Re: [patch] MSI-X: fix resume crash

2007-03-28 Thread Ingo Molnar

* Eric W. Biederman [EMAIL PROTECTED] wrote:

  Eric's patch seems to have done the trick on my T60: i've done 10 
  suspend+resumes and each worked fine. I've tidied up the description 
  part of Eric's patch a bit for upstream application - find it below.
 
 Thanks.  Tidying up the description has been on my todo list for the 
 last little bit but I just haven't gotten there.
 
 I've gotten at least Tony's sign off on the architectural direction so 
 there is nothing to prevent this patch from going in.  Unless Linus or 
 someone wants a more thorough patch this late in the release cycle.
 
 Signed-off-by: Eric W. Biederman [EMAIL PROTECTED]

i've updated the patch below with your sign-off, and trimmed the 
description to the relevant bits, for easy upstream application.

Ingo

-
Subject: [patch] MSI-X: fix resume crash
From: Eric W. Biederman [EMAIL PROTECTED]

So I think the right solution is to simply make pci_enable_device just 
flip enable bits and move the rest of the work someplace else.

However a thorough cleanup is a little extreme for this point in the 
release cycle, so I think a quick hack that makes the code not stomp the 
irq when msi irq's are enabled should be the first fix.  Then we can 
later make the code not change the irqs at all.

Signed-off-by: Eric W. Biederman [EMAIL PROTECTED]
Signed-off-by: Ingo Molnar [EMAIL PROTECTED]
---
 arch/cris/arch-v32/drivers/pci/bios.c |4 +++-
 arch/frv/mb93090-mb00/pci-vdk.c   |3 ++-
 arch/i386/pci/common.c|6 --
 arch/ia64/pci/pci.c   |8 ++--
 4 files changed, 15 insertions(+), 6 deletions(-)

Index: linux/arch/cris/arch-v32/drivers/pci/bios.c
===
--- linux.orig/arch/cris/arch-v32/drivers/pci/bios.c
+++ linux/arch/cris/arch-v32/drivers/pci/bios.c
@@ -100,7 +100,9 @@ int pcibios_enable_device(struct pci_dev
if ((err = pcibios_enable_resources(dev, mask))  0)
return err;
 
-   return pcibios_enable_irq(dev);
+   if (!dev-msi_enabled)
+   pcibios_enable_irq(dev);
+   return 0;
 }
 
 int pcibios_assign_resources(void)
Index: linux/arch/frv/mb93090-mb00/pci-vdk.c
===
--- linux.orig/arch/frv/mb93090-mb00/pci-vdk.c
+++ linux/arch/frv/mb93090-mb00/pci-vdk.c
@@ -466,6 +466,7 @@ int pcibios_enable_device(struct pci_dev
 
if ((err = pcibios_enable_resources(dev, mask))  0)
return err;
-   pcibios_enable_irq(dev);
+   if (!dev-msi_enabled)
+   pcibios_enable_irq(dev);
return 0;
 }
Index: linux/arch/i386/pci/common.c
===
--- linux.orig/arch/i386/pci/common.c
+++ linux/arch/i386/pci/common.c
@@ -434,11 +434,13 @@ int pcibios_enable_device(struct pci_dev
if ((err = pcibios_enable_resources(dev, mask))  0)
return err;
 
-   return pcibios_enable_irq(dev);
+   if (!dev-msi_enabled)
+   return pcibios_enable_irq(dev);
+   return 0;
 }
 
 void pcibios_disable_device (struct pci_dev *dev)
 {
-   if (pcibios_disable_irq)
+   if (!dev-msi_enabled  pcibios_disable_irq)
pcibios_disable_irq(dev);
 }
Index: linux/arch/ia64/pci/pci.c
===
--- linux.orig/arch/ia64/pci/pci.c
+++ linux/arch/ia64/pci/pci.c
@@ -557,14 +557,18 @@ pcibios_enable_device (struct pci_dev *d
if (ret  0)
return ret;
 
-   return acpi_pci_irq_enable(dev);
+   if (!dev-msi_enabled)
+   return acpi_pci_irq_enable(dev);
+   return 0;
 }
 
 void
 pcibios_disable_device (struct pci_dev *dev)
 {
BUG_ON(atomic_read(dev-enable_cnt));
-   acpi_pci_irq_disable(dev);
+   if (!dev-msi_enabled)
+   acpi_pci_irq_disable(dev);
+   return 0;
 }
 
 void
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] MSI-X: fix resume crash

2007-03-28 Thread Len Brown
 Tony, Len the way pci_disable_device is being used in a suspend/resume 
 path by a few drivers is completely incompatible with the way irqs are 
 allocated on ia64.  In particular people the following sequence occurs 
 in several drivers.
 
 probe:
   pci_enable_device(pdev);
   request_irq(pdev-irq);
 suspend:
   pci_disable_device(pdev);
 resume:
   pci_enable_device(pdev);
 remove:
   free_irq(pdev-irq);
   pci_disable_device(pdev);

There are no IA64 machines that support system suspend/resume today --
so you have 0 chance of breaking the IA64 suspend/resume installed base.

My understanding is that Luming Yu has cobbled IA64 S4 support
together for a future release though.

 What I'm proposing we do is move the irq allocation code out of 
 pci_enable_device and the irq freeing code out of pci_disable_device in 
 the future.  If we move ia64 to a model where the irq number equal the 
 gsi like we have for x86_64 and are in the middle of for i386 that 
 should be pretty straight forward. It would even be relatively simple  
 to delay vector allocation in that context until request_irq, if we 
 needed the delayed allocation benefit.  Do you two have any problems 
 with moving in that direction?

I think consistency here would be _wonderful_.
Of course the beauty of having identity GSI=IRQ and a /proc/interrupts
that tells you what IOAPIC pin you are using become moot with MSI --
but hey, showing the IRQ number rather than the vector number
is consistent and makes sense.

 If fixing the arch code is unacceptable for some reason I'm not aware of 
 we need to audit the 10-20 drivers that call pci_disable_device in their 
 suspend/resume processing and ensure that they have freed all of the 
 irqs before that point.  Given that I have bug reports on the msi path I 
 know that isn't true.

I think the suspend/resume interrupt logic needs some serious attention.
We've had several schemes for suspend/resume of interrupts, several
changes in strategy, and right now I think we are inconsistent,
and frankly, I'm amazed it works at all.

-Len


 From: Eric W. Biederman [EMAIL PROTECTED]
 Signed-off-by: Ingo Molnar [EMAIL PROTECTED]
 ---
  arch/cris/arch-v32/drivers/pci/bios.c |4 +++-
  arch/frv/mb93090-mb00/pci-vdk.c   |3 ++-
  arch/i386/pci/common.c|6 --
  arch/ia64/pci/pci.c   |8 ++--
  4 files changed, 15 insertions(+), 6 deletions(-)
 
 Index: linux/arch/cris/arch-v32/drivers/pci/bios.c
 ===
 --- linux.orig/arch/cris/arch-v32/drivers/pci/bios.c
 +++ linux/arch/cris/arch-v32/drivers/pci/bios.c
 @@ -100,7 +100,9 @@ int pcibios_enable_device(struct pci_dev
   if ((err = pcibios_enable_resources(dev, mask))  0)
   return err;
  
 - return pcibios_enable_irq(dev);
 + if (!dev-msi_enabled)
 + pcibios_enable_irq(dev);
 + return 0;
  }
  
  int pcibios_assign_resources(void)
 Index: linux/arch/frv/mb93090-mb00/pci-vdk.c
 ===
 --- linux.orig/arch/frv/mb93090-mb00/pci-vdk.c
 +++ linux/arch/frv/mb93090-mb00/pci-vdk.c
 @@ -466,6 +466,7 @@ int pcibios_enable_device(struct pci_dev
  
   if ((err = pcibios_enable_resources(dev, mask))  0)
   return err;
 - pcibios_enable_irq(dev);
 + if (!dev-msi_enabled)
 + pcibios_enable_irq(dev);
   return 0;
  }
 Index: linux/arch/i386/pci/common.c
 ===
 --- linux.orig/arch/i386/pci/common.c
 +++ linux/arch/i386/pci/common.c
 @@ -434,11 +434,13 @@ int pcibios_enable_device(struct pci_dev
   if ((err = pcibios_enable_resources(dev, mask))  0)
   return err;
  
 - return pcibios_enable_irq(dev);
 + if (!dev-msi_enabled)
 + return pcibios_enable_irq(dev);
 + return 0;
  }
  
  void pcibios_disable_device (struct pci_dev *dev)
  {
 - if (pcibios_disable_irq)
 + if (!dev-msi_enabled  pcibios_disable_irq)
   pcibios_disable_irq(dev);
  }
 Index: linux/arch/ia64/pci/pci.c
 ===
 --- linux.orig/arch/ia64/pci/pci.c
 +++ linux/arch/ia64/pci/pci.c
 @@ -557,14 +557,18 @@ pcibios_enable_device (struct pci_dev *d
   if (ret  0)
   return ret;
  
 - return acpi_pci_irq_enable(dev);
 + if (!dev-msi_enabled)
 + return acpi_pci_irq_enable(dev);
 + return 0;
  }
  
  void
  pcibios_disable_device (struct pci_dev *dev)
  {
   BUG_ON(atomic_read(dev-enable_cnt));
 - acpi_pci_irq_disable(dev);
 + if (!dev-msi_enabled)
 + acpi_pci_irq_disable(dev);
 + return 0;
  }
  
  void
 -
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  

Re: [patch] MSI-X: fix resume crash

2007-03-28 Thread Eric W. Biederman
Len Brown [EMAIL PROTECTED] writes:

 Tony, Len the way pci_disable_device is being used in a suspend/resume 
 path by a few drivers is completely incompatible with the way irqs are 
 allocated on ia64.  In particular people the following sequence occurs 
 in several drivers.
 
 probe:
   pci_enable_device(pdev);
   request_irq(pdev-irq);
 suspend:
   pci_disable_device(pdev);
 resume:
   pci_enable_device(pdev);
 remove:
   free_irq(pdev-irq);
   pci_disable_device(pdev);

 There are no IA64 machines that support system suspend/resume today --
 so you have 0 chance of breaking the IA64 suspend/resume installed base.

Ok.  So that is why the inconsistency persists...

 My understanding is that Luming Yu has cobbled IA64 S4 support
 together for a future release though.

 What I'm proposing we do is move the irq allocation code out of 
 pci_enable_device and the irq freeing code out of pci_disable_device in 
 the future.  If we move ia64 to a model where the irq number equal the 
 gsi like we have for x86_64 and are in the middle of for i386 that 
 should be pretty straight forward. It would even be relatively simple  
 to delay vector allocation in that context until request_irq, if we 
 needed the delayed allocation benefit.  Do you two have any problems 
 with moving in that direction?

 I think consistency here would be _wonderful_.
 Of course the beauty of having identity GSI=IRQ and a /proc/interrupts
 that tells you what IOAPIC pin you are using become moot with MSI --
 but hey, showing the IRQ number rather than the vector number
 is consistent and makes sense.

Yes.  It also allows for bigger machines.  And I can get a consistent
number out of MSI if we allocate irq numbers in a sufficiently non-sparse
way.  Something like bus|device|func|irq which is 8+5+3+12 or 28 bits...
I'll never get there though if i keep unearthing this long standing bugs.

 If fixing the arch code is unacceptable for some reason I'm not aware of 
 we need to audit the 10-20 drivers that call pci_disable_device in their 
 suspend/resume processing and ensure that they have freed all of the 
 irqs before that point.  Given that I have bug reports on the msi path I 
 know that isn't true.

 I think the suspend/resume interrupt logic needs some serious attention.
 We've had several schemes for suspend/resume of interrupts, several
 changes in strategy, and right now I think we are inconsistent,
 and frankly, I'm amazed it works at all.

What I have been doing lately is to aim at consistency in how a function
is called (and thus how it is expected to be used) and how it is actually
implemented.  When I have a choice I try to pick a forgiving implementation
so that driver writers don't have to follow a magic correct path for
things to work correctly.  

Removing the irq assignment from pci_enable_device is something that
matches implementation with use.

As for the rest it seems reasonable to me to allow an irq to be held
requested over suspend/resume and to save and restore apic and msi
capability state.  Especially since irq numbers are a kernel
abstraction we should be able to do with them what we need to.

Honestly the whole suspend/resume thing is beyond me at this point I'm
laptop free...  But I do know how to make code consistent with itself.

Eric
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/