Re: [PATCH] PCI: Add parameter @mmio_force_on to pci_update_resource()

2016-09-29 Thread Gavin Shan
On Wed, Sep 28, 2016 at 11:14:08AM +1000, Gavin Shan wrote:
>On Wed, Sep 28, 2016 at 10:06:44AM +1000, Benjamin Herrenschmidt wrote:
>>On Wed, 2016-09-28 at 09:37 +1000, Gavin Shan wrote:
>>> 
>>> Yeah, it's safe to update it with memory decoding on. As the function call
>>> flow I listed in the changelog (as below), nobody should access the IOV BAR
>>> when pci_update_resource() is called. However, the PF's memory BARs might
>>> be accessed that time and it's not safe to disable PF's memory decoding.
>>
>>The problem isn't so much whether anybody accesses the IOV BAR while
>>it's updated but whether the IOV BAR will decode at all.
>>
>>IE. The BAR is updated in two steps, 32-bit each. That means that there
>>is a window where it contains a "bogus" value.
>>
>>If that bogus value conflicts with another BAR (another BAR of the  PF
>>or another PF of the same device for example) then there is a risk of
>>something bad happening if the driver accesses that conflicting
>>resource during that window.
>>
>>On the other hand, if the IOV BAR doesn't decode at all while the
>>update is done, which I think is the case as I believe SR-IOV isn't
>>enabled during the update (please verify), then we are safe.
>>
>
>I assumed the SRIOV and its memory space aren't enabled when updating IOV
>BARs, but unfortunately they have been enabled at that point. I think
>pcibios_sriov_enable() should be moved before SRIOV is enabled. Note
>that pcibios_sriov_enable() is used by PowerNV only.
>
>static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
>{
>   :
>pci_iov_set_numvfs(dev, nr_virtfn);
>iov->ctrl |= PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE;
>pci_cfg_access_lock(dev);
>pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);  
> /* SRIOV and its memory space enabled */
>msleep(100);
>pci_cfg_access_unlock(dev);
>
>iov->initial_VFs = initial;
>if (nr_virtfn < initial)
>initial = nr_virtfn;
>
>rc = pcibios_sriov_enable(dev, initial);   
> /* IOV BARs are updated inside it */
>
>   :
>}
>

I will add one patch in v2, to call pcibios_sriov_enable() before IOV BARs
are enabled. v2 will be posted shortly.

Thanks,
Gavin



Re: [PATCH] PCI: Add parameter @mmio_force_on to pci_update_resource()

2016-09-27 Thread Gavin Shan
On Wed, Sep 28, 2016 at 10:06:44AM +1000, Benjamin Herrenschmidt wrote:
>On Wed, 2016-09-28 at 09:37 +1000, Gavin Shan wrote:
>> 
>> Yeah, it's safe to update it with memory decoding on. As the function call
>> flow I listed in the changelog (as below), nobody should access the IOV BAR
>> when pci_update_resource() is called. However, the PF's memory BARs might
>> be accessed that time and it's not safe to disable PF's memory decoding.
>
>The problem isn't so much whether anybody accesses the IOV BAR while
>it's updated but whether the IOV BAR will decode at all.
>
>IE. The BAR is updated in two steps, 32-bit each. That means that there
>is a window where it contains a "bogus" value.
>
>If that bogus value conflicts with another BAR (another BAR of the  PF
>or another PF of the same device for example) then there is a risk of
>something bad happening if the driver accesses that conflicting
>resource during that window.
>
>On the other hand, if the IOV BAR doesn't decode at all while the
>update is done, which I think is the case as I believe SR-IOV isn't
>enabled during the update (please verify), then we are safe.
>

I assumed the SRIOV and its memory space aren't enabled when updating IOV
BARs, but unfortunately they have been enabled at that point. I think
pcibios_sriov_enable() should be moved before SRIOV is enabled. Note
that pcibios_sriov_enable() is used by PowerNV only.

static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
{
:
pci_iov_set_numvfs(dev, nr_virtfn);
iov->ctrl |= PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE;
pci_cfg_access_lock(dev);
pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);   
/* SRIOV and its memory space enabled */
msleep(100);
pci_cfg_access_unlock(dev);

iov->initial_VFs = initial;
if (nr_virtfn < initial)
initial = nr_virtfn;

rc = pcibios_sriov_enable(dev, initial);
/* IOV BARs are updated inside it */

:
}

Thanks,
Gavin

>Cheers,
>Ben.
>



Re: [PATCH] PCI: Add parameter @mmio_force_on to pci_update_resource()

2016-09-27 Thread Benjamin Herrenschmidt
On Wed, 2016-09-28 at 09:37 +1000, Gavin Shan wrote:
> 
> Yeah, it's safe to update it with memory decoding on. As the function call
> flow I listed in the changelog (as below), nobody should access the IOV BAR
> when pci_update_resource() is called. However, the PF's memory BARs might
> be accessed that time and it's not safe to disable PF's memory decoding.

The problem isn't so much whether anybody accesses the IOV BAR while
it's updated but whether the IOV BAR will decode at all.

IE. The BAR is updated in two steps, 32-bit each. That means that there
is a window where it contains a "bogus" value.

If that bogus value conflicts with another BAR (another BAR of the  PF
or another PF of the same device for example) then there is a risk of
something bad happening if the driver accesses that conflicting
resource during that window.

On the other hand, if the IOV BAR doesn't decode at all while the
update is done, which I think is the case as I believe SR-IOV isn't
enabled during the update (please verify), then we are safe.

Cheers,
Ben.



Re: [PATCH] PCI: Add parameter @mmio_force_on to pci_update_resource()

2016-09-27 Thread Gavin Shan
On Wed, Sep 28, 2016 at 07:45:32AM +1000, Benjamin Herrenschmidt wrote:
>On Tue, 2016-09-27 at 14:20 -0500, Bjorn Helgaas wrote:
>> On Mon, Sep 19, 2016 at 09:53:30AM +1000, Gavin Shan wrote:
>> > In pci_update_resource(), the PCI device's memory decoding (0x2 in
>> > PCI_COMMAND) is disabled when 64-bits memory BAR is updated if the
>> > PCI device's memory space wasn't asked to be always on by @pdev->
>> > mmio_always_on. The PF's memory decoding might be disabled when
>> > updating its IOV BARs in the following path. Actually, the PF's
>> > memory decoding shouldn't be disabled in this scenario as the PF
>> > has been started to provide services:
>> 
>> The reason we disable memory decoding while updating a 64-bit BAR is
>> because we can't do the update atomically, and a half-updated BAR might
>> conflict with other devices.
>> 
>> You need to explain what is special about these SR-IOV BARs that makes it
>> safe to update them non-atomically while decoding is enabled.
>
>The IOV BAR won't decode until SR-IOV is enabled right ? Gavin, I don't
>think we update it "live", so it should be safe...
>

Yeah, it's safe to update it with memory decoding on. As the function call
flow I listed in the changelog (as below), nobody should access the IOV BAR
when pci_update_resource() is called. However, the PF's memory BARs might
be accessed that time and it's not safe to disable PF's memory decoding.

   sriov_numvfs_store
   pdev->driver->sriov_configure
   mlx5_core_sriov_configure
   pci_enable_sriov
   sriov_enable
   pcibios_sriov_enable
   pnv_pci_sriov_enable
   pnv_pci_vf_resource_shift
   pci_update_resource

Thanks,
Gavin



Re: [PATCH] PCI: Add parameter @mmio_force_on to pci_update_resource()

2016-09-27 Thread Benjamin Herrenschmidt
On Tue, 2016-09-27 at 14:20 -0500, Bjorn Helgaas wrote:
> On Mon, Sep 19, 2016 at 09:53:30AM +1000, Gavin Shan wrote:
> > In pci_update_resource(), the PCI device's memory decoding (0x2 in
> > PCI_COMMAND) is disabled when 64-bits memory BAR is updated if the
> > PCI device's memory space wasn't asked to be always on by @pdev->
> > mmio_always_on. The PF's memory decoding might be disabled when
> > updating its IOV BARs in the following path. Actually, the PF's
> > memory decoding shouldn't be disabled in this scenario as the PF
> > has been started to provide services:
> 
> The reason we disable memory decoding while updating a 64-bit BAR is
> because we can't do the update atomically, and a half-updated BAR might
> conflict with other devices.
> 
> You need to explain what is special about these SR-IOV BARs that makes it
> safe to update them non-atomically while decoding is enabled.

The IOV BAR won't decode until SR-IOV is enabled right ? Gavin, I don't
think we update it "live", so it should be safe...

Cheers,
Ben.



Re: [PATCH] PCI: Add parameter @mmio_force_on to pci_update_resource()

2016-09-27 Thread Bjorn Helgaas
Hi Gavin,

On Mon, Sep 19, 2016 at 09:53:30AM +1000, Gavin Shan wrote:
> In pci_update_resource(), the PCI device's memory decoding (0x2 in
> PCI_COMMAND) is disabled when 64-bits memory BAR is updated if the
> PCI device's memory space wasn't asked to be always on by @pdev->
> mmio_always_on. The PF's memory decoding might be disabled when
> updating its IOV BARs in the following path. Actually, the PF's
> memory decoding shouldn't be disabled in this scenario as the PF
> has been started to provide services:

The reason we disable memory decoding while updating a 64-bit BAR is
because we can't do the update atomically, and a half-updated BAR might
conflict with other devices.

You need to explain what is special about these SR-IOV BARs that makes it
safe to update them non-atomically while decoding is enabled.

>sriov_numvfs_store
>pdev->driver->sriov_configure
>mlx5_core_sriov_configure
>pci_enable_sriov
>sriov_enable
>pcibios_sriov_enable
>pnv_pci_sriov_enable
>pnv_pci_vf_resource_shift
>pci_update_resource
> 
> This doesn't change the PF's memory decoding in the path by introducing
> additional parameter (@mmio_force_on) to pci_update_resource().
> 
> Reported-by: Carol Soto 
> Signed-off-by: Gavin Shan 
> Tested-by: Carol Soto 
> ---
>  arch/powerpc/platforms/powernv/pci-ioda.c | 2 +-
>  drivers/pci/iov.c | 2 +-
>  drivers/pci/pci.c | 2 +-
>  drivers/pci/setup-res.c   | 9 +
>  include/linux/pci.h   | 2 +-
>  5 files changed, 9 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
> b/arch/powerpc/platforms/powernv/pci-ioda.c
> index bc0c91e..2d6a2b7 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -999,7 +999,7 @@ static int pnv_pci_vf_resource_shift(struct pci_dev *dev, 
> int offset)
>   dev_info(>dev, "VF BAR%d: %pR shifted to %pR (%sabling %d 
> VFs shifted by %d)\n",
>i, , res, (offset > 0) ? "En" : "Dis",
>num_vfs, offset);
> - pci_update_resource(dev, i + PCI_IOV_RESOURCES);
> + pci_update_resource(dev, i + PCI_IOV_RESOURCES, true);
>   }
>   return 0;
>  }
> diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
> index 2194b44..117aae6 100644
> --- a/drivers/pci/iov.c
> +++ b/drivers/pci/iov.c
> @@ -511,7 +511,7 @@ static void sriov_restore_state(struct pci_dev *dev)
>   return;
>  
>   for (i = PCI_IOV_RESOURCES; i <= PCI_IOV_RESOURCE_END; i++)
> - pci_update_resource(dev, i);
> + pci_update_resource(dev, i, false);
>  
>   pci_write_config_dword(dev, iov->pos + PCI_SRIOV_SYS_PGSIZE, iov->pgsz);
>   pci_iov_set_numvfs(dev, iov->num_VFs);
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index aab9d51..87a33c0 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -545,7 +545,7 @@ static void pci_restore_bars(struct pci_dev *dev)
>   return;
>  
>   for (i = 0; i < PCI_BRIDGE_RESOURCES; i++)
> - pci_update_resource(dev, i);
> + pci_update_resource(dev, i, false);
>  }
>  
>  static const struct pci_platform_pm_ops *pci_platform_pm;
> diff --git a/drivers/pci/setup-res.c b/drivers/pci/setup-res.c
> index 66c4d8f..e8a50ff 100644
> --- a/drivers/pci/setup-res.c
> +++ b/drivers/pci/setup-res.c
> @@ -26,7 +26,7 @@
>  #include "pci.h"
>  
>  
> -void pci_update_resource(struct pci_dev *dev, int resno)
> +void pci_update_resource(struct pci_dev *dev, int resno, bool mmio_force_on)
>  {
>   struct pci_bus_region region;
>   bool disable;
> @@ -81,7 +81,8 @@ void pci_update_resource(struct pci_dev *dev, int resno)
>* disable decoding so that a half-updated BAR won't conflict
>* with another device.
>*/
> - disable = (res->flags & IORESOURCE_MEM_64) && !dev->mmio_always_on;
> + disable = (res->flags & IORESOURCE_MEM_64) &&
> +   !mmio_force_on && !dev->mmio_always_on;
>   if (disable) {
>   pci_read_config_word(dev, PCI_COMMAND, );
>   pci_write_config_word(dev, PCI_COMMAND,
> @@ -310,7 +311,7 @@ int pci_assign_resource(struct pci_dev *dev, int resno)
>   res->flags &= ~IORESOURCE_STARTALIGN;
>   dev_info(>dev, "BAR %d: assigned %pR\n", resno, res);
>   if (resno < PCI_BRIDGE_RESOURCES)
> - pci_update_resource(dev, resno);
> + pci_update_resource(dev, resno, false);
>  
>   return 0;
>  }
> @@ -350,7 +351,7 @@ int pci_reassign_resource(struct pci_dev *dev, int resno, 
> resource_size_t addsiz
>   dev_info(>dev, "BAR %d: reassigned %pR (expanded by %#llx)\n",
>resno, res, (unsigned long long) addsize);
>   if (resno < PCI_BRIDGE_RESOURCES)
> - pci_update_resource(dev, 

Re: [PATCH] PCI: Add parameter @mmio_force_on to pci_update_resource()

2016-09-26 Thread Gavin Shan
On Mon, Sep 19, 2016 at 09:53:30AM +1000, Gavin Shan wrote:
>In pci_update_resource(), the PCI device's memory decoding (0x2 in
>PCI_COMMAND) is disabled when 64-bits memory BAR is updated if the
>PCI device's memory space wasn't asked to be always on by @pdev->
>mmio_always_on. The PF's memory decoding might be disabled when
>updating its IOV BARs in the following path. Actually, the PF's
>memory decoding shouldn't be disabled in this scenario as the PF
>has been started to provide services:
>
>   sriov_numvfs_store
>   pdev->driver->sriov_configure
>   mlx5_core_sriov_configure
>   pci_enable_sriov
>   sriov_enable
>   pcibios_sriov_enable
>   pnv_pci_sriov_enable
>   pnv_pci_vf_resource_shift
>   pci_update_resource
>
>This doesn't change the PF's memory decoding in the path by introducing
>additional parameter (@mmio_force_on) to pci_update_resource().
>
>Reported-by: Carol Soto 
>Signed-off-by: Gavin Shan 
>Tested-by: Carol Soto 
>---

Bjorn, could you please have a quick review on this when you have available
time? We're running into SRIOV issue that is fixed by this patch.

Thanks,
Gavin 

> arch/powerpc/platforms/powernv/pci-ioda.c | 2 +-
> drivers/pci/iov.c | 2 +-
> drivers/pci/pci.c | 2 +-
> drivers/pci/setup-res.c   | 9 +
> include/linux/pci.h   | 2 +-
> 5 files changed, 9 insertions(+), 8 deletions(-)
>
>diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
>b/arch/powerpc/platforms/powernv/pci-ioda.c
>index bc0c91e..2d6a2b7 100644
>--- a/arch/powerpc/platforms/powernv/pci-ioda.c
>+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
>@@ -999,7 +999,7 @@ static int pnv_pci_vf_resource_shift(struct pci_dev *dev, 
>int offset)
>   dev_info(>dev, "VF BAR%d: %pR shifted to %pR (%sabling %d 
> VFs shifted by %d)\n",
>i, , res, (offset > 0) ? "En" : "Dis",
>num_vfs, offset);
>-  pci_update_resource(dev, i + PCI_IOV_RESOURCES);
>+  pci_update_resource(dev, i + PCI_IOV_RESOURCES, true);
>   }
>   return 0;
> }
>diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
>index 2194b44..117aae6 100644
>--- a/drivers/pci/iov.c
>+++ b/drivers/pci/iov.c
>@@ -511,7 +511,7 @@ static void sriov_restore_state(struct pci_dev *dev)
>   return;
>
>   for (i = PCI_IOV_RESOURCES; i <= PCI_IOV_RESOURCE_END; i++)
>-  pci_update_resource(dev, i);
>+  pci_update_resource(dev, i, false);
>
>   pci_write_config_dword(dev, iov->pos + PCI_SRIOV_SYS_PGSIZE, iov->pgsz);
>   pci_iov_set_numvfs(dev, iov->num_VFs);
>diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>index aab9d51..87a33c0 100644
>--- a/drivers/pci/pci.c
>+++ b/drivers/pci/pci.c
>@@ -545,7 +545,7 @@ static void pci_restore_bars(struct pci_dev *dev)
>   return;
>
>   for (i = 0; i < PCI_BRIDGE_RESOURCES; i++)
>-  pci_update_resource(dev, i);
>+  pci_update_resource(dev, i, false);
> }
>
> static const struct pci_platform_pm_ops *pci_platform_pm;
>diff --git a/drivers/pci/setup-res.c b/drivers/pci/setup-res.c
>index 66c4d8f..e8a50ff 100644
>--- a/drivers/pci/setup-res.c
>+++ b/drivers/pci/setup-res.c
>@@ -26,7 +26,7 @@
> #include "pci.h"
>
>
>-void pci_update_resource(struct pci_dev *dev, int resno)
>+void pci_update_resource(struct pci_dev *dev, int resno, bool mmio_force_on)
> {
>   struct pci_bus_region region;
>   bool disable;
>@@ -81,7 +81,8 @@ void pci_update_resource(struct pci_dev *dev, int resno)
>* disable decoding so that a half-updated BAR won't conflict
>* with another device.
>*/
>-  disable = (res->flags & IORESOURCE_MEM_64) && !dev->mmio_always_on;
>+  disable = (res->flags & IORESOURCE_MEM_64) &&
>+!mmio_force_on && !dev->mmio_always_on;
>   if (disable) {
>   pci_read_config_word(dev, PCI_COMMAND, );
>   pci_write_config_word(dev, PCI_COMMAND,
>@@ -310,7 +311,7 @@ int pci_assign_resource(struct pci_dev *dev, int resno)
>   res->flags &= ~IORESOURCE_STARTALIGN;
>   dev_info(>dev, "BAR %d: assigned %pR\n", resno, res);
>   if (resno < PCI_BRIDGE_RESOURCES)
>-  pci_update_resource(dev, resno);
>+  pci_update_resource(dev, resno, false);
>
>   return 0;
> }
>@@ -350,7 +351,7 @@ int pci_reassign_resource(struct pci_dev *dev, int resno, 
>resource_size_t addsiz
>   dev_info(>dev, "BAR %d: reassigned %pR (expanded by %#llx)\n",
>resno, res, (unsigned long long) addsize);
>   if (resno < PCI_BRIDGE_RESOURCES)
>-  pci_update_resource(dev, resno);
>+  pci_update_resource(dev, resno, false);
>
>   return 0;
> }
>diff --git a/include/linux/pci.h b/include/linux/pci.h
>index 0ab8359..99231d1 100644
>--- a/include/linux/pci.h
>+++ b/include/linux/pci.h
>@@ -1039,7