Re: [PATCH v10 3/5] PCI: Disable Relaxed Ordering Attributes for AMD A1100

2017-08-14 Thread Raj, Ashok
On Mon, Aug 14, 2017 at 11:44:57PM +0800, Ding Tianhong wrote:
> Casey reported that the AMD ARM A1100 SoC has a bug in its PCIe
> Root Port where Upstream Transaction Layer Packets with the Relaxed
> Ordering Attribute clear are allowed to bypass earlier TLPs with
> Relaxed Ordering set, it would cause Data Corruption, so we need
> to disable Relaxed Ordering Attribute when Upstream TLPs to the
> Root Port.
> 
> Signed-off-by: Casey Leedom <lee...@chelsio.com>
> Signed-off-by: Ding Tianhong <dingtianh...@huawei.com>
> Acked-by: Alexander Duyck <alexander.h.du...@intel.com>
> Acked-by: Ashok Raj <ashok@intel.com>

I can't ack this patch :-).. must be someone from AMD. Please remove my
signature from this.

> ---
>  drivers/pci/quirks.c | 16 
>  1 file changed, 16 insertions(+)
> 
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index 1272f7e..1407604 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -4089,6 +4089,22 @@ static void quirk_relaxedordering_disable(struct 
> pci_dev *dev)
> quirk_relaxedordering_disable);
>  
>  /*
> + * The AMD ARM A1100 (AKA "SEATTLE") SoC has a bug in its PCIe Root Complex
> + * where Upstream Transaction Layer Packets with the Relaxed Ordering
> + * Attribute clear are allowed to bypass earlier TLPs with Relaxed Ordering
> + * set.  This is a violation of the PCIe 3.0 Transaction Ordering Rules
> + * outlined in Section 2.4.1 (PCI Express(r) Base Specification Revision 3.0
> + * November 10, 2010).  As a result, on this platform we can't use Relaxed
> + * Ordering for Upstream TLPs.
> + */
> +DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_AMD, 0x1a00, 
> PCI_CLASS_NOT_DEFINED, 8,
> +   quirk_relaxedordering_disable);
> +DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_AMD, 0x1a01, 
> PCI_CLASS_NOT_DEFINED, 8,
> +   quirk_relaxedordering_disable);
> +DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_AMD, 0x1a02, 
> PCI_CLASS_NOT_DEFINED, 8,
> +   quirk_relaxedordering_disable);
> +
> +/*
>   * Per PCIe r3.0, sec 2.2.9, "Completion headers must supply the same
>   * values for the Attribute as were supplied in the header of the
>   * corresponding Request, except as explicitly allowed when IDO is used."
> -- 
> 1.8.3.1
> 
> 


Re: [PATCH v9 1/4] PCI: Add new PCIe Fabric End Node flag, PCI_DEV_FLAGS_NO_RELAXED_ORDERING

2017-08-09 Thread Raj, Ashok
On Wed, Aug 09, 2017 at 04:46:07PM +, Casey Leedom wrote:
> | From: Raj, Ashok <ashok@intel.com>
> | Sent: Wednesday, August 9, 2017 8:58 AM
> | ...
> | As Casey pointed out in an earlier thread, we choose the heavy hammer
> | approach because there are some that can lead to data-corruption as opposed
> | to perf degradation.
> 
> Careful.  As far as I'm aware, there is no Data Corruption problem
> whatsoever with Intel Root Ports and processing of Transaction Layer Packets
> with and without the Relaxed Ordering Attribute set.

That's right.. no data-corruption on Intel parts :-).. It was with
other vendor. Only performance issue with intel root-ports in the parts
identified by the optimization guide. 

Cheers,
AShok



Re: [PATCH v9 1/4] PCI: Add new PCIe Fabric End Node flag, PCI_DEV_FLAGS_NO_RELAXED_ORDERING

2017-08-09 Thread Raj, Ashok
Hi Bjorn

On Tue, Aug 08, 2017 at 06:22:00PM -0500, Bjorn Helgaas wrote:
> On Sat, Aug 05, 2017 at 03:15:10PM +0800, Ding Tianhong wrote:
> > From: Casey Leedom 
> > 
> > Root complexes don't obey PCIe 3.0 ordering rules, hence could lead to
> > data-corruption.
> 
> This needs to include a link to the Intel spec
> (https://software.intel.com/sites/default/files/managed/9e/bc/64-ia-32-architectures-optimization-manual.pdf,
> sec 3.9.1).
> 
> It should also include a pointer to the AMD erratum, if available, or
> at least some reference to how we know it doesn't obey the rules.
> 
> Ashok, thanks for chiming in.  Now that you have, I have a few more
> questions for you:
> 
>   - Is the above doc the one you mentioned as being now public?

Yes. 
>   
>   - Is this considered a hardware erratum?

I would think so. I have tried to pursue the publication in that direction
but it morphed into the optimization guide instead. Once it got into some
open doc i stopped pushing.. but will continue to get this into erratum. i do
agree that's the right place holder for this.

>   
>   - If so, is there a pointer to that as well?
>   
>   - If this is not considered an erratum, can you provide any guidance
> about how an OS should determine when it should use RO?

The optimization guide states that it only applies to transactions targetting
system memory. For peer-2-peer RO is allowed and has perf upside.

As Casey pointed out in an earlier thread, we choose the heavy hammer approach
because there are some that can lead to data-corruption as opposed to perf
degradation. 

This looks ugly, but maybe we can have 2 flags. one that indicates its a strict
no-no, and one that says no to system memory only. That way driver can
determine when the device would turn the hint on in the TLP.

> 
> Relying on a list of device IDs in an optimization manual is OK for an
> erratum, but if it's *not* an erratum, it seems like a hole in the

Good point.. for this specific case its really an erratum, but for some
reason they made the decision to use this doc vs. the generic errata
data-sheet that would have been the preferred way to document.

> specs because as far as I know there's no generic way for the OS to
> discover whether to use RO.
> 

Cheers,
Ashok


Re: [PATCH v7 2/3] PCI: Enable PCIe Relaxed Ordering if supported

2017-08-04 Thread Raj, Ashok
On Fri, Aug 04, 2017 at 08:20:37PM +, Casey Leedom wrote:
> | From: Raj, Ashok <ashok@intel.com>
> | Sent: Thursday, August 3, 2017 1:31 AM
> |
> | I don't understand this completely.. So your driver would know not to send
> | RO TLP's to root complex. But you want to send RO to the NVMe device? This
> | is the peer-2-peer case correct?
> 
> Yes, this is the "heavy hammer" issue which you alluded to later.  There are
> applications where a device will want to send TLPs to a Root Complex without
> Relaxed Ordering set, but will want to use it when sending TLPs to a Peer
> device (say, an NVMe storage device).  The current approach doesn't make
> that easy ... and in fact, I still don't kow how to code a solution for this
> with the proposed APIs.  This means that we may be trading off one
> performance problem for another and that Relaxed Ordering may be doomed for
> use under Linux for the foreseeable future.
> 
> As I've noted a number of times, it would be great if the Intel Hardware
> Engineers who attempted to implement the Relaxed Ordering semantics in the
> current generation of Root Complexes had left the ability to turn off the
> logic which is obviously not working.  If there was a way to disable the
> logic via an undocumented register, then we could have the Linux PCI Quirk
> do that.  Since Relaxed Ordering is just a hint, it's completely legitimate
> to completely ignore it.

Suppose you are looking for the existence of a chicken bit to instruct the
port to ignore RO traffic. So all we would do is turn the chicken bit on
but would permit p2p traffic to be allowed since we won't turn off the
capability as currently proposed.

Let me look into that keep you posted.

Cheers,
Ashok
> 
> Casey


Re: [PATCH v7 2/3] PCI: Enable PCIe Relaxed Ordering if supported

2017-08-03 Thread Raj, Ashok
Hi Ding

patch looks good, except would reword the patch description for clarity

here is my crack at it, feel free to use.

On Thu, Jul 13, 2017 at 10:21:31PM +0800, Ding Tianhong wrote:
> The PCIe Device Control Register use the bit 4 to indicate that
> whether the device is permitted to enable relaxed ordering or not.
> But relaxed ordering is not safe for some platform which could only
> use strong write ordering, so devices are allowed (but not required)
> to enable relaxed ordering bit by default.
> 
> If a PCIe device didn't enable the relaxed ordering attribute default,
> we should not do anything in the PCIe configuration, otherwise we
> should check if any of the devices above us do not support relaxed
> ordering by the PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag, then base on
> the result if we get a return that indicate that the relaxed ordering
> is not supported we should update our device to disable relaxed ordering
> in configuration space. If the device above us doesn't exist or isn't
> the PCIe device, we shouldn't do anything and skip updating relaxed ordering
> because we are probably running in a guest machine.

When bit4 is set in the PCIe Device Control register, it indicates
whether the device is permitted to use relaxed ordering.
On some platforms using relaxed ordering can have performance issues or
due to erratum can cause data-corruption. In such cases devices must avoid
using relaxed ordering.

This patch checks if there is any node in the hierarchy that indicates that
using relaxed ordering is not safe. In such cases the patch turns off the
relaxed ordering by clearing the eapability for this device.

> 
> Signed-off-by: Ding Tianhong 
> ---
>  drivers/pci/pci.c   | 29 +
>  drivers/pci/probe.c | 37 +
>  include/linux/pci.h |  2 ++
>  3 files changed, 68 insertions(+)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index d88edf5..7a6b32f 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -4854,6 +4854,35 @@ int pcie_set_mps(struct pci_dev *dev, int mps)
>  EXPORT_SYMBOL(pcie_set_mps);
>  
>  /**
> + * pcie_clear_relaxed_ordering - clear PCI Express relaxed ordering bit
> + * @dev: PCI device to query
> + *
> + * If possible clear relaxed ordering
> + */
> +int pcie_clear_relaxed_ordering(struct pci_dev *dev)
> +{
> + return pcie_capability_clear_word(dev, PCI_EXP_DEVCTL,
> +   PCI_EXP_DEVCTL_RELAX_EN);
> +}
> +EXPORT_SYMBOL(pcie_clear_relaxed_ordering);
> +
> +/**
> + * pcie_relaxed_ordering_supported - Probe for PCIe relexed ordering support
> + * @dev: PCI device to query
> + *
> + * Returns true if the device support relaxed ordering attribute.
> + */
> +bool pcie_relaxed_ordering_supported(struct pci_dev *dev)
> +{
> + u16 v;
> +
> + pcie_capability_read_word(dev, PCI_EXP_DEVCTL, );
> +
> + return !!(v & PCI_EXP_DEVCTL_RELAX_EN);
> +}
> +EXPORT_SYMBOL(pcie_relaxed_ordering_supported);
> +
> +/**
>   * pcie_get_minimum_link - determine minimum link settings of a PCI device
>   * @dev: PCI device to query
>   * @speed: storage for minimum speed
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index c31310d..48df012 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -1762,6 +1762,42 @@ static void pci_configure_extended_tags(struct pci_dev 
> *dev)
>PCI_EXP_DEVCTL_EXT_TAG);
>  }
>  
> +/**
> + * pci_dev_should_disable_relaxed_ordering - check if the PCI device
> + * should disable the relaxed ordering attribute.
> + * @dev: PCI device
> + *
> + * Return true if any of the PCI devices above us do not support
> + * relaxed ordering.
> + */
> +static bool pci_dev_should_disable_relaxed_ordering(struct pci_dev *dev)
> +{
> + while (dev) {
> + if (dev->dev_flags & PCI_DEV_FLAGS_NO_RELAXED_ORDERING)
> + return true;
> +
> + dev = dev->bus->self;
> + }
> +
> + return false;
> +}
> +
> +static void pci_configure_relaxed_ordering(struct pci_dev *dev)
> +{
> + /* We should not alter the relaxed ordering bit for the VF */
> + if (dev->is_virtfn)
> + return;
> +
> + /* If the releaxed ordering enable bit is not set, do nothing. */
> + if (!pcie_relaxed_ordering_supported(dev))
> + return;
> +
> + if (pci_dev_should_disable_relaxed_ordering(dev)) {
> + pcie_clear_relaxed_ordering(dev);
> + dev_info(>dev, "Disable Relaxed Ordering\n");
> + }
> +}
> +
>  static void pci_configure_device(struct pci_dev *dev)
>  {
>   struct hotplug_params hpp;
> @@ -1769,6 +1805,7 @@ static void pci_configure_device(struct pci_dev *dev)
>  
>   pci_configure_mps(dev);
>   pci_configure_extended_tags(dev);
> + pci_configure_relaxed_ordering(dev);
>  
>   memset(, 0, sizeof(hpp));
>   ret = pci_get_hp_params(dev, );
> diff --git 

Re: [PATCH v7 1/3] PCI: Add new PCIe Fabric End Node flag, PCI_DEV_FLAGS_NO_RELAXED_ORDERING

2017-08-03 Thread Raj, Ashok
Hi Ding

Not sure if V7 is the last version.

can you consider rewording this just to make it a little bit more
readable? My suggestion below, feel free to use/modify

Otherwise its all good and you can add my Ack.

Acked-by: Ashok Raj <ashok@intel.com>

On Thu, Jul 13, 2017 at 10:21:30PM +0800, Ding Tianhong wrote:
> From: Casey Leedom <lee...@chelsio.com>


> 
> The new flag PCI_DEV_FLAGS_NO_RELAXED_ORDERING indicates that the Relaxed
> Ordering Attribute should not be used on Transaction Layer Packets destined
> for the PCIe End Node so flagged.  Initially flagged this way are Intel
> E5-26xx Root Complex Ports which suffer from a Flow Control Credit
> Performance Problem and AMD A1100 ARM ("SEATTLE") Root Complex Ports which
> don't obey PCIe 3.0 ordering rules which can lead to Data Corruption.

The patch adds a new flag PCI_DEV_FLAGS_NO_RELAXED_ORDERING to indicate that
Relaxed Ordering (RO) attribute should not be used for Transaction Layer
Packets (TLP) targetted towards these affected root complexes. Current list
of affected parts include Intel E5-26xx root complex which suffers from 
flow control credits that result in performance issues. On these affected
parts RO can still be used for peer-2-peer traffic. AMD A1100 ARM ("SEATTLE")
Root complexes don't obey PCIe 3.0 ordering rules, hence could lead to
data-corruption.
> 
> Signed-off-by: Casey Leedom <lee...@chelsio.com>
> Signed-off-by: Ding Tianhong <dingtianh...@huawei.com>
> ---
>  drivers/pci/quirks.c | 38 ++
>  include/linux/pci.h  |  2 ++
>  2 files changed, 40 insertions(+)
> 
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index 6967c6b..1e1cdbe 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -4016,6 +4016,44 @@ static void quirk_tw686x_class(struct pci_dev *pdev)
> quirk_tw686x_class);
>  
>  /*
> + * Some devices have problems with Transaction Layer Packets with the Relaxed
> + * Ordering Attribute set.  Such devices should mark themselves and other
> + * Device Drivers should check before sending TLPs with RO set.
> + */
> +static void quirk_relaxedordering_disable(struct pci_dev *dev)
> +{
> + dev->dev_flags |= PCI_DEV_FLAGS_NO_RELAXED_ORDERING;
> +}
> +
> +/*
> + * Intel E5-26xx Root Complex has a Flow Control Credit issue which can
> + * cause performance problems with Upstream Transaction Layer Packets with
> + * Relaxed Ordering set.
> + */
> +DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x6f02, 
> PCI_CLASS_NOT_DEFINED, 8,
> +   quirk_relaxedordering_disable);
> +DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x6f04, 
> PCI_CLASS_NOT_DEFINED, 8,
> +   quirk_relaxedordering_disable);
> +DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x6f08, 
> PCI_CLASS_NOT_DEFINED, 8,
> +   quirk_relaxedordering_disable);
> +
> +/*
> + * The AMD ARM A1100 (AKA "SEATTLE") SoC has a bug in its PCIe Root Complex
> + * where Upstream Transaction Layer Packets with the Relaxed Ordering
> + * Attribute clear are allowed to bypass earlier TLPs with Relaxed Ordering
> + * set.  This is a violation of the PCIe 3.0 Transaction Ordering Rules
> + * outlined in Section 2.4.1 (PCI Express(r) Base Specification Revision 3.0
> + * November 10, 2010).  As a result, on this platform we can't use Relaxed
> + * Ordering for Upstream TLPs.
> + */
> +DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_AMD, 0x1a00, 
> PCI_CLASS_NOT_DEFINED, 8,
> +   quirk_relaxedordering_disable);
> +DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_AMD, 0x1a01, 
> PCI_CLASS_NOT_DEFINED, 8,
> +   quirk_relaxedordering_disable);
> +DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_AMD, 0x1a02, 
> PCI_CLASS_NOT_DEFINED, 8,
> +   quirk_relaxedordering_disable);
> +
> +/*
>   * Per PCIe r3.0, sec 2.2.9, "Completion headers must supply the same
>   * values for the Attribute as were supplied in the header of the
>   * corresponding Request, except as explicitly allowed when IDO is used."
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 4869e66..412ec1c 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -188,6 +188,8 @@ enum pci_dev_flags {
>* the direct_complete optimization.
>*/
>   PCI_DEV_FLAGS_NEEDS_RESUME = (__force pci_dev_flags_t) (1 << 11),
> + /* Don't use Relaxed Ordering for TLPs directed at this device */
> + PCI_DEV_FLAGS_NO_RELAXED_ORDERING = (__force pci_dev_flags_t) (1 << 12),
>  };
>  
>  enum pci_irq_reroute_variant {
> -- 
> 1.8.3.1
> 
> 


Re: [PATCH v7 2/3] PCI: Enable PCIe Relaxed Ordering if supported

2017-08-03 Thread Raj, Ashok
Hi Casey

On Wed, Aug 02, 2017 at 05:53:52PM +, Casey Leedom wrote:
>   Okay, here you go.  As you can tell, it's almost a trivial copy of the
> cxgb4 patch.
>  
>   By the way, I realized that we have yet another hole which is likely not
> to be fixable.  If we're dealing with a problematic Root Complex, and we
> instantiate Virtual Functions and attach them to a Virtual Machine along
> with an NVMe device which can deal with Relaxed Ordering TLPs, the VF driver
> in the VM will be able to tell that it shouldn't attempt to send RO TLPs to
> the RC because it will see the state of its own PCIe Capability Device
> Control[Relaxed Ordering Enable] (a copy of the setting in the VF's
> corresponding PF), but it won't be able to change that and send non-RO TLPs
> to the RC, and RO TLPs to the NVMe device.  Oh well.

I don't understand this completely.. So your driver would know not to send RO
TLP's to root complex. But you want to send RO to the NVMe device? This is the
peer-2-peer case correct?

The issue in the current patchset is that we device to turn off the device 
capability for all devices in the hierarchy so one would expect that 
the NVMe also would have RO turned off i suppose. 

The other approach is to not turn off the device capabilty, but let the
driver do the right thing. i.e for transactions towards system memory vs. 
peer-2-peer? But since we wanted to take a big hammer approach because
some platforms there can be data-corruption and we can't let trust guest
drivers to do the right thing. This isn't something we can fix in this 
current version.

One possible approach is to provide a strict flag, where we use this heavy 
hammer approach only on platforms that have a serious implication, and the 
other is we let the driver do the right thing depending on the platform.

Worst case if the driver doesn't do the right thing, you would see perf issues
but nothing bad would happen. It would allow you to select when to turn on 
RO and when to turn it off.

Cheers,
Ashok


Re: [PATCH v7 2/3] PCI: Enable PCIe Relaxed Ordering if supported

2017-07-27 Thread Raj, Ashok
Hi Casey

> | Still no Intel and AMD guys has ack this, this is what I am worried about,
> | should I ping some man again ?


I can ack the patch set for Intel specific changes. Now that the doc is made
public :-).

Can you/Ding resend the patch series, i do have the most recent v7, some
of the commit message wasn't easy to ready. Seems like this patch has
gotten bigger than originally intended, but seems to be for the overall
good :-).

Sorry for staying silent up until now.

Cheers,
Ashok


Re: [PATCH 1/2] PCI: Add new PCIe Fabric End Node flag, PCI_DEV_FLAGS_NO_RELAXED_ORDERING

2017-05-02 Thread Raj, Ashok
On Tue, May 02, 2017 at 11:10:22AM -0700, Alexander Duyck wrote:
> On Tue, May 2, 2017 at 9:53 AM, Raj, Ashok <ashok@intel.com> wrote:
> > On Tue, May 02, 2017 at 09:39:34AM -0700, Alexander Duyck wrote:
> >> On Mon, May 1, 2017 at 4:13 PM, Casey Leedom <lee...@chelsio.com> wrote:
> >> > The new flag PCI_DEV_FLAGS_NO_RELAXED_ORDERING indicates that the Relaxed
> >> > Ordering Attribute should not be used on Transaction Layer Packets 
> >> > destined
> >> > for the PCIe End Node so flagged.  Initially flagged this way are Intel
> >> > E5-26xx Root Complex Ports which suffer from a Flow Control Credit
> >> > Performance Problem and AMD A1100 ARM ("SEATTLE") Root Complex Ports 
> >> > which
> >> > don't obey PCIe 3.0 ordering rules which can lead to Data Corruption.
> >>
> >> So this is a good first step though might I suggest one other change.
> >>
> >> We may want to add logic to the core PCIe code that clears the "Enable
> >> Relaxed Ordering" bit in the device control register for all devices
> >> hanging off of this root complex. Assuming the devices conform to the
> >> PCIe spec doing that should disable relaxed ordering in a device
> >> agnostic way that then enables us at a driver level to just enable the
> >> feature always without having to perform any checks for your flag. We
> >> could probably do that as a part of probing the PCIe interfaces
> >> hanging off of these devices.
> >
> > I suppose you don't want to turn off RO completely on the device. When
> > traffic is targetted to mmio for peer to peer reasons RO has performance
> > upside. The specific issue with these root ports indicate limitation using
> > RO for traffic targetting coherent memory.
> 
> Actually my main concern here is virtualization. If I take the PCIe
> function and direct assign it I have no way of seeing the root complex
> flag as it is now virtualized away. In the meantime the guest now has
> the ability to enable the function and sees nothing that says you
> can't enable relaxed ordering which in turn ends up potentially
> causing data corruption on the system. I want relaxed ordering
> disabled before I even consider assigning it to the guest on the
> systems where this would be an issue.
> 
> I prefer to err on the side of caution with this. Enabling Relaxed
> Ordering is technically a performance enhancement, so we function but
> not as well as we would like, while having it enabled when there are
> issues can lead to data corruption. I would weigh the risk of data
> corruption the thing to be avoided and of much higher priority than
> enabling improved performance. As such I say we should default the
> relaxed ordering attribute to off in general and look at
> "white-listing" it in for various architectures and/or chipsets that
> support/need it rather than having it enabled by default and trying to
> switch it off after the fact when we find some new issue.

I agree, after thinking about it a bit more.. even for transactions going to
p2p, i'm just reading the pcie spec and some sections aren't super clear
about completion redirect and ACS rules for p2p.

Also it appears the device control default value is 1 for enabling
Relaxed Ordering. This means we should probably save these states across
resets/FLR for e.g. To ensure perf isn't affected after a FLR.
> 
> So for example, in the case of x86 it seems like there are multiple
> root complexes that have issues, and the gains for enabling it with
> standard DMA to host memory are small. As such we may want to default
> it to off via the architecture specific PCIe code and then look at
> having "white-list" cases where we enable it for things like
> peer-to-peer accesses. In the case of SPARC we could look at
> defaulting it to on, and only "black-list" any cases where there might
> be issues since SPARC relies on this in a significant way for
> performance. In the case of ARM and other architectures it is a bit of
> a toss-up. I would say we could just default it to on for now and
> "black-list" anything that doesn't work, or we could go the other way
> like I suggested for x86. It all depends on what the ARM community
> would want to agree on for this. I would say unless it makes a
> significant difference like it does in the case of SPARC we are
> probably better off just defaulting it to off.
> 
> - Alex


Re: [PATCH 1/2] PCI: Add new PCIe Fabric End Node flag, PCI_DEV_FLAGS_NO_RELAXED_ORDERING

2017-05-02 Thread Raj, Ashok
On Tue, May 02, 2017 at 09:39:34AM -0700, Alexander Duyck wrote:
> On Mon, May 1, 2017 at 4:13 PM, Casey Leedom  wrote:
> > The new flag PCI_DEV_FLAGS_NO_RELAXED_ORDERING indicates that the Relaxed
> > Ordering Attribute should not be used on Transaction Layer Packets destined
> > for the PCIe End Node so flagged.  Initially flagged this way are Intel
> > E5-26xx Root Complex Ports which suffer from a Flow Control Credit
> > Performance Problem and AMD A1100 ARM ("SEATTLE") Root Complex Ports which
> > don't obey PCIe 3.0 ordering rules which can lead to Data Corruption.
> 
> So this is a good first step though might I suggest one other change.
> 
> We may want to add logic to the core PCIe code that clears the "Enable
> Relaxed Ordering" bit in the device control register for all devices
> hanging off of this root complex. Assuming the devices conform to the
> PCIe spec doing that should disable relaxed ordering in a device
> agnostic way that then enables us at a driver level to just enable the
> feature always without having to perform any checks for your flag. We
> could probably do that as a part of probing the PCIe interfaces
> hanging off of these devices.

I suppose you don't want to turn off RO completely on the device. When
traffic is targetted to mmio for peer to peer reasons RO has performance
upside. The specific issue with these root ports indicate limitation using 
RO for traffic targetting coherent memory.

> 
> > ---
> >  drivers/pci/quirks.c | 38 ++
> >  include/linux/pci.h  |  2 ++
> >  2 files changed, 40 insertions(+)
> >
> > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> > index f754453..4ae78b3 100644
> > --- a/drivers/pci/quirks.c
> > +++ b/drivers/pci/quirks.c
> > @@ -3979,6 +3979,44 @@ static void quirk_tw686x_class(struct pci_dev *pdev)
> >   quirk_tw686x_class);
> >
> >  /*
> > + * Some devices have problems with Transaction Layer Packets with the 
> > Relaxed
> > + * Ordering Attribute set.  Such devices should mark themselves and other
> > + * Device Drivers should check before sending TLPs with RO set.
> > + */
> > +static void quirk_relaxedordering_disable(struct pci_dev *dev)
> > +{
> > +   dev->dev_flags |= PCI_DEV_FLAGS_NO_RELAXED_ORDERING;
> > +}
> > +
> > +/*
> > + * Intel E5-26xx Root Complex has a Flow Control Credit issue which can
> > + * cause performance problems with Upstream Transaction Layer Packets with
> > + * Relaxed Ordering set.
> > + */
> > +DECLARE_PCI_FIXUP_CLASS_EARLY(0x8086, 0x6f02, PCI_CLASS_NOT_DEFINED, 8,
> > + quirk_relaxedordering_disable);
> > +DECLARE_PCI_FIXUP_CLASS_EARLY(0x8086, 0x6f04, PCI_CLASS_NOT_DEFINED, 8,
> > + quirk_relaxedordering_disable);
> > +DECLARE_PCI_FIXUP_CLASS_EARLY(0x8086, 0x6f08, PCI_CLASS_NOT_DEFINED, 8,
> > + quirk_relaxedordering_disable);
> > +
> > +/*
> > + * The AMD ARM A1100 (AKA "SEATTLE") SoC has a bug in its PCIe Root Complex
> > + * where Upstream Transaction Layer Packets with the Relaxed Ordering
> > + * Attribute clear are allowed to bypass earlier TLPs with Relaxed Ordering
> > + * set.  This is a violation of the PCIe 3.0 Transaction Ordering Rules
> > + * outlined in Section 2.4.1 (PCI Express(r) Base Specification Revision 
> > 3.0
> > + * November 10, 2010).  As a result, on this platform we can't use Relaxed
> > + * Ordering for Upstream TLPs.
> > + */
> > +DECLARE_PCI_FIXUP_CLASS_EARLY(0x1022, 0x1a00, PCI_CLASS_NOT_DEFINED, 8,
> > + quirk_relaxedordering_disable);
> > +DECLARE_PCI_FIXUP_CLASS_EARLY(0x1022, 0x1a01, PCI_CLASS_NOT_DEFINED, 8,
> > + quirk_relaxedordering_disable);
> > +DECLARE_PCI_FIXUP_CLASS_EARLY(0x1022, 0x1a02, PCI_CLASS_NOT_DEFINED, 8,
> > + quirk_relaxedordering_disable);
> > +
> > +/*
> >   * Per PCIe r3.0, sec 2.2.9, "Completion headers must supply the same
> >   * values for the Attribute as were supplied in the header of the
> >   * corresponding Request, except as explicitly allowed when IDO is used."
> > diff --git a/include/linux/pci.h b/include/linux/pci.h
> > index eb3da1a..6764f66 100644
> > --- a/include/linux/pci.h
> > +++ b/include/linux/pci.h
> > @@ -178,6 +178,8 @@ enum pci_dev_flags {
> > PCI_DEV_FLAGS_NO_PM_RESET = (__force pci_dev_flags_t) (1 << 7),
> > /* Get VPD from function 0 VPD */
> > PCI_DEV_FLAGS_VPD_REF_F0 = (__force pci_dev_flags_t) (1 << 8),
> > +   /* Don't use Relaxed Ordering for TLPs directed at this device */
> > +   PCI_DEV_FLAGS_NO_RELAXED_ORDERING = (__force pci_dev_flags_t) (1 << 
> > 9),
> >  };
> >
> >  enum pci_irq_reroute_variant {
> > --
> > 1.9.1
> >


Re: [PATCH 1/2] PCI: Add new PCIe Fabric End Node flag, PCI_DEV_FLAGS_NO_RELAXED_ORDERING

2017-05-02 Thread Raj, Ashok
Hi Casey


On Mon, May 01, 2017 at 04:13:50PM -0700, Casey Leedom wrote:
> The new flag PCI_DEV_FLAGS_NO_RELAXED_ORDERING indicates that the Relaxed
> Ordering Attribute should not be used on Transaction Layer Packets destined
> for the PCIe End Node so flagged.  Initially flagged this way are Intel
> E5-26xx Root Complex Ports which suffer from a Flow Control Credit
> Performance Problem and AMD A1100 ARM ("SEATTLE") Root Complex Ports which
> don't obey PCIe 3.0 ordering rules which can lead to Data Corruption.
> ---
>  drivers/pci/quirks.c | 38 ++
>  include/linux/pci.h  |  2 ++
>  2 files changed, 40 insertions(+)
> 
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index f754453..4ae78b3 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -3979,6 +3979,44 @@ static void quirk_tw686x_class(struct pci_dev *pdev)
> quirk_tw686x_class);
>  
>  /*
> + * Some devices have problems with Transaction Layer Packets with the Relaxed
> + * Ordering Attribute set.  Such devices should mark themselves and other
> + * Device Drivers should check before sending TLPs with RO set.
> + */
> +static void quirk_relaxedordering_disable(struct pci_dev *dev)
> +{
> + dev->dev_flags |= PCI_DEV_FLAGS_NO_RELAXED_ORDERING;
> +}
> +
> +/*
> + * Intel E5-26xx Root Complex has a Flow Control Credit issue which can
> + * cause performance problems with Upstream Transaction Layer Packets with
> + * Relaxed Ordering set.
> + */
> +DECLARE_PCI_FIXUP_CLASS_EARLY(0x8086, 0x6f02, PCI_CLASS_NOT_DEFINED, 8,
> +   quirk_relaxedordering_disable);
> +DECLARE_PCI_FIXUP_CLASS_EARLY(0x8086, 0x6f04, PCI_CLASS_NOT_DEFINED, 8,
> +   quirk_relaxedordering_disable);
> +DECLARE_PCI_FIXUP_CLASS_EARLY(0x8086, 0x6f08, PCI_CLASS_NOT_DEFINED, 8,
> +   quirk_relaxedordering_disable);
> +

You might want to add the RP ID's for both HSX/BDX. Tne entire range 
is 2F01H-2F0EH & 6F01H-6F0EH.

Cheers,
Ashok