Sorry again for the delay.  This time at least partially caused by a 
Chelsio-internal Customer Support request to simply disable Relaxed Ordering 
entirely due to the performance issues with our 100Gb/s product and relatively 
recent Intel Root Complexes.  Our Customer Support people are tired of asking 
customers to try turning off Relaxed Ordering. (sigh)

  So, first off, I've mentioned a couple of times that the current cxgb4 driver 
hardwired the PCIe Capability Device Control[Relaxed Ordering Enable] on.  
Here's the code which does it:

    drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c:4657:

    static void enable_pcie_relaxed_ordering(struct pci_dev *dev)
    {
        pcie_capability_set_word(dev, PCI_EXP_DEVCTL, PCI_EXP_DEVCTL_RELAX_EN);
    }

This is called from the PCI Probe() routine init_one() later in that file.  I 
just wanted to make sure people knew about this.  Obviously given our current 
very difficult thread, this would either need to be diked out or changed or a 
completely different mechanism put in place.

  Second, just to make sure everyone's on the same page, the above simply 
allows the device to send TLPs with the Relaxed Ordering Attribute.  It doesn't 
cause TLPs to suddenly all be sent with RO set.  The use of Relaxed Ordering is 
selective.  For instance, in our hardware we can configure the RX Path to use 
RO on Ingress Packet Data delivery to Free List Buffers, but not use RO for 
delivery of messages noting newly delivered Ingress Packet Data.  Doing this 
allows the destination PCIe target to [potentially] optimize the DMA Writes to 
it based on local conditions (memory controller channel availability, etc.), 
but ensure that the message noting newly delivered Ingress Packet Data isn't 
processed till all of the preceding TLPs with RO set containing Ingress Packet 
Data have been processed.  (This by the way is the essence of the AMD A1100 ARM 
SoC bug: its Root Complex isn't obeying that PCIe ordering rule.)

  Third, as noted above, I'm getting a lot of pressure to get this addressed 
sooner than later, so I think that we should go with something fairly simple 
along the lines that you guys are proposing and I'll stop whining about the 
problem of needing to handle Peer-to-Peer with Relaxed Ordering while not using 
it for deliveries to the Root Complex.  We can just wait for that kettle of 
fish to explode on us and deal with the mess then.  (Hhmmm, the mixed metaphor 
landed in an entirely different place than I originally intended ... :-))

  If we try to stick as closely to Ding's latest patch set as possible, then we 
can probably just add the diff to remove the enable_pcie_relaxed_ordering() 
code in cxgb4_main.c.

Casey

Reply via email to