Re: [PATCH] PCI: Add no-D3 quirk for Mellanox ConnectX-[45]

2019-01-09 Thread Jason Gunthorpe
On Wed, Jan 09, 2019 at 04:09:02PM +1100, Benjamin Herrenschmidt wrote: > > POWER 8 firmware is good? If the link does eventually come back, is > > the POWER8's D3 resumption timeout long enough? > > > > If this doesn't lead to an obvious conclusion you'll probably need to > > connect to IBM's

Re: [PATCH] PCI: Add no-D3 quirk for Mellanox ConnectX-[45]

2019-01-09 Thread Alexey Kardashevskiy
On 09/01/2019 18:24, Benjamin Herrenschmidt wrote: > On Wed, 2019-01-09 at 15:53 +1100, Alexey Kardashevskiy wrote: >> "A PCI completion timeout occurred for an outstanding PCI-E transaction" >> it is. >> >> This is how I bind the device to vfio: >> >> echo vfio-pci >

Re: [PATCH] PCI: Add no-D3 quirk for Mellanox ConnectX-[45]

2019-01-09 Thread Alexey Kardashevskiy
On 09/01/2019 18:25, Benjamin Herrenschmidt wrote: > On Wed, 2019-01-09 at 17:32 +1100, Alexey Kardashevskiy wrote: >> I have just moved the "Mellanox Technologies MT27700 Family >> [ConnectX-4]" from garrison to firestone machine and there it does not >> produce an EEH, with the same kernel

Re: [PATCH] PCI: Add no-D3 quirk for Mellanox ConnectX-[45]

2019-01-08 Thread Benjamin Herrenschmidt
On Wed, 2019-01-09 at 17:32 +1100, Alexey Kardashevskiy wrote: > I have just moved the "Mellanox Technologies MT27700 Family > [ConnectX-4]" from garrison to firestone machine and there it does not > produce an EEH, with the same kernel and skiboot (both upstream + my > debug). Hm. I cannot really

Re: [PATCH] PCI: Add no-D3 quirk for Mellanox ConnectX-[45]

2019-01-08 Thread Benjamin Herrenschmidt
On Wed, 2019-01-09 at 15:53 +1100, Alexey Kardashevskiy wrote: > "A PCI completion timeout occurred for an outstanding PCI-E transaction" > it is. > > This is how I bind the device to vfio: > > echo vfio-pci > '/sys/bus/pci/devices/:01:00.0/driver_override' > echo vfio-pci >

Re: [PATCH] PCI: Add no-D3 quirk for Mellanox ConnectX-[45]

2019-01-08 Thread Alexey Kardashevskiy
On 09/01/2019 16:30, David Gibson wrote: > On Wed, Jan 09, 2019 at 04:09:02PM +1100, Benjamin Herrenschmidt wrote: >> On Mon, 2019-01-07 at 21:01 -0700, Jason Gunthorpe wrote: >>> In a very cryptic way that requires manual parsing using non-public docs sadly but yes. From the look of

Re: [PATCH] PCI: Add no-D3 quirk for Mellanox ConnectX-[45]

2019-01-08 Thread David Gibson
On Wed, Jan 09, 2019 at 04:09:02PM +1100, Benjamin Herrenschmidt wrote: > On Mon, 2019-01-07 at 21:01 -0700, Jason Gunthorpe wrote: > > > > > In a very cryptic way that requires manual parsing using non-public > > > docs sadly but yes. From the look of it, it's a completion timeout. > > > > > >

Re: [PATCH] PCI: Add no-D3 quirk for Mellanox ConnectX-[45]

2019-01-08 Thread Benjamin Herrenschmidt
On Mon, 2019-01-07 at 21:01 -0700, Jason Gunthorpe wrote: > > > In a very cryptic way that requires manual parsing using non-public > > docs sadly but yes. From the look of it, it's a completion timeout. > > > > Looks to me like we don't get a response to a config space access > > during the

Re: [PATCH] PCI: Add no-D3 quirk for Mellanox ConnectX-[45]

2019-01-08 Thread Alexey Kardashevskiy
On 06/01/2019 09:43, Benjamin Herrenschmidt wrote: > On Sat, 2019-01-05 at 10:51 -0700, Jason Gunthorpe wrote: >> >>> Interesting. I've investigated this further, though I don't have as >>> many new clues as I'd like. The problem occurs reliably, at least on >>> one particular type of machine

Re: [PATCH] PCI: Add no-D3 quirk for Mellanox ConnectX-[45]

2019-01-07 Thread Leon Romanovsky
On Mon, Jan 07, 2019 at 09:01:29PM -0700, Jason Gunthorpe wrote: > On Sun, Jan 06, 2019 at 09:43:46AM +1100, Benjamin Herrenschmidt wrote: > > On Sat, 2019-01-05 at 10:51 -0700, Jason Gunthorpe wrote: > > > > > > > Interesting. I've investigated this further, though I don't have as > > > > many

Re: [PATCH] PCI: Add no-D3 quirk for Mellanox ConnectX-[45]

2019-01-07 Thread Jason Gunthorpe
On Sun, Jan 06, 2019 at 09:43:46AM +1100, Benjamin Herrenschmidt wrote: > On Sat, 2019-01-05 at 10:51 -0700, Jason Gunthorpe wrote: > > > > > Interesting. I've investigated this further, though I don't have as > > > many new clues as I'd like. The problem occurs reliably, at least on > > > one

Re: [PATCH] PCI: Add no-D3 quirk for Mellanox ConnectX-[45]

2019-01-05 Thread Benjamin Herrenschmidt
On Sat, 2019-01-05 at 10:51 -0700, Jason Gunthorpe wrote: > > > Interesting. I've investigated this further, though I don't have as > > many new clues as I'd like. The problem occurs reliably, at least on > > one particular type of machine (a POWER8 "Garrison" with ConnectX-4). > > I don't yet

Re: [PATCH] PCI: Add no-D3 quirk for Mellanox ConnectX-[45]

2019-01-05 Thread Jason Gunthorpe
On Fri, Jan 04, 2019 at 02:44:01PM +1100, David Gibson wrote: > On Thu, Dec 06, 2018 at 08:45:09AM +0200, Leon Romanovsky wrote: > > On Thu, Dec 06, 2018 at 03:19:51PM +1100, David Gibson wrote: > > > Mellanox ConnectX-5 IB cards (MT27800) seem to cause a call trace when > > > unbound from their

Re: [PATCH] PCI: Add no-D3 quirk for Mellanox ConnectX-[45]

2019-01-03 Thread David Gibson
On Thu, Dec 06, 2018 at 08:45:09AM +0200, Leon Romanovsky wrote: > On Thu, Dec 06, 2018 at 03:19:51PM +1100, David Gibson wrote: > > Mellanox ConnectX-5 IB cards (MT27800) seem to cause a call trace when > > unbound from their regular driver and attached to vfio-pci in order to pass > > them

Re: [PATCH] PCI: Add no-D3 quirk for Mellanox ConnectX-[45]

2018-12-11 Thread Bjorn Helgaas
On Tue, Dec 11, 2018 at 6:38 PM David Gibson wrote: > > On Tue, Dec 11, 2018 at 08:01:43AM -0600, Bjorn Helgaas wrote: > > Hi David, > > > > I see you're still working on this, but if you do end up going this > > direction eventually, would you mind splitting this into two patches: > > 1) rename

Re: [PATCH] PCI: Add no-D3 quirk for Mellanox ConnectX-[45]

2018-12-11 Thread David Gibson
On Tue, Dec 11, 2018 at 08:01:43AM -0600, Bjorn Helgaas wrote: > Hi David, > > I see you're still working on this, but if you do end up going this > direction eventually, would you mind splitting this into two patches: > 1) rename the quirk to make it more generic (but not changing any >

Re: [PATCH] PCI: Add no-D3 quirk for Mellanox ConnectX-[45]

2018-12-11 Thread Bjorn Helgaas
Hi David, I see you're still working on this, but if you do end up going this direction eventually, would you mind splitting this into two patches: 1) rename the quirk to make it more generic (but not changing any behavior), and 2) add the ConnectX devices to the quirk. That way the ConnectX

Re: [PATCH] PCI: Add no-D3 quirk for Mellanox ConnectX-[45]

2018-12-10 Thread David Gibson
On Thu, Dec 06, 2018 at 08:45:09AM +0200, Leon Romanovsky wrote: > On Thu, Dec 06, 2018 at 03:19:51PM +1100, David Gibson wrote: > > Mellanox ConnectX-5 IB cards (MT27800) seem to cause a call trace when > > unbound from their regular driver and attached to vfio-pci in order to pass > > them

Re: [PATCH] PCI: Add no-D3 quirk for Mellanox ConnectX-[45]

2018-12-05 Thread Leon Romanovsky
On Thu, Dec 06, 2018 at 03:19:51PM +1100, David Gibson wrote: > Mellanox ConnectX-5 IB cards (MT27800) seem to cause a call trace when > unbound from their regular driver and attached to vfio-pci in order to pass > them through to a guest. > > This goes away if the disable_idle_d3 option is used,