Re: [Xen-devel] [PATCH V2] Xen/pciback: Implement PCI slot or bus reset with 'do_flr' SysFS attribute

2017-11-09 Thread Jan Beulich
>>> On 09.11.17 at 00:06,  wrote:
> --- a/drivers/xen/xen-pciback/pci_stub.c
> +++ b/drivers/xen/xen-pciback/pci_stub.c
> @@ -244,6 +244,91 @@ struct pci_dev *pcistub_get_pci_dev(struct 
> xen_pcibk_device *pdev,
>   return found_dev;
>  }
>  
> +struct pcistub_args {
> + struct pci_dev *dev;

Please don't ignore prior review comments: Either carry out what
was requested, or explain why the request can't be fulfilled. You
saying "This field will point to first device that is not owned by
pcistub" to Roger's request to make this a pointer to const is not a
valid reason to not do the adjustment; in fact your reply is entirely
unrelated to the request.

> +static int pcistub_search_dev(struct pci_dev *dev, void *data)
> +{
> + struct pcistub_device *psdev;
> + struct pcistub_args *arg = data;
> + bool found_dev = false;

Purely cosmetical, but anyway: Why not just "found"? What else
could be (not) found here other than the device in question?

> + unsigned long flags;
> +
> + spin_lock_irqsave(&pcistub_devices_lock, flags);
> +
> + list_for_each_entry(psdev, &pcistub_devices, dev_list) {
> + if (psdev->dev == dev) {
> + found_dev = true;
> + arg->dcount++;
> + break;
> + }
> + }
> +
> + spin_unlock_irqrestore(&pcistub_devices_lock, flags);
> +
> + /* Device not owned by pcistub, someone owns it. Abort the walk */
> + if (!found_dev)
> + arg->dev = dev;
> +
> + return found_dev ? 0 : 1;

Despite the function needing to return int, this can be simplified to
"return !found_dev". I'd also like to note that the part of the
earlier comment related to this is sort of disconnected. How about

/* Device not owned by pcistub, someone owns it. Abort the walk */
if (!found_dev) {
arg->dev = dev;
return 1;
}

return 0;

And finally - I don't think the comment is entirely correct - the
device not being owned by pciback doesn't necessarily mean it's
owned by another driver. It could as well be unowned.

> +static int pcistub_reset_dev(struct pci_dev *dev)
> +{
> + struct xen_pcibk_dev_data *dev_data;
> + bool slot = false, bus = false;
> + struct pcistub_args arg = {};
> +
> + if (!dev)
> + return -EINVAL;
> +
> + dev_dbg(&dev->dev, "[%s]\n", __func__);
> +
> + if (!pci_probe_reset_slot(dev->slot))
> + slot = true;
> + else if ((!pci_probe_reset_bus(dev->bus)) &&
> +  (!pci_is_root_bus(dev->bus)))
> + bus = true;
> +
> + if (!bus && !slot)
> + return -EOPNOTSUPP;
> +
> + /*
> +  * Make sure all devices on this bus are owned by the
> +  * PCI backend so that we can safely reset the whole bus.
> +  */

Is that really the case when you mean to do a slot reset? It was for
a reason that I had asked about a missing "else" in v1 review,
rather than questioning the conditional around the logic.

> + pci_walk_bus(dev->bus, pcistub_search_dev, &arg);
> +
> + /* All devices under the bus should be part of pcistub! */
> + if (arg.dev) {
> + dev_err(&dev->dev, "%s device on bus 0x%x is not owned by 
> pcistub\n",

%#x

Yet then, thinking about what would be useful information should the
situation really arise, I'm not convinced printing a bare bus number
here is useful either. Especially for the case of multiple parallel
requests you want to make it possible to match each message to the
original request (guest start or whatever). Hence I think you want
something like

"%s on the same bus as %s is not owned by " DRV_NAME "\n"

> + pci_name(arg.dev), dev->bus->number);
> +
> + return -EBUSY;
> + }
> +
> + dev_dbg(&dev->dev, "pcistub owns %d devices on bus 0x%x\n",
> + arg.dcount, dev->bus->number);

While here the original device is perhaps not necessary to print,
the bare bus number doesn't carry enough information: You'll
want to prefix it by the segment number. Plus you'll want to use
canonical formatting (:bb), so one can get matches when
suitably grep-ing the log. Perhaps bus->name is what you're
after.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH V2] Xen/pciback: Implement PCI slot or bus reset with 'do_flr' SysFS attribute

2017-11-08 Thread Govinda Tatti
The life-cycle of a PCI device in Xen pciback is complex and is constrained
by the generic PCI locking mechanism.

- It starts with the device being bound to us, for which we do a function
  reset (done via SysFS so the PCI lock is held).
- If the device is unbound from us, we also do a function reset
  (done via SysFS so the PCI lock is held).
- If the device is un-assigned from a guest - we do a function reset
  (no PCI lock is held).

All reset operations are done on the individual PCI function level
(so bus:device:function).

The reset for an individual PCI function means device must support FLR
(PCIe or AF), PM reset on D3hot->D0 device specific reset, or a secondary
bus reset for a singleton device on a bus but FLR does not have widespread
support or it is not reliable in some cases. So, we need to provide an
alternate mechanism to users to perform a slot or bus level reset.

Currently, a slot or bus reset is not exposed in SysFS as there is no good
way of exposing a bus topology there. This is due to the complexity -
we MUST know that the different functions of a PCIe device are not in use
by other drivers, or if they are in use (say one of them is assigned to a
guest and the other is  idle) - it is still OK to reset the slot (assuming
both of them are owned by Xen pciback).

This patch does that by doing a slot or bus reset (if slot not supported)
if all of the functions of a PCIe device belong to Xen PCIback.

Due to the complexity with the PCI lock we cannot do the reset when a
device is bound ('echo $BDF > bind') or when unbound ('echo $BDF > unbind')
as the pci_[slot|bus]_reset also takes the same lock resulting in a
dead-lock.

Putting the reset function in a work-queue or thread won't work either -
as we have to do the reset function outside the 'unbind' context (it holds
the PCI lock). But once you 'unbind' a device the device is no longer under
the ownership of Xen pciback and the pci_set_drvdata has been reset, so
we cannot use a thread for this.

Instead of doing all this complex dance, we depend on the tool-stack doing
the right thing. As such, we implement the 'do_flr' SysFS attribute which
'xl' uses when a device is detached or attached from/to a guest. It
bypasses the need to worry about the PCI lock.

To not inadvertently do a bus reset that would affect devices that are in
use by other drivers (other than Xen pciback) prior to the reset, we check
that all of the devices under the bridge are owned by Xen pciback. If they
are not, we refrain from executing the bus (or slot) reset.

Signed-off-by: Govinda Tatti 
Signed-off-by: Konrad Rzeszutek Wilk 
Reviewed-by: Boris Ostrovsky 
---
 Documentation/ABI/testing/sysfs-driver-pciback |  12 +++
 drivers/xen/xen-pciback/pci_stub.c | 119 +
 2 files changed, 131 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-driver-pciback 
b/Documentation/ABI/testing/sysfs-driver-pciback
index 6a733bf..ccf7dc0 100644
--- a/Documentation/ABI/testing/sysfs-driver-pciback
+++ b/Documentation/ABI/testing/sysfs-driver-pciback
@@ -11,3 +11,15 @@ Description:
 #echo 00:19.0-E0:2:FF > /sys/bus/pci/drivers/pciback/quirks
 will allow the guest to read and write to the configuration
 register 0x0E.
+
+What:   /sys/bus/pci/drivers/pciback/do_flr
+Date:   Nov 2017
+KernelVersion:  4.15
+Contact:xen-de...@lists.xenproject.org
+Description:
+An option to perform a slot or bus reset when a PCI device
+   is owned by Xen PCI backend. Writing a string of :BB:DD.F
+   will cause the pciback driver to perform a slot or bus reset
+   if the device supports it. It also checks to make sure that
+   all of the devices under the bridge are owned by Xen PCI
+   backend.
diff --git a/drivers/xen/xen-pciback/pci_stub.c 
b/drivers/xen/xen-pciback/pci_stub.c
index 6331a95..e2677a6 100644
--- a/drivers/xen/xen-pciback/pci_stub.c
+++ b/drivers/xen/xen-pciback/pci_stub.c
@@ -244,6 +244,91 @@ struct pci_dev *pcistub_get_pci_dev(struct 
xen_pcibk_device *pdev,
return found_dev;
 }
 
+struct pcistub_args {
+   struct pci_dev *dev;
+   unsigned int dcount;
+};
+
+static int pcistub_search_dev(struct pci_dev *dev, void *data)
+{
+   struct pcistub_device *psdev;
+   struct pcistub_args *arg = data;
+   bool found_dev = false;
+   unsigned long flags;
+
+   spin_lock_irqsave(&pcistub_devices_lock, flags);
+
+   list_for_each_entry(psdev, &pcistub_devices, dev_list) {
+   if (psdev->dev == dev) {
+   found_dev = true;
+   arg->dcount++;
+   break;
+   }
+   }
+
+   spin_unlock_irqrestore(&pcistub_devices_lock, flags);
+
+   /* Device not owned by pcistub, someone owns it. Abort the walk */
+   if (!found_dev)
+   arg->dev = dev;
+
+   return found_