Re: [Xen-devel] [PATCH v6 11/11] vpci/msix: add MSI-X handlers

2017-10-11 Thread Jan Beulich
>>> On 10.10.17 at 17:04,  wrote:
> On Wed, Oct 04, 2017 at 08:34:43AM +, Jan Beulich wrote:
>> >>> On 19.09.17 at 17:29,  wrote:
>> > --- a/xen/include/xen/vpci.h
>> > +++ b/xen/include/xen/vpci.h
>> > @@ -100,6 +100,40 @@ struct vpci {
>> >  /* 64-bit address capable? */
>> >  bool address64;
>> >  } *msi;
>> > +
>> > +/* MSI-X data. */
>> > +struct vpci_msix {
>> > +struct pci_dev *pdev;
>> > +/* List link. */
>> > +struct list_head next;
>> > +/* Table information. */
>> > +struct vpci_msix_mem {
>> > +/* MSI-X table offset. */
>> > +unsigned int offset;
>> > +/* MSI-X table BIR. */
>> > +unsigned int bir;
>> > +/* Table size. */
>> > +unsigned int size;
>> > +#define VPCI_MSIX_TABLE 0
>> > +#define VPCI_MSIX_PBA   1
>> > +#define VPCI_MSIX_MEM_NUM   2
>> > +} mem[VPCI_MSIX_MEM_NUM];
>> > +/* Maximum number of vectors supported by the device. */
>> > +unsigned int max_entries;
>> > +/* MSI-X enabled? */
>> > +bool enabled;
>> > +/* Masked? */
>> > +bool masked;
>> > +/* Entries. */
>> > +struct vpci_msix_entry {
>> > +uint64_t addr;
>> > +uint32_t data;
>> > +unsigned int nr;
>> > +struct vpci_arch_msix_entry arch;
>> > +bool masked;
>> > +bool updated;
>> > +} entries[];
>> > +} *msix;
>> 
>> Same remark as for MSI regarding optimizing structure size.
> 
> Going over the fields, bir can be turned into a uint8_t, and size into
> a uint16_t. max_entries can also be converted to a uint16_t together
> with nr.
> 
> Apart from that I don't see much more optimization, unless we start
> packaging fields (ie: offset and bir could reside in a uint32_t), but
> IMHO that's going to make the code harder to parse for little gain,
> and will involve more calculations in the handlers.

The more instances of a structure there may be, the more
relevant it is to pack them tightly. I.e. primary focus needs
to be on struct vpci_msix_entry, but since - as indicated -
there may be many devices supporting MSI-X, struct vpci_msix
as a whole should be reasonably well packed as well. I don't
think more calculation in the handlers is an argument - the
compiler will do it for you, and the affected code shouldn't
really be performance critical (it's involved in setting up
interrupts, not delivering them).

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v6 11/11] vpci/msix: add MSI-X handlers

2017-10-10 Thread Roger Pau Monné
On Wed, Oct 04, 2017 at 08:34:43AM +, Jan Beulich wrote:
> >>> On 19.09.17 at 17:29,  wrote:
> > +const struct vpci_bar *bars;
> > +struct vpci_msix *msix;
> > +const struct vpci_msix_entry *entry;
> > +unsigned int offset;
> > +
> > +*data = ~0ul;
> > +
> > +msix = vpci_msix_find(d, addr);
> > +if ( !msix || !vpci_msix_access_allowed(msix->pdev, addr, len) )
> > +return X86EMUL_OKAY;
> 
> In the !msix case I'm once again not convinced returning OKAY is correct
> here.

From what we have spoken in the mmcfg case, for the !msix case Xen
should return _RETRY. This error can only happen when the msix table
is unmapped in between a accept and read/write call, so calling the
accept handler again will return the correct value.

> > --- a/xen/include/xen/vpci.h
> > +++ b/xen/include/xen/vpci.h
> > @@ -100,6 +100,40 @@ struct vpci {
> >  /* 64-bit address capable? */
> >  bool address64;
> >  } *msi;
> > +
> > +/* MSI-X data. */
> > +struct vpci_msix {
> > +struct pci_dev *pdev;
> > +/* List link. */
> > +struct list_head next;
> > +/* Table information. */
> > +struct vpci_msix_mem {
> > +/* MSI-X table offset. */
> > +unsigned int offset;
> > +/* MSI-X table BIR. */
> > +unsigned int bir;
> > +/* Table size. */
> > +unsigned int size;
> > +#define VPCI_MSIX_TABLE 0
> > +#define VPCI_MSIX_PBA   1
> > +#define VPCI_MSIX_MEM_NUM   2
> > +} mem[VPCI_MSIX_MEM_NUM];
> > +/* Maximum number of vectors supported by the device. */
> > +unsigned int max_entries;
> > +/* MSI-X enabled? */
> > +bool enabled;
> > +/* Masked? */
> > +bool masked;
> > +/* Entries. */
> > +struct vpci_msix_entry {
> > +uint64_t addr;
> > +uint32_t data;
> > +unsigned int nr;
> > +struct vpci_arch_msix_entry arch;
> > +bool masked;
> > +bool updated;
> > +} entries[];
> > +} *msix;
> 
> Same remark as for MSI regarding optimizing structure size.

Going over the fields, bir can be turned into a uint8_t, and size into
a uint16_t. max_entries can also be converted to a uint16_t together
with nr.

Apart from that I don't see much more optimization, unless we start
packaging fields (ie: offset and bir could reside in a uint32_t), but
IMHO that's going to make the code harder to parse for little gain,
and will involve more calculations in the handlers.

Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v6 11/11] vpci/msix: add MSI-X handlers

2017-10-04 Thread Jan Beulich
>>> On 19.09.17 at 17:29,  wrote:
> --- a/xen/drivers/vpci/header.c
> +++ b/xen/drivers/vpci/header.c
> @@ -152,6 +152,7 @@ static int vpci_check_bar_overlap(const struct pci_dev 
> *pdev,
>  static void vpci_modify_bars(const struct pci_dev *pdev, bool map)
>  {
>  struct vpci_header *header = &pdev->vpci->header;
> +struct vpci_msix *msix = pdev->vpci->msix;

const and please fetch the value only right before you first need it.

> --- a/xen/drivers/vpci/msi.c
> +++ b/xen/drivers/vpci/msi.c
> @@ -320,13 +320,17 @@ void vpci_dump_msi(void)
>  if ( !has_vpci(d) )
>  continue;
>  
> -printk("vPCI MSI information for d%d\n", d->domain_id);
> +printk("vPCI MSI/MSI-X information for d%d\n", d->domain_id);
>  
>  list_for_each_entry ( pdev, &d->arch.pdev_list, domain_list )
>  {
>  uint8_t seg = pdev->seg, bus = pdev->bus;
>  uint8_t slot = PCI_SLOT(pdev->devfn), func = 
> PCI_FUNC(pdev->devfn);
>  const struct vpci_msi *msi = pdev->vpci->msi;
> +const struct vpci_msix *msix = pdev->vpci->msix;
> +
> +if ( msi || msix )
> +printk("Device %04x:%02x:%02x.%u\n", seg, bus, slot, func);
>  
>  if ( !spin_trylock(&pdev->vpci->lock) )
>  {
> @@ -336,7 +340,7 @@ void vpci_dump_msi(void)
>  
>  if ( msi )
>  {
> -printk("Device %04x:%02x:%02x.%u\n", seg, bus, slot, func);
> +printk(" MSI\n");
>  
>  printk("  Enabled: %u Supports masking: %u 64-bit addresses: 
> %u\n",
> msi->enabled, msi->masking, msi->address64);
> @@ -349,6 +353,20 @@ void vpci_dump_msi(void)
>  printk("  mask=%08x\n", msi->mask);
>  }
>  
> +if ( msix )
> +{
> +unsigned int i;
> +
> +printk(" MSI-X\n");
> +
> +printk("  Max entries: %u maskall: %u enabled: %u\n",
> +   msix->max_entries, msix->masked, msix->enabled);
> +
> +printk("  Table entries:\n");
> +for ( i = 0; i < msix->max_entries; i++ )
> +vpci_msix_arch_print_entry(&msix->entries[i]);
> +}
> +

Again, please try to reduce the amount of overall output.

> +static void vpci_msix_control_write(const struct pci_dev *pdev,
> +unsigned int reg, uint32_t val, void 
> *data)
> +{
> +uint8_t seg = pdev->seg, bus = pdev->bus;
> +uint8_t slot = PCI_SLOT(pdev->devfn), func = PCI_FUNC(pdev->devfn);
> +struct vpci_msix *msix = data;
> +bool new_masked, new_enabled;
> +unsigned int i;
> +int rc;
> +
> +new_masked = val & PCI_MSIX_FLAGS_MASKALL;
> +new_enabled = val & PCI_MSIX_FLAGS_ENABLE;
> +
> +/*
> + * According to the PCI 3.0 specification, switching the enable bit
> + * to 1 or the function mask bit to 0 should cause all the cached
> + * addresses and data fields to be recalculated. Xen implements this
> + * as disabling and enabling the entries.
> + *
> + * Note that the disable/enable sequence is only performed when the
> + * guest has written to the entry (ie: updated field set) or MSIX is
> + * enabled.
> + */
> +if ( new_enabled && !new_masked && (!msix->enabled || msix->masked) )
> +{
> +paddr_t table_base =
> +pdev->vpci->header.bars[msix->mem[VPCI_MSIX_TABLE].bir].addr;
> +
> +for ( i = 0; i < msix->max_entries; i++ )
> +{
> +if ( msix->entries[i].masked ||
> + (new_enabled && msix->enabled && !msix->entries[i].updated) 
> )
> +continue;

This doesn't look to match up with the earlier comment.

> +static int vpci_msix_read(struct vcpu *v, unsigned long addr,
> +  unsigned int len, unsigned long *data)
> +{
> +struct domain *d = v->domain;

const?

> +const struct vpci_bar *bars;
> +struct vpci_msix *msix;
> +const struct vpci_msix_entry *entry;
> +unsigned int offset;
> +
> +*data = ~0ul;
> +
> +msix = vpci_msix_find(d, addr);
> +if ( !msix || !vpci_msix_access_allowed(msix->pdev, addr, len) )
> +return X86EMUL_OKAY;

In the !msix case I'm once again not convinced returning OKAY is correct
here.

> --- a/xen/include/xen/vpci.h
> +++ b/xen/include/xen/vpci.h
> @@ -100,6 +100,40 @@ struct vpci {
>  /* 64-bit address capable? */
>  bool address64;
>  } *msi;
> +
> +/* MSI-X data. */
> +struct vpci_msix {
> +struct pci_dev *pdev;
> +/* List link. */
> +struct list_head next;
> +/* Table information. */
> +struct vpci_msix_mem {
> +/* MSI-X table offset. */
> +unsigned int offset;
> +/* MSI-X table BIR. */
> +unsigned int bir;
> +/* Table size. */
> +unsigned int size

[Xen-devel] [PATCH v6 11/11] vpci/msix: add MSI-X handlers

2017-09-19 Thread Roger Pau Monne
Add handlers for accesses to the MSI-X message control field on the
PCI configuration space, and traps for accesses to the memory region
that contains the MSI-X table and PBA. This traps detect attempts from
the guest to configure MSI-X interrupts and properly sets them up.

Note that accesses to the Table Offset, Table BIR, PBA Offset and PBA
BIR are not trapped by Xen at the moment.

Finally, turn the panic in the Dom0 PVH builder into a warning.

Signed-off-by: Roger Pau Monné 
---
Cc: Jan Beulich 
Cc: Andrew Cooper 
---
Changes since v5:
 - Update lock usage.
 - Unbind/unmap PIRQs when MSIX is disabled.
 - Share the arch-specific MSIX code with the MSI functions.
 - Do not reference the MSIX memory areas from the PCI BARs fields,
   instead fetch the BIR and offset each time needed.
 - Add the '_entry' suffix to the MSIX arch functions.
 - Prefix the vMSIX macros with 'V'.
 - s/gdprintk/gprintk/ in msix.c
 - Make vpci_msix_access_check return bool, and change it's name to
   vpci_msix_access_allowed.
 - Join the first two ifs in vpci_msix_{read/write} into a single one.
 - Allow Dom0 to write to the PBA area.
 - Add a note that reads from the PBA area will need to be translated
   if the PBA it's not identity mapped.

Changes since v4:
 - Remove parentheses around offsetof.
 - Add "being" to MSI-X enabling comment.
 - Use INVALID_PIRQ.
 - Add a simple sanity check to vpci_msix_arch_enable in order to
   detect wrong MSI-X entries more quickly.
 - Constify vpci_msix_arch_print entry argument.
 - s/cpu/fixed/ in vpci_msix_arch_print.
 - Dump the MSI-X info together with the MSI info.
 - Fix vpci_msix_control_write to take into account changes to the
   address and data fields when switching the function mask bit.
 - Only disable/enable the entries if the address or data fields have
   been updated.
 - Usew the BAR enable field to check if a BAR is mapped or not
   (instead of reading the command register for each device).
 - Fix error path in vpci_msix_read to set the return data to ~0.
 - Simplify mask usage in vpci_msix_write.
 - Cast data to uint64_t when shifting it 32 bits.
 - Fix writes to the table entry control register to take into account
   if the mask-all bit is set.
 - Add some comments to clarify the intended behavior of the code.
 - Align the PBA size to 64-bits.
 - Remove the error label in vpci_init_msix.
 - Try to compact the layout of the vpci_msix structure.
 - Remove the local table_bar and pba_bar variables from
   vpci_init_msix, they are used only once.

Changes since v3:
 - Propagate changes from previous versions: remove xen_ prefix, use
   the new fields in vpci_val and remove the return value from
   handlers.
 - Remove the usage of GENMASK.
 - Mave the arch-specific parts of the dump routine to the
   x86/hvm/vmsi.c dump handler.
 - Chain the MSI-X dump handler to the 'M' debug key.
 - Fix the header BAR mappings so that the MSI-X regions inside of
   BARs are unmapped from the domain p2m in order for the handlers to
   work properly.
 - Unconditionally trap and forward accesses to the PBA MSI-X area.
 - Simplify the conditionals in vpci_msix_control_write.
 - Fix vpci_msix_accept to use a bool type.
 - Allow all supported accesses as described in the spec to the MSI-X
   table.
 - Truncate the returned address when the access is a 32b read.
 - Always return X86EMUL_OKAY from the handlers, returning ~0 in the
   read case if the access is not supported, or ignoring writes.
 - Do not check that max_entries is != 0 in the init handler.
 - Use trylock in the dump handler.

Changes since v2:
 - Split out arch-specific code.

This patch has been tested with devices using both a single MSI-X
entry and multiple ones.
---
 xen/arch/x86/hvm/dom0_build.c|   2 +-
 xen/arch/x86/hvm/hvm.c   |   1 +
 xen/arch/x86/hvm/vmsi.c  | 133 --
 xen/drivers/vpci/Makefile|   2 +-
 xen/drivers/vpci/header.c|  16 ++
 xen/drivers/vpci/msi.c   |  22 +-
 xen/drivers/vpci/msix.c  | 506 +++
 xen/include/asm-x86/hvm/domain.h |   3 +
 xen/include/asm-x86/hvm/io.h |   5 +
 xen/include/xen/vpci.h   |  45 
 10 files changed, 705 insertions(+), 30 deletions(-)
 create mode 100644 xen/drivers/vpci/msix.c

diff --git a/xen/arch/x86/hvm/dom0_build.c b/xen/arch/x86/hvm/dom0_build.c
index 17d77137d6..8fa92bc5b6 100644
--- a/xen/arch/x86/hvm/dom0_build.c
+++ b/xen/arch/x86/hvm/dom0_build.c
@@ -,7 +,7 @@ int __init dom0_construct_pvh(struct domain *d, const 
module_t *image,
 
 pvh_setup_mmcfg(d);
 
-panic("Building a PVHv2 Dom0 is not yet supported.");
+printk("WARNING: PVH is an experimental mode with limited 
functionality\n");
 return 0;
 }
 
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index b1064413fc..042b7c6a31 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -585,6 +585,7 @@ int hvm_domain_initialise(struct domain *d, unsigned long 
domcr_flags