Re: [PATCH] uapi/virtio_scsi: allow overriding CDB/SENSE size

2015-03-12 Thread Rusty Russell
"Michael S. Tsirkin"  writes:
> On Wed, Mar 11, 2015 at 02:19:03PM +0100, Michael S. Tsirkin wrote:
>> QEMU wants to use virtio scsi structures with
>> a different VIRTIO_SCSI_CDB_SIZE/VIRTIO_SCSI_SENSE_SIZE,
>> let's add ifdefs to allow overriding them.
>> 
>> Keep the old defines under new names:
>> VIRTIO_SCSI_CDB_DEFAULT_SIZE/VIRTIO_SCSI_SENSE_DEFAULT_SIZE,
>> since that's what these values really are:
>> defaults for cdb/sense size fields.
>> 
>> Suggested-by: Paolo Bonzini 
>> Signed-off-by: Michael S. Tsirkin 
>> ---
>
> Rusty, pls note, if possible I'd like this one
> merged for 4.0 because it's important for QEMU
> (which now imports linux headers).

OK, applied.

Thanks,
Rusty.


>
>
>>  include/uapi/linux/virtio_scsi.h | 12 ++--
>>  1 file changed, 10 insertions(+), 2 deletions(-)
>> 
>> diff --git a/include/uapi/linux/virtio_scsi.h 
>> b/include/uapi/linux/virtio_scsi.h
>> index 42b9370..cc18ef8 100644
>> --- a/include/uapi/linux/virtio_scsi.h
>> +++ b/include/uapi/linux/virtio_scsi.h
>> @@ -29,8 +29,16 @@
>>  
>>  #include 
>>  
>> -#define VIRTIO_SCSI_CDB_SIZE   32
>> -#define VIRTIO_SCSI_SENSE_SIZE 96
>> +/* Default values of the CDB and sense data size configuration fields */
>> +#define VIRTIO_SCSI_CDB_DEFAULT_SIZE   32
>> +#define VIRTIO_SCSI_SENSE_DEFAULT_SIZE 96
>> +
>> +#ifndef VIRTIO_SCSI_CDB_SIZE
>> +#define VIRTIO_SCSI_CDB_SIZE VIRTIO_SCSI_CDB_DEFAULT_SIZE
>> +#endif
>> +#ifndef VIRTIO_SCSI_SENSE_SIZE
>> +#define VIRTIO_SCSI_SENSE_SIZE VIRTIO_SCSI_SENSE_DEFAULT_SIZE
>> +#endif
>>  
>>  /* SCSI command request, followed by data-out */
>>  struct virtio_scsi_cmd_req {
>> -- 
>> MST
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2 log fixed] virtio_mmio: fix endian-ness for mmio

2015-03-12 Thread Rusty Russell
"Michael S. Tsirkin"  writes:
> Subject: [PATCH] virtio_mmio: fix access width for mmio

Just for the record:

Applied.

Thanks,
Rusty.

> Going over the virtio mmio code, I noticed that it doesn't correctly
> access modern device config values using "natural" accessors: it uses
> readb to get/set them byte by byte, while the virtio 1.0 spec explicitly 
> states:
>
> 4.2.2.2 Driver Requirements: MMIO Device Register Layout
>
> ...
>
> The driver MUST only use 32 bit wide and aligned reads and writes to
> access the control registers described in table 4.1.
> For the device-specific configuration space, the driver MUST use
> 8 bit wide accesses for 8 bit wide fields, 16 bit wide and aligned
> accesses for 16 bit wide fields and 32 bit wide and aligned accesses 
> for
> 32 and 64 bit wide fields.
>
> Borrow code from virtio_pci_modern to do this correctly.
>
> Signed-off-by: Michael S. Tsirkin 
>
> ---
>
> This is exactly the same as PATCH virtio_mmio: fix endian-ness for mmio
> but with corrected subject and commit log, to make
> it easier to apply.
>
> Note: untested: QEMU doesn't support virtio 1.0 for virtio-mmio,
> but seems pretty obvious. Let's apply ASAP so we don't ship
> incompliant drivers for virtio 1.0?
>
>  drivers/virtio/virtio_mmio.c | 79 
> +++-
>  1 file changed, 71 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/virtio/virtio_mmio.c b/drivers/virtio/virtio_mmio.c
> index cad5698..0375456 100644
> --- a/drivers/virtio/virtio_mmio.c
> +++ b/drivers/virtio/virtio_mmio.c
> @@ -156,22 +156,85 @@ static void vm_get(struct virtio_device *vdev, unsigned 
> offset,
>  void *buf, unsigned len)
>  {
>   struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
> - u8 *ptr = buf;
> - int i;
> + void __iomem *base = vm_dev->base + VIRTIO_MMIO_CONFIG;
> + u8 b;
> + __le16 w;
> + __le32 l;
>  
> - for (i = 0; i < len; i++)
> - ptr[i] = readb(vm_dev->base + VIRTIO_MMIO_CONFIG + offset + i);
> + if (vm_dev->version == 1) {
> + u8 *ptr = buf;
> + int i;
> +
> + for (i = 0; i < len; i++)
> + ptr[i] = readb(base + offset + i);
> + return;
> + }
> +
> + switch (len) {
> + case 1:
> + b = readb(base + offset);
> + memcpy(buf, &b, sizeof b);
> + break;
> + case 2:
> + w = cpu_to_le16(readw(base + offset));
> + memcpy(buf, &w, sizeof w);
> + break;
> + case 4:
> + l = cpu_to_le32(readl(base + offset));
> + memcpy(buf, &l, sizeof l);
> + break;
> + case 8:
> + l = cpu_to_le32(readl(base + offset));
> + memcpy(buf, &l, sizeof l);
> + l = cpu_to_le32(ioread32(base + offset + sizeof l));
> + memcpy(buf + sizeof l, &l, sizeof l);
> + break;
> + default:
> + BUG();
> + }
>  }
>  
>  static void vm_set(struct virtio_device *vdev, unsigned offset,
>  const void *buf, unsigned len)
>  {
>   struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
> - const u8 *ptr = buf;
> - int i;
> + void __iomem *base = vm_dev->base + VIRTIO_MMIO_CONFIG;
> + u8 b;
> + __le16 w;
> + __le32 l;
>  
> - for (i = 0; i < len; i++)
> - writeb(ptr[i], vm_dev->base + VIRTIO_MMIO_CONFIG + offset + i);
> + if (vm_dev->version == 1) {
> + const u8 *ptr = buf;
> + int i;
> +
> + for (i = 0; i < len; i++)
> + writeb(ptr[i], base + offset + i);
> +
> + return;
> + }
> +
> + switch (len) {
> + case 1:
> + memcpy(&b, buf, sizeof b);
> + writeb(b, base + offset);
> + break;
> + case 2:
> + memcpy(&w, buf, sizeof w);
> + writew(le16_to_cpu(w), base + offset);
> + break;
> + case 4:
> + memcpy(&l, buf, sizeof l);
> + writel(le32_to_cpu(l), base + offset);
> + break;
> + case 8:
> + memcpy(&l, buf, sizeof l);
> + writel(le32_to_cpu(l), base + offset);
> + memcpy(&l, buf + sizeof l, sizeof l);
> + writel(le32_to_cpu(l), base + offset + sizeof l);
> + break;
> + default:
> + BUG();
> + }
>  }
>  
>  static u8 vm_get_status(struct virtio_device *vdev)
> -- 
> MST
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] 9p/trans_virtio: fix hot-unplug

2015-03-12 Thread Rusty Russell
"Michael S. Tsirkin"  writes:
> On Thu, Mar 12, 2015 at 11:54:10AM +1030, Rusty Russell wrote:
>> "Michael S. Tsirkin"  writes:
>> > On device hot-unplug, 9p/virtio currently will kfree channel while
>> > it might still be in use.
>> >
>> > Of course, it might stay used forever, so it's an extremely ugly hack,
>> > but it seems better than use-after-free that we have now.
>> >
>> > Signed-off-by: Michael S. Tsirkin 
>> 
>> I'll apply it, but it looks like a bandaid.
>
>
> Absolutely.

Applied this, too:


Signed-off-by: Rusty Russell 

diff --git a/net/9p/trans_virtio.c b/net/9p/trans_virtio.c
index 73fab71397ce..781315d97ed5 100644
--- a/net/9p/trans_virtio.c
+++ b/net/9p/trans_virtio.c
@@ -659,7 +659,6 @@ static void p9_virtio_remove(struct virtio_device *vdev)
 {
struct virtio_chan *chan = vdev->priv;
unsigned long warning_time;
-   bool inuse;
 
mutex_lock(&virtio_9p_lock);
 
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] virtio_mmio: fix endian-ness for mmio

2015-03-12 Thread Rusty Russell
"Michael S. Tsirkin"  writes:
> On Thu, Mar 12, 2015 at 12:33:36PM +1030, Rusty Russell wrote:
>> "Michael S. Tsirkin"  writes:
>> > Going over the virtio mmio code, I noticed that it doesn't correctly
>> > return device config values in LE format when using virtio 1.0.
>> > Borrow code from virtio_pci_modern to do this correctly.
>> 
>> AFAICT, it doesn't need to.  The endian correction is done by the
>> callers.
>> 
>> The only reason that virtio_pci_modern() does it is because readl() etc
>> do endian conversion, and we don't want them to.  And the PCI part of
>> the spec says to use "natural" accessors, so we don't do byte-at-a-time.
>> 
>> Cheers,
>> Rusty.
>
> You are right, the endina-ness is not an issue, so the commit log I
> wrote is wrong, but I still think the patch is required, because MMIO
> spec says the same as PCI.
>   The driver MUST only use 32 bit wide and aligned reads and writes to
>   access the control registers described
>   in table 4.1. For the device-specific configuration space, the driver
>   MUST use 8 bit wide accesses for 8 bit
>   wide fields, 16 bit wide and aligned accesses for 16 bit wide fields and
>   32 bit wide and aligned accesses for
>   32 and 64 bit wide fields.

Ah, good point.

I'm sure Pawel is on a beach somewhere sipping cocktails, so I'll apply
this immediately (with your updated commit message) to my
pending-rebases and give him the weekend to Ack.

Thanks,
Rusty.


>
>
> Here's a better commit log:
>
> --->
>
> virtio_mmio: fix access width for mmio
>
> Going over the virtio mmio code, I noticed that it doesn't correctly
> access modern device config values using "natural" accessors: it uses
> readb to get/set them byte by byte, while the virtio 1.0 spec explicitly 
> states:
>
>   4.2.2.2 Driver Requirements: MMIO Device Register Layout
>
>   ...
>
>   The driver MUST only use 32 bit wide and aligned reads and writes to
>   access the control registers described in table 4.1.
>   For the device-specific configuration space, the driver MUST use
>   8 bit wide accesses for 8 bit wide fields, 16 bit wide and aligned
>   accesses for 16 bit wide fields and 32 bit wide and aligned accesses for
>   32 and 64 bit wide fields.
>
> Borrow code from virtio_pci_modern to do this correctly.
>
> Signed-off-by: Michael S. Tsirkin 
>
> Makes sense now, right?
> Want me to repost or can you just tweak the commit log?
>
>
>> > Signed-off-by: Michael S. Tsirkin 
>> > ---
>> >
>> > Note: untested: QEMU doesn't support virtio 1.0 for virtio-mmio.
>> > Pawel, could you please confirm this patch makes sense?
>> >
>> >  drivers/virtio/virtio_mmio.c | 79 
>> > +++-
>> >  1 file changed, 71 insertions(+), 8 deletions(-)
>> >
>> > diff --git a/drivers/virtio/virtio_mmio.c b/drivers/virtio/virtio_mmio.c
>> > index cad5698..0375456 100644
>> > --- a/drivers/virtio/virtio_mmio.c
>> > +++ b/drivers/virtio/virtio_mmio.c
>> > @@ -156,22 +156,85 @@ static void vm_get(struct virtio_device *vdev, 
>> > unsigned offset,
>> >   void *buf, unsigned len)
>> >  {
>> >struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
>> > -  u8 *ptr = buf;
>> > -  int i;
>> > +  void __iomem *base = vm_dev->base + VIRTIO_MMIO_CONFIG;
>> > +  u8 b;
>> > +  __le16 w;
>> > +  __le32 l;
>> >  
>> > -  for (i = 0; i < len; i++)
>> > -  ptr[i] = readb(vm_dev->base + VIRTIO_MMIO_CONFIG + offset + i);
>> > +  if (vm_dev->version == 1) {
>> > +  u8 *ptr = buf;
>> > +  int i;
>> > +
>> > +  for (i = 0; i < len; i++)
>> > +  ptr[i] = readb(base + offset + i);
>> > +  return;
>> > +  }
>> > +
>> > +  switch (len) {
>> > +  case 1:
>> > +  b = readb(base + offset);
>> > +  memcpy(buf, &b, sizeof b);
>> > +  break;
>> > +  case 2:
>> > +  w = cpu_to_le16(readw(base + offset));
>> > +  memcpy(buf, &w, sizeof w);
>> > +  break;
>> > +  case 4:
>> > +  l = cpu_to_le32(readl(base + offset));
>> > +  memcpy(buf, &l, sizeof l);
>> > +  break;
>> > +  case 8:
>> > +  l = cpu_to_le32(readl(base + offset));
>> > +  memcpy(buf, &l, sizeof l);
>> > +  l = cpu_to_le32(ioread32(base + offset + sizeof l));
>> > +  memcpy(buf + sizeof l, &l, sizeof l);
>> > +  break;
>> > +  default:
>> > +  BUG();
>> > +  }
>> >  }
>> >  
>> >  static void vm_set(struct virtio_device *vdev, unsigned offset,
>> >   const void *buf, unsigned len)
>> >  {
>> >struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
>> > -  const u8 *ptr = buf;
>> > -  int i;
>> > +  void __iomem *base = vm_dev->base + VIRTIO_MMIO_CONFIG;
>> > +  u8 b;
>> > +  __le16 w;
>> > +  __le32 l;
>> >  
>> > -  for (i = 0; i < len; i++)
>> > -  writeb(ptr[i], vm_dev->base + VIRTIO_MMIO_CONFIG + offset + i);
>> > +  if (vm_dev->version == 1) {
>> > +  

Re: [PATCH] virtio: Remove virtio device during shutdown

2015-03-12 Thread Fam Zheng
On Thu, 03/12 17:22, Michael S. Tsirkin wrote:
> On Wed, Mar 11, 2015 at 06:11:35PM +0800, Fam Zheng wrote:
> > On Wed, 03/11 10:06, Michael S. Tsirkin wrote:
> > > On Wed, Mar 11, 2015 at 04:09:17PM +0800, Fam Zheng wrote:
> > > > Currently shutdown is nop for virtio devices, but the core code could
> > > > remove things behind us such as MSI-X handler etc. For example in the
> > > > case of virtio-scsi-pci, the device may still try to send interupts,
> > > > which will be on IRQ lines seeing MSI-X disabled. Those interrupts will
> > > > be unhandled, and may cause flood.
> > 
> > Here is the problem I want to solve - file system driver hang:
> > 
> > If a fs code happen to hit __wait_on_buffer right after pci 
> > pci_device_shutdown
> > disabled msix, it will never make progress because the requests it waits for
> > will never be completed. So the system hangs.
> 
> Paolo says that pci reset of virtio scsi device guarantees
> that all outstanding requests complete.
> 
> If true and implemented correctly, I don't see what else
> needs to be done.
> 
> You will need to debug this some more.

First of all I was wrong about the fs driver above, scratch that, I'm sorry for
the misleading.

Regarding the hang in shutdown, Ulrich Obergfell has already pointed out that
the vcpu is "busy/stuck in interrupt processing":

https://bugzilla.redhat.com/attachment.cgi?id=998391 (RHBZ 1199155)

Summary: The reason it is stuck is that an IRQ from virtio-scsi-pci is not
handled. Why is there that IRQ? Because pci core code disabled msix. Why is it
not handled?  Because it's done behind virtio-scsi, who still is waiting for
msix.

"Hence, the interrupt will not be acknowledged and the guest becomes flooded
with IRQ 11 interrupt."

Fortunately it's not a livelock for upstream, because of:

commit 184564efae4d775225c8fe3b762a56956fb1f827
Author: Zhang Haoyu 
Date:   Thu Sep 11 16:47:04 2014 +0800

kvm: ioapic: conditionally delay irq delivery duringeoi broadcast

But we still should do the shutdown right.

I also propose to not shutdown msix from pci core shutdown if the device
doesn't have shutdown function:

http://www.spinics.net/lists/kernel/msg1944041.html

With that patch is applied, the "nop" .shutdown in virtio-pci shouldn't hurt
much.

Regarding handing the requests, now I don't know if we really care about them
at shutdown. As you said, waiting for requests may cause more hang.

Ideas?

Fam
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC PATCH] PCI: Disable MSI/MSI-X only if device is shutdown

2015-03-12 Thread Fam Zheng
On Thu, 03/12 17:21, Paolo Bonzini wrote:
> On 12/03/2015 06:21, Fam Zheng wrote:
> > If the device doesn't support shutdown, disabling interrupts may cause
> > trouble. For example, virtio-scsi-pci doesn't implement shutdown, and
> > after we disable MSI-X, futher notifications from device will be
> > delivered to IRQ, which is unexpected. This IRQ will not be cleared, and
> > may prevent us from making progress, by keep triggering interrupts.
> > 
> > Signed-off-by: Fam Zheng 
> > ---
> >  drivers/pci/pci-driver.c | 7 ---
> >  1 file changed, 4 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> > index 3cb2210..fb29c96 100644
> > --- a/drivers/pci/pci-driver.c
> > +++ b/drivers/pci/pci-driver.c
> > @@ -448,10 +448,11 @@ static void pci_device_shutdown(struct device *dev)
> >  
> > pm_runtime_resume(dev);
> >  
> > -   if (drv && drv->shutdown)
> > +   if (drv && drv->shutdown) {
> > drv->shutdown(pci_dev);
> > -   pci_msi_shutdown(pci_dev);
> > -   pci_msix_shutdown(pci_dev);
> > +   pci_msi_shutdown(pci_dev);
> > +   pci_msix_shutdown(pci_dev);
> > +   }
> >  
> >  #ifdef CONFIG_KEXEC
> > /*
> > 
> 
> The patch may be okay, but I think the bug here is also that virtio-pci
> is not defining a .shutdown callback.  It should define one and call
> free_irq (for INTX) and pci_disable_msix.

It's not enough. The device has to know we disabled msix, otherwise it will
send us IRQ, which is wrong.

> 
> How is this related to the virtio-scsi patch that you posted?  Do you
> need both to fix the problem you reported?
> 

This one should be enough. I was wrong in saying virtio-scsi hanging the
shutdown is because requests not being completed, it's more because of the
unexpected IRQ as explained in [1].

[1]: https://bugzilla.redhat.com/attachment.cgi?id=998391

Fam
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH net] virtio-net: correctly delete napi hash

2015-03-12 Thread David Miller
From: "Michael S. Tsirkin" 
Date: Thu, 12 Mar 2015 08:25:34 +0100

> On Thu, Mar 12, 2015 at 01:57:44PM +0800, Jason Wang wrote:
>> We don't delete napi from hash list during module exit. This will
>> cause the following panic when doing module load and unload:
 ...
>> This patch fixes this by doing this in virtnet_free_queues(). And also
>> don't delete napi in virtnet_freeze() since it will call
>> virtnet_free_queues() which has already did this.
>> 
>> Fixes 91815639d880 ("virtio-net: rx busy polling support")
>> Cc: Rusty Russell 
>> Cc: Michael S. Tsirkin 
>> Signed-off-by: Jason Wang 
 ...
> Dave, can you pick this up pls?
> Looks like a stable candidate too.

Applied and queued up for -stable, thanks.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] virtio: Remove virtio device during shutdown

2015-03-12 Thread Paolo Bonzini


On 12/03/2015 17:22, Michael S. Tsirkin wrote:
> On Wed, Mar 11, 2015 at 06:11:35PM +0800, Fam Zheng wrote:
>> On Wed, 03/11 10:06, Michael S. Tsirkin wrote:
>>> On Wed, Mar 11, 2015 at 04:09:17PM +0800, Fam Zheng wrote:
 Currently shutdown is nop for virtio devices, but the core code could
 remove things behind us such as MSI-X handler etc. For example in the
 case of virtio-scsi-pci, the device may still try to send interupts,
 which will be on IRQ lines seeing MSI-X disabled. Those interrupts will
 be unhandled, and may cause flood.
>>
>> Here is the problem I want to solve - file system driver hang:
>>
>> If a fs code happen to hit __wait_on_buffer right after pci 
>> pci_device_shutdown
>> disabled msix, it will never make progress because the requests it waits for
>> will never be completed. So the system hangs.
> 
> Paolo says that pci reset of virtio scsi device guarantees
> that all outstanding requests complete.

For what it's worth, see here:

static void virtio_scsi_reset(VirtIODevice *vdev)
{
...
s->resetting++;
qbus_reset_all(&s->bus.qbus);
s->resetting--;
...
}

static void scsi_disk_reset(DeviceState *dev)
{
SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev.qdev, dev);
uint64_t nb_sectors;

scsi_device_purge_requests(&s->qdev, SENSE_CODE(RESET));
...
}

Paolo

> If true and implemented correctly, I don't see what else
> needs to be done.
> 
> You will need to debug this some more.
> 
> 
>> In other words we will want to reset virtio device before pci_device_shutdown
>> AND wake up all waiters.
>>
>> Unfortunately, neither your patch nor mine does that, because virtio bus can 
>> be
>> shutdown after pci bus (thanks to Jason for pointing out this). In that case,
>> any completion after disabling msix is lost.
>>
>> Maybe we need both the pci shutdown handler to reset the device and the 
>> virtio
>> shutdown handler to remove the device?
>>
>> Fam
>>
>>>
>>> This sounds very tentative. Do you, in fact, observe some problems
>>> with virtio scsi? How to reproduce them? this needs to go
>>> into the commit messages.
>>
>> OK, my bad.
>>
>>>
 Remove the device in "shutdown" callback to allow device drivers clean
 up things.

 Signed-off-by: Fam Zheng 
>>>
>>> I'm concerned this will cause more hangs on shutdown: one
>>> of the reasons for reboot is device mal-functioning.
>>> How about we just reset devices instead? Something like
>>> the below (untested).
>>>
>>> Signed-off-by: Michael S. Tsirkin 
>>>
>>> diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
>>> index 5ce2aa4..0769941 100644
>>> --- a/drivers/virtio/virtio.c
>>> +++ b/drivers/virtio/virtio.c
>>> @@ -269,6 +269,17 @@ static int virtio_dev_remove(struct device *_d)
>>> return 0;
>>>  }
>>>  
>>> +static void virtio_dev_shutdown(struct device *_d)
>>> +{
>>> +   struct virtio_device *dev = dev_to_virtio(_d);
>>> +   /*
>>> +* Reset the device to make it stop sending interrupts, DMA, etc.
>>> +* We are shutting down, no need for full cleanup.
>>> +*/
>>> +   dev->config->reset(dev);
>>> +
>>> +}
>>> +
>>>  static struct bus_type virtio_bus = {
>>> .name  = "virtio",
>>> .match = virtio_dev_match,
>>> @@ -276,6 +288,7 @@ static struct bus_type virtio_bus = {
>>> .uevent = virtio_uevent,
>>> .probe = virtio_dev_probe,
>>> .remove = virtio_dev_remove,
>>> +   .shutdown = virtio_dev_shutdown,
>>>  };
>>>  
>>>  bool virtio_device_is_legacy_only(struct virtio_device_id id)
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] virtio-pci: fix host notifiers on bi-endian architectures

2015-03-12 Thread Paolo Bonzini


On 12/03/2015 08:08, Michael S. Tsirkin wrote:
> But common header format is simple, it's always LE.
> It does not depend on target.
> To me this looks like a bug in memory_region_add_eventfd,
> it should do the right thing depending on device
> endian-ness.

I agree it seems to be a QEMU bug.

Paolo
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] virtio: Remove virtio device during shutdown

2015-03-12 Thread Michael S. Tsirkin
On Wed, Mar 11, 2015 at 06:11:35PM +0800, Fam Zheng wrote:
> On Wed, 03/11 10:06, Michael S. Tsirkin wrote:
> > On Wed, Mar 11, 2015 at 04:09:17PM +0800, Fam Zheng wrote:
> > > Currently shutdown is nop for virtio devices, but the core code could
> > > remove things behind us such as MSI-X handler etc. For example in the
> > > case of virtio-scsi-pci, the device may still try to send interupts,
> > > which will be on IRQ lines seeing MSI-X disabled. Those interrupts will
> > > be unhandled, and may cause flood.
> 
> Here is the problem I want to solve - file system driver hang:
> 
> If a fs code happen to hit __wait_on_buffer right after pci 
> pci_device_shutdown
> disabled msix, it will never make progress because the requests it waits for
> will never be completed. So the system hangs.

Paolo says that pci reset of virtio scsi device guarantees
that all outstanding requests complete.

If true and implemented correctly, I don't see what else
needs to be done.

You will need to debug this some more.


> In other words we will want to reset virtio device before pci_device_shutdown
> AND wake up all waiters.
> 
> Unfortunately, neither your patch nor mine does that, because virtio bus can 
> be
> shutdown after pci bus (thanks to Jason for pointing out this). In that case,
> any completion after disabling msix is lost.
> 
> Maybe we need both the pci shutdown handler to reset the device and the virtio
> shutdown handler to remove the device?
> 
> Fam
> 
> > 
> > This sounds very tentative. Do you, in fact, observe some problems
> > with virtio scsi? How to reproduce them? this needs to go
> > into the commit messages.
> 
> OK, my bad.
> 
> > 
> > > Remove the device in "shutdown" callback to allow device drivers clean
> > > up things.
> > > 
> > > Signed-off-by: Fam Zheng 
> > 
> > I'm concerned this will cause more hangs on shutdown: one
> > of the reasons for reboot is device mal-functioning.
> > How about we just reset devices instead? Something like
> > the below (untested).
> > 
> > Signed-off-by: Michael S. Tsirkin 
> > 
> > diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
> > index 5ce2aa4..0769941 100644
> > --- a/drivers/virtio/virtio.c
> > +++ b/drivers/virtio/virtio.c
> > @@ -269,6 +269,17 @@ static int virtio_dev_remove(struct device *_d)
> > return 0;
> >  }
> >  
> > +static void virtio_dev_shutdown(struct device *_d)
> > +{
> > +   struct virtio_device *dev = dev_to_virtio(_d);
> > +   /*
> > +* Reset the device to make it stop sending interrupts, DMA, etc.
> > +* We are shutting down, no need for full cleanup.
> > +*/
> > +   dev->config->reset(dev);
> > +
> > +}
> > +
> >  static struct bus_type virtio_bus = {
> > .name  = "virtio",
> > .match = virtio_dev_match,
> > @@ -276,6 +288,7 @@ static struct bus_type virtio_bus = {
> > .uevent = virtio_uevent,
> > .probe = virtio_dev_probe,
> > .remove = virtio_dev_remove,
> > +   .shutdown = virtio_dev_shutdown,
> >  };
> >  
> >  bool virtio_device_is_legacy_only(struct virtio_device_id id)
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC PATCH] PCI: Disable MSI/MSI-X only if device is shutdown

2015-03-12 Thread Paolo Bonzini


On 12/03/2015 06:21, Fam Zheng wrote:
> If the device doesn't support shutdown, disabling interrupts may cause
> trouble. For example, virtio-scsi-pci doesn't implement shutdown, and
> after we disable MSI-X, futher notifications from device will be
> delivered to IRQ, which is unexpected. This IRQ will not be cleared, and
> may prevent us from making progress, by keep triggering interrupts.
> 
> Signed-off-by: Fam Zheng 
> ---
>  drivers/pci/pci-driver.c | 7 ---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index 3cb2210..fb29c96 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -448,10 +448,11 @@ static void pci_device_shutdown(struct device *dev)
>  
>   pm_runtime_resume(dev);
>  
> - if (drv && drv->shutdown)
> + if (drv && drv->shutdown) {
>   drv->shutdown(pci_dev);
> - pci_msi_shutdown(pci_dev);
> - pci_msix_shutdown(pci_dev);
> + pci_msi_shutdown(pci_dev);
> + pci_msix_shutdown(pci_dev);
> + }
>  
>  #ifdef CONFIG_KEXEC
>   /*
> 

The patch may be okay, but I think the bug here is also that virtio-pci
is not defining a .shutdown callback.  It should define one and call
free_irq (for INTX) and pci_disable_msix.

How is this related to the virtio-scsi patch that you posted?  Do you
need both to fix the problem you reported?

Paolo
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] uapi/virtio_scsi: allow overriding CDB/SENSE size

2015-03-12 Thread Michael S. Tsirkin
On Wed, Mar 11, 2015 at 02:19:03PM +0100, Michael S. Tsirkin wrote:
> QEMU wants to use virtio scsi structures with
> a different VIRTIO_SCSI_CDB_SIZE/VIRTIO_SCSI_SENSE_SIZE,
> let's add ifdefs to allow overriding them.
> 
> Keep the old defines under new names:
> VIRTIO_SCSI_CDB_DEFAULT_SIZE/VIRTIO_SCSI_SENSE_DEFAULT_SIZE,
> since that's what these values really are:
> defaults for cdb/sense size fields.
> 
> Suggested-by: Paolo Bonzini 
> Signed-off-by: Michael S. Tsirkin 
> ---

Rusty, pls note, if possible I'd like this one
merged for 4.0 because it's important for QEMU
(which now imports linux headers).


>  include/uapi/linux/virtio_scsi.h | 12 ++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/include/uapi/linux/virtio_scsi.h 
> b/include/uapi/linux/virtio_scsi.h
> index 42b9370..cc18ef8 100644
> --- a/include/uapi/linux/virtio_scsi.h
> +++ b/include/uapi/linux/virtio_scsi.h
> @@ -29,8 +29,16 @@
>  
>  #include 
>  
> -#define VIRTIO_SCSI_CDB_SIZE   32
> -#define VIRTIO_SCSI_SENSE_SIZE 96
> +/* Default values of the CDB and sense data size configuration fields */
> +#define VIRTIO_SCSI_CDB_DEFAULT_SIZE   32
> +#define VIRTIO_SCSI_SENSE_DEFAULT_SIZE 96
> +
> +#ifndef VIRTIO_SCSI_CDB_SIZE
> +#define VIRTIO_SCSI_CDB_SIZE VIRTIO_SCSI_CDB_DEFAULT_SIZE
> +#endif
> +#ifndef VIRTIO_SCSI_SENSE_SIZE
> +#define VIRTIO_SCSI_SENSE_SIZE VIRTIO_SCSI_SENSE_DEFAULT_SIZE
> +#endif
>  
>  /* SCSI command request, followed by data-out */
>  struct virtio_scsi_cmd_req {
> -- 
> MST
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v2 log fixed] virtio_mmio: fix endian-ness for mmio

2015-03-12 Thread Michael S. Tsirkin
Subject: [PATCH] virtio_mmio: fix access width for mmio

Going over the virtio mmio code, I noticed that it doesn't correctly
access modern device config values using "natural" accessors: it uses
readb to get/set them byte by byte, while the virtio 1.0 spec explicitly states:

4.2.2.2 Driver Requirements: MMIO Device Register Layout

...

The driver MUST only use 32 bit wide and aligned reads and writes to
access the control registers described in table 4.1.
For the device-specific configuration space, the driver MUST use
8 bit wide accesses for 8 bit wide fields, 16 bit wide and aligned
accesses for 16 bit wide fields and 32 bit wide and aligned accesses for
32 and 64 bit wide fields.

Borrow code from virtio_pci_modern to do this correctly.

Signed-off-by: Michael S. Tsirkin 

---

This is exactly the same as PATCH virtio_mmio: fix endian-ness for mmio
but with corrected subject and commit log, to make
it easier to apply.

Note: untested: QEMU doesn't support virtio 1.0 for virtio-mmio,
but seems pretty obvious. Let's apply ASAP so we don't ship
incompliant drivers for virtio 1.0?

 drivers/virtio/virtio_mmio.c | 79 +++-
 1 file changed, 71 insertions(+), 8 deletions(-)

diff --git a/drivers/virtio/virtio_mmio.c b/drivers/virtio/virtio_mmio.c
index cad5698..0375456 100644
--- a/drivers/virtio/virtio_mmio.c
+++ b/drivers/virtio/virtio_mmio.c
@@ -156,22 +156,85 @@ static void vm_get(struct virtio_device *vdev, unsigned 
offset,
   void *buf, unsigned len)
 {
struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
-   u8 *ptr = buf;
-   int i;
+   void __iomem *base = vm_dev->base + VIRTIO_MMIO_CONFIG;
+   u8 b;
+   __le16 w;
+   __le32 l;
 
-   for (i = 0; i < len; i++)
-   ptr[i] = readb(vm_dev->base + VIRTIO_MMIO_CONFIG + offset + i);
+   if (vm_dev->version == 1) {
+   u8 *ptr = buf;
+   int i;
+
+   for (i = 0; i < len; i++)
+   ptr[i] = readb(base + offset + i);
+   return;
+   }
+
+   switch (len) {
+   case 1:
+   b = readb(base + offset);
+   memcpy(buf, &b, sizeof b);
+   break;
+   case 2:
+   w = cpu_to_le16(readw(base + offset));
+   memcpy(buf, &w, sizeof w);
+   break;
+   case 4:
+   l = cpu_to_le32(readl(base + offset));
+   memcpy(buf, &l, sizeof l);
+   break;
+   case 8:
+   l = cpu_to_le32(readl(base + offset));
+   memcpy(buf, &l, sizeof l);
+   l = cpu_to_le32(ioread32(base + offset + sizeof l));
+   memcpy(buf + sizeof l, &l, sizeof l);
+   break;
+   default:
+   BUG();
+   }
 }
 
 static void vm_set(struct virtio_device *vdev, unsigned offset,
   const void *buf, unsigned len)
 {
struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
-   const u8 *ptr = buf;
-   int i;
+   void __iomem *base = vm_dev->base + VIRTIO_MMIO_CONFIG;
+   u8 b;
+   __le16 w;
+   __le32 l;
 
-   for (i = 0; i < len; i++)
-   writeb(ptr[i], vm_dev->base + VIRTIO_MMIO_CONFIG + offset + i);
+   if (vm_dev->version == 1) {
+   const u8 *ptr = buf;
+   int i;
+
+   for (i = 0; i < len; i++)
+   writeb(ptr[i], base + offset + i);
+
+   return;
+   }
+
+   switch (len) {
+   case 1:
+   memcpy(&b, buf, sizeof b);
+   writeb(b, base + offset);
+   break;
+   case 2:
+   memcpy(&w, buf, sizeof w);
+   writew(le16_to_cpu(w), base + offset);
+   break;
+   case 4:
+   memcpy(&l, buf, sizeof l);
+   writel(le32_to_cpu(l), base + offset);
+   break;
+   case 8:
+   memcpy(&l, buf, sizeof l);
+   writel(le32_to_cpu(l), base + offset);
+   memcpy(&l, buf + sizeof l, sizeof l);
+   writel(le32_to_cpu(l), base + offset + sizeof l);
+   break;
+   default:
+   BUG();
+   }
 }
 
 static u8 vm_get_status(struct virtio_device *vdev)
-- 
MST
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH net] virtio-net: correctly delete napi hash

2015-03-12 Thread Michael S. Tsirkin
On Thu, Mar 12, 2015 at 01:57:44PM +0800, Jason Wang wrote:
> We don't delete napi from hash list during module exit. This will
> cause the following panic when doing module load and unload:
> 
> BUG: unable to handle kernel paging request at 004e0075
> IP: [] napi_hash_add+0x6b/0xf0
> PGD 3c5d5067 PUD 0
> Oops:  [#1] SMP
> ...
> Call Trace:
> [] init_vqs+0x107/0x490 [virtio_net]
> [] virtnet_probe+0x562/0x791815639d880be [virtio_net]
> [] virtio_dev_probe+0x137/0x200
> [] driver_probe_device+0x7a/0x250
> [] __driver_attach+0x93/0xa0
> [] ? __device_attach+0x40/0x40
> [] bus_for_each_dev+0x63/0xa0
> [] driver_attach+0x19/0x20
> [] bus_add_driver+0x170/0x220
> [] ? 0xa0a6
> [] driver_register+0x5f/0xf0
> [] register_virtio_driver+0x1b/0x30
> [] virtio_net_driver_init+0x10/0x12 [virtio_net]
> 
> This patch fixes this by doing this in virtnet_free_queues(). And also
> don't delete napi in virtnet_freeze() since it will call
> virtnet_free_queues() which has already did this.
> 
> Fixes 91815639d880 ("virtio-net: rx busy polling support")
> Cc: Rusty Russell 
> Cc: Michael S. Tsirkin 
> Signed-off-by: Jason Wang 

Good catch!

Acked-by: Michael S. Tsirkin 

or should it be

Reviewed-by: Michael S. Tsirkin 

I'm not sure which one is stronger :)

Dave, can you pick this up pls?
Looks like a stable candidate too.

> ---
>  drivers/net/virtio_net.c | 9 -
>  1 file changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index f1ff366..59b0e97 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -1448,8 +1448,10 @@ static void virtnet_free_queues(struct virtnet_info 
> *vi)
>  {
>   int i;
>  
> - for (i = 0; i < vi->max_queue_pairs; i++)
> + for (i = 0; i < vi->max_queue_pairs; i++) {
> + napi_hash_del(&vi->rq[i].napi);
>   netif_napi_del(&vi->rq[i].napi);
> + }
>  
>   kfree(vi->rq);
>   kfree(vi->sq);
> @@ -1948,11 +1950,8 @@ static int virtnet_freeze(struct virtio_device *vdev)
>   cancel_delayed_work_sync(&vi->refill);
>  
>   if (netif_running(vi->dev)) {
> - for (i = 0; i < vi->max_queue_pairs; i++) {
> + for (i = 0; i < vi->max_queue_pairs; i++)
>   napi_disable(&vi->rq[i].napi);
> - napi_hash_del(&vi->rq[i].napi);
> - netif_napi_del(&vi->rq[i].napi);
> - }
>   }
>  
>   remove_vq_common(vi);
> -- 
> 2.1.0
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] virtio_mmio: fix endian-ness for mmio

2015-03-12 Thread Michael S. Tsirkin
On Thu, Mar 12, 2015 at 12:33:36PM +1030, Rusty Russell wrote:
> "Michael S. Tsirkin"  writes:
> > Going over the virtio mmio code, I noticed that it doesn't correctly
> > return device config values in LE format when using virtio 1.0.
> > Borrow code from virtio_pci_modern to do this correctly.
> 
> AFAICT, it doesn't need to.  The endian correction is done by the
> callers.
> 
> The only reason that virtio_pci_modern() does it is because readl() etc
> do endian conversion, and we don't want them to.  And the PCI part of
> the spec says to use "natural" accessors, so we don't do byte-at-a-time.
> 
> Cheers,
> Rusty.

You are right, the endina-ness is not an issue, so the commit log I
wrote is wrong, but I still think the patch is required, because MMIO
spec says the same as PCI.
The driver MUST only use 32 bit wide and aligned reads and writes to
access the control registers described
in table 4.1. For the device-specific configuration space, the driver
MUST use 8 bit wide accesses for 8 bit
wide fields, 16 bit wide and aligned accesses for 16 bit wide fields and
32 bit wide and aligned accesses for
32 and 64 bit wide fields.


Here's a better commit log:

--->

virtio_mmio: fix access width for mmio

Going over the virtio mmio code, I noticed that it doesn't correctly
access modern device config values using "natural" accessors: it uses
readb to get/set them byte by byte, while the virtio 1.0 spec explicitly states:

4.2.2.2 Driver Requirements: MMIO Device Register Layout

...

The driver MUST only use 32 bit wide and aligned reads and writes to
access the control registers described in table 4.1.
For the device-specific configuration space, the driver MUST use
8 bit wide accesses for 8 bit wide fields, 16 bit wide and aligned
accesses for 16 bit wide fields and 32 bit wide and aligned accesses for
32 and 64 bit wide fields.

Borrow code from virtio_pci_modern to do this correctly.

Signed-off-by: Michael S. Tsirkin 

Makes sense now, right?
Want me to repost or can you just tweak the commit log?


> > Signed-off-by: Michael S. Tsirkin 
> > ---
> >
> > Note: untested: QEMU doesn't support virtio 1.0 for virtio-mmio.
> > Pawel, could you please confirm this patch makes sense?
> >
> >  drivers/virtio/virtio_mmio.c | 79 
> > +++-
> >  1 file changed, 71 insertions(+), 8 deletions(-)
> >
> > diff --git a/drivers/virtio/virtio_mmio.c b/drivers/virtio/virtio_mmio.c
> > index cad5698..0375456 100644
> > --- a/drivers/virtio/virtio_mmio.c
> > +++ b/drivers/virtio/virtio_mmio.c
> > @@ -156,22 +156,85 @@ static void vm_get(struct virtio_device *vdev, 
> > unsigned offset,
> >void *buf, unsigned len)
> >  {
> > struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
> > -   u8 *ptr = buf;
> > -   int i;
> > +   void __iomem *base = vm_dev->base + VIRTIO_MMIO_CONFIG;
> > +   u8 b;
> > +   __le16 w;
> > +   __le32 l;
> >  
> > -   for (i = 0; i < len; i++)
> > -   ptr[i] = readb(vm_dev->base + VIRTIO_MMIO_CONFIG + offset + i);
> > +   if (vm_dev->version == 1) {
> > +   u8 *ptr = buf;
> > +   int i;
> > +
> > +   for (i = 0; i < len; i++)
> > +   ptr[i] = readb(base + offset + i);
> > +   return;
> > +   }
> > +
> > +   switch (len) {
> > +   case 1:
> > +   b = readb(base + offset);
> > +   memcpy(buf, &b, sizeof b);
> > +   break;
> > +   case 2:
> > +   w = cpu_to_le16(readw(base + offset));
> > +   memcpy(buf, &w, sizeof w);
> > +   break;
> > +   case 4:
> > +   l = cpu_to_le32(readl(base + offset));
> > +   memcpy(buf, &l, sizeof l);
> > +   break;
> > +   case 8:
> > +   l = cpu_to_le32(readl(base + offset));
> > +   memcpy(buf, &l, sizeof l);
> > +   l = cpu_to_le32(ioread32(base + offset + sizeof l));
> > +   memcpy(buf + sizeof l, &l, sizeof l);
> > +   break;
> > +   default:
> > +   BUG();
> > +   }
> >  }
> >  
> >  static void vm_set(struct virtio_device *vdev, unsigned offset,
> >const void *buf, unsigned len)
> >  {
> > struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
> > -   const u8 *ptr = buf;
> > -   int i;
> > +   void __iomem *base = vm_dev->base + VIRTIO_MMIO_CONFIG;
> > +   u8 b;
> > +   __le16 w;
> > +   __le32 l;
> >  
> > -   for (i = 0; i < len; i++)
> > -   writeb(ptr[i], vm_dev->base + VIRTIO_MMIO_CONFIG + offset + i);
> > +   if (vm_dev->version == 1) {
> > +   const u8 *ptr = buf;
> > +   int i;
> > +
> > +   for (i = 0; i < len; i++)
> > +   writeb(ptr[i], base + offset + i);
> > +
> > +   return;
> > +   }
> > +
> > +   switch (len) {
> > +   case 1:
> > +   memcpy(&b, buf, sizeof b);
> > +   writeb(b, base + offset);

Re: [Qemu-ppc] [PATCH] virtio-pci: fix host notifiers on bi-endian architectures

2015-03-12 Thread Michael S. Tsirkin
On Wed, Mar 11, 2015 at 11:52:13PM +0100, Greg Kurz wrote:
> On Thu, 12 Mar 2015 09:18:38 +1100
> Benjamin Herrenschmidt  wrote:
> 
> > On Wed, 2015-03-11 at 23:03 +0100, Greg Kurz wrote:
> > 
> > > /* The host notifier will be swapped in adjust_endianness() according to 
> > > the
> > >  * target default endianness. We need to negate this swap if the device 
> > > uses
> > >  * an endianness that is not the default (ppc64le for example).
> > >  */
> > > 
> > > > > +static uint16_t cpu_to_host_notifier16(VirtIODevice *vdev, uint16_t 
> > > > > val)
> > > > > +{
> > > > > +return virtio_is_big_endian(vdev) ? val : bswap16(val);
> > > > > +}
> > 
> > But but but ... The above ... won't it do things like break x86 ? Ie,
> > shouldn't we swap only if TARGET_BIG_ENDIAN and !virtio_is_big_endian ?
> > Or better, "fixed target endian" ^ "virtio endian" to cover all cases ?
> > 
> 
> Yeah you're right, it's a mess :)
> 
> To avoid virtio-pci.o being built per target, we can use 
> virtio_default_endian()
> instead (to be exported from virtio.c):
> 
> return vdev->device_endian() != virtio_default_endian() ? val : bswap16(val);
> 
> I shall test on x86 and post a v2.
> 
> Thanks.
> 
> --
> G

It's a mess because the issue is not in virtio I think.
virtio common region is just a simple device, using
fixed LE format.
It's some bug in ioeventfd support. Need to understand
whether it's in kernel (and maybe do a work-around)
or in qemu memory core.


> > Cheers,
> > Ben.
> > 
> > > > >  static int virtio_pci_set_host_notifier_internal(VirtIOPCIProxy 
> > > > > *proxy,
> > > > >   int n, bool assign, 
> > > > > bool set_handler)
> > > > >  {
> > > > > @@ -150,10 +155,12 @@ static int 
> > > > > virtio_pci_set_host_notifier_internal(VirtIOPCIProxy *proxy,
> > > > >  }
> > > > >  virtio_queue_set_host_notifier_fd_handler(vq, true, 
> > > > > set_handler);
> > > > >  memory_region_add_eventfd(&proxy->bar, 
> > > > > VIRTIO_PCI_QUEUE_NOTIFY, 2,
> > > > > -  true, n, notifier);
> > > > > +  true, cpu_to_host_notifier16(vdev, 
> > > > > n),
> > > > > +  notifier);
> > > > >  } else {
> > > > >  memory_region_del_eventfd(&proxy->bar, 
> > > > > VIRTIO_PCI_QUEUE_NOTIFY, 2,
> > > > > -  true, n, notifier);
> > > > > +  true, cpu_to_host_notifier16(vdev, 
> > > > > n),
> > > > > +  notifier);
> > > > >  virtio_queue_set_host_notifier_fd_handler(vq, false, false);
> > > > >  event_notifier_cleanup(notifier);
> > > > >  }
> > > > 
> > > 
> > 
> > 
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] virtio-pci: fix host notifiers on bi-endian architectures

2015-03-12 Thread Michael S. Tsirkin
On Wed, Mar 11, 2015 at 11:03:14PM +0100, Greg Kurz wrote:
> On Wed, 11 Mar 2015 21:06:05 +0100
> "Michael S. Tsirkin"  wrote:
> 
> > On Wed, Mar 11, 2015 at 07:04:38PM +0100, Greg Kurz wrote:
> > > vhost is seriously broken with ppc64le guests, even in the supposedly
> > > supported case where the host is ppc64le and we don't need cross-endian
> > > support.
> > > 
> > > The TX virtqueue fails to be handled by vhost and falls back to QEMU.
> > > Despite this unexpected scenario where RX is vhost and TX is QEMU, the
> > > guest runs well with reduced upload performances... until you reboot,
> > > migrate, managed save or in fact any operation that causes vhost_net
> > > to be re-started. Network connectivity is then permanantly lost for
> > > the guest.
> > > 
> > > TX falling back to QEMU is the result of a failed MMIO store emulation
> > > in KVM. Debugging shows that:
> > > 
> > > kvmppc_emulate_mmio()
> > >  |
> > >  +-> kvmppc_handle_store()
> > >   |
> > >   +-> kvm_io_bus_write()
> > >|
> > >+-> __kvm_io_bus_write() returns -EOPNOTSUPP
> > > 
> > > This happens because no matching device was found:
> > > 
> > > __kvm_io_bus_write()
> > >  |
> > >  +->kvm_iodevice_write()
> > >  |
> > >  +->ioeventfd_write()
> > >  |
> > >  +->ioeventfd_in_range() returns false for all registered vrings
> > > 
> > > Extra debugging shows that the TX vring number (16-bit) is supposed to
> > > be 0x0100 but QEMU passes 0x0001 to KVM... This happens *again* because
> > > QEMU still assumes powerpc is big endian (TARGET_WORDS_BIGENDIAN) by
> > > default.
> > > 
> > > This patch adds an extra swap in virtio_pci_set_host_notifier_internal()
> > > to negate the one that is done in adjust_endianness(). Since this is not
> > > a hot path and we want to keep virtio-pci.o in common-obj, we don't care
> > > whether the guest is bi-endian or not.
> > > 
> > > Reported-by: Cédric Le Goater 
> > > Suggested-by: Michael Roth 
> > > Signed-off-by: Greg Kurz 
> > 
> > I am confused.
> > The value that notifications use is always LE.
> 
> True but adjust_endianness() does swap unconditionally for ppc64
> because of TARGET_WORDS_BIGENDIAN.
> 
> > Can't we avoid multiple swaps?
> 
> That would mean adding an extra endianness argument down to
> memory_region_wrong_endianness()... not sure we want to do that.
> 
> > They make my head spin.
> > 
> 
> I understand that the current fixed target endianness paradigm
> is best suited for most architectures. Extra swaps in specific
> non-critical locations allows to support odd beasts like ppc64le
> and arm64be without trashing more common paths. Maybe I can add a
> comment for better clarity (see below).

But common header format is simple, it's always LE.
It does not depend on target.
To me this looks like a bug in memory_region_add_eventfd,
it should do the right thing depending on device
endian-ness.



> > > ---
> > > 
> > > I guess it is also a fix for virtio-1 but I didn't check.
> > > 
> > >  hw/virtio/virtio-pci.c |   11 +--
> > >  1 file changed, 9 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> > > index e7baf7b..62b04c9 100644
> > > --- a/hw/virtio/virtio-pci.c
> > > +++ b/hw/virtio/virtio-pci.c
> > > @@ -133,6 +133,11 @@ static int virtio_pci_load_queue(DeviceState *d, int 
> > > n, QEMUFile *f)
> > >  return 0;
> > >  }
> > >  
> 
> /* The host notifier will be swapped in adjust_endianness() according to the
>  * target default endianness. We need to negate this swap if the device uses
>  * an endianness that is not the default (ppc64le for example).
>  */
> 
> > > +static uint16_t cpu_to_host_notifier16(VirtIODevice *vdev, uint16_t val)
> > > +{
> > > +return virtio_is_big_endian(vdev) ? val : bswap16(val);
> > > +}
> > > +
> > >  static int virtio_pci_set_host_notifier_internal(VirtIOPCIProxy *proxy,
> > >   int n, bool assign, 
> > > bool set_handler)
> > >  {
> > > @@ -150,10 +155,12 @@ static int 
> > > virtio_pci_set_host_notifier_internal(VirtIOPCIProxy *proxy,
> > >  }
> > >  virtio_queue_set_host_notifier_fd_handler(vq, true, set_handler);
> > >  memory_region_add_eventfd(&proxy->bar, VIRTIO_PCI_QUEUE_NOTIFY, 
> > > 2,
> > > -  true, n, notifier);
> > > +  true, cpu_to_host_notifier16(vdev, n),
> > > +  notifier);
> > >  } else {
> > >  memory_region_del_eventfd(&proxy->bar, VIRTIO_PCI_QUEUE_NOTIFY, 
> > > 2,
> > > -  true, n, notifier);
> > > +  true, cpu_to_host_notifier16(vdev, n),
> > > +  notifier);
> > >  virtio_queue_set_host_notifier_fd_handler(vq, false, false);
> > >  event_notifier_cleanup(notifier);
> > >  }
> > 
___