Re: [PATCH v2 2/3] PCI: Add DMA configuration for virtual platforms

2020-03-18 Thread Bjorn Helgaas
On Fri, Feb 28, 2020 at 06:25:37PM +0100, Jean-Philippe Brucker wrote:
> Hardware platforms usually describe the IOMMU topology using either
> device-tree pointers or vendor-specific ACPI tables.  For virtual
> platforms that don't provide a device-tree, the virtio-iommu device
> contains a description of the endpoints it manages.  That information
> allows us to probe endpoints after the IOMMU is probed (possibly as late
> as userspace modprobe), provided it is discovered early enough.
> 
> Add a hook to pci_dma_configure(), which returns -EPROBE_DEFER if the
> endpoint is managed by a vIOMMU that will be loaded later, or 0 in any
> other case to avoid disturbing the normal DMA configuration methods.
> When CONFIG_VIRTIO_IOMMU_TOPOLOGY isn't selected, the call to
> virt_dma_configure() is compiled out.
> 
> As long as the information is consistent, platforms can provide both a
> device-tree and a built-in topology, and the IOMMU infrastructure is
> able to deal with multiple DMA configuration methods.
> 
> Signed-off-by: Jean-Philippe Brucker 

Acked-by: Bjorn Helgaas 

> ---
>  drivers/pci/pci-driver.c | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index 0454ca0e4e3f..69303a814f21 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -18,6 +18,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include "pci.h"
>  #include "pcie/portdrv.h"
>  
> @@ -1602,6 +1603,10 @@ static int pci_dma_configure(struct device *dev)
>   struct device *bridge;
>   int ret = 0;
>  
> + ret = virt_dma_configure(dev);
> + if (ret)
> + return ret;
> +
>   bridge = pci_get_host_bridge_device(to_pci_dev(dev));
>  
>   if (IS_ENABLED(CONFIG_OF) && bridge->parent &&
> -- 
> 2.25.0
> 
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH AUTOSEL 4.14 20/28] virtio-blk: fix hw_queue stopped on arbitrary error

2020-03-18 Thread Sasha Levin
From: Halil Pasic 

[ Upstream commit f5f6b95c72f7f8bb46eace8c5306c752d0133daa ]

Since nobody else is going to restart our hw_queue for us, the
blk_mq_start_stopped_hw_queues() is in virtblk_done() is not sufficient
necessarily sufficient to ensure that the queue will get started again.
In case of global resource outage (-ENOMEM because mapping failure,
because of swiotlb full) our virtqueue may be empty and we can get
stuck with a stopped hw_queue.

Let us not stop the queue on arbitrary errors, but only on -EONSPC which
indicates a full virtqueue, where the hw_queue is guaranteed to get
started by virtblk_done() before when it makes sense to carry on
submitting requests. Let us also remove a stale comment.

Signed-off-by: Halil Pasic 
Cc: Jens Axboe 
Fixes: f7728002c1c7 ("virtio_ring: fix return code on DMA mapping fails")
Link: https://lore.kernel.org/r/20200213123728.61216-2-pa...@linux.ibm.com
Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Sasha Levin 
---
 drivers/block/virtio_blk.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index 8767401f75e04..19d226ff15ef8 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -271,10 +271,12 @@ static blk_status_t virtio_queue_rq(struct blk_mq_hw_ctx 
*hctx,
err = virtblk_add_req(vblk->vqs[qid].vq, vbr, vbr->sg, num);
if (err) {
virtqueue_kick(vblk->vqs[qid].vq);
-   blk_mq_stop_hw_queue(hctx);
+   /* Don't stop the queue if -ENOMEM: we may have failed to
+* bounce the buffer due to global resource outage.
+*/
+   if (err == -ENOSPC)
+   blk_mq_stop_hw_queue(hctx);
spin_unlock_irqrestore(>vqs[qid].lock, flags);
-   /* Out of mem doesn't actually happen, since we fall back
-* to direct descriptors */
if (err == -ENOMEM || err == -ENOSPC)
return BLK_STS_RESOURCE;
return BLK_STS_IOERR;
-- 
2.20.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH AUTOSEL 4.9 10/15] virtio-blk: fix hw_queue stopped on arbitrary error

2020-03-18 Thread Sasha Levin
From: Halil Pasic 

[ Upstream commit f5f6b95c72f7f8bb46eace8c5306c752d0133daa ]

Since nobody else is going to restart our hw_queue for us, the
blk_mq_start_stopped_hw_queues() is in virtblk_done() is not sufficient
necessarily sufficient to ensure that the queue will get started again.
In case of global resource outage (-ENOMEM because mapping failure,
because of swiotlb full) our virtqueue may be empty and we can get
stuck with a stopped hw_queue.

Let us not stop the queue on arbitrary errors, but only on -EONSPC which
indicates a full virtqueue, where the hw_queue is guaranteed to get
started by virtblk_done() before when it makes sense to carry on
submitting requests. Let us also remove a stale comment.

Signed-off-by: Halil Pasic 
Cc: Jens Axboe 
Fixes: f7728002c1c7 ("virtio_ring: fix return code on DMA mapping fails")
Link: https://lore.kernel.org/r/20200213123728.61216-2-pa...@linux.ibm.com
Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Sasha Levin 
---
 drivers/block/virtio_blk.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index 44ef1d66caa68..f287eec36b282 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -215,10 +215,12 @@ static int virtio_queue_rq(struct blk_mq_hw_ctx *hctx,
err = __virtblk_add_req(vblk->vqs[qid].vq, vbr, vbr->sg, num);
if (err) {
virtqueue_kick(vblk->vqs[qid].vq);
-   blk_mq_stop_hw_queue(hctx);
+   /* Don't stop the queue if -ENOMEM: we may have failed to
+* bounce the buffer due to global resource outage.
+*/
+   if (err == -ENOSPC)
+   blk_mq_stop_hw_queue(hctx);
spin_unlock_irqrestore(>vqs[qid].lock, flags);
-   /* Out of mem doesn't actually happen, since we fall back
-* to direct descriptors */
if (err == -ENOMEM || err == -ENOSPC)
return BLK_MQ_RQ_QUEUE_BUSY;
return BLK_MQ_RQ_QUEUE_ERROR;
-- 
2.20.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH AUTOSEL 4.19 23/37] virtio-blk: fix hw_queue stopped on arbitrary error

2020-03-18 Thread Sasha Levin
From: Halil Pasic 

[ Upstream commit f5f6b95c72f7f8bb46eace8c5306c752d0133daa ]

Since nobody else is going to restart our hw_queue for us, the
blk_mq_start_stopped_hw_queues() is in virtblk_done() is not sufficient
necessarily sufficient to ensure that the queue will get started again.
In case of global resource outage (-ENOMEM because mapping failure,
because of swiotlb full) our virtqueue may be empty and we can get
stuck with a stopped hw_queue.

Let us not stop the queue on arbitrary errors, but only on -EONSPC which
indicates a full virtqueue, where the hw_queue is guaranteed to get
started by virtblk_done() before when it makes sense to carry on
submitting requests. Let us also remove a stale comment.

Signed-off-by: Halil Pasic 
Cc: Jens Axboe 
Fixes: f7728002c1c7 ("virtio_ring: fix return code on DMA mapping fails")
Link: https://lore.kernel.org/r/20200213123728.61216-2-pa...@linux.ibm.com
Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Sasha Levin 
---
 drivers/block/virtio_blk.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index dd64f586679e1..728c9a9609f0c 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -271,10 +271,12 @@ static blk_status_t virtio_queue_rq(struct blk_mq_hw_ctx 
*hctx,
err = virtblk_add_req(vblk->vqs[qid].vq, vbr, vbr->sg, num);
if (err) {
virtqueue_kick(vblk->vqs[qid].vq);
-   blk_mq_stop_hw_queue(hctx);
+   /* Don't stop the queue if -ENOMEM: we may have failed to
+* bounce the buffer due to global resource outage.
+*/
+   if (err == -ENOSPC)
+   blk_mq_stop_hw_queue(hctx);
spin_unlock_irqrestore(>vqs[qid].lock, flags);
-   /* Out of mem doesn't actually happen, since we fall back
-* to direct descriptors */
if (err == -ENOMEM || err == -ENOSPC)
return BLK_STS_DEV_RESOURCE;
return BLK_STS_IOERR;
-- 
2.20.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH AUTOSEL 5.4 38/73] virtio_balloon: Adjust label in virtballoon_probe

2020-03-18 Thread Sasha Levin
From: Nathan Chancellor 

[ Upstream commit 6ae4edab2fbf86ec92fbf0a8f0c60b857d90d50f ]

Clang warns when CONFIG_BALLOON_COMPACTION is unset:

../drivers/virtio/virtio_balloon.c:963:1: warning: unused label
'out_del_vqs' [-Wunused-label]
out_del_vqs:
^~~~
1 warning generated.

Move the label within the preprocessor block since it is only used when
CONFIG_BALLOON_COMPACTION is set.

Fixes: 1ad6f58ea936 ("virtio_balloon: Fix memory leaks on errors in 
virtballoon_probe()")
Link: https://github.com/ClangBuiltLinux/linux/issues/886
Signed-off-by: Nathan Chancellor 
Link: https://lore.kernel.org/r/20200216004039.23464-1-natechancel...@gmail.com
Signed-off-by: Michael S. Tsirkin 
Reviewed-by: David Hildenbrand 
Signed-off-by: Sasha Levin 
---
 drivers/virtio/virtio_balloon.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index d2c4eb9efd70b..7aaf150f89ba0 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -958,8 +958,8 @@ static int virtballoon_probe(struct virtio_device *vdev)
iput(vb->vb_dev_info.inode);
 out_kern_unmount:
kern_unmount(balloon_mnt);
-#endif
 out_del_vqs:
+#endif
vdev->config->del_vqs(vdev);
 out_free_vb:
kfree(vb);
-- 
2.20.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH AUTOSEL 5.4 36/73] virtio_ring: Fix mem leak with vring_new_virtqueue()

2020-03-18 Thread Sasha Levin
From: Suman Anna 

[ Upstream commit f13f09a12cbd0c7b776e083c5d008b6c6a9c4e0b ]

The functions vring_new_virtqueue() and __vring_new_virtqueue() are used
with split rings, and any allocations within these functions are managed
outside of the .we_own_ring flag. The commit cbeedb72b97a ("virtio_ring:
allocate desc state for split ring separately") allocates the desc state
within the __vring_new_virtqueue() but frees it only when the .we_own_ring
flag is set. This leads to a memory leak when freeing such allocated
virtqueues with the vring_del_virtqueue() function.

Fix this by moving the desc_state free code outside the flag and only
for split rings. Issue was discovered during testing with remoteproc
and virtio_rpmsg.

Fixes: cbeedb72b97a ("virtio_ring: allocate desc state for split ring 
separately")
Signed-off-by: Suman Anna 
Link: https://lore.kernel.org/r/20200224212643.30672-1-s-a...@ti.com
Signed-off-by: Michael S. Tsirkin 
Acked-by: Jason Wang 
Signed-off-by: Sasha Levin 
---
 drivers/virtio/virtio_ring.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 867c7ebd3f107..58b96baa8d488 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -2203,10 +2203,10 @@ void vring_del_virtqueue(struct virtqueue *_vq)
 vq->split.queue_size_in_bytes,
 vq->split.vring.desc,
 vq->split.queue_dma_addr);
-
-   kfree(vq->split.desc_state);
}
}
+   if (!vq->packed_ring)
+   kfree(vq->split.desc_state);
list_del(&_vq->list);
kfree(vq);
 }
-- 
2.20.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH AUTOSEL 5.4 37/73] virtio-blk: fix hw_queue stopped on arbitrary error

2020-03-18 Thread Sasha Levin
From: Halil Pasic 

[ Upstream commit f5f6b95c72f7f8bb46eace8c5306c752d0133daa ]

Since nobody else is going to restart our hw_queue for us, the
blk_mq_start_stopped_hw_queues() is in virtblk_done() is not sufficient
necessarily sufficient to ensure that the queue will get started again.
In case of global resource outage (-ENOMEM because mapping failure,
because of swiotlb full) our virtqueue may be empty and we can get
stuck with a stopped hw_queue.

Let us not stop the queue on arbitrary errors, but only on -EONSPC which
indicates a full virtqueue, where the hw_queue is guaranteed to get
started by virtblk_done() before when it makes sense to carry on
submitting requests. Let us also remove a stale comment.

Signed-off-by: Halil Pasic 
Cc: Jens Axboe 
Fixes: f7728002c1c7 ("virtio_ring: fix return code on DMA mapping fails")
Link: https://lore.kernel.org/r/20200213123728.61216-2-pa...@linux.ibm.com
Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Sasha Levin 
---
 drivers/block/virtio_blk.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index 7ffd719d89def..c2ed3e9128e3a 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -339,10 +339,12 @@ static blk_status_t virtio_queue_rq(struct blk_mq_hw_ctx 
*hctx,
err = virtblk_add_req(vblk->vqs[qid].vq, vbr, vbr->sg, num);
if (err) {
virtqueue_kick(vblk->vqs[qid].vq);
-   blk_mq_stop_hw_queue(hctx);
+   /* Don't stop the queue if -ENOMEM: we may have failed to
+* bounce the buffer due to global resource outage.
+*/
+   if (err == -ENOSPC)
+   blk_mq_stop_hw_queue(hctx);
spin_unlock_irqrestore(>vqs[qid].lock, flags);
-   /* Out of mem doesn't actually happen, since we fall back
-* to direct descriptors */
if (err == -ENOMEM || err == -ENOSPC)
return BLK_STS_DEV_RESOURCE;
return BLK_STS_IOERR;
-- 
2.20.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH V6 3/8] vringh: IOTLB support

2020-03-18 Thread kbuild test robot
Hi Jason,

I love your patch! Yet something to improve:

[auto build test ERROR on vhost/linux-next]
[also build test ERROR on linux/master linus/master v5.6-rc6 next-20200317]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:
https://github.com/0day-ci/linux/commits/Jason-Wang/vDPA-support/20200318-191435
base:   https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git linux-next
config: nds32-randconfig-a001-20200318 (attached as .config)
compiler: nds32le-linux-gcc (GCC) 9.2.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
GCC_VERSION=9.2.0 make.cross ARCH=nds32 

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot 

All errors (new ones prefixed by >>):

   nds32le-linux-ld: arch/nds32/kernel/ex-entry.o: in function 
`common_exception_handler':
   (.text+0xfe): undefined reference to `__trace_hardirqs_off'
   (.text+0xfe): relocation truncated to fit: R_NDS32_25_PCREL_RELA against 
undefined symbol `__trace_hardirqs_off'
   nds32le-linux-ld: arch/nds32/kernel/ex-exit.o: in function `no_work_pending':
   (.text+0xce): undefined reference to `__trace_hardirqs_off'
   nds32le-linux-ld: (.text+0xd2): undefined reference to `__trace_hardirqs_off'
   nds32le-linux-ld: (.text+0xd6): undefined reference to `__trace_hardirqs_on'
   nds32le-linux-ld: (.text+0xda): undefined reference to `__trace_hardirqs_on'
   nds32le-linux-ld: drivers/vhost/vringh.o: in function 
`iotlb_translate.isra.0':
>> vringh.c:(.text+0xa68): undefined reference to `vhost_iotlb_itree_first'
>> nds32le-linux-ld: vringh.c:(.text+0xa6c): undefined reference to 
>> `vhost_iotlb_itree_first'

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH] iommu/virtio: Reject IOMMU page granule larger than PAGE_SIZE

2020-03-18 Thread Jean-Philippe Brucker
On Wed, Mar 18, 2020 at 05:13:59PM +, Robin Murphy wrote:
> On 2020-03-18 4:14 pm, Auger Eric wrote:
> > Hi,
> > 
> > On 3/18/20 1:00 PM, Robin Murphy wrote:
> > > On 2020-03-18 11:40 am, Jean-Philippe Brucker wrote:
> > > > We don't currently support IOMMUs with a page granule larger than the
> > > > system page size. The IOVA allocator has a BUG_ON() in this case, and
> > > > VFIO has a WARN_ON().
> > 
> > Adding Alex in CC in case he has time to jump in. At the moment I don't
> > get why this WARN_ON() is here.
> > 
> > This was introduced in
> > c8dbca165bb090f926996a572ea2b5b577b34b70 vfio/iommu_type1: Avoid overflow
> > 
> > > > 
> > > > It might be possible to remove these obstacles if necessary. If the host
> > > > uses 64kB pages and the guest uses 4kB, then a device driver calling
> > > > alloc_page() followed by dma_map_page() will create a 64kB mapping for a
> > > > 4kB physical page, allowing the endpoint to access the neighbouring 60kB
> > > > of memory. This problem could be worked around with bounce buffers.
> > > 
> > > FWIW the fundamental issue is that callers of iommu_map() may expect to
> > > be able to map two or more page-aligned regions directly adjacent to
> > > each other for scatter-gather purposes (or ring buffer tricks), and
> > > that's just not possible if the IOMMU granule is too big. Bounce
> > > buffering would be a viable workaround for the streaming DMA API and
> > > certain similar use-cases, but not in general (e.g. coherent DMA, VFIO,
> > > GPUs, etc.)
> > > 
> > > Robin.
> > > 
> > > > For the moment, rather than triggering the IOVA BUG_ON() on mismatched
> > > > page sizes, abort the virtio-iommu probe with an error message.
> > 
> > I understand this is a introduced as a temporary solution but this
> > sounds as an important limitation to me. For instance this will prevent
> > from running a fedora guest exposed with a virtio-iommu with a RHEL host.
> 
> As above, even if you bypassed all the warnings it wouldn't really work
> properly anyway. In all cases that wouldn't be considered broken, the
> underlying hardware IOMMUs should support the same set of granules as the
> CPUs (or at least the smallest one), so is it actually appropriate for RHEL
> to (presumably) expose a 64K granule in the first place, rather than "works
> with anything" 4K? And/or more generally is there perhaps a hole in the
> virtio-iommu spec WRT being able to negotiate page_size_mask for a
> particular granule if multiple options are available?

That could be added (by allowing config write). The larger problems are:
1) Supporting granularity smaller than the host's PAGE_SIZE in host VFIO.
   At the moment it restricts the exposed page mask and rejects DMA_MAP
   requests not aligned on that granularity.
2) Propagating this negotiation all the way to io-pgtable and the SMMU
   driver in the host.

Thanks,
Jean

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] iommu/virtio: Reject IOMMU page granule larger than PAGE_SIZE

2020-03-18 Thread Robin Murphy

On 2020-03-18 4:14 pm, Auger Eric wrote:

Hi,

On 3/18/20 1:00 PM, Robin Murphy wrote:

On 2020-03-18 11:40 am, Jean-Philippe Brucker wrote:

We don't currently support IOMMUs with a page granule larger than the
system page size. The IOVA allocator has a BUG_ON() in this case, and
VFIO has a WARN_ON().


Adding Alex in CC in case he has time to jump in. At the moment I don't
get why this WARN_ON() is here.

This was introduced in
c8dbca165bb090f926996a572ea2b5b577b34b70 vfio/iommu_type1: Avoid overflow



It might be possible to remove these obstacles if necessary. If the host
uses 64kB pages and the guest uses 4kB, then a device driver calling
alloc_page() followed by dma_map_page() will create a 64kB mapping for a
4kB physical page, allowing the endpoint to access the neighbouring 60kB
of memory. This problem could be worked around with bounce buffers.


FWIW the fundamental issue is that callers of iommu_map() may expect to
be able to map two or more page-aligned regions directly adjacent to
each other for scatter-gather purposes (or ring buffer tricks), and
that's just not possible if the IOMMU granule is too big. Bounce
buffering would be a viable workaround for the streaming DMA API and
certain similar use-cases, but not in general (e.g. coherent DMA, VFIO,
GPUs, etc.)

Robin.


For the moment, rather than triggering the IOVA BUG_ON() on mismatched
page sizes, abort the virtio-iommu probe with an error message.


I understand this is a introduced as a temporary solution but this
sounds as an important limitation to me. For instance this will prevent
from running a fedora guest exposed with a virtio-iommu with a RHEL host.


As above, even if you bypassed all the warnings it wouldn't really work 
properly anyway. In all cases that wouldn't be considered broken, the 
underlying hardware IOMMUs should support the same set of granules as 
the CPUs (or at least the smallest one), so is it actually appropriate 
for RHEL to (presumably) expose a 64K granule in the first place, rather 
than "works with anything" 4K? And/or more generally is there perhaps a 
hole in the virtio-iommu spec WRT being able to negotiate page_size_mask 
for a particular granule if multiple options are available?


Robin.



Thanks

Eric


Reported-by: Bharat Bhushan 
Signed-off-by: Jean-Philippe Brucker 
---
   drivers/iommu/virtio-iommu.c | 9 +
   1 file changed, 9 insertions(+)

diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
index 6d4e3c2a2ddb..80d5d8f621ab 100644
--- a/drivers/iommu/virtio-iommu.c
+++ b/drivers/iommu/virtio-iommu.c
@@ -998,6 +998,7 @@ static int viommu_probe(struct virtio_device *vdev)
   struct device *parent_dev = vdev->dev.parent;
   struct viommu_dev *viommu = NULL;
   struct device *dev = >dev;
+    unsigned long viommu_page_size;
   u64 input_start = 0;
   u64 input_end = -1UL;
   int ret;
@@ -1028,6 +1029,14 @@ static int viommu_probe(struct virtio_device
*vdev)
   goto err_free_vqs;
   }
   +    viommu_page_size = 1UL << __ffs(viommu->pgsize_bitmap);
+    if (viommu_page_size > PAGE_SIZE) {
+    dev_err(dev, "granule 0x%lx larger than system page size
0x%lx\n",
+    viommu_page_size, PAGE_SIZE);
+    ret = -EINVAL;
+    goto err_free_vqs;
+    }
+
   viommu->map_flags = VIRTIO_IOMMU_MAP_F_READ |
VIRTIO_IOMMU_MAP_F_WRITE;
   viommu->last_domain = ~0U;
  





___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH] iommu/virtio: Reject IOMMU page granule larger than PAGE_SIZE

2020-03-18 Thread Jean-Philippe Brucker
On Wed, Mar 18, 2020 at 05:14:05PM +0100, Auger Eric wrote:
> Hi,
> 
> On 3/18/20 1:00 PM, Robin Murphy wrote:
> > On 2020-03-18 11:40 am, Jean-Philippe Brucker wrote:
> >> We don't currently support IOMMUs with a page granule larger than the
> >> system page size. The IOVA allocator has a BUG_ON() in this case, and
> >> VFIO has a WARN_ON().
> 
> Adding Alex in CC in case he has time to jump in. At the moment I don't
> get why this WARN_ON() is here.
> 
> This was introduced in
> c8dbca165bb090f926996a572ea2b5b577b34b70 vfio/iommu_type1: Avoid overflow
> 
> >>
> >> It might be possible to remove these obstacles if necessary. If the host
> >> uses 64kB pages and the guest uses 4kB, then a device driver calling
> >> alloc_page() followed by dma_map_page() will create a 64kB mapping for a
> >> 4kB physical page, allowing the endpoint to access the neighbouring 60kB
> >> of memory. This problem could be worked around with bounce buffers.
> > 
> > FWIW the fundamental issue is that callers of iommu_map() may expect to
> > be able to map two or more page-aligned regions directly adjacent to
> > each other for scatter-gather purposes (or ring buffer tricks), and
> > that's just not possible if the IOMMU granule is too big. Bounce
> > buffering would be a viable workaround for the streaming DMA API and
> > certain similar use-cases, but not in general (e.g. coherent DMA, VFIO,
> > GPUs, etc.)
> > 
> > Robin.
> > 
> >> For the moment, rather than triggering the IOVA BUG_ON() on mismatched
> >> page sizes, abort the virtio-iommu probe with an error message.
> 
> I understand this is a introduced as a temporary solution but this
> sounds as an important limitation to me. For instance this will prevent
> from running a fedora guest exposed with a virtio-iommu with a RHEL host.

Looks like you have another argument for nested translation :) We could
probably enable 64k-4k for VFIO, but I don't think we can check and fix
all uses of iommu_map() across the kernel, even if we disallow
IOMMU_DOMAIN_DMA for this case.

Thanks,
Jean
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH V6 3/8] vringh: IOTLB support

2020-03-18 Thread kbuild test robot
Hi Jason,

I love your patch! Yet something to improve:

[auto build test ERROR on vhost/linux-next]
[also build test ERROR on linux/master linus/master v5.6-rc6 next-20200317]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:
https://github.com/0day-ci/linux/commits/Jason-Wang/vDPA-support/20200318-191435
base:   https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git linux-next
config: h8300-randconfig-a001-20200318 (attached as .config)
compiler: h8300-linux-gcc (GCC) 9.2.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
GCC_VERSION=9.2.0 make.cross ARCH=h8300 

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot 

All errors (new ones prefixed by >>):

   h8300-linux-ld: drivers/vhost/vringh.o: in function `iotlb_translate':
>> drivers/vhost/vringh.c:1079: undefined reference to `vhost_iotlb_itree_first'

vim +1079 drivers/vhost/vringh.c

  1061  
  1062  static int iotlb_translate(const struct vringh *vrh,
  1063 u64 addr, u64 len, struct bio_vec iov[],
  1064 int iov_size, u32 perm)
  1065  {
  1066  struct vhost_iotlb_map *map;
  1067  struct vhost_iotlb *iotlb = vrh->iotlb;
  1068  int ret = 0;
  1069  u64 s = 0;
  1070  
  1071  while (len > s) {
  1072  u64 size, pa, pfn;
  1073  
  1074  if (unlikely(ret >= iov_size)) {
  1075  ret = -ENOBUFS;
  1076  break;
  1077  }
  1078  
> 1079  map = vhost_iotlb_itree_first(iotlb, addr,
  1080addr + len - 1);
  1081  if (!map || map->start > addr) {
  1082  ret = -EINVAL;
  1083  break;
  1084  } else if (!(map->perm & perm)) {
  1085  ret = -EPERM;
  1086  break;
  1087  }
  1088  
  1089  size = map->size - addr + map->start;
  1090  pa = map->addr + addr - map->start;
  1091  pfn = pa >> PAGE_SHIFT;
  1092  iov[ret].bv_page = pfn_to_page(pfn);
  1093  iov[ret].bv_len = min(len - s, size);
  1094  iov[ret].bv_offset = pa & (PAGE_SIZE - 1);
  1095  s += size;
  1096  addr += size;
  1097  ++ret;
  1098  }
  1099  
  1100  return ret;
  1101  }
  1102  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH] iommu/virtio: Reject IOMMU page granule larger than PAGE_SIZE

2020-03-18 Thread Auger Eric
Hi,

On 3/18/20 1:00 PM, Robin Murphy wrote:
> On 2020-03-18 11:40 am, Jean-Philippe Brucker wrote:
>> We don't currently support IOMMUs with a page granule larger than the
>> system page size. The IOVA allocator has a BUG_ON() in this case, and
>> VFIO has a WARN_ON().

Adding Alex in CC in case he has time to jump in. At the moment I don't
get why this WARN_ON() is here.

This was introduced in
c8dbca165bb090f926996a572ea2b5b577b34b70 vfio/iommu_type1: Avoid overflow

>>
>> It might be possible to remove these obstacles if necessary. If the host
>> uses 64kB pages and the guest uses 4kB, then a device driver calling
>> alloc_page() followed by dma_map_page() will create a 64kB mapping for a
>> 4kB physical page, allowing the endpoint to access the neighbouring 60kB
>> of memory. This problem could be worked around with bounce buffers.
> 
> FWIW the fundamental issue is that callers of iommu_map() may expect to
> be able to map two or more page-aligned regions directly adjacent to
> each other for scatter-gather purposes (or ring buffer tricks), and
> that's just not possible if the IOMMU granule is too big. Bounce
> buffering would be a viable workaround for the streaming DMA API and
> certain similar use-cases, but not in general (e.g. coherent DMA, VFIO,
> GPUs, etc.)
> 
> Robin.
> 
>> For the moment, rather than triggering the IOVA BUG_ON() on mismatched
>> page sizes, abort the virtio-iommu probe with an error message.

I understand this is a introduced as a temporary solution but this
sounds as an important limitation to me. For instance this will prevent
from running a fedora guest exposed with a virtio-iommu with a RHEL host.

Thanks

Eric
>>
>> Reported-by: Bharat Bhushan 
>> Signed-off-by: Jean-Philippe Brucker 
>> ---
>>   drivers/iommu/virtio-iommu.c | 9 +
>>   1 file changed, 9 insertions(+)
>>
>> diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
>> index 6d4e3c2a2ddb..80d5d8f621ab 100644
>> --- a/drivers/iommu/virtio-iommu.c
>> +++ b/drivers/iommu/virtio-iommu.c
>> @@ -998,6 +998,7 @@ static int viommu_probe(struct virtio_device *vdev)
>>   struct device *parent_dev = vdev->dev.parent;
>>   struct viommu_dev *viommu = NULL;
>>   struct device *dev = >dev;
>> +    unsigned long viommu_page_size;
>>   u64 input_start = 0;
>>   u64 input_end = -1UL;
>>   int ret;
>> @@ -1028,6 +1029,14 @@ static int viommu_probe(struct virtio_device
>> *vdev)
>>   goto err_free_vqs;
>>   }
>>   +    viommu_page_size = 1UL << __ffs(viommu->pgsize_bitmap);
>> +    if (viommu_page_size > PAGE_SIZE) {
>> +    dev_err(dev, "granule 0x%lx larger than system page size
>> 0x%lx\n",
>> +    viommu_page_size, PAGE_SIZE);
>> +    ret = -EINVAL;
>> +    goto err_free_vqs;
>> +    }
>> +
>>   viommu->map_flags = VIRTIO_IOMMU_MAP_F_READ |
>> VIRTIO_IOMMU_MAP_F_WRITE;
>>   viommu->last_domain = ~0U;
>>  
> 

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH V6 8/8] virtio: Intel IFC VF driver for VDPA

2020-03-18 Thread kbuild test robot
Hi Jason,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on vhost/linux-next]
[also build test ERROR on linux/master linus/master v5.6-rc6 next-20200317]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:
https://github.com/0day-ci/linux/commits/Jason-Wang/vDPA-support/20200318-191435
base:   https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git linux-next
config: sh-allmodconfig (attached as .config)
compiler: sh4-linux-gcc (GCC) 9.2.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
GCC_VERSION=9.2.0 make.cross ARCH=sh 

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot 

All errors (new ones prefixed by >>):

   drivers/virtio/vdpa/ifcvf/ifcvf_main.c: In function 'ifcvf_probe':
   drivers/virtio/vdpa/ifcvf/ifcvf_main.c:409:30: error: implicit declaration 
of function 'pci_iomap_range'; did you mean 'pci_unmap_page'? 
[-Werror=implicit-function-declaration]
 409 |   vf->mem_resource[i].addr = pci_iomap_range(pdev, i, 0,
 |  ^~~
 |  pci_unmap_page
   drivers/virtio/vdpa/ifcvf/ifcvf_main.c:409:28: warning: assignment to 'void 
*' from 'int' makes pointer from integer without a cast [-Wint-conversion]
 409 |   vf->mem_resource[i].addr = pci_iomap_range(pdev, i, 0,
 |^
   drivers/virtio/vdpa/ifcvf/ifcvf_main.c:443:2: error: implicit declaration of 
function 'pci_free_irq_vectors'; did you mean 'pci_alloc_irq_vectors'? 
[-Werror=implicit-function-declaration]
 443 |  pci_free_irq_vectors(pdev);
 |  ^~~~
 |  pci_alloc_irq_vectors
   drivers/virtio/vdpa/ifcvf/ifcvf_main.c: In function 'ifcvf_remove':
>> drivers/virtio/vdpa/ifcvf/ifcvf_main.c:463:4: error: implicit declaration of 
>> function 'pci_iounmap'; did you mean 'pcim_iounmap'? 
>> [-Werror=implicit-function-declaration]
 463 |pci_iounmap(pdev, vf->mem_resource[i].addr);
 |^~~
 |pcim_iounmap
   drivers/virtio/vdpa/ifcvf/ifcvf_main.c: At top level:
   drivers/virtio/vdpa/ifcvf/ifcvf_main.c:491:1: warning: data definition has 
no type or storage class
 491 | module_pci_driver(ifcvf_driver);
 | ^
   drivers/virtio/vdpa/ifcvf/ifcvf_main.c:491:1: error: type defaults to 'int' 
in declaration of 'module_pci_driver' [-Werror=implicit-int]
   drivers/virtio/vdpa/ifcvf/ifcvf_main.c:491:1: warning: parameter names 
(without types) in function declaration
   drivers/virtio/vdpa/ifcvf/ifcvf_main.c:484:26: warning: 'ifcvf_driver' 
defined but not used [-Wunused-variable]
 484 | static struct pci_driver ifcvf_driver = {
 |  ^~~~
   cc1: some warnings being treated as errors

vim +463 drivers/virtio/vdpa/ifcvf/ifcvf_main.c

   452  
   453  static void ifcvf_remove(struct pci_dev *pdev)
   454  {
   455  struct ifcvf_adapter *adapter = pci_get_drvdata(pdev);
   456  struct ifcvf_hw *vf;
   457  int i;
   458  
   459  ifcvf_vdpa_detach(adapter);
   460  vf = >vf;
   461  for (i = 0; i < IFCVF_PCI_MAX_RESOURCE; i++) {
   462  if (vf->mem_resource[i].addr) {
 > 463  pci_iounmap(pdev, vf->mem_resource[i].addr);
   464  vf->mem_resource[i].addr = NULL;
   465  }
   466  }
   467  
   468  ifcvf_destroy_adapter(adapter);
   469  pci_free_irq_vectors(pdev);
   470  pci_release_regions(pdev);
   471  pci_disable_device(pdev);
   472  kfree(adapter);
   473  }
   474  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH V6 8/8] virtio: Intel IFC VF driver for VDPA

2020-03-18 Thread kbuild test robot
Hi Jason,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on vhost/linux-next]
[also build test ERROR on linux/master linus/master v5.6-rc6 next-20200317]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:
https://github.com/0day-ci/linux/commits/Jason-Wang/vDPA-support/20200318-191435
base:   https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git linux-next
config: c6x-allyesconfig (attached as .config)
compiler: c6x-elf-gcc (GCC) 9.2.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
GCC_VERSION=9.2.0 make.cross ARCH=c6x 

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot 

All error/warnings (new ones prefixed by >>):

   drivers/virtio/vdpa/ifcvf/ifcvf_main.c: In function 'ifcvf_probe':
>> drivers/virtio/vdpa/ifcvf/ifcvf_main.c:409:30: error: implicit declaration 
>> of function 'pci_iomap_range'; did you mean 'pci_unmap_page'? 
>> [-Werror=implicit-function-declaration]
 409 |   vf->mem_resource[i].addr = pci_iomap_range(pdev, i, 0,
 |  ^~~
 |  pci_unmap_page
>> drivers/virtio/vdpa/ifcvf/ifcvf_main.c:409:28: warning: assignment to 'void 
>> *' from 'int' makes pointer from integer without a cast [-Wint-conversion]
 409 |   vf->mem_resource[i].addr = pci_iomap_range(pdev, i, 0,
 |^
>> drivers/virtio/vdpa/ifcvf/ifcvf_main.c:443:2: error: implicit declaration of 
>> function 'pci_free_irq_vectors'; did you mean 'pci_alloc_irq_vectors'? 
>> [-Werror=implicit-function-declaration]
 443 |  pci_free_irq_vectors(pdev);
 |  ^~~~
 |  pci_alloc_irq_vectors
   drivers/virtio/vdpa/ifcvf/ifcvf_main.c: At top level:
>> drivers/virtio/vdpa/ifcvf/ifcvf_main.c:491:1: warning: data definition has 
>> no type or storage class
 491 | module_pci_driver(ifcvf_driver);
 | ^
>> drivers/virtio/vdpa/ifcvf/ifcvf_main.c:491:1: error: type defaults to 'int' 
>> in declaration of 'module_pci_driver' [-Werror=implicit-int]
>> drivers/virtio/vdpa/ifcvf/ifcvf_main.c:491:1: warning: parameter names 
>> (without types) in function declaration
   drivers/virtio/vdpa/ifcvf/ifcvf_main.c:484:26: warning: 'ifcvf_driver' 
defined but not used [-Wunused-variable]
 484 | static struct pci_driver ifcvf_driver = {
 |  ^~~~
   cc1: some warnings being treated as errors
--
   drivers/virtio/vdpa/ifcvf/ifcvf_base.c: In function 'ifcvf_init_hw':
>> drivers/virtio/vdpa/ifcvf/ifcvf_base.c:110:8: warning: 'pos' is used 
>> uninitialized in this function [-Wuninitialized]
 110 |  while (pos) {
 |^

vim +409 drivers/virtio/vdpa/ifcvf/ifcvf_main.c

   365  
   366  static int ifcvf_probe(struct pci_dev *pdev, const struct pci_device_id 
*id)
   367  {
   368  struct device *dev = >dev;
   369  struct ifcvf_adapter *adapter;
   370  struct ifcvf_hw *vf;
   371  int ret, i;
   372  
   373  adapter = kzalloc(sizeof(struct ifcvf_adapter), GFP_KERNEL);
   374  if (adapter == NULL) {
   375  ret = -ENOMEM;
   376  goto fail;
   377  }
   378  
   379  adapter->dev = dev;
   380  pci_set_drvdata(pdev, adapter);
   381  ret = pci_enable_device(pdev);
   382  if (ret) {
   383  IFCVF_ERR(adapter->dev, "Failed to enable device\n");
   384  goto free_adapter;
   385  }
   386  
   387  ret = pci_request_regions(pdev, IFCVF_DRIVER_NAME);
   388  if (ret) {
   389  IFCVF_ERR(adapter->dev, "Failed to request MMIO 
region\n");
   390  goto disable_device;
   391  }
   392  
   393  pci_set_master(pdev);
   394  ret = ifcvf_init_msix(adapter);
   395  if (ret) {
   396  IFCVF_ERR(adapter->dev, "Failed to initialize MSI-X\n");
   397  goto free_msix;
   398  }
   399  
   400  vf = >vf;
   401  for (i = 0; i < IFCVF_PCI_MAX_RESOURCE; i++) {
   402  vf->mem_resource[i].phys_addr = 
pci_resource_start(pdev, i);
   403  vf->mem_resource[i].len = pci_resource_len(pdev, i);
   404  if (!vf->mem_resource[i].len) {
   405  vf->mem_resource[i].addr

Re: [PATCH V6 8/8] virtio: Intel IFC VF driver for VDPA

2020-03-18 Thread Jason Gunthorpe
On Wed, Mar 18, 2020 at 04:03:27PM +0800, Jason Wang wrote:
> From: Zhu Lingshan 
> +
> +static int ifcvf_vdpa_attach(struct ifcvf_adapter *adapter)
> +{
> + int ret;
> +
> + adapter->vdpa_dev  = vdpa_alloc_device(adapter->dev, adapter->dev,
> +_vdpa_ops);
> + if (IS_ERR(adapter->vdpa_dev)) {
> + IFCVF_ERR(adapter->dev, "Failed to init ifcvf on vdpa bus");
> + put_device(>vdpa_dev->dev);
> + return -ENODEV;
> + }

The point of having an alloc call is so that the drivers
ifcvf_adaptor memory could be placed in the same struct - eg use
container_of to flip between them, and have a kref for both memories.

It seem really weird to have an alloc followed immediately by
register.

> diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c
> index c30eb55030be..de64b88ee7e4 100644
> +++ b/drivers/virtio/virtio_vdpa.c
> @@ -362,6 +362,7 @@ static int virtio_vdpa_probe(struct vdpa_device *vdpa)
>   goto err;
>  
>   vdpa_set_drvdata(vdpa, vd_dev);
> + dev_info(vd_dev->vdev.dev.parent, "device attached to VDPA bus\n");
>  
>   return 0;

This hunk seems out of place

Jason  
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] iommu/virtio: Reject IOMMU page granule larger than PAGE_SIZE

2020-03-18 Thread Robin Murphy

On 2020-03-18 11:40 am, Jean-Philippe Brucker wrote:

We don't currently support IOMMUs with a page granule larger than the
system page size. The IOVA allocator has a BUG_ON() in this case, and
VFIO has a WARN_ON().

It might be possible to remove these obstacles if necessary. If the host
uses 64kB pages and the guest uses 4kB, then a device driver calling
alloc_page() followed by dma_map_page() will create a 64kB mapping for a
4kB physical page, allowing the endpoint to access the neighbouring 60kB
of memory. This problem could be worked around with bounce buffers.


FWIW the fundamental issue is that callers of iommu_map() may expect to 
be able to map two or more page-aligned regions directly adjacent to 
each other for scatter-gather purposes (or ring buffer tricks), and 
that's just not possible if the IOMMU granule is too big. Bounce 
buffering would be a viable workaround for the streaming DMA API and 
certain similar use-cases, but not in general (e.g. coherent DMA, VFIO, 
GPUs, etc.)


Robin.


For the moment, rather than triggering the IOVA BUG_ON() on mismatched
page sizes, abort the virtio-iommu probe with an error message.

Reported-by: Bharat Bhushan 
Signed-off-by: Jean-Philippe Brucker 
---
  drivers/iommu/virtio-iommu.c | 9 +
  1 file changed, 9 insertions(+)

diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
index 6d4e3c2a2ddb..80d5d8f621ab 100644
--- a/drivers/iommu/virtio-iommu.c
+++ b/drivers/iommu/virtio-iommu.c
@@ -998,6 +998,7 @@ static int viommu_probe(struct virtio_device *vdev)
struct device *parent_dev = vdev->dev.parent;
struct viommu_dev *viommu = NULL;
struct device *dev = >dev;
+   unsigned long viommu_page_size;
u64 input_start = 0;
u64 input_end = -1UL;
int ret;
@@ -1028,6 +1029,14 @@ static int viommu_probe(struct virtio_device *vdev)
goto err_free_vqs;
}
  
+	viommu_page_size = 1UL << __ffs(viommu->pgsize_bitmap);

+   if (viommu_page_size > PAGE_SIZE) {
+   dev_err(dev, "granule 0x%lx larger than system page size 
0x%lx\n",
+   viommu_page_size, PAGE_SIZE);
+   ret = -EINVAL;
+   goto err_free_vqs;
+   }
+
viommu->map_flags = VIRTIO_IOMMU_MAP_F_READ | VIRTIO_IOMMU_MAP_F_WRITE;
viommu->last_domain = ~0U;
  


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH] iommu/virtio: Reject IOMMU page granule larger than PAGE_SIZE

2020-03-18 Thread Jean-Philippe Brucker
We don't currently support IOMMUs with a page granule larger than the
system page size. The IOVA allocator has a BUG_ON() in this case, and
VFIO has a WARN_ON().

It might be possible to remove these obstacles if necessary. If the host
uses 64kB pages and the guest uses 4kB, then a device driver calling
alloc_page() followed by dma_map_page() will create a 64kB mapping for a
4kB physical page, allowing the endpoint to access the neighbouring 60kB
of memory. This problem could be worked around with bounce buffers.

For the moment, rather than triggering the IOVA BUG_ON() on mismatched
page sizes, abort the virtio-iommu probe with an error message.

Reported-by: Bharat Bhushan 
Signed-off-by: Jean-Philippe Brucker 
---
 drivers/iommu/virtio-iommu.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
index 6d4e3c2a2ddb..80d5d8f621ab 100644
--- a/drivers/iommu/virtio-iommu.c
+++ b/drivers/iommu/virtio-iommu.c
@@ -998,6 +998,7 @@ static int viommu_probe(struct virtio_device *vdev)
struct device *parent_dev = vdev->dev.parent;
struct viommu_dev *viommu = NULL;
struct device *dev = >dev;
+   unsigned long viommu_page_size;
u64 input_start = 0;
u64 input_end = -1UL;
int ret;
@@ -1028,6 +1029,14 @@ static int viommu_probe(struct virtio_device *vdev)
goto err_free_vqs;
}
 
+   viommu_page_size = 1UL << __ffs(viommu->pgsize_bitmap);
+   if (viommu_page_size > PAGE_SIZE) {
+   dev_err(dev, "granule 0x%lx larger than system page size 
0x%lx\n",
+   viommu_page_size, PAGE_SIZE);
+   ret = -EINVAL;
+   goto err_free_vqs;
+   }
+
viommu->map_flags = VIRTIO_IOMMU_MAP_F_READ | VIRTIO_IOMMU_MAP_F_WRITE;
viommu->last_domain = ~0U;
 
-- 
2.25.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH V6 8/8] virtio: Intel IFC VF driver for VDPA

2020-03-18 Thread Jason Wang
From: Zhu Lingshan 

This commit introduced two layers to drive IFC VF:

(1) ifcvf_base layer, which handles IFC VF NIC hardware operations and
configurations.

(2) ifcvf_main layer, which complies to VDPA bus framework,
implemented device operations for VDPA bus, handles device probe,
bus attaching, vring operations, etc.

Signed-off-by: Zhu Lingshan 
Signed-off-by: Bie Tiwei 
Signed-off-by: Wang Xiao 
Signed-off-by: Jason Wang 
---
 drivers/virtio/vdpa/Kconfig|  10 +
 drivers/virtio/vdpa/Makefile   |   1 +
 drivers/virtio/vdpa/ifcvf/Makefile |   3 +
 drivers/virtio/vdpa/ifcvf/ifcvf_base.c | 386 +++
 drivers/virtio/vdpa/ifcvf/ifcvf_base.h | 133 +++
 drivers/virtio/vdpa/ifcvf/ifcvf_main.c | 494 +
 drivers/virtio/virtio_vdpa.c   |   1 +
 7 files changed, 1028 insertions(+)
 create mode 100644 drivers/virtio/vdpa/ifcvf/Makefile
 create mode 100644 drivers/virtio/vdpa/ifcvf/ifcvf_base.c
 create mode 100644 drivers/virtio/vdpa/ifcvf/ifcvf_base.h
 create mode 100644 drivers/virtio/vdpa/ifcvf/ifcvf_main.c

diff --git a/drivers/virtio/vdpa/Kconfig b/drivers/virtio/vdpa/Kconfig
index 9baa1d8da002..9f0b0fc72514 100644
--- a/drivers/virtio/vdpa/Kconfig
+++ b/drivers/virtio/vdpa/Kconfig
@@ -22,4 +22,14 @@ config VDPA_SIM
  to RX. This device is used for testing, prototyping and
  development of vDPA.
 
+config IFCVF
+   tristate "Intel IFC VF VDPA driver"
+   depends on VDPA
+   default n
+   help
+ This kernel module can drive Intel IFC VF NIC to offload
+ virtio dataplane traffic to hardware.
+ To compile this driver as a module, choose M here: the module will
+ be called ifcvf.
+
 endif # VDPA_MENU
diff --git a/drivers/virtio/vdpa/Makefile b/drivers/virtio/vdpa/Makefile
index 3814af8e097b..8bbb686ca7a2 100644
--- a/drivers/virtio/vdpa/Makefile
+++ b/drivers/virtio/vdpa/Makefile
@@ -1,3 +1,4 @@
 # SPDX-License-Identifier: GPL-2.0
 obj-$(CONFIG_VDPA) += vdpa.o
 obj-$(CONFIG_VDPA_SIM) += vdpa_sim/
+obj-$(CONFIG_IFCVF)+= ifcvf/
diff --git a/drivers/virtio/vdpa/ifcvf/Makefile 
b/drivers/virtio/vdpa/ifcvf/Makefile
new file mode 100644
index ..d709915995ab
--- /dev/null
+++ b/drivers/virtio/vdpa/ifcvf/Makefile
@@ -0,0 +1,3 @@
+# SPDX-License-Identifier: GPL-2.0
+obj-$(CONFIG_IFCVF) += ifcvf.o
+ifcvf-$(CONFIG_IFCVF) += ifcvf_main.o ifcvf_base.o
diff --git a/drivers/virtio/vdpa/ifcvf/ifcvf_base.c 
b/drivers/virtio/vdpa/ifcvf/ifcvf_base.c
new file mode 100644
index ..ebfc0453f21a
--- /dev/null
+++ b/drivers/virtio/vdpa/ifcvf/ifcvf_base.c
@@ -0,0 +1,386 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Intel IFC VF NIC driver for virtio dataplane offloading
+ *
+ * Copyright (C) 2020 Intel Corporation.
+ *
+ * Author: Zhu Lingshan 
+ *
+ */
+
+#include "ifcvf_base.h"
+
+static inline u8 ifc_ioread8(u8 __iomem *addr)
+{
+   return ioread8(addr);
+}
+static inline u16 ifc_ioread16 (__le16 __iomem *addr)
+{
+   return ioread16(addr);
+}
+
+static inline u32 ifc_ioread32(__le32 __iomem *addr)
+{
+   return ioread32(addr);
+}
+
+static inline void ifc_iowrite8(u8 value, u8 __iomem *addr)
+{
+   iowrite8(value, addr);
+}
+
+static inline void ifc_iowrite16(u16 value, __le16 __iomem *addr)
+{
+   iowrite16(value, addr);
+}
+
+static inline void ifc_iowrite32(u32 value, __le32 __iomem *addr)
+{
+   iowrite32(value, addr);
+}
+
+static void ifc_iowrite64_twopart(u64 val,
+ __le32 __iomem *lo, __le32 __iomem *hi)
+{
+   ifc_iowrite32((u32)val, lo);
+   ifc_iowrite32(val >> 32, hi);
+}
+
+struct ifcvf_adapter *vf_to_adapter(struct ifcvf_hw *hw)
+{
+   return container_of(hw, struct ifcvf_adapter, vf);
+}
+
+static void __iomem *get_cap_addr(struct ifcvf_hw *hw,
+ struct virtio_pci_cap *cap)
+{
+   struct ifcvf_adapter *ifcvf;
+   u32 length, offset;
+   u8 bar;
+
+   length = le32_to_cpu(cap->length);
+   offset = le32_to_cpu(cap->offset);
+   bar = cap->bar;
+
+   ifcvf = vf_to_adapter(hw);
+   if (bar >= IFCVF_PCI_MAX_RESOURCE) {
+   IFCVF_DBG(ifcvf->dev,
+ "Invalid bar number %u to get capabilities\n", bar);
+   return NULL;
+   }
+
+   if (offset + length > hw->mem_resource[bar].len) {
+   IFCVF_DBG(ifcvf->dev,
+ "offset(%u) + len(%u) overflows bar%u to get 
capabilities\n",
+ offset, length, bar);
+   return NULL;
+   }
+
+   return hw->mem_resource[bar].addr + offset;
+}
+
+static int ifcvf_read_config_range(struct pci_dev *dev,
+  uint32_t *val, int size, int where)
+{
+   int ret, i;
+
+   for (i = 0; i < size; i += 4) {
+   ret = pci_read_config_dword(dev, where + i, val + i / 4);
+   if (ret < 0)
+   

[PATCH V6 6/8] vhost: introduce vDPA-based backend

2020-03-18 Thread Jason Wang
From: Tiwei Bie 

This patch introduces a vDPA-based vhost backend. This backend is
built on top of the same interface defined in virtio-vDPA and provides
a generic vhost interface for userspace to accelerate the virtio
devices in guest.

This backend is implemented as a vDPA device driver on top of the same
ops used in virtio-vDPA. It will create char device entry named
vhost-vdpa-$index for userspace to use. Userspace can use vhost ioctls
on top of this char device to setup the backend.

Vhost ioctls are extended to make it type agnostic and behave like a
virtio device, this help to eliminate type specific API like what
vhost_net/scsi/vsock did:

- VHOST_VDPA_GET_DEVICE_ID: get the virtio device ID which is defined
  by virtio specification to differ from different type of devices
- VHOST_VDPA_GET_VRING_NUM: get the maximum size of virtqueue
  supported by the vDPA device
- VHSOT_VDPA_SET/GET_STATUS: set and get virtio status of vDPA device
- VHOST_VDPA_SET/GET_CONFIG: access virtio config space
- VHOST_VDPA_SET_VRING_ENABLE: enable a specific virtqueue

For memory mapping, IOTLB API is mandated for vhost-vDPA which means
userspace drivers are required to use
VHOST_IOTLB_UPDATE/VHOST_IOTLB_INVALIDATE to add or remove mapping for
a specific userspace memory region.

The vhost-vDPA API is designed to be type agnostic, but it allows net
device only in current stage. Due to the lacking of control virtqueue
support, some features were filter out by vhost-vdpa.

We will enable more features and devices in the near future.

Signed-off-by: Tiwei Bie 
Signed-off-by: Eugenio Pérez 
Signed-off-by: Jason Wang 
---
 drivers/vhost/Kconfig|  11 +
 drivers/vhost/Makefile   |   3 +
 drivers/vhost/vdpa.c | 883 +++
 include/uapi/linux/vhost.h   |  24 +
 include/uapi/linux/vhost_types.h |   8 +
 5 files changed, 929 insertions(+)
 create mode 100644 drivers/vhost/vdpa.c

diff --git a/drivers/vhost/Kconfig b/drivers/vhost/Kconfig
index e76a72490563..0e0bc5292631 100644
--- a/drivers/vhost/Kconfig
+++ b/drivers/vhost/Kconfig
@@ -34,6 +34,17 @@ config VHOST_VSOCK
To compile this driver as a module, choose M here: the module will be 
called
vhost_vsock.
 
+config VHOST_VDPA
+   tristate "Vhost driver for vDPA-based backend"
+   depends on EVENTFD && VDPA
+   select VHOST
+   help
+ This kernel module can be loaded in host kernel to accelerate
+ guest virtio devices with the vDPA-based backends.
+
+ To compile this driver as a module, choose M here: the module
+ will be called vhost_vdpa.
+
 config VHOST
tristate
select VHOST_IOTLB
diff --git a/drivers/vhost/Makefile b/drivers/vhost/Makefile
index fb831002bcf0..f3e1897cce85 100644
--- a/drivers/vhost/Makefile
+++ b/drivers/vhost/Makefile
@@ -10,6 +10,9 @@ vhost_vsock-y := vsock.o
 
 obj-$(CONFIG_VHOST_RING) += vringh.o
 
+obj-$(CONFIG_VHOST_VDPA) += vhost_vdpa.o
+vhost_vdpa-y := vdpa.o
+
 obj-$(CONFIG_VHOST)+= vhost.o
 
 obj-$(CONFIG_VHOST_IOTLB) += vhost_iotlb.o
diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
new file mode 100644
index ..421f02a8530a
--- /dev/null
+++ b/drivers/vhost/vdpa.c
@@ -0,0 +1,883 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2018-2020 Intel Corporation.
+ * Copyright (C) 2020 Red Hat, Inc.
+ *
+ * Author: Tiwei Bie 
+ * Jason Wang 
+ *
+ * Thanks Michael S. Tsirkin for the valuable comments and
+ * suggestions.  And thanks to Cunming Liang and Zhihong Wang for all
+ * their supports.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "vhost.h"
+
+enum {
+   VHOST_VDPA_FEATURES =
+   (1ULL << VIRTIO_F_NOTIFY_ON_EMPTY) |
+   (1ULL << VIRTIO_F_ANY_LAYOUT) |
+   (1ULL << VIRTIO_F_VERSION_1) |
+   (1ULL << VIRTIO_F_IOMMU_PLATFORM) |
+   (1ULL << VIRTIO_F_RING_PACKED) |
+   (1ULL << VIRTIO_F_ORDER_PLATFORM) |
+   (1ULL << VIRTIO_RING_F_INDIRECT_DESC) |
+   (1ULL << VIRTIO_RING_F_EVENT_IDX),
+
+   VHOST_VDPA_NET_FEATURES = VHOST_VDPA_FEATURES |
+   (1ULL << VIRTIO_NET_F_CSUM) |
+   (1ULL << VIRTIO_NET_F_GUEST_CSUM) |
+   (1ULL << VIRTIO_NET_F_MTU) |
+   (1ULL << VIRTIO_NET_F_MAC) |
+   (1ULL << VIRTIO_NET_F_GUEST_TSO4) |
+   (1ULL << VIRTIO_NET_F_GUEST_TSO6) |
+   (1ULL << VIRTIO_NET_F_GUEST_ECN) |
+   (1ULL << VIRTIO_NET_F_GUEST_UFO) |
+   (1ULL << VIRTIO_NET_F_HOST_TSO4) |
+   (1ULL << VIRTIO_NET_F_HOST_TSO6) |
+   (1ULL << VIRTIO_NET_F_HOST_ECN) |
+   (1ULL << VIRTIO_NET_F_HOST_UFO) |
+   (1ULL << VIRTIO_NET_F_MRG_RXBUF) |
+   (1ULL << VIRTIO_NET_F_STATUS) |
+   (1ULL << VIRTIO_NET_F_SPEED_DUPLEX),
+};
+
+/* 

[PATCH V6 5/8] virtio: introduce a vDPA based transport

2020-03-18 Thread Jason Wang
This patch introduces a vDPA transport for virtio. This is used to
use kernel virtio driver to drive the vDPA device that is capable
of populating virtqueue directly.

A new virtio-vdpa driver will be registered to the vDPA bus, when a
new virtio-vdpa device is probed, it will register the device with
vdpa based config ops. This means it is a software transport between
vDPA driver and vDPA device. The transport was implemented through
bus_ops of vDPA parent.

Signed-off-by: Jason Wang 
---
 drivers/virtio/Kconfig   |  13 ++
 drivers/virtio/Makefile  |   1 +
 drivers/virtio/virtio_vdpa.c | 396 +++
 3 files changed, 410 insertions(+)
 create mode 100644 drivers/virtio/virtio_vdpa.c

diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig
index 9c4fdb64d9ac..99e424570644 100644
--- a/drivers/virtio/Kconfig
+++ b/drivers/virtio/Kconfig
@@ -43,6 +43,19 @@ config VIRTIO_PCI_LEGACY
 
  If unsure, say Y.
 
+config VIRTIO_VDPA
+   tristate "vDPA driver for virtio devices"
+   select VDPA
+   select VIRTIO
+   help
+ This driver provides support for virtio based paravirtual
+ device driver over vDPA bus. For this to be useful, you need
+ an appropriate vDPA device implementation that operates on a
+ physical device to allow the datapath of virtio to be
+ offloaded to hardware.
+
+ If unsure, say M.
+
 config VIRTIO_PMEM
tristate "Support for virtio pmem driver"
depends on VIRTIO
diff --git a/drivers/virtio/Makefile b/drivers/virtio/Makefile
index fdf5eacd0d0a..3407ac03fe60 100644
--- a/drivers/virtio/Makefile
+++ b/drivers/virtio/Makefile
@@ -6,4 +6,5 @@ virtio_pci-y := virtio_pci_modern.o virtio_pci_common.o
 virtio_pci-$(CONFIG_VIRTIO_PCI_LEGACY) += virtio_pci_legacy.o
 obj-$(CONFIG_VIRTIO_BALLOON) += virtio_balloon.o
 obj-$(CONFIG_VIRTIO_INPUT) += virtio_input.o
+obj-$(CONFIG_VIRTIO_VDPA) += virtio_vdpa.o
 obj-$(CONFIG_VDPA) += vdpa/
diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c
new file mode 100644
index ..c30eb55030be
--- /dev/null
+++ b/drivers/virtio/virtio_vdpa.c
@@ -0,0 +1,396 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * VIRTIO based driver for vDPA device
+ *
+ * Copyright (c) 2020, Red Hat. All rights reserved.
+ * Author: Jason Wang 
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define MOD_VERSION  "0.1"
+#define MOD_AUTHOR   "Jason Wang "
+#define MOD_DESC "vDPA bus driver for virtio devices"
+#define MOD_LICENSE  "GPL v2"
+
+struct virtio_vdpa_device {
+   struct virtio_device vdev;
+   struct vdpa_device *vdpa;
+   u64 features;
+
+   /* The lock to protect virtqueue list */
+   spinlock_t lock;
+   /* List of virtio_vdpa_vq_info */
+   struct list_head virtqueues;
+};
+
+struct virtio_vdpa_vq_info {
+   /* the actual virtqueue */
+   struct virtqueue *vq;
+
+   /* the list node for the virtqueues list */
+   struct list_head node;
+};
+
+static inline struct virtio_vdpa_device *
+to_virtio_vdpa_device(struct virtio_device *dev)
+{
+   return container_of(dev, struct virtio_vdpa_device, vdev);
+}
+
+static struct vdpa_device *vd_get_vdpa(struct virtio_device *vdev)
+{
+   return to_virtio_vdpa_device(vdev)->vdpa;
+}
+
+static void virtio_vdpa_get(struct virtio_device *vdev, unsigned offset,
+   void *buf, unsigned len)
+{
+   struct vdpa_device *vdpa = vd_get_vdpa(vdev);
+   const struct vdpa_config_ops *ops = vdpa->config;
+
+   ops->get_config(vdpa, offset, buf, len);
+}
+
+static void virtio_vdpa_set(struct virtio_device *vdev, unsigned offset,
+   const void *buf, unsigned len)
+{
+   struct vdpa_device *vdpa = vd_get_vdpa(vdev);
+   const struct vdpa_config_ops *ops = vdpa->config;
+
+   ops->set_config(vdpa, offset, buf, len);
+}
+
+static u32 virtio_vdpa_generation(struct virtio_device *vdev)
+{
+   struct vdpa_device *vdpa = vd_get_vdpa(vdev);
+   const struct vdpa_config_ops *ops = vdpa->config;
+
+   if (ops->get_generation)
+   return ops->get_generation(vdpa);
+
+   return 0;
+}
+
+static u8 virtio_vdpa_get_status(struct virtio_device *vdev)
+{
+   struct vdpa_device *vdpa = vd_get_vdpa(vdev);
+   const struct vdpa_config_ops *ops = vdpa->config;
+
+   return ops->get_status(vdpa);
+}
+
+static void virtio_vdpa_set_status(struct virtio_device *vdev, u8 status)
+{
+   struct vdpa_device *vdpa = vd_get_vdpa(vdev);
+   const struct vdpa_config_ops *ops = vdpa->config;
+
+   return ops->set_status(vdpa, status);
+}
+
+static void virtio_vdpa_reset(struct virtio_device *vdev)
+{
+   struct vdpa_device *vdpa = vd_get_vdpa(vdev);
+   const struct vdpa_config_ops *ops = vdpa->config;
+
+   return ops->set_status(vdpa, 0);
+}
+
+static bool 

[PATCH V6 4/8] vDPA: introduce vDPA bus

2020-03-18 Thread Jason Wang
vDPA device is a device that uses a datapath which complies with the
virtio specifications with vendor specific control path. vDPA devices
can be both physically located on the hardware or emulated by
software. vDPA hardware devices are usually implemented through PCIE
with the following types:

- PF (Physical Function) - A single Physical Function
- VF (Virtual Function) - Device that supports single root I/O
  virtualization (SR-IOV). Its Virtual Function (VF) represents a
  virtualized instance of the device that can be assigned to different
  partitions
- ADI (Assignable Device Interface) and its equivalents - With
  technologies such as Intel Scalable IOV, a virtual device (VDEV)
  composed by host OS utilizing one or more ADIs. Or its equivalent
  like SF (Sub function) from Mellanox.

>From a driver's perspective, depends on how and where the DMA
translation is done, vDPA devices are split into two types:

- Platform specific DMA translation - From the driver's perspective,
  the device can be used on a platform where device access to data in
  memory is limited and/or translated. An example is a PCIE vDPA whose
  DMA request was tagged via a bus (e.g PCIE) specific way. DMA
  translation and protection are done at PCIE bus IOMMU level.
- Device specific DMA translation - The device implements DMA
  isolation and protection through its own logic. An example is a vDPA
  device which uses on-chip IOMMU.

To hide the differences and complexity of the above types for a vDPA
device/IOMMU options and in order to present a generic virtio device
to the upper layer, a device agnostic framework is required.

This patch introduces a software vDPA bus which abstracts the
common attributes of vDPA device, vDPA bus driver and the
communication method (vdpa_config_ops) between the vDPA device
abstraction and the vDPA bus driver. This allows multiple types of
drivers to be used for vDPA device like the virtio_vdpa and vhost_vdpa
driver to operate on the bus and allow vDPA device could be used by
either kernel virtio driver or userspace vhost drivers as:

   virtio drivers  vhost drivers
  | |
[virtio bus]   [vhost uAPI]
  | |
   virtio device   vhost device
   virtio_vdpa drv vhost_vdpa drv
 \   /
[vDPA bus]
 |
vDPA device
hardware drv
 |
[hardware bus]
 |
vDPA hardware

With the abstraction of vDPA bus and vDPA bus operations, the
difference and complexity of the under layer hardware is hidden from
upper layer. The vDPA bus drivers on top can use a unified
vdpa_config_ops to control different types of vDPA device.

Signed-off-by: Jason Wang 
---
 MAINTAINERS  |   1 +
 drivers/virtio/Kconfig   |   2 +
 drivers/virtio/Makefile  |   1 +
 drivers/virtio/vdpa/Kconfig  |   7 ++
 drivers/virtio/vdpa/Makefile |   2 +
 drivers/virtio/vdpa/vdpa.c   | 174 ++
 include/linux/vdpa.h | 232 +++
 7 files changed, 419 insertions(+)
 create mode 100644 drivers/virtio/vdpa/Kconfig
 create mode 100644 drivers/virtio/vdpa/Makefile
 create mode 100644 drivers/virtio/vdpa/vdpa.c
 create mode 100644 include/linux/vdpa.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 0fb645b5a7df..2b8d9fa38d9a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -17701,6 +17701,7 @@ F:  tools/virtio/
 F: drivers/net/virtio_net.c
 F: drivers/block/virtio_blk.c
 F: include/linux/virtio*.h
+F: include/linux/vdpa.h
 F: include/uapi/linux/virtio_*.h
 F: drivers/crypto/virtio/
 F: mm/balloon_compaction.c
diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig
index 078615cf2afc..9c4fdb64d9ac 100644
--- a/drivers/virtio/Kconfig
+++ b/drivers/virtio/Kconfig
@@ -96,3 +96,5 @@ config VIRTIO_MMIO_CMDLINE_DEVICES
 If unsure, say 'N'.
 
 endif # VIRTIO_MENU
+
+source "drivers/virtio/vdpa/Kconfig"
diff --git a/drivers/virtio/Makefile b/drivers/virtio/Makefile
index 3a2b5c5dcf46..fdf5eacd0d0a 100644
--- a/drivers/virtio/Makefile
+++ b/drivers/virtio/Makefile
@@ -6,3 +6,4 @@ virtio_pci-y := virtio_pci_modern.o virtio_pci_common.o
 virtio_pci-$(CONFIG_VIRTIO_PCI_LEGACY) += virtio_pci_legacy.o
 obj-$(CONFIG_VIRTIO_BALLOON) += virtio_balloon.o
 obj-$(CONFIG_VIRTIO_INPUT) += virtio_input.o
+obj-$(CONFIG_VDPA) += vdpa/
diff --git a/drivers/virtio/vdpa/Kconfig b/drivers/virtio/vdpa/Kconfig
new file mode 100644
index ..351617723d12
--- /dev/null
+++ b/drivers/virtio/vdpa/Kconfig
@@ -0,0 +1,7 @@
+# SPDX-License-Identifier: GPL-2.0-only
+config VDPA
+   tristate
+   help
+ Enable this module to support vDPA device that uses a
+ datapath which complies with virtio specifications with
+ vendor specific control path.
diff --git a/drivers/virtio/vdpa/Makefile b/drivers/virtio/vdpa/Makefile
new file mode 100644
index ..ee6a35e8a4fb
--- 

[PATCH V6 7/8] vdpasim: vDPA device simulator

2020-03-18 Thread Jason Wang
This patch implements a software vDPA networking device. The datapath
is implemented through vringh and workqueue. The device has an on-chip
IOMMU which translates IOVA to PA. For kernel virtio drivers, vDPA
simulator driver provides dma_ops. For vhost driers, set_map() methods
of vdpa_config_ops is implemented to accept mappings from vhost.

Currently, vDPA device simulator will loopback TX traffic to RX. So
the main use case for the device is vDPA feature testing, prototyping
and development.

Note, there's no management API implemented, a vDPA device will be
registered once the module is probed. We need to handle this in the
future development.

Signed-off-by: Jason Wang 
---
 drivers/virtio/vdpa/Kconfig |  18 +
 drivers/virtio/vdpa/Makefile|   1 +
 drivers/virtio/vdpa/vdpa_sim/Makefile   |   2 +
 drivers/virtio/vdpa/vdpa_sim/vdpa_sim.c | 646 
 4 files changed, 667 insertions(+)
 create mode 100644 drivers/virtio/vdpa/vdpa_sim/Makefile
 create mode 100644 drivers/virtio/vdpa/vdpa_sim/vdpa_sim.c

diff --git a/drivers/virtio/vdpa/Kconfig b/drivers/virtio/vdpa/Kconfig
index 351617723d12..9baa1d8da002 100644
--- a/drivers/virtio/vdpa/Kconfig
+++ b/drivers/virtio/vdpa/Kconfig
@@ -5,3 +5,21 @@ config VDPA
  Enable this module to support vDPA device that uses a
  datapath which complies with virtio specifications with
  vendor specific control path.
+
+menuconfig VDPA_MENU
+   bool "VDPA drivers"
+   default n
+
+if VDPA_MENU
+
+config VDPA_SIM
+   tristate "vDPA device simulator"
+   select VDPA
+   depends on RUNTIME_TESTING_MENU && VHOST_RING
+   default n
+   help
+ vDPA networking device simulator which loop TX traffic back
+ to RX. This device is used for testing, prototyping and
+ development of vDPA.
+
+endif # VDPA_MENU
diff --git a/drivers/virtio/vdpa/Makefile b/drivers/virtio/vdpa/Makefile
index ee6a35e8a4fb..3814af8e097b 100644
--- a/drivers/virtio/vdpa/Makefile
+++ b/drivers/virtio/vdpa/Makefile
@@ -1,2 +1,3 @@
 # SPDX-License-Identifier: GPL-2.0
 obj-$(CONFIG_VDPA) += vdpa.o
+obj-$(CONFIG_VDPA_SIM) += vdpa_sim/
diff --git a/drivers/virtio/vdpa/vdpa_sim/Makefile 
b/drivers/virtio/vdpa/vdpa_sim/Makefile
new file mode 100644
index ..b40278f65e04
--- /dev/null
+++ b/drivers/virtio/vdpa/vdpa_sim/Makefile
@@ -0,0 +1,2 @@
+# SPDX-License-Identifier: GPL-2.0
+obj-$(CONFIG_VDPA_SIM) += vdpa_sim.o
diff --git a/drivers/virtio/vdpa/vdpa_sim/vdpa_sim.c 
b/drivers/virtio/vdpa/vdpa_sim/vdpa_sim.c
new file mode 100644
index ..da2e9964c48f
--- /dev/null
+++ b/drivers/virtio/vdpa/vdpa_sim/vdpa_sim.c
@@ -0,0 +1,646 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * VDPA networking device simulator.
+ *
+ * Copyright (c) 2020, Red Hat Inc. All rights reserved.
+ * Author: Jason Wang 
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define DRV_VERSION  "0.1"
+#define DRV_AUTHOR   "Jason Wang "
+#define DRV_DESC "vDPA Device Simulator"
+#define DRV_LICENSE  "GPL v2"
+
+struct vdpasim_virtqueue {
+   struct vringh vring;
+   struct vringh_kiov iov;
+   unsigned short head;
+   bool ready;
+   u64 desc_addr;
+   u64 device_addr;
+   u64 driver_addr;
+   u32 num;
+   void *private;
+   irqreturn_t (*cb)(void *data);
+};
+
+#define VDPASIM_QUEUE_ALIGN PAGE_SIZE
+#define VDPASIM_QUEUE_MAX 256
+#define VDPASIM_DEVICE_ID 0x1
+#define VDPASIM_VENDOR_ID 0
+#define VDPASIM_VQ_NUM 0x2
+#define VDPASIM_NAME "vdpasim-netdev"
+
+static u64 vdpasim_features = (1ULL << VIRTIO_F_ANY_LAYOUT) |
+ (1ULL << VIRTIO_F_VERSION_1)  |
+ (1ULL << VIRTIO_F_IOMMU_PLATFORM);
+
+/* State of each vdpasim device */
+struct vdpasim {
+   struct vdpasim_virtqueue vqs[2];
+   struct work_struct work;
+   /* spinlock to synchronize virtqueue state */
+   spinlock_t lock;
+   struct vdpa_device *vdpa;
+   struct device dev;
+   struct virtio_net_config config;
+   struct vhost_iotlb *iommu;
+   void *buffer;
+   u32 status;
+   u32 generation;
+   u64 features;
+};
+
+static struct vdpasim *vdpasim_dev;
+
+static struct vdpasim *dev_to_sim(struct device *dev)
+{
+   return container_of(dev, struct vdpasim, dev);
+}
+
+static struct vdpasim *vdpa_to_sim(struct vdpa_device *vdpa)
+{
+   struct device *d = >dev;
+
+   return dev_to_sim(d->parent);
+}
+
+static void vdpasim_queue_ready(struct vdpasim *vdpasim, unsigned int idx)
+{
+   struct vdpasim_virtqueue *vq = >vqs[idx];
+   int ret;
+
+   ret = vringh_init_iotlb(>vring, vdpasim_features,
+   VDPASIM_QUEUE_MAX, false,
+   (struct vring_desc 

[PATCH V6 2/8] vhost: factor out IOTLB

2020-03-18 Thread Jason Wang
This patch factors out IOTLB into a dedicated module in order to be
reused by other modules like vringh. User may choose to enable the
automatic retiring by specifying VHOST_IOTLB_FLAG_RETIRE flag to fit
for the case of vhost device IOTLB implementation.

Signed-off-by: Jason Wang 
---
 MAINTAINERS |   1 +
 drivers/vhost/Kconfig   |   6 +
 drivers/vhost/Makefile  |   3 +
 drivers/vhost/iotlb.c   | 177 +
 drivers/vhost/net.c |   2 +-
 drivers/vhost/vhost.c   | 221 +++-
 drivers/vhost/vhost.h   |  39 ++-
 include/linux/vhost_iotlb.h |  47 
 8 files changed, 315 insertions(+), 181 deletions(-)
 create mode 100644 drivers/vhost/iotlb.c
 create mode 100644 include/linux/vhost_iotlb.h

diff --git a/MAINTAINERS b/MAINTAINERS
index c74e4ea714a5..0fb645b5a7df 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -17768,6 +17768,7 @@ T:  git 
git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git
 S: Maintained
 F: drivers/vhost/
 F: include/uapi/linux/vhost.h
+F: include/linux/vhost_iotlb.h
 
 VIRTIO INPUT DRIVER
 M: Gerd Hoffmann 
diff --git a/drivers/vhost/Kconfig b/drivers/vhost/Kconfig
index 3d03ccbd1adc..e76a72490563 100644
--- a/drivers/vhost/Kconfig
+++ b/drivers/vhost/Kconfig
@@ -36,6 +36,7 @@ config VHOST_VSOCK
 
 config VHOST
tristate
+   select VHOST_IOTLB
---help---
  This option is selected by any driver which needs to access
  the core of vhost.
@@ -54,3 +55,8 @@ config VHOST_CROSS_ENDIAN_LEGACY
  adds some overhead, it is disabled by default.
 
  If unsure, say "N".
+
+config VHOST_IOTLB
+   tristate
+   help
+ Generic IOTLB implementation for vhost and vringh.
diff --git a/drivers/vhost/Makefile b/drivers/vhost/Makefile
index 6c6df24f770c..fb831002bcf0 100644
--- a/drivers/vhost/Makefile
+++ b/drivers/vhost/Makefile
@@ -11,3 +11,6 @@ vhost_vsock-y := vsock.o
 obj-$(CONFIG_VHOST_RING) += vringh.o
 
 obj-$(CONFIG_VHOST)+= vhost.o
+
+obj-$(CONFIG_VHOST_IOTLB) += vhost_iotlb.o
+vhost_iotlb-y := iotlb.o
diff --git a/drivers/vhost/iotlb.c b/drivers/vhost/iotlb.c
new file mode 100644
index ..1f0ca6e44410
--- /dev/null
+++ b/drivers/vhost/iotlb.c
@@ -0,0 +1,177 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright (C) 2020 Red Hat, Inc.
+ * Author: Jason Wang 
+ *
+ * IOTLB implementation for vhost.
+ */
+#include 
+#include 
+#include 
+
+#define MOD_VERSION  "0.1"
+#define MOD_DESC "VHOST IOTLB"
+#define MOD_AUTHOR   "Jason Wang "
+#define MOD_LICENSE  "GPL v2"
+
+#define START(map) ((map)->start)
+#define LAST(map) ((map)->last)
+
+INTERVAL_TREE_DEFINE(struct vhost_iotlb_map,
+rb, __u64, __subtree_last,
+START, LAST, static inline, vhost_iotlb_itree);
+
+/**
+ * vhost_iotlb_map_free - remove a map node and free it
+ * @iotlb: the IOTLB
+ * @map: the map that want to be remove and freed
+ */
+void vhost_iotlb_map_free(struct vhost_iotlb *iotlb,
+ struct vhost_iotlb_map *map)
+{
+   vhost_iotlb_itree_remove(map, >root);
+   list_del(>link);
+   kfree(map);
+   iotlb->nmaps--;
+}
+EXPORT_SYMBOL_GPL(vhost_iotlb_map_free);
+
+/**
+ * vhost_iotlb_add_range - add a new range to vhost IOTLB
+ * @iotlb: the IOTLB
+ * @start: start of the IOVA range
+ * @last: last of IOVA range
+ * @addr: the address that is mapped to @start
+ * @perm: access permission of this range
+ *
+ * Returns an error last is smaller than start or memory allocation
+ * fails
+ */
+int vhost_iotlb_add_range(struct vhost_iotlb *iotlb,
+ u64 start, u64 last,
+ u64 addr, unsigned int perm)
+{
+   struct vhost_iotlb_map *map;
+
+   if (last < start)
+   return -EFAULT;
+
+   if (iotlb->limit &&
+   iotlb->nmaps == iotlb->limit &&
+   iotlb->flags & VHOST_IOTLB_FLAG_RETIRE) {
+   map = list_first_entry(>list, typeof(*map), link);
+   vhost_iotlb_map_free(iotlb, map);
+   }
+
+   map = kmalloc(sizeof(*map), GFP_ATOMIC);
+   if (!map)
+   return -ENOMEM;
+
+   map->start = start;
+   map->size = last - start + 1;
+   map->last = last;
+   map->addr = addr;
+   map->perm = perm;
+
+   iotlb->nmaps++;
+   vhost_iotlb_itree_insert(map, >root);
+
+   INIT_LIST_HEAD(>link);
+   list_add_tail(>link, >list);
+
+   return 0;
+}
+EXPORT_SYMBOL_GPL(vhost_iotlb_add_range);
+
+/**
+ * vring_iotlb_del_range - delete overlapped ranges from vhost IOTLB
+ * @iotlb: the IOTLB
+ * @start: start of the IOVA range
+ * @last: last of IOVA range
+ */
+void vhost_iotlb_del_range(struct vhost_iotlb *iotlb, u64 start, u64 last)
+{
+   struct vhost_iotlb_map *map;
+
+   while ((map = vhost_iotlb_itree_iter_first(>root,
+  start, last)))
+   

[PATCH V6 3/8] vringh: IOTLB support

2020-03-18 Thread Jason Wang
This patch implements the third memory accessor for vringh besides
current kernel and userspace accessors. This idea is to allow vringh
to do the address translation through an IOTLB which is implemented
via vhost_map interval tree. Users should setup and IOVA to PA mapping
in this IOTLB.

This allows us to:

- Using vringh to access virtqueues with vIOMMU
- Using vringh to implement software virtqueues for vDPA devices

Signed-off-by: Jason Wang 
---
 drivers/vhost/Kconfig.vringh |   1 +
 drivers/vhost/vringh.c   | 421 +--
 include/linux/vringh.h   |  36 +++
 3 files changed, 435 insertions(+), 23 deletions(-)

diff --git a/drivers/vhost/Kconfig.vringh b/drivers/vhost/Kconfig.vringh
index c1fe36a9b8d4..a8d4dd0cb06e 100644
--- a/drivers/vhost/Kconfig.vringh
+++ b/drivers/vhost/Kconfig.vringh
@@ -1,6 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0-only
 config VHOST_RING
tristate
+   select VHOST_IOTLB
---help---
  This option is selected by any driver which needs to access
  the host side of a virtio ring.
diff --git a/drivers/vhost/vringh.c b/drivers/vhost/vringh.c
index a0a2d74967ef..ee0491f579ac 100644
--- a/drivers/vhost/vringh.c
+++ b/drivers/vhost/vringh.c
@@ -13,6 +13,9 @@
 #include 
 #include 
 #include 
+#include 
+#include 
+#include 
 #include 
 
 static __printf(1,2) __cold void vringh_bad(const char *fmt, ...)
@@ -71,9 +74,11 @@ static inline int __vringh_get_head(const struct vringh *vrh,
 }
 
 /* Copy some bytes to/from the iovec.  Returns num copied. */
-static inline ssize_t vringh_iov_xfer(struct vringh_kiov *iov,
+static inline ssize_t vringh_iov_xfer(struct vringh *vrh,
+ struct vringh_kiov *iov,
  void *ptr, size_t len,
- int (*xfer)(void *addr, void *ptr,
+ int (*xfer)(const struct vringh *vrh,
+ void *addr, void *ptr,
  size_t len))
 {
int err, done = 0;
@@ -82,7 +87,7 @@ static inline ssize_t vringh_iov_xfer(struct vringh_kiov *iov,
size_t partlen;
 
partlen = min(iov->iov[iov->i].iov_len, len);
-   err = xfer(iov->iov[iov->i].iov_base, ptr, partlen);
+   err = xfer(vrh, iov->iov[iov->i].iov_base, ptr, partlen);
if (err)
return err;
done += partlen;
@@ -96,6 +101,7 @@ static inline ssize_t vringh_iov_xfer(struct vringh_kiov 
*iov,
/* Fix up old iov element then increment. */
iov->iov[iov->i].iov_len = iov->consumed;
iov->iov[iov->i].iov_base -= iov->consumed;
+

iov->consumed = 0;
iov->i++;
@@ -227,7 +233,8 @@ static int slow_copy(struct vringh *vrh, void *dst, const 
void *src,
  u64 addr,
  struct vringh_range *r),
 struct vringh_range *range,
-int (*copy)(void *dst, const void *src, size_t len))
+int (*copy)(const struct vringh *vrh,
+void *dst, const void *src, size_t len))
 {
size_t part, len = sizeof(struct vring_desc);
 
@@ -241,7 +248,7 @@ static int slow_copy(struct vringh *vrh, void *dst, const 
void *src,
if (!rcheck(vrh, addr, , range, getrange))
return -EINVAL;
 
-   err = copy(dst, src, part);
+   err = copy(vrh, dst, src, part);
if (err)
return err;
 
@@ -262,7 +269,8 @@ __vringh_iov(struct vringh *vrh, u16 i,
 struct vringh_range *)),
 bool (*getrange)(struct vringh *, u64, struct vringh_range *),
 gfp_t gfp,
-int (*copy)(void *dst, const void *src, size_t len))
+int (*copy)(const struct vringh *vrh,
+void *dst, const void *src, size_t len))
 {
int err, count = 0, up_next, desc_max;
struct vring_desc desc, *descs;
@@ -291,7 +299,7 @@ __vringh_iov(struct vringh *vrh, u16 i,
err = slow_copy(vrh, , [i], rcheck, getrange,
, copy);
else
-   err = copy(, [i], sizeof(desc));
+   err = copy(vrh, , [i], sizeof(desc));
if (unlikely(err))
goto fail;
 
@@ -404,7 +412,8 @@ static inline int __vringh_complete(struct vringh *vrh,
unsigned int num_used,
int (*putu16)(const struct vringh *vrh,
  __virtio16 *p, u16 val),
-   

[PATCH V6 0/8] vDPA support

2020-03-18 Thread Jason Wang
Hi all:

This is an update version of vDPA support in kernel.

vDPA device is a device that uses a datapath which complies with the
virtio specifications with vendor specific control path. vDPA devices
can be both physically located on the hardware or emulated by
software. vDPA hardware devices are usually implemented through PCIE
with the following types:

- PF (Physical Function) - A single Physical Function
- VF (Virtual Function) - Device that supports single root I/O
  virtualization (SR-IOV). Its Virtual Function (VF) represents a
  virtualized instance of the device that can be assigned to different
  partitions
- ADI (Assignable Device Interface) and its equivalents - With
  technologies such as Intel Scalable IOV, a virtual device (VDEV)
  composed by host OS utilizing one or more ADIs. Or its equivalent
  like SF (Sub function) from Mellanox.

>From a driver's perspective, depends on how and where the DMA
translation is done, vDPA devices are split into two types:

- Platform specific DMA translation - From the driver's perspective,
  the device can be used on a platform where device access to data in
  memory is limited and/or translated. An example is a PCIE vDPA whose
  DMA request was tagged via a bus (e.g PCIE) specific way. DMA
  translation and protection are done at PCIE bus IOMMU level.
- Device specific DMA translation - The device implements DMA
  isolation and protection through its own logic. An example is a vDPA
  device which uses on-chip IOMMU.

To hide the differences and complexity of the above types for a vDPA
device/IOMMU options and in order to present a generic virtio device
to the upper layer, a device agnostic framework is required.

This series introduces a software vDPA bus which abstracts the
common attributes of vDPA device, vDPA bus driver and the
communication method, the bus operations (vdpa_config_ops) between the
vDPA device abstraction and the vDPA bus driver. This allows multiple
types of drivers to be used for vDPA device like the virtio_vdpa and
vhost_vdpa driver to operate on the bus and allow vDPA device could be
used by either kernel virtio driver or userspace vhost drivers as:

   virtio drivers  vhost drivers
  | |
[virtio bus]   [vhost uAPI]
  | |
   virtio device   vhost device
   virtio_vdpa drv vhost_vdpa drv
 \   /
[vDPA bus]
 |
vDPA device
hardware drv
 |
[hardware bus]
 |
vDPA hardware

virtio_vdpa driver is a transport implementation for kernel virtio
drivers on top of vDPA bus operations. An alternative is to refactor
virtio bus which is sub-optimal since the bus and drivers are designed
to be use by kernel subsystem, a non-trivial major refactoring is
needed which may impact a brunches of drivers and devices
implementation inside the kernel. Using a new transport may grealy
simply both the design and changes.

vhost_vdpa driver is a new type of vhost device which allows userspace
vhost drivers to use vDPA devices via vhost uAPI (with minor
extension). This help to minimize the changes of existed vhost drivers
for using vDPA devices.

With the abstraction of vDPA bus and vDPA bus operations, the
difference and complexity of the under layer hardware is hidden from
upper layer. The vDPA bus drivers on top can use a unified
vdpa_config_ops to control different types of vDPA device.

Two drivers were implemented with the framework introduced in this
series:

- Intel IFC VF driver which depends on the platform IOMMU for DMA
  translation
- VDPA simulator which is a software test device with an emulated
  onchip IOMMU

Future work:

- direct doorbell mapping support
- control virtqueue support
- dirty page tracking support
- direct interrupt support
- management API (devlink)

Please review.

Thanks

Changes from V5:

- include Intel IFCVF driver and vhost-vdpa drivers
- add the platform IOMMU support for vhost-vdpa
- check the return value of dev_set_name() (Jason)
- various tweaks and fixes

Changes from V4:

- use put_device() instead of kfree when fail to register virtio
  device (Jason)
- simplify the error handling when allocating vdpasim device (Jason)
- don't use device_for_each_child() during module exit (Jason)
- correct the error checking for vdpa_alloc_device() (Harpreet, Lingshan)

Changes from V3:

- various Kconfig fixes (Randy)

Changes from V2:

- release idr in the release function for put_device() unwind (Jason)
- don't panic when fail to register vdpa bus (Jason)
- use unsigned int instead of int for ida (Jason)
- fix the wrong commit log in virito_vdpa patches (Jason)
- make vdpa_sim depends on RUNTIME_TESTING_MENU (Michael)
- provide a bus release function for vDPA device (Jason)
- fix the wrong unwind when creating devices for vDPA simulator (Jason)
- move vDPA simulator to a dedicated directory (Lingshan)
- cancel the work before release vDPA simulator

Changes 

[PATCH V6 1/8] vhost: allow per device message handler

2020-03-18 Thread Jason Wang
This patch allow device to register its own message handler during
vhost_dev_init(). vDPA device will use it to implement its own DMA
mapping logic.

Signed-off-by: Jason Wang 
---
 drivers/vhost/net.c   |  3 ++-
 drivers/vhost/scsi.c  |  2 +-
 drivers/vhost/vhost.c | 12 ++--
 drivers/vhost/vhost.h |  6 +-
 drivers/vhost/vsock.c |  2 +-
 5 files changed, 19 insertions(+), 6 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index e158159671fa..c8ab8d83b530 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -1324,7 +1324,8 @@ static int vhost_net_open(struct inode *inode, struct 
file *f)
}
vhost_dev_init(dev, vqs, VHOST_NET_VQ_MAX,
   UIO_MAXIOV + VHOST_NET_BATCH,
-  VHOST_NET_PKT_WEIGHT, VHOST_NET_WEIGHT);
+  VHOST_NET_PKT_WEIGHT, VHOST_NET_WEIGHT,
+  NULL);
 
vhost_poll_init(n->poll + VHOST_NET_VQ_TX, handle_tx_net, EPOLLOUT, 
dev);
vhost_poll_init(n->poll + VHOST_NET_VQ_RX, handle_rx_net, EPOLLIN, dev);
diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c
index 0b949a14bce3..7653667a8cdc 100644
--- a/drivers/vhost/scsi.c
+++ b/drivers/vhost/scsi.c
@@ -1628,7 +1628,7 @@ static int vhost_scsi_open(struct inode *inode, struct 
file *f)
vs->vqs[i].vq.handle_kick = vhost_scsi_handle_kick;
}
vhost_dev_init(>dev, vqs, VHOST_SCSI_MAX_VQ, UIO_MAXIOV,
-  VHOST_SCSI_WEIGHT, 0);
+  VHOST_SCSI_WEIGHT, 0, NULL);
 
vhost_scsi_init_inflight(vs, NULL);
 
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index f44340b41494..8e9e2341e40a 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -457,7 +457,9 @@ static size_t vhost_get_desc_size(struct vhost_virtqueue 
*vq,
 
 void vhost_dev_init(struct vhost_dev *dev,
struct vhost_virtqueue **vqs, int nvqs,
-   int iov_limit, int weight, int byte_weight)
+   int iov_limit, int weight, int byte_weight,
+   int (*msg_handler)(struct vhost_dev *dev,
+  struct vhost_iotlb_msg *msg))
 {
struct vhost_virtqueue *vq;
int i;
@@ -473,6 +475,7 @@ void vhost_dev_init(struct vhost_dev *dev,
dev->iov_limit = iov_limit;
dev->weight = weight;
dev->byte_weight = byte_weight;
+   dev->msg_handler = msg_handler;
init_llist_head(>work_list);
init_waitqueue_head(>wait);
INIT_LIST_HEAD(>read_list);
@@ -1178,7 +1181,12 @@ ssize_t vhost_chr_write_iter(struct vhost_dev *dev,
ret = -EINVAL;
goto done;
}
-   if (vhost_process_iotlb_msg(dev, )) {
+
+   if (dev->msg_handler)
+   ret = dev->msg_handler(dev, );
+   else
+   ret = vhost_process_iotlb_msg(dev, );
+   if (ret) {
ret = -EFAULT;
goto done;
}
diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index a123fd70847e..f9d1a03dd153 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -174,11 +174,15 @@ struct vhost_dev {
int weight;
int byte_weight;
u64 kcov_handle;
+   int (*msg_handler)(struct vhost_dev *dev,
+  struct vhost_iotlb_msg *msg);
 };
 
 bool vhost_exceeds_weight(struct vhost_virtqueue *vq, int pkts, int total_len);
 void vhost_dev_init(struct vhost_dev *, struct vhost_virtqueue **vqs,
-   int nvqs, int iov_limit, int weight, int byte_weight);
+   int nvqs, int iov_limit, int weight, int byte_weight,
+   int (*msg_handler)(struct vhost_dev *dev,
+  struct vhost_iotlb_msg *msg));
 long vhost_dev_set_owner(struct vhost_dev *dev);
 bool vhost_dev_has_owner(struct vhost_dev *dev);
 long vhost_dev_check_owner(struct vhost_dev *);
diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
index c2d7d57e98cf..97669484a3f6 100644
--- a/drivers/vhost/vsock.c
+++ b/drivers/vhost/vsock.c
@@ -621,7 +621,7 @@ static int vhost_vsock_dev_open(struct inode *inode, struct 
file *file)
 
vhost_dev_init(>dev, vqs, ARRAY_SIZE(vsock->vqs),
   UIO_MAXIOV, VHOST_VSOCK_PKT_WEIGHT,
-  VHOST_VSOCK_WEIGHT);
+  VHOST_VSOCK_WEIGHT, NULL);
 
file->private_data = vsock;
spin_lock_init(>send_pkt_list_lock);
-- 
2.20.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v3] drm/bochs: downgrade pci_request_region failure from error to warning

2020-03-18 Thread Gerd Hoffmann
On Tue, Mar 17, 2020 at 05:49:41PM +0100, Daniel Vetter wrote:
> On Fri, Mar 13, 2020 at 09:41:52AM +0100, Gerd Hoffmann wrote:
> > Shutdown of firmware framebuffer has a bunch of problems.  Because
> > of this the framebuffer region might still be reserved even after
> > drm_fb_helper_remove_conflicting_pci_framebuffers() returned.
> 
> Is that still the fbdev lifetime fun where the cleanup might be delayed if
> the char device node is still open?

Yes.

cheers,
  Gerd

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization