Re: [RFC PATCH] vring: Force use of DMA API for ARM-based systems

2017-01-06 Thread Andy Lutomirski
On Fri, Jan 6, 2017 at 10:32 AM, Robin Murphy  wrote:
> On 06/01/17 17:48, Jean-Philippe Brucker wrote:
>> Hi Will,
>>
>> On 20/12/16 15:14, Will Deacon wrote:
>>> Booting Linux on an ARM fastmodel containing an SMMU emulation results
>>> in an unexpected I/O page fault from the legacy virtio-blk PCI device:
>>>
>>> [1.211721] arm-smmu-v3 2b40.smmu: event 0x10 received:
>>> [1.211800] arm-smmu-v3 2b40.smmu:0xf010
>>> [1.211880] arm-smmu-v3 2b40.smmu:0x0208
>>> [1.211959] arm-smmu-v3 2b40.smmu:0x0008fa081002
>>> [1.212075] arm-smmu-v3 2b40.smmu:0x
>>> [1.212155] arm-smmu-v3 2b40.smmu: event 0x10 received:
>>> [1.212234] arm-smmu-v3 2b40.smmu:0xf010
>>> [1.212314] arm-smmu-v3 2b40.smmu:0x0208
>>> [1.212394] arm-smmu-v3 2b40.smmu:0x0008fa081000
>>> [1.212471] arm-smmu-v3 2b40.smmu:0x
>>>
>>> 
>>>
>>> This is because the virtio-blk is behind an SMMU, so we have consequently
>>> swizzled its DMA ops and configured the SMMU to translate accesses. This
>>> then requires the vring code to use the DMA API to establish translations,
>>> otherwise all transactions will result in fatal faults and termination.
>>>
>>> Given that ARM-based systems only see an SMMU if one is really present
>>> (the topology is all described by firmware tables such as device-tree or
>>> IORT), then we can safely use the DMA API for all virtio devices.
>>
>> There is a problem with the platform block device on that same model.
>> Since it's not behind the SMMU, the DMA ops fall back to swiotlb, which
>> limits the number of mappings.
>>
>> It used to work with 4.9, but since 9491ae4 ("mm: don't cap request size
>> based on read-ahead setting") unlocked read-ahead, we quickly run into
>> the limit of swiotlb and panic:
>>
>> [5.382359] virtio-mmio 1c13.virtio_block: swiotlb buffer is full
>> (sz: 491520 bytes)
>> [5.382452] virtio-mmio 1c13.virtio_block: DMA: Out of SW-IOMMU
>> space for 491520 bytes
>> [5.382531] Kernel panic - not syncing: DMA: Random memory could be
>> DMA written
>> ...
>> [5.383148] [] swiotlb_map_page+0x194/0x1a0
>> [5.383226] [] __swiotlb_map_page+0x20/0x88
>> [5.383320] [] vring_map_one_sg.isra.1+0x70/0x88
>> [5.383417] [] virtqueue_add_sgs+0x2ec/0x4e8
>> [5.383505] [] __virtblk_add_req+0x9c/0x1a8
>> ...
>> [5.384449] [] ondemand_readahead+0xfc/0x2b8
>>
>> Commit 9491ae4 caps the read-ahead request to a limit set by the backing
>> device. For virtio-blk, it is infinite (as set by the call to
>> blk_queue_max_hw_sectors in virtblk_probe).
>>
>> I'm not sure how to fix this. Setting an arbitrary sector limit in the
>> virtio-blk driver seems unfair to other users. Maybe we should check if
>> the device is behind a hardware IOMMU before using the DMA API?
>
> Hmm, this looks more like the virtio_block device simply has the wrong
> DMA mask to begin with. For virtio-pci we set the streaming DMA mask to
> 64 bits - should a platform device not be similarly capable?

If it's not, then turning off DMA API will cause random corruption.
ISTM one way or another the bug is in either the DMA ops or in the
driver initialization.

--Andy
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH V4 net-next 1/3] vhost: better detection of available buffers

2017-01-06 Thread Michael S. Tsirkin
On Fri, Jan 06, 2017 at 10:13:15AM +0800, Jason Wang wrote:
> This patch tries to do several tweaks on vhost_vq_avail_empty() for a
> better performance:
> 
> - check cached avail index first which could avoid userspace memory access.
> - using unlikely() for the failure of userspace access
> - check vq->last_avail_idx instead of cached avail index as the last
>   step.
> 
> This patch is need for batching supports which needs to peek whether
> or not there's still available buffers in the ring.
> 
> Reviewed-by: Stefan Hajnoczi 
> Signed-off-by: Jason Wang 
> ---
>  drivers/vhost/vhost.c | 8 ++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> index d643260..9f11838 100644
> --- a/drivers/vhost/vhost.c
> +++ b/drivers/vhost/vhost.c
> @@ -2241,11 +2241,15 @@ bool vhost_vq_avail_empty(struct vhost_dev *dev, 
> struct vhost_virtqueue *vq)
>   __virtio16 avail_idx;
>   int r;
>  
> + if (vq->avail_idx != vq->last_avail_idx)
> + return false;
> +
>   r = vhost_get_user(vq, avail_idx, >avail->idx);
> - if (r)
> + if (unlikely(r))
>   return false;
> + vq->avail_idx = vhost16_to_cpu(vq, avail_idx);
>  
> - return vhost16_to_cpu(vq, avail_idx) == vq->avail_idx;
> + return vq->avail_idx == vq->last_avail_idx;
>  }
>  EXPORT_SYMBOL_GPL(vhost_vq_avail_empty);

So again, this did not address the issue I pointed out in v1:
if we have 1 buffer in RX queue and
that is not enough to store the whole packet,
vhost_vq_avail_empty returns false, then we re-read
the descriptors again and again.

You have saved a single index access but not the more expensive
descriptor access.

I think that a way to address this could be to have this
return current index for the caller. Then as long as that
index isn't changed, you don't poke at descriptor ring.

> -- 
> 2.7.4
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH V4 net-next 3/3] tun: rx batching

2017-01-06 Thread Michael S. Tsirkin
On Fri, Jan 06, 2017 at 10:13:17AM +0800, Jason Wang wrote:
> We can only process 1 packet at one time during sendmsg(). This often
> lead bad cache utilization under heavy load. So this patch tries to do
> some batching during rx before submitting them to host network
> stack. This is done through accepting MSG_MORE as a hint from
> sendmsg() caller, if it was set, batch the packet temporarily in a
> linked list and submit them all once MSG_MORE were cleared.
> 
> Tests were done by pktgen (burst=128) in guest over mlx4(noqueue) on host:
> 
>  Mpps  -+%
> rx-frames = 00.91  +0%
> rx-frames = 41.00  +9.8%
> rx-frames = 81.00  +9.8%
> rx-frames = 16   1.01  +10.9%
> rx-frames = 32   1.07  +17.5%
> rx-frames = 48   1.07  +17.5%
> rx-frames = 64   1.08  +18.6%
> rx-frames = 64 (no MSG_MORE) 0.91  +0%
> 
> User were allowed to change per device batched packets through
> ethtool -C rx-frames. NAPI_POLL_WEIGHT were used as upper limitation
> to prevent bh from being disabled too long.
> 
> Signed-off-by: Jason Wang 
> ---
>  drivers/net/tun.c | 76 
> ++-
>  1 file changed, 70 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> index cd8e02c..6c93926 100644
> --- a/drivers/net/tun.c
> +++ b/drivers/net/tun.c
> @@ -218,6 +218,7 @@ struct tun_struct {
>   struct list_head disabled;
>   void *security;
>   u32 flow_count;
> + u32 rx_batched;
>   struct tun_pcpu_stats __percpu *pcpu_stats;
>  };
>  
> @@ -522,6 +523,7 @@ static void tun_queue_purge(struct tun_file *tfile)
>   while ((skb = skb_array_consume(>tx_array)) != NULL)
>   kfree_skb(skb);
>  
> + skb_queue_purge(>sk.sk_write_queue);
>   skb_queue_purge(>sk.sk_error_queue);
>  }
>  
> @@ -1140,10 +1142,45 @@ static struct sk_buff *tun_alloc_skb(struct tun_file 
> *tfile,
>   return skb;
>  }
>  
> +static void tun_rx_batched(struct tun_struct *tun, struct tun_file *tfile,
> +struct sk_buff *skb, int more)
> +{
> + struct sk_buff_head *queue = >sk.sk_write_queue;
> + struct sk_buff_head process_queue;
> + u32 rx_batched = tun->rx_batched;
> + bool rcv = false;
> +
> + if (!rx_batched || (!more && skb_queue_empty(queue))) {
> + local_bh_disable();
> + netif_receive_skb(skb);
> + local_bh_enable();
> + return;
> + }
> +
> + spin_lock(>lock);
> + if (!more || skb_queue_len(queue) == rx_batched) {
> + __skb_queue_head_init(_queue);
> + skb_queue_splice_tail_init(queue, _queue);
> + rcv = true;
> + } else {
> + __skb_queue_tail(queue, skb);
> + }
> + spin_unlock(>lock);
> +
> + if (rcv) {
> + struct sk_buff *nskb;
> + local_bh_disable();
> + while ((nskb = __skb_dequeue(_queue)))
> + netif_receive_skb(nskb);
> + netif_receive_skb(skb);
> + local_bh_enable();
> + }
> +}
> +
>  /* Get packet from user space buffer */
>  static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
>   void *msg_control, struct iov_iter *from,
> - int noblock)
> + int noblock, bool more)
>  {
>   struct tun_pi pi = { 0, cpu_to_be16(ETH_P_IP) };
>   struct sk_buff *skb;
> @@ -1283,10 +1320,9 @@ static ssize_t tun_get_user(struct tun_struct *tun, 
> struct tun_file *tfile,
>   skb_probe_transport_header(skb, 0);
>  
>   rxhash = skb_get_hash(skb);
> +
>  #ifndef CONFIG_4KSTACKS
> - local_bh_disable();
> - netif_receive_skb(skb);
> - local_bh_enable();
> + tun_rx_batched(tun, tfile, skb, more);
>  #else
>   netif_rx_ni(skb);
>  #endif
> @@ -1312,7 +1348,8 @@ static ssize_t tun_chr_write_iter(struct kiocb *iocb, 
> struct iov_iter *from)
>   if (!tun)
>   return -EBADFD;
>  
> - result = tun_get_user(tun, tfile, NULL, from, file->f_flags & 
> O_NONBLOCK);
> + result = tun_get_user(tun, tfile, NULL, from,
> +   file->f_flags & O_NONBLOCK, false);
>  
>   tun_put(tun);
>   return result;
> @@ -1570,7 +1607,8 @@ static int tun_sendmsg(struct socket *sock, struct 
> msghdr *m, size_t total_len)
>   return -EBADFD;
>  
>   ret = tun_get_user(tun, tfile, m->msg_control, >msg_iter,
> -m->msg_flags & MSG_DONTWAIT);
> +m->msg_flags & MSG_DONTWAIT,
> +m->msg_flags & MSG_MORE);
>   tun_put(tun);
>   return ret;
>  }
> @@ -1771,6 +1809,7 @@ static int tun_set_iff(struct net *net, struct file 
> *file, struct ifreq *ifr)
>   tun->align = NET_SKB_PAD;
>   

Re: [RFC PATCH] vring: Force use of DMA API for ARM-based systems

2017-01-06 Thread Robin Murphy
On 06/01/17 17:48, Jean-Philippe Brucker wrote:
> Hi Will,
> 
> On 20/12/16 15:14, Will Deacon wrote:
>> Booting Linux on an ARM fastmodel containing an SMMU emulation results
>> in an unexpected I/O page fault from the legacy virtio-blk PCI device:
>>
>> [1.211721] arm-smmu-v3 2b40.smmu: event 0x10 received:
>> [1.211800] arm-smmu-v3 2b40.smmu:0xf010
>> [1.211880] arm-smmu-v3 2b40.smmu:0x0208
>> [1.211959] arm-smmu-v3 2b40.smmu:0x0008fa081002
>> [1.212075] arm-smmu-v3 2b40.smmu:0x
>> [1.212155] arm-smmu-v3 2b40.smmu: event 0x10 received:
>> [1.212234] arm-smmu-v3 2b40.smmu:0xf010
>> [1.212314] arm-smmu-v3 2b40.smmu:0x0208
>> [1.212394] arm-smmu-v3 2b40.smmu:0x0008fa081000
>> [1.212471] arm-smmu-v3 2b40.smmu:0x
>>
>> 
>>
>> This is because the virtio-blk is behind an SMMU, so we have consequently
>> swizzled its DMA ops and configured the SMMU to translate accesses. This
>> then requires the vring code to use the DMA API to establish translations,
>> otherwise all transactions will result in fatal faults and termination.
>>
>> Given that ARM-based systems only see an SMMU if one is really present
>> (the topology is all described by firmware tables such as device-tree or
>> IORT), then we can safely use the DMA API for all virtio devices.
> 
> There is a problem with the platform block device on that same model.
> Since it's not behind the SMMU, the DMA ops fall back to swiotlb, which
> limits the number of mappings.
> 
> It used to work with 4.9, but since 9491ae4 ("mm: don't cap request size
> based on read-ahead setting") unlocked read-ahead, we quickly run into
> the limit of swiotlb and panic:
> 
> [5.382359] virtio-mmio 1c13.virtio_block: swiotlb buffer is full
> (sz: 491520 bytes)
> [5.382452] virtio-mmio 1c13.virtio_block: DMA: Out of SW-IOMMU
> space for 491520 bytes
> [5.382531] Kernel panic - not syncing: DMA: Random memory could be
> DMA written
> ...
> [5.383148] [] swiotlb_map_page+0x194/0x1a0
> [5.383226] [] __swiotlb_map_page+0x20/0x88
> [5.383320] [] vring_map_one_sg.isra.1+0x70/0x88
> [5.383417] [] virtqueue_add_sgs+0x2ec/0x4e8
> [5.383505] [] __virtblk_add_req+0x9c/0x1a8
> ...
> [5.384449] [] ondemand_readahead+0xfc/0x2b8
> 
> Commit 9491ae4 caps the read-ahead request to a limit set by the backing
> device. For virtio-blk, it is infinite (as set by the call to
> blk_queue_max_hw_sectors in virtblk_probe).
> 
> I'm not sure how to fix this. Setting an arbitrary sector limit in the
> virtio-blk driver seems unfair to other users. Maybe we should check if
> the device is behind a hardware IOMMU before using the DMA API?

Hmm, this looks more like the virtio_block device simply has the wrong
DMA mask to begin with. For virtio-pci we set the streaming DMA mask to
64 bits - should a platform device not be similarly capable?

Robin.

> 
> Thanks,
> Jean-Philippe
> 
>> Cc: Andy Lutomirski 
>> Cc: Michael S. Tsirkin 
>> Signed-off-by: Will Deacon 
>> ---
>>  drivers/virtio/virtio_ring.c | 4 
>>  1 file changed, 4 insertions(+)
>>
>> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
>> index ed9c9eeedfe5..06b91e29d1b7 100644
>> --- a/drivers/virtio/virtio_ring.c
>> +++ b/drivers/virtio/virtio_ring.c
>> @@ -159,6 +159,10 @@ static bool vring_use_dma_api(struct virtio_device 
>> *vdev)
>>  if (xen_domain())
>>  return true;
>>  
>> +/* On ARM-based machines, the DMA ops will do the right thing */
>> +if (IS_ENABLED(CONFIG_ARM) || IS_ENABLED(CONFIG_ARM64))
>> +return true;
>> +
>>  return false;
>>  }
>>  
>>
> 
> 
> ___
> linux-arm-kernel mailing list
> linux-arm-ker...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] virtio_blk: avoid DMA to stack for the sense buffer

2017-01-06 Thread 王金浦
Hi Christoph,

2017-01-04 6:25 GMT+01:00 Christoph Hellwig :
> Most users of BLOCK_PC requests allocate the sense buffer on the stack,
> so to avoid DMA to the stack copy them to a field in the heap allocated
> virtblk_req structure.  Without that any attempt at SCSI passthrough I/O,
> including the SG_IO ioctl from userspace will crash the kernel.  Note that
> this includes running tools like hdparm even when the host does not have
> SCSI passthrough enabled.

This sounds scary.
Could you share how to reproduce it, this should go into stable if
it's the case.

Thanks,
Jinpu

>
> Signed-off-by: Christoph Hellwig 
> ---
>  drivers/block/virtio_blk.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
> index 5545a67..3c3b8f6 100644
> --- a/drivers/block/virtio_blk.c
> +++ b/drivers/block/virtio_blk.c
> @@ -56,6 +56,7 @@ struct virtblk_req {
> struct virtio_blk_outhdr out_hdr;
> struct virtio_scsi_inhdr in_hdr;
> u8 status;
> +   u8 sense[SCSI_SENSE_BUFFERSIZE];
> struct scatterlist sg[];
>  };
>
> @@ -102,7 +103,8 @@ static int __virtblk_add_req(struct virtqueue *vq,
> }
>
> if (type == cpu_to_virtio32(vq->vdev, VIRTIO_BLK_T_SCSI_CMD)) {
> -   sg_init_one(, vbr->req->sense, SCSI_SENSE_BUFFERSIZE);
> +   memcpy(vbr->sense, vbr->req->sense, SCSI_SENSE_BUFFERSIZE);
> +   sg_init_one(, vbr->sense, SCSI_SENSE_BUFFERSIZE);
> sgs[num_out + num_in++] = 
> sg_init_one(, >in_hdr, sizeof(vbr->in_hdr));
> sgs[num_out + num_in++] = 
> --
> 2.1.4
>
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] virtio_blk: avoid DMA to stack for the sense buffer

2017-01-06 Thread 王金浦
2017-01-05 10:57 GMT+01:00 Christoph Hellwig :
> On Wed, Jan 04, 2017 at 04:47:03PM +0100, 王金浦 wrote:
>> This sounds scary.
>> Could you share how to reproduce it, this should go into stable if
>> it's the case.
>
> Step 1: Build your kernel with CONFIG_VMAP_STACK=y
> Step 2: issue a SG_IO ioctl, e.g. sg_inq /dev/vda
>
Thanks, so it's only relevant to kernel > 4.9, as  CONFIG_VMAP_STACK
only introduced in 4.9 kernel.

Regards,
Jinpu
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH] net: Use kmemdup instead of kmalloc and memcpy

2017-01-06 Thread Shyam Saini
when some other buffer is immediately copied into allocated region.
Replace calls to kmalloc followed by a memcpy with a direct
call to kmemdup.

Signed-off-by: Shyam Saini 
---
 drivers/net/virtio_net.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index ba1aa24..dde4bc4 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1206,10 +1206,9 @@ static int virtnet_set_mac_address(struct net_device 
*dev, void *p)
struct sockaddr *addr;
struct scatterlist sg;
 
-   addr = kmalloc(sizeof(*addr), GFP_KERNEL);
+   addr = kmemdup(p, sizeof(*addr), GFP_KERNEL);
if (!addr)
return -ENOMEM;
-   memcpy(addr, p, sizeof(*addr));
 
ret = eth_prepare_mac_addr_change(dev, addr);
if (ret)
-- 
2.7.4

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH net 1/9] virtio-net: remove the warning before XDP linearizing

2017-01-06 Thread Daniel Borkmann

Hi Jason,

On 12/23/2016 03:37 PM, Jason Wang wrote:

Since we use EWMA to estimate the size of rx buffer. When rx buffer
size is underestimated, it's usual to have a packet with more than one
buffers. Consider this is not a bug, remove the warning and correct
the comment before XDP linearizing.

Cc: John Fastabend 
Signed-off-by: Jason Wang 
---
  drivers/net/virtio_net.c | 8 +---
  1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 08327e0..1067253 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -552,14 +552,8 @@ static struct sk_buff *receive_mergeable(struct net_device 
*dev,
struct page *xdp_page;
u32 act;

-   /* No known backend devices should send packets with
-* more than a single buffer when XDP conditions are
-* met. However it is not strictly illegal so the case
-* is handled as an exception and a warning is thrown.
-*/
+   /* This happens when rx buffer size is underestimated */
if (unlikely(num_buf > 1)) {
-   bpf_warn_invalid_xdp_buffer();


Could you also remove the bpf_warn_invalid_xdp_buffer(), which got added
just for this?

Thanks.


/* linearize data for XDP */
xdp_page = xdp_linearize_page(rq, num_buf,
  page, offset, );



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 5/8] linux: drop __bitwise__ everywhere

2017-01-06 Thread Luca Coelho
On Thu, 2016-12-15 at 07:15 +0200, Michael S. Tsirkin wrote:
> __bitwise__ used to mean "yes, please enable sparse checks
> unconditionally", but now that we dropped __CHECK_ENDIAN__
> __bitwise is exactly the same.
> There aren't many users, replace it by __bitwise everywhere.
> 
> Signed-off-by: Michael S. Tsirkin 
> ---
>  arch/arm/plat-samsung/include/plat/gpio-cfg.h| 2 +-
>  drivers/md/dm-cache-block-types.h| 6 +++---
>  drivers/net/ethernet/sun/sunhme.h| 2 +-
>  drivers/net/wireless/intel/iwlwifi/iwl-fw-file.h | 4 ++--

For drivers/net/wireless/intel/iwlwifi/iwl-fw-file.h:

Acked-by: Luca Coelho 

--
Luca.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC PATCH] vring: Force use of DMA API for ARM-based systems

2017-01-06 Thread Marc Zyngier
On 20/12/16 15:14, Will Deacon wrote:
> Booting Linux on an ARM fastmodel containing an SMMU emulation results
> in an unexpected I/O page fault from the legacy virtio-blk PCI device:
> 
> [1.211721] arm-smmu-v3 2b40.smmu: event 0x10 received:
> [1.211800] arm-smmu-v3 2b40.smmu: 0xf010
> [1.211880] arm-smmu-v3 2b40.smmu: 0x0208
> [1.211959] arm-smmu-v3 2b40.smmu: 0x0008fa081002
> [1.212075] arm-smmu-v3 2b40.smmu: 0x
> [1.212155] arm-smmu-v3 2b40.smmu: event 0x10 received:
> [1.212234] arm-smmu-v3 2b40.smmu: 0xf010
> [1.212314] arm-smmu-v3 2b40.smmu: 0x0208
> [1.212394] arm-smmu-v3 2b40.smmu: 0x0008fa081000
> [1.212471] arm-smmu-v3 2b40.smmu: 0x
> 
> 
> 
> This is because the virtio-blk is behind an SMMU, so we have consequently
> swizzled its DMA ops and configured the SMMU to translate accesses. This
> then requires the vring code to use the DMA API to establish translations,
> otherwise all transactions will result in fatal faults and termination.
> 
> Given that ARM-based systems only see an SMMU if one is really present
> (the topology is all described by firmware tables such as device-tree or
> IORT), then we can safely use the DMA API for all virtio devices.
> 
> Cc: Andy Lutomirski 
> Cc: Michael S. Tsirkin 
> Signed-off-by: Will Deacon 
> ---
>  drivers/virtio/virtio_ring.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index ed9c9eeedfe5..06b91e29d1b7 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -159,6 +159,10 @@ static bool vring_use_dma_api(struct virtio_device *vdev)
>   if (xen_domain())
>   return true;
>  
> + /* On ARM-based machines, the DMA ops will do the right thing */
> + if (IS_ENABLED(CONFIG_ARM) || IS_ENABLED(CONFIG_ARM64))
> + return true;
> +
>   return false;
>  }
>  
> 

This patch makes my model usable again, so FWIW:

Acked-by: Marc Zyngier 

M.
-- 
Jazz is not dead. It just smells funny...
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC PATCH] vring: Force use of DMA API for ARM-based systems

2017-01-06 Thread Jean-Philippe Brucker
Hi Will,

On 20/12/16 15:14, Will Deacon wrote:
> Booting Linux on an ARM fastmodel containing an SMMU emulation results
> in an unexpected I/O page fault from the legacy virtio-blk PCI device:
> 
> [1.211721] arm-smmu-v3 2b40.smmu: event 0x10 received:
> [1.211800] arm-smmu-v3 2b40.smmu: 0xf010
> [1.211880] arm-smmu-v3 2b40.smmu: 0x0208
> [1.211959] arm-smmu-v3 2b40.smmu: 0x0008fa081002
> [1.212075] arm-smmu-v3 2b40.smmu: 0x
> [1.212155] arm-smmu-v3 2b40.smmu: event 0x10 received:
> [1.212234] arm-smmu-v3 2b40.smmu: 0xf010
> [1.212314] arm-smmu-v3 2b40.smmu: 0x0208
> [1.212394] arm-smmu-v3 2b40.smmu: 0x0008fa081000
> [1.212471] arm-smmu-v3 2b40.smmu: 0x
> 
> 
> 
> This is because the virtio-blk is behind an SMMU, so we have consequently
> swizzled its DMA ops and configured the SMMU to translate accesses. This
> then requires the vring code to use the DMA API to establish translations,
> otherwise all transactions will result in fatal faults and termination.
> 
> Given that ARM-based systems only see an SMMU if one is really present
> (the topology is all described by firmware tables such as device-tree or
> IORT), then we can safely use the DMA API for all virtio devices.

There is a problem with the platform block device on that same model.
Since it's not behind the SMMU, the DMA ops fall back to swiotlb, which
limits the number of mappings.

It used to work with 4.9, but since 9491ae4 ("mm: don't cap request size
based on read-ahead setting") unlocked read-ahead, we quickly run into
the limit of swiotlb and panic:

[5.382359] virtio-mmio 1c13.virtio_block: swiotlb buffer is full
(sz: 491520 bytes)
[5.382452] virtio-mmio 1c13.virtio_block: DMA: Out of SW-IOMMU
space for 491520 bytes
[5.382531] Kernel panic - not syncing: DMA: Random memory could be
DMA written
...
[5.383148] [] swiotlb_map_page+0x194/0x1a0
[5.383226] [] __swiotlb_map_page+0x20/0x88
[5.383320] [] vring_map_one_sg.isra.1+0x70/0x88
[5.383417] [] virtqueue_add_sgs+0x2ec/0x4e8
[5.383505] [] __virtblk_add_req+0x9c/0x1a8
...
[5.384449] [] ondemand_readahead+0xfc/0x2b8

Commit 9491ae4 caps the read-ahead request to a limit set by the backing
device. For virtio-blk, it is infinite (as set by the call to
blk_queue_max_hw_sectors in virtblk_probe).

I'm not sure how to fix this. Setting an arbitrary sector limit in the
virtio-blk driver seems unfair to other users. Maybe we should check if
the device is behind a hardware IOMMU before using the DMA API?

Thanks,
Jean-Philippe

> Cc: Andy Lutomirski 
> Cc: Michael S. Tsirkin 
> Signed-off-by: Will Deacon 
> ---
>  drivers/virtio/virtio_ring.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index ed9c9eeedfe5..06b91e29d1b7 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -159,6 +159,10 @@ static bool vring_use_dma_api(struct virtio_device *vdev)
>   if (xen_domain())
>   return true;
>  
> + /* On ARM-based machines, the DMA ops will do the right thing */
> + if (IS_ENABLED(CONFIG_ARM) || IS_ENABLED(CONFIG_ARM64))
> + return true;
> +
>   return false;
>  }
>  
> 

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH net-next] net: make ndo_get_stats64 a void function

2017-01-06 Thread kbuild test robot
Hi Stephen,

[auto build test WARNING on net-next/master]

url:
https://github.com/0day-ci/linux/commits/Stephen-Hemminger/net-make-ndo_get_stats64-a-void-function/20170106-160123
config: xtensa-allmodconfig (attached as .config)
compiler: xtensa-linux-gcc (GCC) 4.9.0
reproduce:
wget 
https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross
 -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=xtensa 

All warnings (new ones prefixed by >>):

   drivers/net/macsec.c: In function 'macsec_get_stats64':
>> drivers/net/macsec.c:2897:3: warning: 'return' with a value, in function 
>> returning void
  return s;
  ^

vim +/return +2897 drivers/net/macsec.c

c09440f7 Sabrina Dubroca   2016-03-11  2881 unsigned int extra = 
macsec->secy.icv_len + macsec_extra_len(true);
c09440f7 Sabrina Dubroca   2016-03-11  2882  
c09440f7 Sabrina Dubroca   2016-03-11  2883 if (macsec->real_dev->mtu - 
extra < new_mtu)
c09440f7 Sabrina Dubroca   2016-03-11  2884 return -ERANGE;
c09440f7 Sabrina Dubroca   2016-03-11  2885  
c09440f7 Sabrina Dubroca   2016-03-11  2886 dev->mtu = new_mtu;
c09440f7 Sabrina Dubroca   2016-03-11  2887  
c09440f7 Sabrina Dubroca   2016-03-11  2888 return 0;
c09440f7 Sabrina Dubroca   2016-03-11  2889  }
c09440f7 Sabrina Dubroca   2016-03-11  2890  
1e665d95 Stephen Hemminger 2017-01-05  2891  static void 
macsec_get_stats64(struct net_device *dev,
c09440f7 Sabrina Dubroca   2016-03-11  2892struct 
rtnl_link_stats64 *s)
c09440f7 Sabrina Dubroca   2016-03-11  2893  {
c09440f7 Sabrina Dubroca   2016-03-11  2894 int cpu;
c09440f7 Sabrina Dubroca   2016-03-11  2895  
c09440f7 Sabrina Dubroca   2016-03-11  2896 if (!dev->tstats)
c09440f7 Sabrina Dubroca   2016-03-11 @2897 return s;
c09440f7 Sabrina Dubroca   2016-03-11  2898  
c09440f7 Sabrina Dubroca   2016-03-11  2899 for_each_possible_cpu(cpu) {
c09440f7 Sabrina Dubroca   2016-03-11  2900 struct pcpu_sw_netstats 
*stats;
c09440f7 Sabrina Dubroca   2016-03-11  2901 struct pcpu_sw_netstats 
tmp;
c09440f7 Sabrina Dubroca   2016-03-11  2902 int start;
c09440f7 Sabrina Dubroca   2016-03-11  2903  
c09440f7 Sabrina Dubroca   2016-03-11  2904 stats = 
per_cpu_ptr(dev->tstats, cpu);
c09440f7 Sabrina Dubroca   2016-03-11  2905 do {

:: The code at line 2897 was first introduced by commit
:: c09440f7dcb304002dfced8c0fea289eb25f2da0 macsec: introduce IEEE 802.1AE 
driver

:: TO: Sabrina Dubroca <s...@queasysnail.net>
:: CC: David S. Miller <da...@davemloft.net>

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH net-next] net: make ndo_get_stats64 a void function

2017-01-06 Thread kbuild test robot
Hi Stephen,

[auto build test WARNING on net-next/master]

url:
https://github.com/0day-ci/linux/commits/Stephen-Hemminger/net-make-ndo_get_stats64-a-void-function/20170106-160123
config: x86_64-acpi-redef (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64 

All warnings (new ones prefixed by >>):

   drivers/net/ethernet/broadcom/bnx2.c: In function 'bnx2_get_stats64':
>> drivers/net/ethernet/broadcom/bnx2.c:6830:10: warning: 'return' with a 
>> value, in function returning void
  return net_stats;
 ^
   drivers/net/ethernet/broadcom/bnx2.c:6825:1: note: declared here
bnx2_get_stats64(struct net_device *dev, struct rtnl_link_stats64 
*net_stats)
^~~~

vim +/return +6830 drivers/net/ethernet/broadcom/bnx2.c

5d07bf26 drivers/net/bnx2.c   Eric Dumazet  2010-07-08  
6814(((u64) (ctr##_hi) << 32) + (u64) (ctr##_lo))
b6016b76 drivers/net/bnx2.c   Michael Chan  2005-05-26  
6815  
a4743058 drivers/net/bnx2.c   Michael Chan  2010-01-17  
6816  #define GET_64BIT_NET_STATS(ctr)  \
354fcd77 drivers/net/bnx2.c   Michael Chan  2010-01-17  
6817GET_64BIT_NET_STATS64(bp->stats_blk->ctr) + \
354fcd77 drivers/net/bnx2.c   Michael Chan  2010-01-17  
6818GET_64BIT_NET_STATS64(bp->temp_stats_blk->ctr)
b6016b76 drivers/net/bnx2.c   Michael Chan  2005-05-26  
6819  
a4743058 drivers/net/bnx2.c   Michael Chan  2010-01-17  
6820  #define GET_32BIT_NET_STATS(ctr)  \
354fcd77 drivers/net/bnx2.c   Michael Chan  2010-01-17  
6821(unsigned long) (bp->stats_blk->ctr +   \
354fcd77 drivers/net/bnx2.c   Michael Chan  2010-01-17  
6822 bp->temp_stats_blk->ctr)
a4743058 drivers/net/bnx2.c   Michael Chan  2010-01-17  
6823  
1e665d95 drivers/net/ethernet/broadcom/bnx2.c Stephen Hemminger 2017-01-05  
6824  static void
5d07bf26 drivers/net/bnx2.c   Eric Dumazet  2010-07-08  
6825  bnx2_get_stats64(struct net_device *dev, struct rtnl_link_stats64 
*net_stats)
b6016b76 drivers/net/bnx2.c   Michael Chan  2005-05-26  
6826  {
972ec0d4 drivers/net/bnx2.c   Michael Chan  2006-01-23  
6827struct bnx2 *bp = netdev_priv(dev);
b6016b76 drivers/net/bnx2.c   Michael Chan  2005-05-26  
6828  
5d07bf26 drivers/net/bnx2.c   Eric Dumazet  2010-07-08  
6829if (bp->stats_blk == NULL)
b6016b76 drivers/net/bnx2.c   Michael Chan  2005-05-26 
@6830return net_stats;
5d07bf26 drivers/net/bnx2.c   Eric Dumazet  2010-07-08  
6831  
b6016b76 drivers/net/bnx2.c   Michael Chan  2005-05-26  
6832net_stats->rx_packets =
a4743058 drivers/net/bnx2.c   Michael Chan  2010-01-17  
6833GET_64BIT_NET_STATS(stat_IfHCInUcastPkts) +
a4743058 drivers/net/bnx2.c   Michael Chan  2010-01-17  
6834GET_64BIT_NET_STATS(stat_IfHCInMulticastPkts) +
a4743058 drivers/net/bnx2.c   Michael Chan  2010-01-17  
6835GET_64BIT_NET_STATS(stat_IfHCInBroadcastPkts);
b6016b76 drivers/net/bnx2.c   Michael Chan  2005-05-26  
6836  
b6016b76 drivers/net/bnx2.c   Michael Chan  2005-05-26  
6837net_stats->tx_packets =
a4743058 drivers/net/bnx2.c   Michael Chan  2010-01-17  
6838GET_64BIT_NET_STATS(stat_IfHCOutUcastPkts) +

:: The code at line 6830 was first introduced by commit
:: b6016b767397258b58163494a869f8f1199e6897 [BNX2]: New Broadcom gigabit 
network driver.

:: TO: Michael Chan <mc...@broadcom.com>
:: CC: David S. Miller <da...@davemloft.net>

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization