Re: [PATCH v3 2/3] vhost: Use vhost_get_used_size() in vhost_vring_set_addr()

2020-10-10 Thread Michael S. Tsirkin
On Sat, Oct 10, 2020 at 10:32:13AM +0800, Jason Wang wrote:
> 
> On 2020/10/3 下午6:02, Greg Kurz wrote:
> > The open-coded computation of the used size doesn't take the event
> > into account when the VIRTIO_RING_F_EVENT_IDX feature is present.
> > Fix that by using vhost_get_used_size().
> > 
> > Signed-off-by: Greg Kurz 
> > ---
> >   drivers/vhost/vhost.c |3 +--
> >   1 file changed, 1 insertion(+), 2 deletions(-)
> > 
> > diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> > index c3b49975dc28..9d2c225fb518 100644
> > --- a/drivers/vhost/vhost.c
> > +++ b/drivers/vhost/vhost.c
> > @@ -1519,8 +1519,7 @@ static long vhost_vring_set_addr(struct vhost_dev *d,
> > /* Also validate log access for used ring if enabled. */
> > if ((a.flags & (0x1 << VHOST_VRING_F_LOG)) &&
> > !log_access_ok(vq->log_base, a.log_guest_addr,
> > -   sizeof *vq->used +
> > -   vq->num * sizeof *vq->used->ring))
> > +  vhost_get_used_size(vq, vq->num)))
> > return -EINVAL;
> > }
> > 
> > 
> 
> Acked-by: Jason Wang 

Linus already merged this, I can't add your ack, sorry!

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v3 1/2] vhost-vdpa: fix vhost_vdpa_map() on error condition

2020-10-10 Thread Michael S. Tsirkin
On Sat, Oct 10, 2020 at 09:48:42AM +0800, Jason Wang wrote:
> 
> On 2020/10/3 下午1:02, Si-Wei Liu wrote:
> > vhost_vdpa_map() should remove the iotlb entry just added
> > if the corresponding mapping fails to set up properly.
> > 
> > Fixes: 4c8cf31885f6 ("vhost: introduce vDPA-based backend")
> > Signed-off-by: Si-Wei Liu 
> > ---
> >   drivers/vhost/vdpa.c | 3 +++
> >   1 file changed, 3 insertions(+)
> > 
> > diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> > index 796fe97..0f27919 100644
> > --- a/drivers/vhost/vdpa.c
> > +++ b/drivers/vhost/vdpa.c
> > @@ -565,6 +565,9 @@ static int vhost_vdpa_map(struct vhost_vdpa *v,
> >   perm_to_iommu_flags(perm));
> > }
> > +   if (r)
> > +   vhost_iotlb_del_range(dev->iotlb, iova, iova + size - 1);
> > +
> > return r;
> >   }
> 
> 
> Acked-by: Jason Wang 

Linus already merged this, I can't add your ack, sorry!

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH net-next v2] virtio-net: ethtool configurable RXCSUM

2020-10-10 Thread Willem de Bruijn
On Fri, Oct 9, 2020 at 10:10 PM Tonghao Zhang  wrote:
>
> On Fri, Oct 9, 2020 at 9:49 PM Willem de Bruijn
>  wrote:
> >
> > On Thu, Oct 8, 2020 at 9:19 PM Tonghao Zhang  
> > wrote:
> > >
> > > On Wed, Sep 30, 2020 at 6:05 PM Willem de Bruijn
> > >  wrote:
> > > >
> > > > On Wed, Sep 30, 2020 at 4:05 AM  wrote:
> > > > >
> > > > > From: Tonghao Zhang 
> > > > >
> > > > > Allow user configuring RXCSUM separately with ethtool -K,
> > > > > reusing the existing virtnet_set_guest_offloads helper
> > > > > that configures RXCSUM for XDP. This is conditional on
> > > > > VIRTIO_NET_F_CTRL_GUEST_OFFLOADS.
> > > > >
> > > > > If Rx checksum is disabled, LRO should also be disabled.
> > > > >
> > > > > Cc: Michael S. Tsirkin 
> > > > > Cc: Jason Wang 
> > > > > Signed-off-by: Tonghao Zhang 
> > > >
> > > > The first patch was merged into net.
> > > >
> > > > Please wait until that is merged into net-next, as this depends on the
> > > > other patch.
> > > >
> > > > > ---
> > > > > v2:
> > > > > * LRO depends the rx csum
> > > > > * remove the unnecessary check
> > > > > ---
> > > > >  drivers/net/virtio_net.c | 49 
> > > > > ++--
> > > > >  1 file changed, 37 insertions(+), 12 deletions(-)
> > > > >
> > > > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > > > > index 21b71148c532..5407a0106771 100644
> > > > > --- a/drivers/net/virtio_net.c
> > > > > +++ b/drivers/net/virtio_net.c
> > > > > @@ -68,6 +68,8 @@ static const unsigned long guest_offloads[] = {
> > > > > (1ULL << VIRTIO_NET_F_GUEST_ECN)  | \
> > > > > (1ULL << VIRTIO_NET_F_GUEST_UFO))
> > > > >
> > > > > +#define GUEST_OFFLOAD_CSUM_MASK (1ULL << VIRTIO_NET_F_GUEST_CSUM)
> > > > > +
> > > > >  struct virtnet_stat_desc {
> > > > > char desc[ETH_GSTRING_LEN];
> > > > > size_t offset;
> > > > > @@ -2522,29 +2524,49 @@ static int virtnet_get_phys_port_name(struct 
> > > > > net_device *dev, char *buf,
> > > > > return 0;
> > > > >  }
> > > > >
> > > > > +static netdev_features_t virtnet_fix_features(struct net_device 
> > > > > *netdev,
> > > > > + netdev_features_t 
> > > > > features)
> > > > > +{
> > > > > +   /* If Rx checksum is disabled, LRO should also be disabled.
> > > > > +* That is life. :)
> > > >
> > > > Please remove this second line.
> > > OK
> > > > > +*/
> > > > > +   if (!(features & NETIF_F_RXCSUM))
> > > > > +   features &= ~NETIF_F_LRO;
> > > > > +
> > > > > +   return features;
> > > > > +}
> > > > > +
> > > > >  static int virtnet_set_features(struct net_device *dev,
> > > > > netdev_features_t features)
> > > > >  {
> > > > > struct virtnet_info *vi = netdev_priv(dev);
> > > > > -   u64 offloads;
> > > > > +   u64 offloads = vi->guest_offloads &
> > > > > +  vi->guest_offloads_capable;
> > > >
> > > > When can vi->guest_offloads have bits set outside the mask of
> > > > vi->guest_offloads_capable?
> > > In this case, we can use only vi->guest_offloads. and
> > > guest_offloads_capable will not be used any more.
> > > so should we remove guest_offloads_capable ?
> >
> > That bitmap stores the capabilities of the device as learned at
> > initial device probe. I don't see how it can be removed.
> Hi Willem
> guest_offloads_capable was introduced by
> a02e8964eaf9 ("virtio-net: ethtool configurable LRO")
> and used only for LRO. Now we don't use it anymore, right ?
> because this patch (virtio-net: ethtool configurable RXCSUM)
> doesn't use it.

It is needed, because it serves as an upper bound on configurable options.

virtnet_set_features cannot enable LRO unless the LRO flags are
captured by the feature negotiation at probe time.

I think on enable you need something like

-   offloads = vi->guest_offloads_capable;
+   offloads |= vi->guest_offloads_capable &
GUEST_OFFLOAD_LRO_MASK;

instead of unconditional

+   offloads |= GUEST_OFFLOAD_LRO_MASK;
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] mm: proc: add Sock to /proc/meminfo

2020-10-10 Thread Randy Dunlap
Hi,

On 10/10/20 3:38 AM, Muchun Song wrote:
> The amount of memory allocated to sockets buffer can become significant.
> However, we do not display the amount of memory consumed by sockets
> buffer. In this case, knowing where the memory is consumed by the kernel
> is very difficult. On our server with 500GB RAM, sometimes we can see
> 25GB disappear through /proc/meminfo. After our analysis, we found the
> following memory allocation path which consumes the memory with page_owner
> enabled.
> 
>   849698 times:
>   Page allocated via order 3, mask 
> 0x4052c0(GFP_NOWAIT|__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP)
>__alloc_pages_nodemask+0x11d/0x290
>skb_page_frag_refill+0x68/0xf0
>sk_page_frag_refill+0x19/0x70
>tcp_sendmsg_locked+0x2f4/0xd10
>tcp_sendmsg+0x29/0xa0
>sock_sendmsg+0x30/0x40
>sock_write_iter+0x8f/0x100
>__vfs_write+0x10b/0x190
>vfs_write+0xb0/0x190
>ksys_write+0x5a/0xd0
>do_syscall_64+0x5d/0x110
>entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 
> Signed-off-by: Muchun Song 
> ---
>  drivers/base/node.c  |  2 ++
>  drivers/net/virtio_net.c |  3 +--
>  fs/proc/meminfo.c|  1 +
>  include/linux/mmzone.h   |  1 +
>  include/linux/skbuff.h   | 43 ++--
>  kernel/exit.c|  3 +--
>  mm/page_alloc.c  |  7 +--
>  mm/vmstat.c  |  1 +
>  net/core/sock.c  |  8 
>  net/ipv4/tcp.c   |  3 +--
>  net/xfrm/xfrm_state.c|  3 +--
>  11 files changed, 59 insertions(+), 16 deletions(-)

Thanks for finding that.

Please update Documentation/filesystems/proc.rst "meminfo" section also.

-- 
~Randy

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] vdpa/mlx5: should keep avail_index despite device status

2020-10-10 Thread Jason Wang


On 2020/10/2 上午4:18, Si-Wei Liu wrote:

A VM with mlx5 vDPA has below warnings while being reset:

vhost VQ 0 ring restore failed: -1: Resource temporarily unavailable (11)
vhost VQ 1 ring restore failed: -1: Resource temporarily unavailable (11)

We should allow userspace emulating the virtio device be
able to get to vq's avail_index, regardless of vDPA device
status. Save the index that was last seen when virtq was
stopped, so that userspace doesn't complain.

Signed-off-by: Si-Wei Liu 



Acked-by: Jason Wang 



---
  drivers/vdpa/mlx5/net/mlx5_vnet.c | 20 ++--
  1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c 
b/drivers/vdpa/mlx5/net/mlx5_vnet.c
index 70676a6..74264e59 100644
--- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
+++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
@@ -1133,15 +1133,17 @@ static void suspend_vq(struct mlx5_vdpa_net *ndev, 
struct mlx5_vdpa_virtqueue *m
if (!mvq->initialized)
return;
  
-	if (query_virtqueue(ndev, mvq, &attr)) {

-   mlx5_vdpa_warn(&ndev->mvdev, "failed to query virtqueue\n");
-   return;
-   }
if (mvq->fw_state != MLX5_VIRTIO_NET_Q_OBJECT_STATE_RDY)
return;
  
  	if (modify_virtqueue(ndev, mvq, MLX5_VIRTIO_NET_Q_OBJECT_STATE_SUSPEND))

mlx5_vdpa_warn(&ndev->mvdev, "modify to suspend failed\n");
+
+   if (query_virtqueue(ndev, mvq, &attr)) {
+   mlx5_vdpa_warn(&ndev->mvdev, "failed to query virtqueue\n");
+   return;
+   }
+   mvq->avail_idx = attr.available_index;
  }
  
  static void suspend_vqs(struct mlx5_vdpa_net *ndev)

@@ -1411,8 +1413,14 @@ static int mlx5_vdpa_get_vq_state(struct vdpa_device 
*vdev, u16 idx, struct vdpa
struct mlx5_virtq_attr attr;
int err;
  
-	if (!mvq->initialized)

-   return -EAGAIN;
+   /* If the virtq object was destroyed, use the value saved at
+* the last minute of suspend_vq. This caters for userspace
+* that cares about emulating the index after vq is stopped.
+*/
+   if (!mvq->initialized) {
+   state->avail_index = mvq->avail_idx;
+   return 0;
+   }
  
  	err = query_virtqueue(ndev, mvq, &attr);

if (err) {


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization