Re: [PATCH] virtio_net: Remove BUG() to aviod machine dead

2021-06-01 Thread Leon Romanovsky
On Tue, May 25, 2021 at 02:19:03PM +0800, Jason Wang wrote:
> 
> 在 2021/5/19 下午10:18, Xianting Tian 写道:
> > thanks, I submit the patch as commented by Andrew
> > https://lkml.org/lkml/2021/5/18/256
> > 
> > Actually, if xmit_skb() returns error, below code will give a warning
> > with error code.
> > 
> > /* Try to transmit */
> > err = xmit_skb(sq, skb);
> > 
> > /* This should not happen! */
> > if (unlikely(err)) {
> >     dev->stats.tx_fifo_errors++;
> >     if (net_ratelimit())
> >     dev_warn(>dev,
> >  "Unexpected TXQ (%d) queue failure: %d\n",
> >  qnum, err);
> >     dev->stats.tx_dropped++;
> >     dev_kfree_skb_any(skb);
> >     return NETDEV_TX_OK;
> > }
> > 
> > 
> > 
> > 
> > 
> > 在 2021/5/18 下午5:54, Michael S. Tsirkin 写道:
> > > typo in subject
> > > 
> > > On Tue, May 18, 2021 at 05:46:56PM +0800, Xianting Tian wrote:
> > > > When met error, we output a print to avoid a BUG().
> 
> 
> So you don't explain why you need to remove BUG(). I think it deserve a
> BUG().

BUG() will crash the machine and virtio_net is not kernel core
functionality that must stop the machine to prevent anything truly
harmful and basic.

I would argue that code in drivers/* shouldn't call BUG() macros at all.

If it is impossible, don't check for that or add WARN_ON() and recover,
but don't crash whole system.

Thanks
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH] virtio-net: fix the kzalloc/kfree mismatch problem

2021-06-01 Thread Leon Romanovsky
On Mon, May 24, 2021 at 10:37:14AM +0800, Jason Wang wrote:
> 
> 在 2021/5/24 上午10:06, Xuan Zhuo 写道:
> > On Mon, 24 May 2021 01:48:53 +, Guodeqing (A)  
> > wrote:
> > > 
> > > > -Original Message-
> > > > From: Max Gurtovoy [mailto:mgurto...@nvidia.com]
> > > > Sent: Sunday, May 23, 2021 15:25
> > > > To: Guodeqing (A) ; m...@redhat.com
> > > > Cc: jasow...@redhat.com; da...@davemloft.net; k...@kernel.org;
> > > > virtualization@lists.linux-foundation.org; net...@vger.kernel.org
> > > > Subject: Re: [PATCH] virtio-net: fix the kzalloc/kfree mismatch problem
> > > > 
> > > > 
> > > > On 5/22/2021 11:02 AM, guodeqing wrote:
> > > > > If the virtio_net device does not suppurt the ctrl queue feature, the
> > > > > vi->ctrl was not allocated, so there is no need to free it.
> > > > you don't need this check.
> > > > 
> > > > from kfree doc:
> > > > 
> > > > "If @objp is NULL, no operation is performed."
> > > > 
> > > > This is not a bug. I've set vi->ctrl to be NULL in case !vi->has_cvq.
> > > > 
> > > > 
> > >yes,  this is not a bug, the patch is just a optimization, because the 
> > > vi->ctrl maybe
> > >be freed which  was not allocated, this may give people a 
> > > misunderstanding.
> > >Thanks.
> > 
> > I think it may be enough to add a comment, and the code does not need to be
> > modified.
> > 
> > Thanks.
> 
> 
> Or even just leave the current code as is. A lot of kernel codes was wrote
> under the assumption that kfree() should deal with NULL.

It is not assumption but standard practice that can be seen as side
effect of "7) Centralized exiting of functions" section of coding-style.rst.

Thanks
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH V2 RESEND 4/4] virtio/vdpa: clear the virtqueue state during probe

2021-06-01 Thread Jason Wang
From: Eli Cohen 

Clear the available index as part of the initialization process to
clear and values that might be left from previous usage of the device.
For example, if the device was previously used by vhost_vdpa and now
probed by vhost_vdpa, you want to start with indices.

Fixes: c043b4a8cf3b ("virtio: introduce a vDPA based transport")
Signed-off-by: Eli Cohen 
Signed-off-by: Jason Wang 
---
 drivers/virtio/virtio_vdpa.c | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c
index e28acf482e0c..e1a141135992 100644
--- a/drivers/virtio/virtio_vdpa.c
+++ b/drivers/virtio/virtio_vdpa.c
@@ -142,6 +142,8 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned 
int index,
struct vdpa_callback cb;
struct virtqueue *vq;
u64 desc_addr, driver_addr, device_addr;
+   /* Assume split virtqueue, switch to packed if necessary */
+   struct vdpa_vq_state state = {0};
unsigned long flags;
u32 align, num;
int err;
@@ -191,6 +193,19 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned 
int index,
goto err_vq;
}
 
+   /* reset virtqueue state index */
+   if (virtio_has_feature(vdev, VIRTIO_F_RING_PACKED)) {
+   struct vdpa_vq_state_packed *s = 
+
+   s->last_avail_counter = 1;
+   s->last_avail_idx = 0;
+   s->last_used_counter = 1;
+   s->last_used_idx = 0;
+   }
+   err = ops->set_vq_state(vdpa, index, );
+   if (err)
+   goto err_vq;
+
ops->set_vq_ready(vdpa, index, 1);
 
vq->priv = info;
-- 
2.25.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH V2 RESEND 3/4] vp_vdpa: allow set vq state to initial state after reset

2021-06-01 Thread Jason Wang
We used to fail the set_vq_state() since it was not supported yet by
the virtio spec. But if the bus tries to set the state which is equal
to the device initial state after reset, we can let it go.

This is a must for virtio_vdpa() to set vq state during probe which is
required for some vDPA parents.

Signed-off-by: Jason Wang 
---
 drivers/vdpa/virtio_pci/vp_vdpa.c | 42 ---
 1 file changed, 39 insertions(+), 3 deletions(-)

diff --git a/drivers/vdpa/virtio_pci/vp_vdpa.c 
b/drivers/vdpa/virtio_pci/vp_vdpa.c
index c76ebb531212..18bf4a422772 100644
--- a/drivers/vdpa/virtio_pci/vp_vdpa.c
+++ b/drivers/vdpa/virtio_pci/vp_vdpa.c
@@ -210,13 +210,49 @@ static int vp_vdpa_get_vq_state(struct vdpa_device *vdpa, 
u16 qid,
return -EOPNOTSUPP;
 }
 
+static int vp_vdpa_set_vq_state_split(struct vdpa_device *vdpa,
+ const struct vdpa_vq_state *state)
+{
+   const struct vdpa_vq_state_split *split = >split;
+
+   if (split->avail_index == 0)
+   return 0;
+
+   return -EOPNOTSUPP;
+}
+
+static int vp_vdpa_set_vq_state_packed(struct vdpa_device *vdpa,
+  const struct vdpa_vq_state *state)
+{
+   const struct vdpa_vq_state_packed *packed = >packed;
+
+   if (packed->last_avail_counter == 1 &&
+   packed->last_avail_idx == 0 &&
+   packed->last_used_counter == 1 &&
+   packed->last_used_idx == 0)
+   return 0;
+
+   return -EOPNOTSUPP;
+}
+
 static int vp_vdpa_set_vq_state(struct vdpa_device *vdpa, u16 qid,
const struct vdpa_vq_state *state)
 {
-   /* Note that this is not supported by virtio specification, so
-* we return -ENOPOTSUPP here. This means we can't support live
-* migration, vhost device start/stop.
+   struct virtio_pci_modern_device *mdev = vdpa_to_mdev(vdpa);
+
+   /* Note that this is not supported by virtio specification.
+* But if the state is by chance equal to the device initial
+* state, we can let it go.
 */
+   if ((vp_modern_get_status(mdev) & VIRTIO_CONFIG_S_FEATURES_OK) &&
+   !vp_modern_get_queue_enable(mdev, qid)) {
+   if (vp_modern_get_driver_features(mdev) &
+   BIT_ULL(VIRTIO_F_RING_PACKED))
+   return vp_vdpa_set_vq_state_packed(vdpa, state);
+   else
+   return vp_vdpa_set_vq_state_split(vdpa, state);
+   }
+
return -EOPNOTSUPP;
 }
 
-- 
2.25.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH V2 RESEND 2/4] virtio-pci library: introduce vp_modern_get_driver_features()

2021-06-01 Thread Jason Wang
This patch introduce a helper to get driver/guest features from the
device.

Signed-off-by: Jason Wang 
---
 drivers/virtio/virtio_pci_modern_dev.c | 21 +
 include/linux/virtio_pci_modern.h  |  1 +
 2 files changed, 22 insertions(+)

diff --git a/drivers/virtio/virtio_pci_modern_dev.c 
b/drivers/virtio/virtio_pci_modern_dev.c
index 54f297028586..e11ed748e661 100644
--- a/drivers/virtio/virtio_pci_modern_dev.c
+++ b/drivers/virtio/virtio_pci_modern_dev.c
@@ -383,6 +383,27 @@ u64 vp_modern_get_features(struct virtio_pci_modern_device 
*mdev)
 }
 EXPORT_SYMBOL_GPL(vp_modern_get_features);
 
+/*
+ * vp_modern_get_driver_features - get driver features from device
+ * @mdev: the modern virtio-pci device
+ *
+ * Returns the driver features read from the device
+ */
+u64 vp_modern_get_driver_features(struct virtio_pci_modern_device *mdev)
+{
+   struct virtio_pci_common_cfg __iomem *cfg = mdev->common;
+
+   u64 features;
+
+   vp_iowrite32(0, >guest_feature_select);
+   features = vp_ioread32(>guest_feature);
+   vp_iowrite32(1, >guest_feature_select);
+   features |= ((u64)vp_ioread32(>guest_feature) << 32);
+
+   return features;
+}
+EXPORT_SYMBOL_GPL(vp_modern_get_driver_features);
+
 /*
  * vp_modern_set_features - set features to device
  * @mdev: the modern virtio-pci device
diff --git a/include/linux/virtio_pci_modern.h 
b/include/linux/virtio_pci_modern.h
index 6a95b58fd0f4..eb2bd9b4077d 100644
--- a/include/linux/virtio_pci_modern.h
+++ b/include/linux/virtio_pci_modern.h
@@ -79,6 +79,7 @@ static inline void vp_iowrite64_twopart(u64 val,
 }
 
 u64 vp_modern_get_features(struct virtio_pci_modern_device *mdev);
+u64 vp_modern_get_driver_features(struct virtio_pci_modern_device *mdev);
 void vp_modern_set_features(struct virtio_pci_modern_device *mdev,
 u64 features);
 u32 vp_modern_generation(struct virtio_pci_modern_device *mdev);
-- 
2.25.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH V2 RESEND 1/4] vdpa: support packed virtqueue for set/get_vq_state()

2021-06-01 Thread Jason Wang
This patch extends the vdpa_vq_state to support packed virtqueue
state which is basically the device/driver ring wrap counters and the
avail and used index. This will be used for the virito-vdpa support
for the packed virtqueue and the future vhost/vhost-vdpa support for
the packed virtqueue.

Signed-off-by: Jason Wang 
---
 drivers/vdpa/ifcvf/ifcvf_main.c   |  4 ++--
 drivers/vdpa/mlx5/net/mlx5_vnet.c |  8 
 drivers/vdpa/vdpa_sim/vdpa_sim.c  |  4 ++--
 drivers/vhost/vdpa.c  |  4 ++--
 include/linux/vdpa.h  | 25 +++--
 5 files changed, 33 insertions(+), 12 deletions(-)

diff --git a/drivers/vdpa/ifcvf/ifcvf_main.c b/drivers/vdpa/ifcvf/ifcvf_main.c
index ab0ab5cf0f6e..5d3891b1ca28 100644
--- a/drivers/vdpa/ifcvf/ifcvf_main.c
+++ b/drivers/vdpa/ifcvf/ifcvf_main.c
@@ -264,7 +264,7 @@ static int ifcvf_vdpa_get_vq_state(struct vdpa_device 
*vdpa_dev, u16 qid,
 {
struct ifcvf_hw *vf = vdpa_to_vf(vdpa_dev);
 
-   state->avail_index = ifcvf_get_vq_state(vf, qid);
+   state->split.avail_index = ifcvf_get_vq_state(vf, qid);
return 0;
 }
 
@@ -273,7 +273,7 @@ static int ifcvf_vdpa_set_vq_state(struct vdpa_device 
*vdpa_dev, u16 qid,
 {
struct ifcvf_hw *vf = vdpa_to_vf(vdpa_dev);
 
-   return ifcvf_set_vq_state(vf, qid, state->avail_index);
+   return ifcvf_set_vq_state(vf, qid, state->split.avail_index);
 }
 
 static void ifcvf_vdpa_set_vq_cb(struct vdpa_device *vdpa_dev, u16 qid,
diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c 
b/drivers/vdpa/mlx5/net/mlx5_vnet.c
index 189e4385df40..e5505d760bca 100644
--- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
+++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
@@ -1427,8 +1427,8 @@ static int mlx5_vdpa_set_vq_state(struct vdpa_device 
*vdev, u16 idx,
return -EINVAL;
}
 
-   mvq->used_idx = state->avail_index;
-   mvq->avail_idx = state->avail_index;
+   mvq->used_idx = state->split.avail_index;
+   mvq->avail_idx = state->split.avail_index;
return 0;
 }
 
@@ -1449,7 +1449,7 @@ static int mlx5_vdpa_get_vq_state(struct vdpa_device 
*vdev, u16 idx, struct vdpa
 * Since both values should be identical, we take the value of
 * used_idx which is reported correctly.
 */
-   state->avail_index = mvq->used_idx;
+   state->split.avail_index = mvq->used_idx;
return 0;
}
 
@@ -1458,7 +1458,7 @@ static int mlx5_vdpa_get_vq_state(struct vdpa_device 
*vdev, u16 idx, struct vdpa
mlx5_vdpa_warn(mvdev, "failed to query virtqueue\n");
return err;
}
-   state->avail_index = attr.used_index;
+   state->split.avail_index = attr.used_index;
return 0;
 }
 
diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
index 98f793bc9376..14e024de5cbf 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
@@ -374,7 +374,7 @@ static int vdpasim_set_vq_state(struct vdpa_device *vdpa, 
u16 idx,
struct vringh *vrh = >vring;
 
spin_lock(>lock);
-   vrh->last_avail_idx = state->avail_index;
+   vrh->last_avail_idx = state->split.avail_index;
spin_unlock(>lock);
 
return 0;
@@ -387,7 +387,7 @@ static int vdpasim_get_vq_state(struct vdpa_device *vdpa, 
u16 idx,
struct vdpasim_virtqueue *vq = >vqs[idx];
struct vringh *vrh = >vring;
 
-   state->avail_index = vrh->last_avail_idx;
+   state->split.avail_index = vrh->last_avail_idx;
return 0;
 }
 
diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
index fb41db3da611..210ab35a7ebf 100644
--- a/drivers/vhost/vdpa.c
+++ b/drivers/vhost/vdpa.c
@@ -383,7 +383,7 @@ static long vhost_vdpa_vring_ioctl(struct vhost_vdpa *v, 
unsigned int cmd,
if (r)
return r;
 
-   vq->last_avail_idx = vq_state.avail_index;
+   vq->last_avail_idx = vq_state.split.avail_index;
break;
}
 
@@ -401,7 +401,7 @@ static long vhost_vdpa_vring_ioctl(struct vhost_vdpa *v, 
unsigned int cmd,
break;
 
case VHOST_SET_VRING_BASE:
-   vq_state.avail_index = vq->last_avail_idx;
+   vq_state.split.avail_index = vq->last_avail_idx;
if (ops->set_vq_state(vdpa, idx, _state))
r = -EINVAL;
break;
diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
index f311d227aa1b..3357ac98878d 100644
--- a/include/linux/vdpa.h
+++ b/include/linux/vdpa.h
@@ -28,13 +28,34 @@ struct vdpa_notification_area {
 };
 
 /**
- * struct vdpa_vq_state - vDPA vq_state definition
+ * struct vdpa_vq_state_split - vDPA split virtqueue state
  * @avail_index: available index
  */
-struct vdpa_vq_state {
+struct vdpa_vq_state_split {
u16 avail_index;
 };
 
+/**
+ * struct vdpa_vq_state_packed - vDPA packed virtqueue state
+ * 

[PATCH V2 RESEND 0/4] Packed virtqueue state support for vDPA

2021-06-01 Thread Jason Wang
Hi:

This series implements the packed virtqueue state support for
vDPA. This is done via extending the vdpa_vq_state to support packed
virtqueue.

For virtio-vDPA, an initial state required by the virtio spec is set.

For vhost-vDPA, the packed virtqueue support still needs to be done at
both vhost core and vhost-vDPA in the future.

Please review.

Tested with packed=on/off via virtio_vdpa driver.

Change since V1:
- unbreak mlx5_vdpa build

Thanks

Eli Cohen (1):
  virtio/vdpa: clear the virtqueue state during probe

Jason Wang (3):
  vdpa: support packed virtqueue for set/get_vq_state()
  virtio-pci library: introduce vp_modern_get_driver_features()
  vp_vdpa: allow set vq state to initial state after reset

 drivers/vdpa/ifcvf/ifcvf_main.c|  4 +--
 drivers/vdpa/mlx5/net/mlx5_vnet.c  |  8 ++---
 drivers/vdpa/vdpa_sim/vdpa_sim.c   |  4 +--
 drivers/vdpa/virtio_pci/vp_vdpa.c  | 42 --
 drivers/vhost/vdpa.c   |  4 +--
 drivers/virtio/virtio_pci_modern_dev.c | 21 +
 drivers/virtio/virtio_vdpa.c   | 15 +
 include/linux/vdpa.h   | 25 +--
 include/linux/virtio_pci_modern.h  |  1 +
 9 files changed, 109 insertions(+), 15 deletions(-)

-- 
2.25.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH V2 0/4]

2021-06-01 Thread Jason Wang


在 2021/6/2 上午10:10, Jason Wang 写道:

*** BLURB HERE ***



Missing blurb...

Will resend a new version.

Thanks




Eli Cohen (1):
   virtio/vdpa: clear the virtqueue state during probe

Jason Wang (3):
   vdpa: support packed virtqueue for set/get_vq_state()
   virtio-pci library: introduce vp_modern_get_driver_features()
   vp_vdpa: allow set vq state to initial state after reset

  drivers/vdpa/ifcvf/ifcvf_main.c|  4 +--
  drivers/vdpa/mlx5/net/mlx5_vnet.c  |  8 ++---
  drivers/vdpa/vdpa_sim/vdpa_sim.c   |  4 +--
  drivers/vdpa/virtio_pci/vp_vdpa.c  | 42 --
  drivers/vhost/vdpa.c   |  4 +--
  drivers/virtio/virtio_pci_modern_dev.c | 21 +
  drivers/virtio/virtio_vdpa.c   | 15 +
  include/linux/vdpa.h   | 25 +--
  include/linux/virtio_pci_modern.h  |  1 +
  9 files changed, 109 insertions(+), 15 deletions(-)



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 1/4] vdpa: support packed virtqueue for set/get_vq_state()

2021-06-01 Thread Jason Wang


在 2021/6/1 下午6:47, Eli Cohen 写道:

On Tue, Jun 01, 2021 at 04:45:00PM +0800, Jason Wang wrote:

This patch extends the vdpa_vq_state to support packed virtqueue
state which is basically the device/driver ring wrap counters and the
avail and used index. This will be used for the virito-vdpa support
for the packed virtqueue and the future vhost/vhost-vdpa support for
the packed virtqueue.

Signed-off-by: Jason Wang 

You changed interface but did not modify mlx5. Does this compile on your
system?



Yes but I'm using a minimal config without mlx5 enabled :(

V2 is posted.

Thanks





---
  drivers/vdpa/ifcvf/ifcvf_main.c  |  4 ++--
  drivers/vdpa/vdpa_sim/vdpa_sim.c |  4 ++--
  drivers/vhost/vdpa.c |  4 ++--
  include/linux/vdpa.h | 25 +++--
  4 files changed, 29 insertions(+), 8 deletions(-)

diff --git a/drivers/vdpa/ifcvf/ifcvf_main.c b/drivers/vdpa/ifcvf/ifcvf_main.c
index ab0ab5cf0f6e..5d3891b1ca28 100644
--- a/drivers/vdpa/ifcvf/ifcvf_main.c
+++ b/drivers/vdpa/ifcvf/ifcvf_main.c
@@ -264,7 +264,7 @@ static int ifcvf_vdpa_get_vq_state(struct vdpa_device 
*vdpa_dev, u16 qid,
  {
struct ifcvf_hw *vf = vdpa_to_vf(vdpa_dev);
  
-	state->avail_index = ifcvf_get_vq_state(vf, qid);

+   state->split.avail_index = ifcvf_get_vq_state(vf, qid);
return 0;
  }
  
@@ -273,7 +273,7 @@ static int ifcvf_vdpa_set_vq_state(struct vdpa_device *vdpa_dev, u16 qid,

  {
struct ifcvf_hw *vf = vdpa_to_vf(vdpa_dev);
  
-	return ifcvf_set_vq_state(vf, qid, state->avail_index);

+   return ifcvf_set_vq_state(vf, qid, state->split.avail_index);
  }
  
  static void ifcvf_vdpa_set_vq_cb(struct vdpa_device *vdpa_dev, u16 qid,

diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
index 98f793bc9376..14e024de5cbf 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
@@ -374,7 +374,7 @@ static int vdpasim_set_vq_state(struct vdpa_device *vdpa, 
u16 idx,
struct vringh *vrh = >vring;
  
  	spin_lock(>lock);

-   vrh->last_avail_idx = state->avail_index;
+   vrh->last_avail_idx = state->split.avail_index;
spin_unlock(>lock);
  
  	return 0;

@@ -387,7 +387,7 @@ static int vdpasim_get_vq_state(struct vdpa_device *vdpa, 
u16 idx,
struct vdpasim_virtqueue *vq = >vqs[idx];
struct vringh *vrh = >vring;
  
-	state->avail_index = vrh->last_avail_idx;

+   state->split.avail_index = vrh->last_avail_idx;
return 0;
  }
  
diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c

index fb41db3da611..210ab35a7ebf 100644
--- a/drivers/vhost/vdpa.c
+++ b/drivers/vhost/vdpa.c
@@ -383,7 +383,7 @@ static long vhost_vdpa_vring_ioctl(struct vhost_vdpa *v, 
unsigned int cmd,
if (r)
return r;
  
-		vq->last_avail_idx = vq_state.avail_index;

+   vq->last_avail_idx = vq_state.split.avail_index;
break;
}
  
@@ -401,7 +401,7 @@ static long vhost_vdpa_vring_ioctl(struct vhost_vdpa *v, unsigned int cmd,

break;
  
  	case VHOST_SET_VRING_BASE:

-   vq_state.avail_index = vq->last_avail_idx;
+   vq_state.split.avail_index = vq->last_avail_idx;
if (ops->set_vq_state(vdpa, idx, _state))
r = -EINVAL;
break;
diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
index f311d227aa1b..3357ac98878d 100644
--- a/include/linux/vdpa.h
+++ b/include/linux/vdpa.h
@@ -28,13 +28,34 @@ struct vdpa_notification_area {
  };
  
  /**

- * struct vdpa_vq_state - vDPA vq_state definition
+ * struct vdpa_vq_state_split - vDPA split virtqueue state
   * @avail_index: available index
   */
-struct vdpa_vq_state {
+struct vdpa_vq_state_split {
u16 avail_index;
  };
  
+/**

+ * struct vdpa_vq_state_packed - vDPA packed virtqueue state
+ * @last_avail_counter: last driver ring wrap counter observed by device
+ * @last_avail_idx: device available index
+ * @last_used_counter: device ring wrap counter
+ * @last_used_idx: used index
+ */
+struct vdpa_vq_state_packed {
+u16last_avail_counter:1;
+u16last_avail_idx:15;
+u16last_used_counter:1;
+u16last_used_idx:15;
+};
+
+struct vdpa_vq_state {
+ union {
+  struct vdpa_vq_state_split split;
+  struct vdpa_vq_state_packed packed;
+ };
+};
+
  struct vdpa_mgmt_dev;
  
  /**

--
2.25.1



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH V2 4/4] virtio/vdpa: clear the virtqueue state during probe

2021-06-01 Thread Jason Wang
From: Eli Cohen 

Clear the available index as part of the initialization process to
clear and values that might be left from previous usage of the device.
For example, if the device was previously used by vhost_vdpa and now
probed by vhost_vdpa, you want to start with indices.

Fixes: c043b4a8cf3b ("virtio: introduce a vDPA based transport")
Signed-off-by: Eli Cohen 
Signed-off-by: Jason Wang 
---
 drivers/virtio/virtio_vdpa.c | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c
index e28acf482e0c..e1a141135992 100644
--- a/drivers/virtio/virtio_vdpa.c
+++ b/drivers/virtio/virtio_vdpa.c
@@ -142,6 +142,8 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned 
int index,
struct vdpa_callback cb;
struct virtqueue *vq;
u64 desc_addr, driver_addr, device_addr;
+   /* Assume split virtqueue, switch to packed if necessary */
+   struct vdpa_vq_state state = {0};
unsigned long flags;
u32 align, num;
int err;
@@ -191,6 +193,19 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned 
int index,
goto err_vq;
}
 
+   /* reset virtqueue state index */
+   if (virtio_has_feature(vdev, VIRTIO_F_RING_PACKED)) {
+   struct vdpa_vq_state_packed *s = 
+
+   s->last_avail_counter = 1;
+   s->last_avail_idx = 0;
+   s->last_used_counter = 1;
+   s->last_used_idx = 0;
+   }
+   err = ops->set_vq_state(vdpa, index, );
+   if (err)
+   goto err_vq;
+
ops->set_vq_ready(vdpa, index, 1);
 
vq->priv = info;
-- 
2.25.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH V2 3/4] vp_vdpa: allow set vq state to initial state after reset

2021-06-01 Thread Jason Wang
We used to fail the set_vq_state() since it was not supported yet by
the virtio spec. But if the bus tries to set the state which is equal
to the device initial state after reset, we can let it go.

This is a must for virtio_vdpa() to set vq state during probe which is
required for some vDPA parents.

Signed-off-by: Jason Wang 
---
 drivers/vdpa/virtio_pci/vp_vdpa.c | 42 ---
 1 file changed, 39 insertions(+), 3 deletions(-)

diff --git a/drivers/vdpa/virtio_pci/vp_vdpa.c 
b/drivers/vdpa/virtio_pci/vp_vdpa.c
index c76ebb531212..18bf4a422772 100644
--- a/drivers/vdpa/virtio_pci/vp_vdpa.c
+++ b/drivers/vdpa/virtio_pci/vp_vdpa.c
@@ -210,13 +210,49 @@ static int vp_vdpa_get_vq_state(struct vdpa_device *vdpa, 
u16 qid,
return -EOPNOTSUPP;
 }
 
+static int vp_vdpa_set_vq_state_split(struct vdpa_device *vdpa,
+ const struct vdpa_vq_state *state)
+{
+   const struct vdpa_vq_state_split *split = >split;
+
+   if (split->avail_index == 0)
+   return 0;
+
+   return -EOPNOTSUPP;
+}
+
+static int vp_vdpa_set_vq_state_packed(struct vdpa_device *vdpa,
+  const struct vdpa_vq_state *state)
+{
+   const struct vdpa_vq_state_packed *packed = >packed;
+
+   if (packed->last_avail_counter == 1 &&
+   packed->last_avail_idx == 0 &&
+   packed->last_used_counter == 1 &&
+   packed->last_used_idx == 0)
+   return 0;
+
+   return -EOPNOTSUPP;
+}
+
 static int vp_vdpa_set_vq_state(struct vdpa_device *vdpa, u16 qid,
const struct vdpa_vq_state *state)
 {
-   /* Note that this is not supported by virtio specification, so
-* we return -ENOPOTSUPP here. This means we can't support live
-* migration, vhost device start/stop.
+   struct virtio_pci_modern_device *mdev = vdpa_to_mdev(vdpa);
+
+   /* Note that this is not supported by virtio specification.
+* But if the state is by chance equal to the device initial
+* state, we can let it go.
 */
+   if ((vp_modern_get_status(mdev) & VIRTIO_CONFIG_S_FEATURES_OK) &&
+   !vp_modern_get_queue_enable(mdev, qid)) {
+   if (vp_modern_get_driver_features(mdev) &
+   BIT_ULL(VIRTIO_F_RING_PACKED))
+   return vp_vdpa_set_vq_state_packed(vdpa, state);
+   else
+   return vp_vdpa_set_vq_state_split(vdpa, state);
+   }
+
return -EOPNOTSUPP;
 }
 
-- 
2.25.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH V2 2/4] virtio-pci library: introduce vp_modern_get_driver_features()

2021-06-01 Thread Jason Wang
This patch introduce a helper to get driver/guest features from the
device.

Signed-off-by: Jason Wang 
---
 drivers/virtio/virtio_pci_modern_dev.c | 21 +
 include/linux/virtio_pci_modern.h  |  1 +
 2 files changed, 22 insertions(+)

diff --git a/drivers/virtio/virtio_pci_modern_dev.c 
b/drivers/virtio/virtio_pci_modern_dev.c
index 54f297028586..e11ed748e661 100644
--- a/drivers/virtio/virtio_pci_modern_dev.c
+++ b/drivers/virtio/virtio_pci_modern_dev.c
@@ -383,6 +383,27 @@ u64 vp_modern_get_features(struct virtio_pci_modern_device 
*mdev)
 }
 EXPORT_SYMBOL_GPL(vp_modern_get_features);
 
+/*
+ * vp_modern_get_driver_features - get driver features from device
+ * @mdev: the modern virtio-pci device
+ *
+ * Returns the driver features read from the device
+ */
+u64 vp_modern_get_driver_features(struct virtio_pci_modern_device *mdev)
+{
+   struct virtio_pci_common_cfg __iomem *cfg = mdev->common;
+
+   u64 features;
+
+   vp_iowrite32(0, >guest_feature_select);
+   features = vp_ioread32(>guest_feature);
+   vp_iowrite32(1, >guest_feature_select);
+   features |= ((u64)vp_ioread32(>guest_feature) << 32);
+
+   return features;
+}
+EXPORT_SYMBOL_GPL(vp_modern_get_driver_features);
+
 /*
  * vp_modern_set_features - set features to device
  * @mdev: the modern virtio-pci device
diff --git a/include/linux/virtio_pci_modern.h 
b/include/linux/virtio_pci_modern.h
index 6a95b58fd0f4..eb2bd9b4077d 100644
--- a/include/linux/virtio_pci_modern.h
+++ b/include/linux/virtio_pci_modern.h
@@ -79,6 +79,7 @@ static inline void vp_iowrite64_twopart(u64 val,
 }
 
 u64 vp_modern_get_features(struct virtio_pci_modern_device *mdev);
+u64 vp_modern_get_driver_features(struct virtio_pci_modern_device *mdev);
 void vp_modern_set_features(struct virtio_pci_modern_device *mdev,
 u64 features);
 u32 vp_modern_generation(struct virtio_pci_modern_device *mdev);
-- 
2.25.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH V2 1/4] vdpa: support packed virtqueue for set/get_vq_state()

2021-06-01 Thread Jason Wang
This patch extends the vdpa_vq_state to support packed virtqueue
state which is basically the device/driver ring wrap counters and the
avail and used index. This will be used for the virito-vdpa support
for the packed virtqueue and the future vhost/vhost-vdpa support for
the packed virtqueue.

Signed-off-by: Jason Wang 
---
 drivers/vdpa/ifcvf/ifcvf_main.c   |  4 ++--
 drivers/vdpa/mlx5/net/mlx5_vnet.c |  8 
 drivers/vdpa/vdpa_sim/vdpa_sim.c  |  4 ++--
 drivers/vhost/vdpa.c  |  4 ++--
 include/linux/vdpa.h  | 25 +++--
 5 files changed, 33 insertions(+), 12 deletions(-)

diff --git a/drivers/vdpa/ifcvf/ifcvf_main.c b/drivers/vdpa/ifcvf/ifcvf_main.c
index ab0ab5cf0f6e..5d3891b1ca28 100644
--- a/drivers/vdpa/ifcvf/ifcvf_main.c
+++ b/drivers/vdpa/ifcvf/ifcvf_main.c
@@ -264,7 +264,7 @@ static int ifcvf_vdpa_get_vq_state(struct vdpa_device 
*vdpa_dev, u16 qid,
 {
struct ifcvf_hw *vf = vdpa_to_vf(vdpa_dev);
 
-   state->avail_index = ifcvf_get_vq_state(vf, qid);
+   state->split.avail_index = ifcvf_get_vq_state(vf, qid);
return 0;
 }
 
@@ -273,7 +273,7 @@ static int ifcvf_vdpa_set_vq_state(struct vdpa_device 
*vdpa_dev, u16 qid,
 {
struct ifcvf_hw *vf = vdpa_to_vf(vdpa_dev);
 
-   return ifcvf_set_vq_state(vf, qid, state->avail_index);
+   return ifcvf_set_vq_state(vf, qid, state->split.avail_index);
 }
 
 static void ifcvf_vdpa_set_vq_cb(struct vdpa_device *vdpa_dev, u16 qid,
diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c 
b/drivers/vdpa/mlx5/net/mlx5_vnet.c
index 189e4385df40..e5505d760bca 100644
--- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
+++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
@@ -1427,8 +1427,8 @@ static int mlx5_vdpa_set_vq_state(struct vdpa_device 
*vdev, u16 idx,
return -EINVAL;
}
 
-   mvq->used_idx = state->avail_index;
-   mvq->avail_idx = state->avail_index;
+   mvq->used_idx = state->split.avail_index;
+   mvq->avail_idx = state->split.avail_index;
return 0;
 }
 
@@ -1449,7 +1449,7 @@ static int mlx5_vdpa_get_vq_state(struct vdpa_device 
*vdev, u16 idx, struct vdpa
 * Since both values should be identical, we take the value of
 * used_idx which is reported correctly.
 */
-   state->avail_index = mvq->used_idx;
+   state->split.avail_index = mvq->used_idx;
return 0;
}
 
@@ -1458,7 +1458,7 @@ static int mlx5_vdpa_get_vq_state(struct vdpa_device 
*vdev, u16 idx, struct vdpa
mlx5_vdpa_warn(mvdev, "failed to query virtqueue\n");
return err;
}
-   state->avail_index = attr.used_index;
+   state->split.avail_index = attr.used_index;
return 0;
 }
 
diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
index 98f793bc9376..14e024de5cbf 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
@@ -374,7 +374,7 @@ static int vdpasim_set_vq_state(struct vdpa_device *vdpa, 
u16 idx,
struct vringh *vrh = >vring;
 
spin_lock(>lock);
-   vrh->last_avail_idx = state->avail_index;
+   vrh->last_avail_idx = state->split.avail_index;
spin_unlock(>lock);
 
return 0;
@@ -387,7 +387,7 @@ static int vdpasim_get_vq_state(struct vdpa_device *vdpa, 
u16 idx,
struct vdpasim_virtqueue *vq = >vqs[idx];
struct vringh *vrh = >vring;
 
-   state->avail_index = vrh->last_avail_idx;
+   state->split.avail_index = vrh->last_avail_idx;
return 0;
 }
 
diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
index fb41db3da611..210ab35a7ebf 100644
--- a/drivers/vhost/vdpa.c
+++ b/drivers/vhost/vdpa.c
@@ -383,7 +383,7 @@ static long vhost_vdpa_vring_ioctl(struct vhost_vdpa *v, 
unsigned int cmd,
if (r)
return r;
 
-   vq->last_avail_idx = vq_state.avail_index;
+   vq->last_avail_idx = vq_state.split.avail_index;
break;
}
 
@@ -401,7 +401,7 @@ static long vhost_vdpa_vring_ioctl(struct vhost_vdpa *v, 
unsigned int cmd,
break;
 
case VHOST_SET_VRING_BASE:
-   vq_state.avail_index = vq->last_avail_idx;
+   vq_state.split.avail_index = vq->last_avail_idx;
if (ops->set_vq_state(vdpa, idx, _state))
r = -EINVAL;
break;
diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
index f311d227aa1b..3357ac98878d 100644
--- a/include/linux/vdpa.h
+++ b/include/linux/vdpa.h
@@ -28,13 +28,34 @@ struct vdpa_notification_area {
 };
 
 /**
- * struct vdpa_vq_state - vDPA vq_state definition
+ * struct vdpa_vq_state_split - vDPA split virtqueue state
  * @avail_index: available index
  */
-struct vdpa_vq_state {
+struct vdpa_vq_state_split {
u16 avail_index;
 };
 
+/**
+ * struct vdpa_vq_state_packed - vDPA packed virtqueue state
+ * 

[PATCH V2 0/4]

2021-06-01 Thread Jason Wang
*** BLURB HERE ***

Eli Cohen (1):
  virtio/vdpa: clear the virtqueue state during probe

Jason Wang (3):
  vdpa: support packed virtqueue for set/get_vq_state()
  virtio-pci library: introduce vp_modern_get_driver_features()
  vp_vdpa: allow set vq state to initial state after reset

 drivers/vdpa/ifcvf/ifcvf_main.c|  4 +--
 drivers/vdpa/mlx5/net/mlx5_vnet.c  |  8 ++---
 drivers/vdpa/vdpa_sim/vdpa_sim.c   |  4 +--
 drivers/vdpa/virtio_pci/vp_vdpa.c  | 42 --
 drivers/vhost/vdpa.c   |  4 +--
 drivers/virtio/virtio_pci_modern_dev.c | 21 +
 drivers/virtio/virtio_vdpa.c   | 15 +
 include/linux/vdpa.h   | 25 +--
 include/linux/virtio_pci_modern.h  |  1 +
 9 files changed, 109 insertions(+), 15 deletions(-)

-- 
2.25.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: virtio-net: kernel panic in virtio_net.c

2021-06-01 Thread Xuan Zhuo
On Tue, 1 Jun 2021 19:47:44 +0200, Eric Dumazet  wrote:
> On Tue, Jun 1, 2021 at 7:09 PM Corentin Noël
>  wrote:
> >
> > Le mardi 01 juin 2021 à 19:07 +0200, Greg KH a écrit :
> > > On Tue, Jun 01, 2021 at 06:06:50PM +0200, Corentin Noël wrote:
> > > > I've been experiencing crashes with 5.13 that do not occur with
> > > > 5.12,
> > > > here is the crash trace:
> > > >
> > > > [   47.713713] skbuff: skb_over_panic: text:b73a8354
> > > > len:3762
> > > > put:3762 head:9e1e1e48e000 data:9e1e1e48e010 tail:0xec2
> > > > end:0xec0 dev:
> > > > [   47.716267] kernel BUG at net/core/skbuff.c:110!
> > > > [   47.717197] invalid opcode:  [#1] SMP PTI
> > > > [   47.718049] CPU: 2 PID: 730 Comm: llvmpipe-0 Not tainted 5.13.0-
> > > > rc3linux-v5.13-rc3-for-mesa-ci-87614d7f3282.tar.bz2 #1
> > > > [   47.719739] Hardware name: ChromiumOS crosvm, BIOS 0
> > > > [   47.720656] RIP: 0010:skb_panic+0x43/0x45
> > > > [   47.721426] Code: 4f 70 50 8b 87 bc 00 00 00 50 8b 87 b8 00 00
> > > > 00 50
> > > > ff b7 c8 00 00 00 4c 8b 8f c0 00 00 00 48 c7 c7 78 ae ef b7 e8 7f
> > > > 4c fb
> > > > ff <0f> 0b 48 8b 14 24 48 c7 c1 a0 22 d1 b7 e8 ab ff ff ff 48 c7 c6
> > > > e0
> > > > [   47.725944] RSP: :acec01347c20 EFLAGS: 00010246
> > > > [   47.726735] RAX: 008b RBX: 0010 RCX:
> > > > dfff
> > > > [   47.727820] RDX:  RSI: ffea RDI:
> > > > 
> > > > [   47.729096] RBP: eb2700792380 R08: b8144b08 R09:
> > > > 9ffb
> > > > [   47.730260] R10: e000 R11: 3fff R12:
> > > > 9e1e1e95b300
> > > > [   47.731411] R13:  R14: 9e1e1e48e000 R15:
> > > > 0eb2
> > > > [   47.732541] FS:  7f3a82b53700()
> > > > GS:9e1f2bd0()
> > > > knlGS:
> > > > [   47.733858] CS:  0010 DS:  ES:  CR0: 80050033
> > > > [   47.734813] CR2: 010d24f8 CR3: 12d6e004 CR4:
> > > > 00370ee0
> > > > [   47.735968] DR0:  DR1:  DR2:
> > > > 
> > > > [   47.737091] DR3:  DR6: fffe0ff0 DR7:
> > > > 0400
> > > > [   47.738318] Call Trace:
> > > > [   47.738812]  skb_put.cold+0x10/0x10
> > > > [   47.739450]  page_to_skb+0xe4/0x400
> > > > [   47.740072]  receive_buf+0x86/0x1660
> > > > [   47.740693]  ? inet_gro_receive+0x54/0x2c0
> > > > [   47.741279]  ? dev_gro_receive+0x194/0x6a0
> > > > [   47.741846]  virtnet_poll+0x2b8/0x3c0
> > > > [   47.742357]  __napi_poll+0x25/0x150
> > > > [   47.742844]  net_rx_action+0x22f/0x280
> > > > [   47.743388]  __do_softirq+0xba/0x264
> > > > [   47.743947]  irq_exit_rcu+0x90/0xb0
> > > > [   47.744435]  common_interrupt+0x40/0xa0
> > > > [   47.744978]  ? asm_common_interrupt+0x8/0x40
> > > > [   47.745582]  asm_common_interrupt+0x1e/0x40
> > > > [   47.746182] RIP: 0033:0x7f3a7a276ed4
> > > > [   47.746708] Code: a0 03 00 00 c5 fc 29 84 24 40 0f 00 00 c5 bc
> > > > 54 c8
> > > > c5 7c 28 84 24 80 01 00 00 c5 bc 59 e9 c5 fe 5b ed c5 fd 76 c0 c5
> > > > d5 fa
> > > > c0  fd db ec c5 fd 7f 84 24 20 0f 00 00 c5 fc 5b ed c4 e2 55 b8
> > > > cb
> > > > [   47.749292] RSP: 002b:7f3a82b4dba0 EFLAGS: 0212
> > > > [   47.750006] RAX: 7f3a8c210324 RBX:  RCX:
> > > > 
> > > > [   47.750964] RDX: 7f3a8c210348 RSI: 7f3a8c21034c RDI:
> > > > 7f3a7c0575a0
> > > > [   47.752049] RBP: 7f3a82b52ca0 R08: 7f3a8c210350 R09:
> > > > 7f3a8c210354
> > > > [   47.753161] R10: 7f3a8c210358 R11: ffef R12:
> > > > 7f3a8c210340
> > > > [   47.754260] R13: 7f3a8c210344 R14: 7f3a7c057580 R15:
> > > > 7f3a8c21033c
> > > > [   47.755354] Modules linked in:
> > > > [   47.755871] ---[ end trace a8b692ea99c9cd9e ]---
> > > > [   47.756606] RIP: 0010:skb_panic+0x43/0x45
> > > > [   47.757297] Code: 4f 70 50 8b 87 bc 00 00 00 50 8b 87 b8 00 00
> > > > 00 50
> > > > ff b7 c8 00 00 00 4c 8b 8f c0 00 00 00 48 c7 c7 78 ae ef b7 e8 7f
> > > > 4c fb
> > > > ff <0f> 0b 48 8b 14 24 48 c7 c1 a0 22 d1 b7 e8 ab ff ff ff 48 c7 c6
> > > > e0
> > > > [   47.760168] RSP: :acec01347c20 EFLAGS: 00010246
> > > > [   47.760896] RAX: 008b RBX: 0010 RCX:
> > > > dfff
> > > > [   47.761903] RDX:  RSI: ffea RDI:
> > > > 
> > > > [   47.762945] RBP: eb2700792380 R08: b8144b08 R09:
> > > > 9ffb
> > > > [   47.764059] R10: e000 R11: 3fff R12:
> > > > 9e1e1e95b300
> > > > [   47.765169] R13:  R14: 9e1e1e48e000 R15:
> > > > 0eb2
> > > > [   47.766261] FS:  7f3a82b53700()
> > > > GS:9e1f2bd0()
> > > > knlGS:
> > > > [   47.767512] CS:  0010 DS:  ES:  CR0: 80050033
> > > > [   47.768389] CR2: 010d24f8 CR3: 12d6e004 CR4:

Re: [PATCH 1/4] vdpa: support packed virtqueue for set/get_vq_state()

2021-06-01 Thread kernel test robot
Hi Jason,

I love your patch! Yet something to improve:

[auto build test ERROR on vhost/linux-next]
[also build test ERROR on linus/master v5.13-rc4 next-20210601]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Jason-Wang/Packed-virtqueue-state-support-for-vDPA/20210601-164715
base:   https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git linux-next
config: alpha-allyesconfig (attached as .config)
compiler: alpha-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# 
https://github.com/0day-ci/linux/commit/eccc56e52d8c89dd93da5df0362931151417eb6a
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Jason-Wang/Packed-virtqueue-state-support-for-vDPA/20210601-164715
git checkout eccc56e52d8c89dd93da5df0362931151417eb6a
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross 
ARCH=alpha 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

   drivers/vdpa/mlx5/net/mlx5_vnet.c: In function 'mlx5_vdpa_set_vq_state':
>> drivers/vdpa/mlx5/net/mlx5_vnet.c:1430:23: error: 'const struct 
>> vdpa_vq_state' has no member named 'avail_index'
1430 |  mvq->used_idx = state->avail_index;
 |   ^~
   drivers/vdpa/mlx5/net/mlx5_vnet.c:1431:24: error: 'const struct 
vdpa_vq_state' has no member named 'avail_index'
1431 |  mvq->avail_idx = state->avail_index;
 |^~
   drivers/vdpa/mlx5/net/mlx5_vnet.c: In function 'mlx5_vdpa_get_vq_state':
>> drivers/vdpa/mlx5/net/mlx5_vnet.c:1452:8: error: 'struct vdpa_vq_state' has 
>> no member named 'avail_index'
1452 |   state->avail_index = mvq->used_idx;
 |^~
   drivers/vdpa/mlx5/net/mlx5_vnet.c:1461:7: error: 'struct vdpa_vq_state' has 
no member named 'avail_index'
1461 |  state->avail_index = attr.used_index;
 |   ^~


vim +1430 drivers/vdpa/mlx5/net/mlx5_vnet.c

1a86b377aa2147 Eli Cohen  2020-08-04  1417  
1a86b377aa2147 Eli Cohen  2020-08-04  1418  static int 
mlx5_vdpa_set_vq_state(struct vdpa_device *vdev, u16 idx,
1a86b377aa2147 Eli Cohen  2020-08-04  1419const 
struct vdpa_vq_state *state)
1a86b377aa2147 Eli Cohen  2020-08-04  1420  {
1a86b377aa2147 Eli Cohen  2020-08-04  1421  struct mlx5_vdpa_dev *mvdev = 
to_mvdev(vdev);
1a86b377aa2147 Eli Cohen  2020-08-04  1422  struct mlx5_vdpa_net *ndev = 
to_mlx5_vdpa_ndev(mvdev);
1a86b377aa2147 Eli Cohen  2020-08-04  1423  struct mlx5_vdpa_virtqueue *mvq 
= >vqs[idx];
1a86b377aa2147 Eli Cohen  2020-08-04  1424  
1a86b377aa2147 Eli Cohen  2020-08-04  1425  if (mvq->fw_state == 
MLX5_VIRTIO_NET_Q_OBJECT_STATE_RDY) {
1a86b377aa2147 Eli Cohen  2020-08-04  1426  mlx5_vdpa_warn(mvdev, 
"can't modify available index\n");
1a86b377aa2147 Eli Cohen  2020-08-04  1427  return -EINVAL;
1a86b377aa2147 Eli Cohen  2020-08-04  1428  }
1a86b377aa2147 Eli Cohen  2020-08-04  1429  
bc04d93ea30a0a Eli Cohen  2021-04-08 @1430  mvq->used_idx = 
state->avail_index;
1a86b377aa2147 Eli Cohen  2020-08-04  1431  mvq->avail_idx = 
state->avail_index;
1a86b377aa2147 Eli Cohen  2020-08-04  1432  return 0;
1a86b377aa2147 Eli Cohen  2020-08-04  1433  }
1a86b377aa2147 Eli Cohen  2020-08-04  1434  
1a86b377aa2147 Eli Cohen  2020-08-04  1435  static int 
mlx5_vdpa_get_vq_state(struct vdpa_device *vdev, u16 idx, struct vdpa_vq_state 
*state)
1a86b377aa2147 Eli Cohen  2020-08-04  1436  {
1a86b377aa2147 Eli Cohen  2020-08-04  1437  struct mlx5_vdpa_dev *mvdev = 
to_mvdev(vdev);
1a86b377aa2147 Eli Cohen  2020-08-04  1438  struct mlx5_vdpa_net *ndev = 
to_mlx5_vdpa_ndev(mvdev);
1a86b377aa2147 Eli Cohen  2020-08-04  1439  struct mlx5_vdpa_virtqueue *mvq 
= >vqs[idx];
1a86b377aa2147 Eli Cohen  2020-08-04  1440  struct mlx5_virtq_attr attr;
1a86b377aa2147 Eli Cohen  2020-08-04  1441  int err;
1a86b377aa2147 Eli Cohen  2020-08-04  1442  
3176e974a750d6 Si-Wei Liu 2020-10-01  1443  /* If the virtq object was 
destroyed, use the value saved at
3176e974a750d6 Si-Wei Liu 2020-10-01  1444   * the last minute of 
suspend_vq. This caters for userspace
3176e974a750d6 Si-Wei Liu 2020-10-01  1445   * that cares about emulating 
the index after vq is stopped.
3176e974a750d6 Si-Wei Liu 2020-10-01  1446   */
3176e974a750d6 Si-Wei Liu 2020-10-01  1447  if (!mvq->initialized) {
bc04d93ea30a0a Eli Cohen  2021-04-08  1448   

Re: virtio-net: kernel panic in virtio_net.c

2021-06-01 Thread Greg KH
On Tue, Jun 01, 2021 at 06:06:50PM +0200, Corentin Noël wrote:
> I've been experiencing crashes with 5.13 that do not occur with 5.12,
> here is the crash trace:
> 
> [   47.713713] skbuff: skb_over_panic: text:b73a8354 len:3762
> put:3762 head:9e1e1e48e000 data:9e1e1e48e010 tail:0xec2
> end:0xec0 dev:
> [   47.716267] kernel BUG at net/core/skbuff.c:110!
> [   47.717197] invalid opcode:  [#1] SMP PTI
> [   47.718049] CPU: 2 PID: 730 Comm: llvmpipe-0 Not tainted 5.13.0-
> rc3linux-v5.13-rc3-for-mesa-ci-87614d7f3282.tar.bz2 #1
> [   47.719739] Hardware name: ChromiumOS crosvm, BIOS 0 
> [   47.720656] RIP: 0010:skb_panic+0x43/0x45
> [   47.721426] Code: 4f 70 50 8b 87 bc 00 00 00 50 8b 87 b8 00 00 00 50
> ff b7 c8 00 00 00 4c 8b 8f c0 00 00 00 48 c7 c7 78 ae ef b7 e8 7f 4c fb
> ff <0f> 0b 48 8b 14 24 48 c7 c1 a0 22 d1 b7 e8 ab ff ff ff 48 c7 c6 e0
> [   47.725944] RSP: :acec01347c20 EFLAGS: 00010246
> [   47.726735] RAX: 008b RBX: 0010 RCX:
> dfff
> [   47.727820] RDX:  RSI: ffea RDI:
> 
> [   47.729096] RBP: eb2700792380 R08: b8144b08 R09:
> 9ffb
> [   47.730260] R10: e000 R11: 3fff R12:
> 9e1e1e95b300
> [   47.731411] R13:  R14: 9e1e1e48e000 R15:
> 0eb2
> [   47.732541] FS:  7f3a82b53700() GS:9e1f2bd0()
> knlGS:
> [   47.733858] CS:  0010 DS:  ES:  CR0: 80050033
> [   47.734813] CR2: 010d24f8 CR3: 12d6e004 CR4:
> 00370ee0
> [   47.735968] DR0:  DR1:  DR2:
> 
> [   47.737091] DR3:  DR6: fffe0ff0 DR7:
> 0400
> [   47.738318] Call Trace:
> [   47.738812]  skb_put.cold+0x10/0x10
> [   47.739450]  page_to_skb+0xe4/0x400
> [   47.740072]  receive_buf+0x86/0x1660
> [   47.740693]  ? inet_gro_receive+0x54/0x2c0
> [   47.741279]  ? dev_gro_receive+0x194/0x6a0
> [   47.741846]  virtnet_poll+0x2b8/0x3c0
> [   47.742357]  __napi_poll+0x25/0x150
> [   47.742844]  net_rx_action+0x22f/0x280
> [   47.743388]  __do_softirq+0xba/0x264
> [   47.743947]  irq_exit_rcu+0x90/0xb0
> [   47.744435]  common_interrupt+0x40/0xa0
> [   47.744978]  ? asm_common_interrupt+0x8/0x40
> [   47.745582]  asm_common_interrupt+0x1e/0x40
> [   47.746182] RIP: 0033:0x7f3a7a276ed4
> [   47.746708] Code: a0 03 00 00 c5 fc 29 84 24 40 0f 00 00 c5 bc 54 c8
> c5 7c 28 84 24 80 01 00 00 c5 bc 59 e9 c5 fe 5b ed c5 fd 76 c0 c5 d5 fa
> c0  fd db ec c5 fd 7f 84 24 20 0f 00 00 c5 fc 5b ed c4 e2 55 b8 cb
> [   47.749292] RSP: 002b:7f3a82b4dba0 EFLAGS: 0212
> [   47.750006] RAX: 7f3a8c210324 RBX:  RCX:
> 
> [   47.750964] RDX: 7f3a8c210348 RSI: 7f3a8c21034c RDI:
> 7f3a7c0575a0
> [   47.752049] RBP: 7f3a82b52ca0 R08: 7f3a8c210350 R09:
> 7f3a8c210354
> [   47.753161] R10: 7f3a8c210358 R11: ffef R12:
> 7f3a8c210340
> [   47.754260] R13: 7f3a8c210344 R14: 7f3a7c057580 R15:
> 7f3a8c21033c
> [   47.755354] Modules linked in:
> [   47.755871] ---[ end trace a8b692ea99c9cd9e ]---
> [   47.756606] RIP: 0010:skb_panic+0x43/0x45
> [   47.757297] Code: 4f 70 50 8b 87 bc 00 00 00 50 8b 87 b8 00 00 00 50
> ff b7 c8 00 00 00 4c 8b 8f c0 00 00 00 48 c7 c7 78 ae ef b7 e8 7f 4c fb
> ff <0f> 0b 48 8b 14 24 48 c7 c1 a0 22 d1 b7 e8 ab ff ff ff 48 c7 c6 e0
> [   47.760168] RSP: :acec01347c20 EFLAGS: 00010246
> [   47.760896] RAX: 008b RBX: 0010 RCX:
> dfff
> [   47.761903] RDX:  RSI: ffea RDI:
> 
> [   47.762945] RBP: eb2700792380 R08: b8144b08 R09:
> 9ffb
> [   47.764059] R10: e000 R11: 3fff R12:
> 9e1e1e95b300
> [   47.765169] R13:  R14: 9e1e1e48e000 R15:
> 0eb2
> [   47.766261] FS:  7f3a82b53700() GS:9e1f2bd0()
> knlGS:
> [   47.767512] CS:  0010 DS:  ES:  CR0: 80050033
> [   47.768389] CR2: 010d24f8 CR3: 12d6e004 CR4:
> 00370ee0
> [   47.769381] DR0:  DR1:  DR2:
> 
> [   47.770362] DR3:  DR6: fffe0ff0 DR7:
> 0400
> [   47.771339] Kernel panic - not syncing: Fatal exception in interrupt
> [   47.772814] Kernel Offset: 0x35c0 from 0x8100
> (relocation range: 0x8000-0xbfff)
> 
> I've been able to bisect the issue a little bit and the issue
> disappeared after reverting the 4 following commits:
>  * fb32856b16ad9d5bcd75b76a274e2c515ac7b9d7
>  * af39c8f72301b268ad8b04bae646b6025918b82b
>  * f5d7872a8b8a3176e65dc6f7f0705ce7e9a699e6
>  * f80bd740cb7c954791279590b2e810ba6c214e52
> 
> Here is my kernel config: 
> 

Re: [PATCH net v2 0/2] virtio-net: fix for build_skb()

2021-06-01 Thread Michael S. Tsirkin
On Tue, Jun 01, 2021 at 02:39:58PM +0800, Xuan Zhuo wrote:
> #1 Fixed a serious error.
> #2 Fixed a logical error, but this error did not cause any serious 
> consequences.
> 
> The logic of this piece is really messy. Fortunately, my refactored patch can 
> be
> completed with a small amount of testing.

Looks good, thanks!
Also needed for stable I think.

Acked-by: Michael S. Tsirkin 

> Thanks.
> 
> Xuan Zhuo (2):
>   virtio-net: fix for unable to handle page fault for address
>   virtio_net: get build_skb() buf by data ptr
> 
>  drivers/net/virtio_net.c | 20 
>  1 file changed, 8 insertions(+), 12 deletions(-)
> 
> --
> 2.31.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH V3 2/2] vDPA/ifcvf: implement doorbell mapping for ifcvf

2021-06-01 Thread Jason Wang


在 2021/6/1 下午4:56, Zhu, Lingshan 写道:



On 6/1/2021 4:50 PM, Jason Wang wrote:


在 2021/6/1 下午2:28, Zhu Lingshan 写道:

This commit implements doorbell mapping feature for ifcvf.
This feature maps the notify page to userspace, to eliminate
vmexit when kick a vq.

Signed-off-by: Zhu Lingshan 
---
  drivers/vdpa/ifcvf/ifcvf_main.c | 21 +
  1 file changed, 21 insertions(+)

diff --git a/drivers/vdpa/ifcvf/ifcvf_main.c 
b/drivers/vdpa/ifcvf/ifcvf_main.c

index ab0ab5cf0f6e..d41db042612c 100644
--- a/drivers/vdpa/ifcvf/ifcvf_main.c
+++ b/drivers/vdpa/ifcvf/ifcvf_main.c
@@ -413,6 +413,26 @@ static int ifcvf_vdpa_get_vq_irq(struct 
vdpa_device *vdpa_dev,

  return vf->vring[qid].irq;
  }
  +static struct vdpa_notification_area 
ifcvf_get_vq_notification(struct vdpa_device *vdpa_dev,

+   u16 idx)
+{
+    struct ifcvf_adapter *adapter = vdpa_to_adapter(vdpa_dev);
+    struct ifcvf_hw *vf = vdpa_to_vf(vdpa_dev);
+    struct pci_dev *pdev = adapter->pdev;
+    struct vdpa_notification_area area;
+
+    area.addr = vf->vring[idx].notify_pa;
+    if (!vf->notify_off_multiplier)
+    area.size = PAGE_SIZE;
+    else
+    area.size = vf->notify_off_multiplier;
+
+    if (area.addr % PAGE_SIZE)
+    IFCVF_DBG(pdev, "vq %u doorbell address is not PAGE_SIZE 
aligned\n", idx);



I don't see the reason to keep this, or get_notification is not the 
proper place to do this kind of warning.


Thanks
some customers have ever complained have troubles to enable such 
features with their IP,

I think this can help them debug.



If you want to do this, the ifcvf_init_hw() is the proper place.

Note that this function is called by userspace.

Thanks




Thanks




+
+    return area;
+}
+
  /*
   * IFCVF currently does't have on-chip IOMMU, so not
   * implemented set_map()/dma_map()/dma_unmap()
@@ -440,6 +460,7 @@ static const struct vdpa_config_ops ifc_vdpa_ops 
= {

  .get_config    = ifcvf_vdpa_get_config,
  .set_config    = ifcvf_vdpa_set_config,
  .set_config_cb  = ifcvf_vdpa_set_config_cb,
+    .get_vq_notification = ifcvf_get_vq_notification,
  };
    static int ifcvf_probe(struct pci_dev *pdev, const struct 
pci_device_id *id)






___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH net v2 2/2] virtio_net: get build_skb() buf by data ptr

2021-06-01 Thread Jason Wang


在 2021/6/1 下午2:40, Xuan Zhuo 写道:

In the case of merge, the page passed into page_to_skb() may be a head
page, not the page where the current data is located. So when trying to
get the buf where the data is located, we should get buf based on
headroom instead of offset.

This patch solves this problem. But if you don't use this patch, the
original code can also run, because if the page is not the page of the
current data, the calculated tailroom will be less than 0, and will not
enter the logic of build_skb() . The significance of this patch is to
modify this logical problem, allowing more situations to use
build_skb().

Signed-off-by: Xuan Zhuo 
---
  drivers/net/virtio_net.c | 17 ++---
  1 file changed, 6 insertions(+), 11 deletions(-)



Acked-by: Jason Wang 




diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 6b929aca155a..fa407eb8b457 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -401,18 +401,13 @@ static struct sk_buff *page_to_skb(struct virtnet_info 
*vi,
/* If headroom is not 0, there is an offset between the beginning of the
 * data and the allocated space, otherwise the data and the allocated
 * space are aligned.
+*
+* Buffers with headroom use PAGE_SIZE as alloc size, see
+* add_recvbuf_mergeable() + get_mergeable_buf_len()
 */
-   if (headroom) {
-   /* Buffers with headroom use PAGE_SIZE as alloc size,
-* see add_recvbuf_mergeable() + get_mergeable_buf_len()
-*/
-   truesize = PAGE_SIZE;
-   tailroom = truesize - len - offset;
-   buf = page_address(page);
-   } else {
-   tailroom = truesize - len;
-   buf = p;
-   }
+   truesize = headroom ? PAGE_SIZE : truesize;
+   tailroom = truesize - len - headroom;
+   buf = p - headroom;
  
  	len -= hdr_len;

offset += hdr_padded_len;


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH net v2 1/2] virtio-net: fix for unable to handle page fault for address

2021-06-01 Thread Jason Wang


在 2021/6/1 下午2:39, Xuan Zhuo 写道:

In merge mode, when xdp is enabled, if the headroom of buf is smaller
than virtnet_get_headroom(), xdp_linearize_page() will be called but the
variable of "headroom" is still 0, which leads to wrong logic after
entering page_to_skb().

[   16.600944] BUG: unable to handle page fault for address: ecbfff7b43c8[  
 16.602175] #PF: supervisor read access in kernel mode
[   16.603350] #PF: error_code(0x) - not-present page
[   16.604200] PGD 0 P4D 0
[   16.604686] Oops:  [#1] SMP PTI
[   16.605306] CPU: 4 PID: 715 Comm: sh Tainted: GB 5.12.0+ #312
[   16.606429] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/04
[   16.608217] RIP: 0010:unmap_page_range+0x947/0xde0
[   16.609014] Code: 00 00 08 00 48 83 f8 01 45 19 e4 41 f7 d4 41 83 e4 03 e9 
a4 fd ff ff e8 b7 63 ed ff 4c 89 e0 48 c1 e0 065
[   16.611863] RSP: 0018:c90002503c58 EFLAGS: 00010286
[   16.612720] RAX: ecbfff7b43c0 RBX: 7f19f7203000 RCX: 812ff359
[   16.613853] RDX: 888107778000 RSI:  RDI: 0005
[   16.614976] RBP: ea000425e000 R08:  R09: 3030303030303030
[   16.616124] R10: 82ed7d94 R11: 6637303030302052 R12: 7c0afffded0f
[   16.617276] R13: 0001 R14: 888119ee7010 R15: 7f19f7202000
[   16.618423] FS:  () GS:88842fd0() 
knlGS:
[   16.619738] CS:  0010 DS:  ES:  CR0: 80050033
[   16.620670] CR2: ecbfff7b43c8 CR3: 000103220005 CR4: 00370ee0
[   16.621792] DR0:  DR1:  DR2: 
[   16.622920] DR3:  DR6: fffe0ff0 DR7: 0400
[   16.624047] Call Trace:
[   16.624525]  ? release_pages+0x24d/0x730
[   16.625209]  unmap_single_vma+0xa9/0x130
[   16.625885]  unmap_vmas+0x76/0xf0
[   16.626480]  exit_mmap+0xa0/0x210
[   16.627129]  mmput+0x67/0x180
[   16.627673]  do_exit+0x3d1/0xf10
[   16.628259]  ? do_user_addr_fault+0x231/0x840
[   16.629000]  do_group_exit+0x53/0xd0
[   16.629631]  __x64_sys_exit_group+0x1d/0x20
[   16.630354]  do_syscall_64+0x3c/0x80
[   16.630988]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[   16.631828] RIP: 0033:0x7f1a043d0191
[   16.632464] Code: Unable to access opcode bytes at RIP 0x7f1a043d0167.
[   16.633502] RSP: 002b:7ffe3d993308 EFLAGS: 0246 ORIG_RAX: 
00e7
[   16.634737] RAX: ffda RBX: 7f1a044c9490 RCX: 7f1a043d0191
[   16.635857] RDX: 003c RSI: 00e7 RDI: 
[   16.636986] RBP:  R08: ff88 R09: 0001
[   16.638120] R10: 0008 R11: 0246 R12: 7f1a044c9490
[   16.639245] R13: 0001 R14: 7f1a044c9968 R15: 
[   16.640408] Modules linked in:
[   16.640958] CR2: ecbfff7b43c8
[   16.641557] ---[ end trace bc4891c6ce46354c ]---
[   16.642335] RIP: 0010:unmap_page_range+0x947/0xde0
[   16.643135] Code: 00 00 08 00 48 83 f8 01 45 19 e4 41 f7 d4 41 83 e4 03 e9 
a4 fd ff ff e8 b7 63 ed ff 4c 89 e0 48 c1 e0 065
[   16.645983] RSP: 0018:c90002503c58 EFLAGS: 00010286
[   16.646845] RAX: ecbfff7b43c0 RBX: 7f19f7203000 RCX: 812ff359
[   16.647970] RDX: 888107778000 RSI:  RDI: 0005
[   16.649091] RBP: ea000425e000 R08:  R09: 3030303030303030
[   16.650250] R10: 82ed7d94 R11: 6637303030302052 R12: 7c0afffded0f
[   16.651394] R13: 0001 R14: 888119ee7010 R15: 7f19f7202000
[   16.652529] FS:  () GS:88842fd0() 
knlGS:
[   16.653887] CS:  0010 DS:  ES:  CR0: 80050033
[   16.654841] CR2: ecbfff7b43c8 CR3: 000103220005 CR4: 00370ee0
[   16.655992] DR0:  DR1:  DR2: 
[   16.657150] DR3:  DR6: fffe0ff0 DR7: 0400
[   16.658290] Kernel panic - not syncing: Fatal exception
[   16.659613] Kernel Offset: disabled
[   16.660234] ---[ end Kernel panic - not syncing: Fatal exception ]---

Fixes: fb32856b16ad ("virtio-net: page_to_skb() use build_skb when there's 
sufficient tailroom")
Signed-off-by: Xuan Zhuo 



Acked-by: Jason Wang 



---
  drivers/net/virtio_net.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 9b6a4a875c55..6b929aca155a 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -958,7 +958,8 @@ static struct sk_buff *receive_mergeable(struct net_device 
*dev,
put_page(page);
head_skb = page_to_skb(vi, rq, xdp_page, offset,
   len, PAGE_SIZE, false,
-  metasize, 

Re: [PATCH V3 1/2] vDPA/ifcvf: record virtio notify base

2021-06-01 Thread Jason Wang


在 2021/6/1 下午2:28, Zhu Lingshan 写道:

This commit records virtio notify base physical addr and
calculate doorbell physical address for vqs.

Signed-off-by: Zhu Lingshan 



Acked-by: Jason Wang 



---
  drivers/vdpa/ifcvf/ifcvf_base.c | 4 
  drivers/vdpa/ifcvf/ifcvf_base.h | 2 ++
  2 files changed, 6 insertions(+)

diff --git a/drivers/vdpa/ifcvf/ifcvf_base.c b/drivers/vdpa/ifcvf/ifcvf_base.c
index 1a661ab45af5..6e197fe0fcf9 100644
--- a/drivers/vdpa/ifcvf/ifcvf_base.c
+++ b/drivers/vdpa/ifcvf/ifcvf_base.c
@@ -133,6 +133,8 @@ int ifcvf_init_hw(struct ifcvf_hw *hw, struct pci_dev *pdev)
  >notify_off_multiplier);
hw->notify_bar = cap.bar;
hw->notify_base = get_cap_addr(hw, );
+   hw->notify_base_pa = pci_resource_start(pdev, cap.bar) +
+   le32_to_cpu(cap.offset);
IFCVF_DBG(pdev, "hw->notify_base = %p\n",
  hw->notify_base);
break;
@@ -161,6 +163,8 @@ int ifcvf_init_hw(struct ifcvf_hw *hw, struct pci_dev *pdev)
notify_off = ifc_ioread16(>common_cfg->queue_notify_off);
hw->vring[i].notify_addr = hw->notify_base +
notify_off * hw->notify_off_multiplier;
+   hw->vring[i].notify_pa = hw->notify_base_pa +
+   notify_off * hw->notify_off_multiplier;
}
  
  	hw->lm_cfg = hw->base[IFCVF_LM_BAR];

diff --git a/drivers/vdpa/ifcvf/ifcvf_base.h b/drivers/vdpa/ifcvf/ifcvf_base.h
index 0111bfdeb342..447f4ad9c0bf 100644
--- a/drivers/vdpa/ifcvf/ifcvf_base.h
+++ b/drivers/vdpa/ifcvf/ifcvf_base.h
@@ -73,6 +73,7 @@ struct vring_info {
u16 last_avail_idx;
bool ready;
void __iomem *notify_addr;
+   phys_addr_t notify_pa;
u32 irq;
struct vdpa_callback cb;
char msix_name[256];
@@ -87,6 +88,7 @@ struct ifcvf_hw {
u8 notify_bar;
/* Notificaiton bar address */
void __iomem *notify_base;
+   phys_addr_t notify_base_pa;
u32 notify_off_multiplier;
u64 req_features;
u64 hw_features;


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH 4/4] virtio/vdpa: clear the virtqueue state during probe

2021-06-01 Thread Jason Wang
From: Eli Cohen 

Clear the available index as part of the initialization process to
clear and values that might be left from previous usage of the device.
For example, if the device was previously used by vhost_vdpa and now
probed by vhost_vdpa, you want to start with indices.

Fixes: c043b4a8cf3b ("virtio: introduce a vDPA based transport")
Signed-off-by: Eli Cohen 
Signed-off-by: Jason Wang 
---
 drivers/virtio/virtio_vdpa.c | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c
index e28acf482e0c..e1a141135992 100644
--- a/drivers/virtio/virtio_vdpa.c
+++ b/drivers/virtio/virtio_vdpa.c
@@ -142,6 +142,8 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned 
int index,
struct vdpa_callback cb;
struct virtqueue *vq;
u64 desc_addr, driver_addr, device_addr;
+   /* Assume split virtqueue, switch to packed if necessary */
+   struct vdpa_vq_state state = {0};
unsigned long flags;
u32 align, num;
int err;
@@ -191,6 +193,19 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned 
int index,
goto err_vq;
}
 
+   /* reset virtqueue state index */
+   if (virtio_has_feature(vdev, VIRTIO_F_RING_PACKED)) {
+   struct vdpa_vq_state_packed *s = 
+
+   s->last_avail_counter = 1;
+   s->last_avail_idx = 0;
+   s->last_used_counter = 1;
+   s->last_used_idx = 0;
+   }
+   err = ops->set_vq_state(vdpa, index, );
+   if (err)
+   goto err_vq;
+
ops->set_vq_ready(vdpa, index, 1);
 
vq->priv = info;
-- 
2.25.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 3/4] vp_vdpa: allow set vq state to initial state after reset

2021-06-01 Thread Jason Wang
We used to fail the set_vq_state() since it was not supported yet by
the virtio spec. But if the bus tries to set the state which is equal
to the device initial state after reset, we can let it go.

This is a must for virtio_vdpa() to set vq state during probe which is
required for some vDPA parents.

Signed-off-by: Jason Wang 
---
 drivers/vdpa/virtio_pci/vp_vdpa.c | 42 ---
 1 file changed, 39 insertions(+), 3 deletions(-)

diff --git a/drivers/vdpa/virtio_pci/vp_vdpa.c 
b/drivers/vdpa/virtio_pci/vp_vdpa.c
index c76ebb531212..18bf4a422772 100644
--- a/drivers/vdpa/virtio_pci/vp_vdpa.c
+++ b/drivers/vdpa/virtio_pci/vp_vdpa.c
@@ -210,13 +210,49 @@ static int vp_vdpa_get_vq_state(struct vdpa_device *vdpa, 
u16 qid,
return -EOPNOTSUPP;
 }
 
+static int vp_vdpa_set_vq_state_split(struct vdpa_device *vdpa,
+ const struct vdpa_vq_state *state)
+{
+   const struct vdpa_vq_state_split *split = >split;
+
+   if (split->avail_index == 0)
+   return 0;
+
+   return -EOPNOTSUPP;
+}
+
+static int vp_vdpa_set_vq_state_packed(struct vdpa_device *vdpa,
+  const struct vdpa_vq_state *state)
+{
+   const struct vdpa_vq_state_packed *packed = >packed;
+
+   if (packed->last_avail_counter == 1 &&
+   packed->last_avail_idx == 0 &&
+   packed->last_used_counter == 1 &&
+   packed->last_used_idx == 0)
+   return 0;
+
+   return -EOPNOTSUPP;
+}
+
 static int vp_vdpa_set_vq_state(struct vdpa_device *vdpa, u16 qid,
const struct vdpa_vq_state *state)
 {
-   /* Note that this is not supported by virtio specification, so
-* we return -ENOPOTSUPP here. This means we can't support live
-* migration, vhost device start/stop.
+   struct virtio_pci_modern_device *mdev = vdpa_to_mdev(vdpa);
+
+   /* Note that this is not supported by virtio specification.
+* But if the state is by chance equal to the device initial
+* state, we can let it go.
 */
+   if ((vp_modern_get_status(mdev) & VIRTIO_CONFIG_S_FEATURES_OK) &&
+   !vp_modern_get_queue_enable(mdev, qid)) {
+   if (vp_modern_get_driver_features(mdev) &
+   BIT_ULL(VIRTIO_F_RING_PACKED))
+   return vp_vdpa_set_vq_state_packed(vdpa, state);
+   else
+   return vp_vdpa_set_vq_state_split(vdpa, state);
+   }
+
return -EOPNOTSUPP;
 }
 
-- 
2.25.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 2/4] virtio-pci library: introduce vp_modern_get_driver_features()

2021-06-01 Thread Jason Wang
This patch introduce a helper to get driver/guest features from the
device.

Signed-off-by: Jason Wang 
---
 drivers/virtio/virtio_pci_modern_dev.c | 21 +
 include/linux/virtio_pci_modern.h  |  1 +
 2 files changed, 22 insertions(+)

diff --git a/drivers/virtio/virtio_pci_modern_dev.c 
b/drivers/virtio/virtio_pci_modern_dev.c
index 54f297028586..e11ed748e661 100644
--- a/drivers/virtio/virtio_pci_modern_dev.c
+++ b/drivers/virtio/virtio_pci_modern_dev.c
@@ -383,6 +383,27 @@ u64 vp_modern_get_features(struct virtio_pci_modern_device 
*mdev)
 }
 EXPORT_SYMBOL_GPL(vp_modern_get_features);
 
+/*
+ * vp_modern_get_driver_features - get driver features from device
+ * @mdev: the modern virtio-pci device
+ *
+ * Returns the driver features read from the device
+ */
+u64 vp_modern_get_driver_features(struct virtio_pci_modern_device *mdev)
+{
+   struct virtio_pci_common_cfg __iomem *cfg = mdev->common;
+
+   u64 features;
+
+   vp_iowrite32(0, >guest_feature_select);
+   features = vp_ioread32(>guest_feature);
+   vp_iowrite32(1, >guest_feature_select);
+   features |= ((u64)vp_ioread32(>guest_feature) << 32);
+
+   return features;
+}
+EXPORT_SYMBOL_GPL(vp_modern_get_driver_features);
+
 /*
  * vp_modern_set_features - set features to device
  * @mdev: the modern virtio-pci device
diff --git a/include/linux/virtio_pci_modern.h 
b/include/linux/virtio_pci_modern.h
index 6a95b58fd0f4..eb2bd9b4077d 100644
--- a/include/linux/virtio_pci_modern.h
+++ b/include/linux/virtio_pci_modern.h
@@ -79,6 +79,7 @@ static inline void vp_iowrite64_twopart(u64 val,
 }
 
 u64 vp_modern_get_features(struct virtio_pci_modern_device *mdev);
+u64 vp_modern_get_driver_features(struct virtio_pci_modern_device *mdev);
 void vp_modern_set_features(struct virtio_pci_modern_device *mdev,
 u64 features);
 u32 vp_modern_generation(struct virtio_pci_modern_device *mdev);
-- 
2.25.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 1/4] vdpa: support packed virtqueue for set/get_vq_state()

2021-06-01 Thread Jason Wang
This patch extends the vdpa_vq_state to support packed virtqueue
state which is basically the device/driver ring wrap counters and the
avail and used index. This will be used for the virito-vdpa support
for the packed virtqueue and the future vhost/vhost-vdpa support for
the packed virtqueue.

Signed-off-by: Jason Wang 
---
 drivers/vdpa/ifcvf/ifcvf_main.c  |  4 ++--
 drivers/vdpa/vdpa_sim/vdpa_sim.c |  4 ++--
 drivers/vhost/vdpa.c |  4 ++--
 include/linux/vdpa.h | 25 +++--
 4 files changed, 29 insertions(+), 8 deletions(-)

diff --git a/drivers/vdpa/ifcvf/ifcvf_main.c b/drivers/vdpa/ifcvf/ifcvf_main.c
index ab0ab5cf0f6e..5d3891b1ca28 100644
--- a/drivers/vdpa/ifcvf/ifcvf_main.c
+++ b/drivers/vdpa/ifcvf/ifcvf_main.c
@@ -264,7 +264,7 @@ static int ifcvf_vdpa_get_vq_state(struct vdpa_device 
*vdpa_dev, u16 qid,
 {
struct ifcvf_hw *vf = vdpa_to_vf(vdpa_dev);
 
-   state->avail_index = ifcvf_get_vq_state(vf, qid);
+   state->split.avail_index = ifcvf_get_vq_state(vf, qid);
return 0;
 }
 
@@ -273,7 +273,7 @@ static int ifcvf_vdpa_set_vq_state(struct vdpa_device 
*vdpa_dev, u16 qid,
 {
struct ifcvf_hw *vf = vdpa_to_vf(vdpa_dev);
 
-   return ifcvf_set_vq_state(vf, qid, state->avail_index);
+   return ifcvf_set_vq_state(vf, qid, state->split.avail_index);
 }
 
 static void ifcvf_vdpa_set_vq_cb(struct vdpa_device *vdpa_dev, u16 qid,
diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
index 98f793bc9376..14e024de5cbf 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
@@ -374,7 +374,7 @@ static int vdpasim_set_vq_state(struct vdpa_device *vdpa, 
u16 idx,
struct vringh *vrh = >vring;
 
spin_lock(>lock);
-   vrh->last_avail_idx = state->avail_index;
+   vrh->last_avail_idx = state->split.avail_index;
spin_unlock(>lock);
 
return 0;
@@ -387,7 +387,7 @@ static int vdpasim_get_vq_state(struct vdpa_device *vdpa, 
u16 idx,
struct vdpasim_virtqueue *vq = >vqs[idx];
struct vringh *vrh = >vring;
 
-   state->avail_index = vrh->last_avail_idx;
+   state->split.avail_index = vrh->last_avail_idx;
return 0;
 }
 
diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
index fb41db3da611..210ab35a7ebf 100644
--- a/drivers/vhost/vdpa.c
+++ b/drivers/vhost/vdpa.c
@@ -383,7 +383,7 @@ static long vhost_vdpa_vring_ioctl(struct vhost_vdpa *v, 
unsigned int cmd,
if (r)
return r;
 
-   vq->last_avail_idx = vq_state.avail_index;
+   vq->last_avail_idx = vq_state.split.avail_index;
break;
}
 
@@ -401,7 +401,7 @@ static long vhost_vdpa_vring_ioctl(struct vhost_vdpa *v, 
unsigned int cmd,
break;
 
case VHOST_SET_VRING_BASE:
-   vq_state.avail_index = vq->last_avail_idx;
+   vq_state.split.avail_index = vq->last_avail_idx;
if (ops->set_vq_state(vdpa, idx, _state))
r = -EINVAL;
break;
diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
index f311d227aa1b..3357ac98878d 100644
--- a/include/linux/vdpa.h
+++ b/include/linux/vdpa.h
@@ -28,13 +28,34 @@ struct vdpa_notification_area {
 };
 
 /**
- * struct vdpa_vq_state - vDPA vq_state definition
+ * struct vdpa_vq_state_split - vDPA split virtqueue state
  * @avail_index: available index
  */
-struct vdpa_vq_state {
+struct vdpa_vq_state_split {
u16 avail_index;
 };
 
+/**
+ * struct vdpa_vq_state_packed - vDPA packed virtqueue state
+ * @last_avail_counter: last driver ring wrap counter observed by device
+ * @last_avail_idx: device available index
+ * @last_used_counter: device ring wrap counter
+ * @last_used_idx: used index
+ */
+struct vdpa_vq_state_packed {
+u16last_avail_counter:1;
+u16last_avail_idx:15;
+u16last_used_counter:1;
+u16last_used_idx:15;
+};
+
+struct vdpa_vq_state {
+ union {
+  struct vdpa_vq_state_split split;
+  struct vdpa_vq_state_packed packed;
+ };
+};
+
 struct vdpa_mgmt_dev;
 
 /**
-- 
2.25.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 0/4] Packed virtqueue state support for vDPA

2021-06-01 Thread Jason Wang
Hi:

This series implements the packed virtqueue state support for
vDPA. This is done via extending the vdpa_vq_state to support packed
virtqueue.

For virtio-vDPA, an initial state required by the virtio spec is set.

For vhost-vDPA, the packed virtqueue support still needs to be done at
both vhost core and vhost-vDPA in the future.

Please review.

Eli Cohen (1):
  virtio/vdpa: clear the virtqueue state during probe

Jason Wang (3):
  vdpa: support packed virtqueue for set/get_vq_state()
  virtio-pci library: introduce vp_modern_get_driver_features()
  vp_vdpa: allow set vq state to initial state after reset

 drivers/vdpa/ifcvf/ifcvf_main.c|  4 +--
 drivers/vdpa/vdpa_sim/vdpa_sim.c   |  4 +--
 drivers/vdpa/virtio_pci/vp_vdpa.c  | 42 --
 drivers/vhost/vdpa.c   |  4 +--
 drivers/virtio/virtio_pci_modern_dev.c | 21 +
 drivers/virtio/virtio_vdpa.c   | 15 +
 include/linux/vdpa.h   | 25 +--
 include/linux/virtio_pci_modern.h  |  1 +
 8 files changed, 105 insertions(+), 11 deletions(-)

-- 
2.25.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH net v2 1/2] virtio-net: fix for unable to handle page fault for address

2021-06-01 Thread Xuan Zhuo
In merge mode, when xdp is enabled, if the headroom of buf is smaller
than virtnet_get_headroom(), xdp_linearize_page() will be called but the
variable of "headroom" is still 0, which leads to wrong logic after
entering page_to_skb().

[   16.600944] BUG: unable to handle page fault for address: ecbfff7b43c8[  
 16.602175] #PF: supervisor read access in kernel mode
[   16.603350] #PF: error_code(0x) - not-present page
[   16.604200] PGD 0 P4D 0
[   16.604686] Oops:  [#1] SMP PTI
[   16.605306] CPU: 4 PID: 715 Comm: sh Tainted: GB 5.12.0+ #312
[   16.606429] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/04
[   16.608217] RIP: 0010:unmap_page_range+0x947/0xde0
[   16.609014] Code: 00 00 08 00 48 83 f8 01 45 19 e4 41 f7 d4 41 83 e4 03 e9 
a4 fd ff ff e8 b7 63 ed ff 4c 89 e0 48 c1 e0 065
[   16.611863] RSP: 0018:c90002503c58 EFLAGS: 00010286
[   16.612720] RAX: ecbfff7b43c0 RBX: 7f19f7203000 RCX: 812ff359
[   16.613853] RDX: 888107778000 RSI:  RDI: 0005
[   16.614976] RBP: ea000425e000 R08:  R09: 3030303030303030
[   16.616124] R10: 82ed7d94 R11: 6637303030302052 R12: 7c0afffded0f
[   16.617276] R13: 0001 R14: 888119ee7010 R15: 7f19f7202000
[   16.618423] FS:  () GS:88842fd0() 
knlGS:
[   16.619738] CS:  0010 DS:  ES:  CR0: 80050033
[   16.620670] CR2: ecbfff7b43c8 CR3: 000103220005 CR4: 00370ee0
[   16.621792] DR0:  DR1:  DR2: 
[   16.622920] DR3:  DR6: fffe0ff0 DR7: 0400
[   16.624047] Call Trace:
[   16.624525]  ? release_pages+0x24d/0x730
[   16.625209]  unmap_single_vma+0xa9/0x130
[   16.625885]  unmap_vmas+0x76/0xf0
[   16.626480]  exit_mmap+0xa0/0x210
[   16.627129]  mmput+0x67/0x180
[   16.627673]  do_exit+0x3d1/0xf10
[   16.628259]  ? do_user_addr_fault+0x231/0x840
[   16.629000]  do_group_exit+0x53/0xd0
[   16.629631]  __x64_sys_exit_group+0x1d/0x20
[   16.630354]  do_syscall_64+0x3c/0x80
[   16.630988]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[   16.631828] RIP: 0033:0x7f1a043d0191
[   16.632464] Code: Unable to access opcode bytes at RIP 0x7f1a043d0167.
[   16.633502] RSP: 002b:7ffe3d993308 EFLAGS: 0246 ORIG_RAX: 
00e7
[   16.634737] RAX: ffda RBX: 7f1a044c9490 RCX: 7f1a043d0191
[   16.635857] RDX: 003c RSI: 00e7 RDI: 
[   16.636986] RBP:  R08: ff88 R09: 0001
[   16.638120] R10: 0008 R11: 0246 R12: 7f1a044c9490
[   16.639245] R13: 0001 R14: 7f1a044c9968 R15: 
[   16.640408] Modules linked in:
[   16.640958] CR2: ecbfff7b43c8
[   16.641557] ---[ end trace bc4891c6ce46354c ]---
[   16.642335] RIP: 0010:unmap_page_range+0x947/0xde0
[   16.643135] Code: 00 00 08 00 48 83 f8 01 45 19 e4 41 f7 d4 41 83 e4 03 e9 
a4 fd ff ff e8 b7 63 ed ff 4c 89 e0 48 c1 e0 065
[   16.645983] RSP: 0018:c90002503c58 EFLAGS: 00010286
[   16.646845] RAX: ecbfff7b43c0 RBX: 7f19f7203000 RCX: 812ff359
[   16.647970] RDX: 888107778000 RSI:  RDI: 0005
[   16.649091] RBP: ea000425e000 R08:  R09: 3030303030303030
[   16.650250] R10: 82ed7d94 R11: 6637303030302052 R12: 7c0afffded0f
[   16.651394] R13: 0001 R14: 888119ee7010 R15: 7f19f7202000
[   16.652529] FS:  () GS:88842fd0() 
knlGS:
[   16.653887] CS:  0010 DS:  ES:  CR0: 80050033
[   16.654841] CR2: ecbfff7b43c8 CR3: 000103220005 CR4: 00370ee0
[   16.655992] DR0:  DR1:  DR2: 
[   16.657150] DR3:  DR6: fffe0ff0 DR7: 0400
[   16.658290] Kernel panic - not syncing: Fatal exception
[   16.659613] Kernel Offset: disabled
[   16.660234] ---[ end Kernel panic - not syncing: Fatal exception ]---

Fixes: fb32856b16ad ("virtio-net: page_to_skb() use build_skb when there's 
sufficient tailroom")
Signed-off-by: Xuan Zhuo 
---
 drivers/net/virtio_net.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 9b6a4a875c55..6b929aca155a 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -958,7 +958,8 @@ static struct sk_buff *receive_mergeable(struct net_device 
*dev,
put_page(page);
head_skb = page_to_skb(vi, rq, xdp_page, offset,
   len, PAGE_SIZE, false,
-  metasize, headroom);
+  

[PATCH net v2 2/2] virtio_net: get build_skb() buf by data ptr

2021-06-01 Thread Xuan Zhuo
In the case of merge, the page passed into page_to_skb() may be a head
page, not the page where the current data is located. So when trying to
get the buf where the data is located, we should get buf based on
headroom instead of offset.

This patch solves this problem. But if you don't use this patch, the
original code can also run, because if the page is not the page of the
current data, the calculated tailroom will be less than 0, and will not
enter the logic of build_skb() . The significance of this patch is to
modify this logical problem, allowing more situations to use
build_skb().

Signed-off-by: Xuan Zhuo 
---
 drivers/net/virtio_net.c | 17 ++---
 1 file changed, 6 insertions(+), 11 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 6b929aca155a..fa407eb8b457 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -401,18 +401,13 @@ static struct sk_buff *page_to_skb(struct virtnet_info 
*vi,
/* If headroom is not 0, there is an offset between the beginning of the
 * data and the allocated space, otherwise the data and the allocated
 * space are aligned.
+*
+* Buffers with headroom use PAGE_SIZE as alloc size, see
+* add_recvbuf_mergeable() + get_mergeable_buf_len()
 */
-   if (headroom) {
-   /* Buffers with headroom use PAGE_SIZE as alloc size,
-* see add_recvbuf_mergeable() + get_mergeable_buf_len()
-*/
-   truesize = PAGE_SIZE;
-   tailroom = truesize - len - offset;
-   buf = page_address(page);
-   } else {
-   tailroom = truesize - len;
-   buf = p;
-   }
+   truesize = headroom ? PAGE_SIZE : truesize;
+   tailroom = truesize - len - headroom;
+   buf = p - headroom;
 
len -= hdr_len;
offset += hdr_padded_len;
-- 
2.31.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH net v2 0/2] virtio-net: fix for build_skb()

2021-06-01 Thread Xuan Zhuo
#1 Fixed a serious error.
#2 Fixed a logical error, but this error did not cause any serious consequences.

The logic of this piece is really messy. Fortunately, my refactored patch can be
completed with a small amount of testing.

Thanks.

Xuan Zhuo (2):
  virtio-net: fix for unable to handle page fault for address
  virtio_net: get build_skb() buf by data ptr

 drivers/net/virtio_net.c | 20 
 1 file changed, 8 insertions(+), 12 deletions(-)

--
2.31.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH net 2/2] virtio-net: get build_skb() buf by data ptr

2021-06-01 Thread Xuan Zhuo
On Tue, 1 Jun 2021 14:17:27 +0800, Jason Wang  wrote:
>
> 在 2021/6/1 下午1:59, Xuan Zhuo 写道:
> > On Tue, 1 Jun 2021 11:27:12 +0800, Jason Wang  wrote:
> >> 在 2021/6/1 上午11:08, Xuan Zhuo 写道:
> >>> On Tue, 1 Jun 2021 11:03:37 +0800, Jason Wang  wrote:
>  在 2021/5/31 下午6:58, Xuan Zhuo 写道:
> > On Mon, 31 May 2021 14:10:55 +0800, Jason Wang  
> > wrote:
> >> 在 2021/5/14 下午11:16, Xuan Zhuo 写道:
> >>> In the case of merge, the page passed into page_to_skb() may be a head
> >>> page, not the page where the current data is located.
> >> I don't get how this can happen?
> >>
> >> Maybe you can explain a little bit more?
> >>
> >> receive_mergeable() call page_to_skb() in two places:
> >>
> >> 1) XDP_PASS for linearized page , in this case we use xdp_page
> >> 2) page_to_skb() for "normal" page, in this case the page contains the 
> >> data
> > The offset may be greater than PAGE_SIZE, because page is obtained by
> > virt_to_head_page(), not the page where buf is located. And "offset" is 
> > the offset
> > of buf relative to page.
> >
> > tailroom = truesize - len - offset;
> >
> > In this case, the tailroom must be less than 0. Although there may be 
> > enough
> > content on this page to save skb_shared_info.
>  Interesting, I think we don't use compound pages for virtio-net. (We
>  don't define SKB_FRAG_PAGE_ORDER).
> 
>  Am I wrong?
> >>> It seems to me that it seems to be a fixed setting, not for us to 
> >>> configure
> >>> independently
> >>
> >> Looks like you are right.
> >>
> >> See comments below.
> >>
> >>
> >>> Thanks.
> >>>
> >>> ==
> >>>
> >>> net/sock.c
> >>>
> >>> #define SKB_FRAG_PAGE_ORDER   get_order(32768)
> >>> DEFINE_STATIC_KEY_FALSE(net_high_order_alloc_disable_key);
> >>>
> >>> /**
> >>>* skb_page_frag_refill - check that a page_frag contains enough room
> >>>* @sz: minimum size of the fragment we want to get
> >>>* @pfrag: pointer to page_frag
> >>>* @gfp: priority for memory allocation
> >>>*
> >>>* Note: While this allocator tries to use high order pages, there is
> >>>* no guarantee that allocations succeed. Therefore, @sz MUST be
> >>>* less or equal than PAGE_SIZE.
> >>>*/
> >>> bool skb_page_frag_refill(unsigned int sz, struct page_frag *pfrag, gfp_t 
> >>> gfp)
> >>> {
> >>>   if (pfrag->page) {
> >>>   if (page_ref_count(pfrag->page) == 1) {
> >>>   pfrag->offset = 0;
> >>>   return true;
> >>>   }
> >>>   if (pfrag->offset + sz <= pfrag->size)
> >>>   return true;
> >>>   put_page(pfrag->page);
> >>>   }
> >>>
> >>>   pfrag->offset = 0;
> >>>   if (SKB_FRAG_PAGE_ORDER &&
> >>>   !static_branch_unlikely(_high_order_alloc_disable_key)) {
> >>>   /* Avoid direct reclaim but allow kswapd to wake */
> >>>   pfrag->page = alloc_pages((gfp & ~__GFP_DIRECT_RECLAIM) |
> >>> __GFP_COMP | __GFP_NOWARN |
> >>> __GFP_NORETRY,
> >>> SKB_FRAG_PAGE_ORDER);
> >>>   if (likely(pfrag->page)) {
> >>>   pfrag->size = PAGE_SIZE << SKB_FRAG_PAGE_ORDER;
> >>>   return true;
> >>>   }
> >>>   }
> >>>   pfrag->page = alloc_page(gfp);
> >>>   if (likely(pfrag->page)) {
> >>>   pfrag->size = PAGE_SIZE;
> >>>   return true;
> >>>   }
> >>>   return false;
> >>> }
> >>> EXPORT_SYMBOL(skb_page_frag_refill);
> >>>
> >>>
>  Thanks
> 
> 
> > Thanks.
> >
> >> Thanks
> >>
> >>
> >>> So when trying to
> >>> get the buf where the data is located, you should directly use the
> >>> pointer(p) to get the address corresponding to the page.
> >>>
> >>> At the same time, the offset of the data in the page should also be
> >>> obtained using offset_in_page().
> >>>
> >>> This patch solves this problem. But if you don’t use this patch, the
> >>> original code can also run, because if the page is not the page of the
> >>> current data, the calculated tailroom will be less than 0, and will 
> >>> not
> >>> enter the logic of build_skb() . The significance of this patch is to
> >>> modify this logical problem, allowing more situations to use
> >>> build_skb().
> >>>
> >>> Signed-off-by: Xuan Zhuo 
> >>> Acked-by: Michael S. Tsirkin 
> >>> ---
> >>>  drivers/net/virtio_net.c | 8 ++--
> >>>  1 file changed, 6 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> >>> index 3e46c12dde08..073fec4c0df1 100644
> >>> --- a/drivers/net/virtio_net.c
> >>> +++ b/drivers/net/virtio_net.c
> >>> @@ -407,8 +407,12 @@ static struct sk_buff *page_to_skb(struct 
> >>> 

Re: [PATCH net 2/2] virtio-net: get build_skb() buf by data ptr

2021-06-01 Thread Jason Wang


在 2021/6/1 下午1:59, Xuan Zhuo 写道:

On Tue, 1 Jun 2021 11:27:12 +0800, Jason Wang  wrote:

在 2021/6/1 上午11:08, Xuan Zhuo 写道:

On Tue, 1 Jun 2021 11:03:37 +0800, Jason Wang  wrote:

在 2021/5/31 下午6:58, Xuan Zhuo 写道:

On Mon, 31 May 2021 14:10:55 +0800, Jason Wang  wrote:

在 2021/5/14 下午11:16, Xuan Zhuo 写道:

In the case of merge, the page passed into page_to_skb() may be a head
page, not the page where the current data is located.

I don't get how this can happen?

Maybe you can explain a little bit more?

receive_mergeable() call page_to_skb() in two places:

1) XDP_PASS for linearized page , in this case we use xdp_page
2) page_to_skb() for "normal" page, in this case the page contains the data

The offset may be greater than PAGE_SIZE, because page is obtained by
virt_to_head_page(), not the page where buf is located. And "offset" is the 
offset
of buf relative to page.

tailroom = truesize - len - offset;

In this case, the tailroom must be less than 0. Although there may be enough
content on this page to save skb_shared_info.

Interesting, I think we don't use compound pages for virtio-net. (We
don't define SKB_FRAG_PAGE_ORDER).

Am I wrong?

It seems to me that it seems to be a fixed setting, not for us to configure
independently


Looks like you are right.

See comments below.



Thanks.

==

net/sock.c

#define SKB_FRAG_PAGE_ORDER get_order(32768)
DEFINE_STATIC_KEY_FALSE(net_high_order_alloc_disable_key);

/**
   * skb_page_frag_refill - check that a page_frag contains enough room
   * @sz: minimum size of the fragment we want to get
   * @pfrag: pointer to page_frag
   * @gfp: priority for memory allocation
   *
   * Note: While this allocator tries to use high order pages, there is
   * no guarantee that allocations succeed. Therefore, @sz MUST be
   * less or equal than PAGE_SIZE.
   */
bool skb_page_frag_refill(unsigned int sz, struct page_frag *pfrag, gfp_t gfp)
{
if (pfrag->page) {
if (page_ref_count(pfrag->page) == 1) {
pfrag->offset = 0;
return true;
}
if (pfrag->offset + sz <= pfrag->size)
return true;
put_page(pfrag->page);
}

pfrag->offset = 0;
if (SKB_FRAG_PAGE_ORDER &&
!static_branch_unlikely(_high_order_alloc_disable_key)) {
/* Avoid direct reclaim but allow kswapd to wake */
pfrag->page = alloc_pages((gfp & ~__GFP_DIRECT_RECLAIM) |
  __GFP_COMP | __GFP_NOWARN |
  __GFP_NORETRY,
  SKB_FRAG_PAGE_ORDER);
if (likely(pfrag->page)) {
pfrag->size = PAGE_SIZE << SKB_FRAG_PAGE_ORDER;
return true;
}
}
pfrag->page = alloc_page(gfp);
if (likely(pfrag->page)) {
pfrag->size = PAGE_SIZE;
return true;
}
return false;
}
EXPORT_SYMBOL(skb_page_frag_refill);



Thanks



Thanks.


Thanks



So when trying to
get the buf where the data is located, you should directly use the
pointer(p) to get the address corresponding to the page.

At the same time, the offset of the data in the page should also be
obtained using offset_in_page().

This patch solves this problem. But if you don’t use this patch, the
original code can also run, because if the page is not the page of the
current data, the calculated tailroom will be less than 0, and will not
enter the logic of build_skb() . The significance of this patch is to
modify this logical problem, allowing more situations to use
build_skb().

Signed-off-by: Xuan Zhuo 
Acked-by: Michael S. Tsirkin 
---
 drivers/net/virtio_net.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 3e46c12dde08..073fec4c0df1 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -407,8 +407,12 @@ static struct sk_buff *page_to_skb(struct virtnet_info *vi,
 * see add_recvbuf_mergeable() + get_mergeable_buf_len()
 */
truesize = PAGE_SIZE;
-   tailroom = truesize - len - offset;
-   buf = page_address(page);
+
+   /* page maybe head page, so we should get the buf by p, not the
+* page
+*/
+   tailroom = truesize - len - offset_in_page(p);


I wonder why offset_in_page(p) is correct? I guess it should be:

tailroom = truesize - len - headroom;

The reason is that the buffer is not necessarily allocated at the page
boundary.

In my understanding, offset_in_page(p) is the offset of p in the page where it
is located. In this case, the two should be equal.



I think not, if the frag is not page aligned. offset_in_page(p) doesn't 
equal 

Re: [PATCH net 2/2] virtio-net: get build_skb() buf by data ptr

2021-06-01 Thread Xuan Zhuo
On Tue, 1 Jun 2021 11:27:12 +0800, Jason Wang  wrote:
>
> 在 2021/6/1 上午11:08, Xuan Zhuo 写道:
> > On Tue, 1 Jun 2021 11:03:37 +0800, Jason Wang  wrote:
> >> 在 2021/5/31 下午6:58, Xuan Zhuo 写道:
> >>> On Mon, 31 May 2021 14:10:55 +0800, Jason Wang  
> >>> wrote:
>  在 2021/5/14 下午11:16, Xuan Zhuo 写道:
> > In the case of merge, the page passed into page_to_skb() may be a head
> > page, not the page where the current data is located.
>  I don't get how this can happen?
> 
>  Maybe you can explain a little bit more?
> 
>  receive_mergeable() call page_to_skb() in two places:
> 
>  1) XDP_PASS for linearized page , in this case we use xdp_page
>  2) page_to_skb() for "normal" page, in this case the page contains the 
>  data
> >>> The offset may be greater than PAGE_SIZE, because page is obtained by
> >>> virt_to_head_page(), not the page where buf is located. And "offset" is 
> >>> the offset
> >>> of buf relative to page.
> >>>
> >>>   tailroom = truesize - len - offset;
> >>>
> >>> In this case, the tailroom must be less than 0. Although there may be 
> >>> enough
> >>> content on this page to save skb_shared_info.
> >>
> >> Interesting, I think we don't use compound pages for virtio-net. (We
> >> don't define SKB_FRAG_PAGE_ORDER).
> >>
> >> Am I wrong?
> >
> > It seems to me that it seems to be a fixed setting, not for us to configure
> > independently
>
>
> Looks like you are right.
>
> See comments below.
>
>
> >
> > Thanks.
> >
> > ==
> >
> > net/sock.c
> >
> > #define SKB_FRAG_PAGE_ORDER get_order(32768)
> > DEFINE_STATIC_KEY_FALSE(net_high_order_alloc_disable_key);
> >
> > /**
> >   * skb_page_frag_refill - check that a page_frag contains enough room
> >   * @sz: minimum size of the fragment we want to get
> >   * @pfrag: pointer to page_frag
> >   * @gfp: priority for memory allocation
> >   *
> >   * Note: While this allocator tries to use high order pages, there is
> >   * no guarantee that allocations succeed. Therefore, @sz MUST be
> >   * less or equal than PAGE_SIZE.
> >   */
> > bool skb_page_frag_refill(unsigned int sz, struct page_frag *pfrag, gfp_t 
> > gfp)
> > {
> > if (pfrag->page) {
> > if (page_ref_count(pfrag->page) == 1) {
> > pfrag->offset = 0;
> > return true;
> > }
> > if (pfrag->offset + sz <= pfrag->size)
> > return true;
> > put_page(pfrag->page);
> > }
> >
> > pfrag->offset = 0;
> > if (SKB_FRAG_PAGE_ORDER &&
> > !static_branch_unlikely(_high_order_alloc_disable_key)) {
> > /* Avoid direct reclaim but allow kswapd to wake */
> > pfrag->page = alloc_pages((gfp & ~__GFP_DIRECT_RECLAIM) |
> >   __GFP_COMP | __GFP_NOWARN |
> >   __GFP_NORETRY,
> >   SKB_FRAG_PAGE_ORDER);
> > if (likely(pfrag->page)) {
> > pfrag->size = PAGE_SIZE << SKB_FRAG_PAGE_ORDER;
> > return true;
> > }
> > }
> > pfrag->page = alloc_page(gfp);
> > if (likely(pfrag->page)) {
> > pfrag->size = PAGE_SIZE;
> > return true;
> > }
> > return false;
> > }
> > EXPORT_SYMBOL(skb_page_frag_refill);
> >
> >
> >> Thanks
> >>
> >>
> >>> Thanks.
> >>>
>  Thanks
> 
> 
> > So when trying to
> > get the buf where the data is located, you should directly use the
> > pointer(p) to get the address corresponding to the page.
> >
> > At the same time, the offset of the data in the page should also be
> > obtained using offset_in_page().
> >
> > This patch solves this problem. But if you don’t use this patch, the
> > original code can also run, because if the page is not the page of the
> > current data, the calculated tailroom will be less than 0, and will not
> > enter the logic of build_skb() . The significance of this patch is to
> > modify this logical problem, allowing more situations to use
> > build_skb().
> >
> > Signed-off-by: Xuan Zhuo 
> > Acked-by: Michael S. Tsirkin 
> > ---
> > drivers/net/virtio_net.c | 8 ++--
> > 1 file changed, 6 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > index 3e46c12dde08..073fec4c0df1 100644
> > --- a/drivers/net/virtio_net.c
> > +++ b/drivers/net/virtio_net.c
> > @@ -407,8 +407,12 @@ static struct sk_buff *page_to_skb(struct 
> > virtnet_info *vi,
> >  * see add_recvbuf_mergeable() + get_mergeable_buf_len()
> >  */
> > truesize = PAGE_SIZE;
> > -   tailroom = truesize - len - offset;
> > -   buf = page_address(page);
> > +
> > +   /* page maybe head