from:"Hawkins Jiawei"

[PATCH v5 3/3] vdpa: Allow VIRTIO_NET_F_RSS in SVQ

2023-10-24 Thread Hawkins Jiawei

Enable SVQ with VIRTIO_NET_F_RSS feature.

Signed-off-by: Hawkins Jiawei 
---
v5:
  - no changes

v4: 
https://lore.kernel.org/all/4ee7f3f339469f41626ca2c3ac7b1c574ebce901.1697904740.git.yin31...@gmail.com/
  - no code changes

v3: 
https://lore.kernel.org/all/2d2a378291bfac4144a0c0c473cf80415bb580b3.1693299194.git.yin31...@gmail.com/

 net/vhost-vdpa.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index a4cc1381fc..d0614d7954 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -122,6 +122,7 @@ static const uint64_t vdpa_svq_device_features =
 /* VHOST_F_LOG_ALL is exposed by SVQ */
 BIT_ULL(VHOST_F_LOG_ALL) |
 BIT_ULL(VIRTIO_NET_F_HASH_REPORT) |
+BIT_ULL(VIRTIO_NET_F_RSS) |
 BIT_ULL(VIRTIO_NET_F_RSC_EXT) |
 BIT_ULL(VIRTIO_NET_F_STANDBY) |
 BIT_ULL(VIRTIO_NET_F_SPEED_DUPLEX);
-- 
2.25.1

[PATCH v5 1/3] vdpa: Add SetSteeringEBPF method for NetClientState

2023-10-24 Thread Hawkins Jiawei

At present, to enable the VIRTIO_NET_F_RSS feature, eBPF must
be loaded for the vhost backend.

Given that vhost-vdpa is one of the vhost backend, we need to
implement the SetSteeringEBPF method to support RSS for vhost-vdpa,
even if vhost-vdpa calculates the rss hash in the hardware device
instead of in the kernel by eBPF.

Although this requires QEMU to be compiled with `--enable-bpf`
configuration even if the vdpa device does not use eBPF to
calculate the rss hash, this can avoid adding the specific
conditional statements for vDPA case to enable the VIRTIO_NET_F_RSS
feature, which reduces code maintainbility.

Suggested-by: Eugenio Pérez 
Signed-off-by: Hawkins Jiawei 
---
v5:
  - no changes

v4: 
https://lore.kernel.org/all/1c6faf4c5c3304c0bf14929143ccedb2e90dbcb2.1697904740.git.yin31...@gmail.com/
  - no code changes

v3: 
https://lore.kernel.org/all/30509e3c3b07bcadd95d5932aeb16820cb022902.1693299194.git.yin31...@gmail.com/

 net/vhost-vdpa.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 54f748d49d..3466936b87 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -241,6 +241,12 @@ static void vhost_vdpa_cleanup(NetClientState *nc)
 }
 }
 
+/** Dummy SetSteeringEBPF to support RSS for vhost-vdpa backend  */
+static bool vhost_vdpa_set_steering_ebpf(NetClientState *nc, int prog_fd)
+{
+return true;
+}
+
 static bool vhost_vdpa_has_vnet_hdr(NetClientState *nc)
 {
 assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
@@ -423,6 +429,7 @@ static NetClientInfo net_vhost_vdpa_info = {
 .has_vnet_hdr = vhost_vdpa_has_vnet_hdr,
 .has_ufo = vhost_vdpa_has_ufo,
 .check_peer_type = vhost_vdpa_check_peer_type,
+.set_steering_ebpf = vhost_vdpa_set_steering_ebpf,
 };
 
 static int64_t vhost_vdpa_get_vring_group(int device_fd, unsigned vq_index,
@@ -1258,6 +1265,7 @@ static NetClientInfo net_vhost_vdpa_cvq_info = {
 .has_vnet_hdr = vhost_vdpa_has_vnet_hdr,
 .has_ufo = vhost_vdpa_has_ufo,
 .check_peer_type = vhost_vdpa_check_peer_type,
+.set_steering_ebpf = vhost_vdpa_set_steering_ebpf,
 };
 
 /*
-- 
2.25.1

[PATCH v5 0/3] Vhost-vdpa Shadow Virtqueue RSS Support

2023-10-24 Thread Hawkins Jiawei

This series enables shadowed CVQ to intercept RSS command
through shadowed CVQ, update the virtio NIC device model
so qemu send it in a migration, and the restore of that
RSS state in the destination.

Note that this patch should be based on
patch "Vhost-vdpa Shadow Virtqueue Hash calculation Support" at [1].

[1]. https://lore.kernel.org/all/cover.1698194366.git.yin31...@gmail.com/

ChangeLog
=
v5: 
  - resolve conflict with the updated patch 
"Vhost-vdpa Shadow Virtqueue Hash calculation Support" at [1]

v4: https://lore.kernel.org/all/cover.1697904740.git.yin31...@gmail.com/
  - add do_rss argument and relative code in vhost_vdpa_net_load_rss()

v3: https://lore.kernel.org/all/cover.1693299194.git.yin31...@gmail.com/
  - resolve conflict with updated patch
"Vhost-vdpa Shadow Virtqueue Hash calculation Support" in patch
"vdpa: Restore receive-side scaling state"

RFC v2: https://lore.kernel.org/all/cover.1691926415.git.yin31...@gmail.com/
  - Correct the feature usage to VIRTIO_NET_F_HASH_REPORT when
loading the hash calculation state in
patch "vdpa: Restore receive-side scaling state"

RFC v1: https://lore.kernel.org/all/cover.1691766252.git.yin31...@gmail.com/

TestStep

1. regression testing using vp-vdpa device
  - For L0 guest, boot QEMU with two virtio-net-pci net device with
`ctrl_vq`, `mq`, `hash` features on, command line like:
-netdev tap,...
-device virtio-net-pci,disable-legacy=on,disable-modern=off,
iommu_platform=on,mq=on,ctrl_vq=on,hash=on,guest_announce=off,
indirect_desc=off,queue_reset=off,guest_uso4=off,guest_uso6=off,
host_uso=off,...

  - For L1 guest, apply the relative patch series and compile the
source code, start QEMU with two vdpa device with svq mode on,
enable the `ctrl_vq`, `mq`, `hash` features on, command line like:
  -netdev type=vhost-vdpa,x-svq=true,...
  -device virtio-net-pci,mq=on,guest_announce=off,ctrl_vq=on,
hash=on,...

  - For L2 source guest, run the following bash command:
```bash
#!/bin/sh

ethtool -K eth0 rxhash on
```
  - Gdb attach the destination VM and break at the
vhost_vdpa_net_load_rss()

  - Execute the live migration in L2 source monitor

  - Result
* with this series, gdb can hit the breakpoint and continue
the executing without triggering any error or warning.




2. test the migration using vp-vdpa device
  - For L0 guest, boot QEMU with two virtio-net-pci net device with
`in-qemu` RSS, command line like:
-netdev tap,vhost=off...
-device virtio-net-pci,disable-legacy=on,disable-modern=off,
iommu_platform=on,mq=on,ctrl_vq=on,hash=on,rss=on,guest_announce=off,
indirect_desc=off,queue_reset=off,guest_uso4=off,guest_uso6=off,
host_uso=off,...

  - For L1 guest, apply the relative patch series and compile the
source code, start QEMU with two vdpa device with svq mode on,
enable the `ctrl_vq`, `mq`, `rss` features on, command line like:
  -netdev type=vhost-vdpa,x-svq=true,...
  -device virtio-net-pci,mq=on,guest_announce=off,ctrl_vq=on,
rss=on,...

  - For L2 source guest, run the following bash command:
```bash
#!/bin/sh

ethtool -K eth0 rxhash on
```

  - Execute the live migration in L2 source monitor

  - Result
* with this series, L2 QEMU can execute without
triggering any error or warning. L0 QEMU echo
"Can't load eBPF RSS - fallback to software RSS".

Hawkins Jiawei (3):
  vdpa: Add SetSteeringEBPF method for NetClientState
  vdpa: Restore receive-side scaling state
  vdpa: Allow VIRTIO_NET_F_RSS in SVQ

 net/vhost-vdpa.c | 76 +---
 1 file changed, 53 insertions(+), 23 deletions(-)

-- 
2.25.1

[PATCH v5 2/3] vdpa: Restore receive-side scaling state

2023-10-24 Thread Hawkins Jiawei

This patch reuses vhost_vdpa_net_load_rss() with some
refactorings to restore the receive-side scaling state
at device's startup.

Signed-off-by: Hawkins Jiawei 
---
v5:
  - resolve conflict with the updated patch 
"Vhost-vdpa Shadow Virtqueue Hash calculation Support"

v4: 
https://lore.kernel.org/all/79caf9bf05778ed5279e11bdd1f26b49baf373ce.1697904740.git.yin31...@gmail.com/
  - add do_rss argument and relative code in vhost_vdpa_net_load_rss()

v3: 
https://lore.kernel.org/all/47b17e160ba4e55b24790b7d73b22d2b437ebe3c.1693299194.git.yin31...@gmail.com/
  - resolve conflict with updated patch
"Vhost-vdpa Shadow Virtqueue Hash calculation Support"

RFC v2: 
https://lore.kernel.org/all/af33aa80bc4ef0b2cec6c21b9448866c517fde80.1691926415.git.yin31...@gmail.com/
  - Correct the feature usage to VIRTIO_NET_F_HASH_REPORT when
loading the hash calculation state

RFC v1: 
https://lore.kernel.org/all/93d5d82f0a5df71df326830033e50358c8b6be7a.1691766252.git.yin31...@gmail.com/

 net/vhost-vdpa.c | 67 +++-
 1 file changed, 44 insertions(+), 23 deletions(-)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 3466936b87..a4cc1381fc 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -828,7 +828,7 @@ static int vhost_vdpa_net_load_mac(VhostVDPAState *s, const 
VirtIONet *n,
 
 static int vhost_vdpa_net_load_rss(VhostVDPAState *s, const VirtIONet *n,
struct iovec *out_cursor,
-   struct iovec *in_cursor)
+   struct iovec *in_cursor, bool do_rss)
 {
 struct virtio_net_rss_config cfg = {};
 ssize_t r;
@@ -854,21 +854,35 @@ static int vhost_vdpa_net_load_rss(VhostVDPAState *s, 
const VirtIONet *n,
sizeof(n->rss_data.indirections_table[0]));
 cfg.hash_types = cpu_to_le32(n->rss_data.hash_types);
 
-/*
- * According to VirtIO standard, "Field reserved MUST contain zeroes.
- * It is defined to make the structure to match the layout of
- * virtio_net_rss_config structure, defined in 5.1.6.5.7.".
- *
- * Therefore, we need to zero the fields in
- * struct virtio_net_rss_config, which corresponds to the
- * `reserved` field in struct virtio_net_hash_config.
- *
- * Note that all other fields are zeroed at their definitions,
- * except for the `indirection_table` field, where the actual data
- * is stored in the `table` variable to ensure compatibility
- * with RSS case. Therefore, we need to zero the `table` variable here.
- */
-table[0] = 0;
+if (do_rss) {
+/*
+ * According to VirtIO standard, "Number of entries in 
indirection_table
+ * is (indirection_table_mask + 1)".
+ */
+cfg.indirection_table_mask = cpu_to_le16(n->rss_data.indirections_len -
+ 1);
+cfg.unclassified_queue = cpu_to_le16(n->rss_data.default_queue);
+for (int i = 0; i < n->rss_data.indirections_len; ++i) {
+table[i] = cpu_to_le16(n->rss_data.indirections_table[i]);
+}
+cfg.max_tx_vq = cpu_to_le16(n->curr_queue_pairs);
+} else {
+/*
+ * According to VirtIO standard, "Field reserved MUST contain zeroes.
+ * It is defined to make the structure to match the layout of
+ * virtio_net_rss_config structure, defined in 5.1.6.5.7.".
+ *
+ * Therefore, we need to zero the fields in
+ * struct virtio_net_rss_config, which corresponds to the
+ * `reserved` field in struct virtio_net_hash_config.
+ *
+ * Note that all other fields are zeroed at their definitions,
+ * except for the `indirection_table` field, where the actual data
+ * is stored in the `table` variable to ensure compatibility
+ * with RSS case. Therefore, we need to zero the `table` variable here.
+ */
+table[0] = 0;
+}
 
 /*
  * Considering that virtio_net_handle_rss() currently does not restore
@@ -899,6 +913,7 @@ static int vhost_vdpa_net_load_rss(VhostVDPAState *s, const 
VirtIONet *n,
 
 r = vhost_vdpa_net_load_cmd(s, out_cursor, in_cursor,
 VIRTIO_NET_CTRL_MQ,
+do_rss ? VIRTIO_NET_CTRL_MQ_RSS_CONFIG :
 VIRTIO_NET_CTRL_MQ_HASH_CONFIG,
 data, ARRAY_SIZE(data));
 if (unlikely(r < 0)) {
@@ -933,13 +948,19 @@ static int vhost_vdpa_net_load_mq(VhostVDPAState *s,
 return r;
 }
 
-if (!virtio_vdev_has_feature(>parent_obj, VIRTIO_NET_F_HASH_REPORT)) {
-return 0;
-}
-
-r = vhost_vdpa_net_load_rss(s, n, out_cursor, in_cursor);
-if (unlikely(r < 0)) {
-return r;
+if (virtio_vdev_has_feature(>parent_obj, VIRTIO_NET_F_RSS)) {
+

[PATCH v4 1/2] vdpa: Restore hash calculation state

2023-10-24 Thread Hawkins Jiawei

This patch introduces vhost_vdpa_net_load_rss() to restore
the hash calculation state at device's startup.

Signed-off-by: Hawkins Jiawei 
---
v4:
  - fix some typos pointed out by Michael
  - zero the `cfg` fields at the definition suggested by Michael

v3: 
https://patchwork.kernel.org/project/qemu-devel/patch/b7cd0c8d6a58b16b086f11714d2908ad35c67caa.1697902949.git.yin31...@gmail.com/
  - remove the `do_rss` argument in vhost_vdpa_net_load_rss()
  - zero reserved fields in "cfg" manually instead of using memset()
to prevent compiler "array-bounds" warning

v2: 
https://lore.kernel.org/all/f5ffad10699001107022851e0560cb394039d6b0.1693297766.git.yin31...@gmail.com/
  - resolve conflict with updated patch
"vdpa: Send all CVQ state load commands in parallel"
  - move the `table` declaration at the beginning of the
vhost_vdpa_net_load_rss()

RFC: 
https://lore.kernel.org/all/a54ca70b12ebe2f3c391864e41241697ab1aba30.1691762906.git.yin31...@gmail.com/

 net/vhost-vdpa.c | 91 
 1 file changed, 91 insertions(+)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 7a226c93bc..e59d40b8ae 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -818,6 +818,88 @@ static int vhost_vdpa_net_load_mac(VhostVDPAState *s, 
const VirtIONet *n,
 return 0;
 }
 
+static int vhost_vdpa_net_load_rss(VhostVDPAState *s, const VirtIONet *n,
+   struct iovec *out_cursor,
+   struct iovec *in_cursor)
+{
+struct virtio_net_rss_config cfg = {};
+ssize_t r;
+g_autofree uint16_t *table = NULL;
+
+/*
+ * According to VirtIO standard, "Initially the device has all hash
+ * types disabled and reports only VIRTIO_NET_HASH_REPORT_NONE.".
+ *
+ * Therefore, there is no need to send this CVQ command if the
+ * driver disables the all hash types, which aligns with
+ * the device's defaults.
+ *
+ * Note that the device's defaults can mismatch the driver's
+ * configuration only at live migration.
+ */
+if (!n->rss_data.enabled ||
+n->rss_data.hash_types == VIRTIO_NET_HASH_REPORT_NONE) {
+return 0;
+}
+
+table = g_malloc_n(n->rss_data.indirections_len,
+   sizeof(n->rss_data.indirections_table[0]));
+cfg.hash_types = cpu_to_le32(n->rss_data.hash_types);
+
+/*
+ * According to VirtIO standard, "Field reserved MUST contain zeroes.
+ * It is defined to make the structure to match the layout of
+ * virtio_net_rss_config structure, defined in 5.1.6.5.7.".
+ *
+ * Therefore, we need to zero the fields in
+ * struct virtio_net_rss_config, which corresponds to the
+ * `reserved` field in struct virtio_net_hash_config.
+ *
+ * Note that all other fields are zeroed at their definitions,
+ * except for the `indirection_table` field, where the actual data
+ * is stored in the `table` variable to ensure compatibility
+ * with RSS case. Therefore, we need to zero the `table` variable here.
+ */
+table[0] = 0;
+
+/*
+ * Considering that virtio_net_handle_rss() currently does not restore
+ * the hash key length parsed from the CVQ command sent from the guest
+ * into n->rss_data and uses the maximum key length in other code, so
+ * we also employ the maximum key length here.
+ */
+cfg.hash_key_length = sizeof(n->rss_data.key);
+
+const struct iovec data[] = {
+{
+.iov_base = ,
+.iov_len = offsetof(struct virtio_net_rss_config,
+indirection_table),
+}, {
+.iov_base = table,
+.iov_len = n->rss_data.indirections_len *
+   sizeof(n->rss_data.indirections_table[0]),
+}, {
+.iov_base = _tx_vq,
+.iov_len = offsetof(struct virtio_net_rss_config, hash_key_data) -
+   offsetof(struct virtio_net_rss_config, max_tx_vq),
+}, {
+.iov_base = (void *)n->rss_data.key,
+.iov_len = sizeof(n->rss_data.key),
+}
+};
+
+r = vhost_vdpa_net_load_cmd(s, out_cursor, in_cursor,
+VIRTIO_NET_CTRL_MQ,
+VIRTIO_NET_CTRL_MQ_HASH_CONFIG,
+data, ARRAY_SIZE(data));
+if (unlikely(r < 0)) {
+return r;
+}
+
+return 0;
+}
+
 static int vhost_vdpa_net_load_mq(VhostVDPAState *s,
   const VirtIONet *n,
   struct iovec *out_cursor,
@@ -843,6 +925,15 @@ static int vhost_vdpa_net_load_mq(VhostVDPAState *s,
 return r;
 }
 
+if (!virtio_vdev_has_feature(>parent_obj, VIRTIO_NET_F_HASH_REPORT)) {
+return 0;
+}
+
+r = vhost_vdpa_net_load_rss(s, n, out_cursor, in_cursor);
+if (unlikely(r < 0)) {
+return r;
+}
+
 return 0;
 }
 
-- 
2.25.1

[PATCH v4 0/2] Vhost-vdpa Shadow Virtqueue Hash calculation Support

2023-10-24 Thread Hawkins Jiawei

This series enables shadowed CVQ to intercept
VIRTIO_NET_CTRL_MQ_HASH_CONFIG command through shadowed CVQ,
update the virtio NIC device model so qemu send it in a
migration, and the restore of that Hash calculation state
in the destination.

ChangeLog
=
v4:
  - fix some typos pointed out by Michael
  - zero the `cfg` fields at the definition suggested by Michael

v3: https://lore.kernel.org/all/cover.1697902949.git.yin31...@gmail.com/
  - remove the `do_rss` argument in vhost_vdpa_net_load_rss()
  - zero reserved fields in "cfg" manually instead of using memset()
to prevent compiler "array-bounds" warning

v2: https://lore.kernel.org/all/cover.1693297766.git.yin31...@gmail.com/
  - resolve conflict with updated patch
"vdpa: Send all CVQ state load commands in parallel", move the
`table` declaration at the beginning of the vhost_vdpa_net_load_rss()
in patch
"vdpa: Restore hash calculation state"

RFC: https://lore.kernel.org/all/cover.1691762906.git.yin31...@gmail.com/#t

TestStep

1. test the migration using vp-vdpa device
  - For L0 guest, boot QEMU with two virtio-net-pci net device with
`ctrl_vq`, `mq`, `hash` features on, command line like:
-netdev tap,...
-device virtio-net-pci,disable-legacy=on,disable-modern=off,
iommu_platform=on,mq=on,ctrl_vq=on,hash=on,guest_announce=off,
indirect_desc=off,queue_reset=off,guest_uso4=off,guest_uso6=off,
host_uso=off,...

  - For L1 guest, apply the relative patch series and compile the
source code, start QEMU with two vdpa device with svq mode on,
enable the `ctrl_vq`, `mq`, `hash` features on, command line like:
  -netdev type=vhost-vdpa,x-svq=true,...
  -device virtio-net-pci,mq=on,guest_announce=off,ctrl_vq=on,
hash=on,...

  - For L2 source guest, run the following bash command:
```bash
#!/bin/sh

ethtool -K eth0 rxhash on
```
  - Gdb attach the destination VM and break at the
vhost_vdpa_net_load_rss()

  - Execute the live migration in L2 source monitor

  - Result
* with this series, gdb can hit the breakpoint and continue
the executing without triggering any error or warning.

Hawkins Jiawei (2):
  vdpa: Restore hash calculation state
  vdpa: Allow VIRTIO_NET_F_HASH_REPORT in SVQ

 net/vhost-vdpa.c | 92 
 1 file changed, 92 insertions(+)

-- 
2.25.1

[PATCH v4 2/2] vdpa: Allow VIRTIO_NET_F_HASH_REPORT in SVQ

2023-10-24 Thread Hawkins Jiawei

Enable SVQ with VIRTIO_NET_F_HASH_REPORT feature.

Signed-off-by: Hawkins Jiawei 
---
v4:
  - no changes

v3: 
https://lore.kernel.org/all/c3b69f0a65600722c1e4d3aa14d53a71e8ffb888.1697902949.git.yin31...@gmail.com/
  - no code changes

v2: 
https://lore.kernel.org/all/a67d4abc2c8c5c7636addc729daa5432fa8193bd.1693297766.git.yin31...@gmail.com/

 net/vhost-vdpa.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index e59d40b8ae..54f748d49d 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -121,6 +121,7 @@ static const uint64_t vdpa_svq_device_features =
 BIT_ULL(VIRTIO_NET_F_CTRL_MAC_ADDR) |
 /* VHOST_F_LOG_ALL is exposed by SVQ */
 BIT_ULL(VHOST_F_LOG_ALL) |
+BIT_ULL(VIRTIO_NET_F_HASH_REPORT) |
 BIT_ULL(VIRTIO_NET_F_RSC_EXT) |
 BIT_ULL(VIRTIO_NET_F_STANDBY) |
 BIT_ULL(VIRTIO_NET_F_SPEED_DUPLEX);
-- 
2.25.1

Re: [PATCH v3 1/2] vdpa: Restore hash calculation state

2023-10-22 Thread Hawkins Jiawei

在 2023/10/22 18:00, Michael S. Tsirkin 写道:
> On Sun, Oct 22, 2023 at 10:00:48AM +0800, Hawkins Jiawei wrote:
>> This patch introduces vhost_vdpa_net_load_rss() to restore
>> the hash calculation state at device's startup.
>>
>> Signed-off-by: Hawkins Jiawei 
>> ---
>> v3:
>>- remove the `do_rss` argument in vhost_vdpa_net_load_rss()
>>- zero reserved fields in "cfg" manually instead of using memset()
>> to prevent compiler "array-bounds" warning
>>
>> v2: 
>> https://lore.kernel.org/all/f5ffad10699001107022851e0560cb394039d6b0.1693297766.git.yin31...@gmail.com/
>>- resolve conflict with updated patch
>> "vdpa: Send all CVQ state load commands in parallel"
>>- move the `table` declaration at the beginning of the
>> vhost_vdpa_net_load_rss()
>>
>> RFC: 
>> https://lore.kernel.org/all/a54ca70b12ebe2f3c391864e41241697ab1aba30.1691762906.git.yin31...@gmail.com/
>>
>>   net/vhost-vdpa.c | 89 
>>   1 file changed, 89 insertions(+)
>>
>> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
>> index 4b7c3b81b8..2e4bad65b4 100644
>> --- a/net/vhost-vdpa.c
>> +++ b/net/vhost-vdpa.c
>> @@ -817,6 +817,86 @@ static int vhost_vdpa_net_load_mac(VhostVDPAState *s, 
>> const VirtIONet *n,
>>   return 0;
>>   }
>>
>> +static int vhost_vdpa_net_load_rss(VhostVDPAState *s, const VirtIONet *n,
>> +   struct iovec *out_cursor,
>> +   struct iovec *in_cursor)
>> +{
>> +struct virtio_net_rss_config cfg;
>> +ssize_t r;
>> +g_autofree uint16_t *table = NULL;
>> +
>> +/*
>> + * According to VirtIO standard, "Initially the device has all hash
>> + * types disabled and reports only VIRTIO_NET_HASH_REPORT_NONE.".
>> + *
>> + * Therefore, there is no need to send this CVQ command if the
>> + * driver disable the all hash types, which aligns with
>
> disables? or disabled

It should be "disables".
I will correct this in the v4 patch.

>
>> + * the device's defaults.
>> + *
>> + * Note that the device's defaults can mismatch the driver's
>> + * configuration only at live migration.
>> + */
>> +if (!n->rss_data.enabled ||
>> +n->rss_data.hash_types == VIRTIO_NET_HASH_REPORT_NONE) {
>> +return 0;
>> +}
>> +
>> +table = g_malloc_n(n->rss_data.indirections_len,
>> +   sizeof(n->rss_data.indirections_table[0]));
>> +cfg.hash_types = cpu_to_le32(n->rss_data.hash_types);
>> +
>> +/*
>> + * According to VirtIO standard, "Field reserved MUST contain zeroes.
>> + * It is defined to make the structure to match the layout of
>> + * virtio_net_rss_config structure, defined in 5.1.6.5.7.".
>> + *
>> + * Therefore, we need to zero the fields in struct 
>> virtio_net_rss_config,
>> + * which corresponds the `reserved` field in
>
> corresponds to

I will correct this in the v4 patch.

>
>> + * struct virtio_net_hash_config.
>> + */
>> +cfg.indirection_table_mask = 0;
>> +cfg.unclassified_queue = 0;
>> +table[0] = 0; /* the actual indirection table for cfg */
>> +cfg.max_tx_vq = 0;
>
> Wouldn't it be easier to just do cfg = {} where it is defined?

Normally, it should follow your pattern, but there are two reasons why
I'm doing it in a different way here.

Firstly, in the subsequent patchset, both hash calculation and rss will
reuse vhost_vdpa_net_load_rss() to restore their state. Given the
similarity of their CVQ commands, if we only explicitly handle the
fields assignment for rss case, while placing the hash calculation field
assignment at the definition site, it would disperse the logic within
the function, making it look odd.

Secondly, to ensure compatibility for rss case, we cannot use the
'indirection_table' field in the cfg. Instead, we need to allocate a
separate 'table' variable here. Even if we initialize the other fields
of the hash calculation case at the definition site, we still need to
manually set 'table' to 0 here. Hence, it makes more sense to set
everything together at this point.

But I am okay if you think it is better to place the field assignment
for the hash calculation case at the definition site.

>
>> +
>> +/*
>> + * Consider that virtio_net_handle_rss() currently does not restore the
>> + * hash key length parsed from the CVQ command sent from the guest into
&g

[PATCH v4 1/3] vdpa: Add SetSteeringEBPF method for NetClientState

2023-10-21 Thread Hawkins Jiawei

At present, to enable the VIRTIO_NET_F_RSS feature, eBPF must
be loaded for the vhost backend.

Given that vhost-vdpa is one of the vhost backend, we need to
implement the SetSteeringEBPF method to support RSS for vhost-vdpa,
even if vhost-vdpa calculates the rss hash in the hardware device
instead of in the kernel by eBPF.

Although this requires QEMU to be compiled with `--enable-bpf`
configuration even if the vdpa device does not use eBPF to
calculate the rss hash, this can avoid adding the specific
conditional statements for vDPA case to enable the VIRTIO_NET_F_RSS
feature, which reduces code maintainbility.

Suggested-by: Eugenio Pérez 
Signed-off-by: Hawkins Jiawei 
---
v4: no code changes

v3: 
https://lore.kernel.org/all/30509e3c3b07bcadd95d5932aeb16820cb022902.1693299194.git.yin31...@gmail.com/

 net/vhost-vdpa.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 4c65c53fd2..c4b89f5119 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -241,6 +241,12 @@ static void vhost_vdpa_cleanup(NetClientState *nc)
 }
 }
 
+/** Dummy SetSteeringEBPF to support RSS for vhost-vdpa backend  */
+static bool vhost_vdpa_set_steering_ebpf(NetClientState *nc, int prog_fd)
+{
+return true;
+}
+
 static bool vhost_vdpa_has_vnet_hdr(NetClientState *nc)
 {
 assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
@@ -422,6 +428,7 @@ static NetClientInfo net_vhost_vdpa_info = {
 .has_vnet_hdr = vhost_vdpa_has_vnet_hdr,
 .has_ufo = vhost_vdpa_has_ufo,
 .check_peer_type = vhost_vdpa_check_peer_type,
+.set_steering_ebpf = vhost_vdpa_set_steering_ebpf,
 };
 
 static int64_t vhost_vdpa_get_vring_group(int device_fd, unsigned vq_index,
@@ -1255,6 +1262,7 @@ static NetClientInfo net_vhost_vdpa_cvq_info = {
 .has_vnet_hdr = vhost_vdpa_has_vnet_hdr,
 .has_ufo = vhost_vdpa_has_ufo,
 .check_peer_type = vhost_vdpa_check_peer_type,
+.set_steering_ebpf = vhost_vdpa_set_steering_ebpf,
 };
 
 /*
-- 
2.25.1

[PATCH v4 0/3] Vhost-vdpa Shadow Virtqueue RSS Support

2023-10-21 Thread Hawkins Jiawei

This series enables shadowed CVQ to intercept RSS command
through shadowed CVQ, update the virtio NIC device model
so qemu send it in a migration, and the restore of that
RSS state in the destination.

Note that this patch should be based on
patch "Vhost-vdpa Shadow Virtqueue Hash calculation Support" at [1].

[1]. https://lore.kernel.org/all/cover.1697902949.git.yin31...@gmail.com/

ChangeLog
=
v4:
  - add do_rss argument and relative code in vhost_vdpa_net_load_rss()

v3: https://lore.kernel.org/all/cover.1693299194.git.yin31...@gmail.com/
  - resolve conflict with updated patch
"Vhost-vdpa Shadow Virtqueue Hash calculation Support" in patch
"vdpa: Restore receive-side scaling state"

RFC v2: https://lore.kernel.org/all/cover.1691926415.git.yin31...@gmail.com/
  - Correct the feature usage to VIRTIO_NET_F_HASH_REPORT when
loading the hash calculation state in
patch "vdpa: Restore receive-side scaling state"

RFC v1: https://lore.kernel.org/all/cover.1691766252.git.yin31...@gmail.com/

TestStep

1. test the migration using vp-vdpa device
  - For L0 guest, boot QEMU with two virtio-net-pci net device with
`in-qemu` RSS, command line like:
-netdev tap,vhost=off...
-device virtio-net-pci,disable-legacy=on,disable-modern=off,
iommu_platform=on,mq=on,ctrl_vq=on,hash=on,rss=on,guest_announce=off,
indirect_desc=off,queue_reset=off,guest_uso4=off,guest_uso6=off,
host_uso=off,...

  - For L1 guest, apply the relative patch series and compile the
source code, start QEMU with two vdpa device with svq mode on,
enable the `ctrl_vq`, `mq`, `rss` features on, command line like:
  -netdev type=vhost-vdpa,x-svq=true,...
  -device virtio-net-pci,mq=on,guest_announce=off,ctrl_vq=on,
rss=on,...

  - For L2 source guest, run the following bash command:
```bash
#!/bin/sh

ethtool -K eth0 rxhash on
```

  - Execute the live migration in L2 source monitor

  - Result
* with this series, L2 QEMU can execute without
triggering any error or warning. L0 QEMU echo
"Can't load eBPF RSS - fallback to software RSS".

Hawkins Jiawei (3):
  vdpa: Add SetSteeringEBPF method for NetClientState
  vdpa: Restore receive-side scaling state
  vdpa: Allow VIRTIO_NET_F_RSS in SVQ

 net/vhost-vdpa.c | 72 ++--
 1 file changed, 51 insertions(+), 21 deletions(-)

-- 
2.25.1

[PATCH v4 2/3] vdpa: Restore receive-side scaling state

2023-10-21 Thread Hawkins Jiawei

This patch reuses vhost_vdpa_net_load_rss() with some
refactorings to restore the receive-side scaling state
at device's startup.

Signed-off-by: Hawkins Jiawei 
---
v4:
  - add do_rss argument and relative code in vhost_vdpa_net_load_rss()

v3: 
https://lore.kernel.org/all/47b17e160ba4e55b24790b7d73b22d2b437ebe3c.1693299194.git.yin31...@gmail.com/
  - resolve conflict with updated patch
"Vhost-vdpa Shadow Virtqueue Hash calculation Support"

RFC v2: 
https://lore.kernel.org/all/af33aa80bc4ef0b2cec6c21b9448866c517fde80.1691926415.git.yin31...@gmail.com/
  - Correct the feature usage to VIRTIO_NET_F_HASH_REPORT when
loading the hash calculation state

RFC v1: 
https://lore.kernel.org/all/93d5d82f0a5df71df326830033e50358c8b6be7a.1691766252.git.yin31...@gmail.com/

 net/vhost-vdpa.c | 63 
 1 file changed, 42 insertions(+), 21 deletions(-)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index c4b89f5119..5de01aa851 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -827,7 +827,7 @@ static int vhost_vdpa_net_load_mac(VhostVDPAState *s, const 
VirtIONet *n,
 
 static int vhost_vdpa_net_load_rss(VhostVDPAState *s, const VirtIONet *n,
struct iovec *out_cursor,
-   struct iovec *in_cursor)
+   struct iovec *in_cursor, bool do_rss)
 {
 struct virtio_net_rss_config cfg;
 ssize_t r;
@@ -853,19 +853,33 @@ static int vhost_vdpa_net_load_rss(VhostVDPAState *s, 
const VirtIONet *n,
sizeof(n->rss_data.indirections_table[0]));
 cfg.hash_types = cpu_to_le32(n->rss_data.hash_types);
 
-/*
- * According to VirtIO standard, "Field reserved MUST contain zeroes.
- * It is defined to make the structure to match the layout of
- * virtio_net_rss_config structure, defined in 5.1.6.5.7.".
- *
- * Therefore, we need to zero the fields in struct virtio_net_rss_config,
- * which corresponds the `reserved` field in
- * struct virtio_net_hash_config.
- */
-cfg.indirection_table_mask = 0;
-cfg.unclassified_queue = 0;
-table[0] = 0; /* the actual indirection table for cfg */
-cfg.max_tx_vq = 0;
+if (do_rss) {
+/*
+ * According to VirtIO standard, "Number of entries in 
indirection_table
+ * is (indirection_table_mask + 1)".
+ */
+cfg.indirection_table_mask = cpu_to_le16(n->rss_data.indirections_len -
+ 1);
+cfg.unclassified_queue = cpu_to_le16(n->rss_data.default_queue);
+for (int i = 0; i < n->rss_data.indirections_len; ++i) {
+table[i] = cpu_to_le16(n->rss_data.indirections_table[i]);
+}
+cfg.max_tx_vq = cpu_to_le16(n->curr_queue_pairs);
+} else {
+/*
+ * According to VirtIO standard, "Field reserved MUST contain zeroes.
+ * It is defined to make the structure to match the layout of
+ * virtio_net_rss_config structure, defined in 5.1.6.5.7.".
+ *
+ * Therefore, we need to zero the fields in
+ * struct virtio_net_rss_config, which corresponds the `reserved` field
+ * in struct virtio_net_hash_config.
+ */
+cfg.indirection_table_mask = 0;
+cfg.unclassified_queue = 0;
+table[0] = 0; /* the actual indirection table for cfg */
+cfg.max_tx_vq = 0;
+}
 
 /*
  * Consider that virtio_net_handle_rss() currently does not restore the
@@ -896,6 +910,7 @@ static int vhost_vdpa_net_load_rss(VhostVDPAState *s, const 
VirtIONet *n,
 
 r = vhost_vdpa_net_load_cmd(s, out_cursor, in_cursor,
 VIRTIO_NET_CTRL_MQ,
+do_rss ? VIRTIO_NET_CTRL_MQ_RSS_CONFIG :
 VIRTIO_NET_CTRL_MQ_HASH_CONFIG,
 data, ARRAY_SIZE(data));
 if (unlikely(r < 0)) {
@@ -930,13 +945,19 @@ static int vhost_vdpa_net_load_mq(VhostVDPAState *s,
 return r;
 }
 
-if (!virtio_vdev_has_feature(>parent_obj, VIRTIO_NET_F_HASH_REPORT)) {
-return 0;
-}
-
-r = vhost_vdpa_net_load_rss(s, n, out_cursor, in_cursor);
-if (unlikely(r < 0)) {
-return r;
+if (virtio_vdev_has_feature(>parent_obj, VIRTIO_NET_F_RSS)) {
+/* load the receive-side scaling state */
+r = vhost_vdpa_net_load_rss(s, n, out_cursor, in_cursor, true);
+if (unlikely(r < 0)) {
+return r;
+}
+} else if (virtio_vdev_has_feature(>parent_obj,
+   VIRTIO_NET_F_HASH_REPORT)) {
+/* load the hash calculation state */
+r = vhost_vdpa_net_load_rss(s, n, out_cursor, in_cursor, false);
+if (unlikely(r < 0)) {
+return r;
+}
 }
 
 return 0;
-- 
2.25.1

[PATCH v4 3/3] vdpa: Allow VIRTIO_NET_F_RSS in SVQ

2023-10-21 Thread Hawkins Jiawei

Enable SVQ with VIRTIO_NET_F_RSS feature.

Signed-off-by: Hawkins Jiawei 
---
v4: no code changes

v3: 
https://lore.kernel.org/all/2d2a378291bfac4144a0c0c473cf80415bb580b3.1693299194.git.yin31...@gmail.com/

 net/vhost-vdpa.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 5de01aa851..66133408d5 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -122,6 +122,7 @@ static const uint64_t vdpa_svq_device_features =
 /* VHOST_F_LOG_ALL is exposed by SVQ */
 BIT_ULL(VHOST_F_LOG_ALL) |
 BIT_ULL(VIRTIO_NET_F_HASH_REPORT) |
+BIT_ULL(VIRTIO_NET_F_RSS) |
 BIT_ULL(VIRTIO_NET_F_RSC_EXT) |
 BIT_ULL(VIRTIO_NET_F_STANDBY) |
 BIT_ULL(VIRTIO_NET_F_SPEED_DUPLEX);
-- 
2.25.1

[PATCH v3 1/2] vdpa: Restore hash calculation state

2023-10-21 Thread Hawkins Jiawei

This patch introduces vhost_vdpa_net_load_rss() to restore
the hash calculation state at device's startup.

Signed-off-by: Hawkins Jiawei 
---
v3:
  - remove the `do_rss` argument in vhost_vdpa_net_load_rss()
  - zero reserved fields in "cfg" manually instead of using memset()
to prevent compiler "array-bounds" warning

v2: 
https://lore.kernel.org/all/f5ffad10699001107022851e0560cb394039d6b0.1693297766.git.yin31...@gmail.com/
  - resolve conflict with updated patch
"vdpa: Send all CVQ state load commands in parallel"
  - move the `table` declaration at the beginning of the
vhost_vdpa_net_load_rss()

RFC: 
https://lore.kernel.org/all/a54ca70b12ebe2f3c391864e41241697ab1aba30.1691762906.git.yin31...@gmail.com/

 net/vhost-vdpa.c | 89 
 1 file changed, 89 insertions(+)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 4b7c3b81b8..2e4bad65b4 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -817,6 +817,86 @@ static int vhost_vdpa_net_load_mac(VhostVDPAState *s, 
const VirtIONet *n,
 return 0;
 }
 
+static int vhost_vdpa_net_load_rss(VhostVDPAState *s, const VirtIONet *n,
+   struct iovec *out_cursor,
+   struct iovec *in_cursor)
+{
+struct virtio_net_rss_config cfg;
+ssize_t r;
+g_autofree uint16_t *table = NULL;
+
+/*
+ * According to VirtIO standard, "Initially the device has all hash
+ * types disabled and reports only VIRTIO_NET_HASH_REPORT_NONE.".
+ *
+ * Therefore, there is no need to send this CVQ command if the
+ * driver disable the all hash types, which aligns with
+ * the device's defaults.
+ *
+ * Note that the device's defaults can mismatch the driver's
+ * configuration only at live migration.
+ */
+if (!n->rss_data.enabled ||
+n->rss_data.hash_types == VIRTIO_NET_HASH_REPORT_NONE) {
+return 0;
+}
+
+table = g_malloc_n(n->rss_data.indirections_len,
+   sizeof(n->rss_data.indirections_table[0]));
+cfg.hash_types = cpu_to_le32(n->rss_data.hash_types);
+
+/*
+ * According to VirtIO standard, "Field reserved MUST contain zeroes.
+ * It is defined to make the structure to match the layout of
+ * virtio_net_rss_config structure, defined in 5.1.6.5.7.".
+ *
+ * Therefore, we need to zero the fields in struct virtio_net_rss_config,
+ * which corresponds the `reserved` field in
+ * struct virtio_net_hash_config.
+ */
+cfg.indirection_table_mask = 0;
+cfg.unclassified_queue = 0;
+table[0] = 0; /* the actual indirection table for cfg */
+cfg.max_tx_vq = 0;
+
+/*
+ * Consider that virtio_net_handle_rss() currently does not restore the
+ * hash key length parsed from the CVQ command sent from the guest into
+ * n->rss_data and uses the maximum key length in other code, so we also
+ * employthe the maxium key length here.
+ */
+cfg.hash_key_length = sizeof(n->rss_data.key);
+
+const struct iovec data[] = {
+{
+.iov_base = ,
+.iov_len = offsetof(struct virtio_net_rss_config,
+indirection_table),
+}, {
+.iov_base = table,
+.iov_len = n->rss_data.indirections_len *
+   sizeof(n->rss_data.indirections_table[0]),
+}, {
+.iov_base = _tx_vq,
+.iov_len = offsetof(struct virtio_net_rss_config, hash_key_data) -
+   offsetof(struct virtio_net_rss_config, max_tx_vq),
+}, {
+.iov_base = (void *)n->rss_data.key,
+.iov_len = sizeof(n->rss_data.key),
+}
+};
+
+r = vhost_vdpa_net_load_cmd(s, out_cursor, in_cursor,
+VIRTIO_NET_CTRL_MQ,
+VIRTIO_NET_CTRL_MQ_HASH_CONFIG,
+data, ARRAY_SIZE(data));
+if (unlikely(r < 0)) {
+return r;
+}
+
+return 0;
+}
+
 static int vhost_vdpa_net_load_mq(VhostVDPAState *s,
   const VirtIONet *n,
   struct iovec *out_cursor,
@@ -842,6 +922,15 @@ static int vhost_vdpa_net_load_mq(VhostVDPAState *s,
 return r;
 }
 
+if (!virtio_vdev_has_feature(>parent_obj, VIRTIO_NET_F_HASH_REPORT)) {
+return 0;
+}
+
+r = vhost_vdpa_net_load_rss(s, n, out_cursor, in_cursor);
+if (unlikely(r < 0)) {
+return r;
+}
+
 return 0;
 }
 
-- 
2.25.1

[PATCH v3 0/2] Vhost-vdpa Shadow Virtqueue Hash calculation Support

2023-10-21 Thread Hawkins Jiawei

This series enables shadowed CVQ to intercept
VIRTIO_NET_CTRL_MQ_HASH_CONFIG command through shadowed CVQ,
update the virtio NIC device model so qemu send it in a
migration, and the restore of that Hash calculation state
in the destination.

ChangeLog
=
v3:
  - remove the `do_rss` argument in vhost_vdpa_net_load_rss()
  - zero reserved fields in "cfg" manually instead of using memset()
to prevent compiler "array-bounds" warning

v2: https://lore.kernel.org/all/cover.1693297766.git.yin31...@gmail.com/
  - resolve conflict with updated patch
"vdpa: Send all CVQ state load commands in parallel", move the
`table` declaration at the beginning of the vhost_vdpa_net_load_rss()
in patch
"vdpa: Restore hash calculation state"

RFC: https://lore.kernel.org/all/cover.1691762906.git.yin31...@gmail.com/#t

TestStep

1. test the migration using vp-vdpa device
  - For L0 guest, boot QEMU with two virtio-net-pci net device with
`ctrl_vq`, `mq`, `hash` features on, command line like:
-netdev tap,...
-device virtio-net-pci,disable-legacy=on,disable-modern=off,
iommu_platform=on,mq=on,ctrl_vq=on,hash=on,guest_announce=off,
indirect_desc=off,queue_reset=off,guest_uso4=off,guest_uso6=off,
host_uso=off,...

  - For L1 guest, apply the relative patch series and compile the
source code, start QEMU with two vdpa device with svq mode on,
enable the `ctrl_vq`, `mq`, `hash` features on, command line like:
  -netdev type=vhost-vdpa,x-svq=true,...
  -device virtio-net-pci,mq=on,guest_announce=off,ctrl_vq=on,
hash=on,...

  - For L2 source guest, run the following bash command:
```bash
#!/bin/sh

ethtool -K eth0 rxhash on
```
  - Gdb attach the destination VM and break at the
vhost_vdpa_net_load_rss()

  - Execute the live migration in L2 source monitor

  - Result
* with this series, gdb can hit the breakpoint and continue
the executing without triggering any error or warning.

Hawkins Jiawei (2):
  vdpa: Restore hash calculation state
  vdpa: Allow VIRTIO_NET_F_HASH_REPORT in SVQ

 net/vhost-vdpa.c | 90 
 1 file changed, 90 insertions(+)

-- 
2.25.1

[PATCH v3 2/2] vdpa: Allow VIRTIO_NET_F_HASH_REPORT in SVQ

2023-10-21 Thread Hawkins Jiawei

Enable SVQ with VIRTIO_NET_F_HASH_REPORT feature.

Signed-off-by: Hawkins Jiawei 
---
v3: no code changes

v2: 
https://lore.kernel.org/all/a67d4abc2c8c5c7636addc729daa5432fa8193bd.1693297766.git.yin31...@gmail.com/

 net/vhost-vdpa.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 2e4bad65b4..4c65c53fd2 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -121,6 +121,7 @@ static const uint64_t vdpa_svq_device_features =
 BIT_ULL(VIRTIO_NET_F_CTRL_MAC_ADDR) |
 /* VHOST_F_LOG_ALL is exposed by SVQ */
 BIT_ULL(VHOST_F_LOG_ALL) |
+BIT_ULL(VIRTIO_NET_F_HASH_REPORT) |
 BIT_ULL(VIRTIO_NET_F_RSC_EXT) |
 BIT_ULL(VIRTIO_NET_F_STANDBY) |
 BIT_ULL(VIRTIO_NET_F_SPEED_DUPLEX);
-- 
2.25.1

Re: [PULL 08/83] vdpa: Restore hash calculation state

2023-10-19 Thread Hawkins Jiawei

在 2023/10/20 02:07, Michael S. Tsirkin 写道:
> On Thu, Oct 19, 2023 at 09:32:28AM -0700, Stefan Hajnoczi wrote:
>> On Wed, 18 Oct 2023 at 08:56, Michael S. Tsirkin  wrote:
>>>
>>> From: Hawkins Jiawei 
>>>
>>> This patch introduces vhost_vdpa_net_load_rss() to restore
>>> the hash calculation state at device's startup.
>>>
>>> Note that vhost_vdpa_net_load_rss() has `do_rss` argument,
>>> which allows future code to reuse this function to restore
>>> the receive-side scaling state when the VIRTIO_NET_F_RSS
>>> feature is enabled in SVQ. Currently, vhost_vdpa_net_load_rss()
>>> could only be invoked when `do_rss` is set to false.
>>>
>>> Signed-off-by: Hawkins Jiawei 
>>> Message-Id: 
>>> 
>>> Reviewed-by: Michael S. Tsirkin 
>>> Signed-off-by: Michael S. Tsirkin 
>>> ---
>>>   net/vhost-vdpa.c | 91 
>>>   1 file changed, 91 insertions(+)
>>>
>>> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
>>> index 4b7c3b81b8..40d0bcbc0b 100644
>>> --- a/net/vhost-vdpa.c
>>> +++ b/net/vhost-vdpa.c
>>> @@ -817,6 +817,88 @@ static int vhost_vdpa_net_load_mac(VhostVDPAState *s, 
>>> const VirtIONet *n,
>>>   return 0;
>>>   }
>>>
>>> +static int vhost_vdpa_net_load_rss(VhostVDPAState *s, const VirtIONet *n,
>>> +   struct iovec *out_cursor,
>>> +   struct iovec *in_cursor, bool do_rss)
>>> +{
>>> +struct virtio_net_rss_config cfg;
>>> +ssize_t r;
>>> +g_autofree uint16_t *table = NULL;
>>> +
>>> +/*
>>> + * According to VirtIO standard, "Initially the device has all hash
>>> + * types disabled and reports only VIRTIO_NET_HASH_REPORT_NONE.".
>>> + *
>>> + * Therefore, there is no need to send this CVQ command if the
>>> + * driver disable the all hash types, which aligns with
>>> + * the device's defaults.
>>> + *
>>> + * Note that the device's defaults can mismatch the driver's
>>> + * configuration only at live migration.
>>> + */
>>> +if (!n->rss_data.enabled ||
>>> +n->rss_data.hash_types == VIRTIO_NET_HASH_REPORT_NONE) {
>>> +return 0;
>>> +}
>>> +
>>> +cfg.hash_types = cpu_to_le32(n->rss_data.hash_types);
>>> +
>>> +/*
>>> + * According to VirtIO standard, "Field reserved MUST contain zeroes.
>>> + * It is defined to make the structure to match the layout of
>>> + * virtio_net_rss_config structure, defined in 5.1.6.5.7.".
>>> + *
>>> + * Therefore, we need to zero the fields in struct 
>>> virtio_net_rss_config,
>>> + * which corresponds the `reserved` field in
>>> + * struct virtio_net_hash_config.
>>> + */
>>> +memset(_table_mask, 0,
>>> +   sizeof_field(struct virtio_net_hash_config, reserved));
>>
>> Please take a look at the following CI failure:
>>
>> In file included from /usr/include/string.h:495,
>> from 
>> /home/gitlab-runner/builds/-LCfcJ2T/0/qemu-project/qemu/include/qemu/osdep.h:116,
>> from ../net/vhost-vdpa.c:12:
>> In function ‘memset’,
>> inlined from ‘vhost_vdpa_net_load_rss’ at ../net/vhost-vdpa.c:874:9:
>> /usr/include/s390x-linux-gnu/bits/string_fortified.h:71:10: error:
>> ‘__builtin_memset’ offset [7, 12] from the object at ‘cfg’ is out of
>> the bounds of referenced subobject ‘indirection_table_mask’ with type
>> ‘short unsigned int’ at offset 4 [-Werror=array-bounds]
>> 71 | return __builtin___memset_chk (__dest, __ch, __len, __bos0 (__dest));
>> | ^
>> cc1: all warnings being treated as errors
>>
>> https://gitlab.com/qemu-project/qemu/-/jobs/5329820077
>
> Hmm yes - the trick it's trying to implement is this:
>
>
> struct virtio_net_rss_config {
>  uint32_t hash_types;
>  uint16_t indirection_table_mask;
>  uint16_t unclassified_queue;
>  uint16_t indirection_table[1/* + indirection_table_mask */];
>  uint16_t max_tx_vq;
>  uint8_t hash_key_length;
>  uint8_t hash_key_data[/* hash_key_length */];
> };
>
>
> ...
>
> struct virtio_net_hash_config {
>  uint32_t hash_types;
>  /* for compatibility wi

[PATCH v5 2/7] vdpa: Avoid using vhost_vdpa_net_load_*() outside vhost_vdpa_net_load()

2023-10-13 Thread Hawkins Jiawei

Next patches in this series will refactor vhost_vdpa_net_load_cmd()
to iterate through the control commands shadow buffers, allowing QEMU
to send CVQ state load commands in parallel at device startup.

Considering that QEMU always forwards the CVQ command serialized
outside of vhost_vdpa_net_load(), it is more elegant to send the
CVQ commands directly without invoking vhost_vdpa_net_load_*() helpers.

Signed-off-by: Hawkins Jiawei 
Acked-by: Eugenio Pérez 
---
v5:
  - remove redundant initialization statement suggested by Eugenio
  - remove assertion suggested by Eugenio

v4: 
https://lore.kernel.org/all/a56d91c3cc2ab46f9be1770074c920f5375ddb5e.1693287885.git.yin31...@gmail.com/
  - pack CVQ command by iov_from_buf() instead of accessing
`out` directly suggested by Eugenio

v3: 
https://lore.kernel.org/all/428a8fac2a29b37757fa15ca747be93c0226cb1f.1689748694.git.yin31...@gmail.com/

 net/vhost-vdpa.c | 14 +++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 618758596a..86b8d31244 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -1114,12 +1114,14 @@ static NetClientInfo net_vhost_vdpa_cvq_info = {
  */
 static int vhost_vdpa_net_excessive_mac_filter_cvq_add(VhostVDPAState *s,
VirtQueueElement *elem,
-   struct iovec *out)
+   struct iovec *out,
+   const struct iovec *in)
 {
 struct virtio_net_ctrl_mac mac_data, *mac_ptr;
 struct virtio_net_ctrl_hdr *hdr_ptr;
 uint32_t cursor;
 ssize_t r;
+uint8_t on = 1;
 
 /* parse the non-multicast MAC address entries from CVQ command */
 cursor = sizeof(*hdr_ptr);
@@ -1167,7 +1169,13 @@ static int 
vhost_vdpa_net_excessive_mac_filter_cvq_add(VhostVDPAState *s,
  * filter table to the vdpa device, it should send the
  * VIRTIO_NET_CTRL_RX_PROMISC CVQ command to enable promiscuous mode
  */
-r = vhost_vdpa_net_load_rx_mode(s, VIRTIO_NET_CTRL_RX_PROMISC, 1);
+hdr_ptr = out->iov_base;
+out->iov_len = sizeof(*hdr_ptr) + sizeof(on);
+
+hdr_ptr->class = VIRTIO_NET_CTRL_RX;
+hdr_ptr->cmd = VIRTIO_NET_CTRL_RX_PROMISC;
+iov_from_buf(out, 1, sizeof(*hdr_ptr), , sizeof(on));
+r = vhost_vdpa_net_cvq_add(s, out, 1, in, 1);
 if (unlikely(r < 0)) {
 return r;
 }
@@ -1285,7 +1293,7 @@ static int 
vhost_vdpa_net_handle_ctrl_avail(VhostShadowVirtqueue *svq,
  * the CVQ command directly.
  */
 dev_written = vhost_vdpa_net_excessive_mac_filter_cvq_add(s, elem,
-  );
+, _in);
 if (unlikely(dev_written < 0)) {
 goto out;
 }
-- 
2.25.1

[PATCH v5 0/7] vdpa: Send all CVQ state load commands in parallel

2023-10-13 Thread Hawkins Jiawei

n apply an addtional
patch to record the load time in microseconds as following:
```diff
diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index 6b958d6363..501b510fd2 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -295,7 +295,10 @@ static int vhost_net_start_one(struct vhost_net *net,
 }
 
 if (net->nc->info->load) {
+int64_t start_us = g_get_monotonic_time();
 r = net->nc->info->load(net->nc);
+error_report("vhost_vdpa_net_load() = %ld us",
+ g_get_monotonic_time() - start_us);
 if (r < 0) {
 goto fail;
 }
```

  - For L1 guest, compile the code, and start QEMU with two vdpa device
with svq mode on, enable the `ctrl_vq`, `ctrl_vlan` features on,
command line like:
  -netdev type=vhost-vdpa,x-svq=true,...
  -device virtio-net-pci,mq=on,guest_announce=off,ctrl_vq=on,
ctrl_vlan=on...

  - For L2 source guest, run the following bash command:
```bash
#!/bin/sh

for idx in {1..4094}
do
  ip link add link eth0 name vlan$idx type vlan id $idx
done
```

  - execute the live migration in L2 source monitor

  - Result
* with this series, QEMU should not trigger any warning
or error except something like "vhost_vdpa_net_load() = 13543 us"
* without this series, QEMU should not trigger any warning
or error except something like "vhost_vdpa_net_load() = 20848 us"

Hawkins Jiawei (7):
  vdpa: Use iovec for vhost_vdpa_net_cvq_add()
  vdpa: Avoid using vhost_vdpa_net_load_*() outside
vhost_vdpa_net_load()
  vdpa: Check device ack in vhost_vdpa_net_load_rx_mode()
  vdpa: Move vhost_svq_poll() to the caller of vhost_vdpa_net_cvq_add()
  vdpa: Introduce cursors to vhost_vdpa_net_loadx()
  vhost: Expose vhost_svq_available_slots()
  vdpa: Send cvq state load commands in parallel

 hw/virtio/vhost-shadow-virtqueue.c |   2 +-
 hw/virtio/vhost-shadow-virtqueue.h |   1 +
 net/vhost-vdpa.c   | 374 +++--
 3 files changed, 244 insertions(+), 133 deletions(-)

-- 
2.25.1

[PATCH v5 3/7] vdpa: Check device ack in vhost_vdpa_net_load_rx_mode()

2023-10-13 Thread Hawkins Jiawei

Considering that vhost_vdpa_net_load_rx_mode() is only called
within vhost_vdpa_net_load_rx() now, this patch refactors
vhost_vdpa_net_load_rx_mode() to include a check for the
device's ack, simplifying the code and improving its maintainability.

Signed-off-by: Hawkins Jiawei 
Acked-by: Eugenio Pérez 
---
v5:
  - no change

v4: 
https://lore.kernel.org/all/be0e39e2c76e1ef39a76839b4a4ce90c8e54a98e.1693287885.git.yin31...@gmail.com/

 net/vhost-vdpa.c | 76 
 1 file changed, 31 insertions(+), 45 deletions(-)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 86b8d31244..36a4e57c0d 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -827,14 +827,24 @@ static int vhost_vdpa_net_load_rx_mode(VhostVDPAState *s,
 .iov_base = ,
 .iov_len = sizeof(on),
 };
-return vhost_vdpa_net_load_cmd(s, VIRTIO_NET_CTRL_RX,
-   cmd, , 1);
+ssize_t dev_written;
+
+dev_written = vhost_vdpa_net_load_cmd(s, VIRTIO_NET_CTRL_RX,
+  cmd, , 1);
+if (unlikely(dev_written < 0)) {
+return dev_written;
+}
+if (*s->status != VIRTIO_NET_OK) {
+return -EIO;
+}
+
+return 0;
 }
 
 static int vhost_vdpa_net_load_rx(VhostVDPAState *s,
   const VirtIONet *n)
 {
-ssize_t dev_written;
+ssize_t r;
 
 if (!virtio_vdev_has_feature(>parent_obj, VIRTIO_NET_F_CTRL_RX)) {
 return 0;
@@ -859,13 +869,9 @@ static int vhost_vdpa_net_load_rx(VhostVDPAState *s,
  * configuration only at live migration.
  */
 if (!n->mac_table.uni_overflow && !n->promisc) {
-dev_written = vhost_vdpa_net_load_rx_mode(s,
-VIRTIO_NET_CTRL_RX_PROMISC, 0);
-if (unlikely(dev_written < 0)) {
-return dev_written;
-}
-if (*s->status != VIRTIO_NET_OK) {
-return -EIO;
+r = vhost_vdpa_net_load_rx_mode(s, VIRTIO_NET_CTRL_RX_PROMISC, 0);
+if (unlikely(r < 0)) {
+return r;
 }
 }
 
@@ -887,13 +893,9 @@ static int vhost_vdpa_net_load_rx(VhostVDPAState *s,
  * configuration only at live migration.
  */
 if (n->mac_table.multi_overflow || n->allmulti) {
-dev_written = vhost_vdpa_net_load_rx_mode(s,
-VIRTIO_NET_CTRL_RX_ALLMULTI, 1);
-if (unlikely(dev_written < 0)) {
-return dev_written;
-}
-if (*s->status != VIRTIO_NET_OK) {
-return -EIO;
+r = vhost_vdpa_net_load_rx_mode(s, VIRTIO_NET_CTRL_RX_ALLMULTI, 1);
+if (unlikely(r < 0)) {
+return r;
 }
 }
 
@@ -912,13 +914,9 @@ static int vhost_vdpa_net_load_rx(VhostVDPAState *s,
  * configuration only at live migration.
  */
 if (n->alluni) {
-dev_written = vhost_vdpa_net_load_rx_mode(s,
-VIRTIO_NET_CTRL_RX_ALLUNI, 1);
-if (dev_written < 0) {
-return dev_written;
-}
-if (*s->status != VIRTIO_NET_OK) {
-return -EIO;
+r = vhost_vdpa_net_load_rx_mode(s, VIRTIO_NET_CTRL_RX_ALLUNI, 1);
+if (r < 0) {
+return r;
 }
 }
 
@@ -933,13 +931,9 @@ static int vhost_vdpa_net_load_rx(VhostVDPAState *s,
  * configuration only at live migration.
  */
 if (n->nomulti) {
-dev_written = vhost_vdpa_net_load_rx_mode(s,
-VIRTIO_NET_CTRL_RX_NOMULTI, 1);
-if (dev_written < 0) {
-return dev_written;
-}
-if (*s->status != VIRTIO_NET_OK) {
-return -EIO;
+r = vhost_vdpa_net_load_rx_mode(s, VIRTIO_NET_CTRL_RX_NOMULTI, 1);
+if (r < 0) {
+return r;
 }
 }
 
@@ -954,13 +948,9 @@ static int vhost_vdpa_net_load_rx(VhostVDPAState *s,
  * configuration only at live migration.
  */
 if (n->nouni) {
-dev_written = vhost_vdpa_net_load_rx_mode(s,
-VIRTIO_NET_CTRL_RX_NOUNI, 1);
-if (dev_written < 0) {
-return dev_written;
-}
-if (*s->status != VIRTIO_NET_OK) {
-return -EIO;
+r = vhost_vdpa_net_load_rx_mode(s, VIRTIO_NET_CTRL_RX_NOUNI, 1);
+if (r < 0) {
+return r;
 }
 }
 
@@ -975,13 +965,9 @@ static int vhost_vdpa_net_load_rx(VhostVDPAState *s,
  * configuration only at live migration.
  */
 if (n->nobcast) {
-dev_written = vhost_vdpa_net_load_rx_mode(s,
-VIRTIO_NET_CTRL_RX_NOBCAST, 1);
-if (dev_written < 0) {
-return dev_written;
-}
-if (*s->status != VIRTIO_NET_OK) {
-return -EIO;
+r = vhost_vdpa_net

[PATCH v5 1/7] vdpa: Use iovec for vhost_vdpa_net_cvq_add()

2023-10-13 Thread Hawkins Jiawei

Next patches in this series will no longer perform an
immediate poll and check of the device's used buffers
for each CVQ state load command. Consequently, there
will be multiple pending buffers in the shadow VirtQueue,
making it a must for every control command to have its
own buffer.

To achieve this, this patch refactor vhost_vdpa_net_cvq_add()
to accept `struct iovec`, which eliminates the coupling of
control commands to `s->cvq_cmd_out_buffer` and `s->status`,
allowing them to use their own buffer.

Signed-off-by: Hawkins Jiawei 
Acked-by: Eugenio Pérez 
---
v5:
  - no change

v4: 
https://lore.kernel.org/all/5e090c2af922192f5897ba7072df4d9e4754e1e0.1693287885.git.yin31...@gmail.com/
  - split `in` to `vdpa_in` and `model_in` instead of reusing `in`
in vhost_vdpa_net_handle_ctrl_avail() suggested by Eugenio

v3: 
https://lore.kernel.org/all/b1d473772ec4bcb254ab0d12430c9b1efe758606.1689748694.git.yin31...@gmail.com/

 net/vhost-vdpa.c | 39 ++-
 1 file changed, 22 insertions(+), 17 deletions(-)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 939c984d5b..618758596a 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -618,22 +618,14 @@ static void vhost_vdpa_net_cvq_stop(NetClientState *nc)
 vhost_vdpa_net_client_stop(nc);
 }
 
-static ssize_t vhost_vdpa_net_cvq_add(VhostVDPAState *s, size_t out_len,
-  size_t in_len)
+static ssize_t vhost_vdpa_net_cvq_add(VhostVDPAState *s,
+const struct iovec *out_sg, size_t out_num,
+const struct iovec *in_sg, size_t in_num)
 {
-/* Buffers for the device */
-const struct iovec out = {
-.iov_base = s->cvq_cmd_out_buffer,
-.iov_len = out_len,
-};
-const struct iovec in = {
-.iov_base = s->status,
-.iov_len = sizeof(virtio_net_ctrl_ack),
-};
 VhostShadowVirtqueue *svq = g_ptr_array_index(s->vhost_vdpa.shadow_vqs, 0);
 int r;
 
-r = vhost_svq_add(svq, , 1, , 1, NULL);
+r = vhost_svq_add(svq, out_sg, out_num, in_sg, in_num, NULL);
 if (unlikely(r != 0)) {
 if (unlikely(r == -ENOSPC)) {
 qemu_log_mask(LOG_GUEST_ERROR, "%s: No space on device queue\n",
@@ -659,6 +651,15 @@ static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState *s, 
uint8_t class,
 .cmd = cmd,
 };
 size_t data_size = iov_size(data_sg, data_num);
+/* Buffers for the device */
+const struct iovec out = {
+.iov_base = s->cvq_cmd_out_buffer,
+.iov_len = sizeof(ctrl) + data_size,
+};
+const struct iovec in = {
+.iov_base = s->status,
+.iov_len = sizeof(*s->status),
+};
 
 assert(data_size < vhost_vdpa_net_cvq_cmd_page_len() - sizeof(ctrl));
 
@@ -669,8 +670,7 @@ static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState *s, 
uint8_t class,
 iov_to_buf(data_sg, data_num, 0,
s->cvq_cmd_out_buffer + sizeof(ctrl), data_size);
 
-return vhost_vdpa_net_cvq_add(s, data_size + sizeof(ctrl),
-  sizeof(virtio_net_ctrl_ack));
+return vhost_vdpa_net_cvq_add(s, , 1, , 1);
 }
 
 static int vhost_vdpa_net_load_mac(VhostVDPAState *s, const VirtIONet *n)
@@ -1248,10 +1248,15 @@ static int 
vhost_vdpa_net_handle_ctrl_avail(VhostShadowVirtqueue *svq,
 .iov_base = s->cvq_cmd_out_buffer,
 };
 /* in buffer used for device model */
-const struct iovec in = {
+const struct iovec model_in = {
 .iov_base = ,
 .iov_len = sizeof(status),
 };
+/* in buffer used for vdpa device */
+const struct iovec vdpa_in = {
+.iov_base = s->status,
+.iov_len = sizeof(*s->status),
+};
 ssize_t dev_written = -EINVAL;
 
 out.iov_len = iov_to_buf(elem->out_sg, elem->out_num, 0,
@@ -1285,7 +1290,7 @@ static int 
vhost_vdpa_net_handle_ctrl_avail(VhostShadowVirtqueue *svq,
 goto out;
 }
 } else {
-dev_written = vhost_vdpa_net_cvq_add(s, out.iov_len, sizeof(status));
+dev_written = vhost_vdpa_net_cvq_add(s, , 1, _in, 1);
 if (unlikely(dev_written < 0)) {
 goto out;
 }
@@ -1301,7 +1306,7 @@ static int 
vhost_vdpa_net_handle_ctrl_avail(VhostShadowVirtqueue *svq,
 }
 
 status = VIRTIO_NET_ERR;
-virtio_net_handle_ctrl_iov(svq->vdev, , 1, , 1);
+virtio_net_handle_ctrl_iov(svq->vdev, _in, 1, , 1);
 if (status != VIRTIO_NET_OK) {
 error_report("Bad CVQ processing in model");
 }
-- 
2.25.1

[PATCH v5 6/7] vhost: Expose vhost_svq_available_slots()

2023-10-13 Thread Hawkins Jiawei

Next patches in this series will delay the polling
and checking of buffers until either the SVQ is
full or control commands shadow buffers are full,
no longer perform an immediate poll and check of
the device's used buffers for each CVQ state load command.

To achieve this, this patch exposes
vhost_svq_available_slots(), allowing QEMU to know
whether the SVQ is full.

Signed-off-by: Hawkins Jiawei 
Acked-by: Eugenio Pérez 
---
v5:
  - inline the vhost_svq_available_slots() in the caller and remove the
helper function from this patch suggested by Eugenio

v4: 
https://lore.kernel.org/all/13b3a36cc33c443a47525957ea38e80594d90595.1693287885.git.yin31...@gmail.com/

 hw/virtio/vhost-shadow-virtqueue.c | 2 +-
 hw/virtio/vhost-shadow-virtqueue.h | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/vhost-shadow-virtqueue.c 
b/hw/virtio/vhost-shadow-virtqueue.c
index e731b1d2ea..fc5f408f77 100644
--- a/hw/virtio/vhost-shadow-virtqueue.c
+++ b/hw/virtio/vhost-shadow-virtqueue.c
@@ -66,7 +66,7 @@ bool vhost_svq_valid_features(uint64_t features, Error **errp)
  *
  * @svq: The svq
  */
-static uint16_t vhost_svq_available_slots(const VhostShadowVirtqueue *svq)
+uint16_t vhost_svq_available_slots(const VhostShadowVirtqueue *svq)
 {
 return svq->num_free;
 }
diff --git a/hw/virtio/vhost-shadow-virtqueue.h 
b/hw/virtio/vhost-shadow-virtqueue.h
index 5bce67837b..19c842a15b 100644
--- a/hw/virtio/vhost-shadow-virtqueue.h
+++ b/hw/virtio/vhost-shadow-virtqueue.h
@@ -114,6 +114,7 @@ typedef struct VhostShadowVirtqueue {
 
 bool vhost_svq_valid_features(uint64_t features, Error **errp);
 
+uint16_t vhost_svq_available_slots(const VhostShadowVirtqueue *svq);
 void vhost_svq_push_elem(VhostShadowVirtqueue *svq,
  const VirtQueueElement *elem, uint32_t len);
 int vhost_svq_add(VhostShadowVirtqueue *svq, const struct iovec *out_sg,
-- 
2.25.1

[PATCH v5 5/7] vdpa: Introduce cursors to vhost_vdpa_net_loadx()

2023-10-13 Thread Hawkins Jiawei

This patch introduces two new arugments, `out_cursor`
and `in_cursor`, to vhost_vdpa_net_loadx(). Addtionally,
it includes a helper function
vhost_vdpa_net_load_cursor_reset() for resetting these
cursors.

Furthermore, this patch refactors vhost_vdpa_net_load_cmd()
so that vhost_vdpa_net_load_cmd() prepares buffers
for the device using the cursors arguments, instead
of directly accesses `s->cvq_cmd_out_buffer` and
`s->status` fields.

By making these change, next patches in this series
can refactor vhost_vdpa_net_load_cmd() directly to
iterate through the control commands shadow buffers,
allowing QEMU to send CVQ state load commands in parallel
at device startup.

Signed-off-by: Hawkins Jiawei 
---
v5:
  - move iov_copy() call before vhost_vdpa_net_cvq_add()
and add comments for iov_copy() to improve readability
  - fix conflicts with master branch

v4: 
https://lore.kernel.org/all/0e2af3ed5695a8044877911df791417fe0ba87af.1693287885.git.yin31...@gmail.com/
  - use `struct iovec` instead of `void **` as cursor
suggested by Eugenio
  - add vhost_vdpa_net_load_cursor_reset() helper function
to reset the cursors
  - refactor vhost_vdpa_net_load_cmd() to prepare buffers
by iov_copy() instead of accessing `in` and `out` directly
suggested by Eugenio

v3: 
https://lore.kernel.org/all/bf390934673f2b613359ea9d7ac6c89199c31384.1689748694.git.yin31...@gmail.com/

 net/vhost-vdpa.c | 111 ---
 1 file changed, 75 insertions(+), 36 deletions(-)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index ea73e3c410..ef4d242811 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -648,7 +648,22 @@ static ssize_t vhost_vdpa_net_svq_poll(VhostVDPAState *s, 
size_t cmds_in_flight)
 return vhost_svq_poll(svq, cmds_in_flight);
 }
 
-static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState *s, uint8_t class,
+static void vhost_vdpa_net_load_cursor_reset(VhostVDPAState *s,
+ struct iovec *out_cursor,
+ struct iovec *in_cursor)
+{
+/* reset the cursor of the output buffer for the device */
+out_cursor->iov_base = s->cvq_cmd_out_buffer;
+out_cursor->iov_len = vhost_vdpa_net_cvq_cmd_page_len();
+
+/* reset the cursor of the in buffer for the device */
+in_cursor->iov_base = s->status;
+in_cursor->iov_len = vhost_vdpa_net_cvq_cmd_page_len();
+}
+
+static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState *s,
+   struct iovec *out_cursor,
+   struct iovec *in_cursor, uint8_t class,
uint8_t cmd, const struct iovec 
*data_sg,
size_t data_num)
 {
@@ -657,25 +672,21 @@ static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState *s, 
uint8_t class,
 .cmd = cmd,
 };
 size_t data_size = iov_size(data_sg, data_num);
-/* Buffers for the device */
-const struct iovec out = {
-.iov_base = s->cvq_cmd_out_buffer,
-.iov_len = sizeof(ctrl) + data_size,
-};
-const struct iovec in = {
-.iov_base = s->status,
-.iov_len = sizeof(*s->status),
-};
+struct iovec out, in;
 ssize_t r;
 
 assert(data_size < vhost_vdpa_net_cvq_cmd_page_len() - sizeof(ctrl));
 
 /* pack the CVQ command header */
-memcpy(s->cvq_cmd_out_buffer, , sizeof(ctrl));
-
+iov_from_buf(out_cursor, 1, 0, , sizeof(ctrl));
 /* pack the CVQ command command-specific-data */
 iov_to_buf(data_sg, data_num, 0,
-   s->cvq_cmd_out_buffer + sizeof(ctrl), data_size);
+   out_cursor->iov_base + sizeof(ctrl), data_size);
+
+/* extract the required buffer from the cursor for output */
+iov_copy(, 1, out_cursor, 1, 0, sizeof(ctrl) + data_size);
+/* extract the required buffer from the cursor for input */
+iov_copy(, 1, in_cursor, 1, 0, sizeof(*s->status));
 
 r = vhost_vdpa_net_cvq_add(s, , 1, , 1);
 if (unlikely(r < 0)) {
@@ -689,14 +700,17 @@ static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState *s, 
uint8_t class,
 return vhost_vdpa_net_svq_poll(s, 1);
 }
 
-static int vhost_vdpa_net_load_mac(VhostVDPAState *s, const VirtIONet *n)
+static int vhost_vdpa_net_load_mac(VhostVDPAState *s, const VirtIONet *n,
+   struct iovec *out_cursor,
+   struct iovec *in_cursor)
 {
 if (virtio_vdev_has_feature(>parent_obj, VIRTIO_NET_F_CTRL_MAC_ADDR)) {
 const struct iovec data = {
 .iov_base = (void *)n->mac,
 .iov_len = sizeof(n->mac),
 };
-ssize_t dev_written = vhost_vdpa_net_load_cmd(s, VIRTIO_NET_CTRL_MAC,
+ssize_t dev_written = vhost_vdpa_net_load_cmd(s, out_cursor, in_cursor,
+  VIRTIO_NET_CTRL_MAC,

[PATCH v5 7/7] vdpa: Send cvq state load commands in parallel

2023-10-13 Thread Hawkins Jiawei

This patch enables sending CVQ state load commands
in parallel at device startup by following steps:

  * Refactor vhost_vdpa_net_load_cmd() to iterate through
the control commands shadow buffers. This allows different
CVQ state load commands to use their own unique buffers.

  * Delay the polling and checking of buffers until either
the SVQ is full or control commands shadow buffers are full.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1578
Signed-off-by: Hawkins Jiawei 
Acked-by: Eugenio Pérez 
---
v5:
  - remove the assertion suggested by Eugenio
  - inline the vhost_svq_available_slots() in vhost_vdpa_net_load_cmd()
suggested by Eugenio
  - fix conflicts with master branch

v4: 
https://lore.kernel.org/all/f25fea0b0aed78bad2dd5744a4cc5538243672e6.1693287885.git.yin31...@gmail.com/
  - refactor argument `cmds_in_flight` to `len` for
vhost_vdpa_net_svq_full()
  - check the return value of vhost_vdpa_net_svq_poll()
in vhost_vdpa_net_svq_flush() suggested by Eugenio
  - use iov_size(), vhost_vdpa_net_load_cursor_reset()
and iov_discard_front() to update the cursors instead of
accessing it directly according to Eugenio

 net/vhost-vdpa.c | 165 +--
 1 file changed, 102 insertions(+), 63 deletions(-)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index ef4d242811..4b7c3b81b8 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -661,6 +661,31 @@ static void 
vhost_vdpa_net_load_cursor_reset(VhostVDPAState *s,
 in_cursor->iov_len = vhost_vdpa_net_cvq_cmd_page_len();
 }
 
+/*
+ * Poll SVQ for multiple pending control commands and check the device's ack.
+ *
+ * Caller should hold the BQL when invoking this function.
+ *
+ * @s: The VhostVDPAState
+ * @len: The length of the pending status shadow buffer
+ */
+static ssize_t vhost_vdpa_net_svq_flush(VhostVDPAState *s, size_t len)
+{
+/* device uses a one-byte length ack for each control command */
+ssize_t dev_written = vhost_vdpa_net_svq_poll(s, len);
+if (unlikely(dev_written != len)) {
+return -EIO;
+}
+
+/* check the device's ack */
+for (int i = 0; i < len; ++i) {
+if (s->status[i] != VIRTIO_NET_OK) {
+return -EIO;
+}
+}
+return 0;
+}
+
 static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState *s,
struct iovec *out_cursor,
struct iovec *in_cursor, uint8_t class,
@@ -671,11 +696,31 @@ static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState *s,
 .class = class,
 .cmd = cmd,
 };
-size_t data_size = iov_size(data_sg, data_num);
+size_t data_size = iov_size(data_sg, data_num), cmd_size;
 struct iovec out, in;
 ssize_t r;
+unsigned dummy_cursor_iov_cnt;
+VhostShadowVirtqueue *svq = g_ptr_array_index(s->vhost_vdpa.shadow_vqs, 0);
 
 assert(data_size < vhost_vdpa_net_cvq_cmd_page_len() - sizeof(ctrl));
+cmd_size = sizeof(ctrl) + data_size;
+if (vhost_svq_available_slots(svq) < 2 ||
+iov_size(out_cursor, 1) < cmd_size) {
+/*
+ * It is time to flush all pending control commands if SVQ is full
+ * or control commands shadow buffers are full.
+ *
+ * We can poll here since we've had BQL from the time
+ * we sent the descriptor.
+ */
+r = vhost_vdpa_net_svq_flush(s, in_cursor->iov_base -
+ (void *)s->status);
+if (unlikely(r < 0)) {
+return r;
+}
+
+vhost_vdpa_net_load_cursor_reset(s, out_cursor, in_cursor);
+}
 
 /* pack the CVQ command header */
 iov_from_buf(out_cursor, 1, 0, , sizeof(ctrl));
@@ -684,7 +729,7 @@ static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState *s,
out_cursor->iov_base + sizeof(ctrl), data_size);
 
 /* extract the required buffer from the cursor for output */
-iov_copy(, 1, out_cursor, 1, 0, sizeof(ctrl) + data_size);
+iov_copy(, 1, out_cursor, 1, 0, cmd_size);
 /* extract the required buffer from the cursor for input */
 iov_copy(, 1, in_cursor, 1, 0, sizeof(*s->status));
 
@@ -693,11 +738,13 @@ static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState *s,
 return r;
 }
 
-/*
- * We can poll here since we've had BQL from the time
- * we sent the descriptor.
- */
-return vhost_vdpa_net_svq_poll(s, 1);
+/* iterate the cursors */
+dummy_cursor_iov_cnt = 1;
+iov_discard_front(_cursor, _cursor_iov_cnt, cmd_size);
+dummy_cursor_iov_cnt = 1;
+iov_discard_front(_cursor, _cursor_iov_cnt, sizeof(*s->status));
+
+return 0;
 }
 
 static int vhost_vdpa_net_load_mac(VhostVDPAState *s, const VirtIONet *n,
@@ -709,15 +756,12 @@ static int vhost_vdpa_net_load_mac(VhostVDPAState *s, 
const VirtIONet *n,
 .iov_base = (void *)n->mac,
 .iov_len = sizeof(n->mac),
 };
-ssize_t dev_wr

[PATCH v5 4/7] vdpa: Move vhost_svq_poll() to the caller of vhost_vdpa_net_cvq_add()

2023-10-13 Thread Hawkins Jiawei

This patch moves vhost_svq_poll() to the caller of
vhost_vdpa_net_cvq_add() and introduces a helper funtion.

By making this change, next patches in this series is
able to refactor vhost_vdpa_net_load_x() only to delay
the polling and checking process until either the SVQ
is full or control commands shadow buffers are full.

Signed-off-by: Hawkins Jiawei 
---
v5:
  - no change

v4: 
https://lore.kernel.org/all/496c542c22ae1b4222175d5576c949621c7c2fc0.1693287885.git.yin31...@gmail.com/
  - always check the return value of vhost_vdpa_net_svq_poll()
suggested Eugenio

v3: 
https://lore.kernel.org/all/152177c4e7082236fba9d31d535e40f8c2984349.1689748694.git.yin31...@gmail.com/

 net/vhost-vdpa.c | 53 +++-
 1 file changed, 43 insertions(+), 10 deletions(-)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 36a4e57c0d..ea73e3c410 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -631,15 +631,21 @@ static ssize_t vhost_vdpa_net_cvq_add(VhostVDPAState *s,
 qemu_log_mask(LOG_GUEST_ERROR, "%s: No space on device queue\n",
   __func__);
 }
-return r;
 }
 
-/*
- * We can poll here since we've had BQL from the time we sent the
- * descriptor. Also, we need to take the answer before SVQ pulls by itself,
- * when BQL is released
- */
-return vhost_svq_poll(svq, 1);
+return r;
+}
+
+/*
+ * Convenience wrapper to poll SVQ for multiple control commands.
+ *
+ * Caller should hold the BQL when invoking this function, and should take
+ * the answer before SVQ pulls by itself when BQL is released.
+ */
+static ssize_t vhost_vdpa_net_svq_poll(VhostVDPAState *s, size_t 
cmds_in_flight)
+{
+VhostShadowVirtqueue *svq = g_ptr_array_index(s->vhost_vdpa.shadow_vqs, 0);
+return vhost_svq_poll(svq, cmds_in_flight);
 }
 
 static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState *s, uint8_t class,
@@ -660,6 +666,7 @@ static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState *s, 
uint8_t class,
 .iov_base = s->status,
 .iov_len = sizeof(*s->status),
 };
+ssize_t r;
 
 assert(data_size < vhost_vdpa_net_cvq_cmd_page_len() - sizeof(ctrl));
 
@@ -670,7 +677,16 @@ static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState *s, 
uint8_t class,
 iov_to_buf(data_sg, data_num, 0,
s->cvq_cmd_out_buffer + sizeof(ctrl), data_size);
 
-return vhost_vdpa_net_cvq_add(s, , 1, , 1);
+r = vhost_vdpa_net_cvq_add(s, , 1, , 1);
+if (unlikely(r < 0)) {
+return r;
+}
+
+/*
+ * We can poll here since we've had BQL from the time
+ * we sent the descriptor.
+ */
+return vhost_vdpa_net_svq_poll(s, 1);
 }
 
 static int vhost_vdpa_net_load_mac(VhostVDPAState *s, const VirtIONet *n)
@@ -1165,6 +1181,15 @@ static int 
vhost_vdpa_net_excessive_mac_filter_cvq_add(VhostVDPAState *s,
 if (unlikely(r < 0)) {
 return r;
 }
+
+/*
+ * We can poll here since we've had BQL from the time
+ * we sent the descriptor.
+ */
+r = vhost_vdpa_net_svq_poll(s, 1);
+if (unlikely(r < sizeof(*s->status))) {
+return r;
+}
 if (*s->status != VIRTIO_NET_OK) {
 return sizeof(*s->status);
 }
@@ -1284,10 +1309,18 @@ static int 
vhost_vdpa_net_handle_ctrl_avail(VhostShadowVirtqueue *svq,
 goto out;
 }
 } else {
-dev_written = vhost_vdpa_net_cvq_add(s, , 1, _in, 1);
-if (unlikely(dev_written < 0)) {
+ssize_t r;
+r = vhost_vdpa_net_cvq_add(s, , 1, _in, 1);
+if (unlikely(r < 0)) {
+dev_written = r;
 goto out;
 }
+
+/*
+ * We can poll here since we've had BQL from the time
+ * we sent the descriptor.
+ */
+dev_written = vhost_vdpa_net_svq_poll(s, 1);
 }
 
 if (unlikely(dev_written < sizeof(status))) {
-- 
2.25.1

Re: [PATCH v4 8/8] vdpa: Send cvq state load commands in parallel

2023-10-07 Thread Hawkins Jiawei

在 2023/10/4 15:33, Eugenio Perez Martin 写道:
> On Tue, Aug 29, 2023 at 7:55 AM Hawkins Jiawei  wrote:
>>
>> This patch enables sending CVQ state load commands
>> in parallel at device startup by following steps:
>>
>>* Refactor vhost_vdpa_net_load_cmd() to iterate through
>> the control commands shadow buffers. This allows different
>> CVQ state load commands to use their own unique buffers.
>>
>>* Delay the polling and checking of buffers until either
>> the SVQ is full or control commands shadow buffers are full.
>>
>> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1578
>> Signed-off-by: Hawkins Jiawei 
>> ---
>> v4:
>>- refactor argument `cmds_in_flight` to `len` for
>> vhost_vdpa_net_svq_full()
>>- check the return value of vhost_vdpa_net_svq_poll()
>> in vhost_vdpa_net_svq_flush() suggested by Eugenio
>>- use iov_size(), vhost_vdpa_net_load_cursor_reset()
>> and iov_discard_front() to update the cursors instead of
>> accessing it directly according to Eugenio
>>
>> v3: 
>> https://lore.kernel.org/all/3a002790e6c880af928c6470ecbf03e7c65a68bb.1689748694.git.yin31...@gmail.com/
>>
>>   net/vhost-vdpa.c | 155 +--
>>   1 file changed, 97 insertions(+), 58 deletions(-)
>>
>> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
>> index a71e8c9090..818464b702 100644
>> --- a/net/vhost-vdpa.c
>> +++ b/net/vhost-vdpa.c
>> @@ -646,6 +646,31 @@ static void 
>> vhost_vdpa_net_load_cursor_reset(VhostVDPAState *s,
>>   in_cursor->iov_len = vhost_vdpa_net_cvq_cmd_page_len();
>>   }
>>
>> +/*
>> + * Poll SVQ for multiple pending control commands and check the device's 
>> ack.
>> + *
>> + * Caller should hold the BQL when invoking this function.
>> + *
>> + * @s: The VhostVDPAState
>> + * @len: The length of the pending status shadow buffer
>> + */
>> +static ssize_t vhost_vdpa_net_svq_flush(VhostVDPAState *s, size_t len)
>> +{
>> +/* Device uses a one-byte length ack for each control command */
>> +ssize_t dev_written = vhost_vdpa_net_svq_poll(s, len);
>> +if (unlikely(dev_written != len)) {
>> +return -EIO;
>> +}
>> +
>> +/* check the device's ack */
>> +for (int i = 0; i < len; ++i) {
>> +if (s->status[i] != VIRTIO_NET_OK) {
>> +return -EIO;
>> +}
>> +}
>> +return 0;
>> +}
>> +
>>   static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState *s,
>>  struct iovec *out_cursor,
>>  struct iovec *in_cursor, uint8_t 
>> class,
>> @@ -660,10 +685,30 @@ static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState 
>> *s,
>>  cmd_size = sizeof(ctrl) + data_size;
>>   struct iovec out, in;
>>   ssize_t r;
>> +unsigned dummy_cursor_iov_cnt;
>>
>>   assert(data_size < vhost_vdpa_net_cvq_cmd_page_len() - sizeof(ctrl));
>> +if (vhost_vdpa_net_svq_available_slots(s) < 2 ||
>> +iov_size(out_cursor, 1) < cmd_size) {
>> +/*
>> + * It is time to flush all pending control commands if SVQ is full
>> + * or control commands shadow buffers are full.
>> + *
>> + * We can poll here since we've had BQL from the time
>> + * we sent the descriptor.
>> + */
>> +r = vhost_vdpa_net_svq_flush(s, in_cursor->iov_base -
>> + (void *)s->status);
>> +if (unlikely(r < 0)) {
>> +return r;
>> +}
>> +
>> +vhost_vdpa_net_load_cursor_reset(s, out_cursor, in_cursor);
>> +}
>> +
>
> It would be great to merge this flush with the one at
> vhost_vdpa_net_load. We would need to return ENOSPC or similar and
> handle it there.
>
> But it would make it more difficult to iterate through the loading of
> the different parameters, so I think it can be done on top.
>

Hi Eugenio,

Please forgive my poor English, so do you mean the approach in my
patch is acceptable for you?

>>   /* Each CVQ command has one out descriptor and one in descriptor */
>>   assert(vhost_vdpa_net_svq_available_slots(s) >= 2);
>> +assert(iov_size(out_cursor, 1) >= cmd_size);
>>
>
> Same here, I think we can avoid the assertion, right?

You are right, I will remove this assertion.

Thanks!


>
> Apart from that,
>
> Acked-by:

Re: [PATCH v4 7/8] vdpa: Introduce cursors to vhost_vdpa_net_loadx()

2023-10-07 Thread Hawkins Jiawei

在 2023/10/4 15:21, Eugenio Perez Martin 写道:
> On Tue, Aug 29, 2023 at 7:55 AM Hawkins Jiawei  wrote:
>>
>> This patch introduces two new arugments, `out_cursor`
>> and `in_cursor`, to vhost_vdpa_net_loadx(). Addtionally,
>> it includes a helper function
>> vhost_vdpa_net_load_cursor_reset() for resetting these
>> cursors.
>>
>> Furthermore, this patch refactors vhost_vdpa_net_load_cmd()
>> so that vhost_vdpa_net_load_cmd() prepares buffers
>> for the device using the cursors arguments, instead
>> of directly accesses `s->cvq_cmd_out_buffer` and
>> `s->status` fields.
>>
>> By making these change, next patches in this series
>> can refactor vhost_vdpa_net_load_cmd() directly to
>> iterate through the control commands shadow buffers,
>> allowing QEMU to send CVQ state load commands in parallel
>> at device startup.
>>
>> Signed-off-by: Hawkins Jiawei 
>> ---
>> v4:
>>- use `struct iovec` instead of `void **` as cursor
>> suggested by Eugenio
>>- add vhost_vdpa_net_load_cursor_reset() helper function
>> to reset the cursors
>>- refactor vhost_vdpa_net_load_cmd() to prepare buffers
>> by iov_copy() instead of accessing `in` and `out` directly
>> suggested by Eugenio
>>
>> v3: 
>> https://lore.kernel.org/all/bf390934673f2b613359ea9d7ac6c89199c31384.1689748694.git.yin31...@gmail.com/
>>
>>   net/vhost-vdpa.c | 114 ---
>>   1 file changed, 77 insertions(+), 37 deletions(-)
>>
>> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
>> index d9b8b3cf6c..a71e8c9090 100644
>> --- a/net/vhost-vdpa.c
>> +++ b/net/vhost-vdpa.c
>> @@ -633,7 +633,22 @@ static uint16_t 
>> vhost_vdpa_net_svq_available_slots(VhostVDPAState *s)
>>   return vhost_svq_available_slots(svq);
>>   }
>>
>> -static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState *s, uint8_t class,
>> +static void vhost_vdpa_net_load_cursor_reset(VhostVDPAState *s,
>> + struct iovec *out_cursor,
>> + struct iovec *in_cursor)
>> +{
>> +/* reset the cursor of the output buffer for the device */
>> +out_cursor->iov_base = s->cvq_cmd_out_buffer;
>> +out_cursor->iov_len = vhost_vdpa_net_cvq_cmd_page_len();
>> +
>> +/* reset the cursor of the in buffer for the device */
>> +in_cursor->iov_base = s->status;
>> +in_cursor->iov_len = vhost_vdpa_net_cvq_cmd_page_len();
>> +}
>> +
>> +static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState *s,
>> +   struct iovec *out_cursor,
>> +   struct iovec *in_cursor, uint8_t 
>> class,
>>  uint8_t cmd, const struct iovec 
>> *data_sg,
>>  size_t data_num)
>>   {
>> @@ -641,28 +656,25 @@ static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState 
>> *s, uint8_t class,
>>   .class = class,
>>   .cmd = cmd,
>>   };
>> -size_t data_size = iov_size(data_sg, data_num);
>> -/* Buffers for the device */
>> -const struct iovec out = {
>> -.iov_base = s->cvq_cmd_out_buffer,
>> -.iov_len = sizeof(ctrl) + data_size,
>> -};
>> -const struct iovec in = {
>> -.iov_base = s->status,
>> -.iov_len = sizeof(*s->status),
>> -};
>> +size_t data_size = iov_size(data_sg, data_num),
>> +   cmd_size = sizeof(ctrl) + data_size;
>> +struct iovec out, in;
>>   ssize_t r;
>>
>>   assert(data_size < vhost_vdpa_net_cvq_cmd_page_len() - sizeof(ctrl));
>>   /* Each CVQ command has one out descriptor and one in descriptor */
>>   assert(vhost_vdpa_net_svq_available_slots(s) >= 2);
>>
>> -/* pack the CVQ command header */
>> -memcpy(s->cvq_cmd_out_buffer, , sizeof(ctrl));
>> +/* Prepare the buffer for out descriptor for the device */
>> +iov_copy(, 1, out_cursor, 1, 0, cmd_size);
>
> I may be missing something here, but cmd_size should be the bytes
> available in "out", so we don't overrun it.
>
>> +/* Prepare the buffer for in descriptor for the device. */
>> +iov_copy(, 1, in_cursor, 1, 0, sizeof(*s->status));
>>
>
> Same here, although it is impossible for the moment to overrun it as
> all cmds only return one byte.
>

Here we just use iov_copy() to initialize the `o

Re: [PATCH v4 4/8] vdpa: Avoid using vhost_vdpa_net_load_*() outside vhost_vdpa_net_load()

2023-10-07 Thread Hawkins Jiawei

在 2023/10/4 01:48, Eugenio Perez Martin 写道:
> On Tue, Aug 29, 2023 at 7:55 AM Hawkins Jiawei  wrote:
>>
>> Next patches in this series will refactor vhost_vdpa_net_load_cmd()
>> to iterate through the control commands shadow buffers, allowing QEMU
>> to send CVQ state load commands in parallel at device startup.
>>
>> Considering that QEMU always forwards the CVQ command serialized
>> outside of vhost_vdpa_net_load(), it is more elegant to send the
>> CVQ commands directly without invoking vhost_vdpa_net_load_*() helpers.
>>
>> Signed-off-by: Hawkins Jiawei 
>> ---
>> v4:
>>- pack CVQ command by iov_from_buf() instead of accessing
>> `out` directly suggested by Eugenio
>>
>> v3: 
>> https://lore.kernel.org/all/428a8fac2a29b37757fa15ca747be93c0226cb1f.1689748694.git.yin31...@gmail.com/
>>
>>   net/vhost-vdpa.c | 16 +---
>>   1 file changed, 13 insertions(+), 3 deletions(-)
>>
>> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
>> index e6342b213f..7c67063469 100644
>> --- a/net/vhost-vdpa.c
>> +++ b/net/vhost-vdpa.c
>> @@ -1097,12 +1097,14 @@ static NetClientInfo net_vhost_vdpa_cvq_info = {
>>*/
>>   static int vhost_vdpa_net_excessive_mac_filter_cvq_add(VhostVDPAState *s,
>>  VirtQueueElement 
>> *elem,
>> -   struct iovec *out)
>> +   struct iovec *out,
>> +   const struct iovec 
>> *in)
>>   {
>>   struct virtio_net_ctrl_mac mac_data, *mac_ptr;
>>   struct virtio_net_ctrl_hdr *hdr_ptr;
>>   uint32_t cursor;
>>   ssize_t r;
>> +uint8_t on = 1;
>>
>>   /* parse the non-multicast MAC address entries from CVQ command */
>>   cursor = sizeof(*hdr_ptr);
>> @@ -1150,7 +1152,15 @@ static int 
>> vhost_vdpa_net_excessive_mac_filter_cvq_add(VhostVDPAState *s,
>>* filter table to the vdpa device, it should send the
>>* VIRTIO_NET_CTRL_RX_PROMISC CVQ command to enable promiscuous mode
>>*/
>> -r = vhost_vdpa_net_load_rx_mode(s, VIRTIO_NET_CTRL_RX_PROMISC, 1);
>> +cursor = 0;
>
> I think this is redundant, as "cursor" is not used by the new code and
> it is already set to 0 before its usage, isn't it?
>

You are right, I will remove this code in the v5 patch.

>> +hdr_ptr = out->iov_base;
>> +out->iov_len = sizeof(*hdr_ptr) + sizeof(on);
>> +assert(out->iov_len < vhost_vdpa_net_cvq_cmd_page_len());
>> +
>
> I think we can assume this assertion is never hit, as out->iov_len is
> controlled by this function.
>

Thanks for your suggestion, I will remove this assertion.

Thanks!


> Apart from these nits,
>
> Acked-by: Eugenio Pérez 
>
>> +hdr_ptr->class = VIRTIO_NET_CTRL_RX;
>> +hdr_ptr->cmd = VIRTIO_NET_CTRL_RX_PROMISC;
>> +iov_from_buf(out, 1, sizeof(*hdr_ptr), , sizeof(on));
>> +r = vhost_vdpa_net_cvq_add(s, out, 1, in, 1);
>>   if (unlikely(r < 0)) {
>>   return r;
>>   }
>> @@ -1268,7 +1278,7 @@ static int 
>> vhost_vdpa_net_handle_ctrl_avail(VhostShadowVirtqueue *svq,
>>* the CVQ command direclty.
>>*/
>>   dev_written = vhost_vdpa_net_excessive_mac_filter_cvq_add(s, elem,
>> -  );
>> +, _in);
>>   if (unlikely(dev_written < 0)) {
>>   goto out;
>>   }
>> --
>> 2.25.1
>>
>

Re: [PATCH v4 3/8] vhost: Expose vhost_svq_available_slots()

2023-10-07 Thread Hawkins Jiawei

在 2023/10/4 01:44, Eugenio Perez Martin 写道:
> On Tue, Aug 29, 2023 at 7:55 AM Hawkins Jiawei  wrote:
>>
>> Next patches in this series will delay the polling
>> and checking of buffers until either the SVQ is
>> full or control commands shadow buffers are full,
>> no longer perform an immediate poll and check of
>> the device's used buffers for each CVQ state load command.
>>
>> To achieve this, this patch exposes
>> vhost_svq_available_slots() and introduces a helper function,
>> allowing QEMU to know whether the SVQ is full.
>>
>> Signed-off-by: Hawkins Jiawei 
>> Acked-by: Eugenio Pérez 
>> ---
>>   hw/virtio/vhost-shadow-virtqueue.c | 2 +-
>>   hw/virtio/vhost-shadow-virtqueue.h | 1 +
>>   net/vhost-vdpa.c   | 9 +
>>   3 files changed, 11 insertions(+), 1 deletion(-)
>>
>> diff --git a/hw/virtio/vhost-shadow-virtqueue.c 
>> b/hw/virtio/vhost-shadow-virtqueue.c
>> index e731b1d2ea..fc5f408f77 100644
>> --- a/hw/virtio/vhost-shadow-virtqueue.c
>> +++ b/hw/virtio/vhost-shadow-virtqueue.c
>> @@ -66,7 +66,7 @@ bool vhost_svq_valid_features(uint64_t features, Error 
>> **errp)
>>*
>>* @svq: The svq
>>*/
>> -static uint16_t vhost_svq_available_slots(const VhostShadowVirtqueue *svq)
>> +uint16_t vhost_svq_available_slots(const VhostShadowVirtqueue *svq)
>>   {
>>   return svq->num_free;
>>   }
>> diff --git a/hw/virtio/vhost-shadow-virtqueue.h 
>> b/hw/virtio/vhost-shadow-virtqueue.h
>> index 5bce67837b..19c842a15b 100644
>> --- a/hw/virtio/vhost-shadow-virtqueue.h
>> +++ b/hw/virtio/vhost-shadow-virtqueue.h
>> @@ -114,6 +114,7 @@ typedef struct VhostShadowVirtqueue {
>>
>>   bool vhost_svq_valid_features(uint64_t features, Error **errp);
>>
>> +uint16_t vhost_svq_available_slots(const VhostShadowVirtqueue *svq);
>>   void vhost_svq_push_elem(VhostShadowVirtqueue *svq,
>>const VirtQueueElement *elem, uint32_t len);
>>   int vhost_svq_add(VhostShadowVirtqueue *svq, const struct iovec *out_sg,
>
> I think it is ok to split this export in its own patch. If you decide
> to do it that way, you can add my Acked-by.

I will split this in its own patch, thanks for your suggestion!

>
>> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
>> index a875767ee9..e6342b213f 100644
>> --- a/net/vhost-vdpa.c
>> +++ b/net/vhost-vdpa.c
>> @@ -620,6 +620,13 @@ static ssize_t vhost_vdpa_net_cvq_add(VhostVDPAState *s,
>>   return vhost_svq_poll(svq, 1);
>>   }
>>
>> +/* Convenience wrapper to get number of available SVQ descriptors */
>> +static uint16_t vhost_vdpa_net_svq_available_slots(VhostVDPAState *s)
>> +{
>> +VhostShadowVirtqueue *svq = g_ptr_array_index(s->vhost_vdpa.shadow_vqs, 
>> 0);
>
> This is not really generic enough for all VhostVDPAState, as dataplane
> ones have two svqs.
>
> I think the best is to just inline the function in the caller, as
> there is only one, isn't it? If not, would it work to just replace
> _net_ by _cvq_ or similar?
>

Yes, there should be only one user for this function, I will inline
the function in the caller.

>> +return vhost_svq_available_slots(svq);
>> +}
>> +
>>   static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState *s, uint8_t class,
>>  uint8_t cmd, const struct iovec 
>> *data_sg,
>>  size_t data_num)
>> @@ -640,6 +647,8 @@ static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState 
>> *s, uint8_t class,
>>   };
>>
>>   assert(data_size < vhost_vdpa_net_cvq_cmd_page_len() - sizeof(ctrl));
>> +/* Each CVQ command has one out descriptor and one in descriptor */
>> +assert(vhost_vdpa_net_svq_available_slots(s) >= 2);
>>
>
> I think we should remove this assertion. By the end of the series
> there is an "if" checks explicitly for the opposite condition, and
> flushing the queue in that case, so the code can never reach it.
>

Yes, you are right. I will remove this assertion.

Thanks!


>>   /* pack the CVQ command header */
>>   memcpy(s->cvq_cmd_out_buffer, , sizeof(ctrl));
>> --
>> 2.25.1
>>
>

Re: [PATCH v4 0/8] vdpa: Send all CVQ state load commands in parallel

2023-08-29 Thread Hawkins Jiawei

On 2023/8/29 13:54, Hawkins Jiawei wrote:
> This patchset allows QEMU to delay polling and checking the device
> used buffer until either the SVQ is full or control commands shadow
> buffers are full, instead of polling and checking immediately after
> sending each SVQ control command, so that QEMU can send all the SVQ
> control commands in parallel, which have better performance improvement.
>
> I use vp_vdpa device to simulate vdpa device, and create 4094 VLANS in
> guest to build a test environment for sending multiple CVQ state load
> commands. This patch series can improve latency from 20455 us to
> 13732 us for about 4099 CVQ state load commands, about 1.64 us per command.
>
> Note that this patch should be based on
> patch "Vhost-vdpa Shadow Virtqueue VLAN support" at [1].
>
> [1]. https://lore.kernel.org/all/cover.1690100802.git.yin31...@gmail.com/

Sorry for the outdated link. The correct link for this patch should
be https://lore.kernel.org/all/cover.1690106284.git.yin31...@gmail.com/

Thanks!


>
> TestStep
> 
> 1. regression testing using vp-vdpa device
>- For L0 guest, boot QEMU with two virtio-net-pci net device with
> `ctrl_vq`, `ctrl_rx`, `ctrl_rx_extra` features on, command line like:
>-device virtio-net-pci,disable-legacy=on,disable-modern=off,
> iommu_platform=on,mq=on,ctrl_vq=on,guest_announce=off,
> indirect_desc=off,queue_reset=off,ctrl_rx=on,ctrl_rx_extra=on,...
>
>- For L1 guest, apply the patch series and compile the source code,
> start QEMU with two vdpa device with svq mode on, enable the `ctrl_vq`,
> `ctrl_rx`, `ctrl_rx_extra` features on, command line like:
>-netdev type=vhost-vdpa,x-svq=true,...
>-device virtio-net-pci,mq=on,guest_announce=off,ctrl_vq=on,
> ctrl_rx=on,ctrl_rx_extra=on...
>
>- For L2 source guest, run the following bash command:
> ```bash
> #!/bin/sh
>
> for idx1 in {0..9}
> do
>for idx2 in {0..9}
>do
>  for idx3 in {0..6}
>  do
>ip link add macvlan$idx1$idx2$idx3 link eth0
> address 4a:30:10:19:$idx1$idx2:1$idx3 type macvlan mode bridge
>ip link set macvlan$idx1$idx2$idx3 up
>  done
>done
> done
> ```
>- Execute the live migration in L2 source monitor
>
>- Result
>  * with this series, QEMU should not trigger any error or warning.
>
>
>
> 2. perf using vp-vdpa device
>- For L0 guest, boot QEMU with two virtio-net-pci net device with
> `ctrl_vq`, `ctrl_vlan` features on, command line like:
>-device virtio-net-pci,disable-legacy=on,disable-modern=off,
> iommu_platform=on,mq=on,ctrl_vq=on,guest_announce=off,
> indirect_desc=off,queue_reset=off,ctrl_vlan=on,...
>
>- For L1 guest, apply the patch series, then apply an addtional
> patch to record the load time in microseconds as following:
> ```diff
> diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
> index 6b958d6363..501b510fd2 100644
> --- a/hw/net/vhost_net.c
> +++ b/hw/net/vhost_net.c
> @@ -295,7 +295,10 @@ static int vhost_net_start_one(struct vhost_net *net,
>   }
>
>   if (net->nc->info->load) {
> +int64_t start_us = g_get_monotonic_time();
>   r = net->nc->info->load(net->nc);
> +error_report("vhost_vdpa_net_load() = %ld us",
> + g_get_monotonic_time() - start_us);
>   if (r < 0) {
>   goto fail;
>   }
> ```
>
>- For L1 guest, compile the code, and start QEMU with two vdpa device
> with svq mode on, enable the `ctrl_vq`, `ctrl_vlan` features on,
> command line like:
>-netdev type=vhost-vdpa,x-svq=true,...
>-device virtio-net-pci,mq=on,guest_announce=off,ctrl_vq=on,
> ctrl_vlan=on...
>
>- For L2 source guest, run the following bash command:
> ```bash
> #!/bin/sh
>
> for idx in {1..4094}
> do
>ip link add link eth0 name vlan$idx type vlan id $idx
> done
> ```
>
>- execute the live migration in L2 source monitor
>
>- Result
>  * with this series, QEMU should not trigger any warning
> or error except something like "vhost_vdpa_net_load() = 13732 us"
>  * without this series, QEMU should not trigger any warning
> or error except something like "vhost_vdpa_net_load() = 20455 us"
>
> ChangeLog
> =
> v4:
>- refactor subject line suggested by Eugenio in patch
> "vhost: Add count argument to vhost_svq_poll()"
>- split `in` to `vdpa_in` and `model_in` instead of reusing `in`
> in vhost_vdpa_net_handle_ctrl_avail() suggested by Eugenio in patch
> "vdpa: Use iovec for vhost_vdpa_net_cvq_add()"
>- pack CVQ command by iov_from_buf() ins

[PATCH v3 2/3] vdpa: Restore receive-side scaling state

2023-08-29 Thread Hawkins Jiawei

This patch reuses vhost_vdpa_net_load_rss() with some
refactorings to restore the receive-side scaling state
at device's startup.

Signed-off-by: Hawkins Jiawei 
---
v3:
  - resolve conflict with updated patch
"Vhost-vdpa Shadow Virtqueue Hash calculation Support"

v2: 
https://lore.kernel.org/all/af33aa80bc4ef0b2cec6c21b9448866c517fde80.1691926415.git.yin31...@gmail.com/
  - Correct the feature usage to VIRTIO_NET_F_HASH_REPORT when
loading the hash calculation state

v1: 
https://lore.kernel.org/all/93d5d82f0a5df71df326830033e50358c8b6be7a.1691766252.git.yin31...@gmail.com/

 net/vhost-vdpa.c | 54 
 1 file changed, 36 insertions(+), 18 deletions(-)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 11f89e7032..85547b7bbb 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -839,17 +839,28 @@ static int vhost_vdpa_net_load_rss(VhostVDPAState *s, 
const VirtIONet *n,
 
 cfg.hash_types = cpu_to_le32(n->rss_data.hash_types);
 
-/*
- * According to VirtIO standard, "Field reserved MUST contain zeroes.
- * It is defined to make the structure to match the layout of
- * virtio_net_rss_config structure, defined in 5.1.6.5.7.".
- *
- * Therefore, we need to zero the fields in struct virtio_net_rss_config,
- * which corresponds the `reserved` field in
- * struct virtio_net_hash_config.
- */
-memset(_table_mask, 0,
-   sizeof_field(struct virtio_net_hash_config, reserved));
+if (do_rss) {
+/*
+ * According to VirtIO standard, "Number of entries in 
indirection_table
+ * is (indirection_table_mask + 1)".
+ */
+cfg.indirection_table_mask = cpu_to_le16(n->rss_data.indirections_len -
+ 1);
+cfg.unclassified_queue = cpu_to_le16(n->rss_data.default_queue);
+cfg.max_tx_vq = cpu_to_le16(n->curr_queue_pairs);
+} else {
+/*
+ * According to VirtIO standard, "Field reserved MUST contain zeroes.
+ * It is defined to make the structure to match the layout of
+ * virtio_net_rss_config structure, defined in 5.1.6.5.7.".
+ *
+ * Therefore, we need to zero the fields in
+ * struct virtio_net_rss_config, which corresponds the `reserved` field
+ * in struct virtio_net_hash_config.
+ */
+memset(_table_mask, 0,
+   sizeof_field(struct virtio_net_hash_config, reserved));
+}
 
 table = g_malloc_n(n->rss_data.indirections_len,
sizeof(n->rss_data.indirections_table[0]));
@@ -886,6 +897,7 @@ static int vhost_vdpa_net_load_rss(VhostVDPAState *s, const 
VirtIONet *n,
 
 r = vhost_vdpa_net_load_cmd(s, out_cursor, in_cursor,
 VIRTIO_NET_CTRL_MQ,
+do_rss ? VIRTIO_NET_CTRL_MQ_RSS_CONFIG :
 VIRTIO_NET_CTRL_MQ_HASH_CONFIG,
 data, ARRAY_SIZE(data));
 if (unlikely(r < 0)) {
@@ -920,13 +932,19 @@ static int vhost_vdpa_net_load_mq(VhostVDPAState *s,
 return r;
 }
 
-if (!virtio_vdev_has_feature(>parent_obj, VIRTIO_NET_F_HASH_REPORT)) {
-return 0;
-}
-
-r = vhost_vdpa_net_load_rss(s, n, out_cursor, in_cursor, false);
-if (unlikely(r < 0)) {
-return r;
+if (virtio_vdev_has_feature(>parent_obj, VIRTIO_NET_F_RSS)) {
+/* load the receive-side scaling state */
+r = vhost_vdpa_net_load_rss(s, n, out_cursor, in_cursor, true);
+if (unlikely(r < 0)) {
+return r;
+}
+} else if (virtio_vdev_has_feature(>parent_obj,
+   VIRTIO_NET_F_HASH_REPORT)) {
+/* load the hash calculation state */
+r = vhost_vdpa_net_load_rss(s, n, out_cursor, in_cursor, false);
+if (unlikely(r < 0)) {
+return r;
+}
 }
 
 return 0;
-- 
2.25.1

[PATCH v3 3/3] vdpa: Allow VIRTIO_NET_F_RSS in SVQ

2023-08-29 Thread Hawkins Jiawei

Enable SVQ with VIRTIO_NET_F_RSS feature.

Signed-off-by: Hawkins Jiawei 
---
 net/vhost-vdpa.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 85547b7bbb..13da60aeda 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -119,6 +119,7 @@ static const uint64_t vdpa_svq_device_features =
 /* VHOST_F_LOG_ALL is exposed by SVQ */
 BIT_ULL(VHOST_F_LOG_ALL) |
 BIT_ULL(VIRTIO_NET_F_HASH_REPORT) |
+BIT_ULL(VIRTIO_NET_F_RSS) |
 BIT_ULL(VIRTIO_NET_F_RSC_EXT) |
 BIT_ULL(VIRTIO_NET_F_STANDBY) |
 BIT_ULL(VIRTIO_NET_F_SPEED_DUPLEX);
-- 
2.25.1

[PATCH v3 1/3] vdpa: Add SetSteeringEBPF method for NetClientState

2023-08-29 Thread Hawkins Jiawei

At present, to enable the VIRTIO_NET_F_RSS feature, eBPF must
be loaded for the vhost backend.

Given that vhost-vdpa is one of the vhost backend, we need to
implement the SetSteeringEBPF method to support RSS for vhost-vdpa,
even if vhost-vdpa calculates the rss hash in the hardware device
instead of in the kernel by eBPF.

Although this requires QEMU to be compiled with `--enable-bpf`
configuration even if the vdpa device does not use eBPF to
calculate the rss hash, this can avoid adding the specific
conditional statements for vDPA case to enable the VIRTIO_NET_F_RSS
feature, which reduces code maintainbility.

Suggested-by: Eugenio Pérez 
Signed-off-by: Hawkins Jiawei 
---
 net/vhost-vdpa.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index b29f84f54c..11f89e7032 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -238,6 +238,12 @@ static void vhost_vdpa_cleanup(NetClientState *nc)
 }
 }
 
+/** Dummy SetSteeringEBPF to support RSS for vhost-vdpa backend  */
+static bool vhost_vdpa_set_steering_ebpf(NetClientState *nc, int prog_fd)
+{
+return true;
+}
+
 static bool vhost_vdpa_has_vnet_hdr(NetClientState *nc)
 {
 assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
@@ -400,6 +406,7 @@ static NetClientInfo net_vhost_vdpa_info = {
 .has_vnet_hdr = vhost_vdpa_has_vnet_hdr,
 .has_ufo = vhost_vdpa_has_ufo,
 .check_peer_type = vhost_vdpa_check_peer_type,
+.set_steering_ebpf = vhost_vdpa_set_steering_ebpf,
 };
 
 static int64_t vhost_vdpa_get_vring_group(int device_fd, unsigned vq_index,
@@ -1241,6 +1248,7 @@ static NetClientInfo net_vhost_vdpa_cvq_info = {
 .has_vnet_hdr = vhost_vdpa_has_vnet_hdr,
 .has_ufo = vhost_vdpa_has_ufo,
 .check_peer_type = vhost_vdpa_check_peer_type,
+.set_steering_ebpf = vhost_vdpa_set_steering_ebpf,
 };
 
 /*
-- 
2.25.1

[PATCH v3 0/3] Vhost-vdpa Shadow Virtqueue RSS Support

2023-08-29 Thread Hawkins Jiawei

This series enables shadowed CVQ to intercept RSS command
through shadowed CVQ, update the virtio NIC device model
so qemu send it in a migration, and the restore of that
RSS state in the destination.

Note that this patch should be based on
patch "Vhost-vdpa Shadow Virtqueue Hash calculation Support" at [1].

[1]. https://lore.kernel.org/all/cover.1693297766.git.yin31...@gmail.com/

TestStep

1. test the migration using vp-vdpa device
  - For L0 guest, boot QEMU with two virtio-net-pci net device with
`in-qemu` RSS, command line like:
-netdev tap,vhost=off...
-device virtio-net-pci,disable-legacy=on,disable-modern=off,
iommu_platform=on,mq=on,ctrl_vq=on,hash=on,rss=on,guest_announce=off,
indirect_desc=off,queue_reset=off,...

  - For L1 guest, apply the relative patch series and compile the
source code, start QEMU with two vdpa device with svq mode on,
enable the `ctrl_vq`, `mq`, `rss` features on, command line like:
  -netdev type=vhost-vdpa,x-svq=true,...
  -device virtio-net-pci,mq=on,guest_announce=off,ctrl_vq=on,
rss=on,...

  - For L2 source guest, run the following bash command:
```bash
#!/bin/sh

ethtool -K eth0 rxhash on
```

  - Execute the live migration in L2 source monitor

  - Result
* with this series, L2 QEMU can execute without
triggering any error or warning. L0 QEMU echo
"Can't load eBPF RSS - fallback to software RSS".

ChangeLog
=
v3:
  - resolve conflict with updated patch
"Vhost-vdpa Shadow Virtqueue Hash calculation Support" in patch
"vdpa: Restore receive-side scaling state"

RFC v2: https://lore.kernel.org/all/cover.1691926415.git.yin31...@gmail.com/
  - Correct the feature usage to VIRTIO_NET_F_HASH_REPORT when
loading the hash calculation state in
patch "vdpa: Restore receive-side scaling state"

RFC v1: https://lore.kernel.org/all/cover.1691766252.git.yin31...@gmail.com/

Hawkins Jiawei (3):
  vdpa: Add SetSteeringEBPF method for NetClientState
  vdpa: Restore receive-side scaling state
  vdpa: Allow VIRTIO_NET_F_RSS in SVQ

 net/vhost-vdpa.c | 63 ++--
 1 file changed, 45 insertions(+), 18 deletions(-)

-- 
2.25.1

[PATCH v2 2/2] vdpa: Allow VIRTIO_NET_F_HASH_REPORT in SVQ

2023-08-29 Thread Hawkins Jiawei

Enable SVQ with VIRTIO_NET_F_HASH_REPORT feature.

Signed-off-by: Hawkins Jiawei 
---
 net/vhost-vdpa.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 139bb79468..b29f84f54c 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -118,6 +118,7 @@ static const uint64_t vdpa_svq_device_features =
 BIT_ULL(VIRTIO_NET_F_CTRL_MAC_ADDR) |
 /* VHOST_F_LOG_ALL is exposed by SVQ */
 BIT_ULL(VHOST_F_LOG_ALL) |
+BIT_ULL(VIRTIO_NET_F_HASH_REPORT) |
 BIT_ULL(VIRTIO_NET_F_RSC_EXT) |
 BIT_ULL(VIRTIO_NET_F_STANDBY) |
 BIT_ULL(VIRTIO_NET_F_SPEED_DUPLEX);
-- 
2.25.1

[PATCH v2 1/2] vdpa: Restore hash calculation state

2023-08-29 Thread Hawkins Jiawei

This patch introduces vhost_vdpa_net_load_rss() to restore
the hash calculation state at device's startup.

Note that vhost_vdpa_net_load_rss() has `do_rss` argument,
which allows future code to reuse this function to restore
the receive-side scaling state when the VIRTIO_NET_F_RSS
feature is enabled in SVQ. Currently, vhost_vdpa_net_load_rss()
could only be invoked when `do_rss` is set to false.

Signed-off-by: Hawkins Jiawei 
---
Question:

It seems that virtio_net_handle_rss() currently does not restore the
hash key length parsed from the CVQ command sent from the guest into
n->rss_data and uses the maximum key length in other code.

So for `hash_key_length` field in VIRTIO_NET_CTRL_MQ_HASH_CONFIG command
sent to device, is it okay to also use the maximum key length as its value?
Or should we introduce the `hash_key_length` field in n->rss_data
structure to record the key length from guest and use this value?

ChangeLog:

v2:
  - resolve conflict with updated patch
"vdpa: Send all CVQ state load commands in parallel"
  - move the `table` declaration at the beginning of the
vhost_vdpa_net_load_rss()

RFC: 
https://lore.kernel.org/all/a54ca70b12ebe2f3c391864e41241697ab1aba30.1691762906.git.yin31...@gmail.com/

 net/vhost-vdpa.c | 91 
 1 file changed, 91 insertions(+)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 818464b702..139bb79468 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -805,6 +805,88 @@ static int vhost_vdpa_net_load_mac(VhostVDPAState *s, 
const VirtIONet *n,
 return 0;
 }
 
+static int vhost_vdpa_net_load_rss(VhostVDPAState *s, const VirtIONet *n,
+   struct iovec *out_cursor,
+   struct iovec *in_cursor, bool do_rss)
+{
+struct virtio_net_rss_config cfg;
+ssize_t r;
+g_autofree uint16_t *table = NULL;
+
+/*
+ * According to VirtIO standard, "Initially the device has all hash
+ * types disabled and reports only VIRTIO_NET_HASH_REPORT_NONE.".
+ *
+ * Therefore, there is no need to send this CVQ command if the
+ * driver disable the all hash types, which aligns with
+ * the device's defaults.
+ *
+ * Note that the device's defaults can mismatch the driver's
+ * configuration only at live migration.
+ */
+if (!n->rss_data.enabled ||
+n->rss_data.hash_types == VIRTIO_NET_HASH_REPORT_NONE) {
+return 0;
+}
+
+cfg.hash_types = cpu_to_le32(n->rss_data.hash_types);
+
+/*
+ * According to VirtIO standard, "Field reserved MUST contain zeroes.
+ * It is defined to make the structure to match the layout of
+ * virtio_net_rss_config structure, defined in 5.1.6.5.7.".
+ *
+ * Therefore, we need to zero the fields in struct virtio_net_rss_config,
+ * which corresponds the `reserved` field in
+ * struct virtio_net_hash_config.
+ */
+memset(_table_mask, 0,
+   sizeof_field(struct virtio_net_hash_config, reserved));
+
+table = g_malloc_n(n->rss_data.indirections_len,
+   sizeof(n->rss_data.indirections_table[0]));
+for (int i = 0; i < n->rss_data.indirections_len; ++i) {
+table[i] = cpu_to_le16(n->rss_data.indirections_table[i]);
+}
+
+/*
+ * Consider that virtio_net_handle_rss() currently does not restore the
+ * hash key length parsed from the CVQ command sent from the guest into
+ * n->rss_data and uses the maximum key length in other code, so we also
+ * employthe the maxium key length here.
+ */
+cfg.hash_key_length = sizeof(n->rss_data.key);
+
+const struct iovec data[] = {
+{
+.iov_base = ,
+.iov_len = offsetof(struct virtio_net_rss_config,
+indirection_table),
+}, {
+.iov_base = table,
+.iov_len = n->rss_data.indirections_len *
+   sizeof(n->rss_data.indirections_table[0]),
+}, {
+.iov_base = _tx_vq,
+.iov_len = offsetof(struct virtio_net_rss_config, hash_key_data) -
+   offsetof(struct virtio_net_rss_config, max_tx_vq),
+}, {
+.iov_base = (void *)n->rss_data.key,
+.iov_len = sizeof(n->rss_data.key),
+}
+};
+
+r = vhost_vdpa_net_load_cmd(s, out_cursor, in_cursor,
+VIRTIO_NET_CTRL_MQ,
+VIRTIO_NET_CTRL_MQ_HASH_CONFIG,
+data, ARRAY_SIZE(data));
+if (unlikely(r < 0)) {
+return r;
+}
+
+return 0;
+}
+
 static int vhost_vdpa_net_load_mq(VhostVDPAState *s,
   const VirtIONet *n,
   struct iovec *out_cursor,
@@ -830,6 +912,15 @@ static int vhost_vdpa_net_load_mq(VhostVDPAState *s,
 return r;

[PATCH v2 0/2] Vhost-vdpa Shadow Virtqueue Hash calculation Support

2023-08-29 Thread Hawkins Jiawei

This series enables shadowed CVQ to intercept
VIRTIO_NET_CTRL_MQ_HASH_CONFIG command through shadowed CVQ,
update the virtio NIC device model so qemu send it in a
migration, and the restore of that Hash calculation state
in the destination.

Note that this patch should be based on
patch "vdpa: Send all CVQ state load commands in parallel" at [1].

[1]. https://lore.kernel.org/all/cover.1693287885.git.yin31...@gmail.com/

TestStep

1. test the migration using vp-vdpa device
  - For L0 guest, boot QEMU with two virtio-net-pci net device with
`ctrl_vq`, `mq`, `hash` features on, command line like:
-netdev tap,...
-device virtio-net-pci,disable-legacy=on,disable-modern=off,
iommu_platform=on,mq=on,ctrl_vq=on,hash=on,guest_announce=off,
indirect_desc=off,queue_reset=off,...

  - For L1 guest, apply the relative patch series and compile the
source code, start QEMU with two vdpa device with svq mode on,
enable the `ctrl_vq`, `mq`, `hash` features on, command line like:
  -netdev type=vhost-vdpa,x-svq=true,...
  -device virtio-net-pci,mq=on,guest_announce=off,ctrl_vq=on,
hash=on,...

  - For L2 source guest, run the following bash command:
```bash
#!/bin/sh

ethtool -K eth0 rxhash on
```
  - Gdb attach the destination VM and break at the
vhost_vdpa_net_load_rss()

  - Execute the live migration in L2 source monitor

  - Result
* with this series, gdb can hit the breakpoint and continue
the executing without triggering any error or warning.


ChangeLog
=
v2:
  - resolve conflict with updated patch
"vdpa: Send all CVQ state load commands in parallel", move the
`table` declaration at the beginning of the vhost_vdpa_net_load_rss()
in patch
"vdpa: Restore hash calculation state"

RFC: https://lore.kernel.org/all/cover.1691762906.git.yin31...@gmail.com/#t

Hawkins Jiawei (2):
  vdpa: Restore hash calculation state
  vdpa: Allow VIRTIO_NET_F_HASH_REPORT in SVQ

 net/vhost-vdpa.c | 92 
 1 file changed, 92 insertions(+)

-- 
2.25.1

[PATCH v4 3/8] vhost: Expose vhost_svq_available_slots()

2023-08-29 Thread Hawkins Jiawei

Next patches in this series will delay the polling
and checking of buffers until either the SVQ is
full or control commands shadow buffers are full,
no longer perform an immediate poll and check of
the device's used buffers for each CVQ state load command.

To achieve this, this patch exposes
vhost_svq_available_slots() and introduces a helper function,
allowing QEMU to know whether the SVQ is full.

Signed-off-by: Hawkins Jiawei 
Acked-by: Eugenio Pérez 
---
 hw/virtio/vhost-shadow-virtqueue.c | 2 +-
 hw/virtio/vhost-shadow-virtqueue.h | 1 +
 net/vhost-vdpa.c   | 9 +
 3 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/vhost-shadow-virtqueue.c 
b/hw/virtio/vhost-shadow-virtqueue.c
index e731b1d2ea..fc5f408f77 100644
--- a/hw/virtio/vhost-shadow-virtqueue.c
+++ b/hw/virtio/vhost-shadow-virtqueue.c
@@ -66,7 +66,7 @@ bool vhost_svq_valid_features(uint64_t features, Error **errp)
  *
  * @svq: The svq
  */
-static uint16_t vhost_svq_available_slots(const VhostShadowVirtqueue *svq)
+uint16_t vhost_svq_available_slots(const VhostShadowVirtqueue *svq)
 {
 return svq->num_free;
 }
diff --git a/hw/virtio/vhost-shadow-virtqueue.h 
b/hw/virtio/vhost-shadow-virtqueue.h
index 5bce67837b..19c842a15b 100644
--- a/hw/virtio/vhost-shadow-virtqueue.h
+++ b/hw/virtio/vhost-shadow-virtqueue.h
@@ -114,6 +114,7 @@ typedef struct VhostShadowVirtqueue {
 
 bool vhost_svq_valid_features(uint64_t features, Error **errp);
 
+uint16_t vhost_svq_available_slots(const VhostShadowVirtqueue *svq);
 void vhost_svq_push_elem(VhostShadowVirtqueue *svq,
  const VirtQueueElement *elem, uint32_t len);
 int vhost_svq_add(VhostShadowVirtqueue *svq, const struct iovec *out_sg,
diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index a875767ee9..e6342b213f 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -620,6 +620,13 @@ static ssize_t vhost_vdpa_net_cvq_add(VhostVDPAState *s,
 return vhost_svq_poll(svq, 1);
 }
 
+/* Convenience wrapper to get number of available SVQ descriptors */
+static uint16_t vhost_vdpa_net_svq_available_slots(VhostVDPAState *s)
+{
+VhostShadowVirtqueue *svq = g_ptr_array_index(s->vhost_vdpa.shadow_vqs, 0);
+return vhost_svq_available_slots(svq);
+}
+
 static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState *s, uint8_t class,
uint8_t cmd, const struct iovec 
*data_sg,
size_t data_num)
@@ -640,6 +647,8 @@ static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState *s, 
uint8_t class,
 };
 
 assert(data_size < vhost_vdpa_net_cvq_cmd_page_len() - sizeof(ctrl));
+/* Each CVQ command has one out descriptor and one in descriptor */
+assert(vhost_vdpa_net_svq_available_slots(s) >= 2);
 
 /* pack the CVQ command header */
 memcpy(s->cvq_cmd_out_buffer, , sizeof(ctrl));
-- 
2.25.1

[PATCH v4 5/8] vdpa: Check device ack in vhost_vdpa_net_load_rx_mode()

2023-08-28 Thread Hawkins Jiawei

Considering that vhost_vdpa_net_load_rx_mode() is only called
within vhost_vdpa_net_load_rx() now, this patch refactors
vhost_vdpa_net_load_rx_mode() to include a check for the
device's ack, simplifying the code and improving its maintainability.

Signed-off-by: Hawkins Jiawei 
Acked-by: Eugenio Pérez 
---
 net/vhost-vdpa.c | 76 
 1 file changed, 31 insertions(+), 45 deletions(-)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 7c67063469..116a06cc45 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -814,14 +814,24 @@ static int vhost_vdpa_net_load_rx_mode(VhostVDPAState *s,
 .iov_base = ,
 .iov_len = sizeof(on),
 };
-return vhost_vdpa_net_load_cmd(s, VIRTIO_NET_CTRL_RX,
-   cmd, , 1);
+ssize_t dev_written;
+
+dev_written = vhost_vdpa_net_load_cmd(s, VIRTIO_NET_CTRL_RX,
+  cmd, , 1);
+if (unlikely(dev_written < 0)) {
+return dev_written;
+}
+if (*s->status != VIRTIO_NET_OK) {
+return -EIO;
+}
+
+return 0;
 }
 
 static int vhost_vdpa_net_load_rx(VhostVDPAState *s,
   const VirtIONet *n)
 {
-ssize_t dev_written;
+ssize_t r;
 
 if (!virtio_vdev_has_feature(>parent_obj, VIRTIO_NET_F_CTRL_RX)) {
 return 0;
@@ -846,13 +856,9 @@ static int vhost_vdpa_net_load_rx(VhostVDPAState *s,
  * configuration only at live migration.
  */
 if (!n->mac_table.uni_overflow && !n->promisc) {
-dev_written = vhost_vdpa_net_load_rx_mode(s,
-VIRTIO_NET_CTRL_RX_PROMISC, 0);
-if (unlikely(dev_written < 0)) {
-return dev_written;
-}
-if (*s->status != VIRTIO_NET_OK) {
-return -EIO;
+r = vhost_vdpa_net_load_rx_mode(s, VIRTIO_NET_CTRL_RX_PROMISC, 0);
+if (unlikely(r < 0)) {
+return r;
 }
 }
 
@@ -874,13 +880,9 @@ static int vhost_vdpa_net_load_rx(VhostVDPAState *s,
  * configuration only at live migration.
  */
 if (n->mac_table.multi_overflow || n->allmulti) {
-dev_written = vhost_vdpa_net_load_rx_mode(s,
-VIRTIO_NET_CTRL_RX_ALLMULTI, 1);
-if (unlikely(dev_written < 0)) {
-return dev_written;
-}
-if (*s->status != VIRTIO_NET_OK) {
-return -EIO;
+r = vhost_vdpa_net_load_rx_mode(s, VIRTIO_NET_CTRL_RX_ALLMULTI, 1);
+if (unlikely(r < 0)) {
+return r;
 }
 }
 
@@ -899,13 +901,9 @@ static int vhost_vdpa_net_load_rx(VhostVDPAState *s,
  * configuration only at live migration.
  */
 if (n->alluni) {
-dev_written = vhost_vdpa_net_load_rx_mode(s,
-VIRTIO_NET_CTRL_RX_ALLUNI, 1);
-if (dev_written < 0) {
-return dev_written;
-}
-if (*s->status != VIRTIO_NET_OK) {
-return -EIO;
+r = vhost_vdpa_net_load_rx_mode(s, VIRTIO_NET_CTRL_RX_ALLUNI, 1);
+if (r < 0) {
+return r;
 }
 }
 
@@ -920,13 +918,9 @@ static int vhost_vdpa_net_load_rx(VhostVDPAState *s,
  * configuration only at live migration.
  */
 if (n->nomulti) {
-dev_written = vhost_vdpa_net_load_rx_mode(s,
-VIRTIO_NET_CTRL_RX_NOMULTI, 1);
-if (dev_written < 0) {
-return dev_written;
-}
-if (*s->status != VIRTIO_NET_OK) {
-return -EIO;
+r = vhost_vdpa_net_load_rx_mode(s, VIRTIO_NET_CTRL_RX_NOMULTI, 1);
+if (r < 0) {
+return r;
 }
 }
 
@@ -941,13 +935,9 @@ static int vhost_vdpa_net_load_rx(VhostVDPAState *s,
  * configuration only at live migration.
  */
 if (n->nouni) {
-dev_written = vhost_vdpa_net_load_rx_mode(s,
-VIRTIO_NET_CTRL_RX_NOUNI, 1);
-if (dev_written < 0) {
-return dev_written;
-}
-if (*s->status != VIRTIO_NET_OK) {
-return -EIO;
+r = vhost_vdpa_net_load_rx_mode(s, VIRTIO_NET_CTRL_RX_NOUNI, 1);
+if (r < 0) {
+return r;
 }
 }
 
@@ -962,13 +952,9 @@ static int vhost_vdpa_net_load_rx(VhostVDPAState *s,
  * configuration only at live migration.
  */
 if (n->nobcast) {
-dev_written = vhost_vdpa_net_load_rx_mode(s,
-VIRTIO_NET_CTRL_RX_NOBCAST, 1);
-if (dev_written < 0) {
-return dev_written;
-}
-if (*s->status != VIRTIO_NET_OK) {
-return -EIO;
+r = vhost_vdpa_net_load_rx_mode(s, VIRTIO_NET_CTRL_RX_NOBCAST, 1);
+if (r < 0) {
+return r;
 }
 }
 
-- 
2.25.1

[PATCH v4 8/8] vdpa: Send cvq state load commands in parallel

2023-08-28 Thread Hawkins Jiawei

This patch enables sending CVQ state load commands
in parallel at device startup by following steps:

  * Refactor vhost_vdpa_net_load_cmd() to iterate through
the control commands shadow buffers. This allows different
CVQ state load commands to use their own unique buffers.

  * Delay the polling and checking of buffers until either
the SVQ is full or control commands shadow buffers are full.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1578
Signed-off-by: Hawkins Jiawei 
---
v4:
  - refactor argument `cmds_in_flight` to `len` for
vhost_vdpa_net_svq_full()
  - check the return value of vhost_vdpa_net_svq_poll()
in vhost_vdpa_net_svq_flush() suggested by Eugenio
  - use iov_size(), vhost_vdpa_net_load_cursor_reset()
and iov_discard_front() to update the cursors instead of
accessing it directly according to Eugenio

v3: 
https://lore.kernel.org/all/3a002790e6c880af928c6470ecbf03e7c65a68bb.1689748694.git.yin31...@gmail.com/

 net/vhost-vdpa.c | 155 +--
 1 file changed, 97 insertions(+), 58 deletions(-)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index a71e8c9090..818464b702 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -646,6 +646,31 @@ static void 
vhost_vdpa_net_load_cursor_reset(VhostVDPAState *s,
 in_cursor->iov_len = vhost_vdpa_net_cvq_cmd_page_len();
 }
 
+/*
+ * Poll SVQ for multiple pending control commands and check the device's ack.
+ *
+ * Caller should hold the BQL when invoking this function.
+ *
+ * @s: The VhostVDPAState
+ * @len: The length of the pending status shadow buffer
+ */
+static ssize_t vhost_vdpa_net_svq_flush(VhostVDPAState *s, size_t len)
+{
+/* Device uses a one-byte length ack for each control command */
+ssize_t dev_written = vhost_vdpa_net_svq_poll(s, len);
+if (unlikely(dev_written != len)) {
+return -EIO;
+}
+
+/* check the device's ack */
+for (int i = 0; i < len; ++i) {
+if (s->status[i] != VIRTIO_NET_OK) {
+return -EIO;
+}
+}
+return 0;
+}
+
 static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState *s,
struct iovec *out_cursor,
struct iovec *in_cursor, uint8_t class,
@@ -660,10 +685,30 @@ static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState *s,
cmd_size = sizeof(ctrl) + data_size;
 struct iovec out, in;
 ssize_t r;
+unsigned dummy_cursor_iov_cnt;
 
 assert(data_size < vhost_vdpa_net_cvq_cmd_page_len() - sizeof(ctrl));
+if (vhost_vdpa_net_svq_available_slots(s) < 2 ||
+iov_size(out_cursor, 1) < cmd_size) {
+/*
+ * It is time to flush all pending control commands if SVQ is full
+ * or control commands shadow buffers are full.
+ *
+ * We can poll here since we've had BQL from the time
+ * we sent the descriptor.
+ */
+r = vhost_vdpa_net_svq_flush(s, in_cursor->iov_base -
+ (void *)s->status);
+if (unlikely(r < 0)) {
+return r;
+}
+
+vhost_vdpa_net_load_cursor_reset(s, out_cursor, in_cursor);
+}
+
 /* Each CVQ command has one out descriptor and one in descriptor */
 assert(vhost_vdpa_net_svq_available_slots(s) >= 2);
+assert(iov_size(out_cursor, 1) >= cmd_size);
 
 /* Prepare the buffer for out descriptor for the device */
 iov_copy(, 1, out_cursor, 1, 0, cmd_size);
@@ -681,11 +726,13 @@ static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState *s,
 return r;
 }
 
-/*
- * We can poll here since we've had BQL from the time
- * we sent the descriptor.
- */
-return vhost_vdpa_net_svq_poll(s, 1);
+/* iterate the cursors */
+dummy_cursor_iov_cnt = 1;
+iov_discard_front(_cursor, _cursor_iov_cnt, cmd_size);
+dummy_cursor_iov_cnt = 1;
+iov_discard_front(_cursor, _cursor_iov_cnt, sizeof(*s->status));
+
+return 0;
 }
 
 static int vhost_vdpa_net_load_mac(VhostVDPAState *s, const VirtIONet *n,
@@ -697,15 +744,12 @@ static int vhost_vdpa_net_load_mac(VhostVDPAState *s, 
const VirtIONet *n,
 .iov_base = (void *)n->mac,
 .iov_len = sizeof(n->mac),
 };
-ssize_t dev_written = vhost_vdpa_net_load_cmd(s, out_cursor, in_cursor,
-  VIRTIO_NET_CTRL_MAC,
-  VIRTIO_NET_CTRL_MAC_ADDR_SET,
-  , 1);
-if (unlikely(dev_written < 0)) {
-return dev_written;
-}
-if (*s->status != VIRTIO_NET_OK) {
-return -EIO;
+ssize_t r = vhost_vdpa_net_load_cmd(s, out_cursor, in_cursor,
+   VIRTIO_NET_CTRL_MAC,
+   VIRTIO_NET_CTRL_MAC_ADDR_SET,
+

[PATCH v4 4/8] vdpa: Avoid using vhost_vdpa_net_load_*() outside vhost_vdpa_net_load()

2023-08-28 Thread Hawkins Jiawei

Next patches in this series will refactor vhost_vdpa_net_load_cmd()
to iterate through the control commands shadow buffers, allowing QEMU
to send CVQ state load commands in parallel at device startup.

Considering that QEMU always forwards the CVQ command serialized
outside of vhost_vdpa_net_load(), it is more elegant to send the
CVQ commands directly without invoking vhost_vdpa_net_load_*() helpers.

Signed-off-by: Hawkins Jiawei 
---
v4:
  - pack CVQ command by iov_from_buf() instead of accessing
`out` directly suggested by Eugenio

v3: 
https://lore.kernel.org/all/428a8fac2a29b37757fa15ca747be93c0226cb1f.1689748694.git.yin31...@gmail.com/

 net/vhost-vdpa.c | 16 +---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index e6342b213f..7c67063469 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -1097,12 +1097,14 @@ static NetClientInfo net_vhost_vdpa_cvq_info = {
  */
 static int vhost_vdpa_net_excessive_mac_filter_cvq_add(VhostVDPAState *s,
VirtQueueElement *elem,
-   struct iovec *out)
+   struct iovec *out,
+   const struct iovec *in)
 {
 struct virtio_net_ctrl_mac mac_data, *mac_ptr;
 struct virtio_net_ctrl_hdr *hdr_ptr;
 uint32_t cursor;
 ssize_t r;
+uint8_t on = 1;
 
 /* parse the non-multicast MAC address entries from CVQ command */
 cursor = sizeof(*hdr_ptr);
@@ -1150,7 +1152,15 @@ static int 
vhost_vdpa_net_excessive_mac_filter_cvq_add(VhostVDPAState *s,
  * filter table to the vdpa device, it should send the
  * VIRTIO_NET_CTRL_RX_PROMISC CVQ command to enable promiscuous mode
  */
-r = vhost_vdpa_net_load_rx_mode(s, VIRTIO_NET_CTRL_RX_PROMISC, 1);
+cursor = 0;
+hdr_ptr = out->iov_base;
+out->iov_len = sizeof(*hdr_ptr) + sizeof(on);
+assert(out->iov_len < vhost_vdpa_net_cvq_cmd_page_len());
+
+hdr_ptr->class = VIRTIO_NET_CTRL_RX;
+hdr_ptr->cmd = VIRTIO_NET_CTRL_RX_PROMISC;
+iov_from_buf(out, 1, sizeof(*hdr_ptr), , sizeof(on));
+r = vhost_vdpa_net_cvq_add(s, out, 1, in, 1);
 if (unlikely(r < 0)) {
 return r;
 }
@@ -1268,7 +1278,7 @@ static int 
vhost_vdpa_net_handle_ctrl_avail(VhostShadowVirtqueue *svq,
  * the CVQ command direclty.
  */
 dev_written = vhost_vdpa_net_excessive_mac_filter_cvq_add(s, elem,
-  );
+, _in);
 if (unlikely(dev_written < 0)) {
 goto out;
 }
-- 
2.25.1

[PATCH v4 6/8] vdpa: Move vhost_svq_poll() to the caller of vhost_vdpa_net_cvq_add()

2023-08-28 Thread Hawkins Jiawei

This patch moves vhost_svq_poll() to the caller of
vhost_vdpa_net_cvq_add() and introduces a helper funtion.

By making this change, next patches in this series is
able to refactor vhost_vdpa_net_load_x() only to delay
the polling and checking process until either the SVQ
is full or control commands shadow buffers are full.

Signed-off-by: Hawkins Jiawei 
---
v4:
  - always check the return value of vhost_vdpa_net_svq_poll()
suggested Eugenio

v3: 
https://lore.kernel.org/all/152177c4e7082236fba9d31d535e40f8c2984349.1689748694.git.yin31...@gmail.com/

 net/vhost-vdpa.c | 53 +++-
 1 file changed, 43 insertions(+), 10 deletions(-)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 116a06cc45..d9b8b3cf6c 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -609,15 +609,21 @@ static ssize_t vhost_vdpa_net_cvq_add(VhostVDPAState *s,
 qemu_log_mask(LOG_GUEST_ERROR, "%s: No space on device queue\n",
   __func__);
 }
-return r;
 }
 
-/*
- * We can poll here since we've had BQL from the time we sent the
- * descriptor. Also, we need to take the answer before SVQ pulls by itself,
- * when BQL is released
- */
-return vhost_svq_poll(svq, 1);
+return r;
+}
+
+/*
+ * Convenience wrapper to poll SVQ for multiple control commands.
+ *
+ * Caller should hold the BQL when invoking this function, and should take
+ * the answer before SVQ pulls by itself when BQL is released.
+ */
+static ssize_t vhost_vdpa_net_svq_poll(VhostVDPAState *s, size_t 
cmds_in_flight)
+{
+VhostShadowVirtqueue *svq = g_ptr_array_index(s->vhost_vdpa.shadow_vqs, 0);
+return vhost_svq_poll(svq, cmds_in_flight);
 }
 
 /* Convenience wrapper to get number of available SVQ descriptors */
@@ -645,6 +651,7 @@ static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState *s, 
uint8_t class,
 .iov_base = s->status,
 .iov_len = sizeof(*s->status),
 };
+ssize_t r;
 
 assert(data_size < vhost_vdpa_net_cvq_cmd_page_len() - sizeof(ctrl));
 /* Each CVQ command has one out descriptor and one in descriptor */
@@ -657,7 +664,16 @@ static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState *s, 
uint8_t class,
 iov_to_buf(data_sg, data_num, 0,
s->cvq_cmd_out_buffer + sizeof(ctrl), data_size);
 
-return vhost_vdpa_net_cvq_add(s, , 1, , 1);
+r = vhost_vdpa_net_cvq_add(s, , 1, , 1);
+if (unlikely(r < 0)) {
+return r;
+}
+
+/*
+ * We can poll here since we've had BQL from the time
+ * we sent the descriptor.
+ */
+return vhost_vdpa_net_svq_poll(s, 1);
 }
 
 static int vhost_vdpa_net_load_mac(VhostVDPAState *s, const VirtIONet *n)
@@ -1150,6 +1166,15 @@ static int 
vhost_vdpa_net_excessive_mac_filter_cvq_add(VhostVDPAState *s,
 if (unlikely(r < 0)) {
 return r;
 }
+
+/*
+ * We can poll here since we've had BQL from the time
+ * we sent the descriptor.
+ */
+r = vhost_vdpa_net_svq_poll(s, 1);
+if (unlikely(r < sizeof(*s->status))) {
+return r;
+}
 if (*s->status != VIRTIO_NET_OK) {
 return sizeof(*s->status);
 }
@@ -1269,10 +1294,18 @@ static int 
vhost_vdpa_net_handle_ctrl_avail(VhostShadowVirtqueue *svq,
 goto out;
 }
 } else {
-dev_written = vhost_vdpa_net_cvq_add(s, , 1, _in, 1);
-if (unlikely(dev_written < 0)) {
+ssize_t r;
+r = vhost_vdpa_net_cvq_add(s, , 1, _in, 1);
+if (unlikely(r < 0)) {
+dev_written = r;
 goto out;
 }
+
+/*
+ * We can poll here since we've had BQL from the time
+ * we sent the descriptor.
+ */
+dev_written = vhost_vdpa_net_svq_poll(s, 1);
 }
 
 if (unlikely(dev_written < sizeof(status))) {
-- 
2.25.1

[PATCH v4 1/8] vhost: Add count argument to vhost_svq_poll()

2023-08-28 Thread Hawkins Jiawei

Next patches in this series will no longer perform an
immediate poll and check of the device's used buffers
for each CVQ state load command. Instead, they will
send CVQ state load commands in parallel by polling
multiple pending buffers at once.

To achieve this, this patch refactoring vhost_svq_poll()
to accept a new argument `num`, which allows vhost_svq_poll()
to wait for the device to use multiple elements,
rather than polling for a single element.

Signed-off-by: Hawkins Jiawei 
Acked-by: Eugenio Pérez 
---
v4:
  - refactor subject line suggested by Eugenio

v3: 
https://lore.kernel.org/all/77c1d8b358644b49992e6dbca55a5c9e62c941a8.1689748694.git.yin31...@gmail.com/

 hw/virtio/vhost-shadow-virtqueue.c | 36 ++
 hw/virtio/vhost-shadow-virtqueue.h |  2 +-
 net/vhost-vdpa.c   |  2 +-
 3 files changed, 24 insertions(+), 16 deletions(-)

diff --git a/hw/virtio/vhost-shadow-virtqueue.c 
b/hw/virtio/vhost-shadow-virtqueue.c
index 49e5aed931..e731b1d2ea 100644
--- a/hw/virtio/vhost-shadow-virtqueue.c
+++ b/hw/virtio/vhost-shadow-virtqueue.c
@@ -514,29 +514,37 @@ static void vhost_svq_flush(VhostShadowVirtqueue *svq,
 }
 
 /**
- * Poll the SVQ for one device used buffer.
+ * Poll the SVQ to wait for the device to use the specified number
+ * of elements and return the total length written by the device.
  *
  * This function race with main event loop SVQ polling, so extra
  * synchronization is needed.
  *
- * Return the length written by the device.
+ * @svq: The svq
+ * @num: The number of elements that need to be used
  */
-size_t vhost_svq_poll(VhostShadowVirtqueue *svq)
+size_t vhost_svq_poll(VhostShadowVirtqueue *svq, size_t num)
 {
-int64_t start_us = g_get_monotonic_time();
-uint32_t len = 0;
+size_t len = 0;
+uint32_t r;
 
-do {
-if (vhost_svq_more_used(svq)) {
-break;
-}
+while (num--) {
+int64_t start_us = g_get_monotonic_time();
 
-if (unlikely(g_get_monotonic_time() - start_us > 10e6)) {
-return 0;
-}
-} while (true);
+do {
+if (vhost_svq_more_used(svq)) {
+break;
+}
+
+if (unlikely(g_get_monotonic_time() - start_us > 10e6)) {
+return len;
+}
+} while (true);
+
+vhost_svq_get_buf(svq, );
+len += r;
+}
 
-vhost_svq_get_buf(svq, );
 return len;
 }
 
diff --git a/hw/virtio/vhost-shadow-virtqueue.h 
b/hw/virtio/vhost-shadow-virtqueue.h
index 6efe051a70..5bce67837b 100644
--- a/hw/virtio/vhost-shadow-virtqueue.h
+++ b/hw/virtio/vhost-shadow-virtqueue.h
@@ -119,7 +119,7 @@ void vhost_svq_push_elem(VhostShadowVirtqueue *svq,
 int vhost_svq_add(VhostShadowVirtqueue *svq, const struct iovec *out_sg,
   size_t out_num, const struct iovec *in_sg, size_t in_num,
   VirtQueueElement *elem);
-size_t vhost_svq_poll(VhostShadowVirtqueue *svq);
+size_t vhost_svq_poll(VhostShadowVirtqueue *svq, size_t num);
 
 void vhost_svq_set_svq_kick_fd(VhostShadowVirtqueue *svq, int svq_kick_fd);
 void vhost_svq_set_svq_call_fd(VhostShadowVirtqueue *svq, int call_fd);
diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 73e9063fa0..3acda8591a 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -625,7 +625,7 @@ static ssize_t vhost_vdpa_net_cvq_add(VhostVDPAState *s, 
size_t out_len,
  * descriptor. Also, we need to take the answer before SVQ pulls by itself,
  * when BQL is released
  */
-return vhost_svq_poll(svq);
+return vhost_svq_poll(svq, 1);
 }
 
 static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState *s, uint8_t class,
-- 
2.25.1

[PATCH v4 7/8] vdpa: Introduce cursors to vhost_vdpa_net_loadx()

2023-08-28 Thread Hawkins Jiawei

This patch introduces two new arugments, `out_cursor`
and `in_cursor`, to vhost_vdpa_net_loadx(). Addtionally,
it includes a helper function
vhost_vdpa_net_load_cursor_reset() for resetting these
cursors.

Furthermore, this patch refactors vhost_vdpa_net_load_cmd()
so that vhost_vdpa_net_load_cmd() prepares buffers
for the device using the cursors arguments, instead
of directly accesses `s->cvq_cmd_out_buffer` and
`s->status` fields.

By making these change, next patches in this series
can refactor vhost_vdpa_net_load_cmd() directly to
iterate through the control commands shadow buffers,
allowing QEMU to send CVQ state load commands in parallel
at device startup.

Signed-off-by: Hawkins Jiawei 
---
v4:
  - use `struct iovec` instead of `void **` as cursor
suggested by Eugenio
  - add vhost_vdpa_net_load_cursor_reset() helper function
to reset the cursors
  - refactor vhost_vdpa_net_load_cmd() to prepare buffers
by iov_copy() instead of accessing `in` and `out` directly
suggested by Eugenio

v3: 
https://lore.kernel.org/all/bf390934673f2b613359ea9d7ac6c89199c31384.1689748694.git.yin31...@gmail.com/

 net/vhost-vdpa.c | 114 ---
 1 file changed, 77 insertions(+), 37 deletions(-)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index d9b8b3cf6c..a71e8c9090 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -633,7 +633,22 @@ static uint16_t 
vhost_vdpa_net_svq_available_slots(VhostVDPAState *s)
 return vhost_svq_available_slots(svq);
 }
 
-static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState *s, uint8_t class,
+static void vhost_vdpa_net_load_cursor_reset(VhostVDPAState *s,
+ struct iovec *out_cursor,
+ struct iovec *in_cursor)
+{
+/* reset the cursor of the output buffer for the device */
+out_cursor->iov_base = s->cvq_cmd_out_buffer;
+out_cursor->iov_len = vhost_vdpa_net_cvq_cmd_page_len();
+
+/* reset the cursor of the in buffer for the device */
+in_cursor->iov_base = s->status;
+in_cursor->iov_len = vhost_vdpa_net_cvq_cmd_page_len();
+}
+
+static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState *s,
+   struct iovec *out_cursor,
+   struct iovec *in_cursor, uint8_t class,
uint8_t cmd, const struct iovec 
*data_sg,
size_t data_num)
 {
@@ -641,28 +656,25 @@ static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState *s, 
uint8_t class,
 .class = class,
 .cmd = cmd,
 };
-size_t data_size = iov_size(data_sg, data_num);
-/* Buffers for the device */
-const struct iovec out = {
-.iov_base = s->cvq_cmd_out_buffer,
-.iov_len = sizeof(ctrl) + data_size,
-};
-const struct iovec in = {
-.iov_base = s->status,
-.iov_len = sizeof(*s->status),
-};
+size_t data_size = iov_size(data_sg, data_num),
+   cmd_size = sizeof(ctrl) + data_size;
+struct iovec out, in;
 ssize_t r;
 
 assert(data_size < vhost_vdpa_net_cvq_cmd_page_len() - sizeof(ctrl));
 /* Each CVQ command has one out descriptor and one in descriptor */
 assert(vhost_vdpa_net_svq_available_slots(s) >= 2);
 
-/* pack the CVQ command header */
-memcpy(s->cvq_cmd_out_buffer, , sizeof(ctrl));
+/* Prepare the buffer for out descriptor for the device */
+iov_copy(, 1, out_cursor, 1, 0, cmd_size);
+/* Prepare the buffer for in descriptor for the device. */
+iov_copy(, 1, in_cursor, 1, 0, sizeof(*s->status));
 
+/* pack the CVQ command header */
+iov_from_buf(, 1, 0, , sizeof(ctrl));
 /* pack the CVQ command command-specific-data */
 iov_to_buf(data_sg, data_num, 0,
-   s->cvq_cmd_out_buffer + sizeof(ctrl), data_size);
+   out.iov_base + sizeof(ctrl), data_size);
 
 r = vhost_vdpa_net_cvq_add(s, , 1, , 1);
 if (unlikely(r < 0)) {
@@ -676,14 +688,17 @@ static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState *s, 
uint8_t class,
 return vhost_vdpa_net_svq_poll(s, 1);
 }
 
-static int vhost_vdpa_net_load_mac(VhostVDPAState *s, const VirtIONet *n)
+static int vhost_vdpa_net_load_mac(VhostVDPAState *s, const VirtIONet *n,
+   struct iovec *out_cursor,
+   struct iovec *in_cursor)
 {
 if (virtio_vdev_has_feature(>parent_obj, VIRTIO_NET_F_CTRL_MAC_ADDR)) {
 const struct iovec data = {
 .iov_base = (void *)n->mac,
 .iov_len = sizeof(n->mac),
 };
-ssize_t dev_written = vhost_vdpa_net_load_cmd(s, VIRTIO_NET_CTRL_MAC,
+ssize_t dev_written = vhost_vdpa_net_load_cmd(s, out_cursor, in_cursor,
+  VIRTIO_NET_CTRL_MAC,

[PATCH v4 0/8] vdpa: Send all CVQ state load commands in parallel

2023-08-28 Thread Hawkins Jiawei

n vhost_vdpa_net_svq_flush(),
use iov_size(), vhost_vdpa_net_load_cursor_reset()
and iov_discard_front() to update the cursors instead of
accessing it directly according to Eugenio in patch
"vdpa: Send cvq state load commands in parallel"

v3: https://lore.kernel.org/all/cover.1689748694.git.yin31...@gmail.com/
  - refactor vhost_svq_poll() to accept cmds_in_flight
suggested by Jason and Eugenio
  - refactor vhost_vdpa_net_cvq_add() to make control commands buffers
is not tied to `s->cvq_cmd_out_buffer` and `s->status`, so we can reuse
it suggested by Eugenio
  - poll and check when SVQ is full or control commands shadow buffers is
full

v2: https://lore.kernel.org/all/cover.1683371965.git.yin31...@gmail.com/
  - recover accidentally deleted rows
  - remove extra newline
  - refactor `need_poll_len` to `cmds_in_flight`
  - return -EINVAL when vhost_svq_poll() return 0 or check
on buffers written by device fails
  - change the type of `in_cursor`, and refactor the
code for updating cursor
  - return directly when vhost_vdpa_net_load_{mac,mq}()
returns a failure in vhost_vdpa_net_load()

v1: https://lore.kernel.org/all/cover.1681732982.git.yin31...@gmail.com/

Hawkins Jiawei (8):
  vhost: Add count argument to vhost_svq_poll()
  vdpa: Use iovec for vhost_vdpa_net_cvq_add()
  vhost: Expose vhost_svq_available_slots()
  vdpa: Avoid using vhost_vdpa_net_load_*() outside
vhost_vdpa_net_load()
  vdpa: Check device ack in vhost_vdpa_net_load_rx_mode()
  vdpa: Move vhost_svq_poll() to the caller of vhost_vdpa_net_cvq_add()
  vdpa: Introduce cursors to vhost_vdpa_net_loadx()
  vdpa: Send cvq state load commands in parallel

 hw/virtio/vhost-shadow-virtqueue.c |  38 +--
 hw/virtio/vhost-shadow-virtqueue.h |   3 +-
 net/vhost-vdpa.c   | 380 +++--
 3 files changed, 276 insertions(+), 145 deletions(-)

-- 
2.25.1

[PATCH v4 2/8] vdpa: Use iovec for vhost_vdpa_net_cvq_add()

2023-08-28 Thread Hawkins Jiawei

Next patches in this series will no longer perform an
immediate poll and check of the device's used buffers
for each CVQ state load command. Consequently, there
will be multiple pending buffers in the shadow VirtQueue,
making it a must for every control command to have its
own buffer.

To achieve this, this patch refactor vhost_vdpa_net_cvq_add()
to accept `struct iovec`, which eliminates the coupling of
control commands to `s->cvq_cmd_out_buffer` and `s->status`,
allowing them to use their own buffer.

Signed-off-by: Hawkins Jiawei 
---
v4:
  - split `in` to `vdpa_in` and `model_in` instead of reusing `in`
in vhost_vdpa_net_handle_ctrl_avail() suggested by Eugenio

v3: 
https://lore.kernel.org/all/b1d473772ec4bcb254ab0d12430c9b1efe758606.1689748694.git.yin31...@gmail.com/

 net/vhost-vdpa.c | 39 ++-
 1 file changed, 22 insertions(+), 17 deletions(-)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 3acda8591a..a875767ee9 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -596,22 +596,14 @@ static void vhost_vdpa_net_cvq_stop(NetClientState *nc)
 vhost_vdpa_net_client_stop(nc);
 }
 
-static ssize_t vhost_vdpa_net_cvq_add(VhostVDPAState *s, size_t out_len,
-  size_t in_len)
+static ssize_t vhost_vdpa_net_cvq_add(VhostVDPAState *s,
+const struct iovec *out_sg, size_t out_num,
+const struct iovec *in_sg, size_t in_num)
 {
-/* Buffers for the device */
-const struct iovec out = {
-.iov_base = s->cvq_cmd_out_buffer,
-.iov_len = out_len,
-};
-const struct iovec in = {
-.iov_base = s->status,
-.iov_len = sizeof(virtio_net_ctrl_ack),
-};
 VhostShadowVirtqueue *svq = g_ptr_array_index(s->vhost_vdpa.shadow_vqs, 0);
 int r;
 
-r = vhost_svq_add(svq, , 1, , 1, NULL);
+r = vhost_svq_add(svq, out_sg, out_num, in_sg, in_num, NULL);
 if (unlikely(r != 0)) {
 if (unlikely(r == -ENOSPC)) {
 qemu_log_mask(LOG_GUEST_ERROR, "%s: No space on device queue\n",
@@ -637,6 +629,15 @@ static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState *s, 
uint8_t class,
 .cmd = cmd,
 };
 size_t data_size = iov_size(data_sg, data_num);
+/* Buffers for the device */
+const struct iovec out = {
+.iov_base = s->cvq_cmd_out_buffer,
+.iov_len = sizeof(ctrl) + data_size,
+};
+const struct iovec in = {
+.iov_base = s->status,
+.iov_len = sizeof(*s->status),
+};
 
 assert(data_size < vhost_vdpa_net_cvq_cmd_page_len() - sizeof(ctrl));
 
@@ -647,8 +648,7 @@ static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState *s, 
uint8_t class,
 iov_to_buf(data_sg, data_num, 0,
s->cvq_cmd_out_buffer + sizeof(ctrl), data_size);
 
-return vhost_vdpa_net_cvq_add(s, data_size + sizeof(ctrl),
-  sizeof(virtio_net_ctrl_ack));
+return vhost_vdpa_net_cvq_add(s, , 1, , 1);
 }
 
 static int vhost_vdpa_net_load_mac(VhostVDPAState *s, const VirtIONet *n)
@@ -1222,10 +1222,15 @@ static int 
vhost_vdpa_net_handle_ctrl_avail(VhostShadowVirtqueue *svq,
 .iov_base = s->cvq_cmd_out_buffer,
 };
 /* in buffer used for device model */
-const struct iovec in = {
+const struct iovec model_in = {
 .iov_base = ,
 .iov_len = sizeof(status),
 };
+/* in buffer used for vdpa device */
+const struct iovec vdpa_in = {
+.iov_base = s->status,
+.iov_len = sizeof(*s->status),
+};
 ssize_t dev_written = -EINVAL;
 
 out.iov_len = iov_to_buf(elem->out_sg, elem->out_num, 0,
@@ -1259,7 +1264,7 @@ static int 
vhost_vdpa_net_handle_ctrl_avail(VhostShadowVirtqueue *svq,
 goto out;
 }
 } else {
-dev_written = vhost_vdpa_net_cvq_add(s, out.iov_len, sizeof(status));
+dev_written = vhost_vdpa_net_cvq_add(s, , 1, _in, 1);
 if (unlikely(dev_written < 0)) {
 goto out;
 }
@@ -1275,7 +1280,7 @@ static int 
vhost_vdpa_net_handle_ctrl_avail(VhostShadowVirtqueue *svq,
 }
 
 status = VIRTIO_NET_ERR;
-virtio_net_handle_ctrl_iov(svq->vdev, , 1, , 1);
+virtio_net_handle_ctrl_iov(svq->vdev, _in, 1, , 1);
 if (status != VIRTIO_NET_OK) {
 error_report("Bad CVQ processing in model");
 }
-- 
2.25.1

Re: [PATCH v3 8/8] vdpa: Send cvq state load commands in parallel

2023-08-19 Thread Hawkins Jiawei

On 2023/8/19 01:27, Eugenio Perez Martin wrote:
> On Wed, Jul 19, 2023 at 9:54 AM Hawkins Jiawei  wrote:
>>
>> This patch enables sending CVQ state load commands
>> in parallel at device startup by following steps:
>>
>>* Refactor vhost_vdpa_net_load_cmd() to iterate through
>> the control commands shadow buffers. This allows different
>> CVQ state load commands to use their own unique buffers.
>>
>>* Delay the polling and checking of buffers until either
>> the SVQ is full or control commands shadow buffers are full.
>>
>> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1578
>> Signed-off-by: Hawkins Jiawei 
>> ---
>>   net/vhost-vdpa.c | 157 +--
>>   1 file changed, 96 insertions(+), 61 deletions(-)
>>
>> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
>> index 795c9c1fd2..1ebb58f7f6 100644
>> --- a/net/vhost-vdpa.c
>> +++ b/net/vhost-vdpa.c
>> @@ -633,6 +633,26 @@ static uint16_t 
>> vhost_vdpa_net_svq_available_slots(VhostVDPAState *s)
>>   return vhost_svq_available_slots(svq);
>>   }
>>
>> +/*
>> + * Poll SVQ for multiple pending control commands and check the device's 
>> ack.
>> + *
>> + * Caller should hold the BQL when invoking this function.
>> + */
>> +static ssize_t vhost_vdpa_net_svq_flush(VhostVDPAState *s,
>> +size_t cmds_in_flight)
>> +{
>> +vhost_vdpa_net_svq_poll(s, cmds_in_flight);
>> +
>> +/* Device should and must use only one byte ack each control command */
>> +assert(cmds_in_flight < vhost_vdpa_net_cvq_cmd_page_len());
>> +for (int i = 0; i < cmds_in_flight; ++i) {
>> +if (s->status[i] != VIRTIO_NET_OK) {
>> +return -EIO;
>> +}
>> +}
>> +return 0;
>> +}
>> +
>>   static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState *s, void 
>> **out_cursor,
>>  void **in_cursor, uint8_t class,
>>  uint8_t cmd, const struct iovec 
>> *data_sg,
>> @@ -642,19 +662,41 @@ static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState 
>> *s, void **out_cursor,
>>   .class = class,
>>   .cmd = cmd,
>>   };
>> -size_t data_size = iov_size(data_sg, data_num);
>> +size_t data_size = iov_size(data_sg, data_num),
>> +   left_bytes = vhost_vdpa_net_cvq_cmd_page_len() -
>> +(*out_cursor - s->cvq_cmd_out_buffer);
>>   /* Buffers for the device */
>>   struct iovec out = {
>> -.iov_base = *out_cursor,
>>   .iov_len = sizeof(ctrl) + data_size,
>>   };
>>   struct iovec in = {
>> -.iov_base = *in_cursor,
>>   .iov_len = sizeof(*s->status),
>>   };
>>   ssize_t r;
>>
>> -assert(data_size < vhost_vdpa_net_cvq_cmd_page_len() - sizeof(ctrl));
>> +if (sizeof(ctrl) > left_bytes || data_size > left_bytes - sizeof(ctrl) 
>> ||
>
> I'm ok with this code, but maybe we can simplify the code if we use
> two struct iovec as cursors instead of a void **? I think functions
> like iov_size and iov_copy already take care of a few checks here.

Hi Eugenio,

Thanks for the explanation, I will refactor the patch according to your
suggestion!

>
> Apart from that it would be great to merge this call to
> vhost_vdpa_net_svq_flush, but I find it very hard to do unless we
> scatter it through all callers of vhost_vdpa_net_load_cmd.

Yes, I agree with you. Maybe we can consider refactoring like this in
the future if needed.

>
> Apart from the minor comments I think the series is great, thanks!

Thanks for your review:)!


>
>> +vhost_vdpa_net_svq_available_slots(s) < 2) {
>> +/*
>> + * It is time to flush all pending control commands if SVQ is full
>> + * or control commands shadow buffers are full.
>> + *
>> + * We can poll here since we've had BQL from the time
>> + * we sent the descriptor.
>> + */
>> +r = vhost_vdpa_net_svq_flush(s, *in_cursor - (void *)s->status);
>> +if (unlikely(r < 0)) {
>> +return r;
>> +}
>> +
>> +*out_cursor = s->cvq_cmd_out_buffer;
>> +*in_cursor = s->status;
>> +left_bytes = vhost_vdpa_net_cvq_cmd_page_len();
>> +}
>> +
>> +out.iov_base = *out_cursor;
>> +in.iov_base = *in_cursor;
>> +
>

Re: [PATCH v3 6/8] vdpa: Move vhost_svq_poll() to the caller of vhost_vdpa_net_cvq_add()

2023-08-19 Thread Hawkins Jiawei

On 2023/8/18 23:48, Eugenio Perez Martin wrote:
> On Wed, Jul 19, 2023 at 9:54 AM Hawkins Jiawei  wrote:
>>
>> This patch moves vhost_svq_poll() to the caller of
>> vhost_vdpa_net_cvq_add() and introduces a helper funtion.
>>
>> By making this change, next patches in this series is
>> able to refactor vhost_vdpa_net_load_x() only to delay
>> the polling and checking process until either the SVQ
>> is full or control commands shadow buffers are full.
>>
>> Signed-off-by: Hawkins Jiawei 
>> ---
>>   net/vhost-vdpa.c | 50 ++--
>>   1 file changed, 40 insertions(+), 10 deletions(-)
>>
>> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
>> index fe0ba19724..d06f38403f 100644
>> --- a/net/vhost-vdpa.c
>> +++ b/net/vhost-vdpa.c
>> @@ -609,15 +609,21 @@ static ssize_t vhost_vdpa_net_cvq_add(VhostVDPAState 
>> *s,
>>   qemu_log_mask(LOG_GUEST_ERROR, "%s: No space on device 
>> queue\n",
>> __func__);
>>   }
>> -return r;
>>   }
>>
>> -/*
>> - * We can poll here since we've had BQL from the time we sent the
>> - * descriptor. Also, we need to take the answer before SVQ pulls by 
>> itself,
>> - * when BQL is released
>> - */
>> -return vhost_svq_poll(svq, 1);
>> +return r;
>> +}
>> +
>> +/*
>> + * Convenience wrapper to poll SVQ for multiple control commands.
>> + *
>> + * Caller should hold the BQL when invoking this function, and should take
>> + * the answer before SVQ pulls by itself when BQL is released.
>> + */
>> +static ssize_t vhost_vdpa_net_svq_poll(VhostVDPAState *s, size_t 
>> cmds_in_flight)
>> +{
>> +VhostShadowVirtqueue *svq = g_ptr_array_index(s->vhost_vdpa.shadow_vqs, 
>> 0);
>> +return vhost_svq_poll(svq, cmds_in_flight);
>>   }
>>
>>   /* Convenience wrapper to get number of available SVQ descriptors */
>> @@ -645,6 +651,7 @@ static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState 
>> *s, uint8_t class,
>>   .iov_base = s->status,
>>   .iov_len = sizeof(*s->status),
>>   };
>> +ssize_t r;
>>
>>   assert(data_size < vhost_vdpa_net_cvq_cmd_page_len() - sizeof(ctrl));
>>   /* Each CVQ command has one out descriptor and one in descriptor */
>> @@ -657,7 +664,16 @@ static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState 
>> *s, uint8_t class,
>>   iov_to_buf(data_sg, data_num, 0,
>>  s->cvq_cmd_out_buffer + sizeof(ctrl), data_size);
>>
>> -return vhost_vdpa_net_cvq_add(s, , 1, , 1);
>> +r = vhost_vdpa_net_cvq_add(s, , 1, , 1);
>> +if (unlikely(r < 0)) {
>> +return r;
>> +}
>> +
>> +/*
>> + * We can poll here since we've had BQL from the time
>> + * we sent the descriptor.
>> + */
>> +return vhost_vdpa_net_svq_poll(s, 1);
>>   }
>>
>>   static int vhost_vdpa_net_load_mac(VhostVDPAState *s, const VirtIONet *n)
>> @@ -1152,6 +1168,12 @@ static int 
>> vhost_vdpa_net_excessive_mac_filter_cvq_add(VhostVDPAState *s,
>>   if (unlikely(r < 0)) {
>>   return r;
>>   }
>> +
>> +/*
>> + * We can poll here since we've had BQL from the time
>> + * we sent the descriptor.
>> + */
>> +vhost_vdpa_net_svq_poll(s, 1);
>
> Don't we need to check the return value of vhost_vdpa_net_svq_poll here?

Hi Eugenio,

Yes, we should always check the return value of
vhost_vdpa_net_svq_poll(). I will fix this problem
in the v4 patch.

Thanks!


>
>>   if (*s->status != VIRTIO_NET_OK) {
>>   return sizeof(*s->status);
>>   }
>> @@ -1266,10 +1288,18 @@ static int 
>> vhost_vdpa_net_handle_ctrl_avail(VhostShadowVirtqueue *svq,
>>   goto out;
>>   }
>>   } else {
>> -dev_written = vhost_vdpa_net_cvq_add(s, , 1, , 1);
>> -if (unlikely(dev_written < 0)) {
>> +ssize_t r;
>> +r = vhost_vdpa_net_cvq_add(s, , 1, , 1);
>> +if (unlikely(r < 0)) {
>> +dev_written = r;
>>   goto out;
>>   }
>> +
>> +/*
>> + * We can poll here since we've had BQL from the time
>> + * we sent the descriptor.
>> + */
>> +dev_written = vhost_vdpa_net_svq_poll(s, 1);
>>   }
>>
>>   if (unlikely(dev_written < sizeof(status))) {
>> --
>> 2.25.1
>>
>

Re: [PATCH v3 4/8] vdpa: Avoid using vhost_vdpa_net_load_*() outside vhost_vdpa_net_load()

2023-08-19 Thread Hawkins Jiawei

On 2023/8/18 23:39, Eugenio Perez Martin wrote:
> On Wed, Jul 19, 2023 at 9:54 AM Hawkins Jiawei  wrote:
>>
>> Next patches in this series will refactor vhost_vdpa_net_load_cmd()
>> to iterate through the control commands shadow buffers, allowing QEMU
>> to send CVQ state load commands in parallel at device startup.
>>
>> Considering that QEMU always forwards the CVQ command serialized
>> outside of vhost_vdpa_net_load(), it is more elegant to send the
>> CVQ commands directly without invoking vhost_vdpa_net_load_*() helpers.
>>
>> Signed-off-by: Hawkins Jiawei 
>> ---
>>   net/vhost-vdpa.c | 17 ++---
>>   1 file changed, 14 insertions(+), 3 deletions(-)
>>
>> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
>> index dd71008e08..ae8f59adaa 100644
>> --- a/net/vhost-vdpa.c
>> +++ b/net/vhost-vdpa.c
>> @@ -1098,12 +1098,14 @@ static NetClientInfo net_vhost_vdpa_cvq_info = {
>>*/
>>   static int vhost_vdpa_net_excessive_mac_filter_cvq_add(VhostVDPAState *s,
>>  VirtQueueElement 
>> *elem,
>> -   struct iovec *out)
>> +   struct iovec *out,
>> +   struct iovec *in)
>>   {
>>   struct virtio_net_ctrl_mac mac_data, *mac_ptr;
>>   struct virtio_net_ctrl_hdr *hdr_ptr;
>>   uint32_t cursor;
>>   ssize_t r;
>> +uint8_t on = 1;
>>
>>   /* parse the non-multicast MAC address entries from CVQ command */
>>   cursor = sizeof(*hdr_ptr);
>> @@ -1151,7 +1153,16 @@ static int 
>> vhost_vdpa_net_excessive_mac_filter_cvq_add(VhostVDPAState *s,
>>* filter table to the vdpa device, it should send the
>>* VIRTIO_NET_CTRL_RX_PROMISC CVQ command to enable promiscuous mode
>>*/
>> -r = vhost_vdpa_net_load_rx_mode(s, VIRTIO_NET_CTRL_RX_PROMISC, 1);
>> +cursor = 0;
>> +hdr_ptr = out->iov_base;
>> +out->iov_len = sizeof(*hdr_ptr) + sizeof(on);
>> +assert(out->iov_len < vhost_vdpa_net_cvq_cmd_page_len());
>> +
>> +hdr_ptr->class = VIRTIO_NET_CTRL_RX;
>> +hdr_ptr->cmd = VIRTIO_NET_CTRL_RX_PROMISC;
>> +cursor += sizeof(*hdr_ptr);
>> +*(uint8_t *)(out->iov_base + cursor) = on;
>> +r = vhost_vdpa_net_cvq_add(s, out, 1, in, 1);
>
> Can this be done with iov_from_buf?

Hi Eugenio,

Yes, this should be done by iov_from_buf(), I will refactor the code
according to your suggestion.

Thanks!


>
>>   if (unlikely(r < 0)) {
>>   return r;
>>   }
>> @@ -1264,7 +1275,7 @@ static int 
>> vhost_vdpa_net_handle_ctrl_avail(VhostShadowVirtqueue *svq,
>>* the CVQ command direclty.
>>*/
>>   dev_written = vhost_vdpa_net_excessive_mac_filter_cvq_add(s, elem,
>> -  );
>> +  , 
>> );
>>   if (unlikely(dev_written < 0)) {
>>   goto out;
>>   }
>> --
>> 2.25.1
>>
>

Re: [PATCH v3 2/8] vdpa: Use iovec for vhost_vdpa_net_cvq_add()

2023-08-19 Thread Hawkins Jiawei

On 2023/8/18 23:23, Eugenio Perez Martin wrote:
> On Wed, Jul 19, 2023 at 9:54 AM Hawkins Jiawei  wrote:
>>
>> Next patches in this series will no longer perform an
>> immediate poll and check of the device's used buffers
>> for each CVQ state load command. Consequently, there
>> will be multiple pending buffers in the shadow VirtQueue,
>> making it a must for every control command to have its
>> own buffer.
>>
>> To achieve this, this patch refactor vhost_vdpa_net_cvq_add()
>> to accept `struct iovec`, which eliminates the coupling of
>> control commands to `s->cvq_cmd_out_buffer` and `s->status`,
>> allowing them to use their own buffer.
>>
>> Signed-off-by: Hawkins Jiawei 
>> ---
>>   net/vhost-vdpa.c | 38 --
>>   1 file changed, 20 insertions(+), 18 deletions(-)
>>
>> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
>> index d1dd140bf6..6b16c8ece0 100644
>> --- a/net/vhost-vdpa.c
>> +++ b/net/vhost-vdpa.c
>> @@ -596,22 +596,14 @@ static void vhost_vdpa_net_cvq_stop(NetClientState *nc)
>>   vhost_vdpa_net_client_stop(nc);
>>   }
>>
>> -static ssize_t vhost_vdpa_net_cvq_add(VhostVDPAState *s, size_t out_len,
>> -  size_t in_len)
>> +static ssize_t vhost_vdpa_net_cvq_add(VhostVDPAState *s,
>> +  struct iovec *out_sg, size_t out_num,
>> +  struct iovec *in_sg, size_t in_num)
>>   {
>> -/* Buffers for the device */
>> -const struct iovec out = {
>> -.iov_base = s->cvq_cmd_out_buffer,
>> -.iov_len = out_len,
>> -};
>> -const struct iovec in = {
>> -.iov_base = s->status,
>> -.iov_len = sizeof(virtio_net_ctrl_ack),
>> -};
>>   VhostShadowVirtqueue *svq = 
>> g_ptr_array_index(s->vhost_vdpa.shadow_vqs, 0);
>>   int r;
>>
>> -r = vhost_svq_add(svq, , 1, , 1, NULL);
>> +r = vhost_svq_add(svq, out_sg, out_num, in_sg, in_num, NULL);
>>   if (unlikely(r != 0)) {
>>   if (unlikely(r == -ENOSPC)) {
>>   qemu_log_mask(LOG_GUEST_ERROR, "%s: No space on device 
>> queue\n",
>> @@ -637,6 +629,15 @@ static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState 
>> *s, uint8_t class,
>>   .cmd = cmd,
>>   };
>>   size_t data_size = iov_size(data_sg, data_num);
>> +/* Buffers for the device */
>> +struct iovec out = {
>> +.iov_base = s->cvq_cmd_out_buffer,
>> +.iov_len = sizeof(ctrl) + data_size,
>> +};
>> +struct iovec in = {
>> +.iov_base = s->status,
>> +.iov_len = sizeof(*s->status),
>> +};
>>
>>   assert(data_size < vhost_vdpa_net_cvq_cmd_page_len() - sizeof(ctrl));
>>
>> @@ -647,8 +648,7 @@ static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState 
>> *s, uint8_t class,
>>   iov_to_buf(data_sg, data_num, 0,
>>  s->cvq_cmd_out_buffer + sizeof(ctrl), data_size);
>>
>> -return vhost_vdpa_net_cvq_add(s, data_size + sizeof(ctrl),
>> -  sizeof(virtio_net_ctrl_ack));
>> +return vhost_vdpa_net_cvq_add(s, , 1, , 1);
>>   }
>>
>>   static int vhost_vdpa_net_load_mac(VhostVDPAState *s, const VirtIONet *n)
>> @@ -1222,9 +1222,7 @@ static int 
>> vhost_vdpa_net_handle_ctrl_avail(VhostShadowVirtqueue *svq,
>>   struct iovec out = {
>>   .iov_base = s->cvq_cmd_out_buffer,
>>   };
>> -/* in buffer used for device model */
>> -const struct iovec in = {
>> -.iov_base = ,
>> +struct iovec in = {
>
> What if instead of reusing "in" we declare a new struct iovec in the
> condition that calls vhost_vdpa_net_cvq_add? Something in the line of
> "device_in".
>
> I'm also ok with this code, but splitting them would reduce the
> possibility of sending the wrong one to the device / virtio device
> model by mistake.

Hi Eugenio,

Ok, I will refactor this part of code according to your suggestion in
the v4 patch.

Thanks!


>
> Thanks!
>
>>   .iov_len = sizeof(status),
>>   };
>>   ssize_t dev_written = -EINVAL;
>> @@ -1232,6 +1230,8 @@ static int 
>> vhost_vdpa_net_handle_ctrl_avail(VhostShadowVirtqueue *svq,
>>   out.iov_len = iov_to_buf(elem->out_sg, elem->out_num, 0,
>>s->cvq_cmd_out_buffer,
>>v

Re: [PATCH v3 1/8] vhost: Add argument to vhost_svq_poll()

2023-08-19 Thread Hawkins Jiawei

On 2023/8/18 23:08, Eugenio Perez Martin wrote:
> On Wed, Jul 19, 2023 at 9:54 AM Hawkins Jiawei  wrote:
>>
>
> The subject could be more explicit. What about "add count argument to
> vhost_svq_poll"?

Hi Eugenio,

Thanks for reviewing.
You are right, I will use this new subject in the v4 patch.

Thanks!


>
> Apart from that:
> Acked-by: Eugenio Pérez 
>
>> Next patches in this series will no longer perform an
>> immediate poll and check of the device's used buffers
>> for each CVQ state load command. Instead, they will
>> send CVQ state load commands in parallel by polling
>> multiple pending buffers at once.
>>
>> To achieve this, this patch refactoring vhost_svq_poll()
>> to accept a new argument `num`, which allows vhost_svq_poll()
>> to wait for the device to use multiple elements,
>> rather than polling for a single element.
>>
>> Signed-off-by: Hawkins Jiawei 
>> ---
>>   hw/virtio/vhost-shadow-virtqueue.c | 36 ++
>>   hw/virtio/vhost-shadow-virtqueue.h |  2 +-
>>   net/vhost-vdpa.c   |  2 +-
>>   3 files changed, 24 insertions(+), 16 deletions(-)
>>
>> diff --git a/hw/virtio/vhost-shadow-virtqueue.c 
>> b/hw/virtio/vhost-shadow-virtqueue.c
>> index 49e5aed931..e731b1d2ea 100644
>> --- a/hw/virtio/vhost-shadow-virtqueue.c
>> +++ b/hw/virtio/vhost-shadow-virtqueue.c
>> @@ -514,29 +514,37 @@ static void vhost_svq_flush(VhostShadowVirtqueue *svq,
>>   }
>>
>>   /**
>> - * Poll the SVQ for one device used buffer.
>> + * Poll the SVQ to wait for the device to use the specified number
>> + * of elements and return the total length written by the device.
>>*
>>* This function race with main event loop SVQ polling, so extra
>>* synchronization is needed.
>>*
>> - * Return the length written by the device.
>> + * @svq: The svq
>> + * @num: The number of elements that need to be used
>>*/
>> -size_t vhost_svq_poll(VhostShadowVirtqueue *svq)
>> +size_t vhost_svq_poll(VhostShadowVirtqueue *svq, size_t num)
>>   {
>> -int64_t start_us = g_get_monotonic_time();
>> -uint32_t len = 0;
>> +size_t len = 0;
>> +uint32_t r;
>>
>> -do {
>> -if (vhost_svq_more_used(svq)) {
>> -break;
>> -}
>> +while (num--) {
>> +int64_t start_us = g_get_monotonic_time();
>>
>> -if (unlikely(g_get_monotonic_time() - start_us > 10e6)) {
>> -return 0;
>> -}
>> -} while (true);
>> +do {
>> +if (vhost_svq_more_used(svq)) {
>> +break;
>> +}
>> +
>> +if (unlikely(g_get_monotonic_time() - start_us > 10e6)) {
>> +return len;
>> +}
>> +} while (true);
>> +
>> +vhost_svq_get_buf(svq, );
>> +len += r;
>> +}
>>
>> -vhost_svq_get_buf(svq, );
>>   return len;
>>   }
>>
>> diff --git a/hw/virtio/vhost-shadow-virtqueue.h 
>> b/hw/virtio/vhost-shadow-virtqueue.h
>> index 6efe051a70..5bce67837b 100644
>> --- a/hw/virtio/vhost-shadow-virtqueue.h
>> +++ b/hw/virtio/vhost-shadow-virtqueue.h
>> @@ -119,7 +119,7 @@ void vhost_svq_push_elem(VhostShadowVirtqueue *svq,
>>   int vhost_svq_add(VhostShadowVirtqueue *svq, const struct iovec *out_sg,
>> size_t out_num, const struct iovec *in_sg, size_t in_num,
>> VirtQueueElement *elem);
>> -size_t vhost_svq_poll(VhostShadowVirtqueue *svq);
>> +size_t vhost_svq_poll(VhostShadowVirtqueue *svq, size_t num);
>>
>>   void vhost_svq_set_svq_kick_fd(VhostShadowVirtqueue *svq, int svq_kick_fd);
>>   void vhost_svq_set_svq_call_fd(VhostShadowVirtqueue *svq, int call_fd);
>> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
>> index dfd271c456..d1dd140bf6 100644
>> --- a/net/vhost-vdpa.c
>> +++ b/net/vhost-vdpa.c
>> @@ -625,7 +625,7 @@ static ssize_t vhost_vdpa_net_cvq_add(VhostVDPAState *s, 
>> size_t out_len,
>>* descriptor. Also, we need to take the answer before SVQ pulls by 
>> itself,
>>* when BQL is released
>>*/
>> -return vhost_svq_poll(svq);
>> +return vhost_svq_poll(svq, 1);
>>   }
>>
>>   static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState *s, uint8_t class,
>> --
>> 2.25.1
>>
>

Re: [PATCH v3 1/7] vdpa: Use iovec for vhost_vdpa_net_load_cmd()

2023-08-17 Thread Hawkins Jiawei

在 2023/8/17 22:05, Eugenio Perez Martin 写道:
> On Thu, Aug 17, 2023 at 2:42 PM Hawkins Jiawei  wrote:
>>
>> On 2023/8/17 17:23, Eugenio Perez Martin wrote:
>>> On Fri, Jul 7, 2023 at 5:27 PM Hawkins Jiawei  wrote:
>>>>
>>>> According to VirtIO standard, "The driver MUST follow
>>>> the VIRTIO_NET_CTRL_MAC_TABLE_SET command by a le32 number,
>>>> followed by that number of non-multicast MAC addresses,
>>>> followed by another le32 number, followed by that number
>>>> of multicast addresses."
>>>>
>>>> Considering that these data is not stored in contiguous memory,
>>>> this patch refactors vhost_vdpa_net_load_cmd() to accept
>>>> scattered data, eliminating the need for an addtional data copy or
>>>> packing the data into s->cvq_cmd_out_buffer outside of
>>>> vhost_vdpa_net_load_cmd().
>>>>
>>>> Signed-off-by: Hawkins Jiawei 
>>>> ---
>>>> v3:
>>>> - rename argument name to `data_sg` and `data_num`
>>>> - use iov_to_buf() suggested by Eugenio
>>>>
>>>> v2: 
>>>> https://lore.kernel.org/all/6d3dc0fc076564a03501e222ef1102a6a7a643af.1688051252.git.yin31...@gmail.com/
>>>> - refactor vhost_vdpa_load_cmd() to accept iovec suggested by
>>>> Eugenio
>>>>
>>>>net/vhost-vdpa.c | 33 +
>>>>1 file changed, 25 insertions(+), 8 deletions(-)
>>>>
>>>> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
>>>> index 373609216f..31ef6ad6ec 100644
>>>> --- a/net/vhost-vdpa.c
>>>> +++ b/net/vhost-vdpa.c
>>>> @@ -620,29 +620,38 @@ static ssize_t vhost_vdpa_net_cvq_add(VhostVDPAState 
>>>> *s, size_t out_len,
>>>>}
>>>>
>>>>static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState *s, uint8_t class,
>>>> -   uint8_t cmd, const void *data,
>>>> -   size_t data_size)
>>>> +   uint8_t cmd, const struct iovec 
>>>> *data_sg,
>>>> +   size_t data_num)
>>>>{
>>>>const struct virtio_net_ctrl_hdr ctrl = {
>>>>.class = class,
>>>>.cmd = cmd,
>>>>};
>>>> +size_t data_size = iov_size(data_sg, data_num);
>>>>
>>>>assert(data_size < vhost_vdpa_net_cvq_cmd_page_len() - 
>>>> sizeof(ctrl));
>>>>
>>>> +/* pack the CVQ command header */
>>>>memcpy(s->cvq_cmd_out_buffer, , sizeof(ctrl));
>>>> -memcpy(s->cvq_cmd_out_buffer + sizeof(ctrl), data, data_size);
>>>>
>>>> -return vhost_vdpa_net_cvq_add(s, sizeof(ctrl) + data_size,
>>>> +/* pack the CVQ command command-specific-data */
>>>> +iov_to_buf(data_sg, data_num, 0,
>>>> +   s->cvq_cmd_out_buffer + sizeof(ctrl), data_size);
>>>> +
>>>> +return vhost_vdpa_net_cvq_add(s, data_size + sizeof(ctrl),
>>>
>>> Nit, any reason for changing the order of the addends? sizeof(ctrl) +
>>> data_size ?
>>
>> Hi Eugenio,
>>
>> Here the code should be changed to `sizeof(ctrl) + data_size` as you
>> point out.
>>
>> Since this patch series has already been merged into master, I will
>> submit a separate patch to correct this problem.
>>
>
> Ouch, I didn't realize that. No need to make it back again, I was just
> trying to reduce lines changed.

Ok, I got it. Regardless, thank you for your review!


>
>>>
>>>>  sizeof(virtio_net_ctrl_ack));
>>>>}
>>>>
>>>>static int vhost_vdpa_net_load_mac(VhostVDPAState *s, const VirtIONet 
>>>> *n)
>>>>{
>>>>if (virtio_vdev_has_feature(>parent_obj, 
>>>> VIRTIO_NET_F_CTRL_MAC_ADDR)) {
>>>> +const struct iovec data = {
>>>> +.iov_base = (void *)n->mac,
>>>
>>> Assign to void should always be valid, no need for casting here.
>>
>> Yes, assign to void should be valid for normal pointers.
>>
>> However, `n->mac` is an array and is treated as a const pointer. It will
>> trigger the warning "error: initialization discards ‘const’ qualifier
>> from pointer

Re: [PATCH v3 2/7] vdpa: Restore MAC address filtering state

2023-08-17 Thread Hawkins Jiawei

On 2023/8/17 18:18, Eugenio Perez Martin wrote:
> On Fri, Jul 7, 2023 at 5:27 PM Hawkins Jiawei  wrote:
>>
>> This patch refactors vhost_vdpa_net_load_mac() to
>> restore the MAC address filtering state at device's startup.
>>
>> Signed-off-by: Hawkins Jiawei 
>> ---
>> v3:
>>- return early if mismatch the condition suggested by Eugenio
>>
>> v2: 
>> https://lore.kernel.org/all/2f2560f749186c0eb1055f9926f464587e419eeb.1688051252.git.yin31...@gmail.com/
>>- use iovec suggested by Eugenio
>>- avoid sending CVQ command in default state
>>
>> v1: 
>> https://lore.kernel.org/all/00f72fe154a882fd6dc15bc39e3a1ac63f9dadce.1687402580.git.yin31...@gmail.com/
>>
>>   net/vhost-vdpa.c | 52 
>>   1 file changed, 52 insertions(+)
>>
>> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
>> index 31ef6ad6ec..7189ccafaf 100644
>> --- a/net/vhost-vdpa.c
>> +++ b/net/vhost-vdpa.c
>> @@ -660,6 +660,58 @@ static int vhost_vdpa_net_load_mac(VhostVDPAState *s, 
>> const VirtIONet *n)
>>   }
>>   }
>>
>> +/*
>> + * According to VirtIO standard, "The device MUST have an
>> + * empty MAC filtering table on reset.".
>> + *
>> + * Therefore, there is no need to send this CVQ command if the
>> + * driver also sets an empty MAC filter table, which aligns with
>> + * the device's defaults.
>> + *
>> + * Note that the device's defaults can mismatch the driver's
>> + * configuration only at live migration.
>> + */
>> +if (!virtio_vdev_has_feature(>parent_obj, VIRTIO_NET_F_CTRL_RX) ||
>> +n->mac_table.in_use == 0) {
>> +return 0;
>> +}
>> +
>> +uint32_t uni_entries = n->mac_table.first_multi,
>
> QEMU coding style prefers declarations at the beginning of the code
> block. Previous uses of these variable names would need to be
> refactored to met this rule.

Hi Eugenio,

Thanks for the detailed explanation.

Since this patch series has already been merged into master, I will
submit a separate patch to correct this problem.

I will take care of this problem in the future.

Thanks!


>
> Apart from that,
>
> Acked-by: Eugenio Pérez 
>
>> + uni_macs_size = uni_entries * ETH_ALEN,
>> + mul_entries = n->mac_table.in_use - uni_entries,
>> + mul_macs_size = mul_entries * ETH_ALEN;
>> +struct virtio_net_ctrl_mac uni = {
>> +.entries = cpu_to_le32(uni_entries),
>> +};
>> +struct virtio_net_ctrl_mac mul = {
>> +.entries = cpu_to_le32(mul_entries),
>> +};
>> +const struct iovec data[] = {
>> +{
>> +.iov_base = ,
>> +.iov_len = sizeof(uni),
>> +}, {
>> +.iov_base = n->mac_table.macs,
>> +.iov_len = uni_macs_size,
>> +}, {
>> +.iov_base = ,
>> +.iov_len = sizeof(mul),
>> +}, {
>> +.iov_base = >mac_table.macs[uni_macs_size],
>> +.iov_len = mul_macs_size,
>> +},
>> +};
>> +ssize_t dev_written = vhost_vdpa_net_load_cmd(s,
>> +VIRTIO_NET_CTRL_MAC,
>> +VIRTIO_NET_CTRL_MAC_TABLE_SET,
>> +data, ARRAY_SIZE(data));
>> +if (unlikely(dev_written < 0)) {
>> +return dev_written;
>> +}
>> +if (*s->status != VIRTIO_NET_OK) {
>> +return -EIO;
>> +}
>> +
>>   return 0;
>>   }
>>
>> --
>> 2.25.1
>>
>

Re: [PATCH v3 1/7] vdpa: Use iovec for vhost_vdpa_net_load_cmd()

2023-08-17 Thread Hawkins Jiawei

On 2023/8/17 17:23, Eugenio Perez Martin wrote:
> On Fri, Jul 7, 2023 at 5:27 PM Hawkins Jiawei  wrote:
>>
>> According to VirtIO standard, "The driver MUST follow
>> the VIRTIO_NET_CTRL_MAC_TABLE_SET command by a le32 number,
>> followed by that number of non-multicast MAC addresses,
>> followed by another le32 number, followed by that number
>> of multicast addresses."
>>
>> Considering that these data is not stored in contiguous memory,
>> this patch refactors vhost_vdpa_net_load_cmd() to accept
>> scattered data, eliminating the need for an addtional data copy or
>> packing the data into s->cvq_cmd_out_buffer outside of
>> vhost_vdpa_net_load_cmd().
>>
>> Signed-off-by: Hawkins Jiawei 
>> ---
>> v3:
>>- rename argument name to `data_sg` and `data_num`
>>- use iov_to_buf() suggested by Eugenio
>>
>> v2: 
>> https://lore.kernel.org/all/6d3dc0fc076564a03501e222ef1102a6a7a643af.1688051252.git.yin31...@gmail.com/
>>- refactor vhost_vdpa_load_cmd() to accept iovec suggested by
>> Eugenio
>>
>>   net/vhost-vdpa.c | 33 +
>>   1 file changed, 25 insertions(+), 8 deletions(-)
>>
>> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
>> index 373609216f..31ef6ad6ec 100644
>> --- a/net/vhost-vdpa.c
>> +++ b/net/vhost-vdpa.c
>> @@ -620,29 +620,38 @@ static ssize_t vhost_vdpa_net_cvq_add(VhostVDPAState 
>> *s, size_t out_len,
>>   }
>>
>>   static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState *s, uint8_t class,
>> -   uint8_t cmd, const void *data,
>> -   size_t data_size)
>> +   uint8_t cmd, const struct iovec 
>> *data_sg,
>> +   size_t data_num)
>>   {
>>   const struct virtio_net_ctrl_hdr ctrl = {
>>   .class = class,
>>   .cmd = cmd,
>>   };
>> +size_t data_size = iov_size(data_sg, data_num);
>>
>>   assert(data_size < vhost_vdpa_net_cvq_cmd_page_len() - sizeof(ctrl));
>>
>> +/* pack the CVQ command header */
>>   memcpy(s->cvq_cmd_out_buffer, , sizeof(ctrl));
>> -memcpy(s->cvq_cmd_out_buffer + sizeof(ctrl), data, data_size);
>>
>> -return vhost_vdpa_net_cvq_add(s, sizeof(ctrl) + data_size,
>> +/* pack the CVQ command command-specific-data */
>> +iov_to_buf(data_sg, data_num, 0,
>> +   s->cvq_cmd_out_buffer + sizeof(ctrl), data_size);
>> +
>> +return vhost_vdpa_net_cvq_add(s, data_size + sizeof(ctrl),
>
> Nit, any reason for changing the order of the addends? sizeof(ctrl) +
> data_size ?

Hi Eugenio,

Here the code should be changed to `sizeof(ctrl) + data_size` as you
point out.

Since this patch series has already been merged into master, I will
submit a separate patch to correct this problem.

>
>> sizeof(virtio_net_ctrl_ack));
>>   }
>>
>>   static int vhost_vdpa_net_load_mac(VhostVDPAState *s, const VirtIONet *n)
>>   {
>>   if (virtio_vdev_has_feature(>parent_obj, 
>> VIRTIO_NET_F_CTRL_MAC_ADDR)) {
>> +const struct iovec data = {
>> +.iov_base = (void *)n->mac,
>
> Assign to void should always be valid, no need for casting here.

Yes, assign to void should be valid for normal pointers.

However, `n->mac` is an array and is treated as a const pointer. It will
trigger the warning "error: initialization discards ‘const’ qualifier
from pointer target type" if we don't add this cast.

Thanks!


>
>> +.iov_len = sizeof(n->mac),
>> +};
>>   ssize_t dev_written = vhost_vdpa_net_load_cmd(s, 
>> VIRTIO_NET_CTRL_MAC,
>> 
>> VIRTIO_NET_CTRL_MAC_ADDR_SET,
>> -  n->mac, sizeof(n->mac));
>> +  , 1);
>>   if (unlikely(dev_written < 0)) {
>>   return dev_written;
>>   }
>> @@ -665,9 +674,13 @@ static int vhost_vdpa_net_load_mq(VhostVDPAState *s,
>>   }
>>
>>   mq.virtqueue_pairs = cpu_to_le16(n->curr_queue_pairs);
>> +const struct iovec data = {
>> +.iov_base = ,
>> +.iov_len = sizeof(mq),
>> +};
>>   dev_written = vhost_vdpa_net_load_cmd(s, VIRTIO_NET_CTRL_MQ,
>> -  VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET, 
>> ,
>

[RFC PATCH v2 0/3] Vhost-vdpa Shadow Virtqueue RSS Support

2023-08-13 Thread Hawkins Jiawei

This series enables shadowed CVQ to intercept RSS command
through shadowed CVQ, update the virtio NIC device model
so qemu send it in a migration, and the restore of that
RSS state in the destination.

Note that this patch should be based on
patch "Vhost-vdpa Shadow Virtqueue Hash calculation Support" at [1].

[1]. https://lore.kernel.org/all/cover.1691762906.git.yin31...@gmail.com/

ChangeLog
=
v2:
  - Correct the feature usage to VIRTIO_NET_F_HASH_REPORT when
loading the hash calculation state in
patch "vdpa: Restore receive-side scaling state"

v1: https://lore.kernel.org/all/cover.1691766252.git.yin31...@gmail.com/

TestStep

1. test the migration using vp-vdpa device
  - For L0 guest, boot QEMU with two virtio-net-pci net device with
`in-qemu` RSS, command line like:
-netdev tap,vhost=off...
-device virtio-net-pci,disable-legacy=on,disable-modern=off,
iommu_platform=on,mq=on,ctrl_vq=on,hash=on,rss=on,guest_announce=off,
indirect_desc=off,queue_reset=off,...

  - For L1 guest, apply the relative patch series and compile the
source code, start QEMU with two vdpa device with svq mode on,
enable the `ctrl_vq`, `mq`, `rss` features on, command line like:
  -netdev type=vhost-vdpa,x-svq=true,...
  -device virtio-net-pci,mq=on,guest_announce=off,ctrl_vq=on,
rss=on,...

  - For L2 source guest, run the following bash command:
```bash
#!/bin/sh

ethtool -K eth0 rxhash on
```

  - Execute the live migration in L2 source monitor

  - Result
* with this series, L2 QEMU can execute without
triggering any error or warning. L0 QEMU echo
"Can't load eBPF RSS - fallback to software RSS".

Hawkins Jiawei (3):
  vdpa: Add SetSteeringEBPF method for NetClientState
  vdpa: Restore receive-side scaling state
  vdpa: Allow VIRTIO_NET_F_RSS in SVQ

 net/vhost-vdpa.c | 63 ++--
 1 file changed, 45 insertions(+), 18 deletions(-)

-- 
2.25.1

[RFC PATCH v2 3/3] vdpa: Allow VIRTIO_NET_F_RSS in SVQ

2023-08-13 Thread Hawkins Jiawei

Enable SVQ with VIRTIO_NET_F_RSS feature.

Signed-off-by: Hawkins Jiawei 
---
 net/vhost-vdpa.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index e21b3ac67a..2a276ef528 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -119,6 +119,7 @@ static const uint64_t vdpa_svq_device_features =
 /* VHOST_F_LOG_ALL is exposed by SVQ */
 BIT_ULL(VHOST_F_LOG_ALL) |
 BIT_ULL(VIRTIO_NET_F_HASH_REPORT) |
+BIT_ULL(VIRTIO_NET_F_RSS) |
 BIT_ULL(VIRTIO_NET_F_RSC_EXT) |
 BIT_ULL(VIRTIO_NET_F_STANDBY) |
 BIT_ULL(VIRTIO_NET_F_SPEED_DUPLEX);
-- 
2.25.1

[RFC PATCH v2 1/3] vdpa: Add SetSteeringEBPF method for NetClientState

2023-08-13 Thread Hawkins Jiawei

At present, to enable the VIRTIO_NET_F_RSS feature, eBPF must
be loaded for the vhost backend.

Given that vhost-vdpa is one of the vhost backend, we need to
implement the SetSteeringEBPF method to support RSS for vhost-vdpa,
even if vhost-vdpa calculates the rss hash in the hardware device
instead of in the kernel by eBPF.

Although this requires QEMU to be compiled with `--enable-bpf`
configuration even if the vdpa device does not use eBPF to
calculate the rss hash, this can avoid adding the specific
conditional statements for vDPA case to enable the VIRTIO_NET_F_RSS
feature, which reduces code maintainbility.

Suggested-by: Eugenio Pérez 
Signed-off-by: Hawkins Jiawei 
---
 net/vhost-vdpa.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index a13b267250..4c8e4b19f6 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -238,6 +238,12 @@ static void vhost_vdpa_cleanup(NetClientState *nc)
 }
 }
 
+/** Dummy SetSteeringEBPF to support RSS for vhost-vdpa backend  */
+static bool vhost_vdpa_set_steering_ebpf(NetClientState *nc, int prog_fd)
+{
+return true;
+}
+
 static bool vhost_vdpa_has_vnet_hdr(NetClientState *nc)
 {
 assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
@@ -400,6 +406,7 @@ static NetClientInfo net_vhost_vdpa_info = {
 .has_vnet_hdr = vhost_vdpa_has_vnet_hdr,
 .has_ufo = vhost_vdpa_has_ufo,
 .check_peer_type = vhost_vdpa_check_peer_type,
+.set_steering_ebpf = vhost_vdpa_set_steering_ebpf,
 };
 
 static int64_t vhost_vdpa_get_vring_group(int device_fd, unsigned vq_index,
@@ -1215,6 +1222,7 @@ static NetClientInfo net_vhost_vdpa_cvq_info = {
 .has_vnet_hdr = vhost_vdpa_has_vnet_hdr,
 .has_ufo = vhost_vdpa_has_ufo,
 .check_peer_type = vhost_vdpa_check_peer_type,
+.set_steering_ebpf = vhost_vdpa_set_steering_ebpf,
 };
 
 /*
-- 
2.25.1

[RFC PATCH v2 2/3] vdpa: Restore receive-side scaling state

2023-08-13 Thread Hawkins Jiawei

This patch reuses vhost_vdpa_net_load_rss() with some
refactorings to restore the receive-side scaling state
at device's startup.

Signed-off-by: Hawkins Jiawei 
---
v2:
  - Correct the feature usage to VIRTIO_NET_F_HASH_REPORT when
loading the hash calculation state

v1: 
https://lore.kernel.org/all/93d5d82f0a5df71df326830033e50358c8b6be7a.1691766252.git.yin31...@gmail.com/

 net/vhost-vdpa.c | 54 
 1 file changed, 36 insertions(+), 18 deletions(-)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 4c8e4b19f6..e21b3ac67a 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -820,17 +820,28 @@ static int vhost_vdpa_net_load_rss(VhostVDPAState *s, 
const VirtIONet *n,
 }
 
 cfg.hash_types = cpu_to_le32(n->rss_data.hash_types);
-/*
- * According to VirtIO standard, "Field reserved MUST contain zeroes.
- * It is defined to make the structure to match the layout of
- * virtio_net_rss_config structure, defined in 5.1.6.5.7.".
- *
- * Therefore, we need to zero the fields in struct virtio_net_rss_config,
- * which corresponds the `reserved` field in
- * struct virtio_net_hash_config.
- */
-memset(_table_mask, 0,
-   sizeof_field(struct virtio_net_hash_config, reserved));
+if (do_rss) {
+/*
+ * According to VirtIO standard, "Number of entries in 
indirection_table
+ * is (indirection_table_mask + 1)".
+ */
+cfg.indirection_table_mask = cpu_to_le16(n->rss_data.indirections_len -
+ 1);
+cfg.unclassified_queue = cpu_to_le16(n->rss_data.default_queue);
+cfg.max_tx_vq = cpu_to_le16(n->curr_queue_pairs);
+} else {
+/*
+ * According to VirtIO standard, "Field reserved MUST contain zeroes.
+ * It is defined to make the structure to match the layout of
+ * virtio_net_rss_config structure, defined in 5.1.6.5.7.".
+ *
+ * Therefore, we need to zero the fields in
+ * struct virtio_net_rss_config, which corresponds the `reserved` field
+ * in struct virtio_net_hash_config.
+ */
+memset(_table_mask, 0,
+   sizeof_field(struct virtio_net_hash_config, reserved));
+}
 /*
  * Consider that virtio_net_handle_rss() currently does not restore the
  * hash key length parsed from the CVQ command sent from the guest into
@@ -866,6 +877,7 @@ static int vhost_vdpa_net_load_rss(VhostVDPAState *s, const 
VirtIONet *n,
 
 r = vhost_vdpa_net_load_cmd(s, out_cursor, in_cursor,
 VIRTIO_NET_CTRL_MQ,
+do_rss ? VIRTIO_NET_CTRL_MQ_RSS_CONFIG :
 VIRTIO_NET_CTRL_MQ_HASH_CONFIG,
 data, ARRAY_SIZE(data));
 if (unlikely(r < 0)) {
@@ -899,13 +911,19 @@ static int vhost_vdpa_net_load_mq(VhostVDPAState *s,
 return r;
 }
 
-if (!virtio_vdev_has_feature(>parent_obj, VIRTIO_NET_F_HASH_REPORT)) {
-return 0;
-}
-
-r = vhost_vdpa_net_load_rss(s, n, out_cursor, in_cursor, false);
-if (unlikely(r < 0)) {
-return r;
+if (virtio_vdev_has_feature(>parent_obj, VIRTIO_NET_F_RSS)) {
+/* Load the receive-side scaling state */
+r = vhost_vdpa_net_load_rss(s, n, out_cursor, in_cursor, true);
+if (unlikely(r < 0)) {
+return r;
+}
+} else if (virtio_vdev_has_feature(>parent_obj,
+   VIRTIO_NET_F_HASH_REPORT)) {
+/* Load the hash calculation state */
+r = vhost_vdpa_net_load_rss(s, n, out_cursor, in_cursor, false);
+if (unlikely(r < 0)) {
+return r;
+}
 }
 
 return 0;
-- 
2.25.1

Re: [RFC PATCH 2/3] vdpa: Restore receive-side scaling state

2023-08-13 Thread Hawkins Jiawei

On 2023/8/11 23:28, Hawkins Jiawei wrote:
> This patch reuses vhost_vdpa_net_load_rss() with some
> refactorings to restore the receive-side scaling state
> at device's startup.
>
> Signed-off-by: Hawkins Jiawei 
> ---
>   net/vhost-vdpa.c | 53 
>   1 file changed, 35 insertions(+), 18 deletions(-)
>
> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
> index 4c8e4b19f6..7870cbe142 100644
> --- a/net/vhost-vdpa.c
> +++ b/net/vhost-vdpa.c
> @@ -820,17 +820,28 @@ static int vhost_vdpa_net_load_rss(VhostVDPAState *s, 
> const VirtIONet *n,
>   }
>
>   cfg.hash_types = cpu_to_le32(n->rss_data.hash_types);
> -/*
> - * According to VirtIO standard, "Field reserved MUST contain zeroes.
> - * It is defined to make the structure to match the layout of
> - * virtio_net_rss_config structure, defined in 5.1.6.5.7.".
> - *
> - * Therefore, we need to zero the fields in struct virtio_net_rss_config,
> - * which corresponds the `reserved` field in
> - * struct virtio_net_hash_config.
> - */
> -memset(_table_mask, 0,
> -   sizeof_field(struct virtio_net_hash_config, reserved));
> +if (do_rss) {
> +/*
> + * According to VirtIO standard, "Number of entries in 
> indirection_table
> + * is (indirection_table_mask + 1)".
> + */
> +cfg.indirection_table_mask = 
> cpu_to_le16(n->rss_data.indirections_len -
> + 1);
> +cfg.unclassified_queue = cpu_to_le16(n->rss_data.default_queue);
> +cfg.max_tx_vq = cpu_to_le16(n->curr_queue_pairs);
> +} else {
> +/*
> + * According to VirtIO standard, "Field reserved MUST contain zeroes.
> + * It is defined to make the structure to match the layout of
> + * virtio_net_rss_config structure, defined in 5.1.6.5.7.".
> + *
> + * Therefore, we need to zero the fields in
> + * struct virtio_net_rss_config, which corresponds the `reserved` 
> field
> + * in struct virtio_net_hash_config.
> + */
> +memset(_table_mask, 0,
> +   sizeof_field(struct virtio_net_hash_config, reserved));
> +}
>   /*
>* Consider that virtio_net_handle_rss() currently does not restore the
>* hash key length parsed from the CVQ command sent from the guest into
> @@ -866,6 +877,7 @@ static int vhost_vdpa_net_load_rss(VhostVDPAState *s, 
> const VirtIONet *n,
>
>   r = vhost_vdpa_net_load_cmd(s, out_cursor, in_cursor,
>   VIRTIO_NET_CTRL_MQ,
> +do_rss ? VIRTIO_NET_CTRL_MQ_RSS_CONFIG :
>   VIRTIO_NET_CTRL_MQ_HASH_CONFIG,
>   data, ARRAY_SIZE(data));
>   if (unlikely(r < 0)) {
> @@ -899,13 +911,18 @@ static int vhost_vdpa_net_load_mq(VhostVDPAState *s,
>   return r;
>   }
>
> -if (!virtio_vdev_has_feature(>parent_obj, VIRTIO_NET_F_HASH_REPORT)) {
> -return 0;
> -}
> -
> -r = vhost_vdpa_net_load_rss(s, n, out_cursor, in_cursor, false);
> -if (unlikely(r < 0)) {
> -return r;
> +if (virtio_vdev_has_feature(>parent_obj, VIRTIO_NET_F_RSS)) {
> +/* Load the receive-side scaling state */
> +r = vhost_vdpa_net_load_rss(s, n, out_cursor, in_cursor, true);
> +if (unlikely(r < 0)) {
> +return r;
> +}
> +} else if (virtio_vdev_has_feature(>parent_obj, VIRTIO_NET_F_RSS)) {

The correct feature to be used here is VIRTIO_NET_F_HASH_REPORT, rather
than VIRTIO_NET_F_RSS. I will correct this in the v2 patch.

Thanks!


> +/* Load the hash calculation state */
> +r = vhost_vdpa_net_load_rss(s, n, out_cursor, in_cursor, false);
> +if (unlikely(r < 0)) {
> +return r;
> +}
>   }
>
>   return 0;

[RFC PATCH 3/3] vdpa: Allow VIRTIO_NET_F_RSS in SVQ

2023-08-11 Thread Hawkins Jiawei

Enable SVQ with VIRTIO_NET_F_RSS feature.

Signed-off-by: Hawkins Jiawei 
---
 net/vhost-vdpa.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 7870cbe142..eb08530396 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -119,6 +119,7 @@ static const uint64_t vdpa_svq_device_features =
 /* VHOST_F_LOG_ALL is exposed by SVQ */
 BIT_ULL(VHOST_F_LOG_ALL) |
 BIT_ULL(VIRTIO_NET_F_HASH_REPORT) |
+BIT_ULL(VIRTIO_NET_F_RSS) |
 BIT_ULL(VIRTIO_NET_F_RSC_EXT) |
 BIT_ULL(VIRTIO_NET_F_STANDBY) |
 BIT_ULL(VIRTIO_NET_F_SPEED_DUPLEX);
-- 
2.25.1

[RFC PATCH 2/3] vdpa: Restore receive-side scaling state

2023-08-11 Thread Hawkins Jiawei

This patch reuses vhost_vdpa_net_load_rss() with some
refactorings to restore the receive-side scaling state
at device's startup.

Signed-off-by: Hawkins Jiawei 
---
 net/vhost-vdpa.c | 53 
 1 file changed, 35 insertions(+), 18 deletions(-)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 4c8e4b19f6..7870cbe142 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -820,17 +820,28 @@ static int vhost_vdpa_net_load_rss(VhostVDPAState *s, 
const VirtIONet *n,
 }
 
 cfg.hash_types = cpu_to_le32(n->rss_data.hash_types);
-/*
- * According to VirtIO standard, "Field reserved MUST contain zeroes.
- * It is defined to make the structure to match the layout of
- * virtio_net_rss_config structure, defined in 5.1.6.5.7.".
- *
- * Therefore, we need to zero the fields in struct virtio_net_rss_config,
- * which corresponds the `reserved` field in
- * struct virtio_net_hash_config.
- */
-memset(_table_mask, 0,
-   sizeof_field(struct virtio_net_hash_config, reserved));
+if (do_rss) {
+/*
+ * According to VirtIO standard, "Number of entries in 
indirection_table
+ * is (indirection_table_mask + 1)".
+ */
+cfg.indirection_table_mask = cpu_to_le16(n->rss_data.indirections_len -
+ 1);
+cfg.unclassified_queue = cpu_to_le16(n->rss_data.default_queue);
+cfg.max_tx_vq = cpu_to_le16(n->curr_queue_pairs);
+} else {
+/*
+ * According to VirtIO standard, "Field reserved MUST contain zeroes.
+ * It is defined to make the structure to match the layout of
+ * virtio_net_rss_config structure, defined in 5.1.6.5.7.".
+ *
+ * Therefore, we need to zero the fields in
+ * struct virtio_net_rss_config, which corresponds the `reserved` field
+ * in struct virtio_net_hash_config.
+ */
+memset(_table_mask, 0,
+   sizeof_field(struct virtio_net_hash_config, reserved));
+}
 /*
  * Consider that virtio_net_handle_rss() currently does not restore the
  * hash key length parsed from the CVQ command sent from the guest into
@@ -866,6 +877,7 @@ static int vhost_vdpa_net_load_rss(VhostVDPAState *s, const 
VirtIONet *n,
 
 r = vhost_vdpa_net_load_cmd(s, out_cursor, in_cursor,
 VIRTIO_NET_CTRL_MQ,
+do_rss ? VIRTIO_NET_CTRL_MQ_RSS_CONFIG :
 VIRTIO_NET_CTRL_MQ_HASH_CONFIG,
 data, ARRAY_SIZE(data));
 if (unlikely(r < 0)) {
@@ -899,13 +911,18 @@ static int vhost_vdpa_net_load_mq(VhostVDPAState *s,
 return r;
 }
 
-if (!virtio_vdev_has_feature(>parent_obj, VIRTIO_NET_F_HASH_REPORT)) {
-return 0;
-}
-
-r = vhost_vdpa_net_load_rss(s, n, out_cursor, in_cursor, false);
-if (unlikely(r < 0)) {
-return r;
+if (virtio_vdev_has_feature(>parent_obj, VIRTIO_NET_F_RSS)) {
+/* Load the receive-side scaling state */
+r = vhost_vdpa_net_load_rss(s, n, out_cursor, in_cursor, true);
+if (unlikely(r < 0)) {
+return r;
+}
+} else if (virtio_vdev_has_feature(>parent_obj, VIRTIO_NET_F_RSS)) {
+/* Load the hash calculation state */
+r = vhost_vdpa_net_load_rss(s, n, out_cursor, in_cursor, false);
+if (unlikely(r < 0)) {
+return r;
+}
 }
 
 return 0;
-- 
2.25.1

[RFC PATCH 0/3] Vhost-vdpa Shadow Virtqueue RSS Support

2023-08-11 Thread Hawkins Jiawei

This series enables shadowed CVQ to intercept RSS command
through shadowed CVQ, update the virtio NIC device model
so qemu send it in a migration, and the restore of that
RSS state in the destination.

Note that this patch should be based on
patch "Vhost-vdpa Shadow Virtqueue Hash calculation Support" at [1].

[1]. https://lore.kernel.org/all/cover.1691762906.git.yin31...@gmail.com/

TestStep

1. test the migration using vp-vdpa device
  - For L0 guest, boot QEMU with two virtio-net-pci net device with
`in-qemu` RSS, command line like:
-netdev tap,vhost=off...
-device virtio-net-pci,disable-legacy=on,disable-modern=off,
iommu_platform=on,mq=on,ctrl_vq=on,hash=on,rss=on,guest_announce=off,
indirect_desc=off,queue_reset=off,...

  - For L1 guest, apply the relative patch series and compile the
source code, start QEMU with two vdpa device with svq mode on,
enable the `ctrl_vq`, `mq`, `rss` features on, command line like:
  -netdev type=vhost-vdpa,x-svq=true,...
  -device virtio-net-pci,mq=on,guest_announce=off,ctrl_vq=on,
rss=on,...

  - For L2 source guest, run the following bash command:
```bash
#!/bin/sh

ethtool -K eth0 rxhash on
```

  - Execute the live migration in L2 source monitor

  - Result
* with this series, L2 QEMU can execute without
triggering any error or warning. L0 QEMU echo
"Can't load eBPF RSS - fallback to software RSS".

Hawkins Jiawei (3):
  vdpa: Add SetSteeringEBPF method for NetClientState
  vdpa: Restore receive-side scaling state
  vdpa: Allow VIRTIO_NET_F_RSS in SVQ

 net/vhost-vdpa.c | 62 ++--
 1 file changed, 44 insertions(+), 18 deletions(-)

-- 
2.25.1

[RFC PATCH 1/3] vdpa: Add SetSteeringEBPF method for NetClientState

2023-08-11 Thread Hawkins Jiawei

At present, to enable the VIRTIO_NET_F_RSS feature, eBPF must
be loaded for the vhost backend.

Given that vhost-vdpa is one of the vhost backend, we need to
implement the SetSteeringEBPF method to support RSS for vhost-vdpa,
even if vhost-vdpa calculates the rss hash in the hardware device
instead of in the kernel by eBPF.

Although this requires QEMU to be compiled with `--enable-bpf`
configuration even if the vdpa device does not use eBPF to
calculate the rss hash, this can avoid adding the specific
conditional statements for vDPA case to enable the VIRTIO_NET_F_RSS
feature, which reduces code maintainbility.

Suggested-by: Eugenio Pérez 
Signed-off-by: Hawkins Jiawei 
---
 net/vhost-vdpa.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index a13b267250..4c8e4b19f6 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -238,6 +238,12 @@ static void vhost_vdpa_cleanup(NetClientState *nc)
 }
 }
 
+/** Dummy SetSteeringEBPF to support RSS for vhost-vdpa backend  */
+static bool vhost_vdpa_set_steering_ebpf(NetClientState *nc, int prog_fd)
+{
+return true;
+}
+
 static bool vhost_vdpa_has_vnet_hdr(NetClientState *nc)
 {
 assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
@@ -400,6 +406,7 @@ static NetClientInfo net_vhost_vdpa_info = {
 .has_vnet_hdr = vhost_vdpa_has_vnet_hdr,
 .has_ufo = vhost_vdpa_has_ufo,
 .check_peer_type = vhost_vdpa_check_peer_type,
+.set_steering_ebpf = vhost_vdpa_set_steering_ebpf,
 };
 
 static int64_t vhost_vdpa_get_vring_group(int device_fd, unsigned vq_index,
@@ -1215,6 +1222,7 @@ static NetClientInfo net_vhost_vdpa_cvq_info = {
 .has_vnet_hdr = vhost_vdpa_has_vnet_hdr,
 .has_ufo = vhost_vdpa_has_ufo,
 .check_peer_type = vhost_vdpa_check_peer_type,
+.set_steering_ebpf = vhost_vdpa_set_steering_ebpf,
 };
 
 /*
-- 
2.25.1

[RFC PATCH 2/2] vdpa: Allow VIRTIO_NET_F_HASH_REPORT in SVQ

2023-08-11 Thread Hawkins Jiawei

Enable SVQ with VIRTIO_NET_F_HASH_REPORT feature.

Signed-off-by: Hawkins Jiawei 
---
 net/vhost-vdpa.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index bd51020771..a13b267250 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -118,6 +118,7 @@ static const uint64_t vdpa_svq_device_features =
 BIT_ULL(VIRTIO_NET_F_CTRL_MAC_ADDR) |
 /* VHOST_F_LOG_ALL is exposed by SVQ */
 BIT_ULL(VHOST_F_LOG_ALL) |
+BIT_ULL(VIRTIO_NET_F_HASH_REPORT) |
 BIT_ULL(VIRTIO_NET_F_RSC_EXT) |
 BIT_ULL(VIRTIO_NET_F_STANDBY) |
 BIT_ULL(VIRTIO_NET_F_SPEED_DUPLEX);
-- 
2.25.1

[RFC PATCH 1/2] vdpa: Restore hash calculation state

2023-08-11 Thread Hawkins Jiawei

This patch introduces vhost_vdpa_net_load_rss() to restore
the hash calculation state at device's startup.

Note that vhost_vdpa_net_load_rss() has `do_rss` argument,
which allows future code to reuse this function to restore
the receive-side scaling state when the VIRTIO_NET_F_RSS
feature is enabled in SVQ. Currently, vhost_vdpa_net_load_rss()
could only be invoked when `do_rss` is set to false.

Signed-off-by: Hawkins Jiawei 
---
Question:

It seems that virtio_net_handle_rss() currently does not restore the
hash key length parsed from the CVQ command sent from the guest into
n->rss_data and uses the maximum key length in other code.

So for `hash_key_length` field in VIRTIO_NET_CTRL_MQ_HASH_CONFIG command
sent to device, is it okay to also use the maximum key length as its value?
Or should we introduce the `hash_key_length` field in n->rss_data
structure to record the key length from guest and use this value?

 net/vhost-vdpa.c | 88 
 1 file changed, 88 insertions(+)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 7bb29f6009..bd51020771 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -788,6 +788,85 @@ static int vhost_vdpa_net_load_mac(VhostVDPAState *s, 
const VirtIONet *n,
 return 0;
 }
 
+static int vhost_vdpa_net_load_rss(VhostVDPAState *s, const VirtIONet *n,
+   void **out_cursor, void **in_cursor,
+   bool do_rss)
+{
+struct virtio_net_rss_config cfg;
+ssize_t r;
+
+/*
+ * According to VirtIO standard, "Initially the device has all hash
+ * types disabled and reports only VIRTIO_NET_HASH_REPORT_NONE.".
+ *
+ * Therefore, there is no need to send this CVQ command if the
+ * driver disable the all hash types, which aligns with
+ * the device's defaults.
+ *
+ * Note that the device's defaults can mismatch the driver's
+ * configuration only at live migration.
+ */
+if (!n->rss_data.enabled ||
+n->rss_data.hash_types == VIRTIO_NET_HASH_REPORT_NONE) {
+return 0;
+}
+
+cfg.hash_types = cpu_to_le32(n->rss_data.hash_types);
+/*
+ * According to VirtIO standard, "Field reserved MUST contain zeroes.
+ * It is defined to make the structure to match the layout of
+ * virtio_net_rss_config structure, defined in 5.1.6.5.7.".
+ *
+ * Therefore, we need to zero the fields in struct virtio_net_rss_config,
+ * which corresponds the `reserved` field in
+ * struct virtio_net_hash_config.
+ */
+memset(_table_mask, 0,
+   sizeof_field(struct virtio_net_hash_config, reserved));
+/*
+ * Consider that virtio_net_handle_rss() currently does not restore the
+ * hash key length parsed from the CVQ command sent from the guest into
+ * n->rss_data and uses the maximum key length in other code, so we also
+ * employthe the maxium key length here.
+ */
+cfg.hash_key_length = sizeof(n->rss_data.key);
+
+g_autofree uint16_t *table = g_malloc_n(n->rss_data.indirections_len,
+sizeof(n->rss_data.indirections_table[0]));
+for (int i = 0; i < n->rss_data.indirections_len; ++i) {
+table[i] = cpu_to_le16(n->rss_data.indirections_table[i]);
+}
+
+const struct iovec data[] = {
+{
+.iov_base = ,
+.iov_len = offsetof(struct virtio_net_rss_config,
+indirection_table),
+}, {
+.iov_base = table,
+.iov_len = n->rss_data.indirections_len *
+   sizeof(n->rss_data.indirections_table[0]),
+}, {
+.iov_base = _tx_vq,
+.iov_len = offsetof(struct virtio_net_rss_config, hash_key_data) -
+   offsetof(struct virtio_net_rss_config, max_tx_vq),
+}, {
+.iov_base = (void *)n->rss_data.key,
+.iov_len = sizeof(n->rss_data.key),
+}
+};
+
+r = vhost_vdpa_net_load_cmd(s, out_cursor, in_cursor,
+VIRTIO_NET_CTRL_MQ,
+VIRTIO_NET_CTRL_MQ_HASH_CONFIG,
+data, ARRAY_SIZE(data));
+if (unlikely(r < 0)) {
+return r;
+}
+
+return 0;
+}
+
 static int vhost_vdpa_net_load_mq(VhostVDPAState *s,
   const VirtIONet *n,
   void **out_cursor, void **in_cursor)
@@ -812,6 +891,15 @@ static int vhost_vdpa_net_load_mq(VhostVDPAState *s,
 return r;
 }
 
+if (!virtio_vdev_has_feature(>parent_obj, VIRTIO_NET_F_HASH_REPORT)) {
+return 0;
+}
+
+r = vhost_vdpa_net_load_rss(s, n, out_cursor, in_cursor, false);
+if (unlikely(r < 0)) {
+return r;
+}
+
 return 0;
 }
 
-- 
2.25.1

[RFC PATCH 0/2] Vhost-vdpa Shadow Virtqueue Hash calculation Support

2023-08-11 Thread Hawkins Jiawei

This series enables shadowed CVQ to intercept
VIRTIO_NET_CTRL_MQ_HASH_CONFIG command through shadowed CVQ,
update the virtio NIC device model so qemu send it in a
migration, and the restore of that Hash calculation state
in the destination.

Note that this patch should be based on
patch "vdpa: Send all CVQ state load commands in parallel" at [1].

[1]. https://lore.kernel.org/all/cover.1689748694.git.yin31...@gmail.com/

TestStep

1. test the migration using vp-vdpa device
  - For L0 guest, boot QEMU with two virtio-net-pci net device with
`ctrl_vq`, `mq`, `hash` features on, command line like:
-netdev tap,...
-device virtio-net-pci,disable-legacy=on,disable-modern=off,
iommu_platform=on,mq=on,ctrl_vq=on,hash=on,guest_announce=off,
indirect_desc=off,queue_reset=off,...

  - For L1 guest, apply the relative patch series and compile the
source code, start QEMU with two vdpa device with svq mode on,
enable the `ctrl_vq`, `mq`, `hash` features on, command line like:
  -netdev type=vhost-vdpa,x-svq=true,...
  -device virtio-net-pci,mq=on,guest_announce=off,ctrl_vq=on,
hash=on,...

  - For L2 source guest, run the following bash command:
```bash
#!/bin/sh

ethtool -K eth0 rxhash on
```
  - Gdb attach the destination VM and break at the
vhost_vdpa_net_load_rss()

  - Execute the live migration in L2 source monitor

  - Result
* with this series, gdb can hit the breakpoint and continue
the executing without triggering any error or warning.

Hawkins Jiawei (2):
  vdpa: Restore hash calculation state
  vdpa: Allow VIRTIO_NET_F_HASH_REPORT in SVQ

 net/vhost-vdpa.c | 89 
 1 file changed, 89 insertions(+)

-- 
2.25.1

Re: [PATCH v3 0/3] vdpa: Return -EIO if device ack is VIRTIO_NET_ERR

2023-08-05 Thread Hawkins Jiawei

On 2023/8/5 14:15, Michael Tokarev wrote:
> 04.07.2023 06:34, Hawkins Jiawei wrote:
>> According to VirtIO standard, "The class, command and
>> command-specific-data are set by the driver,
>> and the device sets the ack byte.
>> There is little it can do except issue a diagnostic
>> if ack is not VIRTIO_NET_OK."
>>
>> Therefore, QEMU should stop sending the queued SVQ commands and
>> cancel the device startup if the device's ack is not VIRTIO_NET_OK.
>>
>> Yet the problem is that, vhost_vdpa_net_load_x() returns 1 based on
>> `*s->status != VIRTIO_NET_OK` when the device's ack is VIRTIO_NET_ERR.
>> As a result, net->nc->info->load() also returns 1, this makes
>> vhost_net_start_one() incorrectly assume the device state is
>> successfully loaded by vhost_vdpa_net_load() and return 0, instead of
>> goto `fail` label to cancel the device startup, as vhost_net_start_one()
>> only cancels the device startup when net->nc->info->load() returns a
>> negative value.
>>
>> This patchset fixes this problem by returning -EIO when the device's
>> ack is not VIRTIO_NET_OK.
>>
>> Changelog
>> =
>> v3:
>>   - split the fixes suggested by Eugenio
>>   - return -EIO suggested by Michael
>>
>> v2:
>> https://lore.kernel.org/all/69010e9ebb5e3729aef595ed92840f43e48e53e5.1687875592.git.yin31...@gmail.com/
>>   - fix the same bug in vhost_vdpa_net_load_offloads()
>>
>> v1: https://lore.kernel.org/all/cover.1686746406.git.yin31...@gmail.com/
>>
>> Hawkins Jiawei (3):
>>vdpa: Return -EIO if device ack is VIRTIO_NET_ERR in _load_mac()
>>vdpa: Return -EIO if device ack is VIRTIO_NET_ERR in _load_mq()
>>vdpa: Return -EIO if device ack is VIRTIO_NET_ERR in _load_offloads()
>
> Hi!
>
> I don't remember why, but this patch series is marked as "check later" in
> my qemu-stable-to-apply email folder.  Does it make sense to back-port this
> series to stable-8.0?
>
> 6f34807116 vdpa: Return -EIO if device ack is VIRTIO_NET_ERR in
> _load_offloads()
> f45fd95ec9 vdpa: Return -EIO if device ack is VIRTIO_NET_ERR in _load_mq()
> b479bc3c9d vdpa: Return -EIO if device ack is VIRTIO_NET_ERR in _load_mac()
>

Hi Michael,

Yes, this bug exists in stable-8.0, so it makes sense to back-port this
series.

Commit f45fd95ec9("vdpa: Return -EIO if device ack is VIRTIO_NET_ERR in
_load_mq()") and
commit b479bc3c9d("vdpa: Return -EIO if device ack is VIRTIO_NET_ERR in
_load_mac()") can be back-ported directly.

> Patch 6f34807116 also needs
>
> b58d3686a0 vdpa: Add vhost_vdpa_net_load_offloads()

As you point out, patch 6f34807116("vdpa: Return -EIO if device ack is
VIRTIO_NET_ERR in _load_offloads()") is a fix to the commit
b58d3686a0("vdpa: Add vhost_vdpa_net_load_offloads()"), which was
introduced by patch series "Vhost-vdpa Shadow Virtqueue Offloads
support" at [1].

This mentioned patch series introduces a new feature for QEMU and
has not been merged into stable-8.0 yet, so I think we do not need to
apply the
patch 6f34807116("vdpa: Return -EIO if device ack is
VIRTIO_NET_ERR in _load_offloads()") to stable-8.0.

Sorry for not mentioning this information in the cover letter.

Thanks!

[1]. https://lore.kernel.org/all/cover.1685704856.git.yin31...@gmail.com/

>
> for 8.0.
>
> Thanks,
>
> /mjt

Re: [PATCH v2 3/4] vdpa: Restore vlan filtering state

2023-07-25 Thread Hawkins Jiawei

On 2023/7/25 14:47, Jason Wang wrote:
> On Sun, Jul 23, 2023 at 5:28 PM Hawkins Jiawei  wrote:
>>
>> This patch introduces vhost_vdpa_net_load_single_vlan()
>> and vhost_vdpa_net_load_vlan() to restore the vlan
>> filtering state at device's startup.
>>
>> Co-developed-by: Eugenio Pérez 
>> Signed-off-by: Eugenio Pérez 
>> Signed-off-by: Hawkins Jiawei 
>
> Acked-by: Jason Wang 
>
> But this seems to be a source of latency killer as it may at most send
> 1024 commands.
>
> As discussed in the past, we need a better cvq command to do this: for
> example, a single command to carray a bitmap.

Hi Jason,

Thanks for your review.

You are right, we need some improvement here.

Therefore, I have submitted another patch series titled "vdpa: Send all
CVQ state load commands in parallel" at [1], which allows QEMU to delay
polling and checking the device used buffer until either the SVQ is full
or control commands shadow buffers are full, so that QEMU can send all
the SVQ control commands in parallel, which has better performance
improvement.

To test that patch series, I created 4094 VLANS in guest to build an
environment for sending multiple CVQ state load commands. According to
the result on the real vdpa device at [2], this patch series can improve
latency from 23296 us to 6539 us.

Thanks!

[1]. https://lists.gnu.org/archive/html/qemu-devel/2023-07/msg03726.html
[2]. https://lists.gnu.org/archive/html/qemu-devel/2023-07/msg03947.html


>
> Thanks
>
>> ---
>> v2:
>>   - remove the extra line pointed out by Eugenio
>>
>> v1: 
>> https://lore.kernel.org/all/0a568cc8a8d2b750c2e09b2237e9f05cece07c3f.1689690854.git.yin31...@gmail.com/
>>
>>   net/vhost-vdpa.c | 48 
>>   1 file changed, 48 insertions(+)
>>
>> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
>> index 9795306742..347241796d 100644
>> --- a/net/vhost-vdpa.c
>> +++ b/net/vhost-vdpa.c
>> @@ -965,6 +965,50 @@ static int vhost_vdpa_net_load_rx(VhostVDPAState *s,
>>   return 0;
>>   }
>>
>> +static int vhost_vdpa_net_load_single_vlan(VhostVDPAState *s,
>> +   const VirtIONet *n,
>> +   uint16_t vid)
>> +{
>> +const struct iovec data = {
>> +.iov_base = ,
>> +.iov_len = sizeof(vid),
>> +};
>> +ssize_t dev_written = vhost_vdpa_net_load_cmd(s, VIRTIO_NET_CTRL_VLAN,
>> +  VIRTIO_NET_CTRL_VLAN_ADD,
>> +  , 1);
>> +if (unlikely(dev_written < 0)) {
>> +return dev_written;
>> +}
>> +if (unlikely(*s->status != VIRTIO_NET_OK)) {
>> +return -EIO;
>> +}
>> +
>> +return 0;
>> +}
>> +
>> +static int vhost_vdpa_net_load_vlan(VhostVDPAState *s,
>> +const VirtIONet *n)
>> +{
>> +int r;
>> +
>> +if (!virtio_vdev_has_feature(>parent_obj, VIRTIO_NET_F_CTRL_VLAN)) {
>> +return 0;
>> +}
>> +
>> +for (int i = 0; i < MAX_VLAN >> 5; i++) {
>> +for (int j = 0; n->vlans[i] && j <= 0x1f; j++) {
>> +if (n->vlans[i] & (1U << j)) {
>> +r = vhost_vdpa_net_load_single_vlan(s, n, (i << 5) + j);
>> +if (unlikely(r != 0)) {
>> +return r;
>> +}
>> +}
>> +}
>> +}
>> +
>> +return 0;
>> +}
>> +
>>   static int vhost_vdpa_net_load(NetClientState *nc)
>>   {
>>   VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
>> @@ -995,6 +1039,10 @@ static int vhost_vdpa_net_load(NetClientState *nc)
>>   if (unlikely(r)) {
>>   return r;
>>   }
>> +r = vhost_vdpa_net_load_vlan(s, n);
>> +if (unlikely(r)) {
>> +return r;
>> +}
>>
>>   return 0;
>>   }
>> --
>> 2.25.1
>>
>

[PATCH v3 2/4] virtio-net: Expose MAX_VLAN

2023-07-23 Thread Hawkins Jiawei

vhost-vdpa shadowed CVQ needs to know the maximum number of
vlans supported by the virtio-net device, so QEMU can restore
the VLAN state in a migration.

Co-developed-by: Eugenio Pérez 
Signed-off-by: Eugenio Pérez 
Signed-off-by: Hawkins Jiawei 
---
 hw/net/virtio-net.c| 2 --
 include/hw/virtio/virtio-net.h | 6 ++
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index d20d5a63cd..a32672039d 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -49,8 +49,6 @@
 
 #define VIRTIO_NET_VM_VERSION11
 
-#define MAX_VLAN(1 << 12)   /* Per 802.1Q definition */
-
 /* previously fixed value */
 #define VIRTIO_NET_RX_QUEUE_DEFAULT_SIZE 256
 #define VIRTIO_NET_TX_QUEUE_DEFAULT_SIZE 256
diff --git a/include/hw/virtio/virtio-net.h b/include/hw/virtio/virtio-net.h
index 5f5dcb4572..93f3bb5d97 100644
--- a/include/hw/virtio/virtio-net.h
+++ b/include/hw/virtio/virtio-net.h
@@ -38,6 +38,12 @@ OBJECT_DECLARE_SIMPLE_TYPE(VirtIONet, VIRTIO_NET)
 /* Maximum VIRTIO_NET_CTRL_MAC_TABLE_SET unicast + multicast entries. */
 #define MAC_TABLE_ENTRIES64
 
+/*
+ * The maximum number of VLANs in the VLAN filter table
+ * added by VIRTIO_NET_CTRL_VLAN_ADD
+ */
+#define MAX_VLAN(1 << 12)   /* Per 802.1Q definition */
+
 typedef struct virtio_net_conf
 {
 uint32_t txtimer;
-- 
2.25.1

[PATCH v3 3/4] vdpa: Restore vlan filtering state

2023-07-23 Thread Hawkins Jiawei

This patch introduces vhost_vdpa_net_load_single_vlan()
and vhost_vdpa_net_load_vlan() to restore the vlan
filtering state at device's startup.

Co-developed-by: Eugenio Pérez 
Signed-off-by: Eugenio Pérez 
Signed-off-by: Hawkins Jiawei 
---
v2:
 - remove the extra line pointed out by Eugenio

v1: 
https://lore.kernel.org/all/0a568cc8a8d2b750c2e09b2237e9f05cece07c3f.1689690854.git.yin31...@gmail.com/

 net/vhost-vdpa.c | 48 
 1 file changed, 48 insertions(+)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 9795306742..347241796d 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -965,6 +965,50 @@ static int vhost_vdpa_net_load_rx(VhostVDPAState *s,
 return 0;
 }
 
+static int vhost_vdpa_net_load_single_vlan(VhostVDPAState *s,
+   const VirtIONet *n,
+   uint16_t vid)
+{
+const struct iovec data = {
+.iov_base = ,
+.iov_len = sizeof(vid),
+};
+ssize_t dev_written = vhost_vdpa_net_load_cmd(s, VIRTIO_NET_CTRL_VLAN,
+  VIRTIO_NET_CTRL_VLAN_ADD,
+  , 1);
+if (unlikely(dev_written < 0)) {
+return dev_written;
+}
+if (unlikely(*s->status != VIRTIO_NET_OK)) {
+return -EIO;
+}
+
+return 0;
+}
+
+static int vhost_vdpa_net_load_vlan(VhostVDPAState *s,
+const VirtIONet *n)
+{
+int r;
+
+if (!virtio_vdev_has_feature(>parent_obj, VIRTIO_NET_F_CTRL_VLAN)) {
+return 0;
+}
+
+for (int i = 0; i < MAX_VLAN >> 5; i++) {
+for (int j = 0; n->vlans[i] && j <= 0x1f; j++) {
+if (n->vlans[i] & (1U << j)) {
+r = vhost_vdpa_net_load_single_vlan(s, n, (i << 5) + j);
+if (unlikely(r != 0)) {
+return r;
+}
+}
+}
+}
+
+return 0;
+}
+
 static int vhost_vdpa_net_load(NetClientState *nc)
 {
 VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
@@ -995,6 +1039,10 @@ static int vhost_vdpa_net_load(NetClientState *nc)
 if (unlikely(r)) {
 return r;
 }
+r = vhost_vdpa_net_load_vlan(s, n);
+if (unlikely(r)) {
+return r;
+}
 
 return 0;
 }
-- 
2.25.1

[PATCH v3 4/4] vdpa: Allow VIRTIO_NET_F_CTRL_VLAN in SVQ

2023-07-23 Thread Hawkins Jiawei

Enable SVQ with VIRTIO_NET_F_CTRL_VLAN feature.

Co-developed-by: Eugenio Pérez 
Signed-off-by: Eugenio Pérez 
Signed-off-by: Hawkins Jiawei 
---
 net/vhost-vdpa.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 347241796d..73e9063fa0 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -111,6 +111,7 @@ static const uint64_t vdpa_svq_device_features =
 BIT_ULL(VIRTIO_NET_F_STATUS) |
 BIT_ULL(VIRTIO_NET_F_CTRL_VQ) |
 BIT_ULL(VIRTIO_NET_F_CTRL_RX) |
+BIT_ULL(VIRTIO_NET_F_CTRL_VLAN) |
 BIT_ULL(VIRTIO_NET_F_CTRL_RX_EXTRA) |
 BIT_ULL(VIRTIO_NET_F_MQ) |
 BIT_ULL(VIRTIO_F_ANY_LAYOUT) |
-- 
2.25.1

[PATCH v3 1/4] virtio-net: do not reset vlan filtering at set_features

2023-07-23 Thread Hawkins Jiawei

This function is called after virtio_load, so all vlan configuration is
lost in migration case.

Just allow all the vlan-tagged packets if vlan is not configured, and
trust device reset to clear all filtered vlans.

Fixes: 0b1eaa8803 ("virtio-net: Do not filter VLANs without F_CTRL_VLAN")
Signed-off-by: Eugenio Pérez 
Reviewed-by: Hawkins Jiawei 
Signed-off-by: Hawkins Jiawei 
---
v3:
 - remove the extra "From" line

v2: 
https://lore.kernel.org/all/95af0d013281282f48ad3f47f6ad1ac4ca9e52eb.1690100802.git.yin31...@gmail.com/

 hw/net/virtio-net.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 7102ec4817..d20d5a63cd 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -1006,9 +1006,7 @@ static void virtio_net_set_features(VirtIODevice *vdev, 
uint64_t features)
 vhost_net_save_acked_features(nc->peer);
 }
 
-if (virtio_has_feature(features, VIRTIO_NET_F_CTRL_VLAN)) {
-memset(n->vlans, 0, MAX_VLAN >> 3);
-} else {
+if (!virtio_has_feature(features, VIRTIO_NET_F_CTRL_VLAN)) {
 memset(n->vlans, 0xff, MAX_VLAN >> 3);
 }
 
-- 
2.25.1

[PATCH v3 0/4] Vhost-vdpa Shadow Virtqueue VLAN support

2023-07-23 Thread Hawkins Jiawei

This series enables shadowed CVQ to intercept VLAN commands
through shadowed CVQ, update the virtio NIC device model
so qemu send it in a migration, and the restore of that
VLAN state in the destination.

ChangeLog
=
v3:
 - remove the extra "From" line in patch 1
"virtio-net: do not reset vlan filtering at set_features"

v2: https://lore.kernel.org/all/cover.1690100802.git.yin31...@gmail.com/
 - remove the extra line pointed out by Eugenio in patch 3
"vdpa: Restore vlan filtering state"

v1: https://lore.kernel.org/all/cover.1689690854.git.yin31...@gmail.com/
 - based on patch "[PATCH 0/3] Vhost-vdpa Shadow Virtqueue VLAN support"
at https://lists.gnu.org/archive/html/qemu-devel/2022-09/msg01016.html
 - move `MAX_VLAN` macro to include/hw/virtio/virtio-net.h
instead of net/vhost-vdpa.c
 - fix conflicts with the master branch


TestStep

1. test the migration using vp-vdpa device
  - For L0 guest, boot QEMU with two virtio-net-pci net device with
`ctrl_vq`, `ctrl_vlan` features on, command line like:
  -device virtio-net-pci,disable-legacy=on,disable-modern=off,
iommu_platform=on,mq=on,ctrl_vq=on,guest_announce=off,
indirect_desc=off,queue_reset=off,ctrl_vlan=on,...

  - For L1 guest, apply the patch series and compile the source code,
start QEMU with two vdpa device with svq mode on, enable the `ctrl_vq`,
`ctrl_vlan` features on, command line like:
  -netdev type=vhost-vdpa,x-svq=true,...
  -device virtio-net-pci,mq=on,guest_announce=off,ctrl_vq=on,
ctrl_vlan=on,...

  - For L2 source guest, run the following bash command:
```bash
#!/bin/sh

for idx in {1..4094}
do
  ip link add link eth0 name vlan$idx type vlan id $idx
done
```

  - gdb attaches the L2 dest VM and break at the
vhost_vdpa_net_load_single_vlan(), and execute the following
gdbscript
```gdbscript
ignore 1 4094
c
```

  - Execute the live migration in L2 source monitor

  - Result
* with this series, gdb can hit the breakpoint and continue
the executing without triggering any error or warning.

Eugenio Pérez (1):
  virtio-net: do not reset vlan filtering at set_features

Hawkins Jiawei (3):
  virtio-net: Expose MAX_VLAN
  vdpa: Restore vlan filtering state
  vdpa: Allow VIRTIO_NET_F_CTRL_VLAN in SVQ

 hw/net/virtio-net.c|  6 +
 include/hw/virtio/virtio-net.h |  6 +
 net/vhost-vdpa.c   | 49 ++
 3 files changed, 56 insertions(+), 5 deletions(-)

-- 
2.25.1

Re: [PATCH v2 1/4] virtio-net: do not reset vlan filtering at set_features

2023-07-23 Thread Hawkins Jiawei

On 2023/7/23 17:26, Hawkins Jiawei wrote:
> From: Eugenio Pérez 

There was a wrong "From" line by mistake, I will send the v3 patch to
fix this.

Thanks!


>
> This function is called after virtio_load, so all vlan configuration is
> lost in migration case.
>
> Just allow all the vlan-tagged packets if vlan is not configured, and
> trust device reset to clear all filtered vlans.
>
> Fixes: 0b1eaa8803 ("virtio-net: Do not filter VLANs without F_CTRL_VLAN")
> Signed-off-by: Eugenio Pérez 
> Reviewed-by: Hawkins Jiawei 
> Signed-off-by: Hawkins Jiawei 
> ---
>   hw/net/virtio-net.c | 4 +---
>   1 file changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> index 7102ec4817..d20d5a63cd 100644
> --- a/hw/net/virtio-net.c
> +++ b/hw/net/virtio-net.c
> @@ -1006,9 +1006,7 @@ static void virtio_net_set_features(VirtIODevice *vdev, 
> uint64_t features)
>   vhost_net_save_acked_features(nc->peer);
>   }
>
> -if (virtio_has_feature(features, VIRTIO_NET_F_CTRL_VLAN)) {
> -memset(n->vlans, 0, MAX_VLAN >> 3);
> -} else {
> +if (!virtio_has_feature(features, VIRTIO_NET_F_CTRL_VLAN)) {
>   memset(n->vlans, 0xff, MAX_VLAN >> 3);
>   }
>

[PATCH v2 3/4] vdpa: Restore vlan filtering state

2023-07-23 Thread Hawkins Jiawei

This patch introduces vhost_vdpa_net_load_single_vlan()
and vhost_vdpa_net_load_vlan() to restore the vlan
filtering state at device's startup.

Co-developed-by: Eugenio Pérez 
Signed-off-by: Eugenio Pérez 
Signed-off-by: Hawkins Jiawei 
---
v2:
 - remove the extra line pointed out by Eugenio

v1: 
https://lore.kernel.org/all/0a568cc8a8d2b750c2e09b2237e9f05cece07c3f.1689690854.git.yin31...@gmail.com/

 net/vhost-vdpa.c | 48 
 1 file changed, 48 insertions(+)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 9795306742..347241796d 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -965,6 +965,50 @@ static int vhost_vdpa_net_load_rx(VhostVDPAState *s,
 return 0;
 }
 
+static int vhost_vdpa_net_load_single_vlan(VhostVDPAState *s,
+   const VirtIONet *n,
+   uint16_t vid)
+{
+const struct iovec data = {
+.iov_base = ,
+.iov_len = sizeof(vid),
+};
+ssize_t dev_written = vhost_vdpa_net_load_cmd(s, VIRTIO_NET_CTRL_VLAN,
+  VIRTIO_NET_CTRL_VLAN_ADD,
+  , 1);
+if (unlikely(dev_written < 0)) {
+return dev_written;
+}
+if (unlikely(*s->status != VIRTIO_NET_OK)) {
+return -EIO;
+}
+
+return 0;
+}
+
+static int vhost_vdpa_net_load_vlan(VhostVDPAState *s,
+const VirtIONet *n)
+{
+int r;
+
+if (!virtio_vdev_has_feature(>parent_obj, VIRTIO_NET_F_CTRL_VLAN)) {
+return 0;
+}
+
+for (int i = 0; i < MAX_VLAN >> 5; i++) {
+for (int j = 0; n->vlans[i] && j <= 0x1f; j++) {
+if (n->vlans[i] & (1U << j)) {
+r = vhost_vdpa_net_load_single_vlan(s, n, (i << 5) + j);
+if (unlikely(r != 0)) {
+return r;
+}
+}
+}
+}
+
+return 0;
+}
+
 static int vhost_vdpa_net_load(NetClientState *nc)
 {
 VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
@@ -995,6 +1039,10 @@ static int vhost_vdpa_net_load(NetClientState *nc)
 if (unlikely(r)) {
 return r;
 }
+r = vhost_vdpa_net_load_vlan(s, n);
+if (unlikely(r)) {
+return r;
+}
 
 return 0;
 }
-- 
2.25.1

[PATCH v2 4/4] vdpa: Allow VIRTIO_NET_F_CTRL_VLAN in SVQ

2023-07-23 Thread Hawkins Jiawei

Enable SVQ with VIRTIO_NET_F_CTRL_VLAN feature.

Co-developed-by: Eugenio Pérez 
Signed-off-by: Eugenio Pérez 
Signed-off-by: Hawkins Jiawei 
---
 net/vhost-vdpa.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 347241796d..73e9063fa0 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -111,6 +111,7 @@ static const uint64_t vdpa_svq_device_features =
 BIT_ULL(VIRTIO_NET_F_STATUS) |
 BIT_ULL(VIRTIO_NET_F_CTRL_VQ) |
 BIT_ULL(VIRTIO_NET_F_CTRL_RX) |
+BIT_ULL(VIRTIO_NET_F_CTRL_VLAN) |
 BIT_ULL(VIRTIO_NET_F_CTRL_RX_EXTRA) |
 BIT_ULL(VIRTIO_NET_F_MQ) |
 BIT_ULL(VIRTIO_F_ANY_LAYOUT) |
-- 
2.25.1

[PATCH v2 1/4] virtio-net: do not reset vlan filtering at set_features

2023-07-23 Thread Hawkins Jiawei

From: Eugenio Pérez 

This function is called after virtio_load, so all vlan configuration is
lost in migration case.

Just allow all the vlan-tagged packets if vlan is not configured, and
trust device reset to clear all filtered vlans.

Fixes: 0b1eaa8803 ("virtio-net: Do not filter VLANs without F_CTRL_VLAN")
Signed-off-by: Eugenio Pérez 
Reviewed-by: Hawkins Jiawei 
Signed-off-by: Hawkins Jiawei 
---
 hw/net/virtio-net.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 7102ec4817..d20d5a63cd 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -1006,9 +1006,7 @@ static void virtio_net_set_features(VirtIODevice *vdev, 
uint64_t features)
 vhost_net_save_acked_features(nc->peer);
 }
 
-if (virtio_has_feature(features, VIRTIO_NET_F_CTRL_VLAN)) {
-memset(n->vlans, 0, MAX_VLAN >> 3);
-} else {
+if (!virtio_has_feature(features, VIRTIO_NET_F_CTRL_VLAN)) {
 memset(n->vlans, 0xff, MAX_VLAN >> 3);
 }
 
-- 
2.25.1

[PATCH v2 2/4] virtio-net: Expose MAX_VLAN

2023-07-23 Thread Hawkins Jiawei

vhost-vdpa shadowed CVQ needs to know the maximum number of
vlans supported by the virtio-net device, so QEMU can restore
the VLAN state in a migration.

Co-developed-by: Eugenio Pérez 
Signed-off-by: Eugenio Pérez 
Signed-off-by: Hawkins Jiawei 
---
 hw/net/virtio-net.c| 2 --
 include/hw/virtio/virtio-net.h | 6 ++
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index d20d5a63cd..a32672039d 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -49,8 +49,6 @@
 
 #define VIRTIO_NET_VM_VERSION11
 
-#define MAX_VLAN(1 << 12)   /* Per 802.1Q definition */
-
 /* previously fixed value */
 #define VIRTIO_NET_RX_QUEUE_DEFAULT_SIZE 256
 #define VIRTIO_NET_TX_QUEUE_DEFAULT_SIZE 256
diff --git a/include/hw/virtio/virtio-net.h b/include/hw/virtio/virtio-net.h
index 5f5dcb4572..93f3bb5d97 100644
--- a/include/hw/virtio/virtio-net.h
+++ b/include/hw/virtio/virtio-net.h
@@ -38,6 +38,12 @@ OBJECT_DECLARE_SIMPLE_TYPE(VirtIONet, VIRTIO_NET)
 /* Maximum VIRTIO_NET_CTRL_MAC_TABLE_SET unicast + multicast entries. */
 #define MAC_TABLE_ENTRIES64
 
+/*
+ * The maximum number of VLANs in the VLAN filter table
+ * added by VIRTIO_NET_CTRL_VLAN_ADD
+ */
+#define MAX_VLAN(1 << 12)   /* Per 802.1Q definition */
+
 typedef struct virtio_net_conf
 {
 uint32_t txtimer;
-- 
2.25.1

[PATCH v2 0/4] Vhost-vdpa Shadow Virtqueue VLAN support

2023-07-23 Thread Hawkins Jiawei

This series enables shadowed CVQ to intercept VLAN commands
through shadowed CVQ, update the virtio NIC device model
so qemu send it in a migration, and the restore of that
VLAN state in the destination.

ChangeLog
=
v2:
 - remove the extra line pointed out by Eugenio in patch 3
"vdpa: Restore vlan filtering state"

v1: https://lore.kernel.org/all/cover.1689690854.git.yin31...@gmail.com/
 - based on patch "[PATCH 0/3] Vhost-vdpa Shadow Virtqueue VLAN support"
at https://lists.gnu.org/archive/html/qemu-devel/2022-09/msg01016.html
 - move `MAX_VLAN` macro to include/hw/virtio/virtio-net.h
instead of net/vhost-vdpa.c
 - fix conflicts with the master branch


TestStep

1. test the migration using vp-vdpa device
  - For L0 guest, boot QEMU with two virtio-net-pci net device with
`ctrl_vq`, `ctrl_vlan` features on, command line like:
  -device virtio-net-pci,disable-legacy=on,disable-modern=off,
iommu_platform=on,mq=on,ctrl_vq=on,guest_announce=off,
indirect_desc=off,queue_reset=off,ctrl_vlan=on,...

  - For L1 guest, apply the patch series and compile the source code,
start QEMU with two vdpa device with svq mode on, enable the `ctrl_vq`,
`ctrl_vlan` features on, command line like:
  -netdev type=vhost-vdpa,x-svq=true,...
  -device virtio-net-pci,mq=on,guest_announce=off,ctrl_vq=on,
ctrl_vlan=on,...

  - For L2 source guest, run the following bash command:
```bash
#!/bin/sh

for idx in {1..4094}
do
  ip link add link eth0 name vlan$idx type vlan id $idx
done
```

  - gdb attaches the L2 dest VM and break at the
vhost_vdpa_net_load_single_vlan(), and execute the following
gdbscript
```gdbscript
ignore 1 4094
c
```

  - Execute the live migration in L2 source monitor

  - Result
* with this series, gdb can hit the breakpoint and continue
the executing without triggering any error or warning.

Eugenio Pérez (1):
  virtio-net: do not reset vlan filtering at set_features

Hawkins Jiawei (3):
  virtio-net: Expose MAX_VLAN
  vdpa: Restore vlan filtering state
  vdpa: Allow VIRTIO_NET_F_CTRL_VLAN in SVQ

 hw/net/virtio-net.c|  6 +
 include/hw/virtio/virtio-net.h |  6 +
 net/vhost-vdpa.c   | 49 ++
 3 files changed, 56 insertions(+), 5 deletions(-)

-- 
2.25.1

Re: [PATCH 3/4] vdpa: Restore vlan filtering state

2023-07-21 Thread Hawkins Jiawei

On 2023/7/21 19:57, Eugenio Perez Martin wrote:
> On Wed, Jul 19, 2023 at 9:48 AM Hawkins Jiawei  wrote:
>>
>> This patch introduces vhost_vdpa_net_load_single_vlan()
>> and vhost_vdpa_net_load_vlan() to restore the vlan
>> filtering state at device's startup.
>>
>> Co-developed-by: Eugenio Pérez 
>> Signed-off-by: Eugenio Pérez 
>> Signed-off-by: Hawkins Jiawei 
>> ---
>>   net/vhost-vdpa.c | 49 
>>   1 file changed, 49 insertions(+)
>>
>> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
>> index 9795306742..0787dd933b 100644
>> --- a/net/vhost-vdpa.c
>> +++ b/net/vhost-vdpa.c
>> @@ -965,6 +965,51 @@ static int vhost_vdpa_net_load_rx(VhostVDPAState *s,
>>   return 0;
>>   }
>>
>> +static int vhost_vdpa_net_load_single_vlan(VhostVDPAState *s,
>> +   const VirtIONet *n,
>> +   uint16_t vid)
>> +{
>> +const struct iovec data = {
>> +.iov_base = ,
>> +.iov_len = sizeof(vid),
>> +};
>> +ssize_t dev_written = vhost_vdpa_net_load_cmd(s, VIRTIO_NET_CTRL_VLAN,
>> +  VIRTIO_NET_CTRL_VLAN_ADD,
>> +  , 1);
>> +if (unlikely(dev_written < 0)) {
>> +return dev_written;
>> +}
>> +if (unlikely(*s->status != VIRTIO_NET_OK)) {
>> +return -EIO;
>> +}
>> +
>> +return 0;
>> +}
>> +
>> +static int vhost_vdpa_net_load_vlan(VhostVDPAState *s,
>> +const VirtIONet *n)
>> +{
>> +int r;
>> +
>> +if (!virtio_vdev_has_feature(>parent_obj, VIRTIO_NET_F_CTRL_VLAN)) {
>> +return 0;
>> +}
>> +
>> +for (int i = 0; i < MAX_VLAN >> 5; i++) {
>> +for (int j = 0; n->vlans[i] && j <= 0x1f; j++) {
>> +if (n->vlans[i] & (1U << j)) {
>> +r = vhost_vdpa_net_load_single_vlan(s, n, (i << 5) + j);
>> +if (unlikely(r != 0)) {
>> +return r;
>> +}
>> +}
>> +}
>> +}
>> +
>> +return 0;
>> +}
>> +
>> +
>
> Nit: I'm not sure if it was here originally, but there is an extra newline 
> here.

Hi Eugenio,

It was not here originally, it was introduced mistakenly during the
refactoring process.

I will fix this in the v2 version.

Thanks!


>
>>   static int vhost_vdpa_net_load(NetClientState *nc)
>>   {
>>   VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
>> @@ -995,6 +1040,10 @@ static int vhost_vdpa_net_load(NetClientState *nc)
>>   if (unlikely(r)) {
>>   return r;
>>   }
>> +r = vhost_vdpa_net_load_vlan(s, n);
>> +if (unlikely(r)) {
>> +return r;
>> +}
>>
>>   return 0;
>>   }
>> --
>> 2.25.1
>>
>

Re: [PATCH v3 0/8] vdpa: Send all CVQ state load commands in parallel

2023-07-20 Thread Hawkins Jiawei

在 2023/7/20 16:53, Lei Yang 写道:
> According to the Hawkins provided steps, I tested two cases based on
> applied this series patches and without it. And all tests are based on
> the real hardware.
> Case 1, without  this series
> Source: qemu-system-x86_64: vhost_vdpa_net_load() = 23308 us
> Dest: qemu-system-x86_64: vhost_vdpa_net_load() = 23296 us
>
> Case 2, applied  this series
> Source: qemu-system-x86_64: vhost_vdpa_net_load() = 6558 us
> Dest: qemu-system-x86_64: vhost_vdpa_net_load() = 6539 us
>
> Tested-by: Lei Yang 
>
>
> On Thu, Jul 20, 2023 at 6:54 AM Lei Yang  wrote:
>>
>> On Wed, Jul 19, 2023 at 11:25 PM Hawkins Jiawei  wrote:
>>>
>>> 在 2023/7/19 20:44, Lei Yang 写道:
>>>> Hello Hawkins and Michael
>>>>
>>>> Looks like there are big changes about vp_vdpa, therefore, if needed,
>>>> QE can test this series in QE's environment before the patch is
>>>
>>> Hi Lei,
>>>
>>> This patch series does not modify the code of vp_vdpa. Instead, it only
>>> modifies how QEMU sends SVQ control commands to the vdpa device.
>>>
>> Hi Hawkins
>>
>>> Considering that the behavior of the vp_vdpa device differs from that
>>> of real vdpa hardware, would it be possible for you to test this patch
>>> series on a real vdpa device?
>>
>> Yes, there is a hardware device to test it , I will update the test
>> results ASAP.
>>
>> BR
>> Lei
>>>
>>> Thanks!
>>>
>>>
>>>> merged, and provide the result.
>>>>
>>>> BR
>>>> Lei
>>>>
>>>>
>>>> On Wed, Jul 19, 2023 at 8:37 PM Hawkins Jiawei  wrote:
>>>>>
>>>>> 在 2023/7/19 17:11, Michael S. Tsirkin 写道:
>>>>>> On Wed, Jul 19, 2023 at 03:53:45PM +0800, Hawkins Jiawei wrote:
>>>>>>> This patchset allows QEMU to delay polling and checking the device
>>>>>>> used buffer until either the SVQ is full or control commands shadow
>>>>>>> buffers are full, instead of polling and checking immediately after
>>>>>>> sending each SVQ control command, so that QEMU can send all the SVQ
>>>>>>> control commands in parallel, which have better performance improvement.
>>>>>>>
>>>>>>> I use vp_vdpa device to simulate vdpa device, and create 4094 VLANS in
>>>>>>> guest to build a test environment for sending multiple CVQ state load
>>>>>>> commands. This patch series can improve latency from 10023 us to
>>>>>>> 8697 us for about 4099 CVQ state load commands, about 0.32 us per 
>>>>>>> command.

It appears that the performance timing with the vp_vdpa device in this
method is not consistently stable.

I retest this patch series with the same steps with the vp_vdpa device.

With this patch series, in the majority of cases, the time for CVQ state
load commands is around 8000 us ~ 1 us.

Without this patch series, in the majority of cases, the time for CVQ
state load commands is around 14000 us ~ 2 us.

Thanks!


>>>>>>
>>>>>> Looks like a tiny improvement.
>>>>>> At the same time we have O(n^2) behaviour with memory mappings.
>>>>>
>>>>> Hi Michael,
>>>>>
>>>>> Thanks for your review.
>>>>>
>>>>> I wonder why you say "we have O(n^2) behaviour on memory mappings" here?
>>>>>
>>>>>From my understanding, QEMU maps two page-size buffers as control
>>>>> commands shadow buffers at device startup. These buffers then are used
>>>>> to cache SVQ control commands, where QEMU fills them with multiple SVQ 
>>>>> control
>>>>> commands bytes, flushes them when SVQ descriptors are full or these
>>>>> control commands shadow buffers reach their capacity.
>>>>>
>>>>> QEMU repeats this process until all CVQ state load commands have been
>>>>> sent in loading.
>>>>>
>>>>> In this loading process, only control commands shadow buffers
>>>>> translation should be relative to memory mappings, which should be
>>>>> O(log n) behaviour to my understanding(Please correct me if I am wrong).
>>>>>
>>>>>> Not saying we must not do this but I think it's worth
>>>>>> checking where the bottleneck is. My guess would be
>>>>>> vp_vdpa is not doing things in pa

Re: [PATCH v3 0/8] vdpa: Send all CVQ state load commands in parallel

2023-07-19 Thread Hawkins Jiawei

在 2023/7/19 20:44, Lei Yang 写道:
> Hello Hawkins and Michael
>
> Looks like there are big changes about vp_vdpa, therefore, if needed,
> QE can test this series in QE's environment before the patch is

Hi Lei,

This patch series does not modify the code of vp_vdpa. Instead, it only
modifies how QEMU sends SVQ control commands to the vdpa device.

Considering that the behavior of the vp_vdpa device differs from that
of real vdpa hardware, would it be possible for you to test this patch
series on a real vdpa device?

Thanks!


> merged, and provide the result.
>
> BR
> Lei
>
>
> On Wed, Jul 19, 2023 at 8:37 PM Hawkins Jiawei  wrote:
>>
>> 在 2023/7/19 17:11, Michael S. Tsirkin 写道:
>>> On Wed, Jul 19, 2023 at 03:53:45PM +0800, Hawkins Jiawei wrote:
>>>> This patchset allows QEMU to delay polling and checking the device
>>>> used buffer until either the SVQ is full or control commands shadow
>>>> buffers are full, instead of polling and checking immediately after
>>>> sending each SVQ control command, so that QEMU can send all the SVQ
>>>> control commands in parallel, which have better performance improvement.
>>>>
>>>> I use vp_vdpa device to simulate vdpa device, and create 4094 VLANS in
>>>> guest to build a test environment for sending multiple CVQ state load
>>>> commands. This patch series can improve latency from 10023 us to
>>>> 8697 us for about 4099 CVQ state load commands, about 0.32 us per command.
>>>
>>> Looks like a tiny improvement.
>>> At the same time we have O(n^2) behaviour with memory mappings.
>>
>> Hi Michael,
>>
>> Thanks for your review.
>>
>> I wonder why you say "we have O(n^2) behaviour on memory mappings" here?
>>
>>   From my understanding, QEMU maps two page-size buffers as control
>> commands shadow buffers at device startup. These buffers then are used
>> to cache SVQ control commands, where QEMU fills them with multiple SVQ 
>> control
>> commands bytes, flushes them when SVQ descriptors are full or these
>> control commands shadow buffers reach their capacity.
>>
>> QEMU repeats this process until all CVQ state load commands have been
>> sent in loading.
>>
>> In this loading process, only control commands shadow buffers
>> translation should be relative to memory mappings, which should be
>> O(log n) behaviour to my understanding(Please correct me if I am wrong).
>>
>>> Not saying we must not do this but I think it's worth
>>> checking where the bottleneck is. My guess would be
>>> vp_vdpa is not doing things in parallel. Want to try fixing that
>>
>> As for "vp_vdpa is not doing things in parallel.", do you mean
>> the vp_vdpa device cannot process QEMU's SVQ control commands
>> in parallel?
>>
>> In this situation, I will try to use real vdpa hardware to
>> test the patch series performance.
>>
>>> to see how far it can be pushed?
>>
>> Currently, I am involved in the "Add virtio-net Control Virtqueue state
>> restore support" project in Google Summer of Code now. Because I am
>> uncertain about the time it will take to fix that problem in the vp_vdpa
>> device, I prefer to complete the gsoc project first.
>>
>> Thanks!
>>
>>
>>>
>>>
>>>> Note that this patch should be based on
>>>> patch "Vhost-vdpa Shadow Virtqueue VLAN support" at [1].
>>>>
>>>> [1]. https://lists.gnu.org/archive/html/qemu-devel/2023-07/msg03719.html
>>>>
>>>> TestStep
>>>> 
>>>> 1. regression testing using vp-vdpa device
>>>> - For L0 guest, boot QEMU with two virtio-net-pci net device with
>>>> `ctrl_vq`, `ctrl_rx`, `ctrl_rx_extra` features on, command line like:
>>>> -device virtio-net-pci,disable-legacy=on,disable-modern=off,
>>>> iommu_platform=on,mq=on,ctrl_vq=on,guest_announce=off,
>>>> indirect_desc=off,queue_reset=off,ctrl_rx=on,ctrl_rx_extra=on,...
>>>>
>>>> - For L1 guest, apply the patch series and compile the source code,
>>>> start QEMU with two vdpa device with svq mode on, enable the `ctrl_vq`,
>>>> `ctrl_rx`, `ctrl_rx_extra` features on, command line like:
>>>> -netdev type=vhost-vdpa,x-svq=true,...
>>>> -device virtio-net-pci,mq=on,guest_announce=off,ctrl_vq=on,
>>>> ctrl_rx=on,ctrl_rx_extra=on...
>>>>
>>>> - For L2 source guest, run the following bash command:
>

Re: [PATCH v3 0/8] vdpa: Send all CVQ state load commands in parallel

2023-07-19 Thread Hawkins Jiawei

在 2023/7/19 20:46, Michael S. Tsirkin 写道:
> On Wed, Jul 19, 2023 at 08:35:50PM +0800, Hawkins Jiawei wrote:
>> 在 2023/7/19 17:11, Michael S. Tsirkin 写道:
>>> On Wed, Jul 19, 2023 at 03:53:45PM +0800, Hawkins Jiawei wrote:
>>>> This patchset allows QEMU to delay polling and checking the device
>>>> used buffer until either the SVQ is full or control commands shadow
>>>> buffers are full, instead of polling and checking immediately after
>>>> sending each SVQ control command, so that QEMU can send all the SVQ
>>>> control commands in parallel, which have better performance improvement.
>>>>
>>>> I use vp_vdpa device to simulate vdpa device, and create 4094 VLANS in
>>>> guest to build a test environment for sending multiple CVQ state load
>>>> commands. This patch series can improve latency from 10023 us to
>>>> 8697 us for about 4099 CVQ state load commands, about 0.32 us per command.
>>>
>>> Looks like a tiny improvement.
>>> At the same time we have O(n^2) behaviour with memory mappings.
>>
>> Hi Michael,
>>
>> Thanks for your review.
>>
>> I wonder why you say "we have O(n^2) behaviour on memory mappings" here?
>
> it's not specific to virtio - it's related to device init.
> generally each device has some memory. during boot bios
> enables each individually O(n) where n is # of devices.
> memory maps has to be updated and in qemu this update
> is at least superlinear with n (more like O(n log n) I think).
> This gets up > O(n^2) with n number of devices.

Thanks for your explanation.


>
>>   From my understanding, QEMU maps two page-size buffers as control
>> commands shadow buffers at device startup. These buffers then are used
>> to cache SVQ control commands, where QEMU fills them with multiple SVQ 
>> control
>> commands bytes, flushes them when SVQ descriptors are full or these
>> control commands shadow buffers reach their capacity.
>>
>> QEMU repeats this process until all CVQ state load commands have been
>> sent in loading.
>>
>> In this loading process, only control commands shadow buffers
>> translation should be relative to memory mappings, which should be
>> O(log n) behaviour to my understanding(Please correct me if I am wrong).
>>
>>> Not saying we must not do this but I think it's worth
>>> checking where the bottleneck is. My guess would be
>>> vp_vdpa is not doing things in parallel. Want to try fixing that
>>
>> As for "vp_vdpa is not doing things in parallel.", do you mean
>> the vp_vdpa device cannot process QEMU's SVQ control commands
>> in parallel?
>>
>> In this situation, I will try to use real vdpa hardware to
>> test the patch series performance.
>
> yea, pls do that.
>
>>> to see how far it can be pushed?
>>
>> Currently, I am involved in the "Add virtio-net Control Virtqueue state
>> restore support" project in Google Summer of Code now. Because I am
>> uncertain about the time it will take to fix that problem in the vp_vdpa
>> device, I prefer to complete the gsoc project first.
>>
>> Thanks!
>>
>>
>>>
>>>
>>>> Note that this patch should be based on
>>>> patch "Vhost-vdpa Shadow Virtqueue VLAN support" at [1].
>>>>
>>>> [1]. https://lists.gnu.org/archive/html/qemu-devel/2023-07/msg03719.html
>>>>
>>>> TestStep
>>>> 
>>>> 1. regression testing using vp-vdpa device
>>>> - For L0 guest, boot QEMU with two virtio-net-pci net device with
>>>> `ctrl_vq`, `ctrl_rx`, `ctrl_rx_extra` features on, command line like:
>>>> -device virtio-net-pci,disable-legacy=on,disable-modern=off,
>>>> iommu_platform=on,mq=on,ctrl_vq=on,guest_announce=off,
>>>> indirect_desc=off,queue_reset=off,ctrl_rx=on,ctrl_rx_extra=on,...
>>>>
>>>> - For L1 guest, apply the patch series and compile the source code,
>>>> start QEMU with two vdpa device with svq mode on, enable the `ctrl_vq`,
>>>> `ctrl_rx`, `ctrl_rx_extra` features on, command line like:
>>>> -netdev type=vhost-vdpa,x-svq=true,...
>>>> -device virtio-net-pci,mq=on,guest_announce=off,ctrl_vq=on,
>>>> ctrl_rx=on,ctrl_rx_extra=on...
>>>>
>>>> - For L2 source guest, run the following bash command:
>>>> ```bash
>>>> #!/bin/sh
>>>>
>>>> for idx1 in {0..9}
>>>> do

Re: [PATCH v3 0/8] vdpa: Send all CVQ state load commands in parallel

2023-07-19 Thread Hawkins Jiawei

在 2023/7/19 17:11, Michael S. Tsirkin 写道:
> On Wed, Jul 19, 2023 at 03:53:45PM +0800, Hawkins Jiawei wrote:
>> This patchset allows QEMU to delay polling and checking the device
>> used buffer until either the SVQ is full or control commands shadow
>> buffers are full, instead of polling and checking immediately after
>> sending each SVQ control command, so that QEMU can send all the SVQ
>> control commands in parallel, which have better performance improvement.
>>
>> I use vp_vdpa device to simulate vdpa device, and create 4094 VLANS in
>> guest to build a test environment for sending multiple CVQ state load
>> commands. This patch series can improve latency from 10023 us to
>> 8697 us for about 4099 CVQ state load commands, about 0.32 us per command.
>
> Looks like a tiny improvement.
> At the same time we have O(n^2) behaviour with memory mappings.

Hi Michael,

Thanks for your review.

I wonder why you say "we have O(n^2) behaviour on memory mappings" here?

 From my understanding, QEMU maps two page-size buffers as control
commands shadow buffers at device startup. These buffers then are used
to cache SVQ control commands, where QEMU fills them with multiple SVQ control
commands bytes, flushes them when SVQ descriptors are full or these
control commands shadow buffers reach their capacity.

QEMU repeats this process until all CVQ state load commands have been
sent in loading.

In this loading process, only control commands shadow buffers
translation should be relative to memory mappings, which should be
O(log n) behaviour to my understanding(Please correct me if I am wrong).

> Not saying we must not do this but I think it's worth
> checking where the bottleneck is. My guess would be
> vp_vdpa is not doing things in parallel. Want to try fixing that

As for "vp_vdpa is not doing things in parallel.", do you mean
the vp_vdpa device cannot process QEMU's SVQ control commands
in parallel?

In this situation, I will try to use real vdpa hardware to
test the patch series performance.

> to see how far it can be pushed?

Currently, I am involved in the "Add virtio-net Control Virtqueue state
restore support" project in Google Summer of Code now. Because I am
uncertain about the time it will take to fix that problem in the vp_vdpa
device, I prefer to complete the gsoc project first.

Thanks!


>
>
>> Note that this patch should be based on
>> patch "Vhost-vdpa Shadow Virtqueue VLAN support" at [1].
>>
>> [1]. https://lists.gnu.org/archive/html/qemu-devel/2023-07/msg03719.html
>>
>> TestStep
>> 
>> 1. regression testing using vp-vdpa device
>>- For L0 guest, boot QEMU with two virtio-net-pci net device with
>> `ctrl_vq`, `ctrl_rx`, `ctrl_rx_extra` features on, command line like:
>>-device virtio-net-pci,disable-legacy=on,disable-modern=off,
>> iommu_platform=on,mq=on,ctrl_vq=on,guest_announce=off,
>> indirect_desc=off,queue_reset=off,ctrl_rx=on,ctrl_rx_extra=on,...
>>
>>- For L1 guest, apply the patch series and compile the source code,
>> start QEMU with two vdpa device with svq mode on, enable the `ctrl_vq`,
>> `ctrl_rx`, `ctrl_rx_extra` features on, command line like:
>>-netdev type=vhost-vdpa,x-svq=true,...
>>-device virtio-net-pci,mq=on,guest_announce=off,ctrl_vq=on,
>> ctrl_rx=on,ctrl_rx_extra=on...
>>
>>- For L2 source guest, run the following bash command:
>> ```bash
>> #!/bin/sh
>>
>> for idx1 in {0..9}
>> do
>>for idx2 in {0..9}
>>do
>>  for idx3 in {0..6}
>>  do
>>ip link add macvlan$idx1$idx2$idx3 link eth0
>> address 4a:30:10:19:$idx1$idx2:1$idx3 type macvlan mode bridge
>>ip link set macvlan$idx1$idx2$idx3 up
>>  done
>>done
>> done
>> ```
>>- Execute the live migration in L2 source monitor
>>
>>- Result
>>  * with this series, QEMU should not trigger any error or warning.
>>
>>
>>
>> 2. perf using vp-vdpa device
>>- For L0 guest, boot QEMU with two virtio-net-pci net device with
>> `ctrl_vq`, `ctrl_vlan` features on, command line like:
>>-device virtio-net-pci,disable-legacy=on,disable-modern=off,
>> iommu_platform=on,mq=on,ctrl_vq=on,guest_announce=off,
>> indirect_desc=off,queue_reset=off,ctrl_vlan=on,...
>>
>>- For L1 guest, apply the patch series, then apply an addtional
>> patch to record the load time in microseconds as following:
>> ```diff
>> diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
>> index 6b958d6363..501b510fd2 100644
>> --- a/hw/net/vhost_net.c
>> +++ b/hw/net/vhost_net.c
>> @@ -

Re: [PATCH 0/4] Vhost-vdpa Shadow Virtqueue VLAN support

2023-07-19 Thread Hawkins Jiawei

在 2023/7/19 15:47, Hawkins Jiawei 写道:
> This series enables shadowed CVQ to intercept VLAN commands
> through shadowed CVQ, update the virtio NIC device model
> so qemu send it in a migration, and the restore of that
> VLAN state in the destination.

This patch series is based on
"[PATCH 0/3] Vhost-vdpa Shadow Virtqueue VLAN support" at [1],
with these changes:

  - move `MAX_VLAN` macro to include/hw/virtio/virtio-net.h
instead of net/vhost-vdpa.c
  - fix conflicts with the master branch

Thanks!

[1]. https://lists.gnu.org/archive/html/qemu-devel/2022-09/msg01016.html


>
> TestStep
> 
> 1. test the migration using vp-vdpa device
>- For L0 guest, boot QEMU with two virtio-net-pci net device with
> `ctrl_vq`, `ctrl_vlan` features on, command line like:
>-device virtio-net-pci,disable-legacy=on,disable-modern=off,
> iommu_platform=on,mq=on,ctrl_vq=on,guest_announce=off,
> indirect_desc=off,queue_reset=off,ctrl_vlan=on,...
>
>- For L1 guest, apply the patch series and compile the source code,
> start QEMU with two vdpa device with svq mode on, enable the `ctrl_vq`,
> `ctrl_vlan` features on, command line like:
>-netdev type=vhost-vdpa,x-svq=true,...
>-device virtio-net-pci,mq=on,guest_announce=off,ctrl_vq=on,
> ctrl_vlan=on,...
>
>- For L2 source guest, run the following bash command:
> ```bash
> #!/bin/sh
>
> for idx in {1..4094}
> do
>ip link add link eth0 name vlan$idx type vlan id $idx
> done
> ```
>
>- gdb attaches the L2 dest VM and break at the
> vhost_vdpa_net_load_single_vlan(), and execute the following
> gdbscript
> ```gdbscript
> ignore 1 4094
> c
> ```
>
>- Execute the live migration in L2 source monitor
>
>- Result
>  * with this series, gdb can hit the breakpoint and continue
> the executing without triggering any error or warning.
>
> Eugenio Pérez (1):
>virtio-net: do not reset vlan filtering at set_features
>
> Hawkins Jiawei (3):
>virtio-net: Expose MAX_VLAN
>vdpa: Restore vlan filtering state
>vdpa: Allow VIRTIO_NET_F_CTRL_VLAN in SVQ
>
>   hw/net/virtio-net.c|  6 +---
>   include/hw/virtio/virtio-net.h |  6 
>   net/vhost-vdpa.c   | 50 ++
>   3 files changed, 57 insertions(+), 5 deletions(-)
>

[PATCH v3 7/8] vdpa: Add cursors to vhost_vdpa_net_loadx()

2023-07-19 Thread Hawkins Jiawei

This patch adds `out_cursor` and `in_cursor` arguments
to vhost_vdpa_net_loadx().

By making this change, next patches in this series
can refactor vhost_vdpa_net_load_cmd() directly to
iterate through the control commands shadow buffers,
allowing QEMU to send CVQ state load commands in parallel
at device startup.

Signed-off-by: Hawkins Jiawei 
---
 net/vhost-vdpa.c | 79 ++--
 1 file changed, 50 insertions(+), 29 deletions(-)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index d06f38403f..795c9c1fd2 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -633,7 +633,8 @@ static uint16_t 
vhost_vdpa_net_svq_available_slots(VhostVDPAState *s)
 return vhost_svq_available_slots(svq);
 }
 
-static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState *s, uint8_t class,
+static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState *s, void **out_cursor,
+   void **in_cursor, uint8_t class,
uint8_t cmd, const struct iovec 
*data_sg,
size_t data_num)
 {
@@ -644,11 +645,11 @@ static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState *s, 
uint8_t class,
 size_t data_size = iov_size(data_sg, data_num);
 /* Buffers for the device */
 struct iovec out = {
-.iov_base = s->cvq_cmd_out_buffer,
+.iov_base = *out_cursor,
 .iov_len = sizeof(ctrl) + data_size,
 };
 struct iovec in = {
-.iov_base = s->status,
+.iov_base = *in_cursor,
 .iov_len = sizeof(*s->status),
 };
 ssize_t r;
@@ -658,11 +659,11 @@ static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState *s, 
uint8_t class,
 assert(vhost_vdpa_net_svq_available_slots(s) >= 2);
 
 /* pack the CVQ command header */
-memcpy(s->cvq_cmd_out_buffer, , sizeof(ctrl));
+memcpy(out.iov_base, , sizeof(ctrl));
 
 /* pack the CVQ command command-specific-data */
 iov_to_buf(data_sg, data_num, 0,
-   s->cvq_cmd_out_buffer + sizeof(ctrl), data_size);
+   out.iov_base + sizeof(ctrl), data_size);
 
 r = vhost_vdpa_net_cvq_add(s, , 1, , 1);
 if (unlikely(r < 0)) {
@@ -676,14 +677,16 @@ static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState *s, 
uint8_t class,
 return vhost_vdpa_net_svq_poll(s, 1);
 }
 
-static int vhost_vdpa_net_load_mac(VhostVDPAState *s, const VirtIONet *n)
+static int vhost_vdpa_net_load_mac(VhostVDPAState *s, const VirtIONet *n,
+   void **out_cursor, void **in_cursor)
 {
 if (virtio_vdev_has_feature(>parent_obj, VIRTIO_NET_F_CTRL_MAC_ADDR)) {
 const struct iovec data = {
 .iov_base = (void *)n->mac,
 .iov_len = sizeof(n->mac),
 };
-ssize_t dev_written = vhost_vdpa_net_load_cmd(s, VIRTIO_NET_CTRL_MAC,
+ssize_t dev_written = vhost_vdpa_net_load_cmd(s, out_cursor, in_cursor,
+  VIRTIO_NET_CTRL_MAC,
   VIRTIO_NET_CTRL_MAC_ADDR_SET,
   , 1);
 if (unlikely(dev_written < 0)) {
@@ -735,7 +738,7 @@ static int vhost_vdpa_net_load_mac(VhostVDPAState *s, const 
VirtIONet *n)
 .iov_len = mul_macs_size,
 },
 };
-ssize_t dev_written = vhost_vdpa_net_load_cmd(s,
+ssize_t dev_written = vhost_vdpa_net_load_cmd(s, out_cursor, in_cursor,
 VIRTIO_NET_CTRL_MAC,
 VIRTIO_NET_CTRL_MAC_TABLE_SET,
 data, ARRAY_SIZE(data));
@@ -750,7 +753,8 @@ static int vhost_vdpa_net_load_mac(VhostVDPAState *s, const 
VirtIONet *n)
 }
 
 static int vhost_vdpa_net_load_mq(VhostVDPAState *s,
-  const VirtIONet *n)
+  const VirtIONet *n,
+  void **out_cursor, void **in_cursor)
 {
 struct virtio_net_ctrl_mq mq;
 ssize_t dev_written;
@@ -764,7 +768,8 @@ static int vhost_vdpa_net_load_mq(VhostVDPAState *s,
 .iov_base = ,
 .iov_len = sizeof(mq),
 };
-dev_written = vhost_vdpa_net_load_cmd(s, VIRTIO_NET_CTRL_MQ,
+dev_written = vhost_vdpa_net_load_cmd(s, out_cursor, in_cursor,
+  VIRTIO_NET_CTRL_MQ,
   VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET,
   , 1);
 if (unlikely(dev_written < 0)) {
@@ -778,7 +783,8 @@ static int vhost_vdpa_net_load_mq(VhostVDPAState *s,
 }
 
 static int vhost_vdpa_net_load_offloads(VhostVDPAState *s,
-const VirtIONet *n)
+const VirtIONet *n,
+void **out_cursor, void **in_cursor)
 {
 uint64_t offloads;
 ssize_t dev_written;
@@ -809,7 +815,8 @@ sta

[PATCH v3 5/8] vdpa: Check device ack in vhost_vdpa_net_load_rx_mode()

2023-07-19 Thread Hawkins Jiawei

Considering that vhost_vdpa_net_load_rx_mode() is only called
within vhost_vdpa_net_load_rx() now, this patch refactors
vhost_vdpa_net_load_rx_mode() to include a check for the
device's ack, simplifying the code and improving its maintainability.

Signed-off-by: Hawkins Jiawei 
---
 net/vhost-vdpa.c | 76 
 1 file changed, 31 insertions(+), 45 deletions(-)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index ae8f59adaa..fe0ba19724 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -814,14 +814,24 @@ static int vhost_vdpa_net_load_rx_mode(VhostVDPAState *s,
 .iov_base = ,
 .iov_len = sizeof(on),
 };
-return vhost_vdpa_net_load_cmd(s, VIRTIO_NET_CTRL_RX,
-   cmd, , 1);
+ssize_t dev_written;
+
+dev_written = vhost_vdpa_net_load_cmd(s, VIRTIO_NET_CTRL_RX,
+  cmd, , 1);
+if (unlikely(dev_written < 0)) {
+return dev_written;
+}
+if (*s->status != VIRTIO_NET_OK) {
+return -EIO;
+}
+
+return 0;
 }
 
 static int vhost_vdpa_net_load_rx(VhostVDPAState *s,
   const VirtIONet *n)
 {
-ssize_t dev_written;
+ssize_t r;
 
 if (!virtio_vdev_has_feature(>parent_obj, VIRTIO_NET_F_CTRL_RX)) {
 return 0;
@@ -846,13 +856,9 @@ static int vhost_vdpa_net_load_rx(VhostVDPAState *s,
  * configuration only at live migration.
  */
 if (!n->mac_table.uni_overflow && !n->promisc) {
-dev_written = vhost_vdpa_net_load_rx_mode(s,
-VIRTIO_NET_CTRL_RX_PROMISC, 0);
-if (unlikely(dev_written < 0)) {
-return dev_written;
-}
-if (*s->status != VIRTIO_NET_OK) {
-return -EIO;
+r = vhost_vdpa_net_load_rx_mode(s, VIRTIO_NET_CTRL_RX_PROMISC, 0);
+if (unlikely(r < 0)) {
+return r;
 }
 }
 
@@ -874,13 +880,9 @@ static int vhost_vdpa_net_load_rx(VhostVDPAState *s,
  * configuration only at live migration.
  */
 if (n->mac_table.multi_overflow || n->allmulti) {
-dev_written = vhost_vdpa_net_load_rx_mode(s,
-VIRTIO_NET_CTRL_RX_ALLMULTI, 1);
-if (unlikely(dev_written < 0)) {
-return dev_written;
-}
-if (*s->status != VIRTIO_NET_OK) {
-return -EIO;
+r = vhost_vdpa_net_load_rx_mode(s, VIRTIO_NET_CTRL_RX_ALLMULTI, 1);
+if (unlikely(r < 0)) {
+return r;
 }
 }
 
@@ -899,13 +901,9 @@ static int vhost_vdpa_net_load_rx(VhostVDPAState *s,
  * configuration only at live migration.
  */
 if (n->alluni) {
-dev_written = vhost_vdpa_net_load_rx_mode(s,
-VIRTIO_NET_CTRL_RX_ALLUNI, 1);
-if (dev_written < 0) {
-return dev_written;
-}
-if (*s->status != VIRTIO_NET_OK) {
-return -EIO;
+r = vhost_vdpa_net_load_rx_mode(s, VIRTIO_NET_CTRL_RX_ALLUNI, 1);
+if (r < 0) {
+return r;
 }
 }
 
@@ -920,13 +918,9 @@ static int vhost_vdpa_net_load_rx(VhostVDPAState *s,
  * configuration only at live migration.
  */
 if (n->nomulti) {
-dev_written = vhost_vdpa_net_load_rx_mode(s,
-VIRTIO_NET_CTRL_RX_NOMULTI, 1);
-if (dev_written < 0) {
-return dev_written;
-}
-if (*s->status != VIRTIO_NET_OK) {
-return -EIO;
+r = vhost_vdpa_net_load_rx_mode(s, VIRTIO_NET_CTRL_RX_NOMULTI, 1);
+if (r < 0) {
+return r;
 }
 }
 
@@ -941,13 +935,9 @@ static int vhost_vdpa_net_load_rx(VhostVDPAState *s,
  * configuration only at live migration.
  */
 if (n->nouni) {
-dev_written = vhost_vdpa_net_load_rx_mode(s,
-VIRTIO_NET_CTRL_RX_NOUNI, 1);
-if (dev_written < 0) {
-return dev_written;
-}
-if (*s->status != VIRTIO_NET_OK) {
-return -EIO;
+r = vhost_vdpa_net_load_rx_mode(s, VIRTIO_NET_CTRL_RX_NOUNI, 1);
+if (r < 0) {
+return r;
 }
 }
 
@@ -962,13 +952,9 @@ static int vhost_vdpa_net_load_rx(VhostVDPAState *s,
  * configuration only at live migration.
  */
 if (n->nobcast) {
-dev_written = vhost_vdpa_net_load_rx_mode(s,
-VIRTIO_NET_CTRL_RX_NOBCAST, 1);
-if (dev_written < 0) {
-return dev_written;
-}
-if (*s->status != VIRTIO_NET_OK) {
-return -EIO;
+r = vhost_vdpa_net_load_rx_mode(s, VIRTIO_NET_CTRL_RX_NOBCAST, 1);
+if (r < 0) {
+return r;
 }
 }
 
-- 
2.25.1

[PATCH v3 6/8] vdpa: Move vhost_svq_poll() to the caller of vhost_vdpa_net_cvq_add()

2023-07-19 Thread Hawkins Jiawei

This patch moves vhost_svq_poll() to the caller of
vhost_vdpa_net_cvq_add() and introduces a helper funtion.

By making this change, next patches in this series is
able to refactor vhost_vdpa_net_load_x() only to delay
the polling and checking process until either the SVQ
is full or control commands shadow buffers are full.

Signed-off-by: Hawkins Jiawei 
---
 net/vhost-vdpa.c | 50 ++--
 1 file changed, 40 insertions(+), 10 deletions(-)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index fe0ba19724..d06f38403f 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -609,15 +609,21 @@ static ssize_t vhost_vdpa_net_cvq_add(VhostVDPAState *s,
 qemu_log_mask(LOG_GUEST_ERROR, "%s: No space on device queue\n",
   __func__);
 }
-return r;
 }
 
-/*
- * We can poll here since we've had BQL from the time we sent the
- * descriptor. Also, we need to take the answer before SVQ pulls by itself,
- * when BQL is released
- */
-return vhost_svq_poll(svq, 1);
+return r;
+}
+
+/*
+ * Convenience wrapper to poll SVQ for multiple control commands.
+ *
+ * Caller should hold the BQL when invoking this function, and should take
+ * the answer before SVQ pulls by itself when BQL is released.
+ */
+static ssize_t vhost_vdpa_net_svq_poll(VhostVDPAState *s, size_t 
cmds_in_flight)
+{
+VhostShadowVirtqueue *svq = g_ptr_array_index(s->vhost_vdpa.shadow_vqs, 0);
+return vhost_svq_poll(svq, cmds_in_flight);
 }
 
 /* Convenience wrapper to get number of available SVQ descriptors */
@@ -645,6 +651,7 @@ static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState *s, 
uint8_t class,
 .iov_base = s->status,
 .iov_len = sizeof(*s->status),
 };
+ssize_t r;
 
 assert(data_size < vhost_vdpa_net_cvq_cmd_page_len() - sizeof(ctrl));
 /* Each CVQ command has one out descriptor and one in descriptor */
@@ -657,7 +664,16 @@ static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState *s, 
uint8_t class,
 iov_to_buf(data_sg, data_num, 0,
s->cvq_cmd_out_buffer + sizeof(ctrl), data_size);
 
-return vhost_vdpa_net_cvq_add(s, , 1, , 1);
+r = vhost_vdpa_net_cvq_add(s, , 1, , 1);
+if (unlikely(r < 0)) {
+return r;
+}
+
+/*
+ * We can poll here since we've had BQL from the time
+ * we sent the descriptor.
+ */
+return vhost_vdpa_net_svq_poll(s, 1);
 }
 
 static int vhost_vdpa_net_load_mac(VhostVDPAState *s, const VirtIONet *n)
@@ -1152,6 +1168,12 @@ static int 
vhost_vdpa_net_excessive_mac_filter_cvq_add(VhostVDPAState *s,
 if (unlikely(r < 0)) {
 return r;
 }
+
+/*
+ * We can poll here since we've had BQL from the time
+ * we sent the descriptor.
+ */
+vhost_vdpa_net_svq_poll(s, 1);
 if (*s->status != VIRTIO_NET_OK) {
 return sizeof(*s->status);
 }
@@ -1266,10 +1288,18 @@ static int 
vhost_vdpa_net_handle_ctrl_avail(VhostShadowVirtqueue *svq,
 goto out;
 }
 } else {
-dev_written = vhost_vdpa_net_cvq_add(s, , 1, , 1);
-if (unlikely(dev_written < 0)) {
+ssize_t r;
+r = vhost_vdpa_net_cvq_add(s, , 1, , 1);
+if (unlikely(r < 0)) {
+dev_written = r;
 goto out;
 }
+
+/*
+ * We can poll here since we've had BQL from the time
+ * we sent the descriptor.
+ */
+dev_written = vhost_vdpa_net_svq_poll(s, 1);
 }
 
 if (unlikely(dev_written < sizeof(status))) {
-- 
2.25.1

[PATCH v3 3/8] vhost: Expose vhost_svq_available_slots()

2023-07-19 Thread Hawkins Jiawei

Next patches in this series will delay the polling
and checking of buffers until either the SVQ is
full or control commands shadow buffers are full,
no longer perform an immediate poll and check of
the device's used buffers for each CVQ state load command.

To achieve this, this patch exposes
vhost_svq_available_slots() and introduces a helper function,
allowing QEMU to know whether the SVQ is full.

Signed-off-by: Hawkins Jiawei 
---
 hw/virtio/vhost-shadow-virtqueue.c | 2 +-
 hw/virtio/vhost-shadow-virtqueue.h | 1 +
 net/vhost-vdpa.c   | 9 +
 3 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/vhost-shadow-virtqueue.c 
b/hw/virtio/vhost-shadow-virtqueue.c
index e731b1d2ea..fc5f408f77 100644
--- a/hw/virtio/vhost-shadow-virtqueue.c
+++ b/hw/virtio/vhost-shadow-virtqueue.c
@@ -66,7 +66,7 @@ bool vhost_svq_valid_features(uint64_t features, Error **errp)
  *
  * @svq: The svq
  */
-static uint16_t vhost_svq_available_slots(const VhostShadowVirtqueue *svq)
+uint16_t vhost_svq_available_slots(const VhostShadowVirtqueue *svq)
 {
 return svq->num_free;
 }
diff --git a/hw/virtio/vhost-shadow-virtqueue.h 
b/hw/virtio/vhost-shadow-virtqueue.h
index 5bce67837b..19c842a15b 100644
--- a/hw/virtio/vhost-shadow-virtqueue.h
+++ b/hw/virtio/vhost-shadow-virtqueue.h
@@ -114,6 +114,7 @@ typedef struct VhostShadowVirtqueue {
 
 bool vhost_svq_valid_features(uint64_t features, Error **errp);
 
+uint16_t vhost_svq_available_slots(const VhostShadowVirtqueue *svq);
 void vhost_svq_push_elem(VhostShadowVirtqueue *svq,
  const VirtQueueElement *elem, uint32_t len);
 int vhost_svq_add(VhostShadowVirtqueue *svq, const struct iovec *out_sg,
diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 6b16c8ece0..dd71008e08 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -620,6 +620,13 @@ static ssize_t vhost_vdpa_net_cvq_add(VhostVDPAState *s,
 return vhost_svq_poll(svq, 1);
 }
 
+/* Convenience wrapper to get number of available SVQ descriptors */
+static uint16_t vhost_vdpa_net_svq_available_slots(VhostVDPAState *s)
+{
+VhostShadowVirtqueue *svq = g_ptr_array_index(s->vhost_vdpa.shadow_vqs, 0);
+return vhost_svq_available_slots(svq);
+}
+
 static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState *s, uint8_t class,
uint8_t cmd, const struct iovec 
*data_sg,
size_t data_num)
@@ -640,6 +647,8 @@ static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState *s, 
uint8_t class,
 };
 
 assert(data_size < vhost_vdpa_net_cvq_cmd_page_len() - sizeof(ctrl));
+/* Each CVQ command has one out descriptor and one in descriptor */
+assert(vhost_vdpa_net_svq_available_slots(s) >= 2);
 
 /* pack the CVQ command header */
 memcpy(s->cvq_cmd_out_buffer, , sizeof(ctrl));
-- 
2.25.1

[PATCH v3 8/8] vdpa: Send cvq state load commands in parallel

2023-07-19 Thread Hawkins Jiawei

This patch enables sending CVQ state load commands
in parallel at device startup by following steps:

  * Refactor vhost_vdpa_net_load_cmd() to iterate through
the control commands shadow buffers. This allows different
CVQ state load commands to use their own unique buffers.

  * Delay the polling and checking of buffers until either
the SVQ is full or control commands shadow buffers are full.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1578
Signed-off-by: Hawkins Jiawei 
---
 net/vhost-vdpa.c | 157 +--
 1 file changed, 96 insertions(+), 61 deletions(-)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 795c9c1fd2..1ebb58f7f6 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -633,6 +633,26 @@ static uint16_t 
vhost_vdpa_net_svq_available_slots(VhostVDPAState *s)
 return vhost_svq_available_slots(svq);
 }
 
+/*
+ * Poll SVQ for multiple pending control commands and check the device's ack.
+ *
+ * Caller should hold the BQL when invoking this function.
+ */
+static ssize_t vhost_vdpa_net_svq_flush(VhostVDPAState *s,
+size_t cmds_in_flight)
+{
+vhost_vdpa_net_svq_poll(s, cmds_in_flight);
+
+/* Device should and must use only one byte ack each control command */
+assert(cmds_in_flight < vhost_vdpa_net_cvq_cmd_page_len());
+for (int i = 0; i < cmds_in_flight; ++i) {
+if (s->status[i] != VIRTIO_NET_OK) {
+return -EIO;
+}
+}
+return 0;
+}
+
 static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState *s, void **out_cursor,
void **in_cursor, uint8_t class,
uint8_t cmd, const struct iovec 
*data_sg,
@@ -642,19 +662,41 @@ static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState *s, 
void **out_cursor,
 .class = class,
 .cmd = cmd,
 };
-size_t data_size = iov_size(data_sg, data_num);
+size_t data_size = iov_size(data_sg, data_num),
+   left_bytes = vhost_vdpa_net_cvq_cmd_page_len() -
+(*out_cursor - s->cvq_cmd_out_buffer);
 /* Buffers for the device */
 struct iovec out = {
-.iov_base = *out_cursor,
 .iov_len = sizeof(ctrl) + data_size,
 };
 struct iovec in = {
-.iov_base = *in_cursor,
 .iov_len = sizeof(*s->status),
 };
 ssize_t r;
 
-assert(data_size < vhost_vdpa_net_cvq_cmd_page_len() - sizeof(ctrl));
+if (sizeof(ctrl) > left_bytes || data_size > left_bytes - sizeof(ctrl) ||
+vhost_vdpa_net_svq_available_slots(s) < 2) {
+/*
+ * It is time to flush all pending control commands if SVQ is full
+ * or control commands shadow buffers are full.
+ *
+ * We can poll here since we've had BQL from the time
+ * we sent the descriptor.
+ */
+r = vhost_vdpa_net_svq_flush(s, *in_cursor - (void *)s->status);
+if (unlikely(r < 0)) {
+return r;
+}
+
+*out_cursor = s->cvq_cmd_out_buffer;
+*in_cursor = s->status;
+left_bytes = vhost_vdpa_net_cvq_cmd_page_len();
+}
+
+out.iov_base = *out_cursor;
+in.iov_base = *in_cursor;
+
+assert(data_size <= left_bytes - sizeof(ctrl));
 /* Each CVQ command has one out descriptor and one in descriptor */
 assert(vhost_vdpa_net_svq_available_slots(s) >= 2);
 
@@ -670,11 +712,11 @@ static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState *s, 
void **out_cursor,
 return r;
 }
 
-/*
- * We can poll here since we've had BQL from the time
- * we sent the descriptor.
- */
-return vhost_vdpa_net_svq_poll(s, 1);
+/* iterate the cursors */
+*out_cursor += out.iov_len;
+*in_cursor += in.iov_len;
+
+return 0;
 }
 
 static int vhost_vdpa_net_load_mac(VhostVDPAState *s, const VirtIONet *n,
@@ -685,15 +727,12 @@ static int vhost_vdpa_net_load_mac(VhostVDPAState *s, 
const VirtIONet *n,
 .iov_base = (void *)n->mac,
 .iov_len = sizeof(n->mac),
 };
-ssize_t dev_written = vhost_vdpa_net_load_cmd(s, out_cursor, in_cursor,
-  VIRTIO_NET_CTRL_MAC,
-  VIRTIO_NET_CTRL_MAC_ADDR_SET,
-  , 1);
-if (unlikely(dev_written < 0)) {
-return dev_written;
-}
-if (*s->status != VIRTIO_NET_OK) {
-return -EIO;
+ssize_t r = vhost_vdpa_net_load_cmd(s, out_cursor, in_cursor,
+   VIRTIO_NET_CTRL_MAC,
+   VIRTIO_NET_CTRL_MAC_ADDR_SET,
+   , 1);
+if (unlikely(r < 0)) {
+return r;
 }
 }
 
@@ -738,15 +777,12 @@ static int vh

[PATCH v3 4/8] vdpa: Avoid using vhost_vdpa_net_load_*() outside vhost_vdpa_net_load()

2023-07-19 Thread Hawkins Jiawei

Next patches in this series will refactor vhost_vdpa_net_load_cmd()
to iterate through the control commands shadow buffers, allowing QEMU
to send CVQ state load commands in parallel at device startup.

Considering that QEMU always forwards the CVQ command serialized
outside of vhost_vdpa_net_load(), it is more elegant to send the
CVQ commands directly without invoking vhost_vdpa_net_load_*() helpers.

Signed-off-by: Hawkins Jiawei 
---
 net/vhost-vdpa.c | 17 ++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index dd71008e08..ae8f59adaa 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -1098,12 +1098,14 @@ static NetClientInfo net_vhost_vdpa_cvq_info = {
  */
 static int vhost_vdpa_net_excessive_mac_filter_cvq_add(VhostVDPAState *s,
VirtQueueElement *elem,
-   struct iovec *out)
+   struct iovec *out,
+   struct iovec *in)
 {
 struct virtio_net_ctrl_mac mac_data, *mac_ptr;
 struct virtio_net_ctrl_hdr *hdr_ptr;
 uint32_t cursor;
 ssize_t r;
+uint8_t on = 1;
 
 /* parse the non-multicast MAC address entries from CVQ command */
 cursor = sizeof(*hdr_ptr);
@@ -1151,7 +1153,16 @@ static int 
vhost_vdpa_net_excessive_mac_filter_cvq_add(VhostVDPAState *s,
  * filter table to the vdpa device, it should send the
  * VIRTIO_NET_CTRL_RX_PROMISC CVQ command to enable promiscuous mode
  */
-r = vhost_vdpa_net_load_rx_mode(s, VIRTIO_NET_CTRL_RX_PROMISC, 1);
+cursor = 0;
+hdr_ptr = out->iov_base;
+out->iov_len = sizeof(*hdr_ptr) + sizeof(on);
+assert(out->iov_len < vhost_vdpa_net_cvq_cmd_page_len());
+
+hdr_ptr->class = VIRTIO_NET_CTRL_RX;
+hdr_ptr->cmd = VIRTIO_NET_CTRL_RX_PROMISC;
+cursor += sizeof(*hdr_ptr);
+*(uint8_t *)(out->iov_base + cursor) = on;
+r = vhost_vdpa_net_cvq_add(s, out, 1, in, 1);
 if (unlikely(r < 0)) {
 return r;
 }
@@ -1264,7 +1275,7 @@ static int 
vhost_vdpa_net_handle_ctrl_avail(VhostShadowVirtqueue *svq,
  * the CVQ command direclty.
  */
 dev_written = vhost_vdpa_net_excessive_mac_filter_cvq_add(s, elem,
-  );
+  , );
 if (unlikely(dev_written < 0)) {
 goto out;
 }
-- 
2.25.1

[PATCH v3 2/8] vdpa: Use iovec for vhost_vdpa_net_cvq_add()

2023-07-19 Thread Hawkins Jiawei

Next patches in this series will no longer perform an
immediate poll and check of the device's used buffers
for each CVQ state load command. Consequently, there
will be multiple pending buffers in the shadow VirtQueue,
making it a must for every control command to have its
own buffer.

To achieve this, this patch refactor vhost_vdpa_net_cvq_add()
to accept `struct iovec`, which eliminates the coupling of
control commands to `s->cvq_cmd_out_buffer` and `s->status`,
allowing them to use their own buffer.

Signed-off-by: Hawkins Jiawei 
---
 net/vhost-vdpa.c | 38 --
 1 file changed, 20 insertions(+), 18 deletions(-)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index d1dd140bf6..6b16c8ece0 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -596,22 +596,14 @@ static void vhost_vdpa_net_cvq_stop(NetClientState *nc)
 vhost_vdpa_net_client_stop(nc);
 }
 
-static ssize_t vhost_vdpa_net_cvq_add(VhostVDPAState *s, size_t out_len,
-  size_t in_len)
+static ssize_t vhost_vdpa_net_cvq_add(VhostVDPAState *s,
+  struct iovec *out_sg, size_t out_num,
+  struct iovec *in_sg, size_t in_num)
 {
-/* Buffers for the device */
-const struct iovec out = {
-.iov_base = s->cvq_cmd_out_buffer,
-.iov_len = out_len,
-};
-const struct iovec in = {
-.iov_base = s->status,
-.iov_len = sizeof(virtio_net_ctrl_ack),
-};
 VhostShadowVirtqueue *svq = g_ptr_array_index(s->vhost_vdpa.shadow_vqs, 0);
 int r;
 
-r = vhost_svq_add(svq, , 1, , 1, NULL);
+r = vhost_svq_add(svq, out_sg, out_num, in_sg, in_num, NULL);
 if (unlikely(r != 0)) {
 if (unlikely(r == -ENOSPC)) {
 qemu_log_mask(LOG_GUEST_ERROR, "%s: No space on device queue\n",
@@ -637,6 +629,15 @@ static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState *s, 
uint8_t class,
 .cmd = cmd,
 };
 size_t data_size = iov_size(data_sg, data_num);
+/* Buffers for the device */
+struct iovec out = {
+.iov_base = s->cvq_cmd_out_buffer,
+.iov_len = sizeof(ctrl) + data_size,
+};
+struct iovec in = {
+.iov_base = s->status,
+.iov_len = sizeof(*s->status),
+};
 
 assert(data_size < vhost_vdpa_net_cvq_cmd_page_len() - sizeof(ctrl));
 
@@ -647,8 +648,7 @@ static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState *s, 
uint8_t class,
 iov_to_buf(data_sg, data_num, 0,
s->cvq_cmd_out_buffer + sizeof(ctrl), data_size);
 
-return vhost_vdpa_net_cvq_add(s, data_size + sizeof(ctrl),
-  sizeof(virtio_net_ctrl_ack));
+return vhost_vdpa_net_cvq_add(s, , 1, , 1);
 }
 
 static int vhost_vdpa_net_load_mac(VhostVDPAState *s, const VirtIONet *n)
@@ -1222,9 +1222,7 @@ static int 
vhost_vdpa_net_handle_ctrl_avail(VhostShadowVirtqueue *svq,
 struct iovec out = {
 .iov_base = s->cvq_cmd_out_buffer,
 };
-/* in buffer used for device model */
-const struct iovec in = {
-.iov_base = ,
+struct iovec in = {
 .iov_len = sizeof(status),
 };
 ssize_t dev_written = -EINVAL;
@@ -1232,6 +1230,8 @@ static int 
vhost_vdpa_net_handle_ctrl_avail(VhostShadowVirtqueue *svq,
 out.iov_len = iov_to_buf(elem->out_sg, elem->out_num, 0,
  s->cvq_cmd_out_buffer,
  vhost_vdpa_net_cvq_cmd_page_len());
+/* In buffer used for the vdpa device */
+in.iov_base = s->status;
 
 ctrl = s->cvq_cmd_out_buffer;
 if (ctrl->class == VIRTIO_NET_CTRL_ANNOUNCE) {
@@ -1260,7 +1260,7 @@ static int 
vhost_vdpa_net_handle_ctrl_avail(VhostShadowVirtqueue *svq,
 goto out;
 }
 } else {
-dev_written = vhost_vdpa_net_cvq_add(s, out.iov_len, sizeof(status));
+dev_written = vhost_vdpa_net_cvq_add(s, , 1, , 1);
 if (unlikely(dev_written < 0)) {
 goto out;
 }
@@ -1276,6 +1276,8 @@ static int 
vhost_vdpa_net_handle_ctrl_avail(VhostShadowVirtqueue *svq,
 }
 
 status = VIRTIO_NET_ERR;
+/* In buffer used for the device model */
+in.iov_base = 
 virtio_net_handle_ctrl_iov(svq->vdev, , 1, , 1);
 if (status != VIRTIO_NET_OK) {
 error_report("Bad CVQ processing in model");
-- 
2.25.1

[PATCH v3 1/8] vhost: Add argument to vhost_svq_poll()

2023-07-19 Thread Hawkins Jiawei

Next patches in this series will no longer perform an
immediate poll and check of the device's used buffers
for each CVQ state load command. Instead, they will
send CVQ state load commands in parallel by polling
multiple pending buffers at once.

To achieve this, this patch refactoring vhost_svq_poll()
to accept a new argument `num`, which allows vhost_svq_poll()
to wait for the device to use multiple elements,
rather than polling for a single element.

Signed-off-by: Hawkins Jiawei 
---
 hw/virtio/vhost-shadow-virtqueue.c | 36 ++
 hw/virtio/vhost-shadow-virtqueue.h |  2 +-
 net/vhost-vdpa.c   |  2 +-
 3 files changed, 24 insertions(+), 16 deletions(-)

diff --git a/hw/virtio/vhost-shadow-virtqueue.c 
b/hw/virtio/vhost-shadow-virtqueue.c
index 49e5aed931..e731b1d2ea 100644
--- a/hw/virtio/vhost-shadow-virtqueue.c
+++ b/hw/virtio/vhost-shadow-virtqueue.c
@@ -514,29 +514,37 @@ static void vhost_svq_flush(VhostShadowVirtqueue *svq,
 }
 
 /**
- * Poll the SVQ for one device used buffer.
+ * Poll the SVQ to wait for the device to use the specified number
+ * of elements and return the total length written by the device.
  *
  * This function race with main event loop SVQ polling, so extra
  * synchronization is needed.
  *
- * Return the length written by the device.
+ * @svq: The svq
+ * @num: The number of elements that need to be used
  */
-size_t vhost_svq_poll(VhostShadowVirtqueue *svq)
+size_t vhost_svq_poll(VhostShadowVirtqueue *svq, size_t num)
 {
-int64_t start_us = g_get_monotonic_time();
-uint32_t len = 0;
+size_t len = 0;
+uint32_t r;
 
-do {
-if (vhost_svq_more_used(svq)) {
-break;
-}
+while (num--) {
+int64_t start_us = g_get_monotonic_time();
 
-if (unlikely(g_get_monotonic_time() - start_us > 10e6)) {
-return 0;
-}
-} while (true);
+do {
+if (vhost_svq_more_used(svq)) {
+break;
+}
+
+if (unlikely(g_get_monotonic_time() - start_us > 10e6)) {
+return len;
+}
+} while (true);
+
+vhost_svq_get_buf(svq, );
+len += r;
+}
 
-vhost_svq_get_buf(svq, );
 return len;
 }
 
diff --git a/hw/virtio/vhost-shadow-virtqueue.h 
b/hw/virtio/vhost-shadow-virtqueue.h
index 6efe051a70..5bce67837b 100644
--- a/hw/virtio/vhost-shadow-virtqueue.h
+++ b/hw/virtio/vhost-shadow-virtqueue.h
@@ -119,7 +119,7 @@ void vhost_svq_push_elem(VhostShadowVirtqueue *svq,
 int vhost_svq_add(VhostShadowVirtqueue *svq, const struct iovec *out_sg,
   size_t out_num, const struct iovec *in_sg, size_t in_num,
   VirtQueueElement *elem);
-size_t vhost_svq_poll(VhostShadowVirtqueue *svq);
+size_t vhost_svq_poll(VhostShadowVirtqueue *svq, size_t num);
 
 void vhost_svq_set_svq_kick_fd(VhostShadowVirtqueue *svq, int svq_kick_fd);
 void vhost_svq_set_svq_call_fd(VhostShadowVirtqueue *svq, int call_fd);
diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index dfd271c456..d1dd140bf6 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -625,7 +625,7 @@ static ssize_t vhost_vdpa_net_cvq_add(VhostVDPAState *s, 
size_t out_len,
  * descriptor. Also, we need to take the answer before SVQ pulls by itself,
  * when BQL is released
  */
-return vhost_svq_poll(svq);
+return vhost_svq_poll(svq, 1);
 }
 
 static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState *s, uint8_t class,
-- 
2.25.1

[PATCH v3 0/8] vdpa: Send all CVQ state load commands in parallel

2023-07-19 Thread Hawkins Jiawei

This patchset allows QEMU to delay polling and checking the device
used buffer until either the SVQ is full or control commands shadow
buffers are full, instead of polling and checking immediately after
sending each SVQ control command, so that QEMU can send all the SVQ
control commands in parallel, which have better performance improvement.

I use vp_vdpa device to simulate vdpa device, and create 4094 VLANS in
guest to build a test environment for sending multiple CVQ state load
commands. This patch series can improve latency from 10023 us to
8697 us for about 4099 CVQ state load commands, about 0.32 us per command.

Note that this patch should be based on
patch "Vhost-vdpa Shadow Virtqueue VLAN support" at [1].

[1]. https://lists.gnu.org/archive/html/qemu-devel/2023-07/msg03719.html

TestStep

1. regression testing using vp-vdpa device
  - For L0 guest, boot QEMU with two virtio-net-pci net device with
`ctrl_vq`, `ctrl_rx`, `ctrl_rx_extra` features on, command line like:
  -device virtio-net-pci,disable-legacy=on,disable-modern=off,
iommu_platform=on,mq=on,ctrl_vq=on,guest_announce=off,
indirect_desc=off,queue_reset=off,ctrl_rx=on,ctrl_rx_extra=on,...

  - For L1 guest, apply the patch series and compile the source code,
start QEMU with two vdpa device with svq mode on, enable the `ctrl_vq`,
`ctrl_rx`, `ctrl_rx_extra` features on, command line like:
  -netdev type=vhost-vdpa,x-svq=true,...
  -device virtio-net-pci,mq=on,guest_announce=off,ctrl_vq=on,
ctrl_rx=on,ctrl_rx_extra=on...

  - For L2 source guest, run the following bash command:
```bash
#!/bin/sh

for idx1 in {0..9}
do
  for idx2 in {0..9}
  do
for idx3 in {0..6}
do
  ip link add macvlan$idx1$idx2$idx3 link eth0
address 4a:30:10:19:$idx1$idx2:1$idx3 type macvlan mode bridge
  ip link set macvlan$idx1$idx2$idx3 up
done
  done
done
```
  - Execute the live migration in L2 source monitor

  - Result
* with this series, QEMU should not trigger any error or warning.



2. perf using vp-vdpa device
  - For L0 guest, boot QEMU with two virtio-net-pci net device with
`ctrl_vq`, `ctrl_vlan` features on, command line like:
  -device virtio-net-pci,disable-legacy=on,disable-modern=off,
iommu_platform=on,mq=on,ctrl_vq=on,guest_announce=off,
indirect_desc=off,queue_reset=off,ctrl_vlan=on,...

  - For L1 guest, apply the patch series, then apply an addtional
patch to record the load time in microseconds as following:
```diff
diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index 6b958d6363..501b510fd2 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -295,7 +295,10 @@ static int vhost_net_start_one(struct vhost_net *net,
 }
 
 if (net->nc->info->load) {
+int64_t start_us = g_get_monotonic_time();
 r = net->nc->info->load(net->nc);
+error_report("vhost_vdpa_net_load() = %ld us",
+ g_get_monotonic_time() - start_us);
 if (r < 0) {
 goto fail;
 }
```

  - For L1 guest, compile the code, and start QEMU with two vdpa device
with svq mode on, enable the `ctrl_vq`, `ctrl_vlan` features on,
command line like:
  -netdev type=vhost-vdpa,x-svq=true,...
  -device virtio-net-pci,mq=on,guest_announce=off,ctrl_vq=on,
ctrl_vlan=on...

  - For L2 source guest, run the following bash command:
```bash
#!/bin/sh

for idx in {1..4094}
do
  ip link add link eth0 name vlan$idx type vlan id $idx
done
```

  - wait for some time, then execute the live migration in L2 source monitor

  - Result
* with this series, QEMU should not trigger any warning
or error except something like "vhost_vdpa_net_load() = 8697 us"
* without this series, QEMU should not trigger any warning
or error except something like "vhost_vdpa_net_load() = 10023 us"

ChangeLog
=
v3:
  - refactor vhost_svq_poll() to accept cmds_in_flight
suggested by Jason and Eugenio
  - refactor vhost_vdpa_net_cvq_add() to make control commands buffers
is not tied to `s->cvq_cmd_out_buffer` and `s->status`, so we can reuse
it suggested by Eugenio
  - poll and check when SVQ is full or control commands shadow buffers is
full

v2: https://lore.kernel.org/all/cover.1683371965.git.yin31...@gmail.com/
  - recover accidentally deleted rows
  - remove extra newline
  - refactor `need_poll_len` to `cmds_in_flight`
  - return -EINVAL when vhost_svq_poll() return 0 or check
on buffers written by device fails
  - change the type of `in_cursor`, and refactor the
code for updating cursor
  - return directly when vhost_vdpa_net_load_{mac,mq}()
returns a failure in vhost_vdpa_net_load()

v1: https://lore.kernel.org/all/cover.1681732982.git.yin31...@gmail.com/

Hawkins Jiawei (8):
  vhost: Add argument to vhost_svq_poll()
  vdpa: Use iovec for vhost_vdpa_net_cvq_add()
  vhost: Expose vhost_svq_available_slots()
  vdpa: Avoid using vhost_vdpa_net_load_*() outside
vhost_vdpa_net_load()
  vdpa: Check d

[PATCH 2/4] virtio-net: Expose MAX_VLAN

2023-07-19 Thread Hawkins Jiawei

vhost-vdpa shadowed CVQ needs to know the maximum number of
vlans supported by the virtio-net device, so QEMU can restore
the VLAN state in a migration.

Co-developed-by: Eugenio Pérez 
Signed-off-by: Eugenio Pérez 
Signed-off-by: Hawkins Jiawei 
---
 hw/net/virtio-net.c| 2 --
 include/hw/virtio/virtio-net.h | 6 ++
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index d20d5a63cd..a32672039d 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -49,8 +49,6 @@
 
 #define VIRTIO_NET_VM_VERSION11
 
-#define MAX_VLAN(1 << 12)   /* Per 802.1Q definition */
-
 /* previously fixed value */
 #define VIRTIO_NET_RX_QUEUE_DEFAULT_SIZE 256
 #define VIRTIO_NET_TX_QUEUE_DEFAULT_SIZE 256
diff --git a/include/hw/virtio/virtio-net.h b/include/hw/virtio/virtio-net.h
index 5f5dcb4572..93f3bb5d97 100644
--- a/include/hw/virtio/virtio-net.h
+++ b/include/hw/virtio/virtio-net.h
@@ -38,6 +38,12 @@ OBJECT_DECLARE_SIMPLE_TYPE(VirtIONet, VIRTIO_NET)
 /* Maximum VIRTIO_NET_CTRL_MAC_TABLE_SET unicast + multicast entries. */
 #define MAC_TABLE_ENTRIES64
 
+/*
+ * The maximum number of VLANs in the VLAN filter table
+ * added by VIRTIO_NET_CTRL_VLAN_ADD
+ */
+#define MAX_VLAN(1 << 12)   /* Per 802.1Q definition */
+
 typedef struct virtio_net_conf
 {
 uint32_t txtimer;
-- 
2.25.1

[PATCH 1/4] virtio-net: do not reset vlan filtering at set_features

2023-07-19 Thread Hawkins Jiawei

From: Eugenio Pérez 

This function is called after virtio_load, so all vlan configuration is
lost in migration case.

Just allow all the vlan-tagged packets if vlan is not configured, and
trust device reset to clear all filtered vlans.

Fixes: 0b1eaa8803 ("virtio-net: Do not filter VLANs without F_CTRL_VLAN")
Signed-off-by: Eugenio Pérez 
Reviewed-by: Hawkins Jiawei 
Signed-off-by: Hawkins Jiawei 
---
 hw/net/virtio-net.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 7102ec4817..d20d5a63cd 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -1006,9 +1006,7 @@ static void virtio_net_set_features(VirtIODevice *vdev, 
uint64_t features)
 vhost_net_save_acked_features(nc->peer);
 }
 
-if (virtio_has_feature(features, VIRTIO_NET_F_CTRL_VLAN)) {
-memset(n->vlans, 0, MAX_VLAN >> 3);
-} else {
+if (!virtio_has_feature(features, VIRTIO_NET_F_CTRL_VLAN)) {
 memset(n->vlans, 0xff, MAX_VLAN >> 3);
 }
 
-- 
2.25.1

[PATCH 3/4] vdpa: Restore vlan filtering state

2023-07-19 Thread Hawkins Jiawei

This patch introduces vhost_vdpa_net_load_single_vlan()
and vhost_vdpa_net_load_vlan() to restore the vlan
filtering state at device's startup.

Co-developed-by: Eugenio Pérez 
Signed-off-by: Eugenio Pérez 
Signed-off-by: Hawkins Jiawei 
---
 net/vhost-vdpa.c | 49 
 1 file changed, 49 insertions(+)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 9795306742..0787dd933b 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -965,6 +965,51 @@ static int vhost_vdpa_net_load_rx(VhostVDPAState *s,
 return 0;
 }
 
+static int vhost_vdpa_net_load_single_vlan(VhostVDPAState *s,
+   const VirtIONet *n,
+   uint16_t vid)
+{
+const struct iovec data = {
+.iov_base = ,
+.iov_len = sizeof(vid),
+};
+ssize_t dev_written = vhost_vdpa_net_load_cmd(s, VIRTIO_NET_CTRL_VLAN,
+  VIRTIO_NET_CTRL_VLAN_ADD,
+  , 1);
+if (unlikely(dev_written < 0)) {
+return dev_written;
+}
+if (unlikely(*s->status != VIRTIO_NET_OK)) {
+return -EIO;
+}
+
+return 0;
+}
+
+static int vhost_vdpa_net_load_vlan(VhostVDPAState *s,
+const VirtIONet *n)
+{
+int r;
+
+if (!virtio_vdev_has_feature(>parent_obj, VIRTIO_NET_F_CTRL_VLAN)) {
+return 0;
+}
+
+for (int i = 0; i < MAX_VLAN >> 5; i++) {
+for (int j = 0; n->vlans[i] && j <= 0x1f; j++) {
+if (n->vlans[i] & (1U << j)) {
+r = vhost_vdpa_net_load_single_vlan(s, n, (i << 5) + j);
+if (unlikely(r != 0)) {
+return r;
+}
+}
+}
+}
+
+return 0;
+}
+
+
 static int vhost_vdpa_net_load(NetClientState *nc)
 {
 VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
@@ -995,6 +1040,10 @@ static int vhost_vdpa_net_load(NetClientState *nc)
 if (unlikely(r)) {
 return r;
 }
+r = vhost_vdpa_net_load_vlan(s, n);
+if (unlikely(r)) {
+return r;
+}
 
 return 0;
 }
-- 
2.25.1

[PATCH 0/4] Vhost-vdpa Shadow Virtqueue VLAN support

2023-07-19 Thread Hawkins Jiawei

This series enables shadowed CVQ to intercept VLAN commands
through shadowed CVQ, update the virtio NIC device model
so qemu send it in a migration, and the restore of that
VLAN state in the destination.

TestStep

1. test the migration using vp-vdpa device
  - For L0 guest, boot QEMU with two virtio-net-pci net device with
`ctrl_vq`, `ctrl_vlan` features on, command line like:
  -device virtio-net-pci,disable-legacy=on,disable-modern=off,
iommu_platform=on,mq=on,ctrl_vq=on,guest_announce=off,
indirect_desc=off,queue_reset=off,ctrl_vlan=on,...

  - For L1 guest, apply the patch series and compile the source code,
start QEMU with two vdpa device with svq mode on, enable the `ctrl_vq`,
`ctrl_vlan` features on, command line like:
  -netdev type=vhost-vdpa,x-svq=true,...
  -device virtio-net-pci,mq=on,guest_announce=off,ctrl_vq=on,
ctrl_vlan=on,...

  - For L2 source guest, run the following bash command:
```bash
#!/bin/sh

for idx in {1..4094}
do
  ip link add link eth0 name vlan$idx type vlan id $idx
done
```

  - gdb attaches the L2 dest VM and break at the
vhost_vdpa_net_load_single_vlan(), and execute the following
gdbscript
```gdbscript
ignore 1 4094
c
```

  - Execute the live migration in L2 source monitor

  - Result
* with this series, gdb can hit the breakpoint and continue
the executing without triggering any error or warning.

Eugenio Pérez (1):
  virtio-net: do not reset vlan filtering at set_features

Hawkins Jiawei (3):
  virtio-net: Expose MAX_VLAN
  vdpa: Restore vlan filtering state
  vdpa: Allow VIRTIO_NET_F_CTRL_VLAN in SVQ

 hw/net/virtio-net.c|  6 +---
 include/hw/virtio/virtio-net.h |  6 
 net/vhost-vdpa.c   | 50 ++
 3 files changed, 57 insertions(+), 5 deletions(-)

-- 
2.25.1

[PATCH 4/4] vdpa: Allow VIRTIO_NET_F_CTRL_VLAN in SVQ

2023-07-19 Thread Hawkins Jiawei

Enable SVQ with VIRTIO_NET_F_CTRL_VLAN feature.

Co-developed-by: Eugenio Pérez 
Signed-off-by: Eugenio Pérez 
Signed-off-by: Hawkins Jiawei 
---
 net/vhost-vdpa.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 0787dd933b..dfd271c456 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -111,6 +111,7 @@ static const uint64_t vdpa_svq_device_features =
 BIT_ULL(VIRTIO_NET_F_STATUS) |
 BIT_ULL(VIRTIO_NET_F_CTRL_VQ) |
 BIT_ULL(VIRTIO_NET_F_CTRL_RX) |
+BIT_ULL(VIRTIO_NET_F_CTRL_VLAN) |
 BIT_ULL(VIRTIO_NET_F_CTRL_RX_EXTRA) |
 BIT_ULL(VIRTIO_NET_F_MQ) |
 BIT_ULL(VIRTIO_F_ANY_LAYOUT) |
-- 
2.25.1

Re: [PATCH] vdpa: Increase out buffer size for CVQ commands

2023-07-12 Thread Hawkins Jiawei

在 2023/7/12 18:45, Michael Tokarev 写道:
> 11.07.2023 04:48, Hawkins Jiawei wrote:
> ..
>> Sorry for not mentioning that I have moved the patch to the patch series
>> titled "Vhost-vdpa Shadow Virtqueue _F_CTRL_RX commands support" at [1].
>> The reason for this move is that the bug in question should not be
>> triggered until the VIRTIO_NET_CTRL_MAC_TABLE_SET command is exposed by
>> this patch series.
>
> Does this mean this particular change is not supposed to be applied to
> -stable,
> as the other change which exposes the bug isn't in any stable series?

Yes, you are right.

This bug is related to the VIRTIO_NET_CTRL_MAC_TABLE_SET command in SVQ,
and this command is not exposed in SVQ in any stable branch, so we do
not need to apply the patch to the -stable branch.

Thanks!

>
> Thanks,
>
> /mjt

Re: [PATCH] vdpa: Increase out buffer size for CVQ commands

2023-07-10 Thread Hawkins Jiawei

On 2023/7/11 2:52, Michael S. Tsirkin wrote:
> On Mon, Jun 26, 2023 at 04:26:04PM +0800, Hawkins Jiawei wrote:
>> It appears that my commit message and comments did not take this into
>> account. I will refactor them in the v2 patch..
>
> does not look like you ever sent v2.
>

Sorry for not mentioning that I have moved the patch to the patch series
titled "Vhost-vdpa Shadow Virtqueue _F_CTRL_RX commands support" at [1].
The reason for this move is that the bug in question should not be
triggered until the VIRTIO_NET_CTRL_MAC_TABLE_SET command is exposed by
this patch series.

I will take care of this in my future patch series.

[1].https://lists.gnu.org/archive/html/qemu-devel/2023-07/msg01577.html

Thanks!

1 2 3 >

1 - 100 of 209 matches

Mail list logo