From: Tonghao Zhang
Allow user configuring RXCSUM separately with ethtool -K,
reusing the existing virtnet_set_guest_offloads helper
that configures RXCSUM for XDP. This is conditional on
VIRTIO_NET_F_CTRL_GUEST_OFFLOADS.
If Rx checksum is disabled, LRO should also be disabled.
Cc: Michael S.
From: Tonghao Zhang
Allow user configuring RXCSUM separately with ethtool -K,
reusing the existing virtnet_set_guest_offloads helper
that configures RXCSUM for XDP. This is conditional on
VIRTIO_NET_F_CTRL_GUEST_OFFLOADS.
If Rx checksum is disabled, LRO should also be disabled.
Cc: Michael S.
From: Tonghao Zhang
Open vSwitch and Linux bridge will disable LRO of the interface
when this interface added to them. Now when disable the LRO, the
virtio-net csum is disable too. That drops the forwarding performance.
Fixes: a02e8964eaf9 ("virtio-net: ethtool configurable LRO")
Cc: Michael S.
From: Tonghao Zhang
Open vSwitch and Linux bridge will disable LRO of the interface
when this interface added to them. Now when disable the LRO, the
virtio-net csum is disable too. That drops the forwarding performance.
Fixes: e59ff2c49ae1 ("virtio-net: disable guest csum during XDP set")
Cc:
From: Tonghao Zhang
Allow user configuring RXCSUM separately with ethtool -K,
reusing the existing virtnet_set_guest_offloads helper
that configures RXCSUM for XDP. This is conditional on
VIRTIO_NET_F_CTRL_GUEST_OFFLOADS.
Cc: Michael S. Tsirkin
Cc: Jason Wang
Signed-off-by: Tonghao Zhang
---
From: Tonghao Zhang
This patch improves the guest receive performance.
On the handle_tx side, we poll the sock receive queue at the
same time. handle_rx do that in the same way.
We set the poll-us=100us and use the netperf to test throughput
and mean latency. When running the tests, the
From: Tonghao Zhang
Factor out generic busy polling logic and will be
used for in tx path in the next patch. And with the patch,
qemu can set differently the busyloop_timeout for rx queue.
To avoid duplicate codes, introduce the helper functions:
* sock_has_rx_data(changed from sk_has_rx_data)
From: Tonghao Zhang
Use the VHOST_NET_VQ_XXX as a subclass for mutex_lock_nested.
Signed-off-by: Tonghao Zhang
Acked-by: Jason Wang
---
drivers/vhost/net.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index
From: Tonghao Zhang
This patch changes the way that lock all vqs
at the same, to lock them one by one. It will
be used for next patch to avoid the deadlock.
Signed-off-by: Tonghao Zhang
Acked-by: Jason Wang
Signed-off-by: Jason Wang
---
drivers/vhost/vhost.c | 24 +++-
1
From: Tonghao Zhang
This patches improve the guest receive performance.
On the handle_tx side, we poll the sock receive queue
at the same time. handle_rx do that in the same way.
For more performance report, see patch 4
Tonghao Zhang (4):
net: vhost: lock the vqs one by one
net: vhost:
From: Tonghao Zhang
This patch improves the guest receive performance.
On the handle_tx side, we poll the sock receive queue at the
same time. handle_rx do that in the same way.
We set the poll-us=100us and use the netperf to test throughput
and mean latency. When running the tests, the
From: Tonghao Zhang
Factor out generic busy polling logic and will be
used for in tx path in the next patch. And with the patch,
qemu can set differently the busyloop_timeout for rx queue.
To avoid duplicate codes, introduce the helper functions:
* sock_has_rx_data(changed from sk_has_rx_data)
From: Tonghao Zhang
Use the VHOST_NET_VQ_XXX as a subclass for mutex_lock_nested.
Signed-off-by: Tonghao Zhang
Acked-by: Jason Wang
---
drivers/vhost/net.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index
From: Tonghao Zhang
This patch changes the way that lock all vqs
at the same, to lock them one by one. It will
be used for next patch to avoid the deadlock.
Signed-off-by: Tonghao Zhang
Acked-by: Jason Wang
Signed-off-by: Jason Wang
---
drivers/vhost/vhost.c | 24 +++-
1
From: Tonghao Zhang
This patches improve the guest receive performance.
On the handle_tx side, we poll the sock receive queue
at the same time. handle_rx do that in the same way.
For more performance report, see patch 4, 5, 6
Tonghao Zhang (6):
net: vhost: lock the vqs one by one
net:
From: Tonghao Zhang
The patch uses vhost_has_work_pending() to check if
the specified handler is scheduled, because in the most case,
vhost_has_work() return true when other side handler is added
to worker list. Use the vhost_has_work_pending() insead of
vhost_has_work().
Topology:
[Host]
From: Tonghao Zhang
In the handle_tx, the busypoll will vhost_net_disable/enable_vq
because we have poll the sock. This can improve performance.
This is suggested by Toshiaki Makita and Jason Wang.
If the rx handle is scheduled, we will not enable vq, because it's
not necessary. We do it not
From: Tonghao Zhang
The bitmap of vhost_dev can help us to check if the
specified poll is scheduled. This patch will be used
for next two patches.
Signed-off-by: Tonghao Zhang
---
drivers/vhost/net.c | 11 +--
drivers/vhost/vhost.c | 17 +++--
drivers/vhost/vhost.h | 7
From: Tonghao Zhang
This patch improves the guest receive performance.
On the handle_tx side, we poll the sock receive queue at the
same time. handle_rx do that in the same way.
We set the poll-us=100us and use the netperf to test throughput
and mean latency. When running the tests, the
From: Tonghao Zhang
Factor out generic busy polling logic and will be
used for in tx path in the next patch. And with the patch,
qemu can set differently the busyloop_timeout for rx queue.
To avoid duplicate codes, introduce the helper functions:
* sock_has_rx_data(changed from sk_has_rx_data)
From: Tonghao Zhang
Use the VHOST_NET_VQ_XXX as a subclass for mutex_lock_nested.
Signed-off-by: Tonghao Zhang
Acked-by: Jason Wang
---
drivers/vhost/net.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index
From: Tonghao Zhang
This patch changes the way that lock all vqs
at the same, to lock them one by one. It will
be used for next patch to avoid the deadlock.
Signed-off-by: Tonghao Zhang
Acked-by: Jason Wang
Signed-off-by: Jason Wang
---
drivers/vhost/vhost.c | 24 +++-
1
From: Tonghao Zhang
This patches improve the guest receive performance.
On the handle_tx side, we poll the sock receive queue
at the same time. handle_rx do that in the same way.
For more performance report, see patch 4, 6, 7
Tonghao Zhang (7):
net: vhost: lock the vqs one by one
net:
From: Tonghao Zhang
Factor out generic busy polling logic and will be
used for in tx path in the next patch. And with the patch,
qemu can set differently the busyloop_timeout for rx queue.
In the handle_tx, the busypoll will vhost_net_disable/enable_vq
because we have poll the sock. This can
From: Tonghao Zhang
This patch improves the guest receive performance.
On the handle_tx side, we poll the sock receive queue at the
same time. handle_rx do that in the same way.
We set the poll-us=100us and use the netperf to test throughput
and mean latency. When running the tests, the
From: Tonghao Zhang
This patch changes the way that lock all vqs
at the same, to lock them one by one. It will
be used for next patch to avoid the deadlock.
Signed-off-by: Tonghao Zhang
Acked-by: Jason Wang
Signed-off-by: Jason Wang
---
drivers/vhost/vhost.c | 24 +++-
1
From: Tonghao Zhang
Use the VHOST_NET_VQ_XXX as a subclass for mutex_lock_nested.
Signed-off-by: Tonghao Zhang
Acked-by: Jason Wang
---
drivers/vhost/net.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index
From: Tonghao Zhang
This patches improve the guest receive performance.
On the handle_tx side, we poll the sock receive queue
at the same time. handle_rx do that in the same way.
For more performance report, see patch 4.
v6->v7:
fix issue and rebase codes:
1. on tx, busypoll will
From: Tonghao Zhang
This patch improves the guest receive performance.
On the handle_tx side, we poll the sock receive queue at the
same time. handle_rx do that in the same way.
We set the poll-us=100us and use the netperf to test throughput
and mean latency. When running the tests, the
From: Tonghao Zhang
Factor out generic busy polling logic and will be
used for in tx path in the next patch. And with the patch,
qemu can set differently the busyloop_timeout for rx queue.
Signed-off-by: Tonghao Zhang
---
drivers/vhost/net.c | 114
From: Tonghao Zhang
Use the VHOST_NET_VQ_XXX as a subclass for mutex_lock_nested.
Signed-off-by: Tonghao Zhang
Acked-by: Jason Wang
---
drivers/vhost/net.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index
From: Tonghao Zhang
This patch changes the way that lock all vqs
at the same, to lock them one by one. It will
be used for next patch to avoid the deadlock.
Signed-off-by: Tonghao Zhang
Acked-by: Jason Wang
Signed-off-by: Jason Wang
---
drivers/vhost/vhost.c | 24 +++-
1
From: Tonghao Zhang
This patches improve the guest receive performance.
On the handle_tx side, we poll the sock receive queue
at the same time. handle_rx do that in the same way.
For more performance report, see patch 4.
v5->v6:
rebase the codes.
Tonghao Zhang (4):
net: vhost: lock the vqs
From: Tonghao Zhang
This patch improves the guest receive and transmit performance.
On the handle_tx side, we poll the sock receive queue at the
same time. handle_rx do that in the same way.
We set the poll-us=100us and use the iperf3 to test
its bandwidth, use the netperf to test throughput
From: Tonghao Zhang
Factor out generic busy polling logic and will be
used for in tx path in the next patch. And with the patch,
qemu can set differently the busyloop_timeout for rx queue.
Signed-off-by: Tonghao Zhang
---
drivers/vhost/net.c | 94
From: Tonghao Zhang
Use the VHOST_NET_VQ_XXX as a subclass for mutex_lock_nested.
Signed-off-by: Tonghao Zhang
Acked-by: Jason Wang
---
drivers/vhost/net.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index
From: Tonghao Zhang
This patches improve the guest receive and transmit performance.
On the handle_tx side, we poll the sock receive queue at the same time.
handle_rx do that in the same way.
For more performance report, see patch 4.
v4 -> v5:
fix some issues
v3 -> v4:
fix some issues
v2 ->
From: Tonghao Zhang
This patch changes the way that lock all vqs
at the same, to lock them one by one. It will
be used for next patch to avoid the deadlock.
Signed-off-by: Tonghao Zhang
Acked-by: Jason Wang
Signed-off-by: Jason Wang
---
drivers/vhost/vhost.c | 24 +++-
1
From: Tonghao Zhang
This patch improves the guest receive and transmit performance.
On the handle_tx side, we poll the sock receive queue at the
same time. handle_rx do that in the same way.
We set the poll-us=100us and use the iperf3 to test
its bandwidth, use the netperf to test throughput
From: Tonghao Zhang
Factor out generic busy polling logic and will be
used for in tx path in the next patch. And with the patch,
qemu can set differently the busyloop_timeout for rx queue.
Signed-off-by: Tonghao Zhang
---
drivers/vhost/net.c | 94
From: Tonghao Zhang
Use the VHOST_NET_VQ_XXX as a subclass for mutex_lock_nested.
Signed-off-by: Tonghao Zhang
Acked-by: Jason Wang
---
drivers/vhost/net.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index
From: Tonghao Zhang
This patch changes the way that lock all vqs
at the same, to lock them one by one. It will
be used for next patch to avoid the deadlock.
Signed-off-by: Tonghao Zhang
Acked-by: Jason Wang
Signed-off-by: Jason Wang
---
drivers/vhost/vhost.c | 24 +++-
1
From: Tonghao Zhang
This patches improve the guest receive and transmit performance.
On the handle_tx side, we poll the sock receive queue at the same time.
handle_rx do that in the same way.
For more performance report, see patch 4.
v3 -> v4:
fix some issues
v2 -> v3:
This patches are
From: Tonghao Zhang
This patch improves the guest receive and transmit performance.
On the handle_tx side, we poll the sock receive queue at the
same time. handle_rx do that in the same way.
We set the poll-us=100us and use the iperf3 to test
its bandwidth, use the netperf to test throughput
From: Tonghao Zhang
Factor out generic busy polling logic and will be
used for tx path in the next patch. And with the patch,
qemu can set differently the busyloop_timeout for rx queue.
Signed-off-by: Tonghao Zhang
---
drivers/vhost/net.c | 92
From: Tonghao Zhang
Use the VHOST_NET_VQ_XXX as a subclass for mutex_lock_nested.
Signed-off-by: Tonghao Zhang
---
drivers/vhost/net.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index e7cf7d2..62bb8e8 100644
---
From: Tonghao Zhang
This patch changes the way that lock all vqs
at the same, to lock them one by one. It will
be used for next patch to avoid the deadlock.
Signed-off-by: Tonghao Zhang
---
drivers/vhost/vhost.c | 24 +++-
1 file changed, 7 insertions(+), 17 deletions(-)
From: Tonghao Zhang
This patches improve the guest receive and transmit performance.
On the handle_tx side, we poll the sock receive queue at the same time.
handle_rx do that in the same way.
This patches are splited from previous big patch:
http://patchwork.ozlabs.org/patch/934673/
For more
From: Tonghao Zhang
This patch improves the guest receive performance from
host. On the handle_tx side, we poll the sock receive
queue at the same time. handle_rx do that in the same way.
For avoiding deadlock, change the code to lock the vq one
by one and use the VHOST_NET_VQ_XX as a subclass
49 matches
Mail list logo