Hi Jan,
On 14/02/2022 10:54, Jan Scheurich wrote:
We do acknowledge the benefit of non-pinned polling of phy rx queues by
PMD threads on all NUMA nodes. It gives the auto-load balancer much better
options to utilize spare capacity on PMDs on all NUMA nodes.
Our patch proposed in
https://protect2.fireeye.com/v1/url?k=31323334-501d5122-313273af-45444
5555731-c996011189a3eea8&q=1&e=0dc6a0b0-959c-493e-a3de-
fea8f3151705&u=
https%3A%2F%2Fmail.openvswitch.org%2Fpipermail%2Fovs-dev%2F2021-
June%2
F384547.html indeed covers the difference between phy and vhu ports.
One has to explicitly enable cross-NUMA-polling for individual interfaces
with:
ovs-vsctl set interface <Name> other_config:cross-numa-polling=true
This would typically only be done by static configuration for the fixed set of
physical ports. There is no code in the OpenStack's os-vif handler to apply such
configuration for dynamically created vhu ports.
I would strongly suggest that cross-num-polling be introduced as a per-
interface option as in our patch rather than as a per-datapath option as in your
patch. Why not adapt our original patch to the latest OVS code base? We can
help you with that.
BR, Jan
Hi, Jan Scheurich
We can achieve the static setting of pinning a phy port by combining pmd-rxq-
isolate and pmd-rxq-affinity. This setting can get the same result. And we have
seen the benefits.
The new issue is the polling of vhu on one numa. Under heavy traffic, polling
vhu + phy will make the pmds reach 100% usage. While other pmds on the
other numa with only phy port reaches 70% usage. Enabling cross-numa polling
for a vhu port would give us more benefits in this case. Overloads of different
pmds on both numa would be balanced.
As you have mentioned, there is no code to apply this config for vhu while
creating them. A global setting would save us from dynamically detecting the
vhu name or any new creation.
Hi Wan Junjie,
We have done extensive benchmarking and found that we get better overall PMD
load balance and resulting OVS performance when we do not statically pin any rx
queues and instead let the auto-load-balancing find the optimal distribution of
phy rx queues over both NUMA nodes to balance an asymmetric load of vhu rx
queues (polled only on the local NUMA node).
Cross-NUMA polling of vhu rx queues comes with a very high latency cost due to
cross-NUMA access to volatile virtio ring pointers in every iteration (not only
when actually copying packets). Cross-NUMA polling of phy rx queues doesn't
have a similar issue.
I agree that for vhost rxq polling, it always causes a performance
penalty when there is cross-numa polling.
For polling phy rxq, when phy and vhost are in different numas, I don't
see any additional penalty for cross-numa polling the phy rxq.
For the case where phy and vhost are both in the same numa, if I change
to poll the phy rxq cross-numa, then I see about a >20% tput drop for
traffic from phy -> vhost. Are you seeing that too?
Also, the fact that a different numa can poll the phy rxq after every
rebalance means that the ability of the auto-load-balancer to estimate
and trigger a rebalance is impacted.
It seems like simple pinning some phy rxqs cross-numa would avoid all
the issues above and give most of the benefit of cross-numa polling for
phy rxqs.
With the pmd-rxq-assign=group and pmd-rxq-isolate=false options, OVS
could still assign other rxqs to those cores which have with pinned phy
rxqs and properly adjust the assignments based on the load from the
pinned rxqs.
New assignments or auto-load-balance would not change the numa polling
those rxqs, so it it would have no impact to ALB or ability to assign
based on load.
thanks,
Kevin.
BR, Jan
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev