On 1/11/2019 4:14 PM, Ilya Maximets wrote:
Nothing significantly changed since the previous versions.
This patch set effectively breaks following cases if multi-segment
mbufs enabled:
Hi Ilya, thanks for your feedback. A few queries and clarifications for
discussion below.
From reading the mail at first I was under the impression that all
these features were being broken by default after applying the patch?
To clarify, in your testing, after applying these patches, are the use
cases below broken if you are not using multiseg/TSO? (by default these
are disabled).
The intention of this work, as an experimental feature, would be that it
is disabled by default and will not introduce new regressions to users
who do not enable it. So there would be no impact on a user who does not
wish to enable this and use OVS DPDK as is today.
The series would be the first steps to enable OVS DPDK to use TSO in
both inter VM and Host to Host environments using DPDK interfaces that
support the feature. Kernel interfaces such as tap devices are not
catered for as of yet.
The use cases that are catered for are
- Inter VM communication on the same host using DPDK Interfaces;
- Inter VM communication on different hosts DPDK Interfaces;
- The same two use cases above, but on a VLAN network.
Again these are the first steps showing TSO in DPDK with multiseg feature.
1. Host <-> VM communication broken.
With this patch applied VMs has offloading enabled by default.
This makes all the packets that goes out from the VM to have
incorrect checksums (beside possible multi-segmenting). As a
result any packet that reaches host interfaces dropped by the
host kernel as corrupted. Could be easily reproduced by trying
to use ssh from host to VM, which is the common case for usual
OpenStack installations.
I feel it's important to stress that by default however the patches
don't break this functionality. Both Multi-segment mbufs and TSO are
disabled by default, and thus, TSO won't be negotiated and the VMs won't
be sending multi-segment mbufs to OvS.
From Tiagos testing if you decide to enable multi-segments mbuf and
TSO, *which is considered experimental* at the moment, the VM-Host
communication will be broken only if your VM negotiates TSO enabled by
default. This is because with TSO enabled checksum offloading comes into
play, which means packets from the VM will arrive to the host with bad
checksums.
Otherwise, if your VM doesn't enable TSO by default, or if you disable
with `ethtool -K <iface> tx off`, everything should work as before, but
using multi-segment mbufs. That was my impression.
@Tiago can you confirm?
The only way to circumvent that, as you mention below, is "fallback to
full software TSO implementation".
As an experimental feature I would be surprised to see Open Stack
provide support for this. Again by default it should not impact an Open
Stack deployment unless it is explicitly enabled and support for that
would be some way off as of yet.
If checksums will be fixed, segmented packets will be dropped
anyway in netdev-linux/bsd/dummy.
By segmented I assume you are referring to TCP segment rather than multi
segment, can you clarify?
This is a limitation, but one to be factored in by a user when choosing
to enable or disable TSO with OVS DPDK until it is resolved.
2. Broken VM to VM communication in case of not negotiated offloading
in one of VMs.
If one of VMs does not negotiate offloading, all the packets
will be dropped for the same reason as in Host to VM case. This
is a really bad case because virtio driver could change the
feature set in runtime and this should be handled in OVS.
Note that linux kernel virtio driver always has rx checksum
offloading enabled, i.e. it doesn't check/recalculates the
checksums in software if needed. So, you need the dpdk driver or
a FreeBSD machine to reproduce checksumming issue.
Again by default I believe this still functions as expected. Tiago can
you comment any further on this from your testing?
3. Broken support of all the virtual and HW NICs that doen't support TSO.
It's actually big number of NICs. More than a half of PMDs in DPDK
does not support TSO.
All the packets just dropped by netdev-dpdk before send.
There are a number of NICS/VDEVs that do not support TSO in DPDK.
However there are a number of NICs that do support it and this will
increase in future DPDK releases.
I don't think should be a blocking factor.
In the case where multi segment is enabled in OVS DPDK we discussed
simply not allowing a NIC to be added if it did not support multi seg
mbufs. This can be extended to TSO also, as currently multi seg is a
prerequirement for TSO.
This avoids a case for the moment where someone is mixing devices where
support does and does not exist. Again this occurs only if the feature
is explicitly enabled.
For mseg and TSO we can can clearly call out the devices that have been
tested in OVS DPDK and do support the feature in DPDK (similar to what
has been done for Hardware offload).
This is most probably not a full list of issues. To fix most of them
fallback to full software TSO implementation required.
I agree, the full software TSO fallback would be a requirement for the
future. This is something that will be implemented as a follow on from
this over the course of the year. But I don't believe it should block
users who have DPDK devices that support TSO now.
Until this is done I prefer not to merge the feature because it
breaks the basic functionality. I understand that you're going to make
it 'experimental', but, IMHO, this doesn't worth even that. Patches
are fairly small and there is nothing actually we need to test before
the full support. The main part is not implemented yet.
Can you clarify basic functionality? I assume you mean the use cases
you've outlined above, and only when the feature is enabled?
It's an interesting point as experimental features. What is enough to
consider it experimental and non experimental? When should experimental
features be added?
If this feature was added (and no doubt it would be experimental), I
believe what you've outlined above is a good start for defining what
would be required for the experimental tag to be removed in the future.
In terms of when an experimental feature should be added, the use cases
you've outlined above are not the only use cases. The use cases it
currently addresses come from feedback from users on the mailing list
and outside of it. They can be built upon incrementally, the feature can
be completed while allowing users to deploy and test. This will give the
benefit of providing more feedback and usecases. Another benefit being
we avoid a 'big bang' implementation in the future with large patch series.
I'm also more than happy to increase the documentation clearly outlining
the limitations above as well as a how to guide showing how to deploy
the supported usecases step by step. It will provide clarity to users
who wish to use the feature.
NACK for the series. Sorry.
No need to apologize, you've raised valid points and I hope we can
discuss these further before making a final decision on the series.
Ian
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev