On 14.01.2019 15:51, Ian Stokes wrote: > On 1/11/2019 7:37 PM, Ian Stokes wrote: >> On 1/11/2019 4:14 PM, Ilya Maximets wrote: >>> Nothing significantly changed since the previous versions. >>> This patch set effectively breaks following cases if multi-segment >>> mbufs enabled: >>> >> >> Hi Ilya, thanks for your feedback. A few queries and clarifications for >> discussion below. >> >> From reading the mail at first I was under the impression that all these >> features were being broken by default after applying the patch? >> >> To clarify, in your testing, after applying these patches, are the use cases >> below broken if you are not using multiseg/TSO? (by default these are >> disabled). >> >> The intention of this work, as an experimental feature, would be that it is >> disabled by default and will not introduce new regressions to users who do >> not enable it. So there would be no impact on a user who does not wish to >> enable this and use OVS DPDK as is today. >> >> The series would be the first steps to enable OVS DPDK to use TSO in both >> inter VM and Host to Host environments using DPDK interfaces that support >> the feature. Kernel interfaces such as tap devices are not catered for as of >> yet. >> >> The use cases that are catered for are >> >> - Inter VM communication on the same host using DPDK Interfaces; >> - Inter VM communication on different hosts DPDK Interfaces; >> - The same two use cases above, but on a VLAN network. >> >> Again these are the first steps showing TSO in DPDK with multiseg feature. >> > > Hi all, > > with code freeze tomorrow I'd like to come to a consensus on this as there > has bee no response to the counter points made below.
Sorry for responses taking so long. I'm just trying to perform some testing. > > In summary the status of the feature is: > > It does not cause the usecases outlined below to break while mutltiseg is > disabled (which by default it is disabled). There is no impact to a user who > does not wish to enable the feature. Disagree. There is a performance impact. I just made some performance testing on current master with and without patch-sets applied (mseg and TSO) with my usual "PVP with bonded PHY" scenario. And I see 5% average performance drop for the 512B UDP packets if multi-segment related patches just applied and *not enabled*. Unfortunately, have no time for more testing of other cases as you're hurrying me up. > > It is considered an experimental feature. > > When enabled it supports the following usecase: > 1. Inter VM communication on the same host using DPDK Interfaces; > 2. Inter VM communication on different hosts DPDK Interfaces; > 3. The same two use cases above, but on a VLAN network. > > The known cases it does not support: > 1. Host <-> VM communication using kernel interfaces. > 2. VM <-> VM communication where TSO is not supported by one VM. > 3. Tunneled connections (i.e. VXLAN) currently not supported. > > For issues 1 and 2 above, GSO (the software fallback for segmenting packets > in DPDK) would address the issue. This work is planned as a follow on for > this feature already by Intel. * not only segmenting, but also checksumming. > > While GSO is not in place, it is the case that a DPDK device being added that > does not support multiseg will fail to add with the user informed why, this > is only if multiseg has been enabled.> > For issue 3 there is already the capability to support this is DPDK 18.11. > There will be a follow on patch for OVS to enable this, just not within the > scope of the 2.11 code freeze. Calculating of inner checksums on TX is not supported by much more devices. And virtio, I think, does not support this. Also, does the current patch-set check/drop packets before encapsulation? > > My position is that the series are at a high enough revision to be added (v14 > for multi seg, v3 for the TSO) as long as the known issues above are > documented. > > Intel intends to develop the feature further with the use cases outlined > above as the next steps in incremental development of the feature. > > It's quite a late stage to receive a NACK on the series based on the > unsupported use cases. With all the respect, I'd like to raise the issue once again. All the "high revisions" and "long time development" is most of the time *last-minute development*. I mean is that there was no any progress since early October and a 3 revisions at once within the last week. And it always happens like this. Just look at most of the DPDK related features in OVS for the last few years. The workflow is same: 1 RFC 2 no progress for a months 3 release soon --> a lot of new revisions in a few weeks 4 didn't meet the release deadline 5 goto step #2 It'll be OK if it was only one feature, but we have 3-4 big, not tested, not reviewed enough features all the time one week before the release and only few people who cares about that. Moreover, most of the time original developers just disappears from the list right after merge and never going to fix/enhance the functionality. > > Based on the cases it does support, and given there is a clear path for > incremental steps to enable the feature completely, I feel experimental > status of the feature is warranted and I'd like to see the is the 2.11 > release. > > Ian. > >>> 1. Host <-> VM communication broken. >>> >>> With this patch applied VMs has offloading enabled by default. >>> This makes all the packets that goes out from the VM to have >>> incorrect checksums (beside possible multi-segmenting). As a >>> result any packet that reaches host interfaces dropped by the >>> host kernel as corrupted. Could be easily reproduced by trying >>> to use ssh from host to VM, which is the common case for usual >>> OpenStack installations. >>> >> I feel it's important to stress that by default however the patches don't >> break this functionality. Both Multi-segment mbufs and TSO are disabled by >> default, and thus, TSO won't be negotiated and the VMs won't be sending >> multi-segment mbufs to OvS. >> >> From Tiagos testing if you decide to enable multi-segments mbuf and TSO, >> *which is considered experimental* at the moment, the VM-Host communication >> will be broken only if your VM negotiates TSO enabled by default. This is >> because with TSO enabled checksum offloading comes into play, which means >> packets from the VM will arrive to the host with bad checksums. >> >> Otherwise, if your VM doesn't enable TSO by default, or if you disable with >> `ethtool -K <iface> tx off`, everything should work as before, but using >> multi-segment mbufs. That was my impression. >> >> @Tiago can you confirm? >> >> The only way to circumvent that, as you mention below, is "fallback to full >> software TSO implementation". >> >> As an experimental feature I would be surprised to see Open Stack provide >> support for this. Again by default it should not impact an Open Stack >> deployment unless it is explicitly enabled and support for that would be >> some way off as of yet. >> >>> If checksums will be fixed, segmented packets will be dropped >>> anyway in netdev-linux/bsd/dummy. >> >> By segmented I assume you are referring to TCP segment rather than multi >> segment, can you clarify? >> >> This is a limitation, but one to be factored in by a user when choosing to >> enable or disable TSO with OVS DPDK until it is resolved. >> >>> >>> 2. Broken VM to VM communication in case of not negotiated offloading >>> in one of VMs. >>> >>> If one of VMs does not negotiate offloading, all the packets >>> will be dropped for the same reason as in Host to VM case. This >>> is a really bad case because virtio driver could change the >>> feature set in runtime and this should be handled in OVS. >>> >>> Note that linux kernel virtio driver always has rx checksum >>> offloading enabled, i.e. it doesn't check/recalculates the >>> checksums in software if needed. So, you need the dpdk driver or >>> a FreeBSD machine to reproduce checksumming issue. >>> >> >> Again by default I believe this still functions as expected. Tiago can you >> comment any further on this from your testing? >> >>> 3. Broken support of all the virtual and HW NICs that doen't support TSO. >>> >>> It's actually big number of NICs. More than a half of PMDs in DPDK >>> does not support TSO. >>> All the packets just dropped by netdev-dpdk before send. >> >> There are a number of NICS/VDEVs that do not support TSO in DPDK. However >> there are a number of NICs that do support it and this will increase in >> future DPDK releases. >> >> I don't think should be a blocking factor. >> >> In the case where multi segment is enabled in OVS DPDK we discussed simply >> not allowing a NIC to be added if it did not support multi seg mbufs. This >> can be extended to TSO also, as currently multi seg is a prerequirement for >> TSO. >> >> This avoids a case for the moment where someone is mixing devices where >> support does and does not exist. Again this occurs only if the feature is >> explicitly enabled. >> >> For mseg and TSO we can can clearly call out the devices that have been >> tested in OVS DPDK and do support the feature in DPDK (similar to what has >> been done for Hardware offload). >> >>> >>> This is most probably not a full list of issues. To fix most of them >>> fallback to full software TSO implementation required. >>> >> I agree, the full software TSO fallback would be a requirement for the >> future. This is something that will be implemented as a follow on from this >> over the course of the year. But I don't believe it should block users who >> have DPDK devices that support TSO now. >> >>> Until this is done I prefer not to merge the feature because it >>> breaks the basic functionality. I understand that you're going to make >>> it 'experimental', but, IMHO, this doesn't worth even that. Patches >>> are fairly small and there is nothing actually we need to test before >>> the full support. The main part is not implemented yet. >> >> Can you clarify basic functionality? I assume you mean the use cases you've >> outlined above, and only when the feature is enabled? >> >> It's an interesting point as experimental features. What is enough to >> consider it experimental and non experimental? When should experimental >> features be added? >> >> If this feature was added (and no doubt it would be experimental), I believe >> what you've outlined above is a good start for defining what would be >> required for the experimental tag to be removed in the future. >> >> In terms of when an experimental feature should be added, the use cases >> you've outlined above are not the only use cases. The use cases it currently >> addresses come from feedback from users on the mailing list and outside of >> it. They can be built upon incrementally, the feature can be completed while >> allowing users to deploy and test. This will give the benefit of providing >> more feedback and usecases. Another benefit being we avoid a 'big bang' >> implementation in the future with large patch series. >> >> I'm also more than happy to increase the documentation clearly outlining the >> limitations above as well as a how to guide showing how to deploy the >> supported usecases step by step. It will provide clarity to users who wish >> to use the feature. >> >>> >>> NACK for the series. Sorry. >> >> No need to apologize, you've raised valid points and I hope we can discuss >> these further before making a final decision on the series. >> >> Ian > > > _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
