>From: Stokes, Ian >Sent: Monday, January 22, 2018 11:16 AM >To: Kavanagh, Mark B <[email protected]>; Flavio Leitner ><[email protected]> >Cc: [email protected] >Subject: RE: [ovs-dev] [RFC PATCH v2 1/1] netdev-dpdk: Fix requested MTU >size validation. > >> Hey Ian, >> >> Thanks for this patch. >> >> Apart from the issues that Flavio pointed out, I have a few additional >> comments inline. > >Thanks Mark, replies inline.
Thanks Ian - one clarification on the final comment inline ;) -Mark > >> >> Cheers, >> Mark >> >> >From: Stokes, Ian >> >Sent: Friday, January 19, 2018 2:47 PM >> >To: Flavio Leitner <[email protected]> >> >Cc: [email protected]; Kavanagh, Mark B <[email protected]> >> >Subject: RE: [ovs-dev] [RFC PATCH v2 1/1] netdev-dpdk: Fix requested >> >MTU size validation. >> > >> >> Hi Ian, >> >> >> >> On Thu, Jan 18, 2018 at 05:32:22PM +0000, Ian Stokes wrote: >> >> > This commit replaces MTU_TO_FRAME_LEN(mtu) with >> >> > MTU_TO_MAX_FRAME_LEN(mtu) when validating if an MTU will exceed >> >> > NETDEV_DPDK_MAX_PKT_LEN in netdev_dpdk_set_mtu(). >> >> > >> >> > When setting an MTU we first check if the requested MTU frame size >> >> > will exceed the maximum packet frame size supported in >> >> > netdev_dpdk_set_mtu(). The MTU frame length is calculated as MTU + >> >> > ETHER_HEADER + ETHER_CRC. The MTU for the device will be set at a >> >> > later stage in dpdk_ethdev_init() using rte_ethdev_set_mtu(mtu). >> >> >> >> Perhaps? >> >> dpdk_eth_dev_init() >> >> rte_eth_dev_set_mtu() >> >> >> >> >> >> > However when using rte_ethdev_set_mtu(mtu) the calculation used to >> >> > check that the MTU does not exceed the max frame length for that >> >> > device varies between DPDK device drivers. For example ixgbe driver >> >> > calculates MTU frame length as >> >> > >> >> > mtu + ETHER_HDR_LEN + ETHER_CRC_LEN >> >> > >> >> > i40e driver calculates it as >> >> > >> >> > mtu + ETHER_HDR_LEN + ETHER_CRC_LEN + I40E_VLAN_TAG_SIZE * 2 >> >> > >> >> > em driver calculates it as >> >> > >> >> > mtu + ETHER_HDR_LEN + ETHER_CRC_LEN + VLAN_TAG_SIZE >> >> > >> >> > Currently it is possible to set an MTU for a netdev_dpdk device >> >> > that exceeds the upper limit MTU for that devices DPDK driver. This >> >> > leads to a segfault. This is because the MTU frame length >> >> > comparison as is, does not take into account the addition of the >> >> > vlan tag overhead expected in the drivers. The >> >> > netdev_dpdk_set_mtu() call will incorrectly succeed but the >> >> > subsequent dpdk_ethdev_init() will fail before the queues have been >> >> > created for the DPDK device. This coupled with assumptions >> >> > regarding reconfiguration requirements for the netdev will lead to a >> segfault when the rxq is polled for this device. >> >> > >> >> > A simple way to avoid this is by using MTU_TO_MAX_FRAME_LEN(mtu) >> >> > when validating a requested MTU in netdev_dpdk_set_mtu(). >> >> > MTU_TO_MAX_FRAME_LEN(mtu) is equivalent to the following: >> >> > >> >> > mtu + ETHER_HDR_LEN + ETHER_CRC_LEN + (2 * VLAN_HEADER_LEN) >> >> > >> >> > It's use accounts for the potential of up to two vlan headers being >> >> Since you're spinning a V3 anyway... ;) >> <micronit>Its, not "It's"</micronit> > >Sure. > >> >> >> > included in the overhead required by some devices in MTU frame >> >> > length at the netdev_dpdk_set_mtu() stage. This allows OVS to flag >> >> > an error rather than the DPDK driver if the MTU exceeds the max >> >> > DPDK frame length. OVS can fail gracefully at this point by falling >> >> > back to a >> >> I think 'falling back' here is a bit misleading, as it suggest some kind >> of rollback, or previous state restoration. >> In truth, the MTU in this case never changes; it was initialized to 1500, >> as stays as such, since the new value was rejected. > >Ok, will remove falling back and replace with 'use default'. > >> >> >> > default MTU of 1500 and continue to configure the port. >> >> > >> >> > Note: this fix is a work around, a better approach would be if DPDK >> >> > devices could report the maximum MTU value that can be requested on >> >> > a per device basis. This capability however is not currently >> available. >> >> > A downside of this patch is that the MTU upper limit will be >> >> > reduced by 8 bytes for DPDK devices that do not need to account for >> >> > vlan tags in MTU frame length driver calculations e.g. ixgbe >> >> > devices upper MTU limit is reduced from the OVS point of view from >> 9710 to 9702. >> >> > >> >> > CC: Mark Kavanagh <[email protected]> >> >> > Fixes: 0072e931 ("netdev-dpdk: add support for jumbo frames") >> >> > Signed-off-by: Ian Stokes <[email protected]> >> >> > >> >> > --- >> >> > v1->v2 >> >> > * Add note to limitations in DPDK documentation regarding MTU. >> >> > * Correct MTU calculation in commit message. >> >> > * Flag that we provision 8 bytes (2 x vlan header) by using >> >> > MTU_TO_MAX_FRAME_LEN in commit message and code comments. >> >> > --- >> >> > Documentation/intro/install/dpdk.rst | 12 ++++++++++++ >> >> > lib/netdev-dpdk.c | 13 ++++++++++++- >> >> > 2 files changed, 24 insertions(+), 1 deletions(-) >> >> > >> >> > diff --git a/Documentation/intro/install/dpdk.rst >> >> > b/Documentation/intro/install/dpdk.rst >> >> > index 3fecb5c..5800096 100644 >> >> > --- a/Documentation/intro/install/dpdk.rst >> >> > +++ b/Documentation/intro/install/dpdk.rst >> >> > @@ -585,6 +585,18 @@ Limitations >> >> > >> >> > .. _DPDK release notes: >> >> > http://dpdk.org/doc/guides/rel_notes/release_17_11.html >> >> > >> >> > +- Upper bound MTU: DPDK device drivers differ in how MTU overhead >> >> > +is >> >> > + calculated e.g. i40e driver includes 2 x vlan headers in MTU >> >> > +overhead, >> >> As a general point, this is technically frame (i.e L2) overhead, whereas >> MTU is an L3 concept. I'd rephrase it as 'how the L2 overhead for a given >> MTU value is calculated'. > >Will fix in v3. > >> >> >> > + em driver includes 1 x vlan header, ixgbe driver does not >> >> > +include a vlan >> >> > + header in overhead. Currently it is not possible for OVS DPDK to >> >> > +know what >> >> > + upper bound MTU value is supported for a given device. As such >> >> > +OVS DPDK >> >> > + must provision for the case where the maximum MTU value >> >> > +includes 2 x vlan >> >> > + headers. This reduces the upper bound MTU value for devices that >> >> > +do not >> >> > + include vlan headers in their overhead by 8 bytes e.g. ixgbe >> >> > +devices upper >> >> >> >> two spaces >> >> >> >> > + bound MTU is reduced from 9710 to 9702. This work around is >> >> > + temporary and is expected to be removed once a method is >> >> > + provided by DPDK to query the maximum MTU for a given device. >> >> Has it been confirmed that DPDK will provide such functionality? If not, >> you may want to rephrase the previous sentence, so as not to set >> unrealistic expectations. > >I don't think it's unrealistic, there's agreement from both the OVS and DPDK >community at the community calls that a fix like this is required and would >not be controversial. I can rephrase above to be: No need to rephrase Ian - I just wanted to know if DPDK had confirmed that they will provide a fix; since they have, it's perfectly fine to set that expectation here. > >'This work around is intended to be temporary until DPDK provides a method >to query the maximum MTU for a given device.' > >If that helps? > >> >> >> > + >> >> > Reporting Bugs >> >> > -------------- >> >> > >> >> > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index >> >> > e32c7f6..2ac76e3 100644 >> >> > --- a/lib/netdev-dpdk.c >> >> > +++ b/lib/netdev-dpdk.c >> >> > @@ -2134,7 +2134,18 @@ netdev_dpdk_set_mtu(struct netdev *netdev, >> >> > int >> >> > mtu) { >> >> > struct netdev_dpdk *dev = netdev_dpdk_cast(netdev); >> >> > >> >> > - if (MTU_TO_FRAME_LEN(mtu) > NETDEV_DPDK_MAX_PKT_LEN >> >> > + /* XXX: We need to ensure the requested MTU frame length does >> >> > + not >> >> I'd probably rephrase to something along these lines: "ensure that the >> overall frame length of the requested MTU...". > >Sure. > >> >> >> > + * surpass the NETDEV_DPDK_MAX_PKT_LEN. DPDK device drivers >> differ >> >> > + * in how the MTU frame length is calculated when >> >> rte_ethdev_set_mtu(mtu) >> >> > + * is called e.g. i40e driver includes 2 x vlan headers in MTU >> >> overhead >> >> > + * the em driver includes 1 x vlan tag in MTU overhead, the >> >> > + ixgbe >> >> driver >> >> > + * does not include vlan tags in MTU overhead. As such we >> >> > + should >> >> use >> >> > + * MTU_TO_MAX_FRAME_LEN(mtu) which includes an additional 2 x >> >> > + vlan >> >> headers >> >> > + * (8 bytes) for comparison. This avoids a failure later with >> >> > + * rte_ethdev_set_mtu(). This approach should be used until >> >> > + DPDK provides >> >> >> >> rte_eth_dev_set_mtu >> >> >> >> > + * a method to retrieve maximum MTU frame length for a given >> >> device. >> >> > + */ >> >> > + if (MTU_TO_MAX_FRAME_LEN(mtu) > NETDEV_DPDK_MAX_PKT_LEN >> >> > || mtu < ETHER_MIN_MTU) { >> >> > VLOG_WARN("%s: unsupported MTU %d\n", dev->up.name, mtu); >> >> > return EINVAL; >> >> >> >> We got a bug report for another related MTU issue because in older >> >> OVS/DPDK, the i40e driver didn't include the 2 VLAN headers overhead, >> >> so if the packet length is the maximum allowed, it can't forward >> >> packets with VLAN headers included. So, I agree that this needs >> >> improvements on the DPDK side to report not only the maximum MTU but >> >> also to standardize what means MTU. Today we don't know the overhead >> each PMD will account. >> >> >> >> In the hope that the committer can fix the small typos above or if >> >> there is another spin: >> >> >> >> Acked-by: Flavio Leitner <[email protected]> >> > >> >Thanks Flavio, >> > >> >I agree, we can add this to requirements for DPDK MTU capabilities. >> > >> >I'll re-spin a v3 and remove the RFC, once there's an ACK from Mark >> >I'll add this to the DPDK_MERGE over the weekend. >> > >> >Thanks >> >Ian >> >> >> >> Thanks Ian! >> >> >> >> >> >> > -- >> >> > 1.7.0.7 >> >> > >> >> > _______________________________________________ >> >> > dev mailing list >> >> > [email protected] >> >> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev >> >> >> >> -- >> >> Flavio _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
