>From: Stokes, Ian
>Sent: Monday, January 22, 2018 11:16 AM
>To: Kavanagh, Mark B <[email protected]>; Flavio Leitner
><[email protected]>
>Cc: [email protected]
>Subject: RE: [ovs-dev] [RFC PATCH v2 1/1] netdev-dpdk: Fix requested MTU
>size validation.
>
>> Hey Ian,
>>
>> Thanks for this patch.
>>
>> Apart from the issues that Flavio pointed out, I have a few additional
>> comments inline.
>
>Thanks Mark, replies inline.

Thanks Ian - one clarification on the final comment inline ;)
-Mark

>
>>
>> Cheers,
>> Mark
>>
>> >From: Stokes, Ian
>> >Sent: Friday, January 19, 2018 2:47 PM
>> >To: Flavio Leitner <[email protected]>
>> >Cc: [email protected]; Kavanagh, Mark B <[email protected]>
>> >Subject: RE: [ovs-dev] [RFC PATCH v2 1/1] netdev-dpdk: Fix requested
>> >MTU size validation.
>> >
>> >> Hi Ian,
>> >>
>> >> On Thu, Jan 18, 2018 at 05:32:22PM +0000, Ian Stokes wrote:
>> >> > This commit replaces MTU_TO_FRAME_LEN(mtu) with
>> >> > MTU_TO_MAX_FRAME_LEN(mtu) when validating if an MTU will exceed
>> >> > NETDEV_DPDK_MAX_PKT_LEN in netdev_dpdk_set_mtu().
>> >> >
>> >> > When setting an MTU we first check if the requested MTU frame size
>> >> > will exceed the maximum packet frame size supported in
>> >> > netdev_dpdk_set_mtu(). The MTU frame length is calculated as MTU +
>> >> > ETHER_HEADER + ETHER_CRC. The MTU for the device will be set at a
>> >> > later stage in dpdk_ethdev_init() using rte_ethdev_set_mtu(mtu).
>> >>
>> >> Perhaps?
>> >> dpdk_eth_dev_init()
>> >> rte_eth_dev_set_mtu()
>> >>
>> >>
>> >> > However when using rte_ethdev_set_mtu(mtu) the calculation used to
>> >> > check that the MTU does not exceed the max frame length for that
>> >> > device varies between DPDK device drivers. For example ixgbe driver
>> >> > calculates MTU frame length  as
>> >> >
>> >> > mtu + ETHER_HDR_LEN + ETHER_CRC_LEN
>> >> >
>> >> > i40e driver calculates it as
>> >> >
>> >> > mtu + ETHER_HDR_LEN + ETHER_CRC_LEN + I40E_VLAN_TAG_SIZE * 2
>> >> >
>> >> > em driver calculates it as
>> >> >
>> >> > mtu + ETHER_HDR_LEN + ETHER_CRC_LEN + VLAN_TAG_SIZE
>> >> >
>> >> > Currently it is possible to set an MTU for a netdev_dpdk device
>> >> > that exceeds the upper limit MTU for that devices DPDK driver. This
>> >> > leads to a segfault. This is because the MTU frame length
>> >> > comparison as is, does not take into account the addition of the
>> >> > vlan tag overhead expected in the drivers. The
>> >> > netdev_dpdk_set_mtu() call will incorrectly succeed but the
>> >> > subsequent dpdk_ethdev_init() will fail before the queues have been
>> >> > created for the DPDK device. This coupled with assumptions
>> >> > regarding reconfiguration requirements for the netdev will lead to a
>> segfault when the rxq is polled for this device.
>> >> >
>> >> > A simple way to avoid this is by using MTU_TO_MAX_FRAME_LEN(mtu)
>> >> > when validating a requested MTU in netdev_dpdk_set_mtu().
>> >> > MTU_TO_MAX_FRAME_LEN(mtu) is equivalent to the following:
>> >> >
>> >> > mtu + ETHER_HDR_LEN + ETHER_CRC_LEN + (2 * VLAN_HEADER_LEN)
>> >> >
>> >> > It's use accounts for the potential of up to two vlan headers being
>>
>> Since you're spinning a V3 anyway... ;)
>> <micronit>Its, not "It's"</micronit>
>
>Sure.
>
>>
>> >> > included in the overhead required by some devices in MTU frame
>> >> > length at the netdev_dpdk_set_mtu() stage. This allows OVS to flag
>> >> > an error rather than the DPDK driver if the MTU exceeds the max
>> >> > DPDK frame length. OVS can fail gracefully at this point by falling
>> >> > back to a
>>
>> I think 'falling back' here is a bit misleading, as it suggest some kind
>> of rollback, or previous state restoration.
>> In truth, the MTU in this case never changes; it was initialized to 1500,
>> as stays as such, since the new value was rejected.
>
>Ok, will remove falling back and replace with 'use default'.
>
>>
>> >> > default MTU of 1500 and continue to configure the port.
>> >> >
>> >> > Note: this fix is a work around, a better approach would be if DPDK
>> >> > devices could report the maximum MTU value that can be requested on
>> >> > a per device basis. This capability however is not currently
>> available.
>> >> > A downside of this patch is that the MTU upper limit will be
>> >> > reduced by 8 bytes for DPDK devices that do not need to account for
>> >> > vlan tags in MTU frame length driver calculations e.g. ixgbe
>> >> > devices upper MTU limit is reduced from the OVS point of view from
>> 9710 to 9702.
>> >> >
>> >> > CC: Mark Kavanagh <[email protected]>
>> >> > Fixes: 0072e931 ("netdev-dpdk: add support for jumbo frames")
>> >> > Signed-off-by: Ian Stokes <[email protected]>
>> >> >
>> >> > ---
>> >> > v1->v2
>> >> > * Add note to limitations in DPDK documentation regarding MTU.
>> >> > * Correct MTU calculation in commit message.
>> >> > * Flag that we provision 8 bytes (2 x vlan header) by using
>> >> >   MTU_TO_MAX_FRAME_LEN in commit message and code comments.
>> >> > ---
>> >> >  Documentation/intro/install/dpdk.rst |   12 ++++++++++++
>> >> >  lib/netdev-dpdk.c                    |   13 ++++++++++++-
>> >> >  2 files changed, 24 insertions(+), 1 deletions(-)
>> >> >
>> >> > diff --git a/Documentation/intro/install/dpdk.rst
>> >> > b/Documentation/intro/install/dpdk.rst
>> >> > index 3fecb5c..5800096 100644
>> >> > --- a/Documentation/intro/install/dpdk.rst
>> >> > +++ b/Documentation/intro/install/dpdk.rst
>> >> > @@ -585,6 +585,18 @@ Limitations
>> >> >
>> >> >  .. _DPDK release notes:
>> >> > http://dpdk.org/doc/guides/rel_notes/release_17_11.html
>> >> >
>> >> > +- Upper bound MTU: DPDK device drivers differ in how MTU overhead
>> >> > +is
>> >> > +  calculated e.g. i40e driver includes 2 x vlan headers in MTU
>> >> > +overhead,
>>
>> As a general point, this is technically frame (i.e L2) overhead, whereas
>> MTU is an L3 concept. I'd rephrase it as 'how the L2 overhead for a given
>> MTU value is calculated'.
>
>Will fix in v3.
>
>>
>> >> > +  em driver includes 1 x vlan header, ixgbe driver does not
>> >> > +include a vlan
>> >> > +  header in overhead. Currently it is not possible for OVS DPDK to
>> >> > +know what
>> >> > +  upper bound MTU value is supported for a given device. As such
>> >> > +OVS DPDK
>> >> > +  must provision for the case where the  maximum MTU value
>> >> > +includes 2 x vlan
>> >> > +  headers. This reduces the upper bound MTU value for devices that
>> >> > +do not
>> >> > +  include vlan headers in their overhead by 8 bytes e.g. ixgbe
>> >> > +devices  upper
>> >>
>> >> two spaces
>> >>
>> >> > +  bound MTU is reduced from 9710 to 9702. This work around is
>> >> > + temporary and  is expected to be removed once a method is
>> >> > + provided by DPDK to query the  maximum MTU for a given device.
>>
>> Has it been confirmed that DPDK will provide such functionality? If not,
>> you may want to rephrase the previous sentence, so as not to set
>> unrealistic expectations.
>
>I don't think it's unrealistic, there's agreement from both the OVS and DPDK
>community at the community calls that a fix like this is required and would
>not be controversial. I can rephrase above to be:

No need to rephrase Ian - I just wanted to know if DPDK had confirmed that they 
will provide a fix; since they have, it's perfectly fine to set that 
expectation here.
 
>
>'This work around is intended to be temporary until DPDK provides a method
>to query the  maximum MTU for a given device.'
>
>If that helps?
>
>>
>> >> > +
>> >> >  Reporting Bugs
>> >> >  --------------
>> >> >
>> >> > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index
>> >> > e32c7f6..2ac76e3 100644
>> >> > --- a/lib/netdev-dpdk.c
>> >> > +++ b/lib/netdev-dpdk.c
>> >> > @@ -2134,7 +2134,18 @@ netdev_dpdk_set_mtu(struct netdev *netdev,
>> >> > int
>> >> > mtu)  {
>> >> >      struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
>> >> >
>> >> > -    if (MTU_TO_FRAME_LEN(mtu) > NETDEV_DPDK_MAX_PKT_LEN
>> >> > +    /* XXX: We need to ensure the requested MTU frame length does
>> >> > + not
>>
>> I'd probably rephrase to something along these lines: "ensure that the
>> overall frame length of the requested MTU...".
>
>Sure.
>
>>
>> >> > +     * surpass the NETDEV_DPDK_MAX_PKT_LEN. DPDK device drivers
>> differ
>> >> > +     * in how the MTU frame length is calculated when
>> >> rte_ethdev_set_mtu(mtu)
>> >> > +     * is called e.g. i40e driver includes 2 x vlan headers in MTU
>> >> overhead
>> >> > +     * the em driver includes 1 x vlan tag in MTU overhead, the
>> >> > + ixgbe
>> >> driver
>> >> > +     * does not include vlan tags in MTU overhead. As such we
>> >> > + should
>> >> use
>> >> > +     * MTU_TO_MAX_FRAME_LEN(mtu) which includes an additional 2 x
>> >> > + vlan
>> >> headers
>> >> > +     * (8 bytes) for comparison. This avoids a failure later with
>> >> > +     * rte_ethdev_set_mtu(). This approach should be used until
>> >> > + DPDK provides
>> >>
>> >> rte_eth_dev_set_mtu
>> >>
>> >> > +     * a method to retrieve maximum MTU frame length for a given
>> >> device.
>> >> > +     */
>> >> > +    if (MTU_TO_MAX_FRAME_LEN(mtu) > NETDEV_DPDK_MAX_PKT_LEN
>> >> >          || mtu < ETHER_MIN_MTU) {
>> >> >          VLOG_WARN("%s: unsupported MTU %d\n", dev->up.name, mtu);
>> >> >          return EINVAL;
>> >>
>> >> We got a bug report for another related MTU issue because in older
>> >> OVS/DPDK, the i40e driver didn't include the 2 VLAN headers overhead,
>> >> so if the packet length is the maximum allowed, it can't forward
>> >> packets with VLAN headers included. So, I agree that this needs
>> >> improvements on the DPDK side to report not only the maximum MTU but
>> >> also to standardize what means MTU. Today we don't know the overhead
>> each PMD will account.
>> >>
>> >> In the hope that the committer can fix the small typos above or if
>> >> there is another spin:
>> >>
>> >> Acked-by: Flavio Leitner <[email protected]>
>> >
>> >Thanks Flavio,
>> >
>> >I agree, we can add this to requirements for DPDK MTU capabilities.
>> >
>> >I'll re-spin a v3 and remove the RFC, once there's an ACK from Mark
>> >I'll add this to the DPDK_MERGE over the weekend.
>> >
>> >Thanks
>> >Ian
>> >>
>> >> Thanks Ian!
>> >>
>> >>
>> >> > --
>> >> > 1.7.0.7
>> >> >
>> >> > _______________________________________________
>> >> > dev mailing list
>> >> > [email protected]
>> >> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>> >>
>> >> --
>> >> Flavio

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to