Hi Ilya, Kevin,
Thanks for the remarks, I will fix them.
However I have two questions inline.
On 27/10/20 15:03 +0100, Ilya Maximets wrote:
> On 10/20/20 1:15 PM, Kevin Traynor wrote:
> > On 16/09/2020 16:17, Gaetan Rivet wrote:
> >> In some cloud topologies, using DPDK VF representors in guest requires
> >> configuring a VF before it is assigned to the guest.
> >>
> >> A first basic option for such configuration is setting the VF MAC
> >> address. Add a key 'dpdk-vf-mac' to the 'options' column of the Interface
> >> table.
> >>
> >> This option can be used as such:
> >>
> >> $ ovs-vsctl add-port br0 dpdk-rep0 -- set Interface dpdk-rep0 type=dpdk
> >> \
> >> options:dpdk-vf-mac=00:11:22:33:44:55
> >>
> >
> > If I got it right, it doesn't seem like it solves any issues by limiting
> > to representor, but is just an attempt to limit some of the edge cases
> > around bifurcated devices to the devices where the requested
> > functionality is really needed now.
> >
> > This limitation has some costs...
> >
> > We will now have, mac, mac_in_use, dpdk-vf-mac, configured_hwaddr,
> > requested_hwaddr for the user to understand. I think it can be confusing.
>
> To be clear, 'configured_hwaddr/requested_hwaddr' type of report should not
> exist at least in the output of 'get_config()' function. Same applies
> to all existing '{configured,requested}_something' reports.
> 'get_config()' intended to report things that could be copied and passed
> directly
> to database. OTOH, 'get_status()' should report what is an actual device
> status, i.e. 'dpdk-vf-mac' should be in 'get_config()' and 'configured_hwaddr'
> should be in 'get_status()', but it should be named differently, e.g. just
> 'dpdk-vf-mac' and the callback it placed in already says that it's actually
> 'configured'. In short:
> - 'dpdk-vf-mac' from the 'get_config()' should report dev->requested_hwaddr.
> - 'dpdk-vf-mac' from the 'get_status()' should report dev->hwaddr.
> Of course, all this should be reported only if device is a representor.
>
>
Indeed I had followed the other '{configured|requested}_' fields.
This is not related to this patchset, however I'd like to know how to
proceed to fix those other fields? They were exposed to users possibly,
should there be a deprecation process to warn of the change, or should
they just be fixed directly as the get_config() API was not respected?
> >
> > 'ovs-vsctl show' will show the dpdk-vf-mac, which is the main one we
> > document, but it's not clear if it was successful or not. (yes there is
> > a log entry, but it is a separate place).
> >
> > As it is specific for representors, if we ever want to allow setting of
> > mac on any non representor DPDK ports, we have an awkward user
> > interface. We would need to make another 'dpdk-mac' type option, or we
> > decide then to allow use mac in interface table but that it doesn't work
> > for representors, or there are two ways to set for representors etc.
> > None of this seems great.
> >
> > My feeling is that it is making things complicated for the user with
> > this many knobs, where we could just have mac and mac_in_use, live with
> > the edge cases (as Gaetan pointed out, we already do for MTU) and we
> > know as well if we ever need to extend further, the user interface will
> > stay simple. Just my 2C, maybe there's a good argument why we can't do
> > it like this.
>
> Setting mac for physical device has lots of split-brain issues and unclear
> ways how and when undo changes as we do not know what was original mac or
> what to do if there was no mac at all at the start. Moreover, some devices
> will not allow setting mac to it's original value (e.g. all zeroes).
> One more point is that vf and representor itself are different devices and
> these devices could have different mac addresses. For now we agreed that
> DPDK api works for representors in a way that it always changes parameters
> of a virtual function, not representors', but that is not a very clear thing
> for all users (e.g. for DPDK developers), so 'vf' in 'dpdk-vf-mac'
> specifically
> points to that difference.
>
> After a long discussion we decided to make this option as specific as
> possible,
> so users will not try to use it to configure anything else beside mac of a
> virtual function. All restrictions and downsides of configuration of vf mac
> was/should be documented.
>
>
> >
> > Few comments on the code as it is now below.
> >
> >
> >> Signed-off-by: Gaetan Rivet <[email protected]>
> >> Suggested-by: Ilya Maximets <[email protected]>
> >> Acked-by: Eli Britstein <[email protected]>
> >> ---
> >> Documentation/topics/dpdk/phy.rst | 22 ++++++++++++
> >> NEWS | 2 ++
> >> lib/netdev-dpdk.c | 60 +++++++++++++++++++++++++++++++
> >> vswitchd/vswitch.xml | 13 +++++++
> >> 4 files changed, 97 insertions(+)
> >>
> >> diff --git a/Documentation/topics/dpdk/phy.rst
> >> b/Documentation/topics/dpdk/phy.rst
> >> index 38e52c8de..73dcb7c28 100644
> >> --- a/Documentation/topics/dpdk/phy.rst
> >> +++ b/Documentation/topics/dpdk/phy.rst
> >> @@ -379,6 +379,28 @@ an eth device whose mac address is
> >> ``00:11:22:33:44:55``::
> >> $ ovs-vsctl add-port br0 dpdk-mac -- set Interface dpdk-mac type=dpdk
> >> \
> >> options:dpdk-devargs="class=eth,mac=00:11:22:33:44:55"
> >>
> >> +Representor specific configuration
> >> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >> +
> >> +In some topologies, a VF must be configured before being assigned to a
> >> +guest (VM) machine. This configuration is done through VF-specific fields
> >> +in the ``options`` column of the Interface table.
> >> +
> >> +.. important::
> >> +
> >> + Some DPDK port use `bifurcated drivers <bifurcated-drivers>`__,
> >> + which means that a kernel netdevice remains when Open vSwitch is
> >> stopped.
> >> +
> >> + In such case, any configuration applied to a VF would remain set on the
> >> + kernel netdevice, and be inherited from it when Open vSwitch is
> >> restarted,
> >> + even if the options described in this section are unset from Open
> >> vSwitch.
> >> +
> >> +.. _bifurcated-drivers:
> >> http://doc.dpdk.org/guides/linux_gsg/linux_drivers.html#bifurcated-driver
> >> +
> >> +- Configure the VF MAC address::
> >> +
> >> + $ ovs-vsctl set Interface dpdk-rep0
> >> options:dpdk-vf-mac=00:11:22:33:44:55
> >> +
> >
> > As mentioned above, if for some reason setting the mac fails there will
> > be warning in the logs, but 'ovs-vsctl show' will still show the
> > dpdk-vf-mac that was not applied and it's quite prominent. Also, there
> > are mac, mac_in_use, dpdk-vf-mac, configured_hwaddr, requested_hwaddr.
> >
> > Suggest to add a small summary here for each one as it relates
> > representors and make clear to users,
> > 1. how to find the mac the user requested
> > 2. how to find the current mac
> >
> >> Jumbo Frames
> >> ------------
> >>
> >> diff --git a/NEWS b/NEWS
> >> index 2f67d5047..5ece29474 100644
> >> --- a/NEWS
> >> +++ b/NEWS
> >> @@ -24,6 +24,8 @@ v2.14.0 - 17 Aug 2020
> >
> > 2.15....hopefully :-)
> >
> >> to list and change log levels in DPDK components.
> >> * Vhost-user Dequeue zero-copy support is deprecated and will be
> >> removed
> >> in the next release.
> >> + * New 'options:dpdk-vf-mac' field for DPDK interface of VF ports,
> >> + that allows configuring the MAC address of a VF representor.
> >> - Linux datapath:
> >> * Support for kernel versions up to 5.5.x.
> >> - AF_XDP:
> >> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> >> index 4ef7f78a6..060348369 100644
> >> --- a/lib/netdev-dpdk.c
> >> +++ b/lib/netdev-dpdk.c
> >> @@ -523,6 +523,9 @@ struct netdev_dpdk {
> >> * otherwise interrupt mode is used. */
> >> bool requested_lsc_interrupt_mode;
> >> bool lsc_interrupt_mode;
> >> +
> >> + /* VF configuration. */
> >> + struct eth_addr requested_hwaddr;
> >> );
> >>
> >> PADDED_MEMBERS(CACHE_LINE_SIZE,
> >> @@ -1685,6 +1688,16 @@ out:
> >> return ret;
> >> }
> >>
> >> +static bool
> >> +dpdk_port_is_representor(struct netdev_dpdk *dev)
> >> + OVS_REQUIRES(dev->mutex)
> >> +{
> >> + struct rte_eth_dev_info dev_info;
> >> +
> >> + rte_eth_dev_info_get(dev->port_id, &dev_info);
> >> + return (*dev_info.dev_flags) & RTE_ETH_DEV_REPRESENTOR;
> >> +}
> >> +
> >> static int
> >> netdev_dpdk_get_config(const struct netdev *netdev, struct smap *args)
> >> {
> >> @@ -1719,6 +1732,13 @@ netdev_dpdk_get_config(const struct netdev *netdev,
> >> struct smap *args)
> >> }
> >> smap_add(args, "lsc_interrupt_mode",
> >> dev->lsc_interrupt_mode ? "true" : "false");
> >> +
> >> + if (dpdk_port_is_representor(dev)) {
> >> + smap_add_format(args, "requested_hwaddr", ETH_ADDR_FMT,
> >> + ETH_ADDR_ARGS(dev->requested_hwaddr));
> >> + smap_add_format(args, "configured_hwaddr", ETH_ADDR_FMT,
> >> + ETH_ADDR_ARGS(dev->hwaddr));
> >
> > '*mac*' is used elsewhere (mac, mac_in_use, dpdk-vf-mac), think it's
> > better to keep consistent
>
> Since this is 'config' it should report things in a way they could be passed
> to the datbase. So, it should be 'dpdk-vf-mac' and it should report the
> value that was requested, i.e. dev->requested_hwaddr.
>
> >
> >> + }
> >> }
> >> ovs_mutex_unlock(&dev->mutex);
> >>
> >> @@ -1898,6 +1918,7 @@ netdev_dpdk_set_config(struct netdev *netdev, const
> >> struct smap *args,
> >> {RTE_FC_RX_PAUSE, RTE_FC_FULL }
> >> };
> >> const char *new_devargs;
> >> + const char *vf_mac;
> >> int err = 0;
> >>
> >> ovs_mutex_lock(&dpdk_mutex);
> >> @@ -1968,6 +1989,36 @@ netdev_dpdk_set_config(struct netdev *netdev, const
> >> struct smap *args,
> >> goto out;
> >> }
> >>
> >> + vf_mac = smap_get(args, "dpdk-vf-mac");
> >> + if (vf_mac) {
> >> + struct eth_addr mac;
> >> +
> >> + err = EINVAL;
> >> +
> >> + if (!dpdk_port_is_representor(dev)) {
> >> + VLOG_ERR_BUF(errp, "'%s' is trying to set the VF MAC '%s' "
> >> + "but 'options:dpdk-vf-mac' is only supported for
> >> "
> >> + "VF representors.",
> >> + netdev_get_name(netdev), vf_mac);
> >
> > Warnings here seems more in line with the other configs that cannot be
> > applied.
>
> I agree, if it's not a representor we could just issue a warning and continue.
>
Your phrasing seems to limit the (err -> warn) change to the
representor filtering part. Should the MAC sanitizing messages below remain
errors?
> >
> >> + goto out;
> >> + }
> >> + if (!eth_addr_from_string(vf_mac, &mac)) {
> >> + VLOG_ERR_BUF(errp, "interface '%s': cannot parse VF MAC '%s'",
> >> + netdev_get_name(netdev), vf_mac);
> >
> > as above
> >
> >> + goto out;
> >> + }
> >> + if (eth_addr_is_multicast(mac)) {
> >> + VLOG_ERR_BUF(errp,
> >> + "interface '%s': cannot set VF MAC to multicast "
> >> + "address", netdev_get_name(netdev));
> >
> > as above
> >
> >> + goto out;
> >> + }
> >> + if (!eth_addr_equals(dev->requested_hwaddr, mac)) {
> >> + dev->requested_hwaddr = mac;
> >> + netdev_request_reconfigure(netdev);
> >> + }
> >> + }
> >> +
> >> lsc_interrupt_mode = smap_get_bool(args, "dpdk-lsc-interrupt", false);
> >> if (dev->requested_lsc_interrupt_mode != lsc_interrupt_mode) {
> >> dev->requested_lsc_interrupt_mode = lsc_interrupt_mode;
> >> @@ -4938,6 +4989,7 @@ netdev_dpdk_reconfigure(struct netdev *netdev)
> >> && dev->lsc_interrupt_mode == dev->requested_lsc_interrupt_mode
> >> && dev->rxq_size == dev->requested_rxq_size
> >> && dev->txq_size == dev->requested_txq_size
> >> + && eth_addr_equals(dev->hwaddr, dev->requested_hwaddr)
> >> && dev->socket_id == dev->requested_socket_id
> >> && dev->started && !dev->reset_needed) {
> >> /* Reconfiguration is unnecessary */
> >> @@ -4969,6 +5021,14 @@ netdev_dpdk_reconfigure(struct netdev *netdev)
> >> dev->txq_size = dev->requested_txq_size;
> >>
> >> rte_free(dev->tx_q);
> >> +
> >> + if (!eth_addr_equals(dev->hwaddr, dev->requested_hwaddr)) {
> >
> > Looks like if dpdk-vf-mac is not set, after initialization there will be
> > a configured hwaddr, and requested_hwaddr will be all zeros.
> >
> > If this function is called because of some other config change, then
> > hwaddr != requested_hwaddr, and it will attempt to set the mac to the
> > requested_hwaddr of all zeros. This may also fail causing the
> > reconfigure to fail for other configs items.
> >
> > (I admit, I hacked a phy port dev_info to say it was a representor, when
> > playing with this but the issue seems valid)
> >
> >
> >> + err = netdev_dpdk_set_etheraddr__(dev, dev->requested_hwaddr);
> >> + if (err) {
> >> + goto out;
> >> + }
> >> + }
> >> +
> >> err = dpdk_eth_dev_init(dev);
> >> if (dev->hw_ol_features & NETDEV_TX_TSO_OFFLOAD) {
> >> netdev->ol_flags |= NETDEV_TX_OFFLOAD_TCP_TSO;
> >> diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
> >> index 81c84927f..c0bf03c95 100644
> >> --- a/vswitchd/vswitch.xml
> >> +++ b/vswitchd/vswitch.xml
> >> @@ -3280,6 +3280,19 @@ ovs-vsctl add-port br0 p0 -- set Interface p0
> >> type=patch options:peer=p1 \
> >> descriptors will be used by default.
> >> </p>
> >> </column>
> >> +
> >> + <column name="options" key="dpdk-vf-mac">
> >> + <p>
> >> + Ethernet address to set for this VF interface. If unset then
> >> the
> >> + default MAC address is used:
> >> + </p>
> >> + <ul>
> >> + <li>For most drivers, the default MAC address assigned by
> >> + their hardware.</li><li>For bifurcated drivers, the MAC
> >> currently
> >> + used by the kernel netdevice.</li>
> >
> > nit: the formatting is a little different compared to the other
> > <li></li> usage, newline/indent etc.
> >
> > thanks,
> > Kevin.
> >
> >> + </ul>
> >> + <p>This option may only be used with dpdk VF representors.</p>
> >> + </column>
> >> </group>
> >>
> >> <group title="EMC (Exact Match Cache) Configuration">
> >>
> >
> >
> >
> >
> >
>
--
Gaëtan
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev