Re: [PATCH RFC 0/3] Support standard SRIOV configuration for IB VFs

Jason Gunthorpe Wed, 27 May 2015 15:42:31 -0700

On Thu, May 28, 2015 at 12:58:59AM +0300, Or Gerlitz wrote:
> On Wed, May 27, 2015 at 8:53 PM, Doug Ledford <[email protected]> wrote:
> 
> > On Wed, 2015-05-27 at 11:11 -0600, Jason Gunthorpe wrote:
> > Well, let's just be clear: netlink/iproute2 screwed the pooch on their
> > implementation of this stuff.  Any solution that doesn't include fixing
> > this up in some way is not really a good solution.
> 
> >>  For instance iproute2 would need IB specific code to format
> >> the 'ip link show' (review print_vfinfo in iproute2) and to length
> >> check 'ip link set vf mac'
> 
> >> If we do use 8, then it would be ideal (and my strong preference) to
> >> also fix the IFLA_VF_MAC message to have a working length. I think
> >> that could be done compatibly with a bit of work. At least that way
> >> iproute2 can be kept clean when it learns to do IB, and we could have
> >> the option again of using 20 someday if we need.
> >>
> 
> > Sounds like a reasonable plan.
> > Or, this is your patch set, are you going to pick up these action items?
> 
> Let see
> 
> >> So to be clear, to go with the 8 byte option I suggest:
> 
> >>  - Engage netdev/iproute and confirm they are philosophically OK
> >>    with IFLA_VF_MAC != IFLA_ADDRESS
> 
> the last thing netdev are is having philosophical views, they are very
> practical (and non-perfect and happy and lively community), the one
> thing before the last  netdev are is caring for the rdma subsystem. If
> something has to change @ their direct part, we should come with
> patches.


Yes, you need to talk in patches, I will suggest you start with this
(totally untested, quickly thrown together):

diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 8de36824018d..715b59e9925a 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -1172,7 +1172,7 @@ static int rtnl_fill_ifinfo(struct sk_buff *skb, struct 
net_device *dev,
                                nla_nest_cancel(skb, vfinfo);
                                goto nla_put_failure;
                        }
-                       if (nla_put(skb, IFLA_VF_MAC, sizeof(vf_mac), &vf_mac) 
||
+                       if (nla_put(skb, IFLA_VF_MAC, sizeof(u32) + 
dev->addr_len, &vf_mac) ||
                            nla_put(skb, IFLA_VF_VLAN, sizeof(vf_vlan), 
&vf_vlan) ||
                            nla_put(skb, IFLA_VF_RATE, sizeof(vf_rate),
                                    &vf_rate) ||
@@ -1292,7 +1292,8 @@ static const struct nla_policy 
ifla_vfinfo_policy[IFLA_VF_INFO_MAX+1] = {
 };
 
 static const struct nla_policy ifla_vf_policy[IFLA_VF_MAX+1] = {
-       [IFLA_VF_MAC]           = { .len = sizeof(struct ifla_vf_mac) },
+       [IFLA_VF_MAC]           = { .type = NLA_BINARY,
+                                   .len = sizeof(struct ifla_vf_mac) },
        [IFLA_VF_VLAN]          = { .len = sizeof(struct ifla_vf_vlan) },
        [IFLA_VF_TX_RATE]       = { .len = sizeof(struct ifla_vf_tx_rate) },
        [IFLA_VF_SPOOFCHK]      = { .len = sizeof(struct ifla_vf_spoofchk) },
@@ -1447,6 +1448,21 @@ static int do_setvfinfo(struct net_device *dev, struct 
nlattr *attr)
                switch (nla_type(vf)) {
                case IFLA_VF_MAC: {
                        struct ifla_vf_mac *ivm;
+                       int sz;
+
+                       /* Legacy compatability, old iproute2 will pass in 32
+                        * bytes for everything, assume it is a 6 byte MAC
+                        * because that is all old iproute really supported.
+                        */
+                       sz = nla_len(attr);
+                       if (sz == 36)
+                               sz = 10;
+
+                       if (sz != (sizeof(u32) + dev->addr_len)) {
+                               err = -EINVAL;
+                               break;
+                       }
+
                        ivm = nla_data(vf);
                        err = -EOPNOTSUPP;
                        if (ops->ndo_set_vf_mac)

And make the argument why the above is OK.
 - Reader side only ever read 6 bytes anyhow, so it doesn't matter
   that ifla_vf_mac is undersized in the RTA_DATA
 - Confirm any other users of this (openstack?) you know are similarly
   OK.
 - Old write side will always send the full 36 byte struct, so
   assume 6 bytes.
 - Justify this as needing to add IB support next which does not use a
   6 byte addr_len
   New code that wants to work with IB will have to use correct lengths.

Then based on that iproute2 can be fixed in the two areas I pointed
out, doing that at the same time will probably earn some rep.

Finally you can make a patch that switches addr_len to something else
- that would trigger the philosophical question of 'what actually is a
IFLA_VF_MAC', if everyone is OK with changing it then golden.

> >>  - Make a iproute patch to use the IFLA_VF_MAC size in print_vfinfo
> >>    instead of hardcoded ETH_ALEN (using len == 32 mean len 6 for compat)
> 
> I was thinking to patch iproute to sense the link type: if eth print
> six bytes, if ipoib print 8 bytes, simple.

This is against the design ideals of netlink: a tool like iproute
should not need link specific knowledge to parse kernel messages.

Now is the best time to make this fix because we can use the sz == 36
so sz == 10 assumption.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH RFC 0/3] Support standard SRIOV configuration for IB VFs

Reply via email to