Re: [RFC RTNETLINK 04/09]: Link creation API
Stephen Hemminger wrote: On Wed, 06 Jun 2007 01:17:11 +0200 Patrick McHardy [EMAIL PROTECTED] wrote: If you want I'll extend existing bridge netlink to use these. Are you talking about brige-port information or bridge device configuration? So far the API is not suitable for anything that currently uses IFLA_PROTINFO because the sender is not the driver which created the device and doesn't use AF_UNSPEC. For bridge device configuration it would certainly be nice to have, but I'm not sure yet how to handle enslave operations. So far my favourite idea is to add enslave/release operations to rtnl_link_ops and call them when IFLA_MASTER is set (so the netlink message would look like this: ifindex: eth0 master: br0 nlmsg_type: RTM_NETLINK). But I haven't really thought this through yet. Was thinking AF_BRIDGE (we have it already so use it), and both add/remove bridge, and enslave/unslave device. I think we should use two seperate families for bridge devices and bridge ports. I'll think about the enslave operation some more and try to add it next week. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC RTNETLINK 04/09]: Link creation API
+static int rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg) +{ + struct rtnl_link_ops *ops; + struct net_device *dev; + struct ifinfomsg *ifm; + char name[MODULE_NAME_LEN]; + char ifname[IFNAMSIZ]; + struct nlattr *tb[IFLA_MAX+1]; + struct nlattr *linkinfo[IFLA_INFO_MAX+1]; + int err; + + err = nlmsg_parse(nlh, sizeof(*ifm), tb, IFLA_MAX, ifla_policy); + if (err 0) + return err; + + if (tb[IFLA_IFNAME]) + nla_strlcpy(ifname, tb[IFLA_IFNAME], IFNAMSIZ); + else + ifname[0] = '\0'; + + ifm = nlmsg_data(nlh); + if (ifm-ifi_index 0) + dev = __dev_get_by_index(ifm-ifi_index); + else if (ifname[0]) + dev = __dev_get_by_name(ifname); + else + dev = NULL; + + if (tb[IFLA_LINKINFO]) { + err = nla_parse_nested(linkinfo, IFLA_INFO_MAX, +tb[IFLA_LINKINFO], ifla_info_policy); + if (err 0) + return err; + } else + memset(linkinfo, 0, sizeof(linkinfo)); + + if (linkinfo[IFLA_INFO_NAME]) { + nla_strlcpy(name, linkinfo[IFLA_INFO_NAME], sizeof(name)); + ops = rtnl_link_ops_get(name); Ugh. Shouldn't we have the request_module logic here? Otherwise it looks like we can skip the validate method and have other weird interactions. + } else { + name[0] = '\0'; + ops = NULL; + } + + if (1) { + struct nlattr *attr[ops ? ops-maxtype + 1 : 0], **data = NULL; + + if (ops) { + if (ops-maxtype linkinfo[IFLA_INFO_DATA]) { + err = nla_parse_nested(attr, ops-maxtype, + linkinfo[IFLA_INFO_DATA], +ops-policy); + if (err 0) + return err; + data = attr; + } + if (ops-validate) { + err = ops-validate(tb, data); + if (err 0) + return err; + } + } + + if (dev) { + int modified = 0; + + if (nlh-nlmsg_flags NLM_F_EXCL) + return -EEXIST; + if (nlh-nlmsg_flags NLM_F_REPLACE) + return -EOPNOTSUPP; + + if (linkinfo[IFLA_INFO_DATA]) { + if (!ops || ops != dev-rtnl_link_ops || + !ops-changelink) + return -EOPNOTSUPP; + + err = ops-changelink(dev, tb, data); + if (err 0) + return err; + modified = 1; + } + + return do_setlink(dev, ifm, tb, ifname, modified); + } + + if (!(nlh-nlmsg_flags NLM_F_CREATE)) + return -ENODEV; + + if (ifm-ifi_index) + return -EINVAL; + if (tb[IFLA_ADDRESS] || tb[IFLA_BROADCAST] || tb[IFLA_MAP]) + return -EOPNOTSUPP; + +#ifdef CONFIG_KMOD + if (!ops name[0]) { + /* race condition: device may be created while rtnl is + * unlocked, final register_netdevice will catch it. + */ + __rtnl_unlock(); + request_module(rtnl-link-%s, name); + rtnl_lock(); + ops = rtnl_link_ops_get(name); + } +#endif + if (!ops) + return -EOPNOTSUPP; + + if (!ifname[0]) + snprintf(ifname, IFNAMSIZ, %s%%d, ops-name); + dev = alloc_netdev(ops-priv_size, ifname, ops-setup); + if (!dev) + return -ENOMEM; + + if (strchr(dev-name, '%')) { + err = dev_alloc_name(dev, dev-name); + if (err 0) + goto err_free; + } + dev-rtnl_link_ops = ops; + + if (tb[IFLA_MTU]) + dev-mtu = nla_get_u32(tb[IFLA_MTU]); + if (tb[IFLA_TXQLEN]) + dev-tx_queue_len = nla_get_u32(tb[IFLA_TXQLEN]); + if (tb[IFLA_WEIGHT]) + dev-weight = nla_get_u32(tb[IFLA_WEIGHT]); + if (tb[IFLA_OPERSTATE]) + set_operstate(dev, nla_get_u8(tb[IFLA_OPERSTATE])); + if (tb[IFLA_LINKMODE]) + dev-link_mode = nla_get_u8(tb[IFLA_LINKMODE]); + +
Re: [RFC RTNETLINK 04/09]: Link creation API
Eric W. Biederman wrote: + if (linkinfo[IFLA_INFO_NAME]) { + nla_strlcpy(name, linkinfo[IFLA_INFO_NAME], sizeof(name)); + ops = rtnl_link_ops_get(name); Ugh. Shouldn't we have the request_module logic here? Otherwise it looks like we can skip the validate method and have other weird interactions. Good catch. The easiest solution seems be to simply replay the request after successful module load, which also avoids the device lookup race. Something like this (untested). diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index 8d2f817..f2868b0 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -930,6 +930,7 @@ static int rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg) struct nlattr *linkinfo[IFLA_INFO_MAX+1]; int err; +replay: err = nlmsg_parse(nlh, sizeof(*ifm), tb, IFLA_MAX, ifla_policy); if (err 0) return err; @@ -1012,19 +1013,19 @@ static int rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg) if (tb[IFLA_ADDRESS] || tb[IFLA_BROADCAST] || tb[IFLA_MAP]) return -EOPNOTSUPP; + if (!ops) { #ifdef CONFIG_KMOD - if (!ops kind[0]) { - /* race condition: device may be created while rtnl is -* unlocked, final register_netdevice will catch it. -*/ - __rtnl_unlock(); - request_module(rtnl-link-%s, kind); - rtnl_lock(); - ops = rtnl_link_ops_get(kind); - } + if (kind[0]) { + __rtnl_unlock(); + request_module(rtnl-link-%s, kind); + rtnl_lock(); + ops = rtnl_link_ops_get(kind); + if (ops) + goto replay; + } #endif - if (!ops) return -EOPNOTSUPP; + } if (!ifname[0]) snprintf(ifname, IFNAMSIZ, %s%%d, ops-kind);
Re: [RFC RTNETLINK 04/09]: Link creation API
Patrick McHardy [EMAIL PROTECTED] writes: Eric W. Biederman wrote: +if (linkinfo[IFLA_INFO_NAME]) { +nla_strlcpy(name, linkinfo[IFLA_INFO_NAME], sizeof(name)); +ops = rtnl_link_ops_get(name); Ugh. Shouldn't we have the request_module logic here? Otherwise it looks like we can skip the validate method and have other weird interactions. Good catch. The easiest solution seems be to simply replay the request after successful module load, which also avoids the device lookup race. Looks reasonable to me. There might be a variable or two that we need to make certain is initialized but at a quick glance it looks ok. Eric Something like this (untested). diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index 8d2f817..f2868b0 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -930,6 +930,7 @@ static int rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg) struct nlattr *linkinfo[IFLA_INFO_MAX+1]; int err; +replay: err = nlmsg_parse(nlh, sizeof(*ifm), tb, IFLA_MAX, ifla_policy); if (err 0) return err; @@ -1012,19 +1013,19 @@ static int rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg) if (tb[IFLA_ADDRESS] || tb[IFLA_BROADCAST] || tb[IFLA_MAP]) return -EOPNOTSUPP; + if (!ops) { #ifdef CONFIG_KMOD - if (!ops kind[0]) { - /* race condition: device may be created while rtnl is - * unlocked, final register_netdevice will catch it. - */ - __rtnl_unlock(); - request_module(rtnl-link-%s, kind); - rtnl_lock(); - ops = rtnl_link_ops_get(kind); - } + if (kind[0]) { + __rtnl_unlock(); + request_module(rtnl-link-%s, kind); + rtnl_lock(); + ops = rtnl_link_ops_get(kind); + if (ops) + goto replay; + } #endif - if (!ops) return -EOPNOTSUPP; + } if (!ifname[0]) snprintf(ifname, IFNAMSIZ, %s%%d, ops-kind); - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC RTNETLINK 04/09]: Link creation API
[RTNETLINK]: Link creation API Add rtnetlink API for creating, changing and deleting software devices. Signed-off-by: Patrick McHardy [EMAIL PROTECTED] --- commit 0323e7d1e7d5042492684264cfcba6d7ff55c473 tree 161530836d43b39ddef42a2c2b48b82187580e3c parent 40b0b8787c1057d055baa6e3d11ff6db7783c982 author Patrick McHardy [EMAIL PROTECTED] Tue, 05 Jun 2007 15:40:12 +0200 committer Patrick McHardy [EMAIL PROTECTED] Tue, 05 Jun 2007 15:40:12 +0200 include/linux/if_link.h | 13 ++ include/linux/netdevice.h |3 include/net/rtnetlink.h | 57 net/core/rtnetlink.c | 338 - 4 files changed, 404 insertions(+), 7 deletions(-) diff --git a/include/linux/if_link.h b/include/linux/if_link.h index 604c243..e46ed94 100644 --- a/include/linux/if_link.h +++ b/include/linux/if_link.h @@ -76,6 +76,8 @@ enum #define IFLA_WEIGHT IFLA_WEIGHT IFLA_OPERSTATE, IFLA_LINKMODE, + IFLA_LINKINFO, +#define IFLA_LINKINFO IFLA_LINKINFO __IFLA_MAX }; @@ -140,4 +142,15 @@ struct ifla_cacheinfo __u32 retrans_time; }; +enum +{ + IFLA_INFO_UNSPEC, + IFLA_INFO_NAME, + IFLA_INFO_DATA, + IFLA_INFO_XSTATS, + __IFLA_INFO_MAX, +}; + +#define IFLA_INFO_MAX (__IFLA_INFO_MAX - 1) + #endif /* _LINUX_IF_LINK_H */ diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 3a70f55..e327ccc 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -540,6 +540,9 @@ struct net_device struct device dev; /* space for optional statistics and wireless sysfs groups */ struct attribute_group *sysfs_groups[3]; + + /* rtnetlink link ops */ + struct rtnl_link_ops*rtnl_link_ops; }; #define to_net_dev(d) container_of(d, struct net_device, dev) diff --git a/include/net/rtnetlink.h b/include/net/rtnetlink.h index 3b3d474..d744198 100644 --- a/include/net/rtnetlink.h +++ b/include/net/rtnetlink.h @@ -22,4 +22,61 @@ static inline int rtnl_msg_family(struct nlmsghdr *nlh) return AF_UNSPEC; } +/** + * struct rtnl_link_ops - rtnetlink link operations + * + * @list: Used internally + * @name: Identifier + * @maxtype: Highest device specific netlink attribute number + * @policy: Netlink policy for device specific attribute validation + * @validate: Optional validation function for netlink/changelink parameters + * @priv_size: sizeof net_device private space + * @setup: net_device setup function + * @newlink: Function for configuring and registering a new device + * @changelink: Function for changing parameters of an existing device + * @dellink: Function to remove a device + * @get_size: Function to calculate required room for dumping device + *specific netlink attributes + * @fill_info: Function to dump device specific netlink attributes + * @xstats_size: Size of device specific statistics + * @fill_xstats: Function to dump device specific statistics + */ +struct rtnl_link_ops { + struct list_headlist; + + const char *name; + + size_t priv_size; + void(*setup)(struct net_device *dev); + + int maxtype; + const struct nla_policy *policy; + int (*validate)(struct nlattr *tb[], + struct nlattr *data[]); + + int (*newlink)(struct net_device *dev, + struct nlattr *tb[], + struct nlattr *data[]); + int (*changelink)(struct net_device *dev, + struct nlattr *tb[], + struct nlattr *data[]); + void(*dellink)(struct net_device *dev); + + size_t (*get_size)(struct net_device *dev); + int (*fill_info)(struct sk_buff *skb, +struct net_device *dev); + + size_t xstats_size; + int (*fill_xstats)(struct sk_buff *skb, + struct net_device *dev); +}; + +extern int __rtnl_link_register(struct rtnl_link_ops *ops); +extern void__rtnl_link_unregister(struct rtnl_link_ops *ops); + +extern int rtnl_link_register(struct rtnl_link_ops *ops); +extern voidrtnl_link_unregister(struct rtnl_link_ops *ops); + +#define MODULE_ALIAS_RTNL_LINK(name) MODULE_ALIAS(rtnl-link- #name) + #endif diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index 25ca219..ed17288 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -243,6 +243,141 @@ void rtnl_unregister_all(int protocol) EXPORT_SYMBOL_GPL(rtnl_unregister_all); +static LIST_HEAD(link_ops); +
Re: [RFC RTNETLINK 04/09]: Link creation API
From: Patrick McHardy [EMAIL PROTECTED] Date: Tue, 5 Jun 2007 16:12:57 +0200 (MEST) [RTNETLINK]: Link creation API Add rtnetlink API for creating, changing and deleting software devices. Signed-off-by: Patrick McHardy [EMAIL PROTECTED] Looks mostly fine, perhaps you can make even more use of 'const' for instances of struct rtnl_link_ops * at least as function arguments deeper in the implementation. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC RTNETLINK 04/09]: Link creation API
David Miller wrote: From: Patrick McHardy [EMAIL PROTECTED] Date: Tue, 5 Jun 2007 16:12:57 +0200 (MEST) [RTNETLINK]: Link creation API Add rtnetlink API for creating, changing and deleting software devices. Signed-off-by: Patrick McHardy [EMAIL PROTECTED] Looks mostly fine, perhaps you can make even more use of 'const' for instances of struct rtnl_link_ops * at least as function arguments deeper in the implementation. Good suggestion. I initially had rtnl_link_ops const everywhere (since the lookup was by family I stored it in an array and didn't need the list_head), then changed it to a list and removed all consts. I'll add them back where possible. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC RTNETLINK 04/09]: Link creation API
On Tue, 5 Jun 2007 16:12:57 +0200 (MEST) Patrick McHardy [EMAIL PROTECTED] wrote: [RTNETLINK]: Link creation API Add rtnetlink API for creating, changing and deleting software devices. Signed-off-by: Patrick McHardy [EMAIL PROTECTED] If you want I'll extend existing bridge netlink to use these. -- Stephen Hemminger [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC RTNETLINK 04/09]: Link creation API
Stephen Hemminger wrote: On Tue, 5 Jun 2007 16:12:57 +0200 (MEST) Patrick McHardy [EMAIL PROTECTED] wrote: [RTNETLINK]: Link creation API Add rtnetlink API for creating, changing and deleting software devices. Signed-off-by: Patrick McHardy [EMAIL PROTECTED] If you want I'll extend existing bridge netlink to use these. Are you talking about brige-port information or bridge device configuration? So far the API is not suitable for anything that currently uses IFLA_PROTINFO because the sender is not the driver which created the device and doesn't use AF_UNSPEC. For bridge device configuration it would certainly be nice to have, but I'm not sure yet how to handle enslave operations. So far my favourite idea is to add enslave/release operations to rtnl_link_ops and call them when IFLA_MASTER is set (so the netlink message would look like this: ifindex: eth0 master: br0 nlmsg_type: RTM_NETLINK). But I haven't really thought this through yet. I would also like to add support for handling secondary device state like bridge port state and others that currently use IFLA_PROTINFO (a lot of the code is very similar to the generic code), but all ideas so far turned out not to work very well. I'm leaving for a short vacation until Sunday tommorrow, so replies may be delayed :) - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC RTNETLINK 04/09]: Link creation API
On Wed, 06 Jun 2007 01:17:11 +0200 Patrick McHardy [EMAIL PROTECTED] wrote: Stephen Hemminger wrote: On Tue, 5 Jun 2007 16:12:57 +0200 (MEST) Patrick McHardy [EMAIL PROTECTED] wrote: [RTNETLINK]: Link creation API Add rtnetlink API for creating, changing and deleting software devices. Signed-off-by: Patrick McHardy [EMAIL PROTECTED] If you want I'll extend existing bridge netlink to use these. Are you talking about brige-port information or bridge device configuration? So far the API is not suitable for anything that currently uses IFLA_PROTINFO because the sender is not the driver which created the device and doesn't use AF_UNSPEC. For bridge device configuration it would certainly be nice to have, but I'm not sure yet how to handle enslave operations. So far my favourite idea is to add enslave/release operations to rtnl_link_ops and call them when IFLA_MASTER is set (so the netlink message would look like this: ifindex: eth0 master: br0 nlmsg_type: RTM_NETLINK). But I haven't really thought this through yet. Was thinking AF_BRIDGE (we have it already so use it), and both add/remove bridge, and enslave/unslave device. I would also like to add support for handling secondary device state like bridge port state and others that currently use IFLA_PROTINFO (a lot of the code is very similar to the generic code), but all ideas so far turned out not to work very well. I'm leaving for a short vacation until Sunday tommorrow, so replies may be delayed :) -- Stephen Hemminger [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html