Re: [PATCH net-next v2 1/2] netlink: ipv4 igmp join notifications
On Mon, 2018-09-03 at 16:12 -0700, Roopa Prabhu wrote: > On Sun, Sep 2, 2018 at 4:18 AM, Patrick Ruddy > wrote: > > Hi Roopa > > > > inline > > > > thx > > > > -pr > > > > On Fri, 2018-08-31 at 09:29 -0700, Roopa Prabhu wrote: > > > On Fri, Aug 31, 2018 at 4:20 AM, Patrick Ruddy > > > wrote: > > > > Some userspace applications need to know about IGMP joins from the > > > > kernel > > > > for 2 reasons > > > > 1. To allow the programming of multicast MAC filters in hardware > > > > 2. To form a multicast FORUS list for non link-local multicast > > > >groups to be sent to the kernel and from there to the interested > > > >party. > > > > (1) can be fulfilled but simply sending the hardware multicast MAC > > > > address to be programmed but (2) requires the L3 address to be sent > > > > since this cannot be constructed from the MAC address whereas the > > > > reverse translation is a standard library function. > > > > > > > > This commit provides addition and deletion of multicast addresses > > > > using the RTM_NEWADDR and RTM_DELADDR messages. It also provides > > > > the RTM_GETADDR extension to allow multicast join state to be read > > > > from the kernel. > > > > > > > > Signed-off-by: Patrick Ruddy > > > > --- > > > > v2: fix kbuild warnings. > > > > > > I am still going through the series, but AFAICT, user-space caches > > > listening to > > > RTNLGRP_IPV4_IFADDR will now also get multicast addresses by default ? > > > > > > > Yes that's the crux of this change. It's unfortunate that I could not > > use IFA_MULTICAST to distinguish the SAFI. I suppose the other option > > would be to create a set of new NEW/DEL/GETMULTICAST messages but the > > partial code for RTM_GETMULTICAST in ipv6/mcast.c complicates that > > slightly. Happy to look at it if you think that would be be better. > > > > yeah, true. Thinking about this some more, you are adding an interface > for multicast entries learnt via igmp. > There is already a netlink channel for layer2 mc addresses via igmp. I > can't see why that cannot be used. > It is RTM_*MDB msgs. It is currently only available for the bridge. > But, I have a requirement for it to be > available via a vxlan dev...so, I am looking at making it available on > other devices. > > Can you check if RTM_*MDB msgs can be made to work for your case ?. > > The reason I think it should be possible is because this is similar to > bridge fdb entries. > The bridge fdb api (RTM_NEWNEIGH with AF_BRIDGE) is overloaded to > notify and dump netdev unicast addresses. > similarly I think the mdb api can be overloaded to notify and dump > netdev multicast addresses (statically added or learnt via igmp) If I'm reading this correctly I think overloading this channel is possible. What you're suggesting is overloading the RTM_***MDB messages with AF_INET and AF_INET6 to carry the per-interfaces joined l3 multicast addresses. I've thrown together a quick test of this and it looks good. I can polish this up and resubmit if you're happy with the approach. FWIW isolating the multicast addresses this was seems safer and it's a smaller patchset. thx -pr
Re: [PATCH net-next v2 1/2] netlink: ipv4 igmp join notifications
On Mon, 2018-09-03 at 16:12 -0700, Roopa Prabhu wrote: > On Sun, Sep 2, 2018 at 4:18 AM, Patrick Ruddy > wrote: > > Hi Roopa > > > > inline > > > > thx > > > > -pr > > > > On Fri, 2018-08-31 at 09:29 -0700, Roopa Prabhu wrote: > > > On Fri, Aug 31, 2018 at 4:20 AM, Patrick Ruddy > > > wrote: > > > > Some userspace applications need to know about IGMP joins from the > > > > kernel > > > > for 2 reasons > > > > 1. To allow the programming of multicast MAC filters in hardware > > > > 2. To form a multicast FORUS list for non link-local multicast > > > >groups to be sent to the kernel and from there to the interested > > > >party. > > > > (1) can be fulfilled but simply sending the hardware multicast MAC > > > > address to be programmed but (2) requires the L3 address to be sent > > > > since this cannot be constructed from the MAC address whereas the > > > > reverse translation is a standard library function. > > > > > > > > This commit provides addition and deletion of multicast addresses > > > > using the RTM_NEWADDR and RTM_DELADDR messages. It also provides > > > > the RTM_GETADDR extension to allow multicast join state to be read > > > > from the kernel. > > > > > > > > Signed-off-by: Patrick Ruddy > > > > --- > > > > v2: fix kbuild warnings. > > > > > > I am still going through the series, but AFAICT, user-space caches > > > listening to > > > RTNLGRP_IPV4_IFADDR will now also get multicast addresses by default ? > > > > > > > Yes that's the crux of this change. It's unfortunate that I could not > > use IFA_MULTICAST to distinguish the SAFI. I suppose the other option > > would be to create a set of new NEW/DEL/GETMULTICAST messages but the > > partial code for RTM_GETMULTICAST in ipv6/mcast.c complicates that > > slightly. Happy to look at it if you think that would be be better. > > > > yeah, true. Thinking about this some more, you are adding an interface > for multicast entries learnt via igmp. > There is already a netlink channel for layer2 mc addresses via igmp. I > can't see why that cannot be used. > It is RTM_*MDB msgs. It is currently only available for the bridge. > But, I have a requirement for it to be > available via a vxlan dev...so, I am looking at making it available on > other devices. > > The reason I think it should be possible is because this is similar to > bridge fdb entries. > The bridge fdb api (RTM_NEWNEIGH with AF_BRIDGE) is overloaded to > notify and dump netdev unicast addresses. > similarly I think the mdb api can be overloaded to notify and dump > netdev multicast addresses (statically added or learnt via igmp) OK I'll take a look at this.
Re: [PATCH net-next v2 1/2] netlink: ipv4 igmp join notifications
On Sun, Sep 2, 2018 at 4:18 AM, Patrick Ruddy wrote: > Hi Roopa > > inline > > thx > > -pr > > On Fri, 2018-08-31 at 09:29 -0700, Roopa Prabhu wrote: >> On Fri, Aug 31, 2018 at 4:20 AM, Patrick Ruddy >> wrote: >> > Some userspace applications need to know about IGMP joins from the kernel >> > for 2 reasons >> > 1. To allow the programming of multicast MAC filters in hardware >> > 2. To form a multicast FORUS list for non link-local multicast >> >groups to be sent to the kernel and from there to the interested >> >party. >> > (1) can be fulfilled but simply sending the hardware multicast MAC >> > address to be programmed but (2) requires the L3 address to be sent >> > since this cannot be constructed from the MAC address whereas the >> > reverse translation is a standard library function. >> > >> > This commit provides addition and deletion of multicast addresses >> > using the RTM_NEWADDR and RTM_DELADDR messages. It also provides >> > the RTM_GETADDR extension to allow multicast join state to be read >> > from the kernel. >> > >> > Signed-off-by: Patrick Ruddy >> > --- >> > v2: fix kbuild warnings. >> >> I am still going through the series, but AFAICT, user-space caches listening >> to >> RTNLGRP_IPV4_IFADDR will now also get multicast addresses by default ? >> > > Yes that's the crux of this change. It's unfortunate that I could not > use IFA_MULTICAST to distinguish the SAFI. I suppose the other option > would be to create a set of new NEW/DEL/GETMULTICAST messages but the > partial code for RTM_GETMULTICAST in ipv6/mcast.c complicates that > slightly. Happy to look at it if you think that would be be better. > yeah, true. Thinking about this some more, you are adding an interface for multicast entries learnt via igmp. There is already a netlink channel for layer2 mc addresses via igmp. I can't see why that cannot be used. It is RTM_*MDB msgs. It is currently only available for the bridge. But, I have a requirement for it to be available via a vxlan dev...so, I am looking at making it available on other devices. Can you check if RTM_*MDB msgs can be made to work for your case ?. The reason I think it should be possible is because this is similar to bridge fdb entries. The bridge fdb api (RTM_NEWNEIGH with AF_BRIDGE) is overloaded to notify and dump netdev unicast addresses. similarly I think the mdb api can be overloaded to notify and dump netdev multicast addresses (statically added or learnt via igmp)
Re: [PATCH net-next v2 1/2] netlink: ipv4 igmp join notifications
Hi Roopa inline thx -pr On Fri, 2018-08-31 at 09:29 -0700, Roopa Prabhu wrote: > On Fri, Aug 31, 2018 at 4:20 AM, Patrick Ruddy > wrote: > > Some userspace applications need to know about IGMP joins from the kernel > > for 2 reasons > > 1. To allow the programming of multicast MAC filters in hardware > > 2. To form a multicast FORUS list for non link-local multicast > >groups to be sent to the kernel and from there to the interested > >party. > > (1) can be fulfilled but simply sending the hardware multicast MAC > > address to be programmed but (2) requires the L3 address to be sent > > since this cannot be constructed from the MAC address whereas the > > reverse translation is a standard library function. > > > > This commit provides addition and deletion of multicast addresses > > using the RTM_NEWADDR and RTM_DELADDR messages. It also provides > > the RTM_GETADDR extension to allow multicast join state to be read > > from the kernel. > > > > Signed-off-by: Patrick Ruddy > > --- > > v2: fix kbuild warnings. > > I am still going through the series, but AFAICT, user-space caches listening > to > RTNLGRP_IPV4_IFADDR will now also get multicast addresses by default ? > Yes that's the crux of this change. It's unfortunate that I could not use IFA_MULTICAST to distinguish the SAFI. I suppose the other option would be to create a set of new NEW/DEL/GETMULTICAST messages but the partial code for RTM_GETMULTICAST in ipv6/mcast.c complicates that slightly. Happy to look at it if you think that would be be better. > > > > > include/linux/igmp.h | 4 ++ > > net/ipv4/devinet.c | 39 +-- > > net/ipv4/igmp.c | 90 > > 3 files changed, 122 insertions(+), 11 deletions(-) > > > > diff --git a/include/linux/igmp.h b/include/linux/igmp.h > > index 119f53941c12..644a548024ed 100644 > > --- a/include/linux/igmp.h > > +++ b/include/linux/igmp.h > > @@ -19,6 +19,8 @@ > > #include > > #include > > #include > > +#include > > +#include > > #include > > > > static inline struct igmphdr *igmp_hdr(const struct sk_buff *skb) > > @@ -130,6 +132,8 @@ extern void ip_mc_unmap(struct in_device *); > > extern void ip_mc_remap(struct in_device *); > > extern void ip_mc_dec_group(struct in_device *in_dev, __be32 addr); > > extern void ip_mc_inc_group(struct in_device *in_dev, __be32 addr); > > +extern int ip_mc_dump_ifaddr(struct sk_buff *skb, struct netlink_callback > > *cb, > > +struct net_device *dev); > > int ip_mc_check_igmp(struct sk_buff *skb, struct sk_buff **skb_trimmed); > > > > #endif > > diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c > > index ea4bd8a52422..42f7dcc4fb5e 100644 > > --- a/net/ipv4/devinet.c > > +++ b/net/ipv4/devinet.c > > @@ -57,6 +57,7 @@ > > #endif > > #include > > #include > > +#include > > > > #include > > #include > > @@ -1651,6 +1652,7 @@ static int inet_dump_ifaddr(struct sk_buff *skb, > > struct netlink_callback *cb) > > int h, s_h; > > int idx, s_idx; > > int ip_idx, s_ip_idx; > > + int multicast, mcast_idx; > > struct net_device *dev; > > struct in_device *in_dev; > > struct in_ifaddr *ifa; > > @@ -1659,6 +1661,8 @@ static int inet_dump_ifaddr(struct sk_buff *skb, > > struct netlink_callback *cb) > > s_h = cb->args[0]; > > s_idx = idx = cb->args[1]; > > s_ip_idx = ip_idx = cb->args[2]; > > + multicast = cb->args[3]; > > + mcast_idx = cb->args[4]; > > > > for (h = s_h; h < NETDEV_HASHENTRIES; h++, s_idx = 0) { > > idx = 0; > > @@ -1675,18 +1679,29 @@ static int inet_dump_ifaddr(struct sk_buff *skb, > > struct netlink_callback *cb) > > if (!in_dev) > > goto cont; > > > > - for (ifa = in_dev->ifa_list, ip_idx = 0; ifa; > > -ifa = ifa->ifa_next, ip_idx++) { > > - if (ip_idx < s_ip_idx) > > - continue; > > - if (inet_fill_ifaddr(skb, ifa, > > -NETLINK_CB(cb->skb).portid, > > -cb->nlh->nlmsg_seq, > > -RTM_NEWADDR, NLM_F_MULTI) < 0) > > { > > - rcu_read_unlock(); > > - goto done; > > + if (!multicast) { > > + for (ifa = in_dev->ifa_list, ip_idx = 0; > > ifa; > > +ifa = ifa->ifa_next, ip_idx++) { > > + if (ip_idx < s_ip_idx) > > + continue; > > + if (inet_fill_ifaddr(skb, ifa, > > +
Re: [PATCH net-next v2 1/2] netlink: ipv4 igmp join notifications
On Fri, Aug 31, 2018 at 4:20 AM, Patrick Ruddy wrote: > Some userspace applications need to know about IGMP joins from the kernel > for 2 reasons > 1. To allow the programming of multicast MAC filters in hardware > 2. To form a multicast FORUS list for non link-local multicast >groups to be sent to the kernel and from there to the interested >party. > (1) can be fulfilled but simply sending the hardware multicast MAC > address to be programmed but (2) requires the L3 address to be sent > since this cannot be constructed from the MAC address whereas the > reverse translation is a standard library function. > > This commit provides addition and deletion of multicast addresses > using the RTM_NEWADDR and RTM_DELADDR messages. It also provides > the RTM_GETADDR extension to allow multicast join state to be read > from the kernel. > > Signed-off-by: Patrick Ruddy > --- > v2: fix kbuild warnings. I am still going through the series, but AFAICT, user-space caches listening to RTNLGRP_IPV4_IFADDR will now also get multicast addresses by default ? > > include/linux/igmp.h | 4 ++ > net/ipv4/devinet.c | 39 +-- > net/ipv4/igmp.c | 90 > 3 files changed, 122 insertions(+), 11 deletions(-) > > diff --git a/include/linux/igmp.h b/include/linux/igmp.h > index 119f53941c12..644a548024ed 100644 > --- a/include/linux/igmp.h > +++ b/include/linux/igmp.h > @@ -19,6 +19,8 @@ > #include > #include > #include > +#include > +#include > #include > > static inline struct igmphdr *igmp_hdr(const struct sk_buff *skb) > @@ -130,6 +132,8 @@ extern void ip_mc_unmap(struct in_device *); > extern void ip_mc_remap(struct in_device *); > extern void ip_mc_dec_group(struct in_device *in_dev, __be32 addr); > extern void ip_mc_inc_group(struct in_device *in_dev, __be32 addr); > +extern int ip_mc_dump_ifaddr(struct sk_buff *skb, struct netlink_callback > *cb, > +struct net_device *dev); > int ip_mc_check_igmp(struct sk_buff *skb, struct sk_buff **skb_trimmed); > > #endif > diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c > index ea4bd8a52422..42f7dcc4fb5e 100644 > --- a/net/ipv4/devinet.c > +++ b/net/ipv4/devinet.c > @@ -57,6 +57,7 @@ > #endif > #include > #include > +#include > > #include > #include > @@ -1651,6 +1652,7 @@ static int inet_dump_ifaddr(struct sk_buff *skb, struct > netlink_callback *cb) > int h, s_h; > int idx, s_idx; > int ip_idx, s_ip_idx; > + int multicast, mcast_idx; > struct net_device *dev; > struct in_device *in_dev; > struct in_ifaddr *ifa; > @@ -1659,6 +1661,8 @@ static int inet_dump_ifaddr(struct sk_buff *skb, struct > netlink_callback *cb) > s_h = cb->args[0]; > s_idx = idx = cb->args[1]; > s_ip_idx = ip_idx = cb->args[2]; > + multicast = cb->args[3]; > + mcast_idx = cb->args[4]; > > for (h = s_h; h < NETDEV_HASHENTRIES; h++, s_idx = 0) { > idx = 0; > @@ -1675,18 +1679,29 @@ static int inet_dump_ifaddr(struct sk_buff *skb, > struct netlink_callback *cb) > if (!in_dev) > goto cont; > > - for (ifa = in_dev->ifa_list, ip_idx = 0; ifa; > -ifa = ifa->ifa_next, ip_idx++) { > - if (ip_idx < s_ip_idx) > - continue; > - if (inet_fill_ifaddr(skb, ifa, > -NETLINK_CB(cb->skb).portid, > -cb->nlh->nlmsg_seq, > -RTM_NEWADDR, NLM_F_MULTI) < 0) { > - rcu_read_unlock(); > - goto done; > + if (!multicast) { > + for (ifa = in_dev->ifa_list, ip_idx = 0; ifa; > +ifa = ifa->ifa_next, ip_idx++) { > + if (ip_idx < s_ip_idx) > + continue; > + if (inet_fill_ifaddr(skb, ifa, > + > NETLINK_CB(cb->skb).portid, > + > cb->nlh->nlmsg_seq, > +RTM_NEWADDR, > +NLM_F_MULTI) < > 0) { > + rcu_read_unlock(); > + goto done; > + } > + nl_dump_check_consistent(cb, > + > nlmsg_hdr(skb)); > } > -
[PATCH net-next v2 1/2] netlink: ipv4 igmp join notifications
Some userspace applications need to know about IGMP joins from the kernel for 2 reasons 1. To allow the programming of multicast MAC filters in hardware 2. To form a multicast FORUS list for non link-local multicast groups to be sent to the kernel and from there to the interested party. (1) can be fulfilled but simply sending the hardware multicast MAC address to be programmed but (2) requires the L3 address to be sent since this cannot be constructed from the MAC address whereas the reverse translation is a standard library function. This commit provides addition and deletion of multicast addresses using the RTM_NEWADDR and RTM_DELADDR messages. It also provides the RTM_GETADDR extension to allow multicast join state to be read from the kernel. Signed-off-by: Patrick Ruddy --- v2: fix kbuild warnings. include/linux/igmp.h | 4 ++ net/ipv4/devinet.c | 39 +-- net/ipv4/igmp.c | 90 3 files changed, 122 insertions(+), 11 deletions(-) diff --git a/include/linux/igmp.h b/include/linux/igmp.h index 119f53941c12..644a548024ed 100644 --- a/include/linux/igmp.h +++ b/include/linux/igmp.h @@ -19,6 +19,8 @@ #include #include #include +#include +#include #include static inline struct igmphdr *igmp_hdr(const struct sk_buff *skb) @@ -130,6 +132,8 @@ extern void ip_mc_unmap(struct in_device *); extern void ip_mc_remap(struct in_device *); extern void ip_mc_dec_group(struct in_device *in_dev, __be32 addr); extern void ip_mc_inc_group(struct in_device *in_dev, __be32 addr); +extern int ip_mc_dump_ifaddr(struct sk_buff *skb, struct netlink_callback *cb, +struct net_device *dev); int ip_mc_check_igmp(struct sk_buff *skb, struct sk_buff **skb_trimmed); #endif diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c index ea4bd8a52422..42f7dcc4fb5e 100644 --- a/net/ipv4/devinet.c +++ b/net/ipv4/devinet.c @@ -57,6 +57,7 @@ #endif #include #include +#include #include #include @@ -1651,6 +1652,7 @@ static int inet_dump_ifaddr(struct sk_buff *skb, struct netlink_callback *cb) int h, s_h; int idx, s_idx; int ip_idx, s_ip_idx; + int multicast, mcast_idx; struct net_device *dev; struct in_device *in_dev; struct in_ifaddr *ifa; @@ -1659,6 +1661,8 @@ static int inet_dump_ifaddr(struct sk_buff *skb, struct netlink_callback *cb) s_h = cb->args[0]; s_idx = idx = cb->args[1]; s_ip_idx = ip_idx = cb->args[2]; + multicast = cb->args[3]; + mcast_idx = cb->args[4]; for (h = s_h; h < NETDEV_HASHENTRIES; h++, s_idx = 0) { idx = 0; @@ -1675,18 +1679,29 @@ static int inet_dump_ifaddr(struct sk_buff *skb, struct netlink_callback *cb) if (!in_dev) goto cont; - for (ifa = in_dev->ifa_list, ip_idx = 0; ifa; -ifa = ifa->ifa_next, ip_idx++) { - if (ip_idx < s_ip_idx) - continue; - if (inet_fill_ifaddr(skb, ifa, -NETLINK_CB(cb->skb).portid, -cb->nlh->nlmsg_seq, -RTM_NEWADDR, NLM_F_MULTI) < 0) { - rcu_read_unlock(); - goto done; + if (!multicast) { + for (ifa = in_dev->ifa_list, ip_idx = 0; ifa; +ifa = ifa->ifa_next, ip_idx++) { + if (ip_idx < s_ip_idx) + continue; + if (inet_fill_ifaddr(skb, ifa, + NETLINK_CB(cb->skb).portid, +cb->nlh->nlmsg_seq, +RTM_NEWADDR, +NLM_F_MULTI) < 0) { + rcu_read_unlock(); + goto done; + } + nl_dump_check_consistent(cb, + nlmsg_hdr(skb)); } - nl_dump_check_consistent(cb, nlmsg_hdr(skb)); + /* set for multicast loop */ + multicast++; + } + /* loop over multicast addresses */ + if (ip_mc_dump_ifaddr(skb, cb, dev) < 0) { + rcu_read_unlock(); + goto done;