Re: [PATCH net-next v2 1/2] netlink: ipv4 igmp join notifications

2018-09-04 Thread Patrick Ruddy
On Mon, 2018-09-03 at 16:12 -0700, Roopa Prabhu wrote:
> On Sun, Sep 2, 2018 at 4:18 AM, Patrick Ruddy
>  wrote:
> > Hi Roopa
> > 
> > inline
> > 
> > thx
> > 
> > -pr
> > 
> > On Fri, 2018-08-31 at 09:29 -0700, Roopa Prabhu wrote:
> > > On Fri, Aug 31, 2018 at 4:20 AM, Patrick Ruddy
> > >  wrote:
> > > > Some userspace applications need to know about IGMP joins from the 
> > > > kernel
> > > > for 2 reasons
> > > > 1. To allow the programming of multicast MAC filters in hardware
> > > > 2. To form a multicast FORUS list for non link-local multicast
> > > >groups to be sent to the kernel and from there to the interested
> > > >party.
> > > > (1) can be fulfilled but simply sending the hardware multicast MAC
> > > > address to be programmed but (2) requires the L3 address to be sent
> > > > since this cannot be constructed from the MAC address whereas the
> > > > reverse translation is a standard library function.
> > > > 
> > > > This commit provides addition and deletion of multicast addresses
> > > > using the RTM_NEWADDR and RTM_DELADDR messages. It also provides
> > > > the RTM_GETADDR extension to allow multicast join state to be read
> > > > from the kernel.
> > > > 
> > > > Signed-off-by: Patrick Ruddy 
> > > > ---
> > > > v2: fix kbuild warnings.
> > > 
> > > I am still going through the series, but AFAICT, user-space caches 
> > > listening to
> > > RTNLGRP_IPV4_IFADDR will now also get multicast addresses by default ?
> > > 
> > 
> > Yes that's the crux of this change. It's unfortunate that I could not
> > use IFA_MULTICAST to distinguish the SAFI. I suppose the other option
> > would be to create a set of new NEW/DEL/GETMULTICAST messages but the
> > partial code for RTM_GETMULTICAST in ipv6/mcast.c complicates that
> > slightly. Happy to look at it if you think that would be be better.
> > 
> 
> yeah, true. Thinking about this some more, you are adding an interface
> for multicast entries learnt via igmp.
> There is already a netlink channel for layer2 mc addresses via igmp. I
> can't see why that cannot be used.
> It is RTM_*MDB msgs. It is currently only available for the bridge.
> But, I have a requirement for it to be
> available via a vxlan dev...so, I am looking at making it available on
> other devices.
> 
> Can you check if RTM_*MDB msgs can be made to work for your case ?.
> 
> The reason I think it should be possible is because this is similar to
> bridge fdb entries.
> The bridge fdb api  (RTM_NEWNEIGH with AF_BRIDGE) is overloaded to
> notify and dump netdev unicast addresses.
> similarly I think the mdb api can be overloaded to notify and dump
> netdev multicast addresses (statically added or learnt via igmp)

If I'm reading this correctly I think overloading this channel is
possible. 

What you're suggesting is overloading the RTM_***MDB messages with
AF_INET and AF_INET6 to carry the per-interfaces joined l3 multicast
addresses.

I've thrown together a quick test of this and it looks good. I can
polish this up and resubmit if you're happy with the approach. FWIW
isolating the multicast addresses this was seems safer and it's a
smaller patchset.

thx

-pr


Re: [PATCH net-next v2 1/2] netlink: ipv4 igmp join notifications

2018-09-04 Thread Patrick Ruddy
On Mon, 2018-09-03 at 16:12 -0700, Roopa Prabhu wrote:
> On Sun, Sep 2, 2018 at 4:18 AM, Patrick Ruddy
>  wrote:
> > Hi Roopa
> > 
> > inline
> > 
> > thx
> > 
> > -pr
> > 
> > On Fri, 2018-08-31 at 09:29 -0700, Roopa Prabhu wrote:
> > > On Fri, Aug 31, 2018 at 4:20 AM, Patrick Ruddy
> > >  wrote:
> > > > Some userspace applications need to know about IGMP joins from the 
> > > > kernel
> > > > for 2 reasons
> > > > 1. To allow the programming of multicast MAC filters in hardware
> > > > 2. To form a multicast FORUS list for non link-local multicast
> > > >groups to be sent to the kernel and from there to the interested
> > > >party.
> > > > (1) can be fulfilled but simply sending the hardware multicast MAC
> > > > address to be programmed but (2) requires the L3 address to be sent
> > > > since this cannot be constructed from the MAC address whereas the
> > > > reverse translation is a standard library function.
> > > > 
> > > > This commit provides addition and deletion of multicast addresses
> > > > using the RTM_NEWADDR and RTM_DELADDR messages. It also provides
> > > > the RTM_GETADDR extension to allow multicast join state to be read
> > > > from the kernel.
> > > > 
> > > > Signed-off-by: Patrick Ruddy 
> > > > ---
> > > > v2: fix kbuild warnings.
> > > 
> > > I am still going through the series, but AFAICT, user-space caches 
> > > listening to
> > > RTNLGRP_IPV4_IFADDR will now also get multicast addresses by default ?
> > > 
> > 
> > Yes that's the crux of this change. It's unfortunate that I could not
> > use IFA_MULTICAST to distinguish the SAFI. I suppose the other option
> > would be to create a set of new NEW/DEL/GETMULTICAST messages but the
> > partial code for RTM_GETMULTICAST in ipv6/mcast.c complicates that
> > slightly. Happy to look at it if you think that would be be better.
> > 
> 
> yeah, true. Thinking about this some more, you are adding an interface
> for multicast entries learnt via igmp.
> There is already a netlink channel for layer2 mc addresses via igmp. I
> can't see why that cannot be used.
> It is RTM_*MDB msgs. It is currently only available for the bridge.
> But, I have a requirement for it to be
> available via a vxlan dev...so, I am looking at making it available on
> other devices.
>  
> The reason I think it should be possible is because this is similar to
> bridge fdb entries.
> The bridge fdb api  (RTM_NEWNEIGH with AF_BRIDGE) is overloaded to
> notify and dump netdev unicast addresses.
> similarly I think the mdb api can be overloaded to notify and dump
> netdev multicast addresses (statically added or learnt via igmp)
OK I'll take a look at this.


Re: [PATCH net-next v2 1/2] netlink: ipv4 igmp join notifications

2018-09-03 Thread Roopa Prabhu
On Sun, Sep 2, 2018 at 4:18 AM, Patrick Ruddy
 wrote:
> Hi Roopa
>
> inline
>
> thx
>
> -pr
>
> On Fri, 2018-08-31 at 09:29 -0700, Roopa Prabhu wrote:
>> On Fri, Aug 31, 2018 at 4:20 AM, Patrick Ruddy
>>  wrote:
>> > Some userspace applications need to know about IGMP joins from the kernel
>> > for 2 reasons
>> > 1. To allow the programming of multicast MAC filters in hardware
>> > 2. To form a multicast FORUS list for non link-local multicast
>> >groups to be sent to the kernel and from there to the interested
>> >party.
>> > (1) can be fulfilled but simply sending the hardware multicast MAC
>> > address to be programmed but (2) requires the L3 address to be sent
>> > since this cannot be constructed from the MAC address whereas the
>> > reverse translation is a standard library function.
>> >
>> > This commit provides addition and deletion of multicast addresses
>> > using the RTM_NEWADDR and RTM_DELADDR messages. It also provides
>> > the RTM_GETADDR extension to allow multicast join state to be read
>> > from the kernel.
>> >
>> > Signed-off-by: Patrick Ruddy 
>> > ---
>> > v2: fix kbuild warnings.
>>
>> I am still going through the series, but AFAICT, user-space caches listening 
>> to
>> RTNLGRP_IPV4_IFADDR will now also get multicast addresses by default ?
>>
>
> Yes that's the crux of this change. It's unfortunate that I could not
> use IFA_MULTICAST to distinguish the SAFI. I suppose the other option
> would be to create a set of new NEW/DEL/GETMULTICAST messages but the
> partial code for RTM_GETMULTICAST in ipv6/mcast.c complicates that
> slightly. Happy to look at it if you think that would be be better.
>

yeah, true. Thinking about this some more, you are adding an interface
for multicast entries learnt via igmp.
There is already a netlink channel for layer2 mc addresses via igmp. I
can't see why that cannot be used.
It is RTM_*MDB msgs. It is currently only available for the bridge.
But, I have a requirement for it to be
available via a vxlan dev...so, I am looking at making it available on
other devices.

Can you check if RTM_*MDB msgs can be made to work for your case ?.

The reason I think it should be possible is because this is similar to
bridge fdb entries.
The bridge fdb api  (RTM_NEWNEIGH with AF_BRIDGE) is overloaded to
notify and dump netdev unicast addresses.
similarly I think the mdb api can be overloaded to notify and dump
netdev multicast addresses (statically added or learnt via igmp)


Re: [PATCH net-next v2 1/2] netlink: ipv4 igmp join notifications

2018-09-02 Thread Patrick Ruddy
Hi Roopa

inline

thx

-pr

On Fri, 2018-08-31 at 09:29 -0700, Roopa Prabhu wrote:
> On Fri, Aug 31, 2018 at 4:20 AM, Patrick Ruddy
>  wrote:
> > Some userspace applications need to know about IGMP joins from the kernel
> > for 2 reasons
> > 1. To allow the programming of multicast MAC filters in hardware
> > 2. To form a multicast FORUS list for non link-local multicast
> >groups to be sent to the kernel and from there to the interested
> >party.
> > (1) can be fulfilled but simply sending the hardware multicast MAC
> > address to be programmed but (2) requires the L3 address to be sent
> > since this cannot be constructed from the MAC address whereas the
> > reverse translation is a standard library function.
> > 
> > This commit provides addition and deletion of multicast addresses
> > using the RTM_NEWADDR and RTM_DELADDR messages. It also provides
> > the RTM_GETADDR extension to allow multicast join state to be read
> > from the kernel.
> > 
> > Signed-off-by: Patrick Ruddy 
> > ---
> > v2: fix kbuild warnings.
> 
> I am still going through the series, but AFAICT, user-space caches listening 
> to
> RTNLGRP_IPV4_IFADDR will now also get multicast addresses by default ?
> 

Yes that's the crux of this change. It's unfortunate that I could not
use IFA_MULTICAST to distinguish the SAFI. I suppose the other option
would be to create a set of new NEW/DEL/GETMULTICAST messages but the
partial code for RTM_GETMULTICAST in ipv6/mcast.c complicates that
slightly. Happy to look at it if you think that would be be better.

> 
> > 
> >  include/linux/igmp.h |  4 ++
> >  net/ipv4/devinet.c   | 39 +--
> >  net/ipv4/igmp.c  | 90 
> >  3 files changed, 122 insertions(+), 11 deletions(-)
> > 
> > diff --git a/include/linux/igmp.h b/include/linux/igmp.h
> > index 119f53941c12..644a548024ed 100644
> > --- a/include/linux/igmp.h
> > +++ b/include/linux/igmp.h
> > @@ -19,6 +19,8 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> > +#include 
> >  #include 
> > 
> >  static inline struct igmphdr *igmp_hdr(const struct sk_buff *skb)
> > @@ -130,6 +132,8 @@ extern void ip_mc_unmap(struct in_device *);
> >  extern void ip_mc_remap(struct in_device *);
> >  extern void ip_mc_dec_group(struct in_device *in_dev, __be32 addr);
> >  extern void ip_mc_inc_group(struct in_device *in_dev, __be32 addr);
> > +extern int ip_mc_dump_ifaddr(struct sk_buff *skb, struct netlink_callback 
> > *cb,
> > +struct net_device *dev);
> >  int ip_mc_check_igmp(struct sk_buff *skb, struct sk_buff **skb_trimmed);
> > 
> >  #endif
> > diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
> > index ea4bd8a52422..42f7dcc4fb5e 100644
> > --- a/net/ipv4/devinet.c
> > +++ b/net/ipv4/devinet.c
> > @@ -57,6 +57,7 @@
> >  #endif
> >  #include 
> >  #include 
> > +#include 
> > 
> >  #include 
> >  #include 
> > @@ -1651,6 +1652,7 @@ static int inet_dump_ifaddr(struct sk_buff *skb, 
> > struct netlink_callback *cb)
> > int h, s_h;
> > int idx, s_idx;
> > int ip_idx, s_ip_idx;
> > +   int multicast, mcast_idx;
> > struct net_device *dev;
> > struct in_device *in_dev;
> > struct in_ifaddr *ifa;
> > @@ -1659,6 +1661,8 @@ static int inet_dump_ifaddr(struct sk_buff *skb, 
> > struct netlink_callback *cb)
> > s_h = cb->args[0];
> > s_idx = idx = cb->args[1];
> > s_ip_idx = ip_idx = cb->args[2];
> > +   multicast = cb->args[3];
> > +   mcast_idx = cb->args[4];
> > 
> > for (h = s_h; h < NETDEV_HASHENTRIES; h++, s_idx = 0) {
> > idx = 0;
> > @@ -1675,18 +1679,29 @@ static int inet_dump_ifaddr(struct sk_buff *skb, 
> > struct netlink_callback *cb)
> > if (!in_dev)
> > goto cont;
> > 
> > -   for (ifa = in_dev->ifa_list, ip_idx = 0; ifa;
> > -ifa = ifa->ifa_next, ip_idx++) {
> > -   if (ip_idx < s_ip_idx)
> > -   continue;
> > -   if (inet_fill_ifaddr(skb, ifa,
> > -NETLINK_CB(cb->skb).portid,
> > -cb->nlh->nlmsg_seq,
> > -RTM_NEWADDR, NLM_F_MULTI) < 0) 
> > {
> > -   rcu_read_unlock();
> > -   goto done;
> > +   if (!multicast) {
> > +   for (ifa = in_dev->ifa_list, ip_idx = 0; 
> > ifa;
> > +ifa = ifa->ifa_next, ip_idx++) {
> > +   if (ip_idx < s_ip_idx)
> > +   continue;
> > +   if (inet_fill_ifaddr(skb, ifa,
> > +

Re: [PATCH net-next v2 1/2] netlink: ipv4 igmp join notifications

2018-08-31 Thread Roopa Prabhu
On Fri, Aug 31, 2018 at 4:20 AM, Patrick Ruddy
 wrote:
> Some userspace applications need to know about IGMP joins from the kernel
> for 2 reasons
> 1. To allow the programming of multicast MAC filters in hardware
> 2. To form a multicast FORUS list for non link-local multicast
>groups to be sent to the kernel and from there to the interested
>party.
> (1) can be fulfilled but simply sending the hardware multicast MAC
> address to be programmed but (2) requires the L3 address to be sent
> since this cannot be constructed from the MAC address whereas the
> reverse translation is a standard library function.
>
> This commit provides addition and deletion of multicast addresses
> using the RTM_NEWADDR and RTM_DELADDR messages. It also provides
> the RTM_GETADDR extension to allow multicast join state to be read
> from the kernel.
>
> Signed-off-by: Patrick Ruddy 
> ---
> v2: fix kbuild warnings.

I am still going through the series, but AFAICT, user-space caches listening to
RTNLGRP_IPV4_IFADDR will now also get multicast addresses by default ?


>
>  include/linux/igmp.h |  4 ++
>  net/ipv4/devinet.c   | 39 +--
>  net/ipv4/igmp.c  | 90 
>  3 files changed, 122 insertions(+), 11 deletions(-)
>
> diff --git a/include/linux/igmp.h b/include/linux/igmp.h
> index 119f53941c12..644a548024ed 100644
> --- a/include/linux/igmp.h
> +++ b/include/linux/igmp.h
> @@ -19,6 +19,8 @@
>  #include 
>  #include 
>  #include 
> +#include 
> +#include 
>  #include 
>
>  static inline struct igmphdr *igmp_hdr(const struct sk_buff *skb)
> @@ -130,6 +132,8 @@ extern void ip_mc_unmap(struct in_device *);
>  extern void ip_mc_remap(struct in_device *);
>  extern void ip_mc_dec_group(struct in_device *in_dev, __be32 addr);
>  extern void ip_mc_inc_group(struct in_device *in_dev, __be32 addr);
> +extern int ip_mc_dump_ifaddr(struct sk_buff *skb, struct netlink_callback 
> *cb,
> +struct net_device *dev);
>  int ip_mc_check_igmp(struct sk_buff *skb, struct sk_buff **skb_trimmed);
>
>  #endif
> diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
> index ea4bd8a52422..42f7dcc4fb5e 100644
> --- a/net/ipv4/devinet.c
> +++ b/net/ipv4/devinet.c
> @@ -57,6 +57,7 @@
>  #endif
>  #include 
>  #include 
> +#include 
>
>  #include 
>  #include 
> @@ -1651,6 +1652,7 @@ static int inet_dump_ifaddr(struct sk_buff *skb, struct 
> netlink_callback *cb)
> int h, s_h;
> int idx, s_idx;
> int ip_idx, s_ip_idx;
> +   int multicast, mcast_idx;
> struct net_device *dev;
> struct in_device *in_dev;
> struct in_ifaddr *ifa;
> @@ -1659,6 +1661,8 @@ static int inet_dump_ifaddr(struct sk_buff *skb, struct 
> netlink_callback *cb)
> s_h = cb->args[0];
> s_idx = idx = cb->args[1];
> s_ip_idx = ip_idx = cb->args[2];
> +   multicast = cb->args[3];
> +   mcast_idx = cb->args[4];
>
> for (h = s_h; h < NETDEV_HASHENTRIES; h++, s_idx = 0) {
> idx = 0;
> @@ -1675,18 +1679,29 @@ static int inet_dump_ifaddr(struct sk_buff *skb, 
> struct netlink_callback *cb)
> if (!in_dev)
> goto cont;
>
> -   for (ifa = in_dev->ifa_list, ip_idx = 0; ifa;
> -ifa = ifa->ifa_next, ip_idx++) {
> -   if (ip_idx < s_ip_idx)
> -   continue;
> -   if (inet_fill_ifaddr(skb, ifa,
> -NETLINK_CB(cb->skb).portid,
> -cb->nlh->nlmsg_seq,
> -RTM_NEWADDR, NLM_F_MULTI) < 0) {
> -   rcu_read_unlock();
> -   goto done;
> +   if (!multicast) {
> +   for (ifa = in_dev->ifa_list, ip_idx = 0; ifa;
> +ifa = ifa->ifa_next, ip_idx++) {
> +   if (ip_idx < s_ip_idx)
> +   continue;
> +   if (inet_fill_ifaddr(skb, ifa,
> +
> NETLINK_CB(cb->skb).portid,
> +
> cb->nlh->nlmsg_seq,
> +RTM_NEWADDR,
> +NLM_F_MULTI) < 
> 0) {
> +   rcu_read_unlock();
> +   goto done;
> +   }
> +   nl_dump_check_consistent(cb,
> +
> nlmsg_hdr(skb));
> }
> -  

[PATCH net-next v2 1/2] netlink: ipv4 igmp join notifications

2018-08-31 Thread Patrick Ruddy
Some userspace applications need to know about IGMP joins from the kernel
for 2 reasons
1. To allow the programming of multicast MAC filters in hardware
2. To form a multicast FORUS list for non link-local multicast
   groups to be sent to the kernel and from there to the interested
   party.
(1) can be fulfilled but simply sending the hardware multicast MAC
address to be programmed but (2) requires the L3 address to be sent
since this cannot be constructed from the MAC address whereas the
reverse translation is a standard library function.

This commit provides addition and deletion of multicast addresses
using the RTM_NEWADDR and RTM_DELADDR messages. It also provides
the RTM_GETADDR extension to allow multicast join state to be read
from the kernel.

Signed-off-by: Patrick Ruddy 
---
v2: fix kbuild warnings.

 include/linux/igmp.h |  4 ++
 net/ipv4/devinet.c   | 39 +--
 net/ipv4/igmp.c  | 90 
 3 files changed, 122 insertions(+), 11 deletions(-)

diff --git a/include/linux/igmp.h b/include/linux/igmp.h
index 119f53941c12..644a548024ed 100644
--- a/include/linux/igmp.h
+++ b/include/linux/igmp.h
@@ -19,6 +19,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include 
 
 static inline struct igmphdr *igmp_hdr(const struct sk_buff *skb)
@@ -130,6 +132,8 @@ extern void ip_mc_unmap(struct in_device *);
 extern void ip_mc_remap(struct in_device *);
 extern void ip_mc_dec_group(struct in_device *in_dev, __be32 addr);
 extern void ip_mc_inc_group(struct in_device *in_dev, __be32 addr);
+extern int ip_mc_dump_ifaddr(struct sk_buff *skb, struct netlink_callback *cb,
+struct net_device *dev);
 int ip_mc_check_igmp(struct sk_buff *skb, struct sk_buff **skb_trimmed);
 
 #endif
diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
index ea4bd8a52422..42f7dcc4fb5e 100644
--- a/net/ipv4/devinet.c
+++ b/net/ipv4/devinet.c
@@ -57,6 +57,7 @@
 #endif
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -1651,6 +1652,7 @@ static int inet_dump_ifaddr(struct sk_buff *skb, struct 
netlink_callback *cb)
int h, s_h;
int idx, s_idx;
int ip_idx, s_ip_idx;
+   int multicast, mcast_idx;
struct net_device *dev;
struct in_device *in_dev;
struct in_ifaddr *ifa;
@@ -1659,6 +1661,8 @@ static int inet_dump_ifaddr(struct sk_buff *skb, struct 
netlink_callback *cb)
s_h = cb->args[0];
s_idx = idx = cb->args[1];
s_ip_idx = ip_idx = cb->args[2];
+   multicast = cb->args[3];
+   mcast_idx = cb->args[4];
 
for (h = s_h; h < NETDEV_HASHENTRIES; h++, s_idx = 0) {
idx = 0;
@@ -1675,18 +1679,29 @@ static int inet_dump_ifaddr(struct sk_buff *skb, struct 
netlink_callback *cb)
if (!in_dev)
goto cont;
 
-   for (ifa = in_dev->ifa_list, ip_idx = 0; ifa;
-ifa = ifa->ifa_next, ip_idx++) {
-   if (ip_idx < s_ip_idx)
-   continue;
-   if (inet_fill_ifaddr(skb, ifa,
-NETLINK_CB(cb->skb).portid,
-cb->nlh->nlmsg_seq,
-RTM_NEWADDR, NLM_F_MULTI) < 0) {
-   rcu_read_unlock();
-   goto done;
+   if (!multicast) {
+   for (ifa = in_dev->ifa_list, ip_idx = 0; ifa;
+ifa = ifa->ifa_next, ip_idx++) {
+   if (ip_idx < s_ip_idx)
+   continue;
+   if (inet_fill_ifaddr(skb, ifa,
+
NETLINK_CB(cb->skb).portid,
+cb->nlh->nlmsg_seq,
+RTM_NEWADDR,
+NLM_F_MULTI) < 0) {
+   rcu_read_unlock();
+   goto done;
+   }
+   nl_dump_check_consistent(cb,
+
nlmsg_hdr(skb));
}
-   nl_dump_check_consistent(cb, nlmsg_hdr(skb));
+   /* set for multicast loop */
+   multicast++;
+   }
+   /* loop over multicast addresses */
+   if (ip_mc_dump_ifaddr(skb, cb, dev) < 0) {
+   rcu_read_unlock();
+   goto done;