[PATCH] ath9k: fix data bus crash when setting nf_override via debugfs

2021-02-09 Thread Linus Lüssing
From: Linus Lüssing 

When trying to set the noise floor via debugfs, a "data bus error"
crash like the following can happen:

[   88.433133] Data bus error, epc == 80221c28, ra == 83314e60
[   88.438895] Oops[#1]:
[   88.441246] CPU: 0 PID: 7263 Comm: sh Not tainted 4.14.195 #0
[   88.447174] task: 838a1c20 task.stack: 82d5e000
[   88.451847] $ 0   :  0030 deadc0de 83141de4
[   88.457248] $ 4   : b810a2c4 a2c4 83230fd4 
[   88.462652] $ 8   : 000a  0001 
[   88.468055] $12   : 7f8ef318   77f802a0
[   88.473457] $16   : 83230080 0002 001b 83230080
[   88.478861] $20   : 83a1c3f8 00841000 77f7adb0 ff92
[   88.484263] $24   : 0fa4 77edd860
[   88.489665] $28   : 82d5e000 82d5fda8  83314e60
[   88.495070] Hi: 
[   88.498044] Lo: 
[   88.501040] epc   : 80221c28 ioread32+0x8/0x10
[   88.505671] ra: 83314e60 ath9k_hw_loadnf+0x88/0x520 [ath9k_hw]
[   88.512049] Status: 1000fc03 KERNEL EXL IE
[   88.516369] Cause : 5080801c (ExcCode 07)
[   88.520508] PrId  : 00019374 (MIPS 24Kc)
[   88.524556] Modules linked in: ath9k ath9k_common pppoe ppp_async l2tp_ppp 
cdc_mbim batman_adv ath9k_hw ath sr9700 smsc95xx sierra_net rndis_host qmi_wwan 
pppox ppp_generic pl2303 nf_conntrack_ipv6 mcs7830 mac80211 kalmia iptable_nat 
ipt_REJECT ipt_MASQUERADE huawei_cdc_ncm ftdi_sio dm9601 cfg80211 cdc_subset 
cdc_ncm cdc_ether cdc_eem ax88179_178a asix xt_time xt_tcpudp xt_tcpmss 
xt_statistic xt_state xt_nat xt_multiport xt_mark xt_mac xt_limit xt_length 
xt_hl xt_ecn xt_dscp xt_conntrack xt_comment xt_TCPMSS xt_REDIRECT xt_NETMAP 
xt_LOG xt_HL xt_FLOWOFFLOAD xt_DSCP xt_CLASSIFY usbserial usbnet usbhid slhc 
rtl8150 r8152 pegasus nf_reject_ipv4 nf_nat_redirect nf_nat_masquerade_ipv4 
nf_conntrack_ipv4 nf_nat_ipv4 nf_nat nf_log_ipv4 nf_flow_table_hw nf_flow_table 
nf_defrag_ipv6 nf_defrag_ipv4 nf_conntrack
[   88.597894]  libcrc32c kaweth iptable_mangle iptable_filter ipt_ECN ipheth 
ip_tables hso hid_generic crc_ccitt compat cdc_wdm cdc_acm br_netfilter hid 
evdev input_core nf_log_ipv6 nf_log_common ip6table_mangle ip6table_filter 
ip6_tables ip6t_REJECT x_tables nf_reject_ipv6 l2tp_netlink l2tp_core 
udp_tunnel ip6_udp_tunnel xfrm6_mode_tunnel xfrm6_mode_transport 
xfrm6_mode_beet ipcomp6 xfrm6_tunnel esp6 ah6 xfrm4_tunnel xfrm4_mode_tunnel 
xfrm4_mode_transport xfrm4_mode_beet ipcomp esp4 ah4 tunnel6 tunnel4 tun 
xfrm_user xfrm_ipcomp af_key xfrm_algo sha256_generic sha1_generic 
jitterentropy_rng drbg md5 hmac echainiv des_generic deflate zlib_inflate 
zlib_deflate cbc authenc crypto_acompress ehci_platform ehci_hcd 
gpio_button_hotplug usbcore nls_base usb_common crc16 mii aead crypto_null 
cryptomgr crc32c_generic
[   88.671671]  crypto_hash
[   88.674292] Process sh (pid: 7263, threadinfo=82d5e000, task=838a1c20, 
tls=77f81efc)
[   88.682279] Stack : 8060 0008 0200    
 0002
[   88.690916] 8050 83230080 82d5fe22 00841000 77f7adb0  
 83156858
[   88.699553]  8352fa00 83ad62b0 835302a8  300a00f8 
0003 82d5fe38
[   88.708190] 82d5fef4 0001 77f54dc4 77f8 77f7adb0 c79fe901 
 
[   88.716828] 8051 0002 00841000 77f54dc4 77f8 801ce4cc 
000b 41824292
[   88.725465] ...
[   88.727994] Call Trace:
[   88.730532] [<80221c28>] ioread32+0x8/0x10
[   88.734765] Code:   8c82  000f <03e8>   08088708 
   aca4  03e8
[   88.744846]
[   88.746464] ---[ end trace db226b2de1b69b9e ]---
[   88.753477] Kernel panic - not syncing: Fatal exception
[   88.759981] Rebooting in 3 seconds..

The "REG_READ(ah, AR_PHY_AGC_CONTROL)" in ath9k_hw_loadnf() does not
like being called when the hardware is asleep, leading to this crash.

The easiest way to reproduce this is trying to set nf_override while
the hardware is down:

  $ ip link set down dev wlan0
  $ echo "-85" > /sys/kernel/debug/ieee80211/phy0/ath9k/nf_override

Fixing this crash by waking the hardware up before trying to set the
noise floor. Similar to what other ath9k debugfs files do.

Tested on a Lima board from 8devices, which has a QCA 4531 chipset.

Fixes: b90189759a7f ("ath9k: add noise floor override option")
Cc: Simon Wunderlich 
Signed-off-by: Linus Lüssing 
---
 drivers/net/wireless/ath/ath9k/debug.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/net/wireless/ath/ath9k/debug.c 
b/drivers/net/wireless/ath/ath9k/debug.c
index 017a43bc400c..4c81b1d7f417 100644
--- a/drivers/net/wireless/ath/ath9k/debug.c
+++ b/drivers/net/wireless/ath/ath9k/debug.c
@@ -1223,8 +1223,11 @@ static ssize_t write_file_nf_override(struct file *file,
 
ah->nf_override = val;
 
-   if (ah->curchan)
+   if (ah->curchan) {
+   ath9k_ps_wakeup(sc);
ath9k

Re: [PATCH net] bridge: mcast: Fix MLD2 Report IPv6 payload length check

2020-07-06 Thread Linus Lüssing
On Sun, Jul 05, 2020 at 11:18:36PM +0300, Nikolay Aleksandrov wrote:
> > > By the way, I can't verify at the moment, but I think we can drop that 
> > > whole
> > > hunk altogether since skb_header_pointer() is used and it will simply 
> > > return
> > > an error if there isn't enough data for nsrcs.
> > > 
> > 
> > Hm, while unlikely, the IPv6 packet / header payload length might be
> > shorter than the frame / skb size.
> > 
> > And even though it wouldn't crash reading over the IPv6 packet
> > length, especially as skb_header_pointer() is used, I think we should
> > still avoid reading over the size indicated by the IPv6 header payload
> > length field, to stay protocol compliant.
> > 
> > Does that make sense?
> > 
> 
> Sure, I just thought the ipv6_mc_may_pull() call after that includes those 2 
> bytes as well, so
> we're covered. That is, it seems to be doing the same check with the full 
> grec size included.
> 

Ah, okay, that's what you mean! You're right, technically the
ipv6_mc_may_pull() later would cover it, too. And it should work,
even if nsrcs were outside of the IPv6 packet and had a bogus
value. (I think.)

My brain linearly parsing the parser code would probably get a bit
confused initially, as it'd look like a bit like a bug. But the
current check for nsrcs might look confusing, too (q.e.d.).

So as you prefer, I'd be okay with both leaving that check for
consistency or removing it for brevity.


Re: [PATCH net] bridge: mcast: Fix MLD2 Report IPv6 payload length check

2020-07-05 Thread Linus Lüssing
On Sun, Jul 05, 2020 at 10:11:39PM +0300, Nikolay Aleksandrov wrote:
> On 7/5/20 10:08 PM, Linus Lüssing wrote:
> > On Sun, Jul 05, 2020 at 09:33:13PM +0300, Nikolay Aleksandrov wrote:
> > > On 05/07/2020 21:22, Linus Lüssing wrote:
> > > > Commit e57f61858b7c ("net: bridge: mcast: fix stale nsrcs pointer in
> > > > igmp3/mld2 report handling") introduced a small bug which would 
> > > > potentially
> > > > lead to accepting an MLD2 Report with a broken IPv6 header payload 
> > > > length
> > > > field.
> > > > 
> > > > The check needs to take into account the 2 bytes for the "Number of
> > > > Sources" field in the "Multicast Address Record" before reading it.
> > > > And not the size of a pointer to this field.
> > > > 
> > > > Fixes: e57f61858b7c ("net: bridge: mcast: fix stale nsrcs pointer in 
> > > > igmp3/mld2 report handling")
> > > > Signed-off-by: Linus Lüssing 
> > > > ---
> > > >   net/bridge/br_multicast.c | 2 +-
> > > >   1 file changed, 1 insertion(+), 1 deletion(-)
> > > > 
> > > 
> > > I'd rather be more concerned with it rejecting a valid report due to 
> > > wrong size. The ptr
> > > size would always be bigger. :)
> > > 
> > > Thanks!
> > > Acked-by: Nikolay Aleksandrov 
> > 
> > Aiy, you're right, it's the other way round. I'll update the
> > commit message and send a v2 in a minute, with your Acked-by
> > included.
> > 
> 
> By the way, I can't verify at the moment, but I think we can drop that whole
> hunk altogether since skb_header_pointer() is used and it will simply return
> an error if there isn't enough data for nsrcs.
> 

Hm, while unlikely, the IPv6 packet / header payload length might be
shorter than the frame / skb size.

And even though it wouldn't crash reading over the IPv6 packet
length, especially as skb_header_pointer() is used, I think we should
still avoid reading over the size indicated by the IPv6 header payload
length field, to stay protocol compliant.

Does that make sense?


[PATCH net v2] bridge: mcast: Fix MLD2 Report IPv6 payload length check

2020-07-05 Thread Linus Lüssing
Commit e57f61858b7c ("net: bridge: mcast: fix stale nsrcs pointer in
igmp3/mld2 report handling") introduced a bug in the IPv6 header payload
length check which would potentially lead to rejecting a valid MLD2 Report:

The check needs to take into account the 2 bytes for the "Number of
Sources" field in the "Multicast Address Record" before reading it.
And not the size of a pointer to this field.

Fixes: e57f61858b7c ("net: bridge: mcast: fix stale nsrcs pointer in igmp3/mld2 
report handling")
Acked-by: Nikolay Aleksandrov 
Signed-off-by: Linus Lüssing 
---
Changelog v2:
* updated commit message, the issue is accidentally rejcting a valid and
  not accepting an invalid MLD2 Report

 net/bridge/br_multicast.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c
index 83490bf73a13..4c4a93abde68 100644
--- a/net/bridge/br_multicast.c
+++ b/net/bridge/br_multicast.c
@@ -1000,21 +1000,21 @@ static int br_ip6_multicast_mld2_report(struct 
net_bridge *br,
num = ntohs(icmp6h->icmp6_dataun.un_data16[1]);
len = skb_transport_offset(skb) + sizeof(*icmp6h);
 
for (i = 0; i < num; i++) {
__be16 *_nsrcs, __nsrcs;
u16 nsrcs;
 
nsrcs_offset = len + offsetof(struct mld2_grec, grec_nsrcs);
 
if (skb_transport_offset(skb) + ipv6_transport_len(skb) <
-   nsrcs_offset + sizeof(_nsrcs))
+   nsrcs_offset + sizeof(__nsrcs))
return -EINVAL;
 
_nsrcs = skb_header_pointer(skb, nsrcs_offset,
sizeof(__nsrcs), &__nsrcs);
if (!_nsrcs)
return -EINVAL;
 
nsrcs = ntohs(*_nsrcs);
grec_len = struct_size(grec, grec_src, nsrcs);
 
-- 
2.27.0



Re: [PATCH net] bridge: mcast: Fix MLD2 Report IPv6 payload length check

2020-07-05 Thread Linus Lüssing
On Sun, Jul 05, 2020 at 09:33:13PM +0300, Nikolay Aleksandrov wrote:
> On 05/07/2020 21:22, Linus Lüssing wrote:
> > Commit e57f61858b7c ("net: bridge: mcast: fix stale nsrcs pointer in
> > igmp3/mld2 report handling") introduced a small bug which would potentially
> > lead to accepting an MLD2 Report with a broken IPv6 header payload length
> > field.
> > 
> > The check needs to take into account the 2 bytes for the "Number of
> > Sources" field in the "Multicast Address Record" before reading it.
> > And not the size of a pointer to this field.
> > 
> > Fixes: e57f61858b7c ("net: bridge: mcast: fix stale nsrcs pointer in 
> > igmp3/mld2 report handling")
> > Signed-off-by: Linus Lüssing 
> > ---
> >  net/bridge/br_multicast.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> 
> I'd rather be more concerned with it rejecting a valid report due to wrong 
> size. The ptr
> size would always be bigger. :)
> 
> Thanks!
> Acked-by: Nikolay Aleksandrov 

Aiy, you're right, it's the other way round. I'll update the
commit message and send a v2 in a minute, with your Acked-by
included.


[PATCH net] bridge: mcast: Fix MLD2 Report IPv6 payload length check

2020-07-05 Thread Linus Lüssing
Commit e57f61858b7c ("net: bridge: mcast: fix stale nsrcs pointer in
igmp3/mld2 report handling") introduced a small bug which would potentially
lead to accepting an MLD2 Report with a broken IPv6 header payload length
field.

The check needs to take into account the 2 bytes for the "Number of
Sources" field in the "Multicast Address Record" before reading it.
And not the size of a pointer to this field.

Fixes: e57f61858b7c ("net: bridge: mcast: fix stale nsrcs pointer in igmp3/mld2 
report handling")
Signed-off-by: Linus Lüssing 
---
 net/bridge/br_multicast.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c
index 83490bf73a13..4c4a93abde68 100644
--- a/net/bridge/br_multicast.c
+++ b/net/bridge/br_multicast.c
@@ -1000,21 +1000,21 @@ static int br_ip6_multicast_mld2_report(struct 
net_bridge *br,
num = ntohs(icmp6h->icmp6_dataun.un_data16[1]);
len = skb_transport_offset(skb) + sizeof(*icmp6h);
 
for (i = 0; i < num; i++) {
__be16 *_nsrcs, __nsrcs;
u16 nsrcs;
 
nsrcs_offset = len + offsetof(struct mld2_grec, grec_nsrcs);
 
if (skb_transport_offset(skb) + ipv6_transport_len(skb) <
-   nsrcs_offset + sizeof(_nsrcs))
+   nsrcs_offset + sizeof(__nsrcs))
return -EINVAL;
 
_nsrcs = skb_header_pointer(skb, nsrcs_offset,
sizeof(__nsrcs), &__nsrcs);
if (!_nsrcs)
return -EINVAL;
 
nsrcs = ntohs(*_nsrcs);
grec_len = struct_size(grec, grec_src, nsrcs);
 
-- 
2.27.0



[PATCH net-next v2 3/4] bridge: join all-snoopers multicast address

2019-01-20 Thread Linus Lüssing
Next to snooping IGMP/MLD queries RFC4541, section 2.1.1.a) recommends
to snoop multicast router advertisements to detect multicast routers.

Multicast router advertisements are sent to an "all-snoopers"
multicast address. To be able to receive them reliably, we need to
join this group.

Otherwise other snooping switches might refrain from forwarding these
advertisements to us.

Signed-off-by: Linus Lüssing 
---
 include/uapi/linux/in.h   |  9 +++---
 net/bridge/br_multicast.c | 72 ++-
 net/ipv6/mcast.c  |  2 ++
 3 files changed, 78 insertions(+), 5 deletions(-)

diff --git a/include/uapi/linux/in.h b/include/uapi/linux/in.h
index f6052e70bf40..7ab685cacb8f 100644
--- a/include/uapi/linux/in.h
+++ b/include/uapi/linux/in.h
@@ -292,10 +292,11 @@ struct sockaddr_in {
 #defineIN_LOOPBACK(a)  long int) (a)) & 0xff00) == 
0x7f00)
 
 /* Defines for Multicast INADDR */
-#define INADDR_UNSPEC_GROUP0xe000U /* 224.0.0.0   */
-#define INADDR_ALLHOSTS_GROUP  0xe001U /* 224.0.0.1   */
-#define INADDR_ALLRTRS_GROUP0xe002U/* 224.0.0.2 */
-#define INADDR_MAX_LOCAL_GROUP  0xe0ffU/* 224.0.0.255 */
+#define INADDR_UNSPEC_GROUP0xe000U /* 224.0.0.0   */
+#define INADDR_ALLHOSTS_GROUP  0xe001U /* 224.0.0.1   */
+#define INADDR_ALLRTRS_GROUP   0xe002U /* 224.0.0.2 */
+#define INADDR_ALLSNOOPERS_GROUP   0xe06aU /* 224.0.0.106 */
+#define INADDR_MAX_LOCAL_GROUP 0xe0ffU /* 224.0.0.255 */
 #endif
 
 /*  contains the htonl type stuff.. */
diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c
index 156c4905639e..2366f4a2780e 100644
--- a/net/bridge/br_multicast.c
+++ b/net/bridge/br_multicast.c
@@ -1780,6 +1780,68 @@ void br_multicast_init(struct net_bridge *br)
INIT_HLIST_HEAD(>mdb_list);
 }
 
+static void br_ip4_multicast_join_snoopers(struct net_bridge *br)
+{
+   struct in_device *in_dev = in_dev_get(br->dev);
+
+   if (!in_dev)
+   return;
+
+   ip_mc_inc_group(in_dev, htonl(INADDR_ALLSNOOPERS_GROUP));
+   in_dev_put(in_dev);
+}
+
+#if IS_ENABLED(CONFIG_IPV6)
+static void br_ip6_multicast_join_snoopers(struct net_bridge *br)
+{
+   struct in6_addr addr;
+
+   ipv6_addr_set(, htonl(0xff02), 0, 0, htonl(0x6a));
+   ipv6_dev_mc_inc(br->dev, );
+}
+#else
+static inline void br_ip6_multicast_join_snoopers(struct net_bridge *br)
+{
+}
+#endif
+
+static void br_multicast_join_snoopers(struct net_bridge *br)
+{
+   br_ip4_multicast_join_snoopers(br);
+   br_ip6_multicast_join_snoopers(br);
+}
+
+static void br_ip4_multicast_leave_snoopers(struct net_bridge *br)
+{
+   struct in_device *in_dev = in_dev_get(br->dev);
+
+   if (WARN_ON(!in_dev))
+   return;
+
+   ip_mc_dec_group(in_dev, htonl(INADDR_ALLSNOOPERS_GROUP));
+   in_dev_put(in_dev);
+}
+
+#if IS_ENABLED(CONFIG_IPV6)
+static void br_ip6_multicast_leave_snoopers(struct net_bridge *br)
+{
+   struct in6_addr addr;
+
+   ipv6_addr_set(, htonl(0xff02), 0, 0, htonl(0x6a));
+   ipv6_dev_mc_dec(br->dev, );
+}
+#else
+static inline void br_ip6_multicast_leave_snoopers(struct net_bridge *br)
+{
+}
+#endif
+
+static void br_multicast_leave_snoopers(struct net_bridge *br)
+{
+   br_ip4_multicast_leave_snoopers(br);
+   br_ip6_multicast_leave_snoopers(br);
+}
+
 static void __br_multicast_open(struct net_bridge *br,
struct bridge_mcast_own_query *query)
 {
@@ -1793,6 +1855,9 @@ static void __br_multicast_open(struct net_bridge *br,
 
 void br_multicast_open(struct net_bridge *br)
 {
+   if (br_opt_get(br, BROPT_MULTICAST_ENABLED))
+   br_multicast_join_snoopers(br);
+
__br_multicast_open(br, >ip4_own_query);
 #if IS_ENABLED(CONFIG_IPV6)
__br_multicast_open(br, >ip6_own_query);
@@ -1808,6 +1873,9 @@ void br_multicast_stop(struct net_bridge *br)
del_timer_sync(>ip6_other_query.timer);
del_timer_sync(>ip6_own_query.timer);
 #endif
+
+   if (br_opt_get(br, BROPT_MULTICAST_ENABLED))
+   br_multicast_leave_snoopers(br);
 }
 
 void br_multicast_dev_del(struct net_bridge *br)
@@ -1943,8 +2011,10 @@ int br_multicast_toggle(struct net_bridge *br, unsigned 
long val)
 
br_mc_disabled_update(br->dev, val);
br_opt_toggle(br, BROPT_MULTICAST_ENABLED, !!val);
-   if (!br_opt_get(br, BROPT_MULTICAST_ENABLED))
+   if (!br_opt_get(br, BROPT_MULTICAST_ENABLED)) {
+   br_multicast_leave_snoopers(br);
goto unlock;
+   }
 
if (!netif_running(br->dev))
goto unlock;
diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
index 21f6deb2aec9..42f3f5cd349f 100644
--- a/net/ipv6/mcast.c
+++ b/net/ipv6/mcast.c
@@ -940,6 +940,7 @@ int ipv6_dev_mc_inc(struct net_device *dev, 

[PATCH net-next v2 2/4] bridge: simplify ip_mc_check_igmp() and ipv6_mc_check_mld() internals

2019-01-20 Thread Linus Lüssing
With this patch the internal use of the skb_trimmed is reduced to
the ICMPv6/IGMP checksum verification. And for the length checks
the newly introduced helper functions are used instead of calculating
and checking with skb->len directly.

These changes should hopefully make it easier to verify that length
checks are performed properly.

Signed-off-by: Linus Lüssing 
---
 net/ipv4/igmp.c| 51 ++---
 net/ipv6/mcast_snoop.c | 62 --
 2 files changed, 52 insertions(+), 61 deletions(-)

diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c
index b1f6d93282d7..a40e48ded10d 100644
--- a/net/ipv4/igmp.c
+++ b/net/ipv4/igmp.c
@@ -1493,22 +1493,22 @@ static int ip_mc_check_igmp_reportv3(struct sk_buff 
*skb)
 
len += sizeof(struct igmpv3_report);
 
-   return pskb_may_pull(skb, len) ? 0 : -EINVAL;
+   return ip_mc_may_pull(skb, len) ? 0 : -EINVAL;
 }
 
 static int ip_mc_check_igmp_query(struct sk_buff *skb)
 {
-   unsigned int len = skb_transport_offset(skb);
-
-   len += sizeof(struct igmphdr);
-   if (skb->len < len)
-   return -EINVAL;
+   unsigned int transport_len = ip_transport_len(skb);
+   unsigned int len;
 
/* IGMPv{1,2}? */
-   if (skb->len != len) {
+   if (transport_len != sizeof(struct igmphdr)) {
/* or IGMPv3? */
-   len += sizeof(struct igmpv3_query) - sizeof(struct igmphdr);
-   if (skb->len < len || !pskb_may_pull(skb, len))
+   if (transport_len < sizeof(struct igmpv3_query))
+   return -EINVAL;
+
+   len = skb_transport_offset(skb) + sizeof(struct igmpv3_query);
+   if (!ip_mc_may_pull(skb, len))
return -EINVAL;
}
 
@@ -1544,35 +1544,24 @@ static inline __sum16 ip_mc_validate_checksum(struct 
sk_buff *skb)
return skb_checksum_simple_validate(skb);
 }
 
-static int __ip_mc_check_igmp(struct sk_buff *skb)
-
+static int ip_mc_check_igmp_csum(struct sk_buff *skb)
 {
-   struct sk_buff *skb_chk;
-   unsigned int transport_len;
unsigned int len = skb_transport_offset(skb) + sizeof(struct igmphdr);
-   int ret = -EINVAL;
+   unsigned int transport_len = ip_transport_len(skb);
+   struct sk_buff *skb_chk;
 
-   transport_len = ntohs(ip_hdr(skb)->tot_len) - ip_hdrlen(skb);
+   if (!ip_mc_may_pull(skb, len))
+   return -EINVAL;
 
skb_chk = skb_checksum_trimmed(skb, transport_len,
   ip_mc_validate_checksum);
if (!skb_chk)
-   goto err;
+   return -EINVAL;
 
-   if (!pskb_may_pull(skb_chk, len))
-   goto err;
-
-   ret = ip_mc_check_igmp_msg(skb_chk);
-   if (ret)
-   goto err;
-
-   ret = 0;
-
-err:
-   if (skb_chk && skb_chk != skb)
+   if (skb_chk != skb)
kfree_skb(skb_chk);
 
-   return ret;
+   return 0;
 }
 
 /**
@@ -1600,7 +1589,11 @@ int ip_mc_check_igmp(struct sk_buff *skb)
if (ip_hdr(skb)->protocol != IPPROTO_IGMP)
return -ENOMSG;
 
-   return __ip_mc_check_igmp(skb);
+   ret = ip_mc_check_igmp_csum(skb);
+   if (ret < 0)
+   return ret;
+
+   return ip_mc_check_igmp_msg(skb);
 }
 EXPORT_SYMBOL(ip_mc_check_igmp);
 
diff --git a/net/ipv6/mcast_snoop.c b/net/ipv6/mcast_snoop.c
index 1a917dc80d5e..a72ddfc40eb3 100644
--- a/net/ipv6/mcast_snoop.c
+++ b/net/ipv6/mcast_snoop.c
@@ -77,27 +77,27 @@ static int ipv6_mc_check_mld_reportv2(struct sk_buff *skb)
 
len += sizeof(struct mld2_report);
 
-   return pskb_may_pull(skb, len) ? 0 : -EINVAL;
+   return ipv6_mc_may_pull(skb, len) ? 0 : -EINVAL;
 }
 
 static int ipv6_mc_check_mld_query(struct sk_buff *skb)
 {
+   unsigned int transport_len = ipv6_transport_len(skb);
struct mld_msg *mld;
-   unsigned int len = skb_transport_offset(skb);
+   unsigned int len;
 
/* RFC2710+RFC3810 (MLDv1+MLDv2) require link-local source addresses */
if (!(ipv6_addr_type(_hdr(skb)->saddr) & IPV6_ADDR_LINKLOCAL))
return -EINVAL;
 
-   len += sizeof(struct mld_msg);
-   if (skb->len < len)
-   return -EINVAL;
-
/* MLDv1? */
-   if (skb->len != len) {
+   if (transport_len != sizeof(struct mld_msg)) {
/* or MLDv2? */
-   len += sizeof(struct mld2_query) - sizeof(struct mld_msg);
-   if (skb->len < len || !pskb_may_pull(skb, len))
+   if (transport_len < sizeof(struct mld2_query))
+   return -EINVAL;
+
+   len = skb_transport_offset(skb) + sizeof(struct mld2_query);
+   if (!ipv6_mc_may_pull(skb, len))
return -EINVAL;
}
 
@@ -115,7 +115,13 @@ static

[PATCH net-next v2 4/4] bridge: Snoop Multicast Router Advertisements

2019-01-20 Thread Linus Lüssing
When multiple multicast routers are present in a broadcast domain then
only one of them will be detectable via IGMP/MLD query snooping. The
multicast router with the lowest IP address will become the selected and
active querier while all other multicast routers will then refrain from
sending queries.

To detect such rather silent multicast routers, too, RFC4286
("Multicast Router Discovery") provides a standardized protocol to
detect multicast routers for multicast snooping switches.

This patch implements the necessary MRD Advertisement message parsing
and after successful processing adds such routers to the internal
multicast router list.

Signed-off-by: Linus Lüssing 
---
 include/linux/in.h  |  5 +
 include/net/addrconf.h  | 15 +
 include/uapi/linux/icmpv6.h |  2 ++
 include/uapi/linux/igmp.h   |  1 +
 net/bridge/br_multicast.c   | 55 +
 net/ipv6/mcast_snoop.c  |  5 -
 6 files changed, 82 insertions(+), 1 deletion(-)

diff --git a/include/linux/in.h b/include/linux/in.h
index 31b493734763..435e7f2a513a 100644
--- a/include/linux/in.h
+++ b/include/linux/in.h
@@ -60,6 +60,11 @@ static inline bool ipv4_is_lbcast(__be32 addr)
return addr == htonl(INADDR_BROADCAST);
 }
 
+static inline bool ipv4_is_all_snoopers(__be32 addr)
+{
+   return addr == htonl(INADDR_ALLSNOOPERS_GROUP);
+}
+
 static inline bool ipv4_is_zeronet(__be32 addr)
 {
return (addr & htonl(0xff00)) == htonl(0x);
diff --git a/include/net/addrconf.h b/include/net/addrconf.h
index daf11dcb0f70..20d523ee2fec 100644
--- a/include/net/addrconf.h
+++ b/include/net/addrconf.h
@@ -229,6 +229,7 @@ void ipv6_mc_unmap(struct inet6_dev *idev);
 void ipv6_mc_remap(struct inet6_dev *idev);
 void ipv6_mc_init_dev(struct inet6_dev *idev);
 void ipv6_mc_destroy_dev(struct inet6_dev *idev);
+int ipv6_mc_check_icmpv6(struct sk_buff *skb);
 int ipv6_mc_check_mld(struct sk_buff *skb);
 void addrconf_dad_failure(struct sk_buff *skb, struct inet6_ifaddr *ifp);
 
@@ -499,6 +500,20 @@ static inline bool ipv6_addr_is_solict_mult(const struct 
in6_addr *addr)
 #endif
 }
 
+static inline bool ipv6_addr_is_all_snoopers(const struct in6_addr *addr)
+{
+#if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) && BITS_PER_LONG == 64
+   __be64 *p = (__be64 *)addr;
+
+   return ((p[0] ^ cpu_to_be64(0xff02UL)) |
+   (p[1] ^ cpu_to_be64(0x6a))) == 0UL;
+#else
+   return ((addr->s6_addr32[0] ^ htonl(0xff02)) |
+   addr->s6_addr32[1] | addr->s6_addr32[2] |
+   (addr->s6_addr32[3] ^ htonl(0x006a))) == 0;
+#endif
+}
+
 #ifdef CONFIG_PROC_FS
 int if6_proc_init(void);
 void if6_proc_exit(void);
diff --git a/include/uapi/linux/icmpv6.h b/include/uapi/linux/icmpv6.h
index caf8dc019250..325395f56bfa 100644
--- a/include/uapi/linux/icmpv6.h
+++ b/include/uapi/linux/icmpv6.h
@@ -108,6 +108,8 @@ struct icmp6hdr {
 #define ICMPV6_MOBILE_PREFIX_SOL   146
 #define ICMPV6_MOBILE_PREFIX_ADV   147
 
+#define ICMPV6_MRDISC_ADV  151
+
 /*
  * Codes for Destination Unreachable
  */
diff --git a/include/uapi/linux/igmp.h b/include/uapi/linux/igmp.h
index 7e44ac02ca18..90c28bc466c6 100644
--- a/include/uapi/linux/igmp.h
+++ b/include/uapi/linux/igmp.h
@@ -93,6 +93,7 @@ struct igmpv3_query {
 #define IGMP_MTRACE_RESP   0x1e
 #define IGMP_MTRACE0x1f
 
+#define IGMP_MRDISC_ADV0x30/* From RFC4286 */
 
 /*
  * Use the BSD names for these for compatibility
diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c
index 2366f4a2780e..2c46c7aca571 100644
--- a/net/bridge/br_multicast.c
+++ b/net/bridge/br_multicast.c
@@ -14,6 +14,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -29,10 +30,12 @@
 #include 
 #include 
 #if IS_ENABLED(CONFIG_IPV6)
+#include 
 #include 
 #include 
 #include 
 #include 
+#include 
 #endif
 
 #include "br_private.h"
@@ -1583,6 +1586,19 @@ static void br_multicast_pim(struct net_bridge *br,
br_multicast_mark_router(br, port);
 }
 
+static int br_ip4_multicast_mrd_rcv(struct net_bridge *br,
+   struct net_bridge_port *port,
+   struct sk_buff *skb)
+{
+   if (ip_hdr(skb)->protocol != IPPROTO_IGMP ||
+   igmp_hdr(skb)->type != IGMP_MRDISC_ADV)
+   return -ENOMSG;
+
+   br_multicast_mark_router(br, port);
+
+   return 0;
+}
+
 static int br_multicast_ipv4_rcv(struct net_bridge *br,
 struct net_bridge_port *port,
 struct sk_buff *skb,
@@ -1600,7 +1616,15 @@ static int br_multicast_ipv4_rcv(struct net_bridge *br,
} else if (pim_ipv4_all_pim_routers(ip_hdr(skb)->daddr)) {
if (ip_hdr(skb)->protocol == IPPROTO_PIM)
  

[PATCH net-next v2 1/4] bridge: simplify ip_mc_check_igmp() and ipv6_mc_check_mld() calls

2019-01-20 Thread Linus Lüssing
This patch refactors ip_mc_check_igmp(), ipv6_mc_check_mld() and
their callers (more precisely, the Linux bridge) to not rely on
the skb_trimmed parameter anymore.

An skb with its tail trimmed to the IP packet length was initially
introduced for the following three reasons:

1) To be able to verify the ICMPv6 checksum.
2) To be able to distinguish the version of an IGMP or MLD query.
   They are distinguishable only by their size.
3) To avoid parsing data for an IGMPv3 or MLDv2 report that is
   beyond the IP packet but still within the skb.

The first case still uses a cloned and potentially trimmed skb to
verfiy. However, there is no need to propagate it to the caller.
For the second and third case explicit IP packet length checks were
added.

This hopefully makes ip_mc_check_igmp() and ipv6_mc_check_mld() easier
to read and verfiy, as well as easier to use.

Signed-off-by: Linus Lüssing 
---
 include/linux/igmp.h   | 11 -
 include/linux/ip.h |  5 
 include/linux/ipv6.h   |  6 +
 include/net/addrconf.h | 12 +-
 net/batman-adv/multicast.c |  4 ++--
 net/bridge/br_multicast.c  | 57 +++---
 net/ipv4/igmp.c| 23 ---
 net/ipv6/mcast_snoop.c | 24 ---
 8 files changed, 70 insertions(+), 72 deletions(-)

diff --git a/include/linux/igmp.h b/include/linux/igmp.h
index 119f53941c12..8b4348f69bc5 100644
--- a/include/linux/igmp.h
+++ b/include/linux/igmp.h
@@ -18,6 +18,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -106,6 +107,14 @@ struct ip_mc_list {
 #define IGMPV3_QQIC(value) IGMPV3_EXP(0x80, 4, 3, value)
 #define IGMPV3_MRC(value) IGMPV3_EXP(0x80, 4, 3, value)
 
+static inline int ip_mc_may_pull(struct sk_buff *skb, unsigned int len)
+{
+   if (skb_transport_offset(skb) + ip_transport_len(skb) < len)
+   return -EINVAL;
+
+   return pskb_may_pull(skb, len);
+}
+
 extern int ip_check_mc_rcu(struct in_device *dev, __be32 mc_addr, __be32 
src_addr, u8 proto);
 extern int igmp_rcv(struct sk_buff *);
 extern int ip_mc_join_group(struct sock *sk, struct ip_mreqn *imr);
@@ -130,6 +139,6 @@ extern void ip_mc_unmap(struct in_device *);
 extern void ip_mc_remap(struct in_device *);
 extern void ip_mc_dec_group(struct in_device *in_dev, __be32 addr);
 extern void ip_mc_inc_group(struct in_device *in_dev, __be32 addr);
-int ip_mc_check_igmp(struct sk_buff *skb, struct sk_buff **skb_trimmed);
+int ip_mc_check_igmp(struct sk_buff *skb);
 
 #endif
diff --git a/include/linux/ip.h b/include/linux/ip.h
index 492bc6513533..482b7b7c9f30 100644
--- a/include/linux/ip.h
+++ b/include/linux/ip.h
@@ -34,4 +34,9 @@ static inline struct iphdr *ipip_hdr(const struct sk_buff 
*skb)
 {
return (struct iphdr *)skb_transport_header(skb);
 }
+
+static inline unsigned int ip_transport_len(const struct sk_buff *skb)
+{
+   return ntohs(ip_hdr(skb)->tot_len) - skb_network_header_len(skb);
+}
 #endif /* _LINUX_IP_H */
diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h
index 495e834c1367..6d45ce784bea 100644
--- a/include/linux/ipv6.h
+++ b/include/linux/ipv6.h
@@ -104,6 +104,12 @@ static inline struct ipv6hdr *ipipv6_hdr(const struct 
sk_buff *skb)
return (struct ipv6hdr *)skb_transport_header(skb);
 }
 
+static inline unsigned int ipv6_transport_len(const struct sk_buff *skb)
+{
+   return ntohs(ipv6_hdr(skb)->payload_len) + sizeof(struct ipv6hdr) -
+  skb_network_header_len(skb);
+}
+
 /* 
This structure contains results of exthdrs parsing
as offsets from skb->nh.
diff --git a/include/net/addrconf.h b/include/net/addrconf.h
index 1656c5978498..daf11dcb0f70 100644
--- a/include/net/addrconf.h
+++ b/include/net/addrconf.h
@@ -49,6 +49,7 @@ struct prefix_info {
struct in6_addr prefix;
 };
 
+#include 
 #include 
 #include 
 #include 
@@ -201,6 +202,15 @@ u32 ipv6_addr_label(struct net *net, const struct in6_addr 
*addr,
 /*
  * multicast prototypes (mcast.c)
  */
+static inline int ipv6_mc_may_pull(struct sk_buff *skb,
+  unsigned int len)
+{
+   if (skb_transport_offset(skb) + ipv6_transport_len(skb) < len)
+   return -EINVAL;
+
+   return pskb_may_pull(skb, len);
+}
+
 int ipv6_sock_mc_join(struct sock *sk, int ifindex,
  const struct in6_addr *addr);
 int ipv6_sock_mc_drop(struct sock *sk, int ifindex,
@@ -219,7 +229,7 @@ void ipv6_mc_unmap(struct inet6_dev *idev);
 void ipv6_mc_remap(struct inet6_dev *idev);
 void ipv6_mc_init_dev(struct inet6_dev *idev);
 void ipv6_mc_destroy_dev(struct inet6_dev *idev);
-int ipv6_mc_check_mld(struct sk_buff *skb, struct sk_buff **skb_trimmed);
+int ipv6_mc_check_mld(struct sk_buff *skb);
 void addrconf_dad_failure(struct sk_buff *skb, struct inet6_ifaddr *ifp);
 
 bool ipv6_chk_mcast_addr(struct net_device *dev, const struct in6_addr *group,
diff --git a/net/batman-adv/mu

[PATCH net-next v2 0/4] bridge: implement Multicast Router Discovery (RFC4286)

2019-01-20 Thread Linus Lüssing
Hi,

This patchset adds initial Multicast Router Discovery support to
the Linux bridge (RFC4286). With MRD it is possible to detect multicast
routers and mark bridge ports and forward multicast packets to such routers
accordingly.

So far, multicast routers are detected via IGMP/MLD queries and PIM
messages in the Linux bridge. As there is only one active, selected
querier at a time RFC4541 ("Considerations for Internet Group Management
Protocol (IGMP) and Multicast Listener Discovery (MLD) Snooping
Switches") section 2.1.1.a) recommends snooping Multicast Router
Advertisements as provided by MRD (RFC4286).


The first two patches are refactoring some existing code which is reused
for parsing the Multicast Router Advertisements later in the fourth
patch. The third patch lets the bridge join the all-snoopers multicast
address to be able to reliably receive the Multicast Router
Advertisements.


What is not implemented yet from RFC4286 yet:

* Sending Multicast Router Solicitations:
  -> RFC4286: "[...] may be sent when [...] an interface is
 (re-)initialized [or] MRD is enabled"
* Snooping Multicast Router Terminations:
  -> currently this only relies on our own timeouts
* Adjusting timeouts with the values provided in the announcements


Regards, Linus

---

Changes in v2:

* rebased to current net-next/master (no conflicts/changes)




[PATCH net-next 2/4] bridge: simplify ip_mc_check_igmp() and ipv6_mc_check_mld() internals

2018-12-21 Thread Linus Lüssing
With this patch the internal use of the skb_trimmed is reduced to
the ICMPv6/IGMP checksum verification. And for the length checks
the newly introduced helper functions are used instead of calculating
and checking with skb->len directly.

These changes should hopefully make it easier to verify that length
checks are performed properly.

Signed-off-by: Linus Lüssing 
---
 net/ipv4/igmp.c| 51 ++---
 net/ipv6/mcast_snoop.c | 62 --
 2 files changed, 52 insertions(+), 61 deletions(-)

diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c
index b1f6d93282d7..a40e48ded10d 100644
--- a/net/ipv4/igmp.c
+++ b/net/ipv4/igmp.c
@@ -1493,22 +1493,22 @@ static int ip_mc_check_igmp_reportv3(struct sk_buff 
*skb)
 
len += sizeof(struct igmpv3_report);
 
-   return pskb_may_pull(skb, len) ? 0 : -EINVAL;
+   return ip_mc_may_pull(skb, len) ? 0 : -EINVAL;
 }
 
 static int ip_mc_check_igmp_query(struct sk_buff *skb)
 {
-   unsigned int len = skb_transport_offset(skb);
-
-   len += sizeof(struct igmphdr);
-   if (skb->len < len)
-   return -EINVAL;
+   unsigned int transport_len = ip_transport_len(skb);
+   unsigned int len;
 
/* IGMPv{1,2}? */
-   if (skb->len != len) {
+   if (transport_len != sizeof(struct igmphdr)) {
/* or IGMPv3? */
-   len += sizeof(struct igmpv3_query) - sizeof(struct igmphdr);
-   if (skb->len < len || !pskb_may_pull(skb, len))
+   if (transport_len < sizeof(struct igmpv3_query))
+   return -EINVAL;
+
+   len = skb_transport_offset(skb) + sizeof(struct igmpv3_query);
+   if (!ip_mc_may_pull(skb, len))
return -EINVAL;
}
 
@@ -1544,35 +1544,24 @@ static inline __sum16 ip_mc_validate_checksum(struct 
sk_buff *skb)
return skb_checksum_simple_validate(skb);
 }
 
-static int __ip_mc_check_igmp(struct sk_buff *skb)
-
+static int ip_mc_check_igmp_csum(struct sk_buff *skb)
 {
-   struct sk_buff *skb_chk;
-   unsigned int transport_len;
unsigned int len = skb_transport_offset(skb) + sizeof(struct igmphdr);
-   int ret = -EINVAL;
+   unsigned int transport_len = ip_transport_len(skb);
+   struct sk_buff *skb_chk;
 
-   transport_len = ntohs(ip_hdr(skb)->tot_len) - ip_hdrlen(skb);
+   if (!ip_mc_may_pull(skb, len))
+   return -EINVAL;
 
skb_chk = skb_checksum_trimmed(skb, transport_len,
   ip_mc_validate_checksum);
if (!skb_chk)
-   goto err;
+   return -EINVAL;
 
-   if (!pskb_may_pull(skb_chk, len))
-   goto err;
-
-   ret = ip_mc_check_igmp_msg(skb_chk);
-   if (ret)
-   goto err;
-
-   ret = 0;
-
-err:
-   if (skb_chk && skb_chk != skb)
+   if (skb_chk != skb)
kfree_skb(skb_chk);
 
-   return ret;
+   return 0;
 }
 
 /**
@@ -1600,7 +1589,11 @@ int ip_mc_check_igmp(struct sk_buff *skb)
if (ip_hdr(skb)->protocol != IPPROTO_IGMP)
return -ENOMSG;
 
-   return __ip_mc_check_igmp(skb);
+   ret = ip_mc_check_igmp_csum(skb);
+   if (ret < 0)
+   return ret;
+
+   return ip_mc_check_igmp_msg(skb);
 }
 EXPORT_SYMBOL(ip_mc_check_igmp);
 
diff --git a/net/ipv6/mcast_snoop.c b/net/ipv6/mcast_snoop.c
index 1a917dc80d5e..a72ddfc40eb3 100644
--- a/net/ipv6/mcast_snoop.c
+++ b/net/ipv6/mcast_snoop.c
@@ -77,27 +77,27 @@ static int ipv6_mc_check_mld_reportv2(struct sk_buff *skb)
 
len += sizeof(struct mld2_report);
 
-   return pskb_may_pull(skb, len) ? 0 : -EINVAL;
+   return ipv6_mc_may_pull(skb, len) ? 0 : -EINVAL;
 }
 
 static int ipv6_mc_check_mld_query(struct sk_buff *skb)
 {
+   unsigned int transport_len = ipv6_transport_len(skb);
struct mld_msg *mld;
-   unsigned int len = skb_transport_offset(skb);
+   unsigned int len;
 
/* RFC2710+RFC3810 (MLDv1+MLDv2) require link-local source addresses */
if (!(ipv6_addr_type(_hdr(skb)->saddr) & IPV6_ADDR_LINKLOCAL))
return -EINVAL;
 
-   len += sizeof(struct mld_msg);
-   if (skb->len < len)
-   return -EINVAL;
-
/* MLDv1? */
-   if (skb->len != len) {
+   if (transport_len != sizeof(struct mld_msg)) {
/* or MLDv2? */
-   len += sizeof(struct mld2_query) - sizeof(struct mld_msg);
-   if (skb->len < len || !pskb_may_pull(skb, len))
+   if (transport_len < sizeof(struct mld2_query))
+   return -EINVAL;
+
+   len = skb_transport_offset(skb) + sizeof(struct mld2_query);
+   if (!ipv6_mc_may_pull(skb, len))
return -EINVAL;
}
 
@@ -115,7 +115,13 @@ static

[PATCH net-next 1/4] bridge: simplify ip_mc_check_igmp() and ipv6_mc_check_mld() calls

2018-12-21 Thread Linus Lüssing
This patch refactors ip_mc_check_igmp(), ipv6_mc_check_mld() and
their callers (more precisely, the Linux bridge) to not rely on
the skb_trimmed parameter anymore.

An skb with its tail trimmed to the IP packet length was initially
introduced for the following three reasons:

1) To be able to verify the ICMPv6 checksum.
2) To be able to distinguish the version of an IGMP or MLD query.
   They are distinguishable only by their size.
3) To avoid parsing data for an IGMPv3 or MLDv2 report that is
   beyond the IP packet but still within the skb.

The first case still uses a cloned and potentially trimmed skb to
verfiy. However, there is no need to propagate it to the caller.
For the second and third case explicit IP packet length checks were
added.

This hopefully makes ip_mc_check_igmp() and ipv6_mc_check_mld() easier
to read and verfiy, as well as easier to use.

Signed-off-by: Linus Lüssing 
---
 include/linux/igmp.h   | 11 -
 include/linux/ip.h |  5 
 include/linux/ipv6.h   |  6 +
 include/net/addrconf.h | 12 +-
 net/batman-adv/multicast.c |  4 ++--
 net/bridge/br_multicast.c  | 57 +++---
 net/ipv4/igmp.c| 23 ---
 net/ipv6/mcast_snoop.c | 24 ---
 8 files changed, 70 insertions(+), 72 deletions(-)

diff --git a/include/linux/igmp.h b/include/linux/igmp.h
index 119f53941c12..8b4348f69bc5 100644
--- a/include/linux/igmp.h
+++ b/include/linux/igmp.h
@@ -18,6 +18,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -106,6 +107,14 @@ struct ip_mc_list {
 #define IGMPV3_QQIC(value) IGMPV3_EXP(0x80, 4, 3, value)
 #define IGMPV3_MRC(value) IGMPV3_EXP(0x80, 4, 3, value)
 
+static inline int ip_mc_may_pull(struct sk_buff *skb, unsigned int len)
+{
+   if (skb_transport_offset(skb) + ip_transport_len(skb) < len)
+   return -EINVAL;
+
+   return pskb_may_pull(skb, len);
+}
+
 extern int ip_check_mc_rcu(struct in_device *dev, __be32 mc_addr, __be32 
src_addr, u8 proto);
 extern int igmp_rcv(struct sk_buff *);
 extern int ip_mc_join_group(struct sock *sk, struct ip_mreqn *imr);
@@ -130,6 +139,6 @@ extern void ip_mc_unmap(struct in_device *);
 extern void ip_mc_remap(struct in_device *);
 extern void ip_mc_dec_group(struct in_device *in_dev, __be32 addr);
 extern void ip_mc_inc_group(struct in_device *in_dev, __be32 addr);
-int ip_mc_check_igmp(struct sk_buff *skb, struct sk_buff **skb_trimmed);
+int ip_mc_check_igmp(struct sk_buff *skb);
 
 #endif
diff --git a/include/linux/ip.h b/include/linux/ip.h
index 492bc6513533..482b7b7c9f30 100644
--- a/include/linux/ip.h
+++ b/include/linux/ip.h
@@ -34,4 +34,9 @@ static inline struct iphdr *ipip_hdr(const struct sk_buff 
*skb)
 {
return (struct iphdr *)skb_transport_header(skb);
 }
+
+static inline unsigned int ip_transport_len(const struct sk_buff *skb)
+{
+   return ntohs(ip_hdr(skb)->tot_len) - skb_network_header_len(skb);
+}
 #endif /* _LINUX_IP_H */
diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h
index 495e834c1367..6d45ce784bea 100644
--- a/include/linux/ipv6.h
+++ b/include/linux/ipv6.h
@@ -104,6 +104,12 @@ static inline struct ipv6hdr *ipipv6_hdr(const struct 
sk_buff *skb)
return (struct ipv6hdr *)skb_transport_header(skb);
 }
 
+static inline unsigned int ipv6_transport_len(const struct sk_buff *skb)
+{
+   return ntohs(ipv6_hdr(skb)->payload_len) + sizeof(struct ipv6hdr) -
+  skb_network_header_len(skb);
+}
+
 /* 
This structure contains results of exthdrs parsing
as offsets from skb->nh.
diff --git a/include/net/addrconf.h b/include/net/addrconf.h
index 1656c5978498..daf11dcb0f70 100644
--- a/include/net/addrconf.h
+++ b/include/net/addrconf.h
@@ -49,6 +49,7 @@ struct prefix_info {
struct in6_addr prefix;
 };
 
+#include 
 #include 
 #include 
 #include 
@@ -201,6 +202,15 @@ u32 ipv6_addr_label(struct net *net, const struct in6_addr 
*addr,
 /*
  * multicast prototypes (mcast.c)
  */
+static inline int ipv6_mc_may_pull(struct sk_buff *skb,
+  unsigned int len)
+{
+   if (skb_transport_offset(skb) + ipv6_transport_len(skb) < len)
+   return -EINVAL;
+
+   return pskb_may_pull(skb, len);
+}
+
 int ipv6_sock_mc_join(struct sock *sk, int ifindex,
  const struct in6_addr *addr);
 int ipv6_sock_mc_drop(struct sock *sk, int ifindex,
@@ -219,7 +229,7 @@ void ipv6_mc_unmap(struct inet6_dev *idev);
 void ipv6_mc_remap(struct inet6_dev *idev);
 void ipv6_mc_init_dev(struct inet6_dev *idev);
 void ipv6_mc_destroy_dev(struct inet6_dev *idev);
-int ipv6_mc_check_mld(struct sk_buff *skb, struct sk_buff **skb_trimmed);
+int ipv6_mc_check_mld(struct sk_buff *skb);
 void addrconf_dad_failure(struct sk_buff *skb, struct inet6_ifaddr *ifp);
 
 bool ipv6_chk_mcast_addr(struct net_device *dev, const struct in6_addr *group,
diff --git a/net/batman-adv/mu

[PATCH net-next 0/4] bridge: implement Multicast Router Discovery (RFC4286)

2018-12-21 Thread Linus Lüssing
Hi,

This patchset adds initial Multicast Router Discovery support to
the Linux bridge (RFC4286). With MRD it is possible to detect multicast
routers and mark bridge ports and forward multicast packets to such routers
accordingly.

So far, multicast routers are detected via IGMP/MLD queries and PIM
messages in the Linux bridge. As there is only one active, selected
querier at a time RFC4541 ("Considerations for Internet Group Management
Protocol (IGMP) and Multicast Listener Discovery (MLD) Snooping
Switches") section 2.1.1.a) recommends snooping Multicast Router
Advertisements as provided by MRD (RFC4286).


The first two patches are refactoring some existing code which is reused
for parsing the Multicast Router Advertisements later in the fourth
patch. The third patch lets the bridge join the all-snoopers multicast
address to be able to reliably receive the Multicast Router
Advertisements.


What is not implemented yet from RFC4286 yet:

* Sending Multicast Router Solicitations:
  -> RFC4286: "[...] may be sent when [...] an interface is
 (re-)initialized [or] MRD is enabled"
* Snooping Multicast Router Terminations:
  -> currently this only relies on our own timeouts
* Adjusting timeouts with the values provided in the announcements


Regards, Linus





[PATCH net-next 4/4] bridge: Snoop Multicast Router Advertisements

2018-12-21 Thread Linus Lüssing
When multiple multicast routers are present in a broadcast domain then
only one of them will be detectable via IGMP/MLD query snooping. The
multicast router with the lowest IP address will become the selected and
active querier while all other multicast routers will then refrain from
sending queries.

To detect such rather silent multicast routers, too, RFC4286
("Multicast Router Discovery") provides a standardized protocol to
detect multicast routers for multicast snooping switches.

This patch implements the necessary MRD Advertisement message parsing
and after successful processing adds such routers to the internal
multicast router list.

Signed-off-by: Linus Lüssing 
---
 include/linux/in.h  |  5 +
 include/net/addrconf.h  | 15 +
 include/uapi/linux/icmpv6.h |  2 ++
 include/uapi/linux/igmp.h   |  1 +
 net/bridge/br_multicast.c   | 55 +
 net/ipv6/mcast_snoop.c  |  5 -
 6 files changed, 82 insertions(+), 1 deletion(-)

diff --git a/include/linux/in.h b/include/linux/in.h
index 31b493734763..435e7f2a513a 100644
--- a/include/linux/in.h
+++ b/include/linux/in.h
@@ -60,6 +60,11 @@ static inline bool ipv4_is_lbcast(__be32 addr)
return addr == htonl(INADDR_BROADCAST);
 }
 
+static inline bool ipv4_is_all_snoopers(__be32 addr)
+{
+   return addr == htonl(INADDR_ALLSNOOPERS_GROUP);
+}
+
 static inline bool ipv4_is_zeronet(__be32 addr)
 {
return (addr & htonl(0xff00)) == htonl(0x);
diff --git a/include/net/addrconf.h b/include/net/addrconf.h
index daf11dcb0f70..20d523ee2fec 100644
--- a/include/net/addrconf.h
+++ b/include/net/addrconf.h
@@ -229,6 +229,7 @@ void ipv6_mc_unmap(struct inet6_dev *idev);
 void ipv6_mc_remap(struct inet6_dev *idev);
 void ipv6_mc_init_dev(struct inet6_dev *idev);
 void ipv6_mc_destroy_dev(struct inet6_dev *idev);
+int ipv6_mc_check_icmpv6(struct sk_buff *skb);
 int ipv6_mc_check_mld(struct sk_buff *skb);
 void addrconf_dad_failure(struct sk_buff *skb, struct inet6_ifaddr *ifp);
 
@@ -499,6 +500,20 @@ static inline bool ipv6_addr_is_solict_mult(const struct 
in6_addr *addr)
 #endif
 }
 
+static inline bool ipv6_addr_is_all_snoopers(const struct in6_addr *addr)
+{
+#if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) && BITS_PER_LONG == 64
+   __be64 *p = (__be64 *)addr;
+
+   return ((p[0] ^ cpu_to_be64(0xff02UL)) |
+   (p[1] ^ cpu_to_be64(0x6a))) == 0UL;
+#else
+   return ((addr->s6_addr32[0] ^ htonl(0xff02)) |
+   addr->s6_addr32[1] | addr->s6_addr32[2] |
+   (addr->s6_addr32[3] ^ htonl(0x006a))) == 0;
+#endif
+}
+
 #ifdef CONFIG_PROC_FS
 int if6_proc_init(void);
 void if6_proc_exit(void);
diff --git a/include/uapi/linux/icmpv6.h b/include/uapi/linux/icmpv6.h
index caf8dc019250..325395f56bfa 100644
--- a/include/uapi/linux/icmpv6.h
+++ b/include/uapi/linux/icmpv6.h
@@ -108,6 +108,8 @@ struct icmp6hdr {
 #define ICMPV6_MOBILE_PREFIX_SOL   146
 #define ICMPV6_MOBILE_PREFIX_ADV   147
 
+#define ICMPV6_MRDISC_ADV  151
+
 /*
  * Codes for Destination Unreachable
  */
diff --git a/include/uapi/linux/igmp.h b/include/uapi/linux/igmp.h
index 7e44ac02ca18..90c28bc466c6 100644
--- a/include/uapi/linux/igmp.h
+++ b/include/uapi/linux/igmp.h
@@ -93,6 +93,7 @@ struct igmpv3_query {
 #define IGMP_MTRACE_RESP   0x1e
 #define IGMP_MTRACE0x1f
 
+#define IGMP_MRDISC_ADV0x30/* From RFC4286 */
 
 /*
  * Use the BSD names for these for compatibility
diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c
index 2366f4a2780e..2c46c7aca571 100644
--- a/net/bridge/br_multicast.c
+++ b/net/bridge/br_multicast.c
@@ -14,6 +14,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -29,10 +30,12 @@
 #include 
 #include 
 #if IS_ENABLED(CONFIG_IPV6)
+#include 
 #include 
 #include 
 #include 
 #include 
+#include 
 #endif
 
 #include "br_private.h"
@@ -1583,6 +1586,19 @@ static void br_multicast_pim(struct net_bridge *br,
br_multicast_mark_router(br, port);
 }
 
+static int br_ip4_multicast_mrd_rcv(struct net_bridge *br,
+   struct net_bridge_port *port,
+   struct sk_buff *skb)
+{
+   if (ip_hdr(skb)->protocol != IPPROTO_IGMP ||
+   igmp_hdr(skb)->type != IGMP_MRDISC_ADV)
+   return -ENOMSG;
+
+   br_multicast_mark_router(br, port);
+
+   return 0;
+}
+
 static int br_multicast_ipv4_rcv(struct net_bridge *br,
 struct net_bridge_port *port,
 struct sk_buff *skb,
@@ -1600,7 +1616,15 @@ static int br_multicast_ipv4_rcv(struct net_bridge *br,
} else if (pim_ipv4_all_pim_routers(ip_hdr(skb)->daddr)) {
if (ip_hdr(skb)->protocol == IPPROTO_PIM)
  

[PATCH net-next 3/4] bridge: join all-snoopers multicast address

2018-12-21 Thread Linus Lüssing
Next to snooping IGMP/MLD queries RFC4541, section 2.1.1.a) recommends
to snoop multicast router advertisements to detect multicast routers.

Multicast router advertisements are sent to an "all-snoopers"
multicast address. To be able to receive them reliably, we need to
join this group.

Otherwise other snooping switches might refrain from forwarding these
advertisements to us.

Signed-off-by: Linus Lüssing 
---
 include/uapi/linux/in.h   |  9 +++---
 net/bridge/br_multicast.c | 72 ++-
 net/ipv6/mcast.c  |  2 ++
 3 files changed, 78 insertions(+), 5 deletions(-)

diff --git a/include/uapi/linux/in.h b/include/uapi/linux/in.h
index f6052e70bf40..7ab685cacb8f 100644
--- a/include/uapi/linux/in.h
+++ b/include/uapi/linux/in.h
@@ -292,10 +292,11 @@ struct sockaddr_in {
 #defineIN_LOOPBACK(a)  long int) (a)) & 0xff00) == 
0x7f00)
 
 /* Defines for Multicast INADDR */
-#define INADDR_UNSPEC_GROUP0xe000U /* 224.0.0.0   */
-#define INADDR_ALLHOSTS_GROUP  0xe001U /* 224.0.0.1   */
-#define INADDR_ALLRTRS_GROUP0xe002U/* 224.0.0.2 */
-#define INADDR_MAX_LOCAL_GROUP  0xe0ffU/* 224.0.0.255 */
+#define INADDR_UNSPEC_GROUP0xe000U /* 224.0.0.0   */
+#define INADDR_ALLHOSTS_GROUP  0xe001U /* 224.0.0.1   */
+#define INADDR_ALLRTRS_GROUP   0xe002U /* 224.0.0.2 */
+#define INADDR_ALLSNOOPERS_GROUP   0xe06aU /* 224.0.0.106 */
+#define INADDR_MAX_LOCAL_GROUP 0xe0ffU /* 224.0.0.255 */
 #endif
 
 /*  contains the htonl type stuff.. */
diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c
index 156c4905639e..2366f4a2780e 100644
--- a/net/bridge/br_multicast.c
+++ b/net/bridge/br_multicast.c
@@ -1780,6 +1780,68 @@ void br_multicast_init(struct net_bridge *br)
INIT_HLIST_HEAD(>mdb_list);
 }
 
+static void br_ip4_multicast_join_snoopers(struct net_bridge *br)
+{
+   struct in_device *in_dev = in_dev_get(br->dev);
+
+   if (!in_dev)
+   return;
+
+   ip_mc_inc_group(in_dev, htonl(INADDR_ALLSNOOPERS_GROUP));
+   in_dev_put(in_dev);
+}
+
+#if IS_ENABLED(CONFIG_IPV6)
+static void br_ip6_multicast_join_snoopers(struct net_bridge *br)
+{
+   struct in6_addr addr;
+
+   ipv6_addr_set(, htonl(0xff02), 0, 0, htonl(0x6a));
+   ipv6_dev_mc_inc(br->dev, );
+}
+#else
+static inline void br_ip6_multicast_join_snoopers(struct net_bridge *br)
+{
+}
+#endif
+
+static void br_multicast_join_snoopers(struct net_bridge *br)
+{
+   br_ip4_multicast_join_snoopers(br);
+   br_ip6_multicast_join_snoopers(br);
+}
+
+static void br_ip4_multicast_leave_snoopers(struct net_bridge *br)
+{
+   struct in_device *in_dev = in_dev_get(br->dev);
+
+   if (WARN_ON(!in_dev))
+   return;
+
+   ip_mc_dec_group(in_dev, htonl(INADDR_ALLSNOOPERS_GROUP));
+   in_dev_put(in_dev);
+}
+
+#if IS_ENABLED(CONFIG_IPV6)
+static void br_ip6_multicast_leave_snoopers(struct net_bridge *br)
+{
+   struct in6_addr addr;
+
+   ipv6_addr_set(, htonl(0xff02), 0, 0, htonl(0x6a));
+   ipv6_dev_mc_dec(br->dev, );
+}
+#else
+static inline void br_ip6_multicast_leave_snoopers(struct net_bridge *br)
+{
+}
+#endif
+
+static void br_multicast_leave_snoopers(struct net_bridge *br)
+{
+   br_ip4_multicast_leave_snoopers(br);
+   br_ip6_multicast_leave_snoopers(br);
+}
+
 static void __br_multicast_open(struct net_bridge *br,
struct bridge_mcast_own_query *query)
 {
@@ -1793,6 +1855,9 @@ static void __br_multicast_open(struct net_bridge *br,
 
 void br_multicast_open(struct net_bridge *br)
 {
+   if (br_opt_get(br, BROPT_MULTICAST_ENABLED))
+   br_multicast_join_snoopers(br);
+
__br_multicast_open(br, >ip4_own_query);
 #if IS_ENABLED(CONFIG_IPV6)
__br_multicast_open(br, >ip6_own_query);
@@ -1808,6 +1873,9 @@ void br_multicast_stop(struct net_bridge *br)
del_timer_sync(>ip6_other_query.timer);
del_timer_sync(>ip6_own_query.timer);
 #endif
+
+   if (br_opt_get(br, BROPT_MULTICAST_ENABLED))
+   br_multicast_leave_snoopers(br);
 }
 
 void br_multicast_dev_del(struct net_bridge *br)
@@ -1943,8 +2011,10 @@ int br_multicast_toggle(struct net_bridge *br, unsigned 
long val)
 
br_mc_disabled_update(br->dev, val);
br_opt_toggle(br, BROPT_MULTICAST_ENABLED, !!val);
-   if (!br_opt_get(br, BROPT_MULTICAST_ENABLED))
+   if (!br_opt_get(br, BROPT_MULTICAST_ENABLED)) {
+   br_multicast_leave_snoopers(br);
goto unlock;
+   }
 
if (!netif_running(br->dev))
goto unlock;
diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
index 21f6deb2aec9..42f3f5cd349f 100644
--- a/net/ipv6/mcast.c
+++ b/net/ipv6/mcast.c
@@ -940,6 +940,7 @@ int ipv6_dev_mc_inc(struct net_device *dev, 

[PATCH net-next v2] netfilter: ebtables: avoid resetting limit rule state

2018-12-08 Thread Linus Lüssing
So far any changes with ebtables will reset the state of limit rules,
leading to spikes in traffic. This is especially noticeable if changes
are done frequently, for instance via a daemon.

This patch fixes this by bailing out from (re)setting if the limit
rule was initialized before.

When sending packets every 250ms for 600s, with a
"--limit 1/sec --limit-burst 50" rule and a command like this
in the background:

$ ebtables -N VOIDCHAIN
$ while true; do ebtables -F VOIDCHAIN; sleep 30; done

The results are:

Before: ~1600 packets
After: 650 packets

This also aligns the behavior to "xtables-nft-multi ebtables" which uses
nft_limit instead of ebt_limit. In tests nft_limit did not suffer from
this issue and rate limited to 650 just fine.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Linus Lüssing 

---

Changelog v2:

- Adjusted commit message (adjusted title, added test results with
  nft_limit for comparison)
- Excluded rate limiting variables from zeroing when passed to userspace
  by increasing .usersize. This became necessary with 4.11 /
  commit ec2318904965 ("xtables: extend matches and targets with .usersize")
- Retested with 4.20-rc4 and current net-next/master (83af01ba1c2d)

v1 was:

"[net-next] bridge: ebtables: Avoid resetting limit rule state"
-> https://lore.kernel.org/patchwork/patch/854802/
---
 net/bridge/netfilter/ebt_limit.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/net/bridge/netfilter/ebt_limit.c b/net/bridge/netfilter/ebt_limit.c
index 165b9d678cf1..2cf9861c3bce 100644
--- a/net/bridge/netfilter/ebt_limit.c
+++ b/net/bridge/netfilter/ebt_limit.c
@@ -69,6 +69,10 @@ static int ebt_limit_mt_check(const struct xt_mtchk_param 
*par)
 {
struct ebt_limit_info *info = par->matchinfo;
 
+   /* Do not reset state on unrelated table changes */
+   if (info->prev)
+   return 0;
+
/* Check for overflow. */
if (info->burst == 0 ||
user2credits(info->avg * info->burst) < user2credits(info->avg)) {
@@ -105,7 +109,7 @@ static struct xt_match ebt_limit_mt_reg __read_mostly = {
.match  = ebt_limit_mt,
.checkentry = ebt_limit_mt_check,
.matchsize  = sizeof(struct ebt_limit_info),
-   .usersize   = offsetof(struct ebt_limit_info, prev),
+   .usersize   = sizeof(struct ebt_limit_info),
 #ifdef CONFIG_COMPAT
.compatsize = sizeof(struct ebt_compat_limit_info),
 #endif
-- 
2.11.0



Re: [PATCH net-next] bridge: ebtables: Avoid resetting limit rule state

2017-12-07 Thread Linus Lüssing
On Fri, Dec 08, 2017 at 06:46:06AM +0100, Linus Lüssing wrote:
> Extending the usersize to include info->prev would probably be too
> hackish/ugly, right?

And wouldn't be enough anyway, since
info->{credit,credit_cap,cost} would still be zeroed... Hm.


Re: [PATCH net-next] bridge: ebtables: Avoid resetting limit rule state

2017-12-07 Thread Linus Lüssing
On Fri, Dec 08, 2017 at 06:46:06AM +0100, Linus Lüssing wrote:
> Extending the usersize to include info->prev would probably be too
> hackish/ugly, right?

And wouldn't be enough anyway, since
info->{credit,credit_cap,cost} would still be zeroed... Hm.


Re: [PATCH net-next] bridge: ebtables: Avoid resetting limit rule state

2017-12-07 Thread Linus Lüssing
On Thu, Dec 07, 2017 at 01:26:19AM +0100, Pablo Neira Ayuso wrote:
> > I also had a quick look at a 4.15-rc1 kernel in a VM now. I still
> > end up in ebt_limit_mt_check() with the variables being reset
> > when editing the table somewhere.
> 
> My question is if your fix would work with 4.15-rc1.

You are absoluetly right, it's not working anymore since the
commit you mentioned initially :-(
("xtables: extend matches and targets with .usersize").
info->prev is always 0 since exactly this commit.

That means, trying tricks in ebt_limit_mt_check() is too
late now, the old values are already overwritten? (or is there
some commit scheme which installs the ebt_limit_info provided
by ebt_limit_check() some time after its call?)

Extending the usersize to include info->prev would probably be too
hackish/ugly, right?


Re: [PATCH net-next] bridge: ebtables: Avoid resetting limit rule state

2017-12-07 Thread Linus Lüssing
On Thu, Dec 07, 2017 at 01:26:19AM +0100, Pablo Neira Ayuso wrote:
> > I also had a quick look at a 4.15-rc1 kernel in a VM now. I still
> > end up in ebt_limit_mt_check() with the variables being reset
> > when editing the table somewhere.
> 
> My question is if your fix would work with 4.15-rc1.

You are absoluetly right, it's not working anymore since the
commit you mentioned initially :-(
("xtables: extend matches and targets with .usersize").
info->prev is always 0 since exactly this commit.

That means, trying tricks in ebt_limit_mt_check() is too
late now, the old values are already overwritten? (or is there
some commit scheme which installs the ebt_limit_info provided
by ebt_limit_check() some time after its call?)

Extending the usersize to include info->prev would probably be too
hackish/ugly, right?


Re: [Bridge] [PATCH net-next] bridge: ebtables: Avoid resetting limit rule state

2017-12-03 Thread Linus Lüssing
On Mon, Dec 04, 2017 at 05:53:35AM +0100, Linus Lüssing wrote:
> And so, no I do not have this patch. I looked at it now, but it
> does not seem to have any relation with .matchinfo, does it?

Relation between .usersize and .checkentry I ment, not
.usersize and .matchinfo.


Re: [Bridge] [PATCH net-next] bridge: ebtables: Avoid resetting limit rule state

2017-12-03 Thread Linus Lüssing
On Mon, Dec 04, 2017 at 05:53:35AM +0100, Linus Lüssing wrote:
> And so, no I do not have this patch. I looked at it now, but it
> does not seem to have any relation with .matchinfo, does it?

Relation between .usersize and .checkentry I ment, not
.usersize and .matchinfo.


Re: [PATCH net-next] bridge: ebtables: Avoid resetting limit rule state

2017-12-03 Thread Linus Lüssing
Hi Pablo,

Thanks for your reply!

On Tue, Nov 28, 2017 at 12:30:08AM +0100, Pablo Neira Ayuso wrote:
> [...]
> > diff --git a/net/bridge/netfilter/ebt_limit.c 
> > b/net/bridge/netfilter/ebt_limit.c
> > index 61a9f1be1263..f74b48633feb 100644
> > --- a/net/bridge/netfilter/ebt_limit.c
> > +++ b/net/bridge/netfilter/ebt_limit.c
> > @@ -69,6 +69,10 @@ static int ebt_limit_mt_check(const struct 
> > xt_mtchk_param *par)
> >  {
> > struct ebt_limit_info *info = par->matchinfo;
> >  
> > +   /* Do not reset state on unrelated table changes */
> > +   if (info->prev)
> > +   return 0;
> 
> What kernel version are you using? I suspect you don't have this
> applied?

I'm indeed using a 4.4.102 kernel, as LEDE is still in the process
of updating to 4.14. So 4.4 with LEDE is where I got the measurement
results from.

> 
> commit ec23189049651b16dc2ffab35a4371dc1f491aca
> Author: Willem de Bruijn 
> Date:   Mon Jan 2 17:19:46 2017 -0500
> 
> xtables: extend matches and targets with .usersize

And so, no I do not have this patch. I looked at it now, but it
does not seem to have any relation with .matchinfo, does it?

I also had a quick look at a 4.15-rc1 kernel in a VM now. I still
end up in ebt_limit_mt_check() with the variables being reset
when editing the table somewhere.

Regards, Linus


Re: [PATCH net-next] bridge: ebtables: Avoid resetting limit rule state

2017-12-03 Thread Linus Lüssing
Hi Pablo,

Thanks for your reply!

On Tue, Nov 28, 2017 at 12:30:08AM +0100, Pablo Neira Ayuso wrote:
> [...]
> > diff --git a/net/bridge/netfilter/ebt_limit.c 
> > b/net/bridge/netfilter/ebt_limit.c
> > index 61a9f1be1263..f74b48633feb 100644
> > --- a/net/bridge/netfilter/ebt_limit.c
> > +++ b/net/bridge/netfilter/ebt_limit.c
> > @@ -69,6 +69,10 @@ static int ebt_limit_mt_check(const struct 
> > xt_mtchk_param *par)
> >  {
> > struct ebt_limit_info *info = par->matchinfo;
> >  
> > +   /* Do not reset state on unrelated table changes */
> > +   if (info->prev)
> > +   return 0;
> 
> What kernel version are you using? I suspect you don't have this
> applied?

I'm indeed using a 4.4.102 kernel, as LEDE is still in the process
of updating to 4.14. So 4.4 with LEDE is where I got the measurement
results from.

> 
> commit ec23189049651b16dc2ffab35a4371dc1f491aca
> Author: Willem de Bruijn 
> Date:   Mon Jan 2 17:19:46 2017 -0500
> 
> xtables: extend matches and targets with .usersize

And so, no I do not have this patch. I looked at it now, but it
does not seem to have any relation with .matchinfo, does it?

I also had a quick look at a 4.15-rc1 kernel in a VM now. I still
end up in ebt_limit_mt_check() with the variables being reset
when editing the table somewhere.

Regards, Linus


[PATCH net-next] bridge: ebtables: Avoid resetting limit rule state

2017-11-24 Thread Linus Lüssing
So far any changes with ebtables will reset the state of limit rules,
leading to spikes in traffic. This is especially noticeable if changes
are done frequently, for instance via a daemon.

This patch fixes this by bailing out from (re)setting if the limit
rule was initialized before.

When sending packets every 250ms for 600s, with a
"--limit 1/sec --limit-burst 50" rule and a command like this
in the background:

$ ebtables -N VOIDCHAIN
$ while true; do ebtables -F VOIDCHAIN; sleep 30; done

The results are:

Before: ~1600 packets
After: 650 packets

Signed-off-by: Linus Lüssing <linus.luess...@c0d3.blue>
---
 net/bridge/netfilter/ebt_limit.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/net/bridge/netfilter/ebt_limit.c b/net/bridge/netfilter/ebt_limit.c
index 61a9f1be1263..f74b48633feb 100644
--- a/net/bridge/netfilter/ebt_limit.c
+++ b/net/bridge/netfilter/ebt_limit.c
@@ -69,6 +69,10 @@ static int ebt_limit_mt_check(const struct xt_mtchk_param 
*par)
 {
struct ebt_limit_info *info = par->matchinfo;
 
+   /* Do not reset state on unrelated table changes */
+   if (info->prev)
+   return 0;
+
/* Check for overflow. */
if (info->burst == 0 ||
user2credits(info->avg * info->burst) < user2credits(info->avg)) {
-- 
2.11.0



[PATCH net-next] bridge: ebtables: Avoid resetting limit rule state

2017-11-24 Thread Linus Lüssing
So far any changes with ebtables will reset the state of limit rules,
leading to spikes in traffic. This is especially noticeable if changes
are done frequently, for instance via a daemon.

This patch fixes this by bailing out from (re)setting if the limit
rule was initialized before.

When sending packets every 250ms for 600s, with a
"--limit 1/sec --limit-burst 50" rule and a command like this
in the background:

$ ebtables -N VOIDCHAIN
$ while true; do ebtables -F VOIDCHAIN; sleep 30; done

The results are:

Before: ~1600 packets
After: 650 packets

Signed-off-by: Linus Lüssing 
---
 net/bridge/netfilter/ebt_limit.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/net/bridge/netfilter/ebt_limit.c b/net/bridge/netfilter/ebt_limit.c
index 61a9f1be1263..f74b48633feb 100644
--- a/net/bridge/netfilter/ebt_limit.c
+++ b/net/bridge/netfilter/ebt_limit.c
@@ -69,6 +69,10 @@ static int ebt_limit_mt_check(const struct xt_mtchk_param 
*par)
 {
struct ebt_limit_info *info = par->matchinfo;
 
+   /* Do not reset state on unrelated table changes */
+   if (info->prev)
+   return 0;
+
/* Check for overflow. */
if (info->burst == 0 ||
user2credits(info->avg * info->burst) < user2credits(info->avg)) {
-- 
2.11.0



Re: [PATCH 3.16 076/294] batman-adv: fix TT sync flag inconsistencies

2017-11-12 Thread Linus Lüssing
Hi Ben,

On Tue, Nov 07, 2017 at 01:42:35PM +, Ben Hutchings wrote:
> That function didn't exist in 3.16 (at least not under that name).

Ah, you're right, back then the netlink interface did not
exist in batman-adv yet, only the debugfs one.
batadv_tt_global_print_entry would be the equivalent function
for debugfs. But not worth the effort now, in my opinion.

I'm fine with this proposed patch for 3.16 now. Thanks for the
clarification! And I'm happy to see this patch backported.

Regards, Linus


Re: [PATCH 3.16 076/294] batman-adv: fix TT sync flag inconsistencies

2017-11-12 Thread Linus Lüssing
Hi Ben,

On Tue, Nov 07, 2017 at 01:42:35PM +, Ben Hutchings wrote:
> That function didn't exist in 3.16 (at least not under that name).

Ah, you're right, back then the netlink interface did not
exist in batman-adv yet, only the debugfs one.
batadv_tt_global_print_entry would be the equivalent function
for debugfs. But not worth the effort now, in my opinion.

I'm fine with this proposed patch for 3.16 now. Thanks for the
clarification! And I'm happy to see this patch backported.

Regards, Linus


Re: [PATCH 3.16 076/294] batman-adv: fix TT sync flag inconsistencies

2017-11-06 Thread Linus Lüssing
Hi Ben!

On Mon, Nov 06, 2017 at 11:03:02PM +, Ben Hutchings wrote:
> 3.16.50-rc1 review patch.  If anyone has any objections, please let me know.
> 
> --
> 
> From: Linus Lüssing <linus.luess...@c0d3.blue>
> 
> commit 54e22f265e872ae140755b3318521d400a094605 upstream.
[...]
> [bwh: Backported to 3.16:
>  - Drop changes to batadv_tt_global_dump_subentry()

May I ask, were there specific concerns for stable 3.16 kernel
releases with this change?

It's not bothering me, but I'm currently wondering whether
this could cause some confusion to users.

Regards, Linus


Re: [PATCH 3.16 076/294] batman-adv: fix TT sync flag inconsistencies

2017-11-06 Thread Linus Lüssing
Hi Ben!

On Mon, Nov 06, 2017 at 11:03:02PM +, Ben Hutchings wrote:
> 3.16.50-rc1 review patch.  If anyone has any objections, please let me know.
> 
> --
> 
> From: Linus Lüssing 
> 
> commit 54e22f265e872ae140755b3318521d400a094605 upstream.
[...]
> [bwh: Backported to 3.16:
>  - Drop changes to batadv_tt_global_dump_subentry()

May I ask, were there specific concerns for stable 3.16 kernel
releases with this change?

It's not bothering me, but I'm currently wondering whether
this could cause some confusion to users.

Regards, Linus


Re: [PATCH] ARM: dts: meson8b: add reserved memory zone to fix silent freezes

2017-10-28 Thread Linus Lüssing
On Mon, Oct 23, 2017 at 09:47:21AM +0200, Linus Lüssing wrote:
> I'm currently continuing to bisect which difference in Emiliano's
> and my kernel image makes mine boot successfully but not
> Emiliano's. (And I'm continuing reading and testing with the
> filter-range option to better understand what it's presence - or
> absence - does exactly)

I found the difference between Emiliano's and my kernel image and
could narrow it down to this particular difference via bisecting:

Using the multi_v7_defconfig target, but with the following
two options unselected:

* System Type
  -> Qualcomm Support
 - Enable support for MSM8X60 (disabled)
 - Enable support for MSM8960 (disabled)

Results in the following diff, according to ./scripts/diffconfig:

-CLKSRC_QCOM y
-MSM_IOMMU n
 ARCH_MSM8960 y -> n
 ARCH_MSM8X60 y -> n

Once this is unselected, the kernel hangs for me on boot, too.
Both with or without this 2MB reserved memory region patch.

Removing the "arm,filter-ranges" as tried by Emiliano makes it
boot again.

Finally, what also helps booting again is this diff from the
pending SMP support patch series from Carlo/Martin [0]:

~
diff --git a/arch/arm/Makefile b/arch/arm/Makefile
index 47d3a1ab08d2..82faa958ab88 100644
--- a/arch/arm/Makefile
+++ b/arch/arm/Makefile
@@ -147,6 +147,7 @@  textofs-$(CONFIG_SA) := 0x00208000
 endif
 textofs-$(CONFIG_ARCH_MSM8X60) := 0x00208000
 textofs-$(CONFIG_ARCH_MSM8960) := 0x00208000
+textofs-$(CONFIG_ARCH_MESON) := 0x00208000
 textofs-$(CONFIG_ARCH_AXXIA) := 0x00308000
 
 # Machine directory name.  This list is sorted alphanumerically
~

Note the $(CONFIG_ARCH_MSM8X60) and $(CONFIG_ARCH_MSM8960) just
above. Sounds familiar :-)?

With this textofs diff alone, stress-ng still hangs though. Only
with the 2MB memory zone reserved via DT or the "arm,filter-ranges"
removed, stress-ng succeeds, too.

Regards, Linus

[0]: "[v7,4/6] ARM: meson: Add SMP bringup code for Meson8 and Meson8b"
 -> https://patchwork.kernel.org/patch/9954935/


On Mon, Oct 23, 2017 at 09:47:21AM +0200, Linus Lüssing wrote:
> Subject: Re: [PATCH] ARM: dts: meson8b: add reserved memory zone to fix 
> silent freezes
> To: Kevin Hilman <khil...@baylibre.com>
> Cc: Carlo Caione <ca...@caione.org>, Kevin Hilman <khil...@baylibre.com>, 
> Martin Blumenstingl <martin.blumensti...@googlemail.com>, Emiliano Ingrassia 
> <ingras...@epigenesys.com>, linux-amlo...@lists.infradead.org
> 
> Hi Kevin,
> 
> Just wanted to let you know that Emiliano and I are currently
> debugging further off-list.
> 
> So far I can reproduce that:
> 
> a) For the binary kernel image Emiliano sent me I can reproduce
> his hang ups during boot on my Odroid C1+.
> b) The 2MB reserved memory region this patch adds does not help
> for this image.
> c) Removing the "arm,filter-range" as proposed by Emiliano back
> then instead of adding this reserved memory zone fixes my freezes
> during boot in Emiliano's image and during stress-ng for my kernel
> image, too.
> 
> I'm currently continuing to bisect which difference in Emiliano's
> and my kernel image makes mine boot successfully but not
> Emiliano's. (And I'm continuing reading and testing with the
> filter-range option to better understand what it's presence - or
> absence - does exactly)
> 
> 
> If these observations ring a bell for anyone here, I'd be curious
> to hear what they think.
> 
> Regards, Linus
> 
> ___
> linux-amlogic mailing list
> linux-amlo...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-amlogic


Re: [PATCH] ARM: dts: meson8b: add reserved memory zone to fix silent freezes

2017-10-28 Thread Linus Lüssing
On Mon, Oct 23, 2017 at 09:47:21AM +0200, Linus Lüssing wrote:
> I'm currently continuing to bisect which difference in Emiliano's
> and my kernel image makes mine boot successfully but not
> Emiliano's. (And I'm continuing reading and testing with the
> filter-range option to better understand what it's presence - or
> absence - does exactly)

I found the difference between Emiliano's and my kernel image and
could narrow it down to this particular difference via bisecting:

Using the multi_v7_defconfig target, but with the following
two options unselected:

* System Type
  -> Qualcomm Support
 - Enable support for MSM8X60 (disabled)
 - Enable support for MSM8960 (disabled)

Results in the following diff, according to ./scripts/diffconfig:

-CLKSRC_QCOM y
-MSM_IOMMU n
 ARCH_MSM8960 y -> n
 ARCH_MSM8X60 y -> n

Once this is unselected, the kernel hangs for me on boot, too.
Both with or without this 2MB reserved memory region patch.

Removing the "arm,filter-ranges" as tried by Emiliano makes it
boot again.

Finally, what also helps booting again is this diff from the
pending SMP support patch series from Carlo/Martin [0]:

~
diff --git a/arch/arm/Makefile b/arch/arm/Makefile
index 47d3a1ab08d2..82faa958ab88 100644
--- a/arch/arm/Makefile
+++ b/arch/arm/Makefile
@@ -147,6 +147,7 @@  textofs-$(CONFIG_SA) := 0x00208000
 endif
 textofs-$(CONFIG_ARCH_MSM8X60) := 0x00208000
 textofs-$(CONFIG_ARCH_MSM8960) := 0x00208000
+textofs-$(CONFIG_ARCH_MESON) := 0x00208000
 textofs-$(CONFIG_ARCH_AXXIA) := 0x00308000
 
 # Machine directory name.  This list is sorted alphanumerically
~

Note the $(CONFIG_ARCH_MSM8X60) and $(CONFIG_ARCH_MSM8960) just
above. Sounds familiar :-)?

With this textofs diff alone, stress-ng still hangs though. Only
with the 2MB memory zone reserved via DT or the "arm,filter-ranges"
removed, stress-ng succeeds, too.

Regards, Linus

[0]: "[v7,4/6] ARM: meson: Add SMP bringup code for Meson8 and Meson8b"
 -> https://patchwork.kernel.org/patch/9954935/


On Mon, Oct 23, 2017 at 09:47:21AM +0200, Linus Lüssing wrote:
> Subject: Re: [PATCH] ARM: dts: meson8b: add reserved memory zone to fix 
> silent freezes
> To: Kevin Hilman 
> Cc: Carlo Caione , Kevin Hilman , 
> Martin Blumenstingl , Emiliano Ingrassia 
> , linux-amlo...@lists.infradead.org
> 
> Hi Kevin,
> 
> Just wanted to let you know that Emiliano and I are currently
> debugging further off-list.
> 
> So far I can reproduce that:
> 
> a) For the binary kernel image Emiliano sent me I can reproduce
> his hang ups during boot on my Odroid C1+.
> b) The 2MB reserved memory region this patch adds does not help
> for this image.
> c) Removing the "arm,filter-range" as proposed by Emiliano back
> then instead of adding this reserved memory zone fixes my freezes
> during boot in Emiliano's image and during stress-ng for my kernel
> image, too.
> 
> I'm currently continuing to bisect which difference in Emiliano's
> and my kernel image makes mine boot successfully but not
> Emiliano's. (And I'm continuing reading and testing with the
> filter-range option to better understand what it's presence - or
> absence - does exactly)
> 
> 
> If these observations ring a bell for anyone here, I'd be curious
> to hear what they think.
> 
> Regards, Linus
> 
> ___
> linux-amlogic mailing list
> linux-amlo...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-amlogic


Re: [PATCH 0/4] drm/meson: power domain init related fixes

2017-10-17 Thread Linus Lüssing
On Tue, Oct 17, 2017 at 10:07:40AM +0200, Neil Armstrong wrote:
> A PM Power Domain driver has been pushed at [1] to solve the main issue.

URL to [1] missing?


Re: [PATCH 0/4] drm/meson: power domain init related fixes

2017-10-17 Thread Linus Lüssing
On Tue, Oct 17, 2017 at 10:07:40AM +0200, Neil Armstrong wrote:
> A PM Power Domain driver has been pushed at [1] to solve the main issue.

URL to [1] missing?


[PATCH] ARM: dts: meson8b: add reserved memory zone to fix silent freezes

2017-10-02 Thread Linus Lüssing
So far, the stress-ng tool for instance quickly resulted in a silent
freeze of the system with no prior notice on a serial console when
running its filesystem or memory stressor classes.

Even with a panic-on-OOM and reboot-on-panic (vm.panic_on_oom=1,
kernel.panic=10) configured, the system would neither reboot nor
would the OOM killer get any chance to otherwise do its job.

The Amlogic reference source code uses a 2MB PHYS_OFFSET. With these 2MB
reserved via DT, stress-ng was able to run on an Odroid C1+ just fine for
several hours, the OOM killer was able to kill processes again and if
configured would successfully trigger a reboot of the system.

Fixes: 4a69fcd3a108 ("ARM: meson: Add DTS for Odroid-C1 and Tronfy MXQ boards")
Signed-off-by: Linus Lüssing <linus.luess...@c0d3.blue>

---
The following stress-ng command worked fine now:
$ stress-ng -v --sequential 0 -t 120s --exclude sysfs,opcode --metrics
(5 hours runtime, tested on an Odroid C1+ with an 4.14-rc1 kernel + SMP
+ USB DTS patches)
---
 arch/arm/boot/dts/meson8b.dtsi | 12 
 1 file changed, 12 insertions(+)

diff --git a/arch/arm/boot/dts/meson8b.dtsi b/arch/arm/boot/dts/meson8b.dtsi
index bc278da..d75a5b5 100644
--- a/arch/arm/boot/dts/meson8b.dtsi
+++ b/arch/arm/boot/dts/meson8b.dtsi
@@ -83,6 +83,18 @@
};
};
 
+   reserved-memory {
+   #address-cells = <1>;
+   #size-cells = <1>;
+   ranges;
+
+   /* 2 MiB reserved for Hardware ROM Firmware? */
+   hwrom@0 {
+   reg = <0x0 0x20>;
+   no-map;
+   };
+   };
+
scu@c430 {
compatible = "arm,cortex-a5-scu";
reg = <0xc430 0x100>;
-- 
2.1.4



[PATCH] ARM: dts: meson8b: add reserved memory zone to fix silent freezes

2017-10-02 Thread Linus Lüssing
So far, the stress-ng tool for instance quickly resulted in a silent
freeze of the system with no prior notice on a serial console when
running its filesystem or memory stressor classes.

Even with a panic-on-OOM and reboot-on-panic (vm.panic_on_oom=1,
kernel.panic=10) configured, the system would neither reboot nor
would the OOM killer get any chance to otherwise do its job.

The Amlogic reference source code uses a 2MB PHYS_OFFSET. With these 2MB
reserved via DT, stress-ng was able to run on an Odroid C1+ just fine for
several hours, the OOM killer was able to kill processes again and if
configured would successfully trigger a reboot of the system.

Fixes: 4a69fcd3a108 ("ARM: meson: Add DTS for Odroid-C1 and Tronfy MXQ boards")
Signed-off-by: Linus Lüssing 

---
The following stress-ng command worked fine now:
$ stress-ng -v --sequential 0 -t 120s --exclude sysfs,opcode --metrics
(5 hours runtime, tested on an Odroid C1+ with an 4.14-rc1 kernel + SMP
+ USB DTS patches)
---
 arch/arm/boot/dts/meson8b.dtsi | 12 
 1 file changed, 12 insertions(+)

diff --git a/arch/arm/boot/dts/meson8b.dtsi b/arch/arm/boot/dts/meson8b.dtsi
index bc278da..d75a5b5 100644
--- a/arch/arm/boot/dts/meson8b.dtsi
+++ b/arch/arm/boot/dts/meson8b.dtsi
@@ -83,6 +83,18 @@
};
};
 
+   reserved-memory {
+   #address-cells = <1>;
+   #size-cells = <1>;
+   ranges;
+
+   /* 2 MiB reserved for Hardware ROM Firmware? */
+   hwrom@0 {
+   reg = <0x0 0x20>;
+   no-map;
+   };
+   };
+
scu@c430 {
compatible = "arm,cortex-a5-scu";
reg = <0xc430 0x100>;
-- 
2.1.4



[PATCH v2] ARM: multi_v7_defconfig: Enable USB3503 driver

2017-09-28 Thread Linus Lüssing
The Odroid U3 (Exynos 4412 based) for instance needs this driver,
otherwise its USB hub will not come up.

Also selecting it as built-in to allow booting from USB without
an initrd/initramfs. exynos_defconfig does the same already, too.

Signed-off-by: Linus Lüssing <linus.luess...@c0d3.blue>
---
 arch/arm/configs/multi_v7_defconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm/configs/multi_v7_defconfig 
b/arch/arm/configs/multi_v7_defconfig
index 0cacdbf..c2484ef 100644
--- a/arch/arm/configs/multi_v7_defconfig
+++ b/arch/arm/configs/multi_v7_defconfig
@@ -693,6 +693,7 @@ CONFIG_USB_MUSB_HDRC=m
 CONFIG_USB_MUSB_SUNXI=m
 CONFIG_USB_DWC3=y
 CONFIG_USB_DWC2=y
+CONFIG_USB_HSIC_USB3503=y
 CONFIG_USB_CHIPIDEA=y
 CONFIG_USB_CHIPIDEA_UDC=y
 CONFIG_USB_CHIPIDEA_HOST=y
-- 
2.1.4



[PATCH v2] ARM: multi_v7_defconfig: Enable USB3503 driver

2017-09-28 Thread Linus Lüssing
The Odroid U3 (Exynos 4412 based) for instance needs this driver,
otherwise its USB hub will not come up.

Also selecting it as built-in to allow booting from USB without
an initrd/initramfs. exynos_defconfig does the same already, too.

Signed-off-by: Linus Lüssing 
---
 arch/arm/configs/multi_v7_defconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm/configs/multi_v7_defconfig 
b/arch/arm/configs/multi_v7_defconfig
index 0cacdbf..c2484ef 100644
--- a/arch/arm/configs/multi_v7_defconfig
+++ b/arch/arm/configs/multi_v7_defconfig
@@ -693,6 +693,7 @@ CONFIG_USB_MUSB_HDRC=m
 CONFIG_USB_MUSB_SUNXI=m
 CONFIG_USB_DWC3=y
 CONFIG_USB_DWC2=y
+CONFIG_USB_HSIC_USB3503=y
 CONFIG_USB_CHIPIDEA=y
 CONFIG_USB_CHIPIDEA_UDC=y
 CONFIG_USB_CHIPIDEA_HOST=y
-- 
2.1.4



Re: [PATCH] ARM: multi_v7_defconfig: Enable USB3503 driver

2017-09-28 Thread Linus Lüssing
Hi Krzysztof,

Thanks for your quick reply!

On Thu, Sep 28, 2017 at 08:21:26AM +0200, Krzysztof Kozlowski wrote:
> [...]
> Anyway please define this as a module (unless it can't... but it
> worked in my case).

In that case you used an initrd, right? I see various cases of USB
built-ins, like CONFIG_USB_STORAGE=y, CONFIG_USB_DWC2=y and so on in
multi_v7_defconfig. So my impression was that multi_v7_defconfig was
supposed to allow booting a rootfs from a USB storage device even
without an initrd.

Let me know if that impression is wrong though, I'll happily resend
(and test) the patch with CONFIG_USB_HSIC_USB3503=m then :-).

Regards, Linus


Re: [PATCH] ARM: multi_v7_defconfig: Enable USB3503 driver

2017-09-28 Thread Linus Lüssing
Hi Krzysztof,

Thanks for your quick reply!

On Thu, Sep 28, 2017 at 08:21:26AM +0200, Krzysztof Kozlowski wrote:
> [...]
> Anyway please define this as a module (unless it can't... but it
> worked in my case).

In that case you used an initrd, right? I see various cases of USB
built-ins, like CONFIG_USB_STORAGE=y, CONFIG_USB_DWC2=y and so on in
multi_v7_defconfig. So my impression was that multi_v7_defconfig was
supposed to allow booting a rootfs from a USB storage device even
without an initrd.

Let me know if that impression is wrong though, I'll happily resend
(and test) the patch with CONFIG_USB_HSIC_USB3503=m then :-).

Regards, Linus


[PATCH] ARM: multi_v7_defconfig: Enable USB3503 driver

2017-09-27 Thread Linus Lüssing
The Odroid U3 (Exynos 4412 based) for instance needs this driver,
otherwise its USB hub will not come up.

exynos_defconfig already has this enabled as built-in, too. So
just doing the same here in multi_v7_defconfig.

Signed-off-by: Linus Lüssing <linus.luess...@c0d3.blue>

---
Thanks to Tobias Jakobi, who indirectly made me aware of why USB
did not work for me.

 arch/arm/configs/multi_v7_defconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm/configs/multi_v7_defconfig 
b/arch/arm/configs/multi_v7_defconfig
index 0cacdbf..c2484ef 100644
--- a/arch/arm/configs/multi_v7_defconfig
+++ b/arch/arm/configs/multi_v7_defconfig
@@ -693,6 +693,7 @@ CONFIG_USB_MUSB_HDRC=m
 CONFIG_USB_MUSB_SUNXI=m
 CONFIG_USB_DWC3=y
 CONFIG_USB_DWC2=y
+CONFIG_USB_HSIC_USB3503=y
 CONFIG_USB_CHIPIDEA=y
 CONFIG_USB_CHIPIDEA_UDC=y
 CONFIG_USB_CHIPIDEA_HOST=y
-- 
2.1.4



[PATCH] ARM: multi_v7_defconfig: Enable USB3503 driver

2017-09-27 Thread Linus Lüssing
The Odroid U3 (Exynos 4412 based) for instance needs this driver,
otherwise its USB hub will not come up.

exynos_defconfig already has this enabled as built-in, too. So
just doing the same here in multi_v7_defconfig.

Signed-off-by: Linus Lüssing 

---
Thanks to Tobias Jakobi, who indirectly made me aware of why USB
did not work for me.

 arch/arm/configs/multi_v7_defconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm/configs/multi_v7_defconfig 
b/arch/arm/configs/multi_v7_defconfig
index 0cacdbf..c2484ef 100644
--- a/arch/arm/configs/multi_v7_defconfig
+++ b/arch/arm/configs/multi_v7_defconfig
@@ -693,6 +693,7 @@ CONFIG_USB_MUSB_HDRC=m
 CONFIG_USB_MUSB_SUNXI=m
 CONFIG_USB_DWC3=y
 CONFIG_USB_DWC2=y
+CONFIG_USB_HSIC_USB3503=y
 CONFIG_USB_CHIPIDEA=y
 CONFIG_USB_CHIPIDEA_UDC=y
 CONFIG_USB_CHIPIDEA_HOST=y
-- 
2.1.4



[PATCH net v3] bridge: ebtables: fix reception of frames DNAT-ed to bridge device/port

2017-04-19 Thread Linus Lüssing
When trying to redirect bridged frames to the bridge device itself or
a bridge port (brouting) via the dnat target then this currently fails:

The ethernet destination of the frame is dnat'ed to the MAC address of
the bridge device or port just fine. However, the IP code drops it in
the beginning of ip_input.c/ip_rcv() as the dnat target left
the skb->pkt_type as PACKET_OTHERHOST.

Fixing this by resetting skb->pkt_type to an appropriate type after
dnat'ing.

Signed-off-by: Linus Lüssing <linus.luess...@c0d3.blue>

---

Changelog v3:
- moved pkt_type fixup into ebtable dnat code
  -> v1/v2 only fixed it for prerouting/dnat so far, now tested
 and verified that v3 fixes it for brouting/dnat, too
- updated commit message

Changelog v2:
- refrain from altering pkt_type for multicast packets
  with a unicast destination MAC
---
 net/bridge/netfilter/ebt_dnat.c | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/net/bridge/netfilter/ebt_dnat.c b/net/bridge/netfilter/ebt_dnat.c
index 4e0b0c3..21acb53 100644
--- a/net/bridge/netfilter/ebt_dnat.c
+++ b/net/bridge/netfilter/ebt_dnat.c
@@ -9,6 +9,7 @@
  */
 #include 
 #include 
+#include "../br_private.h"
 #include 
 #include 
 #include 
@@ -18,11 +19,32 @@ static unsigned int
 ebt_dnat_tg(struct sk_buff *skb, const struct xt_action_param *par)
 {
const struct ebt_nat_info *info = par->targinfo;
+   struct net_device *dev;
 
if (!skb_make_writable(skb, 0))
return EBT_DROP;
 
ether_addr_copy(eth_hdr(skb)->h_dest, info->mac);
+
+   if (is_multicast_ether_addr(info->mac)) {
+   if (is_broadcast_ether_addr(info->mac))
+   skb->pkt_type = PACKET_BROADCAST;
+   else
+   skb->pkt_type = PACKET_MULTICAST;
+   } else {
+   rcu_read_lock();
+   if (xt_hooknum(par) != NF_BR_BROUTING)
+   dev = br_port_get_rcu(xt_in(par))->br->dev;
+   else
+   dev = xt_in(par);
+
+   if (ether_addr_equal(info->mac, dev->dev_addr))
+   skb->pkt_type = PACKET_HOST;
+   else
+   skb->pkt_type = PACKET_OTHERHOST;
+   rcu_read_unlock();
+   }
+
return info->target;
 }
 
-- 
2.1.4



[PATCH net v3] bridge: ebtables: fix reception of frames DNAT-ed to bridge device/port

2017-04-19 Thread Linus Lüssing
When trying to redirect bridged frames to the bridge device itself or
a bridge port (brouting) via the dnat target then this currently fails:

The ethernet destination of the frame is dnat'ed to the MAC address of
the bridge device or port just fine. However, the IP code drops it in
the beginning of ip_input.c/ip_rcv() as the dnat target left
the skb->pkt_type as PACKET_OTHERHOST.

Fixing this by resetting skb->pkt_type to an appropriate type after
dnat'ing.

Signed-off-by: Linus Lüssing 

---

Changelog v3:
- moved pkt_type fixup into ebtable dnat code
  -> v1/v2 only fixed it for prerouting/dnat so far, now tested
 and verified that v3 fixes it for brouting/dnat, too
- updated commit message

Changelog v2:
- refrain from altering pkt_type for multicast packets
  with a unicast destination MAC
---
 net/bridge/netfilter/ebt_dnat.c | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/net/bridge/netfilter/ebt_dnat.c b/net/bridge/netfilter/ebt_dnat.c
index 4e0b0c3..21acb53 100644
--- a/net/bridge/netfilter/ebt_dnat.c
+++ b/net/bridge/netfilter/ebt_dnat.c
@@ -9,6 +9,7 @@
  */
 #include 
 #include 
+#include "../br_private.h"
 #include 
 #include 
 #include 
@@ -18,11 +19,32 @@ static unsigned int
 ebt_dnat_tg(struct sk_buff *skb, const struct xt_action_param *par)
 {
const struct ebt_nat_info *info = par->targinfo;
+   struct net_device *dev;
 
if (!skb_make_writable(skb, 0))
return EBT_DROP;
 
ether_addr_copy(eth_hdr(skb)->h_dest, info->mac);
+
+   if (is_multicast_ether_addr(info->mac)) {
+   if (is_broadcast_ether_addr(info->mac))
+   skb->pkt_type = PACKET_BROADCAST;
+   else
+   skb->pkt_type = PACKET_MULTICAST;
+   } else {
+   rcu_read_lock();
+   if (xt_hooknum(par) != NF_BR_BROUTING)
+   dev = br_port_get_rcu(xt_in(par))->br->dev;
+   else
+   dev = xt_in(par);
+
+   if (ether_addr_equal(info->mac, dev->dev_addr))
+   skb->pkt_type = PACKET_HOST;
+   else
+   skb->pkt_type = PACKET_OTHERHOST;
+   rcu_read_unlock();
+   }
+
return info->target;
 }
 
-- 
2.1.4



Re: [PATCH v2] bridge: ebtables: fix reception of frames DNAT-ed to bridge device

2017-04-17 Thread Linus Lüssing
On Tue, Mar 21, 2017 at 04:32:45PM -0700, Stephen Hemminger wrote:
> On Tue, 21 Mar 2017 23:28:45 +0100
> Linus Lüssing <linus.luess...@c0d3.blue> wrote:
> 
> > However, the IP code drops it in the beginning of ip_input.c/ip_rcv()
> > as the dnat target did not update the skb->pkt_type. If after
> > dnat'ing the packet is now destined to us then the skb->pkt_type
> > needs to be updated from PACKET_OTHERHOST to PACKET_HOST, too.
> 
> Why not fix DNAT netfilter module rather than hacking bridge code here.

Sorry for the late response. Wanted to do some more testing before
replying.

My assumptions regarding macvlan were wrong:

A) The code my patch adds is not touched in the case of a macvlan
device on top of a bridge - with macvlan there seems to be no FDB entry
for the MAC address of the macvlan device at all, actually, resulting
in the flooding of a frame intended for that device (is this, ehm,
known/intended?).

B) ip_rcv() does not drop for a macvlan device, because macvlan
unconditionally sets skb->pkt_type to PACKET_HOST in macvlan_handle_frame().

(And I guess, maybe I shouldn't actually care that much about
macvlans on top of a bridge, as a veth-pair is much cleaner for that?)


Regarding netdev::dev_addrs, if I understand the kernel code correctly, it
is only, potentially populated with more than netdev::dev_addr for two
drivers, via dev_addr_add(): bnx2x and ixgbe. So for a bridge device,
there should never be more than one item in netdev::dev_addrs
(= netdev::dev_addr), correct?


So, currently I'm happy with moving the fix into the ebtables dnat
code and just checking against the netdev::dev_addr of the bridge
device now. (if anyone has any suggestions regarding upper devices
I should test with that, then please let me know)

(Sorry, Pablo, for me pressing against your earlier suggestion to
put the fix in the ebtables dnat instead of bridge code. :( )

Regards, Linus


Re: [PATCH v2] bridge: ebtables: fix reception of frames DNAT-ed to bridge device

2017-04-17 Thread Linus Lüssing
On Tue, Mar 21, 2017 at 04:32:45PM -0700, Stephen Hemminger wrote:
> On Tue, 21 Mar 2017 23:28:45 +0100
> Linus Lüssing  wrote:
> 
> > However, the IP code drops it in the beginning of ip_input.c/ip_rcv()
> > as the dnat target did not update the skb->pkt_type. If after
> > dnat'ing the packet is now destined to us then the skb->pkt_type
> > needs to be updated from PACKET_OTHERHOST to PACKET_HOST, too.
> 
> Why not fix DNAT netfilter module rather than hacking bridge code here.

Sorry for the late response. Wanted to do some more testing before
replying.

My assumptions regarding macvlan were wrong:

A) The code my patch adds is not touched in the case of a macvlan
device on top of a bridge - with macvlan there seems to be no FDB entry
for the MAC address of the macvlan device at all, actually, resulting
in the flooding of a frame intended for that device (is this, ehm,
known/intended?).

B) ip_rcv() does not drop for a macvlan device, because macvlan
unconditionally sets skb->pkt_type to PACKET_HOST in macvlan_handle_frame().

(And I guess, maybe I shouldn't actually care that much about
macvlans on top of a bridge, as a veth-pair is much cleaner for that?)


Regarding netdev::dev_addrs, if I understand the kernel code correctly, it
is only, potentially populated with more than netdev::dev_addr for two
drivers, via dev_addr_add(): bnx2x and ixgbe. So for a bridge device,
there should never be more than one item in netdev::dev_addrs
(= netdev::dev_addr), correct?


So, currently I'm happy with moving the fix into the ebtables dnat
code and just checking against the netdev::dev_addr of the bridge
device now. (if anyone has any suggestions regarding upper devices
I should test with that, then please let me know)

(Sorry, Pablo, for me pressing against your earlier suggestion to
put the fix in the ebtables dnat instead of bridge code. :( )

Regards, Linus


[PATCH v2] bridge: ebtables: fix reception of frames DNAT-ed to bridge device

2017-03-21 Thread Linus Lüssing
When trying to redirect bridged frames to the bridge device itself
via the ebtables nat-prerouting chain and the dnat target then this
currently fails:

The ethernet destination of the frame is dnat'ed to the MAC address of
the bridge itself just fine and the correctly altered frame can even
be captured via a tcpdump on br0 (with or without promisc mode).

However, the IP code drops it in the beginning of ip_input.c/ip_rcv()
as the dnat target did not update the skb->pkt_type. If after
dnat'ing the packet is now destined to us then the skb->pkt_type
needs to be updated from PACKET_OTHERHOST to PACKET_HOST, too.

Signed-off-by: Linus Lüssing <linus.luess...@c0d3.blue>

---
Changelog v2:
* refrain from altering pkt_type for multicast packets
  with a unicast destination MAC
---
 net/bridge/br_input.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
index 013f2290b..fd7bc4c 100644
--- a/net/bridge/br_input.c
+++ b/net/bridge/br_input.c
@@ -198,8 +198,13 @@ int br_handle_frame_finish(struct net *net, struct sock 
*sk, struct sk_buff *skb
if (dst) {
unsigned long now = jiffies;
 
-   if (dst->is_local)
+   if (dst->is_local) {
+   /* fix up potential DNAT inconsistencies */
+   if (skb->pkt_type == PACKET_OTHERHOST)
+   skb->pkt_type = PACKET_HOST;
+
return br_pass_frame_up(skb);
+   }
 
if (now != dst->used)
dst->used = now;
-- 
2.1.4



[PATCH v2] bridge: ebtables: fix reception of frames DNAT-ed to bridge device

2017-03-21 Thread Linus Lüssing
When trying to redirect bridged frames to the bridge device itself
via the ebtables nat-prerouting chain and the dnat target then this
currently fails:

The ethernet destination of the frame is dnat'ed to the MAC address of
the bridge itself just fine and the correctly altered frame can even
be captured via a tcpdump on br0 (with or without promisc mode).

However, the IP code drops it in the beginning of ip_input.c/ip_rcv()
as the dnat target did not update the skb->pkt_type. If after
dnat'ing the packet is now destined to us then the skb->pkt_type
needs to be updated from PACKET_OTHERHOST to PACKET_HOST, too.

Signed-off-by: Linus Lüssing 

---
Changelog v2:
* refrain from altering pkt_type for multicast packets
  with a unicast destination MAC
---
 net/bridge/br_input.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
index 013f2290b..fd7bc4c 100644
--- a/net/bridge/br_input.c
+++ b/net/bridge/br_input.c
@@ -198,8 +198,13 @@ int br_handle_frame_finish(struct net *net, struct sock 
*sk, struct sk_buff *skb
if (dst) {
unsigned long now = jiffies;
 
-   if (dst->is_local)
+   if (dst->is_local) {
+   /* fix up potential DNAT inconsistencies */
+   if (skb->pkt_type == PACKET_OTHERHOST)
+   skb->pkt_type = PACKET_HOST;
+
return br_pass_frame_up(skb);
+   }
 
if (now != dst->used)
dst->used = now;
-- 
2.1.4



Re: [Bridge] [PATCH net] bridge: ebtables: fix reception of frames DNAT-ed to bridge device

2017-03-20 Thread Linus Lüssing
On Sun, Mar 19, 2017 at 05:55:06PM +0100, Linus Lüssing wrote:
> On Fri, Mar 17, 2017 at 02:10:44PM +0100, Pablo Neira Ayuso wrote:
> > Wait.
> > 
> > May this break local multicast listener that are bound to the bridge
> > interface? Assuming the bridge interface got an IP address, and that
> > there is local multicast listener.
> > 
> > Missing anything here?
> 
> Hm, for multicast packets usually the code path a few lines
> later in br_handle_frame_finish() should be taken instead.
> 
> But you might be right for IP multicast packets with a unicast MAC
> destination (due to whatever reason, for instance via DNAT'ing
> again).
> 
> Will check that - thanks!

Ok, I tested DNAT'ing an IP multicast packet to the unicast MAC address
of the bridge interface.

Both ping-ing to an IPv4 and IPv6 multicast listener on br0 worked
and was replied to fine, both with or without changing skb->pkt_type
from PACKET_MULTICAST to PACKET_HOST.
("$ ping 224.1.0.123" and "$ ping6 ff02::1:ff40:707c%in0" from a
 network namespace, tied into the bridge via veth)

Also, a DNAT'ed PACKET_BROADCAST worked, with or without changing
it to PACKET_HOST.

I also checked via tcpdump that the destination MAC was changed
successfully.


So, so far I wasn't able to find any bugs with the current
patch. But I think I like the idea of leaving the skb->pkt_type
unaltered for PACKET_MULTICAST and PACKET_BROADCAST, seems cleaner.

I'd just add an "if (skb->pkt_type == PACKET_OTHERHOST)" check
then and resend a PATCH v2.


Re: [Bridge] [PATCH net] bridge: ebtables: fix reception of frames DNAT-ed to bridge device

2017-03-20 Thread Linus Lüssing
On Sun, Mar 19, 2017 at 05:55:06PM +0100, Linus Lüssing wrote:
> On Fri, Mar 17, 2017 at 02:10:44PM +0100, Pablo Neira Ayuso wrote:
> > Wait.
> > 
> > May this break local multicast listener that are bound to the bridge
> > interface? Assuming the bridge interface got an IP address, and that
> > there is local multicast listener.
> > 
> > Missing anything here?
> 
> Hm, for multicast packets usually the code path a few lines
> later in br_handle_frame_finish() should be taken instead.
> 
> But you might be right for IP multicast packets with a unicast MAC
> destination (due to whatever reason, for instance via DNAT'ing
> again).
> 
> Will check that - thanks!

Ok, I tested DNAT'ing an IP multicast packet to the unicast MAC address
of the bridge interface.

Both ping-ing to an IPv4 and IPv6 multicast listener on br0 worked
and was replied to fine, both with or without changing skb->pkt_type
from PACKET_MULTICAST to PACKET_HOST.
("$ ping 224.1.0.123" and "$ ping6 ff02::1:ff40:707c%in0" from a
 network namespace, tied into the bridge via veth)

Also, a DNAT'ed PACKET_BROADCAST worked, with or without changing
it to PACKET_HOST.

I also checked via tcpdump that the destination MAC was changed
successfully.


So, so far I wasn't able to find any bugs with the current
patch. But I think I like the idea of leaving the skb->pkt_type
unaltered for PACKET_MULTICAST and PACKET_BROADCAST, seems cleaner.

I'd just add an "if (skb->pkt_type == PACKET_OTHERHOST)" check
then and resend a PATCH v2.


Re: [PATCH net] bridge: ebtables: fix reception of frames DNAT-ed to bridge device

2017-03-19 Thread Linus Lüssing
On Fri, Mar 17, 2017 at 02:10:44PM +0100, Pablo Neira Ayuso wrote:
> Wait.
> 
> May this break local multicast listener that are bound to the bridge
> interface? Assuming the bridge interface got an IP address, and that
> there is local multicast listener.
> 
> Missing anything here?

Hm, for multicast packets usually the code path a few lines
later in br_handle_frame_finish() should be taken instead.

But you might be right for IP multicast packets with a unicast MAC
destination (due to whatever reason, for instance via DNAT'ing
again).

Will check that - thanks!


Re: [PATCH net] bridge: ebtables: fix reception of frames DNAT-ed to bridge device

2017-03-19 Thread Linus Lüssing
On Fri, Mar 17, 2017 at 02:10:44PM +0100, Pablo Neira Ayuso wrote:
> Wait.
> 
> May this break local multicast listener that are bound to the bridge
> interface? Assuming the bridge interface got an IP address, and that
> there is local multicast listener.
> 
> Missing anything here?

Hm, for multicast packets usually the code path a few lines
later in br_handle_frame_finish() should be taken instead.

But you might be right for IP multicast packets with a unicast MAC
destination (due to whatever reason, for instance via DNAT'ing
again).

Will check that - thanks!


Re: [PATCH net] bridge: ebtables: fix reception of frames DNAT-ed to bridge device

2017-03-15 Thread Linus Lüssing
On Wed, Mar 15, 2017 at 07:15:39PM +0100, Pablo Neira Ayuso wrote:
> Could you update ebtables dnat to check if the ethernet address
> matches the one of the input bridge interface, so we mangle the
> ->pkt_type accordingly from there, instead of doing this from the
> core?

Actually, that was the approach I thought about and went for first
(and it would probably work for me). Just checking against the
bridge device's net_device::dev_addr.

I scratched it though, as I was afraid that the issue might still
exist for people using some other upper device on top of the bridge
device. For instance, macvlan? And iterating over the
net_device::dev_addrs list seemed too costly for fast path to me.


Re: [PATCH net] bridge: ebtables: fix reception of frames DNAT-ed to bridge device

2017-03-15 Thread Linus Lüssing
On Wed, Mar 15, 2017 at 07:15:39PM +0100, Pablo Neira Ayuso wrote:
> Could you update ebtables dnat to check if the ethernet address
> matches the one of the input bridge interface, so we mangle the
> ->pkt_type accordingly from there, instead of doing this from the
> core?

Actually, that was the approach I thought about and went for first
(and it would probably work for me). Just checking against the
bridge device's net_device::dev_addr.

I scratched it though, as I was afraid that the issue might still
exist for people using some other upper device on top of the bridge
device. For instance, macvlan? And iterating over the
net_device::dev_addrs list seemed too costly for fast path to me.


Re: [PATCH net] bridge: ebtables: fix reception of frames DNAT-ed to bridge device

2017-03-15 Thread Linus Lüssing
On Wed, Mar 15, 2017 at 11:42:11AM +0100, Pablo Neira Ayuso wrote:
> I'm missing then why redirect is not then just enough for Linus usecase.

For my usecase, the MAC address is configured by the user from a
Web-UI. It may or may not be the one from the bridge device.

Besides, found it counter intuitive that DNAT did not work here
and took me some time to find out why. At least I didn't read about
any such known limitations of the dnat target in the ebtables
manpage.


Re: [PATCH net] bridge: ebtables: fix reception of frames DNAT-ed to bridge device

2017-03-15 Thread Linus Lüssing
On Wed, Mar 15, 2017 at 11:42:11AM +0100, Pablo Neira Ayuso wrote:
> I'm missing then why redirect is not then just enough for Linus usecase.

For my usecase, the MAC address is configured by the user from a
Web-UI. It may or may not be the one from the bridge device.

Besides, found it counter intuitive that DNAT did not work here
and took me some time to find out why. At least I didn't read about
any such known limitations of the dnat target in the ebtables
manpage.


[PATCH net] bridge: ebtables: fix reception of frames DNAT-ed to bridge device

2017-03-14 Thread Linus Lüssing
When trying to redirect bridged frames to the bridge device itself
via the ebtables nat-prerouting chain and the dnat target then this
currently fails:

The ethernet destination of the frame is dnat'ed to the MAC address of
the bridge itself just fine and the correctly altered frame can even
be captured via a tcpdump on br0 (with or without promisc mode).

However, the IP code drops it in the beginning of ip_input.c/ip_rcv()
as the dnat target did not update the skb->pkt_type. If after
dnat'ing the packet is now destined to us then the skb->pkt_type
needs to be updated from PACKET_OTHERHOST to PACKET_HOST, too.

Signed-off-by: Linus Lüssing <linus.luess...@c0d3.blue>
---
 net/bridge/br_input.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
index 013f2290b..ec83175 100644
--- a/net/bridge/br_input.c
+++ b/net/bridge/br_input.c
@@ -198,8 +198,12 @@ int br_handle_frame_finish(struct net *net, struct sock 
*sk, struct sk_buff *skb
if (dst) {
unsigned long now = jiffies;
 
-   if (dst->is_local)
+   if (dst->is_local) {
+   /* fix up potential DNAT mess */
+   skb->pkt_type = PACKET_HOST;
+
return br_pass_frame_up(skb);
+   }
 
if (now != dst->used)
dst->used = now;
-- 
2.1.4



[PATCH net] bridge: ebtables: fix reception of frames DNAT-ed to bridge device

2017-03-14 Thread Linus Lüssing
When trying to redirect bridged frames to the bridge device itself
via the ebtables nat-prerouting chain and the dnat target then this
currently fails:

The ethernet destination of the frame is dnat'ed to the MAC address of
the bridge itself just fine and the correctly altered frame can even
be captured via a tcpdump on br0 (with or without promisc mode).

However, the IP code drops it in the beginning of ip_input.c/ip_rcv()
as the dnat target did not update the skb->pkt_type. If after
dnat'ing the packet is now destined to us then the skb->pkt_type
needs to be updated from PACKET_OTHERHOST to PACKET_HOST, too.

Signed-off-by: Linus Lüssing 
---
 net/bridge/br_input.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
index 013f2290b..ec83175 100644
--- a/net/bridge/br_input.c
+++ b/net/bridge/br_input.c
@@ -198,8 +198,12 @@ int br_handle_frame_finish(struct net *net, struct sock 
*sk, struct sk_buff *skb
if (dst) {
unsigned long now = jiffies;
 
-   if (dst->is_local)
+   if (dst->is_local) {
+   /* fix up potential DNAT mess */
+   skb->pkt_type = PACKET_HOST;
+
return br_pass_frame_up(skb);
+   }
 
if (now != dst->used)
dst->used = now;
-- 
2.1.4



Re: [ipv6] a088d1d73a: WARNING:at_net/mac80211/agg-tx.c:#ieee80211_start_tx_ba_cb[mac80211]

2017-03-13 Thread Linus Lüssing
On Wed, Feb 22, 2017 at 10:03:02AM +0800, kernel test robot wrote:
> 
> FYI, we noticed the following commit:
> 
> commit: a088d1d73a4bcfd7bc482f8d08375b9b665dc3e5 ("ipv6: Fix IPv6 packet loss 
> in scenarios involving roaming + snooping switches")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> 
> in testcase: hwsim
> with following parameters:
> 
>   group: hwsim-01
[...]
> kern  :warn  : [  146.568070] WARNING: CPU: 12 PID: 669 at 
> net/mac80211/agg-tx.c:770 ieee80211_start_tx_ba_cb+0x139/0x150 [mac80211]
[...]
> kern  :warn  : [  146.568114] CPU: 12 PID: 669 Comm: kworker/u256:1 Not 
> tainted 4.10.0-rc6-00104-ga088d1d #1
> kern  :warn  : [  146.568115] Hardware name: Intel Corporation LH 
> Pass/S4600LH, BIOS SE5C600.86B.99.02.1047.032320122259 03/23/2012
> kern  :warn  : [  146.568127] Workqueue: phy0 ieee80211_iface_work [mac80211]
> kern  :warn  : [  146.568128] Call Trace:
> kern  :warn  : [  146.568141]  dump_stack+0x63/0x8a
> kern  :warn  : [  146.568146]  __warn+0xcb/0xf0
> kern  :warn  : [  146.568147]  warn_slowpath_null+0x1d/0x20
> kern  :warn  : [  146.568156]  ieee80211_start_tx_ba_cb+0x139/0x150 [mac80211]
> kern  :warn  : [  146.568166]  ieee80211_iface_work+0x266/0x470 [mac80211]
> kern  :warn  : [  146.568180]  process_one_work+0x1a3/0x480
> kern  :warn  : [  146.568182]  worker_thread+0x4e/0x4d0
> kern  :warn  : [  146.568185]  kthread+0x10c/0x140
> kern  :warn  : [  146.568187]  ? process_one_work+0x480/0x480
> kern  :warn  : [  146.568187]  ? kthread_create_on_node+0x40/0x40
> kern  :warn  : [  146.568195]  ret_from_fork+0x2c/0x40
> kern  :warn  : [  146.568197] ---[ end trace 0e4afc24db5dbafa ]---

Unfortunately, I wasn't able to reproduce this issue with the
plain mac80211-hwsim test suite yet. The according test runs successfully
in my VMs here...

To me, it looks like the WARN_ON() could be triggered by a race condition
in mac80211 or mac80211-hwsim, too. I'm going to forward this to and
discuss this further on the wireless mailing list.

Thanks for the notification, Xiaolong!

Regards, Linus


Re: [ipv6] a088d1d73a: WARNING:at_net/mac80211/agg-tx.c:#ieee80211_start_tx_ba_cb[mac80211]

2017-03-13 Thread Linus Lüssing
On Wed, Feb 22, 2017 at 10:03:02AM +0800, kernel test robot wrote:
> 
> FYI, we noticed the following commit:
> 
> commit: a088d1d73a4bcfd7bc482f8d08375b9b665dc3e5 ("ipv6: Fix IPv6 packet loss 
> in scenarios involving roaming + snooping switches")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> 
> in testcase: hwsim
> with following parameters:
> 
>   group: hwsim-01
[...]
> kern  :warn  : [  146.568070] WARNING: CPU: 12 PID: 669 at 
> net/mac80211/agg-tx.c:770 ieee80211_start_tx_ba_cb+0x139/0x150 [mac80211]
[...]
> kern  :warn  : [  146.568114] CPU: 12 PID: 669 Comm: kworker/u256:1 Not 
> tainted 4.10.0-rc6-00104-ga088d1d #1
> kern  :warn  : [  146.568115] Hardware name: Intel Corporation LH 
> Pass/S4600LH, BIOS SE5C600.86B.99.02.1047.032320122259 03/23/2012
> kern  :warn  : [  146.568127] Workqueue: phy0 ieee80211_iface_work [mac80211]
> kern  :warn  : [  146.568128] Call Trace:
> kern  :warn  : [  146.568141]  dump_stack+0x63/0x8a
> kern  :warn  : [  146.568146]  __warn+0xcb/0xf0
> kern  :warn  : [  146.568147]  warn_slowpath_null+0x1d/0x20
> kern  :warn  : [  146.568156]  ieee80211_start_tx_ba_cb+0x139/0x150 [mac80211]
> kern  :warn  : [  146.568166]  ieee80211_iface_work+0x266/0x470 [mac80211]
> kern  :warn  : [  146.568180]  process_one_work+0x1a3/0x480
> kern  :warn  : [  146.568182]  worker_thread+0x4e/0x4d0
> kern  :warn  : [  146.568185]  kthread+0x10c/0x140
> kern  :warn  : [  146.568187]  ? process_one_work+0x480/0x480
> kern  :warn  : [  146.568187]  ? kthread_create_on_node+0x40/0x40
> kern  :warn  : [  146.568195]  ret_from_fork+0x2c/0x40
> kern  :warn  : [  146.568197] ---[ end trace 0e4afc24db5dbafa ]---

Unfortunately, I wasn't able to reproduce this issue with the
plain mac80211-hwsim test suite yet. The according test runs successfully
in my VMs here...

To me, it looks like the WARN_ON() could be triggered by a race condition
in mac80211 or mac80211-hwsim, too. I'm going to forward this to and
discuss this further on the wireless mailing list.

Thanks for the notification, Xiaolong!

Regards, Linus


Re: [PATCH 3.2 049/126] batman-adv: fix splat on disabling an interface

2017-02-15 Thread Linus Lüssing
On Wed, Feb 15, 2017 at 10:41:34PM +, Ben Hutchings wrote:
> 3.2.85-rc1 review patch.  If anyone has any objections, please let me know.
> 
> --
> 
> From: Linus Lüssing <linus.luess...@c0d3.blue>
> 
> commit 9799c50372b23ed774791bdb87d700f1286ee8a9 upstream.

Hi Ben,

This commit was reverted in
27915aa61060fd8954a68a86657784705955088a
('batman-adv: Revert "fix splat on disabling an interface"').

Greg dropped this patch from his stable queue back then, too [0].

Regards, Linus

[0]: https://marc.info/?l=linux-kernel=147938417410032


Re: [PATCH 3.2 049/126] batman-adv: fix splat on disabling an interface

2017-02-15 Thread Linus Lüssing
On Wed, Feb 15, 2017 at 10:41:34PM +, Ben Hutchings wrote:
> 3.2.85-rc1 review patch.  If anyone has any objections, please let me know.
> 
> --
> 
> From: Linus Lüssing 
> 
> commit 9799c50372b23ed774791bdb87d700f1286ee8a9 upstream.

Hi Ben,

This commit was reverted in
27915aa61060fd8954a68a86657784705955088a
('batman-adv: Revert "fix splat on disabling an interface"').

Greg dropped this patch from his stable queue back then, too [0].

Regards, Linus

[0]: https://marc.info/?l=linux-kernel=147938417410032


[PATCH net] ipv6: Fix IPv6 packet loss in scenarios involving roaming + snooping switches

2017-02-02 Thread Linus Lüssing
When for instance a mobile Linux device roams from one access point to
another with both APs sharing the same broadcast domain and a
multicast snooping switch in between:

1)(c) <~~~> (AP1) <--[SSW]--> (AP2)

2)  (AP1) <--[SSW]--> (AP2) <~~~> (c)

Then currently IPv6 multicast packets will get lost for (c) until an
MLD Querier sends its next query message. The packet loss occurs
because upon roaming the Linux host so far stayed silent regarding
MLD and the snooping switch will therefore be unaware of the
multicast topology change for a while.

This patch fixes this by always resending MLD reports when an interface
change happens, for instance from NO-CARRIER to CARRIER state.

Signed-off-by: Linus Lüssing <linus.luess...@c0d3.blue>

---

Initial problem report was sent to the bridge mailing list a while ago:
- https://lists.linuxfoundation.org/pipermail/bridge/2015-September/009754.html

The RFCs concerning IGMP, MLD and snooping switches seem a have a hole
concerning roaming. A request for clarification to mcast-w...@ietf.org
was left unanswered, unfortunately:
- https://mailarchive.ietf.org/arch/msg/mcast-wifi/Ghn2cGy1oN2ZwG1qaQO9SE13g6g

However, simply resending reports seems to be the straight forward way
to me to fix the issue mentioned above.
---
 net/ipv6/addrconf.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index f60e88e..81f7b4e 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -3386,9 +3386,15 @@ static int addrconf_notify(struct notifier_block *this, 
unsigned long event,
}
 
if (idev) {
-   if (idev->if_flags & IF_READY)
-   /* device is already configured. */
+   if (idev->if_flags & IF_READY) {
+   /* device is already configured -
+* but resend MLD reports, we might
+* have roamed and need to update
+* multicast snooping switches
+*/
+   ipv6_mc_up(idev);
break;
+   }
idev->if_flags |= IF_READY;
}
 
-- 
2.1.4



[PATCH net] ipv6: Fix IPv6 packet loss in scenarios involving roaming + snooping switches

2017-02-02 Thread Linus Lüssing
When for instance a mobile Linux device roams from one access point to
another with both APs sharing the same broadcast domain and a
multicast snooping switch in between:

1)(c) <~~~> (AP1) <--[SSW]--> (AP2)

2)  (AP1) <--[SSW]--> (AP2) <~~~> (c)

Then currently IPv6 multicast packets will get lost for (c) until an
MLD Querier sends its next query message. The packet loss occurs
because upon roaming the Linux host so far stayed silent regarding
MLD and the snooping switch will therefore be unaware of the
multicast topology change for a while.

This patch fixes this by always resending MLD reports when an interface
change happens, for instance from NO-CARRIER to CARRIER state.

Signed-off-by: Linus Lüssing 

---

Initial problem report was sent to the bridge mailing list a while ago:
- https://lists.linuxfoundation.org/pipermail/bridge/2015-September/009754.html

The RFCs concerning IGMP, MLD and snooping switches seem a have a hole
concerning roaming. A request for clarification to mcast-w...@ietf.org
was left unanswered, unfortunately:
- https://mailarchive.ietf.org/arch/msg/mcast-wifi/Ghn2cGy1oN2ZwG1qaQO9SE13g6g

However, simply resending reports seems to be the straight forward way
to me to fix the issue mentioned above.
---
 net/ipv6/addrconf.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index f60e88e..81f7b4e 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -3386,9 +3386,15 @@ static int addrconf_notify(struct notifier_block *this, 
unsigned long event,
}
 
if (idev) {
-   if (idev->if_flags & IF_READY)
-   /* device is already configured. */
+   if (idev->if_flags & IF_READY) {
+   /* device is already configured -
+* but resend MLD reports, we might
+* have roamed and need to update
+* multicast snooping switches
+*/
+   ipv6_mc_up(idev);
break;
+   }
idev->if_flags |= IF_READY;
}
 
-- 
2.1.4



[PATCH net-next v5] bridge: multicast to unicast

2017-01-21 Thread Linus Lüssing
From: Felix Fietkau <n...@nbd.name>

Implements an optional, per bridge port flag and feature to deliver
multicast packets to any host on the according port via unicast
individually. This is done by copying the packet per host and
changing the multicast destination MAC to a unicast one accordingly.

multicast-to-unicast works on top of the multicast snooping feature of
the bridge. Which means unicast copies are only delivered to hosts which
are interested in it and signalized this via IGMP/MLD reports
previously.

This feature is intended for interface types which have a more reliable
and/or efficient way to deliver unicast packets than broadcast ones
(e.g. wifi).

However, it should only be enabled on interfaces where no IGMPv2/MLDv1
report suppression takes place. This feature is disabled by default.

The initial patch and idea is from Felix Fietkau.

Signed-off-by: Felix Fietkau <n...@nbd.name>
[linus.luess...@c0d3.blue: various bug + style fixes, commit message]
Signed-off-by: Linus Lüssing <linus.luess...@c0d3.blue>

---

This feature is used and enabled by default in OpenWRT and LEDE for AP
interfaces for more than a year now to allow both a more robust multicast
delivery and multicast at higher rates (e.g. multicast streaming).

In OpenWRT/LEDE the IGMP/MLD report suppression issue is overcome by
the network daemon enabling AP isolation and by that separating all STAs.
Delivery of STA-to-STA IP mulitcast is made possible again by
enabling and utilizing the bridge hairpin mode, which considers the
incoming port as a potential outgoing port, too.

Hairpin-mode is performed after multicast snooping, therefore leading to
only deliver reports to STAs running a multicast router.

Changes in v5:
* fix a potential pagefault in br_ip6_multicast_mld2_report():
  -> a pskb_may_pull() might reallocate skb->data, therefore perform
 the "src = eth_hdr(skb)->h_source" only afterwards
* simplify code by always adding ether source address to a port group
  and checking the per-port multicast-to-unicast flag instead of a
  per-port-group one (thanks Stephen!)

Changes in v4:
* readd "From: Felix Fietkau [...]" (missed it again in v3)

Changes in v3:
* fix an uninitialized variable bug introduced in br_multicast_flood()
  in v2, found by kbuild test bot

Changes in v2:
* netlink support (thanks Nik!)
* fixed switching between multicast_to_unicast on/off
  -> even after toggling an already existing entry would
 stale in its mode and would never be replaced
  -> new extra check in br_port_group_equal)
* reduced checks in br_multicast_flood() from two to one
  to address fast-path concerns (thanks Nik!)
* rev-christmas tree ordering (thanks Nik!)
* removed "net_bridge_port_group::unicast", using
  ::flags instead (thanks Nik!)
* BR_MULTICAST_TO_UCAST -> BR_MULTICAST_TO_UNICAST
  (BR_MULTICAST_FLAST_LEAVE has the same length anyway)
* simplified maybe_deliver_addr()
  (no return, no "prev" paramater -> was a NOP anyway)
* added "From: Felix Fietkau [...]"
* added "Signed-off-by: Felix Fietkau [...]"
---
 include/linux/if_bridge.h|  1 +
 include/uapi/linux/if_link.h |  1 +
 net/bridge/br_forward.c  | 39 ++-
 net/bridge/br_mdb.c  |  2 +-
 net/bridge/br_multicast.c| 90 
 net/bridge/br_netlink.c  |  5 +++
 net/bridge/br_private.h  |  3 +-
 net/bridge/br_sysfs_if.c |  2 +
 8 files changed, 114 insertions(+), 29 deletions(-)

diff --git a/include/linux/if_bridge.h b/include/linux/if_bridge.h
index c6587c0..debc9d5 100644
--- a/include/linux/if_bridge.h
+++ b/include/linux/if_bridge.h
@@ -46,6 +46,7 @@ struct br_ip_list {
 #define BR_LEARNING_SYNC   BIT(9)
 #define BR_PROXYARP_WIFI   BIT(10)
 #define BR_MCAST_FLOOD BIT(11)
+#define BR_MULTICAST_TO_UNICASTBIT(12)
 
 #define BR_DEFAULT_AGEING_TIME (300 * HZ)
 
diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
index 6b13e59..4e59565 100644
--- a/include/uapi/linux/if_link.h
+++ b/include/uapi/linux/if_link.h
@@ -321,6 +321,7 @@ enum {
IFLA_BRPORT_MULTICAST_ROUTER,
IFLA_BRPORT_PAD,
IFLA_BRPORT_MCAST_FLOOD,
+   IFLA_BRPORT_MCAST_TO_UCAST,
__IFLA_BRPORT_MAX
 };
 #define IFLA_BRPORT_MAX (__IFLA_BRPORT_MAX - 1)
diff --git a/net/bridge/br_forward.c b/net/bridge/br_forward.c
index 7cb41ae..a0f9d00 100644
--- a/net/bridge/br_forward.c
+++ b/net/bridge/br_forward.c
@@ -174,6 +174,31 @@ static struct net_bridge_port *maybe_deliver(
return p;
 }
 
+static void maybe_deliver_addr(struct net_bridge_port *p, struct sk_buff *skb,
+  const unsigned char *addr, bool local_orig)
+{
+   struct net_device *dev = BR_INPUT_SKB_CB(skb)->brdev;
+   const unsigned char *src = eth_hdr(skb)->h_source;
+
+   if (!should_deliver(p, skb))
+   return;
+
+

[PATCH net-next v5] bridge: multicast to unicast

2017-01-21 Thread Linus Lüssing
From: Felix Fietkau 

Implements an optional, per bridge port flag and feature to deliver
multicast packets to any host on the according port via unicast
individually. This is done by copying the packet per host and
changing the multicast destination MAC to a unicast one accordingly.

multicast-to-unicast works on top of the multicast snooping feature of
the bridge. Which means unicast copies are only delivered to hosts which
are interested in it and signalized this via IGMP/MLD reports
previously.

This feature is intended for interface types which have a more reliable
and/or efficient way to deliver unicast packets than broadcast ones
(e.g. wifi).

However, it should only be enabled on interfaces where no IGMPv2/MLDv1
report suppression takes place. This feature is disabled by default.

The initial patch and idea is from Felix Fietkau.

Signed-off-by: Felix Fietkau 
[linus.luess...@c0d3.blue: various bug + style fixes, commit message]
Signed-off-by: Linus Lüssing 

---

This feature is used and enabled by default in OpenWRT and LEDE for AP
interfaces for more than a year now to allow both a more robust multicast
delivery and multicast at higher rates (e.g. multicast streaming).

In OpenWRT/LEDE the IGMP/MLD report suppression issue is overcome by
the network daemon enabling AP isolation and by that separating all STAs.
Delivery of STA-to-STA IP mulitcast is made possible again by
enabling and utilizing the bridge hairpin mode, which considers the
incoming port as a potential outgoing port, too.

Hairpin-mode is performed after multicast snooping, therefore leading to
only deliver reports to STAs running a multicast router.

Changes in v5:
* fix a potential pagefault in br_ip6_multicast_mld2_report():
  -> a pskb_may_pull() might reallocate skb->data, therefore perform
 the "src = eth_hdr(skb)->h_source" only afterwards
* simplify code by always adding ether source address to a port group
  and checking the per-port multicast-to-unicast flag instead of a
  per-port-group one (thanks Stephen!)

Changes in v4:
* readd "From: Felix Fietkau [...]" (missed it again in v3)

Changes in v3:
* fix an uninitialized variable bug introduced in br_multicast_flood()
  in v2, found by kbuild test bot

Changes in v2:
* netlink support (thanks Nik!)
* fixed switching between multicast_to_unicast on/off
  -> even after toggling an already existing entry would
 stale in its mode and would never be replaced
  -> new extra check in br_port_group_equal)
* reduced checks in br_multicast_flood() from two to one
  to address fast-path concerns (thanks Nik!)
* rev-christmas tree ordering (thanks Nik!)
* removed "net_bridge_port_group::unicast", using
  ::flags instead (thanks Nik!)
* BR_MULTICAST_TO_UCAST -> BR_MULTICAST_TO_UNICAST
  (BR_MULTICAST_FLAST_LEAVE has the same length anyway)
* simplified maybe_deliver_addr()
  (no return, no "prev" paramater -> was a NOP anyway)
* added "From: Felix Fietkau [...]"
* added "Signed-off-by: Felix Fietkau [...]"
---
 include/linux/if_bridge.h|  1 +
 include/uapi/linux/if_link.h |  1 +
 net/bridge/br_forward.c  | 39 ++-
 net/bridge/br_mdb.c  |  2 +-
 net/bridge/br_multicast.c| 90 
 net/bridge/br_netlink.c  |  5 +++
 net/bridge/br_private.h  |  3 +-
 net/bridge/br_sysfs_if.c |  2 +
 8 files changed, 114 insertions(+), 29 deletions(-)

diff --git a/include/linux/if_bridge.h b/include/linux/if_bridge.h
index c6587c0..debc9d5 100644
--- a/include/linux/if_bridge.h
+++ b/include/linux/if_bridge.h
@@ -46,6 +46,7 @@ struct br_ip_list {
 #define BR_LEARNING_SYNC   BIT(9)
 #define BR_PROXYARP_WIFI   BIT(10)
 #define BR_MCAST_FLOOD BIT(11)
+#define BR_MULTICAST_TO_UNICASTBIT(12)
 
 #define BR_DEFAULT_AGEING_TIME (300 * HZ)
 
diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
index 6b13e59..4e59565 100644
--- a/include/uapi/linux/if_link.h
+++ b/include/uapi/linux/if_link.h
@@ -321,6 +321,7 @@ enum {
IFLA_BRPORT_MULTICAST_ROUTER,
IFLA_BRPORT_PAD,
IFLA_BRPORT_MCAST_FLOOD,
+   IFLA_BRPORT_MCAST_TO_UCAST,
__IFLA_BRPORT_MAX
 };
 #define IFLA_BRPORT_MAX (__IFLA_BRPORT_MAX - 1)
diff --git a/net/bridge/br_forward.c b/net/bridge/br_forward.c
index 7cb41ae..a0f9d00 100644
--- a/net/bridge/br_forward.c
+++ b/net/bridge/br_forward.c
@@ -174,6 +174,31 @@ static struct net_bridge_port *maybe_deliver(
return p;
 }
 
+static void maybe_deliver_addr(struct net_bridge_port *p, struct sk_buff *skb,
+  const unsigned char *addr, bool local_orig)
+{
+   struct net_device *dev = BR_INPUT_SKB_CB(skb)->brdev;
+   const unsigned char *src = eth_hdr(skb)->h_source;
+
+   if (!should_deliver(p, skb))
+   return;
+
+   /* Even with hairpin, no soliloquies - prevent breaking IPv6 DAD */
+   if (s

[PATCH net-next v4] bridge: multicast to unicast

2017-01-18 Thread Linus Lüssing
From: Felix Fietkau <n...@nbd.name>

Implements an optional, per bridge port flag and feature to deliver
multicast packets to any host on the according port via unicast
individually. This is done by copying the packet per host and
changing the multicast destination MAC to a unicast one accordingly.

multicast-to-unicast works on top of the multicast snooping feature of
the bridge. Which means unicast copies are only delivered to hosts which
are interested in it and signalized this via IGMP/MLD reports
previously.

This feature is intended for interface types which have a more reliable
and/or efficient way to deliver unicast packets than broadcast ones
(e.g. wifi).

However, it should only be enabled on interfaces where no IGMPv2/MLDv1
report suppression takes place. This feature is disabled by default.

The initial patch and idea is from Felix Fietkau.

Signed-off-by: Felix Fietkau <n...@nbd.name>
[linus.luess...@c0d3.blue: various bug + style fixes, commit message]
Signed-off-by: Linus Lüssing <linus.luess...@c0d3.blue>

---

This feature is used and enabled by default in OpenWRT and LEDE for AP
interfaces for more than a year now to allow both a more robust multicast
delivery and multicast at higher rates (e.g. multicast streaming).

In OpenWRT/LEDE the IGMP/MLD report suppression issue is overcome by
the network daemon enabling AP isolation and by that separating all STAs.
Delivery of STA-to-STA IP mulitcast is made possible again by
enabling and utilizing the bridge hairpin mode, which considers the
incoming port as a potential outgoing port, too.

Hairpin-mode is performed after multicast snooping, therefore leading to
only deliver reports to STAs running a multicast router.

Changes in v4:
* readd "From: Felix Fietkau [...]" (missed it again in v3)

Changes in v3:
* fix an uninitialized variable bug introduced in br_multicast_flood()
  in v2, found by kbuild test bot

Changes in v2:
* netlink support (thanks Nik!)
* fixed switching between multicast_to_unicast on/off
  -> even after toggling an already existing entry would
 stale in its mode and would never be replaced
  -> new extra check in br_port_group_equal)
* reduced checks in br_multicast_flood() from two to one
  to address fast-path concerns (thanks Nik!)
* rev-christmas tree ordering (thanks Nik!)
* removed "net_bridge_port_group::unicast", using
  ::flags instead (thanks Nik!)
* BR_MULTICAST_TO_UCAST -> BR_MULTICAST_TO_UNICAST
  (BR_MULTICAST_FLAST_LEAVE has the same length anyway)
* simplified maybe_deliver_addr()
  (no return, no "prev" paramater -> was a NOP anyway)
* added "From: Felix Fietkau [...]"
* added "Signed-off-by: Felix Fietkau [...]"
---
 include/linux/if_bridge.h|  1 +
 include/uapi/linux/if_link.h |  1 +
 net/bridge/br_forward.c  | 37 -
 net/bridge/br_mdb.c  |  2 +-
 net/bridge/br_multicast.c| 96 
 net/bridge/br_netlink.c  |  5 +++
 net/bridge/br_private.h  |  8 ++--
 net/bridge/br_sysfs_if.c |  2 +
 8 files changed, 121 insertions(+), 31 deletions(-)

diff --git a/include/linux/if_bridge.h b/include/linux/if_bridge.h
index c6587c0..debc9d5 100644
--- a/include/linux/if_bridge.h
+++ b/include/linux/if_bridge.h
@@ -46,6 +46,7 @@ struct br_ip_list {
 #define BR_LEARNING_SYNC   BIT(9)
 #define BR_PROXYARP_WIFI   BIT(10)
 #define BR_MCAST_FLOOD BIT(11)
+#define BR_MULTICAST_TO_UNICASTBIT(12)
 
 #define BR_DEFAULT_AGEING_TIME (300 * HZ)
 
diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
index 6b13e59..4e59565 100644
--- a/include/uapi/linux/if_link.h
+++ b/include/uapi/linux/if_link.h
@@ -321,6 +321,7 @@ enum {
IFLA_BRPORT_MULTICAST_ROUTER,
IFLA_BRPORT_PAD,
IFLA_BRPORT_MCAST_FLOOD,
+   IFLA_BRPORT_MCAST_TO_UCAST,
__IFLA_BRPORT_MAX
 };
 #define IFLA_BRPORT_MAX (__IFLA_BRPORT_MAX - 1)
diff --git a/net/bridge/br_forward.c b/net/bridge/br_forward.c
index 7cb41ae..a6c8a27 100644
--- a/net/bridge/br_forward.c
+++ b/net/bridge/br_forward.c
@@ -174,6 +174,29 @@ static struct net_bridge_port *maybe_deliver(
return p;
 }
 
+static void maybe_deliver_addr(struct net_bridge_port *p, struct sk_buff *skb,
+  const unsigned char *addr, bool local_orig)
+{
+   struct net_device *dev = BR_INPUT_SKB_CB(skb)->brdev;
+   const unsigned char *src = eth_hdr(skb)->h_source;
+
+   if (!should_deliver(p, skb))
+   return;
+
+   /* Even with hairpin, no soliloquies - prevent breaking IPv6 DAD */
+   if (skb->dev == p->dev && ether_addr_equal(src, addr))
+   return;
+
+   skb = skb_copy(skb, GFP_ATOMIC);
+   if (!skb) {
+   dev->stats.tx_dropped++;
+   return;
+   }
+
+   memcpy(eth_hdr(skb)->h_dest, addr, ETH_ALEN);
+   __br_forward(p, skb, loc

[PATCH net-next v4] bridge: multicast to unicast

2017-01-18 Thread Linus Lüssing
From: Felix Fietkau 

Implements an optional, per bridge port flag and feature to deliver
multicast packets to any host on the according port via unicast
individually. This is done by copying the packet per host and
changing the multicast destination MAC to a unicast one accordingly.

multicast-to-unicast works on top of the multicast snooping feature of
the bridge. Which means unicast copies are only delivered to hosts which
are interested in it and signalized this via IGMP/MLD reports
previously.

This feature is intended for interface types which have a more reliable
and/or efficient way to deliver unicast packets than broadcast ones
(e.g. wifi).

However, it should only be enabled on interfaces where no IGMPv2/MLDv1
report suppression takes place. This feature is disabled by default.

The initial patch and idea is from Felix Fietkau.

Signed-off-by: Felix Fietkau 
[linus.luess...@c0d3.blue: various bug + style fixes, commit message]
Signed-off-by: Linus Lüssing 

---

This feature is used and enabled by default in OpenWRT and LEDE for AP
interfaces for more than a year now to allow both a more robust multicast
delivery and multicast at higher rates (e.g. multicast streaming).

In OpenWRT/LEDE the IGMP/MLD report suppression issue is overcome by
the network daemon enabling AP isolation and by that separating all STAs.
Delivery of STA-to-STA IP mulitcast is made possible again by
enabling and utilizing the bridge hairpin mode, which considers the
incoming port as a potential outgoing port, too.

Hairpin-mode is performed after multicast snooping, therefore leading to
only deliver reports to STAs running a multicast router.

Changes in v4:
* readd "From: Felix Fietkau [...]" (missed it again in v3)

Changes in v3:
* fix an uninitialized variable bug introduced in br_multicast_flood()
  in v2, found by kbuild test bot

Changes in v2:
* netlink support (thanks Nik!)
* fixed switching between multicast_to_unicast on/off
  -> even after toggling an already existing entry would
 stale in its mode and would never be replaced
  -> new extra check in br_port_group_equal)
* reduced checks in br_multicast_flood() from two to one
  to address fast-path concerns (thanks Nik!)
* rev-christmas tree ordering (thanks Nik!)
* removed "net_bridge_port_group::unicast", using
  ::flags instead (thanks Nik!)
* BR_MULTICAST_TO_UCAST -> BR_MULTICAST_TO_UNICAST
  (BR_MULTICAST_FLAST_LEAVE has the same length anyway)
* simplified maybe_deliver_addr()
  (no return, no "prev" paramater -> was a NOP anyway)
* added "From: Felix Fietkau [...]"
* added "Signed-off-by: Felix Fietkau [...]"
---
 include/linux/if_bridge.h|  1 +
 include/uapi/linux/if_link.h |  1 +
 net/bridge/br_forward.c  | 37 -
 net/bridge/br_mdb.c  |  2 +-
 net/bridge/br_multicast.c| 96 
 net/bridge/br_netlink.c  |  5 +++
 net/bridge/br_private.h  |  8 ++--
 net/bridge/br_sysfs_if.c |  2 +
 8 files changed, 121 insertions(+), 31 deletions(-)

diff --git a/include/linux/if_bridge.h b/include/linux/if_bridge.h
index c6587c0..debc9d5 100644
--- a/include/linux/if_bridge.h
+++ b/include/linux/if_bridge.h
@@ -46,6 +46,7 @@ struct br_ip_list {
 #define BR_LEARNING_SYNC   BIT(9)
 #define BR_PROXYARP_WIFI   BIT(10)
 #define BR_MCAST_FLOOD BIT(11)
+#define BR_MULTICAST_TO_UNICASTBIT(12)
 
 #define BR_DEFAULT_AGEING_TIME (300 * HZ)
 
diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
index 6b13e59..4e59565 100644
--- a/include/uapi/linux/if_link.h
+++ b/include/uapi/linux/if_link.h
@@ -321,6 +321,7 @@ enum {
IFLA_BRPORT_MULTICAST_ROUTER,
IFLA_BRPORT_PAD,
IFLA_BRPORT_MCAST_FLOOD,
+   IFLA_BRPORT_MCAST_TO_UCAST,
__IFLA_BRPORT_MAX
 };
 #define IFLA_BRPORT_MAX (__IFLA_BRPORT_MAX - 1)
diff --git a/net/bridge/br_forward.c b/net/bridge/br_forward.c
index 7cb41ae..a6c8a27 100644
--- a/net/bridge/br_forward.c
+++ b/net/bridge/br_forward.c
@@ -174,6 +174,29 @@ static struct net_bridge_port *maybe_deliver(
return p;
 }
 
+static void maybe_deliver_addr(struct net_bridge_port *p, struct sk_buff *skb,
+  const unsigned char *addr, bool local_orig)
+{
+   struct net_device *dev = BR_INPUT_SKB_CB(skb)->brdev;
+   const unsigned char *src = eth_hdr(skb)->h_source;
+
+   if (!should_deliver(p, skb))
+   return;
+
+   /* Even with hairpin, no soliloquies - prevent breaking IPv6 DAD */
+   if (skb->dev == p->dev && ether_addr_equal(src, addr))
+   return;
+
+   skb = skb_copy(skb, GFP_ATOMIC);
+   if (!skb) {
+   dev->stats.tx_dropped++;
+   return;
+   }
+
+   memcpy(eth_hdr(skb)->h_dest, addr, ETH_ALEN);
+   __br_forward(p, skb, local_orig);
+}
+
 /* called under rcu_read_lock */
 void br_flood(struct net_

[PATCH net-next v3] bridge: multicast to unicast

2017-01-18 Thread Linus Lüssing
Implements an optional, per bridge port flag and feature to deliver
multicast packets to any host on the according port via unicast
individually. This is done by copying the packet per host and
changing the multicast destination MAC to a unicast one accordingly.

multicast-to-unicast works on top of the multicast snooping feature of
the bridge. Which means unicast copies are only delivered to hosts which
are interested in it and signalized this via IGMP/MLD reports
previously.

This feature is intended for interface types which have a more reliable
and/or efficient way to deliver unicast packets than broadcast ones
(e.g. wifi).

However, it should only be enabled on interfaces where no IGMPv2/MLDv1
report suppression takes place. This feature is disabled by default.

The initial patch and idea is from Felix Fietkau.

Signed-off-by: Felix Fietkau <n...@nbd.name>
[linus.luess...@c0d3.blue: various bug + style fixes, commit message]
Signed-off-by: Linus Lüssing <linus.luess...@c0d3.blue>

---

This feature is used and enabled by default in OpenWRT and LEDE for AP
interfaces for more than a year now to allow both a more robust multicast
delivery and multicast at higher rates (e.g. multicast streaming).

In OpenWRT/LEDE the IGMP/MLD report suppression issue is overcome by
the network daemon enabling AP isolation and by that separating all STAs.
Delivery of STA-to-STA IP mulitcast is made possible again by
enabling and utilizing the bridge hairpin mode, which considers the
incoming port as a potential outgoing port, too.

Hairpin-mode is performed after multicast snooping, therefore leading to
only deliver reports to STAs running a multicast router.

Changes in v3:
* fix an uninitialized variable bug introduced in br_multicast_flood()
  in v2, found by kbuild test bot

Changes in v2:
* netlink support (thanks Nik!)
* fixed switching between multicast_to_unicast on/off
  -> even after toggling an already existing entry would
 stale in its mode and would never be replaced
  -> new extra check in br_port_group_equal)
* reduced checks in br_multicast_flood() from two to one
  to address fast-path concerns (thanks Nik!)
* rev-christmas tree ordering (thanks Nik!)
* removed "net_bridge_port_group::unicast", using
  ::flags instead (thanks Nik!)
* BR_MULTICAST_TO_UCAST -> BR_MULTICAST_TO_UNICAST
  (BR_MULTICAST_FLAST_LEAVE has the same length anyway)
* simplified maybe_deliver_addr()
  (no return, no "prev" paramater -> was a NOP anyway)
* added "From: Felix Fietkau [...]"
* added "Signed-off-by: Felix Fietkau [...]"
---
 include/linux/if_bridge.h|  1 +
 include/uapi/linux/if_link.h |  1 +
 net/bridge/br_forward.c  | 37 -
 net/bridge/br_mdb.c  |  2 +-
 net/bridge/br_multicast.c| 96 
 net/bridge/br_netlink.c  |  5 +++
 net/bridge/br_private.h  |  8 ++--
 net/bridge/br_sysfs_if.c |  2 +
 8 files changed, 121 insertions(+), 31 deletions(-)

diff --git a/include/linux/if_bridge.h b/include/linux/if_bridge.h
index c6587c0..debc9d5 100644
--- a/include/linux/if_bridge.h
+++ b/include/linux/if_bridge.h
@@ -46,6 +46,7 @@ struct br_ip_list {
 #define BR_LEARNING_SYNC   BIT(9)
 #define BR_PROXYARP_WIFI   BIT(10)
 #define BR_MCAST_FLOOD BIT(11)
+#define BR_MULTICAST_TO_UNICASTBIT(12)
 
 #define BR_DEFAULT_AGEING_TIME (300 * HZ)
 
diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
index 6b13e59..4e59565 100644
--- a/include/uapi/linux/if_link.h
+++ b/include/uapi/linux/if_link.h
@@ -321,6 +321,7 @@ enum {
IFLA_BRPORT_MULTICAST_ROUTER,
IFLA_BRPORT_PAD,
IFLA_BRPORT_MCAST_FLOOD,
+   IFLA_BRPORT_MCAST_TO_UCAST,
__IFLA_BRPORT_MAX
 };
 #define IFLA_BRPORT_MAX (__IFLA_BRPORT_MAX - 1)
diff --git a/net/bridge/br_forward.c b/net/bridge/br_forward.c
index 7cb41ae..a6c8a27 100644
--- a/net/bridge/br_forward.c
+++ b/net/bridge/br_forward.c
@@ -174,6 +174,29 @@ static struct net_bridge_port *maybe_deliver(
return p;
 }
 
+static void maybe_deliver_addr(struct net_bridge_port *p, struct sk_buff *skb,
+  const unsigned char *addr, bool local_orig)
+{
+   struct net_device *dev = BR_INPUT_SKB_CB(skb)->brdev;
+   const unsigned char *src = eth_hdr(skb)->h_source;
+
+   if (!should_deliver(p, skb))
+   return;
+
+   /* Even with hairpin, no soliloquies - prevent breaking IPv6 DAD */
+   if (skb->dev == p->dev && ether_addr_equal(src, addr))
+   return;
+
+   skb = skb_copy(skb, GFP_ATOMIC);
+   if (!skb) {
+   dev->stats.tx_dropped++;
+   return;
+   }
+
+   memcpy(eth_hdr(skb)->h_dest, addr, ETH_ALEN);
+   __br_forward(p, skb, local_orig);
+}
+
 /* called under rcu_read_lock */
 void br_flood(struct net_bridge *br, struct sk_buff *skb,
  

[PATCH net-next v3] bridge: multicast to unicast

2017-01-18 Thread Linus Lüssing
Implements an optional, per bridge port flag and feature to deliver
multicast packets to any host on the according port via unicast
individually. This is done by copying the packet per host and
changing the multicast destination MAC to a unicast one accordingly.

multicast-to-unicast works on top of the multicast snooping feature of
the bridge. Which means unicast copies are only delivered to hosts which
are interested in it and signalized this via IGMP/MLD reports
previously.

This feature is intended for interface types which have a more reliable
and/or efficient way to deliver unicast packets than broadcast ones
(e.g. wifi).

However, it should only be enabled on interfaces where no IGMPv2/MLDv1
report suppression takes place. This feature is disabled by default.

The initial patch and idea is from Felix Fietkau.

Signed-off-by: Felix Fietkau 
[linus.luess...@c0d3.blue: various bug + style fixes, commit message]
Signed-off-by: Linus Lüssing 

---

This feature is used and enabled by default in OpenWRT and LEDE for AP
interfaces for more than a year now to allow both a more robust multicast
delivery and multicast at higher rates (e.g. multicast streaming).

In OpenWRT/LEDE the IGMP/MLD report suppression issue is overcome by
the network daemon enabling AP isolation and by that separating all STAs.
Delivery of STA-to-STA IP mulitcast is made possible again by
enabling and utilizing the bridge hairpin mode, which considers the
incoming port as a potential outgoing port, too.

Hairpin-mode is performed after multicast snooping, therefore leading to
only deliver reports to STAs running a multicast router.

Changes in v3:
* fix an uninitialized variable bug introduced in br_multicast_flood()
  in v2, found by kbuild test bot

Changes in v2:
* netlink support (thanks Nik!)
* fixed switching between multicast_to_unicast on/off
  -> even after toggling an already existing entry would
 stale in its mode and would never be replaced
  -> new extra check in br_port_group_equal)
* reduced checks in br_multicast_flood() from two to one
  to address fast-path concerns (thanks Nik!)
* rev-christmas tree ordering (thanks Nik!)
* removed "net_bridge_port_group::unicast", using
  ::flags instead (thanks Nik!)
* BR_MULTICAST_TO_UCAST -> BR_MULTICAST_TO_UNICAST
  (BR_MULTICAST_FLAST_LEAVE has the same length anyway)
* simplified maybe_deliver_addr()
  (no return, no "prev" paramater -> was a NOP anyway)
* added "From: Felix Fietkau [...]"
* added "Signed-off-by: Felix Fietkau [...]"
---
 include/linux/if_bridge.h|  1 +
 include/uapi/linux/if_link.h |  1 +
 net/bridge/br_forward.c  | 37 -
 net/bridge/br_mdb.c  |  2 +-
 net/bridge/br_multicast.c| 96 
 net/bridge/br_netlink.c  |  5 +++
 net/bridge/br_private.h  |  8 ++--
 net/bridge/br_sysfs_if.c |  2 +
 8 files changed, 121 insertions(+), 31 deletions(-)

diff --git a/include/linux/if_bridge.h b/include/linux/if_bridge.h
index c6587c0..debc9d5 100644
--- a/include/linux/if_bridge.h
+++ b/include/linux/if_bridge.h
@@ -46,6 +46,7 @@ struct br_ip_list {
 #define BR_LEARNING_SYNC   BIT(9)
 #define BR_PROXYARP_WIFI   BIT(10)
 #define BR_MCAST_FLOOD BIT(11)
+#define BR_MULTICAST_TO_UNICASTBIT(12)
 
 #define BR_DEFAULT_AGEING_TIME (300 * HZ)
 
diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
index 6b13e59..4e59565 100644
--- a/include/uapi/linux/if_link.h
+++ b/include/uapi/linux/if_link.h
@@ -321,6 +321,7 @@ enum {
IFLA_BRPORT_MULTICAST_ROUTER,
IFLA_BRPORT_PAD,
IFLA_BRPORT_MCAST_FLOOD,
+   IFLA_BRPORT_MCAST_TO_UCAST,
__IFLA_BRPORT_MAX
 };
 #define IFLA_BRPORT_MAX (__IFLA_BRPORT_MAX - 1)
diff --git a/net/bridge/br_forward.c b/net/bridge/br_forward.c
index 7cb41ae..a6c8a27 100644
--- a/net/bridge/br_forward.c
+++ b/net/bridge/br_forward.c
@@ -174,6 +174,29 @@ static struct net_bridge_port *maybe_deliver(
return p;
 }
 
+static void maybe_deliver_addr(struct net_bridge_port *p, struct sk_buff *skb,
+  const unsigned char *addr, bool local_orig)
+{
+   struct net_device *dev = BR_INPUT_SKB_CB(skb)->brdev;
+   const unsigned char *src = eth_hdr(skb)->h_source;
+
+   if (!should_deliver(p, skb))
+   return;
+
+   /* Even with hairpin, no soliloquies - prevent breaking IPv6 DAD */
+   if (skb->dev == p->dev && ether_addr_equal(src, addr))
+   return;
+
+   skb = skb_copy(skb, GFP_ATOMIC);
+   if (!skb) {
+   dev->stats.tx_dropped++;
+   return;
+   }
+
+   memcpy(eth_hdr(skb)->h_dest, addr, ETH_ALEN);
+   __br_forward(p, skb, local_orig);
+}
+
 /* called under rcu_read_lock */
 void br_flood(struct net_bridge *br, struct sk_buff *skb,
  enum br_pkt_type pkt_type, bool local_rcv, bool local_orig)
@

[PATCH net-next v2] bridge: multicast to unicast

2017-01-17 Thread Linus Lüssing
From: Felix Fietkau <n...@nbd.name>

Implements an optional, per bridge port flag and feature to deliver
multicast packets to any host on the according port via unicast
individually. This is done by copying the packet per host and
changing the multicast destination MAC to a unicast one accordingly.

multicast-to-unicast works on top of the multicast snooping feature of
the bridge. Which means unicast copies are only delivered to hosts which
are interested in it and signalized this via IGMP/MLD reports
previously.

This feature is intended for interface types which have a more reliable
and/or efficient way to deliver unicast packets than broadcast ones
(e.g. wifi).

However, it should only be enabled on interfaces where no IGMPv2/MLDv1
report suppression takes place. This feature is disabled by default.

The initial patch and idea is from Felix Fietkau.

Signed-off-by: Felix Fietkau <n...@nbd.name>
[linus.luess...@c0d3.blue: various bug + style fixes, commit message]
Signed-off-by: Linus Lüssing <linus.luess...@c0d3.blue>

---

This feature is used and enabled by default in OpenWRT and LEDE for AP
interfaces for more than a year now to allow both a more robust multicast
delivery and multicast at higher rates (e.g. multicast streaming).

In OpenWRT/LEDE the IGMP/MLD report suppression issue is overcome by
the network daemon enabling AP isolation and by that separating all STAs.
Delivery of STA-to-STA IP mulitcast is made possible again by
enabling and utilizing the bridge hairpin mode, which considers the
incoming port as a potential outgoing port, too.

Hairpin-mode is performed after multicast snooping, therefore leading to
only deliver reports to STAs running a multicast router.

Changes in v2:
* netlink support (thanks Nik!)
* fixed switching between multicast_to_unicast on/off
  -> even after toggling an already existing entry would
 stale in its mode and would never be replaced
  -> new extra check in br_port_group_equal)
* reduced checks in br_multicast_flood() from two to one
  to address fast-path concerns (thanks Nik!)
* rev-christmas tree ordering (thanks Nik!)
* removed "net_bridge_port_group::unicast", using
  ::flags instead (thanks Nik!)
* BR_MULTICAST_TO_UCAST -> BR_MULTICAST_TO_UNICAST
  (BR_MULTICAST_FLAST_LEAVE has the same length anyway)
* simplified maybe_deliver_addr()
  (no return, no "prev" paramater -> was a NOP anyway)
* added "From: Felix Fietkau [...]"
* added "Signed-off-by: Felix Fietkau [...]"
---
 include/linux/if_bridge.h|  1 +
 include/uapi/linux/if_link.h |  1 +
 net/bridge/br_forward.c  | 37 -
 net/bridge/br_mdb.c  |  2 +-
 net/bridge/br_multicast.c| 96 
 net/bridge/br_netlink.c  |  5 +++
 net/bridge/br_private.h  |  8 ++--
 net/bridge/br_sysfs_if.c |  2 +
 8 files changed, 121 insertions(+), 31 deletions(-)

diff --git a/include/linux/if_bridge.h b/include/linux/if_bridge.h
index c6587c0..debc9d5 100644
--- a/include/linux/if_bridge.h
+++ b/include/linux/if_bridge.h
@@ -46,6 +46,7 @@ struct br_ip_list {
 #define BR_LEARNING_SYNC   BIT(9)
 #define BR_PROXYARP_WIFI   BIT(10)
 #define BR_MCAST_FLOOD BIT(11)
+#define BR_MULTICAST_TO_UNICASTBIT(12)
 
 #define BR_DEFAULT_AGEING_TIME (300 * HZ)
 
diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
index 6b13e59..4e59565 100644
--- a/include/uapi/linux/if_link.h
+++ b/include/uapi/linux/if_link.h
@@ -321,6 +321,7 @@ enum {
IFLA_BRPORT_MULTICAST_ROUTER,
IFLA_BRPORT_PAD,
IFLA_BRPORT_MCAST_FLOOD,
+   IFLA_BRPORT_MCAST_TO_UCAST,
__IFLA_BRPORT_MAX
 };
 #define IFLA_BRPORT_MAX (__IFLA_BRPORT_MAX - 1)
diff --git a/net/bridge/br_forward.c b/net/bridge/br_forward.c
index 7cb41ae..75d041e 100644
--- a/net/bridge/br_forward.c
+++ b/net/bridge/br_forward.c
@@ -174,6 +174,29 @@ static struct net_bridge_port *maybe_deliver(
return p;
 }
 
+static void maybe_deliver_addr(struct net_bridge_port *p, struct sk_buff *skb,
+  const unsigned char *addr, bool local_orig)
+{
+   struct net_device *dev = BR_INPUT_SKB_CB(skb)->brdev;
+   const unsigned char *src = eth_hdr(skb)->h_source;
+
+   if (!should_deliver(p, skb))
+   return;
+
+   /* Even with hairpin, no soliloquies - prevent breaking IPv6 DAD */
+   if (skb->dev == p->dev && ether_addr_equal(src, addr))
+   return;
+
+   skb = skb_copy(skb, GFP_ATOMIC);
+   if (!skb) {
+   dev->stats.tx_dropped++;
+   return;
+   }
+
+   memcpy(eth_hdr(skb)->h_dest, addr, ETH_ALEN);
+   __br_forward(p, skb, local_orig);
+}
+
 /* called under rcu_read_lock */
 void br_flood(struct net_bridge *br, struct sk_buff *skb,
  enum br_pkt_type pkt_type, bool local_rcv, bool local_orig)
@@ -241,10 +264,20 @@ void b

[PATCH net-next v2] bridge: multicast to unicast

2017-01-17 Thread Linus Lüssing
From: Felix Fietkau 

Implements an optional, per bridge port flag and feature to deliver
multicast packets to any host on the according port via unicast
individually. This is done by copying the packet per host and
changing the multicast destination MAC to a unicast one accordingly.

multicast-to-unicast works on top of the multicast snooping feature of
the bridge. Which means unicast copies are only delivered to hosts which
are interested in it and signalized this via IGMP/MLD reports
previously.

This feature is intended for interface types which have a more reliable
and/or efficient way to deliver unicast packets than broadcast ones
(e.g. wifi).

However, it should only be enabled on interfaces where no IGMPv2/MLDv1
report suppression takes place. This feature is disabled by default.

The initial patch and idea is from Felix Fietkau.

Signed-off-by: Felix Fietkau 
[linus.luess...@c0d3.blue: various bug + style fixes, commit message]
Signed-off-by: Linus Lüssing 

---

This feature is used and enabled by default in OpenWRT and LEDE for AP
interfaces for more than a year now to allow both a more robust multicast
delivery and multicast at higher rates (e.g. multicast streaming).

In OpenWRT/LEDE the IGMP/MLD report suppression issue is overcome by
the network daemon enabling AP isolation and by that separating all STAs.
Delivery of STA-to-STA IP mulitcast is made possible again by
enabling and utilizing the bridge hairpin mode, which considers the
incoming port as a potential outgoing port, too.

Hairpin-mode is performed after multicast snooping, therefore leading to
only deliver reports to STAs running a multicast router.

Changes in v2:
* netlink support (thanks Nik!)
* fixed switching between multicast_to_unicast on/off
  -> even after toggling an already existing entry would
 stale in its mode and would never be replaced
  -> new extra check in br_port_group_equal)
* reduced checks in br_multicast_flood() from two to one
  to address fast-path concerns (thanks Nik!)
* rev-christmas tree ordering (thanks Nik!)
* removed "net_bridge_port_group::unicast", using
  ::flags instead (thanks Nik!)
* BR_MULTICAST_TO_UCAST -> BR_MULTICAST_TO_UNICAST
  (BR_MULTICAST_FLAST_LEAVE has the same length anyway)
* simplified maybe_deliver_addr()
  (no return, no "prev" paramater -> was a NOP anyway)
* added "From: Felix Fietkau [...]"
* added "Signed-off-by: Felix Fietkau [...]"
---
 include/linux/if_bridge.h|  1 +
 include/uapi/linux/if_link.h |  1 +
 net/bridge/br_forward.c  | 37 -
 net/bridge/br_mdb.c  |  2 +-
 net/bridge/br_multicast.c| 96 
 net/bridge/br_netlink.c  |  5 +++
 net/bridge/br_private.h  |  8 ++--
 net/bridge/br_sysfs_if.c |  2 +
 8 files changed, 121 insertions(+), 31 deletions(-)

diff --git a/include/linux/if_bridge.h b/include/linux/if_bridge.h
index c6587c0..debc9d5 100644
--- a/include/linux/if_bridge.h
+++ b/include/linux/if_bridge.h
@@ -46,6 +46,7 @@ struct br_ip_list {
 #define BR_LEARNING_SYNC   BIT(9)
 #define BR_PROXYARP_WIFI   BIT(10)
 #define BR_MCAST_FLOOD BIT(11)
+#define BR_MULTICAST_TO_UNICASTBIT(12)
 
 #define BR_DEFAULT_AGEING_TIME (300 * HZ)
 
diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
index 6b13e59..4e59565 100644
--- a/include/uapi/linux/if_link.h
+++ b/include/uapi/linux/if_link.h
@@ -321,6 +321,7 @@ enum {
IFLA_BRPORT_MULTICAST_ROUTER,
IFLA_BRPORT_PAD,
IFLA_BRPORT_MCAST_FLOOD,
+   IFLA_BRPORT_MCAST_TO_UCAST,
__IFLA_BRPORT_MAX
 };
 #define IFLA_BRPORT_MAX (__IFLA_BRPORT_MAX - 1)
diff --git a/net/bridge/br_forward.c b/net/bridge/br_forward.c
index 7cb41ae..75d041e 100644
--- a/net/bridge/br_forward.c
+++ b/net/bridge/br_forward.c
@@ -174,6 +174,29 @@ static struct net_bridge_port *maybe_deliver(
return p;
 }
 
+static void maybe_deliver_addr(struct net_bridge_port *p, struct sk_buff *skb,
+  const unsigned char *addr, bool local_orig)
+{
+   struct net_device *dev = BR_INPUT_SKB_CB(skb)->brdev;
+   const unsigned char *src = eth_hdr(skb)->h_source;
+
+   if (!should_deliver(p, skb))
+   return;
+
+   /* Even with hairpin, no soliloquies - prevent breaking IPv6 DAD */
+   if (skb->dev == p->dev && ether_addr_equal(src, addr))
+   return;
+
+   skb = skb_copy(skb, GFP_ATOMIC);
+   if (!skb) {
+   dev->stats.tx_dropped++;
+   return;
+   }
+
+   memcpy(eth_hdr(skb)->h_dest, addr, ETH_ALEN);
+   __br_forward(p, skb, local_orig);
+}
+
 /* called under rcu_read_lock */
 void br_flood(struct net_bridge *br, struct sk_buff *skb,
  enum br_pkt_type pkt_type, bool local_rcv, bool local_orig)
@@ -241,10 +264,20 @@ void br_multicast_flood(struct net_bridge_mdb_entry *md

Re: [PATCH net-next] bridge: multicast to unicast

2017-01-09 Thread Linus Lüssing
On Mon, Jan 09, 2017 at 01:30:32PM -0800, Stephen Hemminger wrote:
> I wonder if MAC80211 should be doing IGMP snooping and not bridge
> in this environment.

In the long term, yes. For now, not quite sure.

I personally like to go for simple solutions first :).


Re: [PATCH net-next] bridge: multicast to unicast

2017-01-09 Thread Linus Lüssing
On Mon, Jan 09, 2017 at 01:30:32PM -0800, Stephen Hemminger wrote:
> I wonder if MAC80211 should be doing IGMP snooping and not bridge
> in this environment.

In the long term, yes. For now, not quite sure.

I personally like to go for simple solutions first :).


Re: [PATCH net-next] bridge: multicast to unicast

2017-01-09 Thread Linus Lüssing
On Mon, Jan 09, 2017 at 10:42:46PM +0100, Johannes Berg wrote:
> On Mon, 2017-01-09 at 22:33 +0100, Linus Lüssing wrote:
> > On Mon, Jan 09, 2017 at 01:44:03PM +0100, Johannes Berg wrote:
> > > 
> > > > >  A host SHOULD silently discard a datagram that is
> > > > > received via
> > > > >  a link-layer broadcast (see Section 2.4) but does not
> > > > > specify
> > > > >  an IP multicast or broadcast destination address.
> > > > 
> > > > This example is the other way round. It specifies how the IP
> > > > destination should look like in case of link-layer broadcast. Not
> > > > how the link-layer destination should look like in case of a
> > > > multicast/broadcast IP destination.
> > > 
> > > You stopped reading too early - snipped the context part for you :)
> > 
> > Sorry for writing to you directly, but I still have some
> > difficulties. In pseudo-code that line says:
> > 
> > -
> > if ll_dst(pkt) == bcast AND ip_dst(pkt) != mcast/bcast:
> > -> drop(pkt)
> > -
> > 
> > But after multicast-to-unicast conversion, we have:
> > 
> > -
> > ll_dst(pkt) == ucast AND ip_dst(pkt) == mcast
> > -
> > 
> > So none of the two requirements for dropping are matched?
> > 
> 
> Exactly. My point is that this is breaking the expectation that hosts
> are actually able to drop such packets.

[readding CCs I removed earlier]

Ah! Thanks. I was worried about creating packetloss :D.

Hm, for this other other way round, I think it does not apply for
the bridge multicast-to-unicast patch if I'm not misreading the bridge code:

For a packet with a link-layer multicast address but a unicast IP
destination, the bridge MDB lookup will fail.
(http://lxr.free-electrons.com/source/net/bridge/br_multicast.c?v=4.8#L178
 returns NULL)

Case A): No multicast router on port:
-> bridge, br_multicast_flood(), will drop the packet already
   (no matter if multicast-to-unicast is enabled or not)

Case B): Multicast router present on port:
-> The new patch does not apply multicast-to-unicast but just floods
   packet unaltered
   ("else { port = rport; addr = NULL; }" branch)

Regards, Linus


Re: [PATCH net-next] bridge: multicast to unicast

2017-01-09 Thread Linus Lüssing
On Mon, Jan 09, 2017 at 10:42:46PM +0100, Johannes Berg wrote:
> On Mon, 2017-01-09 at 22:33 +0100, Linus Lüssing wrote:
> > On Mon, Jan 09, 2017 at 01:44:03PM +0100, Johannes Berg wrote:
> > > 
> > > > >  A host SHOULD silently discard a datagram that is
> > > > > received via
> > > > >  a link-layer broadcast (see Section 2.4) but does not
> > > > > specify
> > > > >  an IP multicast or broadcast destination address.
> > > > 
> > > > This example is the other way round. It specifies how the IP
> > > > destination should look like in case of link-layer broadcast. Not
> > > > how the link-layer destination should look like in case of a
> > > > multicast/broadcast IP destination.
> > > 
> > > You stopped reading too early - snipped the context part for you :)
> > 
> > Sorry for writing to you directly, but I still have some
> > difficulties. In pseudo-code that line says:
> > 
> > -
> > if ll_dst(pkt) == bcast AND ip_dst(pkt) != mcast/bcast:
> > -> drop(pkt)
> > -
> > 
> > But after multicast-to-unicast conversion, we have:
> > 
> > -
> > ll_dst(pkt) == ucast AND ip_dst(pkt) == mcast
> > -
> > 
> > So none of the two requirements for dropping are matched?
> > 
> 
> Exactly. My point is that this is breaking the expectation that hosts
> are actually able to drop such packets.

[readding CCs I removed earlier]

Ah! Thanks. I was worried about creating packetloss :D.

Hm, for this other other way round, I think it does not apply for
the bridge multicast-to-unicast patch if I'm not misreading the bridge code:

For a packet with a link-layer multicast address but a unicast IP
destination, the bridge MDB lookup will fail.
(http://lxr.free-electrons.com/source/net/bridge/br_multicast.c?v=4.8#L178
 returns NULL)

Case A): No multicast router on port:
-> bridge, br_multicast_flood(), will drop the packet already
   (no matter if multicast-to-unicast is enabled or not)

Case B): Multicast router present on port:
-> The new patch does not apply multicast-to-unicast but just floods
   packet unaltered
   ("else { port = rport; addr = NULL; }" branch)

Regards, Linus


Re: [PATCH net-next] bridge: multicast to unicast

2017-01-09 Thread Linus Lüssing
On Mon, Jan 09, 2017 at 12:44:19PM +0100, M. Braun wrote:
> Am 09.01.2017 um 09:08 schrieb Johannes Berg:
> > Does it make sense to implement the two in separate layers though?
> > 
> > Clearly, this part needs to be implemented in the bridge layer due to
> > the snooping knowledge, but the code is very similar to what mac80211
> > has now.
> 
> Does the bridge always know about all stations connected?

The bridge does not always know about all stations, especially the
silent ones like in your DVB-T example.

However, concerning IP multicast, there is IGMP/MLD. So the bridge
does know about all stations which are interested in a specific IP
multicast stream.

(As long as there is a querier on the link, which periodically
queriers for IGMP/MLD reports from any listener. If there is no
querier then the bridge multicast snooping, including the bridge
multicast-to-unicast will fall back to flooding)


So if your television example uses IP multicast properly, it is
completely doable with the bridge multicast-to-unicast, thanks to
IGMP/MLD.


Re: [PATCH net-next] bridge: multicast to unicast

2017-01-09 Thread Linus Lüssing
On Mon, Jan 09, 2017 at 12:44:19PM +0100, M. Braun wrote:
> Am 09.01.2017 um 09:08 schrieb Johannes Berg:
> > Does it make sense to implement the two in separate layers though?
> > 
> > Clearly, this part needs to be implemented in the bridge layer due to
> > the snooping knowledge, but the code is very similar to what mac80211
> > has now.
> 
> Does the bridge always know about all stations connected?

The bridge does not always know about all stations, especially the
silent ones like in your DVB-T example.

However, concerning IP multicast, there is IGMP/MLD. So the bridge
does know about all stations which are interested in a specific IP
multicast stream.

(As long as there is a querier on the link, which periodically
queriers for IGMP/MLD reports from any listener. If there is no
querier then the bridge multicast snooping, including the bridge
multicast-to-unicast will fall back to flooding)


So if your television example uses IP multicast properly, it is
completely doable with the bridge multicast-to-unicast, thanks to
IGMP/MLD.


Re: [PATCH net-next] bridge: multicast to unicast

2017-01-09 Thread Linus Lüssing
On Mon, Jan 09, 2017 at 09:05:49AM +0100, Johannes Berg wrote:
> On Sat, 2017-01-07 at 16:15 +0100, Linus Lüssing wrote:
> 
> > Actually, I do not quite understand that remark in the mac80211
> > multicast-to-unicast patch. IP should not care about the ethernet
> > header?
> 
> But it does, for example RFC 1122 states:
> 
>  When a host sends a datagram to a link-layer broadcast address,
>  the IP destination address MUST be a legal IP broadcast or IP
>  multicast address.
> 
>  A host SHOULD silently discard a datagram that is received via
>  a link-layer broadcast (see Section 2.4) but does not specify
>  an IP multicast or broadcast destination address.

This example is the other way round. It specifies how the IP
destination should look like in case of link-layer broadcast. Not
how the link-layer destination should look like in case of a
multicast/broadcast IP destination.

Any other examples?


Re: [PATCH net-next] bridge: multicast to unicast

2017-01-09 Thread Linus Lüssing
On Mon, Jan 09, 2017 at 09:05:49AM +0100, Johannes Berg wrote:
> On Sat, 2017-01-07 at 16:15 +0100, Linus Lüssing wrote:
> 
> > Actually, I do not quite understand that remark in the mac80211
> > multicast-to-unicast patch. IP should not care about the ethernet
> > header?
> 
> But it does, for example RFC 1122 states:
> 
>  When a host sends a datagram to a link-layer broadcast address,
>  the IP destination address MUST be a legal IP broadcast or IP
>  multicast address.
> 
>  A host SHOULD silently discard a datagram that is received via
>  a link-layer broadcast (see Section 2.4) but does not specify
>  an IP multicast or broadcast destination address.

This example is the other way round. It specifies how the IP
destination should look like in case of link-layer broadcast. Not
how the link-layer destination should look like in case of a
multicast/broadcast IP destination.

Any other examples?


Re: [PATCH net-next] bridge: multicast to unicast

2017-01-07 Thread Linus Lüssing
On Fri, Jan 06, 2017 at 01:47:52PM +0100, Johannes Berg wrote:
> How does this compare and/or relate to the multicast-to-unicast feature
> we were going to add to the wifi stack, particularly mac80211? Do we
> perhaps not need that feature at all, if bridging will have it?
> 
> I suppose that the feature there could apply also to locally generated
> traffic when the AP interface isn't in a bridge, but I think I could
> live with requiring the AP to be put into a bridge to achieve a similar
> configuration?
> 
> Additionally, on an unrelated note, this seems to apply generically to
> all kinds of frames, losing information by replacing the address.
> Shouldn't it have similar limitations as the wifi stack feature has
> then, like only applying to ARP, IPv4, IPv6 and not general protocols?

(should all three be answered with Michael's and my reply to
Michael's mail, I think)

> 
> Also, it should probably come with the same caveat as we documented for
> the wifi feature:
> 
> Note that this may break certain expectations of the receiver,
> such as the ability to drop unicast IP packets received within
> multicast L2 frames, or the ability to not send ICMP destination
> unreachable messages for packets received in L2 multicast (which
> is required, but the receiver can't tell the difference if this
> new option is enabled.)

Actually, I do not quite understand that remark in the mac80211
multicast-to-unicast patch. IP should not care about the ethernet
header?


Re: [PATCH net-next] bridge: multicast to unicast

2017-01-07 Thread Linus Lüssing
On Fri, Jan 06, 2017 at 01:47:52PM +0100, Johannes Berg wrote:
> How does this compare and/or relate to the multicast-to-unicast feature
> we were going to add to the wifi stack, particularly mac80211? Do we
> perhaps not need that feature at all, if bridging will have it?
> 
> I suppose that the feature there could apply also to locally generated
> traffic when the AP interface isn't in a bridge, but I think I could
> live with requiring the AP to be put into a bridge to achieve a similar
> configuration?
> 
> Additionally, on an unrelated note, this seems to apply generically to
> all kinds of frames, losing information by replacing the address.
> Shouldn't it have similar limitations as the wifi stack feature has
> then, like only applying to ARP, IPv4, IPv6 and not general protocols?

(should all three be answered with Michael's and my reply to
Michael's mail, I think)

> 
> Also, it should probably come with the same caveat as we documented for
> the wifi feature:
> 
> Note that this may break certain expectations of the receiver,
> such as the ability to drop unicast IP packets received within
> multicast L2 frames, or the ability to not send ICMP destination
> unreachable messages for packets received in L2 multicast (which
> is required, but the receiver can't tell the difference if this
> new option is enabled.)

Actually, I do not quite understand that remark in the mac80211
multicast-to-unicast patch. IP should not care about the ethernet
header?


Re: [PATCH net-next] bridge: multicast to unicast

2017-01-07 Thread Linus Lüssing
On Fri, Jan 06, 2017 at 07:13:56PM -0800, Stephen Hemminger wrote:
> On Mon,  2 Jan 2017 20:32:14 +0100
> Linus Lüssing <linus.luess...@c0d3.blue> wrote:
> 
> > This feature is intended for interface types which have a more reliable
> > and/or efficient way to deliver unicast packets than broadcast ones
> > (e.g. wifi).
> 
> 
> Why is this not done in MAC80211 rather than  bridge?

Because mac80211 does not have the IGMP/MLD snooping code as
the bridge has?

Reimplementing the snooping in mac80211 does not make sense
because of duplicating code. Moving the snooping code from the
bridge to mac80211 does not make sense either, because we want
multicast snooping, software based, (virtually) wired switches,
too.

The "best way" (TM) would probably be to migrate the IGMP/MLD
snooping from the bridge code to the net device code on the long
run to make such a database usable for any kind of device, without
needing this bridge hack.

But such a migration would also need a way more invasive patchset.

While Felix's idea might look a little "ugly" due it's hacky
nature, I think it is also quite beautiful thanks to it's
simplicity.


Re: [PATCH net-next] bridge: multicast to unicast

2017-01-07 Thread Linus Lüssing
On Fri, Jan 06, 2017 at 07:13:56PM -0800, Stephen Hemminger wrote:
> On Mon,  2 Jan 2017 20:32:14 +0100
> Linus Lüssing  wrote:
> 
> > This feature is intended for interface types which have a more reliable
> > and/or efficient way to deliver unicast packets than broadcast ones
> > (e.g. wifi).
> 
> 
> Why is this not done in MAC80211 rather than  bridge?

Because mac80211 does not have the IGMP/MLD snooping code as
the bridge has?

Reimplementing the snooping in mac80211 does not make sense
because of duplicating code. Moving the snooping code from the
bridge to mac80211 does not make sense either, because we want
multicast snooping, software based, (virtually) wired switches,
too.

The "best way" (TM) would probably be to migrate the IGMP/MLD
snooping from the bridge code to the net device code on the long
run to make such a database usable for any kind of device, without
needing this bridge hack.

But such a migration would also need a way more invasive patchset.

While Felix's idea might look a little "ugly" due it's hacky
nature, I think it is also quite beautiful thanks to it's
simplicity.


Re: [PATCH net-next] bridge: multicast to unicast

2017-01-07 Thread Linus Lüssing
On Sat, Jan 07, 2017 at 11:32:57AM +0100, M. Braun wrote:
> Am 06.01.2017 um 14:54 schrieb Johannes Berg:
> > 
> >> The bridge layer can use IGMP snooping to ensure that the multicast
> >> stream is only transmitted to clients that are actually a member of
> >> the group. Can the mac80211 feature do the same?
> > 
> > No, it'll convert the packet for all clients that are behind that
> > netdev. But that's an argument for dropping the mac80211 feature, which
> > hasn't been merged upstream yet, no?
> 
> But there is multicast/broadcast traffic like e.g. ARP and some IP
> multicast groups that are not covered by IGMP snooping. The mac80211
> patch converts this to unicast as well, which the bridge cannot do.
> 
> That way, these features both complement and overlap each other.

Right, I'd agree with that.

I didn't write it explicitly in the commit message, but yes, the
like anything concerning bridge multicast snooping, bridge
multicast-to-unicast can only affect packets as noted in
RFC4541 ("Considerations for Internet Group Management Protocol (IGMP)
and Multicast Listener Discovery (MLD) Snooping Switches"), too.

So it is only working for IPv4 multicast, excluding link-local
(224.0.0.0/24), and IPv6 multicast, excluding all-host-multicast
(ff02::1).

And does not concern ARP in any way.


The nice complementary effect is, that the bridge can first sieve
out those IP packets thanks to IGMP/MLD snooping knowledge and for
anything else, like ARP, 224.0.0.x or ff02::1, the mac80211
multicast-to-unicast could do its job.


For APs with a small number of STAs (like your private home AP),
you might want to enable both bridge multicast-to-unicast and
mac80211 multicast-to-unicast for this complementary effect. While
on public APs with 30 to 50 STAs with varying distances and bitrates,
you might only one to enable the bridge one, because sending an ARP
packet 50x might actually reduce performance and airtime
significantly.


Re: [PATCH net-next] bridge: multicast to unicast

2017-01-07 Thread Linus Lüssing
On Sat, Jan 07, 2017 at 11:32:57AM +0100, M. Braun wrote:
> Am 06.01.2017 um 14:54 schrieb Johannes Berg:
> > 
> >> The bridge layer can use IGMP snooping to ensure that the multicast
> >> stream is only transmitted to clients that are actually a member of
> >> the group. Can the mac80211 feature do the same?
> > 
> > No, it'll convert the packet for all clients that are behind that
> > netdev. But that's an argument for dropping the mac80211 feature, which
> > hasn't been merged upstream yet, no?
> 
> But there is multicast/broadcast traffic like e.g. ARP and some IP
> multicast groups that are not covered by IGMP snooping. The mac80211
> patch converts this to unicast as well, which the bridge cannot do.
> 
> That way, these features both complement and overlap each other.

Right, I'd agree with that.

I didn't write it explicitly in the commit message, but yes, the
like anything concerning bridge multicast snooping, bridge
multicast-to-unicast can only affect packets as noted in
RFC4541 ("Considerations for Internet Group Management Protocol (IGMP)
and Multicast Listener Discovery (MLD) Snooping Switches"), too.

So it is only working for IPv4 multicast, excluding link-local
(224.0.0.0/24), and IPv6 multicast, excluding all-host-multicast
(ff02::1).

And does not concern ARP in any way.


The nice complementary effect is, that the bridge can first sieve
out those IP packets thanks to IGMP/MLD snooping knowledge and for
anything else, like ARP, 224.0.0.x or ff02::1, the mac80211
multicast-to-unicast could do its job.


For APs with a small number of STAs (like your private home AP),
you might want to enable both bridge multicast-to-unicast and
mac80211 multicast-to-unicast for this complementary effect. While
on public APs with 30 to 50 STAs with varying distances and bitrates,
you might only one to enable the bridge one, because sending an ARP
packet 50x might actually reduce performance and airtime
significantly.


[PATCH net-next] bridge: multicast to unicast

2017-01-02 Thread Linus Lüssing
Implements an optional, per bridge port flag and feature to deliver
multicast packets to any host on the according port via unicast
individually. This is done by copying the packet per host and
changing the multicast destination MAC to a unicast one accordingly.

multicast-to-unicast works on top of the multicast snooping feature of
the bridge. Which means unicast copies are only delivered to hosts which
are interested in it and signalized this via IGMP/MLD reports
previously.

This feature is intended for interface types which have a more reliable
and/or efficient way to deliver unicast packets than broadcast ones
(e.g. wifi).

However, it should only be enabled on interfaces where no IGMPv2/MLDv1
report suppression takes place. This feature is disabled by default.

The initial patch and idea is from Felix Fietkau.

Cc: Felix Fietkau <n...@nbd.name>
Signed-off-by: Linus Lüssing <linus.luess...@c0d3.blue>

---

This feature is used and enabled by default in OpenWRT and LEDE for AP
interfaces for more than a year now to allow both a more robust multicast
delivery and multicast at higher rates (e.g. multicast streaming).

In OpenWRT/LEDE the IGMP/MLD report suppression issue is overcome by
the network daemon enabling AP isolation and by that separating all STAs.
Delivery of STA-to-STA IP mulitcast is made possible again by
enabling and utilizing the bridge hairpin mode, which considers the
incoming port as a potential outgoing port, too.

Hairpin-mode is performed after multicast snooping, therefore leading to
only deliver reports to STAs running a multicast router.
---
 include/linux/if_bridge.h |  1 +
 net/bridge/br_forward.c   | 44 +--
 net/bridge/br_mdb.c   |  2 +-
 net/bridge/br_multicast.c | 92 ++-
 net/bridge/br_private.h   |  4 ++-
 net/bridge/br_sysfs_if.c  |  2 ++
 6 files changed, 115 insertions(+), 30 deletions(-)

diff --git a/include/linux/if_bridge.h b/include/linux/if_bridge.h
index c6587c0..f1b0d78 100644
--- a/include/linux/if_bridge.h
+++ b/include/linux/if_bridge.h
@@ -46,6 +46,7 @@ struct br_ip_list {
 #define BR_LEARNING_SYNC   BIT(9)
 #define BR_PROXYARP_WIFI   BIT(10)
 #define BR_MCAST_FLOOD BIT(11)
+#define BR_MULTICAST_TO_UCAST  BIT(12)
 
 #define BR_DEFAULT_AGEING_TIME (300 * HZ)
 
diff --git a/net/bridge/br_forward.c b/net/bridge/br_forward.c
index 7cb41ae..49d742d 100644
--- a/net/bridge/br_forward.c
+++ b/net/bridge/br_forward.c
@@ -174,6 +174,33 @@ static struct net_bridge_port *maybe_deliver(
return p;
 }
 
+static struct net_bridge_port *maybe_deliver_addr(
+   struct net_bridge_port *prev, struct net_bridge_port *p,
+   struct sk_buff *skb, const unsigned char *addr,
+   bool local_orig)
+{
+   struct net_device *dev = BR_INPUT_SKB_CB(skb)->brdev;
+   const unsigned char *src = eth_hdr(skb)->h_source;
+
+   if (!should_deliver(p, skb))
+   return prev;
+
+   /* Even with hairpin, no soliloquies - prevent breaking IPv6 DAD */
+   if (skb->dev == p->dev && ether_addr_equal(src, addr))
+   return prev;
+
+   skb = skb_copy(skb, GFP_ATOMIC);
+   if (!skb) {
+   dev->stats.tx_dropped++;
+   return prev;
+   }
+
+   memcpy(eth_hdr(skb)->h_dest, addr, ETH_ALEN);
+   __br_forward(p, skb, local_orig);
+
+   return prev;
+}
+
 /* called under rcu_read_lock */
 void br_flood(struct net_bridge *br, struct sk_buff *skb,
  enum br_pkt_type pkt_type, bool local_rcv, bool local_orig)
@@ -231,6 +258,7 @@ void br_multicast_flood(struct net_bridge_mdb_entry *mdst,
struct net_bridge_port *prev = NULL;
struct net_bridge_port_group *p;
struct hlist_node *rp;
+   const unsigned char *addr;
 
rp = rcu_dereference(hlist_first_rcu(>router_list));
p = mdst ? rcu_dereference(mdst->ports) : NULL;
@@ -241,10 +269,20 @@ void br_multicast_flood(struct net_bridge_mdb_entry *mdst,
rport = rp ? hlist_entry(rp, struct net_bridge_port, rlist) :
 NULL;
 
-   port = (unsigned long)lport > (unsigned long)rport ?
-  lport : rport;
+   if ((unsigned long)lport > (unsigned long)rport) {
+   port = lport;
+   addr = p->unicast ? p->eth_addr : NULL;
+   } else {
+   port = rport;
+   addr = NULL;
+   }
+
+   if (addr)
+   prev = maybe_deliver_addr(prev, port, skb, addr,
+ local_orig);
+   else
+   prev = maybe_deliver(prev, port, skb, local_orig);
 
-   prev = maybe_deliver(prev, port, skb, local_orig);
if (IS_ERR(prev))
goto out;
if (prev == port

[PATCH net-next] bridge: multicast to unicast

2017-01-02 Thread Linus Lüssing
Implements an optional, per bridge port flag and feature to deliver
multicast packets to any host on the according port via unicast
individually. This is done by copying the packet per host and
changing the multicast destination MAC to a unicast one accordingly.

multicast-to-unicast works on top of the multicast snooping feature of
the bridge. Which means unicast copies are only delivered to hosts which
are interested in it and signalized this via IGMP/MLD reports
previously.

This feature is intended for interface types which have a more reliable
and/or efficient way to deliver unicast packets than broadcast ones
(e.g. wifi).

However, it should only be enabled on interfaces where no IGMPv2/MLDv1
report suppression takes place. This feature is disabled by default.

The initial patch and idea is from Felix Fietkau.

Cc: Felix Fietkau 
Signed-off-by: Linus Lüssing 

---

This feature is used and enabled by default in OpenWRT and LEDE for AP
interfaces for more than a year now to allow both a more robust multicast
delivery and multicast at higher rates (e.g. multicast streaming).

In OpenWRT/LEDE the IGMP/MLD report suppression issue is overcome by
the network daemon enabling AP isolation and by that separating all STAs.
Delivery of STA-to-STA IP mulitcast is made possible again by
enabling and utilizing the bridge hairpin mode, which considers the
incoming port as a potential outgoing port, too.

Hairpin-mode is performed after multicast snooping, therefore leading to
only deliver reports to STAs running a multicast router.
---
 include/linux/if_bridge.h |  1 +
 net/bridge/br_forward.c   | 44 +--
 net/bridge/br_mdb.c   |  2 +-
 net/bridge/br_multicast.c | 92 ++-
 net/bridge/br_private.h   |  4 ++-
 net/bridge/br_sysfs_if.c  |  2 ++
 6 files changed, 115 insertions(+), 30 deletions(-)

diff --git a/include/linux/if_bridge.h b/include/linux/if_bridge.h
index c6587c0..f1b0d78 100644
--- a/include/linux/if_bridge.h
+++ b/include/linux/if_bridge.h
@@ -46,6 +46,7 @@ struct br_ip_list {
 #define BR_LEARNING_SYNC   BIT(9)
 #define BR_PROXYARP_WIFI   BIT(10)
 #define BR_MCAST_FLOOD BIT(11)
+#define BR_MULTICAST_TO_UCAST  BIT(12)
 
 #define BR_DEFAULT_AGEING_TIME (300 * HZ)
 
diff --git a/net/bridge/br_forward.c b/net/bridge/br_forward.c
index 7cb41ae..49d742d 100644
--- a/net/bridge/br_forward.c
+++ b/net/bridge/br_forward.c
@@ -174,6 +174,33 @@ static struct net_bridge_port *maybe_deliver(
return p;
 }
 
+static struct net_bridge_port *maybe_deliver_addr(
+   struct net_bridge_port *prev, struct net_bridge_port *p,
+   struct sk_buff *skb, const unsigned char *addr,
+   bool local_orig)
+{
+   struct net_device *dev = BR_INPUT_SKB_CB(skb)->brdev;
+   const unsigned char *src = eth_hdr(skb)->h_source;
+
+   if (!should_deliver(p, skb))
+   return prev;
+
+   /* Even with hairpin, no soliloquies - prevent breaking IPv6 DAD */
+   if (skb->dev == p->dev && ether_addr_equal(src, addr))
+   return prev;
+
+   skb = skb_copy(skb, GFP_ATOMIC);
+   if (!skb) {
+   dev->stats.tx_dropped++;
+   return prev;
+   }
+
+   memcpy(eth_hdr(skb)->h_dest, addr, ETH_ALEN);
+   __br_forward(p, skb, local_orig);
+
+   return prev;
+}
+
 /* called under rcu_read_lock */
 void br_flood(struct net_bridge *br, struct sk_buff *skb,
  enum br_pkt_type pkt_type, bool local_rcv, bool local_orig)
@@ -231,6 +258,7 @@ void br_multicast_flood(struct net_bridge_mdb_entry *mdst,
struct net_bridge_port *prev = NULL;
struct net_bridge_port_group *p;
struct hlist_node *rp;
+   const unsigned char *addr;
 
rp = rcu_dereference(hlist_first_rcu(>router_list));
p = mdst ? rcu_dereference(mdst->ports) : NULL;
@@ -241,10 +269,20 @@ void br_multicast_flood(struct net_bridge_mdb_entry *mdst,
rport = rp ? hlist_entry(rp, struct net_bridge_port, rlist) :
 NULL;
 
-   port = (unsigned long)lport > (unsigned long)rport ?
-  lport : rport;
+   if ((unsigned long)lport > (unsigned long)rport) {
+   port = lport;
+   addr = p->unicast ? p->eth_addr : NULL;
+   } else {
+   port = rport;
+   addr = NULL;
+   }
+
+   if (addr)
+   prev = maybe_deliver_addr(prev, port, skb, addr,
+ local_orig);
+   else
+   prev = maybe_deliver(prev, port, skb, local_orig);
 
-   prev = maybe_deliver(prev, port, skb, local_orig);
if (IS_ERR(prev))
goto out;
if (prev == port)
diff --git a/net/bridge/br_mdb.c b/net/bridge/br_m

Re: [PATCH] mac80211_hwsim: suggest nl80211 instead of wext driver in documentation

2016-10-17 Thread Linus Lüssing
On Mon, Oct 17, 2016 at 11:39:04AM +0200, Johannes Berg wrote:
> On Mon, 2016-10-17 at 00:39 +0200, Linus Lüssing wrote:
> > For mac80211_hwsim interfaces, suggest to use wpa_supplicant with the
> > more modern, netlink based driver instead of wext.
> 
> Makes sense, applied.
> 
> > Actually, I wasn't even able to make a connection with the
> > configuration
> > files and information provided in
> > Documentation/networking/mac80211_hwsim/{README,hostapd.conf/wpa_supp
> > licant.conf}
> > 
> This less so, we even have a few test cases we run regularly, but I
> don't know. Maybe there's something special in those configuration
> files that we don't test for otherwise.

I investigated a little further and the problem is probably that
I'm running a minimal Linux kernel which was compiled without
CONFIG_CFG80211_WEXT :-).

Anyways, thanks for applying the patch!

Regards, Linus


Re: [PATCH] mac80211_hwsim: suggest nl80211 instead of wext driver in documentation

2016-10-17 Thread Linus Lüssing
On Mon, Oct 17, 2016 at 11:39:04AM +0200, Johannes Berg wrote:
> On Mon, 2016-10-17 at 00:39 +0200, Linus Lüssing wrote:
> > For mac80211_hwsim interfaces, suggest to use wpa_supplicant with the
> > more modern, netlink based driver instead of wext.
> 
> Makes sense, applied.
> 
> > Actually, I wasn't even able to make a connection with the
> > configuration
> > files and information provided in
> > Documentation/networking/mac80211_hwsim/{README,hostapd.conf/wpa_supp
> > licant.conf}
> > 
> This less so, we even have a few test cases we run regularly, but I
> don't know. Maybe there's something special in those configuration
> files that we don't test for otherwise.

I investigated a little further and the problem is probably that
I'm running a minimal Linux kernel which was compiled without
CONFIG_CFG80211_WEXT :-).

Anyways, thanks for applying the patch!

Regards, Linus


[PATCH] mac80211_hwsim: suggest nl80211 instead of wext driver in documentation

2016-10-16 Thread Linus Lüssing
For mac80211_hwsim interfaces, suggest to use wpa_supplicant with the more
modern, netlink based driver instead of wext.

Signed-off-by: Linus Lüssing <linus.luess...@c0d3.blue>
---

Actually, I wasn't even able to make a connection with the configuration
files and information provided in
Documentation/networking/mac80211_hwsim/{README,hostapd.conf/wpa_supplicant.conf}

Changing -Dwext to -Dnl80211 helped and made the WPA-PSK connection with
mac80211_hwsim interfaces work for me.
---
 Documentation/networking/mac80211_hwsim/README | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/networking/mac80211_hwsim/README 
b/Documentation/networking/mac80211_hwsim/README
index 24ac91d..3566a72 100644
--- a/Documentation/networking/mac80211_hwsim/README
+++ b/Documentation/networking/mac80211_hwsim/README
@@ -60,7 +60,7 @@ modprobe mac80211_hwsim
 hostapd hostapd.conf
 
 # Run wpa_supplicant (station) for wlan1
-wpa_supplicant -Dwext -iwlan1 -c wpa_supplicant.conf
+wpa_supplicant -Dnl80211 -iwlan1 -c wpa_supplicant.conf
 
 
 More test cases are available in hostap.git:
-- 
2.1.4



[PATCH] mac80211_hwsim: suggest nl80211 instead of wext driver in documentation

2016-10-16 Thread Linus Lüssing
For mac80211_hwsim interfaces, suggest to use wpa_supplicant with the more
modern, netlink based driver instead of wext.

Signed-off-by: Linus Lüssing 
---

Actually, I wasn't even able to make a connection with the configuration
files and information provided in
Documentation/networking/mac80211_hwsim/{README,hostapd.conf/wpa_supplicant.conf}

Changing -Dwext to -Dnl80211 helped and made the WPA-PSK connection with
mac80211_hwsim interfaces work for me.
---
 Documentation/networking/mac80211_hwsim/README | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/networking/mac80211_hwsim/README 
b/Documentation/networking/mac80211_hwsim/README
index 24ac91d..3566a72 100644
--- a/Documentation/networking/mac80211_hwsim/README
+++ b/Documentation/networking/mac80211_hwsim/README
@@ -60,7 +60,7 @@ modprobe mac80211_hwsim
 hostapd hostapd.conf
 
 # Run wpa_supplicant (station) for wlan1
-wpa_supplicant -Dwext -iwlan1 -c wpa_supplicant.conf
+wpa_supplicant -Dnl80211 -iwlan1 -c wpa_supplicant.conf
 
 
 More test cases are available in hostap.git:
-- 
2.1.4



[PATCH] cfg80211: Add stub for cfg80211_get_station()

2016-08-19 Thread Linus Lüssing
This allows modules using this function (currently: batman-adv) to
compile even if cfg80211 is not built at all, thus relaxing
dependencies.

Signed-off-by: Linus Lüssing <linus.luess...@c0d3.blue>
---
 include/net/cfg80211.h | 9 +
 1 file changed, 9 insertions(+)

diff --git a/include/net/cfg80211.h b/include/net/cfg80211.h
index 9c23f4d3..beb7610 100644
--- a/include/net/cfg80211.h
+++ b/include/net/cfg80211.h
@@ -1102,6 +1102,7 @@ struct station_info {
struct cfg80211_tid_stats pertid[IEEE80211_NUM_TIDS + 1];
 };
 
+#if IS_ENABLED(CONFIG_CFG80211)
 /**
  * cfg80211_get_station - retrieve information about a given station
  * @dev: the device where the station is supposed to be connected to
@@ -1114,6 +1115,14 @@ struct station_info {
  */
 int cfg80211_get_station(struct net_device *dev, const u8 *mac_addr,
 struct station_info *sinfo);
+#else
+static inline int cfg80211_get_station(struct net_device *dev,
+  const u8 *mac_addr,
+  struct station_info *sinfo)
+{
+   return -ENOENT;
+}
+#endif
 
 /**
  * enum monitor_flags - monitor flags
-- 
2.1.4



[PATCH] cfg80211: Add stub for cfg80211_get_station()

2016-08-19 Thread Linus Lüssing
This allows modules using this function (currently: batman-adv) to
compile even if cfg80211 is not built at all, thus relaxing
dependencies.

Signed-off-by: Linus Lüssing 
---
 include/net/cfg80211.h | 9 +
 1 file changed, 9 insertions(+)

diff --git a/include/net/cfg80211.h b/include/net/cfg80211.h
index 9c23f4d3..beb7610 100644
--- a/include/net/cfg80211.h
+++ b/include/net/cfg80211.h
@@ -1102,6 +1102,7 @@ struct station_info {
struct cfg80211_tid_stats pertid[IEEE80211_NUM_TIDS + 1];
 };
 
+#if IS_ENABLED(CONFIG_CFG80211)
 /**
  * cfg80211_get_station - retrieve information about a given station
  * @dev: the device where the station is supposed to be connected to
@@ -1114,6 +1115,14 @@ struct station_info {
  */
 int cfg80211_get_station(struct net_device *dev, const u8 *mac_addr,
 struct station_info *sinfo);
+#else
+static inline int cfg80211_get_station(struct net_device *dev,
+  const u8 *mac_addr,
+  struct station_info *sinfo)
+{
+   return -ENOENT;
+}
+#endif
 
 /**
  * enum monitor_flags - monitor flags
-- 
2.1.4



[PATCH] mailmap: Add Linus Lüssing

2016-07-06 Thread Linus Lüssing
For one thing, summarizes all non-umlaut versions into the umlaut one
(Linus Luessing -> Linus Lüssing).

For another, maps obosolete email addresses to the current @c0d3.blue
one.

Signed-off-by: Linus Lüssing <linus.luess...@c0d3.blue>
---
 .mailmap | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/.mailmap b/.mailmap
index 52489f5..d4e7b66 100644
--- a/.mailmap
+++ b/.mailmap
@@ -91,6 +91,8 @@ Krzysztof Kozlowski <k...@kernel.org> 
<k.kozlowsk...@gmail.com>
 Kuninori Morimoto <kuninori.morimoto...@renesas.com>
 Leonid I Ananiev <leonid.i.anan...@intel.com>
 Linas Vepstas <li...@austin.ibm.com>
+Linus Lüssing <linus.luess...@c0d3.blue> <linus.luess...@web.de>
+Linus Lüssing <linus.luess...@c0d3.blue> <linus.luess...@ascom.ch>
 Mark Brown <broo...@sirena.org.uk>
 Matthieu CASTET <castet.matth...@free.fr>
 Mauro Carvalho Chehab <mche...@kernel.org> <mauroche...@gmail.com> 
<mche...@infradead.org> <mche...@redhat.com> <m.che...@samsung.com> 
<mche...@osg.samsung.com> <mche...@s-opensource.com>
-- 
2.1.4



[PATCH] mailmap: Add Linus Lüssing

2016-07-06 Thread Linus Lüssing
For one thing, summarizes all non-umlaut versions into the umlaut one
(Linus Luessing -> Linus Lüssing).

For another, maps obosolete email addresses to the current @c0d3.blue
one.

Signed-off-by: Linus Lüssing 
---
 .mailmap | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/.mailmap b/.mailmap
index 52489f5..d4e7b66 100644
--- a/.mailmap
+++ b/.mailmap
@@ -91,6 +91,8 @@ Krzysztof Kozlowski  

 Kuninori Morimoto 
 Leonid I Ananiev 
 Linas Vepstas 
+Linus Lüssing  
+Linus Lüssing  
 Mark Brown 
 Matthieu CASTET 
 Mauro Carvalho Chehab   
   
 
-- 
2.1.4



Re: [PATCH net] Bridge: Fix ipv6 mc snooping if bridge has no ipv6 address

2016-06-28 Thread Linus Lüssing
On Tue, Jun 28, 2016 at 08:04:42AM -0400, David Miller wrote:
> From: Linus Lüssing <linus.luess...@c0d3.blue>
> [...]
> > Fixes: 1d81d4c3dd88 ("bridge: check return value of ipv6_dev_get_saddr()")
> 
> You're missing an initial 'd' in that SHA1-ID.
> 
> With that fixed, applied and queued up for -stable.

Sorry :(. Thanks for taking care of it!


Re: [PATCH net] Bridge: Fix ipv6 mc snooping if bridge has no ipv6 address

2016-06-28 Thread Linus Lüssing
On Tue, Jun 28, 2016 at 08:04:42AM -0400, David Miller wrote:
> From: Linus Lüssing 
> [...]
> > Fixes: 1d81d4c3dd88 ("bridge: check return value of ipv6_dev_get_saddr()")
> 
> You're missing an initial 'd' in that SHA1-ID.
> 
> With that fixed, applied and queued up for -stable.

Sorry :(. Thanks for taking care of it!


Re: [PATCH net] Bridge: Fix ipv6 mc snooping if bridge has no ipv6 address

2016-06-25 Thread Linus Lüssing
On Fri, Jun 24, 2016 at 12:35:18PM +0200, Daniel Danzberger wrote:
> The bridge is falsly dropping ipv6 mulitcast packets if there is:
>  1. No ipv6 address assigned on the brigde.
>  2. No external mld querier present.
>  3. The internal querier enabled.
> 
> When the bridge fails to build mld queries, because it has no
> ipv6 address, it slilently returns, but keeps the local querier enabled.
> This specific case causes confusing packet loss.
> 
> Ipv6 multicast snooping can only work if:
>  a) An external querier is present
>  OR
>  b) The bridge has an ipv6 address an is capable of sending own queries
> 
> Otherwise it has to forward/flood the ipv6 multicast traffic,
> because snooping cannot work.
> 
> This patch fixes the issue by adding a flag to the bridge struct that
> indicates that there is currently no ipv6 address assinged to the bridge
> and returns a false state for the local querier in
> __br_multicast_querier_exists().

Fixes: 1d81d4c3dd88 ("bridge: check return value of ipv6_dev_get_saddr()")


  1   2   3   >