from:"Felix Fietkau"

Re: [PATCH 00/12] Ethernet: Add and use ether__addr globals

2018-04-05 Thread Felix Fietkau

On 2018-04-05 15:51, Joe Perches wrote:
>>  You have to factor in
>> not just the .text size, but the fact that referencing an exported
>> symbol needs a .reloc entry as well, which also eats up some space (at
>> least when the code is being built as module).
> 
> Thanks, the modules I built got smaller.
Please post some numbers to show this. By the way, on other
architectures the numbers will probably be different, especially on
ARM/MIPS.
>> In my opinion, your series probably causes more bloat in common
>> configurations instead of reducing it.
>> 
>> You're also touching several places that could easily use
>> eth_broadcast_addr and eth_zero_addr. I think making those changes would
>> be more productive than what you did in this series.
> 
> Doubtful. AFAIK: possible unaligned addresses.
Those two are just memset calls, alignment does not matter.

- Felix

Re: [PATCH 00/12] Ethernet: Add and use ether__addr globals

2018-04-05 Thread Felix Fietkau

On 2018-03-31 09:05, Joe Perches wrote:
> There are many local static and non-static arrays that are used for
> Ethernet broadcast address output or comparison.
> 
> Centralize the array into a single separate file and remove the local
> arrays.
I suspect that for many targets and configurations, the local arrays
might actually be smaller than exporting a global. You have to factor in
not just the .text size, but the fact that referencing an exported
symbol needs a .reloc entry as well, which also eats up some space (at
least when the code is being built as module).

In my opinion, your series probably causes more bloat in common
configurations instead of reducing it.

You're also touching several places that could easily use
eth_broadcast_addr and eth_zero_addr. I think making those changes would
be more productive than what you did in this series.

- Felix

Re: Problem with bridge (mcast-to-ucast + hairpin) and Broadcom's 802.11f in their FullMAC fw

2018-03-13 Thread Felix Fietkau

[resent with fixed typo in linux-wireless address]

On 2018-02-27 11:08, Rafał Miłecki wrote:
> I've problem when using OpenWrt/LEDE on a home router with Broadcom's
> FullMAC WiFi chipset.
> 
> 
> First of all OpenWrt/LEDE uses bridge interface for LAN network with:
> 1) IFLA_BRPORT_MCAST_TO_UCAST
> 2) Clients isolation in hostapd
> 3) Hairpin mode enabled
> 
> For more details please see Linus's patch description:
> https://patchwork.kernel.org/patch/9530669/
> and maybe hairpin mode patch:
> https://lwn.net/Articles/347344/
> 
> Short version: in that setup packets received from a bridged wireless
> interface can be handled back to it for transmission.
> 
> 
> Now, Broadcom's firmware for their FullMAC chipsets in AP mode
> supports an obsoleted 802.11f AKA IAPP standard. It's a roaming
> standard that was replaced by 802.11r.
> 
> Whenever a new station associates, firmware generates a packet like:
> ff ff ff ff  ff ff ec 10  7b 5f ?? ??  00 06 00 01  af 81 01 00
> (just masked 2 bytes of my MAC)
> 
> For mode details you can see discussion in my brcmfmac patch thread:
> https://patchwork.kernel.org/patch/10191451/
> 
> 
> The problem is that bridge (in setup as above) handles such a packet
> back to the device.
> 
> That makes Broadcom's FullMAC firmware believe that a given station
> just connected to another AP in a network (which doesn't even exist).
> As a result firmware immediately disassociates that station. It's
> simply impossible to connect to the router. Every association is
> followed by immediate disassociation.
> 
> 
> Can you see any solution for this problem? Is that an option to stop
> multicast-to-unicast from touching 802.11f packets? Some other ideas?
> Obviously I can't modify Broadcom's firmware and drop that obsoleted
> standard.
Let's look at it from a different angle: Since these packets are
forwarded as normal packets by the bridge, and the Broadcom firmware
reacts to them in this nasty way, that's basically local DoS security
issue. In my opinion that matters a lot more than having support for an
obsolete feature that almost nobody will ever want to use.

I think the right approach to deal with this issue is to drop these
garbage packets in both the receive and transmit path of brcmfmac.

- Felix

Re: Problem with bridge (mcast-to-ucast + hairpin) and Broadcom's 802.11f in their FullMAC fw

2018-03-13 Thread Felix Fietkau

On 2018-02-27 11:08, Rafał Miłecki wrote:
> I've problem when using OpenWrt/LEDE on a home router with Broadcom's
> FullMAC WiFi chipset.
> 
> 
> First of all OpenWrt/LEDE uses bridge interface for LAN network with:
> 1) IFLA_BRPORT_MCAST_TO_UCAST
> 2) Clients isolation in hostapd
> 3) Hairpin mode enabled
> 
> For more details please see Linus's patch description:
> https://patchwork.kernel.org/patch/9530669/
> and maybe hairpin mode patch:
> https://lwn.net/Articles/347344/
> 
> Short version: in that setup packets received from a bridged wireless
> interface can be handled back to it for transmission.
> 
> 
> Now, Broadcom's firmware for their FullMAC chipsets in AP mode
> supports an obsoleted 802.11f AKA IAPP standard. It's a roaming
> standard that was replaced by 802.11r.
> 
> Whenever a new station associates, firmware generates a packet like:
> ff ff ff ff  ff ff ec 10  7b 5f ?? ??  00 06 00 01  af 81 01 00
> (just masked 2 bytes of my MAC)
> 
> For mode details you can see discussion in my brcmfmac patch thread:
> https://patchwork.kernel.org/patch/10191451/
> 
> 
> The problem is that bridge (in setup as above) handles such a packet
> back to the device.
> 
> That makes Broadcom's FullMAC firmware believe that a given station
> just connected to another AP in a network (which doesn't even exist).
> As a result firmware immediately disassociates that station. It's
> simply impossible to connect to the router. Every association is
> followed by immediate disassociation.
> 
> 
> Can you see any solution for this problem? Is that an option to stop
> multicast-to-unicast from touching 802.11f packets? Some other ideas?
> Obviously I can't modify Broadcom's firmware and drop that obsoleted
> standard.
Let's look at it from a different angle: Since these packets are
forwarded as normal packets by the bridge, and the Broadcom firmware
reacts to them in this nasty way, that's basically local DoS security
issue. In my opinion that matters a lot more than having support for an
obsolete feature that almost nobody will ever want to use.

I think the right approach to deal with this issue is to drop these
garbage packets in both the receive and transmit path of brcmfmac.

- Felix

Re: [PATCH 00/30] Netfilter/IPVS updates for net-next

2018-03-12 Thread Felix Fietkau

On 2018-03-12 21:01, David Miller wrote:
> From: Felix Fietkau <n...@nbd.name>
> Date: Mon, 12 Mar 2018 20:30:01 +0100
> 
>> It's not dead and useless. In its current state, it has a software fast
>> path that significantly improves nftables routing/NAT throughput,
>> especially on embedded devices.
>> On some devices, I've seen "only" 20% throughput improvement (along with
>> CPU usage reduction), on others it's quite a bit lot more. This is
>> without any extra drivers or patches aside from what's posted.
> 
> I wonder if this software fast path has the exploitability problems that
> things like the ipv4 routing cache and the per-cpu flow cache both had.
> And the reason for which both were removed.
> 
> I don't see how you can avoid this problem.
> 
> I'm willing to be shown otherwise :-)
I don't think it suffers from the same issues, and if it does, it's a
lot easier to mitigate. The ruleset can easily be configured to only
offload connections that transferred a certain amount of data, handling
only bulk flows.

It's easy to put an upper limit on the number of offloaded connections,
and there's nothing in the code that just creates an offload entry per
packet or per lookup or something like that.

If you have other concerns, I'm sure we can address them with follow-up
patches, but as it stands, I think the code is already quite useful.

- Felix

Re: [PATCH 00/30] Netfilter/IPVS updates for net-next

2018-03-12 Thread Felix Fietkau

On 2018-03-12 19:58, David Miller wrote:
> From: Pablo Neira Ayuso 
> Date: Mon, 12 Mar 2018 18:58:50 +0100
> 
>> The following patchset contains Netfilter/IPVS updates for your net-next
>> tree. This batch comes with more input sanitization for xtables to
>> address bug reports from fuzzers, preparation works to the flowtable
>> infrastructure and assorted updates. In no particular order, they are:
> 
> Sorry, I've seen enough.  I'm not pulling this.
> 
> What is the story with this flow table stuff?  I tried to ask you
> about this before, but the response I was given was extremely vague
> and did not answer my question at all.
> 
> This is a lot of code, and a lot of infrastructure, yet I see
> no device using the infrastructure to offload conntack.
> 
> Nor can I see how this can possibly be even useful for such an
> application.  What conntrack offload needs are things completely
> outside of what the flow table stuff provides.  Mainly, they
> require that the SKB is completely abstracted away from all of
> the contrack code paths, and that the conntrack infrastructure
> operates on an abstract packet metadata concept.
> 
> If you are targetting one specific piece of hardware with TCAMs
> that you are familiar with.  I'd like you to stop right there.
> Because if that is all that this infrastructure can actually
> be used for, it is definitely designed wrong.
> 
> This, as has been the case in the past, is what is wrong with
> netfilter approach to supporting offloading.  We see all of this
> infrastructure before an actual working use case is provided for a
> specific piece of hardware for a specific driver in the tree.
> 
> Nobody can evaluate whether the approach is good or not without
> a clear driver change implementing support for it.
> 
> No other area of networking puts the cart before the horse like this.
> 
> I do not agree at all with the flow table infrastructure and I
> therefore do not want to pull any more flow table changes into my tree
> until there is an actual user of this stuff in that pull request which
> actually works in a way which is useful for people.  It is completely
> dead and useless code currently.
It's not dead and useless. In its current state, it has a software fast
path that significantly improves nftables routing/NAT throughput,
especially on embedded devices.
On some devices, I've seen "only" 20% throughput improvement (along with
CPU usage reduction), on others it's quite a bit lot more. This is
without any extra drivers or patches aside from what's posted.

Within OpenWrt, I'm working on a patch that makes the same available to
legacy netfilter as well. This is the reason for a lot of the core
refactoring that I did.

Hardware offload is still being worked on, not sure when we will have
the first driver ready. But as it stands now, the code is already very
useful and backported to OpenWrt for testing.

I think that in a couple of weeks this code will be ready to be enabled
by default in OpenWrt, which means that a lot of users' setups will get
a lot faster with no configuration change at all.

- Felix

Re: [PATCH RFC] net_sched/codel: do not defer queue length update

2018-03-10 Thread Felix Fietkau

On 2017-08-21 10:14, Konstantin Khlebnikov wrote:
> When codel wants to drop last packet in ->dequeue() it cannot call
> qdisc_tree_reduce_backlog() right away - it will notify parent qdisc
> about zero qlen and HTB/HFSC will deactivate class. The same class will
> be deactivated second time by caller of ->dequeue(). Currently codel and
> fq_codel defer update. This triggers warning in HFSC when it's qlen != 0
> but there is no active classes.
> 
> This patch update parent queue length immediately: just temporary increase
> qlen around qdisc_tree_reduce_backlog() to prevent first class deactivation
> if we have skb to return.
> 
> This might open another problem in HFSC - now operation peek could fail and
> deactivate parent class.
> 
> Signed-off-by: Konstantin Khlebnikov 
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=109581
This has been tested by several people and it seems to be working well.
Please make sure this gets merged soon.

Thanks,

- Felix

Re: [PATCH] ath9k: turn on btcoex_enable as default

2018-02-10 Thread Felix Fietkau

On 2018-02-10 14:56, Kai Heng Feng wrote:
> 
>> On 9 Feb 2018, at 3:16 PM, Kalle Valo  wrote:
>> Sure, but we have to make sure that we don't create regressions on
>> existing systems. For example, did you test this with any system which
>> don't support btcoex? (just asking, haven't tested this myself)
> 
> No not really, but I will definitely test it.
> The only module I have that uses ath9k is Dell’s DW1707.
> How do I check if it support btcoex or not?
I just reviewed the code again, and I am sure that we cannot merge this
patch. Enabling the btcoex parameter makes the driver enable a whole
bunch of code starting timers, listening to some GPIOs, etc.

On non-btcoex systems, some of those GPIOs might be floating or even
connected to different things, which could cause a lot of undefined
behavior.

This is simply too big a risk, so there absolutely needs to be a
whitelist for systems that need this, otherwise it has to remain
disabled by default.

- Felix

Re: [PATCH] ath9k: turn on btcoex_enable as default

2018-02-08 Thread Felix Fietkau

On 2018-02-08 06:28, Kai-Heng Feng wrote:
> Without btcoex_enable, WiFi activies make both WiFi and Bluetooth
> unstable if there's a bluetooth connection.
> 
> Enable this option when bt_ant_diversity is disabled.
> 
> BugLink: https://bugs.launchpad.net/bugs/1746164
> Signed-off-by: Kai-Heng Feng 
I think this might cause regressions on devices that don't have
bluetooth. This probably either needs more EEPROM checks, or something
to selectively enable it only on affected platforms.

- Felix

[PATCH 2/2] netfilter: nf_tables: fix flowtable resource leak

2018-02-05 Thread Felix Fietkau

When cleaning up flowtable entries, associated dst and ct entries need
to be released as well

Signed-off-by: Felix Fietkau <n...@nbd.name>
---
 net/netfilter/nf_flow_table.c | 19 +--
 1 file changed, 5 insertions(+), 14 deletions(-)

diff --git a/net/netfilter/nf_flow_table.c b/net/netfilter/nf_flow_table.c
index 20f86091ab98..9925bd3f93c4 100644
--- a/net/netfilter/nf_flow_table.c
+++ b/net/netfilter/nf_flow_table.c
@@ -124,7 +124,7 @@ void flow_offload_free(struct flow_offload *flow)
dst_release(flow->tuplehash[FLOW_OFFLOAD_DIR_ORIGINAL].tuple.dst_cache);
dst_release(flow->tuplehash[FLOW_OFFLOAD_DIR_REPLY].tuple.dst_cache);
e = container_of(flow, struct flow_offload_entry, flow);
-   kfree(e);
+   kfree_rcu(e, rcu_head);
 }
 EXPORT_SYMBOL_GPL(flow_offload_free);
 
@@ -161,7 +161,9 @@ void flow_offload_del(struct nf_flowtable *flow_table,
   *flow_table->type->params);
 
e = container_of(flow, struct flow_offload_entry, flow);
-   kfree_rcu(e, rcu_head);
+   nf_ct_delete(e->ct, 0, 0);
+   nf_ct_put(e->ct);
+   flow_offload_free(flow);
 }
 EXPORT_SYMBOL_GPL(flow_offload_del);
 
@@ -174,15 +176,6 @@ flow_offload_lookup(struct nf_flowtable *flow_table,
 }
 EXPORT_SYMBOL_GPL(flow_offload_lookup);
 
-static void nf_flow_release_ct(const struct flow_offload *flow)
-{
-   struct flow_offload_entry *e;
-
-   e = container_of(flow, struct flow_offload_entry, flow);
-   nf_ct_delete(e->ct, 0, 0);
-   nf_ct_put(e->ct);
-}
-
 int nf_flow_table_iterate(struct nf_flowtable *flow_table,
  void (*iter)(struct flow_offload *flow, void *data),
  void *data)
@@ -276,10 +269,8 @@ void nf_flow_offload_work_gc(struct work_struct *work)
flow = container_of(tuplehash, struct flow_offload, 
tuplehash[0]);
 
if (nf_flow_has_expired(flow) ||
-   nf_flow_is_dying(flow)) {
+   nf_flow_is_dying(flow))
flow_offload_del(flow_table, flow);
-   nf_flow_release_ct(flow);
-   }
}
 out:
rhashtable_walk_stop();
-- 
2.14.2

[PATCH 1/2] netfilter: nf_tables: fix flowtable free

2018-02-05 Thread Felix Fietkau

Every flow_offload entry is added into the table twice. Because of this,
rhashtable_free_and_destroy can't be used, since it would call kfree for
each flow_offload object twice.

Signed-off-by: Felix Fietkau <n...@nbd.name>
---
 include/net/netfilter/nf_flow_table.h   |  2 ++
 net/ipv4/netfilter/nf_flow_table_ipv4.c |  1 +
 net/ipv6/netfilter/nf_flow_table_ipv6.c |  1 +
 net/netfilter/nf_flow_table.c   | 15 +++
 net/netfilter/nf_flow_table_inet.c  |  1 +
 net/netfilter/nf_tables_api.c   |  8 +---
 6 files changed, 21 insertions(+), 7 deletions(-)

diff --git a/include/net/netfilter/nf_flow_table.h 
b/include/net/netfilter/nf_flow_table.h
index b22b22082733..67c61bda6a46 100644
--- a/include/net/netfilter/nf_flow_table.h
+++ b/include/net/netfilter/nf_flow_table.h
@@ -14,6 +14,7 @@ struct nf_flowtable_type {
struct list_headlist;
int family;
void(*gc)(struct work_struct *work);
+   void(*free)(struct nf_flowtable *table);
const struct rhashtable_params  *params;
nf_hookfn   *hook;
struct module   *owner;
@@ -95,6 +96,7 @@ struct flow_offload_tuple_rhash *flow_offload_lookup(struct 
nf_flowtable *flow_t
 int nf_flow_table_iterate(struct nf_flowtable *flow_table,
  void (*iter)(struct flow_offload *flow, void *data),
  void *data);
+void nf_flow_table_free(struct nf_flowtable *flow_table);
 void nf_flow_offload_work_gc(struct work_struct *work);
 extern const struct rhashtable_params nf_flow_offload_rhash_params;
 
diff --git a/net/ipv4/netfilter/nf_flow_table_ipv4.c 
b/net/ipv4/netfilter/nf_flow_table_ipv4.c
index b2d01eb25f2c..25d2975da156 100644
--- a/net/ipv4/netfilter/nf_flow_table_ipv4.c
+++ b/net/ipv4/netfilter/nf_flow_table_ipv4.c
@@ -260,6 +260,7 @@ static struct nf_flowtable_type flowtable_ipv4 = {
.family = NFPROTO_IPV4,
.params = _flow_offload_rhash_params,
.gc = nf_flow_offload_work_gc,
+   .free   = nf_flow_table_free,
.hook   = nf_flow_offload_ip_hook,
.owner  = THIS_MODULE,
 };
diff --git a/net/ipv6/netfilter/nf_flow_table_ipv6.c 
b/net/ipv6/netfilter/nf_flow_table_ipv6.c
index fff21602875a..d346705d6ee6 100644
--- a/net/ipv6/netfilter/nf_flow_table_ipv6.c
+++ b/net/ipv6/netfilter/nf_flow_table_ipv6.c
@@ -253,6 +253,7 @@ static struct nf_flowtable_type flowtable_ipv6 = {
.family = NFPROTO_IPV6,
.params = _flow_offload_rhash_params,
.gc = nf_flow_offload_work_gc,
+   .free   = nf_flow_table_free,
.hook   = nf_flow_offload_ipv6_hook,
.owner  = THIS_MODULE,
 };
diff --git a/net/netfilter/nf_flow_table.c b/net/netfilter/nf_flow_table.c
index 2f5099cb85b8..20f86091ab98 100644
--- a/net/netfilter/nf_flow_table.c
+++ b/net/netfilter/nf_flow_table.c
@@ -221,6 +221,21 @@ int nf_flow_table_iterate(struct nf_flowtable *flow_table,
 }
 EXPORT_SYMBOL_GPL(nf_flow_table_iterate);
 
+static void
+__nf_flow_offload_free(struct flow_offload *flow, void *data)
+{
+   struct nf_flowtable *flow_table = data;
+
+   flow_offload_del(flow_table, flow);
+}
+
+void nf_flow_table_free(struct nf_flowtable *flow_table)
+{
+   nf_flow_table_iterate(flow_table, __nf_flow_offload_free, flow_table);
+   rhashtable_destroy(_table->rhashtable);
+}
+EXPORT_SYMBOL_GPL(nf_flow_table_free);
+
 static inline bool nf_flow_has_expired(const struct flow_offload *flow)
 {
return (__s32)(flow->timeout - (u32)jiffies) <= 0;
diff --git a/net/netfilter/nf_flow_table_inet.c 
b/net/netfilter/nf_flow_table_inet.c
index 281209aeba8f..375a1881d93d 100644
--- a/net/netfilter/nf_flow_table_inet.c
+++ b/net/netfilter/nf_flow_table_inet.c
@@ -24,6 +24,7 @@ static struct nf_flowtable_type flowtable_inet = {
.family = NFPROTO_INET,
.params = _flow_offload_rhash_params,
.gc = nf_flow_offload_work_gc,
+   .free   = nf_flow_table_free,
.hook   = nf_flow_offload_inet_hook,
.owner  = THIS_MODULE,
 };
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 0791813a1e7d..a6c4747c7d5d 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -5399,17 +5399,11 @@ static void nf_tables_flowtable_notify(struct nft_ctx 
*ctx,
nfnetlink_set_err(ctx->net, ctx->portid, NFNLGRP_NFTABLES, -ENOBUFS);
 }
 
-static void nft_flowtable_destroy(void *ptr, void *arg)
-{
-   kfree(ptr);
-}
-
 static void nf_tables_flowtable_destroy(struct nft_flowtable *flowtable)
 {
cancel_delayed_work_sync(>data.gc_work);
kfree(flowtable->name);
-   rhashtable_free_a

Re: [PATCH] net: igmp: fix source address check for IGMPv3 reports

2018-01-23 Thread Felix Fietkau

On 2018-01-22 22:17, David Miller wrote:
> From: Felix Fietkau <n...@nbd.name>
> Date: Fri, 19 Jan 2018 11:50:46 +0100
> 
>> Commit "net: igmp: Use correct source address on IGMPv3 reports"
>> introduced a check to validate the source address of locally generated
>> IGMPv3 packets.
>> Instead of checking the local interface address directly, it uses
>> inet_ifa_match(fl4->saddr, ifa), which checks if the address is on the
>> local subnet (or equal to the point-to-point address if used).
>> 
>> This breaks for point-to-point interfaces, so check against
>> ifa->ifa_local directly.
>> 
>> Cc: Kevin Cernekee <cerne...@chromium.org>
>> Fixes: a46182b00290 ("net: igmp: Use correct source address on IGMPv3 
>> reports")
>> Reported-by: Sebastian Gottschall <s.gottsch...@dd-wrt.com>
>> Signed-off-by: Felix Fietkau <n...@nbd.name>
> 
> Applied, thanks.
Thanks. Please queue it up for stable as well, since the commit fixed by
this patch has already made it to stable kernels.

- Felix

[PATCH] net: igmp: fix source address check for IGMPv3 reports

2018-01-19 Thread Felix Fietkau

Commit "net: igmp: Use correct source address on IGMPv3 reports"
introduced a check to validate the source address of locally generated
IGMPv3 packets.
Instead of checking the local interface address directly, it uses
inet_ifa_match(fl4->saddr, ifa), which checks if the address is on the
local subnet (or equal to the point-to-point address if used).

This breaks for point-to-point interfaces, so check against
ifa->ifa_local directly.

Cc: Kevin Cernekee <cerne...@chromium.org>
Fixes: a46182b00290 ("net: igmp: Use correct source address on IGMPv3 reports")
Reported-by: Sebastian Gottschall <s.gottsch...@dd-wrt.com>
Signed-off-by: Felix Fietkau <n...@nbd.name>
---
 net/ipv4/igmp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c
index 726f6b608274..2d49717a7421 100644
--- a/net/ipv4/igmp.c
+++ b/net/ipv4/igmp.c
@@ -332,7 +332,7 @@ static __be32 igmpv3_get_srcaddr(struct net_device *dev,
return htonl(INADDR_ANY);
 
for_ifa(in_dev) {
-   if (inet_ifa_match(fl4->saddr, ifa))
+   if (fl4->saddr == ifa->ifa_local)
return fl4->saddr;
} endfor_ifa(in_dev);
 
-- 
2.14.2

Re: [PATCH] [wireless-next] mt76: fix building without CONFIG_LEDS_CLASS

2018-01-18 Thread Felix Fietkau

On 2018-01-18 14:14, Arnd Bergmann wrote:
> When CONFIG_LEDS_CLASS is disabled, or it is a loadable module while
> mt76 is built-in, we run into a link error:
> 
> drivers/net/wireless/mediatek/mt76/mac80211.o: In function 
> `mt76_register_device':
> mac80211.c:(.text+0xb78): relocation truncated to fit: R_AARCH64_CALL26 
> against undefined symbol `devm_of_led_classdev_register'
> 
> We don't really need a hard dependency here as the driver can presumably
> work just fine without LEDs, so this follows the iwlwifi example and
> adds a separate Kconfig option for the LED support, this will be available
> whenever it will link, and otherwise the respective code gets left out from
> the driver object.
> 
> Fixes: 17f1de56df05 ("mt76: add common code shared between multiple chipsets")
> Signed-off-by: Arnd Bergmann 
> ---
>  drivers/net/wireless/mediatek/mt76/Kconfig   | 5 +
>  drivers/net/wireless/mediatek/mt76/mac80211.c| 8 +---
>  drivers/net/wireless/mediatek/mt76/mt76x2_init.c | 6 --
>  3 files changed, 14 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/net/wireless/mediatek/mt76/Kconfig 
> b/drivers/net/wireless/mediatek/mt76/Kconfig
> index fc05d79c80d0..9d12c1f5284e 100644
> --- a/drivers/net/wireless/mediatek/mt76/Kconfig
> +++ b/drivers/net/wireless/mediatek/mt76/Kconfig
> @@ -8,3 +8,8 @@ config MT76x2E
>   depends on PCI
>   ---help---
> This adds support for MT7612/MT7602/MT7662-based wireless PCIe 
> devices.
> +
> +config MT76_LEDS
> + bool "LED support for MT76"
> + depends on MT76_CORE
> + depends on LEDS_CLASS=y || MT76_CORE=LEDS_CLASS
I think this should have a 'default y'

- Felix

Re: [PATCH 1/3] kallsyms: don't leak address when symbol not found

2017-12-18 Thread Felix Fietkau

On 2017-12-18 00:53, Tobin C. Harding wrote:
> Currently if kallsyms_lookup() fails to find the symbol then the address
> is printed. This potentially leaks sensitive information. Instead of
> printing the address we can return an error, giving the calling code the
> option to print the address or print some sanitized message.
> 
> Return error instead of printing address to argument buffer. Leave
> buffer in a sane state.
> 
> Signed-off-by: Tobin C. Harding 
I think there should be a way to keep the old behavior for debugging.

- Felix

Re: [PATCH] mt76: fix memcpy to potential null pointer on failed allocation

2017-12-14 Thread Felix Fietkau

On 2017-12-14 11:13, Colin King wrote:
> From: Colin Ian King <colin.k...@canonical.com>
> 
> Currently if the allocation of skb fails and returns NULL then the
> call to skb_put will cause a null pointer dereference. Fix this by
> checking for a null skb and returning NULL.  Note that calls to
> function mt76x2_mcu_msg_alloc don't directly check the null return
> but instead pass the NULL pointer to mt76x2_mcu_msg_send which
> checks for the NULL and returns ENOMEM in this case.
> 
> Detected by CoverityScan, CID#1462624 ("Dereference null return value")
> 
> Fixes: 7bc04215a66b ("mt76: add driver code for MT76x2e")
> Signed-off-by: Colin Ian King <colin.k...@canonical.com>
Acked-by: Felix Fietkau <n...@nbd.name>

Re: [PATCH net-next 1/3] net: dsa: mediatek: add VLAN support for MT7530

2017-12-12 Thread Felix Fietkau

On 2017-12-07 07:06, sean.w...@mediatek.com wrote:
> From: Sean Wang 
> 
> MT7530 can treat each port as either VLAN-unware port or VLAN-ware port
Shouldn't that be VLAN-unaware/VLAN-aware (in the code as well)?

- Felix

Re: [PATCH v2 net-next] tcp: allow drivers to tweak TSQ logic

2017-11-28 Thread Felix Fietkau

On 2017-11-12 00:54, Eric Dumazet wrote:
> From: Eric Dumazet 
> 
> I had many reports that TSQ logic breaks wifi aggregation.
> 
> Current logic is to allow up to 1 ms of bytes to be queued into qdisc
> and drivers queues.
> 
> But Wifi aggregation needs a bigger budget to allow bigger rates to
> be discovered by various TCP Congestion Controls algorithms.
> 
> This patch adds an extra socket field, allowing wifi drivers to select
> another log scale to derive TCP Small Queue credit from current pacing
> rate.
> 
> Initial value is 10, meaning that this patch does not change current
> behavior.
> 
> We expect wifi drivers to set this field to smaller values (tests have
> been done with values from 6 to 9)
> 
> They would have to use following template :
> 
> if (skb->sk && skb->sk->sk_pacing_shift != MY_PACING_SHIFT)
>  skb->sk->sk_pacing_shift = MY_PACING_SHIFT;
I did some experiments with this approach (with your patch backported to
a 4.9 kernel), and I got some crashes.
After looking at the crashes and code some more, it seems that this
would need some extra checks to ensure that skb->sk is a full struct
sock, instead of just a struct request_sock.
Should this be done by checking for skb->sk->sk_state ==
TCP_ESTABLISHED? It seems to me that this might introduce some extra
overhead.

- Felix

Re: [PATCH RFC,WIP 5/5] netfilter: nft_flow_offload: add ndo hooks for hardware offload

2017-11-11 Thread Felix Fietkau

On 2017-11-03 16:26, Pablo Neira Ayuso wrote:
> This patch adds the infrastructure to offload flows to hardware, in case
> the nic/switch comes with built-in flow tables capabilities.
> 
> If the hardware comes with not hardware flow tables or they have
> limitations in terms of features, this falls back to the software
> generic flow table implementation.
> 
> The software flow table aging thread skips entries that resides in the
> hardware, so the hardware will be responsible for releasing this flow
> table entry too.
> 
> Signed-off-by: Pablo Neira Ayuso 
Hi Pablo,

I'd like to start playing with those patches in OpenWrt/LEDE soon. I'm
also considering making a patch that adds iptables support.
For that to work, I think it would be a good idea to keep the code that
tries to offload flows to hardware in nf_flow_offload.c instead, so that
it can be shared with iptables integration.

By the way, do you have a git tree where you keep the current version of
your patch set?

Thanks,

- Felix

Re: [PATCH] cfg80211: Fix array-bounds warning in fragment copy

2017-03-27 Thread Felix Fietkau

On 2017-03-27 12:47, Johannes Berg wrote:
> On Fri, 2017-03-24 at 18:06 -0700, Matthias Kaehlcke wrote:
>> __ieee80211_amsdu_copy_frag intentionally initializes a pointer to
>> array[-1] to increment it later to valid values. clang rightfully
>> generates an array-bounds warning on the initialization statement.
>> Work around this by initializing the pointer to array[0] and
>> decrementing it later, which allows to leave the rest of the
>> algorithm untouched.
>> 
>> Signed-off-by: Matthias Kaehlcke <m...@chromium.org>
>> ---
>>  net/wireless/util.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>> 
>> diff --git a/net/wireless/util.c b/net/wireless/util.c
>> index 68e5f2ecee1a..d3d459e4a070 100644
>> --- a/net/wireless/util.c
>> +++ b/net/wireless/util.c
>> @@ -659,7 +659,7 @@ __ieee80211_amsdu_copy_frag(struct sk_buff *skb,
>> struct sk_buff *frame,
>>  int offset, int len)
>>  {
>>  struct skb_shared_info *sh = skb_shinfo(skb);
>> -const skb_frag_t *frag = >frags[-1];
>> +const skb_frag_t *frag = >frags[0];
>>  struct page *frag_page;
>>  void *frag_ptr;
>>  int frag_len, frag_size;
>> @@ -669,6 +669,7 @@ __ieee80211_amsdu_copy_frag(struct sk_buff *skb,
>> struct sk_buff *frame,
>>  frag_page = virt_to_head_page(skb->head);
>>  frag_ptr = skb->data;
>>  frag_size = head_size;
>> +frag--;
> 
> Isn't it just a question of time until the compiler will see through
> this trick and warn about it?
Frag is incremented again before being accessed, so there is nothing for
the compiler to see through here.

Acked-by: Felix Fietkau <n...@nbd.name>

- Felix

Re: [PATCH net-next v2 5/5] net-next: dsa: add dsa support for Mediatek MT7530 switch

2017-03-23 Thread Felix Fietkau

On 2017-03-23 15:25, John Crispin wrote:
> 
> 
> On 23/03/17 15:09, Felix Fietkau wrote:
>> On 2017-03-23 09:06, Sean Wang wrote:
>>> Hi Andrew,
>>>
>>> The purpose for the regmap table registered is to
>>>
>>> provide a way which helps us to look up a specific
>>>
>>> register on the switch through regmap-debugfs.
>>>
>>>
>>> And not all ranges of register is defined
>>>
>>> so I only include the meaningful ones in a sparse way
>>>
>>> for the table.
>> I think in that case it might be nice to make regmap support optional in
>> order to avoid pulling in bloat on platforms that don't need it.
>>
>> - Felix
>>
> The 2 relevant platforms are mips/ralink and arm/mediatek. both require 
> regmap for the eth_sysctl syscon if they want to utilize the mtk_soc_eth 
> driver which is a prereq for mt7530. so regmap cannot be optional here.
Makes sense, thanks.

- Felix

Re: [PATCH net-next v2 5/5] net-next: dsa: add dsa support for Mediatek MT7530 switch

2017-03-23 Thread Felix Fietkau

On 2017-03-23 09:06, Sean Wang wrote:
> Hi Andrew,
> 
> The purpose for the regmap table registered is to 
> 
> provide a way which helps us to look up a specific 
> 
> register on the switch through regmap-debugfs.
> 
> 
> And not all ranges of register is defined
> 
> so I only include the meaningful ones in a sparse way 
> 
> for the table.
I think in that case it might be nice to make regmap support optional in
order to avoid pulling in bloat on platforms that don't need it.

- Felix

Re: [PATCH 1/3] net: bgmac: allocate struct bgmac just once & don't copy it

2017-01-27 Thread Felix Fietkau

On 2017-01-27 10:20, Rafał Miłecki wrote:
> From: Rafał Miłecki 
> 
> To share as much code as possible in bgmac we call alloc_etherdev from
> bgmac.c which is used by both: platform and bcma code. The easiest
> solution was to use it for allocating whole struct bgmac but it doesn't
> work well as we already get early-filled struct bgmac as an argument.
> 
> So far we were solving this by copying received struct into newly
> allocated one. The problem is it means storing 2 allocated structs,
> using only 1 of them and non-shared code not having access to it.
> 
> This patch solves it by using alloc_etherdev to allocate *pointer* for
> the already allocated struct. The only downside of this is we have to be
> careful when using netdev_priv.
> 
> Another solution was to call alloc_etherdev in platform/bcma specific
> code but Jon advised against it due to sharing less code that way.
How does that lead to sharing less code?
I find this pointer indirection rather ugly and uncommon, and I think it
would be much cleaner to split the probe into bgmac_enet_alloc and
bgmac_enet_probe (with bgmac_enet_alloc calling alloc_etherdev and doing
basic setup).

- Felix

Re: [net, 6/6] net: korina: version bump

2017-01-22 Thread Felix Fietkau

On 2017-01-22 13:10, Roman Yeryomin wrote:
> On 17 January 2017 at 21:19, Roman Yeryomin <leroi.li...@gmail.com> wrote:
>> On 17 January 2017 at 20:55, Felix Fietkau <n...@nbd.name> wrote:
>>> On 2017-01-17 18:33, Roman Yeryomin wrote:
>>>> Signed-off-by: Roman Yeryomin <ro...@advem.lv>
>>>> ---
>>>>  drivers/net/ethernet/korina.c | 4 ++--
>>>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/net/ethernet/korina.c b/drivers/net/ethernet/korina.c
>>>> index 83c994f..c8fed01 100644
>>>> --- a/drivers/net/ethernet/korina.c
>>>> +++ b/drivers/net/ethernet/korina.c
>>>> @@ -66,8 +66,8 @@
>>>>  #include 
>>>>
>>>>  #define DRV_NAME "korina"
>>>> -#define DRV_VERSION  "0.10"
>>>> -#define DRV_RELDATE  "04Mar2008"
>>>> +#define DRV_VERSION  "0.20"
>>>> +#define DRV_RELDATE  "15Jan2017"
>>> I think it would make more sense to remove this version instead of
>>> bumping it. Individual driver versions are rather pointless, the kernel
>>> version is more meaningful anyway.
>>
>> OK, makes sense
> 
> Actually, after thinking a bit more about this, not really...
> How about ethtool, which uses driver name and version?
> I see most ethernet drivers define some version. And it's pretty
> useful, when using backports.
> IMO, it should be kept and bumped.
I don't really care, I just wanted to point out that the exact kernel
version is a much more useful indicator, especially since not all patch
submitters do the useless version bump dance.

- Felix

Re: [net, 6/6] net: korina: version bump

2017-01-17 Thread Felix Fietkau

On 2017-01-17 18:33, Roman Yeryomin wrote:
> Signed-off-by: Roman Yeryomin 
> ---
>  drivers/net/ethernet/korina.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ethernet/korina.c b/drivers/net/ethernet/korina.c
> index 83c994f..c8fed01 100644
> --- a/drivers/net/ethernet/korina.c
> +++ b/drivers/net/ethernet/korina.c
> @@ -66,8 +66,8 @@
>  #include 
>  
>  #define DRV_NAME "korina"
> -#define DRV_VERSION  "0.10"
> -#define DRV_RELDATE  "04Mar2008"
> +#define DRV_VERSION  "0.20"
> +#define DRV_RELDATE  "15Jan2017"
I think it would make more sense to remove this version instead of
bumping it. Individual driver versions are rather pointless, the kernel
version is more meaningful anyway.

- Felix

Re: [PATCH net-next] bridge: multicast to unicast

2017-01-11 Thread Felix Fietkau

On 2017-01-11 13:15, IgorMitsyanko wrote:
> On 01/11/2017 02:30 PM, Felix Fietkau wrote:
>> On 2017-01-11 12:26, IgorMitsyanko wrote:
>>> On 01/11/2017 12:27 AM, Felix Fietkau wrote:
>>>> On 2017-01-10 11:56, Johannes Berg wrote:
>>>>> On Tue, 2017-01-10 at 05:18 +0100, Linus Lüssing wrote:
>>>>>> On Mon, Jan 09, 2017 at 01:30:32PM -0800, Stephen Hemminger wrote:
>>>>>>> I wonder if MAC80211 should be doing IGMP snooping and not bridge
>>>>>>> in this environment.
>>>>>> In the long term, yes. For now, not quite sure.
>>>>> There's no "for now" in the kernel. Code added now will have to be
>>>>> maintained essentially forever.
>>>> I'm not sure that putting the IGMP snooping code in mac80211 is a good
>>>> idea, that would be quite a bit of code duplication.
>>>> This implementation works, it's very simple, and it's quite flexible for
>>>> a number of use cases.
>>>>
>>>> Is there any remaining objection to merging this in principle (aside
>>>> from potential issues with the code)?
>>>>
>>>> - Felix
>>>>
>>>
>>> Hi Felix, can we consider two examples configurations with multicast
>>> traffic:
>>>
>>> 1. AP is a source of multicast traffic itself, no bridge on AP. For
>>> example, wireless video server streaming to several clients.
>>> In this situation, we can not make use of possible advantages given by
>>> mc-to-uc conversion?
>> You could simply put the AP interface in a bridge, no need to have any
>> other bridge members present.
>>
>>> 2. A configuration with AP + STA + 3 client devices behind STA.
>>>   |client 1|
>>>  |
>>> |  mc  ||AP||STA|---|---|client 2|
>>> |server||
>>>   |client 3|
>>>
>>> Multicast server behind AP streams MC video traffic. All 3 clients
>>> behind the STA have joined the multicast group.
>>> I'm not sure if this case will be handled correctly with mc-to-uc
>>> conversion in bridge on AP?
>> What do you mean by "3 client devices behind STA"? Are you using a
>> 4-addr STA, multicast routing, or some kind of vendor specific "client
>> bridge" hackery?
> 
> 3 client devices connected by backbone Ethernet network. Generic
> case is probably STA/AP operating in 4-addr mode (more or less standard
> solution as far as I know).
If the AP is running in 4-addr mode, it will need to have a bridge
interface anyway, because the link to the STA will be split out into a
separate virtual interface (AP_VLAN iftype).

In this case you don't actually need any multicast-to-unicast
conversion, because the multicast traffic will be unicast on 802.11
already (due to use of 4-addr mode).

- Felix

Re: [PATCH net-next] bridge: multicast to unicast

2017-01-11 Thread Felix Fietkau

On 2017-01-11 12:26, IgorMitsyanko wrote:
> On 01/11/2017 12:27 AM, Felix Fietkau wrote:
>> On 2017-01-10 11:56, Johannes Berg wrote:
>>> On Tue, 2017-01-10 at 05:18 +0100, Linus Lüssing wrote:
>>>> On Mon, Jan 09, 2017 at 01:30:32PM -0800, Stephen Hemminger wrote:
>>>>> I wonder if MAC80211 should be doing IGMP snooping and not bridge
>>>>> in this environment.
>>>>
>>>> In the long term, yes. For now, not quite sure.
>>>
>>> There's no "for now" in the kernel. Code added now will have to be
>>> maintained essentially forever.
>> I'm not sure that putting the IGMP snooping code in mac80211 is a good
>> idea, that would be quite a bit of code duplication.
>> This implementation works, it's very simple, and it's quite flexible for
>> a number of use cases.
>>
>> Is there any remaining objection to merging this in principle (aside
>> from potential issues with the code)?
>>
>> - Felix
>>
> 
> 
> Hi Felix, can we consider two examples configurations with multicast 
> traffic:
> 
> 1. AP is a source of multicast traffic itself, no bridge on AP. For 
> example, wireless video server streaming to several clients.
> In this situation, we can not make use of possible advantages given by 
> mc-to-uc conversion?
You could simply put the AP interface in a bridge, no need to have any
other bridge members present.

> 2. A configuration with AP + STA + 3 client devices behind STA.
>  |client 1|
>  |
> |  mc  ||AP||STA|---|---|client 2|
> |server||
>  |client 3|
> 
> Multicast server behind AP streams MC video traffic. All 3 clients 
> behind the STA have joined the multicast group.
> I'm not sure if this case will be handled correctly with mc-to-uc 
> conversion in bridge on AP?
What do you mean by "3 client devices behind STA"? Are you using a
4-addr STA, multicast routing, or some kind of vendor specific "client
bridge" hackery?

- Felix

Re: [PATCH net-next] bridge: multicast to unicast

2017-01-10 Thread Felix Fietkau

On 2017-01-10 11:56, Johannes Berg wrote:
> On Tue, 2017-01-10 at 05:18 +0100, Linus Lüssing wrote:
>> On Mon, Jan 09, 2017 at 01:30:32PM -0800, Stephen Hemminger wrote:
>> > I wonder if MAC80211 should be doing IGMP snooping and not bridge
>> > in this environment.
>> 
>> In the long term, yes. For now, not quite sure.
> 
> There's no "for now" in the kernel. Code added now will have to be
> maintained essentially forever.
I'm not sure that putting the IGMP snooping code in mac80211 is a good
idea, that would be quite a bit of code duplication.
This implementation works, it's very simple, and it's quite flexible for
a number of use cases.

Is there any remaining objection to merging this in principle (aside
from potential issues with the code)?

- Felix

Re: [PATCH net-next] bridge: multicast to unicast

2017-01-10 Thread Felix Fietkau

On 2017-01-10 18:17, Dave Taht wrote:
> In the case of wifi I have 3 issues with this line of thought.
> 
> multicast in wifi has generally supposed to be unreliable. This makes
> it reliable. reliability comes at a cost -
> 
> multicast is typically set at a fixed low rate today. unicast is
> retried at different rates until it succeeds - for every station
> listening. If one station is already at the lowest rate, the total
> cost of the transmit increases, rather than decreases.
> 
> unicast gets block acks until it succeeds. Again, more delay.
> 
> I think there is something like 31 soft-retries in the ath9k driver
If I remember correctly, hardware retries are counted here as well.

> what happens to diffserv markings here? for unicast CS1 goes into the
> BE queue, CS6, the VO queue. Do we go from one flat queue for all of
> multicast to punching it through one of the hardware queues based on
> the diffserv mark now with this patch?
> 
> I would like it if there was a way to preserve the unreliability
> (which multiple mesh protocols depend on), send stuff with QoSNoack,
> etc - or dynamically choose (based on the rates of the stations)
> between conventional multicast and unicast.
> 
> Or - better, IMHO, keep sending multicast as is but pick the best of
> the rates available to all the listening stations for it.
The advantage of the multicast-to-unicast conversion goes beyond simply
selecting a better rate - aggregation matters a lot as well, and that is
simply incompatible with normal multicast.

Some multicast streams use lots of small-ish packets, the airtime impact
of those is vastly reduced, even if the transmission has to be
duplicated for a few stations.

- Felix

Re: [PATCH net-next] bridge: multicast to unicast

2017-01-06 Thread Felix Fietkau

On 2017-01-06 14:54, Johannes Berg wrote:
> 
>> The bridge layer can use IGMP snooping to ensure that the multicast
>> stream is only transmitted to clients that are actually a member of
>> the group. Can the mac80211 feature do the same?
> 
> No, it'll convert the packet for all clients that are behind that
> netdev. But that's an argument for dropping the mac80211 feature, which
> hasn't been merged upstream yet, no?
Right.

- Felix

Re: [PATCH net-next] bridge: multicast to unicast

2017-01-06 Thread Felix Fietkau

On 2017-01-06 13:47, Johannes Berg wrote:
> On Mon, 2017-01-02 at 20:32 +0100, Linus Lüssing wrote:
>> Implements an optional, per bridge port flag and feature to deliver
>> multicast packets to any host on the according port via unicast
>> individually. This is done by copying the packet per host and
>> changing the multicast destination MAC to a unicast one accordingly.
> 
> How does this compare and/or relate to the multicast-to-unicast feature
> we were going to add to the wifi stack, particularly mac80211? Do we
> perhaps not need that feature at all, if bridging will have it?
> 
> I suppose that the feature there could apply also to locally generated
> traffic when the AP interface isn't in a bridge, but I think I could
> live with requiring the AP to be put into a bridge to achieve a similar
> configuration?
> 
> Additionally, on an unrelated note, this seems to apply generically to
> all kinds of frames, losing information by replacing the address.
> Shouldn't it have similar limitations as the wifi stack feature has
> then, like only applying to ARP, IPv4, IPv6 and not general protocols?
> 
> Also, it should probably come with the same caveat as we documented for
> the wifi feature:
> 
> Note that this may break certain expectations of the receiver,
> such as the ability to drop unicast IP packets received within
> multicast L2 frames, or the ability to not send ICMP destination
> unreachable messages for packets received in L2 multicast (which
> is required, but the receiver can't tell the difference if this
> new option is enabled.)
> 
> 
> I'll hold off sending my tree in until we see that we really need both
> features, or decide that we want the wifi feature *instead* of the
> bridge feature.
The bridge layer can use IGMP snooping to ensure that the multicast
stream is only transmitted to clients that are actually a member of the
group. Can the mac80211 feature do the same?

- Felix

Re: [PATCH net-next] bridge: multicast to unicast

2017-01-03 Thread Felix Fietkau

On 2017-01-02 20:32, Linus Lüssing wrote:
> Implements an optional, per bridge port flag and feature to deliver
> multicast packets to any host on the according port via unicast
> individually. This is done by copying the packet per host and
> changing the multicast destination MAC to a unicast one accordingly.
> 
> multicast-to-unicast works on top of the multicast snooping feature of
> the bridge. Which means unicast copies are only delivered to hosts which
> are interested in it and signalized this via IGMP/MLD reports
> previously.
> 
> This feature is intended for interface types which have a more reliable
> and/or efficient way to deliver unicast packets than broadcast ones
> (e.g. wifi).
> 
> However, it should only be enabled on interfaces where no IGMPv2/MLDv1
> report suppression takes place. This feature is disabled by default.
> 
> The initial patch and idea is from Felix Fietkau.
> 
> Cc: Felix Fietkau <n...@nbd.name>
> Signed-off-by: Linus Lüssing <linus.luess...@c0d3.blue>
Please add Signed-off-by: Felix Fietkau <n...@nbd.name>
in the next version, and maybe also From:

Thanks,

- Felix

Re: Misalignment, MIPS, and ip_hdr(skb)->version

2016-12-10 Thread Felix Fietkau

On 2016-12-10 21:32, Måns Rullgård wrote:
> Felix Fietkau <n...@nbd.name> writes:
> 
>> On 2016-12-10 14:25, Måns Rullgård wrote:
>>> Felix Fietkau <n...@nbd.name> writes:
>>> 
>>>> On 2016-12-07 19:54, Jason A. Donenfeld wrote:
>>>>> On Wed, Dec 7, 2016 at 7:51 PM, David Miller <da...@davemloft.net> wrote:
>>>>>> It's so much better to analyze properly where the misalignment comes from
>>>>>> and address it at the source, as we have for various cases that trip up
>>>>>> Sparc too.
>>>>> 
>>>>> That's sort of my attitude too, hence starting this thread. Any
>>>>> pointers you have about this would be most welcome, so as not to
>>>>> perpetuate what already seems like an issue in other parts of the
>>>>> stack.
>>>> Hi Jason,
>>>>
>>>> I'm the author of that hackish LEDE/OpenWrt patch that works around the
>>>> misalignment issues. Here's some context regarding that patch:
>>>>
>>>> I intentionally put it in the target specific patches for only one of
>>>> our MIPS targets. There are a few ar71xx devices where the misalignment
>>>> cannot be fixed, because the Ethernet MAC has a 4-byte DMA alignment
>>>> requirement, and does not support inserting 2 bytes of padding to
>>>> correct the IP header misalignment.
>>>>
>>>> With these limitations the choice was between this ugly network stack
>>>> patch or inserting a very expensive memmove in the data path (which is
>>>> better than taking the mis-alignment traps, but still hurts routing
>>>> performance significantly).
>>> 
>>> I solved this problem in an Ethernet driver by copying the initial part
>>> of the packet to an aligned skb and appending the remainder using
>>> skb_add_rx_frag().  The kernel network stack only cares about the
>>> headers, so the alignment of the packet payload doesn't matter.
>>
>> I considered that as well, but it's bad for routing performance if the
>> ethernet MAC does not support scatter/gather for xmit.
>> Unfortunately that limitation is quite common on embedded hardware.
> 
> Yes, I can see that being an issue.  However, if you're doing zero-copy
> routing, the header part of the original buffer should still be there,
> unused, so you could presumably copy the header of the outgoing packet
> there and then do dma as usual.  Maybe there's something in the network
> stack that makes this impossible though.
That still puts more pressure on the ridiculously small dcache sizes
that are typical for embedded MIPS routers.

- Felix

Re: Misalignment, MIPS, and ip_hdr(skb)->version

2016-12-10 Thread Felix Fietkau

On 2016-12-10 14:25, Måns Rullgård wrote:
> Felix Fietkau <n...@nbd.name> writes:
> 
>> On 2016-12-07 19:54, Jason A. Donenfeld wrote:
>>> On Wed, Dec 7, 2016 at 7:51 PM, David Miller <da...@davemloft.net> wrote:
>>>> It's so much better to analyze properly where the misalignment comes from
>>>> and address it at the source, as we have for various cases that trip up
>>>> Sparc too.
>>> 
>>> That's sort of my attitude too, hence starting this thread. Any
>>> pointers you have about this would be most welcome, so as not to
>>> perpetuate what already seems like an issue in other parts of the
>>> stack.
>> Hi Jason,
>>
>> I'm the author of that hackish LEDE/OpenWrt patch that works around the
>> misalignment issues. Here's some context regarding that patch:
>>
>> I intentionally put it in the target specific patches for only one of
>> our MIPS targets. There are a few ar71xx devices where the misalignment
>> cannot be fixed, because the Ethernet MAC has a 4-byte DMA alignment
>> requirement, and does not support inserting 2 bytes of padding to
>> correct the IP header misalignment.
>>
>> With these limitations the choice was between this ugly network stack
>> patch or inserting a very expensive memmove in the data path (which is
>> better than taking the mis-alignment traps, but still hurts routing
>> performance significantly).
> 
> I solved this problem in an Ethernet driver by copying the initial part
> of the packet to an aligned skb and appending the remainder using
> skb_add_rx_frag().  The kernel network stack only cares about the
> headers, so the alignment of the packet payload doesn't matter.
I considered that as well, but it's bad for routing performance if the
ethernet MAC does not support scatter/gather for xmit.
Unfortunately that limitation is quite common on embedded hardware.

- Felix

Re: Misalignment, MIPS, and ip_hdr(skb)->version

2016-12-10 Thread Felix Fietkau

On 2016-12-07 19:54, Jason A. Donenfeld wrote:
> On Wed, Dec 7, 2016 at 7:51 PM, David Miller  wrote:
>> It's so much better to analyze properly where the misalignment comes from
>> and address it at the source, as we have for various cases that trip up
>> Sparc too.
> 
> That's sort of my attitude too, hence starting this thread. Any
> pointers you have about this would be most welcome, so as not to
> perpetuate what already seems like an issue in other parts of the
> stack.
Hi Jason,

I'm the author of that hackish LEDE/OpenWrt patch that works around the
misalignment issues. Here's some context regarding that patch:

I intentionally put it in the target specific patches for only one of
our MIPS targets. There are a few ar71xx devices where the misalignment
cannot be fixed, because the Ethernet MAC has a 4-byte DMA alignment
requirement, and does not support inserting 2 bytes of padding to
correct the IP header misalignment.

With these limitations the choice was between this ugly network stack
patch or inserting a very expensive memmove in the data path (which is
better than taking the mis-alignment traps, but still hurts routing
performance significantly).

There are a lot of places in the network stack that assume full 32 bit
alignment, and you only get to see those once you start using more of
netfilter, play with various tunnel encapsulations, etc.

I think you have 3 options to deal with this properly:
1. add 3 bytes of padding
2. allocate a separate skb for decryption (might be more expensive)
3. save the header and decrypt to the start of the packet data
(overwriting the misaligned header).

I'm not sure what the performance impact of 2 and 3 is, so it's probably
best to stick with the padding.

I've taken a quick look at the wireguard message headers, and my
recommendation would be to insert the 3-byte padding in struct
message_header and remove __packed from your structs.
This will also remove misaligment of your own protocol fields.

- Felix

Re: [PATCH 1/2] add basic register-field manipulation macros

2016-06-13 Thread Felix Fietkau

On 2016-06-13 15:29, Jakub Kicinski wrote:
> C bitfields are problematic and best avoided.  Developers
> interacting with hardware registers find themselves searching
> for easy-to-use alternatives.  Common approach is to define
> structures or sets of macros containing mask and shift pair.
> Operations on the register are then performed as follows:
> 
>  field = (reg >> shift) & mask;
> 
>  reg &= ~(mask << shift);
>  reg |= (field & mask) << shift;
> 
> Defining shift and mask separately is tedious.  Ivo van Doorn
> came up with an idea of computing them at compilation time
> based on a single shifted mask (later refined by Felix) which
> can be used like this:
> 
>  #define X_REG_FIELD 0x000ff000
> 
>  field = FIELD_GET(X_REG_FIELD, reg);
> 
>  reg &= ~X_REG_FIELD;
>  reg |= FIELD_PUT(X_REG_FIELD, field);
> 
> FIELD_{GET,PUT} macros take care of finding out what the
> appropriate shift is based on compilation time ffs operation.
> 
> GENMASK can be used to define registers (which is usually
> less error-prone and easier to match with datasheets).
> 
> This approach is the most convenient I've seen so to limit code
> multiplication let's move the macros to a global header file.
> 
> Compared to Felix Fietkau's implementation from mt76 this one
> uses standard Linux and GCC functions such as is_power_of_2()
> and __builtin_ffsll().
> 
> Signed-off-by: Jakub Kicinski <jakub.kicin...@netronome.com>
> ---
>  include/linux/bitfield.h | 58 
> 
>  include/linux/log2.h |  6 +
>  2 files changed, 64 insertions(+)
>  create mode 100644 include/linux/bitfield.h
> 
> diff --git a/include/linux/bitfield.h b/include/linux/bitfield.h
> new file mode 100644
> index ..ae2224464523
> --- /dev/null
> +++ b/include/linux/bitfield.h
> @@ -0,0 +1,58 @@
> +/*
> + * Copyright (C) 2014 Felix Fietkau <n...@openwrt.org>
Please change my email address to n...@nbd.name here

- Felix

Re: [RFC/RFT] mac80211: implement fq_codel for software queuing

2016-03-07 Thread Felix Fietkau

On 2016-03-07 15:05, Avery Pennarun wrote:
> On Fri, Mar 4, 2016 at 1:32 AM, Michal Kazior  wrote:
>> On 4 March 2016 at 03:48, Tim Shepard  wrote:
>> [...]
>>> (I am interested in knowing what other mac80211 drivers have been
>>>  modified to use the mac80211 intermediate software queues.   I know
>>>  Michal mentioned he has patches for ath10k that are not yet released,
>>>  and I know Felix is finishing up the mt76 driver which uses them.)
>>
>> Patches for ath10k are under review since quite some time now (but are
>> not merged yet). The latest re-spin is:
>>
>>   http://lists.infradead.org/pipermail/ath10k/2016-March/006923.html
> 
> Hi all, on Friday I had a chance to experiment with some of these
> patches, specifically Tim's ath9k patch (to use intermediate queues),
> plus MIchal's patch to use fq_codel with the intermediate queues.  I
> didn't attempt any fine tuning; I just slapped them together to see
> what happens.  (I tried applying Michal's ath10k patches too, but got
> stuck since they seem to be applied against the upstream v4.4 kernel
> and didn't merge cleanly with the latest mac80211 branch.  Maybe I was
> doing something wrong.)
> 
> Test setup:
>AP (ath9k) -> 2x2 strong signal -> STA1 (mwifiex)
> -> attenuator (-40 dB) -> 1x1 weak signal  -> STA2 (mwifiex)
> 
> STA2 generally gets modulation levels around MCS0-2 and STA1 usually
> gets something like MCS12-15.
> 
> With or without this patch, results with TCP iperf were fishy - I
> think packet loss patterns were particularly bad and caused 2-second
> TCP retry timeouts occasionally - so I removed TCP from the test and
> switched the UDP iperf instead.
> 
> I ran isoping 
> (https://gfiber.googlesource.com/vendor/google/platform/+/master/cmds/isoping.c)
> from the AP to both stations to measure two-way latency during all
> tests.  (I used -r2 for two packets/sec in each direction in order not
> to affect the test results too much.)
> 
> Overall results:
> 
> - Running one iperf at a time, I saw ~45 Mbps to STA1 and ~7 Mbps to STA2.
> 
> - Running both iperfs at once, without the patches, latencies got
> extremely high (~600ms sometimes) and results were closer to
> byte-fairness than airtime-fairness (ie. ~7 Mbps each).
> 
> - Running both iperfs at once, with the patches, latencies were still
> high (usually high 2-digit, sometimes low 3-digit latencies) but we
> got closer to airtime-fairness than byte-fairness (~17 Mbps and ~2
> Mbps).
> 
> - With only one iperf running, without the patches, latencies were
> high to both stations.  With the patches, latency was
> mid-double-digits to the non-iperf station (pretty good!) while being
> low-mid triple-digits to the busy iperf station.  This suggests that
> we are getting per-station queuing (yay!) but does make me question
> whether the fq_ in fq_codel was working.
Please change the 'if (flow->txqi)' check in ieee80211_txq_enqueue to:
if (flow->txqi && flow->txqi != txqi)
This should hopefully fix the fq_ part ;)

- Felix

Re: [RFC/RFT] mac80211: implement fq_codel for software queuing

2016-02-26 Thread Felix Fietkau

On 2016-02-26 14:09, Michal Kazior wrote:
> Since 11n aggregation become important to get the
> best out of txops. However aggregation inherently
> requires buffering and queuing. Once variable
> medium conditions to different associated stations
> is considered it became apparent that bufferbloat
> can't be simply fought with qdiscs for wireless
> drivers. 11ac with MU-MIMO makes the problem
> worse because the bandwidth-delay product becomes
> even greater.
> 
> This bases on codel5 and sch_fq_codel.c. It may
> not be the Right Thing yet but it should at least
> provide a framework for more improvements.
Nice work!

> I guess dropping rate could factor in per-station
> rate control info but I don't know how this should
> exactly be done. HW rate control drivers would
> need extra work to take advantage of this.
> 
> This obviously works only with drivers that use
> wake_tx_queue op.
> 
> Note: This uses IFF_NO_QUEUE to get rid of qdiscs
> for wireless drivers that use mac80211 and
> implement wake_tx_queue op.
> 
> Moreover the current txq_limit and latency setting
> might need tweaking. Either from userspace or be
> dynamically scaled with regard to, e.g. number of
> associated stations.
> 
> FWIW This already works nicely with ath10k's (not
> yey merged) pull-push congestion control for
> MU-MIMO as far as throughput is concerned.
> 
> Evaluating latency improvements is a little tricky
> at this point if a driver is using more queue
> layering and/or its firmware controls tx
> scheduling - hence I don't have any solid data on
> this. I'm open for suggestions though.
> 
> It might also be a good idea to do the following
> in the future:
> 
>  - make generic tx scheduling which does some RR
>over per-sta-tid queues and dequeues bursts of
>packets to form a PPDU to fit into designated
>txop timeframe and bytelimit
> 
>This could in theory be shared and used by
>ath9k and (future) mt76.
> 
>Moreover tx scheduling could factor in rate
>control info and keep per-station number of
>queued packets at a sufficient low threshold to
>avoid queue buildup for slow stations. Emmanuel
>already did similar experiment for iwlwifi's
>station mode and got promising results.
> 
>  - make software queueing default internally in
>mac80211. This could help other drivers to get
>at least some benefit from mac80211 smarter
>queueing.
> 
> Signed-off-by: Michal Kazior 
> ---
>  include/net/mac80211.h |  36 -
>  net/mac80211/agg-tx.c  |   8 +-
>  net/mac80211/codel.h   | 260 +++
>  net/mac80211/codel_i.h |  89 +++
>  net/mac80211/ieee80211_i.h |  27 +++-
>  net/mac80211/iface.c   |  25 ++-
>  net/mac80211/main.c|   9 +-
>  net/mac80211/rx.c  |   2 +-
>  net/mac80211/sta_info.c|  10 +-
>  net/mac80211/sta_info.h|  27 
>  net/mac80211/tx.c  | 370 
> -
>  net/mac80211/util.c|  20 ++-
>  12 files changed, 805 insertions(+), 78 deletions(-)
>  create mode 100644 net/mac80211/codel.h
>  create mode 100644 net/mac80211/codel_i.h
> 

> diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c
> index af584f7cdd63..f42f898cb8b5 100644
> --- a/net/mac80211/tx.c
> +++ b/net/mac80211/tx.c
> + [...]
> +static void ieee80211_txq_enqueue(struct ieee80211_local *local,
> +   struct txq_info *txqi,
> +   struct sk_buff *skb)
> +{
> + struct ieee80211_fq *fq = >fq;
> + struct ieee80211_hw *hw = >hw;
> + struct txq_flow *flow;
> + struct txq_flow *i;
> + size_t idx = fq_hash(fq, skb);
> +
> + flow = >flows[idx];
> +
> + if (flow->txqi)
> + flow = >flow;
I'm not sure I understand this part correctly, but shouldn't that be:
if (flow->txqi && flow->txqi != txqi)

> +
> + /* The following overwrites `vif` pointer effectively. It is later
> +  * restored using txq structure.
> +  */
> + IEEE80211_SKB_CB(skb)->control.enqueue_time = codel_get_time();
> +
> + flow->txqi = txqi;
> + flow->backlog += skb->len;
> + txqi->backlog_bytes += skb->len;
> + txqi->backlog_packets++;
> + fq->backlog++;
> +
> + if (list_empty(>backlogchain))
> + i = list_last_entry(>backlogs, struct txq_flow, 
> backlogchain);
> + else
> + i = flow;
> +
> + list_for_each_entry_continue_reverse(i, >backlogs, backlogchain)
> + if (i->backlog > flow->backlog)
> + break;
> +
> + list_move(>backlogchain, >backlogchain);
> +
> + if (list_empty(>flowchain)) {
> + flow->deficit = fq->quantum;
> + list_add_tail(>flowchain, >new_flows);
> + }
> +
> + __skb_queue_tail(>queue, skb);
> +
> + if (fq->backlog > hw->txq_limit)
> + fq_drop(local);
> +}

Re: [PATCH V2 03/12] net-next: mediatek: add embedded switch driver (ESW)

2016-02-26 Thread Felix Fietkau

On 2016-02-26 16:18, Andrew Lunn wrote:
> On Fri, Feb 26, 2016 at 03:21:35PM +0100, John Crispin wrote:
>> The ESW is found in many of the old 100mbit MIPS based SoCs. it has 5
>> external ports, 1 cpu port and 1 further port that the internal HW
>> offloading engine connects to.
>> 
>> This driver is very basic and only provides basic init and irq support.
>> The SoC and switch core both have support for a special tag making DSA
>> support possible.
> 
> Hi Crispin
> 
> There was recently a discussion about adding switches without using
> DSA or switchdev. It was pretty much decided we would not accept such
> drivers.
For exactly this reason, the code does not provide any non-standard API
for allowing the user to configure the switch. The hardware needs to be
programmed with some defaults for the driver to be functional (since the
switch logic is built into the SoC).

In my opinion, leaving this part out does not make much sense and
neither does deferring the entire patch series until we have a
switchdev/DSA capable driver. This is just a starting point, which will
be turned into a proper driver with the right APIs later.

- Felix

Re: [PATCH] mac80211: minstrel_ht: fix out-of-bound in minstrel_ht_set_best_prob_rate

2016-01-29 Thread Felix Fietkau

On 2016-01-29 09:35, Konstantin Khlebnikov wrote:
> Patch fixes this splat
> 
> BUG: KASAN: slab-out-of-bounds in minstrel_ht_update_stats.isra.7+0x6e1/0x9e0
> [mac80211] at addr 8800cee640f4 Read of size 4 by task swapper/3/0
> 
> Signed-off-by: Konstantin Khlebnikov <koc...@gmail.com>
> Link: 
> http://lkml.kernel.org/r/calygninyjhsavne35qs6ucgasb2dx1_i5hcravuox14otz2...@mail.gmail.com
Acked-by: Felix Fietkau <n...@openwrt.org>

Re: KASAN splal in minstrel_ht_update_stats()

2016-01-28 Thread Felix Fietkau

On 2016-01-15 06:23, Konstantin Khlebnikov wrote:
> Jan 10 17:56:25 hamm kernel: [184374.378842]
> ==
> Jan 10 17:56:25 hamm kernel: [184374.379001] BUG: KASAN:
> slab-out-of-bounds in minstrel_ht_update_stats.isra.7+0x6e1/0x9e0
[...]
> ==
> 
> out-of-bound in
> 
> if (mrs->prob_ewma > mg->rates[mg->max_group_prob_rate].prob_ewma)
>  mg->max_group_prob_rate = index;
> 
> 
> 
> Fix should be something like this:
> 
> --- a/net/mac80211/rc80211_minstrel_ht.c
> +++ b/net/mac80211/rc80211_minstrel_ht.c
> @@ -414,15 +414,16 @@ minstrel_ht_set_best_prob_rate(struct
> minstrel_ht_sta *mi, u16 index)
> (max_tp_group != MINSTREL_CCK_GROUP))
> return;
> 
> +   max_gpr_group = mg->max_group_prob_rate / MCS_GROUP_RATES;
> +   max_gpr_idx = mg->max_group_prob_rate % MCS_GROUP_RATES;
> +   max_gpr_prob = mi->groups[max_gpr_group].rates[max_gpr_idx].prob_ewma;
> +
> if (mrs->prob_ewma > MINSTREL_FRAC(75, 100)) {
> cur_tp_avg = minstrel_ht_get_tp_avg(mi, cur_group, cur_idx,
> mrs->prob_ewma);
> if (cur_tp_avg > tmp_tp_avg)
> mi->max_prob_rate = index;
> 
> -   max_gpr_group = mg->max_group_prob_rate / MCS_GROUP_RATES;
> -   max_gpr_idx = mg->max_group_prob_rate % MCS_GROUP_RATES;
> -   max_gpr_prob =
> mi->groups[max_gpr_group].rates[max_gpr_idx].prob_ewma;
> max_gpr_tp_avg = minstrel_ht_get_tp_avg(mi, max_gpr_group,
> max_gpr_idx,
> max_gpr_prob);
> @@ -431,7 +432,7 @@ minstrel_ht_set_best_prob_rate(struct
> minstrel_ht_sta *mi, u16 index)
> } else {
> if (mrs->prob_ewma > tmp_prob)
> mi->max_prob_rate = index;
> -   if (mrs->prob_ewma >
> mg->rates[mg->max_group_prob_rate].prob_ewma)
> +   if (mrs->prob_ewma > max_gpr_prob)
> mg->max_group_prob_rate = index;
> }
>  }
Fix looks correct, but does not apply (line wrapped). Please resubmit
with a proper description and your Signed-off-by.

Thanks,

- Felix

Re: [PATCH] net: emac: emac gigabit ethernet controller driver

2015-12-07 Thread Felix Fietkau

On 2015-12-07 23:58, Gilad Avidov wrote:
> +/* RRD (Receive Return Descriptor) */
> +union emac_rrd {
> + struct {
> + /* 32bit word 0 */
> + u32  xsum:16;
> + u32  nor:4;   /* number of RFD */
> + u32  si:12;   /* start index of rfd-ring */
> + /* 32bit word 1 */
> + u32  hash;
> + /* 32bit word 2 */
You should never use bitfields for hardware structs.
I think in general, kernel code should be made endian safe, even if you
only care about one particular endian type for your platform.

- Felix
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] net: remove unnecessary semicolon in netdev_alloc_pcpu_stats()

2015-12-05 Thread Felix Fietkau

This semicolon causes a build error if the function call is wrapped in
parentheses.

Fixes: aabc92bbe3cf ("net: add __netdev_alloc_pcpu_stats() to indicate gfp 
flags")
Reported-by: Imre Kaloz <ka...@openwrt.org>
Signed-off-by: Felix Fietkau <n...@openwrt.org>
---
 include/linux/netdevice.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 7d2d1d7..f2f91d9 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2103,7 +2103,7 @@ struct pcpu_sw_netstats {
 })
 
 #define netdev_alloc_pcpu_stats(type)  \
-   __netdev_alloc_pcpu_stats(type, GFP_KERNEL);
+   __netdev_alloc_pcpu_stats(type, GFP_KERNEL)
 
 #include 
 
-- 
2.2.2

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] packet: Allow packets with only a header (but no payload)

2015-11-09 Thread Felix Fietkau

On 2015-11-09 18:53, Willem de Bruijn wrote:
> On Sat, Nov 7, 2015 at 8:11 AM, Felix Fietkau <n...@openwrt.org> wrote:
>> On 2015-07-31 00:15, Martin Blumenstingl wrote:
>>> On Wed, Jul 29, 2015 at 8:05 AM, Willem de Bruijn <will...@google.com> 
>>> wrote:
>>>> Martin, to return to your initial statement that PPPoE PADI packets can
>>>> have a zero payload: the PPPoE RFC states that PADI packets "MUST
>>>> contain exactly one TAG of TAG_TYPE Service-Name, indicating the
>>>> service the Host is requesting, and any number of other TAG types."
>>>> (RFC 2516, 5.1). Is the observed behavior (no payload) perhaps
>>>> incorrect?
>>> As far as I can see you are right, but the real world seems to be different.
>>> My ISP for example lists the PPPoE connection settings, but they are
>>> nowhere mentioning the "service name".
>>>
>>> I have also re-read pppd's source code again and that seems to confirm
>>> what you are reading in the RFC: Leaving the service name away makes
>>> seems to violate the RFC, but pppd still accepts those configurations.
>>>
>>>> Even if it is, if this is breaking established userspace expectations,
>>>> we should look into it. Ethernet specifies a minimum payload size of
>>>> 46 on the wire, but perhaps that is handled with padding, so that
>>>> 0 length should be valid within the stack. Also, there may be other
>>>> valid uses of 0 length payload on top of link layers that are not Ethernet.
>>> Good catch. I would also like to note that the documentation for
>>> "hard_header_len" describes it as "Hardware header length". When the
>>> purpose of this field we should check whether the documentation should
>>> be updated to "Minimum hardware header length" -> that would mean the
>>> condition has to be a "len < hard_header_len" instead of a "len <=
>>> hard_header_len" (as it is now).
>>>
>>> PS: I have also added the pppd maintainer (Paul Mackerras) to this
>>> thread because I think he should know about this issue (and he can
>>> probably provide more details if required).
>>> As a quick summary for him: linux  >= 3.19 rejects PADI packets when
>>> no service name is configured.
>> Any news on this? Users are complaining about this regression:
>> https://dev.openwrt.org/ticket/20707
> 
> I took another look. This hinges on the question what the contract with
> device drivers is on skb network data and length. Is passing an skb with
> skb->len == 0 to ndo_start_xmit allowed?
> 
> From what I gather from the ethernet spec [1], sending frames with an
> empty head is allowed on that medium, at least.
> 
> A quick scan of a few drivers and the loopback path also does not show
> anything that would break. In some cases, skb_network_header points
> beyond the end of the buffer (ETH_HLEN), but the length is correctly
> reported as 0.
> 
> The tap device can also generate packets consisting of only a link layer
> header: compares len < ETH_HLEN in tun_get_user.
> 
> So, I think that this change should be correct:
> 
>  static bool ll_header_truncated(const struct net_device *dev, int len)
>  {
> -   /* net device doesn't like empty head */
> -   if (unlikely(len <= dev->hard_header_len)) {
> +   if (unlikely(len < dev->hard_header_len)) {
> 
> but a definitive answer would require an audit of all device drivers
> (including bonding, ..) or at least the certainty that it has always
> been correct to send a packet of only link layer header to
> ndo_start_xmit.
> 
> [1] IEEE 802.3™-2012 – Section One, {3.2.8, 4.2.3.3}
Yeah, I agree that such an audit is required. However, I think it's
*much* more important to add this change as soon as possible to fix the
regression. The old code may have had theoretical driver issues, but the
current code breaks real-world user setups.

- Felix
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] packet: Allow packets with only a header (but no payload)

2015-11-07 Thread Felix Fietkau

On 2015-07-31 00:15, Martin Blumenstingl wrote:
> On Wed, Jul 29, 2015 at 8:05 AM, Willem de Bruijn  wrote:
>> Martin, to return to your initial statement that PPPoE PADI packets can
>> have a zero payload: the PPPoE RFC states that PADI packets "MUST
>> contain exactly one TAG of TAG_TYPE Service-Name, indicating the
>> service the Host is requesting, and any number of other TAG types."
>> (RFC 2516, 5.1). Is the observed behavior (no payload) perhaps
>> incorrect?
> As far as I can see you are right, but the real world seems to be different.
> My ISP for example lists the PPPoE connection settings, but they are
> nowhere mentioning the "service name".
> 
> I have also re-read pppd's source code again and that seems to confirm
> what you are reading in the RFC: Leaving the service name away makes
> seems to violate the RFC, but pppd still accepts those configurations.
> 
>> Even if it is, if this is breaking established userspace expectations,
>> we should look into it. Ethernet specifies a minimum payload size of
>> 46 on the wire, but perhaps that is handled with padding, so that
>> 0 length should be valid within the stack. Also, there may be other
>> valid uses of 0 length payload on top of link layers that are not Ethernet.
> Good catch. I would also like to note that the documentation for
> "hard_header_len" describes it as "Hardware header length". When the
> purpose of this field we should check whether the documentation should
> be updated to "Minimum hardware header length" -> that would mean the
> condition has to be a "len < hard_header_len" instead of a "len <=
> hard_header_len" (as it is now).
> 
> PS: I have also added the pppd maintainer (Paul Mackerras) to this
> thread because I think he should know about this issue (and he can
> probably provide more details if required).
> As a quick summary for him: linux  >= 3.19 rejects PADI packets when
> no service name is configured.
Any news on this? Users are complaining about this regression:
https://dev.openwrt.org/ticket/20707

- Felix
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/3] net: dsa: Use devm_ prefixed allocations

2015-10-03 Thread Felix Fietkau

On 2015-10-02 15:30, Neil Armstrong wrote:
> On 10/02/2015 03:29 PM, Sergei Shtylyov wrote:
>> On 10/2/2015 1:48 PM, Neil Armstrong wrote:
>> 
>>> To simplify and prevent memory leakage when unbinding, use
>>> the devm_ memory allocation calls.
>>>
>>> Tested-by: Andrew Lunn 
>>> Tested-by: Florian Fainelli 
>>> Signed-off-by: Neil Armstrong 
>>> ---
>>>   net/dsa/dsa.c | 6 +++---
>>>   1 file changed, 3 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
>>> index c59fa5d..98f94c2 100644
>>> --- a/net/dsa/dsa.c
>>> +++ b/net/dsa/dsa.c
>>> @@ -305,7 +305,7 @@ static int dsa_switch_setup_one(struct dsa_switch *ds, 
>>> struct device *parent)
>>>   if (ret < 0)
>>>   goto out;
>>>
>>> -ds->slave_mii_bus = mdiobus_alloc();
>>> +ds->slave_mii_bus = devm_mdiobus_alloc(parent);
>>>   if (ds->slave_mii_bus == NULL) {
>>>   ret = -ENOMEM;
>>>   goto out;
>>> @@ -400,7 +400,7 @@ dsa_switch_setup(struct dsa_switch_tree *dst, int index,
>>>   /*
>>>* Allocate and initialise switch state.
>>>*/
>>> -ds = kzalloc(sizeof(*ds) + drv->priv_size, GFP_KERNEL);
>>> +ds = devm_kzalloc(parent, sizeof(*ds) + drv->priv_size, GFP_KERNEL);
>>>   if (ds == NULL)
>>>   return ERR_PTR(-ENOMEM);
>>>
>>> @@ -883,7 +883,7 @@ static int dsa_probe(struct platform_device *pdev)
>>>   goto out;
>>>   }
>>>
>>> -dst = kzalloc(sizeof(*dst), GFP_KERNEL);
>>> +dst = devm_kzalloc(>dev, sizeof(*dst), GFP_KERNEL);
>>>   if (dst == NULL) {
>>>   dev_put(dev);
>>>   ret = -ENOMEM;
>>>
>> 
>>Shouldn't you remove the correspoding kfree(), etc. calls?
>> 
>> MBR, Sergei
>> 
> The corresponding kfree() calls were all missing.
Not in the error handling path. mdiobus_alloc has a corresponding
mdiobus_free in the same function.
The ds kzalloc in dsa_switch_setup has a kfree in dsa_switch_setup_one.

- Felix
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/3] net: dsa: Use devm_ prefixed allocations

2015-10-02 Thread Felix Fietkau

On 2015-10-02 12:48, Neil Armstrong wrote:
> To simplify and prevent memory leakage when unbinding, use
> the devm_ memory allocation calls.
> 
> Tested-by: Andrew Lunn 
> Tested-by: Florian Fainelli 
> Signed-off-by: Neil Armstrong 
I think you also need to get rid of the corresponding free calls in the
error path, otherwise it will probably crash at some point.

- Felix
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] pppoe: drop pppoe device in pppoe_unbind_sock_work

2015-05-09 Thread Felix Fietkau

After receiving a PADT and the socket is closed, user space will no
longer drop the reference to the pppoe device.
This leads to errors like this:

[  488.57] unregister_netdevice: waiting for eth0.2 to become free. Usage 
count = 2

Fixes: 287f3a943fe (pppoe: Use workqueue to die properly when a PADT is 
received)
Signed-off-by: Felix Fietkau n...@openwrt.org
---
 drivers/net/ppp/pppoe.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/ppp/pppoe.c b/drivers/net/ppp/pppoe.c
index ff059e1..6a544f2 100644
--- a/drivers/net/ppp/pppoe.c
+++ b/drivers/net/ppp/pppoe.c
@@ -462,6 +462,10 @@ static void pppoe_unbind_sock_work(struct work_struct 
*work)
struct sock *sk = sk_pppox(po);
 
lock_sock(sk);
+   if (po-pppoe_dev) {
+   dev_put(po-pppoe_dev);
+   po-pppoe_dev = NULL;
+   }
pppox_unbind_sock(sk);
release_sock(sk);
sock_put(sk);
-- 
2.2.2

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH][FIX 4.1] bgmac: fix requests for extra polling calls from NAPI

2015-04-23 Thread Felix Fietkau

On 2015-04-23 20:56, Rafał Miłecki wrote:
 After d75b1ade567f (net: less interrupt masking in NAPI) polling
 function has to return whole budget when it wants NAPI to call it again.
 
 Signed-off-by: Rafał Miłecki zaj...@gmail.com
 Cc: Felix Fietkau n...@openwrt.org
 Fixes: eb64e2923a886 (bgmac: leave interrupts disabled as long as there is 
 work to do)
Acked-by: Felix Fietkau n...@openwrt.org

- Felix
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

49 matches

Mail list logo