date:20151122

Re: [RFC 5/8] net-next: ralink: add support for rt3050 family

2015-11-22 Thread John Crispin

On 23/11/2015 00:16, Andrew Lunn wrote:
>> Hi Andrew,
>>
>> we have had a switch layer inside openwrt called swconfig for several
>> years. at the moment i have an add-on patch in openwrt to provide an
>> userland interface via that layer.
> 
> Hi John
> 
> I know of swconfig.  However, it has been NACKed for mainline. So you
> either need to do switchdev or DSA for controlling switches in
> mainline.
> 
>   Andrew
> 

Hi Andrew.

I am not planning to bring the switch support upstream in the near
future. right now my focus is on bringing the core driver upstream. once
that is done i plan so add support for the new DMA core, then the new
ARM SoCs. once those are done i plan to add support for the new media
center SoC and once that is done i am probably going to have a go at the
hw offloading engine. Once all that is done, i will have time to look at
the switch driver. however that wont happen in the near future.

John
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net-next v2 8/9] net: ipmr: rearrange and cleanup setsockopt

2015-11-22 Thread Cong Wang

On Sat, Nov 21, 2015 at 6:57 AM, Nikolay Aleksandrov
 wrote:
>  net/ipv4/ipmr.c | 191 
> +++-
>  1 file changed, 107 insertions(+), 84 deletions(-)

Does this really simplify the code? :-/
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net-next v2 4/9] net: ipmr: fix code and comment style

2015-11-22 Thread Cong Wang

On Sat, Nov 21, 2015 at 6:57 AM, Nikolay Aleksandrov
 wrote:
> -
> -/*
> - * Setup for IP multicast routing
> - */
> +/* Setup for IP multicast routing */
>  static int __net_init ipmr_net_init(struct net *net)

Comments like this one are never useful so can be just removed.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net-next v2 2/9] net: ipmr: always define mroute_reg_vif_num

2015-11-22 Thread Cong Wang

On Sat, Nov 21, 2015 at 6:57 AM, Nikolay Aleksandrov
 wrote:
> From: Nikolay Aleksandrov 
>
> Before mroute_reg_vif_num was defined only if any of the CONFIG_PIMSM_
> options were set, but that's not really necessary as the size of the
> struct is the same in both cases (checked with pahole, both cases size
> is 3256 bytes) and we can remove some unnecessary ifdefs to simplify the
> code.
>

Not sure if this really simplifies the code, since now
mroute_reg_vif_num is hidden
deeper after your patch and there are still some code under CONFIG_IP_PIMSM.

If you really care about it, how about introducing a helper function
to set and get
mrt->mroute_reg_vif_num?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCHSET v3] netfilter, cgroup: implement cgroup2 path match in xt_cgroup

2015-11-22 Thread Daniel Wagner

Hi Tejun,

On 11/21/2015 05:13 PM, Tejun Heo wrote:
> This is v3 of the xt_cgroup2 patchset.  Changes from the last take are
> 
> * Folded cgroup2 path matching into xt_cgroup as a new revision rather
>   than a separate xt_cgroup2 match as suggested by Pablo.
> 
> * Refreshed on top of Nina's net_cls dynamic config update fix patch.
>   I included the fix patch as part of this series to ease reviewing.

I started to play with your patches and was greeted by this:

[3.217648] systemd[1]: tmp.mount: Directory /tmp to mount over is not 
empty, mounting anyway.
[3.224665] BUG: spinlock bad magic on CPU#1, systemd/1
[3.225653]  lock: cgroup_sk_update_lock+0x0/0x60, .magic: , .owner: 
systemd/1, .owner_cpu: 1
[3.227034] CPU: 1 PID: 1 Comm: systemd Not tainted 4.4.0-rc1+ #195
[3.227862] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
[3.228906]  834a2160 88007c043ad0 81551edc 
88007c028000
[3.229512]  88007c043af0 81136868 834a2160 
88007aff5940
[3.230105]  88007c043b08 81136b05 834a2160 
88007c043b20
[3.230716] Call Trace:
[3.230906]  [] dump_stack+0x4e/0x82
[3.231289]  [] spin_dump+0x78/0xc0
[3.231642]  [] do_raw_spin_unlock+0x75/0xd0
[3.232039]  [] _raw_spin_unlock+0x27/0x50
[3.232431]  [] update_classid_sock+0x68/0x80
[3.232836]  [] iterate_fd+0x71/0x150
[3.233197]  [] update_classid+0x47/0x80
[3.233571]  [] cgrp_attach+0x14/0x20
[3.233929]  [] cgroup_taskset_migrate+0x1e1/0x330
[3.234366]  [] cgroup_migrate+0xf5/0x190
[3.234747]  [] ? cgroup_migrate+0x5/0x190
[3.235130]  [] cgroup_attach_task+0x176/0x200
[3.235543]  [] ? cgroup_attach_task+0x5/0x200
[3.235953]  [] __cgroup_procs_write+0x2ad/0x460
[3.236377]  [] ? __cgroup_procs_write+0x5e/0x460
[3.236805]  [] cgroup_procs_write+0x14/0x20
[3.237205]  [] cgroup_file_write+0x35/0x1c0
[3.237600]  [] kernfs_fop_write+0x141/0x190
[3.237998]  [] __vfs_write+0x28/0xe0
[3.238361]  [] ? percpu_down_read+0x57/0xa0
[3.238761]  [] ? __sb_start_write+0xb4/0xf0
[3.239154]  [] ? __sb_start_write+0xb4/0xf0
[3.239554]  [] vfs_write+0xac/0x1a0
[3.239930]  [] ? __fget_light+0x66/0x90
[3.240308]  [] SyS_write+0x49/0xb0
[3.240656]  [] entry_SYSCALL_64_fastpath+0x12/0x76

I am using a Fedora 23 host with systemd.unified_cgroup_hierarchy=1. The config 
is
available here:

http://monom.org/cgroup/config-review-xt_cgroup2

Probably completely rubbish, because it's my random test config.

cheers,
daniel
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: r8169 regression: UDP packets dropped intermittantly

2015-11-22 Thread Jonathan Woithe

On Fri, Nov 20, 2015 at 11:45:34PM +0100, Francois Romieu wrote:
> The hardware stats are not exactly clear. Is the initial Tx - Rx packet
> difference (6) at the hardware stats level expected ?

Sorry, I missed this question earlier.  When speaking to the external
device, each UDP command packet that's sent should be balanced with a
response.  However, if a UDP command is sent to a device that's not on (or
there's a typo with the address) then obviously that transmitted packet will
never be responded to.  I expect something like this could easily have
happened during the session in which I ran that test for you, so in that
case would be explainable.

I note that the difference did increase to 7 at the 1447985726.272550987 
mark.  Since this was during the test I can't quite explain that one, unless
something else on the system happened to send a UDP packet out onto that
interface which was not responded to.

Regards
  jonathan
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: What's the benefit of large Rx rings?

2015-11-22 Thread Yuval Mintz

>> This might be a dumb question, but I recently touched this
>> and felt like I'm missing something basic -
>>
>> NAPI is being scheduled from soft-interrupt contex, and it
>> has a ~strict quota for handling Rx packets [even though we're
>> allowing practically unlimited handling of Tx completions].
>> Given these facts, what's the benefit of having arbitrary large
>> Rx buffer rings? Assuming quota is 64, I would have expected
>> that having more than twice or thrice as many buffers could not
>> help in real traffic scenarios - in any given time-unit
>> [the time between 2 NAPI runs which should be relatively
>> constant] CPU can't handle more than the quota; If HW is
>> generating more packets on a regular basis the buffers are bound
>> to get exhausted, no matter how many there are.
>>
>> While there isn't any obvious downside to allowing drivers to
>> increase ring sizes to be larger [other than memory footprint],
>> I feel like I'm missing the scenarios where having Ks of
>> buffers can actually help.
>> And for the unlikely case that I'm not missing anything,
>> why aren't we supplying some `default' max and min amounts
>> in a common header?

> The main benefit of large Rx rings is that you could theoretically
> support longer delays between device interrupts.  So for example if
> you have a protocol such as UDP that doesn't care about latency then
> you could theoretically set a large ring size, a large interrupt delay
> and process several hundred or possibly even several thousand packets
> per device interrupt instead of just a few.

So we're basically spending hundred of MBs [at least for high-speed
ethernet devices] on memory that helps us mostly on the first
coalesced interrupt [since later it all goes through napi re-scheduling]?
Sounds a bit... wasteful.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: r8169 regression: UDP packets dropped intermittantly

2015-11-22 Thread Jonathan Woithe

On Mon, Nov 23, 2015 at 12:02:44AM +0100, Francois Romieu wrote:
> Jonathan Woithe  :
> [...]
> > I'm a little confused though as to which patch you want me to apply.  There
> > was an inline patch against r8169.c in your message, and then there was
> > another patch to r8169.c in the form of an attachment.  Both patches removed
> > the include of asm/system.h but the rest of the content differs.  Did you
> > want each tried in turn?  Apologies if I'm missing something obvious.
> 
> Use the inlined one. I forgot to remove the 2.6.35.11 based attachment.

Ok, no problem.

That's done now and it appears to work.  That is, I am not seeing any of the
erroneous UDP packet delivery problems noted previously and therefore the
communication with the external device is working normally.

I dropped the patched r8169.c into a 4.3.0 kernel since that's what I
already had on the system.

Regards
  jonathan
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] wireless: change cfg80211 regulatory domain info as debug messages

2015-11-22 Thread Dave Young

On 11/20/15 at 12:55pm, Johannes Berg wrote:
> On Sun, 2015-11-15 at 19:25 +0100, Stefan Lippers-Hollmann wrote:
> > Hi
> > 
> > On 2015-11-15, Dave Young wrote:
> > > cfg80211 module prints a lot of messages like below. Actually
> > > printing once is acceptable but sometimes it will print again and
> > > again, it looks very annoying. It is better to change these detail
> > > messages to debugging only.
> > 
> > It is a lot of info, easily repeated 3 times on boot, but it's also
> > the only real chance to determine why you ended up with the
> > regulatory domain settings you got, rather than just the values
> > itself. Given that a lot (most?) of officially shipping wireless
> > devices are misconfigured (wrong EEPROM regdom settings for the
> > region they're sold in) and considering that the limits can even
> > change at runtime (IEEE 802.11d), it is imho quite important not just
> > to be able what the current restrictions (iw reg get) are, but also
> > why the kernel settled on those.
> > 
> 
> Hm. I kinda sympathize with both points of view here, not sure what to
> do.
> 
> Maybe we could skip this for the world regdomain only? It doesn't
> really change, and we typically don't care that much for it? That'd
> probably get rid of most of the lines already.
> 
> Alternatively, perhaps the internal computations should be more
> transparently visible through some other mechanism?
> 

If they are for debugging purpose I would like to see them as pr_debug
or something in debugfs. Especially for printks which will not only
being called on initialization phase.

Seems there're a lot of other wireless messages. Should we refactor 
them as well? I still did not get chance to see where is the code.
(My wireless driver being used is iwlwifi)

# uptime
 09:36:31 up 17 days, 19:17, 11 users,  load average: 0.26, 0.25, 0.17

#dmesg|grep wlp3s0|wc
   4868   54014  404187

# dmesg|grep "Limiting TX power"|wc
   4128   49600  360052

Thanks
Dave
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: r8169 regression: UDP packets dropped intermittantly

2015-11-22 Thread Francois Romieu

Jonathan Woithe  :
[...]
> I'm a little confused though as to which patch you want me to apply.  There
> was an inline patch against r8169.c in your message, and then there was
> another patch to r8169.c in the form of an attachment.  Both patches removed
> the include of asm/system.h but the rest of the content differs.  Did you
> want each tried in turn?  Apologies if I'm missing something obvious.

Use the inlined one. I forgot to remove the 2.6.35.11 based attachment.

-- 
Ueimor
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: r8169 regression: UDP packets dropped intermittantly

2015-11-22 Thread Jonathan Woithe

On Sat, Nov 21, 2015 at 11:36:27PM +0100, Francois Romieu wrote:
> Francois Romieu  :
> [...]
> 
> If you can crash your system at will, you may apply the patch below to
> da78dbff2e05630921c551dbbc70a4b7981a8fff ("r8169: remove work from irq
> handler.") parent (aka 1e874e041fc7c222cbd85b20c4406070be1f687a) and
> build it in a current tree (say 4.2).

No problem crashing the machine at present.  It is an internal test machine
which I am using to chase this issue at present.

As I understand the above you would like me to get r8169.c from
1e874e041fc7c222cbd85b20c4406070be1f687a, apply the patch, drop it into
kernel 4.2 and run with that.  That's fine.

I'm a little confused though as to which patch you want me to apply.  There
was an inline patch against r8169.c in your message, and then there was
another patch to r8169.c in the form of an attachment.  Both patches removed
the include of asm/system.h but the rest of the content differs.  Did you
want each tried in turn?  Apologies if I'm missing something obvious.

> How much memory and CPU may I rely on in your test computer ?

Since the problem can be triggered without me having to run the full data
acquisition system, the machine is basically unloaded (it doesn't take many
resources to send/receive 6 udp packets at a time :-) ).  The CPU is an
i7-860 CPU at 2.8 GHz, with 4 GB of RAM fitted and 8 GB swap.  The kernel
and userspace are 32-bit, so we have the usual 3 GB per-process memory
limit.

Regards
  jonathan
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC 5/8] net-next: ralink: add support for rt3050 family

2015-11-22 Thread Andrew Lunn

> Hi Andrew,
> 
> we have had a switch layer inside openwrt called swconfig for several
> years. at the moment i have an add-on patch in openwrt to provide an
> userland interface via that layer.

Hi John

I know of swconfig.  However, it has been NACKed for mainline. So you
either need to do switchdev or DSA for controlling switches in
mainline.

Andrew
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net-next 1/2] net: l3mdev: Add master device lookup by index

2015-11-22 Thread David Ahern

On 11/22/15 11:17 AM, David Miller wrote:

From: David Ahern 
Date: Sun, 22 Nov 2015 10:30:32 -0700

In this case ...

I understand the problem you are trying to solve, but I am saying
you can't use sk_bound_dev_if to use it.

I am confused by that response given that sk_bound_dev_if is one of the 
key principals for the VRF implementation. Applications wanting to 
communicate over interfaces in a VRF have to set sk_bound_dev_if. If 
sk_bound_dev_if is not set by the kernel when the child socket is 
created the TCP handshake will not complete. It is not something that 
can be deferred until after accept.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net-next 1/2] net: l3mdev: Add master device lookup by index

2015-11-22 Thread David Miller

From: David Ahern 
Date: Sun, 22 Nov 2015 21:02:04 -0700

> I am confused by that response given that sk_bound_dev_if is one of
> the key principals for the VRF implementation. Applications wanting to
> communicate over interfaces in a VRF have to set sk_bound_dev_if.

Yes, they have to set it explicitly.

You are setting it for them in response to the connection
creation, and that's what I object to.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[RFC 5/8] net-next: ralink: add support for rt3050 family

2015-11-22 Thread John Crispin

Add support for SoCs from the rt3050 family. This include rt3050, rt3052,
rt3352 and rt5350. These all have a builtin 5 port 100mbit switch. This patch
includes rudimentary code to power up the switch. There are a lot of magic
values that get written to the switch and the internal phys. These values
come straight from the SDK driver and we do not know the meaning of most of
them.

Signed-off-by: John Crispin 
Signed-off-by: Felix Fietkau 
Signed-off-by: Michael Lee 
---
 drivers/net/ethernet/ralink/esw_rt3050.c |  682 ++
 drivers/net/ethernet/ralink/esw_rt3050.h |   29 ++
 drivers/net/ethernet/ralink/soc_rt3050.c |  154 +++
 3 files changed, 865 insertions(+)
 create mode 100644 drivers/net/ethernet/ralink/esw_rt3050.c
 create mode 100644 drivers/net/ethernet/ralink/esw_rt3050.h
 create mode 100644 drivers/net/ethernet/ralink/soc_rt3050.c

diff --git a/drivers/net/ethernet/ralink/esw_rt3050.c 
b/drivers/net/ethernet/ralink/esw_rt3050.c
new file mode 100644
index 000..aae6dac
--- /dev/null
+++ b/drivers/net/ethernet/ralink/esw_rt3050.c
@@ -0,0 +1,682 @@
+/*   This program is free software; you can redistribute it and/or modify
+ *   it under the terms of the GNU General Public License as published by
+ *   the Free Software Foundation; version 2 of the License
+ *
+ *   This program is distributed in the hope that it will be useful,
+ *   but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *   GNU General Public License for more details.
+ *
+ *   Copyright (C) 2009-2015 John Crispin 
+ *   Copyright (C) 2009-2015 Felix Fietkau 
+ *   Copyright (C) 2013-2015 Michael Lee 
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+#include "ralink_soc_eth.h"
+
+#include 
+#include 
+
+#include 
+
+/* HW limitations for this switch:
+ * - No large frame support (PKT_MAX_LEN at most 1536)
+ * - Can't have untagged vlan and tagged vlan on one port at the same time,
+ *   though this might be possible using the undocumented PPE.
+ */
+
+#define RT305X_ESW_REG_ISR 0x00
+#define RT305X_ESW_REG_IMR 0x04
+#define RT305X_ESW_REG_FCT00x08
+#define RT305X_ESW_REG_PFC10x14
+#define RT305X_ESW_REG_ATS 0x24
+#define RT305X_ESW_REG_ATS00x28
+#define RT305X_ESW_REG_ATS10x2c
+#define RT305X_ESW_REG_ATS20x30
+#define RT305X_ESW_REG_PVIDC(_n)   (0x40 + 4 * (_n))
+#define RT305X_ESW_REG_VLANI(_n)   (0x50 + 4 * (_n))
+#define RT305X_ESW_REG_VMSC(_n)(0x70 + 4 * (_n))
+#define RT305X_ESW_REG_POA 0x80
+#define RT305X_ESW_REG_FPA 0x84
+#define RT305X_ESW_REG_SOCPC   0x8c
+#define RT305X_ESW_REG_POC00x90
+#define RT305X_ESW_REG_POC10x94
+#define RT305X_ESW_REG_POC20x98
+#define RT305X_ESW_REG_SGC 0x9c
+#define RT305X_ESW_REG_STRT0xa0
+#define RT305X_ESW_REG_PCR00xc0
+#define RT305X_ESW_REG_PCR10xc4
+#define RT305X_ESW_REG_FPA20xc8
+#define RT305X_ESW_REG_FCT20xcc
+#define RT305X_ESW_REG_SGC20xe4
+#define RT305X_ESW_REG_P0LED   0xa4
+#define RT305X_ESW_REG_P1LED   0xa8
+#define RT305X_ESW_REG_P2LED   0xac
+#define RT305X_ESW_REG_P3LED   0xb0
+#define RT305X_ESW_REG_P4LED   0xb4
+#define RT305X_ESW_REG_PXPC(_x)(0xe8 + (4 * _x))
+#define RT305X_ESW_REG_P1PC0xec
+#define RT305X_ESW_REG_P2PC0xf0
+#define RT305X_ESW_REG_P3PC0xf4
+#define RT305X_ESW_REG_P4PC0xf8
+#define RT305X_ESW_REG_P5PC0xfc
+
+#define RT305X_ESW_LED_LINK0
+#define RT305X_ESW_LED_100M1
+#define RT305X_ESW_LED_DUPLEX  2
+#define RT305X_ESW_LED_ACTIVITY3
+#define RT305X_ESW_LED_COLLISION   4
+#define RT305X_ESW_LED_LINKACT 5
+#define RT305X_ESW_LED_DUPLCOLL6
+#define RT305X_ESW_LED_10MACT  7
+#define RT305X_ESW_LED_100MACT 8
+/* Additional led states not in datasheet: */
+#define RT305X_ESW_LED_BLINK   10
+#define RT305X_ESW_LED_ON  12
+
+#define RT305X_ESW_LINK_S  25
+#define RT305X_ESW_DUPLEX_S9
+#define RT305X_ESW_SPD_S   0
+
+#define RT305X_ESW_PCR0_WT_NWAY_DATA_S 16
+#define RT305X_ESW_PCR0_WT_PHY_CMD BIT(13)
+#define RT305X_ESW_PCR0_CPU_PHY_REG_S  8
+
+#define RT305X_ESW_PCR1_WT_DONEBIT(0)
+
+#define RT305X_ESW_ATS_TIMEOUT (5 * HZ)
+#define RT305X_ESW_PHY_TIMEOUT (5 * HZ)
+
+#define RT305X_ESW_PVIDC_PVID_M0xfff
+#define RT305X_ESW_PVIDC_PVID_S12

[RFC 4/8] net-next: ralink: add support for rt2880

2015-11-22 Thread John Crispin

rt2880 is the oldest SoC with this core. It has a single gBit port that will
normally be attached to an external phy of switch. The patch also adds the
code required to drive the mdio bus.

Signed-off-by: John Crispin 
Signed-off-by: Felix Fietkau 
Signed-off-by: Michael Lee 
---
 drivers/net/ethernet/ralink/mdio_rt2880.c |  231 +
 drivers/net/ethernet/ralink/mdio_rt2880.h |   23 +++
 drivers/net/ethernet/ralink/soc_rt2880.c  |   76 ++
 3 files changed, 330 insertions(+)
 create mode 100644 drivers/net/ethernet/ralink/mdio_rt2880.c
 create mode 100644 drivers/net/ethernet/ralink/mdio_rt2880.h
 create mode 100644 drivers/net/ethernet/ralink/soc_rt2880.c

diff --git a/drivers/net/ethernet/ralink/mdio_rt2880.c 
b/drivers/net/ethernet/ralink/mdio_rt2880.c
new file mode 100644
index 000..f3d19f0
--- /dev/null
+++ b/drivers/net/ethernet/ralink/mdio_rt2880.c
@@ -0,0 +1,231 @@
+/*   This program is free software; you can redistribute it and/or modify
+ *   it under the terms of the GNU General Public License as published by
+ *   the Free Software Foundation; version 2 of the License
+ *
+ *   This program is distributed in the hope that it will be useful,
+ *   but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *   GNU General Public License for more details.
+ *
+ *   Copyright (C) 2009-2015 John Crispin 
+ *   Copyright (C) 2009-2015 Felix Fietkau 
+ *   Copyright (C) 2013-2015 Michael Lee 
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "ralink_soc_eth.h"
+#include "mdio_rt2880.h"
+#include "mdio.h"
+
+#define FE_MDIO_RETRY  1000
+
+static unsigned char *rt2880_speed_str(struct fe_priv *priv)
+{
+   switch (priv->phy->speed[0]) {
+   case SPEED_1000:
+   return "1000";
+   case SPEED_100:
+   return "100";
+   case SPEED_10:
+   return "10";
+   }
+
+   return "?";
+}
+
+void rt2880_mdio_link_adjust(struct fe_priv *priv, int port)
+{
+   u32 mdio_cfg;
+
+   if (!priv->link[0]) {
+   netif_carrier_off(priv->netdev);
+   netdev_info(priv->netdev, "link down\n");
+   return;
+   }
+
+   mdio_cfg = FE_MDIO_CFG_TX_CLK_SKEW_200 |
+  FE_MDIO_CFG_RX_CLK_SKEW_200 |
+  FE_MDIO_CFG_GP1_FRC_EN;
+
+   if (priv->phy->duplex[0] == DUPLEX_FULL)
+   mdio_cfg |= FE_MDIO_CFG_GP1_DUPLEX;
+
+   if (priv->phy->tx_fc[0])
+   mdio_cfg |= FE_MDIO_CFG_GP1_FC_TX;
+
+   if (priv->phy->rx_fc[0])
+   mdio_cfg |= FE_MDIO_CFG_GP1_FC_RX;
+
+   switch (priv->phy->speed[0]) {
+   case SPEED_10:
+   mdio_cfg |= FE_MDIO_CFG_GP1_SPEED_10;
+   break;
+   case SPEED_100:
+   mdio_cfg |= FE_MDIO_CFG_GP1_SPEED_100;
+   break;
+   case SPEED_1000:
+   mdio_cfg |= FE_MDIO_CFG_GP1_SPEED_1000;
+   break;
+   default:
+   BUG();
+   }
+
+   fe_w32(mdio_cfg, FE_MDIO_CFG);
+
+   netif_carrier_on(priv->netdev);
+   netdev_info(priv->netdev, "link up (%sMbps/%s duplex)\n",
+   rt2880_speed_str(priv),
+   (priv->phy->duplex[0] == DUPLEX_FULL) ? "Full" : "Half");
+}
+
+static int rt2880_mdio_wait_ready(struct fe_priv *priv)
+{
+   int retries;
+
+   retries = FE_MDIO_RETRY;
+   while (1) {
+   u32 t;
+
+   t = fe_r32(FE_MDIO_ACCESS);
+   if ((t & BIT(31)) == 0)
+   return 0;
+
+   if (retries-- == 0)
+   break;
+
+   udelay(1);
+   }
+
+   dev_err(priv->device, "MDIO operation timed out\n");
+   return -ETIMEDOUT;
+}
+
+int rt2880_mdio_read(struct mii_bus *bus, int phy_addr, int phy_reg)
+{
+   struct fe_priv *priv = bus->priv;
+   int err;
+   u32 t;
+
+   err = rt2880_mdio_wait_ready(priv);
+   if (err)
+   return 0x;
+
+   t = (phy_addr << 24) | (phy_reg << 16);
+   fe_w32(t, FE_MDIO_ACCESS);
+   t |= BIT(31);
+   fe_w32(t, FE_MDIO_ACCESS);
+
+   err = rt2880_mdio_wait_ready(priv);
+   if (err)
+   return 0x;
+
+   pr_debug("%s: addr=%04x, reg=%04x, value=%04x\n", __func__,
+phy_addr, phy_reg, fe_r32(FE_MDIO_ACCESS) & 0x);
+
+   return fe_r32(FE_MDIO_ACCESS) & 0x;
+}
+
+int rt2880_mdio_write(struct mii_bus *bus, int phy_addr, int phy_reg, u16 val)
+{
+   struct fe_priv *priv = bus->priv;
+   int err;
+   u32 t;
+
+   pr_debug("%s: addr=%04x, reg=%04x, value=%04x\n", __func__,
+phy_addr, phy_reg,

[RFC 3/8] net-next: ralink: add the drivers core files

2015-11-22 Thread John Crispin

This patch adds the main chunk of the driver. The ethernet core is used in all
of the Mediatek/Ralink Wireless SoCs. Over the years we have seen verious
changes to

* the register layout
* the type of ports (single/dual gbit, internal FE/Gbit switch)
* dma engine

and new offloading features were added, such as

* checksum
* vlan tx/rx
* gso
* lro

However the core functionality has remained the sama allowing us to use the
same core for all SoCs.

The abstraction for the various SoCs uses the typical ops struct pattern which
allows us to extend or override the cores functionality depending on which SoC
we are on. The code to bring up the switches and external ports has also been
split into separate files.

Signed-off-by: John Crispin 
Signed-off-by: Felix Fietkau 
Signed-off-by: Michael Lee 
---
 drivers/net/ethernet/ralink/mdio.c   |  268 +
 drivers/net/ethernet/ralink/mdio.h   |   27 +
 drivers/net/ethernet/ralink/ralink_ethtool.c |  235 
 drivers/net/ethernet/ralink/ralink_ethtool.h |   22 +
 drivers/net/ethernet/ralink/ralink_soc_eth.c | 1622 ++
 drivers/net/ethernet/ralink/ralink_soc_eth.h |  519 
 6 files changed, 2693 insertions(+)
 create mode 100644 drivers/net/ethernet/ralink/mdio.c
 create mode 100644 drivers/net/ethernet/ralink/mdio.h
 create mode 100644 drivers/net/ethernet/ralink/ralink_ethtool.c
 create mode 100644 drivers/net/ethernet/ralink/ralink_ethtool.h
 create mode 100644 drivers/net/ethernet/ralink/ralink_soc_eth.c
 create mode 100644 drivers/net/ethernet/ralink/ralink_soc_eth.h

diff --git a/drivers/net/ethernet/ralink/mdio.c 
b/drivers/net/ethernet/ralink/mdio.c
new file mode 100644
index 000..de95ddb
--- /dev/null
+++ b/drivers/net/ethernet/ralink/mdio.c
@@ -0,0 +1,268 @@
+/*   This program is free software; you can redistribute it and/or modify
+ *   it under the terms of the GNU General Public License as published by
+ *   the Free Software Foundation; version 2 of the License
+ *
+ *   Copyright (C) 2009-2015 John Crispin 
+ *   Copyright (C) 2009-2015 Felix Fietkau 
+ *   Copyright (C) 2013-2015 Michael Lee 
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "ralink_soc_eth.h"
+#include "mdio.h"
+
+static int fe_mdio_reset(struct mii_bus *bus)
+{
+   /* TODO */
+   return 0;
+}
+
+static void fe_phy_link_adjust(struct net_device *dev)
+{
+   struct fe_priv *priv = netdev_priv(dev);
+   unsigned long flags;
+   int i;
+
+   spin_lock_irqsave(>phy->lock, flags);
+   for (i = 0; i < 8; i++) {
+   if (priv->phy->phy_node[i]) {
+   struct phy_device *phydev = priv->phy->phy[i];
+   int status_change = 0;
+
+   if (phydev->link)
+   if (priv->phy->duplex[i] != phydev->duplex ||
+   priv->phy->speed[i] != phydev->speed)
+   status_change = 1;
+
+   if (phydev->link != priv->link[i])
+   status_change = 1;
+
+   switch (phydev->speed) {
+   case SPEED_1000:
+   case SPEED_100:
+   case SPEED_10:
+   priv->link[i] = phydev->link;
+   priv->phy->duplex[i] = phydev->duplex;
+   priv->phy->speed[i] = phydev->speed;
+
+   if (status_change &&
+   priv->soc->mdio_adjust_link)
+   priv->soc->mdio_adjust_link(priv, i);
+   break;
+   }
+   }
+   }
+}
+
+int fe_connect_phy_node(struct fe_priv *priv, struct device_node *phy_node)
+{
+   const __be32 *_port = NULL;
+   struct phy_device *phydev;
+   int phy_mode, port;
+
+   _port = of_get_property(phy_node, "reg", NULL);
+
+   if (!_port || (be32_to_cpu(*_port) >= 0x20)) {
+   pr_err("%s: invalid port id\n", phy_node->name);
+   return -EINVAL;
+   }
+   port = be32_to_cpu(*_port);
+   phy_mode = of_get_phy_mode(phy_node);
+   if (phy_mode < 0) {
+   dev_err(priv->device, "incorrect phy-mode %d\n", phy_mode);
+   priv->phy->phy_node[port] = NULL;
+   return -EINVAL;
+   }
+
+   phydev = of_phy_connect(priv->netdev, phy_node, fe_phy_link_adjust,
+   0, phy_mode);
+   if (IS_ERR(phydev)) {
+   dev_err(priv->device, "could not connect to PHY\n");
+   priv->phy->phy_node[port] = NULL;
+   return PTR_ERR(phydev);
+   }
+
+

[RFC 2/8] net-next: phy: dont auto handle carrier state when multiple phys are attached

2015-11-22 Thread John Crispin

A network core might have more than one phy attached to its cpu port via a
switch. The current code will set the carrier state to on/off when ever a
cable is plugged into any of these ports.

The patch adds a new bool that allows the driver to tell the phy_device to not
set the carrier state. Instead the driver can manually handle the carrier
state.

Signed-off-by: John Crispin 
Signed-off-by: Felix Fietkau 
Signed-off-by: Michael Lee 
Cc: Florian Fainelli 
---
 drivers/net/phy/phy.c |9 ++---
 include/linux/phy.h   |1 +
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index 48ce6ef..bd2df40 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -843,7 +843,8 @@ void phy_state_machine(struct work_struct *work)
/* If the link is down, give up on negotiation for now */
if (!phydev->link) {
phydev->state = PHY_NOLINK;
-   netif_carrier_off(phydev->attached_dev);
+   if (!phydev->no_auto_carrier_off)
+   netif_carrier_off(phydev->attached_dev);
phydev->adjust_link(phydev->attached_dev);
break;
}
@@ -926,7 +927,8 @@ void phy_state_machine(struct work_struct *work)
netif_carrier_on(phydev->attached_dev);
} else {
phydev->state = PHY_NOLINK;
-   netif_carrier_off(phydev->attached_dev);
+   if (!phydev->no_auto_carrier_off)
+   netif_carrier_off(phydev->attached_dev);
}
 
phydev->adjust_link(phydev->attached_dev);
@@ -938,7 +940,8 @@ void phy_state_machine(struct work_struct *work)
case PHY_HALTED:
if (phydev->link) {
phydev->link = 0;
-   netif_carrier_off(phydev->attached_dev);
+   if (!phydev->no_auto_carrier_off)
+   netif_carrier_off(phydev->attached_dev);
phydev->adjust_link(phydev->attached_dev);
do_suspend = true;
}
diff --git a/include/linux/phy.h b/include/linux/phy.h
index 05fde31..276ab8a 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -377,6 +377,7 @@ struct phy_device {
bool is_pseudo_fixed_link;
bool has_fixups;
bool suspended;
+   bool no_auto_carrier_off;
 
enum phy_state state;
 
-- 
1.7.10.4
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[RFC 1/8] Documentation: DT: net: add docs for ralink/mediatek SoC ethernet binding

2015-11-22 Thread John Crispin

Add three files. ralink,rt2880-net.txt  descibes the actual frame engine
and the other two describe the switch forntend bindings.

Signed-off-by: John Crispin 
Signed-off-by: Felix Fietkau 
Signed-off-by: Michael Lee 
Cc: devicet...@vger.kernel.org
---
 .../bindings/net/mediatek,mt7620-gsw.txt   |   26 +
 .../devicetree/bindings/net/ralink,rt2880-net.txt  |   61 
 .../devicetree/bindings/net/ralink,rt3050-esw.txt  |   32 ++
 3 files changed, 119 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/net/mediatek,mt7620-gsw.txt
 create mode 100644 Documentation/devicetree/bindings/net/ralink,rt2880-net.txt
 create mode 100644 Documentation/devicetree/bindings/net/ralink,rt3050-esw.txt

diff --git a/Documentation/devicetree/bindings/net/mediatek,mt7620-gsw.txt 
b/Documentation/devicetree/bindings/net/mediatek,mt7620-gsw.txt
new file mode 100644
index 000..fb47d8e
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/mediatek,mt7620-gsw.txt
@@ -0,0 +1,26 @@
+Mediatek Gigabit Switch
+===
+
+The mediatek gigabit switch can be found on Mediatek SoCs (mt7620, mt7621).
+
+Required properties:
+- compatible: Should be "mediatek,mt7620-gsw"
+- reg: Address and length of the register set for the device
+- interrupt-parent: Should be the phandle for the interrupt controller
+  that services interrupts for this device
+- interrupts: Should contain the gigabit switches interrupt
+- resets: Should contain the gigabit switches resets
+- reset-names: Should contain the reset names "gsw"
+
+Example:
+
+gsw@1011 {
+   compatible = "ralink,mt7620-gsw";
+   reg = <0x1011 8000>;
+
+   resets = < 23>;
+   reset-names = "gsw";
+
+   interrupt-parent = <>;
+   interrupts = <17>;
+};
diff --git a/Documentation/devicetree/bindings/net/ralink,rt2880-net.txt 
b/Documentation/devicetree/bindings/net/ralink,rt2880-net.txt
new file mode 100644
index 000..88b095d
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/ralink,rt2880-net.txt
@@ -0,0 +1,61 @@
+Ralink Frame Engine Ethernet controller
+===
+
+The Ralink frame engine ethernet controller can be found on Ralink and
+Mediatek SoCs (RT288x, RT3x5x, RT366x, RT388x, rt5350, mt7620, mt7621, mt76x8).
+
+Depending on the SoC, there is a number of ports connected to the CPU port
+directly and/or via a (gigabit-)switch.
+
+* Ethernet controller node
+
+Required properties:
+- compatible: Should be one of "ralink,rt2880-eth", "ralink,rt3050-eth",
+  "ralink,rt3050-eth", "ralink,rt3883-eth", "ralink,rt5350-eth",
+  "mediatek,mt7620-eth", "mediatek,mt7621-eth"
+- reg: Address and length of the register set for the device
+- interrupt-parent: Should be the phandle for the interrupt controller
+  that services interrupts for this device
+- interrupts: Should contain the frame engines interrupt
+- resets: Should contain the frame engines resets
+- reset-names: Should contain the reset names "fe". If a switch is present
+  "esw" is also required.
+
+
+* Ethernet port node
+
+Required properties:
+- compatible: Should be "ralink,eth-port"
+- reg: The number of the physical port
+- phy-handle: reference to the node describing the phy
+
+Example:
+
+mdio-bus {
+   ...
+   phy0: ethernet-phy@0 {
+   phy-mode = "mii";
+   reg = <0>;
+   };
+};
+
+ethernet@40 {
+   compatible = "ralink,rt2880-eth";
+   reg = <0x0040 1>;
+
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   resets = < 18>;
+   reset-names = "fe";
+
+   interrupt-parent = <>;
+   interrupts = <5>;
+
+   port@0 {
+   compatible = "ralink,eth-port";
+   reg = <0>;
+   phy-handle = <>;
+   };
+
+};
diff --git a/Documentation/devicetree/bindings/net/ralink,rt3050-esw.txt 
b/Documentation/devicetree/bindings/net/ralink,rt3050-esw.txt
new file mode 100644
index 000..ed32e21
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/ralink,rt3050-esw.txt
@@ -0,0 +1,32 @@
+Ralink Fast Ethernet Embedded Switch
+
+
+The ralink fast ethernet embedded switch can be found on Ralink and Mediatek
+SoCs (RT3x5x, rt5350, mt76x8).
+
+Required properties:
+- compatible: Should be "ralink,rt3050-esw"
+- reg: Address and length of the register set for the device
+- interrupt-parent: Should be the phandle for the interrupt controller
+  that services interrupts for this device
+- interrupts: Should contain the embedded switches interrupt
+- resets: Should contain the embedded switches resets
+- reset-names: Should contain the reset names "esw"
+
+Optional properties:
+- ralink,portmap: can be used to choose if the default switch setup is
+  w or w
+- ralink,led_polarity: override the active high/low settings of the leds
+
+Example:
+
+esw@1011 {
+   compatible =

[RFC 8/8] net-next: ralink: add Kconfig and Makefile

2015-11-22 Thread John Crispin

This patch adds the Makefile and Kconfig required to make the driver build.

Signed-off-by: John Crispin 
Signed-off-by: Felix Fietkau 
Signed-off-by: Michael Lee 
---
 drivers/net/ethernet/Kconfig |1 +
 drivers/net/ethernet/Makefile|1 +
 drivers/net/ethernet/ralink/Kconfig  |   49 ++
 drivers/net/ethernet/ralink/Makefile |   19 +
 4 files changed, 70 insertions(+)
 create mode 100644 drivers/net/ethernet/ralink/Kconfig
 create mode 100644 drivers/net/ethernet/ralink/Makefile

diff --git a/drivers/net/ethernet/Kconfig b/drivers/net/ethernet/Kconfig
index 955d06b..2d24101 100644
--- a/drivers/net/ethernet/Kconfig
+++ b/drivers/net/ethernet/Kconfig
@@ -153,6 +153,7 @@ source "drivers/net/ethernet/packetengines/Kconfig"
 source "drivers/net/ethernet/pasemi/Kconfig"
 source "drivers/net/ethernet/qlogic/Kconfig"
 source "drivers/net/ethernet/qualcomm/Kconfig"
+source "drivers/net/ethernet/ralink/Kconfig"
 source "drivers/net/ethernet/realtek/Kconfig"
 source "drivers/net/ethernet/renesas/Kconfig"
 source "drivers/net/ethernet/rdc/Kconfig"
diff --git a/drivers/net/ethernet/Makefile b/drivers/net/ethernet/Makefile
index 4a2ee98..fba816c 100644
--- a/drivers/net/ethernet/Makefile
+++ b/drivers/net/ethernet/Makefile
@@ -63,6 +63,7 @@ obj-$(CONFIG_NET_PACKET_ENGINE) += packetengines/
 obj-$(CONFIG_NET_VENDOR_PASEMI) += pasemi/
 obj-$(CONFIG_NET_VENDOR_QLOGIC) += qlogic/
 obj-$(CONFIG_NET_VENDOR_QUALCOMM) += qualcomm/
+obj-$(CONFIG_NET_RALINK) += ralink/
 obj-$(CONFIG_NET_VENDOR_REALTEK) += realtek/
 obj-$(CONFIG_NET_VENDOR_RENESAS) += renesas/
 obj-$(CONFIG_NET_VENDOR_RDC) += rdc/
diff --git a/drivers/net/ethernet/ralink/Kconfig 
b/drivers/net/ethernet/ralink/Kconfig
new file mode 100644
index 000..09639cd
--- /dev/null
+++ b/drivers/net/ethernet/ralink/Kconfig
@@ -0,0 +1,49 @@
+config NET_RALINK
+   tristate "Ralink ethernet driver"
+   depends on RALINK
+   help
+ This driver supports the ethernet mac inside the ralink wisocs
+
+if NET_RALINK
+choice
+   prompt "MAC type"
+
+config NET_RALINK_RT2880
+   bool "RT2882"
+   depends on SOC_RT288X
+
+config NET_RALINK_RT3050
+   bool "RT3050/MT7628"
+   depends on (SOC_RT305X || SOC_MT7620)
+
+config NET_RALINK_RT3883
+   bool "RT3883"
+   depends on SOC_RT3883
+
+config NET_RALINK_MT7620
+   bool "MT7620"
+   depends on (SOC_MT7620 || SOC_MT7621)
+
+endchoice
+
+config NET_RALINK_MDIO
+   def_bool NET_RALINK
+   depends on (NET_RALINK_RT2880 || NET_RALINK_RT3883 || NET_RALINK_MT7620 
|| NET_RALINK_MT7621)
+   select PHYLIB
+
+config NET_RALINK_MDIO_RT2880
+   def_bool NET_RALINK
+   depends on (NET_RALINK_RT2880 || NET_RALINK_RT3883)
+   select NET_RALINK_MDIO
+
+config NET_RALINK_ESW_RT3050
+   def_bool NET_RALINK
+   depends on NET_RALINK_RT3050
+   select PHYLIB
+
+config NET_RALINK_GSW_MT7620
+   def_bool NET_RALINK
+   depends on NET_RALINK_MT7620 || NET_RALINK_MT7621
+   select NET_RALINK_MDIO
+   select PHYLIB
+endif
diff --git a/drivers/net/ethernet/ralink/Makefile 
b/drivers/net/ethernet/ralink/Makefile
new file mode 100644
index 000..eb90c56
--- /dev/null
+++ b/drivers/net/ethernet/ralink/Makefile
@@ -0,0 +1,19 @@
+#
+# Makefile for the Ralink SoCs built-in ethernet macs
+#
+
+ralink-eth-y   += ralink_soc_eth.o 
ralink_ethtool.o
+
+ralink-eth-$(CONFIG_NET_RALINK_MDIO)   += mdio.o
+ralink-eth-$(CONFIG_NET_RALINK_MDIO_RT2880)+= mdio_rt2880.o
+
+ralink-eth-$(CONFIG_NET_RALINK_ESW_RT3050) += esw_rt3050.o
+ralink-eth-$(CONFIG_NET_RALINK_GSW_MT7620) += gsw_mt7620.o
+
+ralink-eth-$(CONFIG_NET_RALINK_RT2880) += soc_rt2880.o
+ralink-eth-$(CONFIG_NET_RALINK_RT3050) += soc_rt3050.o
+ralink-eth-$(CONFIG_NET_RALINK_RT3883) += soc_rt3883.o
+ralink-eth-$(CONFIG_NET_RALINK_MT7620) += soc_mt7620.o
+ralink-eth-$(CONFIG_NET_RALINK_MT7621) += soc_mt7621.o
+
+obj-$(CONFIG_NET_RALINK)   += ralink-eth.o
-- 
1.7.10.4
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[RFC 6/8] net-next: ralink: add support for rt3883

2015-11-22 Thread John Crispin

Add support for rt3883 and its smaller version rt3662. They both have a single
gBit port that will normally be attached to an external phy of switch.

Signed-off-by: John Crispin 
Signed-off-by: Felix Fietkau 
Signed-off-by: Michael Lee 
---
 drivers/net/ethernet/ralink/soc_rt3883.c |   75 ++
 1 file changed, 75 insertions(+)
 create mode 100644 drivers/net/ethernet/ralink/soc_rt3883.c

diff --git a/drivers/net/ethernet/ralink/soc_rt3883.c 
b/drivers/net/ethernet/ralink/soc_rt3883.c
new file mode 100644
index 000..f7b6769
--- /dev/null
+++ b/drivers/net/ethernet/ralink/soc_rt3883.c
@@ -0,0 +1,75 @@
+/*   This program is free software; you can redistribute it and/or modify
+ *   it under the terms of the GNU General Public License as published by
+ *   the Free Software Foundation; version 2 of the License
+ *
+ *   This program is distributed in the hope that it will be useful,
+ *   but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *   GNU General Public License for more details.
+ *
+ *   Copyright (C) 2009-2015 John Crispin 
+ *   Copyright (C) 2009-2015 Felix Fietkau 
+ *   Copyright (C) 2013-2015 Michael Lee 
+ */
+
+#include 
+
+#include 
+
+#include "ralink_soc_eth.h"
+#include "mdio_rt2880.h"
+
+#define RT3883_RSTCTRL_FE  BIT(21)
+
+static void rt3883_fe_reset(void)
+{
+   fe_reset(RT3883_RSTCTRL_FE);
+}
+
+static int rt3883_fwd_config(struct fe_priv *priv)
+{
+   int ret;
+
+   ret = fe_set_clock_cycle(priv);
+   if (ret)
+   return ret;
+
+   fe_fwd_config(priv);
+   fe_w32(FE_PSE_FQFC_CFG_256Q, FE_PSE_FQ_CFG);
+   fe_csum_config(priv);
+
+   return ret;
+}
+
+static void rt3883_init_data(struct fe_soc_data *data,
+struct net_device *netdev)
+{
+   struct fe_priv *priv = netdev_priv(netdev);
+
+   priv->flags = FE_FLAG_PADDING_64B | FE_FLAG_PADDING_BUG |
+   FE_FLAG_JUMBO_FRAME;
+   netdev->hw_features = NETIF_F_SG | NETIF_F_IP_CSUM |
+   NETIF_F_RXCSUM | NETIF_F_HW_VLAN_CTAG_TX;
+}
+
+static struct fe_soc_data rt3883_data = {
+   .init_data = rt3883_init_data,
+   .reset_fe = rt3883_fe_reset,
+   .fwd_config = rt3883_fwd_config,
+   .pdma_glo_cfg = FE_PDMA_SIZE_8DWORDS,
+   .rx_int = FE_RX_DONE_INT,
+   .tx_int = FE_TX_DONE_INT,
+   .status_int = FE_CNT_GDM_AF,
+   .checksum_bit = RX_DMA_L4VALID,
+   .mdio_read = rt2880_mdio_read,
+   .mdio_write = rt2880_mdio_write,
+   .mdio_adjust_link = rt2880_mdio_link_adjust,
+   .port_init = rt2880_port_init,
+};
+
+const struct of_device_id of_fe_match[] = {
+   { .compatible = "ralink,rt3883-eth", .data = _data },
+   {},
+};
+
+MODULE_DEVICE_TABLE(of, of_fe_match);
-- 
1.7.10.4
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[RFC 7/8] net-next: ralink: add support for mt7620 family

2015-11-22 Thread John Crispin

Add support for SoCs from the mt7620 family. This include mt7620 and mt7621.
These all have one dedicated external gbit port and a builtin 5 port 100mbit
switch. Additionally one of the 5 switch ports can be changed to become an
additional gbit port that we can attach a phy to. This patch includes
rudimentary code to power up the switch. There are a lot of magic values
that get written to the switch and the internal phys. These values come
straight from the SDK driver.

Signed-off-by: John Crispin 
Signed-off-by: Felix Fietkau 
Signed-off-by: Michael Lee 
---
 drivers/net/ethernet/ralink/gsw_mt7620.c |  784 ++
 drivers/net/ethernet/ralink/gsw_mt7620.h |   26 +
 drivers/net/ethernet/ralink/soc_mt7620.c |  273 +++
 3 files changed, 1083 insertions(+)
 create mode 100644 drivers/net/ethernet/ralink/gsw_mt7620.c
 create mode 100644 drivers/net/ethernet/ralink/gsw_mt7620.h
 create mode 100644 drivers/net/ethernet/ralink/soc_mt7620.c

diff --git a/drivers/net/ethernet/ralink/gsw_mt7620.c 
b/drivers/net/ethernet/ralink/gsw_mt7620.c
new file mode 100644
index 000..24bc312
--- /dev/null
+++ b/drivers/net/ethernet/ralink/gsw_mt7620.c
@@ -0,0 +1,784 @@
+/*   This program is free software; you can redistribute it and/or modify
+ *   it under the terms of the GNU General Public License as published by
+ *   the Free Software Foundation; version 2 of the License
+ *
+ *   This program is distributed in the hope that it will be useful,
+ *   but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *   GNU General Public License for more details.
+ *
+ *   Copyright (C) 2009-2015 John Crispin 
+ *   Copyright (C) 2009-2015 Felix Fietkau 
+ *   Copyright (C) 2013-2015 Michael Lee 
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+#include "ralink_soc_eth.h"
+
+#include 
+#include 
+
+#include "ralink_soc_eth.h"
+#include "gsw_mt7620.h"
+#include "mdio.h"
+
+#define GSW_REG_PHY_TIMEOUT(5 * HZ)
+
+#ifdef CONFIG_SOC_MT7621
+#define MT7620A_GSW_REG_PIAC   0x0004
+#else
+#define MT7620A_GSW_REG_PIAC   0x7004
+#endif
+
+#define GSW_NUM_VLANS  16
+#define GSW_NUM_VIDS   4096
+#define GSW_NUM_PORTS  7
+#define GSW_PORT6  6
+
+#define GSW_MDIO_ACCESSBIT(31)
+#define GSW_MDIO_READ  BIT(19)
+#define GSW_MDIO_WRITE BIT(18)
+#define GSW_MDIO_START BIT(16)
+#define GSW_MDIO_ADDR_SHIFT20
+#define GSW_MDIO_REG_SHIFT 25
+
+#define GSW_REG_PORT_PMCR(x)   (0x3000 + (x * 0x100))
+#define GSW_REG_PORT_STATUS(x) (0x3008 + (x * 0x100))
+#define GSW_REG_SMACCR00x3fE4
+#define GSW_REG_SMACCR10x3fE8
+#define GSW_REG_CKGCR  0x3ff0
+
+#define GSW_REG_IMR0x7008
+#define GSW_REG_ISR0x700c
+#define GSW_REG_GPC1   0x7014
+
+#define SYSC_REG_CHIP_REV_ID   0x0c
+#define SYSC_REG_CFG1  0x14
+#define RST_CTRL_MCM   BIT(2)
+#define SYSC_PAD_RGMII2_MDIO   0x58
+#define SYSC_GPIO_MODE 0x60
+
+#define PORT_IRQ_ST_CHG0x7f
+
+#ifdef CONFIG_SOC_MT7621
+#define ESW_PHY_POLLING0x
+#else
+#define ESW_PHY_POLLING0x7000
+#endif
+
+#definePMCR_IPGBIT(18)
+#definePMCR_MAC_MODE   BIT(16)
+#definePMCR_FORCE  BIT(15)
+#definePMCR_TX_EN  BIT(14)
+#definePMCR_RX_EN  BIT(13)
+#definePMCR_BACKOFFBIT(9)
+#definePMCR_BACKPRES   BIT(8)
+#definePMCR_RX_FC  BIT(5)
+#definePMCR_TX_FC  BIT(4)
+#definePMCR_SPEED(_x)  (_x << 2)
+#definePMCR_DUPLEX BIT(1)
+#definePMCR_LINK   BIT(0)
+
+#define PHY_AN_EN  BIT(31)
+#define PHY_PRE_EN BIT(30)
+#define PMY_MDC_CONF(_x)   ((_x & 0x3f) << 24)
+
+enum {
+   /* Global attributes. */
+   GSW_ATTR_ENABLE_VLAN,
+   /* Port attributes. */
+   GSW_ATTR_PORT_UNTAG,
+};
+
+enum {
+   PORT4_EPHY = 0,
+   PORT4_EXT,
+};
+
+struct mt7620_gsw {
+   struct device   *dev;
+   void __iomem*base;
+   int irq;
+   int port4;
+   unsigned long int   autopoll;
+};
+
+static inline void gsw_w32(struct mt7620_gsw *gsw, u32 val, unsigned reg)
+{
+   iowrite32(val, gsw->base + reg);
+}
+
+static inline u32 gsw_r32(struct mt7620_gsw *gsw, unsigned reg)
+{
+   return ioread32(gsw->base + reg);
+}
+
+static int mt7620_mii_busy_wait(struct mt7620_gsw *gsw)
+{
+   unsigned long t_start =

[PATCH 02/13] net: mvneta: enable IP checksum with jumbo frames for Armada 38x on Port0

2015-11-22 Thread Marcin Wojtas

The Ethernet controller found in the Armada 38x SoC's family support
TCP/IP checksumming with frame sizes larger than 1600 bytes, however
only on port 0.

This commit enables this feature by using 'marvell,armada-xp-neta' in
'ethernet@7' node.

Signed-off-by: Marcin Wojtas 
Cc:  # v3.18+
---
 arch/arm/boot/dts/armada-38x.dtsi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/boot/dts/armada-38x.dtsi 
b/arch/arm/boot/dts/armada-38x.dtsi
index c6a0e9d..b7868b2 100644
--- a/arch/arm/boot/dts/armada-38x.dtsi
+++ b/arch/arm/boot/dts/armada-38x.dtsi
@@ -494,7 +494,7 @@
};
 
eth0: ethernet@7 {
-   compatible = "marvell,armada-370-neta";
+   compatible = "marvell,armada-xp-neta";
reg = <0x7 0x4000>;
interrupts-extended = < 8>;
clocks = < 4>;
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 03/13] net: mvneta: fix bit assignment in MVNETA_RXQ_CONFIG_REG

2015-11-22 Thread Marcin Wojtas

MVNETA_RXQ_HW_BUF_ALLOC bit which controls enabling hardware buffer
allocation was mistakenly set as BIT(1). This commit fixes the assignment.

Signed-off-by: Marcin Wojtas 
Cc:  # v3.8+
---
 drivers/net/ethernet/marvell/mvneta.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/marvell/mvneta.c 
b/drivers/net/ethernet/marvell/mvneta.c
index 0f30aaa..d12b8c6 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -36,7 +36,7 @@
 
 /* Registers */
 #define MVNETA_RXQ_CONFIG_REG(q)(0x1400 + ((q) << 2))
-#define  MVNETA_RXQ_HW_BUF_ALLOCBIT(1)
+#define  MVNETA_RXQ_HW_BUF_ALLOCBIT(0)
 #define  MVNETA_RXQ_PKT_OFFSET_ALL_MASK (0xf<< 8)
 #define  MVNETA_RXQ_PKT_OFFSET_MASK(offs)   ((offs) << 8)
 #define MVNETA_RXQ_THRESHOLD_REG(q) (0x14c0 + ((q) << 2))
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v1] net: stmmac: Free rx_skbufs before realloc

2015-11-22 Thread Shunqian Zheng

From: ZhengShunQian 

The init_dma_desc_rings() may realloc the rx_skbuff[] when
suspend and resume. This patch free the rx_skbuff[] before
reallocing memory.

Signed-off-by: ZhengShunQian 
---
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 18 ++
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c 
b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 64d8aa4..2af1ed9 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -1022,6 +1022,14 @@ static void stmmac_free_rx_buffers(struct stmmac_priv 
*priv, int i)
priv->rx_skbuff[i] = NULL;
 }
 
+static void dma_free_rx_skbufs(struct stmmac_priv *priv)
+{
+   int i;
+
+   for (i = 0; i < priv->dma_rx_size; i++)
+   stmmac_free_rx_buffers(priv, i);
+}
+
 /**
  * init_dma_desc_rings - init the RX/TX descriptor rings
  * @dev: net device structure
@@ -1058,6 +1066,8 @@ static int init_dma_desc_rings(struct net_device *dev, 
gfp_t flags)
/* RX INITIALIZATION */
pr_debug("\tSKB addresses:\nskb\t\tskb data\tdma data\n");
}
+
+   dma_free_rx_skbufs(priv);
for (i = 0; i < rxsize; i++) {
struct dma_desc *p;
if (priv->extend_desc)
@@ -1122,14 +1132,6 @@ err_init_rx_buffers:
return ret;
 }
 
-static void dma_free_rx_skbufs(struct stmmac_priv *priv)
-{
-   int i;
-
-   for (i = 0; i < priv->dma_rx_size; i++)
-   stmmac_free_rx_buffers(priv, i);
-}
-
 static void dma_free_tx_skbufs(struct stmmac_priv *priv)
 {
int i;
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[RFC PATCH v1] Trying to fix the stmmac memory leak during suspend/resume

2015-11-22 Thread Shunqian Zheng

From: ZhengShunQian 

When I run Suspend-to-Ram stress test on my Rockchip RK3288(SoC) board
that integrated stmmac ethernet, it always OOM after a few iterations,
usually 50 times is enough to reproduce.

Compiled kernel with KMEMLEAK feature, I got the logs as below:
unreferenced object 0xed89ac00 (size 192):
  comm "busybox", pid 79, jiffies 2251 (age 54.580s)
  hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
00 00 00 00 00 00 d1 ed 00 00 00 00 00 00 00 00  
  backtrace:
[] kmemleak_alloc+0x44/0x78
[] kmem_cache_alloc+0x1ac/0x264
[] __build_skb+0x38/0x9c
[] __netdev_alloc_skb+0xac/0x118
[] init_dma_desc_rings+0xcc/0x474
[] stmmac_resume+0xc4/0x14c
[] stmmac_pltfr_resume+0x3c/0x40
[] platform_pm_resume+0x3c/0x50
[] dpm_run_callback+0x7c/0x160
[] device_resume+0x174/0x1c0
[] dpm_resume+0x110/0x2cc
[] dpm_resume_end+0x1c/0x28
[] suspend_devices_and_enter+0x53c/0x6ec
[] pm_suspend+0x334/0x478
[] state_store+0xac/0xc8
[] kobj_attr_store+0x1c/0x28

Actually I don't think I know net/stmmac good enough to fix this bug.
I really appreciate that the exports of net/stmmac can take over it if
you think it is a bug too.

ZhengShunQian (1):
  net: stmmac: Free rx_skbufs before realloc

 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 18 ++
 1 file changed, 10 insertions(+), 8 deletions(-)

-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

RE: [PATCH V3 net-next 3/5] net:hns: Add Hip06 "TSO(TCP Segment Offload)" support HNS Driver

2015-11-22 Thread Yuval Mintz

> +static void hns_ae_set_tso_stats(struct hnae_handle *handle, int
> +enable) {
> + struct hns_ppe_cb *ppe_cb = hns_get_ppe_cb(handle);
> +
> + hns_ppe_set_tso_enable(ppe_cb, enable); }

Style issues?

> +void hns_ppe_set_tso_enable(struct hns_ppe_cb *ppe_cb, u32 value) {
> + dsaf_set_dev_bit(ppe_cb, PPEV2_CFG_TSO_EN_REG, 0, !!value); }
> +

Likewise
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Use-after-free in ppoll

2015-11-22 Thread Dmitry Vyukov

On Sun, Nov 22, 2015 at 3:32 PM, Rainer Weikusat
 wrote:
> Dmitry Vyukov  writes:
>> Hello,
>>
>> On commit f2d10565b9bdbb722bd43e6e1a759eeddb9645c8 (Nov 20).
>>
>> The following program triggers use-after-free:
>>
>> // autogenerated by syzkaller (http://github.com/google/syzkaller)
>> #include 
>> #include 
>> #include 
>> #include 
>>
>> void *thread(void *p)
>> {
>> syscall(SYS_write, (long)p, 0x2000278ful, 0x1ul, 0, 0, 0);
>> return 0;
>> }
>
> [...]
>
>
>> long r1 = syscall(SYS_socketpair, 0x1ul, 0x3ul, 0x0ul,
>
> [...]
>
>> long r5 = syscall(SYS_close, r2, 0, 0, 0, 0, 0);
>> pthread_t th;
>> pthread_create(, 0, thread, (void*)(long)r3);
>
> [...]
>
>> long r21 = syscall(SYS_ppoll, 0x2ffful, 0x3ul, 0x2ffcul, 
>> 0x2ffdul, 0x8ul, 0);
>> return 0;
>> }
>
> That's one of the already known sequences for triggering this issue: The
> close will clear the peer pointer of the closed socket, hence, the 2nd
> sock_poll_wait will be called by unix_dgram_poll. The write will
> execute unix_dgram_sendmsg which detects that the peer is dead and
> disconnects from it, causing the corresponding structures to be freed
> despite they're still used.
>
> NB: I didn't execute this but I spend a fair amount of time with the
> af_unix.c code during the last couple of weeks and consider myself
> "reasonably familiar" with it and that's IMO what should happen here.


I have not read the code. But I just want to point out that all 3
reports are different. For example, in the first one, ppoll both frees
the object and then accesses it. That is, it is not write that frees
the object.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Use-after-free in ppoll

2015-11-22 Thread Rainer Weikusat

Dmitry Vyukov  writes:
> Hello,
>
> On commit f2d10565b9bdbb722bd43e6e1a759eeddb9645c8 (Nov 20).
>
> The following program triggers use-after-free:
>
> // autogenerated by syzkaller (http://github.com/google/syzkaller)
> #include 
> #include 
> #include 
> #include 
>
> void *thread(void *p)
> {
> syscall(SYS_write, (long)p, 0x2000278ful, 0x1ul, 0, 0, 0);
> return 0;
> }

[...]


> long r1 = syscall(SYS_socketpair, 0x1ul, 0x3ul, 0x0ul,

[...]

> long r5 = syscall(SYS_close, r2, 0, 0, 0, 0, 0);
> pthread_t th;
> pthread_create(, 0, thread, (void*)(long)r3);

[...]

> long r21 = syscall(SYS_ppoll, 0x2ffful, 0x3ul, 0x2ffcul, 
> 0x2ffdul, 0x8ul, 0);
> return 0;
> }

That's one of the already known sequences for triggering this issue: The
close will clear the peer pointer of the closed socket, hence, the 2nd
sock_poll_wait will be called by unix_dgram_poll. The write will
execute unix_dgram_sendmsg which detects that the peer is dead and
disconnects from it, causing the corresponding structures to be freed
despite they're still used.

NB: I didn't execute this but I spend a fair amount of time with the
af_unix.c code during the last couple of weeks and consider myself
"reasonably familiar" with it and that's IMO what should happen here.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Use-after-free in ppoll

2015-11-22 Thread Dmitry Vyukov

Hello,

On commit f2d10565b9bdbb722bd43e6e1a759eeddb9645c8 (Nov 20).

The following program triggers use-after-free:

// autogenerated by syzkaller (http://github.com/google/syzkaller)
#include 
#include 
#include 
#include 

void *thread(void *p)
{
syscall(SYS_write, (long)p, 0x2000278ful, 0x1ul, 0, 0, 0);
return 0;
}

int main()
{
long r0 = syscall(SYS_mmap, 0x2000ul, 0x1ul, 0x2ul,
0x32ul, 0xul, 0x0ul);
long r1 = syscall(SYS_socketpair, 0x1ul, 0x3ul, 0x0ul,
0x20001000ul, 0, 0);
long r2 = -1;
if (r1 != -1)
r2 = *(uint32_t*)0x20001000;
long r3 = -1;
if (r1 != -1)
r3 = *(uint32_t*)0x20001004;
//long r4 = syscall(SYS_getuid, 0, 0, 0, 0, 0, 0);
long r5 = syscall(SYS_close, r2, 0, 0, 0, 0, 0);
pthread_t th;
pthread_create(, 0, thread, (void*)(long)r3);
long r6 = syscall(SYS_clock_gettime, 0x0ul, 0x2ff0ul, 0, 0, 0, 0);
long r7 = -1;
if (r6 != -1)
r7 = *(uint64_t*)0x2ff0;
long r8 = -1;
if (r6 != -1)
r8 = *(uint64_t*)0x2ff8;
*(uint32_t*)0x2fff = r3;
*(uint16_t*)0x20001003 = 0x8;
*(uint16_t*)0x20001005 = 0x9;
*(uint32_t*)0x20001007 = r3;
*(uint16_t*)0x2000100b = 0x6;
*(uint16_t*)0x2000100d = 0x22b;
*(uint32_t*)0x2000100f = r3;
*(uint16_t*)0x20001013 = 0xe7838d7e9fc50196;
*(uint16_t*)0x20001015 = 0x9c2;
*(uint64_t*)0x2ffc = 0;//r7;
*(uint64_t*)0x20001004 = /*r8+*/1000;
*(uint64_t*)0x2ffd = 0x3;
long r21 = syscall(SYS_ppoll, 0x2ffful, 0x3ul,
0x2ffcul, 0x2ffdul, 0x8ul, 0);
return 0;
}


[ 2672.994366] BUG: KASAN: use-after-free in
do_raw_spin_lock+0x22/0x220 at addr 88003d8829c4
[ 2672.994366] Read of size 4 by task syzkaller_execu/6653
[ 2672.994366] 
=
[ 2672.994366] BUG UNIX (Not tainted): kasan: bad access detected
[ 2672.994366] 
-
[ 2672.994366]
[ 2672.994366] INFO: Allocated in sk_prot_alloc+0x53/0x220 age=11 cpu=1 pid=6653
[ 2672.994366]  __slab_alloc+0x235/0x570
[ 2672.994366]  kmem_cache_alloc+0x131/0x170
[ 2672.994366]  sk_prot_alloc+0x53/0x220
[ 2672.994366]  sk_alloc+0x38/0x1c0
[ 2672.994366]  unix_create1+0x5a/0x260
[ 2672.994366]  unix_create+0xc4/0x110
[ 2672.994366]  __sock_create+0x31c/0x490
[ 2672.994366]  SyS_socketpair+0x14c/0x3c0
[ 2672.994366]  entry_SYSCALL_64_fastpath+0x31/0x9a

[ 2672.994366] INFO: Freed in sk_destruct+0x1b5/0x260 age=12 cpu=1 pid=6653
[ 2672.994366]  __slab_free+0x1ec/0x350
[ 2672.994366]  kmem_cache_free+0x1ed/0x200
[ 2672.994366]  sk_destruct+0x1b5/0x260
[ 2672.994366]  __sk_free+0x61/0x110
[ 2672.994366]  sk_free+0x30/0x40
[ 2672.994366]  unix_dgram_poll+0x352/0x390
[ 2672.994366]  sock_poll+0x13b/0x340
[ 2672.994366]  do_sys_poll+0x405/0x860
[ 2672.994366]  SyS_ppoll+0x1a9/0x310
[ 2672.994366]  entry_SYSCALL_64_fastpath+0x31/0x9a
[ 2672.994366] INFO: Slab 0xeaf62000 objects=17 used=5
fp=0x88003d882440 flags=0x1004080
[ 2672.994366] INFO: Object 0x88003d882440 @offset=9280
fp=0x88003d880e80
[ 2672.994366]
[ 2672.994366] CPU: 1 PID: 6653 Comm: syzkaller_execu Tainted: GB
 4.4.0-rc1+ #66
[ 2672.994366] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS Bochs 01/01/2011
[ 2672.994366]  eaf62000 88004cee76f0 8165b3b7
88003e28fa80
[ 2672.994366]  88003d882440 88003d88 88004cee7720
812c32c4
[ 2672.994366]  88003e28fa80 eaf62000 88003d882440
88003d8829c0
[ 2672.994366] Call Trace:
[ 2672.994366]  [] __asan_load4+0x6a/0x70
[ 2672.994366]  [] do_raw_spin_lock+0x22/0x220
[ 2672.994366]  [] _raw_spin_lock_irqsave+0x51/0x60
[ 2672.994366]  [] remove_wait_queue+0x18/0x80
[ 2672.994366]  [] poll_freewait+0x7b/0x130
[ 2672.994366]  [] do_sys_poll+0x4dc/0x860
[ 2672.994366]  [] SyS_ppoll+0x1a9/0x310
[ 2672.994366] 
==




[   40.882065] BUG: KASAN: use-after-free in
__lock_acquire+0x7ea/0x2600 at addr 88006d145f98
[   40.882065] Read of size 8 by task a.out/13880
[   40.882065] 
=
[   40.887431] BUG UNIX (Not tainted): kasan: bad access detected
[   40.887431] 
-
[   40.887431]
[   40.887431] INFO: Allocated in sk_prot_alloc+0x53/0x220 age=0 cpu=3 pid=13885
[   40.887431] ___slab_alloc+0x489/0x4e0
[   40.896414] __slab_alloc+0x4c/0x90
[   40.896786] kmem_cache_alloc+0x131/0x170
[   40.896786] sk_prot_alloc+0x53/0x220
[   40.896786] sk_alloc+0x38/0x1c0
[   40.896786] unix_create1+0x5a/0x260
[   40.896786] unix_create+0xc4/0x110
[

Re: [PATCH 10/13] ARM: mvebu: add buffer manager nodes to armada-38x.dtsi

2015-11-22 Thread Sergei Shtylyov


Hello.

On 11/22/2015 10:53 AM, Marcin Wojtas wrote:


Armada 38x network controller supports hardware buffer management (BM).
Since it is now enabled in mvneta driver, appropriate nodes can be added
to armada-38x.dtsi - for the actual common BM unit (bm@c8000) and its
internal SRAM (bm-bppi), which is used for indirect access to buffer
pointer ring residing in DRAM.

Pools - ports mapping, bm-bppi entry in 'soc' node's ranges and optional
parameters are supposed to be set in board files.

Signed-off-by: Marcin Wojtas 
---
  arch/arm/boot/dts/armada-38x.dtsi | 18 ++
  1 file changed, 18 insertions(+)

diff --git a/arch/arm/boot/dts/armada-38x.dtsi 
b/arch/arm/boot/dts/armada-38x.dtsi
index b7868b2..b9f4ce2 100644
--- a/arch/arm/boot/dts/armada-38x.dtsi
+++ b/arch/arm/boot/dts/armada-38x.dtsi
@@ -539,6 +539,14 @@
status = "disabled";
};

+   bm: bm@c8000 {


   The ePAPR standard tells us to give generic names to the device nodes, 
hence this should be named "buffer-manager" (?).


[...]

@@ -617,6 +625,16 @@
#size-cells = <1>;
ranges = <0 MBUS_ID(0x09, 0x15) 0 0x800>;
};
+
+   bm_bppi: bm-bppi {


   And this one "memory" (?).

[...]

MBR, Sergei

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

RE: [PATCH 07/15] net: Add driver helper functions to determine checksum offloadability

2015-11-22 Thread Yuval Mintz

> +struct skb_csum_offl_spec {
...
> + no_encapped_ipv6:1,
...
> +};

I don't think it's actually checked by this patch.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Kernel 4.1.12 crash

2015-11-22 Thread Andrew


22.11.2015 07:17, Alexander Duyck wrote:

On 11/21/2015 12:16 AM, Andrew wrote:

Memory corruption, if happens, IMHO shouldn't be a hardware-related -
almost all of these boxes, except H61M-based box from 1st log, works
for a long time with uptime more than year; and only software was
changed on it; H61M-based box runs memtest86 for a tens of hours w/o
any error. If it was caused by hardware - they should crash even
earlier.


I wasn't saying it was hardware related.  My thought is that it could
be some sort of use after free or double free type issue. Basically
what you end up with is the memory getting corrupted by software that
is accessing regions it shouldn't be.


Rarely on different servers I saw 'zram decompression error' messages
(in this case I've got such message on H61M-based box).

Also, other people that uses accel-ppp as BRAS software, have
different kernel panics/bugs/oopses on fresh kernels.

I'll try to apply these patches, and I'll try to switch back to
kernels that were stable on some boxes.


If you could bisect this it would be useful.  Basically we just need
to determine where in the git history these issues started popping up
so that we can then narrow down on the root cause.

- Alex
IMHO bisecting will be too long, because these crashes aren't regular - 
once box may work for a month w/o troubles, and then - may crash twice 
per week with same load.


Maybe if I'll create 10-20k sessions in test environment, this will 
cause crash - but I'm not sure about this. I'll try to check this.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

RE: [PATCH V4 net-next 4/5] net:hns: Add support of ethtool TSO set option for Hip06 in HNS

2015-11-22 Thread Yuval Mintz

> +static netdev_features_t hns_nic_fix_features(
> + struct net_device *netdev, netdev_features_t features) {
> + struct hns_nic_priv *priv = netdev_priv(netdev);
> +
> + switch (priv->enet_ver) {
> + case AE_VERSION_1:
> + features &= ~(NETIF_F_TSO | NETIF_F_TSO6 |
> + NETIF_F_HW_VLAN_CTAG_FILTER);
> + break;
> + default:
> + break;
> + }
> + return features;
> +}
> +

Isn't AE_VERSION_1 something fixed once you publish your features?
If it can't be changed, why not simply remove the features from
`hw_features' instead of having to implement this ndo?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

MULTIPLE_TABLES and IP_ADVANCED_ROUTER

2015-11-22 Thread Geert Uytterhoeven

Hi,

Is there a reason why IP_MULTIPLE_TABLES and IP_MROUTE_MULTIPLE_TABLES
depend on IP_ADVANCED_ROUTER, while their IPV6 counterparts don't?

net/ipv4/Kconfig:

config IP_MULTIPLE_TABLES
bool "IP: policy routing"
depends on IP_ADVANCED_ROUTER
select FIB_RULES

config IP_MROUTE_MULTIPLE_TABLES
bool "IP: multicast policy routing"
depends on IP_MROUTE && IP_ADVANCED_ROUTER
select FIB_RULES


net/ipv6/Kconfig:

config IPV6_MULTIPLE_TABLES
bool "IPv6: Multiple Routing Tables"
select FIB_RULES

config IPV6_MROUTE_MULTIPLE_TABLES
bool "IPv6: multicast policy routing"
depends on IPV6_MROUTE
select FIB_RULES

Thanks!

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

RE: [PATCH V3 net-next 1/5] net:hns: Add support of Hip06 SoC to the Hislicon Network Subsystem

2015-11-22 Thread Yuval Mintz

> +void hns_rcbv2_int_ctrl_hw(struct hnae_queue *q, u32 flag, u32 mask)
> +{
> + u32 int_mask_en = !!mask;
> +
> + if (flag & RCB_INT_FLAG_TX)
> + dsaf_write_dev(q, RCB_RING_INTMSK_TXWL_REG,
> int_mask_en);
> +
> + if (flag & RCB_INT_FLAG_RX)
> + dsaf_write_dev(q, RCB_RING_INTMSK_RXWL_REG,
> int_mask_en);
> +}
> +
> +void hns_rcbv2_int_clr_hw(struct hnae_queue *q, u32 flag)
> +{
> + u32 clr = 1;
> +
> + if (flag & RCB_INT_FLAG_TX)
> + dsaf_write_dev(q, RCBV2_TX_RING_INT_STS_REG, clr);
> +
> + if (flag & RCB_INT_FLAG_RX)
> + dsaf_write_dev(q, RCBV2_RX_RING_INT_STS_REG, clr);
> +}
> +

Why do you need the int_mask_en, clr variables? Why not directly use values?

> +static void fill_v2_desc(struct hnae_ring *ring, void *priv,

> + hnae_set_field(bn_pid, 0x7, 0, buf_num - 1);

Magic values?

> +int hns_nic_net_xmit_hw(struct net_device *ndev,
> + struct sk_buff *skb,
> + struct hns_nic_ring_data *ring_data)
> +{

> - /* If everything has gone correctly network should be the
> + /**
> +  * If everything has gone correctly network should be the
>* data section of the packet and will be the end of the header.
>* If not then it probably represents the end of the last recognized
>* header.

What happened to the network style comments?

>  static int hns_nic_poll_rx_skb(struct hns_nic_ring_data *ring_data,
>  struct sk_buff **out_skb, int *out_bnum)
> + /**
> +  * we will be copying header into skb->data in
> +  * pskb_may_pull so it is in our interest to prefetch
> +  * it now to avoid a possible cache miss
> +  */
> + prefetchw(skb->data);
> +

Likewise

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

RE: [PATCH V3 net-next 2/5] net:hns: Add Hip06 "RSS(Receive Side Scaling)" support to HNS Driver

2015-11-22 Thread Yuval Mintz

>  static void hns_ppe_init_hw(struct hns_ppe_cb *ppe_cb)  {
...
> + /* Set default RSS key and indrection table*/
> + const u32 rss_key[HNS_PPEV2_RSS_KEY_NUM] = {
> + 0x6d5a56da, 0x255b0ec2,
> + 0x4167253d, 0x43a38fb0,
> + 0xd0ca2bcb, 0xae7b30b4,
> + 0x77cb2da3, 0x8030f20c,
> + 0x6a42b73b, 0xbeac01fa,
> + };
> +
> + /* set default RSS key and remember it */
> + for (i = 0; i < HNS_PPEV2_RSS_KEY_NUM; i++)
> + ppe_cb->rss_key[i]  = rss_key[i];
> 

Is there any reason for the special default key?
Why not use netdev_rss_key_fill()?

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH net-next] bnx2x: Utilize FW 7.13.1.0.

2015-11-22 Thread Yuval Mintz

Commit 46e8a249423ff "bnx2x: Add FW 7.13.1.0" added said .bin FW to
linux-firmware; This patch incorporates the FW in the bnx2x driver.

This introduces 2 fixes/enhancements:
 - In some management protocols there are outer-vlan configurations
that can be dynamically changed while device is running. This fixes
some corner cases where such a change did not take effect.

 - Prevent VFs from sending MAC control frames; FW would treat a VF
sending such a packet as malicious and block any further communication
done by the VF.

Signed-off-by: Yuval Mintz 
Signed-off-by: Ariel Elior 
---
Hi Dave,

Please consider applying this to `net-next'.

Thanks,
Yuval
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_hsi.h | 43 +
 1 file changed, 23 insertions(+), 20 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_hsi.h 
b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_hsi.h
index cafd5de..27aa080 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_hsi.h
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_hsi.h
@@ -3013,8 +3013,8 @@ struct afex_stats {
 };
 
 #define BCM_5710_FW_MAJOR_VERSION  7
-#define BCM_5710_FW_MINOR_VERSION  12
-#define BCM_5710_FW_REVISION_VERSION   30
+#define BCM_5710_FW_MINOR_VERSION  13
+#define BCM_5710_FW_REVISION_VERSION   1
 #define BCM_5710_FW_ENGINEERING_VERSION0
 #define BCM_5710_FW_COMPILE_FLAGS  1
 
@@ -3583,7 +3583,7 @@ enum classify_rule {
CLASSIFY_RULE_OPCODE_MAC,
CLASSIFY_RULE_OPCODE_VLAN,
CLASSIFY_RULE_OPCODE_PAIR,
-   CLASSIFY_RULE_OPCODE_VXLAN,
+   CLASSIFY_RULE_OPCODE_IMAC_VNI,
MAX_CLASSIFY_RULE
 };
 
@@ -3826,6 +3826,17 @@ struct eth_classify_header {
__le32 echo;
 };
 
+/*
+ * Command for adding/removing a Inner-MAC/VNI classification rule
+ */
+struct eth_classify_imac_vni_cmd {
+   struct eth_classify_cmd_header header;
+   __le32 vni;
+   __le16 imac_lsb;
+   __le16 imac_mid;
+   __le16 imac_msb;
+   __le16 reserved1;
+};
 
 /*
  * Command for adding/removing a MAC classification rule
@@ -3869,14 +3880,6 @@ struct eth_classify_vlan_cmd {
 /*
  * Command for adding/removing a VXLAN classification rule
  */
-struct eth_classify_vxlan_cmd {
-   struct eth_classify_cmd_header header;
-   __le32 vni;
-   __le16 inner_mac_lsb;
-   __le16 inner_mac_mid;
-   __le16 inner_mac_msb;
-   __le16 reserved1;
-};
 
 /*
  * union for eth classification rule
@@ -3885,7 +3888,7 @@ union eth_classify_rule_cmd {
struct eth_classify_mac_cmd mac;
struct eth_classify_vlan_cmd vlan;
struct eth_classify_pair_cmd pair;
-   struct eth_classify_vxlan_cmd vxlan;
+   struct eth_classify_imac_vni_cmd imac_vni;
 };
 
 /*
@@ -5623,6 +5626,14 @@ enum igu_mode {
MAX_IGU_MODE
 };
 
+/*
+ * Inner Headers Classification Type
+ */
+enum inner_clss_type {
+   INNER_CLSS_DISABLED,
+   INNER_CLSS_USE_VLAN,
+   INNER_CLSS_USE_VNI,
+   MAX_INNER_CLSS_TYPE};
 
 /*
  * IP versions
@@ -5953,14 +5964,6 @@ enum ts_offset_cmd {
MAX_TS_OFFSET_CMD
 };
 
-/* Tunnel Mode */
-enum tunnel_mode {
-   TUNN_MODE_NONE,
-   TUNN_MODE_VXLAN,
-   TUNN_MODE_GRE,
-   MAX_TUNNEL_MODE
-};
-
  /* zone A per-queue data */
 struct ustorm_queue_zone_data {
struct ustorm_eth_rx_producers eth_rx_producers;
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 02/13] net: mvneta: enable IP checksum with jumbo frames for Armada 38x on Port0

2015-11-22 Thread Arnd Bergmann

On Sunday 22 November 2015 08:53:48 Marcin Wojtas wrote:
> The Ethernet controller found in the Armada 38x SoC's family support
> TCP/IP checksumming with frame sizes larger than 1600 bytes, however
> only on port 0.
> 
> This commit enables this feature by using 'marvell,armada-xp-neta' in
> 'ethernet@7' node.
> 
> Signed-off-by: Marcin Wojtas 
> Cc:  # v3.18+
> ---
>  arch/arm/boot/dts/armada-38x.dtsi | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/arm/boot/dts/armada-38x.dtsi 
> b/arch/arm/boot/dts/armada-38x.dtsi
> index c6a0e9d..b7868b2 100644
> --- a/arch/arm/boot/dts/armada-38x.dtsi
> +++ b/arch/arm/boot/dts/armada-38x.dtsi
> @@ -494,7 +494,7 @@
> };
>  
> eth0: ethernet@7 {
> -   compatible = "marvell,armada-370-neta";
> +   compatible = "marvell,armada-xp-neta";
> reg = <0x7 0x4000>;
> interrupts-extended = < 8>;
> clocks = < 4>;
> 

As it's clear that they are not 100% backwards compatible, please
add a SoC specific compatible string here as well, like

compatible = "marvell,armada-380-neta", "marvell,armada-xp-neta";

Maybe also leave the 370 string in place.

Arnd
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 00/13] mvneta Buffer Management and enhancements

2015-11-22 Thread Arnd Bergmann

On Sunday 22 November 2015 08:53:46 Marcin Wojtas wrote:
> 
> 3. Optimisations - concatenating TX descriptors' flush, basing on
> xmit_more support and combined approach for finalizing egress processing.
> Thanks to HR timer buffers can be released with small latency, which is
> good for low transfer and small queues. Along with the timer, coalescing
> irqs are used, whose threshold could be increased back to 15.
> 
> 

If you are already reworking the TX path, it probably makes sense to 
support BQL as well, see the Marvell skge and sky2 drivers for examples
using netdev_{tx_,}{sent,completed}_queue.

Arnd
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 07/13] bus: mvebu-mbus: provide api for obtaining IO and DRAM window information

2015-11-22 Thread Arnd Bergmann

On Sunday 22 November 2015 08:53:53 Marcin Wojtas wrote:
> This commit enables finding appropriate mbus window and obtaining its
> target id and attribute for given physical address in two separate
> routines, both for IO and DRAM windows. This functionality
> is needed for Armada XP/38x Network Controller's Buffer Manager and
> PnC configuration.
> 
> Signed-off-by: Marcin Wojtas 
> 
> [DRAM window information reference in LKv3.10]
> Signed-off-by: Evan Wang 
> 

It's too long ago to remember all the details, but I thought we
had designed this so the configuration can just be done by
describing it in DT. What am I missing?

Arnd
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net-next 1/2] net: l3mdev: Add master device lookup by index

2015-11-22 Thread David Miller

From: David Ahern 
Date: Sun, 22 Nov 2015 10:30:32 -0700

> In this case ...

I understand the problem you are trying to solve, but I am saying
you can't use sk_bound_dev_if to use it.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Use-after-free in ppoll

2015-11-22 Thread Rainer Weikusat

Dmitry Vyukov  writes:
> On Sun, Nov 22, 2015 at 3:32 PM, Rainer Weikusat
>  wrote:
>> Dmitry Vyukov  writes:
>>> Hello,
>>>
>>> On commit f2d10565b9bdbb722bd43e6e1a759eeddb9645c8 (Nov 20).
>>>
>>> The following program triggers use-after-free:
>>>
>>> // autogenerated by syzkaller (http://github.com/google/syzkaller)
>>> #include 
>>> #include 
>>> #include 
>>> #include 
>>>
>>> void *thread(void *p)
>>> {
>>> syscall(SYS_write, (long)p, 0x2000278ful, 0x1ul, 0, 0, 0);
>>> return 0;
>>> }
>>
>> [...]
>>
>>
>>> long r1 = syscall(SYS_socketpair, 0x1ul, 0x3ul, 0x0ul,
>>
>> [...]
>>
>>> long r5 = syscall(SYS_close, r2, 0, 0, 0, 0, 0);
>>> pthread_t th;
>>> pthread_create(, 0, thread, (void*)(long)r3);
>>
>> [...]
>>
>>> long r21 = syscall(SYS_ppoll, 0x2ffful, 0x3ul, 0x2ffcul, 
>>> 0x2ffdul, 0x8ul, 0);
>>> return 0;
>>> }
>>
>> That's one of the already known sequences for triggering this issue:

[...]

> I have not read the code. But I just want to point out that all 3
> reports are different. For example, in the first one, ppoll both frees
> the object and then accesses it. That is, it is not write that frees
> the object.

The call trace is always the same:

[ 2672.994366]  [] __asan_load4+0x6a/0x70
[ 2672.994366]  [] do_raw_spin_lock+0x22/0x220
[ 2672.994366]  [] _raw_spin_lock_irqsave+0x51/0x60
[ 2672.994366]  [] remove_wait_queue+0x18/0x80
[ 2672.994366]  [] poll_freewait+0x7b/0x130
[ 2672.994366]  [] do_sys_poll+0x4dc/0x860
[ 2672.994366]  [] SyS_ppoll+0x1a9/0x310

And if you look at the poll implementation, the important part is this
(fs/ select.c, do_sys_poll)

fdcount = do_poll(nfds, head, , end_time);
poll_freewait();

do_poll calls the poll routine of the file descriptors which cause
"enqueuing of something" via poll wait callback. For poll, that's the
__pollwait routine in select.c:

static void __pollwait(struct file *filp, wait_queue_head_t *wait_address,
poll_table *p)
{
struct poll_wqueues *pwq = container_of(p, struct poll_wqueues, pt);
struct poll_table_entry *entry = poll_get_entry(pwq);
if (!entry)
return;
entry->filp = get_file(filp);
entry->wait_address = wait_address;
entry->key = p->_key;
init_waitqueue_func_entry(>wait, pollwake);
entry->wait.private = pwq;
add_wait_queue(wait_address, >wait);
}

because of the close, this routine will be called with the peer_wait
wait_queue_head of the non-closed socket of the socket pair as
wait_address argument. And poll_freewait calls free_poll_entry for all
entries on the poll table which is

static void free_poll_entry(struct poll_table_entry *entry)
{
remove_wait_queue(entry->wait_address, >wait);
fput(entry->filp);
}

but by this time, the wait_address points to freed memory because the
only thing which kept the socket it belonged to alive after the
corresponding file descriptor was closed was the reference the other
socket held. But that was dropped by unix_dgram_sendmsg upon detecting a
dead peer.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Use-after-free in ppoll

2015-11-22 Thread Rainer Weikusat

Rainer Weikusat  writes:

[...]


> because of the close, this routine will be called with the peer_wait
> wait_queue_head of the non-closed socket of the socket pair as
> wait_address argument.

This should have been "peer_wait wait_queue_head of the peer of the
non-closed socket, ie, that of the closed socket"...
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC 5/8] net-next: ralink: add support for rt3050 family

2015-11-22 Thread John Crispin



On 22/11/2015 17:36, Andrew Lunn wrote:
> On Sun, Nov 22, 2015 at 09:40:55AM +0100, John Crispin wrote:
>> Add support for SoCs from the rt3050 family. This include rt3050, rt3052,
>> rt3352 and rt5350. These all have a builtin 5 port 100mbit switch. This patch
>> includes rudimentary code to power up the switch.
> 
> Hi John
> 
> How do you plan to control this switch?
> 
> Does it make sense to write a DSA driver for it?
> 
>  Andrew
> 

Hi Andrew,

we have had a switch layer inside openwrt called swconfig for several
years. at the moment i have an add-on patch in openwrt to provide an
userland interface via that layer. the driver i have sent will bring up
the switch to a point where traffic flow is possible.

John
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC 5/8] net-next: ralink: add support for rt3050 family

2015-11-22 Thread Andrew Lunn

On Sun, Nov 22, 2015 at 09:40:55AM +0100, John Crispin wrote:
> Add support for SoCs from the rt3050 family. This include rt3050, rt3052,
> rt3352 and rt5350. These all have a builtin 5 port 100mbit switch. This patch
> includes rudimentary code to power up the switch.

Hi John

How do you plan to control this switch?

Does it make sense to write a DSA driver for it?

 Andrew
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net-next] net: IPv6 fib lookup tracepoint

2015-11-22 Thread David Miller

From: David Ahern 
Date: Thu, 19 Nov 2015 12:24:22 -0800

> Add tracepoint to show fib6 table lookups and result.
> 
> Signed-off-by: David Ahern 

Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net-next] bnx2x: Utilize FW 7.13.1.0.

2015-11-22 Thread David Miller

From: Yuval Mintz 
Date: Sun, 22 Nov 2015 15:01:29 +0200

> Commit 46e8a249423ff "bnx2x: Add FW 7.13.1.0" added said .bin FW to
> linux-firmware; This patch incorporates the FW in the bnx2x driver.
> 
> This introduces 2 fixes/enhancements:
>  - In some management protocols there are outer-vlan configurations
> that can be dynamically changed while device is running. This fixes
> some corner cases where such a change did not take effect.
> 
>  - Prevent VFs from sending MAC control frames; FW would treat a VF
> sending such a packet as malicious and block any further communication
> done by the VF.
> 
> Signed-off-by: Yuval Mintz 
> Signed-off-by: Ariel Elior 

Applied.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 06/27] brcm80211: move under broadcom vendor directory

2015-11-22 Thread Kalle Valo

Arend van Spriel  writes:

> On 11/19/2015 08:48 AM, Kalle Valo wrote:
>> Hauke Mehrtens  writes:
>>
>>> On 11/18/2015 03:45 PM, Kalle Valo wrote:
 Part of reorganising wireless drivers directory and Kconfig. Note that I 
 had to
 edit Makefiles from subdirectories to use the new location.

 Signed-off-by: Kalle Valo 
 ---
>>>
>>> I would prefer to remove the brcm80211 directory in this process and create:
>>> drivers/net/wireless/broadcom/brcmfmac
>>> drivers/net/wireless/broadcom/brcmsmac
>>> drivers/net/wireless/broadcom/brcmutil
>>> drivers/net/wireless/broadcom/include
>>>
>>> This way we have one directory less.
>>
>> I think this could be done separately. This patchset is big enough
>> already, I would not like to make it anymore complicated.
>>
>> And I actually like the brcm80211 directory, I would not mind keeping it
>> still.
>
> I prefer to keep it as brcmsmac and brcmfmac rely on brcmutil module
> so I want to keep them together under brcm80211.
>
> So does this patch go in before or after the patches I submitted
> before the merge window. I hope after :-p

Sorry, the vendor patches go in first :) It's much safer that way.

But I think that git should be smart enough and your patchset from
before the merge window should still apply without issues.

-- 
Kalle Valo
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3] packet: Allow packets with only a header (but no payload)

2015-11-22 Thread Martin Blumenstingl

Commit 9c7077622dd91 ("packet: make packet_snd fail on len smaller
than l2 header") added validation for the packet size in packet_snd.
This change enforces that every packet needs a header (with at least
hard_header_len bytes) plus a payload with at least one byte. Before
this change the payload was optional.

This fixes PPPoE connections which do not have a "Service" or
"Host-Uniq" configured (which is violating the spec, but is still
widely used in real-world setups). Those are currently failing with the
following message: "pppd: packet size is too short (24 <= 24)"

Signed-off-by: Martin Blumenstingl 
---
v3: Fixed a whitespace error (detected by checkpatch.pl) and updated
the commit message to point to the original commit correctly.
Thanks to Sergei Shtylyov for reporting these.

v2: Simply change the existing logic in ll_header_truncated instead of
splitting it and having multiple checks.

 include/linux/netdevice.h | 3 ++-
 net/packet/af_packet.c| 4 ++--
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 67bfac1..9a488ed 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1398,7 +1398,8 @@ enum netdev_priv_flags {
  * @dma:   DMA channel
  * @mtu:   Interface MTU value
  * @type:  Interface hardware type
- * @hard_header_len: Hardware header length
+ * @hard_header_len: Hardware header length, which means that this is the
+ * minimum size of a packet.
  *
  * @needed_headroom: Extra headroom the hardware may need, but not in all
  *   cases can this be guaranteed
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 1cf928f..992396a 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -2329,8 +2329,8 @@ static void tpacket_destruct_skb(struct sk_buff *skb)
 static bool ll_header_truncated(const struct net_device *dev, int len)
 {
/* net device doesn't like empty head */
-   if (unlikely(len <= dev->hard_header_len)) {
-   net_warn_ratelimited("%s: packet size is too short (%d <= 
%d)\n",
+   if (unlikely(len < dev->hard_header_len)) {
+   net_warn_ratelimited("%s: packet size is too short (%d < %d)\n",
 current->comm, len, dev->hard_header_len);
return true;
}
-- 
2.6.2

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net-next 1/2] net: l3mdev: Add master device lookup by index

2015-11-22 Thread David Miller

From: David Ahern 
Date: Thu, 19 Nov 2015 12:32:00 -0800

> Add helper to lookup master index given a device index.
> 
> Signed-off-by: David Ahern 

I don't like where this is going.

sk->sk_bound_dev_if is for device bindings which the user has
explicitly asked for.

We should never, therefore, automatically set it without the user's
consent.

I'm not applying these patches, sorry.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net-next 1/2] net: l3mdev: Add master device lookup by index

2015-11-22 Thread David Ahern

On 11/22/15 10:23 AM, David Miller wrote:

From: David Ahern 
Date: Thu, 19 Nov 2015 12:32:00 -0800

Add helper to lookup master index given a device index.

Signed-off-by: David Ahern 

I don't like where this is going.

sk->sk_bound_dev_if is for device bindings which the user has
explicitly asked for.

We should never, therefore, automatically set it without the user's
consent.

In this case the user is running a daemon (bgpd) where a single instance 
works across all VRFs. The listen socket is not bound to a device, so 
this does not override what the user ask for. Child sockets are then 
bound to the VRF device the connection originates over, so it narrows 
the scope of accepted connections to a single VRF.

If you look at the change, e.g.,:

ireq->ir_iif = sk->sk_bound_dev_if ? : 
l3mdev_master_ifindex_by_index(sock_net(sk), skb->skb_iif);

It keeps user requested sk_bound_dev_if if it is set. If not, applies 
the limited scope of a VRF device if the skb originated on a device 
enslaved to a VRF device.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 00/27] wireless drivers vendor directories

2015-11-22 Thread Kalle Valo

Kalle Valo  writes:

> Hi,
>
> I started to reorganise drivers/net/wireless directory and follow what
> drivers/net/ethernet has. The major change is that new vendor
> directories are created and most of the drivers are now under those
> vendor directories:
>
> admtek/
> ath/
> atmel/
> broadcom/
> cisco/
> intel/
> intersil/
> marvell/
> mediatek/
> ralink/
> realtek/
> rsi/
> st/
> ti/
> zydas/
>

[...]

> Kalle Valo (27):
>   adm80211: move under admtek vendor directory
>   airo: move under cisco vendor directory
>   atmel: move under atmel vendor directory
>   b43: move under broadcom vendor directory
>   b43legacy: move under broadcom vendor directory
>   brcm80211: move under broadcom vendor directory
>   cw1200: move under st vendor directory
>   ipw2x00: move under intel vendor directory
>   iwlegacy: move under intel directory
>   iwlwifi: move under intel vendor directory
>   libertas: move under marvell vendor directory
>   libertas_tf: move under marvell vendor directory
>   mwifiex: move under marvell vendor directory
>   mwl8k: move under marvell vendor directory
>   zd1201: move under zydas vendor directory
>   zd1211rw: move under zydas vendor directory
>   hostap: move under intersil vendor directory
>   p54: move under intersil vendor directory
>   orinoco: move under intersil vendor directory
>   prism54: move under intersil vendor directory
>   realtek: create separate Kconfig file
>   rsi: add vendor Kconfig entry
>   rt2x00: move under ralink vendor directory
>   mediatek: unify Kconfig with other vendors
>   ti: unify Kconfig with other vendors
>   ath: unify Kconfig with other vendors
>   mac80211_hwsim: move Kconfig entry for sorting alphabetically

Applied to wireless-drivers-next.git.

-- 
Kalle Valo
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/2 iptables] libxt_cgroup: prepare for multi revisions

2015-11-22 Thread Pablo Neira Ayuso

On Sat, Nov 21, 2015 at 11:18:46AM -0500, Tejun Heo wrote:
> libxt_cgroup will grow cgroup2 path based match.  Postfix existing
> symbols with _v0 and prepare for multi revision registration.  While
> at it, rename O_CGROUP to O_CLASSID and fwid to classid.
> 
> Signed-off-by: Tejun Heo 
> Cc: Daniel Borkmann 
> Cc: Jan Engelhardt 
> Cc: Pablo Neira Ayuso 
> ---
>  extensions/libxt_cgroup.c   |   51 
> +++-
>  include/linux/netfilter/xt_cgroup.h |2 -
>  2 files changed, 28 insertions(+), 25 deletions(-)
> 
> --- a/extensions/libxt_cgroup.c
> +++ b/extensions/libxt_cgroup.c
> @@ -3,30 +3,30 @@
>  #include 
>  
>  enum {
> - O_CGROUP = 0,
> + O_CLASSID = 0,
>  };
>  
> -static void cgroup_help(void)
> +static void cgroup_help_v0(void)
>  {
>   printf(
>  "cgroup match options:\n"
> -"[!] --cgroup fwid  Match cgroup fwid\n");
> +"[!] --cgroup classidMatch cgroup classid\n");

We have to keep the old cgroup integer ID around for a while,
otherwise we'll break users with old kernels and new iptables
utilities.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/2 iptables] libxt_cgroup: prepare for multi revisions

2015-11-22 Thread Pablo Neira Ayuso

On Sun, Nov 22, 2015 at 09:31:28PM +0100, Pablo Neira Ayuso wrote:
> On Sat, Nov 21, 2015 at 11:18:46AM -0500, Tejun Heo wrote:
> > --- a/extensions/libxt_cgroup.c
> > +++ b/extensions/libxt_cgroup.c
> > @@ -3,30 +3,30 @@
> >  #include 
> >  
> >  enum {
> > -   O_CGROUP = 0,
> > +   O_CLASSID = 0,
> >  };
> >  
> > -static void cgroup_help(void)
> > +static void cgroup_help_v0(void)
> >  {
> > printf(
> >  "cgroup match options:\n"
> > -"[!] --cgroup fwid  Match cgroup fwid\n");
> > +"[!] --cgroup classidMatch cgroup classid\n");
> 
> We have to keep the old cgroup integer ID around for a while,
> otherwise we'll break users with old kernels and new iptables
> utilities.

Oh, I see.

This is just a rename to prepare the string based identifier.

Sorry for the noise.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

What's the benefit of large Rx rings?

2015-11-22 Thread Yuval Mintz

Hi,

This might be a dumb question, but I recently touched this
and felt like I'm missing something basic -

NAPI is being scheduled from soft-interrupt contex, and it
has a ~strict quota for handling Rx packets [even though we're
allowing practically unlimited handling of Tx completions].
Given these facts, what's the benefit of having arbitrary large
Rx buffer rings? Assuming quota is 64, I would have expected
that having more than twice or thrice as many buffers could not
help in real traffic scenarios - in any given time-unit
[the time between 2 NAPI runs which should be relatively
constant] CPU can't handle more than the quota; If HW is
generating more packets on a regular basis the buffers are bound
to get exhausted, no matter how many there are.

While there isn't any obvious downside to allowing drivers to
increase ring sizes to be larger [other than memory footprint],
I feel like I'm missing the scenarios where having Ks of
buffers can actually help.
And for the unlikely case that I'm not missing anything,
why aren't we supplying some `default' max and min amounts
in a common header?

Thanks,
Yuval



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 02/13] net: mvneta: enable IP checksum with jumbo frames for Armada 38x on Port0

2015-11-22 Thread Marcin Wojtas

Arnd,

2015-11-22 21:00 GMT+01:00 Arnd Bergmann :
> On Sunday 22 November 2015 08:53:48 Marcin Wojtas wrote:
>> The Ethernet controller found in the Armada 38x SoC's family support
>> TCP/IP checksumming with frame sizes larger than 1600 bytes, however
>> only on port 0.
>>
>> This commit enables this feature by using 'marvell,armada-xp-neta' in
>> 'ethernet@7' node.
>>
>> Signed-off-by: Marcin Wojtas 
>> Cc:  # v3.18+
>> ---
>>  arch/arm/boot/dts/armada-38x.dtsi | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/arch/arm/boot/dts/armada-38x.dtsi 
>> b/arch/arm/boot/dts/armada-38x.dtsi
>> index c6a0e9d..b7868b2 100644
>> --- a/arch/arm/boot/dts/armada-38x.dtsi
>> +++ b/arch/arm/boot/dts/armada-38x.dtsi
>> @@ -494,7 +494,7 @@
>> };
>>
>> eth0: ethernet@7 {
>> -   compatible = "marvell,armada-370-neta";
>> +   compatible = "marvell,armada-xp-neta";
>> reg = <0x7 0x4000>;
>> interrupts-extended = < 8>;
>> clocks = < 4>;
>>
>
> As it's clear that they are not 100% backwards compatible, please
> add a SoC specific compatible string here as well, like
>
> compatible = "marvell,armada-380-neta", "marvell,armada-xp-neta";
>

Wouldn't be one sufficient ("marvell,armada-380-neta")?

> Maybe also leave the 370 string in place.
>

Now 370 string disables ip checksum for jumbo frames, so I don't think
it's appropriate to keep it for port 0.

Best regards,
Marcin
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 07/13] bus: mvebu-mbus: provide api for obtaining IO and DRAM window information

2015-11-22 Thread Marcin Wojtas

Arnd,

2015-11-22 21:02 GMT+01:00 Arnd Bergmann :
> On Sunday 22 November 2015 08:53:53 Marcin Wojtas wrote:
>> This commit enables finding appropriate mbus window and obtaining its
>> target id and attribute for given physical address in two separate
>> routines, both for IO and DRAM windows. This functionality
>> is needed for Armada XP/38x Network Controller's Buffer Manager and
>> PnC configuration.
>>
>> Signed-off-by: Marcin Wojtas 
>>
>> [DRAM window information reference in LKv3.10]
>> Signed-off-by: Evan Wang 
>>
>
> It's too long ago to remember all the details, but I thought we
> had designed this so the configuration can just be done by
> describing it in DT. What am I missing?
>

And those functions do not break this approach. They just enable
finding and reading the settings of MBUS windows done during initial
configuration. Please remember that mvebu-mbus driver fills the MBUS
windows registers basing on DT, however it just configures access CPU
- DRAM/perfipheral.

In this particular case only physical adresses of buffers are known
and we have to 'open windows' between BM <-> DRAM and NETA <-> BM
internal memory. Hence instead of hardcoding size/target/attribute, we
can take information stored in CPU DRAM/IO windows registers.

Best regards,
Marcin
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 02/13] net: mvneta: enable IP checksum with jumbo frames for Armada 38x on Port0

2015-11-22 Thread Arnd Bergmann

On Sunday 22 November 2015 22:04:38 Marcin Wojtas wrote:
> 2015-11-22 21:00 GMT+01:00 Arnd Bergmann :
> > On Sunday 22 November 2015 08:53:48 Marcin Wojtas wrote:
> >> The Ethernet controller found in the Armada 38x SoC's family support
> >> TCP/IP checksumming with frame sizes larger than 1600 bytes, however
> >> only on port 0.
> >>
> >> This commit enables this feature by using 'marvell,armada-xp-neta' in
> >> 'ethernet@7' node.
> >>
> >> Signed-off-by: Marcin Wojtas 
> >> Cc:  # v3.18+
> >> ---
> >>  arch/arm/boot/dts/armada-38x.dtsi | 2 +-
> >>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/arch/arm/boot/dts/armada-38x.dtsi 
> >> b/arch/arm/boot/dts/armada-38x.dtsi
> >> index c6a0e9d..b7868b2 100644
> >> --- a/arch/arm/boot/dts/armada-38x.dtsi
> >> +++ b/arch/arm/boot/dts/armada-38x.dtsi
> >> @@ -494,7 +494,7 @@
> >> };
> >>
> >> eth0: ethernet@7 {
> >> -   compatible = "marvell,armada-370-neta";
> >> +   compatible = "marvell,armada-xp-neta";
> >> reg = <0x7 0x4000>;
> >> interrupts-extended = < 8>;
> >> clocks = < 4>;
> >>
> >
> > As it's clear that they are not 100% backwards compatible, please
> > add a SoC specific compatible string here as well, like
> >
> > compatible = "marvell,armada-380-neta", "marvell,armada-xp-neta";
> >
> 
> Wouldn't be one sufficient ("marvell,armada-380-neta")?

If they are basically compatible, you want to the original one in,
to make sure it keeps running on operating systems that only know
about the older string.

> > Maybe also leave the 370 string in place.
> >
> 
> Now 370 string disables ip checksum for jumbo frames, so I don't think
> it's appropriate to keep it for port 0.

Ok, I see. We should probably have done it the other way round and
kept the default as checksum-disabled and only override it when
the newer compatible string is also present. Basically the device
*is* compatible to an Armada 370, it just has additional features that
work correctly.

If the feature set depends on the port number, we should think about
the way it gets handled again, as this is probably better not described
as something that depends (just) on the SoC, but on the way it gets
integrated. Maybe we can introduce an additional property for the
checksums on jumbo frames and use that if present but fall back to
identifying by compatible string otherwise.

Arnd
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 00/13] mvneta Buffer Management and enhancements

2015-11-22 Thread Marcin Wojtas

Arnd,

2015-11-22 21:06 GMT+01:00 Arnd Bergmann :
> On Sunday 22 November 2015 08:53:46 Marcin Wojtas wrote:
>>
>> 3. Optimisations - concatenating TX descriptors' flush, basing on
>> xmit_more support and combined approach for finalizing egress processing.
>> Thanks to HR timer buffers can be released with small latency, which is
>> good for low transfer and small queues. Along with the timer, coalescing
>> irqs are used, whose threshold could be increased back to 15.
>>
>>
>
> If you are already reworking the TX path, it probably makes sense to
> support BQL as well, see the Marvell skge and sky2 drivers for examples
> using netdev_{tx_,}{sent,completed}_queue.
>

Good idea, I'll take a look.

Best regards,
Marcin
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

alternate queueing mechanism (was: [PATCH] unix: avoid use-after-free in ep_remove_wait_queue)

2015-11-22 Thread Rainer Weikusat

Rainer Weikusat  writes:

[AF_UNIX SOCK_DGRAM throughput]

> It may be possible to improve this by tuning/ changing the flow
> control mechanism. Out of my head, I'd suggest making the queue longer
> (the default value is 10) and delaying wake ups until the server
> actually did catch up, IOW, the receive queue is empty or almost
> empty. But this ought to be done with a different patch.

Because I was curious about the effects, I implemented this using a
slightly modified design than the one I originally suggested to account
for the different uses of the 'is the receive queue full' check. The
code uses a datagram-specific checking function,

static int unix_dgram_recvq_full(struct sock const *sk)
{
struct unix_sock *u;

u = unix_sk(sk);
if (test_bit(UNIX_DG_FULL, >flags))
return 1;

if (!unix_recvq_full(sk))
return 0;

__set_bit(UNIX_DG_FULL, >flags);
return 1;
}

which gets called instead of the other for the n:1 datagram checks and a

if (test_bit(UNIX_DG_FULL, >flags) &&
!skb_queue_len(>sk_receive_queue)) {
__clear_bit(UNIX_DG_FULL, >flags);
wake_up_interruptible_sync_poll(>peer_wait,
POLLOUT | POLLWRNORM |
POLLWRBAND);
}

in unix_dgram_recvmsg to delay wakeups until the queued datagrams have
been consumed if the queue overflowed before. This has the additional,
nice side effect that wakeups won't ever be done for 1:1 connected
datagram sockets (both SOCK_DGRAM and SOCK_SEQPACKET) where they're of
no use, anyway.

Compared to a 'stock' 4.3 running the test program I posted (supposed to
make the overhead noticable by sending lots of small messages), the
average number of bytes sent per second increased by about 782,961.79
(ca 764.61K), about 5.32% of the 4.3 number (14,714,579.91), with a
fairly simple code change.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: What's the benefit of large Rx rings?

2015-11-22 Thread Alexander Duyck

On Sun, Nov 22, 2015 at 12:19 PM, Yuval Mintz  wrote:
> Hi,
>
> This might be a dumb question, but I recently touched this
> and felt like I'm missing something basic -
>
> NAPI is being scheduled from soft-interrupt contex, and it
> has a ~strict quota for handling Rx packets [even though we're
> allowing practically unlimited handling of Tx completions].
> Given these facts, what's the benefit of having arbitrary large
> Rx buffer rings? Assuming quota is 64, I would have expected
> that having more than twice or thrice as many buffers could not
> help in real traffic scenarios - in any given time-unit
> [the time between 2 NAPI runs which should be relatively
> constant] CPU can't handle more than the quota; If HW is
> generating more packets on a regular basis the buffers are bound
> to get exhausted, no matter how many there are.
>
> While there isn't any obvious downside to allowing drivers to
> increase ring sizes to be larger [other than memory footprint],
> I feel like I'm missing the scenarios where having Ks of
> buffers can actually help.
> And for the unlikely case that I'm not missing anything,
> why aren't we supplying some `default' max and min amounts
> in a common header?

The main benefit of large Rx rings is that you could theoretically
support longer delays between device interrupts.  So for example if
you have a protocol such as UDP that doesn't care about latency then
you could theoretically set a large ring size, a large interrupt delay
and process several hundred or possibly even several thousand packets
per device interrupt instead of just a few.

- Alex
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 02/13] net: mvneta: enable IP checksum with jumbo frames for Armada 38x on Port0

2015-11-22 Thread Marcin Wojtas

Arnd,

>
> If the feature set depends on the port number, we should think about
> the way it gets handled again, as this is probably better not described
> as something that depends (just) on the SoC, but on the way it gets
> integrated. Maybe we can introduce an additional property for the
> checksums on jumbo frames and use that if present but fall back to
> identifying by compatible string otherwise.
>

I think adding a property, taking also compatible strings will be the
best solution.

Best regards,
Marcin
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] staging: rtl8712: Cleanup _io_ops wrappers

2015-11-22 Thread Mauro Dreissig

This removes ugly and unnecessary declarations.

Signed-off-by: Mauro Dreissig 
---
 drivers/staging/rtl8712/rtl8712_io.c | 77 +++-
 1 file changed, 22 insertions(+), 55 deletions(-)

diff --git a/drivers/staging/rtl8712/rtl8712_io.c 
b/drivers/staging/rtl8712/rtl8712_io.c
index 4148d48..391eff3 100644
--- a/drivers/staging/rtl8712/rtl8712_io.c
+++ b/drivers/staging/rtl8712/rtl8712_io.c
@@ -36,109 +36,76 @@
 
 u8 r8712_read8(struct _adapter *adapter, u32 addr)
 {
-   struct io_queue *pio_queue = adapter->pio_queue;
-   struct intf_hdl *pintfhdl = &(pio_queue->intf);
-   u8 (*_read8)(struct intf_hdl *pintfhdl, u32 addr);
+   struct intf_hdl *hdl = >pio_queue->intf;
 
-   _read8 = pintfhdl->io_ops._read8;
-   return  _read8(pintfhdl, addr);
+   return hdl->io_ops._read8(hdl, addr);
 }
 
 u16 r8712_read16(struct _adapter *adapter, u32 addr)
 {
-   struct io_queue *pio_queue = adapter->pio_queue;
-   struct intf_hdl *pintfhdl = &(pio_queue->intf);
-   u16 (*_read16)(struct intf_hdl *pintfhdl, u32 addr);
+   struct intf_hdl *hdl = >pio_queue->intf;
 
-   _read16 = pintfhdl->io_ops._read16;
-   return _read16(pintfhdl, addr);
+   return hdl->io_ops._read16(hdl, addr);
 }
 
 u32 r8712_read32(struct _adapter *adapter, u32 addr)
 {
-   struct io_queue *pio_queue = adapter->pio_queue;
-   struct intf_hdl *pintfhdl = &(pio_queue->intf);
-   u32 (*_read32)(struct intf_hdl *pintfhdl, u32 addr);
+   struct intf_hdl *hdl = >pio_queue->intf;
 
-   _read32 = pintfhdl->io_ops._read32;
-   return _read32(pintfhdl, addr);
+   return hdl->io_ops._read32(hdl, addr);
 }
 
 void r8712_write8(struct _adapter *adapter, u32 addr, u8 val)
 {
-   struct io_queue *pio_queue = adapter->pio_queue;
-   struct intf_hdl *pintfhdl = &(pio_queue->intf);
-   void (*_write8)(struct intf_hdl *pintfhdl, u32 addr, u8 val);
+   struct intf_hdl *hdl = >pio_queue->intf;
 
-   _write8 = pintfhdl->io_ops._write8;
-   _write8(pintfhdl, addr, val);
+   hdl->io_ops._write8(hdl, addr, val);
 }
 
 void r8712_write16(struct _adapter *adapter, u32 addr, u16 val)
 {
-   struct io_queue *pio_queue = adapter->pio_queue;
-   struct intf_hdl *pintfhdl = &(pio_queue->intf);
-   void (*_write16)(struct intf_hdl *pintfhdl, u32 addr, u16 val);
+   struct intf_hdl *hdl = >pio_queue->intf;
 
-   _write16 = pintfhdl->io_ops._write16;
-   _write16(pintfhdl, addr, val);
+   hdl->io_ops._write16(hdl, addr, val);
 }
 
 void r8712_write32(struct _adapter *adapter, u32 addr, u32 val)
 {
-   struct io_queue *pio_queue = adapter->pio_queue;
-   struct intf_hdl *pintfhdl  = &(pio_queue->intf);
+   struct intf_hdl *hdl = >pio_queue->intf;
 
-   void (*_write32)(struct intf_hdl *pintfhdl, u32 addr, u32 val);
-
-   _write32 = pintfhdl->io_ops._write32;
-   _write32(pintfhdl, addr, val);
+   hdl->io_ops._write32(hdl, addr, val);
 }
 
 void r8712_read_mem(struct _adapter *adapter, u32 addr, u32 cnt, u8 *pmem)
 {
-   struct io_queue *pio_queue = adapter->pio_queue;
-   struct intf_hdl *pintfhdl = &(pio_queue->intf);
+   struct intf_hdl *hdl = >pio_queue->intf;
 
-   void (*_read_mem)(struct intf_hdl *pintfhdl, u32 addr, u32 cnt,
- u8 *pmem);
if (adapter->bDriverStopped || adapter->bSurpriseRemoved)
return;
-   _read_mem = pintfhdl->io_ops._read_mem;
-   _read_mem(pintfhdl, addr, cnt, pmem);
+
+   hdl->io_ops._read_mem(hdl, addr, cnt, pmem);
 }
 
 void r8712_write_mem(struct _adapter *adapter, u32 addr, u32 cnt, u8 *pmem)
 {
-   struct io_queue *pio_queue = adapter->pio_queue;
-   struct intf_hdl *pintfhdl = &(pio_queue->intf);
-   void (*_write_mem)(struct intf_hdl *pintfhdl, u32 addr, u32 cnt,
-  u8 *pmem);
+   struct intf_hdl *hdl = >pio_queue->intf;
 
-   _write_mem = pintfhdl->io_ops._write_mem;
-   _write_mem(pintfhdl, addr, cnt, pmem);
+   hdl->io_ops._write_mem(hdl, addr, cnt, pmem);
 }
 
 void r8712_read_port(struct _adapter *adapter, u32 addr, u32 cnt, u8 *pmem)
 {
-   struct io_queue *pio_queue = adapter->pio_queue;
-   struct intf_hdl *pintfhdl = &(pio_queue->intf);
+   struct intf_hdl *hdl = >pio_queue->intf;
 
-   u32 (*_read_port)(struct intf_hdl *pintfhdl, u32 addr, u32 cnt,
- u8 *pmem);
if (adapter->bDriverStopped || adapter->bSurpriseRemoved)
return;
-   _read_port = pintfhdl->io_ops._read_port;
-   _read_port(pintfhdl, addr, cnt, pmem);
+
+   hdl->io_ops._read_port(hdl, addr, cnt, pmem);
 }
 
 void r8712_write_port(struct _adapter *adapter, u32 addr, u32 cnt, u8 *pmem)
 {
-   struct io_queue *pio_queue = adapter->pio_queue;
-   struct intf_hdl *pintfhdl = &(pio_queue->intf);
+   struct intf_hdl *hdl = >pio_queue->intf;
 
-   u32

64 matches

Mail list logo