date:20150914

Re: [PATCH v6 1/3] can: Allwinner A10/A20 CAN Controller support - Devicetree bindings

2015-09-14 Thread Marc Kleine-Budde

On 09/14/2015 02:54 PM, Gerhard Bertelsmann wrote:
> Signed-off-by: Gerhard Bertelsmann 
> ---
> 
>  .../devicetree/bindings/net/can/sun4i_can.txt  |  37 +
>  1 files changed, 37 insertions(+)
> 
> 
> diff --git a/Documentation/devicetree/bindings/net/can/sun4i_can.txt 
> b/Documentation/devicetree/bindings/net/can/sun4i_can.txt
> new file mode 100644
> index 000..b572e2b
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/net/can/sun4i_can.txt
> @@ -0,0 +1,37 @@
> +Allwinner A10/A20 CAN controller Device Tree Bindings
> +-
> +
> +Required properties:
> +- compatible: "allwinner,sunxican"
> +- reg: physical base address and size of the Allwinner A10/A20 CAN register 
> map.
> +- interrupts: interrupt specifier for the sole interrupt.
> +- clock: phandle and clock specifier.
> +
> +Example
> +---
> +
> +SoC common .dtsi file:
> +
> + can0_pins_a: can0  0 {

That's supposed to be a proper "@":

can0@0

> + allwinner,pins = "PH20","PH21";
> + allwinner,function = "can";
> + allwinner,drive = <0>;
> + allwinner,pull = <0>;
> + };
> +
> + can0: can  01c2bc00 {

can0@01c2bc00

> + compatible = "allwinner,sunxican";
> + reg = <0x01c2bc00 0x400>;
> + interrupts = <0 26 4>;
> + clocks = <_gates 4>;
> + status = "disabled";
> + };
> +
> +Board specific .dts file:
> +
> + can0: can  01c2bc00 {

can0@01c2bc00

> + pinctrl-names = "default";
> + pinctrl-0 = <_pins_a>;
> + status = "okay";
> + };
> +
> 

Marc

-- 
Pengutronix e.K.  | Marc Kleine-Budde   |
Industrial Linux Solutions| Phone: +49-231-2826-924 |
Vertretung West/Dortmund  | Fax:   +49-5121-206917- |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |



signature.asc
Description: OpenPGP digital signature

Re: [PATCH v6 2/3] can: Allwinner A10/A20 CAN Controller support - Defconfigs

2015-09-14 Thread Marc Kleine-Budde

On 09/14/2015 02:54 PM, Gerhard Bertelsmann wrote:

You might want to add a commit message here. :D

> Signed-off-by: Gerhard Bertelsmann 

Marc

-- 
Pengutronix e.K.  | Marc Kleine-Budde   |
Industrial Linux Solutions| Phone: +49-231-2826-924 |
Vertretung West/Dortmund  | Fax:   +49-5121-206917- |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |



signature.asc
Description: OpenPGP digital signature

Re: mvneta: SGMII fixed-link not so fixed

2015-09-14 Thread Stas Sergeev


14.09.2015 13:32, Russell King - ARM Linux пишет:

I've been bringing up the mainline kernel on a new board, which has
an Marvell SoC with a mvneta interface connected to a Marvell DSA
switch.  The DSA switch is a 88E6176.

In the DT block for the interface, I specify:

ethernet@... {
phy-mode = "sgmii";
status = "okay";

fixed-link {
speed = <1000>;
full-duplex;
};
};

However, this doesn't work without patching mvneta to disable the
autonegotiation and forcing the link up:

diff --git a/drivers/net/ethernet/marvell/mvneta.c 
b/drivers/net/ethernet/marvell/mvneta.c
index 62e48bc0cb23..e1698731e429 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -1010,6 +1010,10 @@ static void mvneta_defaults_set(struct mvneta_port *pp)
val |= MVNETA_GMAC_INBAND_AN_ENABLE |
   MVNETA_GMAC_AN_SPEED_EN |
   MVNETA_GMAC_AN_DUPLEX_EN;
+   /* We appear to need the interface forced for DSA switches */
+   val &= ~(MVNETA_GMAC_AN_DUPLEX_EN |
+MVNETA_GMAC_AN_SPEED_EN);
+   val |= MVNETA_GMAC_FORCE_LINK_PASS;
mvreg_write(pp, MVNETA_GMAC_AUTONEG_CONFIG, val);
val = mvreg_read(pp, MVNETA_GMAC_CLOCK_DIVIDER);
val |= MVNETA_GMAC_1MS_CLOCK_ENABLE;

Hello Russell, just to make sure, aren't you missing this by any chance:
https://lkml.org/lkml/2015/7/20/710
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 00/39] drop null test before destroy functions

2015-09-14 Thread SF Markus Elfring

> Recent commits to kernel/git/torvalds/linux.git have made the following
> functions able to tolerate NULL arguments:
>
> kmem_cache_destroy (commit 3942d29918522)
> mempool_destroy (commit 4e3ca3e033d1)
> dma_pool_destroy (commit 44d7175da6ea)

How do you think about to extend an other SmPL script?

Related topic:
scripts/coccinelle/free: Delete NULL test before freeing functions
https://systeme.lip6.fr/pipermail/cocci/2015-May/001960.html
https://www.mail-archive.com/cocci@systeme.lip6.fr/msg01855.html


> If these changes are OK, I will address the remainder later.

Would anybody like to reuse my general SmPL approach for similar source
code clean-up?

Regards,
Markus
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net 2/2] 8139cp: reset BQL when ring tx ring cleared

2015-09-14 Thread David Woodhouse

On Mon, 2013-05-20 at 17:27 -0700, Stephen Hemminger wrote:
> On Mon, 20 May 2013 23:37:28 +0200
> Francois Romieu  wrote:
> 
> > cp_stop_hw includes netdev_reset_queue.
> > 
> > You have imho exhibited a start_xmit after cp_stop_hw race - not sure if
> > it happens in cp_tx_timeout or cp_change_mtu. Reverting the analysis above,
> > I have not found a place where cp_stop_hw could be called without being
> > followed by a cp_clean_rings. The netdev_reset_queue in cp_stop_hw, now
> > useless, should thus be removed.
> > 
> > Does it make sense ?
> 
> Your right, you could probably remove it.
> 
> It doesn't solve the problem, still seeing transmit timeouts.
> Looks like what happens with DHCP is something else.

Did you ever work this out? I'm seeing something similar on the inward
-facing interface on my home router under high load — and it doesn't
automatically recover.



[308309.340644] [ cut here ]
[308309.345379] WARNING: at net/sched/sch_generic.c:255 
dev_watchdog+0x103/0x190()
[308309.352789] Hardware name: Geos
[308309.356020] NETDEV WATCHDOG: eth1 (8139cp): transmit queue 0 timed out
[308309.362733] Modules linked in: sch_fq_codel sch_teql gpio_keys_polled 
leds_gpio geodewdt solos_pci ledtrig_heartbeat gpio_cs5535 cs5535_clockevt 
8139cp ip6t_REJECT ip6t_rt ip6t_hbh ip6t_mh ip6t_ipv6header ip6t_frag 
ip6t_eui64 ip6t_ah ip6table_raw ip6table_mangle ip6table_filter ip6_tables 
nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_irc nf_conntrack_irc nf_nat_ftp 
nf_conntrack_ftp xt_HL xt_hl xt_ecn ipt_ECN xt_CLASSIFY xt_time xt_tcpmss 
xt_statistic xt_mark xt_length xt_DSCP xt_dscp cs5535_mfgpt cs5535_mfd mfd_core 
ipt_MASQUERADE nf_nat xt_recent xt_helper xt_connmark xt_connbytes pptp 
l2tp_ppp pppoe xt_conntrack xt_CT iptable_raw xt_state nf_conntrack_ipv4 
nf_defrag_ipv4 nf_conntrack pppox pppoatm ipt_REJECT xt_TCPMSS xt_comment 
xt_multiport xt_mac xt_limit iptable_mangle iptable_filter ip_tables xt_tcpudp 
x_tables nsc_gpio ip_gre gre sit l2tp_netlink l2tp_core ppp_mppe tunnel4 tun 
ppp_async ppp_generic slhc br2684 atm crc_ccitt ipv6 input_polldev msr 
input_core sha1_generic geode_aes ecb arc4 aes_i586 ohci_hcd ehci_hcd usbcore 
usb_common
[308309.457239] Pid: 0, comm: swapper Not tainted 3.7.1 #1
[308309.462463] Call Trace:
[308309.465020]  [] ? warn_slowpath_common+0x87/0xb0
[308309.470691]  [] ? dev_watchdog+0x103/0x190
[308309.475755]  [] ? warn_slowpath_fmt+0x33/0x40
[308309.481159]  [] ? dev_watchdog+0x103/0x190
[308309.486244]  [] ? pfifo_fast_dequeue+0xd0/0xd0
[308309.491751]  [] ? call_timer_fn.isra.42+0x1c/0x80
[308309.497422]  [] ? process_backlog+0x54/0xe0
[308309.502674]  [] ? run_timer_softirq+0x12a/0x160
[308309.508169]  [] ? pfifo_fast_dequeue+0xd0/0xd0
[308309.513697]  [] ? __do_softirq+0x6d/0x110
[308309.518675]  [] ? __tasklet_schedule+0x40/0x40
[308309.524178][] ? irq_exit+0x31/0x60
[308309.529359]  [] ? do_IRQ+0x8d/0xb0
[308309.533723]  [] ? do_IRQ+0x8d/0xb0
[308309.538201]  [] ? common_interrupt+0x29/0x2e
[308309.543440]  [] ? rt_mutex_adjust_prio_chain+0x180/0x280
[308309.549829]  [] ? default_idle+0x14/0x30
[308309.554719]  [] ? cpu_idle+0x2f/0x50
[308309.559259]  [] ? start_kernel+0x286/0x28b
[308309.564414]  [] ? repair_env_string+0x4d/0x4d
[308309.569729] ---[ end trace 2e18cc211cee6089 ]---
[308309.574551] 8139cp :00:0b.0 eth1: Transmit timeout, status  c   2b0 
80ff

-- 
dwmw2




smime.p7s
Description: S/MIME cryptographic signature

[PATCH v6 2/3] can: Allwinner A10/A20 CAN Controller support - Defconfigs

2015-09-14 Thread Gerhard Bertelsmann

Signed-off-by: Gerhard Bertelsmann 
---

 arch/arm/configs/multi_v7_defconfig|   1 +
 arch/arm/configs/sunxi_defconfig   |   2 +
 2 files changed, 3 insertions(+)


diff --git a/arch/arm/configs/multi_v7_defconfig 
b/arch/arm/configs/multi_v7_defconfig
index 03deb7f..14eb6b9 100644
--- a/arch/arm/configs/multi_v7_defconfig
+++ b/arch/arm/configs/multi_v7_defconfig
@@ -153,6 +153,7 @@ CONFIG_CAN_DEV=y
 CONFIG_CAN_AT91=m
 CONFIG_CAN_XILINXCAN=y
 CONFIG_CAN_MCP251X=y
+CONFIG_CAN_SUN4I=y
 CONFIG_BT=m
 CONFIG_BT_MRVL=m
 CONFIG_BT_MRVL_SDIO=m
diff --git a/arch/arm/configs/sunxi_defconfig b/arch/arm/configs/sunxi_defconfig
index 51eea22..fe020a5 100644
--- a/arch/arm/configs/sunxi_defconfig
+++ b/arch/arm/configs/sunxi_defconfig
@@ -31,6 +31,8 @@ CONFIG_IP_PNP_BOOTP=y
 # CONFIG_INET_LRO is not set
 # CONFIG_INET_DIAG is not set
 # CONFIG_IPV6 is not set
+CONFIG_CAN=y
+CONFIG_CAN_SUN4I=y
 # CONFIG_WIRELESS is not set
 CONFIG_DEVTMPFS=y
 CONFIG_DEVTMPFS_MOUNT=y
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v6 0/3] can: Allwinner A10/A20 CAN Controller support - Summary

2015-09-14 Thread Gerhard Bertelsmann

Hi,

please find attached the next try. Thanks to Marc an Maxime for their reviews.
To be honest I made the defconfig the best of my belief.

Please test and report bugs if exists.

[PATCH v6 1/3] Device Tree Binding Documentation
[PATCH v6 2/3] Defconfigs
[PATCH v6 3/3] Kernel Module

History:
V6: renamed the driver to sun4i as suggested by Maxime Ripard
removed module version
removed suspend and resume
moved clk enable from can_start into open / should be balanced
  between enabling and disabling now
freeing resources on error

v5: fix license
modify prefix to mode select defines
enable and disable clock in sunxican_get_berr_counter
delete set_normal_mode at the end of sunxi_can_start
removed sunxican_id_table
use devm_clk_get instead of clk_get
use devm_ioremap_resource to simplify probe and remove
make set-normal-mode and set-reset-mode more readable

v4: defines prefixed with SUNXI_
sunxi_can_write_cmdreg tweaked
loops in set_xxx_mode reworked
add return value to set_xxx_mode
sunxican_start_xmit reworked
struct platform_driver stripped
moved set_bittiming into open
moved clock start into open
add clock stop to close
suspend reworked
resume reworked
fixed double counting bug

v3: changed error state change handling (thx Andri for the hint)
use bittiming function correct (no need to call it)
strip down priv (suggested by Marc)
scripts/checkpatch.pl-> no matches anymore
sparse -> no errors or warnings anymore
v2: cleaning
v1: initial

Signed-off-by: Gerhard Bertelsmann 
---

 .../devicetree/bindings/net/can/sun4i_can.txt  |  37 +
 arch/arm/configs/multi_v7_defconfig|   1 +
 arch/arm/configs/sunxi_defconfig   |   2 +
 drivers/net/can/Kconfig|  10 +
 drivers/net/can/Makefile   |   1 +
 drivers/net/can/sun4i_can.c| 824 +
 6 files changed, 875 insertions(+)

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v6 3/3] can: Allwinner A10/A20 CAN Controller support - Kernel module

2015-09-14 Thread Gerhard Bertelsmann

Signed-off-by: Gerhard Bertelsmann 
---

 drivers/net/can/Kconfig|  10 +
 drivers/net/can/Makefile   |   1 +
 drivers/net/can/sun4i_can.c| 824 +
 3 files changed, 835 insertions(+)


diff --git a/drivers/net/can/Kconfig b/drivers/net/can/Kconfig
index e8c96b8..6d04183 100644
--- a/drivers/net/can/Kconfig
+++ b/drivers/net/can/Kconfig
@@ -129,6 +129,16 @@ config CAN_RCAR
  To compile this driver as a module, choose M here: the module will
  be called rcar_can.
 
+config CAN_SUN4I
+   tristate "Allwinner A10 CAN controller"
+   depends on MACH_SUN4I || MACH_SUN7I || COMPILE_TEST
+   ---help---
+ Say Y here if you want to use CAN controller found on Allwinner
+ A10/A20 SoCs.
+
+ To compile this driver as a module, choose M here: the module will
+ be called sun4i_can.
+
 config CAN_XILINXCAN
tristate "Xilinx CAN"
depends on ARCH_ZYNQ || ARM64 || MICROBLAZE || COMPILE_TEST
diff --git a/drivers/net/can/Makefile b/drivers/net/can/Makefile
index c533c62..1f21cef 100644
--- a/drivers/net/can/Makefile
+++ b/drivers/net/can/Makefile
@@ -27,6 +27,7 @@ obj-$(CONFIG_CAN_FLEXCAN) += flexcan.o
 obj-$(CONFIG_PCH_CAN)  += pch_can.o
 obj-$(CONFIG_CAN_GRCAN)+= grcan.o
 obj-$(CONFIG_CAN_RCAR) += rcar_can.o
+obj-$(CONFIG_CAN_SUN4I)+= sun4i_can.o
 obj-$(CONFIG_CAN_XILINXCAN)+= xilinx_can.o
 
 subdir-ccflags-y += -D__CHECK_ENDIAN__
diff --git a/drivers/net/can/sun4i_can.c b/drivers/net/can/sun4i_can.c
new file mode 100644
index 000..b8cc89f
--- /dev/null
+++ b/drivers/net/can/sun4i_can.c
@@ -0,0 +1,824 @@
+/*
+ * sun4i_can.c - CAN bus controller driver for Allwinner SUN4I based SoCs
+ *
+ * Copyright (C) 2013 Peter Chen
+ * Copyright (C) 2015 Gerhard Bertelsmann
+ * All rights reserved.
+ *
+ * Parts of this software are based on (derived from) the SJA1000 code by:
+ *   Copyright (C) 2014 Oliver Hartkopp 
+ *   Copyright (C) 2007 Wolfgang Grandegger 
+ *   Copyright (C) 2002-2007 Volkswagen Group Electronic Research
+ *   Copyright (C) 2003 Matthias Brukner, Trajet Gmbh, Rebenring 33,
+ *   38106 Braunschweig, GERMANY
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *notice, this list of conditions and the following disclaimer in the
+ *documentation and/or other materials provided with the distribution.
+ * 3. Neither the name of Volkswagen nor the names of its contributors
+ *may be used to endorse or promote products derived from this software
+ *without specific prior written permission.
+ *
+ * Alternatively, provided that this notice is retained in full, this
+ * software may be distributed under the terms of the GNU General
+ * Public License ("GPL") version 2, in which case the provisions of the
+ * GPL apply INSTEAD OF those given above.
+ *
+ * The provided data structures and external interfaces from this code
+ * are not restricted to be used by modules with a GPL compatible license.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
+ * DAMAGE.
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define DRV_NAME "sun4i_can"
+
+/* Registers address (physical base address 0x01C2BC00) */
+#define SUNXI_REG_MSEL_ADDR0x  /* CAN Mode Select */
+#define SUNXI_REG_CMD_ADDR 0x0004  /* CAN Command */
+#define SUNXI_REG_STA_ADDR 0x0008  /* CAN Status */
+#define SUNXI_REG_INT_ADDR 0x000c  /* CAN Interrupt Flag */
+#define SUNXI_REG_INTEN_ADDR   0x0010  /* CAN Interrupt Enable */
+#define SUNXI_REG_BTIME_ADDR   0x0014  /* CAN Bus Timing 0 */
+#define SUNXI_REG_TEWL_ADDR0x0018  /* CAN Tx Error Warning Limit */
+#define

DSA: phy polling

2015-09-14 Thread Russell King - ARM Linux

Andrew,

I think you're the current maintainer of the Marvell DSA code, as being
the most recent author of changes to it. :)

I've noticed in my testing that the Marvell DSA code seems to poll the
internal phy link status in mv88e6xxx_poll_link(), and set the network
device carrier status according to the results.

However, the internal phys are created using phylib, which also polls
the phys for their link status, and controls the associated netdev
carrier status.

The side effect of this is that I see duplicated link status messages in
the kernel log when connecting or disconnecting cables from the switch,
caused by the code in mv88e6xxx_poll_link() racing with the phylib code.
>From what I can see, the code in mv88e6xxx_poll_link() is entirely
redundant as the phylib layer will take care of any phy attached to the
switch.

To prove this, I have the following code in my tree, which disables the
polling on a port where we have a phy attached (either an internal or
external phy).  The result is that the per-port network devices are still
updated with the link status even though this code is disabled - thanks
to the phylib polling.

I'm left wondering whether the DSA specific phy polling does anything
useful, or whether the entire polling code both in mv88e6xxx.c and
net/dsa can be removed (mv88e6xxx.c seems to be its only user.)

diff --git a/drivers/net/dsa/mv88e6xxx.c b/drivers/net/dsa/mv88e6xxx.c
index 26ec2fbfaa89..4c324eafeef2 100644
--- a/drivers/net/dsa/mv88e6xxx.c
+++ b/drivers/net/dsa/mv88e6xxx.c
@@ -400,6 +400,13 @@ void mv88e6xxx_poll_link(struct dsa_switch *ds)
if (dev == NULL)
continue;
 
+   /*
+* Ignore ports which have a phy; phylib will take care
+* of polling the link status for these.
+*/
+   if (dsa_slave_has_phy(dev))
+   continue;
+
link = 0;
if (dev->flags & IFF_UP) {
port_status = mv88e6xxx_reg_read(ds, REG_PORT(i),
diff --git a/include/net/dsa.h b/include/net/dsa.h
index fbca63ba8f73..b31e9da43ea7 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -176,6 +176,8 @@ static inline bool dsa_is_port_initialized(struct 
dsa_switch *ds, int p)
return ds->phys_port_mask & (1 << p) && ds->ports[p];
 }
 
+extern bool dsa_slave_has_phy(struct net_device *);
+
 static inline u8 dsa_upstream_port(struct dsa_switch *ds)
 {
struct dsa_switch_tree *dst = ds->dst;
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 35c47ddd04f0..a107242816ff 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -10,6 +10,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -873,6 +874,14 @@ int dsa_slave_resume(struct net_device *slave_dev)
return 0;
 }
 
+bool dsa_slave_has_phy(struct net_device *slave_dev)
+{
+   struct dsa_slave_priv *p = netdev_priv(slave_dev);
+
+   return p->phy != NULL;
+}
+EXPORT_SYMBOL_GPL(dsa_slave_has_phy);
+
 int dsa_slave_create(struct dsa_switch *ds, struct device *parent,
 int port, char *name)
 {


-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH V6 net-next 1/2] net: introduce socket family constants

2015-09-14 Thread Ursula Braun

From: Ursula Braun 

The new socket family is assigned the next available address / protocol
family constant 41.

Signed-off-by: Ursula Braun 
---
 include/linux/socket.h |  4 +++-
 include/net/smc.h  | 12 
 2 files changed, 15 insertions(+), 1 deletion(-)
 create mode 100644 include/net/smc.h

diff --git a/include/linux/socket.h b/include/linux/socket.h
index 5bf59c8..1adcbcc 100644
--- a/include/linux/socket.h
+++ b/include/linux/socket.h
@@ -200,7 +200,8 @@ struct ucred {
 #define AF_ALG 38  /* Algorithm sockets*/
 #define AF_NFC 39  /* NFC sockets  */
 #define AF_VSOCK   40  /* vSockets */
-#define AF_MAX 41  /* For now.. */
+#define AF_SMC 41  /* smc sockets  */
+#define AF_MAX 42  /* For now.. */
 
 /* Protocol families, same as address families. */
 #define PF_UNSPEC  AF_UNSPEC
@@ -246,6 +247,7 @@ struct ucred {
 #define PF_ALG AF_ALG
 #define PF_NFC AF_NFC
 #define PF_VSOCK   AF_VSOCK
+#define PF_SMC AF_SMC
 #define PF_MAX AF_MAX
 
 /* Maximum queue length specifiable by listen.  */
diff --git a/include/net/smc.h b/include/net/smc.h
new file mode 100644
index 000..68cdaae
--- /dev/null
+++ b/include/net/smc.h
@@ -0,0 +1,12 @@
+/*
+ * SMC Definitions for the SMC protocol.
+ *
+ * Author: Ursula Braun 
+ */
+#ifndef _SMC_H
+#define _SMC_H
+
+/* SMC socket options - disjunct with TCP socket options */
+#define SMC_KEEPALIVE  99  /* start/stop keepalives */
+
+#endif /* _SMC_H */
-- 
2.3.8

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH V6 net-next 0/2] net: implement SMC-R solution

2015-09-14 Thread Ursula Braun

From: Ursula Braun 

Dave,

this is V6 of my SMC-R patches taking care about your V5 comment:
Removing trailing blank line in include/net/smc.h (and other new files).

Since you are asking for a solution "100% in our own separate module
with our own can of worms", we have to give up the transparent detection
whether a communication peer can do SMC-R or not (this has been the
purpose of the rejected TCP hooks). Instead, we want just the new
self-contained SMC-R socket family added to the kernel.
By the way, since August 2015 the SMC-R Informational RFC is no longer
a draft, but published as RFC7609.

V6 changes:
1. Remove trailing blank lines in new files

V5 changes:
1. Do no longer invoke tcp_set_keepalive() to implement setsockopt
   SO_KEEPALIVE.
2. Enforce inverse christmas tree ordering for all declarations of local
   function variables.
3. switch back to TCP if IPSEC is needed
4. fix dangling sockets
5. make sure link groups are freed before module unload

V4 changes:
1. Remove tcp patches supporting TCP experimental options
2. Remove references to tcp_sock syn_smc flag in smc-code, since TCP
   experimental options are not supported by the Linux-tcp.
3. clc_wait_msg() simplified

V3 changes:
1. Avoid adding of new space for smc-related bits in the tcp structures.
2. Make the smc feature to be nearly zero cost using Static Keys / jump
   labels
3. Increase / decrease smc static key in the smc-code
4. Make sure the next-to-last patch does not break the build
5. Additional pnet table checking

V2 changes:
1. activate tcp changes for CONFIG_AFSMC only (as suggested by Eric Dumazet)
2. add additional hook in net/core/sock.c
3. fix bitfield endianness problem

Thanks,
Ursula

In 2013, IBM introduced an optimized communications solution for the
IBM zEnterprise EC12 and BC12 (s390 in Linux terminology) that is
comprised of the IBM 10GbE RoCE Express feature with Shared Memory
Communications-RDMA (SMC-R) protocol [1].
SMC-R is designed for the enterprise data center environment and is an open
protocol as specified in the informational RFC7609 [2]. It has been
published in August 2015. Another implementation of this protocol is
available since 2013 with IBM z/OS Version 2 Release 1. 

SMC-R provides a “sockets over RDMA” solution that leverages industry
standard RDMA over Converged Ethernet (RoCE) technology.

IBM has developed a Linux implementation of the SMC-R standard. A new
socket protocol family AF_SMC is introduced. A preload library can be used
to enable TCP-based applications to use SMC-R without changes. 

Key aspects of SMC-R are: 
1. Provides optimized performance compared to standard TCP/IP over Ethernet
   within the data center for both request/response (latency) and streaming
   workloads (CPU savings) [3]. 
   Initial benchmarks on Linux on x86 processors have shown latency
   reduction of up to 52% with a throughput gain of 111% using SMC-R vs TCP
   for request/response message patterns (10 concurrent TCP connections
   with 16KBmessages) and CPU savings of up to 69% for streaming data
   patterns (single TCP connection with 20MB of data in one direction).
   [1] is currently updated to contain more detailed information on Linux
   and performance.
2. In order to preserve the traditional network administrative model the
   SMC-R protocol ties into the existing IP addresses and uses TCP's
   handshake to establish connections. This allows existing management
   tools and security infrastructure to control the creation of SMC
   connections.
3. The SMC-R protocol logically bonds multiple RoCE adapters together
   providingredundancy with transparent fail-over for improved high
   availability, increased bandwidth and load balancing across multiple
   RDMA-capable devices.
Without the rejected TCP Experimental Options the following aspects are
restricted; alternate solutions are in discussion. 
4. Due to its handshake protocol, SMC-R is compatible with (transparent to)
   existing TCP connection load balancers that are commonly used in the
   enterprise data center environment for multi-tier application workloads.
5. SMC-R's handshake protocol allows for transparent fallback to TCP/IP,
   should one of the peers not be capable of the protocol.

Additional SMC-R overview and reference materials are available [1].  

The SMC-R “rendezvous" protocol eliminates the need for RDMA-CM and the
exchange occurs through an initial TCP connection. Building on a TCP
connection to establish an SMC-R connection solves many key requirements.
The rendezvous process occurs now in 1 phase only: 
1. TCP/IP 3-way exchange with TCP experimental options is skipped.
2. SMC-R 3-way exchange:
   It is assumed both partners indicate SMC-R capability. Then at the
   completion of the 3-way TCP handshake the SMC-R layers in each peer take
   control of the TCP connection and exchange their RDMA credentials. If
   this 3-way exchange completes successfully the connection continues

Re: [PATCH] xfrm6: Fix ICMPv6 and MH header checks in _decode_session6

2015-09-14 Thread Steffen Klassert

On Fri, Sep 11, 2015 at 09:57:20AM +0200, Mathias Krause wrote:
> From: Mathias Krause 
> 
> Ensure there's enough data left prior calling pskb_may_pull(). If
> skb->data was already advanced, we'll call pskb_may_pull() with a
> negative value converted to unsigned int -- leading to a huge
> positive value. That won't matter in practice as pskb_may_pull()
> will likely fail in this case, but it leads to underflow reports on
> kernels handling such kind of over-/underflows, e.g. a PaX enabled
> kernel instrumented with the size_overflow plugin.
> 
> Reported-by: satmd 
> Reported-and-tested-by: Marcin Jurkowski 
> Signed-off-by: Mathias Krause 
> Cc: PaX Team 

Skipping upper layer informations due to a wrong length calculation,
may also leed to incorrect policy lookups. Patch applied to the
ipsec tree, thanks!

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

mvneta: SGMII fixed-link not so fixed

2015-09-14 Thread Russell King - ARM Linux

I've been bringing up the mainline kernel on a new board, which has
an Marvell SoC with a mvneta interface connected to a Marvell DSA
switch.  The DSA switch is a 88E6176.

In the DT block for the interface, I specify:

ethernet@... {
phy-mode = "sgmii";
status = "okay";

fixed-link {
speed = <1000>;
full-duplex;
};
};

However, this doesn't work without patching mvneta to disable the
autonegotiation and forcing the link up:

diff --git a/drivers/net/ethernet/marvell/mvneta.c 
b/drivers/net/ethernet/marvell/mvneta.c
index 62e48bc0cb23..e1698731e429 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -1010,6 +1010,10 @@ static void mvneta_defaults_set(struct mvneta_port *pp)
val |= MVNETA_GMAC_INBAND_AN_ENABLE |
   MVNETA_GMAC_AN_SPEED_EN |
   MVNETA_GMAC_AN_DUPLEX_EN;
+   /* We appear to need the interface forced for DSA switches */
+   val &= ~(MVNETA_GMAC_AN_DUPLEX_EN |
+MVNETA_GMAC_AN_SPEED_EN);
+   val |= MVNETA_GMAC_FORCE_LINK_PASS;
mvreg_write(pp, MVNETA_GMAC_AUTONEG_CONFIG, val);
val = mvreg_read(pp, MVNETA_GMAC_CLOCK_DIVIDER);
val |= MVNETA_GMAC_1MS_CLOCK_ENABLE;

As Marvell likes to keep all their documentation secret, it's very hard
to know what's going on, and why this would be necessary.  However, when
running the board under a kernel built from Marvell's code base, it
sets this register in a similar way.

Since the link is definitely a fixed-link operating at 1G, full duplex,
with no autonegotiation, I think using the fixed-link in DT is correct,
but the mvneta driver is wrong to abuse fixed-link to mean "SGMII but
with in-band autonegotiation".

The other issue I've seen is that even with the fixed-link settings, I
have seen via ethtool that the link is reported as 10mbit - because
mvneta_fixed_link_update() changes the settings on the fixed link.  So,
fixed-link doesn't seem to be quite as fixed as "fixed" says it should
be.

Shouldn't SGMII (which is always gigabit) be treated as gigabit with
in-band negotiation when phy-mode = "sgmii" but no fixed-link, but a
real fixed speed, other parameters and forced up when phy-mode = "sgmii"
and there is a fixed link.  It seems to me that there's been a design
mistake here, and my fear is that as we seem to have had this mistake
in the tree since April, it's now almost impossible to support this
setup without breaking DT compatibility.

However, it could be that the switch is misconfigured, and some register
bit somewhere isn't set to indicate that it should provide in-band
signalling on its SGMII interface.  I've trawled the net looking at
various bits of Marvell code for driving their switches, and I see
nothing which would cause them to enable in-band signalling, but maybe
I'm missing something.

Here's the switch registers - the mvneta is connected to port 5:

GLOBAL GLOBAL2   0123456
 0:  c844   0  100f 100f 100f 100f 100f  e09  e07
 1: 0   0 33333   3e3
 2: 0 0000000
 3: 0  1761 1761 1761 1761 1761 1761 1761
 4:  6000 258   430  430  430  430  430 373f  430
 5: 0  ff 0000000
 6: 01f0f7e   7d   7b   77   6f 505f   3f
 7: 0707f 0000000
 8: 07800  2080 2080 2080 2080 2080 2080 2080
 9: 01600 1111111
 a:   148   0 0000000
 b:  60001000 1248   10   20   40
 c: 0  7f 0000000
 d: 0 502 0000000
 e: 0   0 0000000
 f: 0 f00  dada dada dada dada dada dada dada
10: 0   0 0000000
11: 0   0 0000000
12:     0 0000000
13:     0 0000000
14:   400 0000000
15:     0 0000000
16:     033   33   33   33   33   330
17:     0 0000000
18:  fa411884  3210 3210 3210 3210 3210 3210 3210
19: 0 1e1  7654 7654 7654 7654 7654 7654 7654
1a:  5550   0 0000000
1b:   1fff869  8000 8000 8000 8000 8000 8000 8000
1c: 0   0 0000000
1d:  1000   0 0000000
1e: 0   0 0000000
1f: 0   0 0000000

The DSA switch is configured in DT with:

dsa@0 {
compatible = "marvell,dsa";

Re: mvneta: SGMII fixed-link not so fixed

2015-09-14 Thread Russell King - ARM Linux

On Mon, Sep 14, 2015 at 02:06:13PM +0300, Stas Sergeev wrote:
> 14.09.2015 13:32, Russell King - ARM Linux пишет:
> >I've been bringing up the mainline kernel on a new board, which has
> >an Marvell SoC with a mvneta interface connected to a Marvell DSA
> >switch.  The DSA switch is a 88E6176.
> >
> >In the DT block for the interface, I specify:
> >
> > ethernet@... {
> > phy-mode = "sgmii";
> > status = "okay";
> >
> > fixed-link {
> > speed = <1000>;
> > full-duplex;
> > };
> > };
> >
> >However, this doesn't work without patching mvneta to disable the
> >autonegotiation and forcing the link up:
> >
> >diff --git a/drivers/net/ethernet/marvell/mvneta.c 
> >b/drivers/net/ethernet/marvell/mvneta.c
> >index 62e48bc0cb23..e1698731e429 100644
> >--- a/drivers/net/ethernet/marvell/mvneta.c
> >+++ b/drivers/net/ethernet/marvell/mvneta.c
> >@@ -1010,6 +1010,10 @@ static void mvneta_defaults_set(struct mvneta_port 
> >*pp)
> > val |= MVNETA_GMAC_INBAND_AN_ENABLE |
> >MVNETA_GMAC_AN_SPEED_EN |
> >MVNETA_GMAC_AN_DUPLEX_EN;
> >+/* We appear to need the interface forced for DSA switches */
> >+val &= ~(MVNETA_GMAC_AN_DUPLEX_EN |
> >+ MVNETA_GMAC_AN_SPEED_EN);
> >+val |= MVNETA_GMAC_FORCE_LINK_PASS;
> > mvreg_write(pp, MVNETA_GMAC_AUTONEG_CONFIG, val);
> > val = mvreg_read(pp, MVNETA_GMAC_CLOCK_DIVIDER);
> > val |= MVNETA_GMAC_1MS_CLOCK_ENABLE;
> Hello Russell, just to make sure, aren't you missing this by any chance:
> https://lkml.org/lkml/2015/7/20/710

Thanks, I think that will solve it.  I have to wonder why that patch
(f8af8e6eb9509 in mainline) didn't made it into v4.2 though, as it's
billed as a regression that occurred in the previous merge window, and
given that it was sent in July, and we're now in September.  As it
wasn't in v4.2, it looks like it should be a stable candidate.

David, any objections to having the stable guys pick this regression
fix up, if not already done so?

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] net: dsa: actually force the speed on the CPU port

2015-09-14 Thread Russell King

Commit 54d792f257c6 ("net: dsa: Centralise global and port setup
code into mv88e6xxx.") merged in the 4.2 merge window broke the link
speed forcing for the CPU port of Marvell DSA switches.  The original
code was:

/* MAC Forcing register: don't force link, speed, duplex
 * or flow control state to any particular values on physical
 * ports, but force the CPU port and all DSA ports to 1000 Mb/s
 * full duplex.
 */
if (dsa_is_cpu_port(ds, p) || ds->dsa_port_mask & (1 << p))
REG_WRITE(addr, 0x01, 0x003e);
else
REG_WRITE(addr, 0x01, 0x0003);

but the new code does a read-modify-write:

reg = _mv88e6xxx_reg_read(ds, REG_PORT(port), PORT_PCS_CTRL);
if (dsa_is_cpu_port(ds, port) ||
ds->dsa_port_mask & (1 << port)) {
reg |= PORT_PCS_CTRL_FORCE_LINK |
PORT_PCS_CTRL_LINK_UP |
PORT_PCS_CTRL_DUPLEX_FULL |
PORT_PCS_CTRL_FORCE_DUPLEX;
if (mv88e6xxx_6065_family(ds))
reg |= PORT_PCS_CTRL_100;
else
reg |= PORT_PCS_CTRL_1000;

The link speed in the PCS control register is a two bit field.  Forcing
the link speed in this way doesn't ensure that the bit field is set to
the correct value - on the hardware I have here, the speed bitfield
remains set to 0x03, resulting in the speed not being forced to gigabit.

We must clear both bits before forcing the link speed.

Fixes: 54d792f257c6 ("net: dsa: Centralise global and port setup code into 
mv88e6xxx.")
Signed-off-by: Russell King 
---
 drivers/net/dsa/mv88e6xxx.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/dsa/mv88e6xxx.c b/drivers/net/dsa/mv88e6xxx.c
index 561342466076..26ec2fbfaa89 100644
--- a/drivers/net/dsa/mv88e6xxx.c
+++ b/drivers/net/dsa/mv88e6xxx.c
@@ -1387,6 +1387,7 @@ static int mv88e6xxx_setup_port(struct dsa_switch *ds, 
int port)
reg = _mv88e6xxx_reg_read(ds, REG_PORT(port), PORT_PCS_CTRL);
if (dsa_is_cpu_port(ds, port) ||
ds->dsa_port_mask & (1 << port)) {
+   reg &= ~PORT_PCS_CTRL_UNFORCED;
reg |= PORT_PCS_CTRL_FORCE_LINK |
PORT_PCS_CTRL_LINK_UP |
PORT_PCS_CTRL_DUPLEX_FULL |
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [fw filter]: Broken! fw mark based tc class selection not working

2015-09-14 Thread Jamal Hadi Salim


On 09/11/15 20:00, Cong Wang wrote:

On Fri, Sep 11, 2015 at 3:24 PM, Akshat Kakkar  wrote:



Hmm, I didn't know that before either. Looks like my tp->init change
breaks it.

Could you try the following patch?



I would just make init() empty for this classifier (return 0?).
If someone wants to add classids ids, change() is available.
The most common (efficient) use case is what Akshat shows.
So even the check in the classify should optimize for that i.e
if (head == NULL)
do old method
else
...

cheers,
jamal
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v6 1/3] can: Allwinner A10/A20 CAN Controller support - Devicetree bindings

2015-09-14 Thread Gerhard Bertelsmann

Signed-off-by: Gerhard Bertelsmann 
---

 .../devicetree/bindings/net/can/sun4i_can.txt  |  37 +
 1 files changed, 37 insertions(+)


diff --git a/Documentation/devicetree/bindings/net/can/sun4i_can.txt 
b/Documentation/devicetree/bindings/net/can/sun4i_can.txt
new file mode 100644
index 000..b572e2b
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/can/sun4i_can.txt
@@ -0,0 +1,37 @@
+Allwinner A10/A20 CAN controller Device Tree Bindings
+-
+
+Required properties:
+- compatible: "allwinner,sunxican"
+- reg: physical base address and size of the Allwinner A10/A20 CAN register 
map.
+- interrupts: interrupt specifier for the sole interrupt.
+- clock: phandle and clock specifier.
+
+Example
+---
+
+SoC common .dtsi file:
+
+   can0_pins_a: can0  0 {
+   allwinner,pins = "PH20","PH21";
+   allwinner,function = "can";
+   allwinner,drive = <0>;
+   allwinner,pull = <0>;
+   };
+
+   can0: can  01c2bc00 {
+   compatible = "allwinner,sunxican";
+   reg = <0x01c2bc00 0x400>;
+   interrupts = <0 26 4>;
+   clocks = <_gates 4>;
+   status = "disabled";
+   };
+
+Board specific .dts file:
+
+   can0: can  01c2bc00 {
+   pinctrl-names = "default";
+   pinctrl-0 = <_pins_a>;
+   status = "okay";
+   };
+
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCHv2 net] cxgb4vf: support for single-threading access to adapter mailbox registers

2015-09-14 Thread Hariprasad Shenai

The issue is the for the Virtual Function Driver, the only way to get the
Virtual Interface statistics is to issue mailbox commands to ask the
firmware for the VI Stats. And, because the VI Stats command can only
retrieve a smallish number of stats per mailbox command, we have to issue
three mailbox commands in quick succession. What we ran into was irqbalance
coming in every 10 seconds and interrogating every network interface in the
system.

Signed-off-by: Hariprasad Shenai 
---
V2: Updated description and using linux completion API's instead of
for loop based on review comments by David Miller

 drivers/net/ethernet/chelsio/cxgb4vf/adapter.h |  9 +
 .../net/ethernet/chelsio/cxgb4vf/cxgb4vf_main.c|  4 ++
 drivers/net/ethernet/chelsio/cxgb4vf/t4vf_hw.c | 46 +-
 3 files changed, 58 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb4vf/adapter.h 
b/drivers/net/ethernet/chelsio/cxgb4vf/adapter.h
index 6049f70..45b2768 100644
--- a/drivers/net/ethernet/chelsio/cxgb4vf/adapter.h
+++ b/drivers/net/ethernet/chelsio/cxgb4vf/adapter.h
@@ -348,6 +348,10 @@ struct sge {
 #define for_each_ethrxq(sge, iter) \
for (iter = 0; iter < (sge)->ethqsets; iter++)
 
+struct mbox_list {
+   struct list_head list;
+};
+
 /*
  * Per-"adapter" (Virtual Function) information.
  */
@@ -381,6 +385,11 @@ struct adapter {
 
/* various locks */
spinlock_t stats_lock;
+
+   /* support for single-threading access to adapter mailbox registers */
+   spinlock_t mbox_lock;
+   struct mbox_list mlist;
+   struct completion mbox_completion;
 };
 
 enum { /* adapter flags */
diff --git a/drivers/net/ethernet/chelsio/cxgb4vf/cxgb4vf_main.c 
b/drivers/net/ethernet/chelsio/cxgb4vf/cxgb4vf_main.c
index b2b5e5b..bd5799b 100644
--- a/drivers/net/ethernet/chelsio/cxgb4vf/cxgb4vf_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb4vf/cxgb4vf_main.c
@@ -2681,6 +2681,10 @@ static int cxgb4vf_pci_probe(struct pci_dev *pdev,
 */
spin_lock_init(>stats_lock);
 
+   spin_lock_init(>mbox_lock);
+   INIT_LIST_HEAD(>mlist.list);
+   init_completion(>mbox_completion);
+
/*
 * Map our I/O registers in BAR0.
 */
diff --git a/drivers/net/ethernet/chelsio/cxgb4vf/t4vf_hw.c 
b/drivers/net/ethernet/chelsio/cxgb4vf/t4vf_hw.c
index 63dd5fd..08b730f 100644
--- a/drivers/net/ethernet/chelsio/cxgb4vf/t4vf_hw.c
+++ b/drivers/net/ethernet/chelsio/cxgb4vf/t4vf_hw.c
@@ -34,6 +34,7 @@
  */
 
 #include 
+#include 
 
 #include "t4vf_common.h"
 #include "t4vf_defs.h"
@@ -125,6 +126,7 @@ int t4vf_wr_mbox_core(struct adapter *adapter, const void 
*cmd, int size,
const __be64 *p;
u32 mbox_data = T4VF_MBDATA_BASE_ADDR;
u32 mbox_ctl = T4VF_CIM_BASE_ADDR + CIM_VF_EXT_MAILBOX_CTRL;
+   struct mbox_list entry;
 
/*
 * Commands must be multiples of 16 bytes in length and may not be
@@ -134,6 +136,37 @@ int t4vf_wr_mbox_core(struct adapter *adapter, const void 
*cmd, int size,
size > NUM_CIM_VF_MAILBOX_DATA_INSTANCES * 4)
return -EINVAL;
 
+   /* Queue ourselves onto the mailbox access list.  When our entry is at
+* the front of the list, we have rights to access the mailbox.  So we
+* wait [for a while] till we're at the front [or bail out with an
+* EBUSY] ...
+*/
+   spin_lock(>mbox_lock);
+   list_add_tail(, >mlist.list);
+   spin_unlock(>mbox_lock);
+
+   /* If we're at the head, break out and start the mailbox
+* protocol.
+*/
+   if (list_first_entry(>mlist.list,
+struct mbox_list, list) != ) {
+   int ret;
+
+   ret = wait_for_completion_timeout(>mbox_completion,
+ 4 * FW_CMD_MAX_TIMEOUT);
+   /* If we've waited too long, return a busy indication.  This
+* really ought to be based on our initial position in the
+* mailbox access list but this is a start.  We very rearely
+* contend on access to the mailbox ...
+*/
+   if (ret) {
+   spin_lock(>mbox_lock);
+   list_del();
+   spin_unlock(>mbox_lock);
+   return -EBUSY;
+   }
+   }
+
/*
 * Loop trying to get ownership of the mailbox.  Return an error
 * if we can't gain ownership.
@@ -141,8 +174,12 @@ int t4vf_wr_mbox_core(struct adapter *adapter, const void 
*cmd, int size,
v = MBOWNER_G(t4_read_reg(adapter, mbox_ctl));
for (i = 0; v == MBOX_OWNER_NONE && i < 3; i++)
v = MBOWNER_G(t4_read_reg(adapter, mbox_ctl));
-   if (v != MBOX_OWNER_DRV)
+   if (v != MBOX_OWNER_DRV) {
+   spin_lock(>mbox_lock);
+   list_del();
+

Re: [PATCH v6 1/3] can: Allwinner A10/A20 CAN Controller support - Devicetree bindings

2015-09-14 Thread Maxime Ripard

Hi,

On Mon, Sep 14, 2015 at 02:58:21PM +0200, Marc Kleine-Budde wrote:
> On 09/14/2015 02:54 PM, Gerhard Bertelsmann wrote:
> > Signed-off-by: Gerhard Bertelsmann 
> > ---
> > 
> >  .../devicetree/bindings/net/can/sun4i_can.txt  |  37 +
> >  1 files changed, 37 insertions(+)
> > 
> > 
> > diff --git a/Documentation/devicetree/bindings/net/can/sun4i_can.txt 
> > b/Documentation/devicetree/bindings/net/can/sun4i_can.txt
> > new file mode 100644
> > index 000..b572e2b
> > --- /dev/null
> > +++ b/Documentation/devicetree/bindings/net/can/sun4i_can.txt
> > @@ -0,0 +1,37 @@
> > +Allwinner A10/A20 CAN controller Device Tree Bindings
> > +-
> > +
> > +Required properties:
> > +- compatible: "allwinner,sunxican"
> > +- reg: physical base address and size of the Allwinner A10/A20 CAN 
> > register map.
> > +- interrupts: interrupt specifier for the sole interrupt.
> > +- clock: phandle and clock specifier.
> > +
> > +Example
> > +---
> > +
> > +SoC common .dtsi file:
> > +
> > +   can0_pins_a: can0  0 {
> 
> That's supposed to be a proper "@":
> 
> can0@0
> 
> > +   allwinner,pins = "PH20","PH21";
> > +   allwinner,function = "can";
> > +   allwinner,drive = <0>;
> > +   allwinner,pull = <0>;
> > +   };
> > +
> > +   can0: can  01c2bc00 {
> 
> can0@01c2bc00

Actually, beside '' vs '@', he was right.

The name of the node in the DT should be:

@

Which in our case is can@01c2bc00, like he used.

> 
> > +   compatible = "allwinner,sunxican";

However, the compatible is still not right.

Maxime

-- 
Maxime Ripard, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com


signature.asc
Description: Digital signature

Re: [PATCH] x86: Wire up 32-bit direct socket calls

2015-09-14 Thread Ingo Molnar


* Andy Lutomirski  wrote:

> On Fri, Sep 11, 2015 at 3:14 AM, Arnd Bergmann  wrote:
> > On Friday 11 September 2015 11:54:50 Geert Uytterhoeven wrote:
> >> To make sure I don't miss any (it seems I missed recvmmsg and sendmmsg for
> >> the socketcall case, sigh), this is the list of ipc syscalls to implement?
> >>
> >> sys_msgget
> >> sys_msgctl
> >> sys_msgrcv
> >> sys_msgsnd
> >> sys_semget
> >> sys_semctl
> >> sys_semtimedop
> >> sys_shmget
> >> sys_shmctl
> >> sys_shmat
> >> sys_shmdt
> >>
> >> sys_semop() seems to be unneeded because it can be implemented using
> >> sys_semtimedop()?
> >>
> >
> > Yes, that list looks right. IPC also includes a set of six sys_mq_*
> > call, but I believe that everyone already has those as they are not
> > covered by sys_ipc.
> >
> > For y2038 compatibility, we will likely add a new variant of
> > semtimedop that takes a 64-bit timespec. While the argument passed
> > there is a relative time that will never need to be longer than 68
> > years, we need to accommodate user space that defines timespec
> > in a sane way, and converting the argument in libc would be awkward.
> >
> 
> I missed sys_ipc entirely.
> 
> Ingo, Thomas, want to just wire those up, too?  I can send a patch
> next week, but it'll be as trivial as the socket one.

Yeah, sure - split out system calls are so much better (and slightly faster) 
than 
omnibus demuxers.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Netfilter: BUG: unable to handle kernel paging request, RIP: physdev_mt+0xd6/0x160

2015-09-14 Thread Florian Westphal

Sander Eikelenboom  wrote:
> On 2015-09-13 20:06, Florian Westphal wrote:
> >Sander Eikelenboom  wrote:
> >>Using a linux-4.3-rc1 kernel i encountered the splat below:
> >
> >Thanks for reporting this bug.
> >
> >>[  290.200642] BUG: unable to handle kernel paging request at
> >>0484195d
> >>[  290.211702] IP: [] physdev_mt+0xd6/0x160
> >[..]
> >
> >>[  290.444088]  [] ipt_do_table+0x210/0x390
> >>[  290.461951]  [] iptable_filter_hook+0x2e/0x70
> >>[  290.470756]  [] nf_iterate+0x4c/0x80
> >>[  290.479587]  [] nf_hook_slow+0x64/0xc0
> >>[  290.488341]  [] ip_forward+0x369/0x3c0
> >>[  290.496927]  [] ? ip_frag_mem+0x40/0x40
> >>[  290.505365]  [] ip_rcv_finish+0x101/0x330
> >>[  290.513480]  [] ip_rcv+0x291/0x390
> >>[  290.521562]  [] ?
> >
> >Aye, ip forwarding of bridged packets with call-iptables=1 is broken.
> >
> >Please, could you try this patch?  It fixes this bug for me.
> 
> Hi Florian,
> 
> Works for me too, thx for the fix !

Sorry, I made this claim too early.

We cannot use this fix, since it will still cause kernel oops when using
-j NFQUEUE in PRE_ROUTING (We would bump refcnt on ->physoutdev, which is
garbage in this case).

Only option is to undo 72b1e5e4cac as follows:

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -183,7 +183,8 @@ struct nf_bridge_info {
/* prerouting: detect dnat in orig/reply direction */
__be32  ipv4_daddr;
struct in6_addr ipv6_daddr;
-
+   };
+   union {
/* after prerouting + nat detected: store original source
 * mac since neigh resolution overwrites it, only used while
 * skb is out in neigh layer.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v6 3/3] can: Allwinner A10/A20 CAN Controller support - Kernel module

2015-09-14 Thread Marc Kleine-Budde

On 09/14/2015 02:54 PM, Gerhard Bertelsmann wrote:
> Signed-off-by: Gerhard Bertelsmann 
> ---
> 
>  drivers/net/can/Kconfig|  10 +
>  drivers/net/can/Makefile   |   1 +
>  drivers/net/can/sun4i_can.c| 824 
> +
>  3 files changed, 835 insertions(+)
> 
> 
> diff --git a/drivers/net/can/Kconfig b/drivers/net/can/Kconfig
> index e8c96b8..6d04183 100644
> --- a/drivers/net/can/Kconfig
> +++ b/drivers/net/can/Kconfig
> @@ -129,6 +129,16 @@ config CAN_RCAR
> To compile this driver as a module, choose M here: the module will
> be called rcar_can.
>  
> +config CAN_SUN4I
> + tristate "Allwinner A10 CAN controller"
> + depends on MACH_SUN4I || MACH_SUN7I || COMPILE_TEST
> + ---help---
> +   Say Y here if you want to use CAN controller found on Allwinner
> +   A10/A20 SoCs.
> +
> +   To compile this driver as a module, choose M here: the module will
> +   be called sun4i_can.
> +
>  config CAN_XILINXCAN
>   tristate "Xilinx CAN"
>   depends on ARCH_ZYNQ || ARM64 || MICROBLAZE || COMPILE_TEST
> diff --git a/drivers/net/can/Makefile b/drivers/net/can/Makefile
> index c533c62..1f21cef 100644
> --- a/drivers/net/can/Makefile
> +++ b/drivers/net/can/Makefile
> @@ -27,6 +27,7 @@ obj-$(CONFIG_CAN_FLEXCAN)   += flexcan.o
>  obj-$(CONFIG_PCH_CAN)+= pch_can.o
>  obj-$(CONFIG_CAN_GRCAN)  += grcan.o
>  obj-$(CONFIG_CAN_RCAR)   += rcar_can.o
> +obj-$(CONFIG_CAN_SUN4I)  += sun4i_can.o
>  obj-$(CONFIG_CAN_XILINXCAN)  += xilinx_can.o
>  
>  subdir-ccflags-y += -D__CHECK_ENDIAN__
> diff --git a/drivers/net/can/sun4i_can.c b/drivers/net/can/sun4i_can.c
> new file mode 100644
> index 000..b8cc89f
> --- /dev/null
> +++ b/drivers/net/can/sun4i_can.c
> @@ -0,0 +1,824 @@
> +/*
> + * sun4i_can.c - CAN bus controller driver for Allwinner SUN4I based 
> SoCs
> + *
> + * Copyright (C) 2013 Peter Chen
> + * Copyright (C) 2015 Gerhard Bertelsmann
> + * All rights reserved.
> + *
> + * Parts of this software are based on (derived from) the SJA1000 code by:
> + *   Copyright (C) 2014 Oliver Hartkopp 
> + *   Copyright (C) 2007 Wolfgang Grandegger 
> + *   Copyright (C) 2002-2007 Volkswagen Group Electronic Research
> + *   Copyright (C) 2003 Matthias Brukner, Trajet Gmbh, Rebenring 33,
> + *   38106 Braunschweig, GERMANY
> + *
> + * Redistribution and use in source and binary forms, with or without
> + * modification, are permitted provided that the following conditions
> + * are met:
> + * 1. Redistributions of source code must retain the above copyright
> + *notice, this list of conditions and the following disclaimer.
> + * 2. Redistributions in binary form must reproduce the above copyright
> + *notice, this list of conditions and the following disclaimer in the
> + *documentation and/or other materials provided with the distribution.
> + * 3. Neither the name of Volkswagen nor the names of its contributors
> + *may be used to endorse or promote products derived from this software
> + *without specific prior written permission.
> + *
> + * Alternatively, provided that this notice is retained in full, this
> + * software may be distributed under the terms of the GNU General
> + * Public License ("GPL") version 2, in which case the provisions of the
> + * GPL apply INSTEAD OF those given above.
> + *
> + * The provided data structures and external interfaces from this code
> + * are not restricted to be used by modules with a GPL compatible license.
> + *
> + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
> + * DAMAGE.
> + *
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#define DRV_NAME "sun4i_can"
> +
> +/* Registers address (physical base address 0x01C2BC00) */
> +#define SUNXI_REG_MSEL_ADDR  0x  /* CAN Mode Select */
> +#define SUNXI_REG_CMD_ADDR   0x0004  /* CAN Command */
> +#define SUNXI_REG_STA_ADDR   0x0008  /* CAN Status */
>

[PATCH] pptp: avoid releasing the sock too early

2015-09-14 Thread Sasha Levin

Since we're using RCU we can't free the sock structure before RCU lets us,
otherwise we're risking getting use-after-frees accessing it:

[982915.329359] BUG: KASan: use after free in pptp_connect+0xbe3/0xc10 at addr 
88006903e540
[982915.333044] Read of size 2 by task trinity-c4/27338
[982915.335176] page:ea0001a40f80 count:0 mapcount:0 mapping:  
(null) index:0x0
[982915.338684] flags: 0x1f8000()
[982915.340331] page dumped because: kasan: bad access detected
[982915.342766] CPU: 1 PID: 27338 Comm: trinity-c4 Not tainted 
4.2.0-next-20150911-sasha-00043-g353d875-dirty #2545
[982915.347043]  07cd 88025625fbf0 a8fc62ba 
88025625fc78
[982915.349843]  88025625fc68 a77e206e ed004c23799b 
8802611bcce0
[982915.353106]  0282  88025625fc60 
a7473789
[982915.357338] Call Trace:
[982915.358451] dump_stack (lib/dump_stack.c:52)
[982915.360649] kasan_report_error (mm/kasan/report.c:132 mm/kasan/report.c:193)
[982915.363184] ? __lock_is_held (kernel/locking/lockdep.c:3491)
[982915.36] __asan_report_load2_noabort (mm/kasan/report.c:249)
[982915.368322] ? rcu_read_lock_held (kernel/rcu/update.c:270)
[982915.370956] ? pptp_connect (drivers/net/ppp/pptp.c:123 
drivers/net/ppp/pptp.c:445)
[982915.373322] pptp_connect (drivers/net/ppp/pptp.c:123 
drivers/net/ppp/pptp.c:445)
[982915.375654] ? pptp_connect (drivers/net/ppp/pptp.c:445)
[982915.378180] ? pptp_bind (drivers/net/ppp/pptp.c:433)
[982915.380581] ? __might_fault (mm/memory.c:3797 (discriminator 1))
[982915.383047] ? __might_fault (./arch/x86/include/asm/current.h:14 
mm/memory.c:3795)
[982915.385407] SYSC_connect (net/socket.c:1549)
[982915.387598] ? SYSC_bind (net/socket.c:1532)
[982915.389677] ? trace_hardirqs_on_caller (kernel/locking/lockdep.c:2594 
kernel/locking/lockdep.c:2636)
[982915.392286] ? _raw_spin_unlock_irq (./arch/x86/include/asm/preempt.h:95 
include/linux/spinlock_api_smp.h:171 kernel/locking/spinlock.c:199)
[982915.394779] ? alarm_setitimer (kernel/time/itimer.c:271)
[982915.397236] ? do_setitimer (kernel/time/itimer.c:254)
[982915.399620] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63)
[982915.402366] ? trace_hardirqs_on_caller (kernel/locking/lockdep.c:2594 
kernel/locking/lockdep.c:2636)
[982915.405302] ? lockdep_sys_exit_thunk (arch/x86/entry/thunk_64.S:44)
[982915.407928] SyS_connect (net/socket.c:1530)
[982915.410141] entry_SYSCALL_64_fastpath (arch/x86/entry/entry_64.S:186)
[982915.413286] Memory state around the buggy address:
[982915.415530]  88006903e400: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 
ff
[982915.418656]  88006903e480: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 
ff
[982915.421680] >88006903e500: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 
ff
[982915.424792]^
[982915.427125]  88006903e580: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 
ff
[982915.430027]  88006903e600: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 
ff

Signed-off-by: Sasha Levin 
---
 drivers/net/ppp/pptp.c   |9 -
 include/linux/if_pppox.h |1 +
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ppp/pptp.c b/drivers/net/ppp/pptp.c
index 686f37d..cb7a029 100644
--- a/drivers/net/ppp/pptp.c
+++ b/drivers/net/ppp/pptp.c
@@ -517,6 +517,13 @@ static int pptp_getname(struct socket *sock, struct 
sockaddr *uaddr,
return 0;
 }
 
+static void pptp_release_cb(struct rcu_head *rcu)
+{
+   struct pppox_sock *p = container_of(rcu, struct pppox_sock, rcu);
+
+   sock_put(sk_pppox(p));
+}
+
 static int pptp_release(struct socket *sock)
 {
struct sock *sk = sock->sk;
@@ -545,7 +552,7 @@ static int pptp_release(struct socket *sock)
sock->sk = NULL;
 
release_sock(sk);
-   sock_put(sk);
+   call_rcu(>rcu, pptp_release_cb);
 
return error;
 }
diff --git a/include/linux/if_pppox.h b/include/linux/if_pppox.h
index b49cf92..ba9c378 100644
--- a/include/linux/if_pppox.h
+++ b/include/linux/if_pppox.h
@@ -55,6 +55,7 @@ struct pppox_sock {
struct pptp_opt  pptp;
} proto;
__be16  num;
+   struct rcu_head rcu;
 };
 #define pppoe_dev  proto.pppoe.dev
 #define pppoe_ifindex  proto.pppoe.ifindex
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 14/39] SUNRPC: drop null test before destroy functions

2015-09-14 Thread J. Bruce Fields

ACK, but assuming Trond takes this one.--b.

On Sun, Sep 13, 2015 at 02:15:07PM +0200, Julia Lawall wrote:
> Remove unneeded NULL test.
> 
> The semantic patch that makes this change is as follows:
> (http://coccinelle.lip6.fr/)
> 
> // 
> @@ expression x; @@
> -if (x != NULL)
>   \(kmem_cache_destroy\|mempool_destroy\|dma_pool_destroy\)(x);
> // 
> 
> Signed-off-by: Julia Lawall 
> 
> ---
>  net/sunrpc/sched.c |   12 
>  1 file changed, 4 insertions(+), 8 deletions(-)
> 
> diff --git a/net/sunrpc/sched.c b/net/sunrpc/sched.c
> index b140c09..425ca2f 100644
> --- a/net/sunrpc/sched.c
> +++ b/net/sunrpc/sched.c
> @@ -1092,14 +1092,10 @@ void
>  rpc_destroy_mempool(void)
>  {
>   rpciod_stop();
> - if (rpc_buffer_mempool)
> - mempool_destroy(rpc_buffer_mempool);
> - if (rpc_task_mempool)
> - mempool_destroy(rpc_task_mempool);
> - if (rpc_task_slabp)
> - kmem_cache_destroy(rpc_task_slabp);
> - if (rpc_buffer_slabp)
> - kmem_cache_destroy(rpc_buffer_slabp);
> + mempool_destroy(rpc_buffer_mempool);
> + mempool_destroy(rpc_task_mempool);
> + kmem_cache_destroy(rpc_task_slabp);
> + kmem_cache_destroy(rpc_buffer_slabp);
>   rpc_destroy_wait_queue(_queue);
>  }
>  
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC PATCH 3/3] devicetree: macb: Add optional property tsu-clk

2015-09-14 Thread Sören Brinkmann

On Mon, 2015-09-14 at 09:58AM +0200, Boris Brezillon wrote:
> Hi Harini,
> 
> On Mon, 14 Sep 2015 09:39:05 +0530
> Harini Katakam  wrote:
> 
> > On Fri, Sep 11, 2015 at 10:22 PM, Sören Brinkmann
> >  wrote:
> > > Hi Harini,
> > >
> > > On Fri, 2015-09-11 at 01:27PM +0530, Harini Katakam wrote:
> > >> Add TSU clock frequency to be used for 1588 support in macb driver.
> > >>
> > >> Signed-off-by: Harini Katakam 
> > >> ---
> > >>  Documentation/devicetree/bindings/net/macb.txt |3 +++
> > >>  1 file changed, 3 insertions(+)
> > >>
> > >> diff --git a/Documentation/devicetree/bindings/net/macb.txt 
> > >> b/Documentation/devicetree/bindings/net/macb.txt
> > >> index b5d7976..f7c0ea8 100644
> > >> --- a/Documentation/devicetree/bindings/net/macb.txt
> > >> +++ b/Documentation/devicetree/bindings/net/macb.txt
> > >> @@ -19,6 +19,9 @@ Required properties:
> > >>   Optional elements: 'tx_clk'
> > >>  - clocks: Phandles to input clocks.
> > >>
> > >> +Optional properties:
> > >> +- tsu-clk: Time stamp unit clock frequency used.
> > >
> > > Why are we not using the CCF and a clk_get_rate() in the driver?
> > >
> > 
> > If the clock source was only internal, we could use this
> > approach as usual. But TSU clock can be configured to
> > come from an external clock source or internal.
> 
> How about declaring a fixed-rate clk [1] if it comes from an external
> clk, and using a clk driver for the internal clk case?
> This way you'll be able to use the clk API (including the
> clk_get_rate() function) instead of introducing a new way to retrieve a
> clk frequency.

Right. Also, Zynq does already support external clock inputs (actually,
every clock originates from some external clock/oscillator at some
point). Maybe that code needs some additions to handle the TSU clock
too. But either way, I can't see why a clock cannot be modeled using the
CCF.

Sören
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 2/5] seccomp: make underlying bpf ref counted as well

2015-09-14 Thread Tycho Andersen

Hi Daniel,

On Fri, Sep 11, 2015 at 08:28:19PM +0200, Daniel Borkmann wrote:
> I think due to the given insns restrictions on classic seccomp, this
> could work for "most cases" (see below) for the time being until pointer
> sanitation is resolved and that seccomp-only restriction from the dump
> could be removed,

Ok, thanks.

> BUT there's one more stone in the road which you still
> need to take care of with this whole 'giving classic seccomp-BPF -> eBPF
> transforms an fd, dumping and restoring that via bpf(2)' approach:
> 
> If you have JIT enabled on ARM32, and add a classic seccomp-BPF filter,
> and dump that via your bpf(2) interface based on the current patches, what
> you'll get is not eBPF opcodes but classic (!) BPF opcodes as ARM32 classic
> JIT supports compilation of seccomp, since commit 24e737c1ebac ("ARM: net:
> add JIT support for loads from struct seccomp_data.").
> 
> So in that case, bpf_prepare_filter() will not call into bpf_migrate_filter()
> as there's simply no need for it, because the classic code could already
> be JITed there. I guess other archs where JIT support for eBPF in not yet
> within near sight might sooner or later support this insn for their classic
> JITs, too ...

Thanks for pointing this out.

What if we legislate that the output of bpf(BPF_PROG_DUMP, ...) is
always eBPF? As near as I can tell there is no way to determine if a
struct bpf_prog is classic or eBPF, so we'd need to add a bit to
indicate whether or not the prog has been converted so that
BPF_PROG_DUMP knows when to convert it.

Tycho
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[net-next PATCH] net: sched: document attach_default_qdiscs

2015-09-14 Thread Phil Sutter

The process of selecting an interface's default qdisc is not really
intuitive, at least because there are three different cases to consider.

Signed-off-by: Phil Sutter 
---
 net/sched/sch_generic.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index cb5d4ad..d7eaa51 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -741,6 +741,18 @@ static void attach_one_default_qdisc(struct net_device 
*dev,
dev_queue->qdisc_sleeping = qdisc;
 }
 
+/**
+ * attach_default_qdiscs - set qdiscs back to default
+ * @dev: network device to attach to
+ *
+ * This function attaches the (configurable) default qdisc to
+ * an interface.
+ * The actual default depends: Non-MQ, physical interfaces
+ * get whatever default_qdisc_ops has been set to. Multiqueue
+ * interfaces get mq, which uses default_qdisc_ops for it's
+ * leaves. Finally, virtual interfaces unconditionally
+ * default to noqueue.
+ */
 static void attach_default_qdiscs(struct net_device *dev)
 {
struct netdev_queue *txq;
-- 
2.1.2

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 14/39] SUNRPC: drop null test before destroy functions

2015-09-14 Thread Trond Myklebust

On Mon, Sep 14, 2015 at 12:07 PM, J. Bruce Fields  wrote:
> ACK, but assuming Trond takes this one.--b.

No problem. I'll pick it up...

Cheers
  Trond

> On Sun, Sep 13, 2015 at 02:15:07PM +0200, Julia Lawall wrote:
>> Remove unneeded NULL test.
>>
>> The semantic patch that makes this change is as follows:
>> (http://coccinelle.lip6.fr/)
>>
>> // 
>> @@ expression x; @@
>> -if (x != NULL)
>>   \(kmem_cache_destroy\|mempool_destroy\|dma_pool_destroy\)(x);
>> // 
>>
>> Signed-off-by: Julia Lawall 
>>
>> ---
>>  net/sunrpc/sched.c |   12 
>>  1 file changed, 4 insertions(+), 8 deletions(-)
>>
>> diff --git a/net/sunrpc/sched.c b/net/sunrpc/sched.c
>> index b140c09..425ca2f 100644
>> --- a/net/sunrpc/sched.c
>> +++ b/net/sunrpc/sched.c
>> @@ -1092,14 +1092,10 @@ void
>>  rpc_destroy_mempool(void)
>>  {
>>   rpciod_stop();
>> - if (rpc_buffer_mempool)
>> - mempool_destroy(rpc_buffer_mempool);
>> - if (rpc_task_mempool)
>> - mempool_destroy(rpc_task_mempool);
>> - if (rpc_task_slabp)
>> - kmem_cache_destroy(rpc_task_slabp);
>> - if (rpc_buffer_slabp)
>> - kmem_cache_destroy(rpc_buffer_slabp);
>> + mempool_destroy(rpc_buffer_mempool);
>> + mempool_destroy(rpc_task_mempool);
>> + kmem_cache_destroy(rpc_task_slabp);
>> + kmem_cache_destroy(rpc_buffer_slabp);
>>   rpc_destroy_wait_queue(_queue);
>>  }
>>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] atm: deal with setting entry before mkip was called

2015-09-14 Thread Sasha Levin

If we didn't call ATMARP_MKIP before ATMARP_ENCAP the VCC descriptor is
non-existant and we'll end up dereferencing a NULL ptr:

[1033173.491930] kasan: GPF could be caused by NULL-ptr deref or user memory 
accessirq event stamp: 123386
[1033173.493678] general protection fault:  [#1] PREEMPT SMP 
DEBUG_PAGEALLOC KASAN
[1033173.493689] Modules linked in:
[1033173.493697] CPU: 9 PID: 23815 Comm: trinity-c64 Not tainted 
4.2.0-next-20150911-sasha-00043-g353d875-dirty #2545
[1033173.493706] task: 8800630c4000 ti: 88006311 task.ti: 
88006311
[1033173.493823] RIP: clip_ioctl (net/atm/clip.c:320 net/atm/clip.c:689)
[1033173.493826] RSP: 0018:880063117a88  EFLAGS: 00010203
[1033173.493828] RAX: dc00 RBX:  RCX: 
000c
[1033173.493830] RDX: 0002 RSI: b3f10720 RDI: 
0014
[1033173.493832] RBP: 880063117b80 R08: 88047574d9a4 R09: 

[1033173.493834] R10:  R11:  R12: 
11000c622f53
[1033173.493836] R13: 8800cb905500 R14: 8808d6da2000 R15: 
fdfd
[1033173.493840] FS:  7fa56b92d700() GS:88047800() 
knlGS:
[1033173.493843] CS:  0010 DS:  ES:  CR0: 8005003b
[1033173.493845] CR2:  CR3: 630e8000 CR4: 
06a0
[1033173.493855] Stack:
[1033173.493862]  b0b60444 eaea 41b58ab3 
b3c3ce32
[1033173.493867]  b0b6f3e0 b0b60444 b5ea2e50 
11000c622f5e
[1033173.493873]  8800630c4cd8 000ee09a b3ec4888 
b5ea2de8
[1033173.493874] Call Trace:
[1033173.494108] do_vcc_ioctl (net/atm/ioctl.c:170)
[1033173.494113] vcc_ioctl (net/atm/ioctl.c:189)
[1033173.494116] svc_ioctl (net/atm/svc.c:605)
[1033173.494200] sock_do_ioctl (net/socket.c:874)
[1033173.494204] sock_ioctl (net/socket.c:958)
[1033173.494244] do_vfs_ioctl (fs/ioctl.c:43 fs/ioctl.c:607)
[1033173.494290] SyS_ioctl (fs/ioctl.c:622 fs/ioctl.c:613)
[1033173.494295] entry_SYSCALL_64_fastpath (arch/x86/entry/entry_64.S:186)
[1033173.494362] Code: fa 48 c1 ea 03 80 3c 02 00 0f 85 50 09 00 00 49 8b 9e 60 
06 00 00 48 b8 00 00 00 00 00 fc ff df 48 8d 7b 14 48 89 fa 48 c1 ea 03 <0f> b6 
04 02 48 89 fa 83 e2 07 38 d0 7f 08 84 c0 0f 85 14 09 00
All code

   0:   fa  cli
   1:   48 c1 ea 03 shr$0x3,%rdx
   5:   80 3c 02 00 cmpb   $0x0,(%rdx,%rax,1)
   9:   0f 85 50 09 00 00   jne0x95f
   f:   49 8b 9e 60 06 00 00mov0x660(%r14),%rbx
  16:   48 b8 00 00 00 00 00movabs $0xdc00,%rax
  1d:   fc ff df
  20:   48 8d 7b 14 lea0x14(%rbx),%rdi
  24:   48 89 famov%rdi,%rdx
  27:   48 c1 ea 03 shr$0x3,%rdx
  2b:*  0f b6 04 02 movzbl (%rdx,%rax,1),%eax   <-- 
trapping instruction
  2f:   48 89 famov%rdi,%rdx
  32:   83 e2 07and$0x7,%edx
  35:   38 d0   cmp%dl,%al
  37:   7f 08   jg 0x41
  39:   84 c0   test   %al,%al
  3b:   0f 85 14 09 00 00   jne0x955

Code starting with the faulting instruction
===
   0:   0f b6 04 02 movzbl (%rdx,%rax,1),%eax
   4:   48 89 famov%rdi,%rdx
   7:   83 e2 07and$0x7,%edx
   a:   38 d0   cmp%dl,%al
   c:   7f 08   jg 0x16
   e:   84 c0   test   %al,%al
  10:   0f 85 14 09 00 00   jne0x92a
[1033173.494366] RIP clip_ioctl (net/atm/clip.c:320 net/atm/clip.c:689)
[1033173.494368]  RSP 

Signed-off-by: Sasha Levin 
---
 net/atm/clip.c |3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/atm/clip.c b/net/atm/clip.c
index 17e55df..4407b2f 100644
--- a/net/atm/clip.c
+++ b/net/atm/clip.c
@@ -317,6 +317,9 @@ static int clip_constructor(struct neighbour *neigh)
 
 static int clip_encap(struct atm_vcc *vcc, int mode)
 {
+   if (!CLIP_VCC(vcc))
+   return -EBADF;
+
CLIP_VCC(vcc)->encap = mode;
return 0;
 }
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] atm: deal with setting entry before mkip was called

2015-09-14 Thread Eric Dumazet

On Mon, 2015-09-14 at 11:48 -0400, Sasha Levin wrote:

> 
> diff --git a/net/atm/clip.c b/net/atm/clip.c
> index 17e55df..4407b2f 100644
> --- a/net/atm/clip.c
> +++ b/net/atm/clip.c
> @@ -317,6 +317,9 @@ static int clip_constructor(struct neighbour *neigh)
>  
>  static int clip_encap(struct atm_vcc *vcc, int mode)
>  {
> + if (!CLIP_VCC(vcc))
> + return -EBADF;
> +
>   CLIP_VCC(vcc)->encap = mode;
>   return 0;
>  }


-EBADF has a very precise meaning : /* Bad file number */

In this case, the file number is correct (and maps to a proper file),
but driver state is not allowing for this particular operation.




--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] atm: deal with setting entry before mkip was called

2015-09-14 Thread Sasha Levin

On 09/14/2015 01:07 PM, Eric Dumazet wrote:
> On Mon, 2015-09-14 at 13:00 -0400, Sasha Levin wrote:
> 
>> I've tried to be consistent with a similar check within clip_mkip() and
>> clip_setentry():
>>
>> if (!vcc->push)
>> return -EBADFD;
>>
>> So calling clip_setentry() before clip_mkip() would also give you -EBADFD.
>>
> 
> Okay, but -EBADF is not the same than -EBADFD

Doh. Sorry about that.


Thanks,
Sasha

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: DSA: phy polling

2015-09-14 Thread Florian Fainelli

On 14/09/15 03:42, Russell King - ARM Linux wrote:
> Andrew,
> 
> I think you're the current maintainer of the Marvell DSA code, as being
> the most recent author of changes to it. :)
> 
> I've noticed in my testing that the Marvell DSA code seems to poll the
> internal phy link status in mv88e6xxx_poll_link(), and set the network
> device carrier status according to the results.
> 
> However, the internal phys are created using phylib, which also polls
> the phys for their link status, and controls the associated netdev
> carrier status.
> 
> The side effect of this is that I see duplicated link status messages in
> the kernel log when connecting or disconnecting cables from the switch,
> caused by the code in mv88e6xxx_poll_link() racing with the phylib code.
> From what I can see, the code in mv88e6xxx_poll_link() is entirely
> redundant as the phylib layer will take care of any phy attached to the
> switch.
> 
> To prove this, I have the following code in my tree, which disables the
> polling on a port where we have a phy attached (either an internal or
> external phy).  The result is that the per-port network devices are still
> updated with the link status even though this code is disabled - thanks
> to the phylib polling.
> 
> I'm left wondering whether the DSA specific phy polling does anything
> useful, or whether the entire polling code both in mv88e6xxx.c and
> net/dsa can be removed (mv88e6xxx.c seems to be its only user.)

Just my 2 cents here, I suspect the original intention behind this code
was to help utilize the switch's built-in PHY polling unit when
available, and use the HW to collect the state of all PHYs in fewer
register to read, instead of having to do individual (and quite possibly
expensive) MDIO reads towards each individual per-port PHYs (at least
two reads per PHY to latch MII_BMSR).

Now, I do agree there is a duplication of functionality here, and a
potential fix would be to avoid starting the PHY state machine if/when
the switch supports such a feature (not call phy_start*), that should
still get you consistent consistent link partner advertised/status
values, question is, does that really benefit anybody though?

> 
> diff --git a/drivers/net/dsa/mv88e6xxx.c b/drivers/net/dsa/mv88e6xxx.c
> index 26ec2fbfaa89..4c324eafeef2 100644
> --- a/drivers/net/dsa/mv88e6xxx.c
> +++ b/drivers/net/dsa/mv88e6xxx.c
> @@ -400,6 +400,13 @@ void mv88e6xxx_poll_link(struct dsa_switch *ds)
>   if (dev == NULL)
>   continue;
>  
> + /*
> +  * Ignore ports which have a phy; phylib will take care
> +  * of polling the link status for these.
> +  */
> + if (dsa_slave_has_phy(dev))
> + continue;
> +
>   link = 0;
>   if (dev->flags & IFF_UP) {
>   port_status = mv88e6xxx_reg_read(ds, REG_PORT(i),
> diff --git a/include/net/dsa.h b/include/net/dsa.h
> index fbca63ba8f73..b31e9da43ea7 100644
> --- a/include/net/dsa.h
> +++ b/include/net/dsa.h
> @@ -176,6 +176,8 @@ static inline bool dsa_is_port_initialized(struct 
> dsa_switch *ds, int p)
>   return ds->phys_port_mask & (1 << p) && ds->ports[p];
>  }
>  
> +extern bool dsa_slave_has_phy(struct net_device *);
> +
>  static inline u8 dsa_upstream_port(struct dsa_switch *ds)
>  {
>   struct dsa_switch_tree *dst = ds->dst;
> diff --git a/net/dsa/slave.c b/net/dsa/slave.c
> index 35c47ddd04f0..a107242816ff 100644
> --- a/net/dsa/slave.c
> +++ b/net/dsa/slave.c
> @@ -10,6 +10,7 @@
>  
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -873,6 +874,14 @@ int dsa_slave_resume(struct net_device *slave_dev)
>   return 0;
>  }
>  
> +bool dsa_slave_has_phy(struct net_device *slave_dev)
> +{
> + struct dsa_slave_priv *p = netdev_priv(slave_dev);
> +
> + return p->phy != NULL;
> +}
> +EXPORT_SYMBOL_GPL(dsa_slave_has_phy);
> +
>  int dsa_slave_create(struct dsa_switch *ds, struct device *parent,
>int port, char *name)
>  {
> 
> 


-- 
Florian
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 2/5] seccomp: make underlying bpf ref counted as well

2015-09-14 Thread Tycho Andersen

On Mon, Sep 14, 2015 at 06:48:43PM +0200, Daniel Borkmann wrote:
> On 09/14/2015 06:00 PM, Tycho Andersen wrote:
> >On Fri, Sep 11, 2015 at 08:28:19PM +0200, Daniel Borkmann wrote:
> >>I think due to the given insns restrictions on classic seccomp, this
> >>could work for "most cases" (see below) for the time being until pointer
> >>sanitation is resolved and that seccomp-only restriction from the dump
> >>could be removed,
> >
> >Ok, thanks.
> >
> >>BUT there's one more stone in the road which you still
> >>need to take care of with this whole 'giving classic seccomp-BPF -> eBPF
> >>transforms an fd, dumping and restoring that via bpf(2)' approach:
> >>
> >>If you have JIT enabled on ARM32, and add a classic seccomp-BPF filter,
> >>and dump that via your bpf(2) interface based on the current patches, what
> >>you'll get is not eBPF opcodes but classic (!) BPF opcodes as ARM32 classic
> >>JIT supports compilation of seccomp, since commit 24e737c1ebac ("ARM: net:
> >>add JIT support for loads from struct seccomp_data.").
> >>
> >>So in that case, bpf_prepare_filter() will not call into 
> >>bpf_migrate_filter()
> >>as there's simply no need for it, because the classic code could already
> >>be JITed there. I guess other archs where JIT support for eBPF in not yet
> >>within near sight might sooner or later support this insn for their classic
> >>JITs, too ...
> >
> >Thanks for pointing this out.
> >
> >What if we legislate that the output of bpf(BPF_PROG_DUMP, ...) is
> >always eBPF? As near as I can tell there is no way to determine if a
> >struct bpf_prog is classic or eBPF, so we'd need to add a bit to
> >indicate whether or not the prog has been converted so that
> >BPF_PROG_DUMP knows when to convert it.
> 
> As I said, you have bpf_prog_was_classic() function to determine exactly
> this (so without your type re-assignment you have a way to distinguish it).

I don't think this is the same thing, though. IIUC, when the classic
jit succeeds, bpf_prog_was_classic() will still return true even
though prog->insnsi points to classic instructions instead of eBPF
ones, and (I think) this situation is impossible to distinguish.
Anyway, it sounds like this doesn't matter, as we have...

> Wouldn't it be much easier to rip this set apart into multiple ones, solving
> one individual thing at a time, f.e. starting out simple and 1) only add
> native eBPF support to seccomp, after that 2) add a method to dump native-only
> eBPF programs for criu, then 3) think about a right interface for classic
> BPF seccomp dumping, etc, etc? Currently, it tries to solve everything at
> once, and with some early assumptions that have non-trivial side-effects.

The primary motivation for this set is your bullet 3, c/r of programs
with classic bpf programs (i.e. what seccomp supports now). Initially,
I thought it was best to try and dump the eBPFs directly, but it seems
there are a lot of complications I wasn't aware of. Perhaps I'll look
at a bpf_prog_store_orig_filter() style approach.

Thanks,

Tycho
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] pptp: avoid releasing the sock too early

2015-09-14 Thread Eric Dumazet

On Mon, 2015-09-14 at 11:40 -0400, Sasha Levin wrote:
> Since we're using RCU we can't free the sock structure before RCU lets us,
> otherwise we're risking getting use-after-frees accessing it:

> 
> Signed-off-by: Sasha Levin 
> ---
>  drivers/net/ppp/pptp.c   |9 -
>  include/linux/if_pppox.h |1 +
>  2 files changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ppp/pptp.c b/drivers/net/ppp/pptp.c
> index 686f37d..cb7a029 100644
> --- a/drivers/net/ppp/pptp.c
> +++ b/drivers/net/ppp/pptp.c
> @@ -517,6 +517,13 @@ static int pptp_getname(struct socket *sock, struct 
> sockaddr *uaddr,
>   return 0;
>  }
>  
> +static void pptp_release_cb(struct rcu_head *rcu)
> +{
> + struct pppox_sock *p = container_of(rcu, struct pppox_sock, rcu);
> +
> + sock_put(sk_pppox(p));
> +}
> +
>  static int pptp_release(struct socket *sock)
>  {
>   struct sock *sk = sock->sk;
> @@ -545,7 +552,7 @@ static int pptp_release(struct socket *sock)
>   sock->sk = NULL;
>  
>   release_sock(sk);
> - sock_put(sk);
> + call_rcu(>rcu, pptp_release_cb);
>  
>   return error;
>  }
> diff --git a/include/linux/if_pppox.h b/include/linux/if_pppox.h
> index b49cf92..ba9c378 100644
> --- a/include/linux/if_pppox.h
> +++ b/include/linux/if_pppox.h
> @@ -55,6 +55,7 @@ struct pppox_sock {
>   struct pptp_opt  pptp;
>   } proto;
>   __be16  num;
> + struct rcu_head rcu;
>  };
>  #define pppoe_devproto.pppoe.dev
>  #define pppoe_ifindexproto.pppoe.ifindex

Hmm, is the synchronize_rcu() in del_chan() still needed, and why it was
not enough ?

I believe your patch might reduce the race window, but it is not clear
it is the right fix.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] atm: deal with setting entry before mkip was called

2015-09-14 Thread Sasha Levin

On 09/14/2015 12:50 PM, Eric Dumazet wrote:
> On Mon, 2015-09-14 at 11:48 -0400, Sasha Levin wrote:
> 
>>
>> diff --git a/net/atm/clip.c b/net/atm/clip.c
>> index 17e55df..4407b2f 100644
>> --- a/net/atm/clip.c
>> +++ b/net/atm/clip.c
>> @@ -317,6 +317,9 @@ static int clip_constructor(struct neighbour *neigh)
>>  
>>  static int clip_encap(struct atm_vcc *vcc, int mode)
>>  {
>> +if (!CLIP_VCC(vcc))
>> +return -EBADF;
>> +
>>  CLIP_VCC(vcc)->encap = mode;
>>  return 0;
>>  }
> 
> 
> -EBADF has a very precise meaning : /* Bad file number */
> 
> In this case, the file number is correct (and maps to a proper file),
> but driver state is not allowing for this particular operation.

I've tried to be consistent with a similar check within clip_mkip() and
clip_setentry():

if (!vcc->push)
return -EBADFD;

So calling clip_setentry() before clip_mkip() would also give you -EBADFD.


Thanks,
Sasha
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 2/5] seccomp: make underlying bpf ref counted as well

2015-09-14 Thread Daniel Borkmann


On 09/14/2015 06:00 PM, Tycho Andersen wrote:

On Fri, Sep 11, 2015 at 08:28:19PM +0200, Daniel Borkmann wrote:

I think due to the given insns restrictions on classic seccomp, this
could work for "most cases" (see below) for the time being until pointer
sanitation is resolved and that seccomp-only restriction from the dump
could be removed,


Ok, thanks.


BUT there's one more stone in the road which you still
need to take care of with this whole 'giving classic seccomp-BPF -> eBPF
transforms an fd, dumping and restoring that via bpf(2)' approach:

If you have JIT enabled on ARM32, and add a classic seccomp-BPF filter,
and dump that via your bpf(2) interface based on the current patches, what
you'll get is not eBPF opcodes but classic (!) BPF opcodes as ARM32 classic
JIT supports compilation of seccomp, since commit 24e737c1ebac ("ARM: net:
add JIT support for loads from struct seccomp_data.").

So in that case, bpf_prepare_filter() will not call into bpf_migrate_filter()
as there's simply no need for it, because the classic code could already
be JITed there. I guess other archs where JIT support for eBPF in not yet
within near sight might sooner or later support this insn for their classic
JITs, too ...


Thanks for pointing this out.

What if we legislate that the output of bpf(BPF_PROG_DUMP, ...) is
always eBPF? As near as I can tell there is no way to determine if a
struct bpf_prog is classic or eBPF, so we'd need to add a bit to
indicate whether or not the prog has been converted so that
BPF_PROG_DUMP knows when to convert it.


As I said, you have bpf_prog_was_classic() function to determine exactly
this (so without your type re-assignment you have a way to distinguish it).

Wouldn't it be much easier to rip this set apart into multiple ones, solving
one individual thing at a time, f.e. starting out simple and 1) only add
native eBPF support to seccomp, after that 2) add a method to dump native-only
eBPF programs for criu, then 3) think about a right interface for classic
BPF seccomp dumping, etc, etc? Currently, it tries to solve everything at
once, and with some early assumptions that have non-trivial side-effects.

Thanks,
Daniel
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] atm: deal with setting entry before mkip was called

2015-09-14 Thread Eric Dumazet

On Mon, 2015-09-14 at 13:00 -0400, Sasha Levin wrote:

> I've tried to be consistent with a similar check within clip_mkip() and
> clip_setentry():
> 
> if (!vcc->push)
> return -EBADFD;
> 
> So calling clip_setentry() before clip_mkip() would also give you -EBADFD.
> 

Okay, but -EBADF is not the same than -EBADFD



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC PATCH 3/3] devicetree: macb: Add optional property tsu-clk

2015-09-14 Thread Harini Katakam

Hi Soren/Boris,

On Mon, Sep 14, 2015 at 8:14 PM, Sören Brinkmann
 wrote:
> On Mon, 2015-09-14 at 09:58AM +0200, Boris Brezillon wrote:
>> Hi Harini,
>>
>> On Mon, 14 Sep 2015 09:39:05 +0530
>> Harini Katakam  wrote:
>>
>> > On Fri, Sep 11, 2015 at 10:22 PM, Sören Brinkmann
>> >  wrote:
>> > > Hi Harini,
>> > >
>> > > On Fri, 2015-09-11 at 01:27PM +0530, Harini Katakam wrote:
>> > >> Add TSU clock frequency to be used for 1588 support in macb driver.
>> > >>
>> > >> Signed-off-by: Harini Katakam 
>> > >> ---
>> > >>  Documentation/devicetree/bindings/net/macb.txt |3 +++
>> > >>  1 file changed, 3 insertions(+)
>> > >>
>> > >> diff --git a/Documentation/devicetree/bindings/net/macb.txt 
>> > >> b/Documentation/devicetree/bindings/net/macb.txt
>> > >> index b5d7976..f7c0ea8 100644
>> > >> --- a/Documentation/devicetree/bindings/net/macb.txt
>> > >> +++ b/Documentation/devicetree/bindings/net/macb.txt
>> > >> @@ -19,6 +19,9 @@ Required properties:
>> > >>   Optional elements: 'tx_clk'
>> > >>  - clocks: Phandles to input clocks.
>> > >>
>> > >> +Optional properties:
>> > >> +- tsu-clk: Time stamp unit clock frequency used.
>> > >
>> > > Why are we not using the CCF and a clk_get_rate() in the driver?
>> > >
>> >
>> > If the clock source was only internal, we could use this
>> > approach as usual. But TSU clock can be configured to
>> > come from an external clock source or internal.
>>
>> How about declaring a fixed-rate clk [1] if it comes from an external
>> clk, and using a clk driver for the internal clk case?
>> This way you'll be able to use the clk API (including the
>> clk_get_rate() function) instead of introducing a new way to retrieve a
>> clk frequency.
>
> Right. Also, Zynq does already support external clock inputs (actually,
> every clock originates from some external clock/oscillator at some
> point). Maybe that code needs some additions to handle the TSU clock
> too. But either way, I can't see why a clock cannot be modeled using the
> CCF.

Ok, I'm going to try this.

Regards,
Harini
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [fw filter]: Broken! fw mark based tc class selection not working

2015-09-14 Thread Cong Wang

On Mon, Sep 14, 2015 at 5:28 AM, Jamal Hadi Salim  wrote:
> On 09/11/15 20:00, Cong Wang wrote:
>>
>> On Fri, Sep 11, 2015 at 3:24 PM, Akshat Kakkar 
>> wrote:
>
>
>> Hmm, I didn't know that before either. Looks like my tp->init change
>> breaks it.
>>
>> Could you try the following patch?
>>
>
> I would just make init() empty for this classifier (return 0?).
> If someone wants to add classids ids, change() is available.
> The most common (efficient) use case is what Akshat shows.
> So even the check in the classify should optimize for that i.e
> if (head == NULL)
> do old method
> else
> ...

That is exactly the original code. But it is not readable at all,
at least I still missed it when I touched the tp->init() part. :(
Having a boolean doesn't harm anything.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [net-next PATCH] net: sched: document attach_default_qdiscs

2015-09-14 Thread Phil Sutter

On Mon, Sep 14, 2015 at 03:07:42PM -0700, Cong Wang wrote:
> On Mon, Sep 14, 2015 at 8:31 AM, Phil Sutter  wrote:
> > The process of selecting an interface's default qdisc is not really
> > intuitive, at least because there are three different cases to consider.
> 
> It is a static function, not an API, so I don't think it is the right
> place to document.

So static functions should never be documented? I'm playing devil's
advocate, but still:

> Maybe update default_qdisc description in Documentation/sysctl/net.txt?

I don't think this is the right place for source code documentation. The
intended audience are users, and I wouldn't expect a developer to search
in there. On the other hand, that description would indeed benefit from
a review: Apart from omitting noqueue, it neither mentions leaf qdiscs.
I'll fix this.

Thanks, Phil
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net] openvswitch: Fix IPv6 exthdr handling with ct helpers.

2015-09-14 Thread Pravin Shelar

On Mon, Sep 14, 2015 at 11:14 AM, Joe Stringer  wrote:
> Static code analysis reveals the following bug:
>
> net/openvswitch/conntrack.c:281 ovs_ct_helper()
> warn: unsigned 'protoff' is never less than zero.
>
> This signedness bug breaks error handling for IPv6 extension headers when
> using conntrack helpers. Fix the error by using a local signed variable.
>
> Fixes:  cae3a2627520: "openvswitch: Allow attaching helpers to ct
> action"
> Reported-by: Dan Carpenter 
> Signed-off-by: Joe Stringer 

Acked-by: Pravin B Shelar 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

severe regression in alx ethernet driver

2015-09-14 Thread Ldap Tester

There is a serious regression in the alx ethernet driver.  The driver
stopped working after upgrading the kernel from 4.0.x to 4.1.x.
Please see https://bugzilla.redhat.com/show_bug.cgi?id=1251434 and
https://bugzilla.kernel.org/show_bug.cgi?id=70761 This regression is
urgent, as I cannot update my kernel to include the latest security
fixes.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH net-next] xen-netfront: always set num queues if possible

2015-09-14 Thread Charles (Chas) Williams

The xen store preserves this information across module invocations.
If you insmod netfront with two queues and later insmod again with one
queue, the backend will still believe you asked for two queues.

Signed-off-by: Chas Williams <3ch...@gmail.com>
---
 drivers/net/xen-netfront.c | 12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index f821a97..b53a681 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -1819,11 +1819,7 @@ again:
goto destroy_ring;
}
 
-   if (num_queues == 1) {
-   err = write_queue_xenstore_keys(>queues[0], , 0); /* 
flat */
-   if (err)
-   goto abort_transaction_no_dev_fatal;
-   } else {
+   if (xenbus_exists(xbt, dev->nodename, "multi-queue-num-queues")) {
/* Write the number of queues */
err = xenbus_printf(xbt, dev->nodename, 
"multi-queue-num-queues",
"%u", num_queues);
@@ -1831,7 +1827,13 @@ again:
message = "writing multi-queue-num-queues";
goto abort_transaction_no_dev_fatal;
}
+   }
 
+   if (num_queues == 1) {
+   err = write_queue_xenstore_keys(>queues[0], , 0); /* 
flat */
+   if (err)
+   goto abort_transaction_no_dev_fatal;
+   } else {
/* Write the keys for each queue */
for (i = 0; i < num_queues; ++i) {
queue = >queues[i];
-- 
2.1.0



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2] net: mvneta: fix refilling for Rx DMA buffers

2015-09-14 Thread Simon Guinot

Hi Oren,

On Mon, Sep 14, 2015 at 01:22:12PM -0700, Oren Laskin wrote:
> I had to undo this change on my Amada 370 based board.  It was causing
> corrupt data to make it through on large downloads.  I'm using wget to get
> the same 30MB file many times and the SHA would occasionally be different.

During your tests, can you see some "Linux processing - Can't refill"
messages along with the data corruptions ?

> I tracked it down to this commit.  In it, I would find on the order of a
> few hundred bytes to simply be wrong data.

I am little bit surprised here. For me, this patch is very simple and
does the exact opposite. It does fix kernel crashes and data corruptions
in case of refilling errors. This can happen for example if you run
large data transfers with jumbo frames enabled...

But anyway, I'll try to reproduce the issue tomorrow. I only have to
wget the same file (size 30MB) in a loop and to check its md5sum ?
That's it ? And how long should I wait for the error ?

Thanks,

Simon

> 
> Thanks,
> 
> Oren
> 
> On Tue, Jul 21, 2015 at 12:30 AM, David Miller  wrote:
> 
> > From: Simon Guinot 
> > Date: Sun, 19 Jul 2015 13:00:53 +0200
> >
> > > With the actual code, if a memory allocation error happens while
> > > refilling a Rx descriptor, then the original Rx buffer is both passed
> > > to the networking stack (in a SKB) and let in the Rx ring. This leads
> > > to various kernel oops and crashes.
> > >
> > > As a fix, this patch moves Rx descriptor refilling ahead of building
> > > SKB with the associated Rx buffer. In case of a memory allocation
> > > failure, data is dropped and the original DMA buffer is put back into
> > > the Rx ring.
> > >
> > > Signed-off-by: Simon Guinot 
> > > Fixes: c5aff18204da ("net: mvneta: driver for Marvell Armada 370/XP
> > network unit")
> > > Cc:  # v3.8+
> > > Tested-by: Yoann Sculo 
> >
> > Applied, thanks.
> >
> > ___
> > linux-arm-kernel mailing list
> > linux-arm-ker...@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> >


signature.asc
Description: Digital signature

Re: [net-next PATCH] net: sched: document attach_default_qdiscs

2015-09-14 Thread Cong Wang

On Mon, Sep 14, 2015 at 8:31 AM, Phil Sutter  wrote:
> The process of selecting an interface's default qdisc is not really
> intuitive, at least because there are three different cases to consider.

It is a static function, not an API, so I don't think it is the right
place to document.

Maybe update default_qdisc description in Documentation/sysctl/net.txt?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net 2/2] 8139cp: reset BQL when ring tx ring cleared

2015-09-14 Thread Francois Romieu

David Woodhouse  :
[...]
> Did you ever work this out ?

Not specifically.

> I'm seeing something similar on the inward -facing interface on my home
> router under high load — and it doesn't automatically recover.
[...]
> [308309.457239] Pid: 0, comm: swapper Not tainted 3.7.1 #1

It's unrelated but you may have already noticed 
7fe0ee099ad5e3dea88d4ee1b6f20246b1ca57c3 ("net: 8139cp: fix a BUG_ON
triggered by wrong bytes_compl").

[...]
> [308309.574551] 8139cp :00:0b.0 eth1: Transmit timeout, status  c   2b
> 0 80ff

Rx and Tx are enabled.

Instant (untested) hack below.

diff --git a/drivers/net/ethernet/realtek/8139cp.c 
b/drivers/net/ethernet/realtek/8139cp.c
index d79e33b..09ee6fd 100644
--- a/drivers/net/ethernet/realtek/8139cp.c
+++ b/drivers/net/ethernet/realtek/8139cp.c
@@ -129,6 +129,9 @@ MODULE_PARM_DESC (multicast_filter_limit, "8139cp: maximum 
number of filtered mu
 /* Time in jiffies before concluding the transmitter is hung. */
 #define TX_TIMEOUT (6*HZ)
 
+/* TODO: calibrate. It ought to be related to the PCI bus frequency. */
+#define CP_EARLY_TIMEOUT   (8 * 1024)
+
 /* hardware minimum and maximum for a single frame's data payload */
 #define CP_MIN_MTU 60  /* TODO: allow lower, but pad */
 #define CP_MAX_MTU 4096
@@ -146,9 +149,11 @@ enum {
TxConfig= 0x40, /* Tx configuration */
ChipVersion = 0x43, /* 8-bit chip version, inside TxConfig */
RxConfig= 0x44, /* Rx configuration */
+   TimerCount  = 0x48, /* 32 bit general purpose timer. */
RxMissed= 0x4C, /* 24 bits valid, write clears */
Cfg9346 = 0x50, /* EEPROM select/control; Cfg reg [un]lock */
Config1 = 0x52, /* Config1 */
+   TimerInt= 0x54, /* TimerCount IRQ triggering timeout value */
Config3 = 0x59, /* Config3 */
Config4 = 0x5A, /* Config4 */
MultiIntr   = 0x5C, /* Multiple interrupt select */
@@ -283,7 +288,8 @@ enum {
LANWake = (1 << 1),  /* Enable LANWake signal */
PMEStatus   = (1 << 0),  /* PME status can be reset by PCI RST# */
 
-   cp_norx_intr_mask = PciErr | LinkChg | TxOK | TxErr | TxEmpty,
+   cp_norx_intr_mask = PciErr | TimerIntr | LinkChg |
+   TxOK | TxErr | TxEmpty,
cp_rx_intr_mask = RxOK | RxErr | RxEmpty | RxFIFOOvr,
cp_intr_mask = cp_rx_intr_mask | cp_norx_intr_mask,
 };
@@ -608,6 +614,15 @@ static irqreturn_t cp_interrupt (int irq, void 
*dev_instance)
 
if (status & (TxOK | TxErr | TxEmpty | SWInt))
cp_tx(cp);
+
+   if ((status & TimerIntr) && (cp->tx_head != cp->tx_tail)) {
+   if (net_ratelimit()) {
+   netdev_info(dev, "Timeout head=%08x, tail=%08x\n",
+   cp->tx_head, cp->tx_tail);
+   }
+   cp_tx(cp);
+   }
+
if (status & LinkChg)
mii_check_media(>mii_if, netif_msg_link(cp), false);
 
@@ -885,6 +900,8 @@ static netdev_tx_t cp_start_xmit (struct sk_buff *skb,
 out_unlock:
spin_unlock_irqrestore(>lock, intr_flags);
 
+   cpw32(TimerCount, CP_EARLY_TIMEOUT);
+
cpw8(TxPoll, NormalTxPoll);
 
return NETDEV_TX_OK;
@@ -1064,6 +1081,8 @@ static void cp_init_hw (struct cp_private *cp)
 
cpw16(MultiIntr, 0);
 
+   cpw32(TimerInt, CP_EARLY_TIMEOUT);
+
cpw8_f(Cfg9346, Cfg9346_Lock);
 }
 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] Remove #ifdef CONFIG_64BIT from all asm-generic/fcntl.h

2015-09-14 Thread Palmer Dabbelt

On Thu, 10 Sep 2015 04:18:05 PDT (-0700), dhowe...@redhat.com wrote:
> David Howells  wrote:
>
>> Rather than iterating through all the rest of your patches and saying the 
>> same
>> thing, if there's something in a UAPI header that needs wrapping in 
>> __KERNEL__
>> to exclude it from userspace's use, then it should be transferred to the
>> non-UAPI variant of that header (which should #include the UAPI variant).
>
> I should mention that there is the odd case where this is difficult to
> achieve.  See include/uapi/linux/acct.h for an example...

OK, sorry about that.  I'm submitting a v3 that should fix these
problems.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v7 1/3] can: Allwinner A10/A20 CAN Controller support - Devicetree bindings

2015-09-14 Thread Gerhard Bertelsmann

Devicetree bindings for Allwinner A10/A20 CAN

Signed-off-by: Gerhard Bertelsmann 
---

 .../devicetree/bindings/net/can/sun4i_can.txt  |  38 +
 1 file changed, 38 insertions(+)

diff --git a/Documentation/devicetree/bindings/net/can/sun4i_can.txt 
b/Documentation/devicetree/bindings/net/can/sun4i_can.txt
new file mode 100644
index 000..5819043
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/can/sun4i_can.txt
@@ -0,0 +1,38 @@
+Allwinner A10/A20 CAN controller Device Tree Bindings
+-
+
+Required properties:
+- compatible: "allwinner,sun4ican"
+- reg: physical base address and size of the Allwinner A10/A20 CAN register 
map.
+- interrupts: interrupt specifier for the sole interrupt.
+- clock: phandle and clock specifier.
+
+
+Example
+---
+
+SoC common .dtsi file:
+
+   can0_pins_a: can0@0 {
+   allwinner,pins = "PH20","PH21";
+   allwinner,function = "can";
+   allwinner,drive = <0>;
+   allwinner,pull = <0>;
+   };
+...
+   can0: can@01c2bc00 {
+   compatible = "allwinner,sun4ican";
+   reg = <0x01c2bc00 0x400>;
+   interrupts = <0 26 4>;
+   clocks = <_gates 4>;
+   status = "disabled";
+   };
+
+Board specific .dts file:
+
+   can0: can@01c2bc00 {
+   pinctrl-names = "default";
+   pinctrl-0 = <_pins_a>;
+   status = "okay";
+   };
+
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v7 3/3] can: Allwinner A10/A20 CAN Controller support - Kernel module

2015-09-14 Thread Gerhard Bertelsmann

Kernel module for Allwinner A10/A20 CAN

Signed-off-by: Gerhard Bertelsmann 
---

 drivers/net/can/Kconfig|  10 +
 drivers/net/can/Makefile   |   1 +
 drivers/net/can/sun4i_can.c| 857 +
 3 files changed, 868 insertions(+)


diff --git a/drivers/net/can/Kconfig b/drivers/net/can/Kconfig
index e8c96b8..6d04183 100644
--- a/drivers/net/can/Kconfig
+++ b/drivers/net/can/Kconfig
@@ -129,6 +129,16 @@ config CAN_RCAR
  To compile this driver as a module, choose M here: the module will
  be called rcar_can.
 
+config CAN_SUN4I
+   tristate "Allwinner A10 CAN controller"
+   depends on MACH_SUN4I || MACH_SUN7I || COMPILE_TEST
+   ---help---
+ Say Y here if you want to use CAN controller found on Allwinner
+ A10/A20 SoCs.
+
+ To compile this driver as a module, choose M here: the module will
+ be called sun4i_can.
+
 config CAN_XILINXCAN
tristate "Xilinx CAN"
depends on ARCH_ZYNQ || ARM64 || MICROBLAZE || COMPILE_TEST
diff --git a/drivers/net/can/Makefile b/drivers/net/can/Makefile
index c533c62..1f21cef 100644
--- a/drivers/net/can/Makefile
+++ b/drivers/net/can/Makefile
@@ -27,6 +27,7 @@ obj-$(CONFIG_CAN_FLEXCAN) += flexcan.o
 obj-$(CONFIG_PCH_CAN)  += pch_can.o
 obj-$(CONFIG_CAN_GRCAN)+= grcan.o
 obj-$(CONFIG_CAN_RCAR) += rcar_can.o
+obj-$(CONFIG_CAN_SUN4I)+= sun4i_can.o
 obj-$(CONFIG_CAN_XILINXCAN)+= xilinx_can.o
 
 subdir-ccflags-y += -D__CHECK_ENDIAN__
diff --git a/drivers/net/can/sun4i_can.c b/drivers/net/can/sun4i_can.c
new file mode 100644
index 000..8e32520
--- /dev/null
+++ b/drivers/net/can/sun4i_can.c
@@ -0,0 +1,857 @@
+/*
+ * sun4i_can.c - CAN bus controller driver for Allwinner SUN4I based SoCs
+ *
+ * Copyright (C) 2013 Peter Chen
+ * Copyright (C) 2015 Gerhard Bertelsmann
+ * All rights reserved.
+ *
+ * Parts of this software are based on (derived from) the SJA1000 code by:
+ *   Copyright (C) 2014 Oliver Hartkopp 
+ *   Copyright (C) 2007 Wolfgang Grandegger 
+ *   Copyright (C) 2002-2007 Volkswagen Group Electronic Research
+ *   Copyright (C) 2003 Matthias Brukner, Trajet Gmbh, Rebenring 33,
+ *   38106 Braunschweig, GERMANY
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *notice, this list of conditions and the following disclaimer in the
+ *documentation and/or other materials provided with the distribution.
+ * 3. Neither the name of Volkswagen nor the names of its contributors
+ *may be used to endorse or promote products derived from this software
+ *without specific prior written permission.
+ *
+ * Alternatively, provided that this notice is retained in full, this
+ * software may be distributed under the terms of the GNU General
+ * Public License ("GPL") version 2, in which case the provisions of the
+ * GPL apply INSTEAD OF those given above.
+ *
+ * The provided data structures and external interfaces from this code
+ * are not restricted to be used by modules with a GPL compatible license.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
+ * DAMAGE.
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define DRV_NAME "sun4i_can"
+
+/* Registers address (physical base address 0x01C2BC00) */
+#define SUNXI_REG_MSEL_ADDR0x  /* CAN Mode Select */
+#define SUNXI_REG_CMD_ADDR 0x0004  /* CAN Command */
+#define SUNXI_REG_STA_ADDR 0x0008  /* CAN Status */
+#define SUNXI_REG_INT_ADDR 0x000c  /* CAN Interrupt Flag */
+#define SUNXI_REG_INTEN_ADDR   0x0010  /* CAN Interrupt Enable */
+#define SUNXI_REG_BTIME_ADDR   0x0014  /* CAN Bus Timing 0 */
+#define SUNXI_REG_TEWL_ADDR0x0018  /*

[PATCH v7 0/3] can: Allwinner A10/A20 CAN Controller support - Summary

2015-09-14 Thread Gerhard Bertelsmann

Hi,

please find attached the next version of my patch set. I have 
taken all remarks from Marc into the new version

Please review, test and report bugs if exists.

The patchset applies to all recent Kernel versions (4.3-rc, 4.2, next etc.).

[PATCH v7 1/3] Device Tree Binding Documentation
[PATCH v7 2/3] Defconfigs
[PATCH v7 3/3] Kernel Module

History:
V7: set_normal_mode: stripped (code inserted in can_stop)
set_reset_mode: stripped (code inserted in can_start)
sunxi_can_start: reworked
sunxi_can_stop: function added
sunxi_can_err: don't skip if skb alloc fails
sunxican_bittiming_const: use netdev_dbg instead of netdev_info
sunxican_probe: CAN_CTRLMODE_PRESUME_ACK

v6: renamed the driver to sun4i as suggested by Maxime Ripard
removed module version
removed suspend and resume
moved clk enable from can_start into open / should be balanced
  between enabling and disabling now
freeing resources on error

v5: fix license
modify prefix to mode select defines
enable and disable clock in sunxican_get_berr_counter
delete set_normal_mode at the end of sunxi_can_start
removed sunxican_id_table
use devm_clk_get instead of clk_get
use devm_ioremap_resource to simplify probe and remove
make set-normal-mode and set-reset-mode more readable

v4: defines prefixed with SUNXI_
sunxi_can_write_cmdreg tweaked
loops in set_xxx_mode reworked
add return value to set_xxx_mode
sunxican_start_xmit reworked
struct platform_driver stripped
moved set_bittiming into open
moved clock start into open
add clock stop to close
suspend reworked
resume reworked
fixed double counting bug

v3: changed error state change handling (thx Andri for the hint)
use bittiming function correct (no need to call it)
strip down priv (suggested by Marc)
scripts/checkpatch.pl-> no matches anymore
sparse -> no errors or warnings anymore
v2: cleaning
v1: initial

Signed-off-by: Gerhard Bertelsmann 
---

 .../devicetree/bindings/net/can/sun4i_can.txt  |  38 +
 arch/arm/configs/multi_v7_defconfig|   1 +
 arch/arm/configs/sunxi_defconfig   |   2 +
 drivers/net/can/Kconfig|  10 +
 drivers/net/can/Makefile   |   1 +
 drivers/net/can/sun4i_can.c| 857 +
 6 files changed, 909 insertions(+)

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v7 2/3] can: Allwinner A10/A20 CAN Controller support - Defconfigs

2015-09-14 Thread Gerhard Bertelsmann

Defconfigs for Allwinner A10/A20 CAN driver

Signed-off-by: Gerhard Bertelsmann 
---

 arch/arm/configs/multi_v7_defconfig|   1 +
 arch/arm/configs/sunxi_defconfig   |   2 +
 2 files changed, 3 insertions(+)


diff --git a/arch/arm/configs/multi_v7_defconfig 
b/arch/arm/configs/multi_v7_defconfig
index 03deb7f..14eb6b9 100644
--- a/arch/arm/configs/multi_v7_defconfig
+++ b/arch/arm/configs/multi_v7_defconfig
@@ -153,6 +153,7 @@ CONFIG_CAN_DEV=y
 CONFIG_CAN_AT91=m
 CONFIG_CAN_XILINXCAN=y
 CONFIG_CAN_MCP251X=y
+CONFIG_CAN_SUN4I=y
 CONFIG_BT=m
 CONFIG_BT_MRVL=m
 CONFIG_BT_MRVL_SDIO=m
diff --git a/arch/arm/configs/sunxi_defconfig b/arch/arm/configs/sunxi_defconfig
index 51eea22..fe020a5 100644
--- a/arch/arm/configs/sunxi_defconfig
+++ b/arch/arm/configs/sunxi_defconfig
@@ -31,6 +31,8 @@ CONFIG_IP_PNP_BOOTP=y
 # CONFIG_INET_LRO is not set
 # CONFIG_INET_DIAG is not set
 # CONFIG_IPV6 is not set
+CONFIG_CAN=y
+CONFIG_CAN_SUN4I=y
 # CONFIG_WIRELESS is not set
 CONFIG_DEVTMPFS=y
 CONFIG_DEVTMPFS_MOUNT=y
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2] net: mvneta: fix refilling for Rx DMA buffers

2015-09-14 Thread Oren Laskin

I would hit this error on my Armada 370 board about 20% of the time
after downloading a 30MB file to /tmp.  We're running a 1 Gb SGMII
link.  I would hit this in less than a minute before removing this
commit from my tree.  I've now been running this test in a loop for a
few hours with no problems.

It was somewhat hard to diagnose since files I used scp didn't see the
issues (or at least as quickly).  I set up an http program to serve a
file and replicated the problem with wget and found it.

Oren

On Mon, Sep 14, 2015 at 3:13 PM, Simon Guinot  wrote:
> Hi Oren,
>
> On Mon, Sep 14, 2015 at 01:22:12PM -0700, Oren Laskin wrote:
>> I had to undo this change on my Amada 370 based board.  It was causing
>> corrupt data to make it through on large downloads.  I'm using wget to get
>> the same 30MB file many times and the SHA would occasionally be different.
>
> During your tests, can you see some "Linux processing - Can't refill"
> messages along with the data corruptions ?
>
>> I tracked it down to this commit.  In it, I would find on the order of a
>> few hundred bytes to simply be wrong data.
>
> I am little bit surprised here. For me, this patch is very simple and
> does the exact opposite. It does fix kernel crashes and data corruptions
> in case of refilling errors. This can happen for example if you run
> large data transfers with jumbo frames enabled...
>
> But anyway, I'll try to reproduce the issue tomorrow. I only have to
> wget the same file (size 30MB) in a loop and to check its md5sum ?
> That's it ? And how long should I wait for the error ?
>
> Thanks,
>
> Simon
>
>>
>> Thanks,
>>
>> Oren
>>
>> On Tue, Jul 21, 2015 at 12:30 AM, David Miller  wrote:
>>
>> > From: Simon Guinot 
>> > Date: Sun, 19 Jul 2015 13:00:53 +0200
>> >
>> > > With the actual code, if a memory allocation error happens while
>> > > refilling a Rx descriptor, then the original Rx buffer is both passed
>> > > to the networking stack (in a SKB) and let in the Rx ring. This leads
>> > > to various kernel oops and crashes.
>> > >
>> > > As a fix, this patch moves Rx descriptor refilling ahead of building
>> > > SKB with the associated Rx buffer. In case of a memory allocation
>> > > failure, data is dropped and the original DMA buffer is put back into
>> > > the Rx ring.
>> > >
>> > > Signed-off-by: Simon Guinot 
>> > > Fixes: c5aff18204da ("net: mvneta: driver for Marvell Armada 370/XP
>> > network unit")
>> > > Cc:  # v3.8+
>> > > Tested-by: Yoann Sculo 
>> >
>> > Applied, thanks.
>> >
>> > ___
>> > linux-arm-kernel mailing list
>> > linux-arm-ker...@lists.infradead.org
>> > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>> >
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [net-next PATCH] net: sched: document attach_default_qdiscs

2015-09-14 Thread Cong Wang

On Mon, Sep 14, 2015 at 3:42 PM, Phil Sutter  wrote:
> On Mon, Sep 14, 2015 at 03:07:42PM -0700, Cong Wang wrote:
>> Maybe update default_qdisc description in Documentation/sysctl/net.txt?
>
> I don't think this is the right place for source code documentation. The
> intended audience are users, and I wouldn't expect a developer to search
> in there. On the other hand, that description would indeed benefit from
> a review: Apart from omitting noqueue, it neither mentions leaf qdiscs.
> I'll fix this.
>

You are not just documenting the source code, you are documenting
a user-visible behavior. This is why I proposed net.txt.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

RE: fjes: update_zone_task

2015-09-14 Thread Izumi, Taku

Dear Dan,

  Thanks for pointing!
  I'll check that soon.

  Sincerely,
  Taku Izumi

> -Original Message-
> From: Dan Carpenter [mailto:dan.carpen...@oracle.com]
> Sent: Monday, September 14, 2015 10:32 AM
> To: Izumi, Taku/泉 拓
> Cc: netdev@vger.kernel.org
> Subject: re: fjes: update_zone_task
> 
> Hello Taku Izumi,
> 
> The patch 785f28e061a8: "fjes: update_zone_task" from Aug 21, 2015,
> leads to the following static checker warning:
> 
>   drivers/net/fjes/fjes_hw.c:1016 fjes_hw_update_zone_task()
>   warn: potential off by one 'info[]' limit 'hw->max_epid'
> 
> drivers/net/fjes/fjes_hw.c
>963  case 0:
>964
>965  for (epidx = 0; epidx < hw->max_epid; epidx++) {
>966  if (epidx == hw->my_epid) {
>967  hw->ep_shm_info[epidx].es_status =
>968  info[epidx].es_status;
>969  hw->ep_shm_info[epidx].zone =
>970  info[epidx].zone;
>971  continue;
>972  }
>973
>974  pstatus = fjes_hw_get_partner_ep_status(hw, 
> epidx);
>975  switch (pstatus) {
>976  case EP_PARTNER_UNSHARE:
>977  default:
>978  if ((info[epidx].zone !=
>979  FJES_ZONING_ZONE_TYPE_NONE) &&
>980  (info[epidx].es_status ==
>981  FJES_ZONING_STATUS_ENABLE) &&
>982  (info[epidx].zone ==
>983  info[hw->my_epid].zone))
>984  set_bit(epidx, _bit);
>985  else
>986  set_bit(epidx, _bit);
>987  break;
>988
>989  case EP_PARTNER_COMPLETE:
>990  case EP_PARTNER_WAITING:
>991  if ((info[epidx].zone ==
>992  FJES_ZONING_ZONE_TYPE_NONE) ||
>993  (info[epidx].es_status !=
>994  FJES_ZONING_STATUS_ENABLE) ||
>995  (info[epidx].zone !=
>996  info[hw->my_epid].zone)) {
>997  set_bit(epidx,
>998  
> >unshare_watch_bitmask);
>999  set_bit(epidx,
>   1000  
> >hw_info.buffer_unshare_reserve_bit);
>   1001  }
>   1002  break;
>   1003
>   1004  case EP_PARTNER_SHARED:
>   1005  if ((info[epidx].zone ==
>   1006  FJES_ZONING_ZONE_TYPE_NONE) ||
>   1007  (info[epidx].es_status !=
>   1008  FJES_ZONING_STATUS_ENABLE) ||
>   1009  (info[epidx].zone !=
>   1010  info[hw->my_epid].zone))
>   1011  set_bit(epidx, _bit);
>   1012  break;
>   1013  }
>   1014  }
>   1015
>   1016  hw->ep_shm_info[epidx].es_status = 
> info[epidx].es_status;
>   1017  hw->ep_shm_info[epidx].zone = info[epidx].zone;
> 
> 
> I'm not sure how Smatch is able to generate this warning.  The array is
> allocated using the FJES_DEV_REQ_BUF_SIZE(hw->max_epid) macro.  It
> really has a lot of obfuscation layers so I wasn't able to understand
> it.
> 
> It seems like this might be a real bug though.  I suspect the fix is to
> change the continue on line 970 to a break and delete lines 1016 and
> 1017?
> 
>   1018
>   1019  break;
>   1020  }
> 
> regards,
> dan carpenter
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH V1 net-next] net: only check perm protocol when register proto

2015-09-14 Thread martinbj2008

From: Junwei Zhang 

the permanent protocol nodes are at the head of the list.
So only need check all these nodes.

and insert the new node after the last permanent protocol node,
no matter new node is permanent or not.

If the inserted proto conflicts with existing permanent protocol,
then goto out_permanent immediately.

Signed-off-by: Martin Zhang 
---
 net/ipv4/af_inet.c | 14 --
 1 file changed, 4 insertions(+), 10 deletions(-)

diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index 1d0c3ad..c61e0b5 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -1043,22 +1043,16 @@ void inet_register_protosw(struct inet_protosw *p)
goto out_illegal;
 
/* If we are trying to override a permanent protocol, bail. */
-   answer = NULL;
last_perm = [p->type];
list_for_each(lh, [p->type]) {
answer = list_entry(lh, struct inet_protosw, list);
-
/* Check only the non-wild match. */
-   if (INET_PROTOSW_PERMANENT & answer->flags) {
-   if (protocol == answer->protocol)
+   if ((INET_PROTOSW_PERMANENT & answer->flags) == 0)
break;
-   last_perm = lh;
-   }
-
-   answer = NULL;
+   if (protocol == answer->protocol)
+   goto out_permanent;
+   last_perm = lh;
}
-   if (answer)
-   goto out_permanent;
 
/* Add the new entry after the last permanent entry if any, so that
 * the new entry does not override a permanent entry when matched with
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v3 net 0/5] ipv6: Fix dst_entry refcnt bugs in ip6_tunnel

2015-09-14 Thread David Miller

From: David Miller 
Date: Mon, 14 Sep 2015 19:49:25 -0700 (PDT)

> Series applied, thanks Martin.

Actually, reverted, this doesn't even compile :-/

In file included from include/linux/srcu.h:33:0,
 from include/linux/notifier.h:15,
 from include/linux/memory_hotplug.h:6,
 from include/linux/mmzone.h:812,
 from include/linux/gfp.h:5,
 from include/linux/kmod.h:22,
 from include/linux/module.h:13,
 from net/ipv6/ip6_tunnel.c:23:
net/ipv6/ip6_tunnel.c: In function ‘ip6_tnl_per_cpu_dst_set’:
net/ipv6/ip6_tunnel.c:135:35: error: invalid type argument of ‘->’ (have 
‘seqlock_t’)
lockdep_is_held(>lock->lock)));
   ^
include/linux/rcupdate.h:569:52: note: in definition of macro ‘RCU_LOCKDEP_WARN’
   if (debug_lockdep_rcu_enabled() && !__warned && (c)) { \
^
include/linux/rcupdate.h:787:2: note: in expansion of macro 
‘__rcu_dereference_protected’
  __rcu_dereference_protected((p), (c), __rcu)
  ^
net/ipv6/ip6_tunnel.c:133:14: note: in expansion of macro 
‘rcu_dereference_protected’
  dst_release(rcu_dereference_protected(
  ^
net/ipv6/ip6_tunnel.c:135:8: note: in expansion of macro ‘lockdep_is_held’
lockdep_is_held(>lock->lock)));
^

Re: [PATCH v3 net 0/5] ipv6: Fix dst_entry refcnt bugs in ip6_tunnel

2015-09-14 Thread David Miller

From: Martin KaFai Lau 
Date: Fri, 11 Sep 2015 11:06:16 -0700

> v3:
> - Merge a 'if else if' test in patch 4
> - Use rcu_dereference_protected in patch 5 to fix a sparse check when
>   CONFIG_SPARSE_RCU_POINTER is enabled
> 
> v2:
> - Add patch 4 and 5 to remove the spinlock
> 
> v1:
> This patch series is to fix the dst refcnt bugs in ip6_tunnel.
> 
> Patch 1 and 2 are the prep works.  Patch 3 is the fix.
> 
> I can reproduce the bug by adding and removing the ip6gre tunnel
> while running a super_netperf TCP_CRR test.  I get the following
> trace by adding WARN_ON_ONCE(newrefcnt < 0) to dst_release():
 ...

Series applied, thanks Martin.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

bnx2x - occasional high packet loss (on LAN)

2015-09-14 Thread Nikola Ciprich

Hello,

I'm trying to track strange issue with one of our servers and
like to ask for recommendations..

I've got three node cluster (nodes A..C) interconnected with stacked broadcom
ICX6610. eth0 of each box is connected to first switch, eth1 to second one,
bonding set as follows: "mode=802.3ad lacp_rate=fast xmit_hash_policy=layer2+3 
miimon=100"

It happened few times, that suddenly eth1 on box A started misbehaving and 
communication
with other nodes (ie flood ping) started dropping up to 30% packets. When this 
port
has been shut on both sides, problem immediately vanished.

We've tried replacing card, cable and using different port on switch, but 
problem
repeated again yesterday..

Since it's "only" loss, and not link loss, bonding doesn't help me much..

however during weekend, port also had strange link issue:

Sep 12 15:23:45 remrprv1a kernel: [676373.296786] bnx2x :03:00.1 eth1: NIC 
Link is Down
Sep 12 15:23:46 remrprv1a kernel: [676373.356638] bond0: link status definitely 
down for interface eth1, disabling it
Sep 12 15:23:46 remrprv1a kernel: [676374.299571] bnx2x :03:00.1 eth1: NIC 
Link is Up, 1 Mbps full duplex, Flow control: ON - receive & transmit
Sep 12 15:23:47 remrprv1a kernel: [676374.364428] bond0: link status definitely 
up for interface eth1, 1 Mbps full duplex
Sep 12 15:23:47 remrprv1a kernel: [676374.372902] bond0: first active interface 
up!
Sep 12 15:24:24 remrprv1a kernel: [676411.402511] bnx2x :03:00.1 eth1: NIC 
Link is Down
Sep 12 15:24:24 remrprv1a kernel: [676411.407422] bond0: link status definitely 
down for interface eth1, disabling it
Sep 12 15:24:25 remrprv1a kernel: [676412.405311] bnx2x :03:00.1 eth1: NIC 
Link is Up, 1 Mbps full duplex, Flow control: ON - receive & transmit
Sep 12 15:24:25 remrprv1a kernel: [676412.408123] bond0: link status definitely 
up for interface eth1, 0 Mbps full duplex
Sep 12 15:24:51 remrprv1a kernel: [676438.477641] bnx2x :03:00.1 eth1: NIC 
Link is Down
Sep 12 15:24:51 remrprv1a kernel: [676438.528513] bond0: link status definitely 
down for interface eth1, disabling it
Sep 12 15:24:52 remrprv1a kernel: [676439.480472] bnx2x :03:00.1 eth1: NIC 
Link is Up, 1 Mbps full duplex, Flow control: ON - receive & transmit
Sep 12 15:24:52 remrprv1a kernel: [676439.536282] bond0: link status definitely 
up for interface eth1, 1 Mbps full duplex

0mbps link speed is quite weird I guess..

all three boxes are the same, running centos6 based system, 4.0.5 x86_64 kernel.

The only difference I noticed on them is, that irqbalance was enabled on 
problematic
box and not on the others.. So I disabled it and rebooted the box.. The problem 
is,
I can't really wait for the problem to reappear, so I'd like to ask, has anybody
seen similar problem? I of so, was it fixed in some newer kernel release? I 
haven't
found mention in the changelogs, but still.. or does somebody have a hint on 
what else
I should check? 

I'll try to reproduce this on test system (enabling irqbalance and doing some 
network
benchmarks, but I'd be most happy if I could prevent it on this production 
system..)

thanks a lot for any advance

with best regards

nikola ciprich

PS: here's lspci -vv of eths.. should I provide any further information, please 
let me know:

http://nik.lbox.cz/download/lspci.txt

-- 
-
Ing. Nikola CIPRICH
LinuxBox.cz, s.r.o.
28.rijna 168, 709 00 Ostrava

tel.:   +420 591 166 214
fax:+420 596 621 273
mobil:  +420 777 093 799
www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: ser...@linuxbox.cz
-


pgploXW8p3tLt.pgp
Description: PGP signature

Re: xfrm4_garbage_collect reaching limit

2015-09-14 Thread Dan Streetman

On Fri, Sep 11, 2015 at 5:48 AM, Steffen Klassert
 wrote:
> Hi Dan.
>
> On Thu, Sep 10, 2015 at 05:01:26PM -0400, Dan Streetman wrote:
>> Hi Steffen,
>>
>> I've been working with Jay on a ipsec issue, which I believe he
>> discussed with you.
>
> Yes, we talked about this at the LPC.
>
>> In this case the xfrm4_garbage_collect is
>> returning error because the number of xfrm4 dst entries has exceeded
>> twice the gc_thresh, which causes new allocations of xfrm4 dst objects
>> to fail, thus making the ipsec connection unusable (until dst objects
>> are removed/freed).
>>
>> The main reason the count gets to the limit is because the
>> xfrm4_policy_afinfo.garbage_collect function - which points to
>> flow_cache_flush (indirectly) - doesn't actually guarantee any xfrm4
>> dst will get cleaned up, it only cleans up unused entries.
>>
>> The flow cache hashtable size limit watermark does restrict how many
>> flow cache entries exist (by shrinking the per-cpu hashtable once it
>> has 4k entries), and therefore indirectly controls the total number of
>> xfrm4 dst objects.  However, there's a mismatch between the default
>> xfrm4 gc_thresh - of 32k objects (which sets a 64k max of xfrm4 dst
>> objects) - and the flow cache hashtable limit of 4k objects per cpu.
>> Any system with 16 or less cpus will have a total limit of 64k (or
>> less) flow cache entries, so the 64k xfrm4 dst entry limit will never
>> be reached.  However for any system with more than 16 cpus, the flow
>> cache limit is greater than the xfrm4 dst limit, and so the xfrm4 dst
>> allocation can fail, rendering the ipsec connection unusable.
>>
>> The most obvious solution is for the system admin to increase the
>> xfrm4_gc_thresh value, although it's not really an obvious solution to
>> the end-user what value they should set it to :-)
>
> Yes, a static gc threshold is always wrong for some workloads. So
> the user needs to adjust it to his needs, even if the right value
> is not obvious.
>
>> Possibly the
>> default value of xfrm4_gc_thresh could be set proportional to
>> num_online_cpus(), but that doesn't help when cpus are onlined after
>> boot.
>
> This could be an option, we could change the xfrm4_gc_thresh value with
> a cpu notifier callback if more cpus come up after boot.

the issue there is, if the value is changed by the user, does a cpu
hotplug reset it back to default...

>
>> Also, a warning message indicating the xfrm4_gc_thresh limit
>> was reached, and a suggestion to increase the limit, may help anyone
>> who hits the issue.

what do you think about this?  it's the simplest option; something like

pr_warn_ratelimited("xfrm4_gc_limit exceeded\n");

or with a suggestion...

pr_warn_ratelimited("xfrm4_gc_limit exceeded, you may want to increase
to %d or more",
  2048 * num_online_cpus());

>>
>> I'm not sure if something more aggressive is appropriate, like
>> removing active entries during garbage collection.
>
> It would not make too much sense to push an active flow out of the
> fastpath just to add some other flow. If the number of active
> entries is to high, there is no other option than increasing the
> gc threshold.
>
> You could try to reduce the number of active entries by shutting
> down stale security associations frequently.
>
>> Or, removing the
>> failure condition from xfrm4_garbage_collect so xfrm4 dst_ops can
>> always be allocated,
>
> This would open doors for DOS attacks, we can't do this.
>
>> or just increasing it from gc_thresh * 2 up to *
>> 4 or more.
>
> This would just defer the problem, so not a real solution.
>
> That said, whatever we do, we just paper over the real problem,
> that is the flowcache itself. Everything that need this kind
> of garbage collecting is fundamentally broken. But as long as
> nobody volunteers to work on a replacement, we have to live
> with this situation somehow.
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: DSA: phy polling

2015-09-14 Thread Florian Fainelli

On 14/09/15 11:23, Russell King - ARM Linux wrote:
> On Mon, Sep 14, 2015 at 10:28:55AM -0700, Florian Fainelli wrote:
>> Just my 2 cents here, I suspect the original intention behind this code
>> was to help utilize the switch's built-in PHY polling unit when
>> available, and use the HW to collect the state of all PHYs in fewer
>> register to read, instead of having to do individual (and quite possibly
>> expensive) MDIO reads towards each individual per-port PHYs (at least
>> two reads per PHY to latch MII_BMSR).
> 
> Does the Marvell phy have such a register?  Looking at the register
> dump and plugging/unplugging cables seems not to show a register
> reporting whether any particular interface has changed state, and
> I haven't noticed there being any combined register in anything I've
> seen on these switches.

It seemed to me like the PPU was meant to provide that, but I cannot
find any "summary" register which would give you such a status, must
have conflated that with what Broadcom switches support.

> 
>> Now, I do agree there is a duplication of functionality here, and a
>> potential fix would be to avoid starting the PHY state machine if/when
>> the switch supports such a feature (not call phy_start*), that should
>> still get you consistent consistent link partner advertised/status
>> values, question is, does that really benefit anybody though?
> 
> I disagree - it's the DSA polling that needs to go.  The DSA polling
> only looks at the port status, and derives from it the carrier
> state.  The rest of the information is only turned into a printk().
> The PHY state machine does a lot more, recording the link speed so
> that ethtool works on the interface.

I am fine with that approach, it was not exactly clear to me before
reading the code whether the link polling workqueue was doing anything
useful in mv88e6xxx.c, now it is pretty clear to me, this is as
expensive (more actually because of the PPU get/put) and useless since
the PHY library directly polls the individual per-port PHYs.

> 
> If we do want to go the other way, then the phy code needs a rework so
> that it can be properly classed and drivers with non-standard MII
> registers supported without needing to build register emulation layers.
> 

That part is going to be challenging, all the ethtool/PHY library/MII
code is built around the assumption of translating user-configurable
settings into standard MII calls (all the adv_to* etc.), but, then
again, MII is just one possible translation layer here, you could "plug"
another one if your HW supports that (e.g: non-MDIO, but MMIO for
instance which understands basic concepts like speed/link/duplex/pause).

Oh well.
-- 
Florian
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Netfilter: BUG: unable to handle kernel paging request, RIP: physdev_mt+0xd6/0x160

2015-09-14 Thread Sander Eikelenboom


On 2015-09-13 20:06, Florian Westphal wrote:

Sander Eikelenboom  wrote:

Using a linux-4.3-rc1 kernel i encountered the splat below:


Thanks for reporting this bug.


[  290.200642] BUG: unable to handle kernel paging request at
0484195d
[  290.211702] IP: [] physdev_mt+0xd6/0x160

[..]


[  290.444088]  [] ipt_do_table+0x210/0x390
[  290.461951]  [] iptable_filter_hook+0x2e/0x70
[  290.470756]  [] nf_iterate+0x4c/0x80
[  290.479587]  [] nf_hook_slow+0x64/0xc0
[  290.488341]  [] ip_forward+0x369/0x3c0
[  290.496927]  [] ? ip_frag_mem+0x40/0x40
[  290.505365]  [] ip_rcv_finish+0x101/0x330
[  290.513480]  [] ip_rcv+0x291/0x390
[  290.521562]  [] ?


Aye, ip forwarding of bridged packets with call-iptables=1 is broken.

Please, could you try this patch?  It fixes this bug for me.


Hi Florian,

Works for me too, thx for the fix !

--
Sander


diff --git a/net/bridge/br_netfilter_hooks.c 
b/net/bridge/br_netfilter_hooks.c

--- a/net/bridge/br_netfilter_hooks.c
+++ b/net/bridge/br_netfilter_hooks.c
@@ -355,6 +355,7 @@ static int br_nf_pre_routing_finish(struct sock
*sk, struct sk_buff *skb)
struct iphdr *iph = ip_hdr(skb);
struct nf_bridge_info *nf_bridge = nf_bridge_info_get(skb);
struct rtable *rt;
+   bool daddr_changed;
int err;

nf_bridge->frag_max_size = IPCB(skb)->frag_max_size;
@@ -363,8 +364,15 @@ static int br_nf_pre_routing_finish(struct sock
*sk, struct sk_buff *skb)
skb->pkt_type = PACKET_OTHERHOST;
nf_bridge->pkt_otherhost = false;
}
+
+   /* set physoutdev to NULL, its set by the bridge forward hook but
+* frame might be routed instead of bridged.
+*/
+   daddr_changed = br_nf_ipv4_daddr_was_changed(skb, nf_bridge);
+   nf_bridge->physoutdev = NULL;
nf_bridge->in_prerouting = 0;
-   if (br_nf_ipv4_daddr_was_changed(skb, nf_bridge)) {
+
+   if (daddr_changed) {
 		if ((err = ip_route_input(skb, iph->daddr, iph->saddr, iph->tos, 
dev))) {

struct in_device *in_dev = __in_dev_get_rcu(dev);

diff --git a/net/bridge/br_netfilter_ipv6.c 
b/net/bridge/br_netfilter_ipv6.c

index 77383bf..77b 100644
--- a/net/bridge/br_netfilter_ipv6.c
+++ b/net/bridge/br_netfilter_ipv6.c
@@ -167,6 +167,7 @@ static int br_nf_pre_routing_finish_ipv6(struct
sock *sk, struct sk_buff *skb)
struct rtable *rt;
struct net_device *dev = skb->dev;
const struct nf_ipv6_ops *v6ops = nf_get_ipv6_ops();
+   bool daddr_changed;

nf_bridge->frag_max_size = IP6CB(skb)->frag_max_size;

@@ -174,8 +175,12 @@ static int br_nf_pre_routing_finish_ipv6(struct
sock *sk, struct sk_buff *skb)
skb->pkt_type = PACKET_OTHERHOST;
nf_bridge->pkt_otherhost = false;
}
+
+   daddr_changed = br_nf_ipv6_daddr_was_changed(skb, nf_bridge);
+   nf_bridge->physoutdev = NULL;
nf_bridge->in_prerouting = 0;
-   if (br_nf_ipv6_daddr_was_changed(skb, nf_bridge)) {
+
+   if (daddr_changed) {
skb_dst_drop(skb);
v6ops->route_input(skb);

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC PATCH 3/3] devicetree: macb: Add optional property tsu-clk

2015-09-14 Thread Boris Brezillon

Hi Harini,

On Mon, 14 Sep 2015 09:39:05 +0530
Harini Katakam  wrote:

> On Fri, Sep 11, 2015 at 10:22 PM, Sören Brinkmann
>  wrote:
> > Hi Harini,
> >
> > On Fri, 2015-09-11 at 01:27PM +0530, Harini Katakam wrote:
> >> Add TSU clock frequency to be used for 1588 support in macb driver.
> >>
> >> Signed-off-by: Harini Katakam 
> >> ---
> >>  Documentation/devicetree/bindings/net/macb.txt |3 +++
> >>  1 file changed, 3 insertions(+)
> >>
> >> diff --git a/Documentation/devicetree/bindings/net/macb.txt 
> >> b/Documentation/devicetree/bindings/net/macb.txt
> >> index b5d7976..f7c0ea8 100644
> >> --- a/Documentation/devicetree/bindings/net/macb.txt
> >> +++ b/Documentation/devicetree/bindings/net/macb.txt
> >> @@ -19,6 +19,9 @@ Required properties:
> >>   Optional elements: 'tx_clk'
> >>  - clocks: Phandles to input clocks.
> >>
> >> +Optional properties:
> >> +- tsu-clk: Time stamp unit clock frequency used.
> >
> > Why are we not using the CCF and a clk_get_rate() in the driver?
> >
> 
> If the clock source was only internal, we could use this
> approach as usual. But TSU clock can be configured to
> come from an external clock source or internal.

How about declaring a fixed-rate clk [1] if it comes from an external
clk, and using a clk driver for the internal clk case?
This way you'll be able to use the clk API (including the
clk_get_rate() function) instead of introducing a new way to retrieve a
clk frequency.

Best Regards,

Boris

[1]http://lxr.free-electrons.com/source/Documentation/devicetree/bindings/clock/fixed-clock.txt

-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net] ipv6: include NLM_F_REPLACE in route replace notifications

2015-09-14 Thread Nicolas Dichtel


Le 13/09/2015 19:18, Roopa Prabhu a écrit :

From: Roopa Prabhu 

This patch adds NLM_F_REPLACE flag to ipv6 route replace notifications.
This makes nlm_flags in ipv6 replace notifications consistent
with ipv4.

Signed-off-by: Roopa Prabhu 

Acked-by: Nicolas Dichtel 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH net] openvswitch: Fix IPv6 exthdr handling with ct helpers.

2015-09-14 Thread Joe Stringer

Static code analysis reveals the following bug:

net/openvswitch/conntrack.c:281 ovs_ct_helper()
warn: unsigned 'protoff' is never less than zero.

This signedness bug breaks error handling for IPv6 extension headers when
using conntrack helpers. Fix the error by using a local signed variable.

Fixes:  cae3a2627520: "openvswitch: Allow attaching helpers to ct
action"
Reported-by: Dan Carpenter 
Signed-off-by: Joe Stringer 
---
 net/openvswitch/conntrack.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
index e8e524a..002a755 100644
--- a/net/openvswitch/conntrack.c
+++ b/net/openvswitch/conntrack.c
@@ -275,13 +275,15 @@ static int ovs_ct_helper(struct sk_buff *skb, u16 proto)
case NFPROTO_IPV6: {
u8 nexthdr = ipv6_hdr(skb)->nexthdr;
__be16 frag_off;
+   int ofs;
 
-   protoff = ipv6_skip_exthdr(skb, sizeof(struct ipv6hdr),
-  , _off);
-   if (protoff < 0 || (frag_off & htons(~0x7)) != 0) {
+   ofs = ipv6_skip_exthdr(skb, sizeof(struct ipv6hdr), ,
+  _off);
+   if (ofs < 0 || (frag_off & htons(~0x7)) != 0) {
pr_debug("proto header not found\n");
return NF_ACCEPT;
}
+   protoff = ofs;
break;
}
default:
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [net-next PATCH] net: bridge: fix for bridging 802.1Q without REORDER_HDR

2015-09-14 Thread Vlad Yasevich

On 09/11/2015 04:20 PM, Phil Sutter wrote:
> On Fri, Sep 11, 2015 at 12:24:45PM -0700, Stephen Hemminger wrote:
>> On Fri, 11 Sep 2015 21:22:03 +0200
>> Phil Sutter  wrote:
>>
>>> When forwarding packets from an 802.1Q interface with REORDER_HDR set to
>>> zero, the VLAN header previously inserted by vlan_do_receive() needs to
>>> be stripped from the packet and the mac_header adjustment undone,
>>> otherwise a tagged frame with first four bytes missing will be
>>> transmitted.
>>>
>>> Signed-off-by: Phil Sutter 
>>> ---
>>>  net/bridge/br_input.c | 10 ++
>>>  1 file changed, 10 insertions(+)
>>>
>>> diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
>>> index f921a5d..e4e3fc7 100644
>>> --- a/net/bridge/br_input.c
>>> +++ b/net/bridge/br_input.c
>>> @@ -288,6 +288,16 @@ rx_handler_result_t br_handle_frame(struct sk_buff 
>>> **pskb)
>>> }
>>>  
>>>  forward:
>>> +   if (is_vlan_dev(skb->dev) &&
>>> +   !(vlan_dev_priv(skb->dev)->flags & VLAN_FLAG_REORDER_HDR)) {
>>> +   unsigned int offset = skb->data - skb_mac_header(skb);
>>> +
>>> +   skb_push(skb, offset);
>>> +   memmove(skb->data + VLAN_HLEN, skb->data, 2 * ETH_ALEN);
>>> +   skb->mac_header += VLAN_HLEN;
>>> +   skb_pull(skb, offset);
>>> +   skb_reset_mac_len(skb);
>>> +   }
>>> switch (p->state) {
>>> case BR_STATE_FORWARDING:
>>> rhook = rcu_dereference(br_should_route_hook);
>>
>> Thanks for finding this. Is this a new thing or has it always been there?
> 
> Sorry, I didn't check if this is a regression or not. Seen initially
> with RHEL7's kernel-3.10.0-229.7.2, which due to the massive backporting
> is by far not as old as it might seem. But it's surely not a brand new
> problem of net-next or so.
> 
> Since nowadays no sane mind touches REORDER_HDR (there was originally a
> bug in NetworkManager which defaulted this to 0), it may very well be
> there for a long time already.
> 
>> Sorry, this looks so special case it doesn't seem like a good idea.
>> Something is broken in VLAN handling if this is required.
> 
> It is so ugly, I wish I had found a better way to fix the problem. Well,
> maybe I miss something:
> 
> - packet enters __netif_receive_skb_core():
>   - skb->protocol is set to ETH_P_8021Q, so:
> - packet is untagged
> - skb->vlan_tci set
> - skb->protocol set to 'real' protocol
>   - skb_vlan_tag_present(skb) == true, so:
> - vlan_do_receive() is called:
>   - tags the packet again
>   - zeroes vlan_tci
> - goto another_round
> - __netif_receive_skb_core(), round 2:
>   - skb->protocol is not ETH_P_8021Q -> no untagging
>   - skb_vlan_tag_present(skb) == false -> no vlan_do_receive()
>   - rx_handler handler (== br_handle_frame) is called
> 
> IMO the root of all evil is the existence of REORDER_HDR itself. It
> causes an skb which should have been untagged to being passed along with
> VLAN header present and code dealing with it needs to clean up the mess.

So the problem here appears the be the code the in br_dev_queue_push_xmit().
It assumes that MAC_HLEN worth of data has been removed from the skb,
which is normal in case of normal VLAN processing.  However, without
REORDER_HEADER set this is no longer the case.  In this case, the ethernet
header is shifted 4 bytes, and when we push the it back we miss the 4 bytes
of the destination mac address...

I wonder if it would be safe to just use skb->mac_len.

Of course, looks like vlan filtering also makes this assumption and
could be really broken.  And God forbid, someone creates a bunch of
nested encapsulated vlans (Q-in-Q-in...) with REORDER_HEADER == 0.
We could end up completely leaving the ethernet header out.

Looks like it's been there for a very long while.

-vlad

> 
> Cheers, Phil
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Any way to configure a vlan interface to grab ONLY untagged frames?

2015-09-14 Thread Vlad Yasevich

On 09/13/2015 12:49 PM, Nathan Neulinger wrote:
> It seems like running 'vconfig add IFACE 0' and using IFACE.0 would do this, 
> but it
> doesn't actually seem to work that way.
> 
> If I capture on IFACE directly, I'd expect to get all traffic, including the 
> tagged frames
> (with the tag intact). Looking to be able to bridge/capture/etc. and 
> specifically only
> receive the untagged frames that haven't already been pulled out into a vlan 
> specific
> interface.
> 
> Is there any way to accomplish this without using ebtables or other similar 
> hacks?

If you are dealing with a hw interface, any interface that supports vlan
filtering will by default receive only untagged frames.  Only when you put
into promiscuous mode, will you receive all frames.

With bridge, you could configure your vlans adjacent to you bridge:

   vlan0...N   bridge
 |  |
 +-- eth0 --+

This way, configured vlan traffic will go to vlan devices, while all other
traffic will got bridge.  You can even limit this "all other traffic"
further, by turning on vlan filtering on the bridge which will allow
you to run eth0 in non-promiscuous mode thus enforcing HW vlan filters.

-vlad

> 
> -- Nathan
> 
> 
> Nathan Neulinger   nn...@neulinger.org
> Neulinger Consulting   (573) 612-1412
> -- 
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: v2 of seccomp filter c/r patches

2015-09-14 Thread Andy Lutomirski

On Sep 11, 2015 10:28 AM, "Tycho Andersen"  wrote:
>
> On Fri, Sep 11, 2015 at 10:00:22AM -0700, Andy Lutomirski wrote:
> > On Fri, Sep 11, 2015 at 9:30 AM, Andy Lutomirski  
> > wrote:
> > > On Sep 10, 2015 5:22 PM, "Tycho Andersen"  
> > > wrote:
> > >>
> > >> Hi all,
> > >>
> > >> Here is v2 of the seccomp filter c/r set. The patch notes have individual
> > >> changes from the last series, but there are two points not noted:
> > >>
> > >> * The series still does not allow us to correctly restore state for 
> > >> programs
> > >>   that will use SECCOMP_FILTER_FLAG_TSYNC in the future. Given that we 
> > >> want to
> > >>   keep seccomp_filter's identity, I think something along the lines of 
> > >> another
> > >>   seccomp command like SECCOMP_INHERIT_PARENT is needed (although I'm 
> > >> not sure
> > >>   if this can even be done yet). In addition, we'll need a kcmp command 
> > >> for
> > >>   figuring out if filters are the same, although this too needs to 
> > >> compare
> > >>   seccomp_filter objects, so it's a little screwy. Any thoughts on how 
> > >> to do
> > >>   this nicely are welcome.
> > >
> > > Let's add a concept of a seccompfd.
> > >
> > > For background of what I want to add: I want to be able to create a
> > > seccomp monitor.  A seccomp monitor will be, logically, a pair of a
> > > struct file that represents the monitor and a seccomp_filter that is
> > > controlled by the monitor.  Depending on flags, whoever holds the
> > > monitor fd could change the active filter, intercept syscalls, and
> > > issue syscalls on behalf of a process that is trapped in an
> > > intercepted syscall.
> > >
> > > Seccomp filters would nest properly.
> > >
> > > The interface would probably be (extremely pseudocoded):
> > >
> > > monitor_fd, filter_fd = seccomp(CREATE_MONITOR, flags, ...);
> > >
> > > Then, later:
> > >
> > > seccomp(ATTACH_TO_FILTER, filter_fd);  /* now filtered */
> > >
> > > read(monitor_fd, buf, size); /* returns an intercepted syscall */
> > > write(monitor_fd, buf, size); /* issues a syscall or releases the
> > > trapped task */
> > >
> > > This can't be implemented on x86 without either going insane or
> > > finishing the massive set of pending cleanups to the x86 entry code.
> > > I favor the latter.
> > >
> > > We could, however, add part of it right now: we could have a way to
> > > create a filterfd, we could add kcmp support for it, and we could add
> > > the ATTACH_TO_FILTER thing.  I think that would solve your problem.
> > >
> > > One major open question: does a filter_fd know what its parent is and,
> > > if so, will it just refuse to attach if the caller's parent is wrong?
> > > Or will a filter_fd attach anywhere.
> > >
> >
> > Let me add one more thought:
> >
> > Currently, struct seccomp_filter encodes a strict tree hierarchy: it
> > knows what its parent is.  This only matters as an implementation
> > detail and because TSYNC checks for seccomp_filter equality.
> >
> > We could change this without user-visible effects.  We could say that,
> > for TSYNC purposes, two filter states match if they contain exactly
> > the same layers in the same order where a layer does *not* encode a
> > concept of parent.  We could then say that attaching a classic bpf
> > filter creates a branch new layer that is not equal to any other layer
> > that's been created.
> >
> > This has no effect whatsoever.  The difference would be that we could
> > declare that attaching the same ebpf program twice creates the *same*
> > layer so that, if you fork and both children attach the same ebpf
> > program, then they match for TSYNC purposes.
>
> Would you keep struct seccomp_filter identity here (meaning that you'd
> reach over and grab the seccomp_filter from a sibling thread if it
> existed)? Would it only work for the last filter attached to siblings,
> or for all the filters? This does make my life easier, but I like the
> idea of just using seccompfd directly below as it seems somewhat
> easier (for me at least) to understand,
>

If we did that, it would just be an internal optimization.

> > Similarly, attaching the
> > same hypothetical filterfd would create the same layer.
>
> If we change the api of my current set to have the ptrace commands
> iterate over seccomp fds, it looks something like:
>
> seccompfd = ptrace(GET_FILTER_FD, pid);
> while (ptrace(NEXT_FD, pid, seccompfd) == 0) {
> if (seccomp(CHECK_INHERITED, seccompfd))
> break;
>
> bpffd = seccomp(GET_BPF_FD, seccompfd);
> err = buf(BPF_PROG_DUMP, bpffd, );
> /* save the bpf prog */
> }
>
> then restore can look like:
>
> while (have_noninherited_filters()) {
> filter = load_filter();
> bpffd = bpf(BPF_PROG_LOAD, filter);
> seccompfd = seccomp(SECCOMP_FD_CREATE, bpffd);
>
> filters[n_filters++] = seccompfd;
> }
>
> /* fork any children as necessary and do the rest of

Re: DSA: phy polling

2015-09-14 Thread Russell King - ARM Linux

On Mon, Sep 14, 2015 at 10:28:55AM -0700, Florian Fainelli wrote:
> Just my 2 cents here, I suspect the original intention behind this code
> was to help utilize the switch's built-in PHY polling unit when
> available, and use the HW to collect the state of all PHYs in fewer
> register to read, instead of having to do individual (and quite possibly
> expensive) MDIO reads towards each individual per-port PHYs (at least
> two reads per PHY to latch MII_BMSR).

Does the Marvell phy have such a register?  Looking at the register
dump and plugging/unplugging cables seems not to show a register
reporting whether any particular interface has changed state, and
I haven't noticed there being any combined register in anything I've
seen on these switches.

> Now, I do agree there is a duplication of functionality here, and a
> potential fix would be to avoid starting the PHY state machine if/when
> the switch supports such a feature (not call phy_start*), that should
> still get you consistent consistent link partner advertised/status
> values, question is, does that really benefit anybody though?

I disagree - it's the DSA polling that needs to go.  The DSA polling
only looks at the port status, and derives from it the carrier
state.  The rest of the information is only turned into a printk().
The PHY state machine does a lot more, recording the link speed so
that ethtool works on the interface.

If we do want to go the other way, then the phy code needs a rework so
that it can be properly classed and drivers with non-standard MII
registers supported without needing to build register emulation layers.

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Any way to configure a vlan interface to grab ONLY untagged frames?

2015-09-14 Thread Nathan Neulinger


That is a quite elegant solution. I will give that a try!

-- Nathan

On 09/14/2015 01:28 PM, Vlad Yasevich wrote:

On 09/13/2015 12:49 PM, Nathan Neulinger wrote:

It seems like running 'vconfig add IFACE 0' and using IFACE.0 would do this, 
but it
doesn't actually seem to work that way.

If I capture on IFACE directly, I'd expect to get all traffic, including the 
tagged frames
(with the tag intact). Looking to be able to bridge/capture/etc. and 
specifically only
receive the untagged frames that haven't already been pulled out into a vlan 
specific
interface.

Is there any way to accomplish this without using ebtables or other similar 
hacks?


If you are dealing with a hw interface, any interface that supports vlan
filtering will by default receive only untagged frames.  Only when you put
into promiscuous mode, will you receive all frames.

With bridge, you could configure your vlans adjacent to you bridge:

vlan0...N   bridge
  |  |
  +-- eth0 --+

This way, configured vlan traffic will go to vlan devices, while all other
traffic will got bridge.  You can even limit this "all other traffic"
further, by turning on vlan filtering on the bridge which will allow
you to run eth0 in non-promiscuous mode thus enforcing HW vlan filters.

-vlad



-- Nathan


Nathan Neulinger   nn...@neulinger.org
Neulinger Consulting   (573) 612-1412
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html




--

Nathan Neulinger   nn...@neulinger.org
Neulinger Consulting   (573) 612-1412
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 4/5] seccomp: add a way to access filters via bpf fds

2015-09-14 Thread Andy Lutomirski

On Sep 11, 2015 9:44 AM, "Tycho Andersen"  wrote:
>
> On Fri, Sep 11, 2015 at 09:20:55AM -0700, Andy Lutomirski wrote:
> > On Sep 10, 2015 5:22 PM, "Tycho Andersen"  
> > wrote:
> > >
> > > This patch adds a way for a process that is "real root" to access the
> > > seccomp filters of another process. The process first does a
> > > PTRACE_SECCOMP_GET_FILTER_FD to get an fd with that process' seccomp 
> > > filter
> > > attached, and then iterates on this with PTRACE_SECCOMP_NEXT_FILTER using
> > > bpf(BPF_PROG_DUMP) to dump the actual program at each step.
> > >
> >
> > > +
> > > +   fd = bpf_new_fd(filter->prog, O_RDONLY);
> > > +   if (fd > 0)
> > > +   atomic_inc(>prog->aux->refcnt);
> >
> > Why isn't this folded into bpf_new_fd?
>
> No reason it can't be as far as I can see. I'll make the change for
> the next version.
>
> > > +
> > > +   return fd;
> > > +}
> > > +
> > > +long seccomp_next_filter(struct task_struct *child, u32 fd)
> > > +{
> > > +   struct seccomp_filter *cur;
> > > +   struct bpf_prog *prog;
> > > +   long ret = -ESRCH;
> > > +
> > > +   if (!capable(CAP_SYS_ADMIN))
> > > +   return -EACCES;
> > > +
> > > +   if (child->seccomp.mode != SECCOMP_MODE_FILTER)
> > > +   return -EINVAL;
> > > +
> > > +   prog = bpf_prog_get(fd);
> > > +   if (IS_ERR(prog)) {
> > > +   ret = PTR_ERR(prog);
> > > +   goto out;
> > > +   }
> > > +
> > > +   for (cur = child->seccomp.filter; cur; cur = cur->prev) {
> > > +   if (cur->prog == prog) {
> > > +   if (!cur->prev)
> > > +   ret = -ENOENT;
> > > +   else
> > > +   ret = bpf_prog_set(fd, cur->prev->prog);
> >
> > This lets you take an fd pointing to one prog and point it elsewhere.
> > I'm not sure that's a good idea.
>
> That's how the interface was designed (calling ptrace(NEXT_FILTER, fd) and
> then doing bpf(DUMP, fd)). I suppose we could have NEXT_FILTER return
> a new fd instead if that seems better to you.

It'll be slower, but it avoids a weird side effect.

>
> Tycho
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: kernel warning in tcp_fragment

2015-09-14 Thread Martin KaFai Lau

Hi Grant,

Thanks for testing it.  I will try to repost the patch.

Thanks,
Martin

On Tue, Sep 01, 2015 at 04:02:33PM -0700, Grant Zhang wrote:
> Hi Martin,
>
> I did try out your v2 patch on our production server and can confirm that
> the patch gets rid of the WARN_ON trace.
>
> I would really like to see the issue been fixed by upstream(and backported
> to kernel longterm tree 3.14)--either by this patch or something else. Is
> there a plan for this?
>
> Thanks,
>
> Grant
>
> On 12/08/2015 20:45, Martin KaFai Lau wrote:
> >On Mon, Aug 10, 2015 at 02:35:37PM -0400, Neal Cardwell wrote:
> >>On Mon, Aug 10, 2015 at 2:10 PM, Jovi Zhangwei  wrote:
> >>>
> >>>Ping?
> >>>
> >>>We saw a lot of this warnings in our production system. It would be
> >>>great appreciate if someone can give us the fix on this warnings. :)
> >>
> >>What is your net.ipv4.tcp_mtu_probing setting? If 1, have you tried
> >>setting it to 0?
> >
> >Hi Jovi, If setting net.ipv4.tcp_mtu_probing=0 helps, can you give the
> >patch we posted earlier a try: 
> >https://urldefense.proofpoint.com/v1/url?u=https://patchwork.ozlabs.org/patch/481609/=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0A=%2Faj1ZOQObwbmtLwlDw3XzQ%3D%3D%0A=wYNHn6ACXUwYfYQpS2rAg%2BLrj8CrcyDTTr3Fx5SFoWg%3D%0A=51041d4fd18fa1568b4b46b683640d8239be657c50af324621ba9a4e8c9a96b6
> >It is the same patch that I pointed out earlier. You can click
> >on the download link.
> >
> >We are currently using a similar patch while keeping 
> >net.ipv4.tcp_mtu_probing=1.
> >
> >Thanks,
> >--Martin
> >--
> >To unsubscribe from this list: send the line "unsubscribe netdev" in
> >the body of a message to majord...@vger.kernel.org
> >More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: kernel warning in tcp_fragment

2015-09-14 Thread Neal Cardwell

On Mon, Sep 14, 2015 at 6:27 AM, Jovi Zhangwei  wrote:
>
> Hi Near,
>
> After several days testing on your patch, our system crashed. Dmesg attached.

Jovi -- Sorry about that... thank you for the testing and the data point.

neal
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [net-next PATCH] net: bridge: fix for bridging 802.1Q without REORDER_HDR

2015-09-14 Thread Phil Sutter

On Mon, Sep 14, 2015 at 02:21:10PM -0400, Vlad Yasevich wrote:
> On 09/11/2015 04:20 PM, Phil Sutter wrote:
> > On Fri, Sep 11, 2015 at 12:24:45PM -0700, Stephen Hemminger wrote:
> >> On Fri, 11 Sep 2015 21:22:03 +0200
> >> Phil Sutter  wrote:
> >>
> >>> When forwarding packets from an 802.1Q interface with REORDER_HDR set to
> >>> zero, the VLAN header previously inserted by vlan_do_receive() needs to
> >>> be stripped from the packet and the mac_header adjustment undone,
> >>> otherwise a tagged frame with first four bytes missing will be
> >>> transmitted.
> >>>
> >>> Signed-off-by: Phil Sutter 
> >>> ---
> >>>  net/bridge/br_input.c | 10 ++
> >>>  1 file changed, 10 insertions(+)
> >>>
> >>> diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
> >>> index f921a5d..e4e3fc7 100644
> >>> --- a/net/bridge/br_input.c
> >>> +++ b/net/bridge/br_input.c
> >>> @@ -288,6 +288,16 @@ rx_handler_result_t br_handle_frame(struct sk_buff 
> >>> **pskb)
> >>>   }
> >>>  
> >>>  forward:
> >>> + if (is_vlan_dev(skb->dev) &&
> >>> + !(vlan_dev_priv(skb->dev)->flags & VLAN_FLAG_REORDER_HDR)) {
> >>> + unsigned int offset = skb->data - skb_mac_header(skb);
> >>> +
> >>> + skb_push(skb, offset);
> >>> + memmove(skb->data + VLAN_HLEN, skb->data, 2 * ETH_ALEN);
> >>> + skb->mac_header += VLAN_HLEN;
> >>> + skb_pull(skb, offset);
> >>> + skb_reset_mac_len(skb);
> >>> + }
> >>>   switch (p->state) {
> >>>   case BR_STATE_FORWARDING:
> >>>   rhook = rcu_dereference(br_should_route_hook);
> >>
> >> Thanks for finding this. Is this a new thing or has it always been there?
> > 
> > Sorry, I didn't check if this is a regression or not. Seen initially
> > with RHEL7's kernel-3.10.0-229.7.2, which due to the massive backporting
> > is by far not as old as it might seem. But it's surely not a brand new
> > problem of net-next or so.
> > 
> > Since nowadays no sane mind touches REORDER_HDR (there was originally a
> > bug in NetworkManager which defaulted this to 0), it may very well be
> > there for a long time already.
> > 
> >> Sorry, this looks so special case it doesn't seem like a good idea.
> >> Something is broken in VLAN handling if this is required.
> > 
> > It is so ugly, I wish I had found a better way to fix the problem. Well,
> > maybe I miss something:
> > 
> > - packet enters __netif_receive_skb_core():
> >   - skb->protocol is set to ETH_P_8021Q, so:
> > - packet is untagged
> > - skb->vlan_tci set
> > - skb->protocol set to 'real' protocol
> >   - skb_vlan_tag_present(skb) == true, so:
> > - vlan_do_receive() is called:
> >   - tags the packet again
> >   - zeroes vlan_tci
> > - goto another_round
> > - __netif_receive_skb_core(), round 2:
> >   - skb->protocol is not ETH_P_8021Q -> no untagging
> >   - skb_vlan_tag_present(skb) == false -> no vlan_do_receive()
> >   - rx_handler handler (== br_handle_frame) is called
> > 
> > IMO the root of all evil is the existence of REORDER_HDR itself. It
> > causes an skb which should have been untagged to being passed along with
> > VLAN header present and code dealing with it needs to clean up the mess.
> 
> So the problem here appears the be the code the in br_dev_queue_push_xmit().
> It assumes that MAC_HLEN worth of data has been removed from the skb,
> which is normal in case of normal VLAN processing.  However, without
> REORDER_HEADER set this is no longer the case.  In this case, the ethernet
> header is shifted 4 bytes, and when we push the it back we miss the 4 bytes
> of the destination mac address...

Please note that vlan_do_receive() also inserts the VLAN header in
between ethernet header and IP header, therefore:

> I wonder if it would be safe to just use skb->mac_len.

Given this works, the bridge would still forward a tagged frame which
should have been untagged in the first place.

I just wondered where this added VLAN header is dropped if the interface
does not belong to a bridge, but then realized that further packet
processing simply ignores the ethernet header (and everything following
it). So unless I forget something, this should indeed be a
bridge-specific problem.

Cheers, Phil
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net] ipv6: include NLM_F_REPLACE in route replace notifications

2015-09-14 Thread Michal Kubecek

On Sun, Sep 13, 2015 at 10:18:33AM -0700, Roopa Prabhu wrote:
> From: Roopa Prabhu 
> 
> This patch adds NLM_F_REPLACE flag to ipv6 route replace notifications.
> This makes nlm_flags in ipv6 replace notifications consistent
> with ipv4.
> 
> Signed-off-by: Roopa Prabhu 

Reviewed-by: Michal Kubecek 

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] iwlwifi: mvm: fix tof.h header guard

2015-09-14 Thread Johannes Berg

On Sat, 2015-09-12 at 12:04 +0200, Nicolas Iooss wrote:
> Commit ce7929186a39 ("iwlwifi: mvm: add basic Time of Flight 
> (802.11mc
> FTM) support") created drivers/net/wireless/iwlwifi/mvm/tof.h with a
> broken header guard:
> 
> #ifndef __tof
> #define __tof_h__
> 
> ...
> 
> #endif /* __tof_h__ */
> 
> Use __tof_h__ in the first line.
> 
Thanks, I applied it to our development tree.

johannes
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

75 matches

Mail list logo