[PATCH v2 net] bnx2x: Prevent FW assertion when using Vxlan

2015-12-18 Thread Yuval Mintz
FW has a rare corner case in which a fragmented packet using lots
of frags would not be linearized, causing the FW to assert while trying
to transmit the packet.

To prevent this, we need to make sure the window of fragements containing
MSS worth of data contains 1 BD less than for regular packets due to
the additional parsing BD.

Signed-off-by: Yuval Mintz 
Signed-off-by: Ariel Elior 
---
Changes from V1:
  - Corrected indentation [Thanks Sergei].

Hi Dave,

Please consider applying this to `net'.

Thanks,
Yuval
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c | 22 +-
 1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c 
b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
index f8d7a2f..3718362 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
@@ -3430,25 +3430,29 @@ static u32 bnx2x_xmit_type(struct bnx2x *bp, struct 
sk_buff *skb)
return rc;
 }
 
-#if (MAX_SKB_FRAGS >= MAX_FETCH_BD - 3)
+/* VXLAN: 4 = 1 (for linear data BD) + 3 (2 for PBD and last BD) */
+#define BNX2X_NUM_VXLAN_TSO_WIN_SUB_BDS 4
+
+/* Regular: 3 = 1 (for linear data BD) + 2 (for PBD and last BD) */
+#define BNX2X_NUM_TSO_WIN_SUB_BDS   3
+
+#if (MAX_SKB_FRAGS >= MAX_FETCH_BD - BDS_PER_TX_PKT)
 /* check if packet requires linearization (packet is too fragmented)
no need to check fragmentation if page size > 8K (there will be no
violation to FW restrictions) */
 static int bnx2x_pkt_req_lin(struct bnx2x *bp, struct sk_buff *skb,
 u32 xmit_type)
 {
-   int to_copy = 0;
-   int hlen = 0;
-   int first_bd_sz = 0;
+   int first_bd_sz = 0, num_tso_win_sub = BNX2X_NUM_TSO_WIN_SUB_BDS;
+   int to_copy = 0, hlen = 0;
 
-   /* 3 = 1 (for linear data BD) + 2 (for PBD and last BD) */
-   if (skb_shinfo(skb)->nr_frags >= (MAX_FETCH_BD - 3)) {
+   if (xmit_type & XMIT_GSO_ENC)
+   num_tso_win_sub = BNX2X_NUM_VXLAN_TSO_WIN_SUB_BDS;
 
+   if (skb_shinfo(skb)->nr_frags >= (MAX_FETCH_BD - num_tso_win_sub)) {
if (xmit_type & XMIT_GSO) {
unsigned short lso_mss = skb_shinfo(skb)->gso_size;
-   /* Check if LSO packet needs to be copied:
-  3 = 1 (for headers BD) + 2 (for PBD and last BD) */
-   int wnd_size = MAX_FETCH_BD - 3;
+   int wnd_size = MAX_FETCH_BD - num_tso_win_sub;
/* Number of windows to check */
int num_wnds = skb_shinfo(skb)->nr_frags - wnd_size;
int wnd_idx = 0;
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH iproute2] ip, route: fix minor compile warning

2015-12-18 Thread Daniel Borkmann

On 12/18/2015 02:12 AM, Stephen Hemminger wrote:

On Mon, 14 Dec 2015 16:53:20 +0100
Daniel Borkmann  wrote:


On 12/14/2015 04:51 PM, Phil Sutter wrote:

On Mon, Dec 14, 2015 at 04:34:29PM +0100, Daniel Borkmann wrote:

Seems like gcc (4.8.3) doesn't catch this false positive, triggering
after 0f7543322c5f ("route: ignore RTAX_HOPLIMIT of value -1"):

iproute.c: In function 'print_route':
iproute.c:301:12: warning: 'val' may be used uninitialized in this function 
[-Wmaybe-uninitialized]
   features &= ~RTAX_FEATURE_ECN;
^
iproute.c:575:10: note: 'val' was declared here
__u32 val;
  ^
So just shut it up by initializing to 0.


Hmm. Interestingly, my patch shouldn't have changed anything relevant
for gcc's decision. OTOH, I don't see a warning using gcc-4.9.3.


If I revert it, the warning is gone for me ;) perhaps some heuristic issue
with that gcc version.

Cheers,
Daniel


I don't see this warning on current master with gcc 4.9.2.


Well, in the commit message I wrote 4.8.3 ... but I don't mind if we drop it, 
sure.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH iproute2 v3 0/3] improve lwtunnel route support

2015-12-18 Thread Paolo Abeni
This patch series try to improve the current route based
lwtunnel support in iproute2, namely adding support for the
COLLECT_METADATA flag in vxlan and gre link, and for ip6
encap type in lwtunnel.

Tunnel devices need to have the COLLECT_METADATA flag
set in order to be used for route based lwtunnel.

Changes from V1:
- the COLLECT_METADATA flag is now controlled via the 'external' keyword
- 'vni' and 'external' arguments are mutually exclusive for the vxlan link

Changes from V2:
- rebased

Paolo Abeni (3):
  vxlan: add support for collect metadata flag
  gre: add support for collect metadata flag
  lwtunnel: implement support for ip6 encap

 ip/iplink_vxlan.c | 19 +--
 ip/iproute_lwtunnel.c | 92 ++-
 ip/link_gre.c | 11 ++
 3 files changed, 119 insertions(+), 3 deletions(-)

-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH iproute2 v3 3/3] lwtunnel: implement support for ip6 encap

2015-12-18 Thread Paolo Abeni
Currently ip6 encap support for lwtunnel is missing.
This patch implement it, mostly duplicating the ipv4 parts.

Also be sure to insert a space after the encap type, when
showing lwtunnel, to avoid the tunnel type and the following
argument being merged into a single word.

Signed-off-by: Paolo Abeni 
---
 ip/iproute_lwtunnel.c | 92 ++-
 1 file changed, 91 insertions(+), 1 deletion(-)

diff --git a/ip/iproute_lwtunnel.c b/ip/iproute_lwtunnel.c
index 1243977..7074906 100644
--- a/ip/iproute_lwtunnel.c
+++ b/ip/iproute_lwtunnel.c
@@ -115,6 +115,37 @@ static void print_encap_ila(FILE *fp, struct rtattr *encap)
}
 }
 
+static void print_encap_ip6(FILE *fp, struct rtattr *encap)
+{
+   struct rtattr *tb[LWTUNNEL_IP6_MAX+1];
+   char abuf[256];
+
+   parse_rtattr_nested(tb, LWTUNNEL_IP6_MAX, encap);
+
+   if (tb[LWTUNNEL_IP6_ID])
+   fprintf(fp, "id %llu ", 
ntohll(rta_getattr_u64(tb[LWTUNNEL_IP6_ID])));
+
+   if (tb[LWTUNNEL_IP6_SRC])
+   fprintf(fp, "src %s ",
+   rt_addr_n2a(AF_INET6,
+   RTA_PAYLOAD(tb[LWTUNNEL_IP6_SRC]),
+   RTA_DATA(tb[LWTUNNEL_IP6_SRC]),
+   abuf, sizeof(abuf)));
+
+   if (tb[LWTUNNEL_IP6_DST])
+   fprintf(fp, "dst %s ",
+   rt_addr_n2a(AF_INET6,
+   RTA_PAYLOAD(tb[LWTUNNEL_IP6_DST]),
+   RTA_DATA(tb[LWTUNNEL_IP6_DST]),
+   abuf, sizeof(abuf)));
+
+   if (tb[LWTUNNEL_IP6_HOPLIMIT])
+   fprintf(fp, "hoplimit %d ", 
rta_getattr_u8(tb[LWTUNNEL_IP6_HOPLIMIT]));
+
+   if (tb[LWTUNNEL_IP6_TC])
+   fprintf(fp, "tc %d ", rta_getattr_u8(tb[LWTUNNEL_IP6_TC]));
+}
+
 void lwt_print_encap(FILE *fp, struct rtattr *encap_type,
  struct rtattr *encap)
 {
@@ -125,7 +156,7 @@ void lwt_print_encap(FILE *fp, struct rtattr *encap_type,
 
et = rta_getattr_u16(encap_type);
 
-   fprintf(fp, " encap %s", format_encap_type(et));
+   fprintf(fp, " encap %s ", format_encap_type(et));
 
switch (et) {
case LWTUNNEL_ENCAP_MPLS:
@@ -137,6 +168,9 @@ void lwt_print_encap(FILE *fp, struct rtattr *encap_type,
case LWTUNNEL_ENCAP_ILA:
print_encap_ila(fp, encap);
break;
+   case LWTUNNEL_ENCAP_IP6:
+   print_encap_ip6(fp, encap);
+   break;
}
 }
 
@@ -233,6 +267,59 @@ static int parse_encap_ila(struct rtattr *rta, size_t len,
return 0;
 }
 
+static int parse_encap_ip6(struct rtattr *rta, size_t len, int *argcp, char 
***argvp)
+{
+   int id_ok = 0, dst_ok = 0, tos_ok = 0, ttl_ok = 0;
+   char **argv = *argvp;
+   int argc = *argcp;
+
+   while (argc > 0) {
+   if (strcmp(*argv, "id") == 0) {
+   __u64 id;
+   NEXT_ARG();
+   if (id_ok++)
+   duparg2("id", *argv);
+   if (get_u64(, *argv, 0))
+   invarg("\"id\" value is invalid\n", *argv);
+   rta_addattr64(rta, len, LWTUNNEL_IP6_ID, htonll(id));
+   } else if (strcmp(*argv, "dst") == 0) {
+   inet_prefix addr;
+   NEXT_ARG();
+   if (dst_ok++)
+   duparg2("dst", *argv);
+   get_addr(, *argv, AF_INET6);
+   rta_addattr_l(rta, len, LWTUNNEL_IP6_DST, , 
addr.bytelen);
+   } else if (strcmp(*argv, "tc") == 0) {
+   __u32 tc;
+   NEXT_ARG();
+   if (tos_ok++)
+   duparg2("tc", *argv);
+   if (rtnl_dsfield_a2n(, *argv))
+   invarg("\"tc\" value is invalid\n", *argv);
+   rta_addattr8(rta, len, LWTUNNEL_IP6_TC, tc);
+   } else if (strcmp(*argv, "hoplimit") == 0) {
+   __u8 hoplimit;
+   NEXT_ARG();
+   if (ttl_ok++)
+   duparg2("hoplimit", *argv);
+   if (get_u8(, *argv, 0))
+   invarg("\"hoplimit\" value is invalid\n", 
*argv);
+   rta_addattr8(rta, len, LWTUNNEL_IP6_HOPLIMIT, hoplimit);
+   } else {
+   break;
+   }
+   argc--; argv++;
+   }
+
+   /* argv is currently the first unparsed argument,
+* but the lwt_parse_encap() caller will move to the next,
+* so step back */
+   *argcp = argc + 1;
+   *argvp = argv - 1;
+
+   return 0;
+}
+
 int lwt_parse_encap(struct rtattr *rta, size_t len, int *argcp, char 

[PATCH] net: phy: adds backplane driver for Freescale's PCS PHY

2015-12-18 Thread shh.xie
From: Shaohui Xie 

Freescale's PCS PHY can support backplane, this patch provides 10GBASE-KR
and 1000BASE-KX support.

Signed-off-by: Shaohui Xie 
---
 drivers/net/phy/Kconfig |7 +
 drivers/net/phy/Makefile|1 +
 drivers/net/phy/fsl_backplane.c | 1201 +++
 3 files changed, 1209 insertions(+)
 create mode 100644 drivers/net/phy/fsl_backplane.c

diff --git a/drivers/net/phy/Kconfig b/drivers/net/phy/Kconfig
index 60994a8..ce523ea 100644
--- a/drivers/net/phy/Kconfig
+++ b/drivers/net/phy/Kconfig
@@ -256,6 +256,13 @@ config MDIO_BCM_IPROC
  This module provides a driver for the MDIO busses found in the
  Broadcom iProc SoC's.
 
+config FSL_BACKPLANE
+   tristate "Support for Backplane on Freescale PCS PHYs"
+   depends on OF_MDIO
+   help
+ This module provides a driver for Backplane on Freescale PCS PHYs,
+ it supports 10GBASE-KR and 1000BASE-KX.
+
 endif # PHYLIB
 
 config MICREL_KS8995MA
diff --git a/drivers/net/phy/Makefile b/drivers/net/phy/Makefile
index f31a4e2..fa1fb4d 100644
--- a/drivers/net/phy/Makefile
+++ b/drivers/net/phy/Makefile
@@ -42,3 +42,4 @@ obj-$(CONFIG_MDIO_MOXART) += mdio-moxart.o
 obj-$(CONFIG_MDIO_BCM_UNIMAC)  += mdio-bcm-unimac.o
 obj-$(CONFIG_MICROCHIP_PHY)+= microchip.o
 obj-$(CONFIG_MDIO_BCM_IPROC)   += mdio-bcm-iproc.o
+obj-$(CONFIG_FSL_BACKPLANE)+= fsl_backplane.o
diff --git a/drivers/net/phy/fsl_backplane.c b/drivers/net/phy/fsl_backplane.c
new file mode 100644
index 000..ce0974d
--- /dev/null
+++ b/drivers/net/phy/fsl_backplane.c
@@ -0,0 +1,1201 @@
+/* Freescale backplane driver.
+ *   Author: Shaohui Xie 
+ *
+ * Copyright 2015 Freescale Semiconductor, Inc.
+ *
+ * Licensed under the GPL-2 or later.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define FSL_PCS_PHY_ID 0x0083e400
+
+/* Freescale KR PMD registers */
+#define FSL_KR_PMD_CTRL0x96
+#define FSL_KR_PMD_STATUS  0x97
+#define FSL_KR_LP_CU   0x98
+#define FSL_KR_LP_STATUS   0x99
+#define FSL_KR_LD_CU   0x9a
+#define FSL_KR_LD_STATUS   0x9b
+
+/* Freescale KR PMD defines */
+#define PMD_RESET  0x1
+#define PMD_STATUS_SUP_STAT0x4
+#define PMD_STATUS_FRAME_LOCK  0x2
+#define TRAIN_EN   0x3
+#define TRAIN_DISABLE  0x1
+#define RX_STAT0x1
+
+/* Freescale KX PCS mode register */
+#define FSL_PCS_IF_MODE0x8014
+
+/* Freescale KX PCS mode register init value */
+#define IF_MODE_INIT   0x8
+
+/* Freescale KX/KR AN registers */
+#define FSL_AN_AD1 0x11
+#define FSL_AN_BP_STAT 0x30
+
+/* Freescale KX/KR AN registers defines */
+#define AN_CTRL_INIT   0x1200
+#define KX_AN_AD1_INIT 0x25
+#define KR_AN_AD1_INIT 0x85
+#define REMOTE_FAULT   0x10
+#define AN_LNK_UP_MASK 0x4
+#define KR_AN_MASK 0x8
+#define TRAIN_FAIL 0x8
+
+/* C(-1) */
+#define BIN_M1 0
+/* C(1) */
+#define BIN_LONG   1
+#define BIN_M1_SEL 6
+#define BIN_Long_SEL   7
+#define CDR_SEL_MASK   0x0007
+#define BIN_SNAPSHOT_NUM   5
+#define BIN_M1_THRESHOLD   3
+#define BIN_LONG_THRESHOLD 2
+
+#define PRE_COE_MASK   0x03c0
+#define POST_COE_MASK  0x001f
+#define ZERO_COE_MASK  0x3f00
+#define PRE_COE_SHIFT  22
+#define POST_COE_SHIFT 16
+#define ZERO_COE_SHIFT 8
+
+#define PRE_COE_MAX0x0
+#define PRE_COE_MIN0x8
+#define POST_COE_MAX   0x0
+#define POST_COE_MIN   0x10
+#define ZERO_COE_MAX   0x30
+#define ZERO_COE_MIN   0x0
+
+#define TECR0_INIT 0x2420
+#define RATIO_PREQ 0x3
+#define RATIO_PST1Q0xd
+#define RATIO_EQ   0x20
+
+#define GCR0_RESET_MASK0x60
+#define GCR1_REIDL_TH_MASK 0x0070
+#define GCR1_REIDL_EX_SEL_MASK 0x000c
+#define GCR1_REIDL_ET_MAS_MASK 0x4000
+#define TECR0_AMP_RED_MASK 0x003f
+
+#define GCR1_SNP_START_MASK0x0040
+#define RECR1_SNP_DONE_MASK0x0004
+#define TCSR1_SNP_DATA_MASK0xffc0
+#define TCSR1_SNP_DATA_SHIFT   6
+#define TCSR1_EQ_SNPBIN_SIGN_MASK  0x100
+
+#define 

Re: [PATCH v2 0/3] drivers: net: cpsw: Fix bugs in fixed-link PHY DT parsing

2015-12-18 Thread Daniel Trautmann
On Thu, Dec 17, 2015 at 03:45:08PM -0500, David Miller wrote:
> From: "David Rivshin (Allworx)" 
> Date: Wed, 16 Dec 2015 23:02:08 -0500
> 
> > I have tested on the following hardware configurations:
> >  - (EVMSK) dual emac with two real MDIO-connected phys using RGMII-TXID
> >  - single emac with fixed-link using RGMII
> > Testing of other CPSW emac configurations that folks may have would
> > be appreciated.
> 
> I'm going to wait until some others give some feedback and testing
> results on this one, thanks.

I can confirm that this patch is working on the following hardware setup:
 - single emac using MII connected with fixed-link to Micrel KSZ8895 Switch
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: rhashtable: Prevent spurious EBUSY errors on insertion

2015-12-18 Thread Xin Long
On Fri, Dec 18, 2015 at 10:26 AM, Herbert Xu
 wrote:
> On Fri, Dec 18, 2015 at 12:07:08AM +0800, Xin Long wrote:
>>
>> I'm just wondering, why do not we handle the genuine double rehash
>> issue inside rhashtable? i mean it's just a temporary error that a
>> simple retry may fix it.
>
> Because a double rehash means that someone has cracked your hash
> function and there is no point in trying anymore.

ok, get your point, is it possible to be triggered by some cases under
a big stress insertion, but they are all legal cases. like we use rhash in
nftables, if there are a big batch sets to insert, may this issue happen?

>
> Cheers,
> --
> Email: Herbert Xu 
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next] tcp: diag: add support for request sockets to tcp_abort()

2015-12-18 Thread Lorenzo Colitti
On Fri, Dec 18, 2015 at 9:14 AM, Eric Dumazet  wrote:
> Adding support for SYN_RECV request sockets to tcp_abort()
> is quite easy after our tcp listener rewrite.

I added test coverage for this to our tests.

Without this patch, attempting to destroy an SYN_RECV socket using
SOCK_DESTROY results in EOPNOTSUPP. With this patch, SOCK_DESTROY
succeeds, and after it does, sock_diag reports no child sockets.

Tested-by: Lorenzo Colitti 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH iproute2 v2 3/3] lwtunnel: implement support for ip6 encap

2015-12-18 Thread Paolo Abeni
On Thu, 2015-12-17 at 17:30 -0800, Stephen Hemminger wrote:
> On Wed, 16 Dec 2015 13:22:28 +0100
> Paolo Abeni  wrote:
> 
> > Currently ip6 encap support for lwtunnel is missing.
> > This patch implement it, mostly duplicating the ipv4 parts.
> > 
> > Also be sure to insert a space after the encap type, when
> > showing lwtunnel, to avoid the tunnel type and the following
> > argument being merged into a single word.
> > 
> > Signed-off-by: Paolo Abeni 
> 
> Patch does not apply cleanly. Probably because of recent changes to lwtunnel
> and tunnel related arguement parsing.

Sorry, my bad: my local tree was not very up to date. I'll rebase and
resend.

Cheers,

Paolo

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH iproute2 v3 2/3] gre: add support for collect metadata flag

2015-12-18 Thread Paolo Abeni
This patch add support for IFLA_GRE_COLLECT_METADATA via the
'external' keyword to the gre link.

Signed-off-by: Paolo Abeni 
---
 ip/link_gre.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/ip/link_gre.c b/ip/link_gre.c
index 58f416c..c85741f 100644
--- a/ip/link_gre.c
+++ b/ip/link_gre.c
@@ -74,6 +74,7 @@ static int gre_parse_opt(struct link_util *lu, int argc, char 
**argv,
__u16 encapflags = 0;
__u16 encapsport = 0;
__u16 encapdport = 0;
+   __u8 metadata = 0;
 
if (!(n->nlmsg_flags & NLM_F_CREATE)) {
memset(, 0, sizeof(req));
@@ -148,6 +149,9 @@ get_failed:
encapsport = 
rta_getattr_u16(greinfo[IFLA_GRE_ENCAP_SPORT]);
if (greinfo[IFLA_GRE_ENCAP_DPORT])
encapdport = 
rta_getattr_u16(greinfo[IFLA_GRE_ENCAP_DPORT]);
+
+   if (greinfo[IFLA_GRE_COLLECT_METADATA])
+   metadata = 1;
}
 
while (argc > 0) {
@@ -291,6 +295,8 @@ get_failed:
encapflags |= TUNNEL_ENCAP_FLAG_REMCSUM;
} else if (strcmp(*argv, "noencap-remcsum") == 0) {
encapflags |= ~TUNNEL_ENCAP_FLAG_REMCSUM;
+   } else if (strcmp(*argv, "external") == 0) {
+   metadata = 1;
} else
usage();
argc--; argv++;
@@ -325,6 +331,8 @@ get_failed:
addattr16(n, 1024, IFLA_GRE_ENCAP_FLAGS, encapflags);
addattr16(n, 1024, IFLA_GRE_ENCAP_SPORT, htons(encapsport));
addattr16(n, 1024, IFLA_GRE_ENCAP_DPORT, htons(encapdport));
+   if (metadata)
+   addattr_l(n, 1024, IFLA_GRE_COLLECT_METADATA, NULL, 0);
 
return 0;
 }
@@ -413,6 +421,9 @@ static void gre_print_opt(struct link_util *lu, FILE *f, 
struct rtattr *tb[])
if (oflags & GRE_CSUM)
fputs("ocsum ", f);
 
+   if (tb[IFLA_GRE_COLLECT_METADATA])
+   fputs("external ", f);
+
if (tb[IFLA_GRE_ENCAP_TYPE] &&
*(__u16 *)RTA_DATA(tb[IFLA_GRE_ENCAP_TYPE]) != TUNNEL_ENCAP_NONE) {
__u16 type = rta_getattr_u16(tb[IFLA_GRE_ENCAP_TYPE]);
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH iproute2 v3 1/3] vxlan: add support for collect metadata flag

2015-12-18 Thread Paolo Abeni
This patch add support for IFLA_VXLAN_COLLECT_METADATA via the
'external' keyword to the vxlan link.

Also enforce mutual exclusion between 'vni' and 'external'.

Signed-off-by: Paolo Abeni 
---
 ip/iplink_vxlan.c | 19 +--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/ip/iplink_vxlan.c b/ip/iplink_vxlan.c
index db29bf0..aa4d519 100644
--- a/ip/iplink_vxlan.c
+++ b/ip/iplink_vxlan.c
@@ -31,7 +31,7 @@ static void print_explain(FILE *f)
fprintf(f, " [ ageing SECONDS ] [ maxaddress NUMBER 
]\n");
fprintf(f, " [ [no]udpcsum ] [ [no]udp6zerocsumtx ] [ 
[no]udp6zerocsumrx ]\n");
fprintf(f, " [ [no]remcsumtx ] [ [no]remcsumrx ]\n");
-   fprintf(f, " [ gbp ]\n");
+   fprintf(f, " [ [no]external ] [ gbp ]\n");
fprintf(f, "\n");
fprintf(f, "Where: VNI := 0-16777215\n");
fprintf(f, "   ADDR := { IP_ADDRESS | any }\n");
@@ -72,6 +72,7 @@ static int vxlan_parse_opt(struct link_util *lu, int argc, 
char **argv,
__u8 udp6zerocsumrx = 0;
__u8 remcsumtx = 0;
__u8 remcsumrx = 0;
+   __u8 metadata = 0;
__u8 gbp = 0;
int dst_port_set = 0;
struct ifla_vxlan_port_range range = { 0, 0 };
@@ -210,6 +211,10 @@ static int vxlan_parse_opt(struct link_util *lu, int argc, 
char **argv,
remcsumrx = 1;
} else if (!matches(*argv, "noremcsumrx")) {
remcsumrx = 0;
+   } else if (!matches(*argv, "external")) {
+   metadata = 1;
+   } else if (!matches(*argv, "noexternal")) {
+   metadata = 0;
} else if (!matches(*argv, "gbp")) {
gbp = 1;
} else if (matches(*argv, "help") == 0) {
@@ -223,7 +228,12 @@ static int vxlan_parse_opt(struct link_util *lu, int argc, 
char **argv,
argc--, argv++;
}
 
-   if (!vni_set) {
+   if (metadata && vni_set) {
+   fprintf(stderr, "vxlan: both 'external' and vni cannot be 
specified\n");
+   return -1;
+   }
+
+   if (!metadata && !vni_set) {
fprintf(stderr, "vxlan: missing virtual network identifier\n");
return -1;
}
@@ -272,6 +282,7 @@ static int vxlan_parse_opt(struct link_util *lu, int argc, 
char **argv,
addattr8(n, 1024, IFLA_VXLAN_UDP_ZERO_CSUM6_RX, udp6zerocsumrx);
addattr8(n, 1024, IFLA_VXLAN_REMCSUM_TX, remcsumtx);
addattr8(n, 1024, IFLA_VXLAN_REMCSUM_RX, remcsumrx);
+   addattr8(n, 1024, IFLA_VXLAN_COLLECT_METADATA, metadata);
 
if (noage)
addattr32(n, 1024, IFLA_VXLAN_AGEING, 0);
@@ -428,6 +439,10 @@ static void vxlan_print_opt(struct link_util *lu, FILE *f, 
struct rtattr *tb[])
rta_getattr_u8(tb[IFLA_VXLAN_REMCSUM_RX]))
fputs("remcsumrx ", f);
 
+   if (tb[IFLA_VXLAN_COLLECT_METADATA] &&
+   rta_getattr_u8(tb[IFLA_VXLAN_COLLECT_METADATA]))
+   fputs("external ", f);
+
if (tb[IFLA_VXLAN_GBP])
fputs("gbp ", f);
 }
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next 0/2] Local checksum offload for VXLAN

2015-12-18 Thread Edward Cree
On 17/12/15 18:06, Tom Herbert wrote:
> I'm not sure that we need bits in VXLAN or any other encapsulation. It
> should be sufficient in udp_set_csum that if we already have
> CHECKSUM_PARTIAL that can always be used to do local checksum offload.
My understandingis that otherwise iptunnel_handle_offloads() will do the
 inner checksum in sw, because csum_help will be passed as true.  It will
 call skb_checksum_help().
> This is also should be independent as to whether the device does
> NETIF_F_HW_CSUM or can offload  NETIF_F_IP[V6]_CSUM for encapsulated
> packets.
I was wary of drivers that declare NETIF_F_IP[V6]_CSUM but don't cope with
 encapsulated packets.  Would they do the right thing if the inner_csum bool
 in patch 2 just tested for NETIF_F_CSUM_MASK, or do I need to test things
 like NETIF_F_GSO_UDP_TUNNEL_CSUM?  I'm afraid I don't entirely understand
 the infrastructure here, so I just did the minimal thing I was sure worked,
 i.e. testing for NETIF_F_HW_CSUM.
> It would be nice to have a more formal documentation also. This is a
> very powerful mechanism but the math behind it and requirements are
> subtle.
>
> Tom
What would be a good place to put such documentation?  In
 Documentation/networking, or as part of the big checksums comment at the
 top of skbuff.h?

-ed
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net] openvswitch: correct encoding of set tunnel action attributes

2015-12-18 Thread Simon Horman
In a set action tunnel attributes should be encoded in a
nested action.

I noticed this because ovs-dpctl was reporting an error
when dumping flows due to the incorrect encoding of tunnel attributes
in a set action.

Fixes: fc4099f17240 ("openvswitch: Fix egress tunnel info.")
Signed-off-by: Simon Horman 

---
* Lightly tested using ovs-dpctl dump-flows
---
 net/openvswitch/flow_netlink.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/net/openvswitch/flow_netlink.c b/net/openvswitch/flow_netlink.c
index 907d6fd28ede..d1bd4a45ca2d 100644
--- a/net/openvswitch/flow_netlink.c
+++ b/net/openvswitch/flow_netlink.c
@@ -2434,7 +2434,10 @@ static int set_action_to_attr(const struct nlattr *a, 
struct sk_buff *skb)
if (!start)
return -EMSGSIZE;
 
-   err = ovs_nla_put_tunnel_info(skb, tun_info);
+   err =  ip_tun_to_nlattr(skb, _info->key,
+   ip_tunnel_info_opts(tun_info),
+   tun_info->options_len,
+   ip_tunnel_info_af(tun_info));
if (err)
return err;
nla_nest_end(skb, start);
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [iproute PATCH] ip{,6}tunnel: have a shared stats parser/printer

2015-12-18 Thread Phil Sutter
On Thu, Dec 17, 2015 at 05:14:21PM -0800, Stephen Hemminger wrote:
> I just fixed the sscanf formats and after that this patch caused build error.
> tunnel.c: In function ‘tnl_print_stats’:
> tunnel.c:211:13: error: ‘ptr’ undeclared (first use in this function)
>   if (sscanf(ptr, "%lu%lu%lu%lu%lu%lu%lu%*d%lu%lu%lu%lu%lu%lu%lu",
>  ^
> tunnel.c:211:13: note: each undeclared identifier is reported only once for 
> each function it appears in
> : recipe for target 'tunnel.o' failed

Ah, sorry. I obviously messed up conflict resolution while rebasing this
onto your master. Fixed version follows.

Thanks, Phil
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[iproute PATCH v2] ip{,6}tunnel: have a shared stats parser/printer

2015-12-18 Thread Phil Sutter
This has a slight side-effect of not aborting when /proc/net/dev is
malformed, but OTOH stats are not parsed for uninteresting interfaces.

Signed-off-by: Phil Sutter 
---
Changes since v1:
- Fix conflict resolution (sscan from 'buf' instead of 'ptr').
---
 ip/ip6tunnel.c | 21 ++---
 ip/iptunnel.c  | 21 ++---
 ip/tunnel.c| 28 
 ip/tunnel.h|  1 +
 4 files changed, 33 insertions(+), 38 deletions(-)

diff --git a/ip/ip6tunnel.c b/ip/ip6tunnel.c
index 1737d884319ec..7a3cd0461fff1 100644
--- a/ip/ip6tunnel.c
+++ b/ip/ip6tunnel.c
@@ -341,10 +341,6 @@ static int do_tunnels_list(struct ip6_tnl_parm2 *p)
while (fgets(buf, sizeof(buf), fp) != NULL) {
char name[IFNAMSIZ];
int index, type;
-   unsigned long rx_bytes, rx_packets, rx_errs, rx_drops,
-   rx_fifo, rx_frame,
-   tx_bytes, tx_packets, tx_errs, tx_drops,
-   tx_fifo, tx_colls, tx_carrier, rx_multi;
struct ip6_tnl_parm2 p1;
char *ptr;
 
@@ -354,12 +350,6 @@ static int do_tunnels_list(struct ip6_tnl_parm2 *p)
fprintf(stderr, "Wrong format for /proc/net/dev. Giving 
up.\n");
goto end;
}
-   if (sscanf(ptr, "%lu%lu%lu%lu%lu%lu%lu%*d%lu%lu%lu%lu%lu%lu%lu",
-  _bytes, _packets, _errs, _drops,
-  _fifo, _frame, _multi,
-  _bytes, _packets, _errs, _drops,
-  _fifo, _colls, _carrier) != 14)
-   continue;
if (p->name[0] && strcmp(p->name, name))
continue;
index = ll_name_to_index(name);
@@ -385,15 +375,8 @@ static int do_tunnels_list(struct ip6_tnl_parm2 *p)
if (!ip6_tnl_parm_match(p, ))
continue;
print_tunnel();
-   if (show_stats) {
-   printf("%s", _SL_);
-   printf("RX: PacketsBytesErrors CsumErrs 
OutOfSeq Mcasts%s", _SL_);
-   printf("%-10ld %-12ld %-6ld %-8ld %-8ld %-8ld%s",
-  rx_packets, rx_bytes, rx_errs, rx_frame, 
rx_fifo, rx_multi, _SL_);
-   printf("TX: PacketsBytesErrors DeadLoop 
NoRoute  NoBufs%s", _SL_);
-   printf("%-10ld %-12ld %-6ld %-8ld %-8ld %-6ld",
-  tx_packets, tx_bytes, tx_errs, tx_colls, 
tx_carrier, tx_drops);
-   }
+   if (show_stats)
+   tnl_print_stats(ptr);
printf("\n");
}
err = 0;
diff --git a/ip/iptunnel.c b/ip/iptunnel.c
index a3ff99bd87eb8..65a4e6e9c1a5a 100644
--- a/ip/iptunnel.c
+++ b/ip/iptunnel.c
@@ -405,10 +405,6 @@ static int do_tunnels_list(struct ip_tunnel_parm *p)
while (fgets(buf, sizeof(buf), fp) != NULL) {
char name[IFNAMSIZ];
int index, type;
-   unsigned long rx_bytes, rx_packets, rx_errs, rx_drops,
-   rx_fifo, rx_frame,
-   tx_bytes, tx_packets, tx_errs, tx_drops,
-   tx_fifo, tx_colls, tx_carrier, rx_multi;
struct ip_tunnel_parm p1;
char *ptr;
 
@@ -419,12 +415,6 @@ static int do_tunnels_list(struct ip_tunnel_parm *p)
fprintf(stderr, "Wrong format for /proc/net/dev. Giving 
up.\n");
goto end;
}
-   if (sscanf(ptr, "%lu%lu%lu%lu%lu%lu%lu%*d%lu%lu%lu%lu%lu%lu%lu",
-  _bytes, _packets, _errs, _drops,
-  _fifo, _frame, _multi,
-  _bytes, _packets, _errs, _drops,
-  _fifo, _colls, _carrier) != 14)
-   continue;
if (p->name[0] && strcmp(p->name, name))
continue;
index = ll_name_to_index(name);
@@ -447,15 +437,8 @@ static int do_tunnels_list(struct ip_tunnel_parm *p)
(p->i_key && p1.i_key != p->i_key))
continue;
print_tunnel();
-   if (show_stats) {
-   printf("%s", _SL_);
-   printf("RX: PacketsBytesErrors CsumErrs 
OutOfSeq Mcasts%s", _SL_);
-   printf("%-10ld %-12ld %-6ld %-8ld %-8ld %-8ld%s",
-  rx_packets, rx_bytes, rx_errs, rx_frame, 
rx_fifo, rx_multi, _SL_);
-   printf("TX: PacketsBytesErrors DeadLoop 
NoRoute  NoBufs%s", _SL_);
-   printf("%-10ld %-12ld %-6ld %-8ld %-8ld %-6ld",
-  tx_packets, tx_bytes, tx_errs, tx_colls, 
tx_carrier, tx_drops);
-   }
+   if 

Re: [PATCH] net: phy: adds backplane driver for Freescale's PCS PHY

2015-12-18 Thread Andrew Lunn
On Fri, Dec 18, 2015 at 05:30:54PM +0800, shh@gmail.com wrote:
> +static int fsl_backplane_probe(struct phy_device *phydev)
> +{
> + struct fsl_xgkr_inst *xgkr_inst;
> + struct device_node *child, *parent, *lane_node;
> + const char *lane_name;
> + int len;
> + int ret;
> + u32 mode;
> + u32 lane[2];
> +
> + child = phydev->dev.of_node;
> + parent = of_get_parent(child);
> + if (!parent) {
> + dev_err(>dev, "could not get parent node");
> + return 0;
> + }
> +
> + lane_name = of_get_property(parent, "lane-instance", );
> + if (!lane_name)
> + return 0;
> +
> + if (strstr(lane_name, "1000BASE-KX"))
> + mode = BACKPLANE_1G_KX;
> + else
> + mode = BACKPLANE_10G_KR;
> +
> + lane_node = of_parse_phandle(child, "lane-handle", 0);


Hi Shaohui

You are missing the device tree binding Documentation.

Parent will be the mdio bus device and you want 'lane-instance' and
'lane-handle' properties to be in the mdio bus node?

 Andrew
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net: phy: adds backplane driver for Freescale's PCS PHY

2015-12-18 Thread kbuild test robot
Hi Shaohui,

[auto build test ERROR on net/master]
[also build test ERROR on v4.4-rc5 next-20151217]

url:
https://github.com/0day-ci/linux/commits/shh-xie-gmail-com/net-phy-adds-backplane-driver-for-Freescale-s-PCS-PHY/20151218-181424
config: um-allmodconfig (attached as .config)
reproduce:
# save the attached .config to linux build tree
make ARCH=um 

All errors (new ones prefixed by >>):

   drivers/net/phy/fsl_backplane.c: In function 'fsl_backplane_remove':
>> drivers/net/phy/fsl_backplane.c:1154:3: error: implicit declaration of 
>> function 'iounmap' [-Werror=implicit-function-declaration]
  iounmap(xgkr_inst->reg_base);
  ^
   cc1: some warnings being treated as errors

vim +/iounmap +1154 drivers/net/phy/fsl_backplane.c

  1148  {
  1149  struct fsl_xgkr_inst *xgkr_inst = (struct fsl_xgkr_inst 
*)phydev->priv;
  1150  
  1151  cancel_delayed_work_sync(_inst->xgkr_wk);
  1152  
  1153  if (xgkr_inst->reg_base)
> 1154  iounmap(xgkr_inst->reg_base);
  1155  
  1156  kfree(xgkr_inst);
  1157  }

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: Binary data


Re: [PATCH iproute2] iplink: support 'stable-privacy' IPv6 addrgenmode

2015-12-18 Thread Bjørn Mork
Stephen Hemminger  writes:
> On Wed, 16 Dec 2015 16:15:14 +0100
> Bjørn Mork  wrote:
>
>> Signed-off-by: Bjørn Mork 
>
> Does not apply to current code base. Probably because of Hannes recent 
> changes.

Yes, I saw that you applied Hannes' ipaddress.c patch, which conficts
with this one.  No problem, either way is fine by me.  I'll redo the man
page thing as part of the next round for "addrgenmode random", if/when
that is applied to the kernel.

BTW, I didn't understand your comment regarding Hannes' iplink.c patch.
If the ipaddress patch is OK, then the iplink patch is just the part
missing for show/set symmetry.  Or did I misunderstand something?


Bjørn

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next] tcp: diag: add support for request sockets to tcp_abort()

2015-12-18 Thread Eric Dumazet
On Fri, 2015-12-18 at 17:38 +0900, Lorenzo Colitti wrote:
> On Fri, Dec 18, 2015 at 9:14 AM, Eric Dumazet  wrote:
> > Adding support for SYN_RECV request sockets to tcp_abort()
> > is quite easy after our tcp listener rewrite.
> 
> I added test coverage for this to our tests.
> 
> Without this patch, attempting to destroy an SYN_RECV socket using
> SOCK_DESTROY results in EOPNOTSUPP. With this patch, SOCK_DESTROY
> succeeds, and after it does, sock_diag reports no child sockets.
> 
> Tested-by: Lorenzo Colitti 

I am curious, did you use packetdrill for this ?

I was about to write a packetdrill test as well ;)

Thanks !


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Instantaneous Threshold ECN marking for DCTCP

2015-12-18 Thread Florian Westphal
Bryce Cronkite-Ratcliff  wrote:
> Is there a way to achieve this simple threshold-ECN marking AQM with
> tc, or another approach?

Use codel or fq-codel with 'ce_threshold' added by Eric Dumazet, see

https://git.kernel.org/cgit/linux/kernel/git/shemminger/iproute2.git/commit/?id=df1c7d9138eafd5b96e81040b0c1475b6d73d158
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=80ba92fa1a92dea128283f69f55b02242e213650
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: rhashtable: Kill harmless RCU warning in rhashtable_walk_init

2015-12-18 Thread Eric Dumazet
On Fri, 2015-12-18 at 14:24 +0800, Herbert Xu wrote:
> On Fri, Dec 18, 2015 at 01:34:16PM +0800, Herbert Xu wrote:
> > On Fri, Dec 18, 2015 at 09:39:22AM +0800, kernel test robot wrote:
> > > FYI, we noticed the below changes on
> > > 
> > > https://github.com/0day-ci/linux 
> > > Herbert-Xu/rhashtable-Fix-walker-list-corruption/20151216-164833
> > > commit f9f51b8070be3e829100614a7372b219723b864f ("rhashtable: Fix walker 
> > > list corruption")
> > > 
> > > [8.933376] ===
> > > [8.933376] ===
> > > [8.934629] [ INFO: suspicious RCU usage. ]
> > > [8.934629] [ INFO: suspicious RCU usage. ]
> > > [8.935941] 4.4.0-rc3-00995-gf9f51b8 #2 Not tainted
> > > [8.935941] 4.4.0-rc3-00995-gf9f51b8 #2 Not tainted
> > > [8.937494] ---
> > > [8.937494] ---
> > > [8.938818] lib/rhashtable.c:504 suspicious 
> > > rcu_dereference_protected() usage!
> > > [8.938818] lib/rhashtable.c:504 suspicious 
> > > rcu_dereference_protected() usage!
> > 
> > This is actually a false positive because the new spin lock that
> > we hold prevents ht->tbl from disappearing under us.  So here is
> > a patch to kill the warning with a comment.
> 
> Resent with a proper patch subject and reported-by.
> 
> ---8<---
> The commit f9f51b8070be3e829100614a7372b219723b864f ("rhashtable:
> Fix walker list corruption") causes a suspicious RCU usage warning
> because we no longer hold ht->mutex when we dereference ht->tbl.
> 
> However, this is a false positive because we now hold ht->lock
> which also guarantees that ht->tbl won't disppear from under us.
> 
> This patch kills the warning by using rcu_dereference_raw and
> adding a comment.
> 
> Reported-by: kernel test robot 
> Signed-off-by: Herbert Xu 
> 
> diff --git a/lib/rhashtable.c b/lib/rhashtable.c
> index eb9240c..3404b06 100644
> --- a/lib/rhashtable.c
> +++ b/lib/rhashtable.c
> @@ -519,7 +519,11 @@ int rhashtable_walk_init(struct rhashtable *ht, struct 
> rhashtable_iter *iter)
>   return -ENOMEM;
>  
>   spin_lock(>lock);
> - iter->walker->tbl = rht_dereference(ht->tbl, ht);
> + /* We do not need RCU protection because we hold ht->lock
> +  * which guarantees that if we see ht->tbl then it won't
> +  * die on us.
> +  */
> + iter->walker->tbl = rcu_dereference_raw(ht->tbl);

You can avoid the comment by using the self documented and lockdep
enabled primitive

iter->walker->tbl = rcu_dereference_protected(ht->tbl,
  lockdep_is_held(>lock));

But, storing the ht->tbl and then releasing the lock immediately after
escapes RCU protection.

So why do we store ht->tbl in the first place ?

What exactly prevents it from disappearing after lock is released ?



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: rhashtable: Kill harmless RCU warning in rhashtable_walk_init

2015-12-18 Thread Herbert Xu
On Fri, Dec 18, 2015 at 04:54:14AM -0800, Eric Dumazet wrote:
>
> You can avoid the comment by using the self documented and lockdep
> enabled primitive
> 
> iter->walker->tbl = rcu_dereference_protected(ht->tbl,
> lockdep_is_held(>lock));

That is just gross.  I think a comment is much better in this case.

If we were to have more place where ht->lock is taken and we had
to do the RCU dereference on ht->tbl then we could add a helper
for it.  For now it's just a single place and I think a comment
is the best way to deal with it.

> But, storing the ht->tbl and then releasing the lock immediately after
> escapes RCU protection.
> 
> So why do we store ht->tbl in the first place ?
> 
> What exactly prevents it from disappearing after lock is released ?

We add ourselves to the walker list before we release the lock.
The only entity that can destroy ht->tbl will take care of all
walkers before doing so.

Cheers,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Review & Reply

2015-12-18 Thread J.Tynan
Greetings,
My name is Mr.Michael J. Tynan, I am a banker with Bank Of America. It is true 
that we have not meet each other in person, but I strongly believe in trust and 
friendship in every business. I have a Lebanese deceased customer's abandoned 
fund, which I am his personal financial adviser before his accidental death, 
that being the main reason why I alone working in the bank here, know much 
about the existence of this fund and the secrets surrounding this money. But 
before I disclose the full details to you, I will like to know your interest 
and willingness to assist me. You can call me as soon you receive my message, 
so that i will send to you full details about the transaction.
My best regards,
Mr.Michael J. Tynan
MOBILE: +1 347 269 3740
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/1] bonding: delay up state without speed and duplex in 802.3ad mode

2015-12-18 Thread Sergei Shtylyov

Hello.

On 12/18/2015 7:36 AM, zyjzyj2...@gmail.com wrote:


From: yzhu1 

In 802.3ad mode, the speed and duplex is needed. But in some NICs,
there is a time span between NIC up state and getting speed and duplex.
As such, sometimes a slave in 802.3ad mode is in up state without
speed and duplex. This will make bonding in 802.3ad mode can not
work well.

To make bonding driver robust and compatible with more NICs, it is
necessary to delay the up state without speed and duplex in 802.3ad
mode.

Signed-off-by: yzhu1 
---
  drivers/net/bonding/bond_main.c |   34 ++
  1 file changed, 34 insertions(+)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 9e0f8a7..a1d8708 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -419,6 +419,35 @@ const char *bond_slave_link_status(s8 link)
}
  }

+/* This function is to check the speed and duplex of a NIC.
+ * Since the speed and duplex of a slave device are very
+ * important to the bonding in the 802.3ad mode. As such,
+ * it is necessary to check the speed and duplex of a slave
+ * device in 802.3ad mode.
+ *
+ * speed != SPEED_UNKNOWN and duplex == DUPLEX_FULL  :  1
+ *   others  :  0
+ */
+static int __check_speed_duplex(struct net_device *netdev)
+{
+   struct ethtool_cmd ecmd;
+   u32 slave_speed = SPEED_UNKNOWN;
+   int res;
+
+   res = __ethtool_get_settings(netdev, );
+   if (res < 0)
+   return 0;
+
+   slave_speed = ethtool_cmd_speed();
+   if (slave_speed == 0 || slave_speed == ((__u32) -1))
+   return 0;
+
+   if (DUPLEX_FULL != ecmd.duplex)


   Please place the immediate operand to the right of the != operator.

[...]

MBR, Sergei

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] netcp: fix regression in receive processing

2015-12-18 Thread Arnd Bergmann
A cleanup patch I did was unfortunately wrong and introduced
multiple serious bugs in the netcp rx processing, as indicated
by these correct gcc warnings:

drivers/net/ethernet/ti/netcp_core.c:776:14: warning: 'buf_ptr' may be used 
uninitialized in this function [-Wuninitialized]
drivers/net/ethernet/ti/netcp_core.c:687:14: warning: 'ptr' may be used 
uninitialized in this function [-Wuninitialized]

I have checked the patch once more and found that a call to
get_pkt_info() accidentally got removed in netcp_free_rx_desc_chain,
and netcp_process_one_rx_packet no longer retrieved the correct
buffer length. This patch should fix all the known problems,
but I did not test on real hardware.

Signed-off-by: Arnd Bergmann 
Fixes: 899077791403 ("netcp: try to reduce type confusion in descriptors")

diff --git a/drivers/net/ethernet/ti/netcp_core.c 
b/drivers/net/ethernet/ti/netcp_core.c
index 92d08eb262c2..c61d66d38634 100644
--- a/drivers/net/ethernet/ti/netcp_core.c
+++ b/drivers/net/ethernet/ti/netcp_core.c
@@ -582,6 +582,7 @@ static void netcp_free_rx_desc_chain(struct netcp_intf 
*netcp,
unsigned int buf_len, dma_sz = sizeof(*ndesc);
void *buf_ptr;
u32 pad[2];
+   u32 tmp;
 
get_words(_desc, 1, >next_desc);
 
@@ -591,6 +592,7 @@ static void netcp_free_rx_desc_chain(struct netcp_intf 
*netcp,
dev_err(netcp->ndev_dev, "failed to unmap Rx desc\n");
break;
}
+   get_pkt_info(_buf, , _desc, ndesc);
get_pad_ptr(_ptr, ndesc);
dma_unmap_page(netcp->dev, dma_buf, PAGE_SIZE, DMA_FROM_DEVICE);
__free_page(buf_ptr);
@@ -637,6 +639,7 @@ static int netcp_process_one_rx_packet(struct netcp_intf 
*netcp)
dma_addr_t dma_desc, dma_buff;
struct netcp_packet p_info;
struct sk_buff *skb;
+   u32 pad[2];
void *org_buf_ptr;
 
dma_desc = knav_queue_pop(netcp->rx_queue, _sz);
@@ -650,7 +653,8 @@ static int netcp_process_one_rx_packet(struct netcp_intf 
*netcp)
}
 
get_pkt_info(_buff, _len, _desc, desc);
-   get_pad_ptr(_buf_ptr, desc);
+   get_pad_info([0], [1], _buf_len, desc);
+   org_buf_ptr = (void *)(uintptr_t)(pad[0] + ((u64)pad[1] << 32));
 
if (unlikely(!org_buf_ptr)) {
dev_err(netcp->ndev_dev, "NULL bufptr in desc\n");
@@ -684,7 +688,7 @@ static int netcp_process_one_rx_packet(struct netcp_intf 
*netcp)
}
 
get_pkt_info(_buff, _len, _desc, ndesc);
-   get_pad_ptr(ptr, ndesc);
+   get_pad_ptr(, ndesc);
page = ptr;
 
if (likely(dma_buff && buf_len && page)) {
@@ -773,7 +777,7 @@ static void netcp_free_rx_buf(struct netcp_intf *netcp, int 
fdq)
}
 
get_org_pkt_info(, _len, desc);
-   get_pad_ptr(buf_ptr, desc);
+   get_pad_ptr(_ptr, desc);
 
if (unlikely(!dma)) {
dev_err(netcp->ndev_dev, "NULL orig_buff in desc\n");

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 0/2] net: usb: cdc_ncm: Adding support for two new Dell devices

2015-12-18 Thread Daniele Palmas
This patch series add support in the cdc_ncm driver for two devices
based on the same platform, that are different only for carrier
customization.

V2: Added comment for highlighting FLAG_NOARP usage for those devices

Daniele Palmas (2):
  net: usb: cdc_ncm: Adding Dell DW5812 LTE Verizon Mobile Broadband
Card
  net: usb: cdc_ncm: Adding Dell DW5813 LTE AT Mobile Broadband Card

 drivers/net/usb/cdc_ncm.c | 18 ++
 1 file changed, 18 insertions(+)

-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 1/2] net: usb: cdc_ncm: Adding Dell DW5812 LTE Verizon Mobile Broadband Card

2015-12-18 Thread Daniele Palmas
Unlike DW5550, Dell DW5812 is a mobile broadband card with no ARP
capabilities: the patch makes this device to use wwan_noarp_info struct

Signed-off-by: Daniele Palmas 
---
 drivers/net/usb/cdc_ncm.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/drivers/net/usb/cdc_ncm.c b/drivers/net/usb/cdc_ncm.c
index b45e5ca..7be9b73 100644
--- a/drivers/net/usb/cdc_ncm.c
+++ b/drivers/net/usb/cdc_ncm.c
@@ -1593,6 +1593,15 @@ static const struct usb_device_id cdc_devs[] = {
  .driver_info = (unsigned long) _info,
},
 
+   /* DW5812 LTE Verizon Mobile Broadband Card
+* Unlike DW5550 this device requires FLAG_NOARP
+*/
+   { USB_DEVICE_AND_INTERFACE_INFO(0x413c, 0x81bb,
+   USB_CLASS_COMM,
+   USB_CDC_SUBCLASS_NCM, USB_CDC_PROTO_NONE),
+ .driver_info = (unsigned long)_noarp_info,
+   },
+
/* Dell branded MBM devices like DW5550 */
{ .match_flags = USB_DEVICE_ID_MATCH_INT_INFO
| USB_DEVICE_ID_MATCH_VENDOR,
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 2/2] net: usb: cdc_ncm: Adding Dell DW5813 LTE AT Mobile Broadband Card

2015-12-18 Thread Daniele Palmas
Unlike DW5550, Dell DW5813 is a mobile broadband card with no ARP
capabilities: the patch makes this device to use wwan_noarp_info struct

Signed-off-by: Daniele Palmas 
---
 drivers/net/usb/cdc_ncm.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/drivers/net/usb/cdc_ncm.c b/drivers/net/usb/cdc_ncm.c
index 7be9b73..89536d9 100644
--- a/drivers/net/usb/cdc_ncm.c
+++ b/drivers/net/usb/cdc_ncm.c
@@ -1602,6 +1602,15 @@ static const struct usb_device_id cdc_devs[] = {
  .driver_info = (unsigned long)_noarp_info,
},
 
+   /* DW5813 LTE AT Mobile Broadband Card
+* Unlike DW5550 this device requires FLAG_NOARP
+*/
+   { USB_DEVICE_AND_INTERFACE_INFO(0x413c, 0x81bc,
+   USB_CLASS_COMM,
+   USB_CDC_SUBCLASS_NCM, USB_CDC_PROTO_NONE),
+ .driver_info = (unsigned long)_noarp_info,
+   },
+
/* Dell branded MBM devices like DW5550 */
{ .match_flags = USB_DEVICE_ID_MATCH_INT_INFO
| USB_DEVICE_ID_MATCH_VENDOR,
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] [iproute2] tc/q_htb.c: rename b4 buffer to b3 to make its name more consistent

2015-12-18 Thread Jesper Dangaard Brouer
On Fri, 18 Dec 2015 16:16:39 +0300
Dmitrii Shcherbakov  wrote:

> b3 buffer has been deleted previously so b2 is followed by b4 which is not 
> consistent
> 
> Signed-off-by: Dmitrii Shcherbakov 
> ---

Acked-by: Jesper Dangaard Brouer 

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 11/14] openvswitch: use list_for_each_entry

2015-12-18 Thread Geliang Tang
Use list_for_each_entry() instead of list_for_each() to simplify
the code.

Signed-off-by: Geliang Tang 
---
 net/openvswitch/flow_table.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/net/openvswitch/flow_table.c b/net/openvswitch/flow_table.c
index d073fff..9b5999ba 100644
--- a/net/openvswitch/flow_table.c
+++ b/net/openvswitch/flow_table.c
@@ -651,11 +651,9 @@ static bool mask_equal(const struct sw_flow_mask *a,
 static struct sw_flow_mask *flow_mask_find(const struct flow_table *tbl,
   const struct sw_flow_mask *mask)
 {
-   struct list_head *ml;
+   struct sw_flow_mask *m;
 
-   list_for_each(ml, >mask_list) {
-   struct sw_flow_mask *m;
-   m = container_of(ml, struct sw_flow_mask, list);
+   list_for_each_entry(m, >mask_list, list) {
if (mask_equal(mask, m))
return m;
}
-- 
2.5.0


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 13/14] sunrpc: use list_for_each_entry_safe

2015-12-18 Thread Geliang Tang
Use list_for_each_entry_safe() instead of list_for_each_safe() to
simplify the code.

Signed-off-by: Geliang Tang 
---
 net/sunrpc/svc_xprt.c | 9 +++--
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
index a6cbb21..fe4f628 100644
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -904,8 +904,7 @@ out:
 static void svc_age_temp_xprts(unsigned long closure)
 {
struct svc_serv *serv = (struct svc_serv *)closure;
-   struct svc_xprt *xprt;
-   struct list_head *le, *next;
+   struct svc_xprt *xprt, *next;
 
dprintk("svc_age_temp_xprts\n");
 
@@ -916,9 +915,7 @@ static void svc_age_temp_xprts(unsigned long closure)
return;
}
 
-   list_for_each_safe(le, next, >sv_tempsocks) {
-   xprt = list_entry(le, struct svc_xprt, xpt_list);
-
+   list_for_each_entry_safe(xprt, next, >sv_tempsocks, xpt_list) {
/* First time through, just mark it OLD. Second time
 * through, close it. */
if (!test_and_set_bit(XPT_OLD, >xpt_flags))
@@ -926,7 +923,7 @@ static void svc_age_temp_xprts(unsigned long closure)
if (atomic_read(>xpt_ref.refcount) > 1 ||
test_bit(XPT_BUSY, >xpt_flags))
continue;
-   list_del_init(le);
+   list_del_init(>xpt_list);
set_bit(XPT_CLOSE, >xpt_flags);
dprintk("queuing xprt %p for closing\n", xprt);
 
-- 
2.5.0


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 03/14] ipv4, ipv6: use list_for_each_entry*

2015-12-18 Thread Geliang Tang
Use list_for_each_entry*() instead of list_for_each*() to simplify
the code.

Signed-off-by: Geliang Tang 
---
 net/ipv4/af_inet.c| 6 ++
 net/ipv4/tcp_output.c | 6 ++
 net/ipv6/addrconf.c   | 8 +++-
 net/ipv6/af_inet6.c   | 7 ++-
 4 files changed, 9 insertions(+), 18 deletions(-)

diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index 11c4ca1..bb11ec1 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -1028,7 +1028,6 @@ static struct inet_protosw inetsw_array[] =
 
 void inet_register_protosw(struct inet_protosw *p)
 {
-   struct list_head *lh;
struct inet_protosw *answer;
int protocol = p->protocol;
struct list_head *last_perm;
@@ -1040,14 +1039,13 @@ void inet_register_protosw(struct inet_protosw *p)
 
/* If we are trying to override a permanent protocol, bail. */
last_perm = [p->type];
-   list_for_each(lh, [p->type]) {
-   answer = list_entry(lh, struct inet_protosw, list);
+   list_for_each_entry(answer, [p->type], list) {
/* Check only the non-wild match. */
if ((INET_PROTOSW_PERMANENT & answer->flags) == 0)
break;
if (protocol == answer->protocol)
goto out_permanent;
-   last_perm = lh;
+   last_perm = >list;
}
 
/* Add the new entry after the last permanent entry if any, so that
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index a800cee..8810694 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -751,16 +751,14 @@ static void tcp_tasklet_func(unsigned long data)
struct tsq_tasklet *tsq = (struct tsq_tasklet *)data;
LIST_HEAD(list);
unsigned long flags;
-   struct list_head *q, *n;
-   struct tcp_sock *tp;
+   struct tcp_sock *tp, *n;
struct sock *sk;
 
local_irq_save(flags);
list_splice_init(>head, );
local_irq_restore(flags);
 
-   list_for_each_safe(q, n, ) {
-   tp = list_entry(q, struct tcp_sock, tsq_node);
+   list_for_each_entry_safe(tp, n, , tsq_node) {
list_del(>tsq_node);
 
sk = (struct sock *)tp;
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 7082fb7..e293647 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -865,21 +865,19 @@ void inet6_ifa_finish_destroy(struct inet6_ifaddr *ifp)
 static void
 ipv6_link_dev_addr(struct inet6_dev *idev, struct inet6_ifaddr *ifp)
 {
-   struct list_head *p;
+   struct inet6_ifaddr *ifa;
int ifp_scope = ipv6_addr_src_scope(>addr);
 
/*
 * Each device address list is sorted in order of scope -
 * global before linklocal.
 */
-   list_for_each(p, >addr_list) {
-   struct inet6_ifaddr *ifa
-   = list_entry(p, struct inet6_ifaddr, if_list);
+   list_for_each_entry(ifa, >addr_list, if_list) {
if (ifp_scope >= ipv6_addr_src_scope(>addr))
break;
}
 
-   list_add_tail(>if_list, p);
+   list_add_tail(>if_list, >if_list);
 }
 
 static u32 inet6_addr_hash(const struct in6_addr *addr)
diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
index 8ec0df7..a4fb172 100644
--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -568,7 +568,6 @@ static const struct net_proto_family inet6_family_ops = {
 
 int inet6_register_protosw(struct inet_protosw *p)
 {
-   struct list_head *lh;
struct inet_protosw *answer;
struct list_head *last_perm;
int protocol = p->protocol;
@@ -584,14 +583,12 @@ int inet6_register_protosw(struct inet_protosw *p)
answer = NULL;
ret = -EPERM;
last_perm = [p->type];
-   list_for_each(lh, [p->type]) {
-   answer = list_entry(lh, struct inet_protosw, list);
-
+   list_for_each_entry(answer, [p->type], list) {
/* Check only the non-wild match. */
if (INET_PROTOSW_PERMANENT & answer->flags) {
if (protocol == answer->protocol)
break;
-   last_perm = lh;
+   last_perm = >list;
}
 
answer = NULL;
-- 
2.5.0


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 01/14] Bluetooth: use list_for_each_entry*

2015-12-18 Thread Geliang Tang
Use list_for_each_entry*() instead of list_for_each*() to simplify
the code.

Signed-off-by: Geliang Tang 
---
 net/bluetooth/af_bluetooth.c | 12 ++--
 net/bluetooth/cmtp/capi.c|  8 ++--
 net/bluetooth/hci_core.c |  8 +++-
 net/bluetooth/rfcomm/core.c  | 46 ++--
 4 files changed, 25 insertions(+), 49 deletions(-)

diff --git a/net/bluetooth/af_bluetooth.c b/net/bluetooth/af_bluetooth.c
index cb4e8d4..955eda9 100644
--- a/net/bluetooth/af_bluetooth.c
+++ b/net/bluetooth/af_bluetooth.c
@@ -174,13 +174,13 @@ EXPORT_SYMBOL(bt_accept_unlink);
 
 struct sock *bt_accept_dequeue(struct sock *parent, struct socket *newsock)
 {
-   struct list_head *p, *n;
+   struct bt_sock *s, *n;
struct sock *sk;
 
BT_DBG("parent %p", parent);
 
-   list_for_each_safe(p, n, _sk(parent)->accept_q) {
-   sk = (struct sock *) list_entry(p, struct bt_sock, accept_q);
+   list_for_each_entry_safe(s, n, _sk(parent)->accept_q, accept_q) {
+   sk = (struct sock *)s;
 
lock_sock(sk);
 
@@ -388,11 +388,11 @@ EXPORT_SYMBOL(bt_sock_stream_recvmsg);
 
 static inline unsigned int bt_accept_poll(struct sock *parent)
 {
-   struct list_head *p, *n;
+   struct bt_sock *s, *n;
struct sock *sk;
 
-   list_for_each_safe(p, n, _sk(parent)->accept_q) {
-   sk = (struct sock *) list_entry(p, struct bt_sock, accept_q);
+   list_for_each_entry_safe(s, n, _sk(parent)->accept_q, accept_q) {
+   sk = (struct sock *)s;
if (sk->sk_state == BT_CONNECTED ||
(test_bit(BT_SK_DEFER_SETUP, _sk(parent)->flags) &&
 sk->sk_state == BT_CONNECT2))
diff --git a/net/bluetooth/cmtp/capi.c b/net/bluetooth/cmtp/capi.c
index 9a503387..46ac686 100644
--- a/net/bluetooth/cmtp/capi.c
+++ b/net/bluetooth/cmtp/capi.c
@@ -100,10 +100,8 @@ static void cmtp_application_del(struct cmtp_session 
*session, struct cmtp_appli
 static struct cmtp_application *cmtp_application_get(struct cmtp_session 
*session, int pattern, __u16 value)
 {
struct cmtp_application *app;
-   struct list_head *p;
 
-   list_for_each(p, >applications) {
-   app = list_entry(p, struct cmtp_application, list);
+   list_for_each_entry(app, >applications, list) {
switch (pattern) {
case CMTP_MSGNUM:
if (app->msgnum == value)
@@ -511,14 +509,12 @@ static int cmtp_proc_show(struct seq_file *m, void *v)
struct capi_ctr *ctrl = m->private;
struct cmtp_session *session = ctrl->driverdata;
struct cmtp_application *app;
-   struct list_head *p;
 
seq_printf(m, "%s\n\n", cmtp_procinfo(ctrl));
seq_printf(m, "addr %s\n", session->name);
seq_printf(m, "ctrl %d\n", session->num);
 
-   list_for_each(p, >applications) {
-   app = list_entry(p, struct cmtp_application, list);
+   list_for_each_entry(app, >applications, list) {
seq_printf(m, "appl %d -> %d\n", app->appl, app->mapping);
}
 
diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
index 9fb443a..47bcef7 100644
--- a/net/bluetooth/hci_core.c
+++ b/net/bluetooth/hci_core.c
@@ -2713,12 +2713,10 @@ struct bdaddr_list *hci_bdaddr_list_lookup(struct 
list_head *bdaddr_list,
 
 void hci_bdaddr_list_clear(struct list_head *bdaddr_list)
 {
-   struct list_head *p, *n;
+   struct bdaddr_list *b, *n;
 
-   list_for_each_safe(p, n, bdaddr_list) {
-   struct bdaddr_list *b = list_entry(p, struct bdaddr_list, list);
-
-   list_del(p);
+   list_for_each_entry_safe(b, n, bdaddr_list, list) {
+   list_del(>list);
kfree(b);
}
 }
diff --git a/net/bluetooth/rfcomm/core.c b/net/bluetooth/rfcomm/core.c
index 29709fb..f7eb02f 100644
--- a/net/bluetooth/rfcomm/core.c
+++ b/net/bluetooth/rfcomm/core.c
@@ -692,11 +692,9 @@ static struct rfcomm_session *rfcomm_session_del(struct 
rfcomm_session *s)
 
 static struct rfcomm_session *rfcomm_session_get(bdaddr_t *src, bdaddr_t *dst)
 {
-   struct rfcomm_session *s;
-   struct list_head *p, *n;
+   struct rfcomm_session *s, *n;
struct l2cap_chan *chan;
-   list_for_each_safe(p, n, _list) {
-   s = list_entry(p, struct rfcomm_session, list);
+   list_for_each_entry_safe(s, n, _list, list) {
chan = l2cap_pi(s->sock->sk)->chan;
 
if ((!bacmp(src, BDADDR_ANY) || !bacmp(>src, src)) &&
@@ -709,16 +707,14 @@ static struct rfcomm_session *rfcomm_session_get(bdaddr_t 
*src, bdaddr_t *dst)
 static struct rfcomm_session *rfcomm_session_close(struct rfcomm_session *s,
   int err)
 {
-   struct rfcomm_dlc *d;
-   struct list_head *p, *n;
+   struct rfcomm_dlc *d, *n;
 
s->state = BT_CLOSED;
 

[PATCH 14/14] net: pktgen: use list_for_each_entry_safe

2015-12-18 Thread Geliang Tang
Use list_for_each_entry_safe() instead of list_for_each_safe() to
simplify the code.

Signed-off-by: Geliang Tang 
---
 net/core/pktgen.c | 26 --
 1 file changed, 8 insertions(+), 18 deletions(-)

diff --git a/net/core/pktgen.c b/net/core/pktgen.c
index 2be1444..1d8cffb 100644
--- a/net/core/pktgen.c
+++ b/net/core/pktgen.c
@@ -3293,14 +3293,11 @@ static void pktgen_stop(struct pktgen_thread *t)
  */
 static void pktgen_rem_one_if(struct pktgen_thread *t)
 {
-   struct list_head *q, *n;
-   struct pktgen_dev *cur;
+   struct pktgen_dev *cur, *n;
 
func_enter();
 
-   list_for_each_safe(q, n, >if_list) {
-   cur = list_entry(q, struct pktgen_dev, list);
-
+   list_for_each_entry_safe(cur, n, >if_list, list) {
if (!cur->removal_mark)
continue;
 
@@ -3315,16 +3312,13 @@ static void pktgen_rem_one_if(struct pktgen_thread *t)
 
 static void pktgen_rem_all_ifs(struct pktgen_thread *t)
 {
-   struct list_head *q, *n;
-   struct pktgen_dev *cur;
+   struct pktgen_dev *cur, *n;
 
func_enter();
 
/* Remove all devices, free mem */
 
-   list_for_each_safe(q, n, >if_list) {
-   cur = list_entry(q, struct pktgen_dev, list);
-
+   list_for_each_entry_safe(cur, n, >if_list, list) {
kfree_skb(cur->skb);
cur->skb = NULL;
 
@@ -3771,12 +3765,10 @@ static int __net_init pktgen_create_thread(int cpu, 
struct pktgen_net *pn)
 static void _rem_dev_from_if_list(struct pktgen_thread *t,
  struct pktgen_dev *pkt_dev)
 {
-   struct list_head *q, *n;
-   struct pktgen_dev *p;
+   struct pktgen_dev *p, *n;
 
if_lock(t);
-   list_for_each_safe(q, n, >if_list) {
-   p = list_entry(q, struct pktgen_dev, list);
+   list_for_each_entry_safe(p, n, >if_list, list) {
if (p == pkt_dev)
list_del_rcu(>list);
}
@@ -3866,8 +3858,7 @@ remove:
 static void __net_exit pg_net_exit(struct net *net)
 {
struct pktgen_net *pn = net_generic(net, pg_net_id);
-   struct pktgen_thread *t;
-   struct list_head *q, *n;
+   struct pktgen_thread *t, *n;
LIST_HEAD(list);
 
/* Stop all interfaces & threads */
@@ -3877,8 +3868,7 @@ static void __net_exit pg_net_exit(struct net *net)
list_splice_init(>pktgen_threads, );
mutex_unlock(_thread_lock);
 
-   list_for_each_safe(q, n, ) {
-   t = list_entry(q, struct pktgen_thread, th_list);
+   list_for_each_entry_safe(t, n, , th_list) {
list_del(>th_list);
kthread_stop(t->tsk);
put_task_struct(t->tsk);
-- 
2.5.0


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] ila: add NETFILTER dependency

2015-12-18 Thread Arnd Bergmann
The recently added generic ILA translation facility fails to
build when CONFIG_NETFILTER is disabled:

net/ipv6/ila/ila_xlat.c:229:20: warning: 'struct nf_hook_state' declared inside 
parameter list
net/ipv6/ila/ila_xlat.c:235:27: error: array type has incomplete element type 
'struct nf_hook_ops'
 static struct nf_hook_ops ila_nf_hook_ops[] __read_mostly = {

This adds an explicit Kconfig dependency to avoid that case.

Signed-off-by: Arnd Bergmann 
Fixes: 7f00feaf1076 ("ila: Add generic ILA translation facility")

diff --git a/net/ipv6/Kconfig b/net/ipv6/Kconfig
index 983bb999738c..bb7dabe2ebbf 100644
--- a/net/ipv6/Kconfig
+++ b/net/ipv6/Kconfig
@@ -94,6 +94,7 @@ config IPV6_MIP6
 
 config IPV6_ILA
tristate "IPv6: Identifier Locator Addressing (ILA)"
+   depends on NETFILTER
select LWTUNNEL
---help---
  Support for IPv6 Identifier Locator Addressing (ILA).

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 04/14] x25: use list_for_each_entry*

2015-12-18 Thread Geliang Tang
Use list_for_each_entry*() instead of list_for_each*() to simplify
the code.

Signed-off-by: Geliang Tang 
---
 net/x25/x25_forward.c | 20 ++--
 net/x25/x25_link.c| 23 ++-
 net/x25/x25_route.c   | 29 +++--
 3 files changed, 19 insertions(+), 53 deletions(-)

diff --git a/net/x25/x25_forward.c b/net/x25/x25_forward.c
index cf561f1..4394eb6 100644
--- a/net/x25/x25_forward.c
+++ b/net/x25/x25_forward.c
@@ -24,7 +24,6 @@ int x25_forward_call(struct x25_address *dest_addr, struct 
x25_neigh *from,
 {
struct x25_route *rt;
struct x25_neigh *neigh_new = NULL;
-   struct list_head *entry;
struct x25_forward *x25_frwd, *new_frwd;
struct sk_buff *skbn;
short same_lci = 0;
@@ -51,8 +50,7 @@ int x25_forward_call(struct x25_address *dest_addr, struct 
x25_neigh *from,
 * established LCI? It shouldn't happen, just in case..
 */
read_lock_bh(_forward_list_lock);
-   list_for_each(entry, _forward_list) {
-   x25_frwd = list_entry(entry, struct x25_forward, node);
+   list_for_each_entry(x25_frwd, _forward_list, node) {
if (x25_frwd->lci == lci) {
pr_warn("call request for lci which is already 
registered!, transmitting but not registering new pair\n");
same_lci = 1;
@@ -97,15 +95,13 @@ out_no_route:
 int x25_forward_data(int lci, struct x25_neigh *from, struct sk_buff *skb) {
 
struct x25_forward *frwd;
-   struct list_head *entry;
struct net_device *peer = NULL;
struct x25_neigh *nb;
struct sk_buff *skbn;
int rc = 0;
 
read_lock_bh(_forward_list_lock);
-   list_for_each(entry, _forward_list) {
-   frwd = list_entry(entry, struct x25_forward, node);
+   list_for_each_entry(frwd, _forward_list, node) {
if (frwd->lci == lci) {
/* The call is established, either side can send */
if (from->dev == frwd->dev1) {
@@ -136,13 +132,11 @@ out:
 
 void x25_clear_forward_by_lci(unsigned int lci)
 {
-   struct x25_forward *fwd;
-   struct list_head *entry, *tmp;
+   struct x25_forward *fwd, *tmp;
 
write_lock_bh(_forward_list_lock);
 
-   list_for_each_safe(entry, tmp, _forward_list) {
-   fwd = list_entry(entry, struct x25_forward, node);
+   list_for_each_entry_safe(fwd, tmp, _forward_list, node) {
if (fwd->lci == lci) {
list_del(>node);
kfree(fwd);
@@ -154,13 +148,11 @@ void x25_clear_forward_by_lci(unsigned int lci)
 
 void x25_clear_forward_by_dev(struct net_device *dev)
 {
-   struct x25_forward *fwd;
-   struct list_head *entry, *tmp;
+   struct x25_forward *fwd, *tmp;
 
write_lock_bh(_forward_list_lock);
 
-   list_for_each_safe(entry, tmp, _forward_list) {
-   fwd = list_entry(entry, struct x25_forward, node);
+   list_for_each_entry_safe(fwd, tmp, _forward_list, node) {
if ((fwd->dev1 == dev) || (fwd->dev2 == dev)){
list_del(>node);
kfree(fwd);
diff --git a/net/x25/x25_link.c b/net/x25/x25_link.c
index fd5ffb2..61cc8a2 100644
--- a/net/x25/x25_link.c
+++ b/net/x25/x25_link.c
@@ -296,14 +296,11 @@ static void __x25_remove_neigh(struct x25_neigh *nb)
  */
 void x25_link_device_down(struct net_device *dev)
 {
-   struct x25_neigh *nb;
-   struct list_head *entry, *tmp;
+   struct x25_neigh *nb, *tmp;
 
write_lock_bh(_neigh_list_lock);
 
-   list_for_each_safe(entry, tmp, _neigh_list) {
-   nb = list_entry(entry, struct x25_neigh, node);
-
+   list_for_each_entry_safe(nb, tmp, _neigh_list, node) {
if (nb->dev == dev) {
__x25_remove_neigh(nb);
dev_put(dev);
@@ -319,12 +316,9 @@ void x25_link_device_down(struct net_device *dev)
 struct x25_neigh *x25_get_neigh(struct net_device *dev)
 {
struct x25_neigh *nb, *use = NULL;
-   struct list_head *entry;
 
read_lock_bh(_neigh_list_lock);
-   list_for_each(entry, _neigh_list) {
-   nb = list_entry(entry, struct x25_neigh, node);
-
+   list_for_each_entry(nb, _neigh_list, node) {
if (nb->dev == dev) {
use = nb;
break;
@@ -394,18 +388,13 @@ out_dev_put:
  */
 void __exit x25_link_free(void)
 {
-   struct x25_neigh *nb;
-   struct list_head *entry, *tmp;
+   struct x25_neigh *nb, *tmp;
 
write_lock_bh(_neigh_list_lock);
 
-   list_for_each_safe(entry, tmp, _neigh_list) {
-   struct net_device *dev;
-
-   nb = list_entry(entry, struct x25_neigh, node);
-   dev = nb->dev;
+   list_for_each_entry_safe(nb, tmp, _neigh_list, node) {

[PATCH 08/14] caif: use list_for_each_entry_safe

2015-12-18 Thread Geliang Tang
Use list_for_each_entry_safe() instead of list_for_each_safe() to
simplify the code.

Signed-off-by: Geliang Tang 
---
 net/caif/chnl_net.c | 16 +---
 1 file changed, 5 insertions(+), 11 deletions(-)

diff --git a/net/caif/chnl_net.c b/net/caif/chnl_net.c
index 67a4a36..d3db77c 100644
--- a/net/caif/chnl_net.c
+++ b/net/caif/chnl_net.c
@@ -140,13 +140,10 @@ static int delete_device(struct chnl_net *dev)
 
 static void close_work(struct work_struct *work)
 {
-   struct chnl_net *dev = NULL;
-   struct list_head *list_node;
-   struct list_head *_tmp;
+   struct chnl_net *dev = NULL, *tmp;
 
rtnl_lock();
-   list_for_each_safe(list_node, _tmp, _net_list) {
-   dev = list_entry(list_node, struct chnl_net, list_field);
+   list_for_each_entry_safe(dev, tmp, _net_list, list_field) {
if (dev->state == CAIF_SHUTDOWN)
dev_close(dev->netdev);
}
@@ -535,14 +532,11 @@ static int __init chnl_init_module(void)
 
 static void __exit chnl_exit_module(void)
 {
-   struct chnl_net *dev = NULL;
-   struct list_head *list_node;
-   struct list_head *_tmp;
+   struct chnl_net *dev = NULL, *tmp;
rtnl_link_unregister(_link_ops);
rtnl_lock();
-   list_for_each_safe(list_node, _tmp, _net_list) {
-   dev = list_entry(list_node, struct chnl_net, list_field);
-   list_del(list_node);
+   list_for_each_entry_safe(dev, tmp, _net_list, list_field) {
+   list_del(>list_field);
delete_device(dev);
}
rtnl_unlock();
-- 
2.5.0


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 10/14] lapb: use list_for_each_entry

2015-12-18 Thread Geliang Tang
Use list_for_each_entry() instead of list_for_each() to simplify
the code.

Signed-off-by: Geliang Tang 
---
 net/lapb/lapb_iface.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/net/lapb/lapb_iface.c b/net/lapb/lapb_iface.c
index fc60d9d..49abba7 100644
--- a/net/lapb/lapb_iface.c
+++ b/net/lapb/lapb_iface.c
@@ -86,11 +86,9 @@ static void __lapb_insert_cb(struct lapb_cb *lapb)
 
 static struct lapb_cb *__lapb_devtostruct(struct net_device *dev)
 {
-   struct list_head *entry;
struct lapb_cb *lapb, *use = NULL;
 
-   list_for_each(entry, _list) {
-   lapb = list_entry(entry, struct lapb_cb, node);
+   list_for_each_entry(lapb, _list, node) {
if (lapb->dev == dev) {
use = lapb;
break;
-- 
2.5.0


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


splice-bind deadlock (was: [PATCH] af_unix: Revert 'lock_interruptible' in stream receive code)

2015-12-18 Thread Rainer Weikusat
Rainer Weikusat  writes:
> Hannes Frederic Sowa  writes:
>
> [...]
>
>> There is still a deadlock lingering around
>
> [...]
>
>> http://lists.openwall.net/netdev/2015/11/10/4

[...]

>   (a while ago) A: socketpair()
> 
>   B: splice() from a pipe to /mnt/regular_file
>  does sb_start_write() on /mnt
>
>   C: try to freeze /mnt
>  wait for B to finish with /mnt
>
>   A: bind() try to bind our socket to /mnt/new_socket_name
>  lock our socket, see it not bound yet
>  decide that it needs to create something in /mnt
>  try to do sb_start_write() on /mnt, block (it's
>  waiting for C).
>
>   D: splice() from the same pipe to our socket
>  lock the pipe, see that socket is connected
>  try to lock the socket, block waiting for A
>
>   B: get around to actually feeding a chunk from
>  pipe to file, try to lock the pipe.
[from the page]

[...]

> Given
>   a/b - acquire a block b (eg, get read lock on superblock
>   rwsem)
>
>   b/a - acquire b block a
>
> c - u->readlock
>
> d - pipe lock
>
>   [*y]   - blocks waiting for y
>
> 
> B a/b
>
> C b/a[*B]
>
> A c
> A a/b[*C]
>
> D d
> D c[*A]
>
> B d[*D]

Some more explanations on this: There two groups of three in the above
(X <- Y supposed to mean 'Y waits for X'), B <- C <- A and A <- D <-
B. 'B blocking C blocking A' is really the same as if B was holding an
abstract mutex m0 A wants. Likewise, A <- D <- B is equivalent to A
holding an abstract mutex m1 B wants. Conceptually, there are two
threads and two locks here,

B: acquires m0 then m1
A: acquires m1 then m0

and because of the conflicting  locking orders, the whole
shoggoth deadlocks sooner or later (Fhtagn!).

The obvious idea to fix this is to reverse either A or B. I think A
should be reversed because that's probably easier (unless there's some
technical problem with that I don't yet know of) and because this avoids
a situation where some other thread which wants the readlock mutex has to
wait until some completeld unrelated filesystem operations have
completed.

But theory only gets one so far and it would be good if someone capable
of reproducing the problem tested this.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net] sctp: sctp should release assoc when sctp_make_abort_user return NULL in sctp_close

2015-12-18 Thread Vlad Yasevich
On 12/17/2015 02:33 PM, Vlad Yasevich wrote:
> On 12/17/2015 02:01 PM, Marcelo Ricardo Leitner wrote:
>> Em 17-12-2015 16:29, Vlad Yasevich escreveu:
>>> On 12/17/2015 09:30 AM, Xin Long wrote:
 In sctp_close, sctp_make_abort_user may return NULL because of memory
 allocation failure. If this happens, it will bypass any state change
 and never free the assoc. The assoc has no chance to be freed and it
 will be kept in memory with the state it had even after the socket is
 closed by sctp_close().

 So if sctp_make_abort_user fails to allocate memory, we should just
 free the asoc, as there isn't much else that we can do.

 Signed-off-by: Xin Long 
 Acked-by: Marcelo Ricardo Leitner 
 ---
   net/sctp/socket.c | 6 +-
   1 file changed, 5 insertions(+), 1 deletion(-)

 diff --git a/net/sctp/socket.c b/net/sctp/socket.c
 index 9b6cc6d..267b8f8 100644
 --- a/net/sctp/socket.c
 +++ b/net/sctp/socket.c
 @@ -1513,8 +1513,12 @@ static void sctp_close(struct sock *sk, long 
 timeout)
   struct sctp_chunk *chunk;

   chunk = sctp_make_abort_user(asoc, NULL, 0);
 -if (chunk)
 +if (chunk) {
   sctp_primitive_ABORT(net, asoc, chunk);
 +} else {
 +sctp_unhash_established(asoc);
 +sctp_association_free(asoc);
 +}
>>>
>>> I don't think you can do that for an association that has not been closed.
>>>
>>> I think a cleaner approach might be to update abort primitive handlers
>>> to handle a NULL chunk value and unconditionally call the primitive.
>>>
>>> This guarantees that any timers or waitqueues that might be active are
>>> stopped correctly.
>>
>> sctp_association_free() is the one who does that job, even that way. All in 
>> between the
>> primitive call and then the call to sctp_association_free() is just status 
>> changes and
>> packet xmit, which doing this way we cut out when we are in memory pressure. 
>> pkt xmit or
>> ULP events are likely going to fail too anyway.
>>
>> sctp_sf_do_9_1_prm_abort() -> SCTP_CMD_ASSOC_FAILED ->
>>   sctp_cmd_assoc_failed -> ULP events, send abort, and SCTP_CMD_DELETE_TCB ->
>> sctp_cmd_delete_tcb ->
>>   sctp_unhash_established(asoc);
>>   sctp_association_free(asoc);
>> and returns.
>>
>> There is a check on sctp_cmd_delete_tcb() that avoids calling that on temp 
>> assocs on
>> listening sockets, but that condition is false due to the check on 
>> sk_shutdown so it will
>> call those two functions anyway.
> 
> The condition I am a bit concerned about is one thread waiting in 
> sctp_wait_for_sndbuf
> while another does an abort.
> 
> I think this is OK though.  I need to look a bit more...

I think the only time this ends up biting us is if SO_SNDTIMEO was used and we 
ran out
of send buffer.  It looks to me like schedule_timeout() will wait until timer 
expired and
depending on the timer value, you could wait quite a while.

With this path, since you don't transition state, the asoc->wait wait queue is 
never
notified and it could be hanging around for quite a while.

-vlad   

> 
> -vlad
> 
> 
>>
>>   Marcelo
>>
> 

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 12/14] net: sched: use list_for_each_entry

2015-12-18 Thread Geliang Tang
Use list_for_each_entry() instead of list_for_each() to simplify
the code.

Signed-off-by: Geliang Tang 
---
 net/sched/sch_htb.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c
index 15ccd7f..5f7aa74 100644
--- a/net/sched/sch_htb.c
+++ b/net/sched/sch_htb.c
@@ -948,10 +948,9 @@ static unsigned int htb_drop(struct Qdisc *sch)
int prio;
 
for (prio = TC_HTB_NUMPRIO - 1; prio >= 0; prio--) {
-   struct list_head *p;
-   list_for_each(p, q->drops + prio) {
-   struct htb_class *cl = list_entry(p, struct htb_class,
- un.leaf.drop_list);
+   struct htb_class *cl;
+
+   list_for_each_entry(cl, q->drops + prio, un.leaf.drop_list) {
unsigned int len;
if (cl->un.leaf.q->ops->drop &&
(len = cl->un.leaf.q->ops->drop(cl->un.leaf.q))) {
-- 
2.5.0


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 09/14] net: dsa: use list_for_each_entry

2015-12-18 Thread Geliang Tang
Use list_for_each_entry() instead of list_for_each() to simplify
the code.

Signed-off-by: Geliang Tang 
---
 net/dsa/dsa.c | 9 ++---
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
index 208d1b2..2ab4a19 100644
--- a/net/dsa/dsa.c
+++ b/net/dsa/dsa.c
@@ -53,19 +53,14 @@ EXPORT_SYMBOL_GPL(unregister_switch_driver);
 static struct dsa_switch_driver *
 dsa_switch_probe(struct device *host_dev, int sw_addr, char **_name)
 {
-   struct dsa_switch_driver *ret;
-   struct list_head *list;
+   struct dsa_switch_driver *ret, *drv;
char *name;
 
ret = NULL;
name = NULL;
 
mutex_lock(_switch_drivers_mutex);
-   list_for_each(list, _switch_drivers) {
-   struct dsa_switch_driver *drv;
-
-   drv = list_entry(list, struct dsa_switch_driver, list);
-
+   list_for_each_entry(drv, _switch_drivers, list) {
name = drv->probe(host_dev, sw_addr);
if (name != NULL) {
ret = drv;
-- 
2.5.0


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 07/14] batman-adv: use list_for_each_entry_safe

2015-12-18 Thread Geliang Tang
Use list_for_each_entry_safe() instead of list_for_each_safe() to
simplify the code.

Signed-off-by: Geliang Tang 
---
 net/batman-adv/icmp_socket.c | 22 +-
 1 file changed, 9 insertions(+), 13 deletions(-)

diff --git a/net/batman-adv/icmp_socket.c b/net/batman-adv/icmp_socket.c
index bcabb5e..841239c 100644
--- a/net/batman-adv/icmp_socket.c
+++ b/net/batman-adv/icmp_socket.c
@@ -104,25 +104,21 @@ static int batadv_socket_open(struct inode *inode, struct 
file *file)
 
 static int batadv_socket_release(struct inode *inode, struct file *file)
 {
-   struct batadv_socket_client *socket_client = file->private_data;
-   struct batadv_socket_packet *socket_packet;
-   struct list_head *list_pos, *list_pos_tmp;
+   struct batadv_socket_client *client = file->private_data;
+   struct batadv_socket_packet *packet, *tmp;
 
-   spin_lock_bh(_client->lock);
+   spin_lock_bh(>lock);
 
/* for all packets in the queue ... */
-   list_for_each_safe(list_pos, list_pos_tmp, _client->queue_list) {
-   socket_packet = list_entry(list_pos,
-  struct batadv_socket_packet, list);
-
-   list_del(list_pos);
-   kfree(socket_packet);
+   list_for_each_entry_safe(packet, tmp, >queue_list, list) {
+   list_del(>list);
+   kfree(packet);
}
 
-   batadv_socket_client_hash[socket_client->index] = NULL;
-   spin_unlock_bh(_client->lock);
+   batadv_socket_client_hash[client->index] = NULL;
+   spin_unlock_bh(>lock);
 
-   kfree(socket_client);
+   kfree(client);
module_put(THIS_MODULE);
 
return 0;
-- 
2.5.0


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] [iproute2] tc/q_htb.c: remove printing of a deprecated overhead value previously encoded as a part of mpu field

2015-12-18 Thread Phil Sutter
On Fri, Dec 18, 2015 at 04:16:38PM +0300, Dmitrii Shcherbakov wrote:
> Remove printing according to the previously used encoding of mpu and overhead 
> values within the tc_ratespec's mpu field. This encoding is no longer being 
> used as a separate 'overhead' field in the ratespec structure has been 
> introduced.
> 
> Signed-off-by: Dmitrii Shcherbakov 

Acked-by: Phil Sutter 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] [iproute2] tc/q_htb.c: rename b4 buffer to b3 to make its name more consistent

2015-12-18 Thread Phil Sutter
On Fri, Dec 18, 2015 at 04:16:39PM +0300, Dmitrii Shcherbakov wrote:
> b3 buffer has been deleted previously so b2 is followed by b4 which is not 
> consistent
> 
> Signed-off-by: Dmitrii Shcherbakov 

Acked-by: Phil Sutter 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] [iproute2] tc/q_htb.c: remove printing of a deprecated overhead value previously encoded as a part of mpu field

2015-12-18 Thread Jesper Dangaard Brouer

On Fri, 18 Dec 2015 16:16:38 +0300 Dmitrii Shcherbakov  
wrote:

> Remove printing according to the previously used encoding of mpu and overhead 
> values within the tc_ratespec's mpu field. This encoding is no longer being 
> used as a separate 'overhead' field in the ratespec structure has been 
> introduced.
> 
> Signed-off-by: Dmitrii Shcherbakov 
> ---

Acked-by: Jesper Dangaard Brouer 

Thank you Dmitrii for cleaning this up :-)

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 03/14] ipv4, ipv6: use list_for_each_entry*

2015-12-18 Thread Julia Lawall
I don't think this code can work.  list_for_each_entry uses answer to get
from one element to the next.

julia

On Sat, 19 Dec 2015, kbuild test robot wrote:

> CC: kbuild-...@01.org
> In-Reply-To: 
> <0167bba2ecf8c4fcb6b0b3135a4e957309986498.1450451516.git.geliangt...@163.com>
> TO: Geliang Tang <geliangt...@163.com>
> CC: "David S. Miller" <da...@davemloft.net>, Alexey Kuznetsov 
> <kuz...@ms2.inr.ac.ru>, James Morris <jmor...@namei.org>, Hideaki YOSHIFUJI 
> <yoshf...@linux-ipv6.org>, Patrick McHardy <ka...@trash.net>
> CC: Geliang Tang <geliangt...@163.com>, netdev@vger.kernel.org, 
> linux-ker...@vger.kernel.org
>
> Hi Geliang,
>
> [auto build test WARNING on net/master]
> [also build test WARNING on v4.4-rc5 next-20151218]
>
> url:
> https://github.com/0day-ci/linux/commits/Geliang-Tang/Bluetooth-use-list_for_each_entry/20151218-234306
> :: branch date: 69 minutes ago
> :: commit date: 69 minutes ago
>
> >> net/ipv6/af_inet6.c:589:1-20: iterator with update on line 597
>
> git remote add linux-review https://github.com/0day-ci/linux
> git remote update linux-review
> git checkout c5e8d791cacac62eeec48e00a1a14a6a350670f4
> vim +589 net/ipv6/af_inet6.c
>
> ^1da177e Linus Torvalds 2005-04-16  583   goto out_illegal;
> ^1da177e Linus Torvalds 2005-04-16  584
> ^1da177e Linus Torvalds 2005-04-16  585   /* If we are trying to override 
> a permanent protocol, bail. */
> ^1da177e Linus Torvalds 2005-04-16  586   answer = NULL;
> 87c3efbf Daniel Lezcano 2007-12-11  587   ret = -EPERM;
> ^1da177e Linus Torvalds 2005-04-16  588   last_perm = [p->type];
> c5e8d791 Geliang Tang   2015-12-18 @589   list_for_each_entry(answer, 
> [p->type], list) {
> ^1da177e Linus Torvalds 2005-04-16  590   /* Check only the 
> non-wild match. */
> ^1da177e Linus Torvalds 2005-04-16  591   if 
> (INET_PROTOSW_PERMANENT & answer->flags) {
> ^1da177e Linus Torvalds 2005-04-16  592   if (protocol == 
> answer->protocol)
> ^1da177e Linus Torvalds 2005-04-16  593   break;
> c5e8d791 Geliang Tang   2015-12-18  594   last_perm = 
> >list;
> ^1da177e Linus Torvalds 2005-04-16  595   }
> ^1da177e Linus Torvalds 2005-04-16  596
> ^1da177e Linus Torvalds 2005-04-16 @597   answer = NULL;
> ^1da177e Linus Torvalds 2005-04-16  598   }
> ^1da177e Linus Torvalds 2005-04-16  599   if (answer)
> ^1da177e Linus Torvalds 2005-04-16  600   goto out_permanent;
>
> ---
> 0-DAY kernel test infrastructureOpen Source Technology Center
> https://lists.01.org/pipermail/kbuild-all   Intel Corporation
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] veth: don't modify ip-summed; doing so treats packets with bad checksums as good.

2015-12-18 Thread Vijay Pandurangan
Packets that arrive from real hardware devices have ip_summed ==
CHECKSUM_UNNECESSARY if the hardware verified the checksums, or
CHECKSUM_NONE if the packet is bad or it was unable to verify it. The
current version of veth will replace CHECKSUM_NONE with
CHECKSUM_UNNECESSARY, which causes corrupt packets routed from hardware to
a veth device to be delivered to the application. This caused applications
at Twitter to receive corrupt data when network hardware was corrupting
packets.

We believe this was added as an optimization to skip computing and
verifying checksums for communication between containers. However, locally
generated packets have ip_summed == CHECKSUM_PARTIAL, so the code as
written does nothing for them. As far as we can tell, after removing this
code, these packets are transmitted from one stack to another unmodified
(tcpdump shows invalid checksums on both sides, as expected), and they are
delivered correctly to applications. We didn’t test every possible network
configuration, but we tried a few common ones such as bridging containers,
using NAT between the host and a container, and routing from hardware
devices to containers. We have effectively deployed this in production at
Twitter (by disabling RX checksum offloading on veth devices).

This code dates back to the first version of the driver, commit
 ("[NET]: Virtual ethernet device driver"), so I
suspect this bug occurred mostly because the driver API has evolved
significantly since then. Commit <0b7967503dc97864f283a> ("net/veth: Fix
packet checksumming") (in December 2010) fixed this for packets that get
created locally and sent to hardware devices, by not changing
CHECKSUM_PARTIAL. However, the same issue still occurs for packets coming
in from hardware devices.

Co-authored-by: Evan Jones 
Signed-off-by: Evan Jones 
Cc: Nicolas Dichtel 
Cc: Phil Sutter 
Cc: Toshiaki Makita 
Cc: netdev@vger.kernel.org
Cc: linux-ker...@vger.kernel.org
Signed-off-by: Vijay Pandurangan 
---
 drivers/net/veth.c | 6 --
 1 file changed, 6 deletions(-)

diff --git a/drivers/net/veth.c b/drivers/net/veth.c
index 0ef4a5a..ba21d07 100644
--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -117,12 +117,6 @@ static netdev_tx_t veth_xmit(struct sk_buff *skb,
struct net_device *dev)
  kfree_skb(skb);
  goto drop;
  }
- /* don't change ip_summed == CHECKSUM_PARTIAL, as that
- * will cause bad checksum on forwarded packets
- */
- if (skb->ip_summed == CHECKSUM_NONE &&
-rcv->features & NETIF_F_RXCSUM)
- skb->ip_summed = CHECKSUM_UNNECESSARY;

  if (likely(dev_forward_skb(rcv, skb) == NET_RX_SUCCESS)) {
  struct pcpu_vstats *stats = this_cpu_ptr(dev->vstats);
-- 
2.5.0
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net] sctp: sctp should release assoc when sctp_make_abort_user return NULL in sctp_close

2015-12-18 Thread Marcelo Ricardo Leitner
On Fri, Dec 18, 2015 at 09:08:46AM -0500, Vlad Yasevich wrote:
> On 12/17/2015 02:33 PM, Vlad Yasevich wrote:
> > On 12/17/2015 02:01 PM, Marcelo Ricardo Leitner wrote:
...
> >> There is a check on sctp_cmd_delete_tcb() that avoids calling that on temp 
> >> assocs on
> >> listening sockets, but that condition is false due to the check on 
> >> sk_shutdown so it will
> >> call those two functions anyway.
> > 
> > The condition I am a bit concerned about is one thread waiting in 
> > sctp_wait_for_sndbuf
> > while another does an abort.
> > 
> > I think this is OK though.  I need to look a bit more...
> 
> I think the only time this ends up biting us is if SO_SNDTIMEO was used and 
> we ran out
> of send buffer.  It looks to me like schedule_timeout() will wait until timer 
> expired and
> depending on the timer value, you could wait quite a while.
> 
> With this path, since you don't transition state, the asoc->wait wait queue 
> is never
> notified and it could be hanging around for quite a while.

Yes, agreed. For blocking sockets, it could hang waiting until the
application finally closes. Thanks Vlad.

  Marcelo

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][iproute2] tc/q_htb.c: Fix the MPU value output in 'tc -d class show dev ' command

2015-12-18 Thread Phil Sutter
On Fri, Dec 18, 2015 at 07:39:25PM +0300, Dmitrii Shcherbakov wrote:
> > Dmitrii, did iproute2 without your change even print the overhead as set
> > by you before? Looking at the code, I'd assume not.
> 
> Tried building iproute2 (as of tag 4.2) and using the upstream linux kernel 
> (also tag 4.2 - 64291f7db5bd8150a74ad2036f1037e6a0428df2):

This is without your patch, right?

[...]

> ~/src/iproute2/tc$ sudo ./tc class add dev eth0 parent 1: classid 1:1 htb 
> rate 100kbps ceil 100kbps mpu 256 overhead 64

Setting an mpu of 256 is suitable to get 0 as output value, as the code
before your patch ANDs it with 0xff.

> So it looks like the overhead is being set correctly, but the mpu is not, 
> even though the respective kernel module is loaded judging by what I see.

To really know what is being set, you would have to look at the kernel
variables not what iproute prints. This is nitpicking mostly, but
relevant in this case as your patches to fix iproute's output show.

Cheers, Phil
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next 5/5] sfc: Downgrade or remove some error messages

2015-12-18 Thread Bert Kenward
Depending on configuration the NIC may return errors for unprivileged
functions and/or VFs. Where these are expected and handled, reduce the
level of any output.

Signed-off-by: Bert Kenward 
---
 drivers/net/ethernet/sfc/ef10.c | 20 ++--
 drivers/net/ethernet/sfc/efx.c  |  7 ---
 2 files changed, 18 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/sfc/ef10.c b/drivers/net/ethernet/sfc/ef10.c
index edffc9a..9adb895 100644
--- a/drivers/net/ethernet/sfc/ef10.c
+++ b/drivers/net/ethernet/sfc/ef10.c
@@ -486,10 +486,17 @@ static int efx_ef10_alloc_piobufs(struct efx_nic *efx, 
unsigned int n)
BUILD_BUG_ON(MC_CMD_ALLOC_PIOBUF_IN_LEN != 0);
 
for (i = 0; i < n; i++) {
-   rc = efx_mcdi_rpc(efx, MC_CMD_ALLOC_PIOBUF, NULL, 0,
- outbuf, sizeof(outbuf), );
-   if (rc)
+   rc = efx_mcdi_rpc_quiet(efx, MC_CMD_ALLOC_PIOBUF, NULL, 0,
+   outbuf, sizeof(outbuf), );
+   if (rc) {
+   /* Don't display the MC error if we didn't have space
+* for a VF.
+*/
+   if (!(efx_ef10_is_vf(efx) && rc == -ENOSPC))
+   efx_mcdi_display_error(efx, MC_CMD_ALLOC_PIOBUF,
+  0, outbuf, outlen, rc);
break;
+   }
if (outlen < MC_CMD_ALLOC_PIOBUF_OUT_LEN) {
rc = -EIO;
break;
@@ -4027,9 +4034,10 @@ static int efx_ef10_filter_insert_def(struct efx_nic 
*efx, bool multicast,
 
rc = efx_ef10_filter_insert(efx, , true);
if (rc < 0) {
-   netif_warn(efx, drv, efx->net_dev,
-  "%scast mismatch filter insert failed rc=%d\n",
-  multicast ? "Multi" : "Uni", rc);
+   netif_printk(efx, drv, rc == -EPERM ? KERN_DEBUG : KERN_WARNING,
+efx->net_dev,
+"%scast mismatch filter insert failed rc=%d\n",
+multicast ? "Multi" : "Uni", rc);
} else if (multicast) {
table->mcdef_id = efx_ef10_filter_get_unsafe_id(efx, rc);
if (!nic_data->workaround_26807) {
diff --git a/drivers/net/ethernet/sfc/efx.c b/drivers/net/ethernet/sfc/efx.c
index 6f69743..0705ec86 100644
--- a/drivers/net/ethernet/sfc/efx.c
+++ b/drivers/net/ethernet/sfc/efx.c
@@ -3174,14 +3174,15 @@ static int efx_pci_probe(struct pci_dev *pci_dev,
rtnl_lock();
rc = efx_mtd_probe(efx);
rtnl_unlock();
-   if (rc)
+   if (rc && rc != -EPERM)
netif_warn(efx, probe, efx->net_dev,
   "failed to create MTDs (%d)\n", rc);
 
rc = pci_enable_pcie_error_reporting(pci_dev);
if (rc && rc != -EINVAL)
-   netif_warn(efx, probe, efx->net_dev,
-  "pci_enable_pcie_error_reporting failed (%d)\n", rc);
+   netif_notice(efx, probe, efx->net_dev,
+"PCIE error reporting unavailable (%d).\n",
+rc);
 
return 0;
 
-- 
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next 4/5] sfc: Downgrade EPERM messages from MCDI to debug

2015-12-18 Thread Bert Kenward
From: Tomáš Pilař 

When running in an unprivileged function we expect some MC commands
to fail with permission errors. To avoid log spew downgrade these to
debug only.

Signed-off-by: Bert Kenward 
---
 drivers/net/ethernet/sfc/mcdi.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/sfc/mcdi.c b/drivers/net/ethernet/sfc/mcdi.c
index ab68ff6..7df183a 100644
--- a/drivers/net/ethernet/sfc/mcdi.c
+++ b/drivers/net/ethernet/sfc/mcdi.c
@@ -1084,9 +1084,10 @@ void efx_mcdi_display_error(struct efx_nic *efx, 
unsigned cmd,
code = MCDI_DWORD(outbuf, ERR_CODE);
if (outlen >= MC_CMD_ERR_ARG_OFST + 4)
err_arg = MCDI_DWORD(outbuf, ERR_ARG);
-   netif_err(efx, hw, efx->net_dev,
- "MC command 0x%x inlen %d failed rc=%d (raw=%d) arg=%d\n",
- cmd, (int)inlen, rc, code, err_arg);
+   netif_printk(efx, hw, rc == -EPERM ? KERN_DEBUG : KERN_ERR,
+efx->net_dev,
+"MC command 0x%x inlen %zu failed rc=%d (raw=%d) arg=%d\n",
+cmd, inlen, rc, code, err_arg);
 }
 
 /* Switch to polled MCDI completions.  This can be called in various
-- 
2.4.3


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ila: add NETFILTER dependency

2015-12-18 Thread Florian Westphal
Pablo Neira Ayuso  wrote:
> On Fri, Dec 18, 2015 at 03:37:37PM +0100, Arnd Bergmann wrote:
> > The recently added generic ILA translation facility fails to
> > build when CONFIG_NETFILTER is disabled:
> > 
> > net/ipv6/ila/ila_xlat.c:229:20: warning: 'struct nf_hook_state' declared 
> > inside parameter list
> > net/ipv6/ila/ila_xlat.c:235:27: error: array type has incomplete element 
> > type 'struct nf_hook_ops'
> >  static struct nf_hook_ops ila_nf_hook_ops[] __read_mostly = {
> > 
> > This adds an explicit Kconfig dependency to avoid that case.
> 
> I'm afraid this extra Kconfig dependency that Arnd adds to fix this is
> a symptom that there is something that doesn't belong there.
> 
> I overlook this new hook on priority -1, how does this integrate into
> our infrastructure?

Looks problematic since address changes post ipv6 dnat translations,
its certainly unexpected for nft since we have magic address mangling
after -2 and 0 priroized tables...

However ... how is ILA supposed to work?

ila_xlat_outgoing has no callers, so it appears we only do this
stateless nat on ingress...?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next 2/5] sfc: Handle MCDI proxy authorisation

2015-12-18 Thread Bert Kenward
For unprivileged functions operations can be authorised by an admin
function. Extra steps are introduced to the MCDI protocol in this
situation - the initial response from the MCDI tells us that the
operation has been deferred, and we must retry when told. We then
receive an event telling us to retry.

Note that this provides only the functionality for the unprivileged
functions, not the handling of the administrative side.

Signed-off-by: Bert Kenward 
---
 drivers/net/ethernet/sfc/mcdi.c | 156 +---
 drivers/net/ethernet/sfc/mcdi.h |   9 +++
 2 files changed, 157 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/sfc/mcdi.c b/drivers/net/ethernet/sfc/mcdi.c
index 2509ca9..ab68ff6 100644
--- a/drivers/net/ethernet/sfc/mcdi.c
+++ b/drivers/net/ethernet/sfc/mcdi.c
@@ -82,6 +82,7 @@ int efx_mcdi_init(struct efx_nic *efx)
mcdi->logging_enabled = mcdi_logging_default;
 #endif
init_waitqueue_head(>wq);
+   init_waitqueue_head(>proxy_rx_wq);
spin_lock_init(>iface_lock);
mcdi->state = MCDI_STATE_QUIESCENT;
mcdi->mode = MCDI_MODE_POLL;
@@ -622,10 +623,30 @@ efx_mcdi_check_supported(struct efx_nic *efx, unsigned 
int cmd, size_t inlen)
return 0;
 }
 
-static int _efx_mcdi_rpc_finish(struct efx_nic *efx, unsigned cmd, size_t 
inlen,
+static bool efx_mcdi_get_proxy_handle(struct efx_nic *efx,
+ size_t hdr_len, size_t data_len,
+ u32 *proxy_handle)
+{
+   MCDI_DECLARE_BUF_ERR(testbuf);
+   const size_t buflen = sizeof(testbuf);
+
+   if (!proxy_handle || data_len < buflen)
+   return false;
+
+   efx->type->mcdi_read_response(efx, testbuf, hdr_len, buflen);
+   if (MCDI_DWORD(testbuf, ERR_CODE) == MC_CMD_ERR_PROXY_PENDING) {
+   *proxy_handle = MCDI_DWORD(testbuf, ERR_PROXY_PENDING_HANDLE);
+   return true;
+   }
+
+   return false;
+}
+
+static int _efx_mcdi_rpc_finish(struct efx_nic *efx, unsigned int cmd,
+   size_t inlen,
efx_dword_t *outbuf, size_t outlen,
size_t *outlen_actual, bool quiet,
-   int *raw_rc)
+   u32 *proxy_handle, int *raw_rc)
 {
struct efx_mcdi_iface *mcdi = efx_mcdi(efx);
MCDI_DECLARE_BUF_ERR(errbuf);
@@ -659,6 +680,9 @@ static int _efx_mcdi_rpc_finish(struct efx_nic *efx, 
unsigned cmd, size_t inlen,
spin_unlock_bh(>iface_lock);
}
 
+   if (proxy_handle)
+   *proxy_handle = 0;
+
if (rc != 0) {
if (outlen_actual)
*outlen_actual = 0;
@@ -693,6 +717,12 @@ static int _efx_mcdi_rpc_finish(struct efx_nic *efx, 
unsigned cmd, size_t inlen,
netif_err(efx, hw, efx->net_dev, "MC fatal error %d\n",
  -rc);
efx_schedule_reset(efx, RESET_TYPE_MC_FAILURE);
+   } else if (proxy_handle && (rc == -EPROTO) &&
+  efx_mcdi_get_proxy_handle(efx, hdr_len, data_len,
+proxy_handle)) {
+   mcdi->proxy_rx_status = 0;
+   mcdi->proxy_rx_handle = 0;
+   mcdi->state = MCDI_STATE_PROXY_WAIT;
} else if (rc && !quiet) {
efx_mcdi_display_error(efx, cmd, inlen, errbuf, err_len,
   rc);
@@ -705,23 +735,121 @@ static int _efx_mcdi_rpc_finish(struct efx_nic *efx, 
unsigned cmd, size_t inlen,
}
}
 
-   efx_mcdi_release(mcdi);
+   if (!proxy_handle || !*proxy_handle)
+   efx_mcdi_release(mcdi);
return rc;
 }
 
-static int _efx_mcdi_rpc(struct efx_nic *efx, unsigned cmd,
+static void efx_mcdi_proxy_abort(struct efx_mcdi_iface *mcdi)
+{
+   if (mcdi->state == MCDI_STATE_PROXY_WAIT) {
+   /* Interrupt the proxy wait. */
+   mcdi->proxy_rx_status = -EINTR;
+   wake_up(>proxy_rx_wq);
+   }
+}
+
+static void efx_mcdi_ev_proxy_response(struct efx_nic *efx,
+  u32 handle, int status)
+{
+   struct efx_mcdi_iface *mcdi = efx_mcdi(efx);
+
+   WARN_ON(mcdi->state != MCDI_STATE_PROXY_WAIT);
+
+   mcdi->proxy_rx_status = efx_mcdi_errno(status);
+   /* Ensure the status is written before we update the handle, since the
+* latter is used to check if we've finished.
+*/
+   wmb();
+   mcdi->proxy_rx_handle = handle;
+   wake_up(>proxy_rx_wq);
+}
+
+static int efx_mcdi_proxy_wait(struct efx_nic *efx, u32 handle, bool quiet)
+{
+   struct efx_mcdi_iface *mcdi = efx_mcdi(efx);
+   int rc;
+
+   /* Wait for a proxy event, or timeout. */
+   rc = 

[PATCH net-next 1/5] sfc: Retry MCDI after NO_EVB_PORT error on a VF

2015-12-18 Thread Bert Kenward
After reboot the vswitch configuration from the PF may not be
complete before the VF attempts to restore filters. In that
case we see NO_EVB_PORT errors from the MC. Retry up to a time
limit or until a different result is seen.

Signed-off-by: Bert Kenward 
---
 drivers/net/ethernet/sfc/mcdi.c | 99 ++---
 drivers/net/ethernet/sfc/mcdi.h |  1 +
 2 files changed, 85 insertions(+), 15 deletions(-)

diff --git a/drivers/net/ethernet/sfc/mcdi.c b/drivers/net/ethernet/sfc/mcdi.c
index 41fb6b6..2509ca9 100644
--- a/drivers/net/ethernet/sfc/mcdi.c
+++ b/drivers/net/ethernet/sfc/mcdi.c
@@ -315,6 +315,7 @@ static void efx_mcdi_read_response_header(struct efx_nic 
*efx)
}
 #endif
 
+   mcdi->resprc_raw = 0;
if (error && mcdi->resp_data_len == 0) {
netif_err(efx, hw, efx->net_dev, "MC rebooted\n");
mcdi->resprc = -EIO;
@@ -325,8 +326,8 @@ static void efx_mcdi_read_response_header(struct efx_nic 
*efx)
mcdi->resprc = -EIO;
} else if (error) {
efx->type->mcdi_read_response(efx, , mcdi->resp_hdr_len, 4);
-   mcdi->resprc =
-   efx_mcdi_errno(EFX_DWORD_FIELD(hdr, EFX_DWORD_0));
+   mcdi->resprc_raw = EFX_DWORD_FIELD(hdr, EFX_DWORD_0);
+   mcdi->resprc = efx_mcdi_errno(mcdi->resprc_raw);
} else {
mcdi->resprc = 0;
}
@@ -623,7 +624,8 @@ efx_mcdi_check_supported(struct efx_nic *efx, unsigned int 
cmd, size_t inlen)
 
 static int _efx_mcdi_rpc_finish(struct efx_nic *efx, unsigned cmd, size_t 
inlen,
efx_dword_t *outbuf, size_t outlen,
-   size_t *outlen_actual, bool quiet)
+   size_t *outlen_actual, bool quiet,
+   int *raw_rc)
 {
struct efx_mcdi_iface *mcdi = efx_mcdi(efx);
MCDI_DECLARE_BUF_ERR(errbuf);
@@ -669,6 +671,8 @@ static int _efx_mcdi_rpc_finish(struct efx_nic *efx, 
unsigned cmd, size_t inlen,
 * acquiring the iface_lock. */
spin_lock_bh(>iface_lock);
rc = mcdi->resprc;
+   if (raw_rc)
+   *raw_rc = mcdi->resprc_raw;
hdr_len = mcdi->resp_hdr_len;
data_len = mcdi->resp_data_len;
err_len = min(sizeof(errbuf), data_len);
@@ -708,27 +712,92 @@ static int _efx_mcdi_rpc_finish(struct efx_nic *efx, 
unsigned cmd, size_t inlen,
 static int _efx_mcdi_rpc(struct efx_nic *efx, unsigned cmd,
 const efx_dword_t *inbuf, size_t inlen,
 efx_dword_t *outbuf, size_t outlen,
-size_t *outlen_actual, bool quiet)
+size_t *outlen_actual, bool quiet, int *raw_rc)
 {
int rc;
 
rc = efx_mcdi_rpc_start(efx, cmd, inbuf, inlen);
-   if (rc) {
-   if (outlen_actual)
-   *outlen_actual = 0;
+   if (rc)
return rc;
-   }
+
return _efx_mcdi_rpc_finish(efx, cmd, inlen, outbuf, outlen,
-   outlen_actual, quiet);
+   outlen_actual, quiet, raw_rc);
 }
 
+static int _efx_mcdi_rpc_evb_retry(struct efx_nic *efx, unsigned cmd,
+  const efx_dword_t *inbuf, size_t inlen,
+  efx_dword_t *outbuf, size_t outlen,
+  size_t *outlen_actual, bool quiet)
+{
+   int raw_rc = 0;
+   int rc;
+
+   rc = _efx_mcdi_rpc(efx, cmd, inbuf, inlen,
+  outbuf, outlen, outlen_actual, true, _rc);
+
+   if ((rc == -EPROTO) && (raw_rc == MC_CMD_ERR_NO_EVB_PORT) &&
+   efx->type->is_vf) {
+   /* If the EVB port isn't available within a VF this may
+* mean the PF is still bringing the switch up. We should
+* retry our request shortly.
+*/
+   unsigned long abort_time = jiffies + MCDI_RPC_TIMEOUT;
+   unsigned int delay_us = 1;
+
+   netif_dbg(efx, hw, efx->net_dev,
+ "%s: NO_EVB_PORT; will retry request\n",
+ __func__);
+
+   do {
+   usleep_range(delay_us, delay_us + 1);
+   rc = _efx_mcdi_rpc(efx, cmd, inbuf, inlen,
+  outbuf, outlen, outlen_actual,
+  true, _rc);
+   if (delay_us < 10)
+   delay_us <<= 1;
+   } while ((rc == -EPROTO) &&
+(raw_rc == MC_CMD_ERR_NO_EVB_PORT) &&
+time_before(jiffies, abort_time));
+   }
+
+   if (rc && !quiet && !(cmd == MC_CMD_REBOOT && rc == -EIO))
+   

[PATCH net-next 3/5] sfc: Make failed filter removal less noisy

2015-12-18 Thread Bert Kenward
There are situations - mostly reset related - where our view of the
filter table differs from the hardware. In this case we may try and
remove filters that aren't actually installed. This isn't that
interesting in most situations, so downgrade the logging.

Signed-off-by: Bert Kenward 
---
 drivers/net/ethernet/sfc/ef10.c | 48 ++---
 1 file changed, 31 insertions(+), 17 deletions(-)

diff --git a/drivers/net/ethernet/sfc/ef10.c b/drivers/net/ethernet/sfc/ef10.c
index c4a0e8a..edffc9a 100644
--- a/drivers/net/ethernet/sfc/ef10.c
+++ b/drivers/net/ethernet/sfc/ef10.c
@@ -3832,13 +3832,12 @@ static void efx_ef10_filter_table_remove(struct efx_nic 
*efx)
   MC_CMD_FILTER_OP_IN_OP_UNSUBSCRIBE);
MCDI_SET_QWORD(inbuf, FILTER_OP_IN_HANDLE,
   table->entry[filter_idx].handle);
-   rc = efx_mcdi_rpc(efx, MC_CMD_FILTER_OP, inbuf, sizeof(inbuf),
- NULL, 0, NULL);
+   rc = efx_mcdi_rpc_quiet(efx, MC_CMD_FILTER_OP, inbuf,
+   sizeof(inbuf), NULL, 0, NULL);
if (rc)
-   netdev_WARN(efx->net_dev,
-   "filter_idx=%#x handle=%#llx\n",
-   filter_idx,
-   table->entry[filter_idx].handle);
+   netif_info(efx, drv, efx->net_dev,
+  "%s: filter %04x remove failed\n",
+  __func__, filter_idx);
kfree(spec);
}
 
@@ -3847,11 +3846,14 @@ static void efx_ef10_filter_table_remove(struct efx_nic 
*efx)
 }
 
 #define EFX_EF10_FILTER_DO_MARK_OLD(id) \
-   if (id != EFX_EF10_FILTER_ID_INVALID) { \
-   filter_idx = efx_ef10_filter_get_unsafe_id(efx, id); \
-   WARN_ON(!table->entry[filter_idx].spec); \
-   table->entry[filter_idx].spec |= 
EFX_EF10_FILTER_FLAG_AUTO_OLD; \
-   }
+   if (id != EFX_EF10_FILTER_ID_INVALID) { \
+   filter_idx = efx_ef10_filter_get_unsafe_id(efx, id); \
+   if (!table->entry[filter_idx].spec) \
+   netif_dbg(efx, drv, efx->net_dev, \
+ "%s: marked null spec old %04x:%04x\n", \
+ __func__, id, filter_idx); \
+   table->entry[filter_idx].spec |= EFX_EF10_FILTER_FLAG_AUTO_OLD;\
+   }
 static void efx_ef10_filter_mark_old(struct efx_nic *efx)
 {
struct efx_ef10_filter_table *table = efx->filter_state;
@@ -4070,19 +4072,31 @@ static int efx_ef10_filter_insert_def(struct efx_nic 
*efx, bool multicast,
 static void efx_ef10_filter_remove_old(struct efx_nic *efx)
 {
struct efx_ef10_filter_table *table = efx->filter_state;
-   bool remove_failed = false;
+   int remove_failed = 0;
+   int remove_noent = 0;
+   int rc;
int i;
 
for (i = 0; i < HUNT_FILTER_TBL_ROWS; i++) {
if (ACCESS_ONCE(table->entry[i].spec) &
EFX_EF10_FILTER_FLAG_AUTO_OLD) {
-   if (efx_ef10_filter_remove_internal(
-   efx, 1U << EFX_FILTER_PRI_AUTO,
-   i, true) < 0)
-   remove_failed = true;
+   rc = efx_ef10_filter_remove_internal(efx,
+   1U << EFX_FILTER_PRI_AUTO, i, true);
+   if (rc == -ENOENT)
+   remove_noent++;
+   else if (rc)
+   remove_failed++;
}
}
-   WARN_ON(remove_failed);
+
+   if (remove_failed)
+   netif_info(efx, drv, efx->net_dev,
+  "%s: failed to remove %d filters\n",
+  __func__, remove_failed);
+   if (remove_noent)
+   netif_info(efx, drv, efx->net_dev,
+  "%s: failed to remove %d non-existent filters\n",
+  __func__, remove_noent);
 }
 
 static int efx_ef10_vport_set_mac_address(struct efx_nic *efx)
-- 
2.4.3


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Fw: [Bug 109581] New: WARNING: CPU: 1 PID: 0 at net/sched/sch_hfsc.c:1429 hfsc_dequeue+0x166/0x2da()

2015-12-18 Thread Stephen Hemminger


Begin forwarded message:

Date: Fri, 18 Dec 2015 17:12:15 +
From: "bugzilla-dae...@bugzilla.kernel.org" 

To: "shemmin...@linux-foundation.org" 
Subject: [Bug 109581] New: WARNING: CPU: 1 PID: 0 at net/sched/sch_hfsc.c:1429 
hfsc_dequeue+0x166/0x2da()


https://bugzilla.kernel.org/show_bug.cgi?id=109581

Bug ID: 109581
   Summary: WARNING: CPU: 1 PID: 0 at net/sched/sch_hfsc.c:1429
hfsc_dequeue+0x166/0x2da()
   Product: Networking
   Version: 2.5
Kernel Version: 4.3.3
  Hardware: Intel
OS: Linux
  Tree: Mainline
Status: NEW
  Severity: high
  Priority: P1
 Component: Other
  Assignee: shemmin...@linux-foundation.org
  Reporter: pstaszew...@artcom.pl
Regression: No

Dec 18 18:08:12 rtbgp05 kernel: [63943.791331] WARNING: CPU: 1 PID: 0 at
net/sched/sch_hfsc.c:1429 hfsc_dequeue+0x166/0x2da()
Dec 18 18:08:12 rtbgp05 kernel: [63943.791332] Modules linked in: ip_vs
act_police bonding ipmi_si ipmi_msghandler
Dec 18 18:08:12 rtbgp05 kernel: [63943.791339] CPU: 1 PID: 0 Comm: swapper/1
Tainted: GW   4.3.3 #1
Dec 18 18:08:12 rtbgp05 kernel: [63943.791339]  0595
88021fc43788 812550ec 
Dec 18 18:08:12 rtbgp05 kernel: [63943.791341]  
88021fc437c8 8103c428 880200c340e8
Dec 18 18:08:12 rtbgp05 kernel: [63943.791344]  81528b38
 8802135b3800 8802135b3940
Dec 18 18:08:12 rtbgp05 kernel: [63943.791346] Call Trace:
Dec 18 18:08:12 rtbgp05 kernel: [63943.791347][]
dump_stack+0x48/0x5c
Dec 18 18:08:12 rtbgp05 kernel: [63943.791350]  []
warn_slowpath_common+0x97/0xb1
Dec 18 18:08:12 rtbgp05 kernel: [63943.791352]  [] ?
hfsc_dequeue+0x166/0x2da
Dec 18 18:08:12 rtbgp05 kernel: [63943.791353]  []
warn_slowpath_null+0x15/0x17
Dec 18 18:08:12 rtbgp05 kernel: [63943.791354]  []
hfsc_dequeue+0x166/0x2da
Dec 18 18:08:12 rtbgp05 kernel: [63943.791356]  []
__qdisc_run+0xc0/0x17b
Dec 18 18:08:12 rtbgp05 kernel: [63943.791358]  []
__dev_queue_xmit+0x28d/0x3ee
Dec 18 18:08:12 rtbgp05 kernel: [63943.791360]  [] ?
eth_header+0x28/0xb5
Dec 18 18:08:12 rtbgp05 kernel: [63943.791361]  []
dev_queue_xmit_sk+0xe/0x10
Dec 18 18:08:12 rtbgp05 kernel: [63943.791363]  []
neigh_connected_output+0xa7/0xc5
Dec 18 18:08:12 rtbgp05 kernel: [63943.791364]  []
ip_finish_output2+0x2b3/0x2fb
Dec 18 18:08:12 rtbgp05 kernel: [63943.791365]  [] ?
nf_hook_slow+0x3f/0xb8
Dec 18 18:08:12 rtbgp05 kernel: [63943.791366]  []
ip_finish_output+0x131/0x141
Dec 18 18:08:12 rtbgp05 kernel: [63943.791367]  []
ip_output+0x6e/0x73
Dec 18 18:08:12 rtbgp05 kernel: [63943.791369]  []
ip_forward_finish+0x7b/0x82
Dec 18 18:08:12 rtbgp05 kernel: [63943.791370]  []
ip_forward+0x354/0x3ee
Dec 18 18:08:12 rtbgp05 kernel: [63943.791372]  [] ?
ip_frag_mem+0x43/0x43
Dec 18 18:08:12 rtbgp05 kernel: [63943.791373]  []
ip_rcv_finish+0x278/0x294
Dec 18 18:08:12 rtbgp05 kernel: [63943.791375]  [] ?
pskb_may_pull+0x2d/0x2d
Dec 18 18:08:12 rtbgp05 kernel: [63943.791376]  []
NF_HOOK.clone.11+0x6e/0x77
Dec 18 18:08:12 rtbgp05 kernel: [63943.791377]  [] ?
pskb_may_pull+0x2d/0x2d
Dec 18 18:08:12 rtbgp05 kernel: [63943.791378]  []
ip_rcv+0x27e/0x2c0
Dec 18 18:08:12 rtbgp05 kernel: [63943.791379]  []
__netif_receive_skb_core+0x54f/0x592
Dec 18 18:08:12 rtbgp05 kernel: [63943.791381]  [] ?
udp4_gro_receive+0x1be/0x1df
Dec 18 18:08:12 rtbgp05 kernel: [63943.791382]  [] ?
inet_gro_receive+0x20f/0x238
Dec 18 18:08:12 rtbgp05 kernel: [63943.791383]  [] ?
timekeeping_get_ns+0x12/0x38
Dec 18 18:08:12 rtbgp05 kernel: [63943.791384]  []
__netif_receive_skb+0x52/0x57
Dec 18 18:08:12 rtbgp05 kernel: [63943.791385]  []
netif_receive_skb_internal+0x6b/0x72
Dec 18 18:08:12 rtbgp05 kernel: [63943.791386]  []
napi_gro_receive+0x39/0x7f
Dec 18 18:08:12 rtbgp05 kernel: [63943.791388]  []
ixgbe_clean_rx_irq+0x689/0x700
Dec 18 18:08:12 rtbgp05 kernel: [63943.791389]  []
ixgbe_poll+0x48e/0x5d9
Dec 18 18:08:12 rtbgp05 kernel: [63943.791390]  [] ?
__wake_up+0x3f/0x48
Dec 18 18:08:12 rtbgp05 kernel: [63943.791392]  [] ?
irq_exit+0x43/0x45
Dec 18 18:08:12 rtbgp05 kernel: [63943.791394]  []
net_rx_action+0xdb/0x237
Dec 18 18:08:12 rtbgp05 kernel: [63943.791395]  []
__do_softirq+0xb3/0x1ac
Dec 18 18:08:12 rtbgp05 kernel: [63943.791396]  []
irq_exit+0x37/0x45
Dec 18 18:08:12 rtbgp05 kernel: [63943.791398]  []
do_IRQ+0xa4/0xbd
Dec 18 18:08:12 rtbgp05 kernel: [63943.791400]  []
common_interrupt+0x7f/0x7f
Dec 18 18:08:12 rtbgp05 kernel: [63943.791402][] ?
cpuidle_enter_state+0x12a/0x180
Dec 18 18:08:12 rtbgp05 kernel: [63943.791402]  [] ?
cpuidle_enter_state+0xee/0x180
Dec 18 18:08:12 rtbgp05 kernel: [63943.791403]  []
cpuidle_enter+0x12/0x14
Dec 18 18:08:12 rtbgp05 kernel: [63943.791405]  []
cpu_startup_entry+0x12e/0x1a8
Dec 18 18:08:12 rtbgp05 kernel: [63943.791406]  []

Re: [PATCH 1/2] [iproute2] tc/q_htb.c: remove printing of a deprecated overhead value previously encoded as a part of mpu field

2015-12-18 Thread Dmitrii Shcherbakov
Jesper,

> Thank you Dmitrii for cleaning this up :-)

You are welcome :^)

I should read more carefully: its what you asked from the beginning.

Thank you,
Dmitrii Shcherbakov
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next 0/5] sfc: additional virtual function support​

2015-12-18 Thread Bert Kenward
This introduces the client side of a mechanism to defer authorisation of
operations, for example multicast subscription. Although primarily aimed at
SRIOV VFs this can also apply to unprivileged PFs.

Also handle reboot ordering corner cases better and reduce the level of some
logging.

Bert Kenward (4):
  sfc: Retry MCDI after NO_EVB_PORT error on a VF
  sfc: Handle MCDI proxy authorisation
  sfc: Make failed filter removal less noisy
  sfc: Downgrade or remove some error messages

Tomáš Pilař (1):
  sfc: Downgrade EPERM messages from MCDI to debug

 drivers/net/ethernet/sfc/ef10.c |  68 +++
 drivers/net/ethernet/sfc/efx.c  |   7 +-
 drivers/net/ethernet/sfc/mcdi.c | 252 
 drivers/net/ethernet/sfc/mcdi.h |  10 ++
 4 files changed, 290 insertions(+), 47 deletions(-)

-- 
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


commit e34d65696d2e broke stmmac ethernet on socfpga

2015-12-18 Thread Dinh Nguyen
Hi,

It appears that commit e34d65696d2e 'stmmac: create of compatible mdio
bus for
stmmac driver' is causing this error on the SoCFPGA platform:

[1.767246] libphy: PHY stmmac-0: not found
[1.772106] eth0: Could not attach to PHY
[1.776129] stmmac_open: Cannot attach to PHY (error: -19)
[1.781590] IP-Config: Failed to open eth0
[1.785681] IP-Config: No network devices available

Doing a revert of this commit fixes the issue. Will try to debug further.

Thanks,
Dinh
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ila: add NETFILTER dependency

2015-12-18 Thread Pablo Neira Ayuso
On Fri, Dec 18, 2015 at 03:37:37PM +0100, Arnd Bergmann wrote:
> The recently added generic ILA translation facility fails to
> build when CONFIG_NETFILTER is disabled:
> 
> net/ipv6/ila/ila_xlat.c:229:20: warning: 'struct nf_hook_state' declared 
> inside parameter list
> net/ipv6/ila/ila_xlat.c:235:27: error: array type has incomplete element type 
> 'struct nf_hook_ops'
>  static struct nf_hook_ops ila_nf_hook_ops[] __read_mostly = {
> 
> This adds an explicit Kconfig dependency to avoid that case.

I'm afraid this extra Kconfig dependency that Arnd adds to fix this is
a symptom that there is something that doesn't belong there.

I overlook this new hook on priority -1, how does this integrate into
our infrastructure?

And mainly, isn't there any better way to integrate this into the
stack?

And why didn't you Cc netfilter-devel for code that involves
Netfilter?

We have to evaluate how this integrates into what we have, if it
breaks when it interacts with other components that we have.

I'm very sorry to say, but none of this has happened so far.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/1] net: Add SO_REUSEPORT_LISTEN_OFF socket option as drain mode

2015-12-18 Thread Josh Snyder
I was also puzzled that binding succeeded. Looking into the code paths
involved, in inet_csk_get_port, we quickly goto have_snum. From there, we end
up dropping into tb_found. Since !hlist_empty(>owners), we end up checking
that (tb->fastreuseport > 0 && sk->sk_reuseport && uid_eq(tb->fastuid, uid)).
This test passes, so we goto success and bind.

Crucially, we are checking the fastreuseport field on the inet_bind_bucket, and
not the sk_reuseport variable on the other sockets in the bucket. Since this
bit is set based on sk_reuseport at the time the first socket binds (see
tb_not_found), I can see no reason why sockets need to keep SO_REUSEPORT set
beyond initial binding.

Given this, I believe Willy's patch elegantly solves the problem at hand.

Josh

On Wed, Dec 16, 2015 at 8:15 AM, Willy Tarreau  wrote:
> Hi Eric,
>
> On Wed, Dec 16, 2015 at 08:38:14AM +0100, Willy Tarreau wrote:
>> On Tue, Dec 15, 2015 at 01:21:15PM -0800, Eric Dumazet wrote:
>> > On Tue, 2015-12-15 at 20:44 +0100, Willy Tarreau wrote:
>> >
>> > > Thus do you think it's worth adding a new option as Tolga proposed ?
>> >
>> >
>> > I thought we tried hard to avoid adding the option but determined
>> > we could not avoid it ;)
>>
>> Not yet, your other proposal of disabling SO_REUSEPORT makes sense if
>> we combine it with the proposal to change the score in my patch. If
>> we say that a socket which has SO_REUSEPORT scores higher, then the
>> connections which don't want to accept new connections anymore will
>> simply have to drop it an not be elected. I find this even cleaner
>> since the sole purpose of the loop is to find the best socket in case
>> of SO_REUSEPORT.
>
> So I tried this and am pretty satisfied with the results, as I couldn't
> see any single reset on 4.4-rc5 with it. On 4.1 I got a few very rare
> resets at the exact moment the new process binds to the socket, because
> I suspect some ACKs end up in the wrong queue exactly there. But
> apparently the changes you did in 4.4 totally got rid of this, which is
> great!
>
> I suspected that I could enter a situation where a new process could
> fail to bind if generations n-1 and n-2 were still present, because
> n-2 would be running without SO_REUSEPORT and that should make this
> test fail in inet_csk_bind_conflict(), but it never failed for me :
>
> if ((!reuse || !sk2->sk_reuse ||
> sk2->sk_state == TCP_LISTEN) &&
> (!reuseport || !sk2->sk_reuseport ||
> (sk2->sk_state != TCP_TIME_WAIT &&
>  !uid_eq(uid, sock_i_uid(sk2) {
> ...
>
> So I'm clearly missing something and can't spot what. I mean, I'd
> prefer to see my patch occasionally fail than not understanding why
> it always works! If anyone has an suggestion I'm interested.
>
> Here's the updated patch.
>
> Best regards,
> Willy
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][iproute2] tc/q_htb.c: Fix the MPU value output in 'tc -d class show dev ' command

2015-12-18 Thread Dmitrii Shcherbakov
Phil,

> Dmitrii, did iproute2 without your change even print the overhead as set
> by you before? Looking at the code, I'd assume not.

Tried building iproute2 (as of tag 4.2) and using the upstream linux kernel 
(also tag 4.2 - 64291f7db5bd8150a74ad2036f1037e6a0428df2):

~/src/iproute2/tc$ uname -r
4.2.0-040200-generic

~/src/iproute2/tc$ grep -inP 'htb' /boot/config-4.2.0-040200-generic 
1331:CONFIG_NET_SCH_HTB=m

~/src/iproute2/tc$ lsmod | grep htb
sch_htb24576  1

~/src/iproute2/tc$ ./tc -d class show dev eth0

~/src/iproute2/tc$ sudo ./tc qdisc add dev eth0 root handle 1: htb default 12

~/src/iproute2/tc$ sudo ./tc class add dev eth0 parent 1: classid 1:1 htb rate 
100kbps ceil 100kbps mpu 256 overhead 64

~/src/iproute2/tc$ tc -d class show dev eth0
class htb 1:1 root prio 0 quantum 1 rate 800Kbit overhead 64 ceil 800Kbit 
linklayer ethernet burst 1600b/1 mpu 0b overhead 0b cburst 1600b/1 mpu 0b 
overhead 0b level 0

~/src/iproute2/tc$ lsmod | grep htb
sch_htb24576  1

So it looks like the overhead is being set correctly, but the mpu is not, even 
though the respective kernel module is loaded judging by what I see.

Regards,
Dmitrii Shcherbakov
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] veth: don't modify ip-summed; doing so treats packets with bad checksums as good.

2015-12-18 Thread Cong Wang
(Cc'ing Eric B and Tom)

On Fri, Dec 18, 2015 at 9:54 AM, Vijay Pandurangan  wrote:
> Packets that arrive from real hardware devices have ip_summed ==
> CHECKSUM_UNNECESSARY if the hardware verified the checksums, or
> CHECKSUM_NONE if the packet is bad or it was unable to verify it. The
> current version of veth will replace CHECKSUM_NONE with
> CHECKSUM_UNNECESSARY, which causes corrupt packets routed from hardware to
> a veth device to be delivered to the application. This caused applications
> at Twitter to receive corrupt data when network hardware was corrupting
> packets.

Yeah, https://reviews.apache.org/r/41158/.

This is because normally packets to a veth device are _only_ from its pair
device, Mesos network isolator redirects packets from a hardware interface
to veth, which violates this expectation. This is also why no one else sees
this bug. ;)

>
> We believe this was added as an optimization to skip computing and
> verifying checksums for communication between containers. However, locally
> generated packets have ip_summed == CHECKSUM_PARTIAL, so the code as
> written does nothing for them. As far as we can tell, after removing this
> code, these packets are transmitted from one stack to another unmodified
> (tcpdump shows invalid checksums on both sides, as expected), and they are
> delivered correctly to applications. We didn’t test every possible network
> configuration, but we tried a few common ones such as bridging containers,
> using NAT between the host and a container, and routing from hardware
> devices to containers. We have effectively deployed this in production at
> Twitter (by disabling RX checksum offloading on veth devices).


I am wondering if there is any other CHECKSUM_NONE case in the tx
path we could miss here. Mesos case is too special not only because
it redirects packets from hardware to veth, but also because it moves
packets from RX path to TX path.

Eric? Tom?

>
> This code dates back to the first version of the driver, commit
>  ("[NET]: Virtual ethernet device driver"), so I
> suspect this bug occurred mostly because the driver API has evolved
> significantly since then. Commit <0b7967503dc97864f283a> ("net/veth: Fix
> packet checksumming") (in December 2010) fixed this for packets that get
> created locally and sent to hardware devices, by not changing
> CHECKSUM_PARTIAL. However, the same issue still occurs for packets coming
> in from hardware devices.
>
> Co-authored-by: Evan Jones 
> Signed-off-by: Evan Jones 
> Cc: Nicolas Dichtel 
> Cc: Phil Sutter 
> Cc: Toshiaki Makita 
> Cc: netdev@vger.kernel.org
> Cc: linux-ker...@vger.kernel.org
> Signed-off-by: Vijay Pandurangan 

Your patch looks good to me but your email client corrupts your patch,
so please resend.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next 0/2] Local checksum offload for VXLAN

2015-12-18 Thread Tom Herbert
On Fri, Dec 18, 2015 at 2:41 AM, Edward Cree  wrote:
> On 17/12/15 18:06, Tom Herbert wrote:
>> I'm not sure that we need bits in VXLAN or any other encapsulation. It
>> should be sufficient in udp_set_csum that if we already have
>> CHECKSUM_PARTIAL that can always be used to do local checksum offload.
> My understandingis that otherwise iptunnel_handle_offloads() will do the
>  inner checksum in sw, because csum_help will be passed as true.  It will
>  call skb_checksum_help().
>> This is also should be independent as to whether the device does
>> NETIF_F_HW_CSUM or can offload  NETIF_F_IP[V6]_CSUM for encapsulated
>> packets.
> I was wary of drivers that declare NETIF_F_IP[V6]_CSUM but don't cope with
>  encapsulated packets.  Would they do the right thing if the inner_csum bool
>  in patch 2 just tested for NETIF_F_CSUM_MASK, or do I need to test things
>  like NETIF_F_GSO_UDP_TUNNEL_CSUM?  I'm afraid I don't entirely understand
>  the infrastructure here, so I just did the minimal thing I was sure worked,
>  i.e. testing for NETIF_F_HW_CSUM.

Drivers indicate that can do NETIF_F_IP[V6]_CSUM for encapsulation by
setting enc_features. This is checked in validate_xmit_skb so that if
drive can't handle encapsulated checksum skb_checksum_help is called
there.

 >> It would be nice to have a more formal documentation also. This is a
>> very powerful mechanism but the math behind it and requirements are
>> subtle.
>>
>> Tom
> What would be a good place to put such documentation?  In
>  Documentation/networking, or as part of the big checksums comment at the
>  top of skbuff.h?
>
I don't think this right for skbuff.h that should just describe the
interface. LCO has no interface like checksum-unnecessary conversion.
Checksumming, encapsulation, segmentation offloads are complex enough
now there should really be a Linux doc on this maybe modeled after
scaling.txt. That's probably more than you bargained for in this
patch, but if someone wants to learn how this infrastructure really
works, writing a doc is a good way! :-)

Tom
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next] ipv6: addrconf: use stable address generator for ARPHRD_NONE

2015-12-18 Thread David Miller
From: Bjørn Mork 
Date: Wed, 16 Dec 2015 16:44:38 +0100

> Add a new address generator mode, using the stable address generator
> with an automatically generated secret. This is intended as a default
> address generator mode for device types with no EUI64 implementation.
> The new generator is used for ARPHRD_NONE interfaces initially, adding
> default IPv6 autoconf support to e.g. tun interfaces.
> 
> If the addrgenmode is set to 'random', either by default or manually,
> and no stable secret is available, then a random secret is used as
> input for the stable-privacy address generator.  The secret can be
> read and modified like manually configured secrets, using the proc
> interface.  Modifying the secret will change the addrgen mode to
> 'stable-privacy' to indicate that it operates on a known secret.
> 
> Existing behaviour of the 'stable-privacy' mode is kept unchanged. If
> a known secret is available when the device is created, then the mode
> will default to 'stable-privacy' as before.  The mode can be manually
> set to 'random' but it will behave exactly like 'stable-privacy' in
> this case. The secret will not change.
> 
> Cc: Hannes Frederic Sowa 
> Cc: 吉藤英明 
> Signed-off-by: Bjørn Mork 

Applied, thanks!
N�r��yb�X��ǧv�^�)޺{.n�+���z�^�)w*jg����ݢj/���z�ޖ��2�ޙ&�)ߡ�a�����G���h��j:+v���w��٥

[PATCH 2/2] can: sja1000: of: add compatibility with Technologic Systems version

2015-12-18 Thread Damien Riegel
Technologic Systems provides an IP compatible with the SJA1000,
instantiated in an FPGA. Because of some bus widths issue, access to
registers is made through a "window" that works like this:

base + 0x0: address to read/write
base + 0x2: 8-bit register value

This commit adds a new compatible device, "technologic,sja1000", with
read and write functions using the window mechanism.

Signed-off-by: Damien Riegel 
---
 drivers/net/can/sja1000/sja1000_platform.c | 30 --
 1 file changed, 28 insertions(+), 2 deletions(-)

diff --git a/drivers/net/can/sja1000/sja1000_platform.c 
b/drivers/net/can/sja1000/sja1000_platform.c
index 0552ed4..6cbf251 100644
--- a/drivers/net/can/sja1000/sja1000_platform.c
+++ b/drivers/net/can/sja1000/sja1000_platform.c
@@ -70,6 +70,18 @@ static void sp_write_reg32(const struct sja1000_priv *priv, 
int reg, u8 val)
iowrite8(val, priv->reg_base + reg * 4);
 }
 
+static u8 ts4800_read_reg16(const struct sja1000_priv *priv, int reg)
+{
+   sp_write_reg16(priv, 0,  reg);
+   return sp_read_reg16(priv, 2);
+}
+
+static void ts4800_write_reg16(const struct sja1000_priv *priv, int reg, u8 
val)
+{
+   sp_write_reg16(priv, 0, reg);
+   sp_write_reg16(priv, 2, val);
+}
+
 static void sp_populate(struct sja1000_priv *priv,
struct sja1000_platform_data *pdata,
unsigned long resource_mem_flags)
@@ -98,21 +110,34 @@ static void sp_populate(struct sja1000_priv *priv,
 
 static void sp_populate_of(struct sja1000_priv *priv, struct device_node *of)
 {
+   int is_technologic;
int err;
u32 prop;
 
+   is_technologic = of_device_is_compatible(of, "technologic,sja1000");
+
err = of_property_read_u32(of, "reg-io-width", );
if (err)
prop = 1; /* 8 bit is default */
 
+   if (is_technologic && prop != 2) {
+   netdev_warn(priv->dev, "forcing reg-io-width to 2\n");
+   prop = 2;
+   }
+
switch (prop) {
case 4:
priv->read_reg = sp_read_reg32;
priv->write_reg = sp_write_reg32;
break;
case 2:
-   priv->read_reg = sp_read_reg16;
-   priv->write_reg = sp_write_reg16;
+   if (is_technologic) {
+   priv->read_reg = ts4800_read_reg16;
+   priv->write_reg = ts4800_write_reg16;
+   } else {
+   priv->read_reg = sp_read_reg16;
+   priv->write_reg = sp_write_reg16;
+   }
break;
case 1: /* fallthrough */
default:
@@ -244,6 +269,7 @@ static int sp_remove(struct platform_device *pdev)
 
 static const struct of_device_id sp_of_table[] = {
{.compatible = "nxp,sja1000"},
+   {.compatible = "technologic,sja1000"},
{},
 };
 MODULE_DEVICE_TABLE(of, sp_of_table);
-- 
2.5.0

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next] nfp: call netif_carrier_off() during init

2015-12-18 Thread David Miller
From: Jakub Kicinski 
Date: Thu, 17 Dec 2015 14:18:44 +

> Netdevs default to carrier on, we should call netif_carrier_off()
> during initialization since we handle carrier state changes in the
> driver.
> 
> Signed-off-by: Jakub Kicinski 
> Reviewed-by: Rolf Neugebauer 

Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 05/23] netfilter-bridge: brace placement

2015-12-18 Thread Pablo Neira Ayuso
From: Ian Morris 

Change brace placement to eliminate checkpatch error.

No changes detected by objdiff.

Signed-off-by: Ian Morris 
Signed-off-by: Pablo Neira Ayuso 
---
 net/bridge/netfilter/ebt_log.c  | 6 ++
 net/bridge/netfilter/ebtables.c | 3 +--
 2 files changed, 3 insertions(+), 6 deletions(-)

diff --git a/net/bridge/netfilter/ebt_log.c b/net/bridge/netfilter/ebt_log.c
index f22284d..152300d 100644
--- a/net/bridge/netfilter/ebt_log.c
+++ b/net/bridge/netfilter/ebt_log.c
@@ -36,14 +36,12 @@ static int ebt_log_tg_check(const struct xt_tgchk_param 
*par)
return 0;
 }
 
-struct tcpudphdr
-{
+struct tcpudphdr {
__be16 src;
__be16 dst;
 };
 
-struct arppayload
-{
+struct arppayload {
unsigned char mac_src[ETH_ALEN];
unsigned char ip_src[4];
unsigned char mac_dst[ETH_ALEN];
diff --git a/net/bridge/netfilter/ebtables.c b/net/bridge/netfilter/ebtables.c
index 62090e2..b13ea69 100644
--- a/net/bridge/netfilter/ebtables.c
+++ b/net/bridge/netfilter/ebtables.c
@@ -551,8 +551,7 @@ ebt_check_entry_size_and_hooks(const struct ebt_entry *e,
return 0;
 }
 
-struct ebt_cl_stack
-{
+struct ebt_cl_stack {
struct ebt_chainstack cs;
int from;
unsigned int hookmask;
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/23] Netfilter updates for net-next

2015-12-18 Thread David Miller
From: Pablo Neira Ayuso 
Date: Fri, 18 Dec 2015 21:26:26 +0100

> The following patchset contains the first batch of Netfilter updates for
> the upcoming 4.5 kernel. This batch contains userspace netfilter header
> compilation fixes, support for packet mangling in nf_tables, the new
> tracing infrastructure for nf_tables and cgroup2 support for iptables.
> More specifically, they are:
 ...

Pulled.

> BTW, I need that you pull net into net-next, I have another batch that
> requires changes that I don't yet see in net.

I did that several hours ago.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] veth: don’t modify ip_summed; doing so treats packets with bad checksums as good.

2015-12-18 Thread Vijay Pandurangan
Packets that arrive from real hardware devices have ip_summed ==
CHECKSUM_UNNECESSARY if the hardware verified the checksums, or
CHECKSUM_NONE if the packet is bad or it was unable to verify it. The
current version of veth will replace CHECKSUM_NONE with
CHECKSUM_UNNECESSARY, which causes corrupt packets routed from hardware to
a veth device to be delivered to the application. This caused applications
at Twitter to receive corrupt data when network hardware was corrupting
packets.

We believe this was added as an optimization to skip computing and
verifying checksums for communication between containers. However, locally
generated packets have ip_summed == CHECKSUM_PARTIAL, so the code as
written does nothing for them. As far as we can tell, after removing this
code, these packets are transmitted from one stack to another unmodified
(tcpdump shows invalid checksums on both sides, as expected), and they are
delivered correctly to applications. We didn’t test every possible network
configuration, but we tried a few common ones such as bridging containers,
using NAT between the host and a container, and routing from hardware
devices to containers. We have effectively deployed this in production at
Twitter (by disabling RX checksum offloading on veth devices).

This code dates back to the first version of the driver, commit
 ("[NET]: Virtual ethernet device driver"), so I
suspect this bug occurred mostly because the driver API has evolved
significantly since then. Commit <0b7967503dc97864f283a> ("net/veth: Fix
packet checksumming") (in December 2010) fixed this for packets that get
created locally and sent to hardware devices, by not changing
CHECKSUM_PARTIAL. However, the same issue still occurs for packets coming
in from hardware devices.

Co-authored-by: Evan Jones 
Signed-off-by: Evan Jones 
Cc: Nicolas Dichtel 
Cc: Phil Sutter 
Cc: Toshiaki Makita 
Cc: netdev@vger.kernel.org
Cc: linux-ker...@vger.kernel.org
Signed-off-by: Vijay Pandurangan 
---
 drivers/net/veth.c | 6 --
 1 file changed, 6 deletions(-)

diff --git a/drivers/net/veth.c b/drivers/net/veth.c
index 0ef4a5a..ba21d07 100644
--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -117,12 +117,6 @@ static netdev_tx_t veth_xmit(struct sk_buff *skb, struct 
net_device *dev)
kfree_skb(skb);
goto drop;
}
-   /* don't change ip_summed == CHECKSUM_PARTIAL, as that
-* will cause bad checksum on forwarded packets
-*/
-   if (skb->ip_summed == CHECKSUM_NONE &&
-   rcv->features & NETIF_F_RXCSUM)
-   skb->ip_summed = CHECKSUM_UNNECESSARY;
 
if (likely(dev_forward_skb(rcv, skb) == NET_RX_SUCCESS)) {
struct pcpu_vstats *stats = this_cpu_ptr(dev->vstats);
-- 
2.5.0

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [iproute PATCH v2] ip{,6}tunnel: have a shared stats parser/printer

2015-12-18 Thread Stephen Hemminger
On Fri, 18 Dec 2015 11:58:06 +0100
Phil Sutter  wrote:

> This has a slight side-effect of not aborting when /proc/net/dev is
> malformed, but OTOH stats are not parsed for uninteresting interfaces.
> 
> Signed-off-by: Phil Sutter 
> ---
> Changes since v1:
> - Fix conflict resolution (sscan from 'buf' instead of 'ptr').

Applied thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 0/3] drivers: net: cpsw: Fix bugs in fixed-link PHY DT parsing

2015-12-18 Thread David Miller
From: "David Rivshin (Allworx)" 
Date: Wed, 16 Dec 2015 23:02:08 -0500

> Commit 1f71e8c96fc654724723ce987e0a8b2aeb81746d ("drivers: net: cpsw:
> Add support for fixed-link PHY") added initial fixed-link PHY support
> for CPSW, but missed a few considerations.
> 
> This series is based on the tip of the net tree. The first two patches
> fix user-visible errors in different hardware configurations. The third
> patch is for an internal reference counting issue. They are logically
> independent changes, but in the same function, so must be applied in
> order to apply cleanly.
> 
> The first patch was originally submitted by Pascal Speck on December 4,
> but was not picked up by patchwork. I suspect that is because the patch
> was mangled by the mailer. I fixed the mangling and am including it in
> this series, as I believe it is the correct change.
> 
> I have tested on the following hardware configurations:
>  - (EVMSK) dual emac with two real MDIO-connected phys using RGMII-TXID
>  - single emac with fixed-link using RGMII
> Testing of other CPSW emac configurations that folks may have would
> be appreciated.

Series applied, thanks David.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 12/23] netfilter: nf_tables: remove unused struct members

2015-12-18 Thread Pablo Neira Ayuso
From: Florian Westphal 

Signed-off-by: Florian Westphal 
Signed-off-by: Pablo Neira Ayuso 
---
 include/net/netfilter/nf_tables.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/include/net/netfilter/nf_tables.h 
b/include/net/netfilter/nf_tables.h
index 4bd7508..101d7d7 100644
--- a/include/net/netfilter/nf_tables.h
+++ b/include/net/netfilter/nf_tables.h
@@ -19,8 +19,6 @@ struct nft_pktinfo {
const struct net_device *out;
u8  pf;
u8  hook;
-   u8  nhoff;
-   u8  thoff;
u8  tprot;
/* for x_tables compatibility */
struct xt_action_param  xt;
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 10/23] netfilter: ipv6: avoid nf_iterate recursion

2015-12-18 Thread Pablo Neira Ayuso
From: Florian Westphal 

The previous patch changed nf_ct_frag6_gather() to morph reassembled skb
with the previous one.

This means that the return value is always NULL or the skb argument.
So change it to an err value.

Instead of invoking NF_HOOK recursively with threshold to skip already-called 
hooks
we can now just return NF_ACCEPT to move on to the next hook except for
-EINPROGRESS (which means skb has been queued for reassembly), in which case we
return NF_STOLEN.

Signed-off-by: Florian Westphal 
Signed-off-by: Pablo Neira Ayuso 
---
 include/net/netfilter/ipv6/nf_defrag_ipv6.h |  2 +-
 net/ipv6/netfilter/nf_conntrack_reasm.c | 71 +
 net/ipv6/netfilter/nf_defrag_ipv6_hooks.c   | 14 ++
 net/openvswitch/conntrack.c | 11 ++---
 4 files changed, 42 insertions(+), 56 deletions(-)

diff --git a/include/net/netfilter/ipv6/nf_defrag_ipv6.h 
b/include/net/netfilter/ipv6/nf_defrag_ipv6.h
index fcd20cf..ddf162f 100644
--- a/include/net/netfilter/ipv6/nf_defrag_ipv6.h
+++ b/include/net/netfilter/ipv6/nf_defrag_ipv6.h
@@ -5,7 +5,7 @@ void nf_defrag_ipv6_enable(void);
 
 int nf_ct_frag6_init(void);
 void nf_ct_frag6_cleanup(void);
-struct sk_buff *nf_ct_frag6_gather(struct net *net, struct sk_buff *skb, u32 
user);
+int nf_ct_frag6_gather(struct net *net, struct sk_buff *skb, u32 user);
 
 struct inet_frags_ctl;
 
diff --git a/net/ipv6/netfilter/nf_conntrack_reasm.c 
b/net/ipv6/netfilter/nf_conntrack_reasm.c
index 1a86a08..912bc3a 100644
--- a/net/ipv6/netfilter/nf_conntrack_reasm.c
+++ b/net/ipv6/netfilter/nf_conntrack_reasm.c
@@ -361,14 +361,15 @@ err:
 
 /*
  * Check if this packet is complete.
- * Returns NULL on failure by any reason, and pointer
- * to current nexthdr field in reassembled frame.
  *
  * It is called with locked fq, and caller must check that
  * queue is eligible for reassembly i.e. it is not COMPLETE,
  * the last and the first frames arrived and all the bits are here.
+ *
+ * returns true if *prev skb has been transformed into the reassembled
+ * skb, false otherwise.
  */
-static struct sk_buff *
+static bool
 nf_ct_frag6_reasm(struct frag_queue *fq, struct sk_buff *prev,  struct 
net_device *dev)
 {
struct sk_buff *fp, *head = fq->q.fragments;
@@ -382,22 +383,21 @@ nf_ct_frag6_reasm(struct frag_queue *fq, struct sk_buff 
*prev,  struct net_devic
 
ecn = ip_frag_ecn_table[fq->ecn];
if (unlikely(ecn == 0xff))
-   goto out_fail;
+   return false;
 
/* Unfragmented part is taken from the first segment. */
payload_len = ((head->data - skb_network_header(head)) -
   sizeof(struct ipv6hdr) + fq->q.len -
   sizeof(struct frag_hdr));
if (payload_len > IPV6_MAXPLEN) {
-   pr_debug("payload len is too large.\n");
-   goto out_oversize;
+   net_dbg_ratelimited("nf_ct_frag6_reasm: payload len = %d\n",
+   payload_len);
+   return false;
}
 
/* Head of list must not be cloned. */
-   if (skb_unclone(head, GFP_ATOMIC)) {
-   pr_debug("skb is cloned but can't expand head");
-   goto out_oom;
-   }
+   if (skb_unclone(head, GFP_ATOMIC))
+   return false;
 
/* If the first fragment is fragmented itself, we split
 * it to two chunks: the first with data and paged part
@@ -408,7 +408,7 @@ nf_ct_frag6_reasm(struct frag_queue *fq, struct sk_buff 
*prev,  struct net_devic
 
clone = alloc_skb(0, GFP_ATOMIC);
if (clone == NULL)
-   goto out_oom;
+   return false;
 
clone->next = head->next;
head->next = clone;
@@ -438,7 +438,7 @@ nf_ct_frag6_reasm(struct frag_queue *fq, struct sk_buff 
*prev,  struct net_devic
 
fp = skb_clone(prev, GFP_ATOMIC);
if (!fp)
-   goto out_oom;
+   return false;
 
fp->next = prev->next;
skb_queue_walk(head, iter) {
@@ -494,16 +494,7 @@ nf_ct_frag6_reasm(struct frag_queue *fq, struct sk_buff 
*prev,  struct net_devic
fq->q.fragments = NULL;
fq->q.fragments_tail = NULL;
 
-   return head;
-
-out_oversize:
-   net_dbg_ratelimited("nf_ct_frag6_reasm: payload len = %d\n",
-   payload_len);
-   goto out_fail;
-out_oom:
-   net_dbg_ratelimited("nf_ct_frag6_reasm: no memory for reassembly\n");
-out_fail:
-   return NULL;
+   return true;
 }
 
 /*
@@ -569,27 +560,26 @@ find_prev_fhdr(struct sk_buff *skb, u8 *prevhdrp, int 
*prevhoff, int *fhoff)
return 0;
 }
 
-struct sk_buff *nf_ct_frag6_gather(struct net *net, struct sk_buff *skb, u32 
user)
+int nf_ct_frag6_gather(struct net *net, struct sk_buff *skb, u32 user)

[PATCH 00/23] Netfilter updates for net-next

2015-12-18 Thread Pablo Neira Ayuso
Hi David,

The following patchset contains the first batch of Netfilter updates for
the upcoming 4.5 kernel. This batch contains userspace netfilter header
compilation fixes, support for packet mangling in nf_tables, the new
tracing infrastructure for nf_tables and cgroup2 support for iptables.
More specifically, they are:

1) Two patches to include dependencies in our netfilter userspace
   headers to resolve compilation problems, from Mikko Rapeli.

2) Four comestic cleanup patches for the ebtables codebase, from Ian Morris.

3) Remove duplicate include in the netfilter reject infrastructure,
   from Stephen Hemminger.

4) Two patches to simplify the netfilter defragmentation code for IPv6,
   patch from Florian Westphal.

5) Fix root ownership of /proc/net netfilter for unpriviledged net
   namespaces, from Philip Whineray.

6) Get rid of unused fields in struct nft_pktinfo, from Florian Westphal.

7) Add mangling support to our nf_tables payload expression, from
   Patrick McHardy.

8) Introduce a new netlink-based tracing infrastructure for nf_tables,
   from Florian Westphal.

9) Change setter functions in nfnetlink_log to be void, from
Rami Rosen.

10) Add netns support to the cttimeout infrastructure.

11) Add cgroup2 support to iptables, from Tejun Heo.

12) Introduce nfnl_dereference_protected() in nfnetlink, from Florian.

13) Add support for mangling pkttype in the nf_tables meta expression,
also from Florian.

BTW, I need that you pull net into net-next, I have another batch that
requires changes that I don't yet see in net.

You can pull these changes from:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next.git

Thanks!



The following changes since commit cb4396edd84ed73081635fb933d19c1410fafaf4:

  drivers/net: fix eisa_driver probe section mismatch (2015-12-14 00:24:22 
-0500)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next.git HEAD

for you to fetch changes up to b4aae759c22e71a3c32144f0b3bc4f2fa4aaae98:

  netfilter: meta: add support for setting skb->pkttype (2015-12-18 14:12:56 
+0100)


Florian Westphal (9):
  netfilter: ipv6: nf_defrag: avoid/free clone operations
  netfilter: ipv6: avoid nf_iterate recursion
  netfilter: nf_tables: remove unused struct members
  netfilter: nf_tables: extend tracing infrastructure
  netfilter: nf_tables: wrap tracing with a static key
  netfilter: ipv6: nf_defrag: fix NULL deref panic
  netfilter: nf_tables: fix nf_log_trace based tracing
  nfnetlink: add nfnl_dereference_protected helper
  netfilter: meta: add support for setting skb->pkttype

Ian Morris (4):
  netfilter-bridge: Cleanse indentation
  netfilter-bridge: use netdev style comments
  netfilter-bridge: brace placement
  netfilter-bridge: layout of if statements

Marcelo Ricardo Leitner (1):
  netfilter: nf_ct_sctp: move ip_ct_sctp away from UAPI

Mikko Rapeli (2):
  netfilter: ebtables: use __u64 from linux/types.h
  netfilter: fix include files for compilation

Pablo Neira (1):
  netfilter: cttimeout: add netns support

Pablo Neira Ayuso (1):
  Merge branch 'master' of git://git.kernel.org/.../davem/net-next

Patrick McHardy (1):
  netfilter: nft_payload: add packet mangling support

Philip Whineray (1):
  netfilter: Set /proc/net entries owner to root in namespace

Rosen, Rami (1):
  netfilter: nfnetlink_log: Change setter functions to be void

Tejun Heo (2):
  netfilter: prepare xt_cgroup for multi revisions
  netfilter: implement xt_cgroup cgroup2 path match

stephen hemminger (1):
  netfilter: remove duplicate include

 include/linux/netfilter/nf_conntrack_sctp.h|  13 +
 include/net/net_namespace.h|   3 +
 include/net/netfilter/ipv6/nf_defrag_ipv6.h|   3 +-
 include/net/netfilter/nf_conntrack_timeout.h   |   2 +-
 include/net/netfilter/nf_tables.h  |  34 ++-
 include/net/netfilter/nf_tables_core.h |  10 +
 include/net/netfilter/nft_meta.h   |   3 +
 include/uapi/linux/netfilter/ipset/ip_set_bitmap.h |   2 +
 include/uapi/linux/netfilter/ipset/ip_set_hash.h   |   2 +
 include/uapi/linux/netfilter/ipset/ip_set_list.h   |   2 +
 include/uapi/linux/netfilter/nf_conntrack_sctp.h   |  12 +-
 .../linux/netfilter/nf_conntrack_tuple_common.h|   3 +
 include/uapi/linux/netfilter/nf_tables.h   |  69 ++
 include/uapi/linux/netfilter/nfnetlink.h   |   2 +
 include/uapi/linux/netfilter/xt_HMARK.h|   1 +
 include/uapi/linux/netfilter/xt_RATEEST.h  |   1 +
 include/uapi/linux/netfilter/xt_TEE.h  |   2 +
 include/uapi/linux/netfilter/xt_TPROXY.h   |   1 +
 include/uapi/linux/netfilter/xt_cgroup.h   |  15 +-
 

[PATCH 01/23] netfilter: ebtables: use __u64 from linux/types.h

2015-12-18 Thread Pablo Neira Ayuso
From: Mikko Rapeli 

Fixes userspace compilation error:

linux/netfilter_bridge/ebtables.h:38:2: error: unknown type name ‘uint64_t’

Signed-off-by: Mikko Rapeli 
Signed-off-by: Pablo Neira Ayuso 
---
 include/uapi/linux/netfilter_bridge/ebtables.h | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/netfilter_bridge/ebtables.h 
b/include/uapi/linux/netfilter_bridge/ebtables.h
index fd2ee50..e3cdf9f 100644
--- a/include/uapi/linux/netfilter_bridge/ebtables.h
+++ b/include/uapi/linux/netfilter_bridge/ebtables.h
@@ -12,6 +12,8 @@
 
 #ifndef _UAPI__LINUX_BRIDGE_EFF_H
 #define _UAPI__LINUX_BRIDGE_EFF_H
+#include 
+#include 
 #include 
 
 #define EBT_TABLE_MAXNAMELEN 32
@@ -33,8 +35,8 @@ struct xt_match;
 struct xt_target;
 
 struct ebt_counter {
-   uint64_t pcnt;
-   uint64_t bcnt;
+   __u64 pcnt;
+   __u64 bcnt;
 };
 
 struct ebt_replace {
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


pull-request: mac80211-next 2015-12-18

2015-12-18 Thread Johannes Berg
Hi Dave,

Before we all go on vacation/holidays, I have a few bugfixes for
net-next. The remain-on-channel ones are quite necessary since Ilan's
patch broke things quite a bit, causing crashes.

If the issue with the strange mail formatting persists let me know and
I'll send these through some other client in the future.

Enjoy the holidays and happy New Year :)

Thanks,
johannes


The following changes since commit 1b894521e60c1b91db1e8ba1278660e5c89f1b5f:

  mac80211: handle HW ROC expired properly (2015-12-07 11:06:37 +0100)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next.git 
tags/mac80211-next-for-davem-2015-12-18

for you to fetch changes up to c45932df56120a73b05f391be9730185eb51eecf:

  regulatory: fix world regulatory domain data (2015-12-11 17:42:06 +0100)


A few more updates for the next cycle:
 * remove pretty much unused and useless REG_DEBUG option
 * make regdomain messages debugging only
 * fix two bugs in the new remain-on-channel code
 * fix the world regdomain data for consistency


Dave Young (1):
  wireless: change cfg80211 regulatory domain info as debug messages

Johannes Berg (4):
  mac80211: recalculate SW ROC only when needed
  mac80211: fix remain-on-channel cancellation
  cfg80211: remove CFG80211_REG_DEBUG
  regulatory: fix world regulatory domain data

 net/mac80211/offchannel.c |  12 ++--
 net/wireless/Kconfig  |  13 
 net/wireless/reg.c| 167 --
 3 files changed, 64 insertions(+), 128 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next v4 4/4] ila: Add generic ILA translation facility

2015-12-18 Thread David Miller
From: Florian Westphal 
Date: Fri, 11 Dec 2015 12:19:04 +0100

> So if we do ILA in init ns it & pass such skbs to other netns
> it would be preferable to use nf_register_net_hooks in a namespace
> once the first ila translation is added within that namespace.

Right, the idea is that we want to do the ILA address translation
before most other things in the stack see the headers.

In particular, we want this to happen before early socket demux.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ila: add NETFILTER dependency

2015-12-18 Thread David Miller
From: Arnd Bergmann 
Date: Fri, 18 Dec 2015 15:37:37 +0100

> The recently added generic ILA translation facility fails to
> build when CONFIG_NETFILTER is disabled:
> 
> net/ipv6/ila/ila_xlat.c:229:20: warning: 'struct nf_hook_state' declared 
> inside parameter list
> net/ipv6/ila/ila_xlat.c:235:27: error: array type has incomplete element type 
> 'struct nf_hook_ops'
>  static struct nf_hook_ops ila_nf_hook_ops[] __read_mostly = {
> 
> This adds an explicit Kconfig dependency to avoid that case.
> 
> Signed-off-by: Arnd Bergmann 
> Fixes: 7f00feaf1076 ("ila: Add generic ILA translation facility")

Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH iproute2 v3 0/3] improve lwtunnel route support

2015-12-18 Thread Stephen Hemminger
On Fri, 18 Dec 2015 10:50:35 +0100
Paolo Abeni  wrote:

> This patch series try to improve the current route based
> lwtunnel support in iproute2, namely adding support for the
> COLLECT_METADATA flag in vxlan and gre link, and for ip6
> encap type in lwtunnel.
> 
> Tunnel devices need to have the COLLECT_METADATA flag
> set in order to be used for route based lwtunnel.
> 
> Changes from V1:
> - the COLLECT_METADATA flag is now controlled via the 'external' keyword
> - 'vni' and 'external' arguments are mutually exclusive for the vxlan link
> 
> Changes from V2:
> - rebased
> 
> Paolo Abeni (3):
>   vxlan: add support for collect metadata flag
>   gre: add support for collect metadata flag
>   lwtunnel: implement support for ip6 encap
> 
>  ip/iplink_vxlan.c | 19 +--
>  ip/iproute_lwtunnel.c | 92 
> ++-
>  ip/link_gre.c | 11 ++
>  3 files changed, 119 insertions(+), 3 deletions(-)
> 

Applied thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next v3 0/2] net: Allow accepted sockets to be bound to l3mdev domain

2015-12-18 Thread David Miller
From: David Ahern 
Date: Wed, 16 Dec 2015 13:20:42 -0800

> Allow accepted sockets to derive their sk_bound_dev_if setting from the
> l3mdev domain in which the packets originated. This version adds a sysctl
> to control whether the setting is inherited, making the functionality
> similar to sk_mark and its sysctl_tcp_fwmark_accept setting.
> 
> This effectively allow a process to have a "VRF-global" listen socket,
> with child sockets bound to the VRF device in which the packet originated.
> A similar behavior can be achieved using sk_mark, but a solution using marks
> is incomplete as it does not handle duplicate addresses in different L3
> domains/VRFs. Allowing sockets to inherit the sk_bound_dev_if from l3mdev
> domain provides a complete solution.

Series applied, thanks David.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] can: sja1000: add documentation for Technologic Systems version

2015-12-18 Thread Damien Riegel
This commit adds documentation for the Technologic Systems version of
SJA1000. The difference with the NXP version is in the way the registers
are accessed.

Signed-off-by: Damien Riegel 
---
 Documentation/devicetree/bindings/net/can/sja1000.txt | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/net/can/sja1000.txt 
b/Documentation/devicetree/bindings/net/can/sja1000.txt
index b4a6d53..7a158d5 100644
--- a/Documentation/devicetree/bindings/net/can/sja1000.txt
+++ b/Documentation/devicetree/bindings/net/can/sja1000.txt
@@ -2,7 +2,7 @@ Memory mapped SJA1000 CAN controller from NXP (formerly Philips)
 
 Required properties:
 
-- compatible : should be "nxp,sja1000".
+- compatible : should be one of "nxp,sja1000", "technologic,sja1000".
 
 - reg : should specify the chip select, address offset and size required
to map the registers of the SJA1000. The size is usually 0x80.
@@ -14,6 +14,7 @@ Optional properties:
 
 - reg-io-width : Specify the size (in bytes) of the IO accesses that
should be performed on the device.  Valid value is 1, 2 or 4.
+   Must be set to 2 for technologic version.
Default to 1 (8 bits).
 
 - nxp,external-clock-frequency : Frequency of the external oscillator
-- 
2.5.0

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net 0/2] Mellanox mlx4 driver fixes

2015-12-18 Thread David Miller
From: Or Gerlitz 
Date: Thu, 17 Dec 2015 15:35:36 +0200

> Two small fixes from Jenny for code flows that deal with time-stamping.

Series applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 19/23] netfilter: cttimeout: add netns support

2015-12-18 Thread Pablo Neira Ayuso
From: Pablo Neira 

Add a per-netns list of timeout objects and adjust code to use it.

Signed-off-by: Pablo Neira Ayuso 
---
 include/net/net_namespace.h  |  3 +
 include/net/netfilter/nf_conntrack_timeout.h |  2 +-
 net/netfilter/nf_conntrack_timeout.c |  2 +-
 net/netfilter/nfnetlink_cttimeout.c  | 82 +---
 net/netfilter/xt_CT.c|  2 +-
 5 files changed, 57 insertions(+), 34 deletions(-)

diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
index 2dcea63..4089abc 100644
--- a/include/net/net_namespace.h
+++ b/include/net/net_namespace.h
@@ -121,6 +121,9 @@ struct net {
 #if IS_ENABLED(CONFIG_NETFILTER_NETLINK_ACCT)
struct list_headnfnl_acct_list;
 #endif
+#if IS_ENABLED(CONFIG_NF_CT_NETLINK_TIMEOUT)
+   struct list_headnfct_timeout_list;
+#endif
 #endif
 #ifdef CONFIG_WEXT_CORE
struct sk_buff_head wext_nlevents;
diff --git a/include/net/netfilter/nf_conntrack_timeout.h 
b/include/net/netfilter/nf_conntrack_timeout.h
index f72be38..5cc5e9e 100644
--- a/include/net/netfilter/nf_conntrack_timeout.h
+++ b/include/net/netfilter/nf_conntrack_timeout.h
@@ -104,7 +104,7 @@ static inline void nf_conntrack_timeout_fini(void)
 #endif /* CONFIG_NF_CONNTRACK_TIMEOUT */
 
 #ifdef CONFIG_NF_CONNTRACK_TIMEOUT
-extern struct ctnl_timeout *(*nf_ct_timeout_find_get_hook)(const char *name);
+extern struct ctnl_timeout *(*nf_ct_timeout_find_get_hook)(struct net *net, 
const char *name);
 extern void (*nf_ct_timeout_put_hook)(struct ctnl_timeout *timeout);
 #endif
 
diff --git a/net/netfilter/nf_conntrack_timeout.c 
b/net/netfilter/nf_conntrack_timeout.c
index 93da609..26e7420 100644
--- a/net/netfilter/nf_conntrack_timeout.c
+++ b/net/netfilter/nf_conntrack_timeout.c
@@ -25,7 +25,7 @@
 #include 
 
 struct ctnl_timeout *
-(*nf_ct_timeout_find_get_hook)(const char *name) __read_mostly;
+(*nf_ct_timeout_find_get_hook)(struct net *net, const char *name) 
__read_mostly;
 EXPORT_SYMBOL_GPL(nf_ct_timeout_find_get_hook);
 
 void (*nf_ct_timeout_put_hook)(struct ctnl_timeout *timeout) __read_mostly;
diff --git a/net/netfilter/nfnetlink_cttimeout.c 
b/net/netfilter/nfnetlink_cttimeout.c
index c7a2d0e..3921d54 100644
--- a/net/netfilter/nfnetlink_cttimeout.c
+++ b/net/netfilter/nfnetlink_cttimeout.c
@@ -38,8 +38,6 @@ MODULE_LICENSE("GPL");
 MODULE_AUTHOR("Pablo Neira Ayuso ");
 MODULE_DESCRIPTION("cttimeout: Extended Netfilter Connection Tracking timeout 
tuning");
 
-static LIST_HEAD(cttimeout_list);
-
 static const struct nla_policy cttimeout_nla_policy[CTA_TIMEOUT_MAX+1] = {
[CTA_TIMEOUT_NAME]  = { .type = NLA_NUL_STRING,
.len  = CTNL_TIMEOUT_NAME_MAX - 1},
@@ -90,7 +88,7 @@ cttimeout_new_timeout(struct sock *ctnl, struct sk_buff *skb,
l3num = ntohs(nla_get_be16(cda[CTA_TIMEOUT_L3PROTO]));
l4num = nla_get_u8(cda[CTA_TIMEOUT_L4PROTO]);
 
-   list_for_each_entry(timeout, _list, head) {
+   list_for_each_entry(timeout, >nfct_timeout_list, head) {
if (strncmp(timeout->name, name, CTNL_TIMEOUT_NAME_MAX) != 0)
continue;
 
@@ -145,7 +143,7 @@ cttimeout_new_timeout(struct sock *ctnl, struct sk_buff 
*skb,
timeout->l3num = l3num;
timeout->l4proto = l4proto;
atomic_set(>refcnt, 1);
-   list_add_tail_rcu(>head, _list);
+   list_add_tail_rcu(>head, >nfct_timeout_list);
 
return 0;
 err:
@@ -209,6 +207,7 @@ nla_put_failure:
 static int
 ctnl_timeout_dump(struct sk_buff *skb, struct netlink_callback *cb)
 {
+   struct net *net = sock_net(skb->sk);
struct ctnl_timeout *cur, *last;
 
if (cb->args[2])
@@ -219,7 +218,7 @@ ctnl_timeout_dump(struct sk_buff *skb, struct 
netlink_callback *cb)
cb->args[1] = 0;
 
rcu_read_lock();
-   list_for_each_entry_rcu(cur, _list, head) {
+   list_for_each_entry_rcu(cur, >nfct_timeout_list, head) {
if (last) {
if (cur != last)
continue;
@@ -245,6 +244,7 @@ cttimeout_get_timeout(struct sock *ctnl, struct sk_buff 
*skb,
  const struct nlmsghdr *nlh,
  const struct nlattr * const cda[])
 {
+   struct net *net = sock_net(skb->sk);
int ret = -ENOENT;
char *name;
struct ctnl_timeout *cur;
@@ -260,7 +260,7 @@ cttimeout_get_timeout(struct sock *ctnl, struct sk_buff 
*skb,
return -EINVAL;
name = nla_data(cda[CTA_TIMEOUT_NAME]);
 
-   list_for_each_entry(cur, _list, head) {
+   list_for_each_entry(cur, >nfct_timeout_list, head) {
struct sk_buff *skb2;
 
if (strncmp(cur->name, name, CTNL_TIMEOUT_NAME_MAX) != 0)
@@ -301,17 +301,17 @@ static void untimeout(struct nf_conntrack_tuple_hash *i,

[PATCH 21/23] netfilter: implement xt_cgroup cgroup2 path match

2015-12-18 Thread Pablo Neira Ayuso
From: Tejun Heo 

This patch implements xt_cgroup path match which matches cgroup2
membership of the associated socket.  The match is recursive and
invertible.

For rationales on introducing another cgroup based match, please refer
to a preceding commit "sock, cgroup: add sock->sk_cgroup".

v3: Folded into xt_cgroup as a new revision interface as suggested by
Pablo.

v2: Included linux/limits.h from xt_cgroup2.h for PATH_MAX.  Added
explicit alignment to the priv field.  Both suggested by Jan.

Signed-off-by: Tejun Heo 
Cc: Daniel Borkmann 
Cc: Daniel Wagner 
CC: Neil Horman 
Cc: Jan Engelhardt 
Cc: Pablo Neira Ayuso 
Signed-off-by: Pablo Neira Ayuso 
---
 include/uapi/linux/netfilter/xt_cgroup.h | 13 ++
 net/netfilter/xt_cgroup.c| 69 
 2 files changed, 82 insertions(+)

diff --git a/include/uapi/linux/netfilter/xt_cgroup.h 
b/include/uapi/linux/netfilter/xt_cgroup.h
index 577c9e0..1e4b37b 100644
--- a/include/uapi/linux/netfilter/xt_cgroup.h
+++ b/include/uapi/linux/netfilter/xt_cgroup.h
@@ -2,10 +2,23 @@
 #define _UAPI_XT_CGROUP_H
 
 #include 
+#include 
 
 struct xt_cgroup_info_v0 {
__u32 id;
__u32 invert;
 };
 
+struct xt_cgroup_info_v1 {
+   __u8has_path;
+   __u8has_classid;
+   __u8invert_path;
+   __u8invert_classid;
+   charpath[PATH_MAX];
+   __u32   classid;
+
+   /* kernel internal data */
+   void*priv __attribute__((aligned(8)));
+};
+
 #endif /* _UAPI_XT_CGROUP_H */
diff --git a/net/netfilter/xt_cgroup.c b/net/netfilter/xt_cgroup.c
index 1730025..a086a91 100644
--- a/net/netfilter/xt_cgroup.c
+++ b/net/netfilter/xt_cgroup.c
@@ -34,6 +34,37 @@ static int cgroup_mt_check_v0(const struct xt_mtchk_param 
*par)
return 0;
 }
 
+static int cgroup_mt_check_v1(const struct xt_mtchk_param *par)
+{
+   struct xt_cgroup_info_v1 *info = par->matchinfo;
+   struct cgroup *cgrp;
+
+   if ((info->invert_path & ~1) || (info->invert_classid & ~1))
+   return -EINVAL;
+
+   if (!info->has_path && !info->has_classid) {
+   pr_info("xt_cgroup: no path or classid specified\n");
+   return -EINVAL;
+   }
+
+   if (info->has_path && info->has_classid) {
+   pr_info("xt_cgroup: both path and classid specified\n");
+   return -EINVAL;
+   }
+
+   if (info->has_path) {
+   cgrp = cgroup_get_from_path(info->path);
+   if (IS_ERR(cgrp)) {
+   pr_info("xt_cgroup: invalid path, errno=%ld\n",
+   PTR_ERR(cgrp));
+   return -EINVAL;
+   }
+   info->priv = cgrp;
+   }
+
+   return 0;
+}
+
 static bool
 cgroup_mt_v0(const struct sk_buff *skb, struct xt_action_param *par)
 {
@@ -46,6 +77,31 @@ cgroup_mt_v0(const struct sk_buff *skb, struct 
xt_action_param *par)
info->invert;
 }
 
+static bool cgroup_mt_v1(const struct sk_buff *skb, struct xt_action_param 
*par)
+{
+   const struct xt_cgroup_info_v1 *info = par->matchinfo;
+   struct sock_cgroup_data *skcd = >sk->sk_cgrp_data;
+   struct cgroup *ancestor = info->priv;
+
+   if (!skb->sk || !sk_fullsock(skb->sk))
+   return false;
+
+   if (ancestor)
+   return cgroup_is_descendant(sock_cgroup_ptr(skcd), ancestor) ^
+   info->invert_path;
+   else
+   return (info->classid == sock_cgroup_classid(skcd)) ^
+   info->invert_classid;
+}
+
+static void cgroup_mt_destroy_v1(const struct xt_mtdtor_param *par)
+{
+   struct xt_cgroup_info_v1 *info = par->matchinfo;
+
+   if (info->priv)
+   cgroup_put(info->priv);
+}
+
 static struct xt_match cgroup_mt_reg[] __read_mostly = {
{
.name   = "cgroup",
@@ -59,6 +115,19 @@ static struct xt_match cgroup_mt_reg[] __read_mostly = {
  (1 << NF_INET_POST_ROUTING) |
  (1 << NF_INET_LOCAL_IN),
},
+   {
+   .name   = "cgroup",
+   .revision   = 1,
+   .family = NFPROTO_UNSPEC,
+   .checkentry = cgroup_mt_check_v1,
+   .match  = cgroup_mt_v1,
+   .matchsize  = sizeof(struct xt_cgroup_info_v1),
+   .destroy= cgroup_mt_destroy_v1,
+   .me = THIS_MODULE,
+   .hooks  = (1 << NF_INET_LOCAL_OUT) |
+ (1 << NF_INET_POST_ROUTING) |
+ (1 << NF_INET_LOCAL_IN),
+   },
 };
 
 static int __init cgroup_mt_init(void)
-- 

[PATCH 11/23] netfilter: Set /proc/net entries owner to root in namespace

2015-12-18 Thread Pablo Neira Ayuso
From: Philip Whineray 

Various files are owned by root with 0440 permission. Reading them is
impossible in an unprivileged user namespace, interfering with firewall
tools. For instance, iptables-save relies on /proc/net/ip_tables_names
contents to dump only loaded tables.

This patch assigned ownership of the following files to root in the
current namespace:

- /proc/net/*_tables_names
- /proc/net/*_tables_matches
- /proc/net/*_tables_targets
- /proc/net/nf_conntrack
- /proc/net/nf_conntrack_expect
- /proc/net/netfilter/nfnetlink_log

A mapping for root must be available, so this order should be followed:

unshare(CLONE_NEWUSER);
/* Setup the mapping */
unshare(CLONE_NEWNET);

Signed-off-by: Philip Whineray 
Signed-off-by: Pablo Neira Ayuso 
---
 net/netfilter/nf_conntrack_expect.c |  7 +++
 net/netfilter/nf_conntrack_standalone.c |  7 +++
 net/netfilter/nfnetlink_log.c   | 15 +--
 net/netfilter/x_tables.c| 12 
 4 files changed, 39 insertions(+), 2 deletions(-)

diff --git a/net/netfilter/nf_conntrack_expect.c 
b/net/netfilter/nf_conntrack_expect.c
index acf5c7b..278927a 100644
--- a/net/netfilter/nf_conntrack_expect.c
+++ b/net/netfilter/nf_conntrack_expect.c
@@ -596,11 +596,18 @@ static int exp_proc_init(struct net *net)
 {
 #ifdef CONFIG_NF_CONNTRACK_PROCFS
struct proc_dir_entry *proc;
+   kuid_t root_uid;
+   kgid_t root_gid;
 
proc = proc_create("nf_conntrack_expect", 0440, net->proc_net,
   _file_ops);
if (!proc)
return -ENOMEM;
+
+   root_uid = make_kuid(net->user_ns, 0);
+   root_gid = make_kgid(net->user_ns, 0);
+   if (uid_valid(root_uid) && gid_valid(root_gid))
+   proc_set_user(proc, root_uid, root_gid);
 #endif /* CONFIG_NF_CONNTRACK_PROCFS */
return 0;
 }
diff --git a/net/netfilter/nf_conntrack_standalone.c 
b/net/netfilter/nf_conntrack_standalone.c
index 1fb3cac..0f1a45b 100644
--- a/net/netfilter/nf_conntrack_standalone.c
+++ b/net/netfilter/nf_conntrack_standalone.c
@@ -392,11 +392,18 @@ static const struct file_operations ct_cpu_seq_fops = {
 static int nf_conntrack_standalone_init_proc(struct net *net)
 {
struct proc_dir_entry *pde;
+   kuid_t root_uid;
+   kgid_t root_gid;
 
pde = proc_create("nf_conntrack", 0440, net->proc_net, _file_ops);
if (!pde)
goto out_nf_conntrack;
 
+   root_uid = make_kuid(net->user_ns, 0);
+   root_gid = make_kgid(net->user_ns, 0);
+   if (uid_valid(root_uid) && gid_valid(root_gid))
+   proc_set_user(pde, root_uid, root_gid);
+
pde = proc_create("nf_conntrack", S_IRUGO, net->proc_net_stat,
  _cpu_seq_fops);
if (!pde)
diff --git a/net/netfilter/nfnetlink_log.c b/net/netfilter/nfnetlink_log.c
index 740cce4..dea4676 100644
--- a/net/netfilter/nfnetlink_log.c
+++ b/net/netfilter/nfnetlink_log.c
@@ -1064,15 +1064,26 @@ static int __net_init nfnl_log_net_init(struct net *net)
 {
unsigned int i;
struct nfnl_log_net *log = nfnl_log_pernet(net);
+#ifdef CONFIG_PROC_FS
+   struct proc_dir_entry *proc;
+   kuid_t root_uid;
+   kgid_t root_gid;
+#endif
 
for (i = 0; i < INSTANCE_BUCKETS; i++)
INIT_HLIST_HEAD(>instance_table[i]);
spin_lock_init(>instances_lock);
 
 #ifdef CONFIG_PROC_FS
-   if (!proc_create("nfnetlink_log", 0440,
-net->nf.proc_netfilter, _file_ops))
+   proc = proc_create("nfnetlink_log", 0440,
+  net->nf.proc_netfilter, _file_ops);
+   if (!proc)
return -ENOMEM;
+
+   root_uid = make_kuid(net->user_ns, 0);
+   root_gid = make_kgid(net->user_ns, 0);
+   if (uid_valid(root_uid) && gid_valid(root_gid))
+   proc_set_user(proc, root_uid, root_gid);
 #endif
return 0;
 }
diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c
index d4aaad7..c8a0b7d 100644
--- a/net/netfilter/x_tables.c
+++ b/net/netfilter/x_tables.c
@@ -26,6 +26,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include 
@@ -1226,6 +1227,8 @@ int xt_proto_init(struct net *net, u_int8_t af)
 #ifdef CONFIG_PROC_FS
char buf[XT_FUNCTION_MAXNAMELEN];
struct proc_dir_entry *proc;
+   kuid_t root_uid;
+   kgid_t root_gid;
 #endif
 
if (af >= ARRAY_SIZE(xt_prefix))
@@ -1233,12 +1236,17 @@ int xt_proto_init(struct net *net, u_int8_t af)
 
 
 #ifdef CONFIG_PROC_FS
+   root_uid = make_kuid(net->user_ns, 0);
+   root_gid = make_kgid(net->user_ns, 0);
+
strlcpy(buf, xt_prefix[af], sizeof(buf));
strlcat(buf, FORMAT_TABLES, sizeof(buf));
proc = proc_create_data(buf, 0440, net->proc_net, _table_ops,
(void *)(unsigned long)af);
if (!proc)
goto out;
+   if 

[PATCH 23/23] netfilter: meta: add support for setting skb->pkttype

2015-12-18 Thread Pablo Neira Ayuso
From: Florian Westphal 

This allows to redirect bridged packets to local machine:

ether type ip ether daddr set aa:53:08:12:34:56 meta pkttype set unicast
Without 'set unicast', ip stack discards PACKET_OTHERHOST skbs.

It is also useful to add support for a '-m cluster like' nft rule
(where switch floods packets to several nodes, and each cluster node
 node processes a subset of packets for load distribution).

Mangling is restricted to HOST/OTHER/BROAD/MULTICAST, i.e. you cannot set
skb->pkt_type to PACKET_KERNEL or change PACKET_LOOPBACK to PACKET_HOST.

Signed-off-by: Florian Westphal 
Signed-off-by: Pablo Neira Ayuso 
---
 net/netfilter/nft_meta.c | 38 ++
 1 file changed, 38 insertions(+)

diff --git a/net/netfilter/nft_meta.c b/net/netfilter/nft_meta.c
index 5bcd1b0..fe885bf 100644
--- a/net/netfilter/nft_meta.c
+++ b/net/netfilter/nft_meta.c
@@ -26,6 +26,8 @@
 #include 
 #include 
 
+#include  /* NF_BR_PRE_ROUTING */
+
 void nft_meta_get_eval(const struct nft_expr *expr,
   struct nft_regs *regs,
   const struct nft_pktinfo *pkt)
@@ -190,6 +192,13 @@ err:
 }
 EXPORT_SYMBOL_GPL(nft_meta_get_eval);
 
+/* don't change or set _LOOPBACK, _USER, etc. */
+static bool pkt_type_ok(u32 p)
+{
+   return p == PACKET_HOST || p == PACKET_BROADCAST ||
+  p == PACKET_MULTICAST || p == PACKET_OTHERHOST;
+}
+
 void nft_meta_set_eval(const struct nft_expr *expr,
   struct nft_regs *regs,
   const struct nft_pktinfo *pkt)
@@ -205,6 +214,11 @@ void nft_meta_set_eval(const struct nft_expr *expr,
case NFT_META_PRIORITY:
skb->priority = value;
break;
+   case NFT_META_PKTTYPE:
+   if (skb->pkt_type != value &&
+   pkt_type_ok(value) && pkt_type_ok(skb->pkt_type))
+   skb->pkt_type = value;
+   break;
case NFT_META_NFTRACE:
skb->nf_trace = 1;
break;
@@ -273,6 +287,24 @@ int nft_meta_get_init(const struct nft_ctx *ctx,
 }
 EXPORT_SYMBOL_GPL(nft_meta_get_init);
 
+static int nft_meta_set_init_pkttype(const struct nft_ctx *ctx)
+{
+   unsigned int hooks;
+
+   switch (ctx->afi->family) {
+   case NFPROTO_BRIDGE:
+   hooks = 1 << NF_BR_PRE_ROUTING;
+   break;
+   case NFPROTO_NETDEV:
+   hooks = 1 << NF_NETDEV_INGRESS;
+   break;
+   default:
+   return -EOPNOTSUPP;
+   }
+
+   return nft_chain_validate_hooks(ctx->chain, hooks);
+}
+
 int nft_meta_set_init(const struct nft_ctx *ctx,
  const struct nft_expr *expr,
  const struct nlattr * const tb[])
@@ -290,6 +322,12 @@ int nft_meta_set_init(const struct nft_ctx *ctx,
case NFT_META_NFTRACE:
len = sizeof(u8);
break;
+   case NFT_META_PKTTYPE:
+   err = nft_meta_set_init_pkttype(ctx);
+   if (err)
+   return err;
+   len = sizeof(u8);
+   break;
default:
return -EOPNOTSUPP;
}
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 14/23] netfilter: nf_tables: extend tracing infrastructure

2015-12-18 Thread Pablo Neira Ayuso
From: Florian Westphal 

nft monitor mode can then decode and display this trace data.

Parts of LL/Network/Transport headers are provided as separate
attributes.

Otherwise, printing IP address data becomes virtually impossible
for userspace since in the case of the netdev family we really don't
want userspace to have to know all the possible link layer types
and/or sizes just to display/print an ip address.

We also don't want userspace to have to follow ipv6 header chains
to get the s/dport info, the kernel already did this work for us.

To avoid bloating nft_do_chain all data required for tracing is
encapsulated in nft_traceinfo.

The structure is initialized unconditionally(!) for each nft_do_chain
invocation.

This unconditionall call will be moved under a static key in a
followup patch.

With lots of help from Patrick McHardy and Pablo Neira.

Signed-off-by: Florian Westphal 
Acked-by: Patrick McHardy 
Signed-off-by: Pablo Neira Ayuso 
---
 include/net/netfilter/nf_tables.h|  32 
 include/uapi/linux/netfilter/nf_tables.h |  52 ++
 include/uapi/linux/netfilter/nfnetlink.h |   2 +
 net/netfilter/Makefile   |   2 +-
 net/netfilter/nf_tables_api.c|  12 +-
 net/netfilter/nf_tables_core.c   |  45 +++--
 net/netfilter/nf_tables_trace.c  | 271 +++
 net/netfilter/nfnetlink.c|   1 +
 8 files changed, 398 insertions(+), 19 deletions(-)
 create mode 100644 net/netfilter/nf_tables_trace.c

diff --git a/include/net/netfilter/nf_tables.h 
b/include/net/netfilter/nf_tables.h
index 101d7d7..b313cda 100644
--- a/include/net/netfilter/nf_tables.h
+++ b/include/net/netfilter/nf_tables.h
@@ -888,6 +888,38 @@ void nft_unregister_chain_type(const struct nf_chain_type 
*);
 int nft_register_expr(struct nft_expr_type *);
 void nft_unregister_expr(struct nft_expr_type *);
 
+int nft_verdict_dump(struct sk_buff *skb, int type,
+const struct nft_verdict *v);
+
+/**
+ * struct nft_traceinfo - nft tracing information and state
+ *
+ * @pkt: pktinfo currently processed
+ * @basechain: base chain currently processed
+ * @chain: chain currently processed
+ * @rule:  rule that was evaluated
+ * @verdict: verdict given by rule
+ * @type: event type (enum nft_trace_types)
+ * @packet_dumped: packet headers sent in a previous traceinfo message
+ * @trace: other struct members are initialised
+ */
+struct nft_traceinfo {
+   const struct nft_pktinfo*pkt;
+   const struct nft_base_chain *basechain;
+   const struct nft_chain  *chain;
+   const struct nft_rule   *rule;
+   const struct nft_verdict*verdict;
+   enum nft_trace_typestype;
+   boolpacket_dumped;
+   booltrace;
+};
+
+void nft_trace_init(struct nft_traceinfo *info, const struct nft_pktinfo *pkt,
+   const struct nft_verdict *verdict,
+   const struct nft_chain *basechain);
+
+void nft_trace_notify(struct nft_traceinfo *info);
+
 #define nft_dereference(p) \
nfnl_dereference(p, NFNL_SUBSYS_NFTABLES)
 
diff --git a/include/uapi/linux/netfilter/nf_tables.h 
b/include/uapi/linux/netfilter/nf_tables.h
index 5f3ecec..b48a3ab 100644
--- a/include/uapi/linux/netfilter/nf_tables.h
+++ b/include/uapi/linux/netfilter/nf_tables.h
@@ -83,6 +83,7 @@ enum nft_verdicts {
  * @NFT_MSG_DELSETELEM: delete a set element (enum nft_set_elem_attributes)
  * @NFT_MSG_NEWGEN: announce a new generation, only for events (enum 
nft_gen_attributes)
  * @NFT_MSG_GETGEN: get the rule-set generation (enum nft_gen_attributes)
+ * @NFT_MSG_TRACE: trace event (enum nft_trace_attributes)
  */
 enum nf_tables_msg_types {
NFT_MSG_NEWTABLE,
@@ -102,6 +103,7 @@ enum nf_tables_msg_types {
NFT_MSG_DELSETELEM,
NFT_MSG_NEWGEN,
NFT_MSG_GETGEN,
+   NFT_MSG_TRACE,
NFT_MSG_MAX,
 };
 
@@ -987,4 +989,54 @@ enum nft_gen_attributes {
 };
 #define NFTA_GEN_MAX   (__NFTA_GEN_MAX - 1)
 
+/**
+ * enum nft_trace_attributes - nf_tables trace netlink attributes
+ *
+ * @NFTA_TRACE_TABLE: name of the table (NLA_STRING)
+ * @NFTA_TRACE_CHAIN: name of the chain (NLA_STRING)
+ * @NFTA_TRACE_RULE_HANDLE: numeric handle of the rule (NLA_U64)
+ * @NFTA_TRACE_TYPE: type of the event (NLA_U32: nft_trace_types)
+ * @NFTA_TRACE_VERDICT: verdict returned by hook (NLA_NESTED: nft_verdicts)
+ * @NFTA_TRACE_ID: pseudo-id, same for each skb traced (NLA_U32)
+ * @NFTA_TRACE_LL_HEADER: linklayer header (NLA_BINARY)
+ * @NFTA_TRACE_NETWORK_HEADER: network header (NLA_BINARY)
+ * @NFTA_TRACE_TRANSPORT_HEADER: transport header (NLA_BINARY)
+ * @NFTA_TRACE_IIF: indev ifindex (NLA_U32)
+ * @NFTA_TRACE_IIFTYPE: netdev->type of indev (NLA_U16)
+ * @NFTA_TRACE_OIF: outdev 

[PATCH 17/23] netfilter: nfnetlink_log: Change setter functions to be void

2015-12-18 Thread Pablo Neira Ayuso
From: "Rosen, Rami" 

Change return type of nfulnl_set_timeout() and nfulnl_set_qthresh() to
be void.

This patch changes the return type of the static methods
nfulnl_set_timeout() and nfulnl_set_qthresh() to be void, as there is no
justification and no need for these methods to return int.

Signed-off-by: Rami Rosen 
Signed-off-by: Pablo Neira Ayuso 
---
 net/netfilter/nfnetlink_log.c | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/net/netfilter/nfnetlink_log.c b/net/netfilter/nfnetlink_log.c
index dea4676..70b6bd3 100644
--- a/net/netfilter/nfnetlink_log.c
+++ b/net/netfilter/nfnetlink_log.c
@@ -293,24 +293,20 @@ nfulnl_set_nlbufsiz(struct nfulnl_instance *inst, 
u_int32_t nlbufsiz)
return status;
 }
 
-static int
+static void
 nfulnl_set_timeout(struct nfulnl_instance *inst, u_int32_t timeout)
 {
spin_lock_bh(>lock);
inst->flushtimeout = timeout;
spin_unlock_bh(>lock);
-
-   return 0;
 }
 
-static int
+static void
 nfulnl_set_qthresh(struct nfulnl_instance *inst, u_int32_t qthresh)
 {
spin_lock_bh(>lock);
inst->qthreshold = qthresh;
spin_unlock_bh(>lock);
-
-   return 0;
 }
 
 static int
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 16/23] netfilter: ipv6: nf_defrag: fix NULL deref panic

2015-12-18 Thread Pablo Neira Ayuso
From: Florian Westphal 

Valdis reports NULL deref in nf_ct_frag6_gather.
Problem is bogus use of skb_queue_walk() -- we miss first skb in the list
since we start with head->next instead of head.

In case the element we're looking for was head->next we won't find
a result and then trip over NULL iter.

(defrag uses plain NULL-terminated list rather than one terminated by
 head-of-list-pointer, which is what skb_queue_walk expects).

Fixes: 029f7f3b8701cc7a ("netfilter: ipv6: nf_defrag: avoid/free clone 
operations")
Reported-by: Valdis Kletnieks 
Tested-by: Valdis Kletnieks 
Signed-off-by: Florian Westphal 
Signed-off-by: Pablo Neira Ayuso 
---
 net/ipv6/netfilter/nf_conntrack_reasm.c | 13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/net/ipv6/netfilter/nf_conntrack_reasm.c 
b/net/ipv6/netfilter/nf_conntrack_reasm.c
index 912bc3a..6e5f0e0d 100644
--- a/net/ipv6/netfilter/nf_conntrack_reasm.c
+++ b/net/ipv6/netfilter/nf_conntrack_reasm.c
@@ -441,11 +441,14 @@ nf_ct_frag6_reasm(struct frag_queue *fq, struct sk_buff 
*prev,  struct net_devic
return false;
 
fp->next = prev->next;
-   skb_queue_walk(head, iter) {
-   if (iter->next != prev)
-   continue;
-   iter->next = fp;
-   break;
+
+   iter = head;
+   while (iter) {
+   if (iter->next == prev) {
+   iter->next = fp;
+   break;
+   }
+   iter = iter->next;
}
 
skb_morph(prev, head);
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 08/23] netfilter: remove duplicate include

2015-12-18 Thread Pablo Neira Ayuso
From: stephen hemminger 

Signed-off-by: Stephen Hemminger 
Signed-off-by: Pablo Neira Ayuso 
---
 net/ipv4/netfilter/nf_reject_ipv4.c | 1 -
 net/ipv6/netfilter/nf_reject_ipv6.c | 1 -
 2 files changed, 2 deletions(-)

diff --git a/net/ipv4/netfilter/nf_reject_ipv4.c 
b/net/ipv4/netfilter/nf_reject_ipv4.c
index c747b2d..b6ea57e 100644
--- a/net/ipv4/netfilter/nf_reject_ipv4.c
+++ b/net/ipv4/netfilter/nf_reject_ipv4.c
@@ -14,7 +14,6 @@
 #include 
 #include 
 #include 
-#include 
 
 const struct tcphdr *nf_reject_ip_tcphdr_get(struct sk_buff *oldskb,
 struct tcphdr *_oth, int hook)
diff --git a/net/ipv6/netfilter/nf_reject_ipv6.c 
b/net/ipv6/netfilter/nf_reject_ipv6.c
index e0f922b..4709f65 100644
--- a/net/ipv6/netfilter/nf_reject_ipv6.c
+++ b/net/ipv6/netfilter/nf_reject_ipv6.c
@@ -14,7 +14,6 @@
 #include 
 #include 
 #include 
-#include 
 
 const struct tcphdr *nf_reject_ip6_tcphdr_get(struct sk_buff *oldskb,
  struct tcphdr *otcph,
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 15/23] netfilter: nf_tables: wrap tracing with a static key

2015-12-18 Thread Pablo Neira Ayuso
From: Florian Westphal 

Only needed when meta nftrace rule(s) were added.
The assumption is that no such rules are active, so the call to
nft_trace_init is "never" needed.

When nftrace rules are active, we always call the nft_trace_* functions,
but will only send netlink messages when all of the following are true:

 - traceinfo structure was initialised
 - skb->nf_trace == 1
 - at least one subscriber to trace group.

Adding an extra conditional
(static_branch ... && skb->nf_trace)
nft_trace_init( ..)

Is possible but results in a larger nft_do_chain footprint.

Signed-off-by: Florian Westphal 
Acked-by: Patrick McHardy 
Signed-off-by: Pablo Neira Ayuso 
---
 include/net/netfilter/nf_tables_core.h |  1 +
 include/net/netfilter/nft_meta.h   |  3 +++
 net/bridge/netfilter/nft_meta_bridge.c |  1 +
 net/netfilter/nf_tables_core.c |  9 ++---
 net/netfilter/nf_tables_trace.c|  4 
 net/netfilter/nft_meta.c   | 16 
 6 files changed, 31 insertions(+), 3 deletions(-)

diff --git a/include/net/netfilter/nf_tables_core.h 
b/include/net/netfilter/nf_tables_core.h
index 4ff5424..a9060dd 100644
--- a/include/net/netfilter/nf_tables_core.h
+++ b/include/net/netfilter/nf_tables_core.h
@@ -57,6 +57,7 @@ struct nft_payload_set {
 };
 
 extern const struct nft_expr_ops nft_payload_fast_ops;
+extern struct static_key_false nft_trace_enabled;
 
 int nft_payload_module_init(void);
 void nft_payload_module_exit(void);
diff --git a/include/net/netfilter/nft_meta.h b/include/net/netfilter/nft_meta.h
index 711887a..d27588c 100644
--- a/include/net/netfilter/nft_meta.h
+++ b/include/net/netfilter/nft_meta.h
@@ -33,4 +33,7 @@ void nft_meta_set_eval(const struct nft_expr *expr,
   struct nft_regs *regs,
   const struct nft_pktinfo *pkt);
 
+void nft_meta_set_destroy(const struct nft_ctx *ctx,
+ const struct nft_expr *expr);
+
 #endif
diff --git a/net/bridge/netfilter/nft_meta_bridge.c 
b/net/bridge/netfilter/nft_meta_bridge.c
index a21269b..4b901d9 100644
--- a/net/bridge/netfilter/nft_meta_bridge.c
+++ b/net/bridge/netfilter/nft_meta_bridge.c
@@ -84,6 +84,7 @@ static const struct nft_expr_ops nft_meta_bridge_set_ops = {
.size   = NFT_EXPR_SIZE(sizeof(struct nft_meta)),
.eval   = nft_meta_set_eval,
.init   = nft_meta_set_init,
+   .destroy= nft_meta_set_destroy,
.dump   = nft_meta_set_dump,
 };
 
diff --git a/net/netfilter/nf_tables_core.c b/net/netfilter/nf_tables_core.c
index 2395de7..67fa41d 100644
--- a/net/netfilter/nf_tables_core.c
+++ b/net/netfilter/nf_tables_core.c
@@ -16,6 +16,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -50,7 +51,7 @@ static noinline void __nft_trace_packet(struct nft_traceinfo 
*info,
 {
const struct nft_pktinfo *pkt = info->pkt;
 
-   if (!pkt->skb->nf_trace)
+   if (!info->trace || !pkt->skb->nf_trace)
return;
 
info->chain = chain;
@@ -70,7 +71,7 @@ static inline void nft_trace_packet(struct nft_traceinfo 
*info,
int rulenum,
enum nft_trace_types type)
 {
-   if (unlikely(info->trace)) {
+   if (static_branch_unlikely(_trace_enabled)) {
info->rule = rule;
__nft_trace_packet(info, chain, rulenum, type);
}
@@ -137,7 +138,9 @@ nft_do_chain(struct nft_pktinfo *pkt, void *priv)
unsigned int gencursor = nft_genmask_cur(net);
struct nft_traceinfo info;
 
-   nft_trace_init(, pkt, , basechain);
+   info.trace = false;
+   if (static_branch_unlikely(_trace_enabled))
+   nft_trace_init(, pkt, , basechain);
 do_chain:
rulenum = 0;
rule = list_entry(>rules, struct nft_rule, list);
diff --git a/net/netfilter/nf_tables_trace.c b/net/netfilter/nf_tables_trace.c
index 36fd7ad..e9e959f 100644
--- a/net/netfilter/nf_tables_trace.c
+++ b/net/netfilter/nf_tables_trace.c
@@ -8,6 +8,7 @@
  */
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -24,6 +25,9 @@
 #define NFT_TRACETYPE_NETWORK_HSIZE40
 #define NFT_TRACETYPE_TRANSPORT_HSIZE  20
 
+DEFINE_STATIC_KEY_FALSE(nft_trace_enabled);
+EXPORT_SYMBOL_GPL(nft_trace_enabled);
+
 static int trace_fill_id(struct sk_buff *nlskb, struct sk_buff *skb)
 {
__be32 id;
diff --git a/net/netfilter/nft_meta.c b/net/netfilter/nft_meta.c
index 9dfaf4d..85a465b 100644
--- a/net/netfilter/nft_meta.c
+++ b/net/netfilter/nft_meta.c
@@ -18,10 +18,12 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include  /* for TCP_TIME_WAIT */
 #include 
+#include 
 #include 
 
 void nft_meta_get_eval(const struct nft_expr *expr,
@@ -297,6 +299,9 @@ int nft_meta_set_init(const struct nft_ctx *ctx,
if (err < 0)
return err;
 

[PATCH 18/23] netfilter: nf_tables: fix nf_log_trace based tracing

2015-12-18 Thread Pablo Neira Ayuso
From: Florian Westphal 

nf_log_trace() outputs bogus 'TRACE:' strings because I forgot to update
the comments array.

Fixes: 33d5a7b14bfd0 ("netfilter: nf_tables: extend tracing infrastructure")
Signed-off-by: Florian Westphal 
Signed-off-by: Pablo Neira Ayuso 
---
 net/netfilter/nf_tables_core.c | 16 +---
 1 file changed, 5 insertions(+), 11 deletions(-)

diff --git a/net/netfilter/nf_tables_core.c b/net/netfilter/nf_tables_core.c
index 67fa41d..e9f8dff 100644
--- a/net/netfilter/nf_tables_core.c
+++ b/net/netfilter/nf_tables_core.c
@@ -23,16 +23,10 @@
 #include 
 #include 
 
-enum nft_trace {
-   NFT_TRACE_RULE,
-   NFT_TRACE_RETURN,
-   NFT_TRACE_POLICY,
-};
-
-static const char *const comments[] = {
-   [NFT_TRACE_RULE]= "rule",
-   [NFT_TRACE_RETURN]  = "return",
-   [NFT_TRACE_POLICY]  = "policy",
+static const char *const comments[__NFT_TRACETYPE_MAX] = {
+   [NFT_TRACETYPE_POLICY]  = "policy",
+   [NFT_TRACETYPE_RETURN]  = "return",
+   [NFT_TRACETYPE_RULE]= "rule",
 };
 
 static struct nf_loginfo trace_loginfo = {
@@ -47,7 +41,7 @@ static struct nf_loginfo trace_loginfo = {
 
 static noinline void __nft_trace_packet(struct nft_traceinfo *info,
const struct nft_chain *chain,
-   int rulenum, enum nft_trace type)
+   int rulenum, enum nft_trace_types type)
 {
const struct nft_pktinfo *pkt = info->pkt;
 
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 02/23] netfilter: fix include files for compilation

2015-12-18 Thread Pablo Neira Ayuso
From: Mikko Rapeli 

Add missing header dependencies and other small changes so that each file
compiles alone in userspace.

Signed-off-by: Mikko Rapeli 
Signed-off-by: Pablo Neira Ayuso 
---
 include/uapi/linux/netfilter/ipset/ip_set_bitmap.h   |  2 ++
 include/uapi/linux/netfilter/ipset/ip_set_hash.h |  2 ++
 include/uapi/linux/netfilter/ipset/ip_set_list.h |  2 ++
 include/uapi/linux/netfilter/nf_conntrack_tuple_common.h |  3 +++
 include/uapi/linux/netfilter/xt_HMARK.h  |  1 +
 include/uapi/linux/netfilter/xt_RATEEST.h|  1 +
 include/uapi/linux/netfilter/xt_TEE.h|  2 ++
 include/uapi/linux/netfilter/xt_TPROXY.h |  1 +
 include/uapi/linux/netfilter/xt_hashlimit.h  |  1 +
 include/uapi/linux/netfilter/xt_ipvs.h   |  1 +
 include/uapi/linux/netfilter/xt_mac.h|  2 ++
 include/uapi/linux/netfilter/xt_osf.h|  2 ++
 include/uapi/linux/netfilter/xt_physdev.h|  2 +-
 include/uapi/linux/netfilter/xt_policy.h |  2 ++
 include/uapi/linux/netfilter/xt_rateest.h|  1 +
 include/uapi/linux/netfilter/xt_recent.h |  1 +
 include/uapi/linux/netfilter/xt_sctp.h   | 12 ++--
 include/uapi/linux/netfilter_arp/arp_tables.h|  1 +
 include/uapi/linux/netfilter_bridge.h|  1 +
 include/uapi/linux/netfilter_bridge/ebt_arp.h|  1 +
 include/uapi/linux/netfilter_bridge/ebt_arpreply.h   |  2 ++
 include/uapi/linux/netfilter_bridge/ebt_ip6.h|  1 +
 include/uapi/linux/netfilter_bridge/ebt_nat.h|  2 ++
 include/uapi/linux/netfilter_ipv4/ip_tables.h|  1 +
 include/uapi/linux/netfilter_ipv6/ip6_tables.h   |  1 +
 include/uapi/linux/netfilter_ipv6/ip6t_rt.h  |  2 +-
 26 files changed, 42 insertions(+), 8 deletions(-)

diff --git a/include/uapi/linux/netfilter/ipset/ip_set_bitmap.h 
b/include/uapi/linux/netfilter/ipset/ip_set_bitmap.h
index 6a2c038..fd5024d 100644
--- a/include/uapi/linux/netfilter/ipset/ip_set_bitmap.h
+++ b/include/uapi/linux/netfilter/ipset/ip_set_bitmap.h
@@ -1,6 +1,8 @@
 #ifndef _UAPI__IP_SET_BITMAP_H
 #define _UAPI__IP_SET_BITMAP_H
 
+#include 
+
 /* Bitmap type specific error codes */
 enum {
/* The element is out of the range of the set */
diff --git a/include/uapi/linux/netfilter/ipset/ip_set_hash.h 
b/include/uapi/linux/netfilter/ipset/ip_set_hash.h
index 352eecc..82deeb8 100644
--- a/include/uapi/linux/netfilter/ipset/ip_set_hash.h
+++ b/include/uapi/linux/netfilter/ipset/ip_set_hash.h
@@ -1,6 +1,8 @@
 #ifndef _UAPI__IP_SET_HASH_H
 #define _UAPI__IP_SET_HASH_H
 
+#include 
+
 /* Hash type specific error codes */
 enum {
/* Hash is full */
diff --git a/include/uapi/linux/netfilter/ipset/ip_set_list.h 
b/include/uapi/linux/netfilter/ipset/ip_set_list.h
index a44efaa..84d4303 100644
--- a/include/uapi/linux/netfilter/ipset/ip_set_list.h
+++ b/include/uapi/linux/netfilter/ipset/ip_set_list.h
@@ -1,6 +1,8 @@
 #ifndef _UAPI__IP_SET_LIST_H
 #define _UAPI__IP_SET_LIST_H
 
+#include 
+
 /* List type specific error codes */
 enum {
/* Set name to be added/deleted/tested does not exist. */
diff --git a/include/uapi/linux/netfilter/nf_conntrack_tuple_common.h 
b/include/uapi/linux/netfilter/nf_conntrack_tuple_common.h
index 2f6bbc5..a9c3834 100644
--- a/include/uapi/linux/netfilter/nf_conntrack_tuple_common.h
+++ b/include/uapi/linux/netfilter/nf_conntrack_tuple_common.h
@@ -1,6 +1,9 @@
 #ifndef _NF_CONNTRACK_TUPLE_COMMON_H
 #define _NF_CONNTRACK_TUPLE_COMMON_H
 
+#include 
+#include 
+
 enum ip_conntrack_dir {
IP_CT_DIR_ORIGINAL,
IP_CT_DIR_REPLY,
diff --git a/include/uapi/linux/netfilter/xt_HMARK.h 
b/include/uapi/linux/netfilter/xt_HMARK.h
index 826fc58..3fb48c8 100644
--- a/include/uapi/linux/netfilter/xt_HMARK.h
+++ b/include/uapi/linux/netfilter/xt_HMARK.h
@@ -2,6 +2,7 @@
 #define XT_HMARK_H_
 
 #include 
+#include 
 
 enum {
XT_HMARK_SADDR_MASK,
diff --git a/include/uapi/linux/netfilter/xt_RATEEST.h 
b/include/uapi/linux/netfilter/xt_RATEEST.h
index 6605e20..ec1b570 100644
--- a/include/uapi/linux/netfilter/xt_RATEEST.h
+++ b/include/uapi/linux/netfilter/xt_RATEEST.h
@@ -2,6 +2,7 @@
 #define _XT_RATEEST_TARGET_H
 
 #include 
+#include 
 
 struct xt_rateest_target_info {
charname[IFNAMSIZ];
diff --git a/include/uapi/linux/netfilter/xt_TEE.h 
b/include/uapi/linux/netfilter/xt_TEE.h
index 5c21d5c..0109202 100644
--- a/include/uapi/linux/netfilter/xt_TEE.h
+++ b/include/uapi/linux/netfilter/xt_TEE.h
@@ -1,6 +1,8 @@
 #ifndef _XT_TEE_TARGET_H
 #define _XT_TEE_TARGET_H
 
+#include 
+
 struct xt_tee_tginfo {
union nf_inet_addr gw;
char oif[16];
diff --git a/include/uapi/linux/netfilter/xt_TPROXY.h 

[PATCH 22/23] nfnetlink: add nfnl_dereference_protected helper

2015-12-18 Thread Pablo Neira Ayuso
From: Florian Westphal 

to avoid overly long line in followup patch.

Signed-off-by: Florian Westphal 
Signed-off-by: Pablo Neira Ayuso 
---
 net/netfilter/nfnetlink.c | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/net/netfilter/nfnetlink.c b/net/netfilter/nfnetlink.c
index 28591fa..aebf5cd 100644
--- a/net/netfilter/nfnetlink.c
+++ b/net/netfilter/nfnetlink.c
@@ -33,6 +33,10 @@ MODULE_LICENSE("GPL");
 MODULE_AUTHOR("Harald Welte ");
 MODULE_ALIAS_NET_PF_PROTO(PF_NETLINK, NETLINK_NETFILTER);
 
+#define nfnl_dereference_protected(id) \
+   rcu_dereference_protected(table[(id)].subsys, \
+ lockdep_nfnl_is_held((id)))
+
 static char __initdata nfversion[] = "0.30";
 
 static struct {
@@ -208,8 +212,7 @@ replay:
} else {
rcu_read_unlock();
nfnl_lock(subsys_id);
-   if (rcu_dereference_protected(table[subsys_id].subsys,
-   lockdep_is_held([subsys_id].mutex)) != ss 
||
+   if (nfnl_dereference_protected(subsys_id) != ss ||
nfnetlink_find_client(type, ss) != nc)
err = -EAGAIN;
else if (nc->call)
@@ -299,15 +302,13 @@ replay:
skb->sk = oskb->sk;
 
nfnl_lock(subsys_id);
-   ss = rcu_dereference_protected(table[subsys_id].subsys,
-  
lockdep_is_held([subsys_id].mutex));
+   ss = nfnl_dereference_protected(subsys_id);
if (!ss) {
 #ifdef CONFIG_MODULES
nfnl_unlock(subsys_id);
request_module("nfnetlink-subsys-%d", subsys_id);
nfnl_lock(subsys_id);
-   ss = rcu_dereference_protected(table[subsys_id].subsys,
-  
lockdep_is_held([subsys_id].mutex));
+   ss = nfnl_dereference_protected(subsys_id);
if (!ss)
 #endif
{
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 20/23] netfilter: prepare xt_cgroup for multi revisions

2015-12-18 Thread Pablo Neira Ayuso
From: Tejun Heo 

xt_cgroup will grow cgroup2 path based match.  Postfix existing
symbols with _v0 and prepare for multi revision registration.

Signed-off-by: Tejun Heo 
Cc: Daniel Borkmann 
Cc: Daniel Wagner 
CC: Neil Horman 
Cc: Jan Engelhardt 
Cc: Pablo Neira Ayuso 
Signed-off-by: Pablo Neira Ayuso 
---
 include/uapi/linux/netfilter/xt_cgroup.h |  2 +-
 net/netfilter/xt_cgroup.c| 36 +---
 2 files changed, 20 insertions(+), 18 deletions(-)

diff --git a/include/uapi/linux/netfilter/xt_cgroup.h 
b/include/uapi/linux/netfilter/xt_cgroup.h
index 43acb7e..577c9e0 100644
--- a/include/uapi/linux/netfilter/xt_cgroup.h
+++ b/include/uapi/linux/netfilter/xt_cgroup.h
@@ -3,7 +3,7 @@
 
 #include 
 
-struct xt_cgroup_info {
+struct xt_cgroup_info_v0 {
__u32 id;
__u32 invert;
 };
diff --git a/net/netfilter/xt_cgroup.c b/net/netfilter/xt_cgroup.c
index 54eaeb4..1730025 100644
--- a/net/netfilter/xt_cgroup.c
+++ b/net/netfilter/xt_cgroup.c
@@ -24,9 +24,9 @@ MODULE_DESCRIPTION("Xtables: process control group matching");
 MODULE_ALIAS("ipt_cgroup");
 MODULE_ALIAS("ip6t_cgroup");
 
-static int cgroup_mt_check(const struct xt_mtchk_param *par)
+static int cgroup_mt_check_v0(const struct xt_mtchk_param *par)
 {
-   struct xt_cgroup_info *info = par->matchinfo;
+   struct xt_cgroup_info_v0 *info = par->matchinfo;
 
if (info->invert & ~1)
return -EINVAL;
@@ -35,9 +35,9 @@ static int cgroup_mt_check(const struct xt_mtchk_param *par)
 }
 
 static bool
-cgroup_mt(const struct sk_buff *skb, struct xt_action_param *par)
+cgroup_mt_v0(const struct sk_buff *skb, struct xt_action_param *par)
 {
-   const struct xt_cgroup_info *info = par->matchinfo;
+   const struct xt_cgroup_info_v0 *info = par->matchinfo;
 
if (skb->sk == NULL || !sk_fullsock(skb->sk))
return false;
@@ -46,27 +46,29 @@ cgroup_mt(const struct sk_buff *skb, struct xt_action_param 
*par)
info->invert;
 }
 
-static struct xt_match cgroup_mt_reg __read_mostly = {
-   .name   = "cgroup",
-   .revision   = 0,
-   .family = NFPROTO_UNSPEC,
-   .checkentry = cgroup_mt_check,
-   .match  = cgroup_mt,
-   .matchsize  = sizeof(struct xt_cgroup_info),
-   .me = THIS_MODULE,
-   .hooks  = (1 << NF_INET_LOCAL_OUT) |
- (1 << NF_INET_POST_ROUTING) |
- (1 << NF_INET_LOCAL_IN),
+static struct xt_match cgroup_mt_reg[] __read_mostly = {
+   {
+   .name   = "cgroup",
+   .revision   = 0,
+   .family = NFPROTO_UNSPEC,
+   .checkentry = cgroup_mt_check_v0,
+   .match  = cgroup_mt_v0,
+   .matchsize  = sizeof(struct xt_cgroup_info_v0),
+   .me = THIS_MODULE,
+   .hooks  = (1 << NF_INET_LOCAL_OUT) |
+ (1 << NF_INET_POST_ROUTING) |
+ (1 << NF_INET_LOCAL_IN),
+   },
 };
 
 static int __init cgroup_mt_init(void)
 {
-   return xt_register_match(_mt_reg);
+   return xt_register_matches(cgroup_mt_reg, ARRAY_SIZE(cgroup_mt_reg));
 }
 
 static void __exit cgroup_mt_exit(void)
 {
-   xt_unregister_match(_mt_reg);
+   xt_unregister_matches(cgroup_mt_reg, ARRAY_SIZE(cgroup_mt_reg));
 }
 
 module_init(cgroup_mt_init);
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 13/23] netfilter: nft_payload: add packet mangling support

2015-12-18 Thread Pablo Neira Ayuso
From: Patrick McHardy 

Add support for mangling packet payload. Checksum for the specified base
header is updated automatically if requested, however no updates for any
kind of pseudo headers are supported, meaning no stateless NAT is supported.

For checksum updates different checksumming methods can be specified. The
currently supported methods are NONE for no checksum updates, and INET for
internet type checksums.

Signed-off-by: Patrick McHardy 
Signed-off-by: Pablo Neira Ayuso 
---
 include/net/netfilter/nf_tables_core.h   |   9 +++
 include/uapi/linux/netfilter/nf_tables.h |  17 
 net/netfilter/nft_payload.c  | 135 +--
 3 files changed, 155 insertions(+), 6 deletions(-)

diff --git a/include/net/netfilter/nf_tables_core.h 
b/include/net/netfilter/nf_tables_core.h
index c6f400c..4ff5424 100644
--- a/include/net/netfilter/nf_tables_core.h
+++ b/include/net/netfilter/nf_tables_core.h
@@ -47,6 +47,15 @@ struct nft_payload {
enum nft_registers  dreg:8;
 };
 
+struct nft_payload_set {
+   enum nft_payload_bases  base:8;
+   u8  offset;
+   u8  len;
+   enum nft_registers  sreg:8;
+   u8  csum_type;
+   u8  csum_offset;
+};
+
 extern const struct nft_expr_ops nft_payload_fast_ops;
 
 int nft_payload_module_init(void);
diff --git a/include/uapi/linux/netfilter/nf_tables.h 
b/include/uapi/linux/netfilter/nf_tables.h
index d8c8a7c..5f3ecec 100644
--- a/include/uapi/linux/netfilter/nf_tables.h
+++ b/include/uapi/linux/netfilter/nf_tables.h
@@ -598,12 +598,26 @@ enum nft_payload_bases {
 };
 
 /**
+ * enum nft_payload_csum_types - nf_tables payload expression checksum types
+ *
+ * @NFT_PAYLOAD_CSUM_NONE: no checksumming
+ * @NFT_PAYLOAD_CSUM_INET: internet checksum (RFC 791)
+ */
+enum nft_payload_csum_types {
+   NFT_PAYLOAD_CSUM_NONE,
+   NFT_PAYLOAD_CSUM_INET,
+};
+
+/**
  * enum nft_payload_attributes - nf_tables payload expression netlink 
attributes
  *
  * @NFTA_PAYLOAD_DREG: destination register to load data into (NLA_U32: 
nft_registers)
  * @NFTA_PAYLOAD_BASE: payload base (NLA_U32: nft_payload_bases)
  * @NFTA_PAYLOAD_OFFSET: payload offset relative to base (NLA_U32)
  * @NFTA_PAYLOAD_LEN: payload length (NLA_U32)
+ * @NFTA_PAYLOAD_SREG: source register to load data from (NLA_U32: 
nft_registers)
+ * @NFTA_PAYLOAD_CSUM_TYPE: checksum type (NLA_U32)
+ * @NFTA_PAYLOAD_CSUM_OFFSET: checksum offset relative to base (NLA_U32)
  */
 enum nft_payload_attributes {
NFTA_PAYLOAD_UNSPEC,
@@ -611,6 +625,9 @@ enum nft_payload_attributes {
NFTA_PAYLOAD_BASE,
NFTA_PAYLOAD_OFFSET,
NFTA_PAYLOAD_LEN,
+   NFTA_PAYLOAD_SREG,
+   NFTA_PAYLOAD_CSUM_TYPE,
+   NFTA_PAYLOAD_CSUM_OFFSET,
__NFTA_PAYLOAD_MAX
 };
 #define NFTA_PAYLOAD_MAX   (__NFTA_PAYLOAD_MAX - 1)
diff --git a/net/netfilter/nft_payload.c b/net/netfilter/nft_payload.c
index 09b4b07..12cd4bf 100644
--- a/net/netfilter/nft_payload.c
+++ b/net/netfilter/nft_payload.c
@@ -107,10 +107,13 @@ err:
 }
 
 static const struct nla_policy nft_payload_policy[NFTA_PAYLOAD_MAX + 1] = {
-   [NFTA_PAYLOAD_DREG] = { .type = NLA_U32 },
-   [NFTA_PAYLOAD_BASE] = { .type = NLA_U32 },
-   [NFTA_PAYLOAD_OFFSET]   = { .type = NLA_U32 },
-   [NFTA_PAYLOAD_LEN]  = { .type = NLA_U32 },
+   [NFTA_PAYLOAD_SREG] = { .type = NLA_U32 },
+   [NFTA_PAYLOAD_DREG] = { .type = NLA_U32 },
+   [NFTA_PAYLOAD_BASE] = { .type = NLA_U32 },
+   [NFTA_PAYLOAD_OFFSET]   = { .type = NLA_U32 },
+   [NFTA_PAYLOAD_LEN]  = { .type = NLA_U32 },
+   [NFTA_PAYLOAD_CSUM_TYPE]= { .type = NLA_U32 },
+   [NFTA_PAYLOAD_CSUM_OFFSET]  = { .type = NLA_U32 },
 };
 
 static int nft_payload_init(const struct nft_ctx *ctx,
@@ -160,6 +163,118 @@ const struct nft_expr_ops nft_payload_fast_ops = {
.dump   = nft_payload_dump,
 };
 
+static void nft_payload_set_eval(const struct nft_expr *expr,
+struct nft_regs *regs,
+const struct nft_pktinfo *pkt)
+{
+   const struct nft_payload_set *priv = nft_expr_priv(expr);
+   struct sk_buff *skb = pkt->skb;
+   const u32 *src = >data[priv->sreg];
+   int offset, csum_offset;
+   __wsum fsum, tsum;
+   __sum16 sum;
+
+   switch (priv->base) {
+   case NFT_PAYLOAD_LL_HEADER:
+   if (!skb_mac_header_was_set(skb))
+   goto err;
+   offset = skb_mac_header(skb) - skb->data;
+   break;
+   case NFT_PAYLOAD_NETWORK_HEADER:
+   offset = skb_network_offset(skb);
+   break;
+   case NFT_PAYLOAD_TRANSPORT_HEADER:
+   offset = pkt->xt.thoff;
+   break;
+   default:
+

[PATCH 06/23] netfilter-bridge: layout of if statements

2015-12-18 Thread Pablo Neira Ayuso
From: Ian Morris 

Eliminate some checkpatch issues by improved layout of if statements.

No changes detected by objdiff.

Signed-off-by: Ian Morris 
Signed-off-by: Pablo Neira Ayuso 
---
 net/bridge/netfilter/ebt_ip6.c  | 4 ++--
 net/bridge/netfilter/ebtables.c | 8 
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/net/bridge/netfilter/ebt_ip6.c b/net/bridge/netfilter/ebt_ip6.c
index 17fd5f2..98de6e7 100644
--- a/net/bridge/netfilter/ebt_ip6.c
+++ b/net/bridge/netfilter/ebt_ip6.c
@@ -65,8 +65,8 @@ ebt_ip6_mt(const struct sk_buff *skb, struct xt_action_param 
*par)
return false;
if (FWINV(info->protocol != nexthdr, EBT_IP6_PROTO))
return false;
-   if (!(info->bitmask & ( EBT_IP6_DPORT |
-   EBT_IP6_SPORT | EBT_IP6_ICMP6)))
+   if (!(info->bitmask & (EBT_IP6_DPORT |
+  EBT_IP6_SPORT | EBT_IP6_ICMP6)))
return true;
 
/* min icmpv6 headersize is 4, so sizeof(_pkthdr) is ok. */
diff --git a/net/bridge/netfilter/ebtables.c b/net/bridge/netfilter/ebtables.c
index b13ea69..67b2e27 100644
--- a/net/bridge/netfilter/ebtables.c
+++ b/net/bridge/netfilter/ebtables.c
@@ -161,7 +161,7 @@ ebt_basic_match(const struct ebt_entry *e, const struct 
sk_buff *skb,
for (i = 0; i < 6; i++)
verdict |= (h->h_source[i] ^ e->sourcemac[i]) &
   e->sourcemsk[i];
-   if (FWINV2(verdict != 0, EBT_ISOURCE) )
+   if (FWINV2(verdict != 0, EBT_ISOURCE))
return 1;
}
if (e->bitmask & EBT_DESTMAC) {
@@ -169,7 +169,7 @@ ebt_basic_match(const struct ebt_entry *e, const struct 
sk_buff *skb,
for (i = 0; i < 6; i++)
verdict |= (h->h_dest[i] ^ e->destmac[i]) &
   e->destmsk[i];
-   if (FWINV2(verdict != 0, EBT_IDEST) )
+   if (FWINV2(verdict != 0, EBT_IDEST))
return 1;
}
return 0;
@@ -673,7 +673,7 @@ ebt_check_entry(struct ebt_entry *e, struct net *net,
BUGPRINT("Unknown flag for inv bitmask\n");
return -EINVAL;
}
-   if ( (e->bitmask & EBT_NOPROTO) && (e->bitmask & EBT_802_3) ) {
+   if ((e->bitmask & EBT_NOPROTO) && (e->bitmask & EBT_802_3)) {
BUGPRINT("NOPROTO & 802_3 not allowed\n");
return -EINVAL;
}
@@ -1370,7 +1370,7 @@ static inline int ebt_make_watchername(const struct 
ebt_entry_watcher *w,
char name[EBT_FUNCTION_MAXNAMELEN] = {};
 
strlcpy(name, w->u.watcher->name, sizeof(name));
-   if (copy_to_user(hlp , name, EBT_FUNCTION_MAXNAMELEN))
+   if (copy_to_user(hlp, name, EBT_FUNCTION_MAXNAMELEN))
return -EFAULT;
return 0;
 }
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] can: sja1000: of: add compatibility with Technologic Systems version

2015-12-18 Thread Marc Kleine-Budde
On 12/18/2015 09:17 PM, Damien Riegel wrote:
> Technologic Systems provides an IP compatible with the SJA1000,
> instantiated in an FPGA. Because of some bus widths issue, access to
> registers is made through a "window" that works like this:
> 
> base + 0x0: address to read/write
> base + 0x2: 8-bit register value
> 
> This commit adds a new compatible device, "technologic,sja1000", with
> read and write functions using the window mechanism.
> 
> Signed-off-by: Damien Riegel 
> ---
>  drivers/net/can/sja1000/sja1000_platform.c | 30 
> --
>  1 file changed, 28 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/can/sja1000/sja1000_platform.c 
> b/drivers/net/can/sja1000/sja1000_platform.c
> index 0552ed4..6cbf251 100644
> --- a/drivers/net/can/sja1000/sja1000_platform.c
> +++ b/drivers/net/can/sja1000/sja1000_platform.c
> @@ -70,6 +70,18 @@ static void sp_write_reg32(const struct sja1000_priv 
> *priv, int reg, u8 val)
>   iowrite8(val, priv->reg_base + reg * 4);
>  }
>  
> +static u8 ts4800_read_reg16(const struct sja1000_priv *priv, int reg)
> +{
> + sp_write_reg16(priv, 0,  reg);
> + return sp_read_reg16(priv, 2);

This is racy, please add a spinlock.

> +}
> +
> +static void ts4800_write_reg16(const struct sja1000_priv *priv, int reg, u8 
> val)
> +{
> + sp_write_reg16(priv, 0, reg);
> + sp_write_reg16(priv, 2, val);

This is racy, too.

Have a look at https://marc.info/?l=linux-can=137149497403825=2

Marc

> +}
> +
>  static void sp_populate(struct sja1000_priv *priv,
>   struct sja1000_platform_data *pdata,
>   unsigned long resource_mem_flags)
> @@ -98,21 +110,34 @@ static void sp_populate(struct sja1000_priv *priv,
>  
>  static void sp_populate_of(struct sja1000_priv *priv, struct device_node *of)
>  {
> + int is_technologic;
>   int err;
>   u32 prop;
>  
> + is_technologic = of_device_is_compatible(of, "technologic,sja1000");
> +
>   err = of_property_read_u32(of, "reg-io-width", );
>   if (err)
>   prop = 1; /* 8 bit is default */
>  
> + if (is_technologic && prop != 2) {
> + netdev_warn(priv->dev, "forcing reg-io-width to 2\n");
> + prop = 2;
> + }
> +
>   switch (prop) {
>   case 4:
>   priv->read_reg = sp_read_reg32;
>   priv->write_reg = sp_write_reg32;
>   break;
>   case 2:
> - priv->read_reg = sp_read_reg16;
> - priv->write_reg = sp_write_reg16;
> + if (is_technologic) {
> + priv->read_reg = ts4800_read_reg16;
> + priv->write_reg = ts4800_write_reg16;
> + } else {
> + priv->read_reg = sp_read_reg16;
> + priv->write_reg = sp_write_reg16;
> + }
>   break;
>   case 1: /* fallthrough */
>   default:
> @@ -244,6 +269,7 @@ static int sp_remove(struct platform_device *pdev)
>  
>  static const struct of_device_id sp_of_table[] = {
>   {.compatible = "nxp,sja1000"},
> + {.compatible = "technologic,sja1000"},
>   {},
>  };
>  MODULE_DEVICE_TABLE(of, sp_of_table);
> 


-- 
Pengutronix e.K.  | Marc Kleine-Budde   |
Industrial Linux Solutions| Phone: +49-231-2826-924 |
Vertretung West/Dortmund  | Fax:   +49-5121-206917- |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |



signature.asc
Description: OpenPGP digital signature


  1   2   >