date:20160217

Re: [PATCH net-next] ipv6: pass up EMSGSIZE msg for UDP socket in Ipv6

2016-02-17 Thread Eric Dumazet

On mer., 2016-02-17 at 13:58 -0800, Wei Wang wrote:
> From: Wei Wang 
> 
> In ipv4,  when  the machine receives a ICMP_FRAG_NEEDED message,  the
> connected UDP socket will get EMSGSIZE message on its next read from the
> socket.
> However, this is not the case for ipv6.
> This fix modifies the udp err handler in Ipv6 for ICMP6_PKT_TOOBIG to
> make it similar to ipv4 behavior. That is when the machine gets an
> ICMP6_PKT_TOOBIG message, the connected UDP socket will get EMSGSIZE
> message on its next read from the socket.
> 
> Signed-off-by: Wei Wang 

Acked-by: Eric Dumazet 

Thanks Wei.

Re: [Intel-wired-lan] [next] igb: allow setting MAC address on i211 using a device tree blob V4

2016-02-17 Thread John Holland


> On Feb 18, 2016, at 03:29, David Miller  wrote:
> 
> From: John Holland 
> Date: Thu, 18 Feb 2016 00:49:17 +0100
> 
>> The Intel i211 LOM pcie ethernet controllers' iNVM operates as an OTP
>> and has no externel EEPROM interface [1]. The following allows the
>> driver to pickup the MAC address from a device tree blob when
>> CONFIG_OF
>> has been enabled.
> 
> Please use the generic eth_platform_get_mac_address(), or
> alternatively structure your code like the ixgbe and other cases so
> that SPARC and other OF platforms get this support as well.

Don't know what you mean. The PCI path in eth_platform_get_mac_address() didn't 
return a devicetree node and I can find no instance of of_ use in 
ixgbe.

John

[RESEND PATCH v3] isdn: divamnt: use y2038-safe ktime_get_ts64() for trace data timestamps

2016-02-17 Thread Alison Schofield

divamnt stores a start_time at module init and uses it to calculate
elapsed time. The elapsed time, stored in secs and usecs, is part of
the trace data the driver maintains for the DIVA Server ISDN cards.
No change to the format of that time data is required.

To avoid overflow on 32-bit systems use ktime_get_ts64() to return
the elapsed monotonic time since system boot.

This is a change from real to monotonic time. Since the driver only
stores elapsed time, monotonic time is sufficient and more robust
against real time clock changes. These new monotonic values can be
more useful for debugging because they can be easily compared to
other monotonic timestamps.

Note elaspsed time values will now start at system boot time rather
than module load time, so they will differ slightly from previously
reported values.

Remove declaration and init of previously unused time constants:
start_sec, start_usec.

Signed-off-by: Alison Schofield 
Reviewed-by: Arnd Bergmann 
---
Changes in v3:
  - use elapsed time since system boot in place of
elapsed time since module load
  - commit message updated
  - changelog updated

Changes in v2:
 - switched to monotonic time
 - removed the unused time constants
 - changelog updated


 drivers/isdn/hardware/eicon/debug.c   |  4 
 drivers/isdn/hardware/eicon/divamnt.c | 30 ++
 2 files changed, 6 insertions(+), 28 deletions(-)

diff --git a/drivers/isdn/hardware/eicon/debug.c 
b/drivers/isdn/hardware/eicon/debug.c
index b5226af..576b7b4 100644
--- a/drivers/isdn/hardware/eicon/debug.c
+++ b/drivers/isdn/hardware/eicon/debug.c
@@ -192,8 +192,6 @@ static diva_os_spin_lock_t dbg_q_lock;
 static diva_os_spin_lock_t dbg_adapter_lock;
 static int dbg_q_busy;
 static volatile dword  dbg_sequence;
-static dword   start_sec;
-static dword   start_usec;
 
 /*
   INTERFACE:
@@ -215,8 +213,6 @@ int diva_maint_init(byte *base, unsigned long length, int 
do_init) {
 
dbg_base = base;
 
-   diva_os_get_time(_sec, _usec);
-
*(dword *)base  = (dword)DBG_MAGIC; /* Store Magic */
base   += sizeof(dword);
length -= sizeof(dword);
diff --git a/drivers/isdn/hardware/eicon/divamnt.c 
b/drivers/isdn/hardware/eicon/divamnt.c
index 48db08d..0de29b7b 100644
--- a/drivers/isdn/hardware/eicon/divamnt.c
+++ b/drivers/isdn/hardware/eicon/divamnt.c
@@ -45,7 +45,6 @@ char *DRIVERRELEASE_MNT = "2.0";
 
 static wait_queue_head_t msgwaitq;
 static unsigned long opened;
-static struct timeval start_time;
 
 extern int mntfunc_init(int *, void **, unsigned long);
 extern void mntfunc_finit(void);
@@ -88,28 +87,12 @@ int diva_os_copy_from_user(void *os_handle, void *dst, 
const void __user *src,
  */
 void diva_os_get_time(dword *sec, dword *usec)
 {
-   struct timeval tv;
-
-   do_gettimeofday();
-
-   if (tv.tv_sec > start_time.tv_sec) {
-   if (start_time.tv_usec > tv.tv_usec) {
-   tv.tv_sec--;
-   tv.tv_usec += 100;
-   }
-   *sec = (dword) (tv.tv_sec - start_time.tv_sec);
-   *usec = (dword) (tv.tv_usec - start_time.tv_usec);
-   } else if (tv.tv_sec == start_time.tv_sec) {
-   *sec = 0;
-   if (start_time.tv_usec < tv.tv_usec) {
-   *usec = (dword) (tv.tv_usec - start_time.tv_usec);
-   } else {
-   *usec = 0;
-   }
-   } else {
-   *sec = (dword) tv.tv_sec;
-   *usec = (dword) tv.tv_usec;
-   }
+   struct timespec64 time;
+
+   ktime_get_ts64();
+
+   *sec = (dword) time.tv_sec;
+   *usec = (dword) (time.tv_nsec / NSEC_PER_USEC);
 }
 
 /*
@@ -213,7 +196,6 @@ static int __init maint_init(void)
int ret = 0;
void *buffer = NULL;
 
-   do_gettimeofday(_time);
init_waitqueue_head();
 
printk(KERN_INFO "%s\n", DRIVERNAME);
-- 
2.1.4

Re: [PATCH net] vxlan: do not use fdb in metadata mode

2016-02-17 Thread Simon Horman

Hi Jiri,

On Tue, Feb 16, 2016 at 10:18:26PM +0100, Jiri Benc wrote:
> In metadata mode, the vxlan interface is not supposed to use the fdb control
> plane but an external one (openvswitch or static routes). With the current
> code, packets may leak into the fdb handling code which usually causes them
> to be dropped anyway but may have strange side effects.
> 
> Just drop the packets directly when in metadata mode if the destination data
> are not correctly provided on egress.
> 
> Signed-off-by: Jiri Benc 

The logic here looks correct to me but I am curious to know
what circumstances would lead to the kfree_skb() case.

> ---
>  drivers/net/vxlan.c | 8 +---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
> index db96f3a16f6c..e6944b29588e 100644
> --- a/drivers/net/vxlan.c
> +++ b/drivers/net/vxlan.c
> @@ -2171,9 +2171,11 @@ static netdev_tx_t vxlan_xmit(struct sk_buff *skb, 
> struct net_device *dev)
>  #endif
>   }
>  
> - if (vxlan->flags & VXLAN_F_COLLECT_METADATA &&
> - info && info->mode & IP_TUNNEL_INFO_TX) {
> - vxlan_xmit_one(skb, dev, NULL, false);
> + if (vxlan->flags & VXLAN_F_COLLECT_METADATA) {
> + if (info && info->mode & IP_TUNNEL_INFO_TX)
> + vxlan_xmit_one(skb, dev, NULL, false);
> + else
> + kfree_skb(skb);
>   return NETDEV_TX_OK;
>   }
>  
> -- 
> 1.8.3.1
>

Re: [net] r8169 Remove duplicate command which set RxVlan and RxChksum bits.

2016-02-17 Thread Simon Horman

On Sun, Jan 31, 2016 at 05:33:08PM +0200, Corcodel Marian wrote:
>  These bits is already set on set geatures stage.
> 
> Signed-off-by: Corcodel Marian 
> ---
>  drivers/net/ethernet/realtek/r8169.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/realtek/r8169.c 
> b/drivers/net/ethernet/realtek/r8169.c
> index 17d5571..a8fb86d 100644
> --- a/drivers/net/ethernet/realtek/r8169.c
> +++ b/drivers/net/ethernet/realtek/r8169.c
> @@ -8368,7 +8368,6 @@ static int rtl_init_one(struct pci_dev *pdev, const 
> struct pci_device_id *ent)
>   dev->vlan_features = NETIF_F_SG | NETIF_F_IP_CSUM | NETIF_F_TSO |
>   NETIF_F_HIGHDMA;
>  
> - tp->cp_cmd |= RxChkSum | RxVlan;
>  

To avoid two consecutive blank lines, which seems excessive here,
you may want to consider removing the blank line before or after the
line of code you are removing.

>   /*
>* Pretend we are using VLANs; This bypasses a nasty bug where
> -- 
> 2.5.0
>

Re: [net-next 13/15] i40e/i40evf: use logical operators, not bitwise

2016-02-17 Thread Joe Perches

On Wed, 2016-02-17 at 19:38 -0800, Jeff Kirsher wrote:
> From: Mitch Williams 
> 
> Mr. Spock would certainly raise an eyebrow to see us using bitwise
> operators, when we should clearly be relying on logic. Fascinating.

I think it read better before this change.

Spock might have looked at the type of the
variable before raising that eyebrow.

clean_complete is bool so it's not actually a
bitwise operation,

> diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c 
> b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
[]
> @@ -1996,7 +1996,8 @@ int i40e_napi_poll(struct napi_struct *napi, int budget)
>    * budget and be more aggressive about cleaning up the Tx descriptors.
>    */
>   i40e_for_each_ring(ring, q_vector->tx) {
> - clean_complete &= i40e_clean_tx_irq(ring, vsi->work_limit);
> + clean_complete = clean_complete &&
> +  i40e_clean_tx_irq(ring, vsi->work_limit);

etc...

Re: [net-next 00/15][pull request] 40GbE Intel Wired LAN Driver Updates 2016-02-17

2016-02-17 Thread David Miller

From: Jeff Kirsher 
Date: Wed, 17 Feb 2016 19:38:42 -0800

> This series contains updates to i40e/i40evf only (again).

Spock approves, pulled, thanks a lot.

Re: [PATCH v2 net-next 6/8] net: mvneta: bm: add support for hardware buffer management

2016-02-17 Thread David Miller

From: Gregory CLEMENT 
Date: Tue, 16 Feb 2016 16:33:41 +0100

>   pp->dev = dev;
>   SET_NETDEV_DEV(dev, >dev);
>  
> + dev->features = NETIF_F_SG | NETIF_F_IP_CSUM | NETIF_F_TSO;
> + dev->hw_features |= dev->features;
> + dev->vlan_features |= dev->features;
> + dev->priv_flags |= IFF_UNICAST_FLT;
> + dev->gso_max_segs = MVNETA_MAX_TSO_SEGS;
> +
> + err = register_netdev(dev);
> + if (err < 0) {
> + dev_err(>dev, "failed to register\n");
> + goto err_free_stats;
> + }
> +
> + pp->id = dev->ifindex;
> +
> + /* Obtain access to BM resources if enabled and already initialized */
> + bm_node = of_parse_phandle(dn, "buffer-manager", 0);
> + if (bm_node && bm_node->data) {

This set of changes has a lot of problems.

First, the exact moment you call register_netdev() your device must be
fully initialized because ->open() can be invoked immediately.  This
means you must take care of all of this buffer manager stuff before
calling register_netdev().

It must precisely be the last thing you invoke in your probe function
for this reason.

Also you are now adding conditionalized code to every fastpath in your
driver, that is rediculous and is going to hurt performance.

Add seperate code paths for the HWBM vs SWBM, and register a unique
set of netdev_ops as appropriate.

Re: [PATCH] et131x: check return value of dma_alloc_coherent

2016-02-17 Thread David Miller

From: Insu Yun 
Date: Mon, 15 Feb 2016 21:23:47 -0500

> For error handling, dma_alloc_coherent's return value
> needs to be checked, not argument.
> 
> Signed-off-by: Insu Yun 

Applied, thanks.

Re: [patch net-next v2 00/13] rocker: do world split

2016-02-17 Thread David Miller

From: Jiri Pirko 
Date: Tue, 16 Feb 2016 15:14:38 +0100

> This patchset allows new rocker worlds to be easily added in future.
> Two new worlds are now under development: P4 and eBPF.
> 
> The main part of the patchset is the OF-DPA carve-out. It resuts in OF-DPA
> specific file. Clean cut.
> 
> Note this patchset is based on my original attempt in October 2015.
> I had to rebase, included all suggestions and did lot of small changes.
> Main change to go with all-port-one-world approach. Port world is set 
> according
> to what is setup in HW. Not possible to change worlds from driver.
> ---
> v1->v2:
>   patch 12/13:
>   - split port_init into pre-init and init

Series applied, thanks Jiri.

[PATCH net-next 3/3] samples/bpf: offwaketime example

2016-02-17 Thread Alexei Starovoitov

This is simplified version of Brendan Gregg's offwaketime:
This program shows kernel stack traces and task names that were blocked and
"off-CPU", along with the stack traces and task names for the threads that woke
them, and the total elapsed time from when they blocked to when they were woken
up. The combined stacks, task names, and total time is summarized in kernel
context for efficiency.

Example:
$ sudo ./offwaketime | flamegraph.pl > demo.svg
Open demo.svg in the browser as FlameGraph visualization.

Signed-off-by: Alexei Starovoitov 
---
 samples/bpf/Makefile   |   4 +
 samples/bpf/bpf_helpers.h  |   2 +
 samples/bpf/offwaketime_kern.c | 131 +
 samples/bpf/offwaketime_user.c | 185 +
 4 files changed, 322 insertions(+)
 create mode 100644 samples/bpf/offwaketime_kern.c
 create mode 100644 samples/bpf/offwaketime_user.c

diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
index edd638b5825f..c4f8ae0c8afe 100644
--- a/samples/bpf/Makefile
+++ b/samples/bpf/Makefile
@@ -16,6 +16,7 @@ hostprogs-y += tracex5
 hostprogs-y += tracex6
 hostprogs-y += trace_output
 hostprogs-y += lathist
+hostprogs-y += offwaketime
 
 test_verifier-objs := test_verifier.o libbpf.o
 test_maps-objs := test_maps.o libbpf.o
@@ -32,6 +33,7 @@ tracex5-objs := bpf_load.o libbpf.o tracex5_user.o
 tracex6-objs := bpf_load.o libbpf.o tracex6_user.o
 trace_output-objs := bpf_load.o libbpf.o trace_output_user.o
 lathist-objs := bpf_load.o libbpf.o lathist_user.o
+offwaketime-objs := bpf_load.o libbpf.o offwaketime_user.o
 
 # Tell kbuild to always build the programs
 always := $(hostprogs-y)
@@ -47,6 +49,7 @@ always += tracex6_kern.o
 always += trace_output_kern.o
 always += tcbpf1_kern.o
 always += lathist_kern.o
+always += offwaketime_kern.o
 
 HOSTCFLAGS += -I$(objtree)/usr/include
 
@@ -63,6 +66,7 @@ HOSTLOADLIBES_tracex5 += -lelf
 HOSTLOADLIBES_tracex6 += -lelf
 HOSTLOADLIBES_trace_output += -lelf -lrt
 HOSTLOADLIBES_lathist += -lelf
+HOSTLOADLIBES_offwaketime += -lelf
 
 # point this to your LLVM backend with bpf support
 LLC=$(srctree)/tools/bpf/llvm/bld/Debug+Asserts/bin/llc
diff --git a/samples/bpf/bpf_helpers.h b/samples/bpf/bpf_helpers.h
index 7ad19e1dbaf4..811bcca0f29d 100644
--- a/samples/bpf/bpf_helpers.h
+++ b/samples/bpf/bpf_helpers.h
@@ -39,6 +39,8 @@ static int (*bpf_redirect)(int ifindex, int flags) =
(void *) BPF_FUNC_redirect;
 static int (*bpf_perf_event_output)(void *ctx, void *map, int index, void 
*data, int size) =
(void *) BPF_FUNC_perf_event_output;
+static int (*bpf_get_stackid)(void *ctx, void *map, int flags) =
+   (void *) BPF_FUNC_get_stackid;
 
 /* llvm builtin functions that eBPF C program may use to
  * emit BPF_LD_ABS and BPF_LD_IND instructions
diff --git a/samples/bpf/offwaketime_kern.c b/samples/bpf/offwaketime_kern.c
new file mode 100644
index ..c0aa5a9b9c48
--- /dev/null
+++ b/samples/bpf/offwaketime_kern.c
@@ -0,0 +1,131 @@
+/* Copyright (c) 2016 Facebook
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of version 2 of the GNU General Public
+ * License as published by the Free Software Foundation.
+ */
+#include 
+#include "bpf_helpers.h"
+#include 
+#include 
+#include 
+#include 
+
+#define _(P) ({typeof(P) val = 0; bpf_probe_read(, sizeof(val), ); val;})
+
+#define MINBLOCK_US1
+
+struct key_t {
+   char waker[TASK_COMM_LEN];
+   char target[TASK_COMM_LEN];
+   u32 wret;
+   u32 tret;
+};
+
+struct bpf_map_def SEC("maps") counts = {
+   .type = BPF_MAP_TYPE_HASH,
+   .key_size = sizeof(struct key_t),
+   .value_size = sizeof(u64),
+   .max_entries = 1,
+};
+
+struct bpf_map_def SEC("maps") start = {
+   .type = BPF_MAP_TYPE_HASH,
+   .key_size = sizeof(u32),
+   .value_size = sizeof(u64),
+   .max_entries = 1,
+};
+
+struct wokeby_t {
+   char name[TASK_COMM_LEN];
+   u32 ret;
+};
+
+struct bpf_map_def SEC("maps") wokeby = {
+   .type = BPF_MAP_TYPE_HASH,
+   .key_size = sizeof(u32),
+   .value_size = sizeof(struct wokeby_t),
+   .max_entries = 1,
+};
+
+struct bpf_map_def SEC("maps") stackmap = {
+   .type = BPF_MAP_TYPE_STACK_TRACE,
+   .key_size = sizeof(u32),
+   .value_size = PERF_MAX_STACK_DEPTH * sizeof(u64),
+   .max_entries = 1,
+};
+
+#define STACKID_FLAGS (0 | BPF_F_FAST_STACK_CMP)
+
+SEC("kprobe/try_to_wake_up")
+int waker(struct pt_regs *ctx)
+{
+   struct task_struct *p = (void *) PT_REGS_PARM1(ctx);
+   struct wokeby_t woke = {};
+   u32 pid;
+
+   pid = _(p->pid);
+
+   bpf_get_current_comm(, sizeof(woke.name));
+   woke.ret = bpf_get_stackid(ctx, , STACKID_FLAGS);
+
+   bpf_map_update_elem(, , , BPF_ANY);
+   return 0;
+}
+
+static inline int update_counts(struct pt_regs *ctx, u32 pid, u64 delta)
+{
+   struct key_t key = {};
+   struct wokeby_t

[PATCH net-next 0/3] bpf_get_stackid() and stack_trace map

2016-02-17 Thread Alexei Starovoitov

This patch set introduces new map type to store stack traces and
corresponding bpf_get_stackid() helper.
BPF programs already can walk the stack via unrolled loop
of bpf_probe_read()s which is ok for simple analysis, but it's
not efficient and limited to <30 frames after that the programs
don't fit into MAX_BPF_STACK. With bpf_get_stackid() helper
the programs can collect up to PERF_MAX_STACK_DEPTH both
user and kernel frames.
Using stack traces as a key in a map turned out to be very useful
for generating flame graphs, off-cpu graphs, waker and chain graphs.
Patch 3 is a simplified version of 'offwaketime' tool which is
described in detail here:
http://brendangregg.com/blog/2016-02-01/linux-wakeup-offwake-profiling.html

Earlier version of this patch were using save_stack_trace() helper,
but 'unreliable' frames add to much noise and two equiavlent
stack traces produce different 'stackid's.
Using lockdep style of storing frames with MAX_STACK_TRACE_ENTRIES is
great for lockdep, but not acceptable for bpf, since the stack_trace
map needs to be freed when user Ctrl-C the tool.
The ftrace style with per_cpu(struct ftrace_stack) is great, but it's
tightly coupled with ftrace ring buffer and has the same 'unreliable'
noise. perf_event's perf_callchain() mechanism is also very efficient
and it only needed minor generalization which is done in patch 1
to be used by bpf stack_trace maps.
Peter, please take a look at patch 1.
If you're ok with it, I'd like to take the whole set via net-next.

Patch 1 - generalization of perf_callchain()
Patch 2 - stack_trace map done as lock-less hashtable without link list
  to avoid spinlock on insertion which is critical path when
  bpf_get_stackid() helper is called for every task switch event
Patch 3 - offwaketime example

After the patch the 'perf report' for artificial 'sched_bench'
benchmark that doing pthread_cond_wait/signal and 'offwaketime'
example is running in the background:
 16.35%  swapper  [kernel.vmlinux][k] intel_idle
  2.18%  sched_bench  [kernel.vmlinux][k] __switch_to
  2.18%  sched_bench  libpthread-2.12.so  [.] pthread_cond_signal@@GLIBC_2.3.2
  1.72%  sched_bench  libpthread-2.12.so  [.] pthread_mutex_unlock
  1.53%  sched_bench  [kernel.vmlinux][k] bpf_get_stackid
  1.44%  sched_bench  [kernel.vmlinux][k] entry_SYSCALL_64
  1.39%  sched_bench  [kernel.vmlinux][k] __call_rcu.constprop.73
  1.13%  sched_bench  libpthread-2.12.so  [.] pthread_mutex_lock
  1.07%  sched_bench  libpthread-2.12.so  [.] pthread_cond_wait@@GLIBC_2.3.2
  1.07%  sched_bench  [kernel.vmlinux][k] hash_futex
  1.05%  sched_bench  [kernel.vmlinux][k] do_futex
  1.05%  sched_bench  [kernel.vmlinux][k] get_futex_key_refs.isra.13

The hotest part of bpf_get_stackid() is inlined jhash2, so we may consider
using some faster hash in the future, but it's good enough for now.

Alexei Starovoitov (3):
  perf: generalize perf_callchain
  bpf: introduce BPF_MAP_TYPE_STACK_TRACE
  samples/bpf: offwaketime example

 arch/x86/include/asm/stacktrace.h |   2 +-
 arch/x86/kernel/cpu/perf_event.c  |   4 +-
 arch/x86/kernel/dumpstack.c   |   6 +-
 arch/x86/kernel/stacktrace.c  |  18 +--
 arch/x86/oprofile/backtrace.c |   3 +-
 include/linux/bpf.h   |   1 +
 include/linux/perf_event.h|  13 ++-
 include/uapi/linux/bpf.h  |  21 
 kernel/bpf/Makefile   |   3 +
 kernel/bpf/stackmap.c | 237 ++
 kernel/bpf/verifier.c |   6 +-
 kernel/events/callchain.c |  32 +++--
 kernel/events/internal.h  |   2 -
 kernel/trace/bpf_trace.c  |   2 +
 samples/bpf/Makefile  |   4 +
 samples/bpf/bpf_helpers.h |   2 +
 samples/bpf/offwaketime_kern.c| 131 +
 samples/bpf/offwaketime_user.c| 185 +
 18 files changed, 642 insertions(+), 30 deletions(-)
 create mode 100644 kernel/bpf/stackmap.c
 create mode 100644 samples/bpf/offwaketime_kern.c
 create mode 100644 samples/bpf/offwaketime_user.c

-- 
2.4.6

[PATCH net-next 2/3] bpf: introduce BPF_MAP_TYPE_STACK_TRACE

2016-02-17 Thread Alexei Starovoitov

add new map type to store stack traces and corresponding helper
bpf_get_stackid(ctx, map, flags) - walk user or kernel stack and return id
@ctx: struct pt_regs*
@map: pointer to stack_trace map
@flags: bits 0-7 - numer of stack frames to skip
bit 8 - collect user stack instead of kernel
bit 9 - compare stacks by hash only
bit 10 - if two different stacks hash into the same stackid
 discard old
other bits - reserved
Return: >= 0 stackid on success or negative error

stackid is a 32-bit integer handle that can be further combined with
other data (including other stackid) and used as a key into maps.

Userspace will access stackmap using standard lookup/delete syscall commands to
retrieve full stack trace for given stackid.

Signed-off-by: Alexei Starovoitov 
---
 include/linux/bpf.h  |   1 +
 include/uapi/linux/bpf.h |  21 +
 kernel/bpf/Makefile  |   3 +
 kernel/bpf/stackmap.c| 237 +++
 kernel/bpf/verifier.c|   6 +-
 kernel/trace/bpf_trace.c |   2 +
 6 files changed, 269 insertions(+), 1 deletion(-)
 create mode 100644 kernel/bpf/stackmap.c

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 90ee6ab24bc5..0cadbb7456c0 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -237,6 +237,7 @@ extern const struct bpf_func_proto 
bpf_get_current_uid_gid_proto;
 extern const struct bpf_func_proto bpf_get_current_comm_proto;
 extern const struct bpf_func_proto bpf_skb_vlan_push_proto;
 extern const struct bpf_func_proto bpf_skb_vlan_pop_proto;
+extern const struct bpf_func_proto bpf_get_stackid_proto;
 
 /* Shared helpers among cBPF and eBPF. */
 void bpf_user_rnd_init_once(void);
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 2ee0fde1bf96..d3e77da8e9e8 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -83,6 +83,7 @@ enum bpf_map_type {
BPF_MAP_TYPE_PERF_EVENT_ARRAY,
BPF_MAP_TYPE_PERCPU_HASH,
BPF_MAP_TYPE_PERCPU_ARRAY,
+   BPF_MAP_TYPE_STACK_TRACE,
 };
 
 enum bpf_prog_type {
@@ -272,6 +273,20 @@ enum bpf_func_id {
 */
BPF_FUNC_perf_event_output,
BPF_FUNC_skb_load_bytes,
+
+   /**
+* bpf_get_stackid(ctx, map, flags) - walk user or kernel stack and 
return id
+* @ctx: struct pt_regs*
+* @map: pointer to stack_trace map
+* @flags: bits 0-7 - numer of stack frames to skip
+* bit 8 - collect user stack instead of kernel
+* bit 9 - compare stacks by hash only
+* bit 10 - if two different stacks hash into the same stackid
+*  discard old
+* other bits - reserved
+* Return: >= 0 stackid on success or negative error
+*/
+   BPF_FUNC_get_stackid,
__BPF_FUNC_MAX_ID,
 };
 
@@ -294,6 +309,12 @@ enum bpf_func_id {
 /* BPF_FUNC_skb_set_tunnel_key and BPF_FUNC_skb_get_tunnel_key flags. */
 #define BPF_F_TUNINFO_IPV6 (1ULL << 0)
 
+/* BPF_FUNC_get_stackid flags. */
+#define BPF_F_SKIP_FIELD_MASK  0xffULL
+#define BPF_F_USER_STACK   (1ULL << 8)
+#define BPF_F_FAST_STACK_CMP   (1ULL << 9)
+#define BPF_F_REUSE_STACKID(1ULL << 10)
+
 /* user accessible mirror of in-kernel sk_buff.
  * new fields can only be added to the end of this structure
  */
diff --git a/kernel/bpf/Makefile b/kernel/bpf/Makefile
index 13272582eee0..8a932d079c24 100644
--- a/kernel/bpf/Makefile
+++ b/kernel/bpf/Makefile
@@ -2,3 +2,6 @@ obj-y := core.o
 
 obj-$(CONFIG_BPF_SYSCALL) += syscall.o verifier.o inode.o helpers.o
 obj-$(CONFIG_BPF_SYSCALL) += hashtab.o arraymap.o
+ifeq ($(CONFIG_PERF_EVENTS),y)
+obj-$(CONFIG_BPF_SYSCALL) += stackmap.o
+endif
diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c
new file mode 100644
index ..8a60ee14a977
--- /dev/null
+++ b/kernel/bpf/stackmap.c
@@ -0,0 +1,237 @@
+/* Copyright (c) 2016 Facebook
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of version 2 of the GNU General Public
+ * License as published by the Free Software Foundation.
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+struct stack_map_bucket {
+   struct rcu_head rcu;
+   u32 hash;
+   u32 nr;
+   u64 ip[];
+};
+
+struct bpf_stack_map {
+   struct bpf_map map;
+   u32 n_buckets;
+   struct stack_map_bucket __rcu *buckets[];
+};
+
+/* Called from syscall */
+static struct bpf_map *stack_map_alloc(union bpf_attr *attr)
+{
+   u32 value_size = attr->value_size;
+   struct bpf_stack_map *smap;
+   u64 cost, n_buckets;
+   int err;
+
+   if (!capable(CAP_SYS_ADMIN))
+   return ERR_PTR(-EPERM);
+
+   /* check sanity of attributes */
+   if (attr->max_entries == 0 || attr->key_size != 4 ||
+   value_size < 8 || value_size % 8 ||
+   value_size / 8 >

[PATCH net-next 1/3] perf: generalize perf_callchain

2016-02-17 Thread Alexei Starovoitov

. avoid walking the stack when there is no room left in the buffer
. generalize get_perf_callchain() to be called from bpf helper

Signed-off-by: Alexei Starovoitov 
---
 arch/x86/include/asm/stacktrace.h |  2 +-
 arch/x86/kernel/cpu/perf_event.c  |  4 ++--
 arch/x86/kernel/dumpstack.c   |  6 --
 arch/x86/kernel/stacktrace.c  | 18 +++---
 arch/x86/oprofile/backtrace.c |  3 ++-
 include/linux/perf_event.h| 13 +++--
 kernel/events/callchain.c | 32 
 kernel/events/internal.h  |  2 --
 8 files changed, 51 insertions(+), 29 deletions(-)

diff --git a/arch/x86/include/asm/stacktrace.h 
b/arch/x86/include/asm/stacktrace.h
index 70bbe39043a9..7c247e7404be 100644
--- a/arch/x86/include/asm/stacktrace.h
+++ b/arch/x86/include/asm/stacktrace.h
@@ -37,7 +37,7 @@ print_context_stack_bp(struct thread_info *tinfo,
 /* Generic stack tracer with callbacks */
 
 struct stacktrace_ops {
-   void (*address)(void *data, unsigned long address, int reliable);
+   int (*address)(void *data, unsigned long address, int reliable);
/* On negative return stop dumping */
int (*stack)(void *data, char *name);
walk_stack_twalk_stack;
diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index 1b443db2db50..d276b31ca473 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -2180,11 +2180,11 @@ static int backtrace_stack(void *data, char *name)
return 0;
 }
 
-static void backtrace_address(void *data, unsigned long addr, int reliable)
+static int backtrace_address(void *data, unsigned long addr, int reliable)
 {
struct perf_callchain_entry *entry = data;
 
-   perf_callchain_store(entry, addr);
+   return perf_callchain_store(entry, addr);
 }
 
 static const struct stacktrace_ops backtrace_ops = {
diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
index 9c30acfadae2..0d1ff4b407d4 100644
--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -135,7 +135,8 @@ print_context_stack_bp(struct thread_info *tinfo,
if (!__kernel_text_address(addr))
break;
 
-   ops->address(data, addr, 1);
+   if (ops->address(data, addr, 1))
+   break;
frame = frame->next_frame;
ret_addr = >return_address;
print_ftrace_graph_addr(addr, data, ops, tinfo, graph);
@@ -154,10 +155,11 @@ static int print_trace_stack(void *data, char *name)
 /*
  * Print one address/symbol entries per line.
  */
-static void print_trace_address(void *data, unsigned long addr, int reliable)
+static int print_trace_address(void *data, unsigned long addr, int reliable)
 {
touch_nmi_watchdog();
printk_stack_address(addr, reliable, data);
+   return 0;
 }
 
 static const struct stacktrace_ops print_trace_ops = {
diff --git a/arch/x86/kernel/stacktrace.c b/arch/x86/kernel/stacktrace.c
index fdd0c6430e5a..9ee98eefc44d 100644
--- a/arch/x86/kernel/stacktrace.c
+++ b/arch/x86/kernel/stacktrace.c
@@ -14,30 +14,34 @@ static int save_stack_stack(void *data, char *name)
return 0;
 }
 
-static void
+static int
 __save_stack_address(void *data, unsigned long addr, bool reliable, bool 
nosched)
 {
struct stack_trace *trace = data;
 #ifdef CONFIG_FRAME_POINTER
if (!reliable)
-   return;
+   return 0;
 #endif
if (nosched && in_sched_functions(addr))
-   return;
+   return 0;
if (trace->skip > 0) {
trace->skip--;
-   return;
+   return 0;
}
-   if (trace->nr_entries < trace->max_entries)
+   if (trace->nr_entries < trace->max_entries) {
trace->entries[trace->nr_entries++] = addr;
+   return 0;
+   } else {
+   return -1; /* no more room, stop walking the stack */
+   }
 }
 
-static void save_stack_address(void *data, unsigned long addr, int reliable)
+static int save_stack_address(void *data, unsigned long addr, int reliable)
 {
return __save_stack_address(data, addr, reliable, false);
 }
 
-static void
+static int
 save_stack_address_nosched(void *data, unsigned long addr, int reliable)
 {
return __save_stack_address(data, addr, reliable, true);
diff --git a/arch/x86/oprofile/backtrace.c b/arch/x86/oprofile/backtrace.c
index 4e664bdb535a..cb31a4440e58 100644
--- a/arch/x86/oprofile/backtrace.c
+++ b/arch/x86/oprofile/backtrace.c
@@ -23,12 +23,13 @@ static int backtrace_stack(void *data, char *name)
return 0;
 }
 
-static void backtrace_address(void *data, unsigned long addr, int reliable)
+static int backtrace_address(void *data, unsigned long addr, int reliable)
 {
unsigned int *depth = data;
 
if ((*depth)--)
oprofile_add_trace(addr);
+

[PATCH next v3 0/3] IPvlan misc patches

2016-02-17 Thread Mahesh Bandewar

From: Mahesh Bandewar 

This is a collection of unrelated patches for IPvlan driver.
a. crub_skb() changes are added to ensure that the packets hit the
NF_HOOKS in masters' ns in L3 mode.
b. u16 change is bug fix while
c. the third patch is to group tx/rx variables in single cacheline

Mahesh Bandewar (3):
  ipvlan: scrub skb before routing in L3 mode.
  ipvlan: mode is u16
  ipvlan: misc changes

 drivers/net/ipvlan/ipvlan.h  | 10 +++---
 drivers/net/ipvlan/ipvlan_core.c | 31 ++-
 drivers/net/ipvlan/ipvlan_main.c | 11 +++
 3 files changed, 28 insertions(+), 24 deletions(-)

-- 
2.7.0.rc3.207.g0ac5344

[PATCH next v3 3/3] ipvlan: misc perf changes

2016-02-17 Thread Mahesh Bandewar

From: Mahesh Bandewar 

1. scope correction for few functions that are used in single file.
2. Adjust variables that are used in fast-path to fit into single cacheline
3. Update rcv_frame() to skip shared check for frames coming over wire

Signed-off-by: Mahesh Bandewar 
---
 drivers/net/ipvlan/ipvlan.h  |  9 +++--
 drivers/net/ipvlan/ipvlan_core.c | 27 ---
 drivers/net/ipvlan/ipvlan_main.c |  2 +-
 3 files changed, 20 insertions(+), 18 deletions(-)

diff --git a/drivers/net/ipvlan/ipvlan.h b/drivers/net/ipvlan/ipvlan.h
index 817cab1a7959..695a5dc9ace3 100644
--- a/drivers/net/ipvlan/ipvlan.h
+++ b/drivers/net/ipvlan/ipvlan.h
@@ -84,19 +84,19 @@ struct ipvl_addr {
 #define ip4addr ipu.ip4
struct hlist_node   hlnode;  /* Hash-table linkage */
struct list_headanode;   /* logical-interface linkage */
-   struct rcu_head rcu;
ipvl_hdr_type   atype;
+   struct rcu_head rcu;
 };
 
 struct ipvl_port {
struct net_device   *dev;
struct hlist_head   hlhead[IPVLAN_HASH_SIZE];
struct list_headipvlans;
-   struct rcu_head rcu;
+   u16 mode;
struct work_struct  wq;
struct sk_buff_head backlog;
int count;
-   u16 mode;
+   struct rcu_head rcu;
 };
 
 static inline struct ipvl_port *ipvlan_port_get_rcu(const struct net_device *d)
@@ -114,7 +114,6 @@ static inline struct ipvl_port *ipvlan_port_get_rtnl(const 
struct net_device *d)
return rtnl_dereference(d->rx_handler_data);
 }
 
-void ipvlan_adjust_mtu(struct ipvl_dev *ipvlan, struct net_device *dev);
 void ipvlan_init_secret(void);
 unsigned int ipvlan_mac_hash(const unsigned char *addr);
 rx_handler_result_t ipvlan_handle_frame(struct sk_buff **pskb);
@@ -124,7 +123,5 @@ void ipvlan_ht_addr_add(struct ipvl_dev *ipvlan, struct 
ipvl_addr *addr);
 struct ipvl_addr *ipvlan_find_addr(const struct ipvl_dev *ipvlan,
   const void *iaddr, bool is_v6);
 bool ipvlan_addr_busy(struct ipvl_port *port, void *iaddr, bool is_v6);
-struct ipvl_addr *ipvlan_ht_addr_lookup(const struct ipvl_port *port,
-   const void *iaddr, bool is_v6);
 void ipvlan_ht_addr_del(struct ipvl_addr *addr);
 #endif /* __IPVLAN_H */
diff --git a/drivers/net/ipvlan/ipvlan_core.c b/drivers/net/ipvlan/ipvlan_core.c
index a1f87ec316cf..9864bb45b3ec 100644
--- a/drivers/net/ipvlan/ipvlan_core.c
+++ b/drivers/net/ipvlan/ipvlan_core.c
@@ -53,8 +53,8 @@ static u8 ipvlan_get_v4_hash(const void *iaddr)
   IPVLAN_HASH_MASK;
 }
 
-struct ipvl_addr *ipvlan_ht_addr_lookup(const struct ipvl_port *port,
-   const void *iaddr, bool is_v6)
+static struct ipvl_addr *ipvlan_ht_addr_lookup(const struct ipvl_port *port,
+  const void *iaddr, bool is_v6)
 {
struct ipvl_addr *addr;
u8 hash;
@@ -265,20 +265,25 @@ static int ipvlan_rcv_frame(struct ipvl_addr *addr, 
struct sk_buff **pskb,
struct sk_buff *skb = *pskb;
 
len = skb->len + ETH_HLEN;
-   if (unlikely(!(dev->flags & IFF_UP))) {
-   kfree_skb(skb);
-   goto out;
-   }
+   /* Only packets exchanged between two local slaves need to have
+* device-up check as well as skb-share check.
+*/
+   if (local) {
+   if (unlikely(!(dev->flags & IFF_UP))) {
+   kfree_skb(skb);
+   goto out;
+   }
 
-   skb = skb_share_check(skb, GFP_ATOMIC);
-   if (!skb)
-   goto out;
+   skb = skb_share_check(skb, GFP_ATOMIC);
+   if (!skb)
+   goto out;
 
-   *pskb = skb;
+   *pskb = skb;
+   }
skb->dev = dev;
-   skb->pkt_type = PACKET_HOST;
 
if (local) {
+   skb->pkt_type = PACKET_HOST;
if (dev_forward_skb(ipvlan->dev, skb) == NET_RX_SUCCESS)
success = true;
} else {
diff --git a/drivers/net/ipvlan/ipvlan_main.c b/drivers/net/ipvlan/ipvlan_main.c
index 5bcb852c5500..a7ca1c519a0d 100644
--- a/drivers/net/ipvlan/ipvlan_main.c
+++ b/drivers/net/ipvlan/ipvlan_main.c
@@ -9,7 +9,7 @@
 
 #include "ipvlan.h"
 
-void ipvlan_adjust_mtu(struct ipvl_dev *ipvlan, struct net_device *dev)
+static void ipvlan_adjust_mtu(struct ipvl_dev *ipvlan, struct net_device *dev)
 {
ipvlan->dev->mtu = dev->mtu - ipvlan->mtu_adj;
 }
-- 
2.7.0.rc3.207.g0ac5344

[PATCH next v3 2/3] ipvlan: mode is u16

2016-02-17 Thread Mahesh Bandewar

From: Mahesh Bandewar 

The mode argument was erronusly defined as u32 but it has always
been u16. Also use ipvlan_set_mode() helper to set the mode instead
of assigning directly. This should avoid future erronus assignments /
updates.

Signed-off-by: Mahesh Bandewar 
---
 drivers/net/ipvlan/ipvlan.h  | 1 -
 drivers/net/ipvlan/ipvlan_main.c | 9 ++---
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ipvlan/ipvlan.h b/drivers/net/ipvlan/ipvlan.h
index 9542b7bac61a..817cab1a7959 100644
--- a/drivers/net/ipvlan/ipvlan.h
+++ b/drivers/net/ipvlan/ipvlan.h
@@ -115,7 +115,6 @@ static inline struct ipvl_port *ipvlan_port_get_rtnl(const 
struct net_device *d)
 }
 
 void ipvlan_adjust_mtu(struct ipvl_dev *ipvlan, struct net_device *dev);
-void ipvlan_set_port_mode(struct ipvl_port *port, u32 nval);
 void ipvlan_init_secret(void);
 unsigned int ipvlan_mac_hash(const unsigned char *addr);
 rx_handler_result_t ipvlan_handle_frame(struct sk_buff **pskb);
diff --git a/drivers/net/ipvlan/ipvlan_main.c b/drivers/net/ipvlan/ipvlan_main.c
index 7a3b41468a55..5bcb852c5500 100644
--- a/drivers/net/ipvlan/ipvlan_main.c
+++ b/drivers/net/ipvlan/ipvlan_main.c
@@ -14,7 +14,7 @@ void ipvlan_adjust_mtu(struct ipvl_dev *ipvlan, struct 
net_device *dev)
ipvlan->dev->mtu = dev->mtu - ipvlan->mtu_adj;
 }
 
-void ipvlan_set_port_mode(struct ipvl_port *port, u32 nval)
+static void ipvlan_set_port_mode(struct ipvl_port *port, u16 nval)
 {
struct ipvl_dev *ipvlan;
 
@@ -442,6 +442,7 @@ static int ipvlan_link_new(struct net *src_net, struct 
net_device *dev,
struct ipvl_port *port;
struct net_device *phy_dev;
int err;
+   u16 mode = IPVLAN_MODE_L3;
 
if (!tb[IFLA_LINK])
return -EINVAL;
@@ -460,10 +461,10 @@ static int ipvlan_link_new(struct net *src_net, struct 
net_device *dev,
return err;
}
 
-   port = ipvlan_port_get_rtnl(phy_dev);
if (data && data[IFLA_IPVLAN_MODE])
-   port->mode = nla_get_u16(data[IFLA_IPVLAN_MODE]);
+   mode = nla_get_u16(data[IFLA_IPVLAN_MODE]);
 
+   port = ipvlan_port_get_rtnl(phy_dev);
ipvlan->phy_dev = phy_dev;
ipvlan->dev = dev;
ipvlan->port = port;
@@ -489,6 +490,8 @@ static int ipvlan_link_new(struct net *src_net, struct 
net_device *dev,
goto ipvlan_destroy_port;
 
list_add_tail_rcu(>pnode, >ipvlans);
+   ipvlan_set_port_mode(port, mode);
+
netif_stacked_transfer_operstate(phy_dev, dev);
return 0;
 
-- 
2.7.0.rc3.207.g0ac5344

[PATCH next v3 1/3] ipvlan: scrub skb before routing in L3 mode.

2016-02-17 Thread Mahesh Bandewar

From: Mahesh Bandewar 

Scrub skb before hitting the iptable hooks to ensure packets hit
these hooks in master's namespace.

Signed-off-by: Mahesh Bandewar 
---
 drivers/net/ipvlan/ipvlan_core.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ipvlan/ipvlan_core.c b/drivers/net/ipvlan/ipvlan_core.c
index 8c48bb2a94ea..a1f87ec316cf 100644
--- a/drivers/net/ipvlan/ipvlan_core.c
+++ b/drivers/net/ipvlan/ipvlan_core.c
@@ -365,7 +365,7 @@ static int ipvlan_process_v4_outbound(struct sk_buff *skb)
ip_rt_put(rt);
goto err;
}
-   skb_dst_drop(skb);
+   skb_scrub_packet(skb, true);
skb_dst_set(skb, >dst);
err = ip_local_out(net, skb->sk, skb);
if (unlikely(net_xmit_eval(err)))
@@ -403,7 +403,7 @@ static int ipvlan_process_v6_outbound(struct sk_buff *skb)
dst_release(dst);
goto err;
}
-   skb_dst_drop(skb);
+   skb_scrub_packet(skb, true);
skb_dst_set(skb, dst);
err = ip6_local_out(net, skb->sk, skb);
if (unlikely(net_xmit_eval(err)))
-- 
2.7.0.rc3.207.g0ac5344

[net-next 02/15] i40e: Make the DCB firmware checks for X710/XL710 only

2016-02-17 Thread Jeff Kirsher

From: Neerav Parikh 

Make the DCB firmware version related checks specific to
X710 and XL710 only. These checks are not required for
X722 family of devices.

Introduced an inline routine to help determine if the
MAC type is X710/XL710 or not.

Moved the firmware version related checks in i40e_sw_init()
and defined flags for different cases

Fix the version check to allow using "Set LLDP MIB" AQ
for beyond FVL4 FW releases.

Change-ID: Ib78288343de983aa0354fc28aa36e99b073662c0
Signed-off-by: Neerav Parikh 
Tested-by: Andrew Bowers 
Signed-off-by: Jeff Kirsher 
---
 drivers/net/ethernet/intel/i40e/i40e.h  | 16 
 drivers/net/ethernet/intel/i40e/i40e_main.c | 27 ---
 2 files changed, 36 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e.h 
b/drivers/net/ethernet/intel/i40e/i40e.h
index 05af33e..7bfd062 100644
--- a/drivers/net/ethernet/intel/i40e/i40e.h
+++ b/drivers/net/ethernet/intel/i40e/i40e.h
@@ -138,6 +138,19 @@
 /* default to trying for four seconds */
 #define I40E_TRY_LINK_TIMEOUT (4 * HZ)
 
+/**
+ * i40e_is_mac_710 - Return true if MAC is X710/XL710
+ * @hw: ptr to the hardware info
+ **/
+static inline bool i40e_is_mac_710(struct i40e_hw *hw)
+{
+   if ((hw->mac.type == I40E_MAC_X710) ||
+   (hw->mac.type == I40E_MAC_XL710))
+   return true;
+
+   return false;
+}
+
 /* driver state flags */
 enum i40e_state_t {
__I40E_TESTING,
@@ -342,6 +355,9 @@ struct i40e_pf {
 #define I40E_FLAG_NO_PCI_LINK_CHECKBIT_ULL(42)
 #define I40E_FLAG_100M_SGMII_CAPABLE   BIT_ULL(43)
 #define I40E_FLAG_RESTART_AUTONEG  BIT_ULL(44)
+#define I40E_FLAG_NO_DCB_SUPPORT   BIT_ULL(45)
+#define I40E_FLAG_USE_SET_LLDP_MIB BIT_ULL(46)
+#define I40E_FLAG_STOP_FW_LLDP BIT_ULL(47)
 #define I40E_FLAG_PF_MAC   BIT_ULL(50)
 
/* tracks features that get auto disabled by errors */
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c 
b/drivers/net/ethernet/intel/i40e/i40e_main.c
index e974db3..81b7895 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -5012,8 +5012,7 @@ static int i40e_init_pf_dcb(struct i40e_pf *pf)
int err = 0;
 
/* Do not enable DCB for SW1 and SW2 images even if the FW is capable */
-   if (((pf->hw.aq.fw_maj_ver == 4) && (pf->hw.aq.fw_min_ver < 33)) ||
-   (pf->hw.aq.fw_maj_ver < 4))
+   if (pf->flags & I40E_FLAG_NO_DCB_SUPPORT)
goto out;
 
/* Get the initial DCB configuration */
@@ -8425,11 +8424,25 @@ static int i40e_sw_init(struct i40e_pf *pf)
 pf->hw.func_caps.fd_filters_best_effort;
}
 
-   if (((pf->hw.mac.type == I40E_MAC_X710) ||
-(pf->hw.mac.type == I40E_MAC_XL710)) &&
+   if (i40e_is_mac_710(>hw) &&
(((pf->hw.aq.fw_maj_ver == 4) && (pf->hw.aq.fw_min_ver < 33)) ||
-   (pf->hw.aq.fw_maj_ver < 4)))
+   (pf->hw.aq.fw_maj_ver < 4))) {
pf->flags |= I40E_FLAG_RESTART_AUTONEG;
+   /* No DCB support  for FW < v4.33 */
+   pf->flags |= I40E_FLAG_NO_DCB_SUPPORT;
+   }
+
+   /* Disable FW LLDP if FW < v4.3 */
+   if (i40e_is_mac_710(>hw) &&
+   (((pf->hw.aq.fw_maj_ver == 4) && (pf->hw.aq.fw_min_ver < 3)) ||
+   (pf->hw.aq.fw_maj_ver < 4)))
+   pf->flags |= I40E_FLAG_STOP_FW_LLDP;
+
+   /* Use the FW Set LLDP MIB API if FW > v4.40 */
+   if (i40e_is_mac_710(>hw) &&
+   (((pf->hw.aq.fw_maj_ver == 4) && (pf->hw.aq.fw_min_ver >= 40)) ||
+   (pf->hw.aq.fw_maj_ver >= 5)))
+   pf->flags |= I40E_FLAG_USE_SET_LLDP_MIB;
 
if (pf->hw.func_caps.vmdq) {
pf->num_vmdq_vsis = I40E_DEFAULT_NUM_VMDQ_VSI;
@@ -8458,6 +8471,7 @@ static int i40e_sw_init(struct i40e_pf *pf)
 I40E_FLAG_WB_ON_ITR_CAPABLE |
 I40E_FLAG_MULTIPLE_TCP_UDP_RSS_PCTYPE |
 I40E_FLAG_100M_SGMII_CAPABLE |
+I40E_FLAG_USE_SET_LLDP_MIB |
 I40E_FLAG_GENEVE_OFFLOAD_CAPABLE;
} else if ((pf->hw.aq.api_maj_ver > 1) ||
   ((pf->hw.aq.api_maj_ver == 1) &&
@@ -10825,8 +10839,7 @@ static int i40e_probe(struct pci_dev *pdev, const 
struct pci_device_id *ent)
 * Ignore error return codes because if it was already disabled via
 * hardware settings this will fail
 */
-   if (((pf->hw.aq.fw_maj_ver == 4) && (pf->hw.aq.fw_min_ver < 3)) ||
-   (pf->hw.aq.fw_maj_ver < 4)) {
+   if (pf->flags & I40E_FLAG_STOP_FW_LLDP) {
dev_info(>dev, "Stopping firmware LLDP agent.\n");
i40e_aq_stop_lldp(hw, true, NULL);

[net-next 06/15] i40e: Refactor force_wb and WB_ON_ITR functionality code

2016-02-17 Thread Jeff Kirsher

From: Anjali Singhai Jain 

Now that the Force-WriteBack functionality in X710/XL710 devices
has been moved out of the clean routine and into the service task,
we need to make sure WriteBack-On-ITR is separated out since it
is still called from clean.

In the X722 devices, Force-WriteBack implies WriteBack-On-ITR but
without the interrupt, which put the driver into a missed
interrupt scenario and a potential tx-timeout report.

With this patch, we break the two functions out, and call the
appropriate ones at the right place. This will avoid creating missed
interrupt like scenarios for X722 devices.

Also update copyright year in file headers.

Change-ID: Iacbde39f95f332f82be8736864675052c3583a40
Signed-off-by: Anjali Singhai Jain 
Tested-by: Andrew Bowers 
Signed-off-by: Jeff Kirsher 
---
 drivers/net/ethernet/intel/i40e/i40e_txrx.c   | 57 ++--
 drivers/net/ethernet/intel/i40evf/i40e_txrx.c | 63 +++
 drivers/net/ethernet/intel/i40evf/i40e_txrx.h |  3 +-
 3 files changed, 72 insertions(+), 51 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c 
b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
index 6d1dd60..7dfd45e 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
@@ -1,7 +1,7 @@
 
/***
  *
  * Intel Ethernet Controller XL710 Family Linux Driver
- * Copyright(c) 2013 - 2014 Intel Corporation.
+ * Copyright(c) 2013 - 2016 Intel Corporation.
  *
  * This program is free software; you can redistribute it and/or modify it
  * under the terms and conditions of the GNU General Public License,
@@ -774,37 +774,48 @@ static bool i40e_clean_tx_irq(struct i40e_ring *tx_ring, 
int budget)
 }
 
 /**
- * i40e_force_wb - Arm hardware to do a wb on noncache aligned descriptors
+ * i40e_enable_wb_on_itr - Arm hardware to do a wb, interrupts are not enabled
  * @vsi: the VSI we care about
- * @q_vector: the vector  on which to force writeback
+ * @q_vector: the vector on which to enable writeback
  *
  **/
-void i40e_force_wb(struct i40e_vsi *vsi, struct i40e_q_vector *q_vector)
+static void i40e_enable_wb_on_itr(struct i40e_vsi *vsi,
+ struct i40e_q_vector *q_vector)
 {
u16 flags = q_vector->tx.ring[0].flags;
+   u32 val;
 
-   if (flags & I40E_TXR_FLAGS_WB_ON_ITR) {
-   u32 val;
+   if (!(flags & I40E_TXR_FLAGS_WB_ON_ITR))
+   return;
 
-   if (q_vector->arm_wb_state)
-   return;
+   if (q_vector->arm_wb_state)
+   return;
 
-   if (vsi->back->flags & I40E_FLAG_MSIX_ENABLED) {
-   val = I40E_PFINT_DYN_CTLN_WB_ON_ITR_MASK |
- I40E_PFINT_DYN_CTLN_ITR_INDX_MASK; /* set noitr */
+   if (vsi->back->flags & I40E_FLAG_MSIX_ENABLED) {
+   val = I40E_PFINT_DYN_CTLN_WB_ON_ITR_MASK |
+ I40E_PFINT_DYN_CTLN_ITR_INDX_MASK; /* set noitr */
 
-   wr32(>back->hw,
-I40E_PFINT_DYN_CTLN(q_vector->v_idx +
-vsi->base_vector - 1),
-val);
-   } else {
-   val = I40E_PFINT_DYN_CTL0_WB_ON_ITR_MASK |
- I40E_PFINT_DYN_CTL0_ITR_INDX_MASK; /* set noitr */
+   wr32(>back->hw,
+I40E_PFINT_DYN_CTLN(q_vector->v_idx + vsi->base_vector - 
1),
+val);
+   } else {
+   val = I40E_PFINT_DYN_CTL0_WB_ON_ITR_MASK |
+ I40E_PFINT_DYN_CTL0_ITR_INDX_MASK; /* set noitr */
 
-   wr32(>back->hw, I40E_PFINT_DYN_CTL0, val);
-   }
-   q_vector->arm_wb_state = true;
-   } else if (vsi->back->flags & I40E_FLAG_MSIX_ENABLED) {
+   wr32(>back->hw, I40E_PFINT_DYN_CTL0, val);
+   }
+   q_vector->arm_wb_state = true;
+}
+
+/**
+ * i40e_force_wb - Issue SW Interrupt so HW does a wb
+ * @vsi: the VSI we care about
+ * @q_vector: the vector  on which to force writeback
+ *
+ **/
+void i40e_force_wb(struct i40e_vsi *vsi, struct i40e_q_vector *q_vector)
+{
+   if (vsi->back->flags & I40E_FLAG_MSIX_ENABLED) {
u32 val = I40E_PFINT_DYN_CTLN_INTENA_MASK |
  I40E_PFINT_DYN_CTLN_ITR_INDX_MASK | /* set noitr */
  I40E_PFINT_DYN_CTLN_SWINT_TRIG_MASK |
@@ -1946,7 +1957,7 @@ int i40e_napi_poll(struct napi_struct *napi, int budget)
 tx_only:
if (arm_wb) {
q_vector->tx.ring[0].tx_stats.tx_force_wb++;
-   i40e_force_wb(vsi, q_vector);
+   i40e_enable_wb_on_itr(vsi, q_vector);
}

[net-next 14/15] i40e: properly show packet split status in debugfs

2016-02-17 Thread Jeff Kirsher

From: Mitch Williams 

Get rid of the unused hsplit field in the ring struct and use the
existing macro to detect packet split enablement. This allows debugfs
dumps of the VSI to properly show which Rx routine is in use.

Change-ID: Ic4e9589e6a788ab196ed0850703f704e30c03781
Signed-off-by: Mitch Williams 
Tested-by: Andrew Bowers 
Signed-off-by: Jeff Kirsher 
---
 drivers/net/ethernet/intel/i40e/i40e_debugfs.c | 6 +++---
 drivers/net/ethernet/intel/i40e/i40e_txrx.h| 1 -
 drivers/net/ethernet/intel/i40evf/i40e_txrx.h  | 1 -
 3 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_debugfs.c 
b/drivers/net/ethernet/intel/i40e/i40e_debugfs.c
index bdac691..34da53b 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_debugfs.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_debugfs.c
@@ -521,7 +521,7 @@ static void i40e_dbg_dump_vsi_seid(struct i40e_pf *pf, int 
seid)
 rx_ring->dtype);
dev_info(>pdev->dev,
 "rx_rings[%i]: hsplit = %d, next_to_use = %d, 
next_to_clean = %d, ring_active = %i\n",
-i, rx_ring->hsplit,
+i, ring_is_ps_enabled(rx_ring),
 rx_ring->next_to_use,
 rx_ring->next_to_clean,
 rx_ring->ring_active);
@@ -572,8 +572,8 @@ static void i40e_dbg_dump_vsi_seid(struct i40e_pf *pf, int 
seid)
 "tx_rings[%i]: dtype = %d\n",
 i, tx_ring->dtype);
dev_info(>pdev->dev,
-"tx_rings[%i]: hsplit = %d, next_to_use = %d, 
next_to_clean = %d, ring_active = %i\n",
-i, tx_ring->hsplit,
+"tx_rings[%i]: next_to_use = %d, next_to_clean = 
%d, ring_active = %i\n",
+i,
 tx_ring->next_to_use,
 tx_ring->next_to_clean,
 tx_ring->ring_active);
diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.h 
b/drivers/net/ethernet/intel/i40e/i40e_txrx.h
index 3b8d147..ae22c4e 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.h
@@ -256,7 +256,6 @@ struct i40e_ring {
 #define I40E_RX_DTYPE_NO_SPLIT  0
 #define I40E_RX_DTYPE_HEADER_SPLIT  1
 #define I40E_RX_DTYPE_SPLIT_ALWAYS  2
-   u8  hsplit;
 #define I40E_RX_SPLIT_L2  0x1
 #define I40E_RX_SPLIT_IP  0x2
 #define I40E_RX_SPLIT_TCP_UDP 0x4
diff --git a/drivers/net/ethernet/intel/i40evf/i40e_txrx.h 
b/drivers/net/ethernet/intel/i40evf/i40e_txrx.h
index 5f03c44..5467fcdf 100644
--- a/drivers/net/ethernet/intel/i40evf/i40e_txrx.h
+++ b/drivers/net/ethernet/intel/i40evf/i40e_txrx.h
@@ -255,7 +255,6 @@ struct i40e_ring {
 #define I40E_RX_DTYPE_NO_SPLIT  0
 #define I40E_RX_DTYPE_HEADER_SPLIT  1
 #define I40E_RX_DTYPE_SPLIT_ALWAYS  2
-   u8  hsplit;
 #define I40E_RX_SPLIT_L2  0x1
 #define I40E_RX_SPLIT_IP  0x2
 #define I40E_RX_SPLIT_TCP_UDP 0x4
-- 
2.5.0

[net-next 15/15] i40e/i40evf: Bump version

2016-02-17 Thread Jeff Kirsher

From: Jesse Brandeburg 

Bump version to i40e-1.4.13 and i40evf-1.4.9

Change-ID: I9db37f9d4899141c3e5455dfb456d45465b8c035
Signed-off-by: Jesse Brandeburg 
Tested-by: Andrew Bowers 
Signed-off-by: Jeff Kirsher 
---
 drivers/net/ethernet/intel/i40e/i40e_main.c | 2 +-
 drivers/net/ethernet/intel/i40evf/i40evf_main.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c 
b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 8bc848f..8d41c6c 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -46,7 +46,7 @@ static const char i40e_driver_string[] =
 
 #define DRV_VERSION_MAJOR 1
 #define DRV_VERSION_MINOR 4
-#define DRV_VERSION_BUILD 12
+#define DRV_VERSION_BUILD 13
 #define DRV_VERSION __stringify(DRV_VERSION_MAJOR) "." \
 __stringify(DRV_VERSION_MINOR) "." \
 __stringify(DRV_VERSION_BUILD)DRV_KERN
diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_main.c 
b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
index faa1bca..1d81d57 100644
--- a/drivers/net/ethernet/intel/i40evf/i40evf_main.c
+++ b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
@@ -38,7 +38,7 @@ static const char i40evf_driver_string[] =
 
 #define DRV_VERSION_MAJOR 1
 #define DRV_VERSION_MINOR 4
-#define DRV_VERSION_BUILD 8
+#define DRV_VERSION_BUILD 9
 #define DRV_VERSION __stringify(DRV_VERSION_MAJOR) "." \
 __stringify(DRV_VERSION_MINOR) "." \
 __stringify(DRV_VERSION_BUILD) \
-- 
2.5.0

[net-next 03/15] i40e: set shared bit for multicast filters

2016-02-17 Thread Jeff Kirsher

From: Shannon Nelson 

Add the use of the new Shared MAC filter bit for multicast and broadcast
filters in order to make better use of the filters available from the
device.  The FW folks have assured me that setting this bit on older FW
will have no affect, so we don't need a version check.

Also fixed a stray indent problem nearby.

Also update copyright year.

Change-ID: I4c5826a32594382a7937a592a24d228588cee7aa
Signed-off-by: Shannon Nelson 
Tested-by: Andrew Bowers 
Signed-off-by: Jeff Kirsher 
---
 drivers/net/ethernet/intel/i40e/i40e_common.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_common.c 
b/drivers/net/ethernet/intel/i40e/i40e_common.c
index 976b03f..edfea38 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_common.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_common.c
@@ -1,7 +1,7 @@
 
/***
  *
  * Intel Ethernet Controller XL710 Family Linux Driver
- * Copyright(c) 2013 - 2015 Intel Corporation.
+ * Copyright(c) 2013 - 2016 Intel Corporation.
  *
  * This program is free software; you can redistribute it and/or modify it
  * under the terms and conditions of the GNU General Public License,
@@ -2435,6 +2435,7 @@ i40e_status i40e_aq_add_macvlan(struct i40e_hw *hw, u16 
seid,
(struct i40e_aqc_macvlan *)
i40e_status status;
u16 buf_size;
+   int i;
 
if (count == 0 || !mv_list || !hw)
return I40E_ERR_PARAM;
@@ -2448,12 +2449,17 @@ i40e_status i40e_aq_add_macvlan(struct i40e_hw *hw, u16 
seid,
cmd->seid[1] = 0;
cmd->seid[2] = 0;
 
+   for (i = 0; i < count; i++)
+   if (is_multicast_ether_addr(mv_list[i].mac_addr))
+   mv_list[i].flags |=
+  cpu_to_le16(I40E_AQC_MACVLAN_ADD_USE_SHARED_MAC);
+
desc.flags |= cpu_to_le16((u16)(I40E_AQ_FLAG_BUF | I40E_AQ_FLAG_RD));
if (buf_size > I40E_AQ_LARGE_BUF)
desc.flags |= cpu_to_le16((u16)I40E_AQ_FLAG_LB);
 
status = i40e_asq_send_command(hw, , mv_list, buf_size,
-   cmd_details);
+  cmd_details);
 
return status;
 }
-- 
2.5.0

[net-next 04/15] i40e: add VEB stat control and remove L2 cloud filter

2016-02-17 Thread Jeff Kirsher

From: Shannon Nelson 

With the latest firmware, statistics gathering can now be enabled and
disabled in the HW switch, so we need to add a parameter to allow the
driver to set it as desired.  At the same time, the L2 cloud filtering
parameter has been removed as it was never used.

Older drivers working with the newer firmware and newer drivers working
with older firmware will not run into problems with these bits as the
defaults are reasonable and there is no overlap in the bit definitions.
Also, newer drivers will be forced to update because of the change in
function call parameters, a reminder that the functionality exists.

Also update copyright year.

Change-ID: I9acb9160b892ca3146f2f11a88fdcd86be3cadcc
Signed-off-by: Shannon Nelson 
Tested-by: Andrew Bowers 
Signed-off-by: Jeff Kirsher 
---
 drivers/net/ethernet/intel/i40e/i40e_common.c| 11 ++-
 drivers/net/ethernet/intel/i40e/i40e_main.c  |  2 +-
 drivers/net/ethernet/intel/i40e/i40e_prototype.h |  6 +++---
 3 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_common.c 
b/drivers/net/ethernet/intel/i40e/i40e_common.c
index edfea38..354e36c 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_common.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_common.c
@@ -2308,8 +2308,8 @@ i40e_status i40e_update_link_info(struct i40e_hw *hw)
  * @downlink_seid: the VSI SEID
  * @enabled_tc: bitmap of TCs to be enabled
  * @default_port: true for default port VSI, false for control port
- * @enable_l2_filtering: true to add L2 filter table rules to regular 
forwarding rules for cloud support
  * @veb_seid: pointer to where to put the resulting VEB SEID
+ * @enable_stats: true to turn on VEB stats
  * @cmd_details: pointer to command details structure or NULL
  *
  * This asks the FW to add a VEB between the uplink and downlink
@@ -2317,8 +2317,8 @@ i40e_status i40e_update_link_info(struct i40e_hw *hw)
  **/
 i40e_status i40e_aq_add_veb(struct i40e_hw *hw, u16 uplink_seid,
u16 downlink_seid, u8 enabled_tc,
-   bool default_port, bool enable_l2_filtering,
-   u16 *veb_seid,
+   bool default_port, u16 *veb_seid,
+   bool enable_stats,
struct i40e_asq_cmd_details *cmd_details)
 {
struct i40e_aq_desc desc;
@@ -2345,8 +2345,9 @@ i40e_status i40e_aq_add_veb(struct i40e_hw *hw, u16 
uplink_seid,
else
veb_flags |= I40E_AQC_ADD_VEB_PORT_TYPE_DATA;
 
-   if (enable_l2_filtering)
-   veb_flags |= I40E_AQC_ADD_VEB_ENABLE_L2_FILTER;
+   /* reverse logic here: set the bitflag to disable the stats */
+   if (!enable_stats)
+   veb_flags |= I40E_AQC_ADD_VEB_ENABLE_DISABLE_STATS;
 
cmd->veb_flags = cpu_to_le16(veb_flags);
 
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c 
b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 81b7895..95fb342 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -10075,7 +10075,7 @@ static int i40e_add_veb(struct i40e_veb *veb, struct 
i40e_vsi *vsi)
/* get a VEB from the hardware */
ret = i40e_aq_add_veb(>hw, veb->uplink_seid, vsi->seid,
  veb->enabled_tc, is_default,
- is_cloud, >seid, NULL);
+ >seid, is_cloud, NULL);
if (ret) {
dev_info(>pdev->dev,
 "couldn't add VEB, err %s aq_err %s\n",
diff --git a/drivers/net/ethernet/intel/i40e/i40e_prototype.h 
b/drivers/net/ethernet/intel/i40e/i40e_prototype.h
index 45af29b..e8deabd 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_prototype.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_prototype.h
@@ -1,7 +1,7 @@
 
/***
  *
  * Intel Ethernet Controller XL710 Family Linux Driver
- * Copyright(c) 2013 - 2015 Intel Corporation.
+ * Copyright(c) 2013 - 2016 Intel Corporation.
  *
  * This program is free software; you can redistribute it and/or modify it
  * under the terms and conditions of the GNU General Public License,
@@ -138,8 +138,8 @@ i40e_status i40e_aq_update_vsi_params(struct i40e_hw *hw,
struct i40e_asq_cmd_details *cmd_details);
 i40e_status i40e_aq_add_veb(struct i40e_hw *hw, u16 uplink_seid,
u16 downlink_seid, u8 enabled_tc,
-   bool default_port, bool enable_l2_filtering,
-   u16 *pveb_seid,
+   bool default_port, u16 *pveb_seid,
+   bool enable_stats,
struct i40e_asq_cmd_details *cmd_details);

[net-next 00/15][pull request] 40GbE Intel Wired LAN Driver Updates 2016-02-17

2016-02-17 Thread Jeff Kirsher

This series contains updates to i40e/i40evf only (again).

Jesse moves sync_vsi_filters() up in the service_task because it may need
to request a reset, and we do not want to wait another round of service
task time.  Refactored the enable_icr0() in order to allow it to be
decided by the caller whether the CLEARPBA (clear pending events) bit will
be set while re-enabling the interrupt.  Also provides the "Don't Give Up"
patch, where the driver will keep polling trying to allocate receive buffers
until it succeeds.  This should keep all receive queues running even in
the face of memory pressure.  Cleans up the debugging helpers by putting
everything in hex to be consistent.

Neerav updates the DCB firmware version related checkes specific to X710
and XL710 only since the checks are not required for X722 devices.

Shannon adds the use of the new shared MAC filter bit for multicast and
broadcast filters in order to make better use of the filters available
from the device.  Added a parameter to allow the driver to set the
enable/disable of statistics gathering in the hardware switch.  Also the
L2 cloud filtering parameter is removed since it was never used.

Anjali refactors the force_wb and WB_ON_ITR functionality since
Force-WriteBack functionality in X710/XL710 devices has been moved out of
the clean routine and into the service task, so we need to make sure
WriteBack-On-ITR is separated out since it is still called from clean.

Catherine changes the VF driver string to reflect all the products that
are supported.

Mitch refactors the packet split receive code to properly use half-pages
for receives.  Also changes the use of bitwise operators to logical
operators on clean_complete variable, while making a witty reference to
Mr. Spock.  Cleans up (i.e. removes) the hsplit field in the ring
structure and use the existing macro to detect packet split enablement,
which allows debugfs dumps of the VSI to properly show which recevie
routine is in use.

The following are changes since commit 36b6f2cf7edd841c0b0eb7a5ec09c22bd6b5018c:
  Merge branch 'inet_lro-remove'
and are available in the git repository at:
  git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue 40GbE

Anjali Singhai Jain (1):
  i40e: Refactor force_wb and WB_ON_ITR functionality code

Catherine Sullivan (1):
  i40evf: Change vf driver string to reflect all products i40evf
supports

Jesse Brandeburg (6):
  i40e: move sync_vsi_filters up in service_task
  i40e/i40evf: don't lose interrupts
  i40e/i40evf: try again after failure
  i40e: dump descriptor indexes in hex
  i40e/i40evf: use __GFP_NOWARN
  i40e/i40evf: Bump version

Mitch Williams (3):
  i40e/i40evf: use pages correctly in Rx
  i40e/i40evf: use logical operators, not bitwise
  i40e: properly show packet split status in debugfs

Neerav Parikh (1):
  i40e: Make the DCB firmware checks for X710/XL710 only

Shannon Nelson (3):
  i40e: set shared bit for multicast filters
  i40e: add VEB stat control and remove L2 cloud filter
  i40e: use new add_veb calling with VEB stats control

 drivers/net/ethernet/intel/i40e/i40e.h |  23 +-
 drivers/net/ethernet/intel/i40e/i40e_common.c  |  21 +-
 drivers/net/ethernet/intel/i40e/i40e_debugfs.c |  25 +-
 drivers/net/ethernet/intel/i40e/i40e_ethtool.c |  11 +-
 drivers/net/ethernet/intel/i40e/i40e_main.c|  51 +++--
 drivers/net/ethernet/intel/i40e/i40e_prototype.h   |   6 +-
 drivers/net/ethernet/intel/i40e/i40e_txrx.c| 249 +---
 drivers/net/ethernet/intel/i40e/i40e_txrx.h|   9 +-
 drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c |   4 +-
 drivers/net/ethernet/intel/i40evf/i40e_txrx.c  | 253 ++---
 drivers/net/ethernet/intel/i40evf/i40e_txrx.h  |  10 +-
 drivers/net/ethernet/intel/i40evf/i40evf_main.c|   6 +-
 12 files changed, 445 insertions(+), 223 deletions(-)

-- 
2.5.0

[net-next 05/15] i40e: use new add_veb calling with VEB stats control

2016-02-17 Thread Jeff Kirsher

From: Shannon Nelson 

The new parameters for add_veb allow us to enable and disable VEB stats,
so let's use them.

Update copyright year.

Change-ID: Ie6e68c68e2d1d459e42168eda661051b56bf0a65
Signed-off-by: Shannon Nelson 
Tested-by: Andrew Bowers 
Signed-off-by: Jeff Kirsher 
---
 drivers/net/ethernet/intel/i40e/i40e_ethtool.c | 11 ---
 drivers/net/ethernet/intel/i40e/i40e_main.c|  4 ++--
 2 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c 
b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
index 89ad2f7..230fa40 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
@@ -1,7 +1,7 @@
 
/***
  *
  * Intel Ethernet Controller XL710 Family Linux Driver
- * Copyright(c) 2013 - 2015 Intel Corporation.
+ * Copyright(c) 2013 - 2016 Intel Corporation.
  *
  * This program is free software; you can redistribute it and/or modify it
  * under the terms and conditions of the GNU General Public License,
@@ -2785,10 +2785,15 @@ static int i40e_set_priv_flags(struct net_device *dev, 
u32 flags)
pf->auto_disable_flags |= I40E_FLAG_FD_ATR_ENABLED;
}
 
-   if (flags & I40E_PRIV_FLAGS_VEB_STATS)
+   if ((flags & I40E_PRIV_FLAGS_VEB_STATS) &&
+   !(pf->flags & I40E_FLAG_VEB_STATS_ENABLED)) {
pf->flags |= I40E_FLAG_VEB_STATS_ENABLED;
-   else
+   reset_required = true;
+   } else if (!(flags & I40E_PRIV_FLAGS_VEB_STATS) &&
+  (pf->flags & I40E_FLAG_VEB_STATS_ENABLED)) {
pf->flags &= ~I40E_FLAG_VEB_STATS_ENABLED;
+   reset_required = true;
+   }
 
if ((flags & I40E_PRIV_FLAGS_HW_ATR_EVICT) &&
(pf->flags & I40E_FLAG_HW_ATR_EVICT_CAPABLE))
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c 
b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 95fb342..0acec51 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -10069,13 +10069,13 @@ static int i40e_add_veb(struct i40e_veb *veb, struct 
i40e_vsi *vsi)
 {
struct i40e_pf *pf = veb->pf;
bool is_default = veb->pf->cur_promisc;
-   bool is_cloud = false;
+   bool enable_stats = !!(pf->flags & I40E_FLAG_VEB_STATS_ENABLED);
int ret;
 
/* get a VEB from the hardware */
ret = i40e_aq_add_veb(>hw, veb->uplink_seid, vsi->seid,
  veb->enabled_tc, is_default,
- >seid, is_cloud, NULL);
+ >seid, enable_stats, NULL);
if (ret) {
dev_info(>pdev->dev,
 "couldn't add VEB, err %s aq_err %s\n",
-- 
2.5.0

[net-next 07/15] i40evf: Change vf driver string to reflect all products i40evf supports

2016-02-17 Thread Jeff Kirsher

From: Catherine Sullivan 

Change the driver string to 40-10 Gigabit instead of XL710/X710 for X722
and all future products.

Also update copyright year in file header.

Change-ID: I57fae656b36dc4eb682b2b7a054f8f48f3589149
Signed-off-by: Catherine Sullivan 
Tested-by: Andrew Bowers 
Signed-off-by: Jeff Kirsher 
---
 drivers/net/ethernet/intel/i40evf/i40evf_main.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_main.c 
b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
index 045cc7f..faa1bca 100644
--- a/drivers/net/ethernet/intel/i40evf/i40evf_main.c
+++ b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
@@ -1,7 +1,7 @@
 
/***
  *
  * Intel Ethernet Controller XL710 Family Linux Virtual Function Driver
- * Copyright(c) 2013 - 2015 Intel Corporation.
+ * Copyright(c) 2013 - 2016 Intel Corporation.
  *
  * This program is free software; you can redistribute it and/or modify it
  * under the terms and conditions of the GNU General Public License,
@@ -32,7 +32,7 @@ static int i40evf_close(struct net_device *netdev);
 
 char i40evf_driver_name[] = "i40evf";
 static const char i40evf_driver_string[] =
-   "Intel(R) XL710/X710 Virtual Function Network Driver";
+   "Intel(R) 40-10 Gigabit Virtual Function Network Driver";
 
 #define DRV_KERN "-k"
 
-- 
2.5.0

[net-next 12/15] i40e/i40evf: use pages correctly in Rx

2016-02-17 Thread Jeff Kirsher

From: Mitch Williams 

Refactor the packet split Rx code to properly use half-pages for
receives. The previous code was doing way more mapping and unmapping
than it needed to, and wasn't properly using half-pages.

Increment the page use count each time we give a half-page to an skb,
knowing that the stack will probably process and release the page before
we need it again. Only free and reallocate pages if the count shows that
both half-pages are in use. Add counters to track reallocations and page
reuse.

Change-ID: I534b299196036b64be82b4861a0a4036310a8f22
Signed-off-by: Mitch Williams 
Tested-by: Andrew Bowers 
Signed-off-by: Jeff Kirsher 
---
 drivers/net/ethernet/intel/i40e/i40e_debugfs.c |   5 ++
 drivers/net/ethernet/intel/i40e/i40e_txrx.c| 118 -
 drivers/net/ethernet/intel/i40e/i40e_txrx.h|   2 +
 drivers/net/ethernet/intel/i40evf/i40e_txrx.c  | 118 -
 drivers/net/ethernet/intel/i40evf/i40e_txrx.h  |   2 +
 5 files changed, 159 insertions(+), 86 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_debugfs.c 
b/drivers/net/ethernet/intel/i40e/i40e_debugfs.c
index fcae3c8..bdac691 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_debugfs.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_debugfs.c
@@ -536,6 +536,11 @@ static void i40e_dbg_dump_vsi_seid(struct i40e_pf *pf, int 
seid)
 rx_ring->rx_stats.alloc_page_failed,
 rx_ring->rx_stats.alloc_buff_failed);
dev_info(>pdev->dev,
+"rx_rings[%i]: rx_stats: realloc_count = %lld, 
page_reuse_count = %lld\n",
+i,
+rx_ring->rx_stats.realloc_count,
+rx_ring->rx_stats.page_reuse_count);
+   dev_info(>pdev->dev,
 "rx_rings[%i]: size = %i, dma = 0x%08lx\n",
 i, rx_ring->size,
 (unsigned long int)rx_ring->dma);
diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c 
b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
index baaf093..1abef01 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
@@ -1060,7 +1060,7 @@ void i40e_clean_rx_ring(struct i40e_ring *rx_ring)
if (rx_bi->page_dma) {
dma_unmap_page(dev,
   rx_bi->page_dma,
-  PAGE_SIZE / 2,
+  PAGE_SIZE,
   DMA_FROM_DEVICE);
rx_bi->page_dma = 0;
}
@@ -1203,6 +1203,7 @@ bool i40e_alloc_rx_buffers_ps(struct i40e_ring *rx_ring, 
u16 cleaned_count)
u16 i = rx_ring->next_to_use;
union i40e_rx_desc *rx_desc;
struct i40e_rx_buffer *bi;
+   const int current_node = numa_node_id();
 
/* do nothing if no valid netdev defined */
if (!rx_ring->netdev || !cleaned_count)
@@ -1214,39 +1215,50 @@ bool i40e_alloc_rx_buffers_ps(struct i40e_ring 
*rx_ring, u16 cleaned_count)
 
if (bi->skb) /* desc is in use */
goto no_buffers;
+
+   /* If we've been moved to a different NUMA node, release the
+* page so we can get a new one on the current node.
+*/
+   if (bi->page &&  page_to_nid(bi->page) != current_node) {
+   dma_unmap_page(rx_ring->dev,
+  bi->page_dma,
+  PAGE_SIZE,
+  DMA_FROM_DEVICE);
+   __free_page(bi->page);
+   bi->page = NULL;
+   bi->page_dma = 0;
+   rx_ring->rx_stats.realloc_count++;
+   } else if (bi->page) {
+   rx_ring->rx_stats.page_reuse_count++;
+   }
+
if (!bi->page) {
bi->page = alloc_page(GFP_ATOMIC);
if (!bi->page) {
rx_ring->rx_stats.alloc_page_failed++;
goto no_buffers;
}
-   }
-
-   if (!bi->page_dma) {
-   /* use a half page if we're re-using */
-   bi->page_offset ^= PAGE_SIZE / 2;
bi->page_dma = dma_map_page(rx_ring->dev,
bi->page,
-   bi->page_offset,
-   PAGE_SIZE / 2,
+   0,
+   PAGE_SIZE,

[net-next 09/15] i40e/i40evf: try again after failure

2016-02-17 Thread Jeff Kirsher

From: Jesse Brandeburg 

This is the "Don't Give Up" patch.  Previously the
driver could fail an allocation, and then possibly stall
a queue forever, by never coming back to continue receiving
or allocating buffers.

With this patch, the driver will keep polling trying to allocate
receive buffers until it succeeds.  This should keep all receive
queues running even in the face of memory pressure.

Also update copyright year in file header.

Change-ID: I2b103d1ce95b9831288a7222c3343ffa1988b81b
Signed-off-by: Jesse Brandeburg 
Tested-by: Andrew Bowers 
Signed-off-by: Jeff Kirsher 
---
 drivers/net/ethernet/intel/i40e/i40e_txrx.c   | 51 ++-
 drivers/net/ethernet/intel/i40e/i40e_txrx.h   |  6 ++--
 drivers/net/ethernet/intel/i40evf/i40e_txrx.c | 51 ++-
 drivers/net/ethernet/intel/i40evf/i40e_txrx.h |  4 +--
 4 files changed, 89 insertions(+), 23 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c 
b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
index 353e5a0..8049206 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
@@ -1195,8 +1195,10 @@ static inline void i40e_release_rx_desc(struct i40e_ring 
*rx_ring, u32 val)
  * i40e_alloc_rx_buffers_ps - Replace used receive buffers; packet split
  * @rx_ring: ring to place buffers on
  * @cleaned_count: number of buffers to replace
+ *
+ * Returns true if any errors on allocation
  **/
-void i40e_alloc_rx_buffers_ps(struct i40e_ring *rx_ring, u16 cleaned_count)
+bool i40e_alloc_rx_buffers_ps(struct i40e_ring *rx_ring, u16 cleaned_count)
 {
u16 i = rx_ring->next_to_use;
union i40e_rx_desc *rx_desc;
@@ -1204,7 +1206,7 @@ void i40e_alloc_rx_buffers_ps(struct i40e_ring *rx_ring, 
u16 cleaned_count)
 
/* do nothing if no valid netdev defined */
if (!rx_ring->netdev || !cleaned_count)
-   return;
+   return false;
 
while (cleaned_count--) {
rx_desc = I40E_RX_DESC(rx_ring, i);
@@ -1251,17 +1253,29 @@ void i40e_alloc_rx_buffers_ps(struct i40e_ring 
*rx_ring, u16 cleaned_count)
i = 0;
}
 
+   if (rx_ring->next_to_use != i)
+   i40e_release_rx_desc(rx_ring, i);
+
+   return false;
+
 no_buffers:
if (rx_ring->next_to_use != i)
i40e_release_rx_desc(rx_ring, i);
+
+   /* make sure to come back via polling to try again after
+* allocation failure
+*/
+   return true;
 }
 
 /**
  * i40e_alloc_rx_buffers_1buf - Replace used receive buffers; single buffer
  * @rx_ring: ring to place buffers on
  * @cleaned_count: number of buffers to replace
+ *
+ * Returns true if any errors on allocation
  **/
-void i40e_alloc_rx_buffers_1buf(struct i40e_ring *rx_ring, u16 cleaned_count)
+bool i40e_alloc_rx_buffers_1buf(struct i40e_ring *rx_ring, u16 cleaned_count)
 {
u16 i = rx_ring->next_to_use;
union i40e_rx_desc *rx_desc;
@@ -1270,7 +1284,7 @@ void i40e_alloc_rx_buffers_1buf(struct i40e_ring 
*rx_ring, u16 cleaned_count)
 
/* do nothing if no valid netdev defined */
if (!rx_ring->netdev || !cleaned_count)
-   return;
+   return false;
 
while (cleaned_count--) {
rx_desc = I40E_RX_DESC(rx_ring, i);
@@ -1297,6 +1311,8 @@ void i40e_alloc_rx_buffers_1buf(struct i40e_ring 
*rx_ring, u16 cleaned_count)
if (dma_mapping_error(rx_ring->dev, bi->dma)) {
rx_ring->rx_stats.alloc_buff_failed++;
bi->dma = 0;
+   dev_kfree_skb(bi->skb);
+   bi->skb = NULL;
goto no_buffers;
}
}
@@ -1308,9 +1324,19 @@ void i40e_alloc_rx_buffers_1buf(struct i40e_ring 
*rx_ring, u16 cleaned_count)
i = 0;
}
 
+   if (rx_ring->next_to_use != i)
+   i40e_release_rx_desc(rx_ring, i);
+
+   return false;
+
 no_buffers:
if (rx_ring->next_to_use != i)
i40e_release_rx_desc(rx_ring, i);
+
+   /* make sure to come back via polling to try again after
+* allocation failure
+*/
+   return true;
 }
 
 /**
@@ -1494,7 +1520,7 @@ static inline void i40e_rx_hash(struct i40e_ring *ring,
  *
  * Returns true if there's any budget left (e.g. the clean is finished)
  **/
-static int i40e_clean_rx_irq_ps(struct i40e_ring *rx_ring, int budget)
+static int i40e_clean_rx_irq_ps(struct i40e_ring *rx_ring, const int budget)
 {
unsigned int total_rx_bytes = 0, total_rx_packets = 0;
u16 rx_packet_len, rx_header_len, rx_sph, rx_hbo;
@@ -1504,6 +1530,7 @@ static int i40e_clean_rx_irq_ps(struct i40e_ring 
*rx_ring, int budget)
u16 i =

[net-next 13/15] i40e/i40evf: use logical operators, not bitwise

2016-02-17 Thread Jeff Kirsher

From: Mitch Williams 

Mr. Spock would certainly raise an eyebrow to see us using bitwise
operators, when we should clearly be relying on logic. Fascinating.

Change-ID: Ie338010c016f93e9faa2002c07c90b15134b7477
Signed-off-by: Mitch Williams 
Tested-by: Andrew Bowers 
Signed-off-by: Jeff Kirsher 
---
 drivers/net/ethernet/intel/i40e/i40e_txrx.c   | 5 +++--
 drivers/net/ethernet/intel/i40evf/i40e_txrx.c | 5 +++--
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c 
b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
index 1abef01..0ffa9a8 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
@@ -1996,7 +1996,8 @@ int i40e_napi_poll(struct napi_struct *napi, int budget)
 * budget and be more aggressive about cleaning up the Tx descriptors.
 */
i40e_for_each_ring(ring, q_vector->tx) {
-   clean_complete &= i40e_clean_tx_irq(ring, vsi->work_limit);
+   clean_complete = clean_complete &&
+i40e_clean_tx_irq(ring, vsi->work_limit);
arm_wb = arm_wb || ring->arm_wb;
ring->arm_wb = false;
}
@@ -2020,7 +2021,7 @@ int i40e_napi_poll(struct napi_struct *napi, int budget)
 
work_done += cleaned;
/* if we didn't clean as many as budgeted, we must be done */
-   clean_complete &= (budget_per_ring != cleaned);
+   clean_complete = clean_complete && (budget_per_ring > cleaned);
}
 
/* If work not completed, return budget and polling will return */
diff --git a/drivers/net/ethernet/intel/i40evf/i40e_txrx.c 
b/drivers/net/ethernet/intel/i40evf/i40e_txrx.c
index 6f739a7..76bad75 100644
--- a/drivers/net/ethernet/intel/i40evf/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40evf/i40e_txrx.c
@@ -1432,7 +1432,8 @@ int i40evf_napi_poll(struct napi_struct *napi, int budget)
 * budget and be more aggressive about cleaning up the Tx descriptors.
 */
i40e_for_each_ring(ring, q_vector->tx) {
-   clean_complete &= i40e_clean_tx_irq(ring, vsi->work_limit);
+   clean_complete = clean_complete &&
+i40e_clean_tx_irq(ring, vsi->work_limit);
arm_wb = arm_wb || ring->arm_wb;
ring->arm_wb = false;
}
@@ -1456,7 +1457,7 @@ int i40evf_napi_poll(struct napi_struct *napi, int budget)
 
work_done += cleaned;
/* if we didn't clean as many as budgeted, we must be done */
-   clean_complete &= (budget_per_ring != cleaned);
+   clean_complete = clean_complete && (budget_per_ring > cleaned);
}
 
/* If work not completed, return budget and polling will return */
-- 
2.5.0

[net-next 08/15] i40e/i40evf: don't lose interrupts

2016-02-17 Thread Jeff Kirsher

From: Jesse Brandeburg 

While re-enabling interrupts the driver would clear all pending
causes. This meant that if an interrupt was generated while the driver
was cleaning or polling with interrupts disabled, then that interrupt
was lost.  This could cause a queue to become dead, especially for
receive.  Refactored the enable_icr0 function in order to allow
it to be decided by the caller whether the CLEARPBA (clear pending
events) bit will be set while re-enabling the interrupt.

Also update copyright year in file headers.

Change-ID: Ic1db100a05e13c98919057696db147a258ca365a
Signed-off-by: Jesse Brandeburg 
Tested-by: Andrew Bowers 
Signed-off-by: Jeff Kirsher 
---
 drivers/net/ethernet/intel/i40e/i40e.h |  7 +--
 drivers/net/ethernet/intel/i40e/i40e_main.c| 11 ++-
 drivers/net/ethernet/intel/i40e/i40e_txrx.c|  6 --
 drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c |  4 ++--
 drivers/net/ethernet/intel/i40evf/i40e_txrx.c  |  4 +++-
 5 files changed, 20 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e.h 
b/drivers/net/ethernet/intel/i40e/i40e.h
index 7bfd062..5ea431d 100644
--- a/drivers/net/ethernet/intel/i40e/i40e.h
+++ b/drivers/net/ethernet/intel/i40e/i40e.h
@@ -1,7 +1,7 @@
 
/***
  *
  * Intel Ethernet Controller XL710 Family Linux Driver
- * Copyright(c) 2013 - 2015 Intel Corporation.
+ * Copyright(c) 2013 - 2016 Intel Corporation.
  *
  * This program is free software; you can redistribute it and/or modify it
  * under the terms and conditions of the GNU General Public License,
@@ -767,6 +767,9 @@ static inline void i40e_irq_dynamic_enable(struct i40e_vsi 
*vsi, int vector)
struct i40e_hw *hw = >hw;
u32 val;
 
+   /* definitely clear the PBA here, as this function is meant to
+* clean out all previous interrupts AND enable the interrupt
+*/
val = I40E_PFINT_DYN_CTLN_INTENA_MASK |
  I40E_PFINT_DYN_CTLN_CLEARPBA_MASK |
  (I40E_ITR_NONE << I40E_PFINT_DYN_CTLN_ITR_INDX_SHIFT);
@@ -775,7 +778,7 @@ static inline void i40e_irq_dynamic_enable(struct i40e_vsi 
*vsi, int vector)
 }
 
 void i40e_irq_dynamic_disable_icr0(struct i40e_pf *pf);
-void i40e_irq_dynamic_enable_icr0(struct i40e_pf *pf);
+void i40e_irq_dynamic_enable_icr0(struct i40e_pf *pf, bool clearpba);
 #ifdef I40E_FCOE
 struct rtnl_link_stats64 *i40e_get_netdev_stats_struct(
 struct net_device *netdev,
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c 
b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 0acec51..8bc848f 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -3257,14 +3257,15 @@ void i40e_irq_dynamic_disable_icr0(struct i40e_pf *pf)
 /**
  * i40e_irq_dynamic_enable_icr0 - Enable default interrupt generation for icr0
  * @pf: board private structure
+ * @clearpba: true when all pending interrupt events should be cleared
  **/
-void i40e_irq_dynamic_enable_icr0(struct i40e_pf *pf)
+void i40e_irq_dynamic_enable_icr0(struct i40e_pf *pf, bool clearpba)
 {
struct i40e_hw *hw = >hw;
u32 val;
 
val = I40E_PFINT_DYN_CTL0_INTENA_MASK   |
- I40E_PFINT_DYN_CTL0_CLEARPBA_MASK |
+ (clearpba ? I40E_PFINT_DYN_CTL0_CLEARPBA_MASK : 0) |
  (I40E_ITR_NONE << I40E_PFINT_DYN_CTL0_ITR_INDX_SHIFT);
 
wr32(hw, I40E_PFINT_DYN_CTL0, val);
@@ -3396,7 +3397,7 @@ static int i40e_vsi_enable_irq(struct i40e_vsi *vsi)
for (i = 0; i < vsi->num_q_vectors; i++)
i40e_irq_dynamic_enable(vsi, i);
} else {
-   i40e_irq_dynamic_enable_icr0(pf);
+   i40e_irq_dynamic_enable_icr0(pf, true);
}
 
i40e_flush(>hw);
@@ -3542,7 +3543,7 @@ enable_intr:
wr32(hw, I40E_PFINT_ICR0_ENA, ena_mask);
if (!test_bit(__I40E_DOWN, >state)) {
i40e_service_event_schedule(pf);
-   i40e_irq_dynamic_enable_icr0(pf);
+   i40e_irq_dynamic_enable_icr0(pf, false);
}
 
return ret;
@@ -7858,7 +7859,7 @@ static int i40e_setup_misc_vector(struct i40e_pf *pf)
 
i40e_flush(hw);
 
-   i40e_irq_dynamic_enable_icr0(pf);
+   i40e_irq_dynamic_enable_icr0(pf, true);
 
return err;
 }
diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c 
b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
index 7dfd45e..353e5a0 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
@@ -1810,7 +1810,9 @@ static u32 i40e_buildreg_itr(const int type, const u16 
itr)
u32 val;
 
val = I40E_PFINT_DYN_CTLN_INTENA_MASK |
- I40E_PFINT_DYN_CTLN_CLEARPBA_MASK |
+ /*

[net-next 10/15] i40e: dump descriptor indexes in hex

2016-02-17 Thread Jeff Kirsher

From: Jesse Brandeburg 

The debugging helpers for showing descriptor rings were
dumping the indexes in decimal and the offsets in hex.

Put everything in hex and at least be consistent.

Also update copyright year in file header.

Change-ID: Ia35a21411a2ddb713772dffb4e8718889fcfc895
Signed-off-by: Jesse Brandeburg 
Tested-by: Andrew Bowers 
Signed-off-by: Jeff Kirsher 
---
 drivers/net/ethernet/intel/i40e/i40e_debugfs.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_debugfs.c 
b/drivers/net/ethernet/intel/i40e/i40e_debugfs.c
index 3948587..fcae3c8 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_debugfs.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_debugfs.c
@@ -1,7 +1,7 @@
 
/***
  *
  * Intel Ethernet Controller XL710 Family Linux Driver
- * Copyright(c) 2013 - 2014 Intel Corporation.
+ * Copyright(c) 2013 - 2016 Intel Corporation.
  *
  * This program is free software; you can redistribute it and/or modify it
  * under the terms and conditions of the GNU General Public License,
@@ -825,20 +825,20 @@ static void i40e_dbg_dump_desc(int cnt, int vsi_seid, int 
ring_id, int desc_n,
if (!is_rx_ring) {
txd = I40E_TX_DESC(ring, i);
dev_info(>pdev->dev,
-"   d[%03i] = 0x%016llx 0x%016llx\n",
+"   d[%03x] = 0x%016llx 0x%016llx\n",
 i, txd->buffer_addr,
 txd->cmd_type_offset_bsz);
} else if (sizeof(union i40e_rx_desc) ==
   sizeof(union i40e_16byte_rx_desc)) {
rxd = I40E_RX_DESC(ring, i);
dev_info(>pdev->dev,
-"   d[%03i] = 0x%016llx 0x%016llx\n",
+"   d[%03x] = 0x%016llx 0x%016llx\n",
 i, rxd->read.pkt_addr,
 rxd->read.hdr_addr);
} else {
rxd = I40E_RX_DESC(ring, i);
dev_info(>pdev->dev,
-"   d[%03i] = 0x%016llx 0x%016llx 
0x%016llx 0x%016llx\n",
+"   d[%03x] = 0x%016llx 0x%016llx 
0x%016llx 0x%016llx\n",
 i, rxd->read.pkt_addr,
 rxd->read.hdr_addr,
 rxd->read.rsvd1, rxd->read.rsvd2);
@@ -853,20 +853,20 @@ static void i40e_dbg_dump_desc(int cnt, int vsi_seid, int 
ring_id, int desc_n,
if (!is_rx_ring) {
txd = I40E_TX_DESC(ring, desc_n);
dev_info(>pdev->dev,
-"vsi = %02i tx ring = %02i d[%03i] = 0x%016llx 
0x%016llx\n",
+"vsi = %02i tx ring = %02i d[%03x] = 0x%016llx 
0x%016llx\n",
 vsi_seid, ring_id, desc_n,
 txd->buffer_addr, txd->cmd_type_offset_bsz);
} else if (sizeof(union i40e_rx_desc) ==
   sizeof(union i40e_16byte_rx_desc)) {
rxd = I40E_RX_DESC(ring, desc_n);
dev_info(>pdev->dev,
-"vsi = %02i rx ring = %02i d[%03i] = 0x%016llx 
0x%016llx\n",
+"vsi = %02i rx ring = %02i d[%03x] = 0x%016llx 
0x%016llx\n",
 vsi_seid, ring_id, desc_n,
 rxd->read.pkt_addr, rxd->read.hdr_addr);
} else {
rxd = I40E_RX_DESC(ring, desc_n);
dev_info(>pdev->dev,
-"vsi = %02i rx ring = %02i d[%03i] = 0x%016llx 
0x%016llx 0x%016llx 0x%016llx\n",
+"vsi = %02i rx ring = %02i d[%03x] = 0x%016llx 
0x%016llx 0x%016llx 0x%016llx\n",
 vsi_seid, ring_id, desc_n,
 rxd->read.pkt_addr, rxd->read.hdr_addr,
 rxd->read.rsvd1, rxd->read.rsvd2);
-- 
2.5.0

[net-next 11/15] i40e/i40evf: use __GFP_NOWARN

2016-02-17 Thread Jeff Kirsher

From: Jesse Brandeburg 

The i40e and i40evf drivers now cleanly handle allocation
failures and can avoid kernel log spew from the memory allocator
when allocations fail, so set __GFP_NOWARN on Rx buffer alloc.

Change-ID: Ic9e1b83c495e2a3ef6b069ba7fb6e52ce134cd23
Signed-off-by: Jesse Brandeburg 
Tested-by: Andrew Bowers 
Signed-off-by: Jeff Kirsher 
---
 drivers/net/ethernet/intel/i40e/i40e_txrx.c   | 12 
 drivers/net/ethernet/intel/i40evf/i40e_txrx.c | 12 
 2 files changed, 16 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c 
b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
index 8049206..baaf093 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
@@ -1292,8 +1292,10 @@ bool i40e_alloc_rx_buffers_1buf(struct i40e_ring 
*rx_ring, u16 cleaned_count)
skb = bi->skb;
 
if (!skb) {
-   skb = netdev_alloc_skb_ip_align(rx_ring->netdev,
-   rx_ring->rx_buf_len);
+   skb = __netdev_alloc_skb_ip_align(rx_ring->netdev,
+ rx_ring->rx_buf_len,
+ GFP_ATOMIC |
+ __GFP_NOWARN);
if (!skb) {
rx_ring->rx_stats.alloc_buff_failed++;
goto no_buffers;
@@ -1571,8 +1573,10 @@ static int i40e_clean_rx_irq_ps(struct i40e_ring 
*rx_ring, const int budget)
rx_bi = _ring->rx_bi[i];
skb = rx_bi->skb;
if (likely(!skb)) {
-   skb = netdev_alloc_skb_ip_align(rx_ring->netdev,
-   rx_ring->rx_hdr_len);
+   skb = __netdev_alloc_skb_ip_align(rx_ring->netdev,
+ rx_ring->rx_hdr_len,
+ GFP_ATOMIC |
+ __GFP_NOWARN);
if (!skb) {
rx_ring->rx_stats.alloc_buff_failed++;
failure = true;
diff --git a/drivers/net/ethernet/intel/i40evf/i40e_txrx.c 
b/drivers/net/ethernet/intel/i40evf/i40e_txrx.c
index 616daae..1dbdcf8 100644
--- a/drivers/net/ethernet/intel/i40evf/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40evf/i40e_txrx.c
@@ -764,8 +764,10 @@ bool i40evf_alloc_rx_buffers_1buf(struct i40e_ring 
*rx_ring, u16 cleaned_count)
skb = bi->skb;
 
if (!skb) {
-   skb = netdev_alloc_skb_ip_align(rx_ring->netdev,
-   rx_ring->rx_buf_len);
+   skb = __netdev_alloc_skb_ip_align(rx_ring->netdev,
+ rx_ring->rx_buf_len,
+ GFP_ATOMIC |
+ __GFP_NOWARN);
if (!skb) {
rx_ring->rx_stats.alloc_buff_failed++;
goto no_buffers;
@@ -1034,8 +1036,10 @@ static int i40e_clean_rx_irq_ps(struct i40e_ring 
*rx_ring, const int budget)
rx_bi = _ring->rx_bi[i];
skb = rx_bi->skb;
if (likely(!skb)) {
-   skb = netdev_alloc_skb_ip_align(rx_ring->netdev,
-   rx_ring->rx_hdr_len);
+   skb = __netdev_alloc_skb_ip_align(rx_ring->netdev,
+ rx_ring->rx_hdr_len,
+ GFP_ATOMIC |
+ __GFP_NOWARN);
if (!skb) {
rx_ring->rx_stats.alloc_buff_failed++;
failure = true;
-- 
2.5.0

[net-next 01/15] i40e: move sync_vsi_filters up in service_task

2016-02-17 Thread Jeff Kirsher

From: Jesse Brandeburg 

The sync_vsi_filters function is moved up in the service_task because
it may need to request a reset, and we don't want to wait another round
of service task time.

NOTE: Filters will be replayed by sync_vsi_filters including broadcast
and promiscuous settings.

Also, added some error handling in this space in case any of these
fail the driver will retry correctly.

Also update copyright year.

Change-ID: I23f3d552100baecea69466339f738f27614efd47
Signed-off-by: Jesse Brandeburg 
Tested-by: Andrew Bowers 
Signed-off-by: Jeff Kirsher 
---
 drivers/net/ethernet/intel/i40e/i40e_main.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c 
b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 04417e6..e974db3 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -1,7 +1,7 @@
 
/***
  *
  * Intel Ethernet Controller XL710 Family Linux Driver
- * Copyright(c) 2013 - 2015 Intel Corporation.
+ * Copyright(c) 2013 - 2016 Intel Corporation.
  *
  * This program is free software; you can redistribute it and/or modify it
  * under the terms and conditions of the GNU General Public License,
@@ -2168,6 +2168,10 @@ int i40e_sync_vsi_filters(struct i40e_vsi *vsi)
}
}
 out:
+   /* if something went wrong then set the changed flag so we try again */
+   if (retval)
+   vsi->flags |= I40E_VSI_FLAG_FILTER_CHANGED;
+
clear_bit(__I40E_CONFIG_BUSY, >state);
return retval;
 }
@@ -7113,6 +7117,7 @@ static void i40e_service_task(struct work_struct *work)
}
 
i40e_detect_recover_hung(pf);
+   i40e_sync_filters_subtask(pf);
i40e_reset_subtask(pf);
i40e_handle_mdd_event(pf);
i40e_vc_process_vflr_event(pf);
-- 
2.5.0

Re: [PATCH 0/3] net: thunderx: Miscellaneous fixes

2016-02-17 Thread David Miller

From: sunil.kovv...@gmail.com
Date: Tue, 16 Feb 2016 16:29:48 +0530

> This patch series fixes couple of issues w.r.t multiqset mode
> and receive packet statastics.

Series applied, thanks.

Re: [PATCH] tcp: correctly crypto_alloc_hash return check

2016-02-17 Thread David Miller

From: Insu Yun 
Date: Mon, 15 Feb 2016 21:30:33 -0500

> crypto_alloc_hash never returns NULL
> 
> Signed-off-by: Insu Yun 

Applied, thanks.

Re: [PATCH net-next v3 0/4] Add support for Classifier and RSS

2016-02-17 Thread David Miller

From: Iyappan Subramanian 
Date: Wed, 17 Feb 2016 15:00:38 -0800

> This patch set enables,
> 
> (i) Classifier engine that is used for parsing
> through the packet and extracting a search string that is then used
> to search a database to find associative data.
> 
> (ii) Receive Side Scaling (RSS) that does dynamic load
> balancing of the CPUs by controlling the number of messages enqueued
> per CPU though the help of Toeplitz Hash function of 4-tuple of
> source TCP/UDP port, destination TCP/UDP port, source IPV4 address and
> destination IPV4 address.
> 
> (iii) Multi queue, to make advantage of RSS
 ...

Series applied, thanks.

Re: [net-next v2] vlan: change return type of vlan_proc_rem_dev

2016-02-17 Thread David Miller

From: Zhang Shengju 
Date: Thu, 18 Feb 2016 02:29:30 +

> Since function vlan_proc_rem_dev() will only return 0, it's better to
> return void instead of int.
> 
> Signed-off-by: Zhang Shengju 

Applied, thanks.

Re: [PATCH net v2] net: dsa: Unregister slave_dev in error path

2016-02-17 Thread David Miller

From: Florian Fainelli 
Date: Wed, 17 Feb 2016 18:43:22 -0800

> With commit 0071f56e46da ("dsa: Register netdev before phy"), we are now 
> trying
> to free a network device that has been previously registered, and in case of
> errors, this will make us hit the BUG_ON(dev->reg_state != 
> NETREG_UNREGISTERED)
> condition.
> 
> Fix this by adding a missing unregister_netdev() before free_netdev().
> 
> Fixes: 0071f56e46da ("dsa: Register netdev before phy")
> Signed-off-by: Florian Fainelli 

Applied, thanks Florian.

Re: [net-next PATCH] net: pack tc_cls_u32_knode struct slighter better

2016-02-17 Thread David Miller

From: John Fastabend 
Date: Wed, 17 Feb 2016 14:59:30 -0800

> By packing the structure we can remove a few holes as Jamal
> suggests.
> 
> before:
...
> after:
 ...
> Suggested-by: Jamal Hadi Salim 
> Signed-off-by: John Fastabend 

Applied.

Re: [net-next PATCH 2/2] ixgbe: fix dates on header of ixgbe_model.h

2016-02-17 Thread David Miller

From: John Fastabend 
Date: Wed, 17 Feb 2016 14:35:23 -0800

> Fixes: 9d35cf062e05 ("net: ixgbe: add minimal parser details for ixgbe")
> Reported-by: Mark Rustad 
> Signed-off-by: John Fastabend 

Applied.

Re: [net-next PATCH 1/2] ixgbe: use u32 instead of __u32 in model header

2016-02-17 Thread David Miller

From: John Fastabend 
Date: Wed, 17 Feb 2016 14:34:53 -0800

> I incorrectly used __u32 types where we should be using u32 types when
> I added the ixgbe_model.h file.
> 
> Fixes: 9d35cf062e05 ("net: ixgbe: add minimal parser details for ixgbe")
> Suggested-by: Jamal Hadi Salim 
> Signed-off-by: John Fastabend 

Applied.

[PATCH net v2] net: dsa: Unregister slave_dev in error path

2016-02-17 Thread Florian Fainelli

With commit 0071f56e46da ("dsa: Register netdev before phy"), we are now trying
to free a network device that has been previously registered, and in case of
errors, this will make us hit the BUG_ON(dev->reg_state != NETREG_UNREGISTERED)
condition.

Fix this by adding a missing unregister_netdev() before free_netdev().

Fixes: 0071f56e46da ("dsa: Register netdev before phy")
Signed-off-by: Florian Fainelli 
---
Changes in v2:

- fixed commit message to accurately reflect what we are doing s/unregister/free

 net/dsa/slave.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 91e3b2ff364a..ab24521beb4d 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -1204,6 +1204,7 @@ int dsa_slave_create(struct dsa_switch *ds, struct device 
*parent,
ret = dsa_slave_phy_setup(p, slave_dev);
if (ret) {
netdev_err(master, "error %d setting up slave phy\n", ret);
+   unregister_netdev(slave_dev);
free_netdev(slave_dev);
return ret;
}
-- 
2.5.0

Re: [Intel-wired-lan] [next] igb: allow setting MAC address on i211 using a device tree blob V4

2016-02-17 Thread David Miller

From: John Holland 
Date: Thu, 18 Feb 2016 00:49:17 +0100

> The Intel i211 LOM pcie ethernet controllers' iNVM operates as an OTP
> and has no externel EEPROM interface [1]. The following allows the
> driver to pickup the MAC address from a device tree blob when
> CONFIG_OF
> has been enabled.

Please use the generic eth_platform_get_mac_address(), or
alternatively structure your code like the ixgbe and other cases so
that SPARC and other OF platforms get this support as well.

Re: [PATCH net] net: dsa: Unregister slave_dev in error path

2016-02-17 Thread Florian Fainelli

Le 17/02/2016 18:19, Florian Fainelli a écrit :
> With commit 0071f56e46da ("dsa: Register netdev before phy"), we are now 
> trying
> to unregister

s/unregister/free/...

 a network device that has been previously registered, and in case
> of errors, this will make us hit the BUG_ON(dev->reg_state !=
> NETREG_UNREGISTERED) condition.
> 
> Fix this by adding a missing unregister_netdev() before free_netdev().
> 
> Fixes: 0071f56e46da ("dsa: Register netdev before phy")
> Signed-off-by: Florian Fainelli 
> ---
>  net/dsa/slave.c |1 +
>  1 files changed, 1 insertions(+), 0 deletions(-)
> 
> diff --git a/net/dsa/slave.c b/net/dsa/slave.c
> index 40b9ca7..e685042 100644
> --- a/net/dsa/slave.c
> +++ b/net/dsa/slave.c
> @@ -1205,6 +1205,7 @@ int dsa_slave_create(struct dsa_switch *ds, struct 
> device *parent,
>   ret = dsa_slave_phy_setup(p, slave_dev);
>   if (ret) {
>   netdev_err(master, "error %d setting up slave phy\n", ret);
> + unregister_netdev(slave_dev);
>   free_netdev(slave_dev);
>   return ret;
>   }
> 


-- 
Florian

[net-next v2] vlan: change return type of vlan_proc_rem_dev

2016-02-17 Thread Zhang Shengju

Since function vlan_proc_rem_dev() will only return 0, it's better to
return void instead of int.

Signed-off-by: Zhang Shengju 
---
 net/8021q/vlanproc.c | 3 +--
 net/8021q/vlanproc.h | 4 ++--
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/net/8021q/vlanproc.c b/net/8021q/vlanproc.c
index ae63cf7..5f1446c 100644
--- a/net/8021q/vlanproc.c
+++ b/net/8021q/vlanproc.c
@@ -184,12 +184,11 @@ int vlan_proc_add_dev(struct net_device *vlandev)
 /*
  * Delete directory entry for VLAN device.
  */
-int vlan_proc_rem_dev(struct net_device *vlandev)
+void vlan_proc_rem_dev(struct net_device *vlandev)
 {
/** NOTE:  This will consume the memory pointed to by dent, it seems. */
proc_remove(vlan_dev_priv(vlandev)->dent);
vlan_dev_priv(vlandev)->dent = NULL;
-   return 0;
 }
 
 /** Proc filesystem entry points /
diff --git a/net/8021q/vlanproc.h b/net/8021q/vlanproc.h
index 063f60a..8838a2e 100644
--- a/net/8021q/vlanproc.h
+++ b/net/8021q/vlanproc.h
@@ -5,7 +5,7 @@
 struct net;
 
 int vlan_proc_init(struct net *net);
-int vlan_proc_rem_dev(struct net_device *vlandev);
+void vlan_proc_rem_dev(struct net_device *vlandev);
 int vlan_proc_add_dev(struct net_device *vlandev);
 void vlan_proc_cleanup(struct net *net);
 
@@ -14,7 +14,7 @@ void vlan_proc_cleanup(struct net *net);
 #define vlan_proc_init(net)(0)
 #define vlan_proc_cleanup(net) do {} while (0)
 #define vlan_proc_add_dev(dev) ({(void)(dev), 0; })
-#define vlan_proc_rem_dev(dev) ({(void)(dev), 0; })
+#define vlan_proc_rem_dev(dev) do {} while (0)
 #endif
 
 #endif /* !(__BEN_VLAN_PROC_INC__) */
-- 
1.8.3.1

Re: [net-next] vlan: change return type of vlan_proc_rem_dev

2016-02-17 Thread 张胜举

> 
> On Wed, Feb 17, 2016 at 3:44 AM, Zhang Shengju
>  wrote:
> > Since function vlan_proc_rem_dev() will only return 0, it's better to
> > return void instead of int.
> >
> > Signed-off-by: Zhang Shengju 
> > ---
> >  net/8021q/vlanproc.c | 3 +--
> >  net/8021q/vlanproc.h | 2 +-
> >  2 files changed, 2 insertions(+), 3 deletions(-)
> >
> > diff --git a/net/8021q/vlanproc.c b/net/8021q/vlanproc.c index
> > ae63cf7..5f1446c 100644
> > --- a/net/8021q/vlanproc.c
> > +++ b/net/8021q/vlanproc.c
> > @@ -184,12 +184,11 @@ int vlan_proc_add_dev(struct net_device
> > *vlandev)
> >  /*
> >   * Delete directory entry for VLAN device.
> >   */
> > -int vlan_proc_rem_dev(struct net_device *vlandev)
> > +void vlan_proc_rem_dev(struct net_device *vlandev)
> >  {
> > /** NOTE:  This will consume the memory pointed to by dent, it 
> > seems.
> */
> > proc_remove(vlan_dev_priv(vlandev)->dent);
> > vlan_dev_priv(vlandev)->dent = NULL;
> > -   return 0;
> >  }
> >
> >  /** Proc filesystem entry points
> > /
> > diff --git a/net/8021q/vlanproc.h b/net/8021q/vlanproc.h index
> > 063f60a..a9d8734 100644
> > --- a/net/8021q/vlanproc.h
> > +++ b/net/8021q/vlanproc.h
> > @@ -5,7 +5,7 @@
> >  struct net;
> >
> >  int vlan_proc_init(struct net *net);
> > -int vlan_proc_rem_dev(struct net_device *vlandev);
> > +void vlan_proc_rem_dev(struct net_device *vlandev);
> >  int vlan_proc_add_dev(struct net_device *vlandev);  void
> > vlan_proc_cleanup(struct net *net);
> 
> You forget to change the !PROC_FS case:
> 
> #define vlan_proc_rem_dev(dev)  ({(void)(dev), 0; })

Thanks, I will add the missing part.

Re: [net-next PATCH 1/2] ixgbe: use u32 instead of __u32 in model header

2016-02-17 Thread David Miller

From: Jeff Kirsher 
Date: Wed, 17 Feb 2016 14:41:48 -0800

> On Wed, 2016-02-17 at 14:34 -0800, John Fastabend wrote:
>> I incorrectly used __u32 types where we should be using u32 types
>> when
>> I added the ixgbe_model.h file.
>> 
>> Fixes: 9d35cf062e05 ("net: ixgbe: add minimal parser details for
>> ixgbe")
>> Suggested-by: Jamal Hadi Salim 
>> Signed-off-by: John Fastabend 
>> ---
>>  drivers/net/ethernet/intel/ixgbe/ixgbe_model.h |   18 +-
>> 
>>  1 file changed, 9 insertions(+), 9 deletions(-)
> 
> Acked-by: Jeff Kirsher 
> 
> Dave feel free to pull this series in from John to fix the issues with
> his previous series.

Will do.

[PATCH net] net: dsa: Unregister slave_dev in error path

2016-02-17 Thread Florian Fainelli

With commit 0071f56e46da ("dsa: Register netdev before phy"), we are now trying
to unregister a network device that has been previously registered, and in case
of errors, this will make us hit the BUG_ON(dev->reg_state !=
NETREG_UNREGISTERED) condition.

Fix this by adding a missing unregister_netdev() before free_netdev().

Fixes: 0071f56e46da ("dsa: Register netdev before phy")
Signed-off-by: Florian Fainelli 
---
 net/dsa/slave.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 40b9ca7..e685042 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -1205,6 +1205,7 @@ int dsa_slave_create(struct dsa_switch *ds, struct device 
*parent,
ret = dsa_slave_phy_setup(p, slave_dev);
if (ret) {
netdev_err(master, "error %d setting up slave phy\n", ret);
+   unregister_netdev(slave_dev);
free_netdev(slave_dev);
return ret;
}
-- 
1.7.1

Re: [iproute2 net-next] vrf: Add support for slave_info

2016-02-17 Thread Stephen Hemminger

On Tue,  2 Feb 2016 07:43:46 -0800
David Ahern  wrote:

> Print VRF slave_info attributes if present.
> 
> Signed-off-by: David Ahern 

Applied to net-next

Re: [PATCH iproute2] netns: Fix an off-by-one strcpy() in netns_map_add().

2016-02-17 Thread Stephen Hemminger

On Fri, 12 Feb 2016 14:47:39 +0100
Nicolas Cavallari  wrote:

> netns_map_add() does a malloc of (sizeof (struct nsid_cache) +
> strlen(name)) and then proceed with strcpy() of name into the
> zero-length member at the end of the nsid_cache structure.  The
> nul-terminator is written outside of the allocated memory and may
> overwrite the allocator's internal structure.
> 
> This can trigger a segmentation fault on i386 uclibc with names of size 8:
> after the corruption occurs, the call to closedir() on netns_map_init()
> crashes while freeing the DIR structure.
> 
> Here is the relevant valgrind output:
> 
> ==1251== Memcheck, a memory error detector
> ==1251== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
> ==1251== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright
> info
> ==1251== Command: ./ip netns
> ==1251==
> ==1251== Invalid write of size 1
> ==1251==at 0x4011975: strcpy (in
> /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
> ==1251==by 0x8058B00: netns_map_add (ipnetns.c:181)
> ==1251==by 0x8058E2A: netns_map_init (ipnetns.c:226)
> ==1251==by 0x8058E79: do_netns (ipnetns.c:776)
> ==1251==by 0x804D9FF: do_cmd (ip.c:110)
> ==1251==by 0x804D814: main (ip.c:300)

Applied, thanks.

Re: [PATCH iproute2 1/5] iplink: bridge_slave: export read-only values

2016-02-17 Thread Stephen Hemminger

On Tue, 16 Feb 2016 16:08:51 +0100
Nikolay Aleksandrov  wrote:

> From: Nikolay Aleksandrov 
> 
> Export all the read-only values that get returned about a bridge port
> such as the timers, the ids, designated_port and cost,
> topology_change_ack and config_pending. For the bridge ids the
> br_dump_bridge_id function is exported from iplink_bridge.
> 
> Signed-off-by: Nikolay Aleksandrov 
> ---

Applied the series.

Re: [PATCH 1/2] [iproute2] tc/q_htb.c: remove printing of a deprecated overhead value previously encoded as a part of mpu field

2016-02-17 Thread Stephen Hemminger

On Sat, 19 Dec 2015 18:25:52 +0300
Dmitrii Shcherbakov  wrote:

> Remove printing according to the previously used encoding of mpu and overhead 
> values within the tc_ratespec's mpu field. This encoding is no longer being 
> used as a separate 'overhead' field in the ratespec structure has been 
> introduced.
> 
> Signed-off-by: Dmitrii Shcherbakov 
> Acked-by: Jesper Dangaard Brouer 
> Acked-by: Phil Sutter 

Both applied.

I had to fix up the commit logs.
Many tools don't like long subject/summary lines, and it is standard practice
to wrap text in commit body at 80 characters.

linux-next: build failure after merge of the net-next tree

2016-02-17 Thread Stephen Rothwell

Hi all,

After merging the net-next tree, today's linux-next build (powerpc
ppc64_defconfig) failed like this:

In file included from drivers/net/ethernet/broadcom/bnx2x/bnx2x.h:56:0,
 from drivers/net/ethernet/broadcom/bnx2x/bnx2x_dcb.c:30:
drivers/net/ethernet/broadcom/bnx2x/bnx2x_dcb.c: In function 
'bnx2x_dcbx_get_ap_feature':
drivers/net/ethernet/broadcom/bnx2x/bnx2x_dcb.c:224:11: error: 
'DCBX_APP_SF_DEFAULT' undeclared (first use in this function)
   DCBX_APP_SF_DEFAULT) &&  
   ^
drivers/net/ethernet/broadcom/bnx2x/bnx2x_dcb.h:120:45: note: in definition of 
macro 'GET_FLAGS'
 #define GET_FLAGS(flags, bits)  ((flags) & (bits))
 ^
drivers/net/ethernet/broadcom/bnx2x/bnx2x_dcb.c:224:11: note: each undeclared 
identifier is reported only once for each function it appears in
   DCBX_APP_SF_DEFAULT) &&  
   ^
drivers/net/ethernet/broadcom/bnx2x/bnx2x_dcb.h:120:45: note: in definition of 
macro 'GET_FLAGS'
 #define GET_FLAGS(flags, bits)  ((flags) & (bits))
 ^

Caused by commit

  e5d3a51cefbb ("This adds support for default application priority.")

This build is big endian.

I have used the net-next tree from next-20160217 for today.

-- 
Cheers,
Stephen Rothwell

Re: [PATCH] mwifiex: Use to_delayed_work()

2016-02-17 Thread Julian Calaby

Hi All,

On Wed, Feb 17, 2016 at 11:33 PM, Amitoj Kaur Chawla
 wrote:
> Introduce the use of to_delayed_work() helper function instead of open
> coding it with container_of()
>
> A simplified version of the Coccinelle semantic patch used to make
> this change is:
>
> //
> @@
> expression a;
> symbol work;
> @@
> - container_of(a, struct delayed_work, work)
> + to_delayed_work(a)
> //
>
> Signed-off-by: Amitoj Kaur Chawla 

Looks right to me.

Reviewed-by: Julian Calaby 

Thanks,

> ---
>  drivers/net/wireless/marvell/mwifiex/11h.c | 6 ++
>  1 file changed, 2 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/net/wireless/marvell/mwifiex/11h.c 
> b/drivers/net/wireless/marvell/mwifiex/11h.c
> index 71a1b58..81c60d0 100644
> --- a/drivers/net/wireless/marvell/mwifiex/11h.c
> +++ b/drivers/net/wireless/marvell/mwifiex/11h.c
> @@ -123,8 +123,7 @@ void mwifiex_11h_process_join(struct mwifiex_private 
> *priv, u8 **buffer,
>  void mwifiex_dfs_cac_work_queue(struct work_struct *work)
>  {
> struct cfg80211_chan_def chandef;
> -   struct delayed_work *delayed_work =
> -   container_of(work, struct delayed_work, work);
> +   struct delayed_work *delayed_work = to_delayed_work(work);
> struct mwifiex_private *priv =
> container_of(delayed_work, struct mwifiex_private,
>  dfs_cac_work);
> @@ -289,8 +288,7 @@ int mwifiex_11h_handle_radar_detected(struct 
> mwifiex_private *priv,
>  void mwifiex_dfs_chan_sw_work_queue(struct work_struct *work)
>  {
> struct mwifiex_uap_bss_param *bss_cfg;
> -   struct delayed_work *delayed_work =
> -   container_of(work, struct delayed_work, work);
> +   struct delayed_work *delayed_work = to_delayed_work(work);
> struct mwifiex_private *priv =
> container_of(delayed_work, struct mwifiex_private,
>  dfs_chan_sw_work);
> --
> 1.9.1
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Julian Calaby

Email: julian.cal...@gmail.com
Profile: http://www.google.com/profiles/julian.calaby/

[PATCH nf 3/3] netfilter: ipvs: handle ip_vs_fill_iph_skb_off failure

2016-02-17 Thread Simon Horman

From: Arnd Bergmann 

ip_vs_fill_iph_skb_off() may not find an IP header, and gcc has
determined that ip_vs_sip_fill_param() then incorrectly accesses
the protocol fields:

net/netfilter/ipvs/ip_vs_pe_sip.c: In function 'ip_vs_sip_fill_param':
net/netfilter/ipvs/ip_vs_pe_sip.c:76:5: error: 'iph.protocol' may be used 
uninitialized in this function [-Werror=maybe-uninitialized]
  if (iph.protocol != IPPROTO_UDP)
 ^
net/netfilter/ipvs/ip_vs_pe_sip.c:81:10: error: 'iph.len' may be used 
uninitialized in this function [-Werror=maybe-uninitialized]
  dataoff = iph.len + sizeof(struct udphdr);
  ^

This adds a check for the ip_vs_fill_iph_skb_off() return code
before looking at the ip header data returned from it.

Signed-off-by: Arnd Bergmann 
Fixes: b0e010c527de ("ipvs: replace ip_vs_fill_ip4hdr with 
ip_vs_fill_iph_skb_off")
Acked-by: Julian Anastasov 
Signed-off-by: Simon Horman 
---
 net/netfilter/ipvs/ip_vs_pe_sip.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/netfilter/ipvs/ip_vs_pe_sip.c 
b/net/netfilter/ipvs/ip_vs_pe_sip.c
index 1b8d594e493a..c4e9ca016a88 100644
--- a/net/netfilter/ipvs/ip_vs_pe_sip.c
+++ b/net/netfilter/ipvs/ip_vs_pe_sip.c
@@ -70,10 +70,10 @@ ip_vs_sip_fill_param(struct ip_vs_conn_param *p, struct 
sk_buff *skb)
const char *dptr;
int retc;
 
-   ip_vs_fill_iph_skb(p->af, skb, false, );
+   retc = ip_vs_fill_iph_skb(p->af, skb, false, );
 
/* Only useful with UDP */
-   if (iph.protocol != IPPROTO_UDP)
+   if (!retc || iph.protocol != IPPROTO_UDP)
return -EINVAL;
/* todo: IPv6 fragments:
 *   I think this only should be done for the first fragment. /HS
-- 
2.1.4

[PATCH nf 2/3] netfilter: ipvs: allow rescheduling after RST

2016-02-17 Thread Simon Horman

From: Julian Anastasov 

"RFC 5961, 4.2. Mitigation" describes a mechanism to request
client to confirm with RST the restart of TCP connection
before resending its SYN. As result, IPVS can see SYNs for
existing connection in CLOSE state. Add check to allow
rescheduling in this state.

Signed-off-by: Julian Anastasov 
Signed-off-by: Simon Horman 
---
 net/netfilter/ipvs/ip_vs_core.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/netfilter/ipvs/ip_vs_core.c b/net/netfilter/ipvs/ip_vs_core.c
index 4da560005b0e..0c1d3fef9a7c 100644
--- a/net/netfilter/ipvs/ip_vs_core.c
+++ b/net/netfilter/ipvs/ip_vs_core.c
@@ -1089,6 +1089,7 @@ static inline bool is_new_conn_expected(const struct 
ip_vs_conn *cp,
switch (cp->protocol) {
case IPPROTO_TCP:
return (cp->state == IP_VS_TCP_S_TIME_WAIT) ||
+   cp->state == IP_VS_TCP_S_CLOSE ||
((conn_reuse_mode & 2) &&
 (cp->state == IP_VS_TCP_S_FIN_WAIT) &&
 (cp->flags & IP_VS_CONN_F_NOOUTPUT));
-- 
2.1.4

[GIT PULL nf 0/3] IPVS Fixes for v4.5

2016-02-17 Thread Simon Horman

Hi Pablo,

please consider these IPVS fixes for v4.5.

* Arnd Bergman has corrected an error whereby the SIP persistence engine
  may incorrectly access protocol fields
* Julian Anastasov has corrected a problem reported by Jiri Bohac with the
  connection rescheduling mechanism added in 3.10 when new SYNs in
  connection to dead real server can be redirected to another real server.

The following changes since commit 5cc6ce9ff27565949a1001a2889a8dd9fd09e772:

  netfilter: nft_counter: fix erroneous return values (2016-02-08 13:05:02 
+0100)

are available in the git repository at:

  https://git.kernel.org/pub/scm/linux/kernel/git/horms/ipvs.git 
tags/ipvs-fixes-for-v4.5

for you to fetch changes up to 5acaf89f88b97849d550d6fbb10362e3d84b5082:

  netfilter: ipvs: handle ip_vs_fill_iph_skb_off failure (2016-02-18 09:31:48 
+0900)


Arnd Bergmann (1):
  netfilter: ipvs: handle ip_vs_fill_iph_skb_off failure

Julian Anastasov (2):
  netfilter: ipvs: drop first packet to redirect conntrack
  netfilter: ipvs: allow rescheduling after RST

 include/net/ip_vs.h   | 17 +
 net/netfilter/ipvs/ip_vs_core.c   | 38 +-
 net/netfilter/ipvs/ip_vs_pe_sip.c |  4 ++--
 3 files changed, 48 insertions(+), 11 deletions(-)

[PATCH nf 1/3] netfilter: ipvs: drop first packet to redirect conntrack

2016-02-17 Thread Simon Horman

From: Julian Anastasov 

Jiri Bohac is reporting for a problem where the attempt
to reschedule existing connection to another real server
needs proper redirect for the conntrack used by the IPVS
connection. For example, when IPVS connection is created
to NAT-ed real server we alter the reply direction of
conntrack. If we later decide to select different real
server we can not alter again the conntrack. And if we
expire the old connection, the new connection is left
without conntrack.

So, the only way to redirect both the IPVS connection and
the Netfilter's conntrack is to drop the SYN packet that
hits existing connection, to wait for the next jiffie
to expire the old connection and its conntrack and to rely
on client's retransmission to create new connection as
usually.

Jiri Bohac provided a fix that drops all SYNs on rescheduling,
I extended his patch to do such drops only for connections
that use conntrack. Here is the original report from Jiri Bohac:

Since commit dc7b3eb900aa ("ipvs: Fix reuse connection if real server
is dead"), new connections to dead servers are redistributed
immediately to new servers.  The old connection is expired using
ip_vs_conn_expire_now() which sets the connection timer to expire
immediately.

However, before the timer callback, ip_vs_conn_expire(), is run
to clean the connection's conntrack entry, the new redistributed
connection may already be established and its conntrack removed
instead.

Fix this by dropping the first packet of the new connection
instead, like we do when the destination server is not available.
The timer will have deleted the old conntrack entry long before
the first packet of the new connection is retransmitted.

Fixes: dc7b3eb900aa ("ipvs: Fix reuse connection if real server is dead")
Signed-off-by: Jiri Bohac 
Signed-off-by: Julian Anastasov 
Signed-off-by: Simon Horman 
---
 include/net/ip_vs.h | 17 +
 net/netfilter/ipvs/ip_vs_core.c | 37 -
 2 files changed, 45 insertions(+), 9 deletions(-)

diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index 0816c872b689..a6cc576fd467 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -1588,6 +1588,23 @@ static inline void ip_vs_conn_drop_conntrack(struct 
ip_vs_conn *cp)
 }
 #endif /* CONFIG_IP_VS_NFCT */
 
+/* Really using conntrack? */
+static inline bool ip_vs_conn_uses_conntrack(struct ip_vs_conn *cp,
+struct sk_buff *skb)
+{
+#ifdef CONFIG_IP_VS_NFCT
+   enum ip_conntrack_info ctinfo;
+   struct nf_conn *ct;
+
+   if (!(cp->flags & IP_VS_CONN_F_NFCT))
+   return false;
+   ct = nf_ct_get(skb, );
+   if (ct && !nf_ct_is_untracked(ct))
+   return true;
+#endif
+   return false;
+}
+
 static inline int
 ip_vs_dest_conn_overhead(struct ip_vs_dest *dest)
 {
diff --git a/net/netfilter/ipvs/ip_vs_core.c b/net/netfilter/ipvs/ip_vs_core.c
index f57b4dcdb233..4da560005b0e 100644
--- a/net/netfilter/ipvs/ip_vs_core.c
+++ b/net/netfilter/ipvs/ip_vs_core.c
@@ -1757,15 +1757,34 @@ ip_vs_in(struct netns_ipvs *ipvs, unsigned int hooknum, 
struct sk_buff *skb, int
cp = pp->conn_in_get(ipvs, af, skb, );
 
conn_reuse_mode = sysctl_conn_reuse_mode(ipvs);
-   if (conn_reuse_mode && !iph.fragoffs &&
-   is_new_conn(skb, ) && cp &&
-   ((unlikely(sysctl_expire_nodest_conn(ipvs)) && cp->dest &&
- unlikely(!atomic_read(>dest->weight))) ||
-unlikely(is_new_conn_expected(cp, conn_reuse_mode {
-   if (!atomic_read(>n_control))
-   ip_vs_conn_expire_now(cp);
-   __ip_vs_conn_put(cp);
-   cp = NULL;
+   if (conn_reuse_mode && !iph.fragoffs && is_new_conn(skb, ) && cp) {
+   bool uses_ct = false, resched = false;
+
+   if (unlikely(sysctl_expire_nodest_conn(ipvs)) && cp->dest &&
+   unlikely(!atomic_read(>dest->weight))) {
+   resched = true;
+   uses_ct = ip_vs_conn_uses_conntrack(cp, skb);
+   } else if (is_new_conn_expected(cp, conn_reuse_mode)) {
+   uses_ct = ip_vs_conn_uses_conntrack(cp, skb);
+   if (!atomic_read(>n_control)) {
+   resched = true;
+   } else {
+   /* Do not reschedule controlling connection
+* that uses conntrack while it is still
+* referenced by controlled connection(s).
+*/
+   resched = !uses_ct;
+   }
+   }
+
+   if (resched) {
+   if (!atomic_read(>n_control))
+   ip_vs_conn_expire_now(cp);
+   __ip_vs_conn_put(cp);

[PATCH nf-next 2/2] netfilter: ipvs: avoid unused variable warnings

2016-02-17 Thread Simon Horman

From: Arnd Bergmann 

The proc_create() and remove_proc_entry() functions do not reference
their arguments when CONFIG_PROC_FS is disabled, so we get a couple
of warnings about unused variables in IPVS:

ipvs/ip_vs_app.c:608:14: warning: unused variable 'net' [-Wunused-variable]
ipvs/ip_vs_ctl.c:3950:14: warning: unused variable 'net' [-Wunused-variable]
ipvs/ip_vs_ctl.c:3994:14: warning: unused variable 'net' [-Wunused-variable]

This removes the local variables and instead looks them up separately
for each use, which obviously avoids the warning.

Signed-off-by: Arnd Bergmann 
Fixes: 4c50a8ce2b63 ("netfilter: ipvs: avoid unused variable warning")
Acked-by: Julian Anastasov 
Signed-off-by: Simon Horman 
---
 net/netfilter/ipvs/ip_vs_app.c |  8 ++--
 net/netfilter/ipvs/ip_vs_ctl.c | 15 ++-
 2 files changed, 8 insertions(+), 15 deletions(-)

diff --git a/net/netfilter/ipvs/ip_vs_app.c b/net/netfilter/ipvs/ip_vs_app.c
index 0328f7250693..299edc6add5a 100644
--- a/net/netfilter/ipvs/ip_vs_app.c
+++ b/net/netfilter/ipvs/ip_vs_app.c
@@ -605,17 +605,13 @@ static const struct file_operations ip_vs_app_fops = {
 
 int __net_init ip_vs_app_net_init(struct netns_ipvs *ipvs)
 {
-   struct net *net = ipvs->net;
-
INIT_LIST_HEAD(>app_list);
-   proc_create("ip_vs_app", 0, net->proc_net, _vs_app_fops);
+   proc_create("ip_vs_app", 0, ipvs->net->proc_net, _vs_app_fops);
return 0;
 }
 
 void __net_exit ip_vs_app_net_cleanup(struct netns_ipvs *ipvs)
 {
-   struct net *net = ipvs->net;
-
unregister_ip_vs_app(ipvs, NULL /* all */);
-   remove_proc_entry("ip_vs_app", net->proc_net);
+   remove_proc_entry("ip_vs_app", ipvs->net->proc_net);
 }
diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c
index daf4cb746974..404b2a4f4b5b 100644
--- a/net/netfilter/ipvs/ip_vs_ctl.c
+++ b/net/netfilter/ipvs/ip_vs_ctl.c
@@ -3945,7 +3945,6 @@ static struct notifier_block ip_vs_dst_notifier = {
 
 int __net_init ip_vs_control_net_init(struct netns_ipvs *ipvs)
 {
-   struct net *net = ipvs->net;
int i, idx;
 
/* Initialize rs_table */
@@ -3972,9 +3971,9 @@ int __net_init ip_vs_control_net_init(struct netns_ipvs 
*ipvs)
 
spin_lock_init(>tot_stats.lock);
 
-   proc_create("ip_vs", 0, net->proc_net, _vs_info_fops);
-   proc_create("ip_vs_stats", 0, net->proc_net, _vs_stats_fops);
-   proc_create("ip_vs_stats_percpu", 0, net->proc_net,
+   proc_create("ip_vs", 0, ipvs->net->proc_net, _vs_info_fops);
+   proc_create("ip_vs_stats", 0, ipvs->net->proc_net, _vs_stats_fops);
+   proc_create("ip_vs_stats_percpu", 0, ipvs->net->proc_net,
_vs_stats_percpu_fops);
 
if (ip_vs_control_net_init_sysctl(ipvs))
@@ -3989,13 +3988,11 @@ err:
 
 void __net_exit ip_vs_control_net_cleanup(struct netns_ipvs *ipvs)
 {
-   struct net *net = ipvs->net;
-
ip_vs_trash_cleanup(ipvs);
ip_vs_control_net_cleanup_sysctl(ipvs);
-   remove_proc_entry("ip_vs_stats_percpu", net->proc_net);
-   remove_proc_entry("ip_vs_stats", net->proc_net);
-   remove_proc_entry("ip_vs", net->proc_net);
+   remove_proc_entry("ip_vs_stats_percpu", ipvs->net->proc_net);
+   remove_proc_entry("ip_vs_stats", ipvs->net->proc_net);
+   remove_proc_entry("ip_vs", ipvs->net->proc_net);
free_percpu(ipvs->tot_stats.cpustats);
 }
 
-- 
2.1.4

[GIT PULL nf-next 0/2] IPVS Updates for v4.6

2016-02-17 Thread Simon Horman

Hi Pablo,

please consider these cleanups for IPVS for v4.6.

* Arnd Bergmann has resolved a bunch of unused variable warnings and;
* Yannick Brosseau has removed a noisy debug message

The following changes since commit 667f00630ebefc4d73aa105c6ab254e4aec867f8:

  Merge branch 'local-checksum-offload' (2016-02-12 05:52:41 -0500)

are available in the git repository at:

  https://git.kernel.org/pub/scm/linux/kernel/git/horms/ipvs-next.git 
tags/ipvs-for-v4.6

for you to fetch changes up to f6ca9f46f6615c3a87529550058d1b468c0cad89:

  netfilter: ipvs: avoid unused variable warnings (2016-02-18 09:17:58 +0900)


Arnd Bergmann (1):
  netfilter: ipvs: avoid unused variable warnings

Yannick Brosseau (1):
  netfilter: ipvs: Remove noisy debug print from ip_vs_del_service

 net/netfilter/ipvs/ip_vs_app.c |  8 ++--
 net/netfilter/ipvs/ip_vs_ctl.c | 17 ++---
 2 files changed, 8 insertions(+), 17 deletions(-)

Arnd Bergmann (1):
  netfilter: ipvs: avoid unused variable warnings

Yannick Brosseau (1):
  netfilter: ipvs: Remove noisy debug print from ip_vs_del_service

 net/netfilter/ipvs/ip_vs_app.c |  8 ++--
 net/netfilter/ipvs/ip_vs_ctl.c | 17 ++---
 2 files changed, 8 insertions(+), 17 deletions(-)

-- 
2.1.4

[PATCH nf-next 1/2] netfilter: ipvs: Remove noisy debug print from ip_vs_del_service

2016-02-17 Thread Simon Horman

From: Yannick Brosseau 

This have been there for a long time, but does not seem to add value

Signed-off-by: Yannick Brosseau 
Signed-off-by: Simon Horman 
---
 net/netfilter/ipvs/ip_vs_ctl.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c
index e7c1b052c2a3..daf4cb746974 100644
--- a/net/netfilter/ipvs/ip_vs_ctl.c
+++ b/net/netfilter/ipvs/ip_vs_ctl.c
@@ -1376,8 +1376,6 @@ static void __ip_vs_del_service(struct ip_vs_service 
*svc, bool cleanup)
struct ip_vs_pe *old_pe;
struct netns_ipvs *ipvs = svc->ipvs;
 
-   pr_info("%s: enter\n", __func__);
-
/* Count only IPv4 services for old get/setsockopt interface */
if (svc->af == AF_INET)
ipvs->num_services--;
-- 
2.1.4

[PATCH] ipv6: Annotate change of locking mechanism for np->opt

2016-02-17 Thread Benjamin Poirier

follows up commit 45f6fad84cc3 ("ipv6: add complete rcu protection around
np->opt") which added mixed rcu/refcount protection to np->opt.

Given the current implementation of rcu_pointer_handoff(), this has no
effect at runtime.

Signed-off-by: Benjamin Poirier 
---
 include/net/ipv6.h | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index 6570f37..f3c9857 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -259,8 +259,12 @@ static inline struct ipv6_txoptions *txopt_get(const 
struct ipv6_pinfo *np)
 
rcu_read_lock();
opt = rcu_dereference(np->opt);
-   if (opt && !atomic_inc_not_zero(>refcnt))
-   opt = NULL;
+   if (opt) {
+   if (!atomic_inc_not_zero(>refcnt))
+   opt = NULL;
+   else
+   opt = rcu_pointer_handoff(opt);
+   }
rcu_read_unlock();
return opt;
 }
-- 
2.7.0

[PATCH] net: phy: dp83848: Fix sysfs naming collision warning

2016-02-17 Thread Andrew F. Davis

Files in sysfs are created using the name from the phy_driver struct,
when two names are the same we may get a duplicate filename warning,
fix this.

Reported-by: kernel test robot 
Signed-off-by: Andrew F. Davis 
---
 drivers/net/phy/dp83848.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/phy/dp83848.c b/drivers/net/phy/dp83848.c
index 556904f..03d54c4 100644
--- a/drivers/net/phy/dp83848.c
+++ b/drivers/net/phy/dp83848.c
@@ -103,7 +103,7 @@ MODULE_DEVICE_TABLE(mdio, dp83848_tbl);
 
 static struct phy_driver dp83848_driver[] = {
DP83848_PHY_DRIVER(TI_DP83848C_PHY_ID, "TI DP83848C 10/100 Mbps PHY"),
-   DP83848_PHY_DRIVER(NS_DP83848C_PHY_ID, "TI DP83848C 10/100 Mbps PHY"),
+   DP83848_PHY_DRIVER(NS_DP83848C_PHY_ID, "NS DP83848C 10/100 Mbps PHY"),
DP83848_PHY_DRIVER(TLK10X_PHY_ID, "TI TLK10X 10/100 Mbps PHY"),
 };
 module_phy_driver(dp83848_driver);
-- 
2.7.1

Re: [ovs-dev] [PATCH nf-next v7 6/7] openvswitch: Delay conntrack helper call for new connections.

2016-02-17 Thread Joe Stringer

On 5 February 2016 at 17:41, Jarno Rajahalme  wrote:
> There is no need to help connections that are not confirmed, so we can
> delay helping new connections to the time when they are confirmed.
> This change is needed for NAT support, and having this as a separate
> patch will make the following NAT patch a bit easier to review.
>
> Signed-off-by: Jarno Rajahalme 
> ---
>  net/openvswitch/conntrack.c | 20 +++-
>  1 file changed, 15 insertions(+), 5 deletions(-)
>
> diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
> index fa9ab25..fc0ef11 100644
> --- a/net/openvswitch/conntrack.c
> +++ b/net/openvswitch/conntrack.c
> @@ -464,6 +464,7 @@ static bool skb_nfct_cached(struct net *net,
>  /* Pass 'skb' through conntrack in 'net', using zone configured in 'info', if
>   * not done already.  Update key with new CT state after passing the packet
>   * through conntrack.
> + * Note that invalid packets are accepted while the skb->nfct remains unset!
>   */
>  static int __ovs_ct_lookup(struct net *net, struct sw_flow_key *key,
>const struct ovs_conntrack_info *info,

This seems unrelated to the patch.

I *think* that you're trying to say that 'if the packet is deemed
invalid by conntrack, skb->nfct will be set to NULL and 0 will be
returned.' I think that this is a documentation issue separate from
this patch (Maybe could be combined with one of the earlier fragments
in the series?).

> @@ -474,7 +475,11 @@ static int __ovs_ct_lookup(struct net *net, struct 
> sw_flow_key *key,
>  * actually run the packet through conntrack twice unless it's for a
>  * different zone.
>  */
> -   if (!skb_nfct_cached(net, key, info, skb)) {
> +   bool cached = skb_nfct_cached(net, key, info, skb);
> +   enum ip_conntrack_info ctinfo;
> +   struct nf_conn *ct;
> +
> +   if (!cached) {
> struct nf_conn *tmpl = info->ct;
> int err;
>
> @@ -497,11 +502,16 @@ static int __ovs_ct_lookup(struct net *net, struct 
> sw_flow_key *key,
> return -ENOENT;
>
> ovs_ct_update_key(skb, info, key, true);
> +   }
>
> -   if (ovs_ct_helper(skb, info->family) != NF_ACCEPT) {
> -   WARN_ONCE(1, "helper rejected packet");
> -   return -EINVAL;
> -   }
> +   /* Call the helper right after nf_conntrack_in() for confirmed
> +* connections, but only when commiting for unconfirmed connections.
> +*/
> +   ct = nf_ct_get(skb, );
> +   if (ct && (nf_ct_is_confirmed(ct) ? !cached : info->commit)
> +   && ovs_ct_helper(skb, info->family) != NF_ACCEPT) {
> +   WARN_ONCE(1, "helper rejected packet");
> +   return -EINVAL;
> }

The comment points out what I think is the more obvious piece here:
that the helper should be executed when the connection is/will be
committed. The piece that I found less obvious was that "cached"
implies that the helper was already executed (and therefore it
shouldn't be run again). I don't know if there's a way to make this
more obvious, for example with a "helper_executed" variable or
similar. Up to you.

Re: [PATCH nf-next v7 4/7] openvswitch: Find existing conntrack entry after upcall.

2016-02-17 Thread Joe Stringer

On 5 February 2016 at 17:41, Jarno Rajahalme  wrote:

>  /* Determine whether skb->nfct is equal to the result of conntrack lookup. */
> -static bool skb_nfct_cached(const struct net *net, const struct sk_buff *skb,
> -   const struct ovs_conntrack_info *info)
> +static bool skb_nfct_cached(struct net *net,
> +   const struct sw_flow_key *key,
> +   const struct ovs_conntrack_info *info,
> +   struct sk_buff *skb)
>  {
> enum ip_conntrack_info ctinfo;
> struct nf_conn *ct;
>
> ct = nf_ct_get(skb, );
> +   /* If no ct, check if we have evidence that an existing conntrack 
> entry
> +* might be found for this skb.  This happens when we lose a skb->nfct
> +* due to an upcall.  If the connection was not confirmed, it is not
> +* cached and needs to be run through conntrack again. */
> +   if (!ct && key->ct.state & OVS_CS_F_TRACKED
> +   && !(key->ct.state & OVS_CS_F_INVALID)
> +   && key->ct.zone == info->zone.id)
> +   ct = ovs_ct_find_existing(net, >zone, info->family, skb,
> + );
> if (!ct)
> return false;

Logically I think that this makes more sense residing within
__ovs_ct_lookup() and actually populating skb->{nfct,nfctinfo} prior
to making this call to skb_nfct_cached() which answers the question
"Is skb->nfct the same as if I did a lookup?". Maybe a better name for
this function is something like ovs_ct_cmp() as it compares the skb's
nfct against the OVS structures.

The call to nf_ct_get() could move out into __ovs_ct_lookup(), then
just pass the 'ct' into here. I see that a later patch already adds
another call to nf_ct_get() into __ovs_ct_lookup(), which should only
be necessary in the case where it is not already cached.

Re: [PATCH net-next 2/2] phy: marvell: Add support for phy packet generator

2016-02-17 Thread Florian Fainelli

On 17/02/2016 12:32, Andrew Lunn wrote:
> +static int marvell_pkt_gen(struct phy_device *phydev,
> +struct ethtool_phy_pkt_gen *pkt_gen)
> +{
> + int err, oldpage, reg, max_loop = 100;
> + u32 phy_id = phydev->drv->phy_id;
> + bool has_ipg = false;

[snip]


> +
> + do {
> + usleep_range(3000, 4000);
> + reg = phy_read(phydev, MII_88E1540_PKT_GEN);
> + } while (max_loop-- && (reg & MII_88E1540_PKT_GEN_ENABLE));
> +
> + if (!max_loop)
> + err = -ETIMEDOUT;

If I am a HW engineer trying to qualify a PHY after enabling the random
packet generator built into it, I might prefer a "start generation" and
"stop generation" (this echoes back to Ben's comments on the ethtool
API) as opposed to calling the same function multiple times, because the
duration will vary based on link speed, and potentially models of PHYs too.
--
Florian

[Intel-wired-lan] [next] igb: allow setting MAC address on i211 using a device tree blob V4

2016-02-17 Thread John Holland


Hello,

The Intel i211 LOM pcie ethernet controllers' iNVM operates as an OTP
and has no externel EEPROM interface [1]. The following allows the
driver to pickup the MAC address from a device tree blob when CONFIG_OF
has been enabled.

[1]http://www.intel.com/content/www/us/en/embedded/products/networking/i211-ethernet-controller-datasheet.html

Changes V2
- Restrict searching for compatible devices to current pci device.

Changes V3
- Add device tree binding documentation.

Changes V4
- Rebase patch.

Signed-off-by: John Holland
---

 Documentation/devicetree/bindings/net/intel,i210.txt | 36 
++
 drivers/net/ethernet/intel/igb/igb_main.c| 31 
+++

 2 files changed, 67 insertions(+)

diff --git a/Documentation/devicetree/bindings/net/intel,i210.txt 
b/Documentation/devicetree/bindings/net/intel,i210.txt

new file mode 100644
index 000..d6ac8d3
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/intel,i210.txt
@@ -0,0 +1,36 @@
+* Intel I210, I211 PCIe bus controller
+
+Required properties:
+- compatible: must be "intel,i210" as described in
+  Documentation/devicetree/bindings/net/phy.txt;
+
+Optional properties:
+- local-mac-address: as described in
+  Documentation/devicetree/bindings/net/ethernet.txt;
+- mac-address: as described in
+  Documentation/devicetree/bindings/net/ethernet.txt;
+
+Child nodes of this PCIe bus controller node are a subset
+of the standard Ethernet PHY device nodes.
+
+Example:
+
+/*
+ * Set a valid MAC address from the u-boot environment variable eth1addr.
+ * The resulting value may be viewed under
+ * /firmware/devicetree/base/soc/pcie@0x0100/i211@bus1/
+ */
+
+#include "imx6q.dtsi";
+
+/ {
+   aliases {
+   ethernet1 = 
+   };
+};
+
+ {
+   eth1: i211@bus1 {
+   compatible = "intel,i210";
+   };
+};
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c 
b/drivers/net/ethernet/intel/igb/igb_main.c

index a98f418..a3203ec 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -57,6 +57,11 @@
 #include "igb.h"
 #include "igb_cdev.h"

+#ifdef CONFIG_OF
+#include 
+#include 
+#endif
+
 #define MAJ 5
 #define MIN 3
 #define BUILD 0
@@ -2299,6 +2304,27 @@ static s32 igb_init_i2c(struct igb_adapter *adapter)
return status;
 }

+#ifdef CONFIG_OF
+/**
+ *  igb_read_mac_addr_dts - Read mac address from the device tree blob
+ *  @dev: pointer to device structure
+ *  @mac_addr: pointer to found mac address
+ **/
+static void igb_read_mac_addr_dts(const struct device *dev, u8 *mac_addr)
+{
+   const u8 *mac;
+   struct device_node *dn;
+
+   dn = of_find_compatible_node(dev->of_node, NULL, "intel,i210");
+   if (dn) {
+   mac = of_get_mac_address(dn);
+   if (mac)
+   ether_addr_copy(mac_addr, mac);
+   }
+}
+#endif
+
+
 /**
  *  igb_probe - Device Initialization Routine
  *  @pdev: PCI device information struct
@@ -2511,6 +2537,11 @@ static int igb_probe(struct pci_dev *pdev, const 
struct pci_device_id *ent)

if (hw->mac.ops.read_mac_addr(hw))
dev_err(>dev, "NVM Read Error\n");

+#ifdef CONFIG_OF
+   if (!is_valid_ether_addr(hw->mac.addr))
+   igb_read_mac_addr_dts(>dev, hw->mac.addr);
+#endif
+
memcpy(netdev->dev_addr, hw->mac.addr, netdev->addr_len);

if (!is_valid_ether_addr(netdev->dev_addr)) {

Re: [PATCH v2 2/5] net: phy: dp83848: Add PHY ID for TI version of DP83848C

2016-02-17 Thread Florian Fainelli

On 17/02/2016 15:37, Andrew F. Davis wrote:
> On 02/07/2016 11:47 AM, Andrew F. Davis wrote:
>> After acquiring National Semiconductor, TI appears to have
>> changed the Vendor Model Number for the DP83848C PHYs,
>> add this new ID to supported IDs.
>>
>> Signed-off-by: Andrew F. Davis 
>> ---
>>   drivers/net/phy/dp83848.c | 9 ++---
>>   1 file changed, 6 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/net/phy/dp83848.c b/drivers/net/phy/dp83848.c
>> index 4e78f54..d4686d5f 100644
>> --- a/drivers/net/phy/dp83848.c
>> +++ b/drivers/net/phy/dp83848.c
>> @@ -16,7 +16,8 @@
>>   #include 
>>   #include 
>>
>> -#define DP83848_PHY_ID0x20005c90
>> +#define TI_DP83848C_PHY_ID0x20005ca0
>> +#define NS_DP83848C_PHY_ID0x20005c90
>>
>>   /* Registers */
>>   #define DP83848_MICR0x11
>> @@ -65,7 +66,8 @@ static int dp83848_config_intr(struct phy_device
>> *phydev)
>>   }
>>
>>   static struct mdio_device_id __maybe_unused dp83848_tbl[] = {
>> -{ DP83848_PHY_ID, 0xfff0 },
>> +{ TI_DP83848C_PHY_ID, 0xfff0 },
>> +{ NS_DP83848C_PHY_ID, 0xfff0 },
>>   { }
>>   };
>>   MODULE_DEVICE_TABLE(mdio, dp83848_tbl);
>> @@ -91,7 +93,8 @@ MODULE_DEVICE_TABLE(mdio, dp83848_tbl);
>>   }
>>
>>   static struct phy_driver dp83848_driver[] = {
>> -DP83848_PHY_DRIVER(DP83848_PHY_ID, "TI DP83848 10/100 Mbps PHY"),
>> +DP83848_PHY_DRIVER(TI_DP83848C_PHY_ID, "TI DP83848C 10/100 Mbps
>> PHY"),
>> +DP83848_PHY_DRIVER(NS_DP83848C_PHY_ID, "TI DP83848C 10/100 Mbps
>> PHY"),
> 
> This seems to be causing a warning about duplicate file names (driver
> name in
> sysfs), so the bottom one can probably s/TI/NS, can this be changed in-tree
> before the merge or should I submit a patch?

Once the patches are merged by David in his tree, you will need to
provide an incremental patch to fix the problem. I had not noticed the
duplicate name either here, but it sounds like you should indeed fix it.

Thanks!
--
Florian

Re: [PATCH v2 2/5] net: phy: dp83848: Add PHY ID for TI version of DP83848C

2016-02-17 Thread Andrew F. Davis


On 02/07/2016 11:47 AM, Andrew F. Davis wrote:

After acquiring National Semiconductor, TI appears to have
changed the Vendor Model Number for the DP83848C PHYs,
add this new ID to supported IDs.

Signed-off-by: Andrew F. Davis 
---
  drivers/net/phy/dp83848.c | 9 ++---
  1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/net/phy/dp83848.c b/drivers/net/phy/dp83848.c
index 4e78f54..d4686d5f 100644
--- a/drivers/net/phy/dp83848.c
+++ b/drivers/net/phy/dp83848.c
@@ -16,7 +16,8 @@
  #include 
  #include 

-#define DP83848_PHY_ID 0x20005c90
+#define TI_DP83848C_PHY_ID 0x20005ca0
+#define NS_DP83848C_PHY_ID 0x20005c90

  /* Registers */
  #define DP83848_MICR  0x11
@@ -65,7 +66,8 @@ static int dp83848_config_intr(struct phy_device *phydev)
  }

  static struct mdio_device_id __maybe_unused dp83848_tbl[] = {
-   { DP83848_PHY_ID, 0xfff0 },
+   { TI_DP83848C_PHY_ID, 0xfff0 },
+   { NS_DP83848C_PHY_ID, 0xfff0 },
{ }
  };
  MODULE_DEVICE_TABLE(mdio, dp83848_tbl);
@@ -91,7 +93,8 @@ MODULE_DEVICE_TABLE(mdio, dp83848_tbl);
}

  static struct phy_driver dp83848_driver[] = {
-   DP83848_PHY_DRIVER(DP83848_PHY_ID, "TI DP83848 10/100 Mbps PHY"),
+   DP83848_PHY_DRIVER(TI_DP83848C_PHY_ID, "TI DP83848C 10/100 Mbps PHY"),
+   DP83848_PHY_DRIVER(NS_DP83848C_PHY_ID, "TI DP83848C 10/100 Mbps PHY"),


This seems to be causing a warning about duplicate file names (driver name in
sysfs), so the bottom one can probably s/TI/NS, can this be changed in-tree
before the merge or should I submit a patch?

Andrew


  };
  module_phy_driver(dp83848_driver);

[PATCH 1/1] cxgb3: fix up vpd strings for kstrto*()

2016-02-17 Thread Steve Wise

The vpd strings are left justified, in a fixed length array, with possible
trailing white space and no NUL.  So fix them up before calling kstrto*().

This is a recent regression which causes cxgb3 to fail to load.

Fixes:e72c932('cxgb3: Convert simple_strtoul to kstrtox')

Signed-off-by: Steve Wise 
---
 drivers/net/ethernet/chelsio/cxgb3/t3_hw.c | 32 +++---
 1 file changed, 25 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb3/t3_hw.c 
b/drivers/net/ethernet/chelsio/cxgb3/t3_hw.c
index ee04caa..bfd4a7f 100644
--- a/drivers/net/ethernet/chelsio/cxgb3/t3_hw.c
+++ b/drivers/net/ethernet/chelsio/cxgb3/t3_hw.c
@@ -681,6 +681,24 @@ int t3_seeprom_wp(struct adapter *adapter, int enable)
return t3_seeprom_write(adapter, EEPROM_STAT_ADDR, enable ? 0xc : 0);
 }
 
+static int vpdstrtouint(char *s, int len, unsigned int base, unsigned int *val)
+{
+   char tok[len+1];
+
+   memcpy(tok, s, len);
+   tok[len] = 0;
+   return kstrtouint(strim(tok), base, val);
+}
+
+static int vpdstrtou16(char *s, int len, unsigned int base, u16 *val)
+{
+   char tok[len+1];
+
+   memcpy(tok, s, len);
+   tok[len] = 0;
+   return kstrtou16(strim(tok), base, val);
+}
+
 /**
  * get_vpd_params - read VPD parameters from VPD EEPROM
  * @adapter: adapter to read
@@ -709,19 +727,19 @@ static int get_vpd_params(struct adapter *adapter, struct 
vpd_params *p)
return ret;
}
 
-   ret = kstrtouint(vpd.cclk_data, 10, >cclk);
+   ret = vpdstrtouint(vpd.cclk_data, vpd.cclk_len, 10, >cclk);
if (ret)
return ret;
-   ret = kstrtouint(vpd.mclk_data, 10, >mclk);
+   ret = vpdstrtouint(vpd.mclk_data, vpd.mclk_len, 10, >mclk);
if (ret)
return ret;
-   ret = kstrtouint(vpd.uclk_data, 10, >uclk);
+   ret = vpdstrtouint(vpd.uclk_data, vpd.uclk_len, 10, >uclk);
if (ret)
return ret;
-   ret = kstrtouint(vpd.mdc_data, 10, >mdc);
+   ret = vpdstrtouint(vpd.mdc_data, vpd.mdc_len, 10, >mdc);
if (ret)
return ret;
-   ret = kstrtouint(vpd.mt_data, 10, >mem_timing);
+   ret = vpdstrtouint(vpd.mt_data, vpd.mt_len, 10, >mem_timing);
if (ret)
return ret;
memcpy(p->sn, vpd.sn_data, SERNUM_LEN);
@@ -733,10 +751,10 @@ static int get_vpd_params(struct adapter *adapter, struct 
vpd_params *p)
} else {
p->port_type[0] = hex_to_bin(vpd.port0_data[0]);
p->port_type[1] = hex_to_bin(vpd.port1_data[0]);
-   ret = kstrtou16(vpd.xaui0cfg_data, 16, >xauicfg[0]);
+   ret = vpdstrtou16(vpd.xaui0cfg_data, vpd.xaui0cfg_len, 16, 
>xauicfg[0]);
if (ret)
return ret;
-   ret = kstrtou16(vpd.xaui1cfg_data, 16, >xauicfg[1]);
+   ret = vpdstrtou16(vpd.xaui1cfg_data, vpd.xaui1cfg_len, 16, 
>xauicfg[1]);
if (ret)
return ret;
}
-- 
2.7.0

Re: [PATCH net-next 1/2] net: ethtool: Add support for PHY packet generators

2016-02-17 Thread Ben Hutchings

On Wed, 2016-02-17 at 22:55 +0100, Andrew Lunn wrote:
> On Wed, Feb 17, 2016 at 09:06:19PM +, Ben Hutchings wrote:
> > On Wed, 2016-02-17 at 21:32 +0100, Andrew Lunn wrote:
> > > Some PHY devices contain a simple packet generator. Features vary, but
> > > often they can be used to generate packets of different sizes,
> > > different contents, with or without errors, and with different inter
> > > packet gaps. Add support to the core ethtool code to support this.
> > [...]
> > > --- a/include/uapi/linux/ethtool.h
> > > +++ b/include/uapi/linux/ethtool.h
> > > @@ -1167,6 +1167,31 @@ struct ethtool_ts_info {
> > >   __u32   rx_reserved[3];
> > >  };
> > >  
> > > +enum ethtool_phy_pkg_gen_flags {
> > > + ETH_PKT_RANDOM  = (1 << 0),
> > > + ETH_PKT_ERROR   = (1 << 1),
> > 
> > What kind of error?  CRC error, FEC error, symbol error?
> 
> Hi Ben
> 
> I want to try to keep the API generic, since different PHYs are
> different capabilities.  The Marvell phy will generate symbol errors
> and CRC errors. You cannot control it in a finer way than that.

Sure.  But this should be commented, something like "all packets have
layer 1 and/or layer 2 errors".

Similarly the random flag should be commented as something like
"randomise packet header and payload".

> > > +};
> > > +
> > > +/**
> > > + * struct ethtool_phy_pkt_get - command to request the phy to generate 
> > > packets.
> > > + * @cmd: command number = %ETHTOOL_PHY_PKT_GEN
> > > + * @count: number of packets to generate
> > > + * @len: length of generated packets
> > > + * @ipg: inter packet gap in bytes.
> > 
> > What if the PHY doesn't allow varying the IPG?  Should there be a way
> > to find out what its supported IPG is, or to request the default value?
> 
> If you pass 0, it will use the default IPG. If you pass a value other
> than 0, and it is not supported, it return -EINVAL.

Include that in the comment.

[...] 
> > Similarly, should there be a way to find out the minimum/maximum length it 
> > supports?
> 
> I'm trying to keep it simple. Do we really want to add a complex
> mechanism to query every available parameter to determine the range of
> values it can take?

That is an important part of making this feature generic.

> I find it better that the driver accepts the
> values of 0 meaning pick a sensible default, and for any value not 0,
> return an error if it cannot be supported by the hardware.
> I tried to emphasise this in the man page patch.

The ethtool API is not a private API for the ethtool utility, despite
its name.  The API must be documented in ethtool.h.

> > > + * @flags: a bitmask of flags from  ethtool_phy_pkg_gen_flags
> > > + *
> > > + * PHY drivers may not support all of these parameters. If the
> > > + * requested parameter value cannot be supported an error should be
> > > + * returned.
> > 
> > Should, or must?
> 
> Must would be better.
>  
> > How does userland tell when the PHY has finished?  Should it be
> > possible to cancel this (similar to ETHTOOL_PHYS_ID)?
> 
> The call is blocking and returns when all the packets are sent.
> For the Marvell hardware, you can send up to 255 packets. At 10Mbps,
> 1518 byte packets and 256 it takes about 0.3 seconds.

OK, then it needs to drop the RTNL lock and the ethtool core should
probably call into the driver multiple times, similarly to
ETHTOOL_PHYS_ID..

> I don't really like the idea of making it non-blocking. i.e. set it
> generating packets and sometime later stop it. It can lead to some
> very non-obvious issues. Why is my Ethernet card spamming the net at
> line rate, yet the netdev TX counters are not going up?
>  
> > What should happen if the stack tries to send a packet while the PHY is
> > in this mode?  Is it discarded?  Should the driver indicate carrier-off 
> > so that this is obvious?
> 
> Depends on the hardware. Marvell PHYs will discard any packets coming
> from the MAC.

I know that the PHY behaviour varies but the API should be consistent
across drivers and the drivers will have to do a little work to ensure
that.

> > [...]
> > > --- a/net/core/ethtool.c
> > > +++ b/net/core/ethtool.c
> > > @@ -1541,6 +1541,25 @@ static int ethtool_get_phy_stats(struct net_device 
> > > *dev, void __user *useraddr)
> > >   return ret;
> > >  }
> > >  
> > > +static int ethtool_phy_pkt_gen(struct net_device *dev, void __user 
> > > *useraddr)
> > > +{
> > > + struct phy_device *phydev = dev->phydev;
> > > + struct ethtool_phy_pkt_gen pkt_gen;
> > > + int err;
> > > +
> > > + if (!phydev || !phydev->drv->pkt_gen)
> > > + return -EOPNOTSUPP;
> > [...]
> > 
> > Why should this be tied to phylib?  Nothing else in the ethtool
> > interface is.
> 
> We need the phy lock to be held. We don't want anything else accessing
> phy registers at the same time. For the Marvell hardware, we need to
> change the page. If for example genphy_read_status() was used to poll
> the status of the PHY, it could read from the wrong page and get very
> confused.

So what are net drivers

Re: [net-next] vlan: change return type of vlan_proc_rem_dev

2016-02-17 Thread Cong Wang

On Wed, Feb 17, 2016 at 3:44 AM, Zhang Shengju
 wrote:
> Since function vlan_proc_rem_dev() will only return 0, it's better to
> return void instead of int.
>
> Signed-off-by: Zhang Shengju 
> ---
>  net/8021q/vlanproc.c | 3 +--
>  net/8021q/vlanproc.h | 2 +-
>  2 files changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/net/8021q/vlanproc.c b/net/8021q/vlanproc.c
> index ae63cf7..5f1446c 100644
> --- a/net/8021q/vlanproc.c
> +++ b/net/8021q/vlanproc.c
> @@ -184,12 +184,11 @@ int vlan_proc_add_dev(struct net_device *vlandev)
>  /*
>   * Delete directory entry for VLAN device.
>   */
> -int vlan_proc_rem_dev(struct net_device *vlandev)
> +void vlan_proc_rem_dev(struct net_device *vlandev)
>  {
> /** NOTE:  This will consume the memory pointed to by dent, it seems. 
> */
> proc_remove(vlan_dev_priv(vlandev)->dent);
> vlan_dev_priv(vlandev)->dent = NULL;
> -   return 0;
>  }
>
>  /** Proc filesystem entry points 
> /
> diff --git a/net/8021q/vlanproc.h b/net/8021q/vlanproc.h
> index 063f60a..a9d8734 100644
> --- a/net/8021q/vlanproc.h
> +++ b/net/8021q/vlanproc.h
> @@ -5,7 +5,7 @@
>  struct net;
>
>  int vlan_proc_init(struct net *net);
> -int vlan_proc_rem_dev(struct net_device *vlandev);
> +void vlan_proc_rem_dev(struct net_device *vlandev);
>  int vlan_proc_add_dev(struct net_device *vlandev);
>  void vlan_proc_cleanup(struct net *net);

You forget to change the !PROC_FS case:

#define vlan_proc_rem_dev(dev)  ({(void)(dev), 0; })

Re: [net-next PATCH v3 3/8] net: sched: add cls_u32 offload hooks for netdevs

2016-02-17 Thread John Fastabend

[...]

>>
>>> +static void u32_replace_hw_hnode(struct tcf_proto *tp, struct
>>> tc_u_hnode *h)
>>> +{
>>> +struct net_device *dev = tp->q->dev_queue->dev;
>>> +struct tc_cls_u32_offload u32_offload = {0};
>>> +struct tc_to_netdev offload;
>>> +
>>> +offload.type = TC_SETUP_CLSU32;
>>> +offload.cls_u32 = _offload;
>>> +
>>> +if (dev->netdev_ops->ndo_setup_tc) {
>>> +offload.cls_u32->command = TC_CLSU32_NEW_HNODE;
>>
>> TC_CLSU32_REPLACE_HNODE?
>>
> 
> Yep I made this change and will send out v4.
> 
> [...]
> 
>>

Actually thinking about this a bit more I wrote this thinking
that there existed some hardware that actually cared if it was
a new rule or an existing rule. For me it doesn't matter I do
the same thing in the new/replace cases I just write into the
slot on the hardware table and if it happens to have something
in it well its overwritten e.g. "replaced". This works because
the cls_u32 layer protects us from doing something unexpected.

I'm wondering (mostly asking the mlx folks) is there hardware
out there that cares to make this distinction between new and
replace? Otherwise I can just drop new and always use replace.
Or vice versa which is the case in its current form.

Thanks,
John

[PATCH net-next] hv_netvsc: add software transmit timestamp support

2016-02-17 Thread Simon Xiao

Enable skb_tx_timestamp in hyperv netvsc.

Signed-off-by: Simon Xiao 
Reviewed-by: K. Y. Srinivasan 
Reviewed-by: Haiyang Zhang 
---
 drivers/net/hyperv/netvsc_drv.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index c72e5b8..202e2b1 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -550,6 +550,8 @@ do_send:
packet->page_buf_cnt = init_page_array(rndis_msg, rndis_msg_size,
   skb, packet, );
 
+   /* timestamp packet in software */
+   skb_tx_timestamp(skb);
ret = netvsc_send(net_device_ctx->device_ctx, packet,
  rndis_msg, , skb);
 
@@ -920,6 +922,7 @@ static const struct ethtool_ops ethtool_ops = {
.get_link   = ethtool_op_get_link,
.get_channels   = netvsc_get_channels,
.set_channels   = netvsc_set_channels,
+   .get_ts_info= ethtool_op_get_ts_info,
 };
 
 static const struct net_device_ops device_ops = {
-- 
2.5.0

Re: [PATCH v2 net-next 0/8] API set for HW Buffer management

2016-02-17 Thread Willy Tarreau

Hi Gregory,

On Tue, Feb 16, 2016 at 04:33:35PM +0100, Gregory CLEMENT wrote:
> Hello,
> 
> A few weeks ago I sent a proposal for a API set for HW Buffer
> management, to have a better view of the motivation for this API see
> the cover letter of this proposal:
> http://thread.gmane.org/gmane.linux.kernel/2125152
> 
> Since this version I took into account the review from Florian:
> - The hardware buffer management helpers are no more built by default
>   and now depend on a hidden config symbol which has to be selected
>   by the driver if needed
> - The hwbm_pool_refill() and hwbm_pool_add() now receive a gfp_t as
>   argument allowing the caller to specify the flag it needs.
> - buf_num is now tested to ensure there is no wrapping
> - A spinlock has been added to protect the hwbm_pool_add() function in
>   SMP or irq context.
> 
> I also used pr_warn instead of pr_debug in case of errors.
> 
> I fixed the mvneta implementation by returning the buffer to the pool
> at various place instead of ignoring it.
> 
> About the series itself I tried to make this series easier to merge:
> - Squashed "bus: mvenus-mbus: Fix size test for
>mvebu_mbus_get_dram_win_info" into bus: mvebu-mbus: provide api for
>obtaining IO and DRAM window information.
> - Added my signed-otf-by on all the patches as submitter of the series.
> - Renamed the dts patches with the pattern "ARM: dts: platform:"
> - Removed the patch "ARM: mvebu: enable SRAM support in
>   mvebu_v7_defconfig" of this series and already applied it
> - Rodified the order of the patches.
> 
> In order to ease the test the branch mvneta-BM-framework-v2 is
> available at g...@github.com:MISL-EBU-System-SW/mainline-public.git.

Well, I tested this patch series on top of latest master (from today)
on my fresh new clearfog board. I compared carefully with and without
the patchset. My workload was haproxy receiving connections and forwarding
them to my PC via the same port. I tested both with short connections
(HTTP GET of an empty file) and long ones (1 MB or more). No trouble
was detected at all, which is pretty good. I noticed a very tiny
performance drop which is more noticeable on short connections (high
packet rates), my forwarded connection rate went down from 17500/s to
17300/s. But I have not checked yet what can be tuned when using the
BM, nor did I compare CPU usage. I remember having run some tests in
the past, I guess it was on the XP-GP board, and noticed that the BM
could save a significant amount of CPU and improve cache efficiency,
so if this is the case here, we don't really care about a possible 1%
performance drop.

I'll try to provide more results as time permits.

In the mean time if you want (or plan to submit a next batch), feel
free to add a Tested-by: Willy Tarreau .

cheers,
Willy

[PATCH net-next v3 1/4] drivers: net: xgene: Add support for Classifier engine

2016-02-17 Thread Iyappan Subramanian

Signed-off-by: Iyappan Subramanian 
Signed-off-by: Khuong Dinh 
Signed-off-by: Tanmay Inamdar 
Tested-by: Toan Le 
---
 drivers/net/ethernet/apm/xgene/Makefile  |   3 +-
 drivers/net/ethernet/apm/xgene/xgene_enet_cle.c  | 357 +++
 drivers/net/ethernet/apm/xgene/xgene_enet_cle.h  | 254 
 drivers/net/ethernet/apm/xgene/xgene_enet_hw.h   |   1 +
 drivers/net/ethernet/apm/xgene/xgene_enet_main.c |  29 +-
 drivers/net/ethernet/apm/xgene/xgene_enet_main.h |  14 +
 6 files changed, 649 insertions(+), 9 deletions(-)
 create mode 100644 drivers/net/ethernet/apm/xgene/xgene_enet_cle.c
 create mode 100644 drivers/net/ethernet/apm/xgene/xgene_enet_cle.h

diff --git a/drivers/net/ethernet/apm/xgene/Makefile 
b/drivers/net/ethernet/apm/xgene/Makefile
index 700b5ab..f46321f 100644
--- a/drivers/net/ethernet/apm/xgene/Makefile
+++ b/drivers/net/ethernet/apm/xgene/Makefile
@@ -3,5 +3,6 @@
 #
 
 xgene-enet-objs := xgene_enet_hw.o xgene_enet_sgmac.o xgene_enet_xgmac.o \
-  xgene_enet_main.o xgene_enet_ring2.o xgene_enet_ethtool.o
+  xgene_enet_main.o xgene_enet_ring2.o xgene_enet_ethtool.o \
+  xgene_enet_cle.o
 obj-$(CONFIG_NET_XGENE) += xgene-enet.o
diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_cle.c 
b/drivers/net/ethernet/apm/xgene/xgene_enet_cle.c
new file mode 100644
index 000..ff24ca9
--- /dev/null
+++ b/drivers/net/ethernet/apm/xgene/xgene_enet_cle.c
@@ -0,0 +1,357 @@
+/* Applied Micro X-Gene SoC Ethernet Classifier structures
+ *
+ * Copyright (c) 2016, Applied Micro Circuits Corporation
+ * Authors: Khuong Dinh 
+ *  Tanmay Inamdar 
+ *  Iyappan Subramanian 
+ *
+ * This program is free software; you can redistribute  it and/or modify it
+ * under  the terms of  the GNU General  Public License as published by the
+ * Free Software Foundation;  either version 2 of the  License, or (at your
+ * option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see .
+ */
+
+#include "xgene_enet_main.h"
+
+static void xgene_cle_dbptr_to_hw(struct xgene_enet_pdata *pdata,
+ struct xgene_cle_dbptr *dbptr, u32 *buf)
+{
+   buf[4] = SET_VAL(CLE_FPSEL, dbptr->fpsel) |
+SET_VAL(CLE_DSTQIDL, dbptr->dstqid);
+
+   buf[5] = SET_VAL(CLE_DSTQIDH, (u32)dbptr->dstqid >> CLE_DSTQIDL_LEN) |
+SET_VAL(CLE_PRIORITY, dbptr->cle_priority);
+}
+
+static void xgene_cle_kn_to_hw(struct xgene_cle_ptree_kn *kn, u32 *buf)
+{
+   u32 i, j = 0;
+   u32 data;
+
+   buf[j++] = SET_VAL(CLE_TYPE, kn->node_type);
+   for (i = 0; i < kn->num_keys; i++) {
+   struct xgene_cle_ptree_key *key = >key[i];
+
+   if (!(i % 2)) {
+   buf[j] = SET_VAL(CLE_KN_PRIO, key->priority) |
+SET_VAL(CLE_KN_RPTR, key->result_pointer);
+   } else {
+   data = SET_VAL(CLE_KN_PRIO, key->priority) |
+  SET_VAL(CLE_KN_RPTR, key->result_pointer);
+   buf[j++] |= (data << 16);
+   }
+   }
+}
+
+static void xgene_cle_dn_to_hw(struct xgene_cle_ptree_ewdn *dn,
+  u32 *buf, u32 jb)
+{
+   struct xgene_cle_ptree_branch *br;
+   u32 i, j = 0;
+   u32 npp;
+
+   buf[j++] = SET_VAL(CLE_DN_TYPE, dn->node_type) |
+  SET_VAL(CLE_DN_LASTN, dn->last_node) |
+  SET_VAL(CLE_DN_HLS, dn->hdr_len_store) |
+  SET_VAL(CLE_DN_EXT, dn->hdr_extn) |
+  SET_VAL(CLE_DN_BSTOR, dn->byte_store) |
+  SET_VAL(CLE_DN_SBSTOR, dn->search_byte_store) |
+  SET_VAL(CLE_DN_RPTR, dn->result_pointer);
+
+   for (i = 0; i < dn->num_branches; i++) {
+   br = >branch[i];
+   npp = br->next_packet_pointer;
+
+   if ((br->jump_rel == JMP_ABS) && (npp < CLE_PKTRAM_SIZE))
+   npp += jb;
+
+   buf[j++] = SET_VAL(CLE_BR_VALID, br->valid) |
+  SET_VAL(CLE_BR_NPPTR, npp) |
+  SET_VAL(CLE_BR_JB, br->jump_bw) |
+  SET_VAL(CLE_BR_JR, br->jump_rel) |
+  SET_VAL(CLE_BR_OP, br->operation) |
+  SET_VAL(CLE_BR_NNODE, br->next_node) |
+  SET_VAL(CLE_BR_NBR, br->next_branch);
+
+   buf[j++] = SET_VAL(CLE_BR_DATA, br->data) |
+

[PATCH net-next v3 4/4] dtb: xgene: Add irqs to support multi queue

2016-02-17 Thread Iyappan Subramanian

Signed-off-by: Iyappan Subramanian 
Signed-off-by: Khuong Dinh 
Signed-off-by: Tanmay Inamdar 
Tested-by: Toan Le 
---
 arch/arm64/boot/dts/apm/apm-shadowcat.dtsi | 8 +++-
 arch/arm64/boot/dts/apm/apm-storm.dtsi | 8 +++-
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/boot/dts/apm/apm-shadowcat.dtsi 
b/arch/arm64/boot/dts/apm/apm-shadowcat.dtsi
index 5d87a3d..278f106 100644
--- a/arch/arm64/boot/dts/apm/apm-shadowcat.dtsi
+++ b/arch/arm64/boot/dts/apm/apm-shadowcat.dtsi
@@ -621,7 +621,13 @@
  <0x0 0x1f60 0x0 0Xd100>,
  <0x0 0x2000 0x0 0X22>;
interrupts = <0 108 4>,
-<0 109 4>;
+<0 109 4>,
+<0 110 4>,
+<0 111 4>,
+<0 112 4>,
+<0 113 4>,
+<0 114 4>,
+<0 115 4>;
port-id = <1>;
dma-coherent;
clocks = < 0>;
diff --git a/arch/arm64/boot/dts/apm/apm-storm.dtsi 
b/arch/arm64/boot/dts/apm/apm-storm.dtsi
index fe30f76..cafb2c2 100644
--- a/arch/arm64/boot/dts/apm/apm-storm.dtsi
+++ b/arch/arm64/boot/dts/apm/apm-storm.dtsi
@@ -958,7 +958,13 @@
  <0x0 0x1800 0x0 0X200>;
reg-names = "enet_csr", "ring_csr", "ring_cmd";
interrupts = <0x0 0x60 0x4>,
-<0x0 0x61 0x4>;
+<0x0 0x61 0x4>,
+<0x0 0x62 0x4>,
+<0x0 0x63 0x4>,
+<0x0 0x64 0x4>,
+<0x0 0x65 0x4>,
+<0x0 0x66 0x4>,
+<0x0 0x67 0x4>;
dma-coherent;
clocks = < 0>;
/* mac address will be overwritten by the bootloader */
-- 
1.9.1

[PATCH net-next v3 3/4] drivers: net: xgene: Add support for multiple queues

2016-02-17 Thread Iyappan Subramanian

Signed-off-by: Iyappan Subramanian 
Signed-off-by: Khuong Dinh 
Signed-off-by: Tanmay Inamdar 
Tested-by: Toan Le 
---
 drivers/net/ethernet/apm/xgene/xgene_enet_cle.c   |  11 +-
 drivers/net/ethernet/apm/xgene/xgene_enet_hw.c|  12 +
 drivers/net/ethernet/apm/xgene/xgene_enet_hw.h|   5 +
 drivers/net/ethernet/apm/xgene/xgene_enet_main.c  | 453 +-
 drivers/net/ethernet/apm/xgene/xgene_enet_main.h  |  21 +-
 drivers/net/ethernet/apm/xgene/xgene_enet_ring2.c |  12 +
 6 files changed, 320 insertions(+), 194 deletions(-)

diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_cle.c 
b/drivers/net/ethernet/apm/xgene/xgene_enet_cle.c
index c007497..b2124886 100644
--- a/drivers/net/ethernet/apm/xgene/xgene_enet_cle.c
+++ b/drivers/net/ethernet/apm/xgene/xgene_enet_cle.c
@@ -331,14 +331,15 @@ static int xgene_cle_set_rss_skeys(struct xgene_enet_cle 
*cle)
 
 static int xgene_cle_set_rss_idt(struct xgene_enet_pdata *pdata)
 {
-   u32 fpsel, dstqid, nfpsel, idt_reg;
+   u32 fpsel, dstqid, nfpsel, idt_reg, idx;
int i, ret = 0;
u16 pool_id;
 
for (i = 0; i < XGENE_CLE_IDT_ENTRIES; i++) {
-   pool_id = pdata->rx_ring->buf_pool->id;
+   idx = i % pdata->rxq_cnt;
+   pool_id = pdata->rx_ring[idx]->buf_pool->id;
fpsel = xgene_enet_ring_bufnum(pool_id) - 0x20;
-   dstqid = xgene_enet_dst_ring_num(pdata->rx_ring);
+   dstqid = xgene_enet_dst_ring_num(pdata->rx_ring[idx]);
nfpsel = 0;
idt_reg = 0;
 
@@ -695,8 +696,8 @@ static int xgene_enet_cle_init(struct xgene_enet_pdata 
*pdata)
br->mask = 0x;
}
 
-   def_qid = xgene_enet_dst_ring_num(pdata->rx_ring);
-   pool_id = pdata->rx_ring->buf_pool->id;
+   def_qid = xgene_enet_dst_ring_num(pdata->rx_ring[0]);
+   pool_id = pdata->rx_ring[0]->buf_pool->id;
def_fpsel = xgene_enet_ring_bufnum(pool_id) - 0x20;
 
memset(dbptr, 0, sizeof(struct xgene_cle_dbptr) * DB_MAX_PTRS);
diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_hw.c 
b/drivers/net/ethernet/apm/xgene/xgene_enet_hw.c
index db55c9f..39e081a 100644
--- a/drivers/net/ethernet/apm/xgene/xgene_enet_hw.c
+++ b/drivers/net/ethernet/apm/xgene/xgene_enet_hw.c
@@ -204,6 +204,17 @@ static u32 xgene_enet_ring_len(struct xgene_enet_desc_ring 
*ring)
return num_msgs;
 }
 
+static void xgene_enet_setup_coalescing(struct xgene_enet_desc_ring *ring)
+{
+   u32 data = 0x;
+
+   xgene_enet_ring_wr32(ring, CSR_PBM_COAL, 0x8e);
+   xgene_enet_ring_wr32(ring, CSR_PBM_CTICK1, data);
+   xgene_enet_ring_wr32(ring, CSR_PBM_CTICK2, data << 16);
+   xgene_enet_ring_wr32(ring, CSR_THRESHOLD0_SET1, 0x40);
+   xgene_enet_ring_wr32(ring, CSR_THRESHOLD1_SET1, 0x80);
+}
+
 void xgene_enet_parse_error(struct xgene_enet_desc_ring *ring,
struct xgene_enet_pdata *pdata,
enum xgene_enet_err_code status)
@@ -892,4 +903,5 @@ struct xgene_ring_ops xgene_ring1_ops = {
.clear = xgene_enet_clear_ring,
.wr_cmd = xgene_enet_wr_cmd,
.len = xgene_enet_ring_len,
+   .coalesce = xgene_enet_setup_coalescing,
 };
diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_hw.h 
b/drivers/net/ethernet/apm/xgene/xgene_enet_hw.h
index 45725ec..ba7da98 100644
--- a/drivers/net/ethernet/apm/xgene/xgene_enet_hw.h
+++ b/drivers/net/ethernet/apm/xgene/xgene_enet_hw.h
@@ -54,6 +54,11 @@ enum xgene_enet_rm {
 #define IS_BUFFER_POOL BIT(20)
 #define PREFETCH_BUF_ENBIT(21)
 #define CSR_RING_ID_BUF0x000c
+#define CSR_PBM_COAL   0x0014
+#define CSR_PBM_CTICK1 0x001c
+#define CSR_PBM_CTICK2 0x0020
+#define CSR_THRESHOLD0_SET10x0030
+#define CSR_THRESHOLD1_SET10x0034
 #define CSR_RING_NE_INT_MODE   0x017c
 #define CSR_RING_CONFIG0x006c
 #define CSR_RING_WR_BASE   0x0070
diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_main.c 
b/drivers/net/ethernet/apm/xgene/xgene_enet_main.c
index 0bf3924..8d4c1ad 100644
--- a/drivers/net/ethernet/apm/xgene/xgene_enet_main.c
+++ b/drivers/net/ethernet/apm/xgene/xgene_enet_main.c
@@ -182,7 +182,6 @@ static int xgene_enet_tx_completion(struct 
xgene_enet_desc_ring *cp_ring,
 static u64 xgene_enet_work_msg(struct sk_buff *skb)
 {
struct net_device *ndev = skb->dev;
-   struct xgene_enet_pdata *pdata = netdev_priv(ndev);
struct iphdr *iph;
u8 l3hlen = 0, l4hlen = 0;
u8 ethhdr, proto = 0, csum_enable = 0;
@@ -228,10 +227,6 @@ static u64 xgene_enet_work_msg(struct sk_buff *skb)
if (!mss || ((skb->len - hdr_len) <= mss))
goto out;
 
-   if (mss != pdata->mss) {
-   pdata->mss = mss;
-

[PATCH net-next v3 0/4] Add support for Classifier and RSS

2016-02-17 Thread Iyappan Subramanian

This patch set enables,

(i) Classifier engine that is used for parsing
through the packet and extracting a search string that is then used
to search a database to find associative data.

(ii) Receive Side Scaling (RSS) that does dynamic load
balancing of the CPUs by controlling the number of messages enqueued
per CPU though the help of Toeplitz Hash function of 4-tuple of
source TCP/UDP port, destination TCP/UDP port, source IPV4 address and
destination IPV4 address.

(iii) Multi queue, to make advantage of RSS

v3: Address review comments from v2
- reordered local variables from longest to shortlest line

v2: Address review comments from v1
- fix kbuild warning
- add default coalescing

v1:
- Initial version

Signed-off-by: Iyappan Subramanian 
---

Iyappan Subramanian (4):
  drivers: net: xgene: Add support for Classifier engine
  drivers: net: xgene: Add support for RSS
  drivers: net: xgene: Add support for multiple queues
  dtb: xgene: Add irqs to support multi queue

 arch/arm64/boot/dts/apm/apm-shadowcat.dtsi|   8 +-
 arch/arm64/boot/dts/apm/apm-storm.dtsi|   8 +-
 drivers/net/ethernet/apm/xgene/Makefile   |   3 +-
 drivers/net/ethernet/apm/xgene/xgene_enet_cle.c   | 734 ++
 drivers/net/ethernet/apm/xgene/xgene_enet_cle.h   | 295 +
 drivers/net/ethernet/apm/xgene/xgene_enet_hw.c|  12 +
 drivers/net/ethernet/apm/xgene/xgene_enet_hw.h|   6 +
 drivers/net/ethernet/apm/xgene/xgene_enet_main.c  | 482 --
 drivers/net/ethernet/apm/xgene/xgene_enet_main.h  |  35 +-
 drivers/net/ethernet/apm/xgene/xgene_enet_ring2.c |  12 +
 10 files changed, 1395 insertions(+), 200 deletions(-)
 create mode 100644 drivers/net/ethernet/apm/xgene/xgene_enet_cle.c
 create mode 100644 drivers/net/ethernet/apm/xgene/xgene_enet_cle.h

-- 
1.9.1

[PATCH net-next v3 2/4] drivers: net: xgene: Add support for RSS

2016-02-17 Thread Iyappan Subramanian

Signed-off-by: Iyappan Subramanian 
Signed-off-by: Khuong Dinh 
Signed-off-by: Tanmay Inamdar 
Tested-by: Toan Le 
---
 drivers/net/ethernet/apm/xgene/xgene_enet_cle.c | 386 +++-
 drivers/net/ethernet/apm/xgene/xgene_enet_cle.h |  41 +++
 2 files changed, 422 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_cle.c 
b/drivers/net/ethernet/apm/xgene/xgene_enet_cle.c
index ff24ca9..c007497 100644
--- a/drivers/net/ethernet/apm/xgene/xgene_enet_cle.c
+++ b/drivers/net/ethernet/apm/xgene/xgene_enet_cle.c
@@ -21,6 +21,25 @@
 
 #include "xgene_enet_main.h"
 
+/* interfaces to convert structures to HW recognized bit formats */
+static void xgene_cle_sband_to_hw(u8 frag, enum xgene_cle_prot_version ver,
+ enum xgene_cle_prot_type type, u32 len,
+ u32 *reg)
+{
+   *reg =  SET_VAL(SB_IPFRAG, frag) |
+   SET_VAL(SB_IPPROT, type) |
+   SET_VAL(SB_IPVER, ver) |
+   SET_VAL(SB_HDRLEN, len);
+}
+
+static void xgene_cle_idt_to_hw(u32 dstqid, u32 fpsel,
+   u32 nfpsel, u32 *idt_reg)
+{
+   *idt_reg =  SET_VAL(IDT_DSTQID, dstqid) |
+   SET_VAL(IDT_FPSEL, fpsel) |
+   SET_VAL(IDT_NFPSEL, nfpsel);
+}
+
 static void xgene_cle_dbptr_to_hw(struct xgene_enet_pdata *pdata,
  struct xgene_cle_dbptr *dbptr, u32 *buf)
 {
@@ -257,29 +276,372 @@ static void xgene_cle_setup_def_dbptr(struct 
xgene_enet_pdata *pdata,
}
 }
 
+static int xgene_cle_set_rss_sband(struct xgene_enet_cle *cle)
+{
+   u32 idx = CLE_PKTRAM_SIZE / sizeof(u32);
+   u32 mac_hdr_len = ETH_HLEN;
+   u32 sband, reg = 0;
+   u32 ipv4_ihl = 5;
+   u32 hdr_len;
+   int ret;
+
+   /* Sideband: IPV4/TCP packets */
+   hdr_len = (mac_hdr_len << 5) | ipv4_ihl;
+   xgene_cle_sband_to_hw(0, XGENE_CLE_IPV4, XGENE_CLE_TCP, hdr_len, );
+   sband = reg;
+
+   /* Sideband: IPv4/UDP packets */
+   hdr_len = (mac_hdr_len << 5) | ipv4_ihl;
+   xgene_cle_sband_to_hw(1, XGENE_CLE_IPV4, XGENE_CLE_UDP, hdr_len, );
+   sband |= (reg << 16);
+
+   ret = xgene_cle_dram_wr(cle, , 1, idx, PKT_RAM, CLE_CMD_WR);
+   if (ret)
+   return ret;
+
+   /* Sideband: IPv4/RAW packets */
+   hdr_len = (mac_hdr_len << 5) | ipv4_ihl;
+   xgene_cle_sband_to_hw(0, XGENE_CLE_IPV4, XGENE_CLE_OTHER,
+ hdr_len, );
+   sband = reg;
+
+   /* Sideband: Ethernet II/RAW packets */
+   hdr_len = (mac_hdr_len << 5);
+   xgene_cle_sband_to_hw(0, XGENE_CLE_IPV4, XGENE_CLE_OTHER,
+ hdr_len, );
+   sband |= (reg << 16);
+
+   ret = xgene_cle_dram_wr(cle, , 1, idx + 1, PKT_RAM, CLE_CMD_WR);
+   if (ret)
+   return ret;
+
+   return 0;
+}
+
+static int xgene_cle_set_rss_skeys(struct xgene_enet_cle *cle)
+{
+   u32 secret_key_ipv4[4];  /* 16 Bytes*/
+   int ret = 0;
+
+   get_random_bytes(secret_key_ipv4, 16);
+   ret = xgene_cle_dram_wr(cle, secret_key_ipv4, 4, 0,
+   RSS_IPV4_HASH_SKEY, CLE_CMD_WR);
+   return ret;
+}
+
+static int xgene_cle_set_rss_idt(struct xgene_enet_pdata *pdata)
+{
+   u32 fpsel, dstqid, nfpsel, idt_reg;
+   int i, ret = 0;
+   u16 pool_id;
+
+   for (i = 0; i < XGENE_CLE_IDT_ENTRIES; i++) {
+   pool_id = pdata->rx_ring->buf_pool->id;
+   fpsel = xgene_enet_ring_bufnum(pool_id) - 0x20;
+   dstqid = xgene_enet_dst_ring_num(pdata->rx_ring);
+   nfpsel = 0;
+   idt_reg = 0;
+
+   xgene_cle_idt_to_hw(dstqid, fpsel, nfpsel, _reg);
+   ret = xgene_cle_dram_wr(>cle, _reg, 1, i,
+   RSS_IDT, CLE_CMD_WR);
+   if (ret)
+   return ret;
+   }
+
+   ret = xgene_cle_set_rss_skeys(>cle);
+   if (ret)
+   return ret;
+
+   return 0;
+}
+
+static int xgene_cle_setup_rss(struct xgene_enet_pdata *pdata)
+{
+   struct xgene_enet_cle *cle = >cle;
+   void __iomem *base = cle->base;
+   u32 offset, val = 0;
+   int i, ret = 0;
+
+   offset = CLE_PORT_OFFSET;
+   for (i = 0; i < cle->parsers; i++) {
+   if (cle->active_parser != PARSER_ALL)
+   offset = cle->active_parser * CLE_PORT_OFFSET;
+   else
+   offset = i * CLE_PORT_OFFSET;
+
+   /* enable RSS */
+   val = (RSS_IPV4_12B << 1) | 0x1;
+   writel(val, base + RSS_CTRL0 + offset);
+   }
+
+   /* setup sideband data */
+   ret = xgene_cle_set_rss_sband(cle);
+   if (ret)
+   return ret;
+
+   /* setup indirection table */
+   ret = xgene_cle_set_rss_idt(pdata);
+

[net-next PATCH] net: pack tc_cls_u32_knode struct slighter better

2016-02-17 Thread John Fastabend

By packing the structure we can remove a few holes as Jamal
suggests.

before:

struct tc_cls_u32_knode {
struct tcf_exts *  exts; /* 0 8 */
u8 fshift;   /* 8 1 */

/* XXX 3 bytes hole, try to pack */

u32handle;   /*12 4 */
u32val;  /*16 4 */
u32mask; /*20 4 */
u32link_handle;  /*24 4 */

/* XXX 4 bytes hole, try to pack */

struct tc_u32_sel *sel;  /*32 8 */

/* size: 40, cachelines: 1, members: 7 */
/* sum members: 33, holes: 2, sum holes: 7 */
/* last cacheline: 40 bytes */
};

after:

struct tc_cls_u32_knode {
struct tcf_exts *  exts; /* 0 8 */
struct tc_u32_sel *sel;  /* 8 8 */
u32handle;   /*16 4 */
u32val;  /*20 4 */
u32mask; /*24 4 */
u32link_handle;  /*28 4 */
u8 fshift;   /*32 1 */

/* size: 40, cachelines: 1, members: 7 */
/* padding: 7 */
/* last cacheline: 40 bytes */
};

Suggested-by: Jamal Hadi Salim 
Signed-off-by: John Fastabend 
---
 include/net/pkt_cls.h |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
index 59789ca..2121df5 100644
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h
@@ -360,12 +360,12 @@ tcf_match_indev(struct sk_buff *skb, int ifindex)
 
 struct tc_cls_u32_knode {
struct tcf_exts *exts;
-   u8 fshift;
+   struct tc_u32_sel *sel;
u32 handle;
u32 val;
u32 mask;
u32 link_handle;
-   struct tc_u32_sel *sel;
+   u8 fshift;
 };
 
 struct tc_cls_u32_hnode {

Re: [PATCH] rtnl: RTM_GETNETCONF: fix wrong return value

2016-02-17 Thread Cong Wang

On Tue, Feb 16, 2016 at 6:43 PM, Anton Protopopov
 wrote:
> An error response from a RTM_GETNETCONF request can return the positive
> error value EINVAL in the struct nlmsgerr that can mislead userspace.
>
> Signed-off-by: Anton Protopopov 

LGTM,

Acked-by: Cong Wang

2016-02-17 Thread Drouet, Christian



Please Follow the link below for your mail Box account to be Upgrade to the New 
Standard Version for fast internet protocol.
Click on UPGRADE for proper 
confirmation and verification to the new standard version.


Thank You for your cooperation.

Jay  Vankling

Webmail Upgrade & Communication Department.

webmail incorporation

Copyright © 2016.

Re: [net-next PATCH 2/2] ixgbe: fix dates on header of ixgbe_model.h

2016-02-17 Thread Jeff Kirsher

On Wed, 2016-02-17 at 14:35 -0800, John Fastabend wrote:
> Fixes: 9d35cf062e05 ("net: ixgbe: add minimal parser details for
> ixgbe")
> Reported-by: Mark Rustad 
> Signed-off-by: John Fastabend 
> ---
>  drivers/net/ethernet/intel/ixgbe/ixgbe_model.h |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Acked-by: Jeff Kirsher 

signature.asc
Description: This is a digitally signed message part

Re: [net-next PATCH 1/2] ixgbe: use u32 instead of __u32 in model header

2016-02-17 Thread Jeff Kirsher

On Wed, 2016-02-17 at 14:34 -0800, John Fastabend wrote:
> I incorrectly used __u32 types where we should be using u32 types
> when
> I added the ixgbe_model.h file.
> 
> Fixes: 9d35cf062e05 ("net: ixgbe: add minimal parser details for
> ixgbe")
> Suggested-by: Jamal Hadi Salim 
> Signed-off-by: John Fastabend 
> ---
>  drivers/net/ethernet/intel/ixgbe/ixgbe_model.h |   18 +-
> 
>  1 file changed, 9 insertions(+), 9 deletions(-)

Acked-by: Jeff Kirsher 

Dave feel free to pull this series in from John to fix the issues with
his previous series.

signature.asc
Description: This is a digitally signed message part

[net-next PATCH 2/2] ixgbe: fix dates on header of ixgbe_model.h

2016-02-17 Thread John Fastabend

Fixes: 9d35cf062e05 ("net: ixgbe: add minimal parser details for ixgbe")
Reported-by: Mark Rustad 
Signed-off-by: John Fastabend 
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_model.h |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_model.h 
b/drivers/net/ethernet/intel/ixgbe/ixgbe_model.h
index 62ea2e7..ce48872 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_model.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_model.h
@@ -1,7 +1,7 @@
 
/***
  *
  * Intel 10 Gigabit PCI Express Linux drive
- * Copyright(c) 2013 - 2015 Intel Corporation.
+ * Copyright(c) 2016 Intel Corporation.
  *
  * This program is free software; you can redistribute it and/or modify it
  * under the terms and conditions of the GNU General Public License,

[net-next PATCH 1/2] ixgbe: use u32 instead of __u32 in model header

2016-02-17 Thread John Fastabend

I incorrectly used __u32 types where we should be using u32 types when
I added the ixgbe_model.h file.

Fixes: 9d35cf062e05 ("net: ixgbe: add minimal parser details for ixgbe")
Suggested-by: Jamal Hadi Salim 
Signed-off-by: John Fastabend 
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_model.h |   18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_model.h 
b/drivers/net/ethernet/intel/ixgbe/ixgbe_model.h
index 43ebec4..62ea2e7 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_model.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_model.h
@@ -35,13 +35,13 @@ struct ixgbe_mat_field {
unsigned int mask;
int (*val)(struct ixgbe_fdir_filter *input,
   union ixgbe_atr_input *mask,
-  __u32 val, __u32 m);
+  u32 val, u32 m);
unsigned int type;
 };
 
 static inline int ixgbe_mat_prgm_sip(struct ixgbe_fdir_filter *input,
 union ixgbe_atr_input *mask,
-__u32 val, __u32 m)
+u32 val, u32 m)
 {
input->filter.formatted.src_ip[0] = val;
mask->formatted.src_ip[0] = m;
@@ -50,7 +50,7 @@ static inline int ixgbe_mat_prgm_sip(struct ixgbe_fdir_filter 
*input,
 
 static inline int ixgbe_mat_prgm_dip(struct ixgbe_fdir_filter *input,
 union ixgbe_atr_input *mask,
-__u32 val, __u32 m)
+u32 val, u32 m)
 {
input->filter.formatted.dst_ip[0] = val;
mask->formatted.dst_ip[0] = m;
@@ -67,7 +67,7 @@ static struct ixgbe_mat_field ixgbe_ipv4_fields[] = {
 
 static inline int ixgbe_mat_prgm_sport(struct ixgbe_fdir_filter *input,
   union ixgbe_atr_input *mask,
-  __u32 val, __u32 m)
+  u32 val, u32 m)
 {
input->filter.formatted.src_port = val & 0x;
mask->formatted.src_port = m & 0x;
@@ -76,7 +76,7 @@ static inline int ixgbe_mat_prgm_sport(struct 
ixgbe_fdir_filter *input,
 
 static inline int ixgbe_mat_prgm_dport(struct ixgbe_fdir_filter *input,
   union ixgbe_atr_input *mask,
-  __u32 val, __u32 m)
+  u32 val, u32 m)
 {
input->filter.formatted.dst_port = val & 0x;
mask->formatted.dst_port = m & 0x;
@@ -94,12 +94,12 @@ static struct ixgbe_mat_field ixgbe_tcp_fields[] = {
 struct ixgbe_nexthdr {
/* offset, shift, and mask of position to next header */
unsigned int o;
-   __u32 s;
-   __u32 m;
+   u32 s;
+   u32 m;
/* match criteria to make this jump*/
unsigned int off;
-   __u32 val;
-   __u32 mask;
+   u32 val;
+   u32 mask;
/* location of jump to make */
struct ixgbe_mat_field *jump;
 };

Re: [net-next PATCH v3 6/8] net: ixgbe: add minimal parser details for ixgbe

2016-02-17 Thread John Fastabend

On 16-02-17 10:01 AM, Rustad, Mark D wrote:
> John Fastabend  wrote:
> 
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_model.h
>> b/drivers/net/ethernet/intel/ixgbe/ixgbe_model.h
>> new file mode 100644
>> index 000..43ebec4
>> --- /dev/null
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_model.h
>> @@ -0,0 +1,112 @@
>> +/***
>>
>> + *
>> + * Intel 10 Gigabit PCI Express Linux drive
>> + * Copyright(c) 2013 - 2015 Intel Corporation.
> 
> Those copyright dates don't seem reasonable for a new file, since it
> clearly didn't exist in 2013. IANAL, but it has to be better to at least
> be accurate about when it was created.
> 
> -- 
> Mark Rustad, Networking Division, Intel Corporation

Sure I'll fix this as well. :/ thanks Mark.

Re: [PATCH net-next 2/2] phy: marvell: Add support for phy packet generator

2016-02-17 Thread Lino Sanfilippo

Hi,

On 17.02.2016 21:32, Andrew Lunn wrote:
> +
> + oldpage = phy_read(phydev, MII_MARVELL_PHY_PAGE);
> + if (oldpage < 0) {
> + err = oldpage;
> + goto out;
> + }
> +

shouldn't this return immediately? Jump to out label and writing an
error value to the phy does not seem to be correct.

Regards,
Lino

[PATCH] USB: cdc_subset: only build when one driver is enabled

2016-02-17 Thread Arnd Bergmann

This avoids a harmless randconfig warning I get when USB_NET_CDC_SUBSET
is enabled, but all of the more specific drivers are not:

drivers/net/usb/cdc_subset.c:241:2: #warning You need to configure some 
hardware for this driver

The current behavior is clearly intentional, giving a warning when
a user picks a configuration that won't do anything good. The only
reason for even addressing this is that I'm getting close to
eliminating all 'randconfig' warnings on ARM, and this came up
a couple of times.

My workaround is to not even build the module when none of the
configurations are enable.

Alternatively we could simply remove the #warning (nothing wrong
for compile-testing), turn it into a runtime warning, or
change the Kconfig options into a menu to hide CONFIG_USB_NET_CDC_SUBSET.

Signed-off-by: Arnd Bergmann 
---
 drivers/net/usb/Kconfig  | 10 ++
 drivers/net/usb/Makefile |  2 +-
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/net/usb/Kconfig b/drivers/net/usb/Kconfig
index 7f83504dfa69..cdde59089f72 100644
--- a/drivers/net/usb/Kconfig
+++ b/drivers/net/usb/Kconfig
@@ -395,6 +395,10 @@ config USB_NET_RNDIS_HOST
  The protocol specification is incomplete, and is controlled by
  (and for) Microsoft; it isn't an "Open" ecosystem or market.
 
+config USB_NET_CDC_SUBSET_ENABLE
+   tristate
+   depends on USB_NET_CDC_SUBSET
+
 config USB_NET_CDC_SUBSET
tristate "Simple USB Network Links (CDC Ethernet subset)"
depends on USB_USBNET
@@ -413,6 +417,7 @@ config USB_NET_CDC_SUBSET
 config USB_ALI_M5632
bool "ALi M5632 based 'USB 2.0 Data Link' cables"
depends on USB_NET_CDC_SUBSET
+   select USB_NET_CDC_SUBSET_ENABLE
help
  Choose this option if you're using a host-to-host cable
  based on this design, which supports USB 2.0 high speed.
@@ -420,6 +425,7 @@ config USB_ALI_M5632
 config USB_AN2720
bool "AnchorChips 2720 based cables (Xircom PGUNET, ...)"
depends on USB_NET_CDC_SUBSET
+   select USB_NET_CDC_SUBSET_ENABLE
help
  Choose this option if you're using a host-to-host cable
  based on this design.  Note that AnchorChips is now a
@@ -428,6 +434,7 @@ config USB_AN2720
 config USB_BELKIN
bool "eTEK based host-to-host cables (Advance, Belkin, ...)"
depends on USB_NET_CDC_SUBSET
+   select USB_NET_CDC_SUBSET_ENABLE
default y
help
  Choose this option if you're using a host-to-host cable
@@ -437,6 +444,7 @@ config USB_BELKIN
 config USB_ARMLINUX
bool "Embedded ARM Linux links (iPaq, ...)"
depends on USB_NET_CDC_SUBSET
+   select USB_NET_CDC_SUBSET_ENABLE
default y
help
  Choose this option to support the "usb-eth" networking driver
@@ -454,6 +462,7 @@ config USB_ARMLINUX
 config USB_EPSON2888
bool "Epson 2888 based firmware (DEVELOPMENT)"
depends on USB_NET_CDC_SUBSET
+   select USB_NET_CDC_SUBSET_ENABLE
help
  Choose this option to support the usb networking links used
  by some sample firmware from Epson.
@@ -461,6 +470,7 @@ config USB_EPSON2888
 config USB_KC2190
bool "KT Technology KC2190 based cables (InstaNet)"
depends on USB_NET_CDC_SUBSET
+   select USB_NET_CDC_SUBSET_ENABLE
help
  Choose this option if you're using a host-to-host cable
  with one of these chips.
diff --git a/drivers/net/usb/Makefile b/drivers/net/usb/Makefile
index b5f04068dbe4..37fb46aee341 100644
--- a/drivers/net/usb/Makefile
+++ b/drivers/net/usb/Makefile
@@ -23,7 +23,7 @@ obj-$(CONFIG_USB_NET_GL620A)  += gl620a.o
 obj-$(CONFIG_USB_NET_NET1080)  += net1080.o
 obj-$(CONFIG_USB_NET_PLUSB)+= plusb.o
 obj-$(CONFIG_USB_NET_RNDIS_HOST)   += rndis_host.o
-obj-$(CONFIG_USB_NET_CDC_SUBSET)   += cdc_subset.o
+obj-$(CONFIG_USB_NET_CDC_SUBSET_ENABLE)+= cdc_subset.o
 obj-$(CONFIG_USB_NET_ZAURUS)   += zaurus.o
 obj-$(CONFIG_USB_NET_MCS7830)  += mcs7830.o
 obj-$(CONFIG_USB_USBNET)   += usbnet.o
-- 
2.7.0

Re: [PATCH nf-next v7 4/7] openvswitch: Find existing conntrack entry after upcall.

2016-02-17 Thread Joe Stringer

On 5 February 2016 at 17:41, Jarno Rajahalme  wrote:
> Add a new function ovs_ct_find_existing() to find an existing
> conntrack entry for which this packet was already applied to.  This is
> only to be called when there is evidence that the packet was already
> tracked and committed, but we lost the ct reference due to an
> userspace upcall.
>
> ovs_ct_find_existing() is called from skb_nfct_cached(), which can now
> hide the fact that the ct reference may have been lost due to an
> upcall.  This allows ovs_ct_commit() to be simplified.
>
> This patch is needed by later "openvswitch: Interface with NAT" patch,
> as we need to be able to pass the packet through NAT using the
> original ct reference also after the reference is lost after an
> upcall.
>
> Signed-off-by: Jarno Rajahalme 

Please run checkpatch.pl against your series; there are various style
issues and also things like we should not hit BUG_ON() in packet
processing path.

> +/* Find an existing conntrack entry for which this packet was already applied
> + * to.  This is only called when there is evidence that the packet was 
> already
> + * tracked and commited, but we lost the ct reference due to an userspace
> + * upcall. This means that on entry skb->nfct is NULL.
> + * On success, returns conntrack ptr, sets skb->nfct and ctinfo.
> + * Must be called rcu_read_lock()ed. */

I think this reads a bit more natural:

/* Find an existing connection which this packet belongs to without
re-attributing
 * statistics or modifying the connection state. During upcall processing,
 * skb->nfct is lost, so this allows it to be recovered during actions
execution.
 * Must be called with rcu_read_lock.
 *
 * On success, populates skb->nfct and skb->nfctinfo, and returns the
 * connection. Returns NULL if there is no existing entry.
 */

> +static struct nf_conn *
> +ovs_ct_find_existing(struct net *net, const struct nf_conntrack_zone *zone,
> +u_int8_t l3num, struct sk_buff *skb,
> +enum ip_conntrack_info *ctinfo)

The caller doesn't use ctinfo, so this argument could be dropped?



> ct = nf_ct_get(skb, );
> +   /* If no ct, check if we have evidence that an existing conntrack 
> entry
> +* might be found for this skb.  This happens when we lose a skb->nfct
> +* due to an upcall.  If the connection was not confirmed, it is not
> +* cached and needs to be run through conntrack again. */
> +   if (!ct && key->ct.state & OVS_CS_F_TRACKED
> +   && !(key->ct.state & OVS_CS_F_INVALID)
> +   && key->ct.zone == info->zone.id)
> +   ct = ovs_ct_find_existing(net, >zone, info->family, skb,
> + );

Operator style in net is a bit different from OVS userspace.

> if (!ct)
> return false;
> +
> if (!net_eq(net, read_pnet(>ct_net)))
> return false;
> if (!nf_ct_zone_equal_any(info->ct, nf_ct_zone(ct)))

Unrelated whitespace.

> @@ -421,6 +508,13 @@ static int ovs_ct_lookup(struct net *net, struct 
> sw_flow_key *key,
>  {
> struct nf_conntrack_expect *exp;
>
> +   /* If we pass an expected packet through nf_conntrack_in() the
> +* expectiation will be removed, but the packet could still be lost in
> +* upcall processing.  To prevent this from happening we perform an
> +* explicit expectation lookup.  Expected connections are always new,
> +* and will be passed through conntrack only when they are committed,
> +* as it is OK to remove the expectation at that time.
> +*/

The expectation /may/ be removed, but as I understand it depends on
the protocol handler. Minor wording tweak, but this comment makes
sense. This hunk could be a separate patch to document existing
behaviour, but I'm not fussed how it's submitted.

Thanks,
Joe

Re: [PATCH nf-next v7 5/7] openvswitch: Handle NF_REPEAT in conntrack action.

2016-02-17 Thread Joe Stringer

On 5 February 2016 at 17:41, Jarno Rajahalme  wrote:
> Repeat the nf_conntrack_in() call when it returns NF_REPEAT.  This
> avoids dropping a SYN packet re-opening an existing TCP connection.
>
> Signed-off-by: Jarno Rajahalme 

Arguably this is a bugfix.

Acked-by: Joe Stringer

[PATCH net-next] ipv6: pass up EMSGSIZE msg for UDP socket in Ipv6

2016-02-17 Thread Wei Wang

From: Wei Wang 

In ipv4,  when  the machine receives a ICMP_FRAG_NEEDED message,  the
connected UDP socket will get EMSGSIZE message on its next read from the
socket.
However, this is not the case for ipv6.
This fix modifies the udp err handler in Ipv6 for ICMP6_PKT_TOOBIG to
make it similar to ipv4 behavior. That is when the machine gets an
ICMP6_PKT_TOOBIG message, the connected UDP socket will get EMSGSIZE
message on its next read from the socket.

Signed-off-by: Wei Wang 
---
 net/ipv6/udp.c | 20 +++-
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 22e28a4..a0da656 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -590,6 +590,7 @@ void __udp6_lib_err(struct sk_buff *skb, struct 
inet6_skb_parm *opt,
const struct in6_addr *daddr = >daddr;
struct udphdr *uh = (struct udphdr *)(skb->data+offset);
struct sock *sk;
+   int harderr;
int err;
struct net *net = dev_net(skb->dev);
 
@@ -601,26 +602,27 @@ void __udp6_lib_err(struct sk_buff *skb, struct 
inet6_skb_parm *opt,
return;
}
 
+   harderr = icmpv6_err_convert(type, code, );
+   np = inet6_sk(sk);
+
if (type == ICMPV6_PKT_TOOBIG) {
if (!ip6_sk_accept_pmtu(sk))
goto out;
ip6_sk_update_pmtu(skb, sk, info);
+   if (np->pmtudisc != IPV6_PMTUDISC_DONT)
+   harderr = 1;
}
if (type == NDISC_REDIRECT) {
ip6_sk_redirect(skb, sk);
goto out;
}
 
-   np = inet6_sk(sk);
-
-   if (!icmpv6_err_convert(type, code, ) && !np->recverr)
-   goto out;
-
-   if (sk->sk_state != TCP_ESTABLISHED && !np->recverr)
-   goto out;
-
-   if (np->recverr)
+   if (!np->recverr) {
+   if (!harderr || sk->sk_state != TCP_ESTABLISHED)
+   goto out;
+   } else {
ipv6_icmp_error(sk, skb, err, uh->dest, ntohl(info), (u8 
*)(uh+1));
+   }
 
sk->sk_err = err;
sk->sk_error_report(sk);
-- 
2.7.0.rc3.207.g0ac5344

Re: [PATCH net-next 1/2] net: ethtool: Add support for PHY packet generators

2016-02-17 Thread Andrew Lunn

On Wed, Feb 17, 2016 at 09:06:19PM +, Ben Hutchings wrote:
> On Wed, 2016-02-17 at 21:32 +0100, Andrew Lunn wrote:
> > Some PHY devices contain a simple packet generator. Features vary, but
> > often they can be used to generate packets of different sizes,
> > different contents, with or without errors, and with different inter
> > packet gaps. Add support to the core ethtool code to support this.
> [...]
> > --- a/include/uapi/linux/ethtool.h
> > +++ b/include/uapi/linux/ethtool.h
> > @@ -1167,6 +1167,31 @@ struct ethtool_ts_info {
> >     __u32   rx_reserved[3];
> >  };
> >  
> > +enum ethtool_phy_pkg_gen_flags {
> > +   ETH_PKT_RANDOM  = (1 << 0),
> > +   ETH_PKT_ERROR   = (1 << 1),
> 
> What kind of error?  CRC error, FEC error, symbol error?

Hi Ben

I want to try to keep the API generic, since different PHYs are
different capabilities.  The Marvell phy will generate symbol errors
and CRC errors. You cannot control it in a finer way than that.

> > +};
> > +
> > +/**
> > + * struct ethtool_phy_pkt_get - command to request the phy to generate 
> > packets.
> > + * @cmd: command number = %ETHTOOL_PHY_PKT_GEN
> > + * @count: number of packets to generate
> > + * @len: length of generated packets
> > + * @ipg: inter packet gap in bytes.
> 
> What if the PHY doesn't allow varying the IPG?  Should there be a way
> to find out what its supported IPG is, or to request the default value?

If you pass 0, it will use the default IPG. If you pass a value other
than 0, and it is not supported, it return -EINVAL.

For the Marvell PHYs some don't support setting the IPG, it is hard
set to 12. On those phys passing anything other than 0 gives
-EINVAL. When an IPG is allowed, a value of 0 gives the default 12,
since 0 is invalid, and a value > 256 also gives -EINVAL, since that
is the limit imposed by the hardware.

> Similarly, should there be a way to find out the minimum/maximum length it 
> supports?

I'm trying to keep it simple. Do we really want to add a complex
mechanism to query every available parameter to determine the range of
values it can take? I find it better that the driver accepts the
values of 0 meaning pick a sensible default, and for any value not 0,
return an error if it cannot be supported by the hardware.
I tried to emphasise this in the man page patch.

> > + * @flags: a bitmask of flags from  ethtool_phy_pkg_gen_flags
> > + *
> > + * PHY drivers may not support all of these parameters. If the
> > + * requested parameter value cannot be supported an error should be
> > + * returned.
> 
> Should, or must?

Must would be better.

> How does userland tell when the PHY has finished?  Should it be
> possible to cancel this (similar to ETHTOOL_PHYS_ID)?

The call is blocking and returns when all the packets are sent.
For the Marvell hardware, you can send up to 255 packets. At 10Mbps,
1518 byte packets and 256 it takes about 0.3 seconds.

I don't really like the idea of making it non-blocking. i.e. set it
generating packets and sometime later stop it. It can lead to some
very non-obvious issues. Why is my Ethernet card spamming the net at
line rate, yet the netdev TX counters are not going up?

> What should happen if the stack tries to send a packet while the PHY is
> in this mode?  Is it discarded?  Should the driver indicate carrier-off 
> so that this is obvious?

Depends on the hardware. Marvell PHYs will discard any packets coming
from the MAC.

> [...]
> > --- a/net/core/ethtool.c
> > +++ b/net/core/ethtool.c
> > @@ -1541,6 +1541,25 @@ static int ethtool_get_phy_stats(struct net_device 
> > *dev, void __user *useraddr)
> >     return ret;
> >  }
> >  
> > +static int ethtool_phy_pkt_gen(struct net_device *dev, void __user 
> > *useraddr)
> > +{
> > +   struct phy_device *phydev = dev->phydev;
> > +   struct ethtool_phy_pkt_gen pkt_gen;
> > +   int err;
> > +
> > +   if (!phydev || !phydev->drv->pkt_gen)
> > +   return -EOPNOTSUPP;
> [...]
> 
> Why should this be tied to phylib?  Nothing else in the ethtool
> interface is.

We need the phy lock to be held. We don't want anything else accessing
phy registers at the same time. For the Marvell hardware, we need to
change the page. If for example genphy_read_status() was used to poll
the status of the PHY, it could read from the wrong page and get very
confused.

Andrew

Re: [PATCH v2] gre: Avoid kernel panic by clearing IPCB before dst_link_failure called

2016-02-17 Thread David Miller

From: Bernie Harris 
Date: Tue, 16 Feb 2016 14:10:16 +1300

> skb->cb may contain data from previous layers (in the observed case the
> qdisc layer). In the observed scenario, the data was misinterpreted as
> ip header options, which later caused the ihl to be set to an invalid
> value (<5). This resulted in an infinite loop in the mips implementation
> of ip_fast_csum.
> 
> This patch clears IPCB before dst_link_failure is called from the functions
> ip_tunnel_xmit and ip6gre_xmit2, similar to what commit 11c21a30 does for
> an ipv4 case.
> 
> Signed-off-by: Bernie Harris 

Again, I want to see this implemented in a way which causes things to be
treated consistently across all tunneling types.

Which means fixing the exact problem, IPCB(skb)->opt needing initilization.

Thanks.

Re: [PATCH net-next] store complete hash type information in socket buffer...

2016-02-17 Thread Eric Dumazet

On mer., 2016-02-17 at 15:44 -0500, David Miller wrote:
> From: Paul Durrant 
> Date: Mon, 15 Feb 2016 08:32:08 +
> 
> > ...rather than a boolean merely indicating a canonical L4 hash.
> > 
> > skb_set_hash() takes a hash type (from enum pkt_hash_types) as an
> > argument but information is lost since only a single bit in the skb
> > stores whether that hash type is PKT_HASH_TYPE_L4 or not. By using
> > two bits it's possible to store the complete hash type information.
> > 
> > Signed-off-by: Paul Durrant 
> 
> Tom and/or Eric, please have a look at this.

I guess my question is simply 'why do we need this' ?

Consuming a bit in our precious sk_buff is not something we want for
some obscure feature.

Re: [Intel-wired-lan] [next] igb: allow setting MAC address on i211 using a device tree blob V3

2016-02-17 Thread Jeff Kirsher

On Wed, 2016-02-17 at 07:43 +0100, John Holland wrote:
> Hello,
> 
> The Intel i211 LOM pcie ethernet controllers' iNVM operates as an OTP
> and has no externel EEPROM interface [1]. The following allows the
> driver to pickup the MAC address from a device tree blob when
> CONFIG_OF
> has been enabled.
> 
> [1]http://www.intel.com/content/www/us/en/embedded/products/networkin
> g/i211-ethernet-controller-datasheet.html
> 
> Changes V2
> - Restrict searching for compatible devices to current pci device.
> 
> Changes V3
> - Add device tree binding documentation.
> 
> Signed-off-by: John Holland
> ---
> 
>   Documentation/devicetree/bindings/net/intel,i210.txt | 36
> 
>   drivers/net/ethernet/intel/igb/igb_main.c    | 30
> ++
>   2 files changed, 66 insertions(+)

Does not apply cleanly to my tree, please make sure you rebase your
patches based off my dev-queue branch on my next-queue tree.

signature.asc
Description: This is a digitally signed message part

Re: [PATCH] net-sysfs: remove unused fmt_long_hex

2016-02-17 Thread David Miller

From: Colin King 
Date: Mon, 15 Feb 2016 22:54:47 +

> From: Colin Ian King 
> 
> Ever since commit 04ed3e741d0f133e02bed7fa5c98edba128f90e7
> ("net: change netdev->features to u32") the format string
> fmt_long_hex has not been used, so we may as well remove it.
> 
> Signed-off-by: Colin Ian King 

Applied, thank you.

Re: [PATCH v3] phy: marvell: Fix and unify reg-init behavior

2016-02-17 Thread David Miller

From: Andrew Lunn 
Date: Tue, 16 Feb 2016 16:29:02 +0100

> On Mon, Feb 15, 2016 at 11:46:45PM +0100, Clemens Gruber wrote:
>> For the Marvell 88E1510, marvell_of_reg_init was called too late, in the
>> config_aneg function.
>> Since commit 113c74d83eef ("net: phy: turn carrier off on phy attach"),
>> this lead to the link not coming up at boot anymore, due to the phy
>> state machine being stuck at waiting for interrupts (off by default on
>> the 88E1510).
>> For seven other Marvell PHYs, marvell_of_reg_init was not called at all.
>> 
>> Add a generic marvell_config_init function, which in turn calls
>> marvell_of_reg_init.
>> PHYs, which already have a specific config_init function with a call to
>> marvell_of_reg_init, are left untouched. The generic marvell_config_init
>> function is called for all the others, to get consistent behavior across
>> all Marvell PHYs.
>> 
>> Signed-off-by: Clemens Gruber 
> 
> Hi Clemens
> 
> Thanks for extending your original patch to make things more
> consistent.
> 
> Reviewed-by: Andrew Lunn 
> fixes: 113c74d83eef ("net: phy: turn carrier off on phy attach")

Applied, thanks.

BTW, "Fixes: " should be capitalized.

1 2 3 >

1 - 100 of 266 matches

Mail list logo