Re: [PATCH 1/2] net: ethernet: ti: cpts: convert to use kthread_worker

2017-07-21 Thread Richard Cochran
On Fri, Jul 21, 2017 at 06:49:42PM -0500, Grygorii Strashko wrote:
> There could be significant delay in CPTS work schedule under high system
> load and on -RT which could cause CPTS misbehavior due to internal counter
> overflow. Usage of own kthread_worker allows to avoid such kind of issues
> and makes it possible to tune priority of CPTS kthread_worker thread on -RT
> (thread name "cpts").

I have also seen excessive delays in the time stamp work from the
dp83640 under certain loads.  Can we please implement this time stamp
kthread_worker in the PHC subsystem?  That way, the facility can be
used by other drivers without code duplication.

Thanks,
Richard


Re: [PATCH v7 2/3] PCI: Enable PCIe Relaxed Ordering if supported

2017-07-21 Thread Ding Tianhong
Hi Sinan, Bjorn:

On 2017/7/14 21:54, Sinan Kaya wrote:
> On 7/13/2017 9:26 PM, Ding Tianhong wrote:
>> There is no code to enable the PCIe Relaxed Ordering bit in the 
>> configuration space,
>> it is only be enable by default according to the PCIe Standard 
>> Specification, what we
>> do is to distinguish the RC problematic platform and clear the Relaxed 
>> Ordering bit
>> to tell the PCIe EP don't send any TLPs with Relaxed Ordering Attributes to 
>> the Root
>> Complex.
> 
> Maybe, you should change the patch commit as 
> "Disable PCIe Relaxed Ordering if not supported"...

I agree that to use the new commit title as your suggested, thanks. :)

@Bjorn do you want me to spawn a new patchset with the new commit title
and the Reviewed-by from Casey on the patch 3, or maybe you could pick this
up and modify it own ? thanks.

Ding

> 



Re: 4.12.0: igb: transmit queue 7 timed out

2017-07-21 Thread Cong Wang
(Cc netdev and Intel)

On Tue, Jul 18, 2017 at 1:57 PM, Justin Piszcz  wrote:
> Hello,
>
> Kernel: 4.12.0
> Arch: x86_64
>
> What causes this issue?


It is likely a igb driver issue.


>
> [199141.434449] NETDEV WATCHDOG: eth1 (igb): transmit queue 7 timed out
> [199141.434501] [ cut here ]
> [199141.434515] WARNING: CPU: 10 PID: 0 at net/sched/sch_generic.c:316
> dev_watchdog+0x212/0x220
> [199141.434528] CPU: 10 PID: 0 Comm: swapper/10 Not tainted 4.12.0 #3
> [199141.434533] Hardware name: Supermicro X9SRL-F/X9SRL-F, BIOS 3.2
> 01/16/2015
> [199141.434539] task: 8810385b8180 task.stack: c90b4000
> [199141.434546] RIP: 0010:dev_watchdog+0x212/0x220
> [199141.434551] RSP: 0018:88103f483e68 EFLAGS: 00010286
> [199141.434558] RAX: 0037 RBX: 0007 RCX:
> 083f
> [199141.434563] RDX:  RSI: 00f6 RDI:
> 003f
> [199141.434569] RBP: 88103f483e88 R08: 0006a111 R09:
> 0487
> [199141.434574] R10: 0082 R11: 823f49ee R12:
> 8810348a8000
> [199141.434579] R13: 000a R14: 0008 R15:
> 0082
> [199141.434585] FS:  () GS:88103f48()
> knlGS:
> [199141.434590] CS:  0010 DS:  ES:  CR0: 80050033
> [199141.434596] CR2: 29e25c35b000 CR3: 0220a000 CR4:
> 001406e0
> [199141.434600] Call Trace:
> [199141.434607]  
> [199141.434615]  ? qdisc_rcu_free+0x40/0x40
> [199141.434625]  call_timer_fn.isra.4+0x19/0x90
> [199141.434633]  expire_timers+0x7f/0x90
> [199141.434641]  run_timer_softirq+0x84/0xe0
> [199141.434650]  ? ktime_get+0x3b/0x90
> [199141.434659]  ? clockevents_program_event+0x75/0xf0
> [199141.434666]  __do_softirq+0xdf/0x1f0
> [199141.434673]  irq_exit+0xab/0xb0
> [199141.434681]  smp_trace_apic_timer_interrupt+0x63/0x90
> [199141.434688]  smp_apic_timer_interrupt+0x9/0x10
> [199141.434696]  apic_timer_interrupt+0x86/0x90
> [199141.434705] RIP: 0010:cpuidle_enter_state+0x153/0x1e0
> [199141.434711] RSP: 0018:c90b7e68 EFLAGS: 0286 ORIG_RAX:
> ff10
> [199141.434720] RAX: 88103f499000 RBX: e8c91fd0 RCX:
> 001f
> [199141.434725] RDX: 20c49ba5e353f7cf RSI: 88103f496818 RDI:
> 
> [199141.434730] RBP: c90b7e98 R08: 3f7b R09:
> 0018
> [199141.434735] R10: c90b7e48 R11: 38b6 R12:
> b51e3a7582f1
> [199141.434740] R13: 0004 R14: 8225e2b8 R15:
> 8225e120
> [199141.434746]  
> [199141.434754]  ? cpuidle_enter_state+0x148/0x1e0
> [199141.434762]  cpuidle_enter+0x12/0x20
> [199141.434771]  call_cpuidle+0x1e/0x30
> [199141.434779]  do_idle+0xde/0x180
> [199141.434788]  cpu_startup_entry+0x6c/0x70
> [199141.434796]  start_secondary+0x13c/0x160
> [199141.434808]  secondary_startup_64+0x9f/0x9f
> [199141.434813] Code: 8c 24 a4 03 00 00 eb 8d 4c 89 e7 c6 05 ea 96 81 00 01
> e8 02 f0 fd ff 89 d9 4c 89 e6 48 c7 c7 f0 49 00 82 48 89 c2 e8 bd 08 6e ff
> <0f> ff eb c1 66 2e 0f 1f 84 00 00 00 00 00 48 8b 4e 48 55 44 8b
> [199141.434895] ---[ end trace 0bac15bcef84778a ]---
> [199141.435208] igb :08:00.2 eth1: Reset adapter
> [199144.839995] igb :08:00.2 eth1: igb: eth1 NIC Link is Up 1000 Mbps
> Full Duplex, Flow Control: RX/TX
>
> Justin.
>


[PATCH 1/2] net: ethernet: ti: cpts: convert to use kthread_worker

2017-07-21 Thread Grygorii Strashko
There could be significant delay in CPTS work schedule under high system
load and on -RT which could cause CPTS misbehavior due to internal counter
overflow. Usage of own kthread_worker allows to avoid such kind of issues
and makes it possible to tune priority of CPTS kthread_worker thread on -RT
(thread name "cpts").

Signed-off-by: Grygorii Strashko 
---
 drivers/net/ethernet/ti/cpts.c | 23 +--
 drivers/net/ethernet/ti/cpts.h |  4 +++-
 2 files changed, 20 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/ti/cpts.c b/drivers/net/ethernet/ti/cpts.c
index 32279d2..6a520ae 100644
--- a/drivers/net/ethernet/ti/cpts.c
+++ b/drivers/net/ethernet/ti/cpts.c
@@ -20,12 +20,12 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 
@@ -238,14 +238,15 @@ static struct ptp_clock_info cpts_info = {
.enable = cpts_ptp_enable,
 };
 
-static void cpts_overflow_check(struct work_struct *work)
+static void cpts_overflow_check(struct kthread_work *work)
 {
struct timespec64 ts;
struct cpts *cpts = container_of(work, struct cpts, overflow_work.work);
 
cpts_ptp_gettime(>info, );
pr_debug("cpts overflow check at %lld.%09lu\n", ts.tv_sec, ts.tv_nsec);
-   schedule_delayed_work(>overflow_work, cpts->ov_check_period);
+   kthread_queue_delayed_work(cpts->kworker, >overflow_work,
+  cpts->ov_check_period);
 }
 
 static int cpts_match(struct sk_buff *skb, unsigned int ptp_class,
@@ -378,7 +379,8 @@ int cpts_register(struct cpts *cpts)
}
cpts->phc_index = ptp_clock_index(cpts->clock);
 
-   schedule_delayed_work(>overflow_work, cpts->ov_check_period);
+   kthread_queue_delayed_work(cpts->kworker, >overflow_work,
+  cpts->ov_check_period);
return 0;
 
 err_ptp:
@@ -392,7 +394,7 @@ void cpts_unregister(struct cpts *cpts)
if (WARN_ON(!cpts->clock))
return;
 
-   cancel_delayed_work_sync(>overflow_work);
+   kthread_cancel_delayed_work_sync(>overflow_work);
 
ptp_clock_unregister(cpts->clock);
cpts->clock = NULL;
@@ -476,7 +478,6 @@ struct cpts *cpts_create(struct device *dev, void __iomem 
*regs,
cpts->dev = dev;
cpts->reg = (struct cpsw_cpts __iomem *)regs;
spin_lock_init(>lock);
-   INIT_DELAYED_WORK(>overflow_work, cpts_overflow_check);
 
ret = cpts_of_parse(cpts, node);
if (ret)
@@ -488,6 +489,14 @@ struct cpts *cpts_create(struct device *dev, void __iomem 
*regs,
return ERR_PTR(PTR_ERR(cpts->refclk));
}
 
+   kthread_init_delayed_work(>overflow_work, cpts_overflow_check);
+   cpts->kworker = kthread_create_worker(0, "cpts");
+   if (IS_ERR(cpts->kworker)) {
+   dev_err(dev, "failed to create cpts overflow_work task %ld\n",
+   PTR_ERR(cpts->kworker));
+   return ERR_CAST(cpts->kworker);
+   }
+
clk_prepare(cpts->refclk);
 
cpts->cc.read = cpts_systim_read;
@@ -513,6 +522,8 @@ void cpts_release(struct cpts *cpts)
return;
 
clk_unprepare(cpts->refclk);
+
+   kthread_destroy_worker(cpts->kworker);
 }
 EXPORT_SYMBOL_GPL(cpts_release);
 
diff --git a/drivers/net/ethernet/ti/cpts.h b/drivers/net/ethernet/ti/cpts.h
index 01ea82b..e8128e8 100644
--- a/drivers/net/ethernet/ti/cpts.h
+++ b/drivers/net/ethernet/ti/cpts.h
@@ -27,6 +27,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -119,13 +120,14 @@ struct cpts {
u32 cc_mult; /* for the nominal frequency */
struct cyclecounter cc;
struct timecounter tc;
-   struct delayed_work overflow_work;
int phc_index;
struct clk *refclk;
struct list_head events;
struct list_head pool;
struct cpts_event pool_data[CPTS_MAX_EVENTS];
unsigned long ov_check_period;
+   struct kthread_worker *kworker;
+   struct kthread_delayed_work overflow_work;
 };
 
 void cpts_rx_timestamp(struct cpts *cpts, struct sk_buff *skb);
-- 
2.10.1



[PATCH 0/2] net: ethernet: ti: cpts: fix tx timestamping timeout

2017-07-21 Thread Grygorii Strashko
With the low Ethernet connection speed cpdma notification about packet
processing can be received before CPTS TX timestamp event, which is set
when packet actually left CPSW while cpdma notification is sent when packet
pushed in CPSW fifo. As result, when connection is slow and CPU is fast
enough TX timestamping is not working properly.
Issue was discovered using timestamping tool on am57x boards with Ethernet link
speed forced to 100M and on am335x-evm with Ethernet link speed forced to 10M.

This series fixes it by introducing TX SKB queue to store PTP SKBs for which
Ethernet Transmit Event hasn't been received yet and then re-check this queue
with new Ethernet Transmit Events by scheduling CPTS overflow
work more often until TX SKB queue is not empty.
The internal CPTS workqueues are also converted to use kthread_worker.

Grygorii Strashko (2):
  net: ethernet: ti: cpts: convert to use kthread_worker
  net: ethernet: ti: cpts: fix tx timestamping timeout

 drivers/net/ethernet/ti/cpts.c | 112 +
 drivers/net/ethernet/ti/cpts.h |   5 +-
 2 files changed, 106 insertions(+), 11 deletions(-)

-- 
2.10.1



[PATCH 2/2] net: ethernet: ti: cpts: fix tx timestamping timeout

2017-07-21 Thread Grygorii Strashko
With the low speed Ethernet connection CPDMA notification about packet
processing can be received before CPTS TX timestamp event, which is set
when packet actually left CPSW while cpdma notification is sent when packet
pushed in CPSW fifo.  As result, when connection is slow and CPU is fast
enough TX timestamping is not working properly.

Fix it, by introducing TX SKB queue to store PTP SKBs for which Ethernet
Transmit Event hasn't been received yet and then re-check this queue
with new Ethernet Transmit Events by scheduling CPTS overflow
work more often (every 1 jiffies) until TX SKB queue is not empty.

Side effect of this change is:
 - User space tools require to take into account possible delay in TX
timestamp processing (for example ptp4l works with tx_timestamp_timeout=400
under net traffic and tx_timestamp_timeout=25 in idle).

Signed-off-by: Grygorii Strashko 
---
 drivers/net/ethernet/ti/cpts.c | 93 +++---
 drivers/net/ethernet/ti/cpts.h |  1 +
 2 files changed, 88 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/ti/cpts.c b/drivers/net/ethernet/ti/cpts.c
index 6a520ae..746dd1d 100644
--- a/drivers/net/ethernet/ti/cpts.c
+++ b/drivers/net/ethernet/ti/cpts.c
@@ -31,9 +31,18 @@
 
 #include "cpts.h"
 
+#define CPTS_SKB_TX_WORK_TIMEOUT 1 /* jiffies */
+
+struct cpts_skb_cb_data {
+   unsigned long tmo;
+};
+
 #define cpts_read32(c, r)  readl_relaxed(>reg->r)
 #define cpts_write32(c, v, r)  writel_relaxed(v, >reg->r)
 
+static int cpts_match(struct sk_buff *skb, unsigned int ptp_class,
+ u16 ts_seqid, u8 ts_msgtype);
+
 static int event_expired(struct cpts_event *event)
 {
return time_after(jiffies, event->tmo);
@@ -77,6 +86,47 @@ static int cpts_purge_events(struct cpts *cpts)
return removed ? 0 : -1;
 }
 
+static bool cpts_match_tx_ts(struct cpts *cpts, struct cpts_event *event)
+{
+   struct sk_buff *skb, *tmp;
+   u16 seqid;
+   u8 mtype;
+   bool found = false;
+
+   mtype = (event->high >> MESSAGE_TYPE_SHIFT) & MESSAGE_TYPE_MASK;
+   seqid = (event->high >> SEQUENCE_ID_SHIFT) & SEQUENCE_ID_MASK;
+
+   /* no need to grab txq.lock as access is always done under cpts->lock */
+   skb_queue_walk_safe(>txq, skb, tmp) {
+   struct skb_shared_hwtstamps ssh;
+   unsigned int class = ptp_classify_raw(skb);
+   struct cpts_skb_cb_data *skb_cb =
+   (struct cpts_skb_cb_data *)skb->cb;
+
+   if (cpts_match(skb, class, seqid, mtype)) {
+   u64 ns = timecounter_cyc2time(>tc, event->low);
+
+   memset(, 0, sizeof(ssh));
+   ssh.hwtstamp = ns_to_ktime(ns);
+   skb_tstamp_tx(skb, );
+   found = true;
+   __skb_unlink(skb, >txq);
+   dev_consume_skb_any(skb);
+   dev_dbg(cpts->dev, "match tx timestamp mtype %u seqid 
%04x\n",
+   mtype, seqid);
+   } else if (time_after(jiffies, skb_cb->tmo)) {
+   /* timeout any expired skbs over 1s */
+   dev_dbg(cpts->dev,
+   "expiring tx timestamp mtype %u seqid %04x\n",
+   mtype, seqid);
+   __skb_unlink(skb, >txq);
+   dev_consume_skb_any(skb);
+   }
+   }
+
+   return found;
+}
+
 /*
  * Returns zero if matching event type was found.
  */
@@ -101,9 +151,15 @@ static int cpts_fifo_read(struct cpts *cpts, int match)
event->low = lo;
type = event_type(event);
switch (type) {
+   case CPTS_EV_TX:
+   if (cpts_match_tx_ts(cpts, event)) {
+   /* if the new event matches an existing skb,
+* then don't queue it
+*/
+   break;
+   }
case CPTS_EV_PUSH:
case CPTS_EV_RX:
-   case CPTS_EV_TX:
list_del_init(>list);
list_add_tail(>list, >events);
break;
@@ -240,13 +296,20 @@ static struct ptp_clock_info cpts_info = {
 
 static void cpts_overflow_check(struct kthread_work *work)
 {
-   struct timespec64 ts;
struct cpts *cpts = container_of(work, struct cpts, overflow_work.work);
+   unsigned long delay = cpts->ov_check_period;
+   struct timespec64 ts;
+   unsigned long flags;
+
+   spin_lock_irqsave(>lock, flags);
+   ts = ns_to_timespec64(timecounter_read(>tc));
+
+   if (!skb_queue_empty(>txq))
+   delay = CPTS_SKB_TX_WORK_TIMEOUT;
+   spin_unlock_irqrestore(>lock, flags);
 
-   cpts_ptp_gettime(>info, );

Re: [PATCH iproute2 master 1/2] bpf: improve error reporting around tail calls

2017-07-21 Thread Stephen Hemminger
On Fri, 21 Jul 2017 21:13:06 +0200
Daniel Borkmann  wrote:

> Currently, it's still quite hard to figure out if a prog passed the
> verifier, but later gets rejected due to different tail call ownership.
> Figure out whether that is the case and provide appropriate error
> messages to the user.
> 
> Signed-off-by: Daniel Borkmann 

Sorry, dead code. Please fix and resubmit.

bpf.c:356:12: warning: ‘bpf_derive_prog_from_fdinfo’ defined but not used 
[-Wunused-function]
 static int bpf_derive_prog_from_fdinfo(int fd, struct bpf_prog_data *prog)


SELinux/IP_PASSSEC regression in 4.13-rcX

2017-07-21 Thread Paul Moore
Hello,

I've been seeing a SELinux regression with IP_PASSSEC on the v4.13-rcX
kernels and finally tracked the problem down to the
skb_release_head_state() call in __udp_queue_rcv_skb().  Looking at
the code and the git log it would appear that the likely culprit is
0a463c78d25b ("udp: avoid a cache miss on dequeue
"); it looks similar to IP option problem fixed in 0ddf3fb2c43d2.

>From a SELinux/IP_PASSSEC point of view we need access to the skb->sp
pointer to examine the SAs.  I'm posting this here without a patch
because it isn't clear to me how you would like to fix the problem; my
initial thought would be to simply make the skb_release_head_state()
conditional on the skb->sp pointer, much like the IP options fix, but
I'm not sure if you have a more clever idea.

-- 
paul moore
www.paul-moore.com


Re: [PATCH net] ipv6: no need to return rt->dst.error if it is not null entry.

2017-07-21 Thread Roopa Prabhu
On Fri, Jul 21, 2017 at 11:42 AM, Cong Wang  wrote:
> On Thu, Jul 20, 2017 at 8:23 AM, Hangbin Liu  wrote:
>> 2017-07-20 23:06 GMT+08:00 Hangbin Liu :
 +++ b/net/ipv6/route.c
 @@ -3637,12 +3637,6 @@ static int inet6_rtm_getroute(struct sk_buff 
 *in_skb, struct nlmsghdr *nlh,
 dst = ip6_route_lookup(net, , 0);

 rt = container_of(dst, struct rt6_info, dst);
 -   if (rt->dst.error) {
 -   err = rt->dst.error;
 -   ip6_rt_put(rt);
 -   goto errout;
 -   }
>>>
>>> hmm... or instead of remove this check, should we check all the entry? Like
>>> if ((rt->dst.error && rt != net->ipv6.ip6_null_entry && rt !=
>> ^^ mistake here
>>> net->ipv6.ip6_blk_hole_entry) ||
>>>  rt == net->ipv6.ip6_null_entry )
>>
>> Sorry,  there should be no need to check ip6_null_entry since the
>> error is already
>> -ENETUNREACH. So how about
>
> Hmm? All of these 3 entries have error set, right??
> So we should only check dst.error...

removing the check seems ok to me. We can also make the check
conditional to fibmatch code only
to eliminate any change in behavior introduced by fibmatch.

ie if (fibmatch && rt->dst.error).

Hangbin, can you also pls check that fibmatch works ok for such routes
with your patch applied ?.

ip netns exec client ip -6 route get fibmatch 2003::1 (with latest iproute2)

thank you.


[PATCH] net: stmmac: Adjust dump offset of DMA registers for ethtool

2017-07-21 Thread thor . thayer
From: Thor Thayer 

The commit fbf68229ffe7 ("net: stmmac: unify registers dumps methods")

in the Linux kernel modified the register dump to store the DMA registers
at the DMA register offset (0x1000) but ethtool (stmmac.c) looks for the
DMA registers after the MAC registers which is offset 55.
This patch copies the DMA registers from the higher offset to the offset
where ethtool expects them.

Signed-off-by: Thor Thayer 
---
 drivers/net/ethernet/stmicro/stmmac/dwmac1000_dma.c  | 2 +-
 drivers/net/ethernet/stmicro/stmmac/dwmac100_dma.c   | 2 +-
 drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h  | 3 +++
 drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c | 5 +
 4 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac1000_dma.c 
b/drivers/net/ethernet/stmicro/stmmac/dwmac1000_dma.c
index cc4ea13..ec539f7 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac1000_dma.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac1000_dma.c
@@ -205,7 +205,7 @@ static void dwmac1000_dump_dma_regs(void __iomem *ioaddr, 
u32 *reg_space)
 {
int i;
 
-   for (i = 0; i < 23; i++)
+   for (i = 0; i < NUM_DWMAC1000_DMA_REGS; i++)
if ((i < 12) || (i > 17))
reg_space[DMA_BUS_MODE / 4 + i] =
readl(ioaddr + DMA_BUS_MODE + i * 4);
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac100_dma.c 
b/drivers/net/ethernet/stmicro/stmmac/dwmac100_dma.c
index eef2f22..6502b9a 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac100_dma.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac100_dma.c
@@ -70,7 +70,7 @@ static void dwmac100_dump_dma_regs(void __iomem *ioaddr, u32 
*reg_space)
 {
int i;
 
-   for (i = 0; i < 9; i++)
+   for (i = 0; i < NUM_DWMAC100_DMA_REGS; i++)
reg_space[DMA_BUS_MODE / 4 + i] =
readl(ioaddr + DMA_BUS_MODE + i * 4);
 
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h 
b/drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h
index 56e485f..3107d19 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h
@@ -136,6 +136,9 @@
 #define DMA_STATUS_TI  0x0001  /* Transmit Interrupt */
 #define DMA_CONTROL_FTF0x0010  /* Flush transmit FIFO 
*/
 
+#define NUM_DWMAC100_DMA_REGS  9
+#define NUM_DWMAC1000_DMA_REGS 23
+
 void dwmac_enable_dma_transmission(void __iomem *ioaddr);
 void dwmac_enable_dma_irq(void __iomem *ioaddr);
 void dwmac_disable_dma_irq(void __iomem *ioaddr);
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c 
b/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c
index 690b7c1..2ffb5b1 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c
@@ -33,6 +33,8 @@
 #define MAC100_ETHTOOL_NAME"st_mac100"
 #define GMAC_ETHTOOL_NAME  "st_gmac"
 
+#define ETHTOOL_DMA_OFFSET 55
+
 struct stmmac_stats {
char stat_string[ETH_GSTRING_LEN];
int sizeof_stat;
@@ -443,6 +445,9 @@ static void stmmac_ethtool_gregs(struct net_device *dev,
 
priv->hw->mac->dump_regs(priv->hw, reg_space);
priv->hw->dma->dump_regs(priv->ioaddr, reg_space);
+   /* Copy DMA registers to where ethtool expects them */
+   memcpy(_space[ETHTOOL_DMA_OFFSET], _space[DMA_BUS_MODE / 4],
+  NUM_DWMAC1000_DMA_REGS * 4);
 }
 
 static void
-- 
2.7.4



Re: A buggy behavior for Linux TCP Reno and HTCP

2017-07-21 Thread Yuchung Cheng
On Fri, Jul 21, 2017 at 1:46 PM, Neal Cardwell  wrote:
> On Fri, Jul 21, 2017 at 4:27 PM, Lisong Xu  wrote:
>>
>> Hi Yuchung,
>>
>> This test scenario is only one example to trigger this bug. In general, as
>> long as cwnd <4, the undo function has this bug.
>
>
> Yes, personally I agree that this seems like an issue that is general enough
> to be worth fixing. In the sense that, if cwnd <4, then we may well be very
> congested. So we don't want to get hit by this bug wherein an undo of a loss
> recovery can cause cwnd to suddenly jump (from 1, 2, or 3) up to 4.
>
> Seems like any of the several CCs that use tcp_reno_undo_cwnd() have this
> bug.
>
> I guess in my mind the only question is whether we want to add a
> tcp_foo_undo_cwnd() and ca->loss_cwnd to every CC module to handle this
> issue (i.e. make every CC module handle it the way CUBIC does), or (my
I would prefer the former b/c loss_cwnd may not be universal TCP
state, just like ssthresh carries no meaning in some CC (bbr). It also
seems also more consistent with the recent change on undo

commit e97991832a4ea4a5f47d65f068a4c966a2eb5730
Author: Florian Westphal 
Date:   Mon Nov 21 14:18:38 2016 +0100

tcp: make undo_cwnd mandatory for congestion modules

> preference) just add a tp->loss_cwnd field so we can use shared code in
> tcp_reno_undo_cwnd() to get this right across all CC modules.
>
> neal
>


Re: A buggy behavior for Linux TCP Reno and HTCP

2017-07-21 Thread Neal Cardwell
On Fri, Jul 21, 2017 at 4:27 PM, Lisong Xu  wrote:
>
> Hi Yuchung,
>
> This test scenario is only one example to trigger this bug. In general, as 
> long as cwnd <4, the undo function has this bug.

Yes, personally I agree that this seems like an issue that is general
enough to be worth fixing. In the sense that, if cwnd <4, then we may
well be very congested. So we don't want to get hit by this bug
wherein an undo of a loss recovery can cause cwnd to suddenly jump
(from 1, 2, or 3) up to 4.

Seems like any of the several CCs that use tcp_reno_undo_cwnd() have this bug.

I guess in my mind the only question is whether we want to add a
tcp_foo_undo_cwnd() and ca->loss_cwnd to every CC module to handle
this issue (i.e. make every CC module handle it the way CUBIC does),
or (my preference) just add a tp->loss_cwnd field so we can use shared
code in tcp_reno_undo_cwnd() to get this right across all CC modules.

neal


Re: [RFC 1/2] net-next: add a dma_desc element to struct skb_shared_info

2017-07-21 Thread Florian Westphal
John Crispin  wrote:
> When the flow offloading engine forwards a packet to the DMA it will send
> additional info to the sw path. this includes
> * physical switch port
> * internal flow hash - this is required to populate the correct flow table
> entry
> * ppe state - this indicates what state the PPEs internal table is in for
> the flow
> * the reason why the packet was forwarde - these are things like bind,
> unbind, timed out, ...
> 
> once the flow table offloading patches are ready and upstream, the netfilter
> layer will see the SKB and pass it o to the flow table offloading code,

If this is about conntrack offloading, then I prefer if this is done
without changing any core network structure.

What about adding a new conntrack extension to hold whatever info
you need, and then allocate a conntrack entry in the driver?

This would obviously need core changes in conntrack (such as allowing
calls into conntrack from drivers without hard module dependencies,
and a thorough check if this causes backwards problems (e.g.
right now a "-m conntrack" check in the raw table can only succeed for
packets from lo interface).

But I think that could be worked around, esp. if we assume that we
won't see such entries a lot (assuming sw is slowpath and hw handles
most packets).

Thanks,
Florian


Re: A buggy behavior for Linux TCP Reno and HTCP

2017-07-21 Thread Lisong Xu

Hi Yuchung,

This test scenario is only one example to trigger this bug. In 
generally, as long as cwnd <4, the undo function has this bug.


This would not be a problem for a normal network. But might be an issue, 
if the network is highly congested (e.g., many many TCP flows with small 
cwnd <4). In this case, the bug may possibly mistakenly double the 
sending rate of each flow, and make a highly congested network even more 
congested  similar to congestion collapse. This is actually why we 
need the congestion control algorithms in the first place.


Thanks
Lisong

On 7/21/2017 12:59 PM, Yuchung Cheng wrote:

On Thu, Jul 20, 2017 at 2:28 PM, Wei Sun  wrote:

Hi Yuchung,

Sorry for the confusion.  The test case was adapted from an old DSACK
test case (i.e., forget to remove something).

Attached is a new and simple one. Thanks

Note that the test scenario is fairly rare IMO: the connection first
experience timeouts, then its retransmission got acked, then the
original packets get Acked (ack w/ val 1400 ecr 130). It can be really
long reordering, or reordering plus packet loss.

The Linux undo state machines may not handle this perfectly, but it's
probably not worth extra state for such rare events.




On Wed, Jul 19, 2017 at 2:31 PM, Yuchung Cheng  wrote:

On Tue, Jul 18, 2017 at 2:36 PM, Wei Sun  wrote:

Hi there,

We find a buggy behavior when using Linux TCP Reno and HTCP in low
bandwidth or highly congested network environments.

In a simple word, their undo functions may mistakenly double the cwnd,
leading to a more aggressive behavior in a highly congested scenario.


The detailed reason:

The current reno undo function assumes cwnd halving (and thus doubles
the cwnd), but it doesn't consider a corner case condition that
ssthresh is at least 2.

e.g.,
  cwnd  ssth
An initial state: 25
A spurious loss:   12
Undo:   45

Here the cwnd after undo is two times as that before undo. Attached is
a simple script to reproduce it.

the packetdrill script is a bit confusing: it disables SACK but then
the client returns ACK w/ SACKs, also 3 dupacks happen after RTO so
the sender isn't technically going through a fast recovery...

could you provide a better test?


A similar reason for HTCP, so we recommend to store the cwnd on loss
in .ssthresh implementation and restore it again in .undo_cwnd for TCP
Reno and HTCP implementations.

Thanks




Re: A buggy behavior for Linux TCP Reno and HTCP

2017-07-21 Thread Lisong Xu

Hi Yuchung,

This test scenario is only one example to trigger this bug. In general, 
as long as cwnd <4, the undo function has this bug.


This would not be a problem for a normal network. But might be an issue, 
if the network is highly congested (e.g., many many TCP flows with small 
cwnd <4). In this case, the bug may possibly mistakenly double the 
sending rate of each flow, and make a highly congested network even more 
congested  similar to congestion collapse. This is actually why we 
need the congestion control algorithms in the first place.


Thanks
Lisong

On 7/21/2017 12:59 PM, Yuchung Cheng wrote:

On Thu, Jul 20, 2017 at 2:28 PM, Wei Sun  wrote:

Hi Yuchung,

Sorry for the confusion.  The test case was adapted from an old DSACK
test case (i.e., forget to remove something).

Attached is a new and simple one. Thanks

Note that the test scenario is fairly rare IMO: the connection first
experience timeouts, then its retransmission got acked, then the
original packets get Acked (ack w/ val 1400 ecr 130). It can be really
long reordering, or reordering plus packet loss.

The Linux undo state machines may not handle this perfectly, but it's
probably not worth extra state for such rare events.




On Wed, Jul 19, 2017 at 2:31 PM, Yuchung Cheng  wrote:

On Tue, Jul 18, 2017 at 2:36 PM, Wei Sun  wrote:

Hi there,

We find a buggy behavior when using Linux TCP Reno and HTCP in low
bandwidth or highly congested network environments.

In a simple word, their undo functions may mistakenly double the cwnd,
leading to a more aggressive behavior in a highly congested scenario.


The detailed reason:

The current reno undo function assumes cwnd halving (and thus doubles
the cwnd), but it doesn't consider a corner case condition that
ssthresh is at least 2.

e.g.,
  cwnd  ssth
An initial state: 25
A spurious loss:   12
Undo:   45

Here the cwnd after undo is two times as that before undo. Attached is
a simple script to reproduce it.

the packetdrill script is a bit confusing: it disables SACK but then
the client returns ACK w/ SACKs, also 3 dupacks happen after RTO so
the sender isn't technically going through a fast recovery...

could you provide a better test?


A similar reason for HTCP, so we recommend to store the cwnd on loss
in .ssthresh implementation and restore it again in .undo_cwnd for TCP
Reno and HTCP implementations.

Thanks




Re: [RFC 1/2] net-next: add a dma_desc element to struct skb_shared_info

2017-07-21 Thread David Miller
From: John Crispin 
Date: Fri, 21 Jul 2017 19:01:57 +0200

> When the flow offloading engine forwards a packet to the DMA it will
> send additional info to the sw path. this includes
> * physical switch port
> * internal flow hash - this is required to populate the correct flow
> * table entry
> * ppe state - this indicates what state the PPEs internal table is in
> * for the flow
> * the reason why the packet was forwarde - these are things like bind,
> * unbind, timed out, ...
> 
> once the flow table offloading patches are ready and upstream, the
> netfilter layer will see the SKB and pass it o to the flow table
> offloading code, at which point it will finally end up inside the
> offloading driver. this will need to have access to this info sent to
> the sw path inside the rx descriptor to properly work out what state
> the flow is in and which table entry to populate in the HW table for
> offloading to work.

You absolutely must justify any change to a core data structure
alongside the complete and full set of patches that actually make use
of that data structure change.

You can't just say "here is the data structure change and BTW what
actually uses this is somewhere else, and not here on the list yet."

That makes it impossible to 1) evaluate the correctness of your change
and 2) validate the actual use so we can suggest alternative schemes
and/or approaches.

So please don't suggest changes this way.

Thanks.


[PATCH iproute2 master 1/2] bpf: improve error reporting around tail calls

2017-07-21 Thread Daniel Borkmann
Currently, it's still quite hard to figure out if a prog passed the
verifier, but later gets rejected due to different tail call ownership.
Figure out whether that is the case and provide appropriate error
messages to the user.

Signed-off-by: Daniel Borkmann 
---
 lib/bpf.c | 226 ++
 1 file changed, 169 insertions(+), 57 deletions(-)

diff --git a/lib/bpf.c b/lib/bpf.c
index 7eb5cd9..4f6421a 100644
--- a/lib/bpf.c
+++ b/lib/bpf.c
@@ -344,15 +344,50 @@ static void bpf_map_pin_report(const struct bpf_elf_map 
*pin,
fprintf(stderr, "\n");
 }
 
-static int bpf_map_selfcheck_pinned(int fd, const struct bpf_elf_map *map,
-   int length, enum bpf_prog_type type)
+struct bpf_prog_data {
+   unsigned int type;
+   unsigned int jited;
+};
+
+struct bpf_map_ext {
+   struct bpf_prog_data owner;
+};
+
+static int bpf_derive_prog_from_fdinfo(int fd, struct bpf_prog_data *prog)
+{
+   char file[PATH_MAX], buff[4096];
+   unsigned int val;
+   FILE *fp;
+
+   snprintf(file, sizeof(file), "/proc/%d/fdinfo/%d", getpid(), fd);
+   memset(prog, 0, sizeof(*prog));
+
+   fp = fopen(file, "r");
+   if (!fp) {
+   fprintf(stderr, "No procfs support?!\n");
+   return -EIO;
+   }
+
+   while (fgets(buff, sizeof(buff), fp)) {
+   if (sscanf(buff, "prog_type:\t%u", ) == 1)
+   prog->type = val;
+   else if (sscanf(buff, "prog_jited:\t%u", ) == 1)
+   prog->jited = val;
+   }
+
+   fclose(fp);
+   return 0;
+}
+
+static int bpf_derive_elf_map_from_fdinfo(int fd, struct bpf_elf_map *map,
+ struct bpf_map_ext *ext)
 {
+   unsigned int val, owner_type = 0, owner_jited = 0;
char file[PATH_MAX], buff[4096];
-   struct bpf_elf_map tmp = {}, zero = {};
-   unsigned int val, owner_type = 0;
FILE *fp;
 
snprintf(file, sizeof(file), "/proc/%d/fdinfo/%d", getpid(), fd);
+   memset(map, 0, sizeof(*map));
 
fp = fopen(file, "r");
if (!fp) {
@@ -362,27 +397,48 @@ static int bpf_map_selfcheck_pinned(int fd, const struct 
bpf_elf_map *map,
 
while (fgets(buff, sizeof(buff), fp)) {
if (sscanf(buff, "map_type:\t%u", ) == 1)
-   tmp.type = val;
+   map->type = val;
else if (sscanf(buff, "key_size:\t%u", ) == 1)
-   tmp.size_key = val;
+   map->size_key = val;
else if (sscanf(buff, "value_size:\t%u", ) == 1)
-   tmp.size_value = val;
+   map->size_value = val;
else if (sscanf(buff, "max_entries:\t%u", ) == 1)
-   tmp.max_elem = val;
+   map->max_elem = val;
else if (sscanf(buff, "map_flags:\t%i", ) == 1)
-   tmp.flags = val;
+   map->flags = val;
else if (sscanf(buff, "owner_prog_type:\t%i", ) == 1)
owner_type = val;
+   else if (sscanf(buff, "owner_jited:\t%i", ) == 1)
+   owner_jited = val;
}
 
fclose(fp);
+   if (ext) {
+   memset(ext, 0, sizeof(*ext));
+   ext->owner.type  = owner_type;
+   ext->owner.jited = owner_jited;
+   }
+
+   return 0;
+}
+
+static int bpf_map_selfcheck_pinned(int fd, const struct bpf_elf_map *map,
+   struct bpf_map_ext *ext, int length,
+   enum bpf_prog_type type)
+{
+   struct bpf_elf_map tmp, zero = {};
+   int ret;
+
+   ret = bpf_derive_elf_map_from_fdinfo(fd, , ext);
+   if (ret < 0)
+   return ret;
 
/* The decision to reject this is on kernel side eventually, but
 * at least give the user a chance to know what's wrong.
 */
-   if (owner_type && owner_type != type)
+   if (ext->owner.type && ext->owner.type != type)
fprintf(stderr, "Program array map owner types differ: %u (obj) 
!= %u (pin)\n",
-   type, owner_type);
+   type, ext->owner.type);
 
if (!memcmp(, map, length)) {
return 0;
@@ -882,6 +938,7 @@ int bpf_graft_map(const char *map_path, uint32_t *key, int 
argc, char **argv)
.argc   = argc,
.argv   = argv,
};
+   struct bpf_map_ext ext = {};
int ret, prog_fd, map_fd;
enum bpf_mode mode;
uint32_t map_key;
@@ -908,7 +965,7 @@ int bpf_graft_map(const char *map_path, uint32_t *key, int 
argc, char **argv)
goto out_prog;
}
 
-   ret = bpf_map_selfcheck_pinned(map_fd, ,
+   ret = bpf_map_selfcheck_pinned(map_fd, , ,
   

[PATCH iproute2 master 0/2] Minor BPF updates

2017-07-21 Thread Daniel Borkmann
Two minor updates to the BPF code, first one makes use of the
recently exposed owner_jited in fdinfo to report whether a
load issue related to tail calls occured, and second one fixes
up custom mount of bpf fs when passed via env.

Thanks!

Daniel Borkmann (2):
  bpf: improve error reporting around tail calls
  bpf: fix mnt path when from env

 lib/bpf.c | 281 +++---
 1 file changed, 214 insertions(+), 67 deletions(-)

-- 
1.9.3



[PATCH iproute2 master 2/2] bpf: fix mnt path when from env

2017-07-21 Thread Daniel Borkmann
When bpf fs mount path is from env, behavior is currently broken as
we continue to search in default paths, thus fix this up.

Signed-off-by: Daniel Borkmann 
---
 lib/bpf.c | 55 +--
 1 file changed, 45 insertions(+), 10 deletions(-)

diff --git a/lib/bpf.c b/lib/bpf.c
index 4f6421a..851742c 100644
--- a/lib/bpf.c
+++ b/lib/bpf.c
@@ -485,6 +485,24 @@ static int bpf_mnt_fs(const char *target)
return 0;
 }
 
+static int bpf_mnt_check_target(const char *target)
+{
+   struct stat sb = {};
+   int ret;
+
+   ret = stat(target, );
+   if (ret) {
+   ret = mkdir(target, S_IRWXU);
+   if (ret) {
+   fprintf(stderr, "mkdir %s failed: %s\n", target,
+   strerror(errno));
+   return ret;
+   }
+   }
+
+   return 0;
+}
+
 static int bpf_valid_mntpt(const char *mnt, unsigned long magic)
 {
struct statfs st_fs;
@@ -497,6 +515,21 @@ static int bpf_valid_mntpt(const char *mnt, unsigned long 
magic)
return 0;
 }
 
+static const char *bpf_find_mntpt_single(unsigned long magic, char *mnt,
+int len, const char *mntpt)
+{
+   int ret;
+
+   ret = bpf_valid_mntpt(mntpt, magic);
+   if (!ret) {
+   strncpy(mnt, mntpt, len - 1);
+   mnt[len - 1] = 0;
+   return mnt;
+   }
+
+   return NULL;
+}
+
 static const char *bpf_find_mntpt(const char *fstype, unsigned long magic,
  char *mnt, int len,
  const char * const *known_mnts)
@@ -508,11 +541,8 @@ static const char *bpf_find_mntpt(const char *fstype, 
unsigned long magic,
if (known_mnts) {
ptr = known_mnts;
while (*ptr) {
-   if (bpf_valid_mntpt(*ptr, magic) == 0) {
-   strncpy(mnt, *ptr, len - 1);
-   mnt[len - 1] = 0;
+   if (bpf_find_mntpt_single(magic, mnt, len, *ptr))
return mnt;
-   }
ptr++;
}
}
@@ -690,6 +720,7 @@ static const char *bpf_get_work_dir(enum bpf_prog_type type)
static char bpf_wrk_dir[PATH_MAX];
static const char *mnt;
static bool bpf_mnt_cached;
+   const char *mnt_env = getenv(BPF_ENV_MNT);
static const char * const bpf_known_mnts[] = {
BPF_DIR_MNT,
"/bpf",
@@ -708,13 +739,17 @@ static const char *bpf_get_work_dir(enum bpf_prog_type 
type)
return out;
}
 
-   mnt = bpf_find_mntpt("bpf", BPF_FS_MAGIC, bpf_tmp, sizeof(bpf_tmp),
-bpf_known_mnts);
+   if (mnt_env)
+   mnt = bpf_find_mntpt_single(BPF_FS_MAGIC, bpf_tmp,
+   sizeof(bpf_tmp), mnt_env);
+   else
+   mnt = bpf_find_mntpt("bpf", BPF_FS_MAGIC, bpf_tmp,
+sizeof(bpf_tmp), bpf_known_mnts);
if (!mnt) {
-   mnt = getenv(BPF_ENV_MNT);
-   if (!mnt)
-   mnt = BPF_DIR_MNT;
-   ret = bpf_mnt_fs(mnt);
+   mnt = mnt_env ? : BPF_DIR_MNT;
+   ret = bpf_mnt_check_target(mnt);
+   if (!ret)
+   ret = bpf_mnt_fs(mnt);
if (ret) {
mnt = NULL;
goto out;
-- 
1.9.3



Re: [PATCH V2 1/4] net-next: dsa: move struct dsa_device_ops to the global header file

2017-07-21 Thread David Miller

A proper patch series submission must always have a proper
"[PATCH 0/N] ..." header posting that explains at a high level
what the patch series does, how it is doing it, and why it is
doing it that way.

Thank you.


Re: [PATCH net-next 1/2] rxrpc: Expose UAPI definitions to userspace

2017-07-21 Thread Joe Perches
On Fri, 2017-07-21 at 19:29 +0100, David Howells wrote:
> Move UAPI definitions from the internal header and place them in a UAPI
> header file so that userspace can make use of them.

Please use git format-patch -M to make the rename clearer.



Re: commit 16ecba59 breaks 82574L under heavy load.

2017-07-21 Thread Lennart Sorensen
On Thu, Jul 20, 2017 at 04:44:55PM -0700, Benjamin Poirier wrote:
> Could you please test the following patch and let me know if it:
> 1) reduces the interrupt rate of the Other msi-x vector
> 2) avoids the link flaps
> or
> 3) logs some dmesg warnings of the form "Other interrupt with unhandled [...]"
> In this case, please paste icr values printed.

By the way, while at fixing the e1000e, I just noticed that
if you are blasting the port with traffic when it comes up,
you risk getting a transmit queue time out, because the queue
is started before the carrier is up.  ixgbe already fixed that in
cdc04dcce0598fead6029a2f95e95a4d2ea419c2.  igb has the same problem (which
goes away by moving the queue start to the watchdog after carrier_on,
I just haven't got around to sending that patch yet).

I am going to try moving the queue start to the watchdog and try it again.

Trace looked like this:

[ cut here ]
WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:316 dev_watchdog+0x1f9/0x200
NETDEV WATCHDOG: eth0 (e1000e): transmit queue 0 timed out
Modules linked in: dpi_drv(PO) ccu_util(PO) ipv4_mb(PO) 
l2bridge_config_util(PO) l2_config_util(PO) route_config_util(PO) 
qos_config_util(PO) sysapp_common(PO) chantry_fwd_eng_2800_config(PO) 
shim_module(PO) sadb_cc(PO) ipsecXformer(PO) libeCrypto(PO) ipmatch_cc(PO) 
l2h_cc(PO) ndproxy_cc(PO) arpint_cc(PO) portinfo_cc(PO) chantryqos_cc(PO) 
redirector_cc(PO) ix_ph(PO) fpm_core_cc(PO) pulse_cc(PO) vnstt_cc(PO) 
vnsap_cc(PO) fm_cc(PO) rutm_cc(PO) mutm_cc(PO) ethernet_tx_cc(PO) stkdrv_cc(PO) 
l2bridge_cc(PO) events_util(PO) sched_cc(PO) qm_cc(PO) ipv4_cc(PO) wred_cc(PO) 
tc_meter_cc(PO) dscp_classifier_cc(PO) classifier_6t_cc(PO) ent586_cc(PO) 
dev_cc_arp(PO) chantry_fwd_eng_2800_tables(PO) ether_arp_lib(PO) rtmv4_lib(PO) 
lkup_lib(PO) l2tm_lib(PO) fragmentation_lib(PO) properties_lib(PO) 
msg_support_lib(PO)
 utilities_lib(PO) cci_lib(PO) rm_lib(PO) libossl(O) vip(O) 
productSpec_x86_dp(PO) e1000e
CPU: 0 PID: 0 Comm: swapper/0 Tainted: P   O4.9.24 #20
Hardware name: Supermicro X7SPA-HF/X7SPA-HF, BIOS 1.2a   06/23/12  
  811cef1b 88007fc03e88 
 81037ade  88007fc03ed8 0001
  0082 0001 81037b4c
Call Trace:
  
 [] ? dump_stack+0x46/0x5b
 [] ? __warn+0xbe/0xe0
 [] ? warn_slowpath_fmt+0x4c/0x50
 [] ? mod_timer+0xf2/0x150
 [] ? dev_watchdog+0x1f9/0x200
 [] ? dev_graft_qdisc+0x70/0x70
 [] ? call_timer_fn.isra.26+0x11/0x80
 [] ? run_timer_softirq+0x128/0x150
 [] ? __do_softirq+0xeb/0x1f0
 [] ? irq_exit+0x55/0x60
 [] ? smp_apic_timer_interrupt+0x39/0x50
 [] ? apic_timer_interrupt+0x7c/0x90
  
 [] ? mwait_idle+0x51/0x80
 [] ? cpu_startup_entry+0xa7/0x130
 [] ? start_kernel+0x306/0x30e
---[ end trace ee759b7a56e1110b ]---

-- 
Len Sorensen


[PATCH] netvsc: fix ptr_ret.cocci warnings

2017-07-21 Thread kbuild test robot
drivers/net/hyperv/netvsc_drv.c:737:8-14: WARNING: PTR_ERR_OR_ZERO can be used


 Use PTR_ERR_OR_ZERO rather than if(IS_ERR(...)) + PTR_ERR

Generated by: scripts/coccinelle/api/ptr_ret.cocci

Fixes: 9749fed5d43d ("netvsc: use ERR_PTR to avoid dereference issues")
CC: stephen hemminger 
Signed-off-by: Fengguang Wu 
---

 netvsc_drv.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -734,7 +734,7 @@ static int netvsc_set_queues(struct net_
return ret;
 
net_device = rndis_filter_device_add(dev, _info);
-   return IS_ERR(net_device) ? PTR_ERR(net_device) : 0;
+   return PTR_ERR_OR_ZERO(net_device);
 }
 
 static int netvsc_set_channels(struct net_device *net,


Re: [Patch net] net: check mac address length for dev_set_mac_address()

2017-07-21 Thread Cong Wang
On Thu, Jul 20, 2017 at 11:27 AM, Cong Wang  wrote:
> diff --git a/net/core/dev_ioctl.c b/net/core/dev_ioctl.c
> index 82fd4c9c4a1b..3f41601d7b7c 100644
> --- a/net/core/dev_ioctl.c
> +++ b/net/core/dev_ioctl.c
> @@ -262,6 +262,8 @@ static int dev_ifsioc(struct net *net, struct ifreq *ifr, 
> unsigned int cmd)
> return dev_set_mtu(dev, ifr->ifr_mtu);
>
> case SIOCSIFHWADDR:
> +   if (dev->addr_len > sizeof(struct sockaddr))
> +   return -EINVAL;
> return dev_set_mac_address(dev, >ifr_hwaddr);
>

Thinking a bit more, I should keep this patch simpler for -net and -stable,
so only the above piece is necessary, the rest pieces can be put to
-net-next.

I will resend this, and the "team: use a larger struct for mac address".

Thanks.


Re: [PATCH 4/5] e1000e: Separate signaling for link check/link up

2017-07-21 Thread Lennart Sorensen
On Fri, Jul 21, 2017 at 11:36:26AM -0700, Benjamin Poirier wrote:
> Lennart reported the following race condition:
> 
> \ e1000_watchdog_task
> \ e1000e_has_link
> \ hw->mac.ops.check_for_link() === e1000e_check_for_copper_link
> /* link is up */
> mac->get_link_status = false;
> 
> /* interrupt */
> \ e1000_msix_other
> hw->mac.get_link_status = true;
> 
> link_active = !hw->mac.get_link_status
> /* link_active is false, wrongly */
> 
> This problem arises because the single flag get_link_status is used to
> signal two different states: link status needs checking and link status is
> down.
> 
> Avoid the problem by using the return value of .check_for_link to signal
> the link status to e1000e_has_link().
> 
> Reported-by: Lennart Sorensen 
> Signed-off-by: Benjamin Poirier 

This too seems potentially -stable worthy, although with patch 5, the
problem becomes much much less likely to occur.

-- 
Len Sorensen


Re: [PATCH 5/5] e1000e: Avoid receiver overrun interrupt bursts

2017-07-21 Thread Lennart Sorensen
On Fri, Jul 21, 2017 at 11:36:27AM -0700, Benjamin Poirier wrote:
> When e1000e_poll() is not fast enough to keep up with incoming traffic, the
> adapter (when operating in msix mode) raises the Other interrupt to signal
> Receiver Overrun.
> 
> This is a double problem because 1) at the moment e1000_msix_other()
> assumes that it is only called in case of Link Status Change and 2) if the
> condition persists, the interrupt is repeatedly raised again in quick
> succession.
> 
> Ideally we would configure the Other interrupt to not be raised in case of
> receiver overrun but this doesn't seem possible on this adapter. Instead,
> we handle the first part of the problem by reverting to the practice of
> reading ICR in the other interrupt handler, like before commit 16ecba59bc33
> ("e1000e: Do not read ICR in Other interrupt"). Thanks to commit
> 0a8047ac68e5 ("e1000e: Fix msi-x interrupt automask") which cleared IAME
> from CTRL_EXT, reading ICR doesn't interfere with RxQ0, TxQ0 interrupts
> anymore. We handle the second part of the problem by not re-enabling the
> Other interrupt right away when there is overrun. Instead, we wait until
> traffic subsides, napi polling mode is exited and interrupts are
> re-enabled.
> 
> Reported-by: Lennart Sorensen 
> Fixes: 16ecba59bc33 ("e1000e: Do not read ICR in Other interrupt")
> Signed-off-by: Benjamin Poirier 

Any chance of this fix hitting -stable?  After all adapter reset under
load is not nice.

-- 
Len Sorensen


Re: [PATCH net] ipv6: no need to return rt->dst.error if it is not null entry.

2017-07-21 Thread Cong Wang
On Thu, Jul 20, 2017 at 8:23 AM, Hangbin Liu  wrote:
> 2017-07-20 23:06 GMT+08:00 Hangbin Liu :
>>> +++ b/net/ipv6/route.c
>>> @@ -3637,12 +3637,6 @@ static int inet6_rtm_getroute(struct sk_buff 
>>> *in_skb, struct nlmsghdr *nlh,
>>> dst = ip6_route_lookup(net, , 0);
>>>
>>> rt = container_of(dst, struct rt6_info, dst);
>>> -   if (rt->dst.error) {
>>> -   err = rt->dst.error;
>>> -   ip6_rt_put(rt);
>>> -   goto errout;
>>> -   }
>>
>> hmm... or instead of remove this check, should we check all the entry? Like
>> if ((rt->dst.error && rt != net->ipv6.ip6_null_entry && rt !=
> ^^ mistake here
>> net->ipv6.ip6_blk_hole_entry) ||
>>  rt == net->ipv6.ip6_null_entry )
>
> Sorry,  there should be no need to check ip6_null_entry since the
> error is already
> -ENETUNREACH. So how about

Hmm? All of these 3 entries have error set, right??
So we should only check dst.error...


[PATCH 1/5] e1000e: Fix error path in link detection

2017-07-21 Thread Benjamin Poirier
In case of error from e1e_rphy(), the loop will exit early and "success"
will be set to true erroneously.

Signed-off-by: Benjamin Poirier 
---
 drivers/net/ethernet/intel/e1000e/phy.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/intel/e1000e/phy.c 
b/drivers/net/ethernet/intel/e1000e/phy.c
index d78d47b41a71..86ff0969efb6 100644
--- a/drivers/net/ethernet/intel/e1000e/phy.c
+++ b/drivers/net/ethernet/intel/e1000e/phy.c
@@ -1744,6 +1744,7 @@ s32 e1000e_phy_has_link_generic(struct e1000_hw *hw, u32 
iterations,
s32 ret_val = 0;
u16 i, phy_status;
 
+   *success = false;
for (i = 0; i < iterations; i++) {
/* Some PHYs require the MII_BMSR register to be read
 * twice due to the link bit being sticky.  No harm doing
@@ -1763,16 +1764,16 @@ s32 e1000e_phy_has_link_generic(struct e1000_hw *hw, 
u32 iterations,
ret_val = e1e_rphy(hw, MII_BMSR, _status);
if (ret_val)
break;
-   if (phy_status & BMSR_LSTATUS)
+   if (phy_status & BMSR_LSTATUS) {
+   *success = true;
break;
+   }
if (usec_interval >= 1000)
msleep(usec_interval / 1000);
else
udelay(usec_interval);
}
 
-   *success = (i < iterations);
-
return ret_val;
 }
 
-- 
2.13.2



[PATCH 3/5] e1000e: Fix return value test

2017-07-21 Thread Benjamin Poirier
All the helpers return -E1000_ERR_PHY.

Signed-off-by: Benjamin Poirier 
---
 drivers/net/ethernet/intel/e1000e/netdev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c 
b/drivers/net/ethernet/intel/e1000e/netdev.c
index 58a87134d2e5..fc6a1db2 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -5099,7 +5099,7 @@ static bool e1000e_has_link(struct e1000_adapter *adapter)
break;
}
 
-   if ((ret_val == E1000_ERR_PHY) && (hw->phy.type == e1000_phy_igp_3) &&
+   if ((ret_val == -E1000_ERR_PHY) && (hw->phy.type == e1000_phy_igp_3) &&
(er32(CTRL) & E1000_PHY_CTRL_GBE_DISABLE)) {
/* See e1000_kmrn_lock_loss_workaround_ich8lan() */
e_info("Gigabit has been disabled, downgrading speed\n");
-- 
2.13.2



[PATCH 5/5] e1000e: Avoid receiver overrun interrupt bursts

2017-07-21 Thread Benjamin Poirier
When e1000e_poll() is not fast enough to keep up with incoming traffic, the
adapter (when operating in msix mode) raises the Other interrupt to signal
Receiver Overrun.

This is a double problem because 1) at the moment e1000_msix_other()
assumes that it is only called in case of Link Status Change and 2) if the
condition persists, the interrupt is repeatedly raised again in quick
succession.

Ideally we would configure the Other interrupt to not be raised in case of
receiver overrun but this doesn't seem possible on this adapter. Instead,
we handle the first part of the problem by reverting to the practice of
reading ICR in the other interrupt handler, like before commit 16ecba59bc33
("e1000e: Do not read ICR in Other interrupt"). Thanks to commit
0a8047ac68e5 ("e1000e: Fix msi-x interrupt automask") which cleared IAME
from CTRL_EXT, reading ICR doesn't interfere with RxQ0, TxQ0 interrupts
anymore. We handle the second part of the problem by not re-enabling the
Other interrupt right away when there is overrun. Instead, we wait until
traffic subsides, napi polling mode is exited and interrupts are
re-enabled.

Reported-by: Lennart Sorensen 
Fixes: 16ecba59bc33 ("e1000e: Do not read ICR in Other interrupt")
Signed-off-by: Benjamin Poirier 
---
 drivers/net/ethernet/intel/e1000e/defines.h |  1 +
 drivers/net/ethernet/intel/e1000e/netdev.c  | 33 +++--
 2 files changed, 27 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/intel/e1000e/defines.h 
b/drivers/net/ethernet/intel/e1000e/defines.h
index 0641c0098738..afb7ebe20b24 100644
--- a/drivers/net/ethernet/intel/e1000e/defines.h
+++ b/drivers/net/ethernet/intel/e1000e/defines.h
@@ -398,6 +398,7 @@
 #define E1000_ICR_LSC   0x0004 /* Link Status Change */
 #define E1000_ICR_RXSEQ 0x0008 /* Rx sequence error */
 #define E1000_ICR_RXDMT00x0010 /* Rx desc min. threshold (0) */
+#define E1000_ICR_RXO   0x0040 /* Receiver Overrun */
 #define E1000_ICR_RXT0  0x0080 /* Rx timer intr (ring 0) */
 #define E1000_ICR_ECCER 0x0040 /* Uncorrectable ECC Error */
 /* If this bit asserted, the driver should claim the interrupt */
diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c 
b/drivers/net/ethernet/intel/e1000e/netdev.c
index 5a8ab1136566..803edd1a6401 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -1910,12 +1910,30 @@ static irqreturn_t e1000_msix_other(int __always_unused 
irq, void *data)
struct net_device *netdev = data;
struct e1000_adapter *adapter = netdev_priv(netdev);
struct e1000_hw *hw = >hw;
+   u32 icr;
+   bool enable = true;
+
+   icr = er32(ICR);
+   if (icr & E1000_ICR_RXO) {
+   ew32(ICR, E1000_ICR_RXO);
+   enable = false;
+   /* napi poll will re-enable Other, make sure it runs */
+   if (napi_schedule_prep(>napi)) {
+   adapter->total_rx_bytes = 0;
+   adapter->total_rx_packets = 0;
+   __napi_schedule(>napi);
+   }
+   }
+   if (icr & E1000_ICR_LSC) {
+   ew32(ICR, E1000_ICR_LSC);
+   hw->mac.get_link_status = true;
+   /* guard against interrupt when we're going down */
+   if (!test_bit(__E1000_DOWN, >state)) {
+   mod_timer(>watchdog_timer, jiffies + 1);
+   }
+   }
 
-   hw->mac.get_link_status = true;
-
-   /* guard against interrupt when we're going down */
-   if (!test_bit(__E1000_DOWN, >state)) {
-   mod_timer(>watchdog_timer, jiffies + 1);
+   if (enable && !test_bit(__E1000_DOWN, >state)) {
ew32(IMS, E1000_IMS_OTHER);
}
 
@@ -2687,7 +2705,8 @@ static int e1000e_poll(struct napi_struct *napi, int 
weight)
napi_complete_done(napi, work_done);
if (!test_bit(__E1000_DOWN, >state)) {
if (adapter->msix_entries)
-   ew32(IMS, adapter->rx_ring->ims_val);
+   ew32(IMS, adapter->rx_ring->ims_val |
+E1000_IMS_OTHER);
else
e1000_irq_enable(adapter);
}
@@ -4204,7 +4223,7 @@ static void e1000e_trigger_lsc(struct e1000_adapter 
*adapter)
struct e1000_hw *hw = >hw;
 
if (adapter->msix_entries)
-   ew32(ICS, E1000_ICS_OTHER);
+   ew32(ICS, E1000_ICS_LSC | E1000_ICS_OTHER);
else
ew32(ICS, E1000_ICS_LSC);
 }
-- 
2.13.2



[PATCH 2/5] e1000e: Fix wrong comment related to link detection

2017-07-21 Thread Benjamin Poirier
Reading e1000e_check_for_copper_link() shows that get_link_status is set to
false after link has been detected. Therefore, it stays TRUE until then.

Signed-off-by: Benjamin Poirier 
---
 drivers/net/ethernet/intel/e1000e/netdev.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c 
b/drivers/net/ethernet/intel/e1000e/netdev.c
index 2dcb5463d9b8..58a87134d2e5 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -5074,7 +5074,7 @@ static bool e1000e_has_link(struct e1000_adapter *adapter)
 
/* get_link_status is set on LSC (link status) interrupt or
 * Rx sequence error interrupt.  get_link_status will stay
-* false until the check_for_link establishes link
+* true until the check_for_link establishes link
 * for copper adapters ONLY
 */
switch (hw->phy.media_type) {
@@ -5092,7 +5092,7 @@ static bool e1000e_has_link(struct e1000_adapter *adapter)
break;
case e1000_media_type_internal_serdes:
ret_val = hw->mac.ops.check_for_link(hw);
-   link_active = adapter->hw.mac.serdes_has_link;
+   link_active = hw->mac.serdes_has_link;
break;
default:
case e1000_media_type_unknown:
-- 
2.13.2



[PATCH 4/5] e1000e: Separate signaling for link check/link up

2017-07-21 Thread Benjamin Poirier
Lennart reported the following race condition:

\ e1000_watchdog_task
\ e1000e_has_link
\ hw->mac.ops.check_for_link() === e1000e_check_for_copper_link
/* link is up */
mac->get_link_status = false;

/* interrupt */
\ e1000_msix_other
hw->mac.get_link_status = true;

link_active = !hw->mac.get_link_status
/* link_active is false, wrongly */

This problem arises because the single flag get_link_status is used to
signal two different states: link status needs checking and link status is
down.

Avoid the problem by using the return value of .check_for_link to signal
the link status to e1000e_has_link().

Reported-by: Lennart Sorensen 
Signed-off-by: Benjamin Poirier 
---
 drivers/net/ethernet/intel/e1000e/mac.c| 11 ---
 drivers/net/ethernet/intel/e1000e/netdev.c |  2 +-
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/intel/e1000e/mac.c 
b/drivers/net/ethernet/intel/e1000e/mac.c
index b322011ec282..f457c5703d0c 100644
--- a/drivers/net/ethernet/intel/e1000e/mac.c
+++ b/drivers/net/ethernet/intel/e1000e/mac.c
@@ -410,6 +410,9 @@ void e1000e_clear_hw_cntrs_base(struct e1000_hw *hw)
  *  Checks to see of the link status of the hardware has changed.  If a
  *  change in link status has been detected, then we read the PHY registers
  *  to get the current speed/duplex if link exists.
+ *
+ *  Returns a negative error code (-E1000_ERR_*) or 0 (link down) or 1 (link
+ *  up).
  **/
 s32 e1000e_check_for_copper_link(struct e1000_hw *hw)
 {
@@ -423,7 +426,7 @@ s32 e1000e_check_for_copper_link(struct e1000_hw *hw)
 * Change or Rx Sequence Error interrupt.
 */
if (!mac->get_link_status)
-   return 0;
+   return 1;
 
/* First we want to see if the MII Status Register reports
 * link.  If so, then we want to get the current speed/duplex
@@ -461,10 +464,12 @@ s32 e1000e_check_for_copper_link(struct e1000_hw *hw)
 * different link partner.
 */
ret_val = e1000e_config_fc_after_link_up(hw);
-   if (ret_val)
+   if (ret_val) {
e_dbg("Error configuring flow control\n");
+   return ret_val;
+   }
 
-   return ret_val;
+   return 1;
 }
 
 /**
diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c 
b/drivers/net/ethernet/intel/e1000e/netdev.c
index fc6a1db2..5a8ab1136566 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -5081,7 +5081,7 @@ static bool e1000e_has_link(struct e1000_adapter *adapter)
case e1000_media_type_copper:
if (hw->mac.get_link_status) {
ret_val = hw->mac.ops.check_for_link(hw);
-   link_active = !hw->mac.get_link_status;
+   link_active = ret_val > 0;
} else {
link_active = true;
}
-- 
2.13.2



[patch net-next] mlxsw: spectrum_router: Don't batch neighbour deletion

2017-07-21 Thread Jiri Pirko
From: Ido Schimmel 

Current firmware supported by the driver doesn't support batch deletion
of IPv6 neighbours on a given router interface (RIF).

Until a new version that supports this functionality is made available,
delete neighbours one by one.

Signed-off-by: Ido Schimmel 
Signed-off-by: Jiri Pirko 
---
 drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c | 15 +++
 1 file changed, 3 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c 
b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
index e6d629f..548552c 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
@@ -1419,25 +1419,16 @@ static void mlxsw_sp_neigh_fini(struct mlxsw_sp 
*mlxsw_sp)
rhashtable_destroy(_sp->router->neigh_ht);
 }
 
-static int mlxsw_sp_neigh_rif_flush(struct mlxsw_sp *mlxsw_sp,
-   const struct mlxsw_sp_rif *rif)
-{
-   char rauht_pl[MLXSW_REG_RAUHT_LEN];
-
-   mlxsw_reg_rauht_pack(rauht_pl, MLXSW_REG_RAUHT_OP_WRITE_DELETE_ALL,
-rif->rif_index, rif->addr);
-   return mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(rauht), rauht_pl);
-}
-
 static void mlxsw_sp_neigh_rif_gone_sync(struct mlxsw_sp *mlxsw_sp,
 struct mlxsw_sp_rif *rif)
 {
struct mlxsw_sp_neigh_entry *neigh_entry, *tmp;
 
-   mlxsw_sp_neigh_rif_flush(mlxsw_sp, rif);
list_for_each_entry_safe(neigh_entry, tmp, >neigh_list,
-rif_list_node)
+rif_list_node) {
+   mlxsw_sp_neigh_entry_update(mlxsw_sp, neigh_entry, false);
mlxsw_sp_neigh_entry_destroy(mlxsw_sp, neigh_entry);
+   }
 }
 
 struct mlxsw_sp_nexthop_key {
-- 
2.9.3



[PATCH net-next 0/2] rxrpc: Rearrange headers

2017-07-21 Thread David Howells

Here's a pair of patches that rearrange some of the AF_RXRPC header files
that are outside of the net/rxrpc/ directory:

 (1) The bits userspace need are moved to uapi/linux/rxrpc.h.  [Should this
 be af_rxrpc.h instead, I wonder - but there doesn't seem to be
 precedent for that in the other net UAPI headers.]

 (2) For the most part, the contents of rxrpc/packet.h are no longer used
 outside of the AF_RXRPC module, so move them to net/rxrpc/protocol.h
 with the exception of the standard abort codes which are exposed to
 userspace when an abort occurs and the security index values which are
 needed when constructing keys.

The patches can be found here also:


http://git.kernel.org/cgit/linux/kernel/git/dhowells/linux-fs.git/log/?h=rxrpc-rewrite

Tagged thusly:

git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git
rxrpc-rewrite-20170721

David
---
David Howells (2):
  rxrpc: Expose UAPI definitions to userspace
  rxrpc: Move the packet.h include file into net/rxrpc/


 fs/afs/misc.c  |1 
 fs/afs/rxrpc.c |1 
 include/linux/rxrpc.h  |   79 ---
 include/rxrpc/packet.h |  235 
 include/uapi/linux/rxrpc.h |  124 +++
 net/rxrpc/ar-internal.h|2 
 net/rxrpc/protocol.h   |  190 
 7 files changed, 315 insertions(+), 317 deletions(-)
 delete mode 100644 include/linux/rxrpc.h
 delete mode 100644 include/rxrpc/packet.h
 create mode 100644 include/uapi/linux/rxrpc.h
 create mode 100644 net/rxrpc/protocol.h



[PATCH net-next 1/2] rxrpc: Expose UAPI definitions to userspace

2017-07-21 Thread David Howells
Move UAPI definitions from the internal header and place them in a UAPI
header file so that userspace can make use of them.

Signed-off-by: David Howells 
---

 include/linux/rxrpc.h  |   79 ---
 include/uapi/linux/rxrpc.h |   80 
 2 files changed, 80 insertions(+), 79 deletions(-)
 delete mode 100644 include/linux/rxrpc.h
 create mode 100644 include/uapi/linux/rxrpc.h

diff --git a/include/linux/rxrpc.h b/include/linux/rxrpc.h
deleted file mode 100644
index 7343f71783dc..
--- a/include/linux/rxrpc.h
+++ /dev/null
@@ -1,79 +0,0 @@
-/* AF_RXRPC parameters
- *
- * Copyright (C) 2007 Red Hat, Inc. All Rights Reserved.
- * Written by David Howells (dhowe...@redhat.com)
- *
- * This program is free software; you can redistribute it and/or
- * modify it under the terms of the GNU General Public License
- * as published by the Free Software Foundation; either version
- * 2 of the License, or (at your option) any later version.
- */
-
-#ifndef _LINUX_RXRPC_H
-#define _LINUX_RXRPC_H
-
-#include 
-#include 
-
-/*
- * RxRPC socket address
- */
-struct sockaddr_rxrpc {
-   sa_family_t srx_family; /* address family */
-   u16 srx_service;/* service desired */
-   u16 transport_type; /* type of transport socket 
(SOCK_DGRAM) */
-   u16 transport_len;  /* length of transport address */
-   union {
-   sa_family_t family; /* transport address family */
-   struct sockaddr_in sin; /* IPv4 transport address */
-   struct sockaddr_in6 sin6;   /* IPv6 transport address */
-   } transport;
-};
-
-/*
- * RxRPC socket options
- */
-#define RXRPC_SECURITY_KEY 1   /* [clnt] set client security 
key */
-#define RXRPC_SECURITY_KEYRING 2   /* [srvr] set ring of server 
security keys */
-#define RXRPC_EXCLUSIVE_CONNECTION 3   /* Deprecated; use 
RXRPC_EXCLUSIVE_CALL instead */
-#define RXRPC_MIN_SECURITY_LEVEL   4   /* minimum security level */
-#define RXRPC_UPGRADEABLE_SERVICE  5   /* Upgrade service[0] -> 
service[1] */
-#define RXRPC_SUPPORTED_CMSG   6   /* Get highest supported 
control message type */
-
-/*
- * RxRPC control messages
- * - If neither abort or accept are specified, the message is a data message.
- * - terminal messages mean that a user call ID tag can be recycled
- * - s/r/- indicate whether these are applicable to sendmsg() and/or recvmsg()
- */
-enum rxrpc_cmsg_type {
-   RXRPC_USER_CALL_ID  = 1,/* sr: user call ID specifier */
-   RXRPC_ABORT = 2,/* sr: abort request / notification 
[terminal] */
-   RXRPC_ACK   = 3,/* -r: [Service] RPC op final ACK 
received [terminal] */
-   RXRPC_NET_ERROR = 5,/* -r: network error received 
[terminal] */
-   RXRPC_BUSY  = 6,/* -r: server busy received [terminal] 
*/
-   RXRPC_LOCAL_ERROR   = 7,/* -r: local error generated [terminal] 
*/
-   RXRPC_NEW_CALL  = 8,/* -r: [Service] new incoming call 
notification */
-   RXRPC_ACCEPT= 9,/* s-: [Service] accept request */
-   RXRPC_EXCLUSIVE_CALL= 10,   /* s-: Call should be on exclusive 
connection */
-   RXRPC_UPGRADE_SERVICE   = 11,   /* s-: Request service upgrade for 
client call */
-   RXRPC_TX_LENGTH = 12,   /* s-: Total length of Tx data */
-   RXRPC__SUPPORTED
-};
-
-/*
- * RxRPC security levels
- */
-#define RXRPC_SECURITY_PLAIN   0   /* plain secure-checksummed packets 
only */
-#define RXRPC_SECURITY_AUTH1   /* authenticated packets */
-#define RXRPC_SECURITY_ENCRYPT 2   /* encrypted packets */
-
-/*
- * RxRPC security indices
- */
-#define RXRPC_SECURITY_NONE0   /* no security protocol */
-#define RXRPC_SECURITY_RXKAD   2   /* kaserver or kerberos 4 */
-#define RXRPC_SECURITY_RXGK4   /* gssapi-based */
-#define RXRPC_SECURITY_RXK55   /* kerberos 5 */
-
-#endif /* _LINUX_RXRPC_H */
diff --git a/include/uapi/linux/rxrpc.h b/include/uapi/linux/rxrpc.h
new file mode 100644
index ..08e2fb9c70ae
--- /dev/null
+++ b/include/uapi/linux/rxrpc.h
@@ -0,0 +1,80 @@
+/* Types and definitions for AF_RXRPC.
+ *
+ * Copyright (C) 2007 Red Hat, Inc. All Rights Reserved.
+ * Written by David Howells (dhowe...@redhat.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public Licence
+ * as published by the Free Software Foundation; either version
+ * 2 of the Licence, or (at your option) any later version.
+ */
+
+#ifndef _UAPI_LINUX_RXRPC_H
+#define _UAPI_LINUX_RXRPC_H
+
+#include 
+#include 
+#include 
+
+/*
+ * RxRPC socket address
+ */
+struct sockaddr_rxrpc {
+   sa_family_t srx_family; /* address family */
+   u16 

[PATCH net-next 2/2] rxrpc: Move the packet.h include file into net/rxrpc/

2017-07-21 Thread David Howells
Move the protocol description header file into net/rxrpc/ and rename it to
protocol.h.  It's no longer necessary to expose it as packets are no longer
exposed to kernel services (such as AFS) that use the facility.

The abort codes are transferred to the UAPI header instead as we pass these
back to userspace and also to kernel services.

Signed-off-by: David Howells 
---

 fs/afs/misc.c  |1 
 fs/afs/rxrpc.c |1 
 include/rxrpc/packet.h |  235 
 include/uapi/linux/rxrpc.h |   44 
 net/rxrpc/ar-internal.h|2 
 net/rxrpc/protocol.h   |  190 
 6 files changed, 235 insertions(+), 238 deletions(-)
 delete mode 100644 include/rxrpc/packet.h
 create mode 100644 net/rxrpc/protocol.h

diff --git a/fs/afs/misc.c b/fs/afs/misc.c
index 100b207efc9e..c05f1f1c0d41 100644
--- a/fs/afs/misc.c
+++ b/fs/afs/misc.c
@@ -12,7 +12,6 @@
 #include 
 #include 
 #include 
-#include 
 #include "internal.h"
 #include "afs_fs.h"
 
diff --git a/fs/afs/rxrpc.c b/fs/afs/rxrpc.c
index 02781e78ffb6..10743043d431 100644
--- a/fs/afs/rxrpc.c
+++ b/fs/afs/rxrpc.c
@@ -14,7 +14,6 @@
 
 #include 
 #include 
-#include 
 #include "internal.h"
 #include "afs_cm.h"
 
diff --git a/include/rxrpc/packet.h b/include/rxrpc/packet.h
deleted file mode 100644
index a2dcfb850b9f..
--- a/include/rxrpc/packet.h
+++ /dev/null
@@ -1,235 +0,0 @@
-/* packet.h: Rx packet layout and definitions
- *
- * Copyright (C) 2002, 2007 Red Hat, Inc. All Rights Reserved.
- * Written by David Howells (dhowe...@redhat.com)
- *
- * This program is free software; you can redistribute it and/or
- * modify it under the terms of the GNU General Public License
- * as published by the Free Software Foundation; either version
- * 2 of the License, or (at your option) any later version.
- */
-
-#ifndef _LINUX_RXRPC_PACKET_H
-#define _LINUX_RXRPC_PACKET_H
-
-typedef u32rxrpc_seq_t;/* Rx message sequence number */
-typedef u32rxrpc_serial_t; /* Rx message serial number */
-typedef __be32 rxrpc_seq_net_t; /* on-the-wire Rx message sequence number */
-typedef __be32 rxrpc_serial_net_t; /* on-the-wire Rx message serial number */
-
-/*/
-/*
- * on-the-wire Rx packet header
- * - all multibyte fields should be in network byte order
- */
-struct rxrpc_wire_header {
-   __be32  epoch;  /* client boot timestamp */
-#define RXRPC_RANDOM_EPOCH 0x8000  /* Random if set, date-based if 
not */
-
-   __be32  cid;/* connection and channel ID */
-#define RXRPC_MAXCALLS 4   /* max active calls per 
conn */
-#define RXRPC_CHANNELMASK  (RXRPC_MAXCALLS-1)  /* mask for channel ID 
*/
-#define RXRPC_CIDMASK  (~RXRPC_CHANNELMASK)/* mask for connection 
ID */
-#define RXRPC_CIDSHIFT ilog2(RXRPC_MAXCALLS)   /* shift for connection 
ID */
-#define RXRPC_CID_INC  (1 << RXRPC_CIDSHIFT)   /* connection ID 
increment */
-
-   __be32  callNumber; /* call ID (0 for connection-level 
packets) */
-   __be32  seq;/* sequence number of pkt in call 
stream */
-   __be32  serial; /* serial number of pkt sent to network 
*/
-
-   uint8_t type;   /* packet type */
-#define RXRPC_PACKET_TYPE_DATA 1   /* data */
-#define RXRPC_PACKET_TYPE_ACK  2   /* ACK */
-#define RXRPC_PACKET_TYPE_BUSY 3   /* call reject */
-#define RXRPC_PACKET_TYPE_ABORT4   /* call/connection 
abort */
-#define RXRPC_PACKET_TYPE_ACKALL   5   /* ACK all outstanding packets 
on call */
-#define RXRPC_PACKET_TYPE_CHALLENGE6   /* connection security 
challenge (SRVR->CLNT) */
-#define RXRPC_PACKET_TYPE_RESPONSE 7   /* connection secutity response 
(CLNT->SRVR) */
-#define RXRPC_PACKET_TYPE_DEBUG8   /* debug info request */
-#define RXRPC_PACKET_TYPE_VERSION  13  /* version string request */
-#define RXRPC_N_PACKET_TYPES   14  /* number of packet types (incl 
type 0) */
-
-   uint8_t flags;  /* packet flags */
-#define RXRPC_CLIENT_INITIATED 0x01/* signifies a packet generated 
by a client */
-#define RXRPC_REQUEST_ACK  0x02/* request an unconditional ACK 
of this packet */
-#define RXRPC_LAST_PACKET  0x04/* the last packet from this 
side for this call */
-#define RXRPC_MORE_PACKETS 0x08/* more packets to come */
-#define RXRPC_JUMBO_PACKET 0x20/* [DATA] this is a jumbo 
packet */
-#define RXRPC_SLOW_START_OK0x20/* [ACK] slow start supported */
-
-   uint8_t userStatus; /* app-layer defined status */
-#define RXRPC_USERSTATUS_SERVICE_UPGRADE 0x01  /* AuriStor service 

Re: [PATCH v2 3/4] net: ethernet: nb8800: Fix RGMII TX clock delay setup

2017-07-21 Thread Florian Fainelli
On 07/21/2017 07:24 AM, Måns Rullgård wrote:
> Marc Gonzalez  writes:
> 
>> On 21/07/2017 15:47, Måns Rullgård wrote:
>>
>>> Marc Gonzalez wrote:
>>>
 On 21/07/2017 15:04, Måns Rullgård wrote:

> Marc Gonzalez wrote:
>
>> According to commit e5f3a4a56ce2a707b2fb8ce37e4414dcac89c672
>> ("Documentation: devicetree: clarify usage of the RGMII phy-modes")
>> there are 4 RGMII modes to handle:
>>
>> "rgmii" (RX and TX delays are added by the MAC when required)
>> "rgmii-id" (RGMII with internal RX and TX delays provided by the PHY,
>>  the MAC should not add the RX or TX delays in this case)
>> "rgmii-rxid" (RGMII with internal RX delay provided by the PHY,
>>  the MAC should not add an RX delay in this case)
>> "rgmii-txid" (RGMII with internal TX delay provided by the PHY,
>>  the MAC should not add an TX delay in this case)
>>
>> Add TX delay in the MAC only for rgmii and rgmii-rxid.
>>
>> Signed-off-by: Marc Gonzalez 
>> ---
>>  drivers/net/ethernet/aurora/nb8800.c | 6 --
>>  1 file changed, 4 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/aurora/nb8800.c 
>> b/drivers/net/ethernet/aurora/nb8800.c
>> index ded041dbafe7..f3ed320eb4ad 100644
>> --- a/drivers/net/ethernet/aurora/nb8800.c
>> +++ b/drivers/net/ethernet/aurora/nb8800.c
>> @@ -1268,11 +1268,13 @@ static int nb8800_tangox_init(struct net_device 
>> *dev)
>>  break;
>>
>>  case PHY_INTERFACE_MODE_RGMII:
>> -pad_mode = PAD_MODE_RGMII;
>> +case PHY_INTERFACE_MODE_RGMII_RXID:
>> +pad_mode = PAD_MODE_RGMII | PAD_MODE_GTX_CLK_DELAY;
>>  break;
>>
>> +case PHY_INTERFACE_MODE_RGMII_ID:
>>  case PHY_INTERFACE_MODE_RGMII_TXID:
>> -pad_mode = PAD_MODE_RGMII | PAD_MODE_GTX_CLK_DELAY;
>> +pad_mode = PAD_MODE_RGMII;
>>  break;
>>
>>  default:
>
> I still don't like this.  Having both the MAC and PHY drivers react to
> the phy-connection-type property is bound to cause trouble somewhere.
>
> The only way out of the current mess is to define new properties for
> both MAC and PHY that override the existing ones if present.

 Do you mean defining 4 new bindings and their corresponding
 phy_interface_t enum values? For example:

 "rgmii-v2"
 "rgmii-id-v2"
 "rgmii-rxid-v2"
 "rgmii-txid-v2"

PHY_INTERFACE_MODE_RGMII_V2,
PHY_INTERFACE_MODE_RGMII_ID_V2,
PHY_INTERFACE_MODE_RGMII_RXID_V2,
PHY_INTERFACE_MODE_RGMII_TXID_V2,

 And then handling these new enums in the at803x and nb8800 drivers?
>>>
>>> It has already been suggested to add new properties specifying desired
>>> delays in picoseconds.  If present on the MAC node, the MAC driver
>>> should attempt to provide the delay, and if present on the PHY node, the
>>> PHY driver is responsible.
>>
>> Sorry, I had already forgotten about Florian's suggestion:
>>> If you introduced PHY and/or MAC specific properties to configure the
>>> delays in the appropriate unit of time (say ps), you could use a
>>> non-compliant 'phy-mode' just to satisfy the driver/PHY library and
>>> still override the delays you need.
>>
>> So we would need two properties (RX and TX).
>> "rgmii_rx_skew_ps" and "rgmii_tx_skew_ps"
>>
>> but it's not clear to me how the MAC probe function communicates
>> the arguments to the phy driver. Looks like we would need to add
>> two fields to struct phy_device, and maybe define a new phy-mode
>> to instruct the PHY driver to look for the two fields.
> 
> There's no need for the drivers to communicate.  The location of the
> properties in the device tree determines which driver should deal with
> it.

Exactly. I really like how the PHY delay properties are defined in
Documentation/devicetree/bindings/net/micrel-ksz90x1.txt because they
are quite generic and for a MAC counterpart those defined in
Documentation/devicetree/bindings/net/dwmac-sun8i.txt and
Documentation/devicetree/bindings/net/meson-dwmac.txt are also good
examples.

> 
>> I don't have time to work on that for now, but I do need to
>> fix the nb8800 driver now. Can we apply the following patch
>> in the interim?
>>
>> diff --git a/drivers/net/ethernet/aurora/nb8800.c 
>> b/drivers/net/ethernet/aurora/nb8800.c
>> index ded041dbafe7..e94159507847 100644
>> --- a/drivers/net/ethernet/aurora/nb8800.c
>> +++ b/drivers/net/ethernet/aurora/nb8800.c
>> @@ -1268,11 +1268,10 @@ static int nb8800_tangox_init(struct net_device *dev)
>> break;
>>
>> case PHY_INTERFACE_MODE_RGMII:
>> -   pad_mode = PAD_MODE_RGMII;
>> -   break;
>> -
>> +   case PHY_INTERFACE_MODE_RGMII_ID:
>> +   case 

Re: [PATCH v2 2/4] net: ethernet: nb8800: Set RGMII_MODE for all RGMII modes

2017-07-21 Thread Florian Fainelli
On 07/21/2017 04:25 AM, Marc Gonzalez wrote:
> There are 4 RGMII phy-modes to handle.

... so make sure that the MAC is configured to set the RGMII_MODE
accordingly for all 4 RGMII mode.

> 
> Signed-off-by: Marc Gonzalez 

Fixes: 52dfc8301248 ("net: ethernet: add driver for Aurora VLSI NB8800
Ethernet controller")

Reviewed-by: Florian Fainelli 
> ---
>  drivers/net/ethernet/aurora/nb8800.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/aurora/nb8800.c 
> b/drivers/net/ethernet/aurora/nb8800.c
> index 041cfb7952f8..ded041dbafe7 100644
> --- a/drivers/net/ethernet/aurora/nb8800.c
> +++ b/drivers/net/ethernet/aurora/nb8800.c
> @@ -609,7 +609,7 @@ static void nb8800_mac_config(struct net_device *dev)
>   mac_mode |= HALF_DUPLEX;
>  
>   if (gigabit) {
> - if (priv->phy_mode == PHY_INTERFACE_MODE_RGMII)
> + if (phy_interface_is_rgmii(dev->phydev))
>   mac_mode |= RGMII_MODE;
>  
>   mac_mode |= GMAC_MODE;
> 


-- 
Florian


Re: A buggy behavior for Linux TCP Reno and HTCP

2017-07-21 Thread Yuchung Cheng
On Thu, Jul 20, 2017 at 2:28 PM, Wei Sun  wrote:
> Hi Yuchung,
>
> Sorry for the confusion.  The test case was adapted from an old DSACK
> test case (i.e., forget to remove something).
>
> Attached is a new and simple one. Thanks
Note that the test scenario is fairly rare IMO: the connection first
experience timeouts, then its retransmission got acked, then the
original packets get Acked (ack w/ val 1400 ecr 130). It can be really
long reordering, or reordering plus packet loss.

The Linux undo state machines may not handle this perfectly, but it's
probably not worth extra state for such rare events.

>
>
>
> On Wed, Jul 19, 2017 at 2:31 PM, Yuchung Cheng  wrote:
>> On Tue, Jul 18, 2017 at 2:36 PM, Wei Sun  wrote:
>>> Hi there,
>>>
>>> We find a buggy behavior when using Linux TCP Reno and HTCP in low
>>> bandwidth or highly congested network environments.
>>>
>>> In a simple word, their undo functions may mistakenly double the cwnd,
>>> leading to a more aggressive behavior in a highly congested scenario.
>>>
>>>
>>> The detailed reason:
>>>
>>> The current reno undo function assumes cwnd halving (and thus doubles
>>> the cwnd), but it doesn't consider a corner case condition that
>>> ssthresh is at least 2.
>>>
>>> e.g.,
>>>  cwnd  ssth
>>> An initial state: 25
>>> A spurious loss:   12
>>> Undo:   45
>>>
>>> Here the cwnd after undo is two times as that before undo. Attached is
>>> a simple script to reproduce it.
>> the packetdrill script is a bit confusing: it disables SACK but then
>> the client returns ACK w/ SACKs, also 3 dupacks happen after RTO so
>> the sender isn't technically going through a fast recovery...
>>
>> could you provide a better test?
>>
>>>
>>> A similar reason for HTCP, so we recommend to store the cwnd on loss
>>> in .ssthresh implementation and restore it again in .undo_cwnd for TCP
>>> Reno and HTCP implementations.
>>>
>>> Thanks


Re: [RFC 1/2] net-next: add a dma_desc element to struct skb_shared_info

2017-07-21 Thread John Crispin



On 21/07/17 17:56, Paolo Abeni wrote:

Hi,

On Fri, 2017-07-21 at 17:20 +0200, John Crispin wrote:

In order to make HW flow offloading work in latest MediaTek silicon we need
to propagate part of the RX DMS descriptor to the upper layers populating
the flow offload engines HW tables. This patch adds an extra element to
struct skb_shared_info allowing the ethernet drivers RX napi code to store
the required information and make it persistent for the lifecycle of the
skb and its clones.

Signed-off-by: John Crispin 
---
  include/linux/skbuff.h | 1 +
  1 file changed, 1 insertion(+)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 4093552be1de..db9576cd946b 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -426,6 +426,7 @@ struct skb_shared_info {
unsigned intgso_type;
u32 tskey;
__be32  ip6_frag_id;
+   u32 dma_desc;
  
  	/*

 * Warning : all fields before dataref are cleared in __alloc_skb()

This will increase the skb_shared_info struct size, which is already
quite large, and can have several kind of performance drawback.
AFAIK this is discouraged.

I don't understand the use case; the driver will set this field, but
who is going to consume it?

Thanks,

Paolo

Hi Paolo,

When the flow offloading engine forwards a packet to the DMA it will 
send additional info to the sw path. this includes

* physical switch port
* internal flow hash - this is required to populate the correct flow 
table entry
* ppe state - this indicates what state the PPEs internal table is in 
for the flow
* the reason why the packet was forwarde - these are things like bind, 
unbind, timed out, ...


once the flow table offloading patches are ready and upstream, the 
netfilter layer will see the SKB and pass it o to the flow table 
offloading code, at which point it will finally end up inside the 
offloading driver. this will need to have access to this info sent to 
the sw path inside the rx descriptor to properly work out what state the 
flow is in and which table entry to populate in the HW table for 
offloading to work.


Hope that is a little clearer. current hackish driver is here [1], the 
patch to the ethernet driver is here [2]


John

[1] 
https://git.lede-project.org/?p=lede/blogic/staging.git;a=tree;f=target/linux/mediatek/files/drivers/net/ethernet/mediatek/mtk_hnat;hb=bc0518b9d928b43d965d8a1f8860281f0ae6a31c
[2] 
https://git.lede-project.org/?p=lede/blogic/staging.git;a=blob;f=target/linux/mediatek/patches-4.9/0310-hwnat.patch;h=57bd0c07b39d2169f3ba08e1aa83b92dffcee025;hb=bc0518b9d928b43d965d8a1f8860281f0ae6a31c




Re: [patch net-next] mlxsw: spectrum_router: Don't ignore IPv6 notifications

2017-07-21 Thread Ido Schimmel
On Fri, Jul 21, 2017 at 06:05:23PM +0200, Jiri Pirko wrote:
> From: Ido Schimmel 
> 
> We now have all the necessary IPv6 infrastructure in place, so stop
> ignoring these notifications.
> 
> Signed-off-by: Ido Schimmel 
> Signed-off-by: Jiri Pirko 

Dave, we sent the wrong patch. Please ignore.

Thanks


[PATCH net] net/socket: fix type in assignment and trim long line

2017-07-21 Thread Paolo Abeni
The commit ffb07550c76f ("copy_msghdr_from_user(): get rid of
field-by-field copyin") introduce a new sparse warning:

net/socket.c:1919:27: warning: incorrect type in assignment (different address 
spaces)
net/socket.c:1919:27:expected void *msg_control
net/socket.c:1919:27:got void [noderef] *[addressable] msg_control

and a line above 80 chars, let's fix them

Fixes: ffb07550c76f ("copy_msghdr_from_user(): get rid of field-by-field 
copyin")
Signed-off-by: Paolo Abeni 
---
I noticed only because I had a very similar commit pending;
replacing the field-by-field copyin with a single copy_from_user()
gives measurable performance improvements - more than 10% under
UDP flood if using recvmsg() to sink the traffic
---
 net/socket.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/net/socket.c b/net/socket.c
index bf21226..ad22df1 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -1916,7 +1916,7 @@ static int copy_msghdr_from_user(struct msghdr *kmsg,
if (copy_from_user(, umsg, sizeof(*umsg)))
return -EFAULT;
 
-   kmsg->msg_control = msg.msg_control;
+   kmsg->msg_control = (void __force *)msg.msg_control;
kmsg->msg_controllen = msg.msg_controllen;
kmsg->msg_flags = msg.msg_flags;
 
@@ -1935,7 +1935,8 @@ static int copy_msghdr_from_user(struct msghdr *kmsg,
 
if (msg.msg_name && kmsg->msg_namelen) {
if (!save_addr) {
-   err = move_addr_to_kernel(msg.msg_name, 
kmsg->msg_namelen,
+   err = move_addr_to_kernel(msg.msg_name,
+ kmsg->msg_namelen,
  kmsg->msg_name);
if (err < 0)
return err;
-- 
2.9.4



Re: [PATCHv2 net] ipv6: should not return rt->dst.error if it is prohibit or blk hole entry.

2017-07-21 Thread kbuild test robot
Hi Hangbin,

[auto build test WARNING on net/master]

url:
https://github.com/0day-ci/linux/commits/Hangbin-Liu/ipv6-should-not-return-rt-dst-error-if-it-is-prohibit-or-blk-hole-entry/20170721-204554
config: x86_64-randconfig-x003-07211556 (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64 

All warnings (new ones prefixed by >>):

   In file included from include/uapi/linux/stddef.h:1:0,
from include/linux/stddef.h:4,
from include/uapi/linux/posix_types.h:4,
from include/uapi/linux/types.h:13,
from include/linux/types.h:5,
from include/uapi/linux/capability.h:16,
from include/linux/capability.h:15,
from net/ipv6/route.c:29:
   net/ipv6/route.c: In function 'inet6_rtm_getroute':
   net/ipv6/route.c:3640:38: error: 'struct netns_ipv6' has no member named 
'ip6_prohibit_entry'; did you mean 'ip6_null_entry'?
 if (rt->dst.error && rt != net->ipv6.ip6_prohibit_entry &&
 ^
   include/linux/compiler.h:156:30: note: in definition of macro '__trace_if'
 if (__builtin_constant_p(!!(cond)) ? !!(cond) :   \
 ^~~~
>> net/ipv6/route.c:3640:2: note: in expansion of macro 'if'
 if (rt->dst.error && rt != net->ipv6.ip6_prohibit_entry &&
 ^~
   net/ipv6/route.c:3641:21: error: 'struct netns_ipv6' has no member named 
'ip6_blk_hole_entry'; did you mean 'ip6_null_entry'?
 rt != net->ipv6.ip6_blk_hole_entry) {
^
   include/linux/compiler.h:156:30: note: in definition of macro '__trace_if'
 if (__builtin_constant_p(!!(cond)) ? !!(cond) :   \
 ^~~~
>> net/ipv6/route.c:3640:2: note: in expansion of macro 'if'
 if (rt->dst.error && rt != net->ipv6.ip6_prohibit_entry &&
 ^~
   net/ipv6/route.c:3640:38: error: 'struct netns_ipv6' has no member named 
'ip6_prohibit_entry'; did you mean 'ip6_null_entry'?
 if (rt->dst.error && rt != net->ipv6.ip6_prohibit_entry &&
 ^
   include/linux/compiler.h:156:42: note: in definition of macro '__trace_if'
 if (__builtin_constant_p(!!(cond)) ? !!(cond) :   \
 ^~~~
>> net/ipv6/route.c:3640:2: note: in expansion of macro 'if'
 if (rt->dst.error && rt != net->ipv6.ip6_prohibit_entry &&
 ^~
   net/ipv6/route.c:3641:21: error: 'struct netns_ipv6' has no member named 
'ip6_blk_hole_entry'; did you mean 'ip6_null_entry'?
 rt != net->ipv6.ip6_blk_hole_entry) {
^
   include/linux/compiler.h:156:42: note: in definition of macro '__trace_if'
 if (__builtin_constant_p(!!(cond)) ? !!(cond) :   \
 ^~~~
>> net/ipv6/route.c:3640:2: note: in expansion of macro 'if'
 if (rt->dst.error && rt != net->ipv6.ip6_prohibit_entry &&
 ^~
   net/ipv6/route.c:3640:38: error: 'struct netns_ipv6' has no member named 
'ip6_prohibit_entry'; did you mean 'ip6_null_entry'?
 if (rt->dst.error && rt != net->ipv6.ip6_prohibit_entry &&
 ^
   include/linux/compiler.h:167:16: note: in definition of macro '__trace_if'
  __r = !!(cond); \
   ^~~~
>> net/ipv6/route.c:3640:2: note: in expansion of macro 'if'
 if (rt->dst.error && rt != net->ipv6.ip6_prohibit_entry &&
 ^~
   net/ipv6/route.c:3641:21: error: 'struct netns_ipv6' has no member named 
'ip6_blk_hole_entry'; did you mean 'ip6_null_entry'?
 rt != net->ipv6.ip6_blk_hole_entry) {
^
   include/linux/compiler.h:167:16: note: in definition of macro '__trace_if'
  __r = !!(cond); \
   ^~~~
>> net/ipv6/route.c:3640:2: note: in expansion of macro 'if'
 if (rt->dst.error && rt != net->ipv6.ip6_prohibit_entry &&
 ^~
   net/ipv6/route.c: At top level:
   include/linux/compiler.h:162:4: warning: '__f' is static but declared in 
inline function 'strcpy' which is not static
   __f = { \
   ^
   include/linux/compiler.h:154:23: note: in expansion of macro '__trace_if'
#define if(cond, ...) __trace_if( (cond , ## __VA_ARGS__) )
  ^~
   include/linux/string.h:390:2: note: in expansion of macro 'if'
 if (p_size == (size_t)-1 && q_size == (size_t)-1)
 ^~
   include/linux/compiler.h:162:4: warning: '__f' is static but declared in 
inline function 'kmemdup' which is not static
   __f = { \
   ^
   include/linux/compiler.h:1

pull-request: wireless-drivers 2017-07-21

2017-07-21 Thread Kalle Valo
Hi Dave,

important fixes for net which had accumulated while I was away. I only
applied the brcmfmac and rtlwifi patches only eight hours ago and I
haven't seen the kbuild report yet so they might have some build
breakage in theory. But the patches are so small so the chances that
they would break something are really small and so I send this to you
already now to not delay them any longer.

Please let me know if there are any problems.

Kalle

The following changes since commit 96080f697786e0a30006fcbcc5b53f350fcb3e9f:

  Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net (2017-07-20 
16:33:39 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers.git 
tags/wireless-drivers-for-davem-2017-07-21

for you to fetch changes up to d755cbc26e8295ae8e5d30425364e093b4247a85:

  Merge tag 'iwlwifi-for-kalle-2017-07-21' of 
git://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-fixes (2017-07-21 
14:33:27 +0300)


wireless-drivers fixes for 4.13

Important, but small in size, fixes.

brcmfmac

* fix a regression in SDIO support introduced in v4.13-rc1

rtlwifi

* fix a regression in bluetooth coexistance introduced in v4.13-rc1

iwlwifi

* a few NULL pointer dereferences in the recovery flow

* a small but important fix for IBSS

* a one-liner fix for tracing, which was including too much data


Arend Van Spriel (1):
  brcmfmac: fix regression in brcmf_sdio_txpkt_hdalign()

Dan Carpenter (1):
  iwlwifi: missing error code in iwl_trans_pcie_alloc()

Emmanuel Grumbach (3):
  iwlwifi: dvm: prevent an out of bounds access
  iwlwifi: mvm: fix a NULL pointer dereference of error in recovery
  iwlwifi: fix tracing when tx only is enabled

Johannes Berg (1):
  iwlwifi: mvm: defer setting IWL_MVM_STATUS_IN_HW_RESTART

Kalle Valo (1):
  Merge tag 'iwlwifi-for-kalle-2017-07-21' of 
git://git.kernel.org/.../iwlwifi/iwlwifi-fixes

Larry Finger (1):
  Revert "rtlwifi: btcoex: rtl8723be: fix ant_sel not work"

Luca Coelho (1):
  iwlwifi: mvm: handle IBSS probe_queue in a few missing places

Mordechai Goodstein (1):
  iwlwifi: pcie: fix unused txq NULL pointer dereference

 drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c |  3 ++-
 drivers/net/wireless/intel/iwlwifi/dvm/tx.c |  2 +-
 drivers/net/wireless/intel/iwlwifi/iwl-devtrace.h   |  4 ++--
 drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c   |  8 +++-
 drivers/net/wireless/intel/iwlwifi/mvm/mvm.h|  2 ++
 drivers/net/wireless/intel/iwlwifi/mvm/ops.c|  6 +++---
 drivers/net/wireless/intel/iwlwifi/mvm/sta.c| 15 ++-
 drivers/net/wireless/intel/iwlwifi/pcie/trans.c |  3 ++-
 drivers/net/wireless/intel/iwlwifi/pcie/tx.c|  3 +++
 drivers/net/wireless/realtek/rtlwifi/rtl8723be/hw.c |  3 ---
 drivers/net/wireless/realtek/rtlwifi/wifi.h |  1 -
 11 files changed, 36 insertions(+), 14 deletions(-)

-- 
Kalle Valo


Re: commit 16ecba59 breaks 82574L under heavy load.

2017-07-21 Thread Lennart Sorensen
On Fri, Jul 21, 2017 at 11:27:09AM -0400,  wrote:
> On Thu, Jul 20, 2017 at 04:44:55PM -0700, Benjamin Poirier wrote:
> > Could you please test the following patch and let me know if it:
> > 1) reduces the interrupt rate of the Other msi-x vector
> > 2) avoids the link flaps
> > or
> > 3) logs some dmesg warnings of the form "Other interrupt with unhandled 
> > [...]"
> > In this case, please paste icr values printed.
> 
> I will give it a try.

So test looks excellent.  Seems to only get interrupts when link state
actually changes now.

> Another odd behaviour I see is that the driver will hang in
> napi_synchronize on shutdown if there is traffic at the time (at least
> I think that's the trigger, maybe the trigger is if there has been an
> overload of traffic and the backlog in napi was used).
> 
> From doing some searching, this seems to be a problem that has plagued
> some people for years with this driver.
> 
> I am having trouble figuring out exactly what napi_synchronize is waiting
> for and who is supposed to toggle the flag it is waiting on.  The flag
> appears to work backwards from what I would have expected it to do.
> I see lots of places that can set the bit, but only napi_enable seems
> to clear it again, and I don't see how that would get called for all
> the places that potentially set the bit.

I just realized NAPI_STATE_SCHED and NAPIF_STATE_SCHED are the same
thing and I need to look at both of those.

Still something seems odd in some corner case where napi gets stuck and
you can't close the port anymore due to napi_synchronize never being
able to finish.  Some traffic pattern causes that SCHED state bit to
get into the wrong state and nothing ever clears it.  Even managed to
see it get stuck so it never passed traffic again and hung on shutdown.
The napi poll was never called again.

-- 
Len Sorensen


[patch net] mlxsw: spectrum_router: Don't offload routes next in list

2017-07-21 Thread Jiri Pirko
From: Ido Schimmel 

Each FIB node holds a linked list of routes sharing the same prefix and
length. In the case of IPv4 it's ordered according to table ID, metric
and TOS and only the first route in the list is actually programmed to
the device.

In case a gatewayed route is added somewhere in the list, then after its
nexthop group will be refreshed and become valid (due to the resolution
of its gateway), it'll mistakenly overwrite the existing entry.

Example:
192.168.200.0/24 dev enp3s0np3 scope link metric 1000 offload
192.168.200.0/24 via 192.168.100.1 dev enp3s0np3 metric 1000 offload

Both routes are marked as offloaded despite the fact only the first one
should actually be present in the device's table.

When refreshing the nexthop group, don't write the route to the device's
table unless it's the first in its node.

Fixes: 9aecce1c7d97 ("mlxsw: spectrum_router: Correctly handle identical 
routes")
Signed-off-by: Ido Schimmel 
Signed-off-by: Jiri Pirko 
---
 drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c 
b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
index 383fef5a..4b2e0fd 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
@@ -1512,6 +1512,10 @@ mlxsw_sp_nexthop_group_mac_update(struct mlxsw_sp 
*mlxsw_sp,
 static int mlxsw_sp_fib_entry_update(struct mlxsw_sp *mlxsw_sp,
 struct mlxsw_sp_fib_entry *fib_entry);
 
+static bool
+mlxsw_sp_fib_node_entry_is_first(const struct mlxsw_sp_fib_node *fib_node,
+const struct mlxsw_sp_fib_entry *fib_entry);
+
 static int
 mlxsw_sp_nexthop_fib_entries_update(struct mlxsw_sp *mlxsw_sp,
struct mlxsw_sp_nexthop_group *nh_grp)
@@ -1520,6 +1524,9 @@ mlxsw_sp_nexthop_fib_entries_update(struct mlxsw_sp 
*mlxsw_sp,
int err;
 
list_for_each_entry(fib_entry, _grp->fib_list, nexthop_group_node) {
+   if (!mlxsw_sp_fib_node_entry_is_first(fib_entry->fib_node,
+ fib_entry))
+   continue;
err = mlxsw_sp_fib_entry_update(mlxsw_sp, fib_entry);
if (err)
return err;
-- 
2.9.3



[patch net-next] mlxsw: spectrum_router: Don't ignore IPv6 notifications

2017-07-21 Thread Jiri Pirko
From: Ido Schimmel 

We now have all the necessary IPv6 infrastructure in place, so stop
ignoring these notifications.

Signed-off-by: Ido Schimmel 
Signed-off-by: Jiri Pirko 
---
 drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c 
b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
index 9adb239..fa15f17 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
@@ -3804,7 +3804,7 @@ static int mlxsw_sp_router_fib_event(struct 
notifier_block *nb,
struct fib_notifier_info *info = ptr;
struct mlxsw_sp_router *router;
 
-   if (!net_eq(info->net, _net) || info->family != AF_INET)
+   if (!net_eq(info->net, _net))
return NOTIFY_DONE;
 
fib_work = kzalloc(sizeof(*fib_work), GFP_ATOMIC);
-- 
2.9.3



Re: [PATCH net] ipv6: no need to return rt->dst.error if it is not null entry.

2017-07-21 Thread David Ahern
On 7/20/17 9:23 AM, Hangbin Liu wrote:
> diff --git a/net/ipv6/route.c b/net/ipv6/route.c
> index 4d30c96..c290aa4 100644
> --- a/net/ipv6/route.c
> +++ b/net/ipv6/route.c
> @@ -3637,13 +3637,8 @@ static int inet6_rtm_getroute(struct sk_buff
> *in_skb, struct nlmsghdr *nlh,
> dst = ip6_route_lookup(net, , 0);
> 
> rt = container_of(dst, struct rt6_info, dst);
> -   if (rt->dst.error) {
> -   err = rt->dst.error;
> -   ip6_rt_put(rt);
> -   goto errout;
> -   }

The above 5 lines introduced the change in behavior, so removing them
should be sufficient.


Re: [iovisor-dev] [PATCH net 0/2] bpf: fix verifier min/max handling in BPF_SUB

2017-07-21 Thread Nadav Amit
Edward Cree via iovisor-dev  wrote:

> I managed to come up with a test for the swapped bounds in BPF_SUB, so here
> it is along with a patch that fixes it, separated out from my 'rewrite
> everything' series so it can go to -stable.
> 
> Edward Cree (2):
>  selftests/bpf: subtraction bounds test
>  bpf/verifier: fix min/max handling in BPF_SUB
> 
> kernel/bpf/verifier.c   | 21 +++--
> tools/testing/selftests/bpf/test_verifier.c | 28 
> 2 files changed, 43 insertions(+), 6 deletions(-)
> 
> ___
> iovisor-dev mailing list
> iovisor-...@lists.iovisor.org
> https://lists.iovisor.org/mailman/listinfo/iovisor-dev

Thanks for separating it.

Nadav

Re: [RFC 1/2] net-next: add a dma_desc element to struct skb_shared_info

2017-07-21 Thread Paolo Abeni
Hi,

On Fri, 2017-07-21 at 17:20 +0200, John Crispin wrote:
> In order to make HW flow offloading work in latest MediaTek silicon we need
> to propagate part of the RX DMS descriptor to the upper layers populating
> the flow offload engines HW tables. This patch adds an extra element to
> struct skb_shared_info allowing the ethernet drivers RX napi code to store
> the required information and make it persistent for the lifecycle of the
> skb and its clones.
> 
> Signed-off-by: John Crispin 
> ---
>  include/linux/skbuff.h | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index 4093552be1de..db9576cd946b 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -426,6 +426,7 @@ struct skb_shared_info {
>   unsigned intgso_type;
>   u32 tskey;
>   __be32  ip6_frag_id;
> + u32 dma_desc;
>  
>   /*
>* Warning : all fields before dataref are cleared in __alloc_skb()

This will increase the skb_shared_info struct size, which is already
quite large, and can have several kind of performance drawback.
AFAIK this is discouraged. 

I don't understand the use case; the driver will set this field, but
who is going to consume it?

Thanks,

Paolo


Re: [PATCH] net: ethernet: ti: cpsw: Push the request_irq function to the end of probe

2017-07-21 Thread Grygorii Strashko



On 07/20/2017 11:45 PM, Keerthy wrote:



On Friday 21 July 2017 04:14 AM, Grygorii Strashko wrote:



On 07/20/2017 05:28 PM, David Miller wrote:

From: Grygorii Strashko 
Date: Thu, 20 Jul 2017 11:08:09 -0500


In general patch looks good to me, but it's really unexpected to
receive IRQs while CPSW is probing ;(


This is a poor expectation.

Boot loaders and other entities can leave the device in any state
whatsoever.

Furthermore, enabling an IRQ whose handler cannot properly execute
without crashing is wrong fundamentally.  All data structures and
state must be set up properly before the IRQ is requested.

Therefore this patch is correct and I will apply it.



Thanks. Agree (it just has never triggered before, so I meant - unexpected
  from current driver code point of view ;().
And I'm just worry that it might not be enough :(, especially for am335x.


I tried nfs boot on am335x-evm with this patch and it boots fine for me.



Thank you Keerthy, I've also simulated and tested it on am335x and dra7
and see networking working and no crashes.

--
regards,
-grygorii


Re: [PATCHv2 net] ipv6: should not return rt->dst.error if it is prohibit or blk hole entry.

2017-07-21 Thread kbuild test robot
Hi Hangbin,

[auto build test ERROR on net/master]

url:
https://github.com/0day-ci/linux/commits/Hangbin-Liu/ipv6-should-not-return-rt-dst-error-if-it-is-prohibit-or-blk-hole-entry/20170721-204554
config: m68k-sun3_defconfig (attached as .config)
compiler: m68k-linux-gcc (GCC) 4.9.0
reproduce:
wget 
https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=m68k 

All errors (new ones prefixed by >>):

   net//ipv6/route.c: In function 'inet6_rtm_getroute':
>> net//ipv6/route.c:3640:38: error: 'struct netns_ipv6' has no member named 
>> 'ip6_prohibit_entry'
 if (rt->dst.error && rt != net->ipv6.ip6_prohibit_entry &&
 ^
>> net//ipv6/route.c:3641:21: error: 'struct netns_ipv6' has no member named 
>> 'ip6_blk_hole_entry'
 rt != net->ipv6.ip6_blk_hole_entry) {
^

vim +3640 net//ipv6/route.c

  3558  
  3559  static int inet6_rtm_getroute(struct sk_buff *in_skb, struct nlmsghdr 
*nlh,
  3560struct netlink_ext_ack *extack)
  3561  {
  3562  struct net *net = sock_net(in_skb->sk);
  3563  struct nlattr *tb[RTA_MAX+1];
  3564  int err, iif = 0, oif = 0;
  3565  struct dst_entry *dst;
  3566  struct rt6_info *rt;
  3567  struct sk_buff *skb;
  3568  struct rtmsg *rtm;
  3569  struct flowi6 fl6;
  3570  bool fibmatch;
  3571  
  3572  err = nlmsg_parse(nlh, sizeof(*rtm), tb, RTA_MAX, 
rtm_ipv6_policy,
  3573extack);
  3574  if (err < 0)
  3575  goto errout;
  3576  
  3577  err = -EINVAL;
  3578  memset(, 0, sizeof(fl6));
  3579  rtm = nlmsg_data(nlh);
  3580  fl6.flowlabel = ip6_make_flowinfo(rtm->rtm_tos, 0);
  3581  fibmatch = !!(rtm->rtm_flags & RTM_F_FIB_MATCH);
  3582  
  3583  if (tb[RTA_SRC]) {
  3584  if (nla_len(tb[RTA_SRC]) < sizeof(struct in6_addr))
  3585  goto errout;
  3586  
  3587  fl6.saddr = *(struct in6_addr *)nla_data(tb[RTA_SRC]);
  3588  }
  3589  
  3590  if (tb[RTA_DST]) {
  3591  if (nla_len(tb[RTA_DST]) < sizeof(struct in6_addr))
  3592  goto errout;
  3593  
  3594  fl6.daddr = *(struct in6_addr *)nla_data(tb[RTA_DST]);
  3595  }
  3596  
  3597  if (tb[RTA_IIF])
  3598  iif = nla_get_u32(tb[RTA_IIF]);
  3599  
  3600  if (tb[RTA_OIF])
  3601  oif = nla_get_u32(tb[RTA_OIF]);
  3602  
  3603  if (tb[RTA_MARK])
  3604  fl6.flowi6_mark = nla_get_u32(tb[RTA_MARK]);
  3605  
  3606  if (tb[RTA_UID])
  3607  fl6.flowi6_uid = make_kuid(current_user_ns(),
  3608 nla_get_u32(tb[RTA_UID]));
  3609  else
  3610  fl6.flowi6_uid = iif ? INVALID_UID : current_uid();
  3611  
  3612  if (iif) {
  3613  struct net_device *dev;
  3614  int flags = 0;
  3615  
  3616  dev = __dev_get_by_index(net, iif);
  3617  if (!dev) {
  3618  err = -ENODEV;
  3619  goto errout;
  3620  }
  3621  
  3622  fl6.flowi6_iif = iif;
  3623  
  3624  if (!ipv6_addr_any())
  3625  flags |= RT6_LOOKUP_F_HAS_SADDR;
  3626  
  3627  if (!fibmatch)
  3628  dst = ip6_route_input_lookup(net, dev, , 
flags);
  3629  } else {
  3630  fl6.flowi6_oif = oif;
  3631  
  3632  if (!fibmatch)
  3633  dst = ip6_route_output(net, NULL, );
  3634  }
  3635  
  3636  if (fibmatch)
  3637  dst = ip6_route_lookup(net, , 0);
  3638  
  3639  rt = container_of(dst, struct rt6_info, dst);
> 3640  if (rt->dst.error && rt != net->ipv6.ip6_prohibit_entry &&
> 3641  rt != net->ipv6.ip6_blk_hole_entry) {
  3642  err = rt->dst.error;
  3643  ip6_rt_put(rt);
  3644  goto errout;
  3645  }
  3646  
  3647  skb = alloc_skb(NLMSG_GOODSIZE, GFP_KERNEL);
  3648  if (!skb) {
  3649  ip6_rt_put(rt);
  3650  err = -ENOBUFS;
  3651  goto errout;
  3652  }
  3653  
  3654  skb_dst_set(skb, >dst);
  3655  if (fibmatch)
  3656  err = rt6_fill_node(net, skb, rt, NULL, NULL, iif,
  3657   

Re: commit 16ecba59 breaks 82574L under heavy load.

2017-07-21 Thread Lennart Sorensen
On Thu, Jul 20, 2017 at 04:44:55PM -0700, Benjamin Poirier wrote:
> Could you please test the following patch and let me know if it:
> 1) reduces the interrupt rate of the Other msi-x vector
> 2) avoids the link flaps
> or
> 3) logs some dmesg warnings of the form "Other interrupt with unhandled [...]"
> In this case, please paste icr values printed.

I will give it a try.

Another odd behaviour I see is that the driver will hang in
napi_synchronize on shutdown if there is traffic at the time (at least
I think that's the trigger, maybe the trigger is if there has been an
overload of traffic and the backlog in napi was used).

>From doing some searching, this seems to be a problem that has plagued
some people for years with this driver.

I am having trouble figuring out exactly what napi_synchronize is waiting
for and who is supposed to toggle the flag it is waiting on.  The flag
appears to work backwards from what I would have expected it to do.
I see lots of places that can set the bit, but only napi_enable seems
to clear it again, and I don't see how that would get called for all
the places that potentially set the bit.

-- 
Len Sorensen


[RFC 1/2] net-next: add a dma_desc element to struct skb_shared_info

2017-07-21 Thread John Crispin
In order to make HW flow offloading work in latest MediaTek silicon we need
to propagate part of the RX DMS descriptor to the upper layers populating
the flow offload engines HW tables. This patch adds an extra element to
struct skb_shared_info allowing the ethernet drivers RX napi code to store
the required information and make it persistent for the lifecycle of the
skb and its clones.

Signed-off-by: John Crispin 
---
 include/linux/skbuff.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 4093552be1de..db9576cd946b 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -426,6 +426,7 @@ struct skb_shared_info {
unsigned intgso_type;
u32 tskey;
__be32  ip6_frag_id;
+   u32 dma_desc;
 
/*
 * Warning : all fields before dataref are cleared in __alloc_skb()
-- 
2.11.0



[RFC 2/2] net-next: mediatek: populate the shared

2017-07-21 Thread John Crispin
When enabling HW flow offloading, the traffic coming in on either of the
GMACs is first sent to the PPE for processing. Any traffic not offloaded
at this point will then be forwarded to the normal RX DMA ring for SW path
processing. In this case the PPE will send additional data inside RXD4
that is later required by the upper layers to populate the flow offloading
engines HW tables properly. This patch sets the skb_shared_info's dma_desc
field so that we can use the value later on.

Signed-off-by: John Crispin 
---
 drivers/net/ethernet/mediatek/mtk_eth_soc.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c 
b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
index a455d1b4f1d8..42d162cd6363 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
@@ -918,6 +918,7 @@ static void mtk_update_rx_cpu_idx(struct mtk_eth *eth)
 static int mtk_poll_rx(struct napi_struct *napi, int budget,
   struct mtk_eth *eth)
 {
+   struct skb_shared_info *sh;
struct mtk_rx_ring *ring;
int idx;
struct sk_buff *skb;
@@ -1000,6 +1001,9 @@ static int mtk_poll_rx(struct napi_struct *napi, int 
budget,
else
skb_checksum_none_assert(skb);
skb->protocol = eth_type_trans(skb, netdev);
+   sh = skb_shinfo(skb);
+
+   sh->dma_desc = trxd.rxd4;
 
if (netdev->features & NETIF_F_HW_VLAN_CTAG_RX &&
RX_DMA_VID(trxd.rxd3))
-- 
2.11.0



[RFC 0/2] net-next: hw flow offloading

2017-07-21 Thread John Crispin
Hi,

I managed to bring up the flow offloading on latest MedieTek silicon.

When enabling HW flow offloading, the traffic coming in on either of the
GMACs is first sent to the PPE for processing. Any traffic not offloaded
at this point will then be forwarded to the normal RX DMA ring for SW path
processing. In this case the PPE will send additional data inside RXD4
that is later required by the upper layers to populate the flow offloading
engines HW tables properly.

This series is a RFC as i am not sure how to best propagate the additional
info from the RX DMA descriptor. The driver is still using NF hooks and
I plan to rebase it and send it upstream once the flow table offloading
patches that folks are working on are upstream.

I am right now trying to get rid of the remaning hacks in the code and
wanted to know if this series would be a feasible solution.

John

John Crispin (2):
  net-next: add a dma_desc element to struct skb_shared_info
  net-next: mediatek: populate the shared

 drivers/net/ethernet/mediatek/mtk_eth_soc.c | 4 
 include/linux/skbuff.h  | 1 +
 2 files changed, 5 insertions(+)

-- 
2.11.0



[PATCH] lib: test_rhashtable: fix for large entry counts

2017-07-21 Thread Phil Sutter
During concurrent access testing, threadfunc() concatenated thread ID
and object index to create a unique key like so:

| tdata->objs[i].value = (tdata->id << 16) | i;

This breaks if a user passes an entries parameter of 64k or higher,
since 'i' might use more than 16 bits then. Effectively, this will lead
to duplicate keys in the table.

Fix the problem by introducing a struct holding object and thread ID and
using that as key instead of a single integer type field.

Fixes: f4a3e90ba5739 ("rhashtable-test: extend to test concurrency")
Reported by: Manuel Messner 
Signed-off-by: Phil Sutter 
---
 lib/test_rhashtable.c | 53 +++
 1 file changed, 32 insertions(+), 21 deletions(-)

diff --git a/lib/test_rhashtable.c b/lib/test_rhashtable.c
index 64e899b633371..16949d219291a 100644
--- a/lib/test_rhashtable.c
+++ b/lib/test_rhashtable.c
@@ -56,8 +56,13 @@ static bool enomem_retry = false;
 module_param(enomem_retry, bool, 0);
 MODULE_PARM_DESC(enomem_retry, "Retry insert even if -ENOMEM was returned 
(default: off)");
 
+struct test_obj_val {
+   int id;
+   int tid;
+};
+
 struct test_obj {
-   int value;
+   struct test_obj_val value;
struct rhash_head   node;
 };
 
@@ -72,7 +77,7 @@ static struct test_obj array[MAX_ENTRIES];
 static struct rhashtable_params test_rht_params = {
.head_offset = offsetof(struct test_obj, node),
.key_offset = offsetof(struct test_obj, value),
-   .key_len = sizeof(int),
+   .key_len = sizeof(struct test_obj_val),
.hashfn = jhash,
.nulls_base = (3U << RHT_BASE_SHIFT),
 };
@@ -109,24 +114,26 @@ static int __init test_rht_lookup(struct rhashtable *ht)
for (i = 0; i < entries * 2; i++) {
struct test_obj *obj;
bool expected = !(i % 2);
-   u32 key = i;
+   struct test_obj_val key = {
+   .id = i,
+   };
 
-   if (array[i / 2].value == TEST_INSERT_FAIL)
+   if (array[i / 2].value.id == TEST_INSERT_FAIL)
expected = false;
 
obj = rhashtable_lookup_fast(ht, , test_rht_params);
 
if (expected && !obj) {
-   pr_warn("Test failed: Could not find key %u\n", key);
+   pr_warn("Test failed: Could not find key %u\n", key.id);
return -ENOENT;
} else if (!expected && obj) {
pr_warn("Test failed: Unexpected entry found for key 
%u\n",
-   key);
+   key.id);
return -EEXIST;
} else if (expected && obj) {
-   if (obj->value != i) {
+   if (obj->value.id != i) {
pr_warn("Test failed: Lookup value mismatch 
%u!=%u\n",
-   obj->value, i);
+   obj->value.id, i);
return -EINVAL;
}
}
@@ -195,7 +202,7 @@ static s64 __init test_rhashtable(struct rhashtable *ht)
for (i = 0; i < entries; i++) {
struct test_obj *obj = [i];
 
-   obj->value = i * 2;
+   obj->value.id = i * 2;
err = insert_retry(ht, >node, test_rht_params);
if (err > 0)
insert_retries += err;
@@ -218,7 +225,7 @@ static s64 __init test_rhashtable(struct rhashtable *ht)
for (i = 0; i < entries; i++) {
u32 key = i * 2;
 
-   if (array[i].value != TEST_INSERT_FAIL) {
+   if (array[i].value.id != TEST_INSERT_FAIL) {
obj = rhashtable_lookup_fast(ht, , test_rht_params);
BUG_ON(!obj);
 
@@ -242,18 +249,21 @@ static int thread_lookup_test(struct thread_data *tdata)
 
for (i = 0; i < entries; i++) {
struct test_obj *obj;
-   int key = (tdata->id << 16) | i;
+   struct test_obj_val key = {
+   .id = i,
+   .tid = tdata->id,
+   };
 
obj = rhashtable_lookup_fast(, , test_rht_params);
-   if (obj && (tdata->objs[i].value == TEST_INSERT_FAIL)) {
-   pr_err("  found unexpected object %d\n", key);
+   if (obj && (tdata->objs[i].value.id == TEST_INSERT_FAIL)) {
+   pr_err("  found unexpected object %d-%d\n", key.tid, 
key.id);
err++;
-   } else if (!obj && (tdata->objs[i].value != TEST_INSERT_FAIL)) {
-   pr_err("  object %d not found!\n", key);
+   } else if (!obj && (tdata->objs[i].value.id != 
TEST_INSERT_FAIL)) {
+   pr_err("  object %d-%d not 

[PATCH] iwlwifi: Demote messages about fw flags size to info

2017-07-21 Thread João Paulo Rechi Vita
These messages are not reporting a real error, just the fact that the
firmware knows about more flags then the driver.

Currently these messages are presented to the user during boot if there
is no bootsplash covering the console, sometimes even if the boot splash
is enabled but has not started yet by the time this message is shown.

Demoting it to the info level helps having a clean boot process.

Signed-off-by: João Paulo Rechi Vita 
---
 drivers/net/wireless/intel/iwlwifi/iwl-drv.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/net/wireless/intel/iwlwifi/iwl-drv.c 
b/drivers/net/wireless/intel/iwlwifi/iwl-drv.c
index 6fdb5921e17f..557acd43d705 100644
--- a/drivers/net/wireless/intel/iwlwifi/iwl-drv.c
+++ b/drivers/net/wireless/intel/iwlwifi/iwl-drv.c
@@ -487,9 +487,9 @@ static int iwl_set_ucode_api_flags(struct iwl_drv *drv, 
const u8 *data,
int i;
 
if (api_index >= DIV_ROUND_UP(NUM_IWL_UCODE_TLV_API, 32)) {
-   IWL_ERR(drv,
-   "api flags index %d larger than supported by driver\n",
-   api_index);
+   IWL_INFO(drv,
+"api flags index %d larger than supported by driver\n",
+api_index);
/* don't return an error so we can load FW that has more bits */
return 0;
}
@@ -511,9 +511,9 @@ static int iwl_set_ucode_capabilities(struct iwl_drv *drv, 
const u8 *data,
int i;
 
if (api_index >= DIV_ROUND_UP(NUM_IWL_UCODE_TLV_CAPA, 32)) {
-   IWL_ERR(drv,
-   "capa flags index %d larger than supported by driver\n",
-   api_index);
+   IWL_INFO(drv,
+"capa flags index %d larger than supported by 
driver\n",
+api_index);
/* don't return an error so we can load FW that has more bits */
return 0;
}
-- 
2.13.2



Re: [PATCH v2 1/4] net: phy: at803x: Document RGMII RX and TX clock delay issues

2017-07-21 Thread Marc Gonzalez
On 21/07/2017 16:06, Timur Tabi wrote:

> On 7/21/17 8:29 AM, Marc Gonzalez wrote:
>
>> I don't understand what you're saying.
>>
>> It is a correct observation that the code enabling
>> RGMII RX clock delay is a NOP, since that bit will
>> always be set at that point.
>>
>> The spec for the 8035 (I haven't checked for 8030 and 8031,
>> is that what you meant by "other systems"?) states that
>> Sel_clk125m_dsp, which is described as:
>> "Control bit for rgmii interface rx clock delay"
>> is 1 after HW reset, 1 after SW reset.
>>
>> So my statement "RX clock delay is enabled at reset"
>> is universally true. It's not just on some systems.
> 
> Ok, taken out of context, the comment doesn't really explain why the 
> code is the way it is.  I'm not really happy about the word "assumes". 

If a HW setting defaults to 0 at reset, and some init is called
right after reset, then you know the setting's value is 0.
If you need that value to be 1, all you need is a function
setting it to 1. This is the current situation.

Commit 2e5f9f281ee8 assumes 0 at reset, and provides a function
setting the value to 1.

Reality is different. The HW setting defaults to 1 at reset.
So it turns out that the function setting the value to 1
is pointless, because the value is already 1. There is
however no way to set the value to 0.

Does that explain why I wrote "assume"?

Also the commit message:
"The current code supports enabling RGMII RX and TX clock delays.
The unstated assumption is that these settings are disabled by
default at reset, which is not the case."

> Maybe you should add a sentence explaining when the code is NOT a no-op.

The code is *NEVER* NOT a no-op.
I.e. the code enabling RX clock delay is ALWAYS a no-op.
I don't understand what you think is unclear in my comment.

Regards.



Re: [PATCH net 2/2] bpf/verifier: fix min/max handling in BPF_SUB

2017-07-21 Thread Daniel Borkmann

On 07/21/2017 03:37 PM, Edward Cree wrote:

We have to subtract the src max from the dst min, and vice-versa, since
  (e.g.) the smallest result comes from the largest subtrahend.

Fixes: 484611357c19 ("bpf: allow access into map value arrays")
Signed-off-by: Edward Cree 


LGTM, thanks for the fix!

Acked-by: Daniel Borkmann 


---
  kernel/bpf/verifier.c | 21 +++--
  1 file changed, 15 insertions(+), 6 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index af9e84a..664d939 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -1865,10 +1865,12 @@ static void adjust_reg_min_max_vals(struct 
bpf_verifier_env *env,
 * do our normal operations to the register, we need to set the values
 * to the min/max since they are undefined.
 */
-   if (min_val == BPF_REGISTER_MIN_RANGE)
-   dst_reg->min_value = BPF_REGISTER_MIN_RANGE;
-   if (max_val == BPF_REGISTER_MAX_RANGE)
-   dst_reg->max_value = BPF_REGISTER_MAX_RANGE;
+   if (opcode != BPF_SUB) {
+   if (min_val == BPF_REGISTER_MIN_RANGE)
+   dst_reg->min_value = BPF_REGISTER_MIN_RANGE;
+   if (max_val == BPF_REGISTER_MAX_RANGE)
+   dst_reg->max_value = BPF_REGISTER_MAX_RANGE;
+   }

switch (opcode) {
case BPF_ADD:
@@ -1879,10 +1881,17 @@ static void adjust_reg_min_max_vals(struct 
bpf_verifier_env *env,
dst_reg->min_align = min(src_align, dst_align);
break;
case BPF_SUB:
+   /* If one of our values was at the end of our ranges, then the
+* _opposite_ value in the dst_reg goes to the end of our range.
+*/
+   if (min_val == BPF_REGISTER_MIN_RANGE)
+   dst_reg->max_value = BPF_REGISTER_MAX_RANGE;
+   if (max_val == BPF_REGISTER_MAX_RANGE)
+   dst_reg->min_value = BPF_REGISTER_MIN_RANGE;
if (dst_reg->min_value != BPF_REGISTER_MIN_RANGE)
-   dst_reg->min_value -= min_val;
+   dst_reg->min_value -= max_val;
if (dst_reg->max_value != BPF_REGISTER_MAX_RANGE)
-   dst_reg->max_value -= max_val;
+   dst_reg->max_value -= min_val;
dst_reg->min_align = min(src_align, dst_align);
break;
case BPF_MUL:





Re: [PATCH net 1/2] selftests/bpf: subtraction bounds test

2017-07-21 Thread Daniel Borkmann

On 07/21/2017 03:36 PM, Edward Cree wrote:

There is a bug in the verifier's handling of BPF_SUB: [a,b] - [c,d] yields
  was [a-c, b-d] rather than the correct [a-d, b-c].  So here is a test
  which, with the bogus handling, will produce ranges of [0,0] and thus
  allowed accesses; whereas the correct handling will give a range of
  [-255, 255] (and hence the right-shift will give a range of [0, 255]) and
  the accesses will be rejected.

Signed-off-by: Edward Cree 


Acked-by: Daniel Borkmann 


  tools/testing/selftests/bpf/test_verifier.c | 28 
  1 file changed, 28 insertions(+)

diff --git a/tools/testing/selftests/bpf/test_verifier.c 
b/tools/testing/selftests/bpf/test_verifier.c
index af7d173..addea82 100644
--- a/tools/testing/selftests/bpf/test_verifier.c
+++ b/tools/testing/selftests/bpf/test_verifier.c
@@ -5980,6 +5980,34 @@ static struct bpf_test tests[] = {
.result = REJECT,
.result_unpriv = REJECT,
},
+   {
+   "subtraction bounds (map value)",
+   .insns = {
+   BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
+   BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+   BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+   BPF_LD_MAP_FD(BPF_REG_1, 0),
+   BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
+BPF_FUNC_map_lookup_elem),
+   BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 9),
+   BPF_LDX_MEM(BPF_B, BPF_REG_1, BPF_REG_0, 0),
+   BPF_JMP_IMM(BPF_JGT, BPF_REG_1, 0xff, 7),
+   BPF_LDX_MEM(BPF_B, BPF_REG_3, BPF_REG_0, 1),
+   BPF_JMP_IMM(BPF_JGT, BPF_REG_3, 0xff, 5),
+   BPF_ALU64_REG(BPF_SUB, BPF_REG_1, BPF_REG_3),
+   BPF_ALU64_IMM(BPF_RSH, BPF_REG_1, 56),
+   BPF_ALU64_REG(BPF_ADD, BPF_REG_0, BPF_REG_1),
+   BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_0, 0),
+   BPF_EXIT_INSN(),
+   BPF_MOV64_IMM(BPF_REG_0, 0),
+   BPF_EXIT_INSN(),
+   },
+   .fixup_map1 = { 3 },
+   .errstr_unpriv = "R0 pointer arithmetic prohibited",
+   .errstr = "R0 min value is negative, either use unsigned index or do a 
if (index >=0) check.",
+   .result = REJECT,
+   .result_unpriv = REJECT,
+   },
  };

  static int probe_filter_length(const struct bpf_insn *fp)





Re: [PATCH v2 3/4] net: ethernet: nb8800: Fix RGMII TX clock delay setup

2017-07-21 Thread Måns Rullgård
Marc Gonzalez  writes:

> On 21/07/2017 15:47, Måns Rullgård wrote:
>
>> Marc Gonzalez wrote:
>> 
>>> On 21/07/2017 15:04, Måns Rullgård wrote:
>>>
 Marc Gonzalez wrote:

> According to commit e5f3a4a56ce2a707b2fb8ce37e4414dcac89c672
> ("Documentation: devicetree: clarify usage of the RGMII phy-modes")
> there are 4 RGMII modes to handle:
>
> "rgmii" (RX and TX delays are added by the MAC when required)
> "rgmii-id" (RGMII with internal RX and TX delays provided by the PHY,
>   the MAC should not add the RX or TX delays in this case)
> "rgmii-rxid" (RGMII with internal RX delay provided by the PHY,
>   the MAC should not add an RX delay in this case)
> "rgmii-txid" (RGMII with internal TX delay provided by the PHY,
>   the MAC should not add an TX delay in this case)
>
> Add TX delay in the MAC only for rgmii and rgmii-rxid.
>
> Signed-off-by: Marc Gonzalez 
> ---
>  drivers/net/ethernet/aurora/nb8800.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/ethernet/aurora/nb8800.c 
> b/drivers/net/ethernet/aurora/nb8800.c
> index ded041dbafe7..f3ed320eb4ad 100644
> --- a/drivers/net/ethernet/aurora/nb8800.c
> +++ b/drivers/net/ethernet/aurora/nb8800.c
> @@ -1268,11 +1268,13 @@ static int nb8800_tangox_init(struct net_device 
> *dev)
>   break;
>
>   case PHY_INTERFACE_MODE_RGMII:
> - pad_mode = PAD_MODE_RGMII;
> + case PHY_INTERFACE_MODE_RGMII_RXID:
> + pad_mode = PAD_MODE_RGMII | PAD_MODE_GTX_CLK_DELAY;
>   break;
>
> + case PHY_INTERFACE_MODE_RGMII_ID:
>   case PHY_INTERFACE_MODE_RGMII_TXID:
> - pad_mode = PAD_MODE_RGMII | PAD_MODE_GTX_CLK_DELAY;
> + pad_mode = PAD_MODE_RGMII;
>   break;
>
>   default:

 I still don't like this.  Having both the MAC and PHY drivers react to
 the phy-connection-type property is bound to cause trouble somewhere.

 The only way out of the current mess is to define new properties for
 both MAC and PHY that override the existing ones if present.
>>>
>>> Do you mean defining 4 new bindings and their corresponding
>>> phy_interface_t enum values? For example:
>>>
>>> "rgmii-v2"
>>> "rgmii-id-v2"
>>> "rgmii-rxid-v2"
>>> "rgmii-txid-v2"
>>>
>>> PHY_INTERFACE_MODE_RGMII_V2,
>>> PHY_INTERFACE_MODE_RGMII_ID_V2,
>>> PHY_INTERFACE_MODE_RGMII_RXID_V2,
>>> PHY_INTERFACE_MODE_RGMII_TXID_V2,
>>>
>>> And then handling these new enums in the at803x and nb8800 drivers?
>> 
>> It has already been suggested to add new properties specifying desired
>> delays in picoseconds.  If present on the MAC node, the MAC driver
>> should attempt to provide the delay, and if present on the PHY node, the
>> PHY driver is responsible.
>
> Sorry, I had already forgotten about Florian's suggestion:
>> If you introduced PHY and/or MAC specific properties to configure the
>> delays in the appropriate unit of time (say ps), you could use a
>> non-compliant 'phy-mode' just to satisfy the driver/PHY library and
>> still override the delays you need.
>
> So we would need two properties (RX and TX).
> "rgmii_rx_skew_ps" and "rgmii_tx_skew_ps"
>
> but it's not clear to me how the MAC probe function communicates
> the arguments to the phy driver. Looks like we would need to add
> two fields to struct phy_device, and maybe define a new phy-mode
> to instruct the PHY driver to look for the two fields.

There's no need for the drivers to communicate.  The location of the
properties in the device tree determines which driver should deal with
it.

> I don't have time to work on that for now, but I do need to
> fix the nb8800 driver now. Can we apply the following patch
> in the interim?
>
> diff --git a/drivers/net/ethernet/aurora/nb8800.c 
> b/drivers/net/ethernet/aurora/nb8800.c
> index ded041dbafe7..e94159507847 100644
> --- a/drivers/net/ethernet/aurora/nb8800.c
> +++ b/drivers/net/ethernet/aurora/nb8800.c
> @@ -1268,11 +1268,10 @@ static int nb8800_tangox_init(struct net_device *dev)
> break;
>
> case PHY_INTERFACE_MODE_RGMII:
> -   pad_mode = PAD_MODE_RGMII;
> -   break;
> -
> +   case PHY_INTERFACE_MODE_RGMII_ID:
> +   case PHY_INTERFACE_MODE_RGMII_RXID:
> case PHY_INTERFACE_MODE_RGMII_TXID:
> -   pad_mode = PAD_MODE_RGMII | PAD_MODE_GTX_CLK_DELAY;
> +   pad_mode = PAD_MODE_RGMII;
> break;
>
> default:

Simply stop reacting to the delay aspect of the phy-connection-type
property?  Yes, I'm fine with that.

-- 
Måns Rullgård


Re: [PATCH v2 3/4] net: ethernet: nb8800: Fix RGMII TX clock delay setup

2017-07-21 Thread Marc Gonzalez
On 21/07/2017 15:47, Måns Rullgård wrote:

> Marc Gonzalez wrote:
> 
>> On 21/07/2017 15:04, Måns Rullgård wrote:
>>
>>> Marc Gonzalez wrote:
>>>
 According to commit e5f3a4a56ce2a707b2fb8ce37e4414dcac89c672
 ("Documentation: devicetree: clarify usage of the RGMII phy-modes")
 there are 4 RGMII modes to handle:

 "rgmii" (RX and TX delays are added by the MAC when required)
 "rgmii-id" (RGMII with internal RX and TX delays provided by the PHY,
the MAC should not add the RX or TX delays in this case)
 "rgmii-rxid" (RGMII with internal RX delay provided by the PHY,
the MAC should not add an RX delay in this case)
 "rgmii-txid" (RGMII with internal TX delay provided by the PHY,
the MAC should not add an TX delay in this case)

 Add TX delay in the MAC only for rgmii and rgmii-rxid.

 Signed-off-by: Marc Gonzalez 
 ---
  drivers/net/ethernet/aurora/nb8800.c | 6 --
  1 file changed, 4 insertions(+), 2 deletions(-)

 diff --git a/drivers/net/ethernet/aurora/nb8800.c 
 b/drivers/net/ethernet/aurora/nb8800.c
 index ded041dbafe7..f3ed320eb4ad 100644
 --- a/drivers/net/ethernet/aurora/nb8800.c
 +++ b/drivers/net/ethernet/aurora/nb8800.c
 @@ -1268,11 +1268,13 @@ static int nb8800_tangox_init(struct net_device 
 *dev)
break;

case PHY_INTERFACE_MODE_RGMII:
 -  pad_mode = PAD_MODE_RGMII;
 +  case PHY_INTERFACE_MODE_RGMII_RXID:
 +  pad_mode = PAD_MODE_RGMII | PAD_MODE_GTX_CLK_DELAY;
break;

 +  case PHY_INTERFACE_MODE_RGMII_ID:
case PHY_INTERFACE_MODE_RGMII_TXID:
 -  pad_mode = PAD_MODE_RGMII | PAD_MODE_GTX_CLK_DELAY;
 +  pad_mode = PAD_MODE_RGMII;
break;

default:
>>>
>>> I still don't like this.  Having both the MAC and PHY drivers react to
>>> the phy-connection-type property is bound to cause trouble somewhere.
>>>
>>> The only way out of the current mess is to define new properties for
>>> both MAC and PHY that override the existing ones if present.
>>
>> Do you mean defining 4 new bindings and their corresponding
>> phy_interface_t enum values? For example:
>>
>> "rgmii-v2"
>> "rgmii-id-v2"
>> "rgmii-rxid-v2"
>> "rgmii-txid-v2"
>>
>>  PHY_INTERFACE_MODE_RGMII_V2,
>>  PHY_INTERFACE_MODE_RGMII_ID_V2,
>>  PHY_INTERFACE_MODE_RGMII_RXID_V2,
>>  PHY_INTERFACE_MODE_RGMII_TXID_V2,
>>
>> And then handling these new enums in the at803x and nb8800 drivers?
> 
> It has already been suggested to add new properties specifying desired
> delays in picoseconds.  If present on the MAC node, the MAC driver
> should attempt to provide the delay, and if present on the PHY node, the
> PHY driver is responsible.

Sorry, I had already forgotten about Florian's suggestion:
> If you introduced PHY and/or MAC specific properties to configure the
> delays in the appropriate unit of time (say ps), you could use a
> non-compliant 'phy-mode' just to satisfy the driver/PHY library and
> still override the delays you need.

So we would need two properties (RX and TX).
"rgmii_rx_skew_ps" and "rgmii_tx_skew_ps"

but it's not clear to me how the MAC probe function communicates
the arguments to the phy driver. Looks like we would need to add
two fields to struct phy_device, and maybe define a new phy-mode
to instruct the PHY driver to look for the two fields.

I don't have time to work on that for now, but I do need to
fix the nb8800 driver now. Can we apply the following patch
in the interim?

diff --git a/drivers/net/ethernet/aurora/nb8800.c 
b/drivers/net/ethernet/aurora/nb8800.c
index ded041dbafe7..e94159507847 100644
--- a/drivers/net/ethernet/aurora/nb8800.c
+++ b/drivers/net/ethernet/aurora/nb8800.c
@@ -1268,11 +1268,10 @@ static int nb8800_tangox_init(struct net_device *dev)
break;
 
case PHY_INTERFACE_MODE_RGMII:
-   pad_mode = PAD_MODE_RGMII;
-   break;
-
+   case PHY_INTERFACE_MODE_RGMII_ID:
+   case PHY_INTERFACE_MODE_RGMII_RXID:
case PHY_INTERFACE_MODE_RGMII_TXID:
-   pad_mode = PAD_MODE_RGMII | PAD_MODE_GTX_CLK_DELAY;
+   pad_mode = PAD_MODE_RGMII;
break;
 
default:

Regards.



Re: [PATCH v2 1/4] net: phy: at803x: Document RGMII RX and TX clock delay issues

2017-07-21 Thread Timur Tabi

On 7/21/17 8:29 AM, Marc Gonzalez wrote:

I don't understand what you're saying.

It is a correct observation that the code enabling
RGMII RX clock delay is a NOP, since that bit will
always be set at that point.

The spec for the 8035 (I haven't checked for 8030 and 8031,
is that what you meant by "other systems"?) states that
Sel_clk125m_dsp, which is described as:
"Control bit for rgmii interface rx clock delay"
is 1 after HW reset, 1 after SW reset.

So my statement "RX clock delay is enabled at reset"
is universally true. It's not just on some systems.


Ok, taken out of context, the comment doesn't really explain why the 
code is the way it is.  I'm not really happy about the word "assumes". 
Maybe you should add a sentence explaining when the code is NOT a no-op.


--
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the
Code Aurora Forum, hosted by The Linux Foundation.


[RFC PATCH] IP: do not modify ingress packet IP option in ip_options_echo()

2017-07-21 Thread Paolo Abeni
While computing the response option set for LRSS, ip_options_echo()
also changes the ingress packet LRSS addresses list, setting
the last one to the dst specific address for the ingress packet
- via memset(start[ ...
The only visible effect of such change - beyond possibly damaging
shared/cloned skbs - is modifying the data carried by ICMP replies
changing the header information for reported the ingress packet,
whichi violates RFC1122 3.2.2.6.
All the others call sites just ignore the ingress packet IP options
after calling ip_options_echo().
This behavior predates git history and apparently was present into
the initial ip_options_echo() implementation in linux 1.3.30 but
still looks wrong.

Removing the fib_compute_spec_dst() call will allow completely
removing, with a later patch, skb->dst usage in ip_options_echo(),
so that we can release the dst entry early in the rx path and
drop a couple of related hacks.

This will evenutally target net-next.

Signed-off-by: Paolo Abeni 
---
 net/ipv4/ip_options.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/net/ipv4/ip_options.c b/net/ipv4/ip_options.c
index 93157f2..fdda973 100644
--- a/net/ipv4/ip_options.c
+++ b/net/ipv4/ip_options.c
@@ -174,9 +174,6 @@ int __ip_options_echo(struct ip_options *dopt, struct 
sk_buff *skb,
doffset -= 4;
}
if (doffset > 3) {
-   __be32 daddr = fib_compute_spec_dst(skb);
-
-   memcpy([doffset-1], , 4);
dopt->faddr = faddr;
dptr[0] = start[0];
dptr[1] = doffset+3;
-- 
2.9.4



Re: [PATCH v2 3/4] net: ethernet: nb8800: Fix RGMII TX clock delay setup

2017-07-21 Thread Måns Rullgård
Marc Gonzalez  writes:

> On 21/07/2017 15:04, Måns Rullgård wrote:
>
>> Marc Gonzalez wrote:
>> 
>>> According to commit e5f3a4a56ce2a707b2fb8ce37e4414dcac89c672
>>> ("Documentation: devicetree: clarify usage of the RGMII phy-modes")
>>> there are 4 RGMII modes to handle:
>>>
>>> "rgmii" (RX and TX delays are added by the MAC when required)
>>> "rgmii-id" (RGMII with internal RX and TX delays provided by the PHY,
>>> the MAC should not add the RX or TX delays in this case)
>>> "rgmii-rxid" (RGMII with internal RX delay provided by the PHY,
>>> the MAC should not add an RX delay in this case)
>>> "rgmii-txid" (RGMII with internal TX delay provided by the PHY,
>>> the MAC should not add an TX delay in this case)
>>>
>>> Add TX delay in the MAC only for rgmii and rgmii-rxid.
>>>
>>> Signed-off-by: Marc Gonzalez 
>>> ---
>>>  drivers/net/ethernet/aurora/nb8800.c | 6 --
>>>  1 file changed, 4 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/net/ethernet/aurora/nb8800.c 
>>> b/drivers/net/ethernet/aurora/nb8800.c
>>> index ded041dbafe7..f3ed320eb4ad 100644
>>> --- a/drivers/net/ethernet/aurora/nb8800.c
>>> +++ b/drivers/net/ethernet/aurora/nb8800.c
>>> @@ -1268,11 +1268,13 @@ static int nb8800_tangox_init(struct net_device 
>>> *dev)
>>> break;
>>>
>>> case PHY_INTERFACE_MODE_RGMII:
>>> -   pad_mode = PAD_MODE_RGMII;
>>> +   case PHY_INTERFACE_MODE_RGMII_RXID:
>>> +   pad_mode = PAD_MODE_RGMII | PAD_MODE_GTX_CLK_DELAY;
>>> break;
>>>
>>> +   case PHY_INTERFACE_MODE_RGMII_ID:
>>> case PHY_INTERFACE_MODE_RGMII_TXID:
>>> -   pad_mode = PAD_MODE_RGMII | PAD_MODE_GTX_CLK_DELAY;
>>> +   pad_mode = PAD_MODE_RGMII;
>>> break;
>>>
>>> default:
>> 
>> I still don't like this.  Having both the MAC and PHY drivers react to
>> the phy-connection-type property is bound to cause trouble somewhere.
>> 
>> The only way out of the current mess is to define new properties for
>> both MAC and PHY that override the existing ones if present.
>
> Do you mean defining 4 new bindings and their corresponding
> phy_interface_t enum values? For example:
>
> "rgmii-v2"
> "rgmii-id-v2"
> "rgmii-rxid-v2"
> "rgmii-txid-v2"
>
>   PHY_INTERFACE_MODE_RGMII_V2,
>   PHY_INTERFACE_MODE_RGMII_ID_V2,
>   PHY_INTERFACE_MODE_RGMII_RXID_V2,
>   PHY_INTERFACE_MODE_RGMII_TXID_V2,
>
> And then handling these new enums in the at803x and nb8800 drivers?

It has already been suggested to add new properties specifying desired
delays in picoseconds.  If present on the MAC node, the MAC driver
should attempt to provide the delay, and if present on the PHY node, the
PHY driver is responsible.

-- 
Måns Rullgård


Re: [PATCH v2 3/4] net: ethernet: nb8800: Fix RGMII TX clock delay setup

2017-07-21 Thread Marc Gonzalez
On 21/07/2017 15:04, Måns Rullgård wrote:

> Marc Gonzalez wrote:
> 
>> According to commit e5f3a4a56ce2a707b2fb8ce37e4414dcac89c672
>> ("Documentation: devicetree: clarify usage of the RGMII phy-modes")
>> there are 4 RGMII modes to handle:
>>
>> "rgmii" (RX and TX delays are added by the MAC when required)
>> "rgmii-id" (RGMII with internal RX and TX delays provided by the PHY,
>>  the MAC should not add the RX or TX delays in this case)
>> "rgmii-rxid" (RGMII with internal RX delay provided by the PHY,
>>  the MAC should not add an RX delay in this case)
>> "rgmii-txid" (RGMII with internal TX delay provided by the PHY,
>>  the MAC should not add an TX delay in this case)
>>
>> Add TX delay in the MAC only for rgmii and rgmii-rxid.
>>
>> Signed-off-by: Marc Gonzalez 
>> ---
>>  drivers/net/ethernet/aurora/nb8800.c | 6 --
>>  1 file changed, 4 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/aurora/nb8800.c 
>> b/drivers/net/ethernet/aurora/nb8800.c
>> index ded041dbafe7..f3ed320eb4ad 100644
>> --- a/drivers/net/ethernet/aurora/nb8800.c
>> +++ b/drivers/net/ethernet/aurora/nb8800.c
>> @@ -1268,11 +1268,13 @@ static int nb8800_tangox_init(struct net_device *dev)
>>  break;
>>
>>  case PHY_INTERFACE_MODE_RGMII:
>> -pad_mode = PAD_MODE_RGMII;
>> +case PHY_INTERFACE_MODE_RGMII_RXID:
>> +pad_mode = PAD_MODE_RGMII | PAD_MODE_GTX_CLK_DELAY;
>>  break;
>>
>> +case PHY_INTERFACE_MODE_RGMII_ID:
>>  case PHY_INTERFACE_MODE_RGMII_TXID:
>> -pad_mode = PAD_MODE_RGMII | PAD_MODE_GTX_CLK_DELAY;
>> +pad_mode = PAD_MODE_RGMII;
>>  break;
>>
>>  default:
> 
> I still don't like this.  Having both the MAC and PHY drivers react to
> the phy-connection-type property is bound to cause trouble somewhere.
> 
> The only way out of the current mess is to define new properties for
> both MAC and PHY that override the existing ones if present.

Do you mean defining 4 new bindings and their corresponding
phy_interface_t enum values? For example:

"rgmii-v2"
"rgmii-id-v2"
"rgmii-rxid-v2"
"rgmii-txid-v2"

PHY_INTERFACE_MODE_RGMII_V2,
PHY_INTERFACE_MODE_RGMII_ID_V2,
PHY_INTERFACE_MODE_RGMII_RXID_V2,
PHY_INTERFACE_MODE_RGMII_TXID_V2,

And then handling these new enums in the at803x and nb8800 drivers?

FWIW, PAD_MODE_GTX_CLK_DELAY is broken in tango5 (doesn't add any
delay). I'm considering removing MAC-side TX delay altogether.

Regards.



[PATCH net 2/2] bpf/verifier: fix min/max handling in BPF_SUB

2017-07-21 Thread Edward Cree
We have to subtract the src max from the dst min, and vice-versa, since
 (e.g.) the smallest result comes from the largest subtrahend.

Fixes: 484611357c19 ("bpf: allow access into map value arrays")
Signed-off-by: Edward Cree 
---
 kernel/bpf/verifier.c | 21 +++--
 1 file changed, 15 insertions(+), 6 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index af9e84a..664d939 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -1865,10 +1865,12 @@ static void adjust_reg_min_max_vals(struct 
bpf_verifier_env *env,
 * do our normal operations to the register, we need to set the values
 * to the min/max since they are undefined.
 */
-   if (min_val == BPF_REGISTER_MIN_RANGE)
-   dst_reg->min_value = BPF_REGISTER_MIN_RANGE;
-   if (max_val == BPF_REGISTER_MAX_RANGE)
-   dst_reg->max_value = BPF_REGISTER_MAX_RANGE;
+   if (opcode != BPF_SUB) {
+   if (min_val == BPF_REGISTER_MIN_RANGE)
+   dst_reg->min_value = BPF_REGISTER_MIN_RANGE;
+   if (max_val == BPF_REGISTER_MAX_RANGE)
+   dst_reg->max_value = BPF_REGISTER_MAX_RANGE;
+   }
 
switch (opcode) {
case BPF_ADD:
@@ -1879,10 +1881,17 @@ static void adjust_reg_min_max_vals(struct 
bpf_verifier_env *env,
dst_reg->min_align = min(src_align, dst_align);
break;
case BPF_SUB:
+   /* If one of our values was at the end of our ranges, then the
+* _opposite_ value in the dst_reg goes to the end of our range.
+*/
+   if (min_val == BPF_REGISTER_MIN_RANGE)
+   dst_reg->max_value = BPF_REGISTER_MAX_RANGE;
+   if (max_val == BPF_REGISTER_MAX_RANGE)
+   dst_reg->min_value = BPF_REGISTER_MIN_RANGE;
if (dst_reg->min_value != BPF_REGISTER_MIN_RANGE)
-   dst_reg->min_value -= min_val;
+   dst_reg->min_value -= max_val;
if (dst_reg->max_value != BPF_REGISTER_MAX_RANGE)
-   dst_reg->max_value -= max_val;
+   dst_reg->max_value -= min_val;
dst_reg->min_align = min(src_align, dst_align);
break;
case BPF_MUL:


[PATCH net 1/2] selftests/bpf: subtraction bounds test

2017-07-21 Thread Edward Cree
There is a bug in the verifier's handling of BPF_SUB: [a,b] - [c,d] yields
 was [a-c, b-d] rather than the correct [a-d, b-c].  So here is a test
 which, with the bogus handling, will produce ranges of [0,0] and thus
 allowed accesses; whereas the correct handling will give a range of
 [-255, 255] (and hence the right-shift will give a range of [0, 255]) and
 the accesses will be rejected.

Signed-off-by: Edward Cree 
---
 tools/testing/selftests/bpf/test_verifier.c | 28 
 1 file changed, 28 insertions(+)

diff --git a/tools/testing/selftests/bpf/test_verifier.c 
b/tools/testing/selftests/bpf/test_verifier.c
index af7d173..addea82 100644
--- a/tools/testing/selftests/bpf/test_verifier.c
+++ b/tools/testing/selftests/bpf/test_verifier.c
@@ -5980,6 +5980,34 @@ static struct bpf_test tests[] = {
.result = REJECT,
.result_unpriv = REJECT,
},
+   {
+   "subtraction bounds (map value)",
+   .insns = {
+   BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
+   BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+   BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+   BPF_LD_MAP_FD(BPF_REG_1, 0),
+   BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
+BPF_FUNC_map_lookup_elem),
+   BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 9),
+   BPF_LDX_MEM(BPF_B, BPF_REG_1, BPF_REG_0, 0),
+   BPF_JMP_IMM(BPF_JGT, BPF_REG_1, 0xff, 7),
+   BPF_LDX_MEM(BPF_B, BPF_REG_3, BPF_REG_0, 1),
+   BPF_JMP_IMM(BPF_JGT, BPF_REG_3, 0xff, 5),
+   BPF_ALU64_REG(BPF_SUB, BPF_REG_1, BPF_REG_3),
+   BPF_ALU64_IMM(BPF_RSH, BPF_REG_1, 56),
+   BPF_ALU64_REG(BPF_ADD, BPF_REG_0, BPF_REG_1),
+   BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_0, 0),
+   BPF_EXIT_INSN(),
+   BPF_MOV64_IMM(BPF_REG_0, 0),
+   BPF_EXIT_INSN(),
+   },
+   .fixup_map1 = { 3 },
+   .errstr_unpriv = "R0 pointer arithmetic prohibited",
+   .errstr = "R0 min value is negative, either use unsigned index 
or do a if (index >=0) check.",
+   .result = REJECT,
+   .result_unpriv = REJECT,
+   },
 };
 
 static int probe_filter_length(const struct bpf_insn *fp)



[PATCH net 0/2] bpf: fix verifier min/max handling in BPF_SUB

2017-07-21 Thread Edward Cree
I managed to come up with a test for the swapped bounds in BPF_SUB, so here
 it is along with a patch that fixes it, separated out from my 'rewrite
 everything' series so it can go to -stable.

Edward Cree (2):
  selftests/bpf: subtraction bounds test
  bpf/verifier: fix min/max handling in BPF_SUB

 kernel/bpf/verifier.c   | 21 +++--
 tools/testing/selftests/bpf/test_verifier.c | 28 
 2 files changed, 43 insertions(+), 6 deletions(-)



Re: [PATCH v2 1/4] net: phy: at803x: Document RGMII RX and TX clock delay issues

2017-07-21 Thread Marc Gonzalez
On 21/07/2017 15:20, Timur Tabi wrote:

> On 7/21/17 6:25 AM, Marc Gonzalez wrote:
>
>> + * NB: This code assumes that RGMII RX clock delay is disabled
>> + * at reset, but actually, RX clock delay is enabled at reset.
> 
> Could we change this to say, "RX clock delay is enabled at reset on some 
> systems."?  Otherwise, it implies that the code is wrong 100% of the 
> time and should be fixed, not documented.

I don't understand what you're saying.

It is a correct observation that the code enabling
RGMII RX clock delay is a NOP, since that bit will
always be set at that point.

The spec for the 8035 (I haven't checked for 8030 and 8031,
is that what you meant by "other systems"?) states that
Sel_clk125m_dsp, which is described as:
"Control bit for rgmii interface rx clock delay"
is 1 after HW reset, 1 after SW reset.

So my statement "RX clock delay is enabled at reset"
is universally true. It's not just on some systems.

What am I missing?

Regards.



Re: [PATCH v2 1/4] net: phy: at803x: Document RGMII RX and TX clock delay issues

2017-07-21 Thread Timur Tabi

On 7/21/17 6:25 AM, Marc Gonzalez wrote:

+* NB: This code assumes that RGMII RX clock delay is disabled
+* at reset, but actually, RX clock delay is enabled at reset.


Could we change this to say, "RX clock delay is enabled at reset on some 
systems."?  Otherwise, it implies that the code is wrong 100% of the 
time and should be fixed, not documented.


--
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the
Code Aurora Forum, hosted by The Linux Foundation.


Re: [Patch net] team: use a larger struct for mac address

2017-07-21 Thread kbuild test robot
Hi Cong,

[auto build test WARNING on net/master]

url:
https://github.com/0day-ci/linux/commits/Cong-Wang/team-use-a-larger-struct-for-mac-address/20170721-203849
config: xtensa-allyesconfig (attached as .config)
compiler: xtensa-linux-gcc (GCC) 4.9.0
reproduce:
wget 
https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=xtensa 

All warnings (new ones prefixed by >>):

   drivers/net/team/team.c: In function '__set_port_dev_addr':
>> drivers/net/team/team.c:67:9: warning: passing argument 2 of 
>> 'dev_set_mac_address' from incompatible pointer type
 return dev_set_mac_address(port_dev, );
^
   In file included from drivers/net/team/team.c:20:0:
   include/linux/netdevice.h:3290:5: note: expected 'struct sockaddr *' but 
argument is of type 'struct __kernel_sockaddr_storage *'
int dev_set_mac_address(struct net_device *, struct sockaddr *);
^

vim +/dev_set_mac_address +67 drivers/net/team/team.c

3d249d4c Jiri Pirko 2011-11-11  55  
3d249d4c Jiri Pirko 2011-11-11  56  /*
1d76efe1 Jiri Pirko 2012-08-17  57   * Since the ability to change device 
address for open port device is tested in
3d249d4c Jiri Pirko 2011-11-11  58   * team_port_add, this function can be 
called without control of return value
3d249d4c Jiri Pirko 2011-11-11  59   */
1d76efe1 Jiri Pirko 2012-08-17  60  static int __set_port_dev_addr(struct 
net_device *port_dev,
3d249d4c Jiri Pirko 2011-11-11  61 const unsigned 
char *dev_addr)
3d249d4c Jiri Pirko 2011-11-11  62  {
44118243 Cong Wang  2017-07-20  63  struct sockaddr_storage addr;
3d249d4c Jiri Pirko 2011-11-11  64  
44118243 Cong Wang  2017-07-20  65  memcpy(addr.__data, dev_addr, 
port_dev->addr_len);
44118243 Cong Wang  2017-07-20  66  addr.ss_family = port_dev->type;
3d249d4c Jiri Pirko 2011-11-11 @67  return dev_set_mac_address(port_dev, 
);
3d249d4c Jiri Pirko 2011-11-11  68  }
3d249d4c Jiri Pirko 2011-11-11  69  

:: The code at line 67 was first introduced by commit
:: 3d249d4ca7d0ed6629a135ea1ea21c72286c0d80 net: introduce ethernet teaming 
device

:: TO: Jiri Pirko <jpi...@redhat.com>
:: CC: David S. Miller <da...@davemloft.net>

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


Re: [PATCH v2 0/4] RGMII RX and TX clock delays using Atheros 8035

2017-07-21 Thread Marc Gonzalez
On 21/07/2017 14:47, Marc Gonzalez wrote:

> fping caps at 1000 packets per second, which limits
> its usefulness as a benchmarking tool.

"Normal" ping is slightly faster (5000 packets per second).

time ping -f -c 6 -s 1450 -q 172.27.64.1

--- 172.27.64.1 ping statistics ---
6 packets transmitted, 6 received, 0% packet loss, time 12059ms
rtt min/avg/max/mdev = 0.159/0.167/3.536/0.021 ms, ipg/ewma 0.201/0.166 ms

real0m12.067s
user0m0.899s
sys 0m1.886s

NIC statistics:
 rx_bytes_ok: 89760375
 rx_frames_ok: 60003
 rx_undersize_frames: 0
 rx_fragment_frames: 0
 rx_64_byte_frames: 2
 rx_127_byte_frames: 0
 rx_255_byte_frames: 1
 rx_511_byte_frames: 0
 rx_1023_byte_frames: 0
 rx_max_size_frames: 6
 rx_oversize_frames: 0
 rx_bad_fcs_frames: 0
 rx_broadcast_frames: 1
 rx_multicast_frames: 0
 rx_control_frames: 0
 rx_pause_frames: 0
 rx_unsup_control_frames: 0
 rx_align_error_frames: 0
 rx_overrun_frames: 0
 rx_jabber_frames: 0
 rx_bytes: 89760375
 rx_frames: 60003
 tx_bytes_ok: 89760128
 tx_frames_ok: 60002
 tx_64_byte_frames: 2
 tx_127_byte_frames: 0
 tx_255_byte_frames: 0
 tx_511_byte_frames: 0
 tx_1023_byte_frames: 0
 tx_max_size_frames: 6
 tx_oversize_frames: 0
 tx_broadcast_frames: 1
 tx_multicast_frames: 0
 tx_control_frames: 0
 tx_pause_frames: 0
 tx_underrun_frames: 0
 tx_single_collision_frames: 0
 tx_multi_collision_frames: 0
 tx_deferred_collision_frames: 0
 tx_late_collision_frames: 0
 tx_excessive_collision_frames: 0
 tx_bytes: 89760128
 tx_frames: 60002
 tx_collisions: 0



Re: [Patch net] team: use a larger struct for mac address

2017-07-21 Thread kbuild test robot
Hi Cong,

[auto build test ERROR on net/master]

url:
https://github.com/0day-ci/linux/commits/Cong-Wang/team-use-a-larger-struct-for-mac-address/20170721-203849
config: i386-randconfig-x019-07211017 (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
# save the attached .config to linux build tree
make ARCH=i386 

All errors (new ones prefixed by >>):

   drivers/net/team/team.c: In function '__set_port_dev_addr':
>> drivers/net/team/team.c:67:39: error: passing argument 2 of 
>> 'dev_set_mac_address' from incompatible pointer type 
>> [-Werror=incompatible-pointer-types]
 return dev_set_mac_address(port_dev, );
  ^
   In file included from drivers/net/team/team.c:20:0:
   include/linux/netdevice.h:3290:5: note: expected 'struct sockaddr *' but 
argument is of type 'struct __kernel_sockaddr_storage *'
int dev_set_mac_address(struct net_device *, struct sockaddr *);
^~~
   cc1: some warnings being treated as errors

vim +/dev_set_mac_address +67 drivers/net/team/team.c

3d249d4c Jiri Pirko 2011-11-11  55  
3d249d4c Jiri Pirko 2011-11-11  56  /*
1d76efe1 Jiri Pirko 2012-08-17  57   * Since the ability to change device 
address for open port device is tested in
3d249d4c Jiri Pirko 2011-11-11  58   * team_port_add, this function can be 
called without control of return value
3d249d4c Jiri Pirko 2011-11-11  59   */
1d76efe1 Jiri Pirko 2012-08-17  60  static int __set_port_dev_addr(struct 
net_device *port_dev,
3d249d4c Jiri Pirko 2011-11-11  61 const unsigned 
char *dev_addr)
3d249d4c Jiri Pirko 2011-11-11  62  {
44118243 Cong Wang  2017-07-20  63  struct sockaddr_storage addr;
3d249d4c Jiri Pirko 2011-11-11  64  
44118243 Cong Wang  2017-07-20  65  memcpy(addr.__data, dev_addr, 
port_dev->addr_len);
44118243 Cong Wang  2017-07-20  66  addr.ss_family = port_dev->type;
3d249d4c Jiri Pirko 2011-11-11 @67  return dev_set_mac_address(port_dev, 
);
3d249d4c Jiri Pirko 2011-11-11  68  }
3d249d4c Jiri Pirko 2011-11-11  69  

:: The code at line 67 was first introduced by commit
:: 3d249d4ca7d0ed6629a135ea1ea21c72286c0d80 net: introduce ethernet teaming 
device

:: TO: Jiri Pirko <jpi...@redhat.com>
:: CC: David S. Miller <da...@davemloft.net>

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


Re: [PATCH v2 3/4] net: ethernet: nb8800: Fix RGMII TX clock delay setup

2017-07-21 Thread Måns Rullgård
Marc Gonzalez  writes:

> According to commit e5f3a4a56ce2a707b2fb8ce37e4414dcac89c672
> ("Documentation: devicetree: clarify usage of the RGMII phy-modes")
> there are 4 RGMII modes to handle:
>
> "rgmii" (RX and TX delays are added by the MAC when required)
> "rgmii-id" (RGMII with internal RX and TX delays provided by the PHY,
>   the MAC should not add the RX or TX delays in this case)
> "rgmii-rxid" (RGMII with internal RX delay provided by the PHY,
>   the MAC should not add an RX delay in this case)
> "rgmii-txid" (RGMII with internal TX delay provided by the PHY,
>   the MAC should not add an TX delay in this case)
>
> Add TX delay in the MAC only for rgmii and rgmii-rxid.
>
> Signed-off-by: Marc Gonzalez 
> ---
>  drivers/net/ethernet/aurora/nb8800.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/ethernet/aurora/nb8800.c 
> b/drivers/net/ethernet/aurora/nb8800.c
> index ded041dbafe7..f3ed320eb4ad 100644
> --- a/drivers/net/ethernet/aurora/nb8800.c
> +++ b/drivers/net/ethernet/aurora/nb8800.c
> @@ -1268,11 +1268,13 @@ static int nb8800_tangox_init(struct net_device *dev)
>   break;
>
>   case PHY_INTERFACE_MODE_RGMII:
> - pad_mode = PAD_MODE_RGMII;
> + case PHY_INTERFACE_MODE_RGMII_RXID:
> + pad_mode = PAD_MODE_RGMII | PAD_MODE_GTX_CLK_DELAY;
>   break;
>
> + case PHY_INTERFACE_MODE_RGMII_ID:
>   case PHY_INTERFACE_MODE_RGMII_TXID:
> - pad_mode = PAD_MODE_RGMII | PAD_MODE_GTX_CLK_DELAY;
> + pad_mode = PAD_MODE_RGMII;
>   break;
>
>   default:

I still don't like this.  Having both the MAC and PHY drivers react to
the phy-connection-type property is bound to cause trouble somewhere.

The only way out of the current mess is to define new properties for
both MAC and PHY that override the existing ones if present.

-- 
Måns Rullgård


Re: [PATCH v2 2/4] net: ethernet: nb8800: Set RGMII_MODE for all RGMII modes

2017-07-21 Thread Måns Rullgård
Marc Gonzalez  writes:

> There are 4 RGMII phy-modes to handle.
>
> Signed-off-by: Marc Gonzalez 

Acked-by: Mans Rullgard 

> ---
>  drivers/net/ethernet/aurora/nb8800.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/aurora/nb8800.c 
> b/drivers/net/ethernet/aurora/nb8800.c
> index 041cfb7952f8..ded041dbafe7 100644
> --- a/drivers/net/ethernet/aurora/nb8800.c
> +++ b/drivers/net/ethernet/aurora/nb8800.c
> @@ -609,7 +609,7 @@ static void nb8800_mac_config(struct net_device *dev)
>   mac_mode |= HALF_DUPLEX;
>
>   if (gigabit) {
> - if (priv->phy_mode == PHY_INTERFACE_MODE_RGMII)
> + if (phy_interface_is_rgmii(dev->phydev))
>   mac_mode |= RGMII_MODE;
>
>   mac_mode |= GMAC_MODE;
> -- 
> 2.11.0
>

-- 
Måns Rullgård


Re: [PATCH v2 0/4] RGMII RX and TX clock delays using Atheros 8035

2017-07-21 Thread Marc Gonzalez
On 21/07/2017 13:22, Marc Gonzalez wrote:

> Changes from v1
> - Drop support for disabling RX and TX clock delays
> (it breaks some boards). Document the issues instead.
> - Split the MAC patch in two unrelated parts
> - Fix the vantage 1172 DTS
> 
> Marc Gonzalez (4):
>   net: phy: at803x: Document RGMII RX and TX clock delay issues
>   net: ethernet: nb8800: Set RGMII_MODE for all RGMII modes
>   net: ethernet: nb8800: Fix RGMII TX clock delay setup
>   ARM: dts: tango4: Add RGMII RX and TX clock delays
> 
>  arch/arm/boot/dts/tango4-vantage-1172.dts |  2 +-
>  drivers/net/ethernet/aurora/nb8800.c  |  8 +---
>  drivers/net/phy/at803x.c  | 12 
>  3 files changed, 18 insertions(+), 4 deletions(-)

Rudimentary test:

ip addr add 172.27.64.45/18 brd 172.27.127.255 dev eth0
ip link set eth0 up
sleep 10 ## autoneg should be complete
time fping -c 1 -b 1450 -p 1 -i 1 -q 172.27.64.1
ethtool -S eth0
cat /proc/interrupts

# test_eth.sh 
[   13.276242] ENTER at803x_enable_rx_delay
[   13.280283] BEFORE=82ee
[   13.282735]  AFTER=82ee
[   13.285462] ENTER at803x_enable_tx_delay
[   13.289567] BEFORE=2d47
[   13.292018]  AFTER=2d47
[   16.663266] nb8800 26000.ethernet eth0: Link is Up - 1Gbps/Full - flow 
control rx/tx
172.27.64.1 : xmt/rcv/%loss = 1/1/0%, min/avg/max = 0.18/0.21/3.79

real0m10.688s
user0m0.106s
sys 0m0.472s

NIC statistics:
 rx_bytes_ok: 14960128
 rx_frames_ok: 10002
 rx_undersize_frames: 0
 rx_fragment_frames: 0
 rx_64_byte_frames: 2
 rx_127_byte_frames: 0
 rx_255_byte_frames: 0
 rx_511_byte_frames: 0
 rx_1023_byte_frames: 0
 rx_max_size_frames: 1
 rx_oversize_frames: 0
 rx_bad_fcs_frames: 0
 rx_broadcast_frames: 0
 rx_multicast_frames: 0
 rx_control_frames: 0
 rx_pause_frames: 0
 rx_unsup_control_frames: 0
 rx_align_error_frames: 0
 rx_overrun_frames: 0
 rx_jabber_frames: 0
 rx_bytes: 14960128
 rx_frames: 10002
 tx_bytes_ok: 14960128
 tx_frames_ok: 10002
 tx_64_byte_frames: 2
 tx_127_byte_frames: 0
 tx_255_byte_frames: 0
 tx_511_byte_frames: 0
 tx_1023_byte_frames: 0
 tx_max_size_frames: 1
 tx_oversize_frames: 0
 tx_broadcast_frames: 1
 tx_multicast_frames: 0
 tx_control_frames: 0
 tx_pause_frames: 0
 tx_underrun_frames: 0
 tx_single_collision_frames: 0
 tx_multi_collision_frames: 0
 tx_deferred_collision_frames: 0
 tx_late_collision_frames: 0
 tx_excessive_collision_frames: 0
 tx_bytes: 14960128
 tx_frames: 10002
 tx_collisions: 0

CPU0   CPU1   
  19:  10825983 GIC-0  29 Edge  twd
  20: 85  0  irq0   1 Level ttyS0
  21:  0  0  irq0  60 Level mmc0
  22:228  0  irq0   8 Level mmc1
  25:  20005  0  irq0  38 Level eth0
  28:  1  0  irq0  37 Edge  26000.nb8800-mii:04
IPI0:  0  0  CPU wakeup interrupts
IPI1:  0  0  Timer broadcast interrupts
IPI2:   1183   1725  Rescheduling interrupts
IPI3:  0  1  Function call interrupts
IPI4:  0  0  CPU stop interrupts
IPI5:  1  0  IRQ work interrupts
IPI6:  0  0  completion interrupts
Err:   0


fping caps at 1000 packets per second, which limits
its usefulness as a benchmarking tool.

Regards.



Backwards route when adding IPv6 peer?

2017-07-21 Thread Fabian Pietsch
Hi,

this looks backwards to me:

| root@blackbox:~# ip -6 addr add fd12:3456:789a::1 peer fd99::::1/48 
dev sit0
| root@blackbox:~# ip -6 route show dev sit0
| fd12:3456:789a::/48 proto kernel metric 256  pref medium
| fd99::::1 proto kernel metric 256  pref medium

| root@blackbox:~# ip -6 addr show dev sit0
| 4: sit0@NONE:  mtu 1480 state DOWN qlen 1
| inet6 fd12:3456:789a::1 peer fd99::::1/48 scope global
|valid_lft forever preferred_lft forever

The address+peer pair gets added just fine / appears correctly
in the addr show output. But the routing table has the *local*
address widened to the given prefix len, routing the local
address space to the remote side.

That is, I wanted to get a route to the peer's whole address space
implicitly, but I got a route of *my* address space to the tunnel.

Is this a Linux kernel bug? Or is it just working as expected?


Quoting from http://www.policyrouting.org/iproute2.doc.html#ss9.2.1
| peer ADDRESS--- address of remote endpoint for pointopoint interfaces.
| Again, the ADDRESS may be followed by a slash and decimal number,
| encoding the network prefix length. If a peer address is specified then
| the local address cannot have a network prefix length as the network
| prefix is associated with the peer rather than with the local address.
| In other words, netmasks can only be assigned to peer addresses when
| specifying both peer and local addresses.

This reads to me as if to support that the current kernel behaviour
is wrong.


blackbox is my notebook running something just before release of Debian 9:

| root@blackbox:~# dpkg -l iproute2 | grep ^.i
| ii  iproute2   4.9.0-1  amd64networking and traffic control 
tools
| root@blackbox:~# uname -a
| Linux blackbox 4.9.0-2-amd64 #1 SMP Debian 4.9.13-1 (2017-02-27) x86_64 
GNU/Linux

but I had the same problem initially on my home server/router/firewall
running Debian 8, while tweaking my OpenVPN tunnel configuration.

| root@mandarb:~# dpkg -l iproute2 | grep ^.i
| ii  iproute2   3.16.0-2 amd64networking and traffic control 
tools
| root@mandarb:~# uname -a
| Linux mandarb 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt20-1+deb8u3 (2016-01-17) 
x86_64 GNU/Linux

Regards,
Fabian


[PATCH v2 3/4] net: ethernet: nb8800: Fix RGMII TX clock delay setup

2017-07-21 Thread Marc Gonzalez
According to commit e5f3a4a56ce2a707b2fb8ce37e4414dcac89c672
("Documentation: devicetree: clarify usage of the RGMII phy-modes")
there are 4 RGMII modes to handle:

"rgmii" (RX and TX delays are added by the MAC when required)
"rgmii-id" (RGMII with internal RX and TX delays provided by the PHY,
the MAC should not add the RX or TX delays in this case)
"rgmii-rxid" (RGMII with internal RX delay provided by the PHY,
the MAC should not add an RX delay in this case)
"rgmii-txid" (RGMII with internal TX delay provided by the PHY,
the MAC should not add an TX delay in this case)

Add TX delay in the MAC only for rgmii and rgmii-rxid.

Signed-off-by: Marc Gonzalez 
---
 drivers/net/ethernet/aurora/nb8800.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/aurora/nb8800.c 
b/drivers/net/ethernet/aurora/nb8800.c
index ded041dbafe7..f3ed320eb4ad 100644
--- a/drivers/net/ethernet/aurora/nb8800.c
+++ b/drivers/net/ethernet/aurora/nb8800.c
@@ -1268,11 +1268,13 @@ static int nb8800_tangox_init(struct net_device *dev)
break;
 
case PHY_INTERFACE_MODE_RGMII:
-   pad_mode = PAD_MODE_RGMII;
+   case PHY_INTERFACE_MODE_RGMII_RXID:
+   pad_mode = PAD_MODE_RGMII | PAD_MODE_GTX_CLK_DELAY;
break;
 
+   case PHY_INTERFACE_MODE_RGMII_ID:
case PHY_INTERFACE_MODE_RGMII_TXID:
-   pad_mode = PAD_MODE_RGMII | PAD_MODE_GTX_CLK_DELAY;
+   pad_mode = PAD_MODE_RGMII;
break;
 
default:
-- 
2.11.0



[PATCH v2 1/4] net: phy: at803x: Document RGMII RX and TX clock delay issues

2017-07-21 Thread Marc Gonzalez
The current code supports enabling RGMII RX and TX clock delays.
The unstated assumption is that these settings are disabled by
default at reset, which is not the case.

RX clock delay is enabled at reset. And TX clock delay "survives"
across SW resets. Thus, if the bootloader enables TX clock delay,
it will remain enabled at reset in Linux.

Signed-off-by: Marc Gonzalez 
---
 drivers/net/phy/at803x.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/drivers/net/phy/at803x.c b/drivers/net/phy/at803x.c
index c1e52b9dc58d..7a0954513b91 100644
--- a/drivers/net/phy/at803x.c
+++ b/drivers/net/phy/at803x.c
@@ -283,6 +283,12 @@ static int at803x_config_init(struct phy_device *phydev)
if (ret < 0)
return ret;
 
+   /*
+* NB: This code assumes that RGMII RX clock delay is disabled
+* at reset, but actually, RX clock delay is enabled at reset.
+* Disabling the delay if it has not been explicitly requested
+* breaks boards that rely on the enabled-by-default behavior.
+*/
if (phydev->interface == PHY_INTERFACE_MODE_RGMII_RXID ||
phydev->interface == PHY_INTERFACE_MODE_RGMII_ID) {
ret = at803x_enable_rx_delay(phydev);
@@ -290,6 +296,12 @@ static int at803x_config_init(struct phy_device *phydev)
return ret;
}
 
+   /*
+* NB: This code assumes that RGMII TX clock delay is disabled
+* at reset, but actually, TX clock delay "survives" across SW
+* resets. If the bootloader enables TX clock delay, Linux is
+* stuck with that setting.
+*/
if (phydev->interface == PHY_INTERFACE_MODE_RGMII_TXID ||
phydev->interface == PHY_INTERFACE_MODE_RGMII_ID) {
ret = at803x_enable_tx_delay(phydev);
-- 
2.11.0



[PATCH v2 2/4] net: ethernet: nb8800: Set RGMII_MODE for all RGMII modes

2017-07-21 Thread Marc Gonzalez
There are 4 RGMII phy-modes to handle.

Signed-off-by: Marc Gonzalez 
---
 drivers/net/ethernet/aurora/nb8800.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/aurora/nb8800.c 
b/drivers/net/ethernet/aurora/nb8800.c
index 041cfb7952f8..ded041dbafe7 100644
--- a/drivers/net/ethernet/aurora/nb8800.c
+++ b/drivers/net/ethernet/aurora/nb8800.c
@@ -609,7 +609,7 @@ static void nb8800_mac_config(struct net_device *dev)
mac_mode |= HALF_DUPLEX;
 
if (gigabit) {
-   if (priv->phy_mode == PHY_INTERFACE_MODE_RGMII)
+   if (phy_interface_is_rgmii(dev->phydev))
mac_mode |= RGMII_MODE;
 
mac_mode |= GMAC_MODE;
-- 
2.11.0



[PATCH v2 0/4] RGMII RX and TX clock delays using Atheros 8035

2017-07-21 Thread Marc Gonzalez
Changes from v1
- Drop support for disabling RX and TX clock delays
(it breaks some boards). Document the issues instead.
- Split the MAC patch in two unrelated parts
- Fix the vantage 1172 DTS

Marc Gonzalez (4):
  net: phy: at803x: Document RGMII RX and TX clock delay issues
  net: ethernet: nb8800: Set RGMII_MODE for all RGMII modes
  net: ethernet: nb8800: Fix RGMII TX clock delay setup
  ARM: dts: tango4: Add RGMII RX and TX clock delays

 arch/arm/boot/dts/tango4-vantage-1172.dts |  2 +-
 drivers/net/ethernet/aurora/nb8800.c  |  8 +---
 drivers/net/phy/at803x.c  | 12 
 3 files changed, 18 insertions(+), 4 deletions(-)

-- 
2.11.0



[PATCH v2 4/4] ARM: dts: tango4: Add RGMII RX and TX clock delays

2017-07-21 Thread Marc Gonzalez
RX and TX clock delays are required.

Signed-off-by: Marc Gonzalez 
---
 arch/arm/boot/dts/tango4-vantage-1172.dts | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/boot/dts/tango4-vantage-1172.dts 
b/arch/arm/boot/dts/tango4-vantage-1172.dts
index 86d8df98802f..13bcc460bcb2 100644
--- a/arch/arm/boot/dts/tango4-vantage-1172.dts
+++ b/arch/arm/boot/dts/tango4-vantage-1172.dts
@@ -22,7 +22,7 @@
 };
 
  {
-   phy-connection-type = "rgmii";
+   phy-connection-type = "rgmii-id";
phy-handle = <_phy>;
#address-cells = <1>;
#size-cells = <0>;
-- 
2.11.0



Re: [PATCH] bluetooth: document config options

2017-07-21 Thread Marcel Holtmann
Hi Pavel,

>>> Kernel config options should include useful help text; I had to look
>>> up the terms on wikipedia.
>>> 
>>> Signed-off-by: Pavel Machek 
>>> 
>>> diff --git a/net/bluetooth/Kconfig b/net/bluetooth/Kconfig
>>> index 68f951b..133c8a6 100644
>>> --- a/net/bluetooth/Kconfig
>>> +++ b/net/bluetooth/Kconfig
>>> @@ -45,6 +45,8 @@ config BT_BREDR
>>> bool "Bluetooth Classic (BR/EDR) features"
>>> depends on BT
>>> default y
>>> +   help
>>> + Support Bluetooth version 1 and 2 connections.
>> 
>> so this is actually not correct. Version 5.0 is also Bluetooth BR/EDR.
>> 
>> I am fine if use the terms "Basic Rate" and "Enhanced Data Rate" in the 
>> description, but the version numbers are kinda not how this works. So some 
>> alternative wording needs to be used:
>> 
>> "Bluetooth Classic includes support for Basic Rate (BR) available with 
>> Bluetooth version 1.0b or later and support for Enhanced Data Rate (EDR) 
>> available with Bluetooth version 2.0 or later."
>> 
> 
> Ok, works for me.
> 
>>> +   help
>>> + Bluetooth 3 introduces high-speed mode where bluetooth endpoints
>>> + negotiate fast connection over WIFI. This controls its support.
>> 
>> Again, while Bluetooth version 3.0 introduces HS support, it is not what is 
>> selecting this.
>> 
>> I think here we can write something like this:
>> 
>> "Bluetooth High Speed includes support for off-loading Bluetooth connections 
>> via 802.11 physical layer available with Bluetooth version 3.0 or later."
>> 
> 
> Ok.
> 
>>> config BT_LE
>>> bool "Bluetooth Low Energy (LE) features"
>>> depends on BT
>>> default y
>>> +   help
>>> + Bluetooth 4 introduces special low-energy protocol, designed
>>> + for simple devices. This controls its support.
>> 
>> I think here it is also better to have an alternative text:
>> 
>> "Bluetooth Low Energy includes support low-energy physical layer available 
>> with Bluetooth version 4.0 or later."
>> 
> 
> Ok. Do I understand it correctly that Bluetooth LE basically has
> nothing to do with bluetooth, and that Bluetooth 2 hardware will not
> be able to detect / communicate with Bluetooth LE hardware?

if you have LE only on one side and BR/EDR only on the other side, then they 
can not communicate with each other. You need at least one side with dual-mode 
BR/EDR/LE support.

Regards

Marcel



Re: [PATCH net v2] udp: preserve skb->dst if required for IP options processing

2017-07-21 Thread Artem Savkov
On Tue, Jul 18, 2017 at 11:57:55AM +0200, Paolo Abeni wrote:
> Eric noticed that in udp_recvmsg() we still need to access
> skb->dst while processing the IP options.
> Since commit 0a463c78d25b ("udp: avoid a cache miss on dequeue")
> skb->dst is no more available at recvmsg() time and bad things
> will happen if we enter the relevant code path.
> 
> This commit address the issue, avoid clearing skb->dst if
> any IP options are present into the relevant skb.
> Since the IP CB is contained in the first skb cacheline, we can
> test it to decide to leverage the consume_stateless_skb()
> optimization, without measurable additional cost in the faster
> path.

This doesn't seem to be enough for XFRM case. CMSG gets eaten up in
selinux-testsuite[1]/inet_socket test in ipsec configuration.

[1] https://github.com/SELinuxProject/selinux-testsuite.git

-- 
Regards,
  Artem


[PATCH net-next v2 0/6] Allow to switch off UDP-based tunnel offloads per netdevice

2017-07-21 Thread Sabrina Dubroca
This patchset adds a new netdevice feature to toggle RX offloads of
UDP-based tunnel via ethtool. This is useful if the offload is causing
issues, for example if the hardware is buggy.

The feature is added to all devices providing the ->ndo_udp_tunnel_add
op, and enabled by default to preserve current behavior.

When the administrator disables this feature on a device, all
currently offloaded ports are cleared from the device.  When the
feature is turned on, the stack notifies the device about all current
ports.

v2:
 - rename feature bit to NETIF_F_RX_UDP_TUNNEL_PORT
 - rename ethtool feature to rx-udp_tunnel-port-offload


Sabrina Dubroca (6):
  net: add new netdevice feature for offload of RX port for UDP tunnels
  net: check UDP tunnel RX port offload feature before calling tunnel
ndo ndo
  net: add infrastructure to un-offload UDP tunnel port
  net: call udp_tunnel_get_rx_info when NETIF_F_RX_UDP_TUNNEL_PORT is
toggled
  geneve/vxlan: add support for NETDEV_UDP_TUNNEL_DROP_INFO
  geneve/vxlan: offload ports on register/unregister events

 drivers/net/geneve.c| 24 ++--
 drivers/net/vxlan.c | 31 ++-
 include/linux/netdev_features.h |  2 ++
 include/linux/netdevice.h   |  1 +
 include/net/udp_tunnel.h|  8 
 net/core/dev.c  | 28 +++-
 net/core/ethtool.c  |  1 +
 net/ipv4/udp_tunnel.c   | 25 -
 8 files changed, 103 insertions(+), 17 deletions(-)

-- 
2.13.2



[PATCH net-next v2 1/6] net: add new netdevice feature for offload of RX port for UDP tunnels

2017-07-21 Thread Sabrina Dubroca
This adds a new netdevice feature, so that the offloading of RX port for
UDP tunnels can be disabled by the administrator on some netdevices,
using the "rx-udp_tunnel-port-offload" feature in ethtool.

This feature is set for all devices that provide ndo_udp_tunnel_add.

Signed-off-by: Sabrina Dubroca 
---
 include/linux/netdev_features.h | 2 ++
 net/core/dev.c  | 6 ++
 net/core/ethtool.c  | 1 +
 3 files changed, 9 insertions(+)

diff --git a/include/linux/netdev_features.h b/include/linux/netdev_features.h
index ebd273627334..dc8b4896b77b 100644
--- a/include/linux/netdev_features.h
+++ b/include/linux/netdev_features.h
@@ -75,6 +75,7 @@ enum {
NETIF_F_HW_TC_BIT,  /* Offload TC infrastructure */
NETIF_F_HW_ESP_BIT, /* Hardware ESP transformation offload 
*/
NETIF_F_HW_ESP_TX_CSUM_BIT, /* ESP with TX checksum offload */
+   NETIF_F_RX_UDP_TUNNEL_PORT_BIT, /* Offload of RX port for UDP tunnels */
 
/*
 * Add your fresh new feature above and remember to update
@@ -138,6 +139,7 @@ enum {
 #define NETIF_F_HW_TC  __NETIF_F(HW_TC)
 #define NETIF_F_HW_ESP __NETIF_F(HW_ESP)
 #define NETIF_F_HW_ESP_TX_CSUM __NETIF_F(HW_ESP_TX_CSUM)
+#defineNETIF_F_RX_UDP_TUNNEL_PORT  __NETIF_F(RX_UDP_TUNNEL_PORT)
 
 #define for_each_netdev_feature(mask_addr, bit)\
for_each_set_bit(bit, (unsigned long *)mask_addr, NETDEV_FEATURE_COUNT)
diff --git a/net/core/dev.c b/net/core/dev.c
index 509af6ce8831..9081134adc0d 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -7530,6 +7530,12 @@ int register_netdevice(struct net_device *dev)
 */
dev->hw_features |= NETIF_F_SOFT_FEATURES;
dev->features |= NETIF_F_SOFT_FEATURES;
+
+   if (dev->netdev_ops->ndo_udp_tunnel_add) {
+   dev->features |= NETIF_F_RX_UDP_TUNNEL_PORT;
+   dev->hw_features |= NETIF_F_RX_UDP_TUNNEL_PORT;
+   }
+
dev->wanted_features = dev->features & dev->hw_features;
 
if (!(dev->flags & IFF_LOOPBACK))
diff --git a/net/core/ethtool.c b/net/core/ethtool.c
index 78408ab77a10..b987bc475fc8 100644
--- a/net/core/ethtool.c
+++ b/net/core/ethtool.c
@@ -105,6 +105,7 @@ static const char 
netdev_features_strings[NETDEV_FEATURE_COUNT][ETH_GSTRING_LEN]
[NETIF_F_HW_TC_BIT] ="hw-tc-offload",
[NETIF_F_HW_ESP_BIT] =   "esp-hw-offload",
[NETIF_F_HW_ESP_TX_CSUM_BIT] =   "esp-tx-csum-hw-offload",
+   [NETIF_F_RX_UDP_TUNNEL_PORT_BIT] =   "rx-udp_tunnel-port-offload",
 };
 
 static const char
-- 
2.13.2



[PATCH net-next v2 5/6] geneve/vxlan: add support for NETDEV_UDP_TUNNEL_DROP_INFO

2017-07-21 Thread Sabrina Dubroca
Signed-off-by: Sabrina Dubroca 
---
 drivers/net/geneve.c | 19 +--
 drivers/net/vxlan.c  | 25 +
 2 files changed, 30 insertions(+), 14 deletions(-)

diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
index de8156c6b292..ee52233750f7 100644
--- a/drivers/net/geneve.c
+++ b/drivers/net/geneve.c
@@ -1014,16 +1014,22 @@ static struct device_type geneve_type = {
  * supply the listening GENEVE udp ports. Callers are expected
  * to implement the ndo_udp_tunnel_add.
  */
-static void geneve_push_rx_ports(struct net_device *dev)
+static void geneve_offload_rx_ports(struct net_device *dev, bool push)
 {
struct net *net = dev_net(dev);
struct geneve_net *gn = net_generic(net, geneve_net_id);
struct geneve_sock *gs;
 
rcu_read_lock();
-   list_for_each_entry_rcu(gs, >sock_list, list)
-   udp_tunnel_push_rx_port(dev, gs->sock,
-   UDP_TUNNEL_TYPE_GENEVE);
+   list_for_each_entry_rcu(gs, >sock_list, list) {
+   if (push) {
+   udp_tunnel_push_rx_port(dev, gs->sock,
+   UDP_TUNNEL_TYPE_GENEVE);
+   } else {
+   udp_tunnel_drop_rx_port(dev, gs->sock,
+   UDP_TUNNEL_TYPE_GENEVE);
+   }
+   }
rcu_read_unlock();
 }
 
@@ -1426,8 +1432,9 @@ static int geneve_netdevice_event(struct notifier_block 
*unused,
 {
struct net_device *dev = netdev_notifier_info_to_dev(ptr);
 
-   if (event == NETDEV_UDP_TUNNEL_PUSH_INFO)
-   geneve_push_rx_ports(dev);
+   if (event == NETDEV_UDP_TUNNEL_PUSH_INFO ||
+   event == NETDEV_UDP_TUNNEL_DROP_INFO)
+   geneve_offload_rx_ports(dev, event == 
NETDEV_UDP_TUNNEL_PUSH_INFO);
 
return NOTIFY_DONE;
 }
diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index 96aa7e6cf214..4642d5be2fa0 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -2608,7 +2608,7 @@ static struct device_type vxlan_type = {
  * supply the listening VXLAN udp ports. Callers are expected
  * to implement the ndo_udp_tunnel_add.
  */
-static void vxlan_push_rx_ports(struct net_device *dev)
+static void vxlan_offload_rx_ports(struct net_device *dev, bool push)
 {
struct vxlan_sock *vs;
struct net *net = dev_net(dev);
@@ -2617,11 +2617,19 @@ static void vxlan_push_rx_ports(struct net_device *dev)
 
spin_lock(>sock_lock);
for (i = 0; i < PORT_HASH_SIZE; ++i) {
-   hlist_for_each_entry_rcu(vs, >sock_list[i], hlist)
-   udp_tunnel_push_rx_port(dev, vs->sock,
-   (vs->flags & VXLAN_F_GPE) ?
-   UDP_TUNNEL_TYPE_VXLAN_GPE :
-   UDP_TUNNEL_TYPE_VXLAN);
+   hlist_for_each_entry_rcu(vs, >sock_list[i], hlist) {
+   unsigned short type;
+
+   if (vs->flags & VXLAN_F_GPE)
+   type = UDP_TUNNEL_TYPE_VXLAN_GPE;
+   else
+   type = UDP_TUNNEL_TYPE_VXLAN;
+
+   if (push)
+   udp_tunnel_push_rx_port(dev, vs->sock, type);
+   else
+   udp_tunnel_drop_rx_port(dev, vs->sock, type);
+   }
}
spin_unlock(>sock_lock);
 }
@@ -3632,8 +3640,9 @@ static int vxlan_netdevice_event(struct notifier_block 
*unused,
 
if (event == NETDEV_UNREGISTER)
vxlan_handle_lowerdev_unregister(vn, dev);
-   else if (event == NETDEV_UDP_TUNNEL_PUSH_INFO)
-   vxlan_push_rx_ports(dev);
+   else if (event == NETDEV_UDP_TUNNEL_PUSH_INFO ||
+event == NETDEV_UDP_TUNNEL_DROP_INFO)
+   vxlan_offload_rx_ports(dev, event == 
NETDEV_UDP_TUNNEL_PUSH_INFO);
 
return NOTIFY_DONE;
 }
-- 
2.13.2



[PATCH net-next v2 4/6] net: call udp_tunnel_get_rx_info when NETIF_F_RX_UDP_TUNNEL_PORT is toggled

2017-07-21 Thread Sabrina Dubroca
NETIF_F_RX_UDP_TUNNEL_PORT is special, in that we need to do more than
just flip the bit in dev->features. When disabling we must also clear
currently offloaded ports from the device, and when enabling we must
tell the device to offload the ports it can.

Because vxlan stores its sockets in a hashtable and they are inserted at
the head of per-bucket lists, switching the feature off and then on can
result in a different set of ports being offloaded (depending on the
HW's limits).

Signed-off-by: Sabrina Dubroca 
---
 net/core/dev.c | 22 +-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 9081134adc0d..8ea6b4b42611 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -144,6 +144,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "net-sysfs.h"
 
@@ -7327,8 +7328,27 @@ int __netdev_update_features(struct net_device *dev)
netdev_for_each_lower_dev(dev, lower, iter)
netdev_sync_lower_features(dev, lower, features);
 
-   if (!err)
+   if (!err) {
+   netdev_features_t diff = features ^ dev->features;
+
+   if (diff & NETIF_F_RX_UDP_TUNNEL_PORT) {
+   /* udp_tunnel_{get,drop}_rx_info both need
+* NETIF_F_RX_UDP_TUNNEL_PORT enabled on the
+* device, or they won't do anything.
+* Thus we need to update dev->features
+* *before* calling udp_tunnel_get_rx_info,
+* but *after* calling udp_tunnel_drop_rx_info.
+*/
+   if (features & NETIF_F_RX_UDP_TUNNEL_PORT) {
+   dev->features = features;
+   udp_tunnel_get_rx_info(dev);
+   } else {
+   udp_tunnel_drop_rx_info(dev);
+   }
+   }
+
dev->features = features;
+   }
 
return err < 0 ? 0 : 1;
 }
-- 
2.13.2



[PATCH net-next v2 6/6] geneve/vxlan: offload ports on register/unregister events

2017-07-21 Thread Sabrina Dubroca
This improves consistency of handling when moving a netdev to another
netns. Most drivers currently do a full reset when the device goes up,
so that will flush the offload state anyway.

Signed-off-by: Sabrina Dubroca 
---
 drivers/net/geneve.c |  7 ++-
 drivers/net/vxlan.c  | 10 +++---
 2 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
index ee52233750f7..6c6350b5c201 100644
--- a/drivers/net/geneve.c
+++ b/drivers/net/geneve.c
@@ -1433,8 +1433,13 @@ static int geneve_netdevice_event(struct notifier_block 
*unused,
struct net_device *dev = netdev_notifier_info_to_dev(ptr);
 
if (event == NETDEV_UDP_TUNNEL_PUSH_INFO ||
-   event == NETDEV_UDP_TUNNEL_DROP_INFO)
+   event == NETDEV_UDP_TUNNEL_DROP_INFO) {
geneve_offload_rx_ports(dev, event == 
NETDEV_UDP_TUNNEL_PUSH_INFO);
+   } else if (event == NETDEV_UNREGISTER) {
+   geneve_offload_rx_ports(dev, false);
+   } else if (event == NETDEV_REGISTER) {
+   geneve_offload_rx_ports(dev, true);
+   }
 
return NOTIFY_DONE;
 }
diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index 4642d5be2fa0..dbca067540d0 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -3638,11 +3638,15 @@ static int vxlan_netdevice_event(struct notifier_block 
*unused,
struct net_device *dev = netdev_notifier_info_to_dev(ptr);
struct vxlan_net *vn = net_generic(dev_net(dev), vxlan_net_id);
 
-   if (event == NETDEV_UNREGISTER)
+   if (event == NETDEV_UNREGISTER) {
+   vxlan_offload_rx_ports(dev, false);
vxlan_handle_lowerdev_unregister(vn, dev);
-   else if (event == NETDEV_UDP_TUNNEL_PUSH_INFO ||
-event == NETDEV_UDP_TUNNEL_DROP_INFO)
+   } else if (event == NETDEV_REGISTER) {
+   vxlan_offload_rx_ports(dev, true);
+   } else if (event == NETDEV_UDP_TUNNEL_PUSH_INFO ||
+  event == NETDEV_UDP_TUNNEL_DROP_INFO) {
vxlan_offload_rx_ports(dev, event == 
NETDEV_UDP_TUNNEL_PUSH_INFO);
+   }
 
return NOTIFY_DONE;
 }
-- 
2.13.2



[PATCH net-next v2 3/6] net: add infrastructure to un-offload UDP tunnel port

2017-07-21 Thread Sabrina Dubroca
This adds a new NETDEV_UDP_TUNNEL_DROP_INFO event, similar to
NETDEV_UDP_TUNNEL_PUSH_INFO, to signal to un-offload ports.

This also adds udp_tunnel_drop_rx_port(), which calls
ndo_udp_tunnel_del.

Signed-off-by: Sabrina Dubroca 
---
 include/linux/netdevice.h |  1 +
 include/net/udp_tunnel.h  |  8 
 net/ipv4/udp_tunnel.c | 18 ++
 3 files changed, 27 insertions(+)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 614642eb7eb7..3a3cdc1b1f31 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2317,6 +2317,7 @@ struct netdev_lag_lower_state_info {
 #define NETDEV_PRECHANGEUPPER  0x001A
 #define NETDEV_CHANGELOWERSTATE0x001B
 #define NETDEV_UDP_TUNNEL_PUSH_INFO0x001C
+#define NETDEV_UDP_TUNNEL_DROP_INFO0x001D
 #define NETDEV_CHANGE_TX_QUEUE_LEN 0x001E
 
 int register_netdevice_notifier(struct notifier_block *nb);
diff --git a/include/net/udp_tunnel.h b/include/net/udp_tunnel.h
index 02c5be037451..10cce0dd4450 100644
--- a/include/net/udp_tunnel.h
+++ b/include/net/udp_tunnel.h
@@ -115,6 +115,8 @@ struct udp_tunnel_info {
 /* Notify network devices of offloadable types */
 void udp_tunnel_push_rx_port(struct net_device *dev, struct socket *sock,
 unsigned short type);
+void udp_tunnel_drop_rx_port(struct net_device *dev, struct socket *sock,
+unsigned short type);
 void udp_tunnel_notify_add_rx_port(struct socket *sock, unsigned short type);
 void udp_tunnel_notify_del_rx_port(struct socket *sock, unsigned short type);
 
@@ -124,6 +126,12 @@ static inline void udp_tunnel_get_rx_info(struct 
net_device *dev)
call_netdevice_notifiers(NETDEV_UDP_TUNNEL_PUSH_INFO, dev);
 }
 
+static inline void udp_tunnel_drop_rx_info(struct net_device *dev)
+{
+   ASSERT_RTNL();
+   call_netdevice_notifiers(NETDEV_UDP_TUNNEL_DROP_INFO, dev);
+}
+
 /* Transmit the skb using UDP encapsulation. */
 void udp_tunnel_xmit_skb(struct rtable *rt, struct sock *sk, struct sk_buff 
*skb,
 __be32 src, __be32 dst, __u8 tos, __u8 ttl,
diff --git a/net/ipv4/udp_tunnel.c b/net/ipv4/udp_tunnel.c
index 0d3f14cdc524..6539ff15e9a3 100644
--- a/net/ipv4/udp_tunnel.c
+++ b/net/ipv4/udp_tunnel.c
@@ -94,6 +94,24 @@ void udp_tunnel_push_rx_port(struct net_device *dev, struct 
socket *sock,
 }
 EXPORT_SYMBOL_GPL(udp_tunnel_push_rx_port);
 
+void udp_tunnel_drop_rx_port(struct net_device *dev, struct socket *sock,
+unsigned short type)
+{
+   struct sock *sk = sock->sk;
+   struct udp_tunnel_info ti;
+
+   if (!dev->netdev_ops->ndo_udp_tunnel_del ||
+   !(dev->features & NETIF_F_RX_UDP_TUNNEL_PORT))
+   return;
+
+   ti.type = type;
+   ti.sa_family = sk->sk_family;
+   ti.port = inet_sk(sk)->inet_sport;
+
+   dev->netdev_ops->ndo_udp_tunnel_del(dev, );
+}
+EXPORT_SYMBOL_GPL(udp_tunnel_drop_rx_port);
+
 /* Notify netdevs that UDP port started listening */
 void udp_tunnel_notify_add_rx_port(struct socket *sock, unsigned short type)
 {
-- 
2.13.2



[PATCH net-next v2 2/6] net: check UDP tunnel RX port offload feature before calling tunnel ndo ndo

2017-07-21 Thread Sabrina Dubroca
If NETIF_F_RX_UDP_TUNNEL_PORT was disabled on a given netdevice, skip
the tunnel offload ndo call during tunnel port creation and deletion.

Signed-off-by: Sabrina Dubroca 
---
 net/ipv4/udp_tunnel.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/net/ipv4/udp_tunnel.c b/net/ipv4/udp_tunnel.c
index 58bd39fb14b4..0d3f14cdc524 100644
--- a/net/ipv4/udp_tunnel.c
+++ b/net/ipv4/udp_tunnel.c
@@ -82,7 +82,8 @@ void udp_tunnel_push_rx_port(struct net_device *dev, struct 
socket *sock,
struct sock *sk = sock->sk;
struct udp_tunnel_info ti;
 
-   if (!dev->netdev_ops->ndo_udp_tunnel_add)
+   if (!dev->netdev_ops->ndo_udp_tunnel_add ||
+   !(dev->features & NETIF_F_RX_UDP_TUNNEL_PORT))
return;
 
ti.type = type;
@@ -109,6 +110,8 @@ void udp_tunnel_notify_add_rx_port(struct socket *sock, 
unsigned short type)
for_each_netdev_rcu(net, dev) {
if (!dev->netdev_ops->ndo_udp_tunnel_add)
continue;
+   if (!(dev->features & NETIF_F_RX_UDP_TUNNEL_PORT))
+   continue;
dev->netdev_ops->ndo_udp_tunnel_add(dev, );
}
rcu_read_unlock();
@@ -131,6 +134,8 @@ void udp_tunnel_notify_del_rx_port(struct socket *sock, 
unsigned short type)
for_each_netdev_rcu(net, dev) {
if (!dev->netdev_ops->ndo_udp_tunnel_del)
continue;
+   if (!(dev->features & NETIF_F_RX_UDP_TUNNEL_PORT))
+   continue;
dev->netdev_ops->ndo_udp_tunnel_del(dev, );
}
rcu_read_unlock();
-- 
2.13.2



Re: [PATCH] bluetooth: document config options

2017-07-21 Thread Pavel Machek
Hi!

> > Kernel config options should include useful help text; I had to look
> > up the terms on wikipedia.
> > 
> > Signed-off-by: Pavel Machek 
> > 
> > diff --git a/net/bluetooth/Kconfig b/net/bluetooth/Kconfig
> > index 68f951b..133c8a6 100644
> > --- a/net/bluetooth/Kconfig
> > +++ b/net/bluetooth/Kconfig
> > @@ -45,6 +45,8 @@ config BT_BREDR
> > bool "Bluetooth Classic (BR/EDR) features"
> > depends on BT
> > default y
> > +   help
> > + Support Bluetooth version 1 and 2 connections.
> 
> so this is actually not correct. Version 5.0 is also Bluetooth BR/EDR.
> 
> I am fine if use the terms "Basic Rate" and "Enhanced Data Rate" in the 
> description, but the version numbers are kinda not how this works. So some 
> alternative wording needs to be used:
> 
> "Bluetooth Classic includes support for Basic Rate (BR) available with 
> Bluetooth version 1.0b or later and support for Enhanced Data Rate (EDR) 
> available with Bluetooth version 2.0 or later."
> 

Ok, works for me.

> > +   help
> > + Bluetooth 3 introduces high-speed mode where bluetooth endpoints
> > + negotiate fast connection over WIFI. This controls its support.
> 
> Again, while Bluetooth version 3.0 introduces HS support, it is not what is 
> selecting this.
> 
> I think here we can write something like this:
> 
> "Bluetooth High Speed includes support for off-loading Bluetooth connections 
> via 802.11 physical layer available with Bluetooth version 3.0 or later."
>

Ok.

> > config BT_LE
> > bool "Bluetooth Low Energy (LE) features"
> > depends on BT
> > default y
> > +   help
> > + Bluetooth 4 introduces special low-energy protocol, designed
> > + for simple devices. This controls its support.
> 
> I think here it is also better to have an alternative text:
> 
> "Bluetooth Low Energy includes support low-energy physical layer available 
> with Bluetooth version 4.0 or later."
> 

Ok. Do I understand it correctly that Bluetooth LE basically has
nothing to do with bluetooth, and that Bluetooth 2 hardware will not
be able to detect / communicate with Bluetooth LE hardware?

Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature


Re: [PATCH net-next 1/3] bluetooth: 6lowpan dev_close never returns error

2017-07-21 Thread kbuild test robot
Hi Stephen,

[auto build test WARNING on net-next/master]

url:
https://github.com/0day-ci/linux/commits/Stephen-Hemminger/net-make-dev_close-void/20170720-090123
config: x86_64-randconfig-a0-07211734 (attached as .config)
compiler: gcc-4.4 (Debian 4.4.7-8) 4.4.7
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64 

All warnings (new ones prefixed by >>):

   net/bluetooth/6lowpan.c: In function 'ifdown':
>> net/bluetooth/6lowpan.c:621: warning: unused variable 'err'

vim +/err +621 net/bluetooth/6lowpan.c

18722c24 Jukka Rissanen2013-12-11  618  
7f118253 Jukka Rissanen2014-06-18  619  static void ifdown(struct 
net_device *netdev)
7f118253 Jukka Rissanen2014-06-18  620  {
7f118253 Jukka Rissanen2014-06-18 @621  int err;
7f118253 Jukka Rissanen2014-06-18  622  
7f118253 Jukka Rissanen2014-06-18  623  rtnl_lock();
ca74f73e Stephen Hemminger 2017-07-18  624  dev_close(netdev);
7f118253 Jukka Rissanen2014-06-18  625  rtnl_unlock();
7f118253 Jukka Rissanen2014-06-18  626  }
7f118253 Jukka Rissanen2014-06-18  627  

:: The code at line 621 was first introduced by commit
:: 7f118253820fc3ad38659485adb3ebdfe64820e1 Bluetooth: 6LoWPAN: Remove 
network devices when unloading

:: TO: Jukka Rissanen 
:: CC: Marcel Holtmann 

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


Re: [PATCH] bluetooth: document config options

2017-07-21 Thread Marcel Holtmann
Hi Pavel,

> Kernel config options should include useful help text; I had to look
> up the terms on wikipedia.
> 
> Signed-off-by: Pavel Machek 
> 
> diff --git a/net/bluetooth/Kconfig b/net/bluetooth/Kconfig
> index 68f951b..133c8a6 100644
> --- a/net/bluetooth/Kconfig
> +++ b/net/bluetooth/Kconfig
> @@ -45,6 +45,8 @@ config BT_BREDR
>   bool "Bluetooth Classic (BR/EDR) features"
>   depends on BT
>   default y
> + help
> +   Support Bluetooth version 1 and 2 connections.

so this is actually not correct. Version 5.0 is also Bluetooth BR/EDR.

I am fine if use the terms "Basic Rate" and "Enhanced Data Rate" in the 
description, but the version numbers are kinda not how this works. So some 
alternative wording needs to be used:

"Bluetooth Classic includes support for Basic Rate (BR) available with 
Bluetooth version 1.0b or later and support for Enhanced Data Rate (EDR) 
available with Bluetooth version 2.0 or later."

> source "net/bluetooth/rfcomm/Kconfig"
> 
> @@ -58,11 +60,17 @@ config BT_HS
>   bool "Bluetooth High Speed (HS) features"
>   depends on BT_BREDR
>   default y
> + help
> +   Bluetooth 3 introduces high-speed mode where bluetooth endpoints
> +   negotiate fast connection over WIFI. This controls its support.

Again, while Bluetooth version 3.0 introduces HS support, it is not what is 
selecting this.

I think here we can write something like this:

"Bluetooth High Speed includes support for off-loading Bluetooth connections 
via 802.11 physical layer available with Bluetooth version 3.0 or later."

> 
> config BT_LE
>   bool "Bluetooth Low Energy (LE) features"
>   depends on BT
>   default y
> + help
> +   Bluetooth 4 introduces special low-energy protocol, designed
> +   for simple devices. This controls its support.

I think here it is also better to have an alternative text:

"Bluetooth Low Energy includes support low-energy physical layer available with 
Bluetooth version 4.0 or later."

Regards

Marcel



Re: [PATCH net-next 2/8] netvsc: add some rtnl_dereference annotations

2017-07-21 Thread kbuild test robot
Hi Stephen,

[auto build test ERROR on net-next/master]

url:
https://github.com/0day-ci/linux/commits/Stephen-Hemminger/netvsc-lockdep-and-related-fixes/20170720-191938
config: x86_64-rhel (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64 

Note: the 
linux-review/Stephen-Hemminger/netvsc-lockdep-and-related-fixes/20170720-191938 
HEAD a5e48342eae178e9ed020d3634cdf92f1f05cb70 builds fine.
  It only hurts bisectibility.

All error/warnings (new ones prefixed by >>):

   drivers/net/hyperv/netvsc.c: In function 'netvsc_switch_datapath':
>> drivers/net/hyperv/netvsc.c:44:33: error: implicit declaration of function 
>> 'rtnl_dereference' [-Werror=implicit-function-declaration]
 struct netvsc_device *nv_dev = rtnl_dereference(net_device_ctx->nvdev);
^~~~
>> drivers/net/hyperv/netvsc.c:44:33: warning: initialization makes pointer 
>> from integer without a cast [-Wint-conversion]
   drivers/net/hyperv/netvsc.c: In function 'netvsc_device_remove':
   drivers/net/hyperv/netvsc.c:553:5: warning: initialization makes pointer 
from integer without a cast [-Wint-conversion]
  = rtnl_dereference(net_device_ctx->nvdev);
^~~~
   cc1: some warnings being treated as errors

vim +/rtnl_dereference +44 drivers/net/hyperv/netvsc.c

35  
36  /*
37   * Switch the data path from the synthetic interface to the VF
38   * interface.
39   */
40  void netvsc_switch_datapath(struct net_device *ndev, bool vf)
41  {
42  struct net_device_context *net_device_ctx = netdev_priv(ndev);
43  struct hv_device *dev = net_device_ctx->device_ctx;
  > 44  struct netvsc_device *nv_dev = 
rtnl_dereference(net_device_ctx->nvdev);
45  struct nvsp_message *init_pkt = _dev->channel_init_pkt;
46  
47  memset(init_pkt, 0, sizeof(struct nvsp_message));
48  init_pkt->hdr.msg_type = NVSP_MSG4_TYPE_SWITCH_DATA_PATH;
49  if (vf)
50  init_pkt->msg.v4_msg.active_dp.active_datapath =
51  NVSP_DATAPATH_VF;
52  else
53  init_pkt->msg.v4_msg.active_dp.active_datapath =
54  NVSP_DATAPATH_SYNTHETIC;
55  
56  vmbus_sendpacket(dev->channel, init_pkt,
57 sizeof(struct nvsp_message),
58 (unsigned long)init_pkt,
59 VM_PKT_DATA_INBAND, 0);
60  
61  net_device_ctx->datapath = vf;
62  }
63  

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


Re: [PATCH 0/4] Configuring traffic classes via new hardware offload mechanism in tc/mqprio

2017-07-21 Thread Richard Cochran
On Mon, May 22, 2017 at 12:31:12PM -0700, Jeff Kirsher wrote:
> On Fri, 2017-05-19 at 17:58 -0700, Amritha Nambiar wrote:
> > The following series introduces a new harware offload mode in tc/mqprio
> > where the TCs, the queue configurations and bandwidth rate limits are
> > offloaded to the hardware.

...

> This was meant to be sent out as an RFC, but apparently that did not get
> conveyed when these were sent out Friday.

I am looking at tc/mqprio and dcb with the purpose of implementing 

  - Forwarding and Queuing Enhancements for Time-Sensitive Streams (FQTSS)
802.1Q-2014 Clause 34

  - Scheduled Traffic (time based scheduling)
P802.1Qbv

using the HW capabilities of the i210.  This series looks like a
promising avenue for these features.

My question is, did series go anywhere?

I didn't see any follow ups on netdev, but maybe I missed something.

Thanks,
Richard





[PATCH] ceph: kernel client startsync can be removed

2017-07-21 Thread Yanhu Cao
kernel client is still sending write,startsync.
that startsync is a no-op (has been for years) and can probably be removed

Link: http://tracker.ceph.com/issues/20604

Signed-off-by: Yanhu Cao 
---
 fs/ceph/addr.c | 9 ++---
 fs/ceph/file.c | 5 +
 include/linux/ceph/rados.h | 1 -
 net/ceph/osd_client.c  | 5 -
 4 files changed, 3 insertions(+), 17 deletions(-)

diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
index 5083628..f9a1805 100644
--- a/fs/ceph/addr.c
+++ b/fs/ceph/addr.c
@@ -939,7 +939,7 @@ static int ceph_writepages_start(struct address_space 
*mapping,
break;
}
 
-   num_ops = 1 + do_sync;
+   num_ops = 1;
strip_unit_end = page->index +
((len - 1) >> PAGE_SHIFT);
 
@@ -1045,7 +1045,7 @@ static int ceph_writepages_start(struct address_space 
*mapping,
for (i = 0; i < locked_pages; i++) {
u64 cur_offset = page_offset(pages[i]);
if (offset + len != cur_offset) {
-   if (op_idx + do_sync + 1 == req->r_num_ops)
+   if (op_idx + 1 == req->r_num_ops)
break;
osd_req_op_extent_dup_last(req, op_idx,
   cur_offset - offset);
@@ -1082,17 +1082,12 @@ static int ceph_writepages_start(struct address_space 
*mapping,
 0, !!pool, false);
osd_req_op_extent_update(req, op_idx, len);
 
-   if (do_sync) {
-   op_idx++;
-   osd_req_op_init(req, op_idx, CEPH_OSD_OP_STARTSYNC, 0);
-   }
BUG_ON(op_idx + 1 != req->r_num_ops);
 
pool = NULL;
if (i < locked_pages) {
BUG_ON(num_ops <= req->r_num_ops);
num_ops -= req->r_num_ops;
-   num_ops += do_sync;
locked_pages -= i;
 
/* allocate new pages array for next request */
diff --git a/fs/ceph/file.c b/fs/ceph/file.c
index 3d48c41..405c99b 100644
--- a/fs/ceph/file.c
+++ b/fs/ceph/file.c
@@ -800,7 +800,6 @@ static void ceph_aio_retry_work(struct work_struct *work)
}
 
req->r_ops[0] = orig_req->r_ops[0];
-   osd_req_op_init(req, 1, CEPH_OSD_OP_STARTSYNC, 0);
 
req->r_mtime = aio_req->mtime;
req->r_data_offset = req->r_ops[0].extent.offset;
@@ -874,8 +873,7 @@ static void ceph_aio_retry_work(struct work_struct *work)
vino = ceph_vino(inode);
req = ceph_osdc_new_request(>client->osdc, >i_layout,
vino, pos, , 0,
-   /*include a 'startsync' command*/
-   write ? 2 : 1,
+   1,
write ? CEPH_OSD_OP_WRITE :
CEPH_OSD_OP_READ,
flags, snapc,
@@ -922,7 +920,6 @@ static void ceph_aio_retry_work(struct work_struct *work)
truncate_inode_pages_range(inode->i_mapping, pos,
(pos+len) | (PAGE_SIZE - 1));
 
-   osd_req_op_init(req, 1, CEPH_OSD_OP_STARTSYNC, 0);
req->r_mtime = mtime;
}
 
diff --git a/include/linux/ceph/rados.h b/include/linux/ceph/rados.h
index 385db08..ba5adf0 100644
--- a/include/linux/ceph/rados.h
+++ b/include/linux/ceph/rados.h
@@ -226,7 +226,6 @@ struct ceph_eversion {
\
/* fancy write */   \
f(APPEND,   __CEPH_OSD_OP(WR, DATA, 6), "append")   \
-   f(STARTSYNC,__CEPH_OSD_OP(WR, DATA, 7), "startsync")\
f(SETTRUNC, __CEPH_OSD_OP(WR, DATA, 8), "settrunc") \
f(TRIMTRUNC,__CEPH_OSD_OP(WR, DATA, 9), "trimtrunc")\
\
diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
index 901bb82..5c9d696 100644
--- a/net/ceph/osd_client.c
+++ b/net/ceph/osd_client.c
@@ -863,8 +863,6 @@ static u32 osd_req_encode_op(struct ceph_osd_op *dst,
dst->cls.method_len = src->cls.method_len;
dst->cls.indata_len = cpu_to_le32(src->cls.indata_len);
break;
-   case CEPH_OSD_OP_STARTSYNC:
-   break;
case CEPH_OSD_OP_WATCH:
dst->watch.cookie = 

[PATCH V2 1/4] net-next: dsa: move struct dsa_device_ops to the global header file

2017-07-21 Thread John Crispin
We need to access this struct from within the flow_dissector to fix
dissection for packets coming in on DSA devices.

Signed-off-by: John Crispin 
---
 include/net/dsa.h  | 7 +++
 net/dsa/dsa_priv.h | 7 ---
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/include/net/dsa.h b/include/net/dsa.h
index 88da272d20d0..a4c0d52abc80 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -101,6 +101,13 @@ struct dsa_platform_data {
 
 struct packet_type;
 
+struct dsa_device_ops {
+   struct sk_buff *(*xmit)(struct sk_buff *skb, struct net_device *dev);
+   struct sk_buff *(*rcv)(struct sk_buff *skb, struct net_device *dev,
+  struct packet_type *pt,
+  struct net_device *orig_dev);
+};
+
 struct dsa_switch_tree {
struct list_headlist;
 
diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h
index 55982cc39b24..62ea3663c2c6 100644
--- a/net/dsa/dsa_priv.h
+++ b/net/dsa/dsa_priv.h
@@ -65,13 +65,6 @@ struct dsa_notifier_vlan_info {
int port;
 };
 
-struct dsa_device_ops {
-   struct sk_buff *(*xmit)(struct sk_buff *skb, struct net_device *dev);
-   struct sk_buff *(*rcv)(struct sk_buff *skb, struct net_device *dev,
-  struct packet_type *pt,
-  struct net_device *orig_dev);
-};
-
 struct dsa_slave_priv {
/* Copy of dp->ds->dst->tag_ops->xmit for faster access in hot path */
struct sk_buff *(*xmit)(struct sk_buff *skb,
-- 
2.11.0



  1   2   >