date:20170922

[GIT] Networking

2017-09-22 Thread David Miller


1) Fix NAPI poll list corruption in enic driver, from Christian
   Lamparter.

2) Fix route use after free, from Eric Dumazet.

3) Fix regression in reuseaddr handling, from Josef Bacik.

4) Assert the size of control messages in compat handling since we
   copy it in from userspace twice.  From Meng Xu.

5) SMC layer bug fixes (missing RCU locking, bad refcounting, etc.)
   from Ursula Braun.

6) Fix races in AF_PACKET fanout handling, from Willem de Bruijn.

7) Don't use ARRAY_SIZE on spinlock array which might have zero
   entries, from Geert Uytterhoeven.

8) Fix miscomputation of checksum in ipv6 udp code, from Subash
   Abhinov Kasiviswanathan.

9) Push the ipv6 header properly in ipv6 GRE tunnel driver, from
   Xin Long.

Please pull, thanks a lot.

The following changes since commit 2bd6bf03f4c1c59381d62c61d03f6cc3fe71f66e:

  Linux 4.14-rc1 (2017-09-16 15:47:51 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 

for you to fetch changes up to 4e683f499a15cd777d3cb51aaebe48d72334c852:

  Merge branch 'net-fix-reuseaddr-regression' (2017-09-22 20:33:18 -0700)


Alex Ng (1):
  hv_netvsc: fix send buffer failure on MTU change

Andreas Gruenbacher (1):
  rhashtable: Documentation tweak

Ariel Elior (1):
  MAINTAINERS: Remove Yuval Mintz from maintainers list

Christian Lamparter (1):
  net: emac: Fix napi poll list corruption

Cong Wang (1):
  net_sched: remove cls_flower idr on failure

Daniel Borkmann (1):
  bpf: fix ri->map_owner pointer on bpf_prog_realloc

David S. Miller (8):
  Merge tag 'mac80211-for-davem-2017-11-19' of 
git://git.kernel.org/.../jberg/mac80211
  Merge branch 'hns3-bug-fixes'
  Merge git://git.kernel.org/.../pablo/nf
  Merge branch 'hns3-tm-fixes'
  Merge branch 'phylib-xcvr-type'
  Merge branch 'lan78xx-fixes'
  Merge branch 'smc-bug-fixes'
  Merge branch 'net-fix-reuseaddr-regression'

Davide Caratti (1):
  net/sched: cls_matchall: fix crash when used with classful qdisc

Edward Cree (1):
  net: change skb->mac_header when Generic XDP calls adjust_head

Eric Dumazet (4):
  8139too: revisit napi_complete_done() usage
  bpf: do not disable/enable BH in bpf_map_free_id()
  tcp: fastopen: fix on syn-data transmit failure
  net: prevent dst uses after free

Fahad Kunnathadi (1):
  net: phy: Fix mask value write on gmii2rgmii converter speed register

Florian Fainelli (3):
  net: systemport: Fix 64-bit statistics dependency
  net: ethtool: Add back transceiver type
  net: phy: Keep reporting transceiver type

Geert Uytterhoeven (2):
  netfilter: nat: Do not use ARRAY_SIZE() on spinlocks to fix zero div
  net: phy: Fix truncation of large IRQ numbers in phy_attached_print()

Hans Wippel (2):
  net/smc: add missing dev_put
  net/smc: add receive timeout check

Jerome Brunet (1):
  net: phy: Kconfig: Fix PHY infrastructure menu in menuconfig

Johannes Berg (1):
  nl80211: fix null-ptr dereference on invalid mesh configuration

Josef Bacik (3):
  net: set tb->fast_sk_family
  net: use inet6_rcv_saddr to compare sockets
  inet: fix improper empty comparison

Konstantin Khlebnikov (2):
  net_sched: always reset qdisc backlog in qdisc_reset()
  net_sched/hfsc: fix curve activation in hfsc_change_class()

Lipeng (6):
  net: hns3: Fixes initialization of phy address from firmware
  net: hns3: Fixes the command used to unmap ring from vector
  net: hns3: Fixes ring-to-vector map-and-unmap command
  net: hns3: Fixes the initialization of MAC address in hardware
  net: hns3: Fixes the default VLAN-id of PF
  net: hns3: Fixes the premature exit of loop when matching clients

Matteo Croce (1):
  ipv6: fix net.ipv6.conf.all interface DAD handlers

Meng Xu (2):
  net: compat: assert the size of cmsg copied in is as expected
  isdn/i4l: fetch the ppp_write buffer in one shot

Mike Manning (1):
  net: ipv6: fix regression of no RTM_DELADDR sent after DAD failure

Nisar Sayed (3):
  lan78xx: Fix for eeprom read/write when device auto suspend
  lan78xx: Allow EEPROM write for less than MAX_EEPROM_SIZE
  lan78xx: Use default values loaded from EEPROM/OTP after reset

Randy Dunlap (1):
  Documentation: networking: fix ASCII art in switchdev.txt

Salil Mehta (1):
  net: hns3: Fixes the ether address copy with appropriate API

Sathya Perla (1):
  bnxt_en: check for ingress qdisc in flower offload

Stefan Schmidt (1):
  MAINTAINERS: update git tree locations for ieee802154 subsystem

Subash Abhinov Kasiviswanathan (1):
  udpv6: Fix the checksum computation when HW checksum does not apply

Thomas Meyer (1):
  net: stmmac: Cocci spatch "of_table"

Timur Tabi (1):
  net: qcom/emac: add software control for pause frame mode

Tobias Klauser (1):
  bpf: devmap: pass on

Re: [PATCH net-next] bpf/verifier: improve disassembly of BPF_END instructions

2017-09-22 Thread Y Song

On Fri, Sep 22, 2017 at 9:23 AM, Edward Cree  wrote:
> On 22/09/17 16:16, Alexei Starovoitov wrote:
>> looks like we're converging on
>> "be16/be32/be64/le16/le32/le64 #register" for BPF_END.
>> I guess it can live with that. I would prefer more C like syntax
>> to match the rest, but llvm parsing point is a strong one.
> Yep, agreed.  I'll post a v2 once we've settled BPF_NEG.
>> For BPG_NEG I prefer to do it in C syntax like interpreter does:
>> ALU_NEG:
>> DST = (u32) -DST;
>> ALU64_NEG:
>> DST = -DST;
>> Yonghong, does it mean that asmparser will equally suffer?
> Correction to my earlier statements: verifier will currently disassemble
>  neg as:
> (87) r0 neg 0
> (84) (u32) r0 neg (u32) 0
>  because it pretends 'neg' is a compound-assignment operator like +=.
> The analogy with be16 and friends would be to use
> neg64 r0
> neg32 r0
>  whereas the analogy with everything else would be
> r0 = -r0
> r0 = (u32) -r0
>  as Alexei says.
> I'm happy to go with Alexei's version if it doesn't cause problems for llvm.

I got some time to do some prototyping in llvm and it looks like that
I am able to
resolve the issue and we are able to use more C-like syntax. That is:
for bswap:
 r1 = (be16) (u16) r1
 or
 r1 = (be16) r1
 or
 r1 = be16 r1
for neg:
 r0 = -r0
 (for 32bit support, llvm may output "w0 = -w0" in the future. But
since it is not
  enabled yet, you can continue to output "r0 = (u32) -r0".)

Not sure which syntax is best for bswap. The "r1 = (be16) (u16) r1" is most
explicit in its intention.

Attaching my llvm patch as well and cc'ing Jiong and Jakub so they can see my
implementation and the relative discussion here. (In this patch, I did
not implement
bswap for little endian yet.) Maybe they can provide additional comments.

0001-bpf-add-support-for-neg-insn-and-change-format-of-bs.patch
Description: Binary data

Re: pull-request: ieee802154 2017-09-20

2017-09-22 Thread David Miller

From: Stefan Schmidt 
Date: Thu, 21 Sep 2017 22:56:07 +0200

> Here comes a pull request for ieee802154 changes I have queued up for
> this merge window.
> 
> Normally these have been coming through the bluetooth tree but as this
> three have been falling through the cracks so far and I have to review
> and ack all of them anyway I think it makes sense if I save the
> bluetooth people some work and handle them directly.
> 
> Its the first pull request I send to you so please let me know if I did
> something wrong or if you prefer a different format.

Pulled, thanks.

Re: [PATCH net-next v2 0/4] cxgb4: add support to offload tc flower

2017-09-22 Thread David Miller

From: Rahul Lakkireddy 
Date: Thu, 21 Sep 2017 23:41:12 +0530

> This series of patches add support to offload tc flower onto Chelsio
> NICs.
> 
> Patch 1 adds basic skeleton to prepare for offloading tc flower flows.
> 
> Patch 2 adds support to add/remove flows for offload.  Flows can have
> accompanying masks.  Following match and action are currently supported
> for offload:
> Match:  ether-protocol, IPv4/IPv6 addresses, L4 ports (TCP/UDP)
> Action: drop, redirect to another port on the device.
> 
> Patch 3 adds support to offload tc-flower flows having
> vlan actions: pop, push, and modify.
> 
> Patch 4 adds support to fetch stats for the offloaded tc flower flows
> from hardware.
> 
> Support for offloading more match and action types are to be followed
> in subsequent series.

Series applied, thank you.

Re: [PATCH] net: use 32-bit arithmetic while allocating net device

2017-09-22 Thread David Miller

From: Alexey Dobriyan 
Date: Thu, 21 Sep 2017 23:33:29 +0300

> Private part of allocation is never big enough to warrant size_t.
> 
> Space savings:
> 
>   add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-10 (-10)
>   function old new   delta
>   alloc_netdev_mqs11201110 -10
> 
> Signed-off-by: Alexey Dobriyan 

Applied to net-next.

Re: [PATCH net-next v2] net: Remove useless function skb_header_release

2017-09-22 Thread David Miller

From: gfree.w...@vip.163.com
Date: Fri, 22 Sep 2017 10:25:22 +0800

> From: Gao Feng 
> 
> There is no one which would invokes the function skb_header_release.
> So just remove it now.
> 
> Signed-off-by: Gao Feng 

Applied, thanks.

Re: [PATCH 0/3] fix reuseaddr regression

2017-09-22 Thread David Miller

From: Josef Bacik 
Date: Fri, 22 Sep 2017 20:20:05 -0400

> I introduced a regression when reworking the fastreuse port stuff that allows
> bind conflicts to occur once a reuseaddr successfully opens on an existing tb.
> The root cause is I reversed an if statement which caused us to set the tb as 
> if
> there were no owners on the socket if there were, which obviously is not
> correct.
> 
> Dave could you please queue these changes up for -stable, I've run them 
> through
> the net tests and added another test to check for this problem specifically.

Series applied and queued up for -stable, thanks.

Re: [PATCH net v2] net: orphan frags on stand-alone ptype in dev_queue_xmit_nit

2017-09-22 Thread David Miller

From: Willem de Bruijn 
Date: Fri, 22 Sep 2017 19:42:37 -0400

> Zerocopy skbs frags are copied when the skb is looped to a local sock.
> Commit 1080e512d44d ("net: orphan frags on receive") introduced calls
> to skb_orphan_frags to deliver_skb and __netif_receive_skb for this.
> 
> With msg_zerocopy, these skbs can also exist in the tx path and thus
> loop from dev_queue_xmit_nit. This already calls deliver_skb in its
> loop. But it does not orphan before a separate pt_prev->func().
> 
> Add the missing skb_orphan_frags_rx.
> 
> Changes
>   v1->v2: handle skb_orphan_frags_rx failure
> 
> Fixes: 1f8b977ab32d ("sock: enable MSG_ZEROCOPY")
> Signed-off-by: Willem de Bruijn 

Applied and queued up for -stable, thanks.

Re: [PATCH] MAINTAINERS: update git tree locations for ieee802154 subsystem

2017-09-22 Thread David Miller

From: Stefan Schmidt 
Date: Fri, 22 Sep 2017 14:28:46 +0200

> Patches for ieee802154 will go through my new trees towards netdev from
> now on. The 6LoWPAN subsystem will stay as is (shared between ieee802154
> and bluetooth) and go through the bluetooth tree as usual.
> 
> Signed-off-by: Stefan Schmidt 

Applied.

Re: [PATCH] net: stmmac: Meet alignment requirements for DMA

2017-09-22 Thread David Miller

From: Matt Redfearn 
Date: Fri, 22 Sep 2017 12:13:53 +0100

> According to Documentation/DMA-API.txt:
>  Warnings:  Memory coherency operates at a granularity called the cache
>  line width.  In order for memory mapped by this API to operate
>  correctly, the mapped region must begin exactly on a cache line
>  boundary and end exactly on one (to prevent two separately mapped
>  regions from sharing a single cache line).  Since the cache line size
>  may not be known at compile time, the API will not enforce this
>  requirement.  Therefore, it is recommended that driver writers who
>  don't take special care to determine the cache line size at run time
>  only map virtual regions that begin and end on page boundaries (which
>  are guaranteed also to be cache line boundaries).

This is rediculious.  You're misreading what this document is trying
to explain.

As long as you use the dma_{map,unamp}_single() and sync to/from
deivce interfaces properly, the cacheline issues will be handled properly
and the cpu and the device will see proper uptodate memory contents.

It is completely rediculious to require every driver to stash away two
sets of pointer for every packet, and to DMA map the headroom of the SKB
which is wasteful.

I'm not applying this, fix this problem properly, thanks.

Re: [PATCH 0/5] use setup_timer() helper function.

2017-09-22 Thread David Miller

From: Allen Pais 
Date: Fri, 22 Sep 2017 16:28:17 +0530

> This series uses setup_timer() helper function. The series
> addresses the files under net/*.

There was a recent change to the nfc code in net-next which causes
your patches to not apply.

Please repsin against net-next, thanks.

Re: tools: selftests: psock_tpacket: skip un-supported tpacket_v3 test

2017-09-22 Thread David Miller

From: Orson Zhai 
Date: Fri, 22 Sep 2017 18:17:17 +0800

> The TPACKET_V3 test of PACKET_TX_RING will fail with kernel version
> lower than v4.11. Supported code of tx ring was add with commit id
> <7f953ab2ba46: af_packet: TX_RING support for TPACKET_V3> at Jan. 3
> of 2017.
> 
> So skip this item test instead of reporting failing for old kernels.
> 
> Signed-off-by: Orson Zhai 

The whole point is to make sure the kernel in which the selftest
code is present functions properly.

There are many tests in selftests that only work on recent kernels.

I'm not applying this, sorry.

Re: [PATCH net-next] virtio-net: correctly set xdp_xmit for mergeable buffer

2017-09-22 Thread David Miller

From: Jason Wang 
Date: Fri, 22 Sep 2017 14:38:58 +0800

> We should set xdp_xmit only when xdp_do_redirect() succeed.
> 
> Cc: John Fastabend 
> Signed-off-by: Jason Wang 

Applied, thanks Jason.

Re: [PATCH net-next 10/10] net: hns3: Add mqprio support when interacting with network stack

2017-09-22 Thread Yunsheng Lin

Hi, Jiri

On 2017/9/23 0:03, Jiri Pirko wrote:
> Fri, Sep 22, 2017 at 04:11:51PM CEST, linyunsh...@huawei.com wrote:
>> Hi, Jiri
>>
 - if (!tc) {
 + if (if_running) {
 + (void)hns3_nic_net_stop(netdev);
 + msleep(100);
 + }
 +
 + ret = (kinfo->dcb_ops && kinfo->dcb_ops->>setup_tc) ?
 + kinfo->dcb_ops->setup_tc(h, tc, prio_tc) : ->EOPNOTSUPP;
>>
>>> This is most odd. Why do you call dcb_ops from >ndo_setup_tc callback?
>>> Why are you mixing this together? prio->tc mapping >can be done
>>> directly in dcbnl
>>
>> Here is what we do in dcb_ops->setup_tc:
>> Firstly, if current tc num is different from the tc num
>> that user provide, then we setup the queues for each
>> tc.
>>
>> Secondly, we tell hardware the pri to tc mapping that
>> the stack is using. In rx direction, our hardware need
>> that mapping to put different packet into different tc'
>> queues according to the priority of the packet, then
>> rss decides which specific queue in the tc should the
>> packet goto.
>>
>> By mixing, I suppose you meant why we need the
>> pri to tc infomation?
> 
> by mixing, I mean what I wrote. You are calling dcb_ops callback from
> ndo_setup_tc callback. So you are mixing DCBNL subsystem and TC
> subsystem. Why? Why do you need sch_mqprio? Why DCBNL is not enough for
> all?

When using lldptool, dcbnl is involved.

But when using tc qdisc, dcbbl is not involved, below is the a few key
call graph in the kernel when tc qdisc cmd is executed.

cmd:
tc qdisc add dev eth0 root handle 1:0 mqprio num_tc 4 map 1 2 3 3 1 3 1 1 hw 1

call graph:
rtnetlink_rcv_msg -> tc_modify_qdisc -> qdisc_create -> mqprio_init ->
hns3_nic_setup_tc

When hns3_nic_setup_tc is called, we need to know how many tc num and
prio_tc mapping from the tc_mqprio_qopt which is provided in the paramter
in the ndo_setup_tc function, and dcb_ops is the our hardware specific
method to setup the tc related parameter to the hardware, so this is why
we call dcb_ops callback in ndo_setup_tc callback.

I hope this will answer your question, thanks for your time.

> 
> 
> 
>> I hope I did not misunderstand your question, thanks
>> for your time reviewing.
> 
> .
>

[PATCH net-next] liquidio: pass date and time info to NIC firmware

2017-09-22 Thread Felix Manlunas

From: Veerasenareddy Burru 

Signed-off-by: Veerasenareddy Burru 
Signed-off-by: Manish Awasthi 
Signed-off-by: Felix Manlunas 
---
 .../net/ethernet/cavium/liquidio/octeon_console.c  | 28 +++---
 1 file changed, 25 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/cavium/liquidio/octeon_console.c 
b/drivers/net/ethernet/cavium/liquidio/octeon_console.c
index ec3dd69..eda799b 100644
--- a/drivers/net/ethernet/cavium/liquidio/octeon_console.c
+++ b/drivers/net/ethernet/cavium/liquidio/octeon_console.c
@@ -803,15 +803,19 @@ static int octeon_console_read(struct octeon_device *oct, 
u32 console_num,
 }
 
 #define FBUF_SIZE  (4 * 1024 * 1024)
+#define MAX_DATE_SIZE30
 
 int octeon_download_firmware(struct octeon_device *oct, const u8 *data,
 size_t size)
 {
-   int ret = 0;
+   struct octeon_firmware_file_header *h;
+   char date[MAX_DATE_SIZE];
+   struct timeval time;
u32 crc32_result;
+   struct tm tm_val;
u64 load_addr;
u32 image_len;
-   struct octeon_firmware_file_header *h;
+   int ret = 0;
u32 i, rem;
 
if (size < sizeof(struct octeon_firmware_file_header)) {
@@ -890,11 +894,29 @@ int octeon_download_firmware(struct octeon_device *oct, 
const u8 *data,
load_addr += size;
}
}
+
+   /* Get time of the day */
+   do_gettimeofday();
+   time_to_tm(time.tv_sec, (-sys_tz.tz_minuteswest) * 60,  _val);
+   ret = snprintf(date, MAX_DATE_SIZE,
+  " date=%04ld.%02d.%02d-%02d:%02d:%02d",
+  tm_val.tm_year + 1900, tm_val.tm_mon + 1, tm_val.tm_mday,
+  tm_val.tm_hour, tm_val.tm_min, tm_val.tm_sec);
+   if ((sizeof(h->bootcmd) - strnlen(h->bootcmd, sizeof(h->bootcmd))) <
+   ret) {
+   dev_err(>pci_dev->dev, "Boot command buffer too small\n");
+   return -EINVAL;
+   }
+   strncat(h->bootcmd, date,
+   sizeof(h->bootcmd) - strnlen(h->bootcmd, sizeof(h->bootcmd)));
+
dev_info(>pci_dev->dev, "Writing boot command: %s\n",
 h->bootcmd);
 
/* Invoke the bootcmd */
ret = octeon_console_send_cmd(oct, h->bootcmd, 50);
+   if (ret)
+   dev_info(>pci_dev->dev, "Boot command send failed\n");
 
-   return 0;
+   return ret;
 }

Re: [PATCH 0/3] fix reuseaddr regression

2017-09-22 Thread Josef Bacik

On Tue, Sep 19, 2017 at 01:50:56PM -0700, David Miller wrote:
> From: jo...@toxicpanda.com
> Date: Mon, 18 Sep 2017 12:28:54 -0400
> 
> > I introduced a regression when reworking the fastreuse port stuff that 
> > allows
> > bind conflicts to occur once a reuseaddr socket successfully opens on an
> > existing tb.  The root cause is I reversed an if statement which caused us 
> > to
> > set the tb as if there were no owners on the socket if there were, which
> > obviously is not correct.
> > 
> > Dave I have follow up patches that will add a selftest for this case and I 
> > ran
> > the other reuseport related tests as well.  These need to go in pretty 
> > quickly
> > as it breaks kvm, I've marked them for stable.  Sorry for the regression,
> 
> First, please fix your "From: " field so that it actually has your full
> name rather than just your email address.  This matter when I apply
> your patches.
> 
> Second, remove the stable CC:.  For networking changes, you simply ask
> me to queue the changes up for -stable.
> 

Sorry Dave, I've fixed my git email settings and I droped the stable cc and sent
a new round.  Didn't see this until just now, my bad.

Josef

[PATCH 1/3] net: set tb->fast_sk_family

2017-09-22 Thread Josef Bacik

From: Josef Bacik 

We need to set the tb->fast_sk_family properly so we can use the proper
comparison function for all subsequent reuseport bind requests.

Fixes: 637bc8bbe6c0 ("inet: reset tb->fastreuseport when adding a reuseport sk")
Reported-and-tested-by: Cole Robinson 
Signed-off-by: Josef Bacik 
---
 net/ipv4/inet_connection_sock.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
index b9c64b40a83a..f87f4805e244 100644
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -328,6 +328,7 @@ int inet_csk_get_port(struct sock *sk, unsigned short snum)
tb->fastuid = uid;
tb->fast_rcv_saddr = sk->sk_rcv_saddr;
tb->fast_ipv6_only = ipv6_only_sock(sk);
+   tb->fast_sk_family = sk->sk_family;
 #if IS_ENABLED(CONFIG_IPV6)
tb->fast_v6_rcv_saddr = sk->sk_v6_rcv_saddr;
 #endif
@@ -354,6 +355,7 @@ int inet_csk_get_port(struct sock *sk, unsigned short snum)
tb->fastuid = uid;
tb->fast_rcv_saddr = sk->sk_rcv_saddr;
tb->fast_ipv6_only = ipv6_only_sock(sk);
+   tb->fast_sk_family = sk->sk_family;
 #if IS_ENABLED(CONFIG_IPV6)
tb->fast_v6_rcv_saddr = sk->sk_v6_rcv_saddr;
 #endif
-- 
2.7.4

[PATCH 0/3] fix reuseaddr regression

2017-09-22 Thread Josef Bacik

I introduced a regression when reworking the fastreuse port stuff that allows
bind conflicts to occur once a reuseaddr successfully opens on an existing tb.
The root cause is I reversed an if statement which caused us to set the tb as if
there were no owners on the socket if there were, which obviously is not
correct.

Dave could you please queue these changes up for -stable, I've run them through
the net tests and added another test to check for this problem specifically.
Thanks,

Josef

[PATCH 2/3] net: use inet6_rcv_saddr to compare sockets

2017-09-22 Thread Josef Bacik

From: Josef Bacik 

In ipv6_rcv_saddr_equal() we need to use inet6_rcv_saddr(sk) for the
ipv6 compare with the fast socket information to make sure we're doing
the proper comparisons.

Fixes: 637bc8bbe6c0 ("inet: reset tb->fastreuseport when adding a reuseport sk")
Reported-and-tested-by: Cole Robinson 
Signed-off-by: Josef Bacik 
---
 net/ipv4/inet_connection_sock.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
index f87f4805e244..a1bf30438bc5 100644
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -266,7 +266,7 @@ static inline int sk_reuseport_match(struct 
inet_bind_bucket *tb,
 #if IS_ENABLED(CONFIG_IPV6)
if (tb->fast_sk_family == AF_INET6)
return ipv6_rcv_saddr_equal(>fast_v6_rcv_saddr,
-   >sk_v6_rcv_saddr,
+   inet6_rcv_saddr(sk),
tb->fast_rcv_saddr,
sk->sk_rcv_saddr,
tb->fast_ipv6_only,
-- 
2.7.4

[PATCH 3/3] inet: fix improper empty comparison

2017-09-22 Thread Josef Bacik

From: Josef Bacik 

When doing my reuseport rework I screwed up and changed a

if (hlist_empty(>owners))

to

if (!hlist_empty(>owners))

This is obviously bad as all of the reuseport/reuse logic was reversed,
which caused weird problems like allowing an ipv4 bind conflict if we
opened an ipv4 only socket on a port followed by an ipv6 only socket on
the same port.

Fixes: b9470c27607b ("inet: kill smallest_size and smallest_port")
Reported-by: Cole Robinson 
Signed-off-by: Josef Bacik 
---
 net/ipv4/inet_connection_sock.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
index a1bf30438bc5..c039c937ba90 100644
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -321,7 +321,7 @@ int inet_csk_get_port(struct sock *sk, unsigned short snum)
goto fail_unlock;
}
 success:
-   if (!hlist_empty(>owners)) {
+   if (hlist_empty(>owners)) {
tb->fastreuse = reuse;
if (sk->sk_reuseport) {
tb->fastreuseport = FASTREUSEPORT_ANY;
-- 
2.7.4

[PATCH net-next 2/3] liquidio: verify firmware version when auto-loaded from flash.

2017-09-22 Thread Felix Manlunas

From: Rick Farrington 

Signed-off-by: Rick Farrington 
Signed-off-by: Felix Manlunas 
---
 drivers/net/ethernet/cavium/liquidio/lio_main.c | 18 +-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/cavium/liquidio/lio_main.c 
b/drivers/net/ethernet/cavium/liquidio/lio_main.c
index ce08f71..a3c9867 100644
--- a/drivers/net/ethernet/cavium/liquidio/lio_main.c
+++ b/drivers/net/ethernet/cavium/liquidio/lio_main.c
@@ -3303,7 +3303,7 @@ static int setup_nic_devices(struct octeon_device 
*octeon_dev)
 {
struct lio *lio = NULL;
struct net_device *netdev;
-   u8 mac[6], i, j;
+   u8 mac[6], i, j, *fw_ver;
struct octeon_soft_command *sc;
struct liquidio_if_cfg_context *ctx;
struct liquidio_if_cfg_resp *resp;
@@ -3414,6 +3414,22 @@ static int setup_nic_devices(struct octeon_device 
*octeon_dev)
goto setup_nic_dev_fail;
}
 
+   /* Verify f/w version (in case of 'auto' loading from flash) */
+   fw_ver = octeon_dev->fw_info.liquidio_firmware_version;
+   if (memcmp(LIQUIDIO_BASE_VERSION,
+  fw_ver,
+  strlen(LIQUIDIO_BASE_VERSION))) {
+   dev_err(_dev->pci_dev->dev,
+   "Unmatched firmware version. Expected %s.x, got 
%s.\n",
+   LIQUIDIO_BASE_VERSION, fw_ver);
+   goto setup_nic_dev_fail;
+   } else if (atomic_read(octeon_dev->adapter_fw_state) ==
+  FW_IS_PRELOADED) {
+   dev_info(_dev->pci_dev->dev,
+"Using auto-loaded firmware version %s.\n",
+fw_ver);
+   }
+
octeon_swap_8B_data((u64 *)(>cfg_info),
(sizeof(struct liquidio_if_cfg_info)) >> 3);
 
-- 
1.8.3.1

[PATCH net-next 3/3] liquidio: update module parameter fw_type to reflect firmware type loaded

2017-09-22 Thread Felix Manlunas

From: Rick Farrington 

Signed-off-by: Rick Farrington 
Signed-off-by: Felix Manlunas 
---
 drivers/net/ethernet/cavium/liquidio/lio_main.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/cavium/liquidio/lio_main.c 
b/drivers/net/ethernet/cavium/liquidio/lio_main.c
index a3c9867..963803b 100644
--- a/drivers/net/ethernet/cavium/liquidio/lio_main.c
+++ b/drivers/net/ethernet/cavium/liquidio/lio_main.c
@@ -1934,10 +1934,12 @@ static int load_firmware(struct octeon_device *oct)
char fw_name[LIO_MAX_FW_FILENAME_LEN];
char *tmp_fw_type;
 
-   if (fw_type_is_auto())
+   if (fw_type_is_auto()) {
tmp_fw_type = LIO_FW_NAME_TYPE_NIC;
-   else
+   strncpy(fw_type, tmp_fw_type, sizeof(fw_type));
+   } else {
tmp_fw_type = fw_type;
+   }
 
sprintf(fw_name, "%s%s%s_%s%s", LIO_FW_DIR, LIO_FW_BASE_NAME,
octeon_get_conf(oct)->card_name, tmp_fw_type,
-- 
1.8.3.1

[PATCH net-next 1/3] liquidio: allow override of firmware present in flash

2017-09-22 Thread Felix Manlunas

From: Rick Farrington 

Signed-off-by: Rick Farrington 
Signed-off-by: Felix Manlunas 
---
 drivers/net/ethernet/cavium/liquidio/lio_main.c| 68 ++
 .../net/ethernet/cavium/liquidio/liquidio_image.h  |  1 +
 .../net/ethernet/cavium/liquidio/octeon_device.c   | 11 +++-
 .../net/ethernet/cavium/liquidio/octeon_device.h   | 10 
 4 files changed, 64 insertions(+), 26 deletions(-)

diff --git a/drivers/net/ethernet/cavium/liquidio/lio_main.c 
b/drivers/net/ethernet/cavium/liquidio/lio_main.c
index e7f5494..ce08f71 100644
--- a/drivers/net/ethernet/cavium/liquidio/lio_main.c
+++ b/drivers/net/ethernet/cavium/liquidio/lio_main.c
@@ -59,9 +59,9 @@
 module_param(debug, int, 0644);
 MODULE_PARM_DESC(debug, "NETIF_MSG debug bits");
 
-static char fw_type[LIO_MAX_FW_TYPE_LEN] = LIO_FW_NAME_TYPE_NIC;
+static char fw_type[LIO_MAX_FW_TYPE_LEN] = LIO_FW_NAME_TYPE_AUTO;
 module_param_string(fw_type, fw_type, sizeof(fw_type), 0444);
-MODULE_PARM_DESC(fw_type, "Type of firmware to be loaded. Default \"nic\".  
Use \"none\" to load firmware from flash.");
+MODULE_PARM_DESC(fw_type, "Type of firmware to be loaded (default is 
\"auto\"), which uses firmware in flash, if present, else loads \"nic\".");
 
 static u32 console_bitmask;
 module_param(console_bitmask, int, 0644);
@@ -1115,10 +1115,10 @@ static int liquidio_watchdog(void *param)
return 0;
 }
 
-static bool fw_type_is_none(void)
+static bool fw_type_is_auto(void)
 {
-   return strncmp(fw_type, LIO_FW_NAME_TYPE_NONE,
-  sizeof(LIO_FW_NAME_TYPE_NONE)) == 0;
+   return strncmp(fw_type, LIO_FW_NAME_TYPE_AUTO,
+  sizeof(LIO_FW_NAME_TYPE_AUTO)) == 0;
 }
 
 /**
@@ -1302,7 +1302,7 @@ static void octeon_destroy_resources(struct octeon_device 
*oct)
 * Implementation note: only soft-reset the device
 * if it is a CN6XXX OR the LAST CN23XX device.
 */
-   if (fw_type_is_none())
+   if (atomic_read(oct->adapter_fw_state) == FW_IS_PRELOADED)
octeon_pci_flr(oct);
else if (OCTEON_CN6XXX(oct) || !refcount)
oct->fn_list.soft_reset(oct);
@@ -1934,7 +1934,7 @@ static int load_firmware(struct octeon_device *oct)
char fw_name[LIO_MAX_FW_FILENAME_LEN];
char *tmp_fw_type;
 
-   if (fw_type[0] == '\0')
+   if (fw_type_is_auto())
tmp_fw_type = LIO_FW_NAME_TYPE_NIC;
else
tmp_fw_type = fw_type;
@@ -3882,9 +3882,9 @@ static void nic_starter(struct work_struct *work)
 static int octeon_device_init(struct octeon_device *octeon_dev)
 {
int j, ret;
-   int fw_loaded = 0;
char bootcmd[] = "\n";
char *dbg_enb = NULL;
+   enum lio_fw_state fw_state;
struct octeon_device_priv *oct_priv =
(struct octeon_device_priv *)octeon_dev->priv;
atomic_set(_dev->status, OCT_DEV_BEGIN_STATE);
@@ -3916,24 +3916,40 @@ static int octeon_device_init(struct octeon_device 
*octeon_dev)
 
octeon_dev->app_mode = CVM_DRV_INVALID_APP;
 
-   if (OCTEON_CN23XX_PF(octeon_dev)) {
-   if (!cn23xx_fw_loaded(octeon_dev) && !fw_type_is_none()) {
-   fw_loaded = 0;
-   /* Do a soft reset of the Octeon device. */
-   if (octeon_dev->fn_list.soft_reset(octeon_dev))
-   return 1;
-   /* things might have changed */
-   if (!cn23xx_fw_loaded(octeon_dev))
-   fw_loaded = 0;
-   else
-   fw_loaded = 1;
-   } else {
-   fw_loaded = 1;
-   }
-   } else if (octeon_dev->fn_list.soft_reset(octeon_dev)) {
-   return 1;
+   /* CN23XX supports preloaded firmware if the following is true:
+*
+* The adapter indicates that firmware is currently running AND
+* 'fw_type' is 'auto'.
+*
+* (default state is NEEDS_TO_BE_LOADED, override it if appropriate).
+*/
+   if (OCTEON_CN23XX_PF(octeon_dev) &&
+   cn23xx_fw_loaded(octeon_dev) && fw_type_is_auto()) {
+   atomic_cmpxchg(octeon_dev->adapter_fw_state,
+  FW_NEEDS_TO_BE_LOADED, FW_IS_PRELOADED);
}
 
+   /* If loading firmware, only first device of adapter needs to do so. */
+   fw_state = atomic_cmpxchg(octeon_dev->adapter_fw_state,
+ FW_NEEDS_TO_BE_LOADED,
+ FW_IS_BEING_LOADED);
+
+   /* Here, [local variable] 'fw_state' is set to one of:
+*
+*   FW_IS_PRELOADED:   No firmware is to be loaded (see above)
+*   FW_NEEDS_TO_BE_LOADED: The driver's first instance will load
+*

[PATCH net-next 0/3] liquidio: firmware loading

2017-09-22 Thread Felix Manlunas

From: Rick Farrington 

1. Allow host driver parameter to override auto-loaded firmware (in flash).
2. Verify version of firmware that is auto-loaded from flash.
3. Change value of fw_type module parameter to reflect default firmware
   image name that is loaded by host driver (in /sys/module/liquidio/...)

 drivers/net/ethernet/cavium/liquidio/lio_main.c| 90 +++---
 .../net/ethernet/cavium/liquidio/liquidio_image.h  |  1 +
 .../net/ethernet/cavium/liquidio/octeon_device.c   | 11 ++-
 .../net/ethernet/cavium/liquidio/octeon_device.h   | 10 +++
 4 files changed, 84 insertions(+), 28 deletions(-)

-- 
1.8.3.1

Re: [PATCH net v2] net: orphan frags on stand-alone ptype in dev_queue_xmit_nit

2017-09-22 Thread Eric Dumazet

On Fri, 2017-09-22 at 19:42 -0400, Willem de Bruijn wrote:
> Zerocopy skbs frags are copied when the skb is looped to a local sock.
> Commit 1080e512d44d ("net: orphan frags on receive") introduced calls
> to skb_orphan_frags to deliver_skb and __netif_receive_skb for this.
> 
> With msg_zerocopy, these skbs can also exist in the tx path and thus
> loop from dev_queue_xmit_nit. This already calls deliver_skb in its
> loop. But it does not orphan before a separate pt_prev->func().
> 
> Add the missing skb_orphan_frags_rx.
> 
> Changes
>   v1->v2: handle skb_orphan_frags_rx failure
> 
> Fixes: 1f8b977ab32d ("sock: enable MSG_ZEROCOPY")
> Signed-off-by: Willem de Bruijn 
> ---
>  net/core/dev.c | 8 ++--
>  1 file changed, 6 insertions(+), 2 deletions(-)

Reviewed-by: Eric Dumazet

[PATCH net v2] net: orphan frags on stand-alone ptype in dev_queue_xmit_nit

2017-09-22 Thread Willem de Bruijn

Zerocopy skbs frags are copied when the skb is looped to a local sock.
Commit 1080e512d44d ("net: orphan frags on receive") introduced calls
to skb_orphan_frags to deliver_skb and __netif_receive_skb for this.

With msg_zerocopy, these skbs can also exist in the tx path and thus
loop from dev_queue_xmit_nit. This already calls deliver_skb in its
loop. But it does not orphan before a separate pt_prev->func().

Add the missing skb_orphan_frags_rx.

Changes
  v1->v2: handle skb_orphan_frags_rx failure

Fixes: 1f8b977ab32d ("sock: enable MSG_ZEROCOPY")
Signed-off-by: Willem de Bruijn 
---
 net/core/dev.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 9a2254f9802f..588b473194a8 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1948,8 +1948,12 @@ void dev_queue_xmit_nit(struct sk_buff *skb, struct 
net_device *dev)
goto again;
}
 out_unlock:
-   if (pt_prev)
-   pt_prev->func(skb2, skb->dev, pt_prev, skb->dev);
+   if (pt_prev) {
+   if (!skb_orphan_frags_rx(skb2, GFP_ATOMIC))
+   pt_prev->func(skb2, skb->dev, pt_prev, skb->dev);
+   else
+   kfree_skb(skb2);
+   }
rcu_read_unlock();
 }
 EXPORT_SYMBOL_GPL(dev_queue_xmit_nit);
-- 
2.14.1.821.g8fa685d3b7-goog

Re: [PATCH net] net: orphan frags on stand-alone ptype in dev_queue_xmit_nit

2017-09-22 Thread Willem de Bruijn

On Fri, Sep 22, 2017 at 7:04 PM, Eric Dumazet  wrote:
> On Fri, 2017-09-22 at 18:51 -0400, Willem de Bruijn wrote:
>> Zerocopy skbs frags are copied when the skb is looped to a local sock.
>> Commit 1080e512d44d ("net: orphan frags on receive") introduced calls
>> to skb_orphan_frags to deliver_skb and __netif_receive_skb.
>>
>> With msg_zerocopy, these skbs can also exist in the tx path and thus
>> loop from dev_queue_xmit_nit. This already calls deliver_skb in its
>> loop. But it does not orphan before a separate pt_prev->func().
>>
>> Add the missing skb_orphan_frags_rx.
>>
>> Fixes: 1f8b977ab32d ("sock: enable MSG_ZEROCOPY")
>> Signed-off-by: Willem de Bruijn 
>> ---
>>  net/core/dev.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/net/core/dev.c b/net/core/dev.c
>> index 9a2254f9802f..3f5b26ff4f74 100644
>> --- a/net/core/dev.c
>> +++ b/net/core/dev.c
>> @@ -1948,7 +1948,7 @@ void dev_queue_xmit_nit(struct sk_buff *skb, struct 
>> net_device *dev)
>>   goto again;
>>   }
>>  out_unlock:
>> - if (pt_prev)
>> + if (pt_prev && !skb_orphan_frags_rx(skb2, GFP_ATOMIC))
>>   pt_prev->func(skb2, skb->dev, pt_prev, skb->dev);
>
> Don't you need to kfree_skb(skb2) in case of failure ?

Oh, yes, of course! :/ Will fix right away.

[PATCH net-next v3 0/2] net: dsa: port enabling

2017-09-22 Thread Vivien Didelot

This patchset makes slave open and close symmetrical and provides
helpers for enabling or disabling a given DSA port.

Changes in v3:
  - save the phy_device change for a future patchset

Changes in v2:
  - do not remove the phy argument from port enable/disable

Vivien Didelot (2):
  net: dsa: make slave close symmetrical to open
  net: dsa: add port enable and disable helpers

 net/dsa/dsa_priv.h |  3 ++-
 net/dsa/port.c | 31 ++-
 net/dsa/slave.c| 21 ++---
 3 files changed, 38 insertions(+), 17 deletions(-)

-- 
2.14.1

[PATCH net-next v3 1/2] net: dsa: make slave close symmetrical to open

2017-09-22 Thread Vivien Didelot

The DSA slave open function configures the unicast MAC addresses on the
master device, enable the switch port, change its STP state, then start
the PHY device.

Make the close function symmetric, by first stopping the PHY device,
then changing the STP state, disabling the switch port and restore the
master device.

Signed-off-by: Vivien Didelot 
Reviewed-by: Florian Fainelli 
Reviewed-by: Andrew Lunn 
---
 net/dsa/slave.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 02ace7d462c4..c2bb48579032 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -133,6 +133,11 @@ static int dsa_slave_close(struct net_device *dev)
if (p->phy)
phy_stop(p->phy);
 
+   dsa_port_set_state_now(p->dp, BR_STATE_DISABLED);
+
+   if (ds->ops->port_disable)
+   ds->ops->port_disable(ds, p->dp->index, p->phy);
+
dev_mc_unsync(master, dev);
dev_uc_unsync(master, dev);
if (dev->flags & IFF_ALLMULTI)
@@ -143,11 +148,6 @@ static int dsa_slave_close(struct net_device *dev)
if (!ether_addr_equal(dev->dev_addr, master->dev_addr))
dev_uc_del(master, dev->dev_addr);
 
-   if (ds->ops->port_disable)
-   ds->ops->port_disable(ds, p->dp->index, p->phy);
-
-   dsa_port_set_state_now(p->dp, BR_STATE_DISABLED);
-
return 0;
 }
 
-- 
2.14.1

[PATCH net-next v3 2/2] net: dsa: add port enable and disable helpers

2017-09-22 Thread Vivien Didelot

Provide dsa_port_enable and dsa_port_disable helpers to respectively
enable and disable a switch port. This makes the dsa_port_set_state_now
helper static.

Signed-off-by: Vivien Didelot 
Reviewed-by: Florian Fainelli 
Reviewed-by: Andrew Lunn 
---
 net/dsa/dsa_priv.h |  3 ++-
 net/dsa/port.c | 31 ++-
 net/dsa/slave.c| 19 +--
 3 files changed, 37 insertions(+), 16 deletions(-)

diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h
index 9803952a5b40..0298a0f6a349 100644
--- a/net/dsa/dsa_priv.h
+++ b/net/dsa/dsa_priv.h
@@ -117,7 +117,8 @@ void dsa_master_ethtool_restore(struct net_device *dev);
 /* port.c */
 int dsa_port_set_state(struct dsa_port *dp, u8 state,
   struct switchdev_trans *trans);
-void dsa_port_set_state_now(struct dsa_port *dp, u8 state);
+int dsa_port_enable(struct dsa_port *dp, struct phy_device *phy);
+void dsa_port_disable(struct dsa_port *dp, struct phy_device *phy);
 int dsa_port_bridge_join(struct dsa_port *dp, struct net_device *br);
 void dsa_port_bridge_leave(struct dsa_port *dp, struct net_device *br);
 int dsa_port_vlan_filtering(struct dsa_port *dp, bool vlan_filtering,
diff --git a/net/dsa/port.c b/net/dsa/port.c
index 76d43a82d397..72c8dbd3d3f2 100644
--- a/net/dsa/port.c
+++ b/net/dsa/port.c
@@ -56,7 +56,7 @@ int dsa_port_set_state(struct dsa_port *dp, u8 state,
return 0;
 }
 
-void dsa_port_set_state_now(struct dsa_port *dp, u8 state)
+static void dsa_port_set_state_now(struct dsa_port *dp, u8 state)
 {
int err;
 
@@ -65,6 +65,35 @@ void dsa_port_set_state_now(struct dsa_port *dp, u8 state)
pr_err("DSA: failed to set STP state %u (%d)\n", state, err);
 }
 
+int dsa_port_enable(struct dsa_port *dp, struct phy_device *phy)
+{
+   u8 stp_state = dp->bridge_dev ? BR_STATE_BLOCKING : BR_STATE_FORWARDING;
+   struct dsa_switch *ds = dp->ds;
+   int port = dp->index;
+   int err;
+
+   if (ds->ops->port_enable) {
+   err = ds->ops->port_enable(ds, port, phy);
+   if (err)
+   return err;
+   }
+
+   dsa_port_set_state_now(dp, stp_state);
+
+   return 0;
+}
+
+void dsa_port_disable(struct dsa_port *dp, struct phy_device *phy)
+{
+   struct dsa_switch *ds = dp->ds;
+   int port = dp->index;
+
+   dsa_port_set_state_now(dp, BR_STATE_DISABLED);
+
+   if (ds->ops->port_disable)
+   ds->ops->port_disable(ds, port, phy);
+}
+
 int dsa_port_bridge_join(struct dsa_port *dp, struct net_device *br)
 {
struct dsa_notifier_bridge_info info = {
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index c2bb48579032..bd51ef56ec5b 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -73,9 +73,7 @@ static int dsa_slave_open(struct net_device *dev)
 {
struct dsa_slave_priv *p = netdev_priv(dev);
struct dsa_port *dp = p->dp;
-   struct dsa_switch *ds = dp->ds;
struct net_device *master = dsa_master_netdev(p);
-   u8 stp_state = dp->bridge_dev ? BR_STATE_BLOCKING : BR_STATE_FORWARDING;
int err;
 
if (!(master->flags & IFF_UP))
@@ -98,13 +96,9 @@ static int dsa_slave_open(struct net_device *dev)
goto clear_allmulti;
}
 
-   if (ds->ops->port_enable) {
-   err = ds->ops->port_enable(ds, p->dp->index, p->phy);
-   if (err)
-   goto clear_promisc;
-   }
-
-   dsa_port_set_state_now(p->dp, stp_state);
+   err = dsa_port_enable(dp, p->phy);
+   if (err)
+   goto clear_promisc;
 
if (p->phy)
phy_start(p->phy);
@@ -128,15 +122,12 @@ static int dsa_slave_close(struct net_device *dev)
 {
struct dsa_slave_priv *p = netdev_priv(dev);
struct net_device *master = dsa_master_netdev(p);
-   struct dsa_switch *ds = p->dp->ds;
+   struct dsa_port *dp = p->dp;
 
if (p->phy)
phy_stop(p->phy);
 
-   dsa_port_set_state_now(p->dp, BR_STATE_DISABLED);
-
-   if (ds->ops->port_disable)
-   ds->ops->port_disable(ds, p->dp->index, p->phy);
+   dsa_port_disable(dp, p->phy);
 
dev_mc_unsync(master, dev);
dev_uc_unsync(master, dev);
-- 
2.14.1

Re: [PATCH net] net: orphan frags on stand-alone ptype in dev_queue_xmit_nit

2017-09-22 Thread Eric Dumazet

On Fri, 2017-09-22 at 18:51 -0400, Willem de Bruijn wrote:
> Zerocopy skbs frags are copied when the skb is looped to a local sock.
> Commit 1080e512d44d ("net: orphan frags on receive") introduced calls
> to skb_orphan_frags to deliver_skb and __netif_receive_skb.
> 
> With msg_zerocopy, these skbs can also exist in the tx path and thus
> loop from dev_queue_xmit_nit. This already calls deliver_skb in its
> loop. But it does not orphan before a separate pt_prev->func().
> 
> Add the missing skb_orphan_frags_rx.
> 
> Fixes: 1f8b977ab32d ("sock: enable MSG_ZEROCOPY")
> Signed-off-by: Willem de Bruijn 
> ---
>  net/core/dev.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 9a2254f9802f..3f5b26ff4f74 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -1948,7 +1948,7 @@ void dev_queue_xmit_nit(struct sk_buff *skb, struct 
> net_device *dev)
>   goto again;
>   }
>  out_unlock:
> - if (pt_prev)
> + if (pt_prev && !skb_orphan_frags_rx(skb2, GFP_ATOMIC))
>   pt_prev->func(skb2, skb->dev, pt_prev, skb->dev);

Don't you need to kfree_skb(skb2) in case of failure ?

>   rcu_read_unlock();
>  }

[PATCH net] net: orphan frags on stand-alone ptype in dev_queue_xmit_nit

2017-09-22 Thread Willem de Bruijn

Zerocopy skbs frags are copied when the skb is looped to a local sock.
Commit 1080e512d44d ("net: orphan frags on receive") introduced calls
to skb_orphan_frags to deliver_skb and __netif_receive_skb.

With msg_zerocopy, these skbs can also exist in the tx path and thus
loop from dev_queue_xmit_nit. This already calls deliver_skb in its
loop. But it does not orphan before a separate pt_prev->func().

Add the missing skb_orphan_frags_rx.

Fixes: 1f8b977ab32d ("sock: enable MSG_ZEROCOPY")
Signed-off-by: Willem de Bruijn 
---
 net/core/dev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 9a2254f9802f..3f5b26ff4f74 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1948,7 +1948,7 @@ void dev_queue_xmit_nit(struct sk_buff *skb, struct 
net_device *dev)
goto again;
}
 out_unlock:
-   if (pt_prev)
+   if (pt_prev && !skb_orphan_frags_rx(skb2, GFP_ATOMIC))
pt_prev->func(skb2, skb->dev, pt_prev, skb->dev);
rcu_read_unlock();
 }
-- 
2.14.1.821.g8fa685d3b7-goog

[PATCH net-next] hv_netvsc: Fix the real number of queues of non-vRSS cases

2017-09-22 Thread Haiyang Zhang

From: Haiyang Zhang 

For older hosts without multi-channel (vRSS) support, and some error
cases, we still need to set the real number of queues to one.
This patch adds this missing setting.

Fixes: 8195b1396ec8 ("hv_netvsc: fix deadlock on hotplug")
Signed-off-by: Haiyang Zhang 
---
 drivers/net/hyperv/netvsc_drv.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index d4902ee5f260..68eac12fbf75 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -1929,6 +1929,12 @@ static int netvsc_probe(struct hv_device *dev,
/* We always need headroom for rndis header */
net->needed_headroom = RNDIS_AND_PPI_SIZE;
 
+   /* Initialize the number of queues to be 1, we may change it if more
+* channels are offered later.
+*/
+   netif_set_real_num_tx_queues(net, 1);
+   netif_set_real_num_rx_queues(net, 1);
+
/* Notify the netvsc driver of the new device */
memset(_info, 0, sizeof(device_info));
device_info.ring_size = ring_size;
-- 
2.14.1

Re: [RFC PATCH 00/11] udp: full early demux for unconnected sockets

2017-09-22 Thread Eric Dumazet

On Fri, 2017-09-22 at 23:06 +0200, Paolo Abeni wrote:
> This series refactor the UDP early demux code so that:
> 
> * full socket lookup is performed for unicast packets
> * a sk is grabbed even for unconnected socket match
> * a dst cache is used even in such scenario
> 
> To perform this tasks a couple of facilities are added:
> 
> * noref socket references, scoped inside the current RCU section, to be
>   explicitly cleared before leaving such section
> * a dst cache inside the inet and inet6 local addresses tables, caching the
>   related local dst entry
> 
> The measured performance gain under small packet UDP flood is as follow:
> 
> ingress NIC   vanilla patched delta
> rx queues (kpps)  (kpps)  (%)
> [ipv4]
> 1 2177241410
> 2 2527289214
> 3 3050373322


This is a clear sign your program is not using latest SO_REUSEPORT +
[ec]BPF filter [1]

return socket[RX_QUEUE# | or CPU#];

If udp_sink uses SO_REUSEPORT with no extra hint, socket selection is
based on a lazy hash, meaning that you do not have proper siloing.

return socket[hash(skb)];

Multiple cpus can then :
 - compete on grabbing same socket refcount
 - compete on grabbing the receive queue lock
 - compete for releasing lock and socket refcount
 - skb freeing done on different cpus than where allocated.

You are adding complexity to the kernel because you are using a
sub-optimal user space program, favoring false sharing.

First solve the false sharing issue.

Performance with 2 rx queues should be almost twice the performance with
1 rx queue.

Then we can see if the gains you claim are still applicable.

Thanks

PS: Wei Wan is about to release the IPV6 changes so that the big
differences you showed are going to disappear soon.

Refs [1]

tools/testing/selftests/net/reuseport_bpf.c

6a5ef90c58daada158ba16ba330558efc3471491 Merge branch 'faster-soreuseport'
3ca8e4029969d40ab90e3f1ecd83ab1cadd60fbb soreuseport: BPF selection functional 
test
538950a1b7527a0a52ccd9337e3fcd304f027f13 soreuseport: setsockopt 
SO_ATTACH_REUSEPORT_[CE]BPF
e32ea7e747271a0abcd37e265005e97cc81d9df5 soreuseport: fast reuseport UDP socket 
selection
ef456144da8ef507c8cf504284b6042e9201a05c soreuseport: define reuseport groups

[PATCH v4 0/9] bring back stack frame warning with KASAN

2017-09-22 Thread Arnd Bergmann

This is a new version of patches I originally submitted back in March
[1], and last time in June [2]. This time I have basically rewritten
the entire patch series based on a new approach that came out of GCC
PR81715 that I opened[3]. The upcoming gcc-8 release is now much better
at consolidating stack slots for inline function arguments and would
obsolete most of my workaround patches here, but we still need the
workarounds for gcc-5, gcc-6 and gcc-7. Many thanks to Jakub Jelinek
for the analysis and the gcc-8 patch!

This minimal set of patches only makes sure that we do get frame size
warnings in allmodconfig for x86_64 and arm64 again with a 2048 byte
limit, even with KASAN enabled, but without the new KASAN_EXTRA option.

I set the warning limit with KASAN_EXTRA to 3072, limiting the
allmodconfig+KASAN_EXTRA build output to around 50 legitimate warnings.
These are for stack frames up to 31KB that will cause an immediate stack
overflow, and fixing them would require bringing back my older patches
and more. We can debate whether we want to apply those as a follow-up,
or instead remove the option entirely.

Another follow-up series I have reduces the warning limit with
KASAN to 1536, and without KASAN to 1280 for 64-bit architectures.

I hope we can get all patches merged for v4.14 and most of them
backported into stable kernels. Since we no longer have a dependency
on a preparation patch, my preference would be for the respective
subsystem maintainers to pick up the individual patches.
The last patch introduces a couple of "allmodconfig" build warnings
on x86 and arm64 unless the other patches get merged first, I'll
send that again separately once everything else has been taken
care of.

The remaining contents are:
- -fsanitize-address-use-after-scope is moved to a separate
  CONFIG_KASAN_EXTRA option that increases the warning limit
- CONFIG_KASAN_EXTRA is disabled with CONFIG_COMPILE_TEST,
  improving compile speed and disabling code that leads to
  valid warnings on gcc-7.0.1
- KMEMCHECK conflicts with CONFIG_KASAN
- my inline function workaround is applied to netlink, one
  ethernet driver and a few media drivers.
- The rework for the brcmsmac driver from previous versions is
  still there.

Changes since v3:
- I dropped all "noinline_if_stackbloat" annotations and used
  a workaround that introduces additional local variables in the inline
  functions to copy the function arguments, resulting in much better
  object code at the expense of having rather odd-looking functions.
- The v4 patches now don't help with KASAN_EXTRA any more at all,
  CONFIG_KASAN_EXTRA now depends on CONFIG_DEBUG_KERNEL, as it
  is more dangerous in production systems than it was before
- Rewrote the "em28xx" patch to be small enough for a stable backport.
- The rewritten vt-keyboard patches got merged and are now in
  stable kernels as well.

Changes since v2:
- rewrote the vt-keyboard patch based on feedback
- and made KMEMCHECK mutually exclusive with KASAN
  (rather than KASAN_EXTRA)

Changes since v1:
- dropped patches to fix all the CONFIG_KASAN_EXTRA warnings:
 - READ_ONCE/WRITE_ONCE cause problems in lots of code
 - typecheck() causes huge problems in a few places
 - many more uses of noinline_if_stackbloat

 Arnd

[1] https://www.spinics.net/lists/linux-wireless/msg159819.html
[2] https://www.spinics.net/lists/netdev/msg441918.html
[3] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81715

Arnd Bergmann (9):
  brcmsmac: make some local variables 'static const' to reduce stack
size
  brcmsmac: split up wlc_phy_workarounds_nphy
  brcmsmac: reindent split functions
  em28xx: fix em28xx_dvb_init for KASAN
  r820t: fix r820t_write_reg for KASAN
  dvb-frontends: fix i2c access helpers for KASAN
  rocker: fix rocker_tlv_put_* functions for KASAN
  netlink: fix nla_put_{u8,u16,u32} for KASAN
  kasan: rework Kconfig settings

 drivers/media/dvb-frontends/ascot2e.c  |4 +-
 drivers/media/dvb-frontends/cxd2841er.c|4 +-
 drivers/media/dvb-frontends/helene.c   |4 +-
 drivers/media/dvb-frontends/horus3a.c  |4 +-
 drivers/media/dvb-frontends/itd1000.c  |5 +-
 drivers/media/dvb-frontends/mt312.c|4 +-
 drivers/media/dvb-frontends/stb0899_drv.c  |3 +-
 drivers/media/dvb-frontends/stb6100.c  |6 +-
 drivers/media/dvb-frontends/stv0367.c  |4 +-
 drivers/media/dvb-frontends/stv090x.c  |4 +-
 drivers/media/dvb-frontends/stv6110x.c |4 +-
 drivers/media/dvb-frontends/zl10039.c  |4 +-
 drivers/media/tuners/r820t.c   |   13 +-
 drivers/media/usb/em28xx/em28xx-dvb.c  |   30 +-
 drivers/net/ethernet/rocker/rocker_tlv.h   |   48 +-
 .../broadcom/brcm80211/brcmsmac/phy/phy_n.c| 1856 ++--
 include/net/netlink.h  |   73 +-
 lib/Kconfig.debug

[PATCH v4 1/9] brcmsmac: make some local variables 'static const' to reduce stack size

2017-09-22 Thread Arnd Bergmann

With KASAN and a couple of other patches applied, this driver is one
of the few remaining ones that actually use more than 2048 bytes of
kernel stack:

broadcom/brcm80211/brcmsmac/phy/phy_n.c: In function 
'wlc_phy_workarounds_nphy_gainctrl':
broadcom/brcm80211/brcmsmac/phy/phy_n.c:16065:1: warning: the frame size of 
3264 bytes is larger than 2048 bytes [-Wframe-larger-than=]
broadcom/brcm80211/brcmsmac/phy/phy_n.c: In function 'wlc_phy_workarounds_nphy':
broadcom/brcm80211/brcmsmac/phy/phy_n.c:17138:1: warning: the frame size of 
2864 bytes is larger than 2048 bytes [-Wframe-larger-than=]

Here, I'm reducing the stack size by marking as many local variables as
'static const' as I can without changing the actual code.

This is the first of three patches to improve the stack usage in this
driver. It would be good to have this backported to stabl kernels
to get all drivers in 'allmodconfig' below the 2048 byte limit so
we can turn on the frame warning again globally, but I realize that
the patch is larger than the normal limit for stable backports.

The other two patches do not need to be backported.

Acked-by: Arend van Spriel 
Signed-off-by: Arnd Bergmann 
---
 .../broadcom/brcm80211/brcmsmac/phy/phy_n.c| 197 ++---
 1 file changed, 97 insertions(+), 100 deletions(-)

diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_n.c 
b/drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_n.c
index b3aab2fe96eb..ef685465f80a 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_n.c
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_n.c
@@ -14764,8 +14764,8 @@ static void 
wlc_phy_ipa_restore_tx_digi_filts_nphy(struct brcms_phy *pi)
 }
 
 static void
-wlc_phy_set_rfseq_nphy(struct brcms_phy *pi, u8 cmd, u8 *events, u8 *dlys,
-  u8 len)
+wlc_phy_set_rfseq_nphy(struct brcms_phy *pi, u8 cmd, const u8 *events,
+  const u8 *dlys, u8 len)
 {
u32 t1_offset, t2_offset;
u8 ctr;
@@ -15240,16 +15240,16 @@ static void 
wlc_phy_workarounds_nphy_gainctrl_2057_rev5(struct brcms_phy *pi)
 static void wlc_phy_workarounds_nphy_gainctrl_2057_rev6(struct brcms_phy *pi)
 {
u16 currband;
-   s8 lna1G_gain_db_rev7[] = { 9, 14, 19, 24 };
-   s8 *lna1_gain_db = NULL;
-   s8 *lna1_gain_db_2 = NULL;
-   s8 *lna2_gain_db = NULL;
-   s8 tiaA_gain_db_rev7[] = { -9, -6, -3, 0, 3, 3, 3, 3, 3, 3 };
-   s8 *tia_gain_db;
-   s8 tiaA_gainbits_rev7[] = { 0, 1, 2, 3, 4, 4, 4, 4, 4, 4 };
-   s8 *tia_gainbits;
-   u16 rfseqA_init_gain_rev7[] = { 0x624f, 0x624f };
-   u16 *rfseq_init_gain;
+   static const s8 lna1G_gain_db_rev7[] = { 9, 14, 19, 24 };
+   const s8 *lna1_gain_db = NULL;
+   const s8 *lna1_gain_db_2 = NULL;
+   const s8 *lna2_gain_db = NULL;
+   static const s8 tiaA_gain_db_rev7[] = { -9, -6, -3, 0, 3, 3, 3, 3, 3, 3 
};
+   const s8 *tia_gain_db;
+   static const s8 tiaA_gainbits_rev7[] = { 0, 1, 2, 3, 4, 4, 4, 4, 4, 4 };
+   const s8 *tia_gainbits;
+   static const u16 rfseqA_init_gain_rev7[] = { 0x624f, 0x624f };
+   const u16 *rfseq_init_gain;
u16 init_gaincode;
u16 clip1hi_gaincode;
u16 clip1md_gaincode = 0;
@@ -15310,10 +15310,9 @@ static void 
wlc_phy_workarounds_nphy_gainctrl_2057_rev6(struct brcms_phy *pi)
 
if ((freq <= 5080) || (freq == 5825)) {
 
-   s8 lna1A_gain_db_rev7[] = { 11, 16, 20, 24 };
-   s8 lna1A_gain_db_2_rev7[] = {
-   11, 17, 22, 25};
-   s8 lna2A_gain_db_rev7[] = { -1, 6, 10, 14 };
+   static const s8 lna1A_gain_db_rev7[] = { 11, 
16, 20, 24 };
+   static const s8 lna1A_gain_db_2_rev7[] = { 11, 
17, 22, 25};
+   static const s8 lna2A_gain_db_rev7[] = { -1, 6, 
10, 14 };
 
crsminu_th = 0x3e;
lna1_gain_db = lna1A_gain_db_rev7;
@@ -15321,10 +15320,9 @@ static void 
wlc_phy_workarounds_nphy_gainctrl_2057_rev6(struct brcms_phy *pi)
lna2_gain_db = lna2A_gain_db_rev7;
} else if ((freq >= 5500) && (freq <= 5700)) {
 
-   s8 lna1A_gain_db_rev7[] = { 11, 17, 21, 25 };
-   s8 lna1A_gain_db_2_rev7[] = {
-   12, 18, 22, 26};
-   s8 lna2A_gain_db_rev7[] = { 1, 8, 12, 16 };
+   static const s8 lna1A_gain_db_rev7[] = { 11, 
17, 21, 25 };
+   static const s8 lna1A_gain_db_2_rev7[] = { 12, 
18, 22, 26};
+   static const s8 lna2A_gain_db_rev7[] = { 1, 8, 
12, 16 };
 
crsminu_th =

[PATCH v4 2/9] brcmsmac: split up wlc_phy_workarounds_nphy

2017-09-22 Thread Arnd Bergmann

The stack consumption in this driver is still relatively high, with one
remaining warning if the warning level is lowered to 1536 bytes:

drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_n.c:17135:1: error: 
the frame size of 1880 bytes is larger than 1536 bytes 
[-Werror=frame-larger-than=]

The affected function is actually a collection of three separate 
implementations,
and each of them is fairly large by itself. Splitting them up is done easily
and improves readability at the same time.

I'm leaving the original indentation to make the review easier.

Acked-by: Arend van Spriel 
Signed-off-by: Arnd Bergmann 
---
 .../broadcom/brcm80211/brcmsmac/phy/phy_n.c| 178 -
 1 file changed, 104 insertions(+), 74 deletions(-)

This one and the following patch could be merged for either v4.14 or
v4.15 at this point, whichever the maintainers prefer. No need to
backport them to stable kernels.

diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_n.c 
b/drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_n.c
index ef685465f80a..ed409a80f3d2 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_n.c
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_n.c
@@ -16061,52 +16061,8 @@ static void wlc_phy_workarounds_nphy_gainctrl(struct 
brcms_phy *pi)
}
 }
 
-static void wlc_phy_workarounds_nphy(struct brcms_phy *pi)
+static void wlc_phy_workarounds_nphy_rev7(struct brcms_phy *pi)
 {
-   static const u8 rfseq_rx2tx_events[] = {
-   NPHY_RFSEQ_CMD_NOP,
-   NPHY_RFSEQ_CMD_RXG_FBW,
-   NPHY_RFSEQ_CMD_TR_SWITCH,
-   NPHY_RFSEQ_CMD_CLR_HIQ_DIS,
-   NPHY_RFSEQ_CMD_RXPD_TXPD,
-   NPHY_RFSEQ_CMD_TX_GAIN,
-   NPHY_RFSEQ_CMD_EXT_PA
-   };
-   u8 rfseq_rx2tx_dlys[] = { 8, 6, 6, 2, 4, 60, 1 };
-   static const u8 rfseq_tx2rx_events[] = {
-   NPHY_RFSEQ_CMD_NOP,
-   NPHY_RFSEQ_CMD_EXT_PA,
-   NPHY_RFSEQ_CMD_TX_GAIN,
-   NPHY_RFSEQ_CMD_RXPD_TXPD,
-   NPHY_RFSEQ_CMD_TR_SWITCH,
-   NPHY_RFSEQ_CMD_RXG_FBW,
-   NPHY_RFSEQ_CMD_CLR_HIQ_DIS
-   };
-   static const u8 rfseq_tx2rx_dlys[] = { 8, 6, 2, 4, 4, 6, 1 };
-   static const u8 rfseq_tx2rx_events_rev3[] = {
-   NPHY_REV3_RFSEQ_CMD_EXT_PA,
-   NPHY_REV3_RFSEQ_CMD_INT_PA_PU,
-   NPHY_REV3_RFSEQ_CMD_TX_GAIN,
-   NPHY_REV3_RFSEQ_CMD_RXPD_TXPD,
-   NPHY_REV3_RFSEQ_CMD_TR_SWITCH,
-   NPHY_REV3_RFSEQ_CMD_RXG_FBW,
-   NPHY_REV3_RFSEQ_CMD_CLR_HIQ_DIS,
-   NPHY_REV3_RFSEQ_CMD_END
-   };
-   static const u8 rfseq_tx2rx_dlys_rev3[] = { 8, 4, 2, 2, 4, 4, 6, 1 };
-   u8 rfseq_rx2tx_events_rev3[] = {
-   NPHY_REV3_RFSEQ_CMD_NOP,
-   NPHY_REV3_RFSEQ_CMD_RXG_FBW,
-   NPHY_REV3_RFSEQ_CMD_TR_SWITCH,
-   NPHY_REV3_RFSEQ_CMD_CLR_HIQ_DIS,
-   NPHY_REV3_RFSEQ_CMD_RXPD_TXPD,
-   NPHY_REV3_RFSEQ_CMD_TX_GAIN,
-   NPHY_REV3_RFSEQ_CMD_INT_PA_PU,
-   NPHY_REV3_RFSEQ_CMD_EXT_PA,
-   NPHY_REV3_RFSEQ_CMD_END
-   };
-   u8 rfseq_rx2tx_dlys_rev3[] = { 8, 6, 6, 4, 4, 18, 42, 1, 1 };
-
static const u8 rfseq_rx2tx_events_rev3_ipa[] = {
NPHY_REV3_RFSEQ_CMD_NOP,
NPHY_REV3_RFSEQ_CMD_RXG_FBW,
@@ -16120,29 +16076,15 @@ static void wlc_phy_workarounds_nphy(struct brcms_phy 
*pi)
};
static const u8 rfseq_rx2tx_dlys_rev3_ipa[] = { 8, 6, 6, 4, 4, 16, 43, 
1, 1 };
static const u16 rfseq_rx2tx_dacbufpu_rev7[] = { 0x10f, 0x10f };
-
-   s16 alpha0, alpha1, alpha2;
-   s16 beta0, beta1, beta2;
-   u32 leg_data_weights, ht_data_weights, nss1_data_weights,
-   stbc_data_weights;
+   u32 leg_data_weights;
u8 chan_freq_range = 0;
static const u16 dac_control = 0x0002;
u16 aux_adc_vmid_rev7_core0[] = { 0x8e, 0x96, 0x96, 0x96 };
u16 aux_adc_vmid_rev7_core1[] = { 0x8f, 0x9f, 0x9f, 0x96 };
-   u16 aux_adc_vmid_rev4[] = { 0xa2, 0xb4, 0xb4, 0x89 };
-   u16 aux_adc_vmid_rev3[] = { 0xa2, 0xb4, 0xb4, 0x89 };
-   u16 *aux_adc_vmid;
u16 aux_adc_gain_rev7[] = { 0x02, 0x02, 0x02, 0x02 };
-   u16 aux_adc_gain_rev4[] = { 0x02, 0x02, 0x02, 0x00 };
-   u16 aux_adc_gain_rev3[] = { 0x02, 0x02, 0x02, 0x00 };
-   u16 *aux_adc_gain;
-   static const u16 sk_adc_vmid[] = { 0xb4, 0xb4, 0xb4, 0x24 };
-   static const u16 sk_adc_gain[] = { 0x02, 0x02, 0x02, 0x02 };
s32 min_nvar_val = 0x18d;
s32 min_nvar_offset_6mbps = 20;
u8 pdetrange;
-   u8 triso;
-   u16 regval;
u16 afectrl_adc_ctrl1_rev7 = 0x20;
u16 afectrl_adc_ctrl2_rev7 = 0x0;
u16 rfseq_rx2tx_lpf_h_hpc_rev7 = 0x77;
@@ -16171,17 +16113,6 @@ static void

[PATCH v4 8/9] netlink: fix nla_put_{u8,u16,u32} for KASAN

2017-09-22 Thread Arnd Bergmann

When CONFIG_KASAN is enabled, the "--param asan-stack=1" causes rather large
stack frames in some functions. This goes unnoticed normally because
CONFIG_FRAME_WARN is disabled with CONFIG_KASAN by default as of commit
3f181b4d8652 ("lib/Kconfig.debug: disable -Wframe-larger-than warnings with
KASAN=y").

The kernelci.org build bot however has the warning enabled and that led
me to investigate it a little further, as every build produces these warnings:

net/wireless/nl80211.c:4389:1: warning: the frame size of 2240 bytes is larger 
than 2048 bytes [-Wframe-larger-than=]
net/wireless/nl80211.c:1895:1: warning: the frame size of 3776 bytes is larger 
than 2048 bytes [-Wframe-larger-than=]
net/wireless/nl80211.c:1410:1: warning: the frame size of 2208 bytes is larger 
than 2048 bytes [-Wframe-larger-than=]
net/bridge/br_netlink.c:1282:1: warning: the frame size of 2544 bytes is larger 
than 2048 bytes [-Wframe-larger-than=]

Most of this problem is now solved in gcc-8, which can consolidate
the stack slots for the inline function arguments. On older compilers
we can add a workaround by declaring a local variable in each function
to pass the inline function argument.

Cc: sta...@vger.kernel.org
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81715
Signed-off-by: Arnd Bergmann 
---
 include/net/netlink.h | 73 ++-
 1 file changed, 55 insertions(+), 18 deletions(-)

diff --git a/include/net/netlink.h b/include/net/netlink.h
index e51cf5f81597..14c289393071 100644
--- a/include/net/netlink.h
+++ b/include/net/netlink.h
@@ -773,7 +773,10 @@ static inline int nla_parse_nested(struct nlattr *tb[], 
int maxtype,
  */
 static inline int nla_put_u8(struct sk_buff *skb, int attrtype, u8 value)
 {
-   return nla_put(skb, attrtype, sizeof(u8), );
+   /* temporary variables to work around GCC PR81715 with asan-stack=1 */
+   u8 tmp = value;
+
+   return nla_put(skb, attrtype, sizeof(u8), );
 }
 
 /**
@@ -784,7 +787,9 @@ static inline int nla_put_u8(struct sk_buff *skb, int 
attrtype, u8 value)
  */
 static inline int nla_put_u16(struct sk_buff *skb, int attrtype, u16 value)
 {
-   return nla_put(skb, attrtype, sizeof(u16), );
+   u16 tmp = value;
+
+   return nla_put(skb, attrtype, sizeof(u16), );
 }
 
 /**
@@ -795,7 +800,9 @@ static inline int nla_put_u16(struct sk_buff *skb, int 
attrtype, u16 value)
  */
 static inline int nla_put_be16(struct sk_buff *skb, int attrtype, __be16 value)
 {
-   return nla_put(skb, attrtype, sizeof(__be16), );
+   __be16 tmp = value;
+
+   return nla_put(skb, attrtype, sizeof(__be16), );
 }
 
 /**
@@ -806,7 +813,9 @@ static inline int nla_put_be16(struct sk_buff *skb, int 
attrtype, __be16 value)
  */
 static inline int nla_put_net16(struct sk_buff *skb, int attrtype, __be16 
value)
 {
-   return nla_put_be16(skb, attrtype | NLA_F_NET_BYTEORDER, value);
+   __be16 tmp = value;
+
+   return nla_put_be16(skb, attrtype | NLA_F_NET_BYTEORDER, tmp);
 }
 
 /**
@@ -817,7 +826,9 @@ static inline int nla_put_net16(struct sk_buff *skb, int 
attrtype, __be16 value)
  */
 static inline int nla_put_le16(struct sk_buff *skb, int attrtype, __le16 value)
 {
-   return nla_put(skb, attrtype, sizeof(__le16), );
+   __le16 tmp = value;
+
+   return nla_put(skb, attrtype, sizeof(__le16), );
 }
 
 /**
@@ -828,7 +839,9 @@ static inline int nla_put_le16(struct sk_buff *skb, int 
attrtype, __le16 value)
  */
 static inline int nla_put_u32(struct sk_buff *skb, int attrtype, u32 value)
 {
-   return nla_put(skb, attrtype, sizeof(u32), );
+   u32 tmp = value;
+
+   return nla_put(skb, attrtype, sizeof(u32), );
 }
 
 /**
@@ -839,7 +852,9 @@ static inline int nla_put_u32(struct sk_buff *skb, int 
attrtype, u32 value)
  */
 static inline int nla_put_be32(struct sk_buff *skb, int attrtype, __be32 value)
 {
-   return nla_put(skb, attrtype, sizeof(__be32), );
+   __be32 tmp = value;
+
+   return nla_put(skb, attrtype, sizeof(__be32), );
 }
 
 /**
@@ -850,7 +865,9 @@ static inline int nla_put_be32(struct sk_buff *skb, int 
attrtype, __be32 value)
  */
 static inline int nla_put_net32(struct sk_buff *skb, int attrtype, __be32 
value)
 {
-   return nla_put_be32(skb, attrtype | NLA_F_NET_BYTEORDER, value);
+   __be32 tmp = value;
+
+   return nla_put_be32(skb, attrtype | NLA_F_NET_BYTEORDER, tmp);
 }
 
 /**
@@ -861,7 +878,9 @@ static inline int nla_put_net32(struct sk_buff *skb, int 
attrtype, __be32 value)
  */
 static inline int nla_put_le32(struct sk_buff *skb, int attrtype, __le32 value)
 {
-   return nla_put(skb, attrtype, sizeof(__le32), );
+   __le32 tmp = value;
+
+   return nla_put(skb, attrtype, sizeof(__le32), );
 }
 
 /**
@@ -874,7 +893,9 @@ static inline int nla_put_le32(struct sk_buff *skb, int 
attrtype, __le32 value)
 static inline int nla_put_u64_64bit(struct sk_buff *skb, int attrtype,
u64 value,

[PATCH v4 9/9] kasan: rework Kconfig settings

2017-09-22 Thread Arnd Bergmann

We get a lot of very large stack frames using gcc-7.0.1 with the default
-fsanitize-address-use-after-scope --param asan-stack=1 options, which
can easily cause an overflow of the kernel stack, e.g.

drivers/gpu/drm/i915/gvt/handlers.c:2407:1: error: the frame size of 31216 
bytes is larger than 2048 bytes
drivers/net/wireless/ralink/rt2x00/rt2800lib.c:5650:1: error: the frame size of 
23632 bytes is larger than 2048 bytes
drivers/scsi/fnic/fnic_trace.c:451:1: error: the frame size of 5152 bytes is 
larger than 2048 bytes
fs/btrfs/relocation.c:1202:1: error: the frame size of 4256 bytes is larger 
than 2048 bytes
fs/fscache/stats.c:287:1: error: the frame size of 6552 bytes is larger than 
2048 bytes
lib/atomic64_test.c:250:1: error: the frame size of 12616 bytes is larger than 
2048 bytes
mm/vmscan.c:1367:1: error: the frame size of 5080 bytes is larger than 2048 
bytes
net/wireless/nl80211.c:1905:1: error: the frame size of 4232 bytes is larger 
than 2048 bytes

To reduce this risk, -fsanitize-address-use-after-scope is now split
out into a separate CONFIG_KASAN_EXTRA Kconfig option, leading to stack
frames that are smaller than 2 kilobytes most of the time on x86_64. An
earlier version of this patch also prevented combining KASAN_EXTRA with
KASAN_INLINE, but that is no longer necessary with gcc-7.0.1.

A lot of warnings with KASAN_EXTRA go away if we disable KMEMCHECK,
as -fsanitize-address-use-after-scope seems to understand the builtin
memcpy, but adds checking code around an extern memcpy call. I had to work
around a circular dependency, as DEBUG_SLAB/SLUB depended on !KMEMCHECK,
while KASAN did it the other way round. Now we handle both the same way
and make KASAN and KMEMCHECK mutually exclusive.

All patches to get the frame size below 2048 bytes with CONFIG_KASAN=y
and CONFIG_KASAN_EXTRA=n have been submitted along with this patch, so
we can bring back that default now. KASAN_EXTRA=y still causes lots of
warnings but now defaults to !COMPILE_TEST to disable it in allmodconfig,
and it remains disabled in all other defconfigs since it is a new option.
I arbitrarily raise the warning limit for KASAN_EXTRA to 3072 to reduce
the noise, but an allmodconfig kernel still has around 50 warnings
on gcc-7.

I experimented a bit more with smaller stack frames and have another
follow-up series that reduces the warning limit for 64-bit architectures
to 1280 bytes (without CONFIG_KASAN).

With earlier versions of this patch series, I also had patches to
address the warnings we get with KASAN and/or KASAN_EXTRA, using a
"noinline_if_stackbloat" annotation. That annotation now got replaced with
a gcc-8 bugfix (see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81715)
and a workaround for older compilers, which means that KASAN_EXTRA is
now just as bad as before and will lead to an instant stack overflow in
a few extreme cases.

This reverts parts of commit commit 3f181b4 ("lib/Kconfig.debug: disable
-Wframe-larger-than warnings with KASAN=y").

Signed-off-by: Arnd Bergmann 
---
 lib/Kconfig.debug  |  4 ++--
 lib/Kconfig.kasan  | 13 -
 lib/Kconfig.kmemcheck  |  1 +
 scripts/Makefile.kasan |  3 +++
 4 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index b19c491cbc4e..5755875d4a80 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -217,7 +217,7 @@ config ENABLE_MUST_CHECK
 config FRAME_WARN
int "Warn for stack frames larger than (needs gcc 4.4)"
range 0 8192
-   default 0 if KASAN
+   default 3072 if KASAN_EXTRA
default 2048 if GCC_PLUGIN_LATENT_ENTROPY
default 1024 if !64BIT
default 2048 if 64BIT
@@ -503,7 +503,7 @@ config DEBUG_OBJECTS_ENABLE_DEFAULT
 
 config DEBUG_SLAB
bool "Debug slab memory allocations"
-   depends on DEBUG_KERNEL && SLAB && !KMEMCHECK
+   depends on DEBUG_KERNEL && SLAB && !KMEMCHECK && !KASAN
help
  Say Y here to have the kernel do limited verification on memory
  allocation as well as poisoning memory on free to catch use of freed
diff --git a/lib/Kconfig.kasan b/lib/Kconfig.kasan
index bd38aab05929..db799e6e9dba 100644
--- a/lib/Kconfig.kasan
+++ b/lib/Kconfig.kasan
@@ -5,7 +5,7 @@ if HAVE_ARCH_KASAN
 
 config KASAN
bool "KASan: runtime memory debugger"
-   depends on SLUB || (SLAB && !DEBUG_SLAB)
+   depends on SLUB || SLAB
select CONSTRUCTORS
select STACKDEPOT
help
@@ -20,6 +20,17 @@ config KASAN
  Currently CONFIG_KASAN doesn't work with CONFIG_DEBUG_SLAB
  (the resulting kernel does not boot).
 
+config KASAN_EXTRA
+   bool "KAsan: extra checks"
+   depends on KASAN && DEBUG_KERNEL && !COMPILE_TEST
+   help
+ This enables further checks in the kernel address sanitizer, for now
+ it only includes the address-use-after-scope check that can lead
+ to excessive kernel stack usage, frame size warnings and longer
+

[PATCH v4 7/9] rocker: fix rocker_tlv_put_* functions for KASAN

2017-09-22 Thread Arnd Bergmann

Inlining these functions creates lots of stack variables that each take
64 bytes when KASAN is enabled, leading to this warning about potential
stack overflow:

drivers/net/ethernet/rocker/rocker_ofdpa.c: In function 
'ofdpa_cmd_flow_tbl_add':
drivers/net/ethernet/rocker/rocker_ofdpa.c:621:1: error: the frame size of 2752 
bytes is larger than 1536 bytes [-Werror=frame-larger-than=]

gcc-8 can now consolidate the stack slots itself, but on older versions
we get the same behavior by using a temporary variable that holds a
copy of the inline function argument.

Cc: sta...@vger.kernel.org
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81715
Signed-off-by: Arnd Bergmann 
---
 drivers/net/ethernet/rocker/rocker_tlv.h | 48 
 1 file changed, 30 insertions(+), 18 deletions(-)

diff --git a/drivers/net/ethernet/rocker/rocker_tlv.h 
b/drivers/net/ethernet/rocker/rocker_tlv.h
index a63ef82e7c72..dfae3c9d57c6 100644
--- a/drivers/net/ethernet/rocker/rocker_tlv.h
+++ b/drivers/net/ethernet/rocker/rocker_tlv.h
@@ -139,40 +139,52 @@ rocker_tlv_start(struct rocker_desc_info *desc_info)
 int rocker_tlv_put(struct rocker_desc_info *desc_info,
   int attrtype, int attrlen, const void *data);
 
-static inline int rocker_tlv_put_u8(struct rocker_desc_info *desc_info,
-   int attrtype, u8 value)
+static inline int
+rocker_tlv_put_u8(struct rocker_desc_info *desc_info, int attrtype, u8 value)
 {
-   return rocker_tlv_put(desc_info, attrtype, sizeof(u8), );
+   u8 tmp = value; /* work around GCC PR81715 */
+
+   return rocker_tlv_put(desc_info, attrtype, sizeof(u8), );
 }
 
-static inline int rocker_tlv_put_u16(struct rocker_desc_info *desc_info,
-int attrtype, u16 value)
+static inline int
+rocker_tlv_put_u16(struct rocker_desc_info *desc_info, int attrtype, u16 value)
 {
-   return rocker_tlv_put(desc_info, attrtype, sizeof(u16), );
+   u16 tmp = value;
+
+   return rocker_tlv_put(desc_info, attrtype, sizeof(u16), );
 }
 
-static inline int rocker_tlv_put_be16(struct rocker_desc_info *desc_info,
- int attrtype, __be16 value)
+static inline int
+rocker_tlv_put_be16(struct rocker_desc_info *desc_info, int attrtype, __be16 
value)
 {
-   return rocker_tlv_put(desc_info, attrtype, sizeof(__be16), );
+   __be16 tmp = value;
+
+   return rocker_tlv_put(desc_info, attrtype, sizeof(__be16), );
 }
 
-static inline int rocker_tlv_put_u32(struct rocker_desc_info *desc_info,
-int attrtype, u32 value)
+static inline int
+rocker_tlv_put_u32(struct rocker_desc_info *desc_info, int attrtype, u32 value)
 {
-   return rocker_tlv_put(desc_info, attrtype, sizeof(u32), );
+   u32 tmp = value;
+
+   return rocker_tlv_put(desc_info, attrtype, sizeof(u32), );
 }
 
-static inline int rocker_tlv_put_be32(struct rocker_desc_info *desc_info,
- int attrtype, __be32 value)
+static inline int
+rocker_tlv_put_be32(struct rocker_desc_info *desc_info, int attrtype, __be32 
value)
 {
-   return rocker_tlv_put(desc_info, attrtype, sizeof(__be32), );
+   __be32 tmp = value;
+
+   return rocker_tlv_put(desc_info, attrtype, sizeof(__be32), );
 }
 
-static inline int rocker_tlv_put_u64(struct rocker_desc_info *desc_info,
-int attrtype, u64 value)
+static inline int
+rocker_tlv_put_u64(struct rocker_desc_info *desc_info, int attrtype, u64 value)
 {
-   return rocker_tlv_put(desc_info, attrtype, sizeof(u64), );
+   u64 tmp = value;
+
+   return rocker_tlv_put(desc_info, attrtype, sizeof(u64), );
 }
 
 static inline struct rocker_tlv *
-- 
2.9.0

[PATCH v4 4/9] em28xx: fix em28xx_dvb_init for KASAN

2017-09-22 Thread Arnd Bergmann

With CONFIG_KASAN, the init function uses a large amount of kernel stack:

drivers/media/usb/em28xx/em28xx-dvb.c: In function 'em28xx_dvb_init.part.4':
drivers/media/usb/em28xx/em28xx-dvb.c:2061:1: error: the frame size of 3232 
bytes is larger than 2048 bytes [-Werror=frame-larger-than=]

It seems that this is triggered in part by using strlcpy(), which the
compiler doesn't recognize as copying at most 'len' bytes, since strlcpy
is not part of the C standard.

It does however recognize the standard strncpy() and optimizes away
the extra checks for that, using only 1688 bytes in the end.
I have another larger patch that we could use in addition to this one,
in order to shrink the stack for -fsanitize-address-use-after-scope
(with gcc-7.1.1) as well, but that would not be appropriate for
stable backports, so let's focus on this one first.

Cc: sta...@vger.kernel.org
Signed-off-by: Arnd Bergmann 
---
 drivers/media/usb/em28xx/em28xx-dvb.c | 30 +++---
 1 file changed, 15 insertions(+), 15 deletions(-)

diff --git a/drivers/media/usb/em28xx/em28xx-dvb.c 
b/drivers/media/usb/em28xx/em28xx-dvb.c
index 4a7db623fe29..06c363dc55ed 100644
--- a/drivers/media/usb/em28xx/em28xx-dvb.c
+++ b/drivers/media/usb/em28xx/em28xx-dvb.c
@@ -1440,7 +1440,7 @@ static int em28xx_dvb_init(struct em28xx *dev)
tda10071_pdata.pll_multiplier = 20,
tda10071_pdata.tuner_i2c_addr = 0x14,
memset(_info, 0, sizeof(board_info));
-   strlcpy(board_info.type, "tda10071_cx24118", I2C_NAME_SIZE);
+   strncpy(board_info.type, "tda10071_cx24118", I2C_NAME_SIZE - 1);
board_info.addr = 0x55;
board_info.platform_data = _pdata;
request_module("tda10071");
@@ -1460,7 +1460,7 @@ static int em28xx_dvb_init(struct em28xx *dev)
/* attach SEC */
a8293_pdata.dvb_frontend = dvb->fe[0];
memset(_info, 0, sizeof(board_info));
-   strlcpy(board_info.type, "a8293", I2C_NAME_SIZE);
+   strncpy(board_info.type, "a8293", I2C_NAME_SIZE - 1);
board_info.addr = 0x08;
board_info.platform_data = _pdata;
request_module("a8293");
@@ -1643,7 +1643,7 @@ static int em28xx_dvb_init(struct em28xx *dev)
m88ds3103_pdata.ts_clk_pol = 1;
m88ds3103_pdata.agc = 0x99;
memset(_info, 0, sizeof(board_info));
-   strlcpy(board_info.type, "m88ds3103", I2C_NAME_SIZE);
+   strncpy(board_info.type, "m88ds3103", I2C_NAME_SIZE - 1);
board_info.addr = 0x68;
board_info.platform_data = _pdata;
request_module("m88ds3103");
@@ -1664,7 +1664,7 @@ static int em28xx_dvb_init(struct em28xx *dev)
/* attach tuner */
ts2020_config.fe = dvb->fe[0];
memset(_info, 0, sizeof(board_info));
-   strlcpy(board_info.type, "ts2022", I2C_NAME_SIZE);
+   strncpy(board_info.type, "ts2022", I2C_NAME_SIZE - 1);
board_info.addr = 0x60;
board_info.platform_data = _config;
request_module("ts2020");
@@ -1690,7 +1690,7 @@ static int em28xx_dvb_init(struct em28xx *dev)
/* attach SEC */
a8293_pdata.dvb_frontend = dvb->fe[0];
memset(_info, 0, sizeof(board_info));
-   strlcpy(board_info.type, "a8293", I2C_NAME_SIZE);
+   strncpy(board_info.type, "a8293", I2C_NAME_SIZE - 1);
board_info.addr = 0x08;
board_info.platform_data = _pdata;
request_module("a8293");
@@ -1729,7 +1729,7 @@ static int em28xx_dvb_init(struct em28xx *dev)
si2168_config.fe = >fe[0];
si2168_config.ts_mode = SI2168_TS_PARALLEL;
memset(, 0, sizeof(struct i2c_board_info));
-   strlcpy(info.type, "si2168", I2C_NAME_SIZE);
+   strncpy(info.type, "si2168", I2C_NAME_SIZE - 1);
info.addr = 0x64;
info.platform_data = _config;
request_module(info.type);
@@ -1755,7 +1755,7 @@ static int em28xx_dvb_init(struct em28xx *dev)
si2157_config.mdev = dev->media_dev;
 #endif
memset(, 0, sizeof(struct i2c_board_info));
-   strlcpy(info.type, "si2157", I2C_NAME_SIZE);
+   strncpy(info.type, "si2157", I2C_NAME_SIZE - 1);
info.addr = 0x60;
info.platform_data = _config;
request_module(info.type);
@@ -1793,7 +1793,7 @@ static int em28xx_dvb_init(struct em28xx *dev)
si2168_config.fe = >fe[0];
si2168_config.ts_mode = SI2168_TS_PARALLEL;
memset(, 0,

[PATCH v4 3/9] brcmsmac: reindent split functions

2017-09-22 Thread Arnd Bergmann

In the previous commit I left the indentation alone to help reviewing
the patch, this one now runs the three new functions through 'indent -kr -8'
with some manual fixups to avoid silliness.

No changes other than whitespace are intended here.

Signed-off-by: Arnd Bergmann 
Acked-by: Arend van Spriel 
---
 .../broadcom/brcm80211/brcmsmac/phy/phy_n.c| 1507 +---
 1 file changed, 697 insertions(+), 810 deletions(-)

diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_n.c 
b/drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_n.c
index ed409a80f3d2..763e8ba6b178 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_n.c
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_n.c
@@ -16074,7 +16074,8 @@ static void wlc_phy_workarounds_nphy_rev7(struct 
brcms_phy *pi)
NPHY_REV3_RFSEQ_CMD_INT_PA_PU,
NPHY_REV3_RFSEQ_CMD_END
};
-   static const u8 rfseq_rx2tx_dlys_rev3_ipa[] = { 8, 6, 6, 4, 4, 16, 43, 
1, 1 };
+   static const u8 rfseq_rx2tx_dlys_rev3_ipa[] =
+   { 8, 6, 6, 4, 4, 16, 43, 1, 1 };
static const u16 rfseq_rx2tx_dacbufpu_rev7[] = { 0x10f, 0x10f };
u32 leg_data_weights;
u8 chan_freq_range = 0;
@@ -16114,526 +16115,452 @@ static void wlc_phy_workarounds_nphy_rev7(struct 
brcms_phy *pi)
int coreNum;
 
 
-   if (NREV_IS(pi->pubpi.phy_rev, 7)) {
-   mod_phy_reg(pi, 0x221, (0x1 << 4), (1 << 4));
-
-   mod_phy_reg(pi, 0x160, (0x7f << 0), (32 << 0));
-   mod_phy_reg(pi, 0x160, (0x7f << 8), (39 << 8));
-   mod_phy_reg(pi, 0x161, (0x7f << 0), (46 << 0));
-   mod_phy_reg(pi, 0x161, (0x7f << 8), (51 << 8));
-   mod_phy_reg(pi, 0x162, (0x7f << 0), (55 << 0));
-   mod_phy_reg(pi, 0x162, (0x7f << 8), (58 << 8));
-   mod_phy_reg(pi, 0x163, (0x7f << 0), (60 << 0));
-   mod_phy_reg(pi, 0x163, (0x7f << 8), (62 << 8));
-   mod_phy_reg(pi, 0x164, (0x7f << 0), (62 << 0));
-   mod_phy_reg(pi, 0x164, (0x7f << 8), (63 << 8));
-   mod_phy_reg(pi, 0x165, (0x7f << 0), (63 << 0));
-   mod_phy_reg(pi, 0x165, (0x7f << 8), (64 << 8));
-   mod_phy_reg(pi, 0x166, (0x7f << 0), (64 << 0));
-   mod_phy_reg(pi, 0x166, (0x7f << 8), (64 << 8));
-   mod_phy_reg(pi, 0x167, (0x7f << 0), (64 << 0));
-   mod_phy_reg(pi, 0x167, (0x7f << 8), (64 << 8));
-   }
-
-   if (NREV_LE(pi->pubpi.phy_rev, 8)) {
-   write_phy_reg(pi, 0x23f, 0x1b0);
-   write_phy_reg(pi, 0x240, 0x1b0);
-   }
+   if (NREV_IS(pi->pubpi.phy_rev, 7)) {
+   mod_phy_reg(pi, 0x221, (0x1 << 4), (1 << 4));
+
+   mod_phy_reg(pi, 0x160, (0x7f << 0), (32 << 0));
+   mod_phy_reg(pi, 0x160, (0x7f << 8), (39 << 8));
+   mod_phy_reg(pi, 0x161, (0x7f << 0), (46 << 0));
+   mod_phy_reg(pi, 0x161, (0x7f << 8), (51 << 8));
+   mod_phy_reg(pi, 0x162, (0x7f << 0), (55 << 0));
+   mod_phy_reg(pi, 0x162, (0x7f << 8), (58 << 8));
+   mod_phy_reg(pi, 0x163, (0x7f << 0), (60 << 0));
+   mod_phy_reg(pi, 0x163, (0x7f << 8), (62 << 8));
+   mod_phy_reg(pi, 0x164, (0x7f << 0), (62 << 0));
+   mod_phy_reg(pi, 0x164, (0x7f << 8), (63 << 8));
+   mod_phy_reg(pi, 0x165, (0x7f << 0), (63 << 0));
+   mod_phy_reg(pi, 0x165, (0x7f << 8), (64 << 8));
+   mod_phy_reg(pi, 0x166, (0x7f << 0), (64 << 0));
+   mod_phy_reg(pi, 0x166, (0x7f << 8), (64 << 8));
+   mod_phy_reg(pi, 0x167, (0x7f << 0), (64 << 0));
+   mod_phy_reg(pi, 0x167, (0x7f << 8), (64 << 8));
+   }
 
-   if (NREV_GE(pi->pubpi.phy_rev, 8))
-   mod_phy_reg(pi, 0xbd, (0xff << 0), (114 << 0));
+   if (NREV_LE(pi->pubpi.phy_rev, 8)) {
+   write_phy_reg(pi, 0x23f, 0x1b0);
+   write_phy_reg(pi, 0x240, 0x1b0);
+   }
 
-   wlc_phy_table_write_nphy(pi, NPHY_TBL_ID_AFECTRL, 1, 0x00, 16,
-_control);
-   wlc_phy_table_write_nphy(pi, NPHY_TBL_ID_AFECTRL, 1, 0x10, 16,
-_control);
+   if (NREV_GE(pi->pubpi.phy_rev, 8))
+   mod_phy_reg(pi, 0xbd, (0xff << 0), (114 << 0));
 
-   wlc_phy_table_read_nphy(pi, NPHY_TBL_ID_CMPMETRICDATAWEIGHTTBL,
-   1, 0, 32, _data_weights);
-   leg_data_weights = leg_data_weights & 0xff;
-   wlc_phy_table_write_nphy(pi, NPHY_TBL_ID_CMPMETRICDATAWEIGHTTBL,
-

[PATCH v4 6/9] dvb-frontends: fix i2c access helpers for KASAN

2017-09-22 Thread Arnd Bergmann

A typical code fragment was copied across many dvb-frontend drivers and
causes large stack frames when built with with CONFIG_KASAN on gcc-5/6/7:

drivers/media/dvb-frontends/cxd2841er.c:3225:1: error: the frame size of 3992 
bytes is larger than 3072 bytes [-Werror=frame-larger-than=]
drivers/media/dvb-frontends/cxd2841er.c:3404:1: error: the frame size of 3136 
bytes is larger than 3072 bytes [-Werror=frame-larger-than=]
drivers/media/dvb-frontends/stv0367.c:3143:1: error: the frame size of 4016 
bytes is larger than 3072 bytes [-Werror=frame-larger-than=]
drivers/media/dvb-frontends/stv090x.c:3430:1: error: the frame size of 5312 
bytes is larger than 3072 bytes [-Werror=frame-larger-than=]
drivers/media/dvb-frontends/stv090x.c:4248:1: error: the frame size of 4872 
bytes is larger than 3072 bytes [-Werror=frame-larger-than=]

gcc-8 now solves this by consolidating the stack slots for the argument
variables, but on older compilers we can get the same behavior by taking
the pointer of a local variable rather than the inline function argument.

Cc: sta...@vger.kernel.org
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81715
Signed-off-by: Arnd Bergmann 
---
I'm undecided here whether there should be a comment pointing
to PR81715 for each file that the bogus local variable workaround
to prevent it from being cleaned up again. It's probably not
necessary since anything that causes actual problems would also
trigger a build warning.
---
 drivers/media/dvb-frontends/ascot2e.c | 4 +++-
 drivers/media/dvb-frontends/cxd2841er.c   | 4 +++-
 drivers/media/dvb-frontends/helene.c  | 4 +++-
 drivers/media/dvb-frontends/horus3a.c | 4 +++-
 drivers/media/dvb-frontends/itd1000.c | 5 +++--
 drivers/media/dvb-frontends/mt312.c   | 4 +++-
 drivers/media/dvb-frontends/stb0899_drv.c | 3 ++-
 drivers/media/dvb-frontends/stb6100.c | 6 --
 drivers/media/dvb-frontends/stv0367.c | 4 +++-
 drivers/media/dvb-frontends/stv090x.c | 4 +++-
 drivers/media/dvb-frontends/stv6110x.c| 4 +++-
 drivers/media/dvb-frontends/zl10039.c | 4 +++-
 12 files changed, 36 insertions(+), 14 deletions(-)

diff --git a/drivers/media/dvb-frontends/ascot2e.c 
b/drivers/media/dvb-frontends/ascot2e.c
index 0ee0df53b91b..1219272ca3f0 100644
--- a/drivers/media/dvb-frontends/ascot2e.c
+++ b/drivers/media/dvb-frontends/ascot2e.c
@@ -155,7 +155,9 @@ static int ascot2e_write_regs(struct ascot2e_priv *priv,
 
 static int ascot2e_write_reg(struct ascot2e_priv *priv, u8 reg, u8 val)
 {
-   return ascot2e_write_regs(priv, reg, , 1);
+   u8 tmp = val;
+
+   return ascot2e_write_regs(priv, reg, , 1);
 }
 
 static int ascot2e_read_regs(struct ascot2e_priv *priv,
diff --git a/drivers/media/dvb-frontends/cxd2841er.c 
b/drivers/media/dvb-frontends/cxd2841er.c
index 48ee9bc00c06..b7574deff5c6 100644
--- a/drivers/media/dvb-frontends/cxd2841er.c
+++ b/drivers/media/dvb-frontends/cxd2841er.c
@@ -257,7 +257,9 @@ static int cxd2841er_write_regs(struct cxd2841er_priv *priv,
 static int cxd2841er_write_reg(struct cxd2841er_priv *priv,
   u8 addr, u8 reg, u8 val)
 {
-   return cxd2841er_write_regs(priv, addr, reg, , 1);
+   u8 tmp = val;
+
+   return cxd2841er_write_regs(priv, addr, reg, , 1);
 }
 
 static int cxd2841er_read_regs(struct cxd2841er_priv *priv,
diff --git a/drivers/media/dvb-frontends/helene.c 
b/drivers/media/dvb-frontends/helene.c
index 4bf5a551ba40..6e93f2d1575b 100644
--- a/drivers/media/dvb-frontends/helene.c
+++ b/drivers/media/dvb-frontends/helene.c
@@ -331,7 +331,9 @@ static int helene_write_regs(struct helene_priv *priv,
 
 static int helene_write_reg(struct helene_priv *priv, u8 reg, u8 val)
 {
-   return helene_write_regs(priv, reg, , 1);
+   u8 tmp = val;
+
+   return helene_write_regs(priv, reg, , 1);
 }
 
 static int helene_read_regs(struct helene_priv *priv,
diff --git a/drivers/media/dvb-frontends/horus3a.c 
b/drivers/media/dvb-frontends/horus3a.c
index 68d759c4c52e..fa9e2d373073 100644
--- a/drivers/media/dvb-frontends/horus3a.c
+++ b/drivers/media/dvb-frontends/horus3a.c
@@ -89,7 +89,9 @@ static int horus3a_write_regs(struct horus3a_priv *priv,
 
 static int horus3a_write_reg(struct horus3a_priv *priv, u8 reg, u8 val)
 {
-   return horus3a_write_regs(priv, reg, , 1);
+   u8 tmp = val;
+
+   return horus3a_write_regs(priv, reg, , 1);
 }
 
 static int horus3a_enter_power_save(struct horus3a_priv *priv)
diff --git a/drivers/media/dvb-frontends/itd1000.c 
b/drivers/media/dvb-frontends/itd1000.c
index 5bb1e73a10b4..1ac5177162f6 100644
--- a/drivers/media/dvb-frontends/itd1000.c
+++ b/drivers/media/dvb-frontends/itd1000.c
@@ -95,8 +95,9 @@ static int itd1000_read_reg(struct itd1000_state *state, u8 
reg)
 
 static inline int itd1000_write_reg(struct itd1000_state *state, u8 r, u8 v)
 {
-   int ret = itd1000_write_regs(state, r, , 1);
-   state->shadow[r] = v;
+   u8 tmp = v;
+   int ret =

[PATCH v4 5/9] r820t: fix r820t_write_reg for KASAN

2017-09-22 Thread Arnd Bergmann

With CONFIG_KASAN, we get an overly long stack frame due to inlining
the register access functions:

drivers/media/tuners/r820t.c: In function 'generic_set_freq.isra.7':
drivers/media/tuners/r820t.c:1334:1: error: the frame size of 2880 bytes is 
larger than 2048 bytes [-Werror=frame-larger-than=]

This is caused by a gcc bug that has now been fixed in gcc-8.
To work around the problem, we can pass the register data
through a local variable that older gcc versions can optimize
out as well.

Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81715
Signed-off-by: Arnd Bergmann 
---
 drivers/media/tuners/r820t.c | 13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/drivers/media/tuners/r820t.c b/drivers/media/tuners/r820t.c
index ba80376a3b86..d097eb04a0e9 100644
--- a/drivers/media/tuners/r820t.c
+++ b/drivers/media/tuners/r820t.c
@@ -396,9 +396,11 @@ static int r820t_write(struct r820t_priv *priv, u8 reg, 
const u8 *val,
return 0;
 }
 
-static int r820t_write_reg(struct r820t_priv *priv, u8 reg, u8 val)
+static inline int r820t_write_reg(struct r820t_priv *priv, u8 reg, u8 val)
 {
-   return r820t_write(priv, reg, , 1);
+   u8 tmp = val; /* work around GCC PR81715 with asan-stack=1 */
+
+   return r820t_write(priv, reg, , 1);
 }
 
 static int r820t_read_cache_reg(struct r820t_priv *priv, int reg)
@@ -411,17 +413,18 @@ static int r820t_read_cache_reg(struct r820t_priv *priv, 
int reg)
return -EINVAL;
 }
 
-static int r820t_write_reg_mask(struct r820t_priv *priv, u8 reg, u8 val,
+static inline int r820t_write_reg_mask(struct r820t_priv *priv, u8 reg, u8 val,
u8 bit_mask)
 {
+   u8 tmp = val;
int rc = r820t_read_cache_reg(priv, reg);
 
if (rc < 0)
return rc;
 
-   val = (rc & ~bit_mask) | (val & bit_mask);
+   tmp = (rc & ~bit_mask) | (tmp & bit_mask);
 
-   return r820t_write(priv, reg, , 1);
+   return r820t_write(priv, reg, , 1);
 }
 
 static int r820t_read(struct r820t_priv *priv, u8 reg, u8 *val, int len)
-- 
2.9.0

[RFC PATCH 07/11] ipv6/addrconf: add an helper for inet6 address lookup

2017-09-22 Thread Paolo Abeni

reduce code duplication and will simplify follow-up patch

Signed-off-by: Paolo Abeni 
---
 net/ipv6/addrconf.c | 65 +
 1 file changed, 31 insertions(+), 34 deletions(-)

diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index c2e2a78787ec..5940062cac8d 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -1796,35 +1796,46 @@ int ipv6_chk_addr(struct net *net, const struct 
in6_addr *addr,
 }
 EXPORT_SYMBOL(ipv6_chk_addr);
 
+/* called under RCU lock with bh disabled */
+static struct inet6_ifaddr *ipv6_lookup_ifaddr_rcu_bh(struct net *net,
+   const struct in6_addr *addr)
+{
+   unsigned int hash = inet6_addr_hash(addr);
+   struct inet6_ifaddr *ifp;
+
+   hlist_for_each_entry_rcu_bh(ifp, _addr_lst[hash], addr_lst)
+   if (net_eq(dev_net(ifp->idev->dev), net) &&
+   ipv6_addr_equal(>addr, addr))
+   return ifp;
+
+   return NULL;
+}
+
 int ipv6_chk_addr_and_flags(struct net *net, const struct in6_addr *addr,
const struct net_device *dev, int strict,
u32 banned_flags)
 {
struct inet6_ifaddr *ifp;
-   unsigned int hash = inet6_addr_hash(addr);
u32 ifp_flags;
+   int ret = 0;
 
rcu_read_lock_bh();
-   hlist_for_each_entry_rcu(ifp, _addr_lst[hash], addr_lst) {
-   if (!net_eq(dev_net(ifp->idev->dev), net))
-   continue;
+   ifp = ipv6_lookup_ifaddr_rcu_bh(net, addr);
+   if (ifp) {
/* Decouple optimistic from tentative for evaluation here.
 * Ban optimistic addresses explicitly, when required.
 */
ifp_flags = (ifp->flags_F_OPTIMISTIC)
? (ifp->flags&~IFA_F_TENTATIVE)
: ifp->flags;
-   if (ipv6_addr_equal(>addr, addr) &&
-   !(ifp_flags_flags) &&
+   if (!(ifp_flags_flags) &&
(!dev || ifp->idev->dev == dev ||
-!(ifp->scope&(IFA_LINK|IFA_HOST) || strict))) {
-   rcu_read_unlock_bh();
-   return 1;
-   }
+!(ifp->scope&(IFA_LINK|IFA_HOST) || strict)))
+   ret = 1;
}
 
rcu_read_unlock_bh();
-   return 0;
+   return ret;
 }
 EXPORT_SYMBOL(ipv6_chk_addr_and_flags);
 
@@ -1900,20 +1911,13 @@ struct inet6_ifaddr *ipv6_get_ifaddr(struct net *net, 
const struct in6_addr *add
 struct net_device *dev, int strict)
 {
struct inet6_ifaddr *ifp, *result = NULL;
-   unsigned int hash = inet6_addr_hash(addr);
 
rcu_read_lock_bh();
-   hlist_for_each_entry_rcu_bh(ifp, _addr_lst[hash], addr_lst) {
-   if (!net_eq(dev_net(ifp->idev->dev), net))
-   continue;
-   if (ipv6_addr_equal(>addr, addr)) {
-   if (!dev || ifp->idev->dev == dev ||
-   !(ifp->scope&(IFA_LINK|IFA_HOST) || strict)) {
-   result = ifp;
-   in6_ifa_hold(ifp);
-   break;
-   }
-   }
+   ifp = ipv6_lookup_ifaddr_rcu_bh(net, addr);
+   if (ifp && (!dev || ifp->idev->dev == dev ||
+   !(ifp->scope & (IFA_LINK|IFA_HOST) || strict))) {
+   result = ifp;
+   in6_ifa_hold(ifp);
}
rcu_read_unlock_bh();
 
@@ -4226,20 +4230,13 @@ void if6_proc_exit(void)
 /* Check if address is a home address configured on any interface. */
 int ipv6_chk_home_addr(struct net *net, const struct in6_addr *addr)
 {
-   int ret = 0;
struct inet6_ifaddr *ifp = NULL;
-   unsigned int hash = inet6_addr_hash(addr);
+   int ret = 0;
 
rcu_read_lock_bh();
-   hlist_for_each_entry_rcu_bh(ifp, _addr_lst[hash], addr_lst) {
-   if (!net_eq(dev_net(ifp->idev->dev), net))
-   continue;
-   if (ipv6_addr_equal(>addr, addr) &&
-   (ifp->flags & IFA_F_HOMEADDRESS)) {
-   ret = 1;
-   break;
-   }
-   }
+   ifp = ipv6_lookup_ifaddr_rcu_bh(net, addr);
+   if (ifp && ifp->flags & IFA_F_HOMEADDRESS)
+   ret = 1;
rcu_read_unlock_bh();
return ret;
 }
-- 
2.13.5

[RFC PATCH 01/11] net: add support for noref skb->sk

2017-09-22 Thread Paolo Abeni

Noref sk do not carry a socket refcount, are valid
only inside the current RCU section and must be
explicitly cleared before exiting such section.

They will be used in a later patch to allow early demux
without sock refcounting.

Signed-off-by: Paolo Abeni 
---
 include/linux/skbuff.h | 31 +++
 net/core/sock.c|  7 +++
 2 files changed, 38 insertions(+)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 492828801acb..c3fc32636690 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -922,6 +922,37 @@ static inline struct rtable *skb_rtable(const struct 
sk_buff *skb)
return (struct rtable *)skb_dst(skb);
 }
 
+void sock_dummyfree(struct sk_buff *skb);
+
+/* only early demux can set noref socks
+ * noref socks do not carry any refcount and must be
+ * cleared before exiting the current RCU section
+ */
+static inline void skb_set_noref_sk(struct sk_buff *skb, struct sock *sk)
+{
+   skb->sk = sk;
+   skb->destructor = sock_dummyfree;
+}
+
+static inline bool skb_has_noref_sk(struct sk_buff *skb)
+{
+   return skb->destructor == sock_dummyfree;
+}
+
+static inline struct sock *skb_clear_noref_sk(struct sk_buff *skb)
+{
+   struct sock *ret;
+
+   if (!skb_has_noref_sk(skb))
+   return NULL;
+
+   WARN_ON_ONCE(!rcu_read_lock_held());
+   ret = skb->sk;
+   skb->sk = NULL;
+   skb->destructor = NULL;
+   return ret;
+}
+
 /* For mangling skb->pkt_type from user space side from applications
  * such as nft, tc, etc, we only allow a conservative subset of
  * possible pkt_types to be set.
diff --git a/net/core/sock.c b/net/core/sock.c
index 9b7b6bbb2a23..33da8e7e58a0 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1893,6 +1893,13 @@ void sock_efree(struct sk_buff *skb)
 }
 EXPORT_SYMBOL(sock_efree);
 
+/* dummy destructor used by noref sockets */
+void sock_dummyfree(struct sk_buff *skb)
+{
+   WARN_ON_ONCE(!rcu_read_lock_held());
+}
+EXPORT_SYMBOL(sock_dummyfree);
+
 kuid_t sock_i_uid(struct sock *sk)
 {
kuid_t uid;
-- 
2.13.5

[RFC PATCH 05/11] udp: perform full socket lookup in early demux

2017-09-22 Thread Paolo Abeni

Since UDP early demux lookup fetches noref socket references,
we can safely be optimistic about it and set the sk reference
even if the skb is not going to land on such socket, avoiding
the rx dst cache usage for unconnected unicast sockets.

This avoids a second lookup for unconnected sockets, and clean
up a bit the whole udp early demux code.

After this change, on hosts not acting as routers, the UDP
early demux never affect negatively the receive performances,
while before this change UDP early demux caused measurable
performance impact for unconnected sockets.

Signed-off-by: Paolo Abeni 
---
 include/linux/udp.h |  2 ++
 net/ipv4/udp.c  | 62 +++--
 net/ipv6/udp.c  | 57 
 3 files changed, 38 insertions(+), 83 deletions(-)

diff --git a/include/linux/udp.h b/include/linux/udp.h
index eaea63bc79bb..9c68b57543cc 100644
--- a/include/linux/udp.h
+++ b/include/linux/udp.h
@@ -92,6 +92,8 @@ static inline struct udp_sock *udp_sk(const struct sock *sk)
return (struct udp_sock *)sk;
 }
 
+void udp_set_skb_rx_dst(struct sock *sk, struct sk_buff *skb, u32 cookie);
+
 static inline void udp_set_no_check6_tx(struct sock *sk, bool val)
 {
udp_sk(sk)->no_check6_tx = val;
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index ba49d5aa9f09..5cbbd78024dc 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -2043,6 +2043,11 @@ static inline int udp4_csum_init(struct sk_buff *skb, 
struct udphdr *uh,
 inet_compute_pseudo);
 }
 
+static bool udp_use_rx_dst_cache(struct sock *sk, struct sk_buff *skb)
+{
+   return sk->sk_state == TCP_ESTABLISHED || skb->pkt_type != PACKET_HOST;
+}
+
 /*
  * All we need to do is get the socket, and then do a checksum.
  */
@@ -2088,8 +2093,8 @@ int __udp4_lib_rcv(struct sk_buff *skb, struct udp_table 
*udptable,
struct dst_entry *dst = skb_dst(skb);
int ret;
 
-   if (unlikely(sk->sk_rx_dst != dst))
-   udp_sk_rx_dst_set(sk, dst);
+   if (udp_use_rx_dst_cache(sk, skb))
+   dst_update(>sk_rx_dst, dst);
 
ret = udp_queue_rcv_skb(sk, skb);
if (!noref_sk)
@@ -2196,42 +2201,28 @@ static struct sock 
*__udp4_lib_mcast_demux_lookup(struct net *net,
return result;
 }
 
-/* For unicast we should only early demux connected sockets or we can
- * break forwarding setups.  The chains here can be long so only check
- * if the first socket is an exact match and if not move on.
- */
-static struct sock *__udp4_lib_demux_lookup(struct net *net,
-   __be16 loc_port, __be32 loc_addr,
-   __be16 rmt_port, __be32 rmt_addr,
-   int dif, int sdif)
+void udp_set_skb_rx_dst(struct sock *sk, struct sk_buff *skb, u32 cookie)
 {
-   unsigned short hnum = ntohs(loc_port);
-   unsigned int hash2 = udp4_portaddr_hash(net, loc_addr, hnum);
-   unsigned int slot2 = hash2 & udp_table.mask;
-   struct udp_hslot *hslot2 = _table.hash2[slot2];
-   INET_ADDR_COOKIE(acookie, rmt_addr, loc_addr);
-   const __portpair ports = INET_COMBINED_PORTS(rmt_port, hnum);
-   struct sock *sk;
+   struct dst_entry *dst = dst_access(>sk_rx_dst, cookie);
 
-   udp_portaddr_for_each_entry_rcu(sk, >head) {
-   if (INET_MATCH(sk, net, acookie, rmt_addr,
-  loc_addr, ports, dif, sdif))
-   return sk;
-   /* Only check first socket in chain */
-   break;
+   if (dst) {
+   /* set noref for now.
+* any place which wants to hold dst has to call
+* dst_hold_safe()
+*/
+   skb_dst_set_noref(skb, dst);
}
-   return NULL;
 }
+EXPORT_SYMBOL_GPL(udp_set_skb_rx_dst);
 
 void udp_v4_early_demux(struct sk_buff *skb)
 {
struct net *net = dev_net(skb->dev);
+   int dif = skb->dev->ifindex;
+   int sdif = inet_sdif(skb);
const struct iphdr *iph;
const struct udphdr *uh;
struct sock *sk = NULL;
-   struct dst_entry *dst;
-   int dif = skb->dev->ifindex;
-   int sdif = inet_sdif(skb);
int ours;
 
/* validate the packet */
@@ -2260,25 +2251,16 @@ void udp_v4_early_demux(struct sk_buff *skb)
   uh->source, iph->saddr,
   dif, sdif);
} else if (skb->pkt_type == PACKET_HOST) {
-   sk = __udp4_lib_demux_lookup(net, uh->dest, iph->daddr,
-uh->source, iph->saddr, dif, sdif);
+   sk = __udp4_lib_lookup(net, iph->saddr, uh->source, iph->daddr,
+

[RFC PATCH 02/11] net: allow early demux to fetch noref socket

2017-09-22 Thread Paolo Abeni

We must be careful to avoid leaking such sockets outside
the RCU section containing the early demux call; we clear
them on nonlocal delivery.

For ipv4 we clear sknoref even for multicast traffic entering
the ip_mr_input() path; we will lose the mcast early demux
optimization when the host is acting as multicast router, but
that will help to keep to code simple.

Also update all iptables/nftables extension that can
happen in the input chain and can transmit the skb outside
such patch, namely TEE, nft_dup and nfqueue.

Signed-off-by: Paolo Abeni 
---
 net/ipv4/ip_input.c  | 8 
 net/ipv4/netfilter/nf_dup_ipv4.c | 3 +++
 net/ipv6/ip6_input.c | 4 
 net/ipv6/netfilter/nf_dup_ipv6.c | 3 +++
 net/netfilter/nf_queue.c | 3 +++
 5 files changed, 21 insertions(+)

diff --git a/net/ipv4/ip_input.c b/net/ipv4/ip_input.c
index fa2dc8f692c6..5690ef09da28 100644
--- a/net/ipv4/ip_input.c
+++ b/net/ipv4/ip_input.c
@@ -351,6 +351,14 @@ static int ip_rcv_finish(struct net *net, struct sock *sk, 
struct sk_buff *skb)
}
}
 
+   /* Since the sk has no reference to the socket, we must
+* clear it before escaping this RCU section.
+* The sk is just an hint and we know we are not going to use
+* it outside the input path.
+*/
+   if (skb_dst(skb)->input != ip_local_deliver)
+   skb_clear_noref_sk(skb);
+
 #ifdef CONFIG_IP_ROUTE_CLASSID
if (unlikely(skb_dst(skb)->tclassid)) {
struct ip_rt_acct *st = this_cpu_ptr(ip_rt_acct);
diff --git a/net/ipv4/netfilter/nf_dup_ipv4.c b/net/ipv4/netfilter/nf_dup_ipv4.c
index 39895b9ddeb9..bf8b78492fc8 100644
--- a/net/ipv4/netfilter/nf_dup_ipv4.c
+++ b/net/ipv4/netfilter/nf_dup_ipv4.c
@@ -71,6 +71,9 @@ void nf_dup_ipv4(struct net *net, struct sk_buff *skb, 
unsigned int hooknum,
nf_reset(skb);
nf_ct_set(skb, NULL, IP_CT_UNTRACKED);
 #endif
+   /* Avoid leaking noref sk outside the input path */
+   skb_clear_noref_sk(skb);
+
/*
 * If we are in PREROUTING/INPUT, decrease the TTL to mitigate potential
 * loops between two hosts.
diff --git a/net/ipv6/ip6_input.c b/net/ipv6/ip6_input.c
index 9ee208a348f5..e15ec2d36b9e 100644
--- a/net/ipv6/ip6_input.c
+++ b/net/ipv6/ip6_input.c
@@ -68,6 +68,10 @@ int ip6_rcv_finish(struct net *net, struct sock *sk, struct 
sk_buff *skb)
if (!skb_valid_dst(skb))
ip6_route_input(skb);
 
+   /* see comment on ipv4 edmux */
+   if (skb_dst(skb)->input != ip6_input)
+   skb_clear_noref_sk(skb);
+
return dst_input(skb);
 }
 
diff --git a/net/ipv6/netfilter/nf_dup_ipv6.c b/net/ipv6/netfilter/nf_dup_ipv6.c
index 4a7ddeddbaab..939f6a2238f9 100644
--- a/net/ipv6/netfilter/nf_dup_ipv6.c
+++ b/net/ipv6/netfilter/nf_dup_ipv6.c
@@ -60,6 +60,9 @@ void nf_dup_ipv6(struct net *net, struct sk_buff *skb, 
unsigned int hooknum,
nf_reset(skb);
nf_ct_set(skb, NULL, IP_CT_UNTRACKED);
 #endif
+   /* Avoid leaking noref sk outside the input path */
+   skb_clear_noref_sk(skb);
+
if (hooknum == NF_INET_PRE_ROUTING ||
hooknum == NF_INET_LOCAL_IN) {
struct ipv6hdr *iph = ipv6_hdr(skb);
diff --git a/net/netfilter/nf_queue.c b/net/netfilter/nf_queue.c
index f7e21953b1de..100eff08cb51 100644
--- a/net/netfilter/nf_queue.c
+++ b/net/netfilter/nf_queue.c
@@ -145,6 +145,9 @@ static int __nf_queue(struct sk_buff *skb, const struct 
nf_hook_state *state,
.size   = sizeof(*entry) + afinfo->route_key_size,
};
 
+   /* Avoid leaking noref sk outside the input path */
+   skb_clear_noref_sk(skb);
+
nf_queue_entry_get_refs(entry);
skb_dst_force(skb);
afinfo->saveroute(skb, entry);
-- 
2.13.5

[RFC PATCH 09/11] route: add ipv4/6 helpers to do partial route lookup vs local dst

2017-09-22 Thread Paolo Abeni

For ipv4 also implement the proper source address validation, even
against martian addresses and return an error code accordingly.

Will be used by later patches to perform dst lookup in early
demux for unconnected sockets.

Signed-off-by: Paolo Abeni 
---
 include/net/ip6_route.h |  1 +
 include/net/route.h |  2 ++
 net/ipv4/route.c| 43 +++
 net/ipv6/route.c| 13 +
 4 files changed, 59 insertions(+)

diff --git a/include/net/ip6_route.h b/include/net/ip6_route.h
index ee96f402cb75..edb24456a609 100644
--- a/include/net/ip6_route.h
+++ b/include/net/ip6_route.h
@@ -65,6 +65,7 @@ static inline bool rt6_need_strict(const struct in6_addr 
*daddr)
(IPV6_ADDR_MULTICAST | IPV6_ADDR_LINKLOCAL | 
IPV6_ADDR_LOOPBACK);
 }
 
+void ip6_route_try_local_rcu_bh(struct net *net, struct sk_buff *skb);
 void ip6_route_input(struct sk_buff *skb);
 struct dst_entry *ip6_route_input_lookup(struct net *net,
 struct net_device *dev,
diff --git a/include/net/route.h b/include/net/route.h
index ec09c3d73581..21927231cc14 100644
--- a/include/net/route.h
+++ b/include/net/route.h
@@ -178,6 +178,8 @@ static inline struct rtable *ip_route_output_gre(struct net 
*net, struct flowi4
 
 struct rtable *ip_local_route_alloc(struct net_device *dev, unsigned int flags,
u32 itag, unsigned char type, bool docache);
+int ip_route_try_local_rcu(struct net *net, struct sk_buff *skb,
+  const struct iphdr *iph);
 int ip_route_input_noref(struct sk_buff *skb, __be32 dst, __be32 src,
 u8 tos, struct net_device *devin);
 int ip_route_input_rcu(struct sk_buff *skb, __be32 dst, __be32 src,
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 515589f1b3d1..84248dd41da6 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -2079,6 +2079,49 @@ out: return err;
goto out;
 }
 
+/* try to resolve and set the route for the ingress packet in the local
+ * destination, looking-up the destination address against the local ones
+ * and performing source validation
+ * return an error only if the local look up is successful and validation fails
+ * Called under RCU
+ */
+int ip_route_try_local_rcu(struct net *net, struct sk_buff *skb,
+  const struct iphdr *iph)
+{
+   __be32 saddr = iph->saddr;
+   struct in_device *in_dev;
+   struct dst_entry *dst;
+   int err = -EINVAL;
+   u32 itag;
+
+   dst = inet_get_ifaddr_dst_rcu(net, iph->daddr);
+   if (!dst)
+   return 0;
+
+   in_dev = __in_dev_get_rcu(skb->dev);
+   if (ipv4_is_multicast(saddr) || ipv4_is_lbcast(saddr))
+   goto martian_source;
+
+   /* check for zeronet only after successful lookup, so that we don't trip
+* over limited broadcast destination, see ip_route_input_slow()
+*/
+   if (ipv4_is_zeronet(saddr) || (ipv4_is_loopback(saddr) &&
+  !IN_DEV_NET_ROUTE_LOCALNET(in_dev, net)))
+   goto martian_source;
+
+   err = fib_validate_source(skb, saddr, iph->daddr, iph->tos, 0, skb->dev,
+ in_dev, );
+   if (err < 0)
+   goto martian_source;
+
+   skb_dst_set_noref(skb, dst);
+   return 0;
+
+martian_source:
+   ip_handle_martian_source(skb->dev, in_dev, skb, iph->daddr, iph->saddr);
+   return err;
+}
+
 int ip_route_input_noref(struct sk_buff *skb, __be32 daddr, __be32 saddr,
 u8 tos, struct net_device *dev)
 {
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 26cc9f483b6d..d957e30b1cbe 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -1283,6 +1283,19 @@ void ip6_route_input(struct sk_buff *skb)
skb_dst_set(skb, ip6_route_input_lookup(net, skb->dev, , flags));
 }
 
+/* try to resolve and set the route for the ingress packet in the local
+ * destination
+ * Called under RCU
+ */
+void ip6_route_try_local_rcu_bh(struct net *net, struct sk_buff *skb)
+{
+   struct dst_entry *dst;
+
+   dst = inet6_get_ifaddr_dst_rcu_bh(net, _hdr(skb)->daddr);
+   if (dst)
+   skb_dst_set_noref(skb, dst);
+}
+
 static struct rt6_info *ip6_pol_route_output(struct net *net, struct 
fib6_table *table,
 struct flowi6 *fl6, int flags)
 {
-- 
2.13.5

[RFC PATCH 06/11] ip/route: factor out helper for local route creation

2017-09-22 Thread Paolo Abeni

Will be used by a later patch to build the ifaddr dst cache.
No functional changes are introduced here.

Signed-off-by: Paolo Abeni 
---
 include/net/route.h |  2 ++
 net/ipv4/route.c| 30 ++
 2 files changed, 24 insertions(+), 8 deletions(-)

diff --git a/include/net/route.h b/include/net/route.h
index 1b09a9368c68..ec09c3d73581 100644
--- a/include/net/route.h
+++ b/include/net/route.h
@@ -176,6 +176,8 @@ static inline struct rtable *ip_route_output_gre(struct net 
*net, struct flowi4
return ip_route_output_key(net, fl4);
 }
 
+struct rtable *ip_local_route_alloc(struct net_device *dev, unsigned int flags,
+   u32 itag, unsigned char type, bool docache);
 int ip_route_input_noref(struct sk_buff *skb, __be32 dst, __be32 src,
 u8 tos, struct net_device *devin);
 int ip_route_input_rcu(struct sk_buff *skb, __be32 dst, __be32 src,
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 94d4cd2d5ea4..515589f1b3d1 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -1859,6 +1859,27 @@ static int ip_mkroute_input(struct sk_buff *skb,
return __mkroute_input(skb, res, in_dev, daddr, saddr, tos);
 }
 
+struct rtable *ip_local_route_alloc(struct net_device *dev, unsigned int flags,
+   u32 itag, unsigned char type, bool do_cache)
+{
+   struct in_device *in_dev = __in_dev_get_rcu(dev);
+   struct net *net = dev_net(dev);
+   struct rtable *rth;
+
+   rth = rt_dst_alloc(l3mdev_master_dev_rcu(dev) ? : net->loopback_dev,
+  flags | RTCF_LOCAL, type,
+  IN_DEV_CONF_GET(in_dev, NOPOLICY), false, do_cache);
+   if (!rth)
+   return NULL;
+
+   rth->dst.output= ip_rt_bug;
+#ifdef CONFIG_IP_ROUTE_CLASSID
+   rth->dst.tclassid = itag;
+#endif
+   rth->rt_is_input = 1;
+   return rth;
+}
+
 /*
  * NOTE. We drop all the packets that has local source
  * addresses, because every properly looped back packet
@@ -1996,17 +2017,10 @@ out:return err;
}
}
 
-   rth = rt_dst_alloc(l3mdev_master_dev_rcu(dev) ? : net->loopback_dev,
-  flags | RTCF_LOCAL, res->type,
-  IN_DEV_CONF_GET(in_dev, NOPOLICY), false, do_cache);
+   rth = ip_local_route_alloc(dev, flags, itag, res->type, do_cache);
if (!rth)
goto e_nobufs;
 
-   rth->dst.output= ip_rt_bug;
-#ifdef CONFIG_IP_ROUTE_CLASSID
-   rth->dst.tclassid = itag;
-#endif
-   rth->rt_is_input = 1;
if (res->table)
rth->rt_table_id = res->table->tb_id;
 
-- 
2.13.5

[RFC PATCH 08/11] net: implement local route cache inside ifaddr

2017-09-22 Thread Paolo Abeni

add storage and helpers to associate an ipv{4,6} address
with the local route to self. This will be used by a
later patch to implement early demux for unconnected UDP
sockets.

The caches are filled on address creation, with DST_OBSOLETE_NONE.
Ipv6 cache are explicitly clearered and refreshed on underlaying
device down/up events.

The above schema is simpler than refreshing the cache every
time the dst expires under the default obsolete schema.

Signed-off-by: Paolo Abeni 
---
 include/linux/inetdevice.h |  4 
 include/net/addrconf.h |  3 +++
 include/net/if_inet6.h |  4 
 net/ipv4/devinet.c | 29 -
 net/ipv6/addrconf.c| 44 
 5 files changed, 83 insertions(+), 1 deletion(-)

diff --git a/include/linux/inetdevice.h b/include/linux/inetdevice.h
index 751d051f0bc7..c29982f178bb 100644
--- a/include/linux/inetdevice.h
+++ b/include/linux/inetdevice.h
@@ -130,6 +130,8 @@ static inline void ipv4_devconf_setall(struct in_device 
*in_dev)
 #define IN_DEV_ARP_IGNORE(in_dev)  IN_DEV_MAXCONF((in_dev), ARP_IGNORE)
 #define IN_DEV_ARP_NOTIFY(in_dev)  IN_DEV_MAXCONF((in_dev), ARP_NOTIFY)
 
+struct dst_entry;
+
 struct in_ifaddr {
struct hlist_node   hash;
struct in_ifaddr*ifa_next;
@@ -149,6 +151,7 @@ struct in_ifaddr {
__u32   ifa_preferred_lft;
unsigned long   ifa_cstamp; /* created timestamp */
unsigned long   ifa_tstamp; /* updated timestamp */
+   struct dst_entry*dst; /* local route to self */
 };
 
 struct in_validator_info {
@@ -180,6 +183,7 @@ __be32 inet_confirm_addr(struct net *net, struct in_device 
*in_dev, __be32 dst,
 struct in_ifaddr *inet_ifa_byprefix(struct in_device *in_dev, __be32 prefix,
__be32 mask);
 struct in_ifaddr *inet_lookup_ifaddr_rcu(struct net *net, __be32 addr);
+struct dst_entry *inet_get_ifaddr_dst_rcu(struct net *net, __be32 addr);
 static __inline__ bool inet_ifa_match(__be32 addr, struct in_ifaddr *ifa)
 {
return !((addr^ifa->ifa_address)>ifa_mask);
diff --git a/include/net/addrconf.h b/include/net/addrconf.h
index 87981cd63180..bdfa3306a4c5 100644
--- a/include/net/addrconf.h
+++ b/include/net/addrconf.h
@@ -87,6 +87,9 @@ struct inet6_ifaddr *ipv6_get_ifaddr(struct net *net,
 const struct in6_addr *addr,
 struct net_device *dev, int strict);
 
+struct dst_entry *inet6_get_ifaddr_dst_rcu_bh(struct net *net,
+ const struct in6_addr *addr);
+
 int ipv6_dev_get_saddr(struct net *net, const struct net_device *dev,
   const struct in6_addr *daddr, unsigned int srcprefs,
   struct in6_addr *saddr);
diff --git a/include/net/if_inet6.h b/include/net/if_inet6.h
index d4088d1a688d..1dd42e7c17a4 100644
--- a/include/net/if_inet6.h
+++ b/include/net/if_inet6.h
@@ -39,6 +39,8 @@ enum {
INET6_IFADDR_STATE_DEAD,
 };
 
+struct dst_entry;
+
 struct inet6_ifaddr {
struct in6_addr addr;
__u32   prefix_len;
@@ -77,6 +79,8 @@ struct inet6_ifaddr {
 
struct rcu_head rcu;
struct in6_addr peer_addr;
+
+   struct dst_entry*dst; /* local route to self */
 };
 
 struct ip6_sf_socklist {
diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
index 7ce22a2c07ce..a7748f787866 100644
--- a/net/ipv4/devinet.c
+++ b/net/ipv4/devinet.c
@@ -179,6 +179,17 @@ struct in_ifaddr *inet_lookup_ifaddr_rcu(struct net *net, 
__be32 addr)
return NULL;
 }
 
+/* called under RCU lock */
+struct dst_entry *inet_get_ifaddr_dst_rcu(struct net *net, __be32 addr)
+{
+   struct in_ifaddr *ifa = inet_lookup_ifaddr_rcu(net, addr);
+
+   if (!ifa)
+   return NULL;
+
+   return dst_access(>dst, 0);
+}
+
 static void rtmsg_ifa(int event, struct in_ifaddr *, struct nlmsghdr *, u32);
 
 static BLOCKING_NOTIFIER_HEAD(inetaddr_chain);
@@ -337,6 +348,7 @@ static void __inet_del_ifa(struct in_device *in_dev, struct 
in_ifaddr **ifap,
struct in_ifaddr *last_prim = in_dev->ifa_list;
struct in_ifaddr *prev_prom = NULL;
int do_promote = IN_DEV_PROMOTE_SECONDARIES(in_dev);
+   struct dst_entry *dst;
 
ASSERT_RTNL();
 
@@ -395,7 +407,12 @@ static void __inet_del_ifa(struct in_device *in_dev, 
struct in_ifaddr **ifap,
*ifap = ifa1->ifa_next;
inet_hash_remove(ifa1);
 
-   /* 3. Announce address deletion */
+   /* 3. Clear dst cache */
+
+   dst = xchg(>dst, NULL);
+   dst_release(dst);
+
+   /* 4. Announce address deletion */
 
/* Send message first, then call notifier.
   At first sight, FIB update triggered by notifier
@@ -449,6 +466,7 @@ static int __inet_insert_ifa(struct in_ifaddr *ifa, struct 
nlmsghdr *nlh,
struct

[RFC PATCH 00/11] udp: full early demux for unconnected sockets

2017-09-22 Thread Paolo Abeni

This series refactor the UDP early demux code so that:

* full socket lookup is performed for unicast packets
* a sk is grabbed even for unconnected socket match
* a dst cache is used even in such scenario

To perform this tasks a couple of facilities are added:

* noref socket references, scoped inside the current RCU section, to be
  explicitly cleared before leaving such section
* a dst cache inside the inet and inet6 local addresses tables, caching the
  related local dst entry

The measured performance gain under small packet UDP flood is as follow:

ingress NIC vanilla patched delta
rx queues   (kpps)  (kpps)  (%)
[ipv4]
1   2177241410
2   2527289214
3   3050373322
4   3918464318
5   5074569912
6   5654686921

[ipv6]
1   2002282140
2   2087314850
3   2583400855
4   3072496361
5   3719599261
6   4314691060

The number of user space process in use is equal to the number of
NIC rx queue; when multiple user space processes the SO_REUSEPORT 
options is used, as described below:

ethtool  -L em2 combined $n
MASK=1
for I in `seq 0 $((n - 1))`; do
udp_sink  --reuse-port --recvfrom --count 10 --port 9 $1 &
taskset -p $((MASK << ($I + $n) )) $!
done

Paolo Abeni (11):
  net: add support for noref skb->sk
  net: allow early demux to fetch noref socket
  udp: do not touch socket refcount in early demux
  net: add simple socket-like dst cache helpers
  udp: perform full socket lookup in early demux
  ip/route: factor out helper for local route creation
  ipv6/addrconf: add an helper for inet6 address lookup
  net: implement local route cache inside ifaddr
  route: add ipv4/6 helpers to do partial route lookup vs local dst
  IP: early demux can return an error code
  udp: dst lookup in early demux for unconnected sockets

 include/linux/inetdevice.h   |   4 ++
 include/linux/skbuff.h   |  31 +++
 include/linux/udp.h  |   2 +
 include/net/addrconf.h   |   3 ++
 include/net/dst.h|  20 +++
 include/net/if_inet6.h   |   4 ++
 include/net/ip6_route.h  |   1 +
 include/net/protocol.h   |   4 +-
 include/net/route.h  |   4 ++
 include/net/tcp.h|   2 +-
 include/net/udp.h|   2 +-
 net/core/dst.c   |  12 +
 net/core/sock.c  |   7 +++
 net/ipv4/devinet.c   |  29 ++-
 net/ipv4/ip_input.c  |  33 
 net/ipv4/netfilter/nf_dup_ipv4.c |   3 ++
 net/ipv4/route.c |  73 +++---
 net/ipv4/tcp_ipv4.c  |   9 ++--
 net/ipv4/udp.c   |  95 +++---
 net/ipv6/addrconf.c  | 109 +++
 net/ipv6/ip6_input.c |   4 ++
 net/ipv6/netfilter/nf_dup_ipv6.c |   3 ++
 net/ipv6/route.c |  13 +
 net/ipv6/udp.c   |  72 ++
 net/netfilter/nf_queue.c |   3 ++
 25 files changed, 383 insertions(+), 159 deletions(-)

-- 
2.13.5

[RFC PATCH 03/11] udp: do not touch socket refcount in early demux

2017-09-22 Thread Paolo Abeni

use noref sockets instead. This gives some small performance
improvements and will allow efficient early demux for unconnected
sockets in a later patch.

Signed-off-by: Paolo Abeni 
---
 net/ipv4/udp.c | 18 ++
 net/ipv6/udp.c | 10 ++
 2 files changed, 16 insertions(+), 12 deletions(-)

diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 784ced0b9150..ba49d5aa9f09 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -2050,12 +2050,13 @@ static inline int udp4_csum_init(struct sk_buff *skb, 
struct udphdr *uh,
 int __udp4_lib_rcv(struct sk_buff *skb, struct udp_table *udptable,
   int proto)
 {
-   struct sock *sk;
-   struct udphdr *uh;
-   unsigned short ulen;
+   struct net *net = dev_net(skb->dev);
struct rtable *rt = skb_rtable(skb);
+   unsigned short ulen;
__be32 saddr, daddr;
-   struct net *net = dev_net(skb->dev);
+   struct udphdr *uh;
+   struct sock *sk;
+   bool noref_sk;
 
/*
 *  Validate the packet.
@@ -2081,6 +2082,7 @@ int __udp4_lib_rcv(struct sk_buff *skb, struct udp_table 
*udptable,
if (udp4_csum_init(skb, uh, proto))
goto csum_error;
 
+   noref_sk = skb_has_noref_sk(skb);
sk = skb_steal_sock(skb);
if (sk) {
struct dst_entry *dst = skb_dst(skb);
@@ -2090,7 +2092,8 @@ int __udp4_lib_rcv(struct sk_buff *skb, struct udp_table 
*udptable,
udp_sk_rx_dst_set(sk, dst);
 
ret = udp_queue_rcv_skb(sk, skb);
-   sock_put(sk);
+   if (!noref_sk)
+   sock_put(sk);
/* a return value > 0 means to resubmit the input, but
 * it wants the return to be -protocol, or 0
 */
@@ -2261,11 +2264,10 @@ void udp_v4_early_demux(struct sk_buff *skb)
 uh->source, iph->saddr, dif, sdif);
}
 
-   if (!sk || !refcount_inc_not_zero(>sk_refcnt))
+   if (!sk)
return;
 
-   skb->sk = sk;
-   skb->destructor = sock_efree;
+   skb_set_noref_sk(skb, sk);
dst = READ_ONCE(sk->sk_rx_dst);
 
if (dst)
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index e2ecfb137297..8f62392c4c35 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -787,6 +787,7 @@ int __udp6_lib_rcv(struct sk_buff *skb, struct udp_table 
*udptable,
struct net *net = dev_net(skb->dev);
struct udphdr *uh;
struct sock *sk;
+   bool noref_sk;
u32 ulen = 0;
 
if (!pskb_may_pull(skb, sizeof(struct udphdr)))
@@ -823,6 +824,7 @@ int __udp6_lib_rcv(struct sk_buff *skb, struct udp_table 
*udptable,
goto csum_error;
 
/* Check if the socket is already available, e.g. due to early demux */
+   noref_sk = skb_has_noref_sk(skb);
sk = skb_steal_sock(skb);
if (sk) {
struct dst_entry *dst = skb_dst(skb);
@@ -832,7 +834,8 @@ int __udp6_lib_rcv(struct sk_buff *skb, struct udp_table 
*udptable,
udp6_sk_rx_dst_set(sk, dst);
 
ret = udpv6_queue_rcv_skb(sk, skb);
-   sock_put(sk);
+   if (!noref_sk)
+   sock_put(sk);
 
/* a return value > 0 means to resubmit the input */
if (ret > 0)
@@ -948,11 +951,10 @@ static void udp_v6_early_demux(struct sk_buff *skb)
else
return;
 
-   if (!sk || !refcount_inc_not_zero(>sk_refcnt))
+   if (!sk)
return;
 
-   skb->sk = sk;
-   skb->destructor = sock_efree;
+   skb_set_noref_sk(skb, sk);
dst = READ_ONCE(sk->sk_rx_dst);
 
if (dst)
-- 
2.13.5

[RFC PATCH 04/11] net: add simple socket-like dst cache helpers

2017-09-22 Thread Paolo Abeni

It will be used by later patches to reduce code duplication.

Signed-off-by: Paolo Abeni 
---
 include/net/dst.h | 20 
 net/core/dst.c| 12 
 2 files changed, 32 insertions(+)

diff --git a/include/net/dst.h b/include/net/dst.h
index 93568bd0a352..4fcca0e368c6 100644
--- a/include/net/dst.h
+++ b/include/net/dst.h
@@ -485,6 +485,26 @@ static inline struct dst_entry *dst_check(struct dst_entry 
*dst, u32 cookie)
return dst;
 }
 
+/* update the cache with dst, assuming the latter already carries a refcount */
+static inline bool __dst_update(struct dst_entry **cache, struct dst_entry 
*dst)
+{
+   struct dst_entry *old = xchg(cache, dst);
+
+   dst_release(old);
+   return old != dst;
+}
+bool dst_update(struct dst_entry **cache, struct dst_entry *dst);
+static inline struct dst_entry *dst_access(struct dst_entry **cache,
+ u32 cookie)
+{
+   struct dst_entry *dst = READ_ONCE(*cache);
+
+   if (!dst)
+   return NULL;
+
+   return dst_check(dst, cookie);
+}
+
 /* Flags for xfrm_lookup flags argument. */
 enum {
XFRM_LOOKUP_ICMP = 1 << 0,
diff --git a/net/core/dst.c b/net/core/dst.c
index a6c47da7d0f8..4076f9af45d7 100644
--- a/net/core/dst.c
+++ b/net/core/dst.c
@@ -205,6 +205,18 @@ void dst_release_immediate(struct dst_entry *dst)
 }
 EXPORT_SYMBOL(dst_release_immediate);
 
+/* update the cache with dst, assuming the latter does not carry a refcount */
+bool dst_update(struct dst_entry **cache, struct dst_entry *dst)
+{
+   if (likely(*cache == dst))
+   return false;
+
+   if (dst_hold_safe(dst))
+   return __dst_update(cache, dst);
+   return false;
+}
+EXPORT_SYMBOL_GPL(dst_update);
+
 u32 *dst_cow_metrics_generic(struct dst_entry *dst, unsigned long old)
 {
struct dst_metrics *p = kmalloc(sizeof(*p), GFP_ATOMIC);
-- 
2.13.5

[PATCH,v3,net-next 2/2] tun: enable napi_gro_frags() for TUN/TAP driver

2017-09-22 Thread Petar Penkov

Add a TUN/TAP receive mode that exercises the napi_gro_frags()
interface. This mode is available only in TAP mode, as the interface
expects packets with Ethernet headers.

Furthermore, packets follow the layout of the iovec_iter that was
received. The first iovec is the linear data, and every one after the
first is a fragment. If there are more fragments than the max number,
drop the packet. Additionally, invoke eth_get_headlen() to exercise flow
dissector code and to verify that the header resides in the linear data.

The napi_gro_frags() mode requires setting the IFF_NAPI_FRAGS option.
This is imposed because this mode is intended for testing via tools like
syzkaller and packetdrill, and the increased flexibility it provides can
introduce security vulnerabilities. This flag is accepted only if the
device is in TAP mode and has the IFF_NAPI flag set as well. This is
done because both of these are explicit requirements for correct
operation in this mode.

Signed-off-by: Petar Penkov 
Cc: Eric Dumazet 
Cc: Mahesh Bandewar 
Cc: Willem de Bruijn 
Cc: da...@davemloft.net
Cc: ppen...@stanford.edu
---
 drivers/net/tun.c   | 134 ++--
 include/uapi/linux/if_tun.h |   1 +
 2 files changed, 129 insertions(+), 6 deletions(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index f16407242b18..9880b3bc8fa5 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -75,6 +75,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 
@@ -121,7 +122,8 @@ do {
\
 #define TUN_VNET_BE 0x4000
 
 #define TUN_FEATURES (IFF_NO_PI | IFF_ONE_QUEUE | IFF_VNET_HDR | \
- IFF_MULTI_QUEUE | IFF_NAPI)
+ IFF_MULTI_QUEUE | IFF_NAPI | IFF_NAPI_FRAGS)
+
 #define GOODCOPY_LEN 128
 
 #define FLT_EXACT_COUNT 8
@@ -173,6 +175,7 @@ struct tun_file {
unsigned int ifindex;
};
struct napi_struct napi;
+   struct mutex napi_mutex;/* Protects access to the above napi */
struct list_head next;
struct tun_struct *detached;
struct skb_array tx_array;
@@ -277,6 +280,7 @@ static void tun_napi_init(struct tun_struct *tun, struct 
tun_file *tfile,
netif_napi_add(tun->dev, >napi, tun_napi_poll,
   NAPI_POLL_WEIGHT);
napi_enable(>napi);
+   mutex_init(>napi_mutex);
}
 }
 
@@ -292,6 +296,11 @@ static void tun_napi_del(struct tun_struct *tun, struct 
tun_file *tfile)
netif_napi_del(>napi);
 }
 
+static bool tun_napi_frags_enabled(const struct tun_struct *tun)
+{
+   return READ_ONCE(tun->flags) & IFF_NAPI_FRAGS;
+}
+
 #ifdef CONFIG_TUN_VNET_CROSS_LE
 static inline bool tun_legacy_is_little_endian(struct tun_struct *tun)
 {
@@ -1036,7 +1045,8 @@ static void tun_poll_controller(struct net_device *dev)
 * supports polling, which enables bridge devices in virt setups to
 * still use netconsole
 * If NAPI is enabled, however, we need to schedule polling for all
-* queues.
+* queues unless we are using napi_gro_frags(), which we call in
+* process context and not in NAPI context.
 */
struct tun_struct *tun = netdev_priv(dev);
 
@@ -1044,6 +1054,9 @@ static void tun_poll_controller(struct net_device *dev)
struct tun_file *tfile;
int i;
 
+   if (tun_napi_frags_enabled(tun))
+   return;
+
rcu_read_lock();
for (i = 0; i < tun->numqueues; i++) {
tfile = rcu_dereference(tun->tfiles[i]);
@@ -1266,6 +1279,64 @@ static unsigned int tun_chr_poll(struct file *file, 
poll_table *wait)
return mask;
 }
 
+static struct sk_buff *tun_napi_alloc_frags(struct tun_file *tfile,
+   size_t len,
+   const struct iov_iter *it)
+{
+   struct sk_buff *skb;
+   size_t linear;
+   int err;
+   int i;
+
+   if (it->nr_segs > MAX_SKB_FRAGS + 1)
+   return ERR_PTR(-ENOMEM);
+
+   local_bh_disable();
+   skb = napi_get_frags(>napi);
+   local_bh_enable();
+   if (!skb)
+   return ERR_PTR(-ENOMEM);
+
+   linear = iov_iter_single_seg_count(it);
+   err = __skb_grow(skb, linear);
+   if (err)
+   goto free;
+
+   skb->len = len;
+   skb->data_len = len - linear;
+   skb->truesize += skb->data_len;
+
+   for (i = 1; i < it->nr_segs; i++) {
+   size_t fragsz = it->iov[i].iov_len;
+   unsigned long offset;
+   struct page *page;
+   void *data;
+
+   if (fragsz == 0 || fragsz > PAGE_SIZE) {
+   err = -EINVAL;
+   goto free;

[PATCH,v3,net-next 1/2] tun: enable NAPI for TUN/TAP driver

2017-09-22 Thread Petar Penkov

Changes TUN driver to use napi_gro_receive() upon receiving packets
rather than netif_rx_ni(). Adds flag IFF_NAPI that enables these
changes and operation is not affected if the flag is disabled.  SKBs
are constructed upon packet arrival and are queued to be processed
later.

The new path was evaluated with a benchmark with the following setup:
Open two tap devices and a receiver thread that reads in a loop for
each device. Start one sender thread and pin all threads to different
CPUs. Send 1M minimum UDP packets to each device and measure sending
time for each of the sending methods:
napi_gro_receive(): 4.90s
netif_rx_ni():  4.90s
netif_receive_skb():7.20s

Signed-off-by: Petar Penkov 
Cc: Eric Dumazet 
Cc: Mahesh Bandewar 
Cc: Willem de Bruijn 
Cc: da...@davemloft.net
Cc: ppen...@stanford.edu
---
 drivers/net/tun.c   | 133 +++-
 include/uapi/linux/if_tun.h |   1 +
 2 files changed, 119 insertions(+), 15 deletions(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 3c9985f29950..f16407242b18 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -121,7 +121,7 @@ do {
\
 #define TUN_VNET_BE 0x4000
 
 #define TUN_FEATURES (IFF_NO_PI | IFF_ONE_QUEUE | IFF_VNET_HDR | \
- IFF_MULTI_QUEUE)
+ IFF_MULTI_QUEUE | IFF_NAPI)
 #define GOODCOPY_LEN 128
 
 #define FLT_EXACT_COUNT 8
@@ -172,6 +172,7 @@ struct tun_file {
u16 queue_index;
unsigned int ifindex;
};
+   struct napi_struct napi;
struct list_head next;
struct tun_struct *detached;
struct skb_array tx_array;
@@ -229,6 +230,68 @@ struct tun_struct {
struct bpf_prog __rcu *xdp_prog;
 };
 
+static int tun_napi_receive(struct napi_struct *napi, int budget)
+{
+   struct tun_file *tfile = container_of(napi, struct tun_file, napi);
+   struct sk_buff_head *queue = >sk.sk_write_queue;
+   struct sk_buff_head process_queue;
+   struct sk_buff *skb;
+   int received = 0;
+
+   __skb_queue_head_init(_queue);
+
+   spin_lock(>lock);
+   skb_queue_splice_tail_init(queue, _queue);
+   spin_unlock(>lock);
+
+   while (received < budget && (skb = __skb_dequeue(_queue))) {
+   napi_gro_receive(napi, skb);
+   ++received;
+   }
+
+   if (!skb_queue_empty(_queue)) {
+   spin_lock(>lock);
+   skb_queue_splice(_queue, queue);
+   spin_unlock(>lock);
+   }
+
+   return received;
+}
+
+static int tun_napi_poll(struct napi_struct *napi, int budget)
+{
+   unsigned int received;
+
+   received = tun_napi_receive(napi, budget);
+
+   if (received < budget)
+   napi_complete_done(napi, received);
+
+   return received;
+}
+
+static void tun_napi_init(struct tun_struct *tun, struct tun_file *tfile,
+ bool napi_en)
+{
+   if (napi_en) {
+   netif_napi_add(tun->dev, >napi, tun_napi_poll,
+  NAPI_POLL_WEIGHT);
+   napi_enable(>napi);
+   }
+}
+
+static void tun_napi_disable(struct tun_struct *tun, struct tun_file *tfile)
+{
+   if (tun->flags & IFF_NAPI)
+   napi_disable(>napi);
+}
+
+static void tun_napi_del(struct tun_struct *tun, struct tun_file *tfile)
+{
+   if (tun->flags & IFF_NAPI)
+   netif_napi_del(>napi);
+}
+
 #ifdef CONFIG_TUN_VNET_CROSS_LE
 static inline bool tun_legacy_is_little_endian(struct tun_struct *tun)
 {
@@ -541,6 +604,11 @@ static void __tun_detach(struct tun_file *tfile, bool 
clean)
 
tun = rtnl_dereference(tfile->tun);
 
+   if (tun && clean) {
+   tun_napi_disable(tun, tfile);
+   tun_napi_del(tun, tfile);
+   }
+
if (tun && !tfile->detached) {
u16 index = tfile->queue_index;
BUG_ON(index >= tun->numqueues);
@@ -598,6 +666,7 @@ static void tun_detach_all(struct net_device *dev)
for (i = 0; i < n; i++) {
tfile = rtnl_dereference(tun->tfiles[i]);
BUG_ON(!tfile);
+   tun_napi_disable(tun, tfile);
tfile->socket.sk->sk_shutdown = RCV_SHUTDOWN;
tfile->socket.sk->sk_data_ready(tfile->socket.sk);
RCU_INIT_POINTER(tfile->tun, NULL);
@@ -613,6 +682,7 @@ static void tun_detach_all(struct net_device *dev)
synchronize_net();
for (i = 0; i < n; i++) {
tfile = rtnl_dereference(tun->tfiles[i]);
+   tun_napi_del(tun, tfile);
/* Drop read queue */
tun_queue_purge(tfile);
sock_put(>sk);
@@ -631,7 +701,8 @@ static void tun_detach_all(struct net_device *dev)

[PATCH,v3,net-next 0/2] Improve code coverage of syzkaller

2017-09-22 Thread Petar Penkov

This patch series is intended to improve code coverage of syzkaller on
the early receive path, specifically including flow dissector, GRO,
and GRO with frags parts of the networking stack. Syzkaller exercises
the stack through the TUN driver and this is therefore where changes
reside. Current coverage through netif_receive_skb() is limited as it
does not touch on any of the aforementioned code paths. Furthermore,
for full coverage, it is necessary to have more flexibility over the
linear and non-linear data of the skbs.

The following patches address this by providing the user(syzkaller)
with the ability to send via napi_gro_receive() and napi_gro_frags().
Additionally, syzkaller can specify how many fragments there are and
how much data per fragment there is. This is done by exploiting the
convenient structure of iovecs. Finally, this patch series adds
support for exercising the flow dissector during fuzzing.

The code path including napi_gro_receive() can be enabled via the
IFF_NAPI flag.  The remainder of the changes in this patch series give
the user significantly more control over packets entering the kernel.
To avoid potential security vulnerabilities, hide the ability to send
custom skbs and the flow dissector code paths behind a
capable(CAP_NET_ADMIN) check to require special user privileges.

Changes since v2 based on feedback from Willem de Bruijn and Mahesh
Bandewar:

Patch 1/ No changes.
Patch 2/ Check if the preconditions for IFF_NAPI_FRAGS (IFF_NAPI and
 IFF_TAP) are met before opening/attaching rather than after.
 If they are not, change the behavior from discarding the
 flag to rejecting the command with EINVAL.

Petar Penkov (2):
  tun: enable NAPI for TUN/TAP driver
  tun: enable napi_gro_frags() for TUN/TAP driver

 drivers/net/tun.c   | 261 +---
 include/uapi/linux/if_tun.h |   2 +
 2 files changed, 245 insertions(+), 18 deletions(-)

-- 
2.11.0

[PATCH] [for 4.14] net: qcom/emac: specify the correct size when mapping a DMA buffer

2017-09-22 Thread Timur Tabi

When mapping the RX DMA buffers, the driver was accidentally specifying
zero for the buffer length.  Under normal circumstances, SWIOTLB does not
need to allocate a bounce buffer, so the address is just mapped without
checking the size field.  This is why the error was not detected earlier.

Fixes: b9b17debc69d ("net: emac: emac gigabit ethernet controller driver")
Cc: sta...@vger.kernel.org
Signed-off-by: Timur Tabi 
---
 drivers/net/ethernet/qualcomm/emac/emac-mac.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/qualcomm/emac/emac-mac.c 
b/drivers/net/ethernet/qualcomm/emac/emac-mac.c
index 0ea3ca09c689..3ed9033e56db 100644
--- a/drivers/net/ethernet/qualcomm/emac/emac-mac.c
+++ b/drivers/net/ethernet/qualcomm/emac/emac-mac.c
@@ -898,7 +898,8 @@ static void emac_mac_rx_descs_refill(struct emac_adapter 
*adpt,
 
curr_rxbuf->dma_addr =
dma_map_single(adpt->netdev->dev.parent, skb->data,
-  curr_rxbuf->length, DMA_FROM_DEVICE);
+  adpt->rxbuf_size, DMA_FROM_DEVICE);
+
ret = dma_mapping_error(adpt->netdev->dev.parent,
curr_rxbuf->dma_addr);
if (ret) {
-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm
Technologies, Inc.  Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

Re: [PATCH net-next v2 1/3] net: dsa: use slave device phydev

2017-09-22 Thread Vivien Didelot

Hi Florian,

Florian Fainelli  writes:

> On 09/22/2017 12:40 PM, Vivien Didelot wrote:
>> There is no need to store a phy_device in dsa_slave_priv since
>> net_device already provides one. Simply s/p->phy/dev->phydev/.
>
> You can therefore remove the phy_device from dsa_slave_priv, see below
> for more comments. I will have to regress test the heck out of this,
> this should take a few hours.

OK, since this is a sensible topic, I will respin a v3 without this
patch, so that a future patchset can address your comments below and
also gives you time to test this one patch alone.

>>  static int dsa_slave_port_attr_set(struct net_device *dev,
>> @@ -435,12 +433,10 @@ static int
>>  dsa_slave_get_link_ksettings(struct net_device *dev,
>>   struct ethtool_link_ksettings *cmd)
>>  {
>> -struct dsa_slave_priv *p = netdev_priv(dev);
>> +if (!dev->phydev)
>> +return -ENODEV;
>>  
>> -if (!p->phy)
>> -return -EOPNOTSUPP;
>> -
>> -phy_ethtool_ksettings_get(p->phy, cmd);
>> +phy_ethtool_ksettings_get(dev->phydev, cmd);
>
> This can be replaced by phy_ethtool_get_link_ksettings()
>
>>  
>>  return 0;
>>  }
>> @@ -449,12 +445,10 @@ static int
>>  dsa_slave_set_link_ksettings(struct net_device *dev,
>>   const struct ethtool_link_ksettings *cmd)
>>  {
>> -struct dsa_slave_priv *p = netdev_priv(dev);
>> +if (!dev->phydev)
>> +return -ENODEV;
>>  
>> -if (p->phy != NULL)
>> -return phy_ethtool_ksettings_set(p->phy, cmd);
>> -
>> -return -EOPNOTSUPP;
>> +return phy_ethtool_ksettings_set(dev->phydev, cmd);
>>  }
>
> This can disappear and you can assign this ethtool operation to
> phy_ethtool_set_link_ksettings()
>
>>  
>>  static void dsa_slave_get_drvinfo(struct net_device *dev,
>> @@ -488,24 +482,20 @@ dsa_slave_get_regs(struct net_device *dev, struct 
>> ethtool_regs *regs, void *_p)
>>  
>>  static int dsa_slave_nway_reset(struct net_device *dev)
>>  {
>> -struct dsa_slave_priv *p = netdev_priv(dev);
>> +if (!dev->phydev)
>> +return -ENODEV;
>>  
>> -if (p->phy != NULL)
>> -return genphy_restart_aneg(p->phy);
>> -
>> -return -EOPNOTSUPP;
>> +return genphy_restart_aneg(dev->phydev);
>>  }
>
> This can now disappear and you can use phy_ethtool_nway_reset() directly
> in ethtool_ops
>
>>  
>>  static u32 dsa_slave_get_link(struct net_device *dev)
>>  {
>> -struct dsa_slave_priv *p = netdev_priv(dev);
>> +if (!dev->phydev)
>> +return -ENODEV;
>>  
>> -if (p->phy != NULL) {
>> -genphy_update_link(p->phy);
>> -return p->phy->link;
>> -}
>> +genphy_update_link(dev->phydev);
>>  
>> -return -EOPNOTSUPP;
>> +return dev->phydev->link;
>>  }
>
> This should certainly be just ethtool_op_get_link(), not sure why we
> kept that around here...

Haaa, good to read that! I wasn't sure about this, but with this patch
the slave phy ethtool functions seemed indeed quite generic...


Thanks,

Vivien

Re: [PATCH net-next 2/2] net: dsa: lan9303: Add basic offloading of unicast traffic

2017-09-22 Thread Andrew Lunn

> >I'm wondering how this is supposed to work. Please add a good comment
> >here, since the hardware is forcing you to do something odd.
> >
> >Maybe it would be a good idea to save the STP state in chip.  And then
> >when chip->is_bridged is set true, change the state in the hardware to
> >the saved value?
> >
> >What happens when port 0 is added to the bridge, there is then a
> >minute pause and then port 1 is added? I would expect that as soon as
> >port 0 is added, the STP state machine for port 0 will start and move
> >into listening and then forwarding. Due to hardware limitations it
> >looks like you cannot do this. So what state is the hardware
> >effectively in? Blocking? Forwarding?
> >
> >Then port 1 is added. You can then can respect the states. port 1 will
> >do blocking->listening->forwarding, but what about port 0? The calls
> >won't get repeated? How does it transition to forwarding?
> >
> >   Andrew
> >
> 
> I see your point with the "minute pause" argument. Although a bit
> contrived use case, it is easy to fix by caching the STP state, as
> you suggest. So I can do that.

I don't think it is contrived. I've done bridge configuration by hand
for testing purposes. I've also set the forwarding delay to very small
values, so there is a clear race condition here.

> How does other DSA HW chips handle port separation? Knowing that
> could perhaps help me know what to look for.

They have better hardware :-)

Generally each port is totally independent. You can change the STP
state per port without restrictions.

  Andrew

Re: [PATCH net-next v2 1/3] net: dsa: use slave device phydev

2017-09-22 Thread Andrew Lunn

On Fri, Sep 22, 2017 at 03:40:43PM -0400, Vivien Didelot wrote:
> There is no need to store a phy_device in dsa_slave_priv since
> net_device already provides one. Simply s/p->phy/dev->phydev/.
> 
> While at it, return -ENODEV when it is NULL instead of -EOPNOTSUPP.

I just did a quick poll for calling phy_mii_ioctl(). ENODEV seems the
most popular, second to EINVAL. Marvell drivers all use EOPNOTSUPP.

>  static int dsa_slave_nway_reset(struct net_device *dev)
>  {
> - struct dsa_slave_priv *p = netdev_priv(dev);
> + if (!dev->phydev)
> + return -ENODEV;
>  
> - if (p->phy != NULL)
> - return genphy_restart_aneg(p->phy);
> -
> - return -EOPNOTSUPP;
> + return genphy_restart_aneg(dev->phydev);
>  }

It looks like this can now be replaced with phy_ethtool_nway_reset().

It could be there are other phy_ethtool_ helpers which can be used,
now that we have phydev in ndev.

Andrew

Re: [PATCH net-next v2 1/3] net: dsa: use slave device phydev

2017-09-22 Thread Florian Fainelli

On 09/22/2017 12:40 PM, Vivien Didelot wrote:
> There is no need to store a phy_device in dsa_slave_priv since
> net_device already provides one. Simply s/p->phy/dev->phydev/.

You can therefore remove the phy_device from dsa_slave_priv, see below
for more comments. I will have to regress test the heck out of this,
this should take a few hours.

> 
> While at it, return -ENODEV when it is NULL instead of -EOPNOTSUPP.
> 
> Signed-off-by: Vivien Didelot 
> ---

>  static int dsa_slave_port_attr_set(struct net_device *dev,
> @@ -435,12 +433,10 @@ static int
>  dsa_slave_get_link_ksettings(struct net_device *dev,
>struct ethtool_link_ksettings *cmd)
>  {
> - struct dsa_slave_priv *p = netdev_priv(dev);
> + if (!dev->phydev)
> + return -ENODEV;
>  
> - if (!p->phy)
> - return -EOPNOTSUPP;
> -
> - phy_ethtool_ksettings_get(p->phy, cmd);
> + phy_ethtool_ksettings_get(dev->phydev, cmd);

This can be replaced by phy_ethtool_get_link_ksettings()

>  
>   return 0;
>  }
> @@ -449,12 +445,10 @@ static int
>  dsa_slave_set_link_ksettings(struct net_device *dev,
>const struct ethtool_link_ksettings *cmd)
>  {
> - struct dsa_slave_priv *p = netdev_priv(dev);
> + if (!dev->phydev)
> + return -ENODEV;
>  
> - if (p->phy != NULL)
> - return phy_ethtool_ksettings_set(p->phy, cmd);
> -
> - return -EOPNOTSUPP;
> + return phy_ethtool_ksettings_set(dev->phydev, cmd);
>  }

This can disappear and you can assign this ethtool operation to
phy_ethtool_set_link_ksettings()

>  
>  static void dsa_slave_get_drvinfo(struct net_device *dev,
> @@ -488,24 +482,20 @@ dsa_slave_get_regs(struct net_device *dev, struct 
> ethtool_regs *regs, void *_p)
>  
>  static int dsa_slave_nway_reset(struct net_device *dev)
>  {
> - struct dsa_slave_priv *p = netdev_priv(dev);
> + if (!dev->phydev)
> + return -ENODEV;
>  
> - if (p->phy != NULL)
> - return genphy_restart_aneg(p->phy);
> -
> - return -EOPNOTSUPP;
> + return genphy_restart_aneg(dev->phydev);
>  }

This can now disappear and you can use phy_ethtool_nway_reset() directly
in ethtool_ops

>  
>  static u32 dsa_slave_get_link(struct net_device *dev)
>  {
> - struct dsa_slave_priv *p = netdev_priv(dev);
> + if (!dev->phydev)
> + return -ENODEV;
>  
> - if (p->phy != NULL) {
> - genphy_update_link(p->phy);
> - return p->phy->link;
> - }
> + genphy_update_link(dev->phydev);
>  
> - return -EOPNOTSUPP;
> + return dev->phydev->link;
>  }

This should certainly be just ethtool_op_get_link(), not sure why we
kept that around here...
-- 
Florian

[PATCH net-next v2 2/3] net: dsa: make slave close symmetrical to open

2017-09-22 Thread Vivien Didelot

The DSA slave open function configures the unicast MAC addresses on the
master device, enable the switch port, change its STP state, then start
the PHY device.

Make the close function symmetric, by first stopping the PHY device,
then changing the STP state, disabling the switch port and restore the
master device.

Signed-off-by: Vivien Didelot 
Reviewed-by: Florian Fainelli 
Reviewed-by: Andrew Lunn 
---
 net/dsa/slave.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 3760472bf41d..0aab29928152 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -133,6 +133,11 @@ static int dsa_slave_close(struct net_device *dev)
if (dev->phydev)
phy_stop(dev->phydev);
 
+   dsa_port_set_state_now(p->dp, BR_STATE_DISABLED);
+
+   if (ds->ops->port_disable)
+   ds->ops->port_disable(ds, p->dp->index, dev->phydev);
+
dev_mc_unsync(master, dev);
dev_uc_unsync(master, dev);
if (dev->flags & IFF_ALLMULTI)
@@ -143,11 +148,6 @@ static int dsa_slave_close(struct net_device *dev)
if (!ether_addr_equal(dev->dev_addr, master->dev_addr))
dev_uc_del(master, dev->dev_addr);
 
-   if (ds->ops->port_disable)
-   ds->ops->port_disable(ds, p->dp->index, dev->phydev);
-
-   dsa_port_set_state_now(p->dp, BR_STATE_DISABLED);
-
return 0;
 }
 
-- 
2.14.1

[PATCH net-next v2 3/3] net: dsa: add port enable and disable helpers

2017-09-22 Thread Vivien Didelot

Provide dsa_port_enable and dsa_port_disable helpers to respectively
enable and disable a switch port. This makes the dsa_port_set_state_now
helper static.

Signed-off-by: Vivien Didelot 
Reviewed-by: Florian Fainelli 
Reviewed-by: Andrew Lunn 
---
 net/dsa/dsa_priv.h |  3 ++-
 net/dsa/port.c | 31 ++-
 net/dsa/slave.c| 19 +--
 3 files changed, 37 insertions(+), 16 deletions(-)

diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h
index 9803952a5b40..0298a0f6a349 100644
--- a/net/dsa/dsa_priv.h
+++ b/net/dsa/dsa_priv.h
@@ -117,7 +117,8 @@ void dsa_master_ethtool_restore(struct net_device *dev);
 /* port.c */
 int dsa_port_set_state(struct dsa_port *dp, u8 state,
   struct switchdev_trans *trans);
-void dsa_port_set_state_now(struct dsa_port *dp, u8 state);
+int dsa_port_enable(struct dsa_port *dp, struct phy_device *phy);
+void dsa_port_disable(struct dsa_port *dp, struct phy_device *phy);
 int dsa_port_bridge_join(struct dsa_port *dp, struct net_device *br);
 void dsa_port_bridge_leave(struct dsa_port *dp, struct net_device *br);
 int dsa_port_vlan_filtering(struct dsa_port *dp, bool vlan_filtering,
diff --git a/net/dsa/port.c b/net/dsa/port.c
index 76d43a82d397..72c8dbd3d3f2 100644
--- a/net/dsa/port.c
+++ b/net/dsa/port.c
@@ -56,7 +56,7 @@ int dsa_port_set_state(struct dsa_port *dp, u8 state,
return 0;
 }
 
-void dsa_port_set_state_now(struct dsa_port *dp, u8 state)
+static void dsa_port_set_state_now(struct dsa_port *dp, u8 state)
 {
int err;
 
@@ -65,6 +65,35 @@ void dsa_port_set_state_now(struct dsa_port *dp, u8 state)
pr_err("DSA: failed to set STP state %u (%d)\n", state, err);
 }
 
+int dsa_port_enable(struct dsa_port *dp, struct phy_device *phy)
+{
+   u8 stp_state = dp->bridge_dev ? BR_STATE_BLOCKING : BR_STATE_FORWARDING;
+   struct dsa_switch *ds = dp->ds;
+   int port = dp->index;
+   int err;
+
+   if (ds->ops->port_enable) {
+   err = ds->ops->port_enable(ds, port, phy);
+   if (err)
+   return err;
+   }
+
+   dsa_port_set_state_now(dp, stp_state);
+
+   return 0;
+}
+
+void dsa_port_disable(struct dsa_port *dp, struct phy_device *phy)
+{
+   struct dsa_switch *ds = dp->ds;
+   int port = dp->index;
+
+   dsa_port_set_state_now(dp, BR_STATE_DISABLED);
+
+   if (ds->ops->port_disable)
+   ds->ops->port_disable(ds, port, phy);
+}
+
 int dsa_port_bridge_join(struct dsa_port *dp, struct net_device *br)
 {
struct dsa_notifier_bridge_info info = {
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 0aab29928152..4ea1c6eb0da8 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -73,9 +73,7 @@ static int dsa_slave_open(struct net_device *dev)
 {
struct dsa_slave_priv *p = netdev_priv(dev);
struct dsa_port *dp = p->dp;
-   struct dsa_switch *ds = dp->ds;
struct net_device *master = dsa_master_netdev(p);
-   u8 stp_state = dp->bridge_dev ? BR_STATE_BLOCKING : BR_STATE_FORWARDING;
int err;
 
if (!(master->flags & IFF_UP))
@@ -98,13 +96,9 @@ static int dsa_slave_open(struct net_device *dev)
goto clear_allmulti;
}
 
-   if (ds->ops->port_enable) {
-   err = ds->ops->port_enable(ds, p->dp->index, dev->phydev);
-   if (err)
-   goto clear_promisc;
-   }
-
-   dsa_port_set_state_now(p->dp, stp_state);
+   err = dsa_port_enable(dp, dev->phydev);
+   if (err)
+   goto clear_promisc;
 
if (dev->phydev)
phy_start(dev->phydev);
@@ -128,15 +122,12 @@ static int dsa_slave_close(struct net_device *dev)
 {
struct dsa_slave_priv *p = netdev_priv(dev);
struct net_device *master = dsa_master_netdev(p);
-   struct dsa_switch *ds = p->dp->ds;
+   struct dsa_port *dp = p->dp;
 
if (dev->phydev)
phy_stop(dev->phydev);
 
-   dsa_port_set_state_now(p->dp, BR_STATE_DISABLED);
-
-   if (ds->ops->port_disable)
-   ds->ops->port_disable(ds, p->dp->index, dev->phydev);
+   dsa_port_disable(dp, dev->phydev);
 
dev_mc_unsync(master, dev);
dev_uc_unsync(master, dev);
-- 
2.14.1

[PATCH net-next v2 0/3] net: dsa: use slave device phydev

2017-09-22 Thread Vivien Didelot

This patchset removes the private phy_device in favor of the one
provided by the slave net_device, makes slave open and close symmetrical
and finally provides helpers for enabling or disabling a DSA port.

Changes in v2:
  - do not remove the phy argument from port enable/disable

Vivien Didelot (3):
  net: dsa: use slave device phydev
  net: dsa: make slave close symmetrical to open
  net: dsa: add port enable and disable helpers

 net/dsa/dsa_priv.h |   3 +-
 net/dsa/port.c |  31 +++-
 net/dsa/slave.c| 143 +++--
 3 files changed, 94 insertions(+), 83 deletions(-)

-- 
2.14.1

[PATCH net-next v2 1/3] net: dsa: use slave device phydev

2017-09-22 Thread Vivien Didelot

There is no need to store a phy_device in dsa_slave_priv since
net_device already provides one. Simply s/p->phy/dev->phydev/.

While at it, return -ENODEV when it is NULL instead of -EOPNOTSUPP.

Signed-off-by: Vivien Didelot 
---
 net/dsa/slave.c | 126 ++--
 1 file changed, 58 insertions(+), 68 deletions(-)

diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 02ace7d462c4..3760472bf41d 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -99,15 +99,15 @@ static int dsa_slave_open(struct net_device *dev)
}
 
if (ds->ops->port_enable) {
-   err = ds->ops->port_enable(ds, p->dp->index, p->phy);
+   err = ds->ops->port_enable(ds, p->dp->index, dev->phydev);
if (err)
goto clear_promisc;
}
 
dsa_port_set_state_now(p->dp, stp_state);
 
-   if (p->phy)
-   phy_start(p->phy);
+   if (dev->phydev)
+   phy_start(dev->phydev);
 
return 0;
 
@@ -130,8 +130,8 @@ static int dsa_slave_close(struct net_device *dev)
struct net_device *master = dsa_master_netdev(p);
struct dsa_switch *ds = p->dp->ds;
 
-   if (p->phy)
-   phy_stop(p->phy);
+   if (dev->phydev)
+   phy_stop(dev->phydev);
 
dev_mc_unsync(master, dev);
dev_uc_unsync(master, dev);
@@ -144,7 +144,7 @@ static int dsa_slave_close(struct net_device *dev)
dev_uc_del(master, dev->dev_addr);
 
if (ds->ops->port_disable)
-   ds->ops->port_disable(ds, p->dp->index, p->phy);
+   ds->ops->port_disable(ds, p->dp->index, dev->phydev);
 
dsa_port_set_state_now(p->dp, BR_STATE_DISABLED);
 
@@ -273,12 +273,10 @@ dsa_slave_fdb_dump(struct sk_buff *skb, struct 
netlink_callback *cb,
 
 static int dsa_slave_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
 {
-   struct dsa_slave_priv *p = netdev_priv(dev);
+   if (!dev->phydev)
+   return -ENODEV;
 
-   if (p->phy != NULL)
-   return phy_mii_ioctl(p->phy, ifr, cmd);
-
-   return -EOPNOTSUPP;
+   return phy_mii_ioctl(dev->phydev, ifr, cmd);
 }
 
 static int dsa_slave_port_attr_set(struct net_device *dev,
@@ -435,12 +433,10 @@ static int
 dsa_slave_get_link_ksettings(struct net_device *dev,
 struct ethtool_link_ksettings *cmd)
 {
-   struct dsa_slave_priv *p = netdev_priv(dev);
+   if (!dev->phydev)
+   return -ENODEV;
 
-   if (!p->phy)
-   return -EOPNOTSUPP;
-
-   phy_ethtool_ksettings_get(p->phy, cmd);
+   phy_ethtool_ksettings_get(dev->phydev, cmd);
 
return 0;
 }
@@ -449,12 +445,10 @@ static int
 dsa_slave_set_link_ksettings(struct net_device *dev,
 const struct ethtool_link_ksettings *cmd)
 {
-   struct dsa_slave_priv *p = netdev_priv(dev);
+   if (!dev->phydev)
+   return -ENODEV;
 
-   if (p->phy != NULL)
-   return phy_ethtool_ksettings_set(p->phy, cmd);
-
-   return -EOPNOTSUPP;
+   return phy_ethtool_ksettings_set(dev->phydev, cmd);
 }
 
 static void dsa_slave_get_drvinfo(struct net_device *dev,
@@ -488,24 +482,20 @@ dsa_slave_get_regs(struct net_device *dev, struct 
ethtool_regs *regs, void *_p)
 
 static int dsa_slave_nway_reset(struct net_device *dev)
 {
-   struct dsa_slave_priv *p = netdev_priv(dev);
+   if (!dev->phydev)
+   return -ENODEV;
 
-   if (p->phy != NULL)
-   return genphy_restart_aneg(p->phy);
-
-   return -EOPNOTSUPP;
+   return genphy_restart_aneg(dev->phydev);
 }
 
 static u32 dsa_slave_get_link(struct net_device *dev)
 {
-   struct dsa_slave_priv *p = netdev_priv(dev);
+   if (!dev->phydev)
+   return -ENODEV;
 
-   if (p->phy != NULL) {
-   genphy_update_link(p->phy);
-   return p->phy->link;
-   }
+   genphy_update_link(dev->phydev);
 
-   return -EOPNOTSUPP;
+   return dev->phydev->link;
 }
 
 static int dsa_slave_get_eeprom_len(struct net_device *dev)
@@ -640,7 +630,7 @@ static int dsa_slave_set_eee(struct net_device *dev, struct 
ethtool_eee *e)
int ret;
 
/* Port's PHY and MAC both need to be EEE capable */
-   if (!p->phy)
+   if (!dev->phydev)
return -ENODEV;
 
if (!ds->ops->set_mac_eee)
@@ -651,12 +641,12 @@ static int dsa_slave_set_eee(struct net_device *dev, 
struct ethtool_eee *e)
return ret;
 
if (e->eee_enabled) {
-   ret = phy_init_eee(p->phy, 0);
+   ret = phy_init_eee(dev->phydev, 0);
if (ret)
return ret;
}
 
-   return phy_ethtool_set_eee(p->phy, e);
+   return phy_ethtool_set_eee(dev->phydev, e);
 }
 
 static int dsa_slave_get_eee(struct net_device *dev, struct ethtool_eee *e)
@@

Re: [PATCH net-next 2/4] net: dsa: remove phy arg from port enable/disable

2017-09-22 Thread Andrew Lunn

> Historical reasons mostly. Considering the complexity of
> dsa_slave_phy_setup(), I would certainly be extremely careful in
> changing any of this, the potential for breakage is pretty big.

Yes, i took a look at this, wondering how to convert to phylink. I
went away and got a stiff drink :-)

 Andrew

[PATCH] r8152: add Linksys USB3GIGV1 id

2017-09-22 Thread Grant Grundler

This Linksys dongle by default comes up in cdc_ether mode.
This patch allows r8152 to claim the device:
   Bus 002 Device 002: ID 13b1:0041 Linksys

Signed-off-by: Grant Grundler 
---
 drivers/net/usb/r8152.c | 2 ++
 1 file changed, 2 insertions(+)

This was tested on chromeos-3.14, chromeos-3.18, and chromeos-4.4 kernels
with a mix of ARM/x86-64 systems.

diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c
index ceb78e2ea4f0..941ece08ba78 100644
--- a/drivers/net/usb/r8152.c
+++ b/drivers/net/usb/r8152.c
@@ -613,6 +613,7 @@ enum rtl8152_flags {
 #define VENDOR_ID_MICROSOFT0x045e
 #define VENDOR_ID_SAMSUNG  0x04e8
 #define VENDOR_ID_LENOVO   0x17ef
+#define VENDOR_ID_LINKSYS  0x13b1
 #define VENDOR_ID_NVIDIA   0x0955
 
 #define MCU_TYPE_PLA   0x0100
@@ -5316,6 +5317,7 @@ static const struct usb_device_id rtl8152_table[] = {
{REALTEK_USB_DEVICE(VENDOR_ID_LENOVO,  0x7205)},
{REALTEK_USB_DEVICE(VENDOR_ID_LENOVO,  0x720c)},
{REALTEK_USB_DEVICE(VENDOR_ID_LENOVO,  0x7214)},
+   {REALTEK_USB_DEVICE(VENDOR_ID_LINKSYS, 0x0041)},
{REALTEK_USB_DEVICE(VENDOR_ID_NVIDIA,  0x09ff)},
{}
 };
-- 
2.14.1.821.g8fa685d3b7-goog

Re: [PATCH net-next 2/4] net: dsa: remove phy arg from port enable/disable

2017-09-22 Thread Florian Fainelli

On 09/22/2017 11:12 AM, Vivien Didelot wrote:
> Hi Florian,
> 
> Florian Fainelli  writes:
> 
>> On 09/22/2017 09:17 AM, Vivien Didelot wrote:
>>> The .port_enable and .port_disable functions are meant to deal with the
>>> switch ports only, and no driver is using the phy argument anyway.
>>> Remove it.
>>
>> I don't think this makes sense, there are perfectly legit reasons why a
>> switch driver may have something to do with the PHY device attached to
>> its per-port network interface, we should definitively keep that around,
>> unless you think we should be accessing the PHY within the switch
>> drivers by doing:
>>
>> struct phy_device *phydev = ds->ports[port].netdev->phydev?
> 
> bcm_sf2 is the only user for this phy argument right now. The reason I'm
> doing this is because I prefer to discourage switch drivers to dig into
> the phy device themselves while as you said there must be a cleaner
> solution. This must be handled somehow elsewhere in the stack.

The current approach of passing the phy_device reference as an argument
is certainly a cleaner way then. The port_enable caller can provide the
correct phy_device and that lifts the switch driver from having to dig
it itself from its per-port netdev.

> 
> In the meantime, moving the PHY device up to the dsa_port structure is a
> good solution, in order not to expose it in switch ops, but still make
> it available to more complex drivers.
> 
> Do you know if netdev->phydev is usable? Why do DSA has its own copy in
> dsa_slave_priv then?

Historical reasons mostly. Considering the complexity of
dsa_slave_phy_setup(), I would certainly be extremely careful in
changing any of this, the potential for breakage is pretty big. At first
glance, I would say that this is a safe conversion to do, and I can test
this on the HW I have here anyway.
-- 
Florian

Re: [PATCH net-next 2/4] net: dsa: remove phy arg from port enable/disable

2017-09-22 Thread Vivien Didelot

Hi Florian,

Florian Fainelli  writes:

> On 09/22/2017 09:17 AM, Vivien Didelot wrote:
>> The .port_enable and .port_disable functions are meant to deal with the
>> switch ports only, and no driver is using the phy argument anyway.
>> Remove it.
>
> I don't think this makes sense, there are perfectly legit reasons why a
> switch driver may have something to do with the PHY device attached to
> its per-port network interface, we should definitively keep that around,
> unless you think we should be accessing the PHY within the switch
> drivers by doing:
>
> struct phy_device *phydev = ds->ports[port].netdev->phydev?

bcm_sf2 is the only user for this phy argument right now. The reason I'm
doing this is because I prefer to discourage switch drivers to dig into
the phy device themselves while as you said there must be a cleaner
solution. This must be handled somehow elsewhere in the stack.

In the meantime, moving the PHY device up to the dsa_port structure is a
good solution, in order not to expose it in switch ops, but still make
it available to more complex drivers.

Do you know if netdev->phydev is usable? Why do DSA has its own copy in
dsa_slave_priv then?

I'll respin, thanks.

Vivien

Re: [PATCH,v2,net-next 1/2] tun: enable NAPI for TUN/TAP driver

2017-09-22 Thread महेश बंडेवार

On Fri, Sep 22, 2017 at 11:03 AM, Willem de Bruijn
 wrote:
> On Fri, Sep 22, 2017 at 1:11 PM, Mahesh Bandewar (महेश बंडेवार)
>  wrote:
>>>  #ifdef CONFIG_TUN_VNET_CROSS_LE
>>>  static inline bool tun_legacy_is_little_endian(struct tun_struct *tun)
>>>  {
>>> @@ -541,6 +604,11 @@ static void __tun_detach(struct tun_file *tfile, bool 
>>> clean)
>>>
>>> tun = rtnl_dereference(tfile->tun);
>>>
>>> +   if (tun && clean) {
>>> +   tun_napi_disable(tun, tfile);
>> are we missing synchronize_net() separating disable and del calls?
>
> That is not needed here. napi_disable has its own mechanism for waiting
> until a napi struct is no longer run. netif_napi_del will call synchronize_net
> if needed.
Yes, that will do. Thanks.

> These two calls are made one after the other in quite a few drivers.

Re: [Intel-wired-lan] [PATCH][V3] e1000: avoid null pointer dereference on invalid stat type

2017-09-22 Thread Alexander Duyck

On Fri, Sep 22, 2017 at 10:13 AM, Colin King  wrote:
> From: Colin Ian King 
>
> Currently if the stat type is invalid then data[i] is being set
> either by dereferencing a null pointer p, or it is reading from
> an incorrect previous location if we had a valid stat type
> previously.  Fix this by skipping over the read of p on an invalid
> stat type.
>
> Detected by CoverityScan, CID#113385 ("Explicit null dereferenced")
>
> Signed-off-by: Colin Ian King 

Looks good to me.

Reviewed-by: Alexander Duyck 

> ---
>  drivers/net/ethernet/intel/e1000/e1000_ethtool.c | 9 -
>  1 file changed, 4 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/e1000/e1000_ethtool.c 
> b/drivers/net/ethernet/intel/e1000/e1000_ethtool.c
> index ec8aa4562cc9..3b3983a1ffbb 100644
> --- a/drivers/net/ethernet/intel/e1000/e1000_ethtool.c
> +++ b/drivers/net/ethernet/intel/e1000/e1000_ethtool.c
> @@ -1824,11 +1824,12 @@ static void e1000_get_ethtool_stats(struct net_device 
> *netdev,
>  {
> struct e1000_adapter *adapter = netdev_priv(netdev);
> int i;
> -   char *p = NULL;
> const struct e1000_stats *stat = e1000_gstrings_stats;
>
> e1000_update_stats(adapter);
> -   for (i = 0; i < E1000_GLOBAL_STATS_LEN; i++) {
> +   for (i = 0; i < E1000_GLOBAL_STATS_LEN; i++, stat++) {
> +   char *p;
> +
> switch (stat->type) {
> case NETDEV_STATS:
> p = (char *)netdev + stat->stat_offset;
> @@ -1839,15 +1840,13 @@ static void e1000_get_ethtool_stats(struct net_device 
> *netdev,
> default:
> WARN_ONCE(1, "Invalid E1000 stat type: %u index %d\n",
>   stat->type, i);
> -   break;
> +   continue;
> }
>
> if (stat->sizeof_stat == sizeof(u64))
> data[i] = *(u64 *)p;
> else
> data[i] = *(u32 *)p;
> -
> -   stat++;
> }
>  /* BUG_ON(i != E1000_STATS_LEN); */
>  }
> --
> 2.14.1
>
> ___
> Intel-wired-lan mailing list
> intel-wired-...@osuosl.org
> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

Re: [PATCH,v2,net-next 1/2] tun: enable NAPI for TUN/TAP driver

2017-09-22 Thread Willem de Bruijn

On Fri, Sep 22, 2017 at 1:11 PM, Mahesh Bandewar (महेश बंडेवार)
 wrote:
>>  #ifdef CONFIG_TUN_VNET_CROSS_LE
>>  static inline bool tun_legacy_is_little_endian(struct tun_struct *tun)
>>  {
>> @@ -541,6 +604,11 @@ static void __tun_detach(struct tun_file *tfile, bool 
>> clean)
>>
>> tun = rtnl_dereference(tfile->tun);
>>
>> +   if (tun && clean) {
>> +   tun_napi_disable(tun, tfile);
> are we missing synchronize_net() separating disable and del calls?

That is not needed here. napi_disable has its own mechanism for waiting
until a napi struct is no longer run. netif_napi_del will call synchronize_net
if needed. These two calls are made one after the other in quite a few drivers.

Re: [PATCH] Add a driver for Renesas uPD60620 and uPD60620A PHYs

2017-09-22 Thread Andrew Lunn

On Fri, Sep 22, 2017 at 05:08:45PM +, Bernd Edlinger wrote:
> Signed-off-by: Bernd Edlinger 
> ---
>   drivers/net/phy/Kconfig|   5 +
>   drivers/net/phy/Makefile   |   1 +
>   drivers/net/phy/uPD60620.c | 226 
> +
>   3 files changed, 232 insertions(+)
>   create mode 100644 drivers/net/phy/uPD60620.c
> 
> diff --git a/drivers/net/phy/Kconfig b/drivers/net/phy/Kconfig
> index a9d16a3..25089f0 100644
> --- a/drivers/net/phy/Kconfig
> +++ b/drivers/net/phy/Kconfig
> @@ -287,6 +287,11 @@ config DP83867_PHY
>   ---help---
> Currently supports the DP83867 PHY.
> 
> +config RENESAS_PHY
> + tristate "Driver for Renesas PHYs"
> + ---help---
> +   Supports the uPD60620 and uPD60620A PHYs.
> +

Hi Bernd

Please call this "Reneseas PHYs" and place in it alphabetical order.

>   config FIXED_PHY
>   tristate "MDIO Bus/PHY emulation with fixed speed/link PHYs"
>   depends on PHYLIB
> diff --git a/drivers/net/phy/Makefile b/drivers/net/phy/Makefile
> index 416df92..1404ad3 100644
> --- a/drivers/net/phy/Makefile
> +++ b/drivers/net/phy/Makefile
> @@ -72,6 +72,7 @@ obj-$(CONFIG_MICROSEMI_PHY) += mscc.o
>   obj-$(CONFIG_NATIONAL_PHY)  += national.o
>   obj-$(CONFIG_QSEMI_PHY) += qsemi.o
>   obj-$(CONFIG_REALTEK_PHY)   += realtek.o
> +obj-$(CONFIG_RENESAS_PHY)+= uPD60620.o
>   obj-$(CONFIG_ROCKCHIP_PHY)  += rockchip.o
>   obj-$(CONFIG_SMSC_PHY)  += smsc.o
>   obj-$(CONFIG_STE10XP)   += ste10Xp.o
> diff --git a/drivers/net/phy/uPD60620.c b/drivers/net/phy/uPD60620.c
> new file mode 100644
> index 000..b3d900c
> --- /dev/null
> +++ b/drivers/net/phy/uPD60620.c
> @@ -0,0 +1,226 @@
> +/*
> + * Driver for the Renesas PHY uPD60620.
> + *
> + * Copyright (C) 2015 Softing Industrial Automation GmbH
> + *
> + *  This program is free software; you can redistribute it and/or modify
> + *  it under the terms of the GNU General Public License as published by
> + *  the Free Software Foundation; either version 2 of the License, or
> + *  (at your option) any later version.
> + *
> + */
> +
> +#include 
> +#include 
> +#include 
> +
> +#define UPD60620_PHY_ID0xb8242824
> +
> +/* Extended Registers and values */
> +/* PHY Special Control/Status*/
> +#define PHY_PHYSCR 0x1F  /* PHY.31 */
> +#define PHY_PHYSCR_10MB0x0004/* PHY speed = 10mb */
> +#define PHY_PHYSCR_100MB   0x0008/* PHY speed = 100mb */
> +#define PHY_PHYSCR_DUPLEX  0x0010/* PHY Duplex */
> +#define PHY_PHYSCR_RSVD5   0x0020/* Reserved Bit 5 */
> +#define PHY_PHYSCR_MIIMOD  0x0040/* Enable 4B5B MII mode */

Are any of these comments actually useful. It seems like the defines
are pretty obvious.

> +#define PHY_PHYSCR_RSVD7   0x0080/* Reserved Bit 7 */
> +#define PHY_PHYSCR_RSVD8   0x0100/* Reserved Bit 8 */
> +#define PHY_PHYSCR_RSVD9   0x0200/* Reserved Bit 9 */
> +#define PHY_PHYSCR_RSVD10  0x0400/* Reserved Bit 10 */
> +#define PHY_PHYSCR_RSVD11  0x0800/* Reserved Bit 11 */
> +#define PHY_PHYSCR_ANDONE  0x1000/* Auto negotiation done */
> +#define PHY_PHYSCR_RSVD13  0x2000/* Reserved Bit 13 */
> +#define PHY_PHYSCR_RSVD14  0x4000/* Reserved Bit 14 */
> +#define PHY_PHYSCR_RSVD15  0x8000/* Reserved Bit 15 */

It looks like the only register you use is SCR and SPM. Maybe delete
all the rest? Or do you plan to add more features making use of these
registers?

> +/* Init PHY */
> +
> +static int upd60620_config_init(struct phy_device *phydev)
> +{
> + /* Enable support for passive HUBs (could be a strap option) */
> + /* PHYMODE: All speeds, HD in parallel detect */
> + return phy_write(phydev, PHY_SPM, 0x0180 | phydev->mdio.addr);
> +}
> +
> +/* Get PHY status from common registers */
> +
> +static int upd60620_read_status(struct phy_device *phydev)
> +{
> + int phy_state;
> +
> + /* Read negotiated state */
> + phy_state = phy_read(phydev, MII_BMSR);
> + if (phy_state < 0)
> + return phy_state;
> +
> + phydev->link = 0;
> + phydev->lp_advertising = 0;
> + phydev->pause = 0;
> + phydev->asym_pause = 0;
> +
> + if (phy_state & BMSR_ANEGCOMPLETE) {

It is worth comparing this against genphy_read_status() which is the
reference implementation. You would normally check if auto negotiation
is enabled, not if it has completed. If it is enabled you read the
current negotiated state, even if it is not completed.

> + phy_state = phy_read(phydev, PHY_PHYSCR);
> + if (phy_state < 0)
> + return phy_state;
> +
> + if (phy_state & (PHY_PHYSCR_10MB | PHY_PHYSCR_100MB)) {
> + phydev->link = 1;
> + phydev->speed = SPEED_10;
> + phydev->duplex = DUPLEX_HALF;
> +
> + if (phy_state & PHY_PHYSCR_100MB)
> + phydev->speed = SPEED_100;
> +

Re: [PATCH,v2,net-next 2/2] tun: enable napi_gro_frags() for TUN/TAP driver

2017-09-22 Thread Willem de Bruijn

On Fri, Sep 22, 2017 at 1:48 PM, Petar Penkov  wrote:
> On Fri, Sep 22, 2017 at 9:51 AM, Mahesh Bandewar (महेश बंडेवार)
>  wrote:
>> On Fri, Sep 22, 2017 at 7:06 AM, Willem de Bruijn
>>  wrote:
 @@ -2061,6 +2174,9 @@ static int tun_set_iff(struct net *net, struct file 
 *file, struct ifreq *ifr)
 if (tfile->detached)
 return -EINVAL;

 +   if ((ifr->ifr_flags & IFF_NAPI_FRAGS) && !capable(CAP_NET_ADMIN))
 +   return -EPERM;
 +
>>>
>>> This should perhaps be moved into the !dev branch, directly below the
>>> ns_capable check.
>>>
>> Hmm, does that mean fail only on creation but allow to attach if
>> exists? That would be wrong, isn't it? Correct me if I'm wrong but we
>> want to prevent both these scenarios if user does not have sufficient
>> privileges (i.e. NET_ADMIN in init-ns).

Ok.

>>
> My understanding is we want to protect both scenarios.
 dev = __dev_get_by_name(net, ifr->ifr_name);
 if (dev) {
 if (ifr->ifr_flags & IFF_TUN_EXCL)
 @@ -2185,6 +2301,9 @@ static int tun_set_iff(struct net *net, struct file 
 *file, struct ifreq *ifr)
 tun->flags = (tun->flags & ~TUN_FEATURES) |
 (ifr->ifr_flags & TUN_FEATURES);

 +   if (!(tun->flags & IFF_NAPI) || (tun->flags & TUN_TYPE_MASK) != 
 IFF_TAP)
 +   tun->flags = tun->flags & ~IFF_NAPI_FRAGS;
 +
>>>
>>> Similarly, this check only need to be performed in that branch.
>>> Instead of reverting to non-frags mode, a tun_set_iff with the wrong
>>> set of flags should probably fail hard.
>> Yes, agree, wrong set of flags should fail hard and probably be done
>> before attach or open, no?
> Agreed, in v3 I will push this check before the conditional so both
> branches can be rejected with EINVAL.

Sounds great.

Re: [PATCH,v2,net-next 2/2] tun: enable napi_gro_frags() for TUN/TAP driver

2017-09-22 Thread Petar Penkov

On Fri, Sep 22, 2017 at 9:51 AM, Mahesh Bandewar (महेश बंडेवार)
 wrote:
> On Fri, Sep 22, 2017 at 7:06 AM, Willem de Bruijn
>  wrote:
>>> @@ -2061,6 +2174,9 @@ static int tun_set_iff(struct net *net, struct file 
>>> *file, struct ifreq *ifr)
>>> if (tfile->detached)
>>> return -EINVAL;
>>>
>>> +   if ((ifr->ifr_flags & IFF_NAPI_FRAGS) && !capable(CAP_NET_ADMIN))
>>> +   return -EPERM;
>>> +
>>
>> This should perhaps be moved into the !dev branch, directly below the
>> ns_capable check.
>>
> Hmm, does that mean fail only on creation but allow to attach if
> exists? That would be wrong, isn't it? Correct me if I'm wrong but we
> want to prevent both these scenarios if user does not have sufficient
> privileges (i.e. NET_ADMIN in init-ns).
>
My understanding is we want to protect both scenarios.
>>> dev = __dev_get_by_name(net, ifr->ifr_name);
>>> if (dev) {
>>> if (ifr->ifr_flags & IFF_TUN_EXCL)
>>> @@ -2185,6 +2301,9 @@ static int tun_set_iff(struct net *net, struct file 
>>> *file, struct ifreq *ifr)
>>> tun->flags = (tun->flags & ~TUN_FEATURES) |
>>> (ifr->ifr_flags & TUN_FEATURES);
>>>
>>> +   if (!(tun->flags & IFF_NAPI) || (tun->flags & TUN_TYPE_MASK) != 
>>> IFF_TAP)
>>> +   tun->flags = tun->flags & ~IFF_NAPI_FRAGS;
>>> +
>>
>> Similarly, this check only need to be performed in that branch.
>> Instead of reverting to non-frags mode, a tun_set_iff with the wrong
>> set of flags should probably fail hard.
> Yes, agree, wrong set of flags should fail hard and probably be done
> before attach or open, no?
Agreed, in v3 I will push this check before the conditional so both
branches can be rejected with EINVAL.

Re: [PATCH net-next 4/4] net: dsa: add port enable and disable helpers

2017-09-22 Thread Florian Fainelli

On 09/22/2017 09:17 AM, Vivien Didelot wrote:
> Provide dsa_port_enable and dsa_port_disable helpers to respectively
> enable and disable a switch port. This makes the dsa_port_set_state_now
> helper static.
> 
> Signed-off-by: Vivien Didelot 

Reviewed-by: Florian Fainelli 
-- 
Florian

Re: [PATCH net-next 2/4] net: dsa: remove phy arg from port enable/disable

2017-09-22 Thread Florian Fainelli

On 09/22/2017 09:17 AM, Vivien Didelot wrote:
> The .port_enable and .port_disable functions are meant to deal with the
> switch ports only, and no driver is using the phy argument anyway.
> Remove it.

I don't think this makes sense, there are perfectly legit reasons why a
switch driver may have something to do with the PHY device attached to
its per-port network interface, we should definitively keep that around,
unless you think we should be accessing the PHY within the switch
drivers by doing:

struct phy_device *phydev = ds->ports[port].netdev->phydev?

> 
> Signed-off-by: Vivien Didelot 
> ---
>  drivers/net/dsa/b53/b53_common.c   |  6 +++---
>  drivers/net/dsa/b53/b53_priv.h |  4 ++--
>  drivers/net/dsa/bcm_sf2.c  | 16 +++-
>  drivers/net/dsa/lan9303-core.c |  6 ++
>  drivers/net/dsa/microchip/ksz_common.c |  6 ++
>  drivers/net/dsa/mt7530.c   |  8 +++-
>  drivers/net/dsa/mv88e6xxx/chip.c   |  6 ++
>  drivers/net/dsa/qca8k.c|  6 ++
>  include/net/dsa.h  |  6 ++
>  net/dsa/slave.c|  4 ++--
>  10 files changed, 27 insertions(+), 41 deletions(-)
> 
> diff --git a/drivers/net/dsa/b53/b53_common.c 
> b/drivers/net/dsa/b53/b53_common.c
> index d4ce092def83..e46eb29d29f0 100644
> --- a/drivers/net/dsa/b53/b53_common.c
> +++ b/drivers/net/dsa/b53/b53_common.c
> @@ -502,7 +502,7 @@ void b53_imp_vlan_setup(struct dsa_switch *ds, int 
> cpu_port)
>  }
>  EXPORT_SYMBOL(b53_imp_vlan_setup);
>  
> -int b53_enable_port(struct dsa_switch *ds, int port, struct phy_device *phy)
> +int b53_enable_port(struct dsa_switch *ds, int port)
>  {
>   struct b53_device *dev = ds->priv;
>   unsigned int cpu_port = dev->cpu_port;
> @@ -531,7 +531,7 @@ int b53_enable_port(struct dsa_switch *ds, int port, 
> struct phy_device *phy)
>  }
>  EXPORT_SYMBOL(b53_enable_port);
>  
> -void b53_disable_port(struct dsa_switch *ds, int port, struct phy_device 
> *phy)
> +void b53_disable_port(struct dsa_switch *ds, int port)
>  {
>   struct b53_device *dev = ds->priv;
>   u8 reg;
> @@ -874,7 +874,7 @@ static int b53_setup(struct dsa_switch *ds)
>   if (dsa_is_cpu_port(ds, port))
>   b53_enable_cpu_port(dev, port);
>   else if (!(BIT(port) & ds->enabled_port_mask))
> - b53_disable_port(ds, port, NULL);
> + b53_disable_port(ds, port);
>   }
>  
>   return ret;
> diff --git a/drivers/net/dsa/b53/b53_priv.h b/drivers/net/dsa/b53/b53_priv.h
> index 603c66d240d8..688d02ee6155 100644
> --- a/drivers/net/dsa/b53/b53_priv.h
> +++ b/drivers/net/dsa/b53/b53_priv.h
> @@ -311,8 +311,8 @@ int b53_mirror_add(struct dsa_switch *ds, int port,
>  struct dsa_mall_mirror_tc_entry *mirror, bool ingress);
>  void b53_mirror_del(struct dsa_switch *ds, int port,
>   struct dsa_mall_mirror_tc_entry *mirror);
> -int b53_enable_port(struct dsa_switch *ds, int port, struct phy_device *phy);
> -void b53_disable_port(struct dsa_switch *ds, int port, struct phy_device 
> *phy);
> +int b53_enable_port(struct dsa_switch *ds, int port);
> +void b53_disable_port(struct dsa_switch *ds, int port);
>  void b53_brcm_hdr_setup(struct dsa_switch *ds, int port);
>  void b53_eee_enable_set(struct dsa_switch *ds, int port, bool enable);
>  int b53_eee_init(struct dsa_switch *ds, int port, struct phy_device *phy);
> diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c
> index ad96b9725a2c..77e0c43f973b 100644
> --- a/drivers/net/dsa/bcm_sf2.c
> +++ b/drivers/net/dsa/bcm_sf2.c
> @@ -159,8 +159,7 @@ static inline void bcm_sf2_port_intr_disable(struct 
> bcm_sf2_priv *priv,
>   intrl2_1_writel(priv, P_IRQ_MASK(off), INTRL2_CPU_CLEAR);
>  }
>  
> -static int bcm_sf2_port_setup(struct dsa_switch *ds, int port,
> -   struct phy_device *phy)
> +static int bcm_sf2_port_setup(struct dsa_switch *ds, int port)
>  {
>   struct bcm_sf2_priv *priv = bcm_sf2_to_priv(ds);
>   unsigned int i;
> @@ -191,11 +190,10 @@ static int bcm_sf2_port_setup(struct dsa_switch *ds, 
> int port,
>   if (port == priv->moca_port)
>   bcm_sf2_port_intr_enable(priv, port);
>  
> - return b53_enable_port(ds, port, phy);
> + return b53_enable_port(ds, port);
>  }
>  
> -static void bcm_sf2_port_disable(struct dsa_switch *ds, int port,
> -  struct phy_device *phy)
> +static void bcm_sf2_port_disable(struct dsa_switch *ds, int port)
>  {
>   struct bcm_sf2_priv *priv = bcm_sf2_to_priv(ds);
>   u32 off, reg;
> @@ -214,7 +212,7 @@ static void bcm_sf2_port_disable(struct dsa_switch *ds, 
> int port,
>   else
>   off = CORE_G_PCTL_PORT(port);
>  
> - b53_disable_port(ds, port, phy);
> + b53_disable_port(ds, port);
>  
>   /* Power down the port memory */
>   reg =

Re: [PATCH net-next 3/4] net: dsa: make slave close symmetrical to open

2017-09-22 Thread Florian Fainelli

On 09/22/2017 09:17 AM, Vivien Didelot wrote:
> The DSA slave open function configures the unicast MAC addresses on the
> master device, enable the switch port, change its STP state, then start
> the PHY device.
> 
> Make the close function symmetric, by first stopping the PHY device,
> then changing the STP state, disabling the switch port and restore the
> master device.
> 
> Signed-off-by: Vivien Didelot 

Reviewed-by: Florian Fainelli 
-- 
Florian

[PATCH][V3] e1000: avoid null pointer dereference on invalid stat type

2017-09-22 Thread Colin King

From: Colin Ian King 

Currently if the stat type is invalid then data[i] is being set
either by dereferencing a null pointer p, or it is reading from
an incorrect previous location if we had a valid stat type
previously.  Fix this by skipping over the read of p on an invalid
stat type.

Detected by CoverityScan, CID#113385 ("Explicit null dereferenced")

Signed-off-by: Colin Ian King 
---
 drivers/net/ethernet/intel/e1000/e1000_ethtool.c | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/intel/e1000/e1000_ethtool.c 
b/drivers/net/ethernet/intel/e1000/e1000_ethtool.c
index ec8aa4562cc9..3b3983a1ffbb 100644
--- a/drivers/net/ethernet/intel/e1000/e1000_ethtool.c
+++ b/drivers/net/ethernet/intel/e1000/e1000_ethtool.c
@@ -1824,11 +1824,12 @@ static void e1000_get_ethtool_stats(struct net_device 
*netdev,
 {
struct e1000_adapter *adapter = netdev_priv(netdev);
int i;
-   char *p = NULL;
const struct e1000_stats *stat = e1000_gstrings_stats;
 
e1000_update_stats(adapter);
-   for (i = 0; i < E1000_GLOBAL_STATS_LEN; i++) {
+   for (i = 0; i < E1000_GLOBAL_STATS_LEN; i++, stat++) {
+   char *p;
+
switch (stat->type) {
case NETDEV_STATS:
p = (char *)netdev + stat->stat_offset;
@@ -1839,15 +1840,13 @@ static void e1000_get_ethtool_stats(struct net_device 
*netdev,
default:
WARN_ONCE(1, "Invalid E1000 stat type: %u index %d\n",
  stat->type, i);
-   break;
+   continue;
}
 
if (stat->sizeof_stat == sizeof(u64))
data[i] = *(u64 *)p;
else
data[i] = *(u32 *)p;
-
-   stat++;
}
 /* BUG_ON(i != E1000_STATS_LEN); */
 }
-- 
2.14.1

Re: [PATCH,v2,net-next 1/2] tun: enable NAPI for TUN/TAP driver

2017-09-22 Thread महेश बंडेवार

>  #ifdef CONFIG_TUN_VNET_CROSS_LE
>  static inline bool tun_legacy_is_little_endian(struct tun_struct *tun)
>  {
> @@ -541,6 +604,11 @@ static void __tun_detach(struct tun_file *tfile, bool 
> clean)
>
> tun = rtnl_dereference(tfile->tun);
>
> +   if (tun && clean) {
> +   tun_napi_disable(tun, tfile);
are we missing synchronize_net() separating disable and del calls?
> +   tun_napi_del(tun, tfile);
> +   }
> +
> if (tun && !tfile->detached) {
> u16 index = tfile->queue_index;
> BUG_ON(index >= tun->numqueues);

Re: [PATCH iproute2 v2] man: fix documentation for range of route table ID

2017-09-22 Thread Stephen Hemminger

On Fri, 22 Sep 2017 13:28:54 +0200
Thomas Haller  wrote:

> Signed-off-by: Thomas Haller 
> ---
> Changes in v2:
>   - "0" is not a valid table ID.
> 
>  man/man8/ip-route.8.in | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/man/man8/ip-route.8.in b/man/man8/ip-route.8.in
> index 803de3b9..705ceb20 100644
> --- a/man/man8/ip-route.8.in
> +++ b/man/man8/ip-route.8.in
> @@ -322,7 +322,7 @@ normal routing tables.
>  .P
>  .B Route tables:
>  Linux-2.x can pack routes into several routing tables identified
> -by a number in the range from 1 to 2^31 or by name from the file
> +by a number in the range from 1 to 2^32-1 or by name from the file
>  .B @SYSCONFDIR@/rt_tables
>  By default all normal routes are inserted into the
>  .B main

Applied

[PATCH] Add a driver for Renesas uPD60620 and uPD60620A PHYs

2017-09-22 Thread Bernd Edlinger

Signed-off-by: Bernd Edlinger 
---
  drivers/net/phy/Kconfig|   5 +
  drivers/net/phy/Makefile   |   1 +
  drivers/net/phy/uPD60620.c | 226 
+
  3 files changed, 232 insertions(+)
  create mode 100644 drivers/net/phy/uPD60620.c

diff --git a/drivers/net/phy/Kconfig b/drivers/net/phy/Kconfig
index a9d16a3..25089f0 100644
--- a/drivers/net/phy/Kconfig
+++ b/drivers/net/phy/Kconfig
@@ -287,6 +287,11 @@ config DP83867_PHY
---help---
  Currently supports the DP83867 PHY.

+config RENESAS_PHY
+   tristate "Driver for Renesas PHYs"
+   ---help---
+ Supports the uPD60620 and uPD60620A PHYs.
+
  config FIXED_PHY
tristate "MDIO Bus/PHY emulation with fixed speed/link PHYs"
depends on PHYLIB
diff --git a/drivers/net/phy/Makefile b/drivers/net/phy/Makefile
index 416df92..1404ad3 100644
--- a/drivers/net/phy/Makefile
+++ b/drivers/net/phy/Makefile
@@ -72,6 +72,7 @@ obj-$(CONFIG_MICROSEMI_PHY)   += mscc.o
  obj-$(CONFIG_NATIONAL_PHY)+= national.o
  obj-$(CONFIG_QSEMI_PHY)   += qsemi.o
  obj-$(CONFIG_REALTEK_PHY) += realtek.o
+obj-$(CONFIG_RENESAS_PHY)  += uPD60620.o
  obj-$(CONFIG_ROCKCHIP_PHY)+= rockchip.o
  obj-$(CONFIG_SMSC_PHY)+= smsc.o
  obj-$(CONFIG_STE10XP) += ste10Xp.o
diff --git a/drivers/net/phy/uPD60620.c b/drivers/net/phy/uPD60620.c
new file mode 100644
index 000..b3d900c
--- /dev/null
+++ b/drivers/net/phy/uPD60620.c
@@ -0,0 +1,226 @@
+/*
+ * Driver for the Renesas PHY uPD60620.
+ *
+ * Copyright (C) 2015 Softing Industrial Automation GmbH
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ */
+
+#include 
+#include 
+#include 
+
+#define UPD60620_PHY_ID0xb8242824
+
+/* Extended Registers and values */
+/* PHY Special Control/Status*/
+#define PHY_PHYSCR 0x1F  /* PHY.31 */
+#define PHY_PHYSCR_10MB0x0004/* PHY speed = 10mb */
+#define PHY_PHYSCR_100MB   0x0008/* PHY speed = 100mb */
+#define PHY_PHYSCR_DUPLEX  0x0010/* PHY Duplex */
+#define PHY_PHYSCR_RSVD5   0x0020/* Reserved Bit 5 */
+#define PHY_PHYSCR_MIIMOD  0x0040/* Enable 4B5B MII mode */
+#define PHY_PHYSCR_RSVD7   0x0080/* Reserved Bit 7 */
+#define PHY_PHYSCR_RSVD8   0x0100/* Reserved Bit 8 */
+#define PHY_PHYSCR_RSVD9   0x0200/* Reserved Bit 9 */
+#define PHY_PHYSCR_RSVD10  0x0400/* Reserved Bit 10 */
+#define PHY_PHYSCR_RSVD11  0x0800/* Reserved Bit 11 */
+#define PHY_PHYSCR_ANDONE  0x1000/* Auto negotiation done */
+#define PHY_PHYSCR_RSVD13  0x2000/* Reserved Bit 13 */
+#define PHY_PHYSCR_RSVD14  0x4000/* Reserved Bit 14 */
+#define PHY_PHYSCR_RSVD15  0x8000/* Reserved Bit 15 */
+
+/* PHY Global Config Mapping */
+#define PHY_GLOBAL_CONFIG  0x07
+/* PHY GPIO Config Register 1 */
+#define PHY_GPIO_CONFIG1   0x01 /* PHY 7.1 */
+#define PHY_GPIO4_INT0 0x000d   /* GPIO4 configuration */
+#define PHY_GPIO5_INT1 0x00d0   /* GPIO5 configuration */
+
+/* PHY Interrupt Control Register */
+#define PHY_ICR0x1e  /* PHY.30 */
+#define PHY_ICR_RSVD0  0x0001/* Reserved bit 0 */
+#define PHY_ICR_ANCPRRN0x0002/* Auto negotiation paged received */
+#define PHY_ICR_PDFEN  0x0004/* Parallel detection fault */
+#define PHY_ICR_ANCLPAEN   0x0008/* Auto negotiation last page ack */
+#define PHY_ICR_LNKINTEN   0x0010/* Link down */
+#define PHY_ICR_REMFD  0x0020/* Remote fault detected */
+#define PHY_ICR_ANCINTEN   0x0040/* Auto negotiation complete */
+#define PHY_ICR_EOEN   0x0080/* Energy on generated */
+#define PHY_ICR_RSVD8  0x0100/* Reserved bit 8 */
+#define PHY_ICR_FEQTRGEN   0x0200/* FEQ Trigger */
+#define PHY_ICR_BERTRGEN   0x0400/* BER Counter Trigger */
+#define PHY_ICR_MLINTEN0x0800/* Maxlvl */
+#define PHY_ICR_CLPINTEN   0x1000/* Clipping */
+#define PHY_ICR_RSVD13 0x2000/* Reserved bit 13 */
+#define PHY_ICR_RSVD14 0x4000/* Reserved bit 14 */
+#define PHY_ICR_RSVD15 0x8000/* Reserved bit 15 */
+
+/* PHY Interrupt Status Register */
+#define PHY_ISR0x1d  /* PHY.29 */
+#define PHY_ISR_DUPINT 0x/* Placeholder for Duplex/Speed 
intr */
+#define PHY_ISR_RSVD0  0x0001/* Reserved bit 0 */
+#define PHY_ISR_ANCPR  0x0002/* Auto negotiation paged received */
+#define PHY_ISR_PDF0x0004/* Parallel detection fault */
+#define PHY_ISR_ANCLPA 0x0008/* Auto negotiation last page ack */
+#define PHY_ISR_LNKINT 0x0010/* Link down */
+#define PHY_ISR_REMFD  0x0020/* Remote fault detected */
+#define PHY_ISR_ANCINT 0x0040/* Auto negotiation complete */
+#define PHY_ISR_EO

Re: [PATCH iproute2 master 0/2] BPF/XDP json follow-up

2017-09-22 Thread Stephen Hemminger

On Thu, 21 Sep 2017 10:42:27 +0200
Daniel Borkmann  wrote:

> After merging net-next branch into master, Stephen asked to
> fix up json dump for XDP as there were some merge conflicts,
> so here it is.
> 
> Thanks!
> 
> Daniel Borkmann (2):
>   json: move json printer to common library
>   bpf: properly output json for xdp
> 
>  include/json_print.h |  71 
>  ip/Makefile  |   2 +-
>  ip/ip_common.h   |  65 ++
>  ip/ip_print.c| 233 
> ---
>  ip/iplink_xdp.c  |  74 +---
>  lib/Makefile |   2 +-
>  lib/bpf.c|  19 +++--
>  lib/json_print.c | 231 ++
>  8 files changed, 369 insertions(+), 328 deletions(-)
>  create mode 100644 include/json_print.h
>  delete mode 100644 ip/ip_print.c
>  create mode 100644 lib/json_print.c
> 

Applied.

usb/wireless/rsi_91x: use-after-free write in __run_timers

2017-09-22 Thread Andrey Konovalov

Hi!

I've got the following report while fuzzing the kernel with syzkaller.

On commit 6e80ecdddf4ea6f3cd84e83720f3d852e6624a68 (Sep 21).

==
BUG: KASAN: use-after-free in __run_timers+0xc0e/0xd40
Write of size 8 at addr 880069f701b8 by task swapper/0/0

CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.14.0-rc1-42311-g6e80ecdddf4e #234
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
 
 __dump_stack lib/dump_stack.c:16
 dump_stack+0x292/0x395 lib/dump_stack.c:52
 print_address_description+0x78/0x280 mm/kasan/report.c:252
 kasan_report_error mm/kasan/report.c:351
 kasan_report+0x22f/0x340 mm/kasan/report.c:409
 __asan_report_store8_noabort+0x1c/0x20 mm/kasan/report.c:435
 collect_expired_timers ./include/linux/list.h:729
 __run_timers+0xc0e/0xd40 kernel/time/timer.c:1616
 run_timer_softirq+0x83/0x140 kernel/time/timer.c:1646
 __do_softirq+0x305/0xc2d kernel/softirq.c:284
 invoke_softirq kernel/softirq.c:364
 irq_exit+0x171/0x1a0 kernel/softirq.c:405
 exiting_irq ./arch/x86/include/asm/apic.h:638
 smp_apic_timer_interrupt+0x2b9/0x8d0 arch/x86/kernel/apic/apic.c:1048
 apic_timer_interrupt+0x9d/0xb0
 
RIP: 0010:native_safe_halt+0x6/0x10 ./arch/x86/include/asm/irqflags.h:53
RSP: 0018:86607958 EFLAGS: 0282 ORIG_RAX: ff10
RAX: dc20 RBX: 10cc0f2f RCX: 
RDX:  RSI: 0001 RDI: 8662ea64
RBP: 86607958 R08: 813d3501 R09: 
R10:  R11:  R12: 10cc0f3b
R13: 86607a98 R14: 86fc1628 R15: 
 arch_safe_halt ./arch/x86/include/asm/paravirt.h:93
 default_idle+0x127/0x690 arch/x86/kernel/process.c:341
 arch_cpu_idle+0xf/0x20 arch/x86/kernel/process.c:332
 default_idle_call+0x3b/0x60 kernel/sched/idle.c:98
 cpuidle_idle_call kernel/sched/idle.c:156
 do_idle+0x35c/0x440 kernel/sched/idle.c:246
 cpu_startup_entry+0x1d/0x20 kernel/sched/idle.c:351
 rest_init+0xf3/0x100 init/main.c:435
 start_kernel+0x782/0x7b0 init/main.c:710
 x86_64_start_reservations+0x2a/0x2c arch/x86/kernel/head64.c:377
 x86_64_start_kernel+0x77/0x7a arch/x86/kernel/head64.c:358
 secondary_startup_64+0xa5/0xa5 arch/x86/kernel/head_64.S:235

Allocated by task 1845:
 save_stack_trace+0x1b/0x20 arch/x86/kernel/stacktrace.c:59
 save_stack+0x43/0xd0 mm/kasan/kasan.c:447
 set_track mm/kasan/kasan.c:459
 kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:551
 kmem_cache_alloc_trace+0x11e/0x2d0 mm/slub.c:2772
 kmalloc ./include/linux/slab.h:493
 kzalloc ./include/linux/slab.h:666
 rsi_91x_init+0x98/0x510 drivers/net/wireless/rsi/rsi_91x_main.c:203
 rsi_probe+0xb6/0x13b0 drivers/net/wireless/rsi/rsi_91x_usb.c:665
 usb_probe_interface+0x35d/0x8e0 drivers/usb/core/driver.c:361
 really_probe drivers/base/dd.c:413
 driver_probe_device+0x610/0xa00 drivers/base/dd.c:557
 __device_attach_driver+0x230/0x290 drivers/base/dd.c:653
 bus_for_each_drv+0x161/0x210 drivers/base/bus.c:463
 __device_attach+0x26e/0x3d0 drivers/base/dd.c:710
 device_initial_probe+0x1f/0x30 drivers/base/dd.c:757
 bus_probe_device+0x1eb/0x290 drivers/base/bus.c:523
 device_add+0xd0b/0x1660 drivers/base/core.c:1835
 usb_set_configuration+0x104e/0x1870 drivers/usb/core/message.c:1932
 generic_probe+0x73/0xe0 drivers/usb/core/generic.c:174
 usb_probe_device+0xaf/0xe0 drivers/usb/core/driver.c:266
 really_probe drivers/base/dd.c:413
 driver_probe_device+0x610/0xa00 drivers/base/dd.c:557
 __device_attach_driver+0x230/0x290 drivers/base/dd.c:653
 bus_for_each_drv+0x161/0x210 drivers/base/bus.c:463
 __device_attach+0x26e/0x3d0 drivers/base/dd.c:710
 device_initial_probe+0x1f/0x30 drivers/base/dd.c:757
 bus_probe_device+0x1eb/0x290 drivers/base/bus.c:523
 device_add+0xd0b/0x1660 drivers/base/core.c:1835
 usb_new_device+0x7b8/0x1020 drivers/usb/core/hub.c:2457
 hub_port_connect drivers/usb/core/hub.c:4903
 hub_port_connect_change drivers/usb/core/hub.c:5009
 port_event drivers/usb/core/hub.c:5115
 hub_event+0x194d/0x3740 drivers/usb/core/hub.c:5195
 process_one_work+0xc7f/0x1db0 kernel/workqueue.c:2119
 worker_thread+0x221/0x1850 kernel/workqueue.c:2253
 kthread+0x3a1/0x470 kernel/kthread.c:231
 ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:431

Freed by task 1845:
 save_stack_trace+0x1b/0x20 arch/x86/kernel/stacktrace.c:59
 save_stack+0x43/0xd0 mm/kasan/kasan.c:447
 set_track mm/kasan/kasan.c:459
 kasan_slab_free+0x72/0xc0 mm/kasan/kasan.c:524
 slab_free_hook mm/slub.c:1390
 slab_free_freelist_hook mm/slub.c:1412
 slab_free mm/slub.c:2988
 kfree+0xf6/0x2f0 mm/slub.c:3919
 rsi_91x_deinit+0x1e8/0x250 drivers/net/wireless/rsi/rsi_91x_main.c:268
 rsi_probe+0xed1/0x13b0 drivers/net/wireless/rsi/rsi_91x_usb.c:709
 usb_probe_interface+0x35d/0x8e0 drivers/usb/core/driver.c:361
 really_probe drivers/base/dd.c:413
 driver_probe_device+0x610/0xa00 drivers/base/dd.c:557
 __device_attach_driver+0x230/0x290 drivers/base/dd.c:653

Re: [PATCH net-next 1/4] net: dsa: move up phy enabling in core

2017-09-22 Thread Florian Fainelli

On 09/22/2017 09:32 AM, Andrew Lunn wrote:
> On Fri, Sep 22, 2017 at 12:17:50PM -0400, Vivien Didelot wrote:
>> bcm_sf2 is currently the only driver using the phy argument passed to
>> .port_enable. It resets the state machine if the phy has been hard
>> reset. This check is generic and can be moved to DSA core.
>>  
>>  dsa_port_set_state_now(p->dp, stp_state);
>>  
>> -if (p->phy)
>> -phy_start(p->phy);
>> +if (phy) {
>> +/* If phy_stop() has been called before, phy will be in
>> + * halted state, and phy_start() will call resume.
>> + *
>> + * The resume path does not configure back autoneg
>> + * settings, and since the internal phy may have been
>> + * hard reset, we need to reset the state machine also.
>> + */
>> +phy->state = PHY_READY;
>> +phy_init_hw(phy);
>> +phy_start(phy);
>> +}
> 
> Hi Vivien
> 
> If this is generic, why is it needed at all here? Shouldn't this
> actually by in phylib?

This does not belong in the core logic within net/dsa/slave.c. The
reason why this is necessary here is because we are doing a HW-based
reset of the PHY, as the comment explains this is specific to how the HW
works. There may be a cleaner solution to this problem, but in any case,
I don't think other drivers should inherit that logic.
-- 
Florian

Re: [PATCH net-next 1/4] net: dsa: move up phy enabling in core

2017-09-22 Thread Florian Fainelli

On 09/22/2017 09:17 AM, Vivien Didelot wrote:
> bcm_sf2 is currently the only driver using the phy argument passed to
> .port_enable. It resets the state machine if the phy has been hard
> reset. This check is generic and can be moved to DSA core.

This is completely specific to bcm_sf2 because it does call
bcm_sf2_gphy_enable_set() which performs a HW reset of the PHY, you
can't move this to the generic portion of net/dsa/slave.c. NACK.

> 
> Signed-off-by: Vivien Didelot 
> ---
>  drivers/net/dsa/bcm_sf2.c | 16 +---
>  net/dsa/slave.c   | 15 +--
>  2 files changed, 14 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c
> index 898d5642b516..ad96b9725a2c 100644
> --- a/drivers/net/dsa/bcm_sf2.c
> +++ b/drivers/net/dsa/bcm_sf2.c
> @@ -184,22 +184,8 @@ static int bcm_sf2_port_setup(struct dsa_switch *ds, int 
> port,
>   core_writel(priv, reg, CORE_PORT_TC2_QOS_MAP_PORT(port));
>  
>   /* Re-enable the GPHY and re-apply workarounds */
> - if (priv->int_phy_mask & 1 << port && priv->hw_params.num_gphy == 1) {
> + if (priv->int_phy_mask & 1 << port && priv->hw_params.num_gphy == 1)
>   bcm_sf2_gphy_enable_set(ds, true);
> - if (phy) {
> - /* if phy_stop() has been called before, phy
> -  * will be in halted state, and phy_start()
> -  * will call resume.
> -  *
> -  * the resume path does not configure back
> -  * autoneg settings, and since we hard reset
> -  * the phy manually here, we need to reset the
> -  * state machine also.
> -  */
> - phy->state = PHY_READY;
> - phy_init_hw(phy);
> - }
> - }
>  
>   /* Enable MoCA port interrupts to get notified */
>   if (port == priv->moca_port)
> diff --git a/net/dsa/slave.c b/net/dsa/slave.c
> index 02ace7d462c4..606812160fd5 100644
> --- a/net/dsa/slave.c
> +++ b/net/dsa/slave.c
> @@ -72,6 +72,7 @@ static int dsa_slave_get_iflink(const struct net_device 
> *dev)
>  static int dsa_slave_open(struct net_device *dev)
>  {
>   struct dsa_slave_priv *p = netdev_priv(dev);
> + struct phy_device *phy = p->phy;
>   struct dsa_port *dp = p->dp;
>   struct dsa_switch *ds = dp->ds;
>   struct net_device *master = dsa_master_netdev(p);
> @@ -106,8 +107,18 @@ static int dsa_slave_open(struct net_device *dev)
>  
>   dsa_port_set_state_now(p->dp, stp_state);
>  
> - if (p->phy)
> - phy_start(p->phy);
> + if (phy) {
> + /* If phy_stop() has been called before, phy will be in
> +  * halted state, and phy_start() will call resume.
> +  *
> +  * The resume path does not configure back autoneg
> +  * settings, and since the internal phy may have been
> +  * hard reset, we need to reset the state machine also.
> +  */
> + phy->state = PHY_READY;
> + phy_init_hw(phy);
> + phy_start(phy);
> + }
>  
>   return 0;
>  
> 


-- 
Florian

Re: [PATCH] brcm80211: make const array ucode_ofdm_rates static, reduces object code size

2017-09-22 Thread Arend van Spriel


Please use 'brcmsmac:' as prefix instead of 'brcm80211:'.

On 22-09-17 16:03, Colin King wrote:

From: Colin Ian King 

Don't populate const array ucode_ofdm_rates on the stack, instead make it
static. Makes the object code smaller by 100 bytes:

Before:
text   data bss dec hex filename
   39482564   0   400469c6e phy_cmn.o

After
text   data bss dec hex filename
   39326620   0   399469c0a phy_cmn.o

(gcc 6.3.0, x86-64)


Acked-by: Arend van Spriel 

Signed-off-by: Colin Ian King 
---
  drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_cmn.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_cmn.c 
b/drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_cmn.c
index 1c4e9dd57960..3a13d176b221 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_cmn.c
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_cmn.c
@@ -1916,7 +1916,7 @@ void wlc_phy_txpower_update_shm(struct brcms_phy *pi)
 pi->hwpwr_txcur);
  
  		for (j = TXP_FIRST_OFDM; j <= TXP_LAST_OFDM; j++) {

-   const u8 ucode_ofdm_rates[] = {
+   static const u8 ucode_ofdm_rates[] = {
0x0c, 0x12, 0x18, 0x24, 0x30, 0x48, 0x60, 0x6c
};
offset = wlapi_bmac_rate_shm_offset(

Re: [PATCH,v2,net-next 2/2] tun: enable napi_gro_frags() for TUN/TAP driver

2017-09-22 Thread महेश बंडेवार

On Fri, Sep 22, 2017 at 7:06 AM, Willem de Bruijn
 wrote:
>> @@ -2061,6 +2174,9 @@ static int tun_set_iff(struct net *net, struct file 
>> *file, struct ifreq *ifr)
>> if (tfile->detached)
>> return -EINVAL;
>>
>> +   if ((ifr->ifr_flags & IFF_NAPI_FRAGS) && !capable(CAP_NET_ADMIN))
>> +   return -EPERM;
>> +
>
> This should perhaps be moved into the !dev branch, directly below the
> ns_capable check.
>
Hmm, does that mean fail only on creation but allow to attach if
exists? That would be wrong, isn't it? Correct me if I'm wrong but we
want to prevent both these scenarios if user does not have sufficient
privileges (i.e. NET_ADMIN in init-ns).

>> dev = __dev_get_by_name(net, ifr->ifr_name);
>> if (dev) {
>> if (ifr->ifr_flags & IFF_TUN_EXCL)
>> @@ -2185,6 +2301,9 @@ static int tun_set_iff(struct net *net, struct file 
>> *file, struct ifreq *ifr)
>> tun->flags = (tun->flags & ~TUN_FEATURES) |
>> (ifr->ifr_flags & TUN_FEATURES);
>>
>> +   if (!(tun->flags & IFF_NAPI) || (tun->flags & TUN_TYPE_MASK) != 
>> IFF_TAP)
>> +   tun->flags = tun->flags & ~IFF_NAPI_FRAGS;
>> +
>
> Similarly, this check only need to be performed in that branch.
> Instead of reverting to non-frags mode, a tun_set_iff with the wrong
> set of flags should probably fail hard.
Yes, agree, wrong set of flags should fail hard and probably be done
before attach or open, no?

Re: [Intel-wired-lan] [PATCH] i40e: make const array patterns static, reduces object code size

2017-09-22 Thread Jesse Brandeburg

On Fri, 22 Sep 2017 15:11:38 +0100
Colin King  wrote:

> From: Colin Ian King 
> 
> Don't populate const array patterns on the stack, instead make it
> static. Makes the object code smaller by over 60 bytes:
> 
> Before:
>text  data bss dec hex filename
>1953   496   02449 991 i40e_diag.o
> 
> After:
>text  data bss dec hex filename
>1798   584   02382 94e i40e_diag.o
> 
> (gcc 6.3.0, x86-64)
> 
> Signed-off-by: Colin Ian King 

Looks good, thanks Colin!

Acked-by: Jesse Brandeburg

Re: [PATCH net-next 4/4] net: dsa: add port enable and disable helpers

2017-09-22 Thread Andrew Lunn

On Fri, Sep 22, 2017 at 12:17:53PM -0400, Vivien Didelot wrote:
> Provide dsa_port_enable and dsa_port_disable helpers to respectively
> enable and disable a switch port. This makes the dsa_port_set_state_now
> helper static.
> 
> Signed-off-by: Vivien Didelot 

Reviewed-by: Andrew Lunn 

Andrew

Re: [PATCH net-next 3/4] net: dsa: make slave close symmetrical to open

2017-09-22 Thread Andrew Lunn

On Fri, Sep 22, 2017 at 12:17:52PM -0400, Vivien Didelot wrote:
> The DSA slave open function configures the unicast MAC addresses on the
> master device, enable the switch port, change its STP state, then start
> the PHY device.
> 
> Make the close function symmetric, by first stopping the PHY device,
> then changing the STP state, disabling the switch port and restore the
> master device.
> 
> Signed-off-by: Vivien Didelot 

Reviewed-by: Andrew Lunn 

Andrew

Re: [PATCH net-next 1/4] net: dsa: move up phy enabling in core

2017-09-22 Thread Andrew Lunn

On Fri, Sep 22, 2017 at 12:17:50PM -0400, Vivien Didelot wrote:
> bcm_sf2 is currently the only driver using the phy argument passed to
> .port_enable. It resets the state machine if the phy has been hard
> reset. This check is generic and can be moved to DSA core.
>  
>   dsa_port_set_state_now(p->dp, stp_state);
>  
> - if (p->phy)
> - phy_start(p->phy);
> + if (phy) {
> + /* If phy_stop() has been called before, phy will be in
> +  * halted state, and phy_start() will call resume.
> +  *
> +  * The resume path does not configure back autoneg
> +  * settings, and since the internal phy may have been
> +  * hard reset, we need to reset the state machine also.
> +  */
> + phy->state = PHY_READY;
> + phy_init_hw(phy);
> + phy_start(phy);
> + }

Hi Vivien

If this is generic, why is it needed at all here? Shouldn't this
actually by in phylib?

Florian ?

   Andrew

Re: [PATCH net-next] bpf/verifier: improve disassembly of BPF_END instructions

2017-09-22 Thread Edward Cree

On 22/09/17 16:16, Alexei Starovoitov wrote:
> looks like we're converging on
> "be16/be32/be64/le16/le32/le64 #register" for BPF_END.
> I guess it can live with that. I would prefer more C like syntax
> to match the rest, but llvm parsing point is a strong one.
Yep, agreed.  I'll post a v2 once we've settled BPF_NEG.
> For BPG_NEG I prefer to do it in C syntax like interpreter does:
> ALU_NEG:
> DST = (u32) -DST;
> ALU64_NEG:
> DST = -DST;
> Yonghong, does it mean that asmparser will equally suffer?
Correction to my earlier statements: verifier will currently disassemble
 neg as:
(87) r0 neg 0
(84) (u32) r0 neg (u32) 0
 because it pretends 'neg' is a compound-assignment operator like +=.
The analogy with be16 and friends would be to use
neg64 r0
neg32 r0
 whereas the analogy with everything else would be
r0 = -r0
r0 = (u32) -r0
 as Alexei says.
I'm happy to go with Alexei's version if it doesn't cause problems for llvm.

[PATCH net-next 1/4] net: dsa: move up phy enabling in core

2017-09-22 Thread Vivien Didelot

bcm_sf2 is currently the only driver using the phy argument passed to
.port_enable. It resets the state machine if the phy has been hard
reset. This check is generic and can be moved to DSA core.

Signed-off-by: Vivien Didelot 
---
 drivers/net/dsa/bcm_sf2.c | 16 +---
 net/dsa/slave.c   | 15 +--
 2 files changed, 14 insertions(+), 17 deletions(-)

diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c
index 898d5642b516..ad96b9725a2c 100644
--- a/drivers/net/dsa/bcm_sf2.c
+++ b/drivers/net/dsa/bcm_sf2.c
@@ -184,22 +184,8 @@ static int bcm_sf2_port_setup(struct dsa_switch *ds, int 
port,
core_writel(priv, reg, CORE_PORT_TC2_QOS_MAP_PORT(port));
 
/* Re-enable the GPHY and re-apply workarounds */
-   if (priv->int_phy_mask & 1 << port && priv->hw_params.num_gphy == 1) {
+   if (priv->int_phy_mask & 1 << port && priv->hw_params.num_gphy == 1)
bcm_sf2_gphy_enable_set(ds, true);
-   if (phy) {
-   /* if phy_stop() has been called before, phy
-* will be in halted state, and phy_start()
-* will call resume.
-*
-* the resume path does not configure back
-* autoneg settings, and since we hard reset
-* the phy manually here, we need to reset the
-* state machine also.
-*/
-   phy->state = PHY_READY;
-   phy_init_hw(phy);
-   }
-   }
 
/* Enable MoCA port interrupts to get notified */
if (port == priv->moca_port)
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 02ace7d462c4..606812160fd5 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -72,6 +72,7 @@ static int dsa_slave_get_iflink(const struct net_device *dev)
 static int dsa_slave_open(struct net_device *dev)
 {
struct dsa_slave_priv *p = netdev_priv(dev);
+   struct phy_device *phy = p->phy;
struct dsa_port *dp = p->dp;
struct dsa_switch *ds = dp->ds;
struct net_device *master = dsa_master_netdev(p);
@@ -106,8 +107,18 @@ static int dsa_slave_open(struct net_device *dev)
 
dsa_port_set_state_now(p->dp, stp_state);
 
-   if (p->phy)
-   phy_start(p->phy);
+   if (phy) {
+   /* If phy_stop() has been called before, phy will be in
+* halted state, and phy_start() will call resume.
+*
+* The resume path does not configure back autoneg
+* settings, and since the internal phy may have been
+* hard reset, we need to reset the state machine also.
+*/
+   phy->state = PHY_READY;
+   phy_init_hw(phy);
+   phy_start(phy);
+   }
 
return 0;
 
-- 
2.14.1

[PATCH net-next 2/4] net: dsa: remove phy arg from port enable/disable

2017-09-22 Thread Vivien Didelot

The .port_enable and .port_disable functions are meant to deal with the
switch ports only, and no driver is using the phy argument anyway.
Remove it.

Signed-off-by: Vivien Didelot 
---
 drivers/net/dsa/b53/b53_common.c   |  6 +++---
 drivers/net/dsa/b53/b53_priv.h |  4 ++--
 drivers/net/dsa/bcm_sf2.c  | 16 +++-
 drivers/net/dsa/lan9303-core.c |  6 ++
 drivers/net/dsa/microchip/ksz_common.c |  6 ++
 drivers/net/dsa/mt7530.c   |  8 +++-
 drivers/net/dsa/mv88e6xxx/chip.c   |  6 ++
 drivers/net/dsa/qca8k.c|  6 ++
 include/net/dsa.h  |  6 ++
 net/dsa/slave.c|  4 ++--
 10 files changed, 27 insertions(+), 41 deletions(-)

diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c
index d4ce092def83..e46eb29d29f0 100644
--- a/drivers/net/dsa/b53/b53_common.c
+++ b/drivers/net/dsa/b53/b53_common.c
@@ -502,7 +502,7 @@ void b53_imp_vlan_setup(struct dsa_switch *ds, int cpu_port)
 }
 EXPORT_SYMBOL(b53_imp_vlan_setup);
 
-int b53_enable_port(struct dsa_switch *ds, int port, struct phy_device *phy)
+int b53_enable_port(struct dsa_switch *ds, int port)
 {
struct b53_device *dev = ds->priv;
unsigned int cpu_port = dev->cpu_port;
@@ -531,7 +531,7 @@ int b53_enable_port(struct dsa_switch *ds, int port, struct 
phy_device *phy)
 }
 EXPORT_SYMBOL(b53_enable_port);
 
-void b53_disable_port(struct dsa_switch *ds, int port, struct phy_device *phy)
+void b53_disable_port(struct dsa_switch *ds, int port)
 {
struct b53_device *dev = ds->priv;
u8 reg;
@@ -874,7 +874,7 @@ static int b53_setup(struct dsa_switch *ds)
if (dsa_is_cpu_port(ds, port))
b53_enable_cpu_port(dev, port);
else if (!(BIT(port) & ds->enabled_port_mask))
-   b53_disable_port(ds, port, NULL);
+   b53_disable_port(ds, port);
}
 
return ret;
diff --git a/drivers/net/dsa/b53/b53_priv.h b/drivers/net/dsa/b53/b53_priv.h
index 603c66d240d8..688d02ee6155 100644
--- a/drivers/net/dsa/b53/b53_priv.h
+++ b/drivers/net/dsa/b53/b53_priv.h
@@ -311,8 +311,8 @@ int b53_mirror_add(struct dsa_switch *ds, int port,
   struct dsa_mall_mirror_tc_entry *mirror, bool ingress);
 void b53_mirror_del(struct dsa_switch *ds, int port,
struct dsa_mall_mirror_tc_entry *mirror);
-int b53_enable_port(struct dsa_switch *ds, int port, struct phy_device *phy);
-void b53_disable_port(struct dsa_switch *ds, int port, struct phy_device *phy);
+int b53_enable_port(struct dsa_switch *ds, int port);
+void b53_disable_port(struct dsa_switch *ds, int port);
 void b53_brcm_hdr_setup(struct dsa_switch *ds, int port);
 void b53_eee_enable_set(struct dsa_switch *ds, int port, bool enable);
 int b53_eee_init(struct dsa_switch *ds, int port, struct phy_device *phy);
diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c
index ad96b9725a2c..77e0c43f973b 100644
--- a/drivers/net/dsa/bcm_sf2.c
+++ b/drivers/net/dsa/bcm_sf2.c
@@ -159,8 +159,7 @@ static inline void bcm_sf2_port_intr_disable(struct 
bcm_sf2_priv *priv,
intrl2_1_writel(priv, P_IRQ_MASK(off), INTRL2_CPU_CLEAR);
 }
 
-static int bcm_sf2_port_setup(struct dsa_switch *ds, int port,
- struct phy_device *phy)
+static int bcm_sf2_port_setup(struct dsa_switch *ds, int port)
 {
struct bcm_sf2_priv *priv = bcm_sf2_to_priv(ds);
unsigned int i;
@@ -191,11 +190,10 @@ static int bcm_sf2_port_setup(struct dsa_switch *ds, int 
port,
if (port == priv->moca_port)
bcm_sf2_port_intr_enable(priv, port);
 
-   return b53_enable_port(ds, port, phy);
+   return b53_enable_port(ds, port);
 }
 
-static void bcm_sf2_port_disable(struct dsa_switch *ds, int port,
-struct phy_device *phy)
+static void bcm_sf2_port_disable(struct dsa_switch *ds, int port)
 {
struct bcm_sf2_priv *priv = bcm_sf2_to_priv(ds);
u32 off, reg;
@@ -214,7 +212,7 @@ static void bcm_sf2_port_disable(struct dsa_switch *ds, int 
port,
else
off = CORE_G_PCTL_PORT(port);
 
-   b53_disable_port(ds, port, phy);
+   b53_disable_port(ds, port);
 
/* Power down the port memory */
reg = core_readl(priv, CORE_MEM_PSM_VDD_CTRL);
@@ -613,7 +611,7 @@ static int bcm_sf2_sw_suspend(struct dsa_switch *ds)
for (port = 0; port < DSA_MAX_PORTS; port++) {
if ((1 << port) & ds->enabled_port_mask ||
dsa_is_cpu_port(ds, port))
-   bcm_sf2_port_disable(ds, port, NULL);
+   bcm_sf2_port_disable(ds, port);
}
 
return 0;
@@ -636,7 +634,7 @@ static int bcm_sf2_sw_resume(struct dsa_switch *ds)
 
for (port = 0; port < DSA_MAX_PORTS; port++) {
if ((1 <<

[PATCH net-next 3/4] net: dsa: make slave close symmetrical to open

2017-09-22 Thread Vivien Didelot

The DSA slave open function configures the unicast MAC addresses on the
master device, enable the switch port, change its STP state, then start
the PHY device.

Make the close function symmetric, by first stopping the PHY device,
then changing the STP state, disabling the switch port and restore the
master device.

Signed-off-by: Vivien Didelot 
---
 net/dsa/slave.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 6290741e496a..235a5c95dfcc 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -144,6 +144,11 @@ static int dsa_slave_close(struct net_device *dev)
if (p->phy)
phy_stop(p->phy);
 
+   dsa_port_set_state_now(p->dp, BR_STATE_DISABLED);
+
+   if (ds->ops->port_disable)
+   ds->ops->port_disable(ds, p->dp->index);
+
dev_mc_unsync(master, dev);
dev_uc_unsync(master, dev);
if (dev->flags & IFF_ALLMULTI)
@@ -154,11 +159,6 @@ static int dsa_slave_close(struct net_device *dev)
if (!ether_addr_equal(dev->dev_addr, master->dev_addr))
dev_uc_del(master, dev->dev_addr);
 
-   if (ds->ops->port_disable)
-   ds->ops->port_disable(ds, p->dp->index);
-
-   dsa_port_set_state_now(p->dp, BR_STATE_DISABLED);
-
return 0;
 }
 
-- 
2.14.1

[PATCH net-next 0/4] net: dsa: simplify port enabling

2017-09-22 Thread Vivien Didelot

This patchset removes the unnecessary PHY device argument in port
enable/disable switch operations, makes slave open and close symmetrical
and finally provides helpers for enabling or disabling a DSA port.

Vivien Didelot (4):
  net: dsa: move up phy enabling in core
  net: dsa: remove phy arg from port enable/disable
  net: dsa: make slave close symmetrical to open
  net: dsa: add port enable and disable helpers

 drivers/net/dsa/b53/b53_common.c   |  6 +++---
 drivers/net/dsa/b53/b53_priv.h |  4 ++--
 drivers/net/dsa/bcm_sf2.c  | 32 --
 drivers/net/dsa/lan9303-core.c |  6 ++
 drivers/net/dsa/microchip/ksz_common.c |  6 ++
 drivers/net/dsa/mt7530.c   |  8 +++-
 drivers/net/dsa/mv88e6xxx/chip.c   |  6 ++
 drivers/net/dsa/qca8k.c|  6 ++
 include/net/dsa.h  |  6 ++
 net/dsa/dsa_priv.h |  3 ++-
 net/dsa/port.c | 31 -
 net/dsa/slave.c| 36 ++
 12 files changed, 77 insertions(+), 73 deletions(-)

-- 
2.14.1

[PATCH net-next 4/4] net: dsa: add port enable and disable helpers

2017-09-22 Thread Vivien Didelot

Provide dsa_port_enable and dsa_port_disable helpers to respectively
enable and disable a switch port. This makes the dsa_port_set_state_now
helper static.

Signed-off-by: Vivien Didelot 
---
 net/dsa/dsa_priv.h |  3 ++-
 net/dsa/port.c | 31 ++-
 net/dsa/slave.c| 19 +--
 3 files changed, 37 insertions(+), 16 deletions(-)

diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h
index 9803952a5b40..6bfff19d1615 100644
--- a/net/dsa/dsa_priv.h
+++ b/net/dsa/dsa_priv.h
@@ -117,7 +117,8 @@ void dsa_master_ethtool_restore(struct net_device *dev);
 /* port.c */
 int dsa_port_set_state(struct dsa_port *dp, u8 state,
   struct switchdev_trans *trans);
-void dsa_port_set_state_now(struct dsa_port *dp, u8 state);
+int dsa_port_enable(struct dsa_port *dp);
+void dsa_port_disable(struct dsa_port *dp);
 int dsa_port_bridge_join(struct dsa_port *dp, struct net_device *br);
 void dsa_port_bridge_leave(struct dsa_port *dp, struct net_device *br);
 int dsa_port_vlan_filtering(struct dsa_port *dp, bool vlan_filtering,
diff --git a/net/dsa/port.c b/net/dsa/port.c
index 76d43a82d397..50749339e252 100644
--- a/net/dsa/port.c
+++ b/net/dsa/port.c
@@ -56,7 +56,7 @@ int dsa_port_set_state(struct dsa_port *dp, u8 state,
return 0;
 }
 
-void dsa_port_set_state_now(struct dsa_port *dp, u8 state)
+static void dsa_port_set_state_now(struct dsa_port *dp, u8 state)
 {
int err;
 
@@ -65,6 +65,35 @@ void dsa_port_set_state_now(struct dsa_port *dp, u8 state)
pr_err("DSA: failed to set STP state %u (%d)\n", state, err);
 }
 
+int dsa_port_enable(struct dsa_port *dp)
+{
+   u8 stp_state = dp->bridge_dev ? BR_STATE_BLOCKING : BR_STATE_FORWARDING;
+   struct dsa_switch *ds = dp->ds;
+   int port = dp->index;
+   int err;
+
+   if (ds->ops->port_enable) {
+   err = ds->ops->port_enable(ds, port);
+   if (err)
+   return err;
+   }
+
+   dsa_port_set_state_now(dp, stp_state);
+
+   return 0;
+}
+
+void dsa_port_disable(struct dsa_port *dp)
+{
+   struct dsa_switch *ds = dp->ds;
+   int port = dp->index;
+
+   dsa_port_set_state_now(dp, BR_STATE_DISABLED);
+
+   if (ds->ops->port_disable)
+   ds->ops->port_disable(ds, port);
+}
+
 int dsa_port_bridge_join(struct dsa_port *dp, struct net_device *br)
 {
struct dsa_notifier_bridge_info info = {
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 235a5c95dfcc..e40623939323 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -74,9 +74,7 @@ static int dsa_slave_open(struct net_device *dev)
struct dsa_slave_priv *p = netdev_priv(dev);
struct phy_device *phy = p->phy;
struct dsa_port *dp = p->dp;
-   struct dsa_switch *ds = dp->ds;
struct net_device *master = dsa_master_netdev(p);
-   u8 stp_state = dp->bridge_dev ? BR_STATE_BLOCKING : BR_STATE_FORWARDING;
int err;
 
if (!(master->flags & IFF_UP))
@@ -99,13 +97,9 @@ static int dsa_slave_open(struct net_device *dev)
goto clear_allmulti;
}
 
-   if (ds->ops->port_enable) {
-   err = ds->ops->port_enable(ds, p->dp->index);
-   if (err)
-   goto clear_promisc;
-   }
-
-   dsa_port_set_state_now(p->dp, stp_state);
+   err = dsa_port_enable(dp);
+   if (err)
+   goto clear_promisc;
 
if (phy) {
/* If phy_stop() has been called before, phy will be in
@@ -139,15 +133,12 @@ static int dsa_slave_close(struct net_device *dev)
 {
struct dsa_slave_priv *p = netdev_priv(dev);
struct net_device *master = dsa_master_netdev(p);
-   struct dsa_switch *ds = p->dp->ds;
+   struct dsa_port *dp = p->dp;
 
if (p->phy)
phy_stop(p->phy);
 
-   dsa_port_set_state_now(p->dp, BR_STATE_DISABLED);
-
-   if (ds->ops->port_disable)
-   ds->ops->port_disable(ds, p->dp->index);
+   dsa_port_disable(dp);
 
dev_mc_unsync(master, dev);
dev_uc_unsync(master, dev);
-- 
2.14.1

[PATCH] Switch to use the new hashtable implementation. This reduces the code and need for yet another hashtable implementation.

2017-09-22 Thread Aaron Wood

Signed-off-by: Aaron Wood 
---
 net/9p/error.c | 21 +
 1 file changed, 9 insertions(+), 12 deletions(-)

diff --git a/net/9p/error.c b/net/9p/error.c
index 126fd0dceea2..2e966fcc5cbb 100644
--- a/net/9p/error.c
+++ b/net/9p/error.c
@@ -32,6 +32,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -50,8 +51,8 @@ struct errormap {
struct hlist_node list;
 };
 
-#define ERRHASHSZ  32
-static struct hlist_head hash_errmap[ERRHASHSZ];
+#define ERR_HASH_BITS  5
+static DEFINE_HASHTABLE(hash_errmap, ERR_HASH_BITS);
 
 /* FixMe - reduce to a reasonable size */
 static struct errormap errmap[] = {
@@ -193,18 +194,14 @@ static struct errormap errmap[] = {
 int p9_error_init(void)
 {
struct errormap *c;
-   int bucket;
-
-   /* initialize hash table */
-   for (bucket = 0; bucket < ERRHASHSZ; bucket++)
-   INIT_HLIST_HEAD(_errmap[bucket]);
+   int key;
 
/* load initial error map into hash table */
for (c = errmap; c->name != NULL; c++) {
c->namelen = strlen(c->name);
-   bucket = jhash(c->name, c->namelen, 0) % ERRHASHSZ;
+   key = jhash(c->name, c->namelen, 0);
INIT_HLIST_NODE(>list);
-   hlist_add_head(>list, _errmap[bucket]);
+   hash_add(hash_errmap, >list, key);
}
 
return 1;
@@ -222,12 +219,12 @@ int p9_errstr2errno(char *errstr, int len)
 {
int errno;
struct errormap *c;
-   int bucket;
+   int key;
 
errno = 0;
c = NULL;
-   bucket = jhash(errstr, len, 0) % ERRHASHSZ;
-   hlist_for_each_entry(c, _errmap[bucket], list) {
+   key = jhash(errstr, len, 0);
+   hash_for_each_possible(hash_errmap, c, list, key) {
if (c->namelen == len && !memcmp(c->name, errstr, len)) {
errno = c->val;
break;
-- 
2.11.0

1 2 >

1 - 100 of 184 matches

Mail list logo