date:20171202

Re: [PATCH iproute2] iproute2: Fix undeclared __kernel_long_t type build error in RHEL 6.8

2017-12-02 Thread Leon Romanovsky

On Fri, Dec 01, 2017 at 08:56:24PM +0100, Michal Kubecek wrote:
> On Fri, Dec 01, 2017 at 08:48:07AM -0800, Stephen Hemminger wrote:
> > On Fri,  1 Dec 2017 13:04:51 +0200
> > Leon Romanovsky  wrote:
> >
> > > From: Leon Romanovsky 
> > >
> > > Add asm/posix_types.h header file to the list of needed includes,
> > > because the headers files in RHEL 6.8 are too old and doesn't
> > > have declaration of __kernel_long_t.
> > >
> > > In file included from ../include/uapi/linux/kernel.h:5,
> > >  from ../include/uapi/linux/netfilter/x_tables.h:4,
> > >  from ../include/xtables.h:20,
> > >  from em_ipset.c:26:
> > > ../include/uapi/linux/sysinfo.h:9: error: expected 
> > > specifier-qualifier-list before ‘__kernel_long_t’
> > >
> > > Cc: Riad Abo Raed 
> > > Cc: Guy Ergas 
> > > Signed-off-by: Leon Romanovsky 
> >
> > I see the problem, but the solution of dragging in posix_types.h
> > would be too much of a long term maintenance issue.
> > All the headers in uapi are regularly generated from upstream
> > kernel headers; I don't want to start making exceptions.
> >
> > Is it just the xtables stuff (which has always been problematic)?
>
> Actually, the only place where __kernel_long_t and __kernel_ulong_t
> appear is struct sysinfo in include/uapi/linux/sysinfo.h and this
> structure isn't even used anywhere in iproute2 source (not even in the
> include/uapi/linux/kernel.h file which includes ).
>
> So one could work around the problem by defining _LINUX_SYSINFO_H but
> that seems a bit dirty hack.

It is too dirty :). It can cause to completely unpredictable compilation
failures in the future, which won't be easy to track down.

>
> Michal Kubecek
>


signature.asc
Description: PGP signature

[PATCH net-next 2/2 v6] net: ethernet: Add a driver for Gemini gigabit ethernet

2017-12-02 Thread Linus Walleij

The Gemini ethernet has been around for years as an out-of-tree
patch used with the NAS boxen and routers built on StorLink
SL3512 and SL3516, later Storm Semiconductor, later Cortina
Systems. These ASICs are still being deployed and brand new
off-the-shelf systems using it can easily be acquired.

Gemini is the common codename used for all the SoCs using
this IP. An earlier codename was "Lepus".

Cc: Tobias Waldvogel 
Signed-off-by: Michał Mirosław 
Signed-off-by: Linus Walleij 
---
The latest v6 incarnation of this driver was written by Michał
Mirosław and submitted for inclusion in 2011. This was the
last post:
https://lwn.net/Articles/437889/

DaveM ACKed it at the time:
https://marc.info/?l=linux-netdev=130255434310315=2

The controversial pieces under ARM (board files) and other
subsystems are now gone and replaced by DeviceTree.

Michał: I hope you don't mind me picking it up and hope
you can still test it on your ICYbox, I have device tree
patches in my tree:
https://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-nomadik.git/log/?h=gemini-ethernet

Changes from v6:
- Drop all arch support code using the old board files.
- Adapted for device tree probing
- Getting all resources using devm_* accessors where applicable
- Split in parent ethernet device and two per-port devices
  that get spawn from the parent. This is necessary with
  device tree and other aspects of the PHY device model and
  device tree structure that requires a 1:1 mapping between
  a device and PHY to work properly.
- Grab clocks and reset handles as resources from the clock
  and reset subsystems infrastructure instead of open coding
  access to system devices.
- Let the pin control subsystem deal with setting up the
  multplexing and clock skew/delay settings of the RGMII
  lines.
- A separate SoC driver was created to deal with setting up
  bus arbitration and will be merged separately.
- Tested with the D-Link DNS-313 NAS box with a Realtek RTL8211B
  transciever.
- Rename and move code around to fit better with the new device
  handling with a top level device and two children.
- Order code as net vendor Cortina and adapter Gemini. We have
  confirmed with Faraday that this network device is not from
  them (which was initially suspected).
- Rebased onto v4.15-rc1

Changes from v5:
 - merge arch setup code into the patch
 - move platform data include to include/linux/platform_data/gemini_gmac.h
 - use new hw_features instead of ethtool_ops for offload setting
 - add some #ifdefs for build testing on other arches
 - a bit of cleanups

Changes from v4:
 - rebased on upcoming 2.6.38 (removal of page_to_dma() and per-txq stats)
 - removed setting last_rx and trans_start as that's handled by net core
 - changed __raw_read/writel() to read/writel()
 - added setting of AHB_WEIGHT register (didn't improve anything, I'm afraid)
 - fixed DMA unmapping bug
 - added limit of packet size for TX offload (HW checks only 13 bits of 
mtu_size field)
 - reduced RX_MAX_ALLOC_ORDER as it caused a lot of order 4 allocation failures
   under load
 - cleanups

Changes from v3:
 - fixed remaining tx_queue_len misuse bugs
 - bulk RX DMA page map/unmap
 - whitespace changes to make checkpatch happier (please ignore remaining
   complaints - long lines in .c and typedefs/whitespace/long lines in .h)

Changes from v2:
 - converted to page buffers and napi_gro_frags()
 - later IRQ acking and NAPI exits
 - larger rings by default
 - tx-interrupt coalescing
 - MTU changing
 - jumbo frames support
 - ringparam and coalesce settings via ethtool
 - more fixes/cleanups

Changes from v1:
 - fixed stats (now using u64_stats_sync; no-op on UP anyway)
 - pre-load mdio-gpio if built as module
 - disable TX checksum offload by default (unreliable HW)
 - convert to NAPI+GRO (netperf TCP STREAM RX test:
before: 156mbit/s, now: 185mbit/s)

Later TODO:
 - netpoll (netconsole)
 - parse MAC address from flash settings and pass it through platform data
 - move TX completion to NAPI poll
 - implement rx copybreak
 - remove DMA API abuse on RX (large map, small unmaps)
 - better test multicast support
---
 MAINTAINERS   |2 +
 drivers/net/ethernet/Kconfig  |1 +
 drivers/net/ethernet/Makefile |1 +
 drivers/net/ethernet/cortina/Kconfig  |   24 +
 drivers/net/ethernet/cortina/Makefile |4 +
 drivers/net/ethernet/cortina/gemini.c | 2461 +
 drivers/net/ethernet/cortina/gemini.h | 1432 +++
 7 files changed, 3925 insertions(+)
 create mode 100644 drivers/net/ethernet/cortina/Kconfig
 create mode 100644 drivers/net/ethernet/cortina/Makefile
 create mode 100644 drivers/net/ethernet/cortina/gemini.c
 create mode 100644 drivers/net/ethernet/cortina/gemini.h

diff --git a/MAINTAINERS b/MAINTAINERS
index aa71ab52fd76..200ff7670276 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1326,8 +1326,10 @@

RE: [EXT] Re: [PATCH net] net: phylink: fix link state on phy-connect

2017-12-02 Thread Yan Markman

Hi Russel
The Grygorii has raised one Additional point (about netif_carrier_off) I just 
didn't want to start before finishing the previous one.
On ifconfig-down the mac_config() called but with LINK=0. 
The config has no any knowledge what is intention -- up or down and should be 
done under disabled ingress/egress,
   and so the mac_config one of its action isnetif_carrier_off.

After calling mac_config() the phylink checks  if (!link  &&  
!netif_carrier_ok()) and decides to abort further down since all-done...

REMOVE netif_carrier_off looks like correct BUT has cases where de driver stops 
to works properly (sorry, I can't remember now what exactly).
So finally I have placed there the CONDITIONAL carrier-off depending upon link:

static void mvpp2_mac_config(){
if (state->link)--- occasionally is TRUE on UP but FALSE on down
netif_carrier_off(port->dev);//YANM

BTW: It's seems your below patch should be present anyway.
+++ b/drivers/net/phy/phylink.c
@@ -798,6 +798,7 @@ void phylink_disconnect_phy(struct phylink *pl)
+   pl->phy_state.link = false;

Thank you
Best regards
Yan Markman

-Original Message-
From: Russell King - ARM Linux [mailto:li...@armlinux.org.uk] 
Sent: Friday, December 01, 2017 7:48 PM
To: Florian Fainelli 
Cc: Grygorii Strashko ; Yan Markman 
; Antoine Tenart ; 
and...@lunn.ch; da...@davemloft.net; gregory.clem...@free-electrons.com; 
thomas.petazz...@free-electrons.com; miquel.ray...@free-electrons.com; Nadav 
Haklai ; m...@semihalf.com; Stefan Chulski 
; netdev@vger.kernel.org; linux-ker...@vger.kernel.org
Subject: Re: [EXT] Re: [PATCH net] net: phylink: fix link state on phy-connect

On Fri, Dec 01, 2017 at 09:36:42AM -0800, Florian Fainelli wrote:
> On 12/01/2017 09:24 AM, Russell King - ARM Linux wrote:
> > On Fri, Dec 01, 2017 at 11:07:22AM -0600, Grygorii Strashko wrote:
> >> Hi Russell,
> >>
> >> On 11/30/2017 07:28 AM, Russell King - ARM Linux wrote:
> >>> On Thu, Nov 30, 2017 at 10:10:18AM +, Russell King - ARM Linux wrote:
>  On Thu, Nov 30, 2017 at 08:51:21AM +, Yan Markman wrote:
> > The phylink_stop is called before phylink_disconnect_phy You 
> > could see in mvpp2.c:
> >
> > mvpp2_stop_dev() {
> > phylink_stop(port->phylink);
> > }
> >
> > mvpp2_stop()   {
> > mvpp2_stop_dev(port);
> > phylink_disconnect_phy(port->phylink);
> > }
> >
> > .ndo_stop = mvpp2_stop,
> 
>  Sorry, I don't have this in mvpp2.c, so I have no visibility of 
>  what you're working with.
> 
>  What you have above looks correct, and I see no reason why the 
>  p21 patch would not have resolved your issue.  The p21 patch 
>  ensures that phylink_resolve() gets called and completes before 
>  phylink_stop() returns.  In that case, phylink_resolve() will 
>  call the mac_link_down() method if the link is not already down.  
>  It will also print the "Link is Down" message.
> 
>  Florian has already tested this patch after encountering a 
>  similar issue, and has reported that it solves the problem for 
>  him.  I've also tested it with mvneta, and the original mvpp2x driver on 
>  Macchiatobin.
> 
>  Maybe there's something different about mvpp2, but as I have no 
>  visibility of that driver and the modifications therein, I can't 
>  comment further other than stating that it works for three 
>  different implementations.
> 
>  Maybe you could try and work out what's going on with the p21 
>  patch in your case?
> >>>
> >>> I think I now realise what's probably going on.
> >>>
> >>> If you call netif_carrier_off() before phylink_stop(), then 
> >>> phylink will believe that the link is already down, and so it 
> >>> won't bother calling
> >>> mac_link_down() - it will believe that the link is already down.
> >>>
> >>> I'll update the documentation for phylink_stop() to spell out this 
> >>> aspect.
> >>>
> >>
> >> There are pretty high number of net drivers which do call
> >>netif_carrier_off(dev);
> >> before
> >>phy_stop(dev->phydev);
> >> in .ndo_stop() callback.
> >>
> >> As per you comment this seems to be incorrect, so should such calls 
> >> be removed?
> > 
> > Well, I think the question that needs to be asked is this:
> > 
> >   Is calling netif_carrier_off() before phy_stop() safe?
> > 
> > Well, reading the phylib code, this is the answer I've come to:
> > 
> >   Between phy_start() and phy_stop(), phylib is free to manage the
> >   carrier state itself through the phylib state machine.
> > 
> >   This means if you call netif_carrier_off() prior to phy_stop(),
> >   there is nothing preventing the phylib state machine from running,
> >   and a co-incident poll of the PHY could notice that the

Re: [Patch net-next] net_sched: get rid of rcu_barrier() in tcf_block_put_ext()

2017-12-02 Thread Jiri Pirko

Sat, Dec 02, 2017 at 01:18:04AM CET, xiyou.wangc...@gmail.com wrote:
>Both Eric and Paolo noticed the rcu_barrier() we use in
>tcf_block_put_ext() could be a performance bottleneck when
>we have lots of filters.

The problem is not a lots of filters, the problem is lots of classes and
therefore tcf_blocks


>
>Paolo provided the following to demonstrate the issue:
>
>tc qdisc add dev lo root htb
>for I in `seq 1 1000`; do
>tc class add dev lo parent 1: classid 1:$I htb rate 100kbit
>tc qdisc add dev lo parent 1:$I handle $((I + 1)): htb
>for J in `seq 1 10`; do
>tc filter add dev lo parent $((I + 1)): u32 match ip src 
> 1.1.1.$J
>done
>done
>time tc qdisc del dev root
>
>real0m54.764s
>user0m0.023s
>sys 0m0.000s
>
>The rcu_barrier() there is to ensure we free the block after all chains
>are gone, that is, to queue tcf_block_put_final() at the tail of workqueue.
>We can achieve this ordering requirement by refcnt'ing tcf block instead,
>that is, the tcf block is freed only when the last chain in this block is
>gone. This also simplifies the code.
>
>Paolo reported after this patch we get:
>
>real0m0.017s
>user0m0.000s
>sys 0m0.017s
>
>Tested-by: Paolo Abeni 
>Cc: Eric Dumazet 
>Cc: Jiri Pirko 
>Cc: Jamal Hadi Salim 
>Signed-off-by: Cong Wang 
>---
> include/net/sch_generic.h |  2 +-
> net/sched/cls_api.c   | 31 +--
> 2 files changed, 10 insertions(+), 23 deletions(-)
>
>diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
>index 65d0d25f2648..b013ded1a38d 100644
>--- a/include/net/sch_generic.h
>+++ b/include/net/sch_generic.h
>@@ -278,7 +278,7 @@ struct tcf_block {
>   struct net *net;
>   struct Qdisc *q;
>   struct list_head cb_list;
>-  struct work_struct work;
>+  unsigned int nr_chains;
> };
> 
> static inline void qdisc_cb_private_validate(const struct sk_buff *skb, int 
> sz)
>diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
>index ddcf04b4ab43..dec0d36078c8 100644
>--- a/net/sched/cls_api.c
>+++ b/net/sched/cls_api.c
>@@ -190,6 +190,7 @@ static struct tcf_chain *tcf_chain_create(struct tcf_block 
>*block,
>   return NULL;
>   list_add_tail(>list, >chain_list);
>   chain->block = block;
>+  block->nr_chains++;
>   chain->index = chain_index;
>   chain->refcnt = 1;
>   return chain;
>@@ -218,8 +219,12 @@ static void tcf_chain_flush(struct tcf_chain *chain)
> 
> static void tcf_chain_destroy(struct tcf_chain *chain)
> {
>+  struct tcf_block *block = chain->block;
>+
>   list_del(>list);
>   kfree(chain);
>+  if (!--block->nr_chains)

You don't need this counter. You can just check
list_empty(block->chain_list);


>+  kfree(block);
> }
> 
> static void tcf_chain_hold(struct tcf_chain *chain)
>@@ -330,27 +335,13 @@ int tcf_block_get(struct tcf_block **p_block,
> }
> EXPORT_SYMBOL(tcf_block_get);
> 
>-static void tcf_block_put_final(struct work_struct *work)
>-{
>-  struct tcf_block *block = container_of(work, struct tcf_block, work);
>-  struct tcf_chain *chain, *tmp;
>-
>-  rtnl_lock();
>-
>-  /* At this point, all the chains should have refcnt == 1. */
>-  list_for_each_entry_safe(chain, tmp, >chain_list, list)
>-  tcf_chain_put(chain);
>-  rtnl_unlock();
>-  kfree(block);
>-}
>-
> /* XXX: Standalone actions are not allowed to jump to any chain, and bound
>  * actions should be all removed after flushing.
>  */
> void tcf_block_put_ext(struct tcf_block *block, struct Qdisc *q,
>  struct tcf_block_ext_info *ei)
> {
>-  struct tcf_chain *chain;
>+  struct tcf_chain *chain, *tmp;
> 
>   /* Hold a refcnt for all chains, except 0, so that they don't disappear
>* while we are iterating.
>@@ -364,13 +355,9 @@ void tcf_block_put_ext(struct tcf_block *block, struct 
>Qdisc *q,
> 
>   tcf_block_offload_unbind(block, q, ei);
> 
>-  INIT_WORK(>work, tcf_block_put_final);
>-  /* Wait for existing RCU callbacks to cool down, make sure their works
>-   * have been queued before this. We can not flush pending works here
>-   * because we are holding the RTNL lock.
>-   */
>-  rcu_barrier();
>-  tcf_queue_work(>work);
>+  /* At this point, all the chains should have refcnt >= 1. */
>+  list_for_each_entry_safe(chain, tmp, >chain_list, list)
>+  tcf_chain_put(chain);

I think this is correct. Would be probably good to elaborate a bit more
about what is happening. Perhaps a comment?

> }
> EXPORT_SYMBOL(tcf_block_put_ext);
> 
>-- 
>2.13.0
>

Re: [PATCH net] nfp: fix port stats for mac representors

2017-12-02 Thread Jakub Kicinski

On Fri, Dec 1, 2017 at 9:37 PM, Jakub Kicinski
 wrote:
> From: Pieter Jansen van Vuuren 
>
> Previously we swapped the tx_packets, tx_bytes and tx_dropped counters
> with rx_packets, rx_bytes and rx_dropped counters, respectively. This
> behaviour is correct and expected for VF representors but it should not
> be swapped for physical port mac representors.

Ah, I forgot to point the finger.  Should I repost?

Fixes: eadfa4c3be99 ("nfp: add stats and xmit helpers for representors")

> Signed-off-by: Pieter Jansen van Vuuren 
> Reviewed-by: Simon Horman 
> Reviewed-by: Jakub Kicinski

Re: [BUG] mveta: mvneta_txq_bufs_free NULL pointer dereference

2017-12-02 Thread Sean Nyekjær

Hi

>> I'm not sure at all, but could you try to apply
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0d63785c6b94b5d2f095f90755825f90eea791f5
>> and see if the problem is resolved ?>
> I will apply the patch right away, and report back.
>

The same issue reappeared yesterday, with the patch applied :-)

BR
/Sean

Re: [PATCH iproute2] iproute2: Fix undeclared __kernel_long_t type build error in RHEL 6.8

2017-12-02 Thread Leon Romanovsky

On Fri, Dec 01, 2017 at 08:48:07AM -0800, Stephen Hemminger wrote:
> On Fri,  1 Dec 2017 13:04:51 +0200
> Leon Romanovsky  wrote:
>
> > From: Leon Romanovsky 
> >
> > Add asm/posix_types.h header file to the list of needed includes,
> > because the headers files in RHEL 6.8 are too old and doesn't
> > have declaration of __kernel_long_t.
> >
> > In file included from ../include/uapi/linux/kernel.h:5,
> >  from ../include/uapi/linux/netfilter/x_tables.h:4,
> >  from ../include/xtables.h:20,
> >  from em_ipset.c:26:
> > ../include/uapi/linux/sysinfo.h:9: error: expected specifier-qualifier-list 
> > before ‘__kernel_long_t’
> >
> > Cc: Riad Abo Raed 
> > Cc: Guy Ergas 
> > Signed-off-by: Leon Romanovsky 
>
> I see the problem, but the solution of dragging in posix_types.h
> would be too much of a long term maintenance issue.
> All the headers in uapi are regularly generated from upstream
> kernel headers; I don't want to start making exceptions.
>
> Is it just the xtables stuff (which has always been problematic)?

Yes, both failures are related to xtables. And this wass my naive approach to
solve first one, the second mentioned in the original commit log
(missing xtables-version.h) is more harder to fix.

Will it work if I test in configure script the existence of __kernel_long_t
and fallback to xt-internal.h?

Thanks


signature.asc
Description: PGP signature

Re: [Patch net-next] act_mirred: use tcfm_dev in tcf_mirred_get_dev()

2017-12-02 Thread Jiri Pirko

Thu, Nov 30, 2017 at 11:53:32PM CET, xiyou.wangc...@gmail.com wrote:
>tcfm_dev always points to the correct netdev and we already
>hold a refcnt, so no need to use ifindex to lookup again.
>
>If we would support moving target netdev across netns, using
>pointer would be better than ifindex.
>
>Cc: Jiri Pirko 
>Cc: Jamal Hadi Salim 
>Signed-off-by: Cong Wang 
>---
> include/net/tc_act/tc_mirred.h | 1 -
> net/sched/act_mirred.c | 3 +--
> 2 files changed, 1 insertion(+), 3 deletions(-)
>
>diff --git a/include/net/tc_act/tc_mirred.h b/include/net/tc_act/tc_mirred.h
>index 21d253c9a8c6..b2dbbfaefd22 100644
>--- a/include/net/tc_act/tc_mirred.h
>+++ b/include/net/tc_act/tc_mirred.h
>@@ -11,7 +11,6 @@ struct tcf_mirred {
>   int tcfm_ifindex;
>   booltcfm_mac_header_xmit;
>   struct net_device __rcu *tcfm_dev;
>-  struct net  *net;
>   struct list_headtcfm_list;
> };
> #define to_mirred(a) ((struct tcf_mirred *)a)
>diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c
>index 8b3e59388480..fe6489f9c3cf 100644
>--- a/net/sched/act_mirred.c
>+++ b/net/sched/act_mirred.c
>@@ -140,7 +140,6 @@ static int tcf_mirred_init(struct net *net, struct nlattr 
>*nla,
>   m->tcfm_eaction = parm->eaction;
>   if (dev != NULL) {
>   m->tcfm_ifindex = parm->ifindex;
>-  m->net = net;
>   if (ret != ACT_P_CREATED)
>   dev_put(rcu_dereference_protected(m->tcfm_dev, 1));
>   dev_hold(dev);
>@@ -318,7 +317,7 @@ static struct net_device *tcf_mirred_get_dev(const struct 
>tc_action *a)
> {
>   struct tcf_mirred *m = to_mirred(a);
> 
>-  return __dev_get_by_index(m->net, m->tcfm_ifindex);
>+  return rtnl_dereference(m->tcfm_dev);

Good. Please also use m->tcfm_dev->ifindex in tcf_mirred_dump and
tcf_mirred_ifindex. Then you can remove tcfm_ifindex completely.


> }
> 
> static struct tc_action_ops act_mirred_ops = {
>-- 
>2.13.0
>

Re: [PATCH 01/10] trailing whitespace fixed

2017-12-02 Thread Greg KH

On Sun, Nov 26, 2017 at 01:45:56AM +0300, Mike wrote:
> Signed-off-by: Mike 
> ---
>  drivers/staging/irda/drivers/ali-ircc.c | 1002 
> +++
>  1 file changed, 501 insertions(+), 501 deletions(-)
> 

Hi,

This is the friendly patch-bot of Greg Kroah-Hartman.  You have sent him
a patch that has triggered this response.  He used to manually respond
to these common problems, but in order to save his sanity (he kept
writing the same thing over and over, yet to different people), I was
created.  Hopefully you will not take offence and will fix the problem
in your patch and resubmit it so that it can be accepted into the Linux
kernel tree.

You are receiving this message because of the following common error(s)
as indicated below:

- Your patch was attached, please place it inline so that it can be
  applied directly from the email message itself.

- You did not specify a description of why the patch is needed, or
  possibly, any description at all, in the email body.  Please read the
  section entitled "The canonical patch format" in the kernel file,
  Documentation/SubmittingPatches for what is needed in order to
  properly describe the change.

- You did not write a descriptive Subject: for the patch, allowing Greg,
  and everyone else, to know what this patch is all about.  Please read
  the section entitled "The canonical patch format" in the kernel file,
  Documentation/SubmittingPatches for what a proper Subject: line should
  look like.

- You did not use your real name.  Signed-off-by: and From: has to have
  a real name associated with it.  Use whatever you sign legal documents
  with, no anonymous patches are allowed.

If you wish to discuss this problem further, or you have questions about
how to resolve this issue, please feel free to respond to this email and
Greg will reply once he has dug out from the pending patches received
from other developers.

thanks,

greg k-h's patch email bot

[PATCH v2 net-next] net/tcp: trace all TCP/IP state transition with tcp_set_state tracepoint

2017-12-02 Thread Yafang Shao

The TCP/IP transition from TCP_LISTEN to TCP_SYN_RECV and some other
transitions are not traced with tcp_set_state tracepoint.

In order to trace the whole tcp lifespans, two helpers are introduced,
void __tcp_set_state(struct sock *sk, int state)
void __sk_state_store(struct sock *sk, int newstate)

When do TCP/IP state transition, we should use these two helpers or use
tcp_set_state() other than assigning a value to sk_state directly.

Signed-off-by: Yafang Shao 

---
v2: test
---
 include/net/tcp.h   |  2 ++
 net/ipv4/inet_connection_sock.c |  6 +++---
 net/ipv4/inet_hashtables.c  |  2 +-
 net/ipv4/tcp.c  | 12 
 4 files changed, 18 insertions(+), 4 deletions(-)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index 85ea578..4f2d015 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1247,6 +1247,8 @@ static inline bool tcp_checksum_complete(struct sk_buff 
*skb)
"Close Wait","Last ACK","Listen","Closing"
 };
 #endif
+void __sk_state_store(struct sock *sk, int newstate);
+void __tcp_set_state(struct sock *sk, int state);
 void tcp_set_state(struct sock *sk, int state);
 
 void tcp_done(struct sock *sk);
diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
index 4ca46dc..f3967f1 100644
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -783,7 +783,7 @@ struct sock *inet_csk_clone_lock(const struct sock *sk,
if (newsk) {
struct inet_connection_sock *newicsk = inet_csk(newsk);
 
-   newsk->sk_state = TCP_SYN_RECV;
+   __tcp_set_state(newsk, TCP_SYN_RECV);
newicsk->icsk_bind_hash = NULL;
 
inet_sk(newsk)->inet_dport = inet_rsk(req)->ir_rmt_port;
@@ -877,7 +877,7 @@ int inet_csk_listen_start(struct sock *sk, int backlog)
 * It is OK, because this socket enters to hash table only
 * after validation is complete.
 */
-   sk_state_store(sk, TCP_LISTEN);
+   __sk_state_store(sk, TCP_LISTEN);
if (!sk->sk_prot->get_port(sk, inet->inet_num)) {
inet->inet_sport = htons(inet->inet_num);
 
@@ -888,7 +888,7 @@ int inet_csk_listen_start(struct sock *sk, int backlog)
return 0;
}
 
-   sk->sk_state = TCP_CLOSE;
+   __tcp_set_state(sk, TCP_CLOSE);
return err;
 }
 EXPORT_SYMBOL_GPL(inet_csk_listen_start);
diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c
index e7d15fb..72c15b6 100644
--- a/net/ipv4/inet_hashtables.c
+++ b/net/ipv4/inet_hashtables.c
@@ -430,7 +430,7 @@ bool inet_ehash_nolisten(struct sock *sk, struct sock *osk)
sock_prot_inuse_add(sock_net(sk), sk->sk_prot, 1);
} else {
percpu_counter_inc(sk->sk_prot->orphan_count);
-   sk->sk_state = TCP_CLOSE;
+   __tcp_set_state(sk, TCP_CLOSE);
sock_set_flag(sk, SOCK_DEAD);
inet_csk_destroy_sock(sk);
}
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index bf97317..2bc7e04 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2036,6 +2036,18 @@ int tcp_recvmsg(struct sock *sk, struct msghdr *msg, 
size_t len, int nonblock,
 }
 EXPORT_SYMBOL(tcp_recvmsg);
 
+void __sk_state_store(struct sock *sk, int newstate)
+{
+   trace_tcp_set_state(sk, sk->sk_state, newstate);
+   sk_state_store(sk, newstate);
+}
+
+void __tcp_set_state(struct sock *sk, int state)
+{
+   trace_tcp_set_state(sk, sk->sk_state, state);
+   sk->sk_state = state;
+}
+
 void tcp_set_state(struct sock *sk, int state)
 {
int oldstate = sk->sk_state;
-- 
1.8.3.1

[PATCH net-next 1/2 v6] net: ethernet: Add DT bindings for the Gemini ethernet

2017-12-02 Thread Linus Walleij

This adds the device tree bindings for the Gemini ethernet
controller. It is pretty straight-forward, using standard
bindings and modelling the two child ports as child devices
under the parent ethernet controller device.

Cc: devicet...@vger.kernel.org
Cc: Tobias Waldvogel 
Cc: Michał Mirosław 
Signed-off-by: Linus Walleij 
---
 .../bindings/net/cortina,gemini-ethernet.txt   | 92 ++
 1 file changed, 92 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/net/cortina,gemini-ethernet.txt

diff --git a/Documentation/devicetree/bindings/net/cortina,gemini-ethernet.txt 
b/Documentation/devicetree/bindings/net/cortina,gemini-ethernet.txt
new file mode 100644
index ..35fa3abd1c73
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/cortina,gemini-ethernet.txt
@@ -0,0 +1,92 @@
+Cortina Systems Gemini Ethernet Controller
+==
+
+This ethernet controller is found in the Gemini SoC family:
+StorLink SL3512 and SL3516, also known as Cortina Systems
+CS3512 and CS3516.
+
+Required properties:
+- compatible: must be "cortina,gemini-ethernet"
+- reg: must contain the global registers and the V-bit and A-bit
+  memory areas, in total three register sets.
+- syscon: a phandle to the system controller
+- #address-cells: must be specified, must be <1>
+- #size-cells: must be specified, must be <1>
+- ranges: should be state like this giving a 1:1 address translation
+  for the subnodes
+
+The subnodes represents the two ethernet ports in this device.
+They are not independent of each other since they share resources
+in the parent node, and are thus children.
+
+Required subnodes:
+- port0: contains the resources for ethernet port 0
+- port1: contains the resources for ethernet port 1
+
+Required subnode properties:
+- compatible: must be "cortina,gemini-ethernet-port"
+- reg: must contain two register areas: the DMA/TOE memory and
+  the GMAC memory area of the port
+- interrupts: should contain the interrupt line of the port.
+  this is nominally a level interrupt active high.
+- resets: this must provide an SoC-integrated reset line for
+  the port.
+- clocks: this should contain a handle to the PCLK clock for
+  clocking the silicon in this port
+- clock-names: must be "PCLK"
+
+Optional subnode properties:
+- phy-mode: see ethernet.txt
+- phy-handle: see ethernet.txt
+
+Example:
+
+mdio-bus {
+   (...)
+   phy0: ethernet-phy@1 {
+   reg = <1>;
+   device_type = "ethernet-phy";
+   };
+   phy1: ethernet-phy@3 {
+   reg = <3>;
+   device_type = "ethernet-phy";
+   };
+};
+
+
+ethernet@6000 {
+   compatible = "cortina,gemini-ethernet";
+   reg = <0x6000 0x4000>, /* Global registers, queue */
+ <0x60004000 0x2000>, /* V-bit */
+ <0x60006000 0x2000>; /* A-bit */
+   syscon = <>;
+   #address-cells = <1>;
+   #size-cells = <1>;
+   ranges;
+
+   gmac0: port0 {
+   compatible = "cortina,gemini-ethernet-port";
+   reg = <0x60008000 0x2000>, /* Port 0 DMA/TOE */
+ <0x6000a000 0x2000>; /* Port 0 GMAC */
+   interrupt-parent = <>;
+   interrupts = <1 IRQ_TYPE_LEVEL_HIGH>;
+   resets = < GEMINI_RESET_GMAC0>;
+   clocks = < GEMINI_CLK_GATE_GMAC0>;
+   clock-names = "PCLK";
+   phy-mode = "rgmii";
+   phy-handle = <>;
+   };
+
+   gmac1: port1 {
+   compatible = "cortina,gemini-ethernet-port";
+   reg = <0x6000c000 0x2000>, /* Port 1 DMA/TOE */
+ <0x6000e000 0x2000>; /* Port 1 GMAC */
+   interrupt-parent = <>;
+   interrupts = <2 IRQ_TYPE_LEVEL_HIGH>;
+   resets = < GEMINI_RESET_GMAC1>;
+   clocks = < GEMINI_CLK_GATE_GMAC1>;
+   clock-names = "PCLK";
+   phy-mode = "rgmii";
+   phy-handle = <>;
+   };
+};
-- 
2.14.3

Re: [PATCH 00/10] Code style patches for staging/irda

2017-12-02 Thread Greg KH

On Sun, Nov 26, 2017 at 01:45:55AM +0300, Mike wrote:
> 
> This is my task from Little Penguin :)

What does that mean?

> 
> Mike (10):
>   trailing whitespace fixed
>   spaces before tabs are fixed
>   trailing */ on a separate line fixed
>   if-else code style fixed
>   space prohibited before that ',' fixed
>   space required before the open parenthesis - fixed
>   code indent should use tabs where possible - fixed
>   spaces required - fixed
>   others not regular and missed errors are fixed
>   lines over 80 characters are fixed
> 
>  drivers/staging/irda/drivers/ali-ircc.c | 1516 
> +++
>  1 file changed, 722 insertions(+), 794 deletions(-)

Did you read drivers/staging/irda/TODO ?

[PATCH net-next] net: phy: broadcom: re-add mistakenly removed config settings

2017-12-02 Thread Heiner Kallweit

Previous patch mistakenly removed three chip-specific config settings.
Add them again.

Fixes: 80274abafc60 "net: phy: remove generic settings for callbacks 
config_aneg and read_status from drivers"
Signed-off-by: Heiner Kallweit 
---
 drivers/net/phy/broadcom.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/phy/broadcom.c b/drivers/net/phy/broadcom.c
index e008411a3..a8f69c577 100644
--- a/drivers/net/phy/broadcom.c
+++ b/drivers/net/phy/broadcom.c
@@ -611,6 +611,7 @@ static struct phy_driver broadcom_drivers[] = {
.features   = PHY_GBIT_FEATURES,
.flags  = PHY_HAS_INTERRUPT,
.config_init= bcm54xx_config_init,
+   .config_aneg= bcm5481_config_aneg,
.ack_interrupt  = bcm_phy_ack_intr,
.config_intr= bcm_phy_config_intr,
 }, {
@@ -620,6 +621,7 @@ static struct phy_driver broadcom_drivers[] = {
.features   = PHY_GBIT_FEATURES,
.flags  = PHY_HAS_INTERRUPT,
.config_init= bcm54xx_config_init,
+   .config_aneg= bcm5481_config_aneg,
.ack_interrupt  = bcm_phy_ack_intr,
.config_intr= bcm_phy_config_intr,
 }, {
@@ -629,6 +631,7 @@ static struct phy_driver broadcom_drivers[] = {
.features   = PHY_GBIT_FEATURES,
.flags  = PHY_HAS_INTERRUPT,
.config_init= bcm5482_config_init,
+   .read_status= bcm5482_read_status,
.ack_interrupt  = bcm_phy_ack_intr,
.config_intr= bcm_phy_config_intr,
 }, {
-- 
2.15.1

Re: [PATCH 1/3] iio: trigger: Fix platform_get_irq's error checking

2017-12-02 Thread Jonathan Cameron

On Thu, 30 Nov 2017 21:13:34 +0530
Arvind Yadav  wrote:

> The platform_get_irq() function returns negative if an error occurs.
> zero or positive number on success. platform_get_irq() error checking
> for zero is not correct.
> 
> Signed-off-by: Arvind Yadav 
Applied to the togreg branch of iio.git.  This is probably just
a theoretical problem as obviously the blackfin trigger only runs
on blackfin boards and I assume they only return 0.

Anyhow, nothing wrong with tidying it up as might possible get
cut and paste to somewhere it does matter in future!

Jonathan

> ---
>  drivers/staging/iio/trigger/iio-trig-bfin-timer.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/staging/iio/trigger/iio-trig-bfin-timer.c 
> b/drivers/staging/iio/trigger/iio-trig-bfin-timer.c
> index d80dcf8..f389f5c 100644
> --- a/drivers/staging/iio/trigger/iio-trig-bfin-timer.c
> +++ b/drivers/staging/iio/trigger/iio-trig-bfin-timer.c
> @@ -187,9 +187,9 @@ static int iio_bfin_tmr_trigger_probe(struct 
> platform_device *pdev)
>   return -ENOMEM;
>  
>   st->irq = platform_get_irq(pdev, 0);
> - if (!st->irq) {
> + if (st->irq < 0) {
>   dev_err(>dev, "No IRQs specified");
> - return -ENODEV;
> + return st->irq;
>   }
>  
>   ret = iio_bfin_tmr_get_number(st->irq);

Re: netfilter: xt_bpf: Fix XT_BPF_MODE_FD_PINNED mode of 'xt_bpf_info_v1'

2017-12-02 Thread Al Viro

On Fri, Dec 01, 2017 at 09:47:00PM +0100, Daniel Borkmann wrote:

> > Might want to replace security_path_mknod() with something saner, while we 
> > are
> > at it.
> > 
> > Objections?
> 
> No, thanks for looking into this, and sorry for this fugly hack! :( Not
> that this doesn't make it any better, but I think back then I took it
> over from mqueue implementation ... should have known better and looking
> into making this generic instead, sigh. The above looks good to me, so
> no objections from my side and thanks for working on it!
> 
> > PS: mqueue.c would also benefit from such primitive - do_create() there 
> > would
> > simply pass attr as callback's argument into vfs_mkobj(), with callback 
> > being
> > the guts of mqueue_create()...

OK...  See vfs.git#untested.mkobj; it really needs testing, though - mq_open(2)
passes LTP tests, but that's not saying much, and BPF side is completely
untested.

Re: [PATCH 3/4] RFC: net: dsa: Add bindings for Realtek SMI DSAs

2017-12-02 Thread Linus Walleij

On Thu, Nov 30, 2017 at 12:26 AM, Florian Fainelli  wrote:
> On 11/29/2017 03:19 PM, Linus Walleij wrote:

>> Or are there in pracice things such that reg is different
>> on the port and the PHY connected to it? Then it makes
>> much sense to put an MDIO bus inside the switch DT
>> node and populate the PHY interrupts from there as you
>> say.
>
> Yes, I have such systems here, Port 0 has its PHY at MDIO address 5 for
> instance.

That explains it.

> switch@0 {
> compatible = "acme,switch";
> #address-cells = <1>;
> #size-cells = <0>;
>
> ports {
>
> port@0 {
> reg = <0>;
> phy-handle = <>;
> };
>
> port@1 {
> reg = <1>;
> phy-handle = <>;
> };
>
> port@8 {
> reg = <8>;
> ethernet = = <>;
> };
> };
>
> mdio {
> compatible = "acme,switch-mdio";
>
> phy@0 {
> reg = <0>;
> };
>
> phy@1 {
> reg = <1>;
> };
> };
> };
>
> That way it's clear which port maps to which PHY, and that the MDIO
> controller is internal within the switch (and so are the PHYs).

So why not:

switch@0 {
compatible = "acme,switch";
#address-cells = <1>;
#size-cells = <0>;

ports {

port@0 {
reg = <0>;
phy@0 {
 reg = <0>;
};
};

port@1 {
reg = <1>;
phy@1 {
 reg = <1>;
};
};

port@8 {
reg = <8>;
ethernet = = <>;
};
};

This avoids the cross-referencing of phandles.

Yours,
Linus Walleii

Re: NETDEV WATCHDOG: eth0 (dwc-eth-dwmac): transmit queue 1 timed out

2017-12-02 Thread Giuseppe CAVALLARO


Ok Bhadram

thx for this check, I was afraid that the HW FIFO had some issues.

Best Regards
Peppe

On 12/1/2017 4:39 PM, Bhadram Varka wrote:

Hi Giuseppe,

I don't see any issue with if we execute "ping -s 1400" case. I believe in this 
case TSO not triggered.

Thanks,
Bhadram.

-Original Message-
From: Giuseppe CAVALLARO [mailto:peppe.cavall...@st.com]
Sent: Thursday, November 23, 2017 11:58 AM
To: Bhadram Varka ; joao.pi...@synopsys.com
Cc: linux-netdev 
Subject: Re: NETDEV WATCHDOG: eth0 (dwc-eth-dwmac): transmit queue 1 timed out

Hi Bhadram

you said that  In normal ping scenario this is not observed, I wonder if you 
could try for example, ping with -s 1400. In that case, if still fail I think 
the issue could be the FIFO tuning and I expect overflow on RX MMC counters.

Let me know
Regards,
Peppe

On 11/20/2017 3:22 PM, Bhadram Varka wrote:

Hi Giuseppe,

Thanks for responding.

Actually I am using net-next tree for making the changes. Below patches already 
present in code base.

a0daae1 net: stmmac: Disable flow ctrl for RX AVB queues and really
enable TX AVB queues
52a7623 net: stmmac: Use correct values in TQS/RQS fields

Thanks,
Bhadram.

-Original Message-
From: Giuseppe CAVALLARO [mailto:peppe.cavall...@st.com]
Sent: Monday, November 20, 2017 6:37 PM
To: Bhadram Varka ; joao.pi...@synopsys.com
Cc: linux-netdev 
Subject: Re: NETDEV WATCHDOG: eth0 (dwc-eth-dwmac): transmit queue 1
timed out

Hello Bhadram

there are some new patches actually in net/net-next repo that you should have; 
for example:

  [PATCH net-next v2 0/2] net: stmmac: Improvements for
multi-queuing and for AVB

Let me know if these help you.

Regards
Peppe

On 11/20/2017 7:38 AM, Bhadram Varka wrote:

Hi Joao/Peppe,

Observed this issue more frequently with multi-channel case. Am I missing 
something in DT ?
Please help here to understand the issue.

Thanks,
Bhadram

-Original Message-
From: Bhadram Varka
Sent: Thursday, November 16, 2017 9:41 AM
To: linux-netdev 
Subject: NETDEV WATCHDOG: eth0 (dwc-eth-dwmac): transmit queue 1
timed out

Hi,

I am trying to enable multi-queue in Tegra186 EQOS (which has support for 4 
channels). Observed below netdev watchdog warning. Its easily reproable with 
iperf test.
In normal ping scenario this is not observed. I did not observe any issue if we 
disable TSO. Looks like issue in stmmac_tso_xmit() in multi-channel scenario.

[   88.801672] NETDEV WATCHDOG: eth0 (dwc-eth-dwmac): transmit queue 0 timed out
[   88.808818] [ cut here ]
[   88.813435] WARNING: CPU: 5 PID: 0 at net/sched/sch_generic.c:320 
dev_watchdog+0x2cc/0x2d8
[   88.821681] Modules linked in: dwmac_dwc_qos_eth stmmac_platform crc32_ce 
crct10dif_ce stmmac ip_tables x_tables ipv6
[   88.832290] CPU: 5 PID: 0 Comm: swapper/5 Tainted: G S  
4.14.0-rc7-01956-g9395db5-dirty #21
[   88.841663] Hardware name: NVIDIA Tegra186 P2771- Development Board (DT)
[   88.848697] task: 8001ec8fd400 task.stack: 09e38000
[   88.854606] PC is at dev_watchdog+0x2cc/0x2d8
[   88.858952] LR is at dev_watchdog+0x2cc/0x2d8
[   88.863300] pc : [] lr : [] pstate: 
2145
[   88.870678] sp : 0802bd80
[   88.873983] x29: 0802bd80 x28: 00a0
[   88.879287] x27:  x26: 8001eae2c3b0
[   88.884589] x25: 0005 x24: 8001ecb6be80
[   88.889891] x23: 8001eae2c39c x22: 8001eae2bfb0
[   88.895192] x21: 8001eae2c000 x20: 08fe7000
[   88.900493] x19: 0001 x18: 0010
[   88.905795] x17:  x16: 
[   88.911098] x15:  x14: 756f2064656d6974
[   88.916399] x13: 2031206575657571 x12: 08fe9df0
[   88.921699] x11: 08586180 x10: 642d6874652d6377
[   88.927000] x9 : 0016 x8 : 3a474f4448435441
[   88.932301] x7 : 572056454454454e x6 : 014f
[   88.937602] x5 : 0020 x4 : 
[   88.942902] x3 :  x2 : 08fec4c0
[   88.948203] x1 : 8001ec8fd400 x0 : 0041
[   88.953504] Call trace:
[   88.955944] Exception stack(0x0802bc40 to 0x0802bd80)
[   88.962371] bc40: 0041 8001ec8fd400 08fec4c0 

[   88.970184] bc60:  0020 014f 
572056454454454e
[   88.977998] bc80: 3a474f4448435441 0016 642d6874652d6377 
08586180
[   88.985811] bca0: 08fe9df0 2031206575657571 756f2064656d6974 

[   88.993624] bcc0:   0010 
0001
[   89.001439] bce0: 08fe7000 8001eae2c000 8001eae2bfb0 
8001eae2c39c
[   89.009252] bd00: 8001ecb6be80 0005 8001eae2c3b0 

[   89.017065] bd20: 00a0 0802bd80

Re: [EXT] Re: [PATCH net] net: phylink: fix link state on phy-connect

2017-12-02 Thread Russell King - ARM Linux

On Sat, Dec 02, 2017 at 11:08:45AM +, Yan Markman wrote:
> Hi Russel
>
> The Grygorii has raised one Additional point (about netif_carrier_off)
> I just didn't want to start before finishing the previous one.
>
> On ifconfig-down the mac_config() called but with LINK=0. 
> The config has no any knowledge what is intention -- up or down and
> should be done under disabled ingress/egress, and so the mac_config
> one of its action isnetif_carrier_off.

With the "p21" patch applied, which is now queued for 4.15-rc by davem,
the behaviour of phylink when phylink_stop() is called becomes entirely
predictable.

When phylink_stop() has been called, provided the carrier state is left
alone, it is guaranteed that mac_link_down() will be called if the link
was originally up, and this will complete prior to phylink_stop()
returning.

After that call has been made, and provided no further calls from the
MAC driver to phylink are made, phylink will make no further calls
to the MAC driver via mac_config(), mac_link_up() or mac_link_down().

It will only resume making these calls once phylink_start() is called.
phylink_start() will cause mac_config() to be called for the current
link mode.  A resolve of the current state is then triggered, which
may trigger further mac_config() calls to be made.  If the link is
then deemed to be up, a call to mac_link_up() will be made.

> After calling mac_config() the phylink checks
>   if (!link  &&  !netif_carrier_ok())
> and decides to abort further down since all-done...

phylink does not contain any such if () statement, so I'm not sure
what code you are referring to.

> REMOVE netif_carrier_off looks like correct BUT has cases where de driver 
> stops to works properly (sorry, I can't remember now what exactly).
> So finally I have placed there the CONDITIONAL carrier-off depending upon 
> link:
> 
> static void mvpp2_mac_config(){
>   if (state->link)--- occasionally is TRUE on UP but FALSE on down
>   netif_carrier_off(port->dev);//YANM

You should not be changing the carrier state in your mac_config()
function, because, again, just like having netif_carrier_off() before
phylink_stop(), it will mess phylink's tracking of the current state
and will cause the mac_link_*() functions to be called erratically.

> BTW: It's seems your below patch should be present anyway.
> +++ b/drivers/net/phy/phylink.c
> @@ -798,6 +798,7 @@ void phylink_disconnect_phy(struct phylink *pl)
> + pl->phy_state.link = false;

Here's an example without the above on Macchiatobin of a up -> down -> up
sequence on the gigabit wired ethernet port on this board (which I have
bound to a Linux bridge device).  The exact command used for this was:

# ifconfig eth2 down; sleep 2; ifconfig eth2 up

[66926.127009] mvpp2x f400.ppv22 eth2: Link is Down
[66926.131557] br0: port 1(eth2) entered disabled state
[66928.144845] mvpp2x f400.ppv22 eth2: configuring for inband/sgmii link 
mode
[66928.144853] mvpp2x f400.ppv22 eth2: reconfig: pm 4->4 cm 201->201 f 2->2
[66928.154937] IPv6: ADDRCONF(NETDEV_UP): eth2: link is not ready
[66929.783866] IPv6: ADDRCONF(NETDEV_UP): br0: link is not ready
[66929.979499] IPv6: ADDRCONF(NETDEV_UP): br0: link is not ready
[66931.213407] mvpp2x f400.ppv22 eth2: reconfig: pm 4->4 cm 201->201 f a->a
[66931.213424] mvpp2x f400.ppv22 eth2: Link is Up - 1Gbps/Full - flow 
control off
[66931.213433] IPv6: ADDRCONF(NETDEV_CHANGE): eth2: link becomes ready
[66931.213682] br0: port 1(eth2) entered blocking state
[66931.213685] br0: port 1(eth2) entered forwarding state
[66931.213920] IPv6: ADDRCONF(NETDEV_CHANGE): br0: link becomes ready

This is with the "p21" patch applied, and mvpp2x_the netif_carrier_off()
before phylink_stop() in mvpp2x removed.  Basically:

void mv_pp2x_stop_dev(struct mv_pp2x_port *port)
{
struct gop_hw *gop = >priv->hw.gop;
struct mv_mac_data *mac = >mac_data;

if (port->mac_data.phylink) {
phylink_stop(port->mac_data.phylink);

/* Disable interrupts on all CPUs */
mv_pp2x_port_interrupts_disable(port);
mv_pp2x_port_napi_disable(port);
netif_tx_stop_all_queues(port->dev);
} else {
/* Stop new packets from arriving to RXQs */
mv_pp2x_ingress_disable(port);

mdelay(10);

/* Disable interrupts on all CPUs */
mv_pp2x_port_interrupts_disable(port);

mv_pp2x_port_napi_disable(port);

netif_carrier_off(port->dev);
netif_tx_stop_all_queues(port->dev);

mv_pp2x_egress_disable(port);
}

if (port->comphy)
phy_power_off(port->comphy);

if (port->priv->pp2_version == PPV21) {
mv_pp21_port_disable(port);
} else {
mv_gop110_port_events_mask(gop, mac);
mv_gop110_port_disable(gop,

Re: ath9k: dfs: use swap macro in ath9k_check_chirping

2017-12-02 Thread Kalle Valo

"Gustavo A. R. Silva"  wrote:

> Make use of the swap macro and remove unnecessary variable temp.
> This makes the code easier to read and maintain.
> 
> This code was detected with the help of Coccinelle.
> 
> Signed-off-by: Gustavo A. R. Silva 
> Signed-off-by: Kalle Valo 

Patch applied to ath-next branch of ath.git, thanks.

626ab6707abe ath9k: dfs: use swap macro in ath9k_check_chirping

-- 
https://patchwork.kernel.org/patch/10041197/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

Re: [v4] wcn36xx: Set default BTLE coexistence config

2017-12-02 Thread Kalle Valo

Ramon Fried  wrote:

> If the value for the firmware configuration parameters
> BTC_STATIC_LEN_LE_BT and BTC_STATIC_LEN_LE_WLAN are not set the duty
> cycle between BT and WLAN is such that if BT (including BLE) is active
> WLAN gets 0 bandwidth. When tuning these parameters having a too high
> value for WLAN means that BLE performance degrades.
> The "sweet" point of roughly half of the maximal values was empirically
> found to achieve a balance between BLE and Wi-Fi coexistence
> performance.
> 
> Signed-off-by: Eyal Ilsar 
> Signed-off-by: Ramon Fried 
> Acked-by: Bjorn Andersson 
> Signed-off-by: Kalle Valo 

Patch applied to ath-next branch of ath.git, thanks.

4119b6160a35 wcn36xx: set default BTLE coexistence config

-- 
https://patchwork.kernel.org/patch/10060833/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

Re: [PATCH v2 22/35] nds32: Device tree support

2017-12-02 Thread Greentime Hu

2017-11-28 3:07 GMT+08:00 Rob Herring :
> On Mon, Nov 27, 2017 at 6:28 AM, Greentime Hu  wrote:
>> From: Greentime Hu 
>>
>> This patch adds support for device tree.
>>
>> Signed-off-by: Vincent Chen 
>> Signed-off-by: Greentime Hu 
>> ---
>>  arch/nds32/boot/dts/Makefile   |8 ++
>>  arch/nds32/boot/dts/ae3xx.dts  |   55 
>>  arch/nds32/boot/dts/ag101p.dts |   60 
>> 
>>  arch/nds32/kernel/devtree.c|   45 ++
>>  4 files changed, 168 insertions(+)
>>  create mode 100644 arch/nds32/boot/dts/Makefile
>>  create mode 100644 arch/nds32/boot/dts/ae3xx.dts
>>  create mode 100644 arch/nds32/boot/dts/ag101p.dts
>>  create mode 100644 arch/nds32/kernel/devtree.c
>>
>> diff --git a/arch/nds32/boot/dts/Makefile b/arch/nds32/boot/dts/Makefile
>> new file mode 100644
>> index 000..d31faa8
>> --- /dev/null
>> +++ b/arch/nds32/boot/dts/Makefile
>> @@ -0,0 +1,8 @@
>> +ifneq '$(CONFIG_NDS32_BUILTIN_DTB)' '""'
>
> Built-in dtb's are really for legacy bootloader cases where the
> bootloader doesn't understand dtbs. Do you have that here?
>
> Plus, I don't see any code here to handle the built-in dtb.

As you mentioned in the next thread, it is handled in head.S
We would like to keep it because we debug kernel through gdb without
bootloader very often.

>> +BUILTIN_DTB := $(patsubst "%",%,$(CONFIG_NDS32_BUILTIN_DTB)).dtb.o
>> +else
>> +BUILTIN_DTB :=
>> +endif
>> +obj-$(CONFIG_OF) += $(BUILTIN_DTB)
>> +
>> +clean-files := *.dtb *.dtb.S
>> diff --git a/arch/nds32/boot/dts/ae3xx.dts b/arch/nds32/boot/dts/ae3xx.dts
>> new file mode 100644
>> index 000..4181060
>> --- /dev/null
>> +++ b/arch/nds32/boot/dts/ae3xx.dts
>> @@ -0,0 +1,55 @@
>> +/dts-v1/;
>> +/ {
>> +   compatible = "nds32 ae3xx";
>
> This compatible needs to be documented and is not valid. Needs to be
> in the form "vendor,board-name" without spaces.

Sorry I forgot to check this.
I will provide a document in bindings like
"Documentation/devicetree/bindings/nds32/andestech-boards".

>> +   #address-cells = <1>;
>> +   #size-cells = <1>;
>> +   interrupt-parent = <>;
>> +
>> +   chosen {
>> +   bootargs = "earlycon console=ttyS0,38400n8 debug loglevel=7";
>> +   stdout-path = 
>> +   };
>> +
>> +   memory@0 {
>> +   device_type = "memory";
>> +   reg = <0x 0x4000>;
>> +   };
>> +
>> +   cpu {
>> +   device_type = "cpu";
>> +   compatible = "andestech,n13", "andestech,nds32v3";
>> +   clock-frequency = <6000>;
>> +   };
>> +
>> +   intc: interrupt-controller {
>> +   compatible = "andestech,ativic32";
>> +   #interrupt-cells = <1>;
>> +   interrupt-controller;
>> +   };
>> +
>> +   serial0: serial@f030 {
>> +   compatible = "andestech,uart16550", "ns16550a";
>> +   reg = <0xf030 0x1000>;
>> +   interrupts = <8>;
>> +   clock-frequency = <14745600>;
>> +   reg-shift = <2>;
>> +   reg-offset = <32>;
>> +   no-loopback-test = <1>;
>> +   };
>> +
>> +   timer0: timer@f040 {
>> +   compatible = "andestech,atcpit100";
>> +   reg = <0xf040 0x1000>;
>> +   interrupts = <2>;
>> +   clock-frequency = <3000>;
>> +   cycle-count-offset = <0x38>;
>> +   cycle-count-down;
>> +   };
>> +
>> +   mac0: mac@e010 {
>
> ethernet@...
>
>> +   compatible = "andestech,atmac100";
>> +   reg = <0xe010 0x1000>;
>> +   interrupts = <18>;
>> +   };
>> +
>> +};
>> diff --git a/arch/nds32/boot/dts/ag101p.dts b/arch/nds32/boot/dts/ag101p.dts
>> new file mode 100644
>> index 000..f1cb540
>> --- /dev/null
>> +++ b/arch/nds32/boot/dts/ag101p.dts
>> @@ -0,0 +1,60 @@
>> +/dts-v1/;
>> +/ {
>> +   compatible = "nds32 ag101p";
>
> Same here.

Sorry I forgot to check this.
I will provide a document in bindings like
"Documentation/devicetree/bindings/nds32/andestech-boards".

>> +   #address-cells = <1>;
>> +   #size-cells = <1>;
>> +   interrupt-parent = <>;
>> +
>> +   chosen {
>> +   bootargs = "earlycon console=ttyS0,38400n8 debug loglevel=7";
>> +   stdout-path = 
>> +   };
>> +
>> +   memory@0 {
>> +   device_type = "memory";
>> +   reg = <0x 0x4000>;
>> +   };
>> +
>> +   cpu@0 {
>> +   device_type = "cpu";
>> +   compatible = "andestech,n13";
>> +   clock-frequency = <6000>;
>> +   next-level-cache = <>;
>> +   };
>> +
>> +   intc: interrupt-controller {
>> +   compatible =

Re: [PATCH] net: phy: realtek: fix RTL8211F interrupt mode

2017-12-02 Thread Martin Blumenstingl

Hi Heiner,

On Sun, Nov 12, 2017 at 4:16 PM, Heiner Kallweit  wrote:
> After commit b94d22d94ad22 "ARM64: dts: meson-gx: add external PHY
> interrupt on some platforms" ethernet stopped working on my Odroid-C2
> which has a RTL8211F phy.
>
> It turned out that no interrupts were triggered. Further analysis
> showed the register INER can't be altered on page 0.
> Because register INSR needs to be accessed via page 0xa43 I assumed
> that register INER needs to be accessed via some page too.
> Some brute force check resulted in page 0xa42 being the right one.
unfortunately there's no public datasheet for the RTL8211F.
I contacted Realtek to see if we could get a datasheet. unfortunately
an NDA is required for that
however, they were kind enough to share some information from the
RTL8211F datasheet with me

RTL821x_INER is called INER (Interrupt Enable Register) in the datasheet.
it is located at page 0xa42, address (the register after selecting the
page) 0x12 (RTL821x_INER is also 0x12)

in other words: your findings were correct!
(I know that my mail is too late to make it into the commit message -
but with this mail it's "documented" online now)

RTL8211E also uses RTL821x_INER (0x12) register, but according to the
information I got from Realtek it is located in page 0x0 (so no
special page has to be selected before changing that register on
RTL8211E)


Regards
Martin

Wireless regressions in v4.15-rc1

2017-12-02 Thread Kalle Valo

Hi,

just a heads up to everyone that there are multiple regressions in
v4.15-rc1. For starters hostapd (=AP mode) doesn't work because of:

net: netlink: Update attr validation to require exact length for some types
https://git.kernel.org/linus/28033ae4e0f5

Jouni fixed this already in hostapd but we also need a fix for kernel so
that old hostapd versions continue to work:

https://w1.fi/cgit/hostap/commit/?id=a2426829ce426de82d2fa47071ca41ea81c43307

Jouni also found a similar problem with mesh:

https://w1.fi/cgit/hostap/commit/?id=963d3149abfcbab5b83f9023bc50321f777360d1

And Johannes already submitted a revert related to wpa_supplicant:

[net] Revert "net: core: maybe return -EEXIST in __dev_alloc_name" 
diffmboxseries
https://patchwork.ozlabs.org/patch/843863/

And with ath10k I'm now seeing this:

[  133.175508] WARNING: CPU: 2 PID: 1743 at net/mac80211/agg-tx.c:315 
___ieee80211_stop_tx_ba_session+0x1ab/0x280 [mac80211]
[  133.175660] Modules linked in: ctr ccm arc4 ath10k_pci(E) ath10k_core(E) 
ath(E) mac80211(E) cfg80211(E) snd_hda_codec_hdmi snd_hda_codec_idt 
snd_hda_codec_generic joydev btusb snd_hda_i
[  133.175924] CPU: 2 PID: 1743 Comm: hostapd Tainted: GE
4.15.0-rc1-wt-ath+ #553
[  133.175960] Hardware name: Hewlett-Packard HP ProBook 6540b/1722, BIOS 68CDD 
Ver. F.04 01/27/2010
[  133.175996] task: 9a1b9b6e1cc0 task.stack: b13800348000
[  133.176072] RIP: 0010:___ieee80211_stop_tx_ba_session+0x1ab/0x280 [mac80211]
[  133.176111] RSP: 0018:b1380034b8f8 EFLAGS: 00210246
[  133.176151] RAX:  RBX: 9a1b93e85698 RCX: 
[  133.176187] RDX:  RSI:  RDI: 00200246
[  133.176223] RBP:  R08:  R09: 0001
[  133.176259] R10:  R11: 0012 R12: 
[  133.176295] R13: 0003 R14:  R15: 9a1b93060dc0
[  133.176332] FS:  () GS:9a1bb260(0063) 
knlGS:f7b69b00
[  133.176368] CS:  0010 DS: 002b ES: 002b CR0: 80050033
[  133.181034] CR2: f7dd56c0 CR3: 000119076000 CR4: 06e0
[  133.181068] Call Trace:
[  133.181191]  ieee80211_sta_tear_down_BA_sessions+0x6d/0x140 [mac80211]
[  133.181264]  __sta_info_destroy_part1+0x5e/0x9d0 [mac80211]
[  133.181336]  __sta_info_flush+0x129/0x190 [mac80211]
[  133.181421]  ieee80211_stop_ap+0x14c/0x5c0 [mac80211]
[  133.181516]  __cfg80211_stop_ap+0xdd/0x620 [cfg80211]
[  133.181591]  cfg80211_stop_ap+0x3a/0x50 [cfg80211]
[  133.181631]  genl_family_rcv_msg+0x1b9/0x370
[  133.181680]  genl_rcv_msg+0x47/0x90
[  133.181712]  ? genl_rcv+0x15/0x40
[  133.181744]  ? genl_family_rcv_msg+0x370/0x370
[  133.181777]  netlink_rcv_skb+0xd2/0xf0
[  133.181816]  genl_rcv+0x24/0x40
[  133.181850]  netlink_unicast+0x1c0/0x2e0
[  133.181887]  netlink_sendmsg+0x2ac/0x390
[  133.181931]  sock_sendmsg+0x30/0x40
[  133.181965]  ___sys_sendmsg+0x2a6/0x2b0
[  133.182003]  ? trace_hardirqs_on_caller+0x124/0x190
[  133.182042]  ? trace_hardirqs_on_caller+0x124/0x190
[  133.182083]  ? free_debug_processing+0x271/0x380
[  133.182122]  ? __lock_acquire+0x52d/0x1120
[  133.182159]  ? __lock_acquire+0x52d/0x1120
[  133.182200]  ? __sys_sendmsg+0x41/0x70
[  133.182231]  __sys_sendmsg+0x41/0x70
[  133.182278]  compat_SyS_socketcall+0x2db/0x480
[  133.182311]  ? task_work_run+0x6a/0xb0
[  133.182366]  ? trace_hardirqs_off_thunk+0x1a/0x1c
[  133.182406]  do_fast_syscall_32+0x9c/0x300
[  133.182443]  entry_SYSENTER_compat+0x51/0x60
[  133.182493] Code: 5d 41 5e 41 5f c3 c7 04 24 03 00 00 00 e9 dc fe ff ff 48 
8d bf 58 09 00 00 be ff ff ff ff e8 dd 36 90 e7 85 c0 0f 85 a8 fe ff ff <0f> ff 
e9 a1 fe ff ff 4c 89 f7 e8 76 
[  133.182798] ---[ end trace fb3fb3b808e4ce1b ]---

-- 
Kalle Valo

[PATCH 10/10] net: ethernet: cpmac: Handle return value of platform_get_irq_byname

2017-12-02 Thread Arvind Yadav

platform_get_irq_byname() can fail here and we must check its return
value

Signed-off-by: Arvind Yadav 
---
 drivers/net/ethernet/ti/cpmac.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/ethernet/ti/cpmac.c b/drivers/net/ethernet/ti/cpmac.c
index 9b8a30b..f3acfc0 100644
--- a/drivers/net/ethernet/ti/cpmac.c
+++ b/drivers/net/ethernet/ti/cpmac.c
@@ -1124,6 +1124,10 @@ static int cpmac_probe(struct platform_device *pdev)
}
 
dev->irq = platform_get_irq_byname(pdev, "irq");
+   if (dev->irq < 0) {
+   rc = dev->irq;
+   goto fail;
+   }
 
dev->netdev_ops = _netdev_ops;
dev->ethtool_ops = _ethtool_ops;
-- 
2.7.4

[PATCH 08/10] net: fjes: Handle return value of platform_get_irq and platform_get_resource

2017-12-02 Thread Arvind Yadav

platform_get_irq() and platform_get_resource() can fail here and
we must check its return value.

Signed-off-by: Arvind Yadav 
---
 drivers/net/fjes/fjes_main.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/drivers/net/fjes/fjes_main.c b/drivers/net/fjes/fjes_main.c
index 750954b..540dd51 100644
--- a/drivers/net/fjes/fjes_main.c
+++ b/drivers/net/fjes/fjes_main.c
@@ -1265,9 +1265,19 @@ static int fjes_probe(struct platform_device *plat_dev)
adapter->interrupt_watch_enable = false;
 
res = platform_get_resource(plat_dev, IORESOURCE_MEM, 0);
+   if (!res) {
+   err = -EINVAL;
+   goto err_free_netdev;
+   }
+
hw->hw_res.start = res->start;
hw->hw_res.size = resource_size(res);
hw->hw_res.irq = platform_get_irq(plat_dev, 0);
+   if (hw->hw_res.irq <= 0) {
+   err = hw->hw_res.irq ? hw->hw_res.irq : -ENODEV;
+   goto err_free_netdev;
+   }
+
err = fjes_hw_init(>hw);
if (err)
goto err_free_netdev;
-- 
2.7.4

[PATCH 09/10] net: ethernet: korina: Handle return value of platform_get_irq_byname

2017-12-02 Thread Arvind Yadav

platform_get_irq_byname() can fail here and we must check its return
value.

Signed-off-by: Arvind Yadav 
---
 drivers/net/ethernet/korina.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/drivers/net/ethernet/korina.c b/drivers/net/ethernet/korina.c
index ae195f8..e778504 100644
--- a/drivers/net/ethernet/korina.c
+++ b/drivers/net/ethernet/korina.c
@@ -1039,7 +1039,16 @@ static int korina_probe(struct platform_device *pdev)
memcpy(dev->dev_addr, bif->mac, ETH_ALEN);
 
lp->rx_irq = platform_get_irq_byname(pdev, "korina_rx");
+   if (lp->rx_irq < 0) {
+   rc = lp->rx_irq;
+   goto probe_err_out;
+   }
+
lp->tx_irq = platform_get_irq_byname(pdev, "korina_tx");
+   if (lp->tx_irq < 0) {
+   rc = lp->tx_irq;
+   goto probe_err_out;
+   }
 
r = platform_get_resource_byname(pdev, IORESOURCE_MEM, "korina_regs");
dev->base_addr = r->start;
-- 
2.7.4

[PATCH 07/10] net: ethernet: smsc: Handle return value of platform_get_irq

2017-12-02 Thread Arvind Yadav

platform_get_irq() can fail here and we must check its return value.

Signed-off-by: Arvind Yadav 
---
 drivers/net/ethernet/smsc/smc911x.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/net/ethernet/smsc/smc911x.c 
b/drivers/net/ethernet/smsc/smc911x.c
index 0515744..a1cf18c 100644
--- a/drivers/net/ethernet/smsc/smc911x.c
+++ b/drivers/net/ethernet/smsc/smc911x.c
@@ -2088,6 +2088,11 @@ static int smc911x_drv_probe(struct platform_device 
*pdev)
 
ndev->dma = (unsigned char)-1;
ndev->irq = platform_get_irq(pdev, 0);
+   if (ndev->irq <= 0) {
+   ret = ndev->irq ? ndev->irq : -ENODEV;
+   goto release_both;
+   }
+
lp = netdev_priv(ndev);
lp->netdev = ndev;
 #ifdef SMC_DYNAMIC_BUS_CONFIG
-- 
2.7.4

[PATCH net 1/2] netlink: add NLA_U8_BUGGY attribute type

2017-12-02 Thread Johannes Berg

From: Johannes Berg 

This netlink type is used only for backwards compatibility
with broken userspace that used the wrong size for a given
u8 attribute, which is now rejected. It would've been wrong
before already, since on big endian the wrong value (always
zero) would be used by the kernel, but we can't break the
existing deployed userspace - hostapd for example now fails
to initialize entirely.

We could try to fix up the big endian problem here, but we
don't know *how* userspace misbehaved - if using nla_put_u32
then we could, but we also found a debug tool (which we'll
ignore for the purposes of this regression) that was putting
the padding into the length.

Fixes: 28033ae4e0f5 ("net: netlink: Update attr validation to require exact 
length for some types")
Signed-off-by: Johannes Berg 
---
 include/net/netlink.h | 1 +
 lib/nlattr.c  | 1 +
 2 files changed, 2 insertions(+)

diff --git a/include/net/netlink.h b/include/net/netlink.h
index 0c154f98e987..448a9b86c959 100644
--- a/include/net/netlink.h
+++ b/include/net/netlink.h
@@ -180,6 +180,7 @@ enum {
NLA_S32,
NLA_S64,
NLA_BITFIELD32,
+   NLA_U8_BUGGY, /* don't use this - only for bug-ward compatibility */
__NLA_TYPE_MAX,
 };
 
diff --git a/lib/nlattr.c b/lib/nlattr.c
index 8bf78b4b78f0..2b89d25d4745 100644
--- a/lib/nlattr.c
+++ b/lib/nlattr.c
@@ -28,6 +28,7 @@ static const u8 nla_attr_len[NLA_TYPE_MAX+1] = {
 };
 
 static const u8 nla_attr_minlen[NLA_TYPE_MAX+1] = {
+   [NLA_U8_BUGGY]  = sizeof(u8),
[NLA_MSECS] = sizeof(u64),
[NLA_NESTED]= NLA_HDRLEN,
 };
-- 
2.14.2

[PATCH net 2/2] nl80211: use NLA_U8_BUGGY for two attributes

2017-12-02 Thread Johannes Berg

From: Johannes Berg 

We discovered that these are set incorrectly by the
corresponding userspace code, so keep compatible with
their bugs even if they'd always set the value to 0.

Reported-by: Jouni Malinen 
Fixes: 28033ae4e0f5 ("net: netlink: Update attr validation to require exact 
length for some types")
Signed-off-by: Johannes Berg 
---
 net/wireless/nl80211.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/wireless/nl80211.c b/net/wireless/nl80211.c
index b1ac23ca20c8..751b4efbf09a 100644
--- a/net/wireless/nl80211.c
+++ b/net/wireless/nl80211.c
@@ -384,7 +384,7 @@ static const struct nla_policy 
nl80211_policy[NUM_NL80211_ATTR] = {
[NL80211_ATTR_TSID] = { .type = NLA_U8 },
[NL80211_ATTR_USER_PRIO] = { .type = NLA_U8 },
[NL80211_ATTR_ADMITTED_TIME] = { .type = NLA_U16 },
-   [NL80211_ATTR_SMPS_MODE] = { .type = NLA_U8 },
+   [NL80211_ATTR_SMPS_MODE] = { .type = NLA_U8_BUGGY },
[NL80211_ATTR_MAC_MASK] = { .len = ETH_ALEN },
[NL80211_ATTR_WIPHY_SELF_MANAGED_REG] = { .type = NLA_FLAG },
[NL80211_ATTR_NETNS_FD] = { .type = NLA_U32 },
@@ -5830,7 +5830,7 @@ static const struct nla_policy 
nl80211_meshconf_params_policy[NL80211_MESHCONF_A
[NL80211_MESHCONF_MAX_RETRIES] = { .type = NLA_U8 },
[NL80211_MESHCONF_TTL] = { .type = NLA_U8 },
[NL80211_MESHCONF_ELEMENT_TTL] = { .type = NLA_U8 },
-   [NL80211_MESHCONF_AUTO_OPEN_PLINKS] = { .type = NLA_U8 },
+   [NL80211_MESHCONF_AUTO_OPEN_PLINKS] = { .type = NLA_U8_BUGGY },
[NL80211_MESHCONF_SYNC_OFFSET_MAX_NEIGHBOR] = { .type = NLA_U32 },
[NL80211_MESHCONF_HWMP_MAX_PREQ_RETRIES] = { .type = NLA_U8 },
[NL80211_MESHCONF_PATH_REFRESH_TIME] = { .type = NLA_U32 },
-- 
2.14.2

[PATCH net-next 2/4] rtnetlink: get reference on module before invoking handlers

2017-12-02 Thread Florian Westphal

Add yet another rtnl_register function.  It will be used by modules
that can be removed.

The passed module struct is used to prevent module unload while
a netlink dump is in progress or when a DOIT_UNLOCKED doit callback
is called.

Cc: Peter Zijlstra 
Signed-off-by: Florian Westphal 
---
 include/net/rtnetlink.h |   2 +
 net/core/rtnetlink.c| 113 +---
 2 files changed, 80 insertions(+), 35 deletions(-)

diff --git a/include/net/rtnetlink.h b/include/net/rtnetlink.h
index ead018744ff5..e326b3f9eb5f 100644
--- a/include/net/rtnetlink.h
+++ b/include/net/rtnetlink.h
@@ -17,6 +17,8 @@ int __rtnl_register(int protocol, int msgtype,
rtnl_doit_func, rtnl_dumpit_func, unsigned int flags);
 void rtnl_register(int protocol, int msgtype,
   rtnl_doit_func, rtnl_dumpit_func, unsigned int flags);
+int rtnl_register_module(struct module *owner, int protocol, int msgtype,
+rtnl_doit_func, rtnl_dumpit_func, unsigned int flags);
 int rtnl_unregister(int protocol, int msgtype);
 void rtnl_unregister_all(int protocol);
 
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index ff292d3f2c41..de6390365c90 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -62,6 +62,7 @@
 struct rtnl_link {
rtnl_doit_func  doit;
rtnl_dumpit_funcdumpit;
+   struct module   *owner;
unsigned intflags;
struct rcu_head rcu;
 };
@@ -129,7 +130,6 @@ EXPORT_SYMBOL(lockdep_rtnl_is_held);
 #endif /* #ifdef CONFIG_PROVE_LOCKING */
 
 static struct rtnl_link __rcu **rtnl_msg_handlers[RTNL_FAMILY_MAX + 1];
-static refcount_t rtnl_msg_handlers_ref[RTNL_FAMILY_MAX + 1];
 
 static inline int rtm_msgindex(int msgtype)
 {
@@ -159,27 +159,10 @@ static struct rtnl_link *rtnl_get_link(int protocol, int 
msgtype)
return tab[msgtype];
 }
 
-/**
- * __rtnl_register - Register a rtnetlink message type
- * @protocol: Protocol family or PF_UNSPEC
- * @msgtype: rtnetlink message type
- * @doit: Function pointer called for each request message
- * @dumpit: Function pointer called for each dump request (NLM_F_DUMP) message
- * @flags: rtnl_link_flags to modifiy behaviour of doit/dumpit functions
- *
- * Registers the specified function pointers (at least one of them has
- * to be non-NULL) to be called whenever a request message for the
- * specified protocol family and message type is received.
- *
- * The special protocol family PF_UNSPEC may be used to define fallback
- * function pointers for the case when no entry for the specific protocol
- * family exists.
- *
- * Returns 0 on success or a negative error code.
- */
-int __rtnl_register(int protocol, int msgtype,
-   rtnl_doit_func doit, rtnl_dumpit_func dumpit,
-   unsigned int flags)
+static int rtnl_register_internal(struct module *owner,
+ int protocol, int msgtype,
+ rtnl_doit_func doit, rtnl_dumpit_func dumpit,
+ unsigned int flags)
 {
struct rtnl_link **tab, *link, *old;
int msgindex;
@@ -210,6 +193,9 @@ int __rtnl_register(int protocol, int msgtype,
goto unlock;
}
 
+   WARN_ON(link->owner && link->owner != owner);
+   link->owner = owner;
+
WARN_ON(doit && link->doit && link->doit != doit);
if (doit)
link->doit = doit;
@@ -228,6 +214,54 @@ int __rtnl_register(int protocol, int msgtype,
rtnl_unlock();
return ret;
 }
+
+/**
+ * rtnl_register_module - Register a rtnetlink message type
+ *
+ * @owner: module registering the hook (THIS_MODULE)
+ * @protocol: Protocol family or PF_UNSPEC
+ * @msgtype: rtnetlink message type
+ * @doit: Function pointer called for each request message
+ * @dumpit: Function pointer called for each dump request (NLM_F_DUMP) message
+ * @flags: rtnl_link_flags to modifiy behaviour of doit/dumpit functions
+ *
+ * Like rtnl_register, but for use by removable modules.
+ */
+int rtnl_register_module(struct module *owner,
+int protocol, int msgtype,
+rtnl_doit_func doit, rtnl_dumpit_func dumpit,
+unsigned int flags)
+{
+   return rtnl_register_internal(owner, protocol, msgtype,
+ doit, dumpit, flags);
+}
+EXPORT_SYMBOL_GPL(rtnl_register_module);
+
+/**
+ * __rtnl_register - Register a rtnetlink message type
+ * @protocol: Protocol family or PF_UNSPEC
+ * @msgtype: rtnetlink message type
+ * @doit: Function pointer called for each request message
+ * @dumpit: Function pointer called for each dump request (NLM_F_DUMP) message
+ * @flags: rtnl_link_flags to modifiy behaviour of doit/dumpit functions
+ *
+ * Registers the specified function pointers (at least one of them has
+ * to be non-NULL) to be

Re: [PATCH 08/10] net: fjes: Handle return value of platform_get_irq and platform_get_resource

2017-12-02 Thread arvindY


Hi Sergei,

On Sunday 03 December 2017 01:36 AM, Sergei Shtylyov wrote:

Hello!

On 12/02/2017 10:26 PM, Arvind Yadav wrote:


platform_get_irq() and platform_get_resource() can fail here and
we must check its return value.

Signed-off-by: Arvind Yadav 
---
  drivers/net/fjes/fjes_main.c | 10 ++
  1 file changed, 10 insertions(+)

diff --git a/drivers/net/fjes/fjes_main.c b/drivers/net/fjes/fjes_main.c
index 750954b..540dd51 100644
--- a/drivers/net/fjes/fjes_main.c
+++ b/drivers/net/fjes/fjes_main.c
@@ -1265,9 +1265,19 @@ static int fjes_probe(struct platform_device 
*plat_dev)

  adapter->interrupt_watch_enable = false;
res = platform_get_resource(plat_dev, IORESOURCE_MEM, 0);
+if (!res) {
+err = -EINVAL;
+goto err_free_netdev;
+}
+
  hw->hw_res.start = res->start;
  hw->hw_res.size = resource_size(res);
  hw->hw_res.irq = platform_get_irq(plat_dev, 0);
+if (hw->hw_res.irq <= 0) {


   This function no longer returns 0 on error, no need to check for <= 0.


+err = hw->hw_res.irq ? hw->hw_res.irq : -ENODEV;
+goto err_free_netdev;


   gcc allows a shorter way to write that.

err = hw->hw_res.irq ?: -ENODEV;

Yes, you are right, But first is more readable. That's why I have used.



+}
+
  err = fjes_hw_init(>hw);
  if (err)
  goto err_free_netdev;


MBR, Sergei

Thanks,
~arvind

Re: [PATCH net v2 2/3] xfrm: Add an activate() offload dev op

2017-12-02 Thread Shannon Nelson


On 12/2/2017 2:33 PM, Yossi Kuperman wrote:




On 1 Dec 2017, at 9:09, Steffen Klassert  wrote:

On Tue, Nov 28, 2017 at 07:55:41PM +0200, av...@mellanox.com wrote:
From: Aviv Heller 

Adding the state to the offload device prior to replay init in
xfrm_state_construct() will result in NULL dereference if a matching
ESP packet is received in between.

In order to inhibit driver offload logic from processing the state's
packets prior to the xfrm_state object being completely initialized and
added to the SADBs, a new activate() operation was added to inform the
driver the aforementioned conditions have been met.


We discussed this already some time ago, and I still think that
we should fix this by setting XFRM_STATE_VALID only after the
state is fully initialized.


An upcoming patch will refactor the if statement (encap_type < 0) in 
xfrm_input, in order to support crypto offload with GRO disabled. Currently it 
doesn’t work. This entails yet another check for the validity of the state. 
Resulting in total of 3 copies: 1) for normal traffic, 2) GRO and 3) crypto 
offload.

Anyway, IMO it is not right that we (the driver) allow an incoming packet to be 
delivered while the SA is not yet ready. Rather than checking for an invalid 
input I prefer to make sure that such a case won’t happen in the first place.

To complete the picture, there is another patch to the driver which simply drop 
incoming packets that underwent successful decryption and haven’t been 
activated yet. Active state merely means that the SA is present in the driver’s 
hash table.

We can make a separate patch to set the state to valid once it is fully 
initialized, it make sense on its own.

What do you think?



If the SA isn't ready, just don't tell the driver about it.  Please 
don't add yet another state for the driver to track.  This should be as 
simple as possible, and shouldn't be any more complex than the model 
already used by ndo_vlan_rx_add_vid and ndo_vlan_rx_kill_vid.


sln

Re: [PATCH net-next] openvswitch: do not propagate headroom updates to internal port

2017-12-02 Thread David Miller

From: Paolo Abeni 
Date: Thu, 30 Nov 2017 15:35:33 +0100

> After commit 3a927bc7cf9d ("ovs: propagate per dp max headroom to
> all vports") the need_headroom for the internal vport is updated
> accordingly to the max needed headroom in its datapath.
> 
> That avoids the pskb_expand_head() costs when sending/forwarding
> packets towards tunnel devices, at least for some scenarios.
> 
> We still require such copy when using the ovs-preferred configuration
> for vxlan tunnels:
> 
> br_int
>   /   \
> tap  vxlan
>(remote_ip:X)
> 
> br_phy
>  \
> NIC
> 
> where the route towards the IP 'X' is via 'br_phy'.
> 
> When forwarding traffic from the tap towards the vxlan device, we
> will call pskb_expand_head() in vxlan_build_skb() because
> br-phy->needed_headroom is equal to tun->needed_headroom.
> 
> With this change we avoid updating the internal vport needed_headroom,
> so that in the above scenario no head copy is needed, giving 5%
> performance improvement in UDP throughput test.
> 
> As a trade-off, packets sent from the internal port towards a tunnel
> device will now experience the head copy overhead. The rationale is
> that the latter use-case is less relevant performance-wise.
> 
> Signed-off-by: Paolo Abeni 

Applied, thanks.

Re: [PATCH net-next 0/4] net: dsa: simplify switchdev prepare phase

2017-12-02 Thread David Miller

From: Vivien Didelot 
Date: Thu, 30 Nov 2017 11:23:56 -0500

> This patch series brings no functional changes.
> 
> It removes the unused switchdev_trans arguments from the dsa_switch_ops
> for both MDB and VLAN operations, and provides functions to prepare and
> add these objects for a given bitmap of ports.

Series applied.

Re: [PATCH net 0/4] bnxt_en: Fixes.

2017-12-02 Thread David Miller

From: Michael Chan 
Date: Fri,  1 Dec 2017 03:13:01 -0500

> A shutdown fix for SMARTNIC, 2 fixes related to TC Flower vxlan
> filters, and the last one fixes an out-of-scope variable when sending
> short firmware messages.

Series applied, thanks Michael.

Re: [PATCH 07/10] net: ethernet: smsc: Handle return value of platform_get_irq

2017-12-02 Thread Sergei Shtylyov


On 12/02/2017 10:26 PM, Arvind Yadav wrote:


platform_get_irq() can fail here and we must check its return value.

Signed-off-by: Arvind Yadav 
---
  drivers/net/ethernet/smsc/smc911x.c | 5 +
  1 file changed, 5 insertions(+)

diff --git a/drivers/net/ethernet/smsc/smc911x.c 
b/drivers/net/ethernet/smsc/smc911x.c
index 0515744..a1cf18c 100644
--- a/drivers/net/ethernet/smsc/smc911x.c
+++ b/drivers/net/ethernet/smsc/smc911x.c
@@ -2088,6 +2088,11 @@ static int smc911x_drv_probe(struct platform_device 
*pdev)
  
  	ndev->dma = (unsigned char)-1;

ndev->irq = platform_get_irq(pdev, 0);
+   if (ndev->irq <= 0) {
+   ret = ndev->irq ? ndev->irq : -ENODEV;


   Same comments as the next patch...


+   goto release_both;
+   }
+
lp = netdev_priv(ndev);
lp->netdev = ndev;
  #ifdef SMC_DYNAMIC_BUS_CONFIG


MBR, Sergei

Re: [PATCH 05/10] net: ethernet: i825xx: Fix platform_get_irq's error checking

2017-12-02 Thread arvindY


Hi Sergei,

On Sunday 03 December 2017 01:38 AM, Sergei Shtylyov wrote:

Hello.

On 12/02/2017 10:26 PM, Arvind Yadav wrote:


The platform_get_irq() function returns negative if an error occurs.
zero or positive number on success. platform_get_irq() error checking
for zero is not correct.


   The why you consider returning 0 a sign of failure?

Here, Returning 0 is a problem. Because IRQ0 is always a problem.
This function is for getting an IRQ for a device. So we should check for
0 also.



Signed-off-by: Arvind Yadav 
---
  drivers/net/ethernet/i825xx/sni_82596.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/i825xx/sni_82596.c 
b/drivers/net/ethernet/i825xx/sni_82596.c

index b2c04a7..a6d56f5 100644
--- a/drivers/net/ethernet/i825xx/sni_82596.c
+++ b/drivers/net/ethernet/i825xx/sni_82596.c
@@ -120,9 +120,10 @@ static int sni_82596_probe(struct 
platform_device *dev)

  netdevice->dev_addr[5] = readb(eth_addr + 0x06);
  iounmap(eth_addr);
  -if (!netdevice->irq) {
+if (netdevice->irq <= 0) {
  printk(KERN_ERR "%s: IRQ not found for i82596 at 0x%lx\n",
  __FILE__, netdevice->base_addr);
+retval = netdevice->irq ? netdevice->irq : -ENODEV;
  goto probe_failed;
  }


MBR, Sergei

~arvind

[PATCH net-next 5/5] net: phy: realtek: add utility functions to read/write page addresses

2017-12-02 Thread Martin Blumenstingl

Realtek PHYs implement the concept of so-called "extension pages". The
reason for this is probably because these PHYs expose more registers
than available in the standard address range.
After all read/write operations on such a page are done the driver
should switch back to page 0 where the standard MII registers (such as
MII_BMCR) are available.

When referring to such a register the datasheets of RTL8211E and
RTL8211F always specify:
- the page / "ext. page" which has to be written to RTL821x_PAGE_SELECT
- an address (sometimes also called reg)

These new utility functions make the existing code easier to read since
it removes some duplication (switching back to page 0 is done within the
new helpers for example).

No functional changes are intended.

Signed-off-by: Martin Blumenstingl 
---
 drivers/net/phy/realtek.c | 83 ++-
 1 file changed, 53 insertions(+), 30 deletions(-)

diff --git a/drivers/net/phy/realtek.c b/drivers/net/phy/realtek.c
index d6868e8daaab..5416ec5af042 100644
--- a/drivers/net/phy/realtek.c
+++ b/drivers/net/phy/realtek.c
@@ -41,6 +41,39 @@ MODULE_DESCRIPTION("Realtek PHY driver");
 MODULE_AUTHOR("Johnson Leung");
 MODULE_LICENSE("GPL");
 
+static int rtl8211x_page_read(struct phy_device *phydev, u16 page, u16 address)
+{
+   int ret;
+
+   ret = phy_write(phydev, RTL821x_PAGE_SELECT, page);
+   if (ret)
+   return ret;
+
+   ret = phy_read(phydev, address);
+
+   /* restore to default page 0 */
+   phy_write(phydev, RTL821x_PAGE_SELECT, 0x0);
+
+   return ret;
+}
+
+static int rtl8211x_page_write(struct phy_device *phydev, u16 page,
+  u16 address, u16 val)
+{
+   int ret;
+
+   ret = phy_write(phydev, RTL821x_PAGE_SELECT, page);
+   if (ret)
+   return ret;
+
+   ret = phy_write(phydev, address, val);
+
+   /* restore to default page 0 */
+   phy_write(phydev, RTL821x_PAGE_SELECT, 0x0);
+
+   return ret;
+}
+
 static int rtl8201_ack_interrupt(struct phy_device *phydev)
 {
int err;
@@ -63,31 +96,21 @@ static int rtl8211f_ack_interrupt(struct phy_device *phydev)
 {
int err;
 
-   phy_write(phydev, RTL821x_PAGE_SELECT, 0xa43);
-   err = phy_read(phydev, RTL8211F_INSR);
-   /* restore to default page 0 */
-   phy_write(phydev, RTL821x_PAGE_SELECT, 0x0);
+   err = rtl8211x_page_read(phydev, 0xa43, RTL8211F_INSR);
 
return (err < 0) ? err : 0;
 }
 
 static int rtl8201_config_intr(struct phy_device *phydev)
 {
-   int err;
-
-   /* switch to page 7 */
-   phy_write(phydev, RTL821x_PAGE_SELECT, 0x7);
+   u16 val;
 
if (phydev->interrupts == PHY_INTERRUPT_ENABLED)
-   err = phy_write(phydev, RTL8201F_IER,
-   BIT(13) | BIT(12) | BIT(11));
+   val = BIT(13) | BIT(12) | BIT(11);
else
-   err = phy_write(phydev, RTL8201F_IER, 0);
+   val = 0;
 
-   /* restore to default page 0 */
-   phy_write(phydev, RTL821x_PAGE_SELECT, 0x0);
-
-   return err;
+   return rtl8211x_page_write(phydev, 0x7, RTL8201F_IER, val);
 }
 
 static int rtl8211b_config_intr(struct phy_device *phydev)
@@ -118,41 +141,41 @@ static int rtl8211e_config_intr(struct phy_device *phydev)
 
 static int rtl8211f_config_intr(struct phy_device *phydev)
 {
-   int err;
+   u16 val;
 
-   phy_write(phydev, RTL821x_PAGE_SELECT, 0xa42);
if (phydev->interrupts == PHY_INTERRUPT_ENABLED)
-   err = phy_write(phydev, RTL821x_INER,
-   RTL8211F_INER_LINK_STATUS);
+   val = RTL8211F_INER_LINK_STATUS;
else
-   err = phy_write(phydev, RTL821x_INER, 0);
-   phy_write(phydev, RTL821x_PAGE_SELECT, 0);
+   val = 0;
 
-   return err;
+   return rtl8211x_page_write(phydev, 0xa42, RTL821x_INER, val);
 }
 
 static int rtl8211f_config_init(struct phy_device *phydev)
 {
int ret;
-   u16 reg;
+   u16 val;
 
ret = genphy_config_init(phydev);
if (ret < 0)
return ret;
 
-   phy_write(phydev, RTL821x_PAGE_SELECT, 0xd08);
-   reg = phy_read(phydev, 0x11);
+   ret = rtl8211x_page_read(phydev, 0xd08, 0x11);
+   if (ret < 0)
+   return ret;
+
+   val = ret & 0x;
 
/* enable TX-delay for rgmii-id and rgmii-txid, otherwise disable it */
if (phydev->interface == PHY_INTERFACE_MODE_RGMII_ID ||
phydev->interface == PHY_INTERFACE_MODE_RGMII_TXID)
-   reg |= RTL8211F_TX_DELAY;
+   val |= RTL8211F_TX_DELAY;
else
-   reg &= ~RTL8211F_TX_DELAY;
+   val &= ~RTL8211F_TX_DELAY;
 
-   phy_write(phydev, 0x11, reg);
-   /* restore to default page 0 */
-   phy_write(phydev, RTL821x_PAGE_SELECT, 0x0);
+   ret =

[PATCH net-next 3/5] net: phy: realtek: group all register bit #defines for RTL821x_INER

2017-12-02 Thread Martin Blumenstingl

This simply moves all register bit #defines which describe the (PHY
specific) bits in the RTL821x_INER right below the RTL821x_INER register
definition. This makes it easier to spot which registers and bits belong
together.
No functional changes.

Signed-off-by: Martin Blumenstingl 
---
 drivers/net/phy/realtek.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/net/phy/realtek.c b/drivers/net/phy/realtek.c
index 59f0688e4d28..da263a92d6b1 100644
--- a/drivers/net/phy/realtek.c
+++ b/drivers/net/phy/realtek.c
@@ -20,13 +20,16 @@
 #define RTL821x_PHYSR  0x11
 #define RTL821x_PHYSR_DUPLEX   BIT(13)
 #define RTL821x_PHYSR_SPEEDGENMASK(15, 14)
+
 #define RTL821x_INER   0x12
 #define RTL8211B_INER_INIT 0x6400
+#define RTL8211E_INER_LINK_STATUS  BIT(10)
+#define RTL8211F_INER_LINK_STATUS  BIT(4)
+
 #define RTL821x_INSR   0x13
+
 #define RTL821x_PAGE_SELECT0x1f
-#define RTL8211E_INER_LINK_STATUS  BIT(10)
 
-#define RTL8211F_INER_LINK_STATUS  BIT(4)
 #define RTL8211F_INSR  0x1d
 #define RTL8211F_TX_DELAY  BIT(8)
 
-- 
2.15.1

[PATCH net-next 2/5] net: phy: realtek: rename RTL821x_INER_INIT to RTL8211B_INER_INIT

2017-12-02 Thread Martin Blumenstingl

This macro is only used by the RTL8211B code. RTL8211E and RTL8211F both
use other bits to initialize the RTL821x_INER register.
No functional changes.

Signed-off-by: Martin Blumenstingl 
---
 drivers/net/phy/realtek.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/phy/realtek.c b/drivers/net/phy/realtek.c
index 9708aa9c58dd..59f0688e4d28 100644
--- a/drivers/net/phy/realtek.c
+++ b/drivers/net/phy/realtek.c
@@ -21,7 +21,7 @@
 #define RTL821x_PHYSR_DUPLEX   BIT(13)
 #define RTL821x_PHYSR_SPEEDGENMASK(15, 14)
 #define RTL821x_INER   0x12
-#define RTL821x_INER_INIT  0x6400
+#define RTL8211B_INER_INIT 0x6400
 #define RTL821x_INSR   0x13
 #define RTL821x_PAGE_SELECT0x1f
 #define RTL8211E_INER_LINK_STATUS  BIT(10)
@@ -92,7 +92,7 @@ static int rtl8211b_config_intr(struct phy_device *phydev)
 
if (phydev->interrupts == PHY_INTERRUPT_ENABLED)
err = phy_write(phydev, RTL821x_INER,
-   RTL821x_INER_INIT);
+   RTL8211B_INER_INIT);
else
err = phy_write(phydev, RTL821x_INER, 0);
 
-- 
2.15.1

Re: [PATCH net v2 2/3] xfrm: Add an activate() offload dev op

2017-12-02 Thread Yossi Kuperman

>> On 1 Dec 2017, at 9:09, Steffen Klassert  
>> wrote:
>> 
>> On Tue, Nov 28, 2017 at 07:55:41PM +0200, av...@mellanox.com wrote:
>> From: Aviv Heller 
>> 
>> Adding the state to the offload device prior to replay init in
>> xfrm_state_construct() will result in NULL dereference if a matching
>> ESP packet is received in between.
>> 
>> In order to inhibit driver offload logic from processing the state's
>> packets prior to the xfrm_state object being completely initialized and
>> added to the SADBs, a new activate() operation was added to inform the
>> driver the aforementioned conditions have been met.
> 
> We discussed this already some time ago, and I still think that
> we should fix this by setting XFRM_STATE_VALID only after the
> state is fully initialized.

An upcoming patch will refactor the if statement (encap_type < 0) in 
xfrm_input, in order to support crypto offload with GRO disabled. Currently it 
doesn’t work. This entails yet another check for the validity of the state. 
Resulting in total of 3 copies: 1) for normal traffic, 2) GRO and 3) crypto 
offload.

Anyway, IMO it is not right that we (the driver) allow an incoming packet to be 
delivered while the SA is not yet ready. Rather than checking for an invalid 
input I prefer to make sure that such a case won’t happen in the first place.

To complete the picture, there is another patch to the driver which simply drop 
incoming packets that underwent successful decryption and haven’t been 
activated yet. Active state merely means that the SA is present in the driver’s 
hash table.

We can make a separate patch to set the state to valid once it is fully 
initialized, it make sense on its own.

What do you think?

Re: [PATCH net 1/2] netlink: add NLA_U8_BUGGY attribute type

2017-12-02 Thread David Ahern

On 12/2/17 1:23 PM, Johannes Berg wrote:
> From: Johannes Berg 
> 
> This netlink type is used only for backwards compatibility
> with broken userspace that used the wrong size for a given
> u8 attribute, which is now rejected. It would've been wrong
> before already, since on big endian the wrong value (always
> zero) would be used by the kernel, but we can't break the
> existing deployed userspace - hostapd for example now fails
> to initialize entirely.
> 
> We could try to fix up the big endian problem here, but we
> don't know *how* userspace misbehaved - if using nla_put_u32
> then we could, but we also found a debug tool (which we'll
> ignore for the purposes of this regression) that was putting
> the padding into the length.
> 
> Fixes: 28033ae4e0f5 ("net: netlink: Update attr validation to require exact 
> length for some types")
> Signed-off-by: Johannes Berg 

Hi Johannes:

I have been really busy the past 2 weeks, so have not gotten around to
dealing with this. I was planning to partially revert 28033ae4e0f5 --
change it from failure to log an error message so buggy commands can be
fixed.

David

Re: [PATCH net-next v3 0/8] xdp: make stack perform remove and add selftests

2017-12-02 Thread Daniel Borkmann

On 12/02/2017 12:08 AM, Jakub Kicinski wrote:
> Hi!
> 
> The purpose of this series is to add a software model of BPF offloads
> to make it easier for everyone to test them and make some of the more
> arcane rules and assumptions more clear.
> 
> The series starts with 3 patches aiming to make XDP handling in the
> drivers less error prone.  Currently driver authors have to remember
> to free XDP programs if XDP is active during unregister.  With this
> series the core will disable XDP on its own.  It will take place
> after close, drivers are not expected to perform reconfiguration
> when disabling XDP on a downed device.
> 
> Next two patches add the software netdev driver, followed by a python
> test which exercises all the corner cases which came to my mind.
> 
> Test needs to be run as root.  It will print basic information to
> stdout, but can also create a more detailed log of all commands
> when --log option is passed.  Log is in Emacs Org-mode format.
> 
>   ./tools/testing/selftests/bpf/test_offload.py --log /tmp/log
> 
> Last two patches replace the SR-IOV API implementation of dummy.
> 
> v3:
>  - move the freeing of vfs to release (Phil).
> v2:
>  - free device from the release function;
>  - use bus-based name generatin instead of netdev name.
> v1:
>  - replace the SR-IOV API implementation of dummy;
>  - make the dev_xdp_uninstall() also handle the XDP generic (Daniel).

Series applied to bpf-next, thanks Jakub!

Re: Fixing CVE-2017-16939 in v4.4.y and possibly v3.18.y

2017-12-02 Thread Guenter Roeck


On 12/01/2017 11:48 AM, Michal Kubecek wrote:

On Thu, Nov 30, 2017 at 10:37:40AM -0800, Guenter Roeck wrote:

Hi,

The fix for CVE-2017-16939 has been applied to v4.9.y, but not to v4.4.y
and older kernels. However, I confirmed that running the published POC
(see https://blogs.securiteam.com/index.php/archives/3535) does crash a 4.4
kernel.

I confirmed that the following two patches fix the problem in v4.4.y.
Please consider applying them to v4.4.y (and possibly v3.18.y).

fc9e50f5a5a4e ("netlink: add a start callback for starting a netlink dump")
1137b5e2529a8 ("ipsec: Fix aborted xfrm policy dump crash")

My apologies for the noise if this is already under consideration.


It's a bit too big hammer. As Nicolai Stange noticed when we were


The hammer is just as big as the upstream hammer. Personally I prefer the
upstream patch; I don't see a reason to deviate from upstream just because
the upstream solution is more complex than necessary.


handling this for SLE12 (where fc9e50f5a5a4e would break kABI), it's


I didn't know that this is even a concern for stable releases. Is there
some guideline that kABI changes should be avoided in stable releases ?

Thanks,
Guenter


much simpler to use the flag we already have in cb->args[0] to let
xfrm_dump_policy_done() call xfrm_policy_walk_done() only if the walk
structure has been initialized. Thus all you need is the patch below.

Michal Kubecek

diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c
index 7a5a64e70b4d..c01c7a7eb4d3 100644
--- a/net/xfrm/xfrm_user.c
+++ b/net/xfrm/xfrm_user.c
@@ -1655,7 +1655,9 @@ static int xfrm_dump_policy_done(struct netlink_callback 
*cb)
struct xfrm_policy_walk *walk = (struct xfrm_policy_walk *) 
>args[1];
struct net *net = sock_net(cb->skb->sk);
  
-	xfrm_policy_walk_done(walk, net);

+   /* cb->args[0] is set when walk is initialized */
+   if (cb->args[0])
+   xfrm_policy_walk_done(walk, net);
return 0;
  }

[PATCH 05/10] net: ethernet: i825xx: Fix platform_get_irq's error checking

2017-12-02 Thread Arvind Yadav

The platform_get_irq() function returns negative if an error occurs.
zero or positive number on success. platform_get_irq() error checking
for zero is not correct.

Signed-off-by: Arvind Yadav 
---
 drivers/net/ethernet/i825xx/sni_82596.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/i825xx/sni_82596.c 
b/drivers/net/ethernet/i825xx/sni_82596.c
index b2c04a7..a6d56f5 100644
--- a/drivers/net/ethernet/i825xx/sni_82596.c
+++ b/drivers/net/ethernet/i825xx/sni_82596.c
@@ -120,9 +120,10 @@ static int sni_82596_probe(struct platform_device *dev)
netdevice->dev_addr[5] = readb(eth_addr + 0x06);
iounmap(eth_addr);
 
-   if (!netdevice->irq) {
+   if (netdevice->irq <= 0) {
printk(KERN_ERR "%s: IRQ not found for i82596 at 0x%lx\n",
__FILE__, netdevice->base_addr);
+   retval = netdevice->irq ? netdevice->irq : -ENODEV;
goto probe_failed;
}
 
-- 
2.7.4

[PATCH 01/10] net: bcmgenet: Fix platform_get_irq's error checking

2017-12-02 Thread Arvind Yadav

The platform_get_irq() function returns negative if an error occurs.
zero or positive number on success. platform_get_irq() error checking
for zero is not correct.

Signed-off-by: Arvind Yadav 
---
 drivers/net/ethernet/broadcom/genet/bcmgenet.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet.c 
b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
index 24b4f4c..e2f1268 100644
--- a/drivers/net/ethernet/broadcom/genet/bcmgenet.c
+++ b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
@@ -3371,7 +3371,7 @@ static int bcmgenet_probe(struct platform_device *pdev)
priv->irq0 = platform_get_irq(pdev, 0);
priv->irq1 = platform_get_irq(pdev, 1);
priv->wol_irq = platform_get_irq(pdev, 2);
-   if (!priv->irq0 || !priv->irq1) {
+   if (priv->irq0 <= 0 || priv->irq1 <= 0 || priv->wol_irq <= 0) {
dev_err(>dev, "can't find IRQs\n");
err = -EINVAL;
goto err;
-- 
2.7.4

[PATCH 04/10] can: xilinx: Handle return value of platform_get_irq

2017-12-02 Thread Arvind Yadav

platform_get_irq() can fail here and we must check its return value.

Signed-off-by: Arvind Yadav 
---
 drivers/net/can/xilinx_can.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/can/xilinx_can.c b/drivers/net/can/xilinx_can.c
index 89aec07..e36e2a2 100644
--- a/drivers/net/can/xilinx_can.c
+++ b/drivers/net/can/xilinx_can.c
@@ -,6 +,10 @@ static int xcan_probe(struct platform_device *pdev)
 
/* Get IRQ for the device */
ndev->irq = platform_get_irq(pdev, 0);
+   if (ndev->irq <= 0) {
+   ret = ndev->irq ? ndev->irq : -ENODEV;
+   goto err_free;
+   }
ndev->flags |= IFF_ECHO;/* We support local echo */
 
platform_set_drvdata(pdev, ndev);
-- 
2.7.4

[PATCH 06/10] net: ethernet: natsemi: Handle return value of platform_get_irq

2017-12-02 Thread Arvind Yadav

platform_get_irq() can fail here and we must check its return value.

Signed-off-by: Arvind Yadav 
---
 drivers/net/ethernet/natsemi/jazzsonic.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/net/ethernet/natsemi/jazzsonic.c 
b/drivers/net/ethernet/natsemi/jazzsonic.c
index d5b2888..3cf0856 100644
--- a/drivers/net/ethernet/natsemi/jazzsonic.c
+++ b/drivers/net/ethernet/natsemi/jazzsonic.c
@@ -242,6 +242,11 @@ static int jazz_sonic_probe(struct platform_device *pdev)
 
dev->base_addr = res->start;
dev->irq = platform_get_irq(pdev, 0);
+   if (dev->irq <= 0) {
+   err = dev->irq ? dev->irq : -ENODEV;
+   goto out;
+   }
+
err = sonic_probe1(dev);
if (err)
goto out;
-- 
2.7.4

[PATCH 03/10] net: ezchip: nps_enet: Fix platform_get_irq's error checking

2017-12-02 Thread Arvind Yadav

The platform_get_irq() function returns negative if an error occurs.
zero or positive number on success. platform_get_irq() error checking
for zero is not correct. And remove unnecessary check for free_netdev().

Signed-off-by: Arvind Yadav 
---
 drivers/net/ethernet/ezchip/nps_enet.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/ezchip/nps_enet.c 
b/drivers/net/ethernet/ezchip/nps_enet.c
index 659f1ad..82dc6d0 100644
--- a/drivers/net/ethernet/ezchip/nps_enet.c
+++ b/drivers/net/ethernet/ezchip/nps_enet.c
@@ -623,9 +623,9 @@ static s32 nps_enet_probe(struct platform_device *pdev)
 
/* Get IRQ number */
priv->irq = platform_get_irq(pdev, 0);
-   if (!priv->irq) {
+   if (priv->irq <= 0) {
dev_err(dev, "failed to retrieve  value from device 
tree\n");
-   err = -ENODEV;
+   err = priv->irq ? priv->irq : -ENODEV;
goto out_netdev;
}
 
@@ -646,8 +646,7 @@ static s32 nps_enet_probe(struct platform_device *pdev)
 out_netif_api:
netif_napi_del(>napi);
 out_netdev:
-   if (err)
-   free_netdev(ndev);
+   free_netdev(ndev);
 
return err;
 }
-- 
2.7.4

[PATCH 02/10] net: bcmgenet: free netdev on of_match_node() error

2017-12-02 Thread Arvind Yadav

The change is to call free_netdev(), If of_match_node() will fail.

Signed-off-by: Arvind Yadav 
---
 drivers/net/ethernet/broadcom/genet/bcmgenet.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet.c 
b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
index e2f1268..e0a8f79 100644
--- a/drivers/net/ethernet/broadcom/genet/bcmgenet.c
+++ b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
@@ -3363,8 +3363,10 @@ static int bcmgenet_probe(struct platform_device *pdev)
 
if (dn) {
of_id = of_match_node(bcmgenet_match, dn);
-   if (!of_id)
-   return -EINVAL;
+   if (!of_id) {
+   err = -EINVAL;
+   goto err;
+   }
}
 
priv = netdev_priv(dev);
-- 
2.7.4

[PATCH 00/10] Handle return value of platform_get_*

2017-12-02 Thread Arvind Yadav

 - The platform_get_*_*() function returns negative if an error occurs.
   zero or positive number on success. platform_get_irq_byname() error
   checking for zero is not correct.
 - The change is to call free_netdev(), If of_match_node() will fail.
 - Handle return value of platform_get_resource()

Arvind Yadav (10):
  [PATCH 01/10] net: bcmgenet: Fix platform_get_irq's error checking
  [PATCH 02/10] net: bcmgenet: free netdev on of_match_node() error
  [PATCH 03/10] net: ezchip: nps_enet: Fix platform_get_irq's error checking
  [PATCH 04/10] can: xilinx: Handle return value of platform_get_irq
  [PATCH 05/10] net: ethernet: i825xx: Fix platform_get_irq's error checking
  [PATCH 06/10] net: ethernet: natsemi: Handle return value of platform_get_irq
  [PATCH 07/10] net: ethernet: smsc: Handle return value of platform_get_irq
  [PATCH 08/10] net: fjes: Handle return value of platform_get_irq and 
platform_get_resource
  [PATCH 09/10] net: ethernet: korina: Handle return value of 
platform_get_irq_byname
  [PATCH 10/10] net: ethernet: cpmac: Handle return value of 
platform_get_irq_byname

 drivers/net/can/xilinx_can.c   |  4 
 drivers/net/ethernet/broadcom/genet/bcmgenet.c |  8 +---
 drivers/net/ethernet/ezchip/nps_enet.c |  7 +++
 drivers/net/ethernet/i825xx/sni_82596.c|  3 ++-
 drivers/net/ethernet/korina.c  |  9 +
 drivers/net/ethernet/natsemi/jazzsonic.c   |  5 +
 drivers/net/ethernet/smsc/smc911x.c|  5 +
 drivers/net/ethernet/ti/cpmac.c|  4 
 drivers/net/fjes/fjes_main.c   | 10 ++
 9 files changed, 47 insertions(+), 8 deletions(-)

-- 
2.7.4

Re: [PATCH 05/10] net: ethernet: i825xx: Fix platform_get_irq's error checking

2017-12-02 Thread Sergei Shtylyov


Hello.

On 12/02/2017 10:26 PM, Arvind Yadav wrote:


The platform_get_irq() function returns negative if an error occurs.
zero or positive number on success. platform_get_irq() error checking
for zero is not correct.


   The why you consider returning 0 a sign of failure?


Signed-off-by: Arvind Yadav 
---
  drivers/net/ethernet/i825xx/sni_82596.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/i825xx/sni_82596.c 
b/drivers/net/ethernet/i825xx/sni_82596.c
index b2c04a7..a6d56f5 100644
--- a/drivers/net/ethernet/i825xx/sni_82596.c
+++ b/drivers/net/ethernet/i825xx/sni_82596.c
@@ -120,9 +120,10 @@ static int sni_82596_probe(struct platform_device *dev)
netdevice->dev_addr[5] = readb(eth_addr + 0x06);
iounmap(eth_addr);
  
-	if (!netdevice->irq) {

+   if (netdevice->irq <= 0) {
printk(KERN_ERR "%s: IRQ not found for i82596 at 0x%lx\n",
__FILE__, netdevice->base_addr);
+   retval = netdevice->irq ? netdevice->irq : -ENODEV;
goto probe_failed;
}
  


MBR, Sergei

[PATCH net-next 1/5] net: phy: realtek: use the BIT and GENMASK macros

2017-12-02 Thread Martin Blumenstingl

This makes it easier to compare the #defines with the datasheets.
No functional changes.

Signed-off-by: Martin Blumenstingl 
---
 drivers/net/phy/realtek.c | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/net/phy/realtek.c b/drivers/net/phy/realtek.c
index eda0a6e86918..9708aa9c58dd 100644
--- a/drivers/net/phy/realtek.c
+++ b/drivers/net/phy/realtek.c
@@ -13,21 +13,22 @@
  * option) any later version.
  *
  */
+#include 
 #include 
 #include 
 
 #define RTL821x_PHYSR  0x11
-#define RTL821x_PHYSR_DUPLEX   0x2000
-#define RTL821x_PHYSR_SPEED0xc000
+#define RTL821x_PHYSR_DUPLEX   BIT(13)
+#define RTL821x_PHYSR_SPEEDGENMASK(15, 14)
 #define RTL821x_INER   0x12
 #define RTL821x_INER_INIT  0x6400
 #define RTL821x_INSR   0x13
 #define RTL821x_PAGE_SELECT0x1f
-#define RTL8211E_INER_LINK_STATUS 0x400
+#define RTL8211E_INER_LINK_STATUS  BIT(10)
 
-#define RTL8211F_INER_LINK_STATUS 0x0010
+#define RTL8211F_INER_LINK_STATUS  BIT(4)
 #define RTL8211F_INSR  0x1d
-#define RTL8211F_TX_DELAY  0x100
+#define RTL8211F_TX_DELAY  BIT(8)
 
 #define RTL8201F_ISR   0x1e
 #define RTL8201F_IER   0x13
-- 
2.15.1

[PATCH net-next 0/5] Realtek Ethernet PHY driver improvements

2017-12-02 Thread Martin Blumenstingl

This series provides some small improvements and cleanups for the
Realtek Ethernet PHY driver.
None of the patches in this series should change any functionality.
The goal is to make the code a bit easier to read by:
- re-using the BIT and GENMASK macros (which makes it easier to compare
  the #defines in the kernel with the values from the datasheets)
- rename a #define from a generic name to a PHY-specific name since it's
  only used for one specific PHY
- logically group the register #defines and their register bit #defines
  together
- indentation cleanups
- removed some code duplicating for reading/writing registers on a
  Realtek specific "page"


Martin Blumenstingl (5):
  net: phy: realtek: use the BIT and GENMASK macros
  net: phy: realtek: rename RTL821x_INER_INIT to RTL8211B_INER_INIT
  net: phy: realtek: group all register bit #defines for RTL821x_INER
  net: phy: realtek: use the same indentation for all #defines
  net: phy: realtek: add utility functions to read/write page addresses

 drivers/net/phy/realtek.c | 116 --
 1 file changed, 72 insertions(+), 44 deletions(-)

-- 
2.15.1

[PATCH net-next 4/5] net: phy: realtek: use the same indentation for all #defines

2017-12-02 Thread Martin Blumenstingl

This simply makes the code easier to read. No functional changes.

Signed-off-by: Martin Blumenstingl 
---
 drivers/net/phy/realtek.c | 27 ++-
 1 file changed, 14 insertions(+), 13 deletions(-)

diff --git a/drivers/net/phy/realtek.c b/drivers/net/phy/realtek.c
index da263a92d6b1..d6868e8daaab 100644
--- a/drivers/net/phy/realtek.c
+++ b/drivers/net/phy/realtek.c
@@ -17,24 +17,25 @@
 #include 
 #include 
 
-#define RTL821x_PHYSR  0x11
-#define RTL821x_PHYSR_DUPLEX   BIT(13)
-#define RTL821x_PHYSR_SPEEDGENMASK(15, 14)
+#define RTL821x_PHYSR  0x11
+#define RTL821x_PHYSR_DUPLEX   BIT(13)
+#define RTL821x_PHYSR_SPEEDGENMASK(15, 14)
 
-#define RTL821x_INER   0x12
-#define RTL8211B_INER_INIT 0x6400
-#define RTL8211E_INER_LINK_STATUS  BIT(10)
-#define RTL8211F_INER_LINK_STATUS  BIT(4)
+#define RTL821x_INER   0x12
+#define RTL8211B_INER_INIT 0x6400
+#define RTL8211E_INER_LINK_STATUS  BIT(10)
+#define RTL8211F_INER_LINK_STATUS  BIT(4)
 
-#define RTL821x_INSR   0x13
+#define RTL821x_INSR   0x13
 
-#define RTL821x_PAGE_SELECT0x1f
+#define RTL821x_PAGE_SELECT0x1f
 
-#define RTL8211F_INSR  0x1d
-#define RTL8211F_TX_DELAY  BIT(8)
+#define RTL8211F_INSR  0x1d
 
-#define RTL8201F_ISR   0x1e
-#define RTL8201F_IER   0x13
+#define RTL8211F_TX_DELAY  BIT(8)
+
+#define RTL8201F_ISR   0x1e
+#define RTL8201F_IER   0x13
 
 MODULE_DESCRIPTION("Realtek PHY driver");
 MODULE_AUTHOR("Johnson Leung");
-- 
2.15.1

[RfC net-next 1/3] net: phy: realtek: add support for configuring the RX delay on RTL8211F

2017-12-02 Thread Martin Blumenstingl

On RTL8211F the RX delay can also be enabled/disabled.
The overall behavior of the RX delay is similar to the behavior of the
TX delay, which was already supported by the driver.

The RX delay (similar to the TX delay) may be enabled using hardware pin
strapping. If the MAC already configures the RX delay (if required) then
the RX delay generated by the RTL8211F PHY has to be turned off.

While here, update the comment regarding the TX delay why it has to be
enabled or disabled within the driver.
Also avoid code-duplication by extracting the code to mask/unmask bits
in a paged register into a new rtl8211x_page_mask_bits helper function.

Signed-off-by: Martin Blumenstingl 
---
 drivers/net/phy/realtek.c | 55 ++-
 1 file changed, 45 insertions(+), 10 deletions(-)

diff --git a/drivers/net/phy/realtek.c b/drivers/net/phy/realtek.c
index 5416ec5af042..d4e7f249a4bc 100644
--- a/drivers/net/phy/realtek.c
+++ b/drivers/net/phy/realtek.c
@@ -32,7 +32,10 @@
 
 #define RTL8211F_INSR  0x1d
 
-#define RTL8211F_TX_DELAY  BIT(8)
+#define RTL8211F_RX_DELAY_REG  0x15
+#define RTL8211F_RX_DELAY_EN   BIT(3)
+#define RTL8211F_TX_DELAY_REG  0x11
+#define RTL8211F_TX_DELAY_EN   BIT(8)
 
 #define RTL8201F_ISR   0x1e
 #define RTL8201F_IER   0x13
@@ -74,6 +77,23 @@ static int rtl8211x_page_write(struct phy_device *phydev, 
u16 page,
return ret;
 }
 
+static int rtl8211x_page_mask_bits(struct phy_device *phydev, u16 page,
+  u16 address, u16 mask, u16 set)
+{
+   int ret;
+   u16 val;
+
+   ret = rtl8211x_page_read(phydev, page, address);
+   if (ret < 0)
+   return ret;
+
+   val = ret & 0x;
+   val &= ~mask;
+   val |= (set & mask);
+
+   return rtl8211x_page_write(phydev, page, address, val);
+}
+
 static int rtl8201_ack_interrupt(struct phy_device *phydev)
 {
int err;
@@ -160,20 +180,35 @@ static int rtl8211f_config_init(struct phy_device *phydev)
if (ret < 0)
return ret;
 
-   ret = rtl8211x_page_read(phydev, 0xd08, 0x11);
-   if (ret < 0)
-   return ret;
+   /*
+* enable TX-delay for rgmii-id and rgmii-txid, otherwise disable it.
+* this is needed because it can be enabled by pin strapping and
+* conflict with the TX-delay configured by the MAC.
+*/
+   if (phydev->interface == PHY_INTERFACE_MODE_RGMII_ID ||
+   phydev->interface == PHY_INTERFACE_MODE_RGMII_TXID)
+   val = RTL8211F_TX_DELAY_EN;
+   else
+   val = 0;
 
-   val = ret & 0x;
+   ret = rtl8211x_page_mask_bits(phydev, 0xd08, RTL8211F_TX_DELAY_REG,
+ RTL8211F_TX_DELAY_EN, val);
+   if (ret)
+   return ret;
 
-   /* enable TX-delay for rgmii-id and rgmii-txid, otherwise disable it */
+   /*
+* enable RX-delay for rgmii-id and rgmii-rxid, otherwise disable it.
+* this is needed because it can be enabled by pin strapping and
+* conflict with the RX-delay configured by the MAC.
+*/
if (phydev->interface == PHY_INTERFACE_MODE_RGMII_ID ||
-   phydev->interface == PHY_INTERFACE_MODE_RGMII_TXID)
-   val |= RTL8211F_TX_DELAY;
+   phydev->interface == PHY_INTERFACE_MODE_RGMII_RXID)
+   val = RTL8211F_RX_DELAY_EN;
else
-   val &= ~RTL8211F_TX_DELAY;
+   val = 0;
 
-   ret = rtl8211x_page_write(phydev, 0xd08, 0x11, val);
+   ret = rtl8211x_page_mask_bits(phydev, 0xd08, RTL8211F_RX_DELAY_REG,
+ RTL8211F_RX_DELAY_EN, val);
if (ret)
return ret;
 
-- 
2.15.1

Re: netfilter: xt_bpf: Fix XT_BPF_MODE_FD_PINNED mode of 'xt_bpf_info_v1'

2017-12-02 Thread Al Viro

On Sat, Dec 02, 2017 at 06:48:50PM +, Al Viro wrote:
> On Fri, Dec 01, 2017 at 09:47:00PM +0100, Daniel Borkmann wrote:
> 
> > > Might want to replace security_path_mknod() with something saner, while 
> > > we are
> > > at it.
> > > 
> > > Objections?
> > 
> > No, thanks for looking into this, and sorry for this fugly hack! :( Not
> > that this doesn't make it any better, but I think back then I took it
> > over from mqueue implementation ... should have known better and looking
> > into making this generic instead, sigh. The above looks good to me, so
> > no objections from my side and thanks for working on it!
> > 
> > > PS: mqueue.c would also benefit from such primitive - do_create() there 
> > > would
> > > simply pass attr as callback's argument into vfs_mkobj(), with callback 
> > > being
> > > the guts of mqueue_create()...
> 
> OK...  See vfs.git#untested.mkobj; it really needs testing, though - 
> mq_open(2)
> passes LTP tests, but that's not saying much, and BPF side is completely
> untested.

... and FWIW, completely untested patch for net/netfilter/xt_bpf.c follows:

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index e55e4255a210..a7000e4775e7 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -514,6 +514,9 @@ static inline struct bpf_prog *bpf_prog_get_type(u32 ufd,
return bpf_prog_get_type_dev(ufd, type, false);
 }
 
+struct bpf_prog *bpf_prog_get_type_path(const char *name, enum bpf_prog_type 
type);
+bool bpf_prog_get_ok(struct bpf_prog *, enum bpf_prog_type *, bool);
+
 int bpf_prog_offload_compile(struct bpf_prog *prog);
 void bpf_prog_offload_destroy(struct bpf_prog *prog);
 
diff --git a/kernel/bpf/inode.c b/kernel/bpf/inode.c
index 2b75faccc771..9d1050dc2a7a 100644
--- a/kernel/bpf/inode.c
+++ b/kernel/bpf/inode.c
@@ -364,6 +364,45 @@ int bpf_obj_get_user(const char __user *pathname, int 
flags)
 }
 EXPORT_SYMBOL_GPL(bpf_obj_get_user);
 
+static struct bpf_prog *__get_prog_inode(struct inode *inode, enum 
bpf_prog_type type)
+{
+   struct bpf_prog *prog;
+   int ret = inode_permission(inode, MAY_READ | MAY_WRITE);
+   if (ret)
+   return ERR_PTR(ret);
+
+   if (inode->i_op == _map_iops)
+   return ERR_PTR(-EINVAL);
+   if (inode->i_op != _prog_iops)
+   return ERR_PTR(-EACCES);
+
+   prog = inode->i_private;
+
+   ret = security_bpf_prog(prog);
+   if (ret < 0)
+   return ERR_PTR(ret);
+
+   if (!bpf_prog_get_ok(prog, , false))
+   return ERR_PTR(-EINVAL);
+
+   return bpf_prog_inc(prog);
+}
+
+struct bpf_prog *bpf_prog_get_type_path(const char *name, enum bpf_prog_type 
type)
+{
+   struct bpf_prog *prog;
+   struct path path;
+   int ret = kern_path(name, LOOKUP_FOLLOW, );
+   if (ret)
+   return ERR_PTR(ret);
+   prog = __get_prog_inode(d_backing_inode(path.dentry), type);
+   if (!IS_ERR(prog))
+   touch_atime();
+   path_put();
+   return prog;
+}
+EXPORT_SYMBOL(bpf_prog_get_type_path);
+
 static void bpf_evict_inode(struct inode *inode)
 {
enum bpf_type type;
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 2c4cfeaa8d5e..5cb783fc8224 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -1057,7 +1057,7 @@ struct bpf_prog *bpf_prog_inc_not_zero(struct bpf_prog 
*prog)
 }
 EXPORT_SYMBOL_GPL(bpf_prog_inc_not_zero);
 
-static bool bpf_prog_get_ok(struct bpf_prog *prog,
+bool bpf_prog_get_ok(struct bpf_prog *prog,
enum bpf_prog_type *attach_type, bool attach_drv)
 {
/* not an attachment, just a refcount inc, always allow */
diff --git a/net/netfilter/xt_bpf.c b/net/netfilter/xt_bpf.c
index 041da0d9c06f..fa2ca0a13619 100644
--- a/net/netfilter/xt_bpf.c
+++ b/net/netfilter/xt_bpf.c
@@ -52,18 +52,8 @@ static int __bpf_mt_check_fd(int fd, struct bpf_prog **ret)
 
 static int __bpf_mt_check_path(const char *path, struct bpf_prog **ret)
 {
-   mm_segment_t oldfs = get_fs();
-   int retval, fd;
-
-   set_fs(KERNEL_DS);
-   fd = bpf_obj_get_user(path, 0);
-   set_fs(oldfs);
-   if (fd < 0)
-   return fd;
-
-   retval = __bpf_mt_check_fd(fd, ret);
-   sys_close(fd);
-   return retval;
+   *ret = bpf_prog_get_type_path(path, BPF_PROG_TYPE_SOCKET_FILTER);
+   return PTR_ERR_OR_ZERO(*ret);
 }
 
 static int bpf_mt_check(const struct xt_mtchk_param *par)

[RfC net-next 3/3] net: phy: realtek: add more interrupt bits for RTL8211E and RTL8211F

2017-12-02 Thread Martin Blumenstingl

This documents a few more bits in the RTL821x_INER register for RTL8211E
and RTL8211F. These are added only to document them (as no public
datasheets are available for these PHYs), they are currently not used.

Signed-off-by: Martin Blumenstingl 
---
 drivers/net/phy/realtek.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/net/phy/realtek.c b/drivers/net/phy/realtek.c
index 961165d128d6..a793c35cbaae 100644
--- a/drivers/net/phy/realtek.c
+++ b/drivers/net/phy/realtek.c
@@ -24,7 +24,14 @@
 #define RTL821x_INER   0x12
 #define RTL8211B_INER_INIT 0x6400
 #define RTL8211E_INER_LINK_STATUS  BIT(10)
+#define RTL8211E_INER_ANEG_COMPLETED   BIT(11)
+#define RTL8211E_INER_PAGE_RECEIVEDBIT(12)
+#define RTL8211E_INER_ANEG_ERROR   BIT(15)
 #define RTL8211F_INER_LINK_STATUS  BIT(4)
+#define RTL8211F_INER_PHY_REGISTER_ACCESSIBLE  BIT(5)
+#define RTL8211F_INER_WOL_PME  BIT(7)
+#define RTL8211F_INER_ALDPS_STATE_CHANGE   BIT(9)
+#define RTL8211F_INER_JABBER   BIT(10)
 
 #define RTL821x_INSR   0x13
 
-- 
2.15.1

Re: [PATCH] vsock.7: document VSOCK socket address family

2017-12-02 Thread Stefan Hajnoczi

On Fri, Dec 01, 2017 at 09:57:04AM -0500, G. Branden Robinson wrote:
> At 2017-12-01T13:09:01+, Stefan Hajnoczi wrote:
> > On Thu, Nov 30, 2017 at 01:21:26PM +, Jorgen S. Hansen wrote:
> > > > On Nov 30, 2017, at 12:21 PM, Stefan Hajnoczi  
> > > > wrote:
> > 
> > Thanks for the quick review!
> > 
> > I forgot to ask you: Is SOCK_DGRAM reliable and in-order over VMCI?
> > 
> > > > +.PP
> > > > +Valid socket types are
> > > > +.B SOCK_STREAM
> > > > +and
> > > > +.B SOCK_DGRAM .
> > > 
> > > The space here results in a space between SOCK_DGRAM and the “.” in the 
> > > formatted text. Is that intentional?
> > 
> > I haven't figured out the groff syntax to avoid the space :(.  Any
> > ideas?
> 
> What you want is the .BR macro.
> 
> .BR SOCK_DGRAM .
> 
> The man macro package defines six "two-font" macros for switching
> between roman, bold, and italic faces without intervening space.
> 
> See man(7) and groff_man(7).

Excellent, thank you!

Stefan


signature.asc
Description: PGP signature

RE: [PATCH net-next] net: hns3: Refactors "reset" handling code in HCLGE layer of HNS3 driver

2017-12-02 Thread Salil Mehta

Hi Andrew,

> -Original Message-
> From: Andrew Lunn [mailto:and...@lunn.ch]
> Sent: Friday, December 01, 2017 1:44 PM
> To: Salil Mehta 
> Cc: da...@davemloft.net; Zhuangyuzeng (Yisen)
> ; lipeng (Y) ;
> mehta.salil@gmail.com; netdev@vger.kernel.org; linux-
> ker...@vger.kernel.org; linux-r...@vger.kernel.org; Linuxarm
> 
> Subject: Re: [PATCH net-next] net: hns3: Refactors "reset" handling
> code in HCLGE layer of HNS3 driver
> 
> On Fri, Dec 01, 2017 at 03:37:44AM +, Salil Mehta wrote:
> > This patch refactors the code of the reset feature in HCLGE layer
> > of HNS3 PF driver. Prime motivation to do this change is:
> > 1. To reduce the time for which common miscellaneous Vector 0
> >interrupt is disabled because of the reset.
> > 2. Simplification of reset request submission and pending reset
> >logic.
> > 3. Simplification of the common miscellaneous interrupt handler
> >routine(for Vector 0) used to handle reset and other sources
> >of Vector 0 interrupt.
> >
> > To achieve above below few things have been done:
> > 1. Interrupt is disabled while common miscellaneous interrupt
> >handler is entered and re-enabled before it is exit. This
> >reduces the interrupt handling latency as compared to older
> >interrupt handling scheme where interrupt was being disabled
> >in interrupt handler context and re-enabled in task context
> >some time later.
> > 2. Introduces new reset service task for honoring software reset
> >requests like from network stack related to timeout and serving
> >the pending reset request(to reset the driver and associated
> >clients).
> > 3. Made Miscellaneous interrupt handler more generic to handle
> >all sources including reset interrupt source.
> 
> Hi Salil
> 
> This is a rather large patch. Can you break it up? It seems like you
> should be able to break it up into at least three parts, maybe more.
> 
> You are aiming to have small patches which are obviously correct. It
> is much easier to review than one big patch which is not obvious at
> all.
Ok. No issues. I will try to fix this in V2 version.

> 
>   Andrew

Re: [PATCH 1/1] timecounter: Make cyclecounter struct part of timecounter struct

2017-12-02 Thread Richard Cochran

On Sat, Dec 02, 2017 at 10:01:35AM +0530, Sagar Arun Kamble wrote:
> There is no real need for the users of timecounters to define cyclecounter
> and timecounter variables separately. Since timecounter will always be
> based on cyclecounter, have cyclecounter struct as member of timecounter
> struct.

Overall, this is a welcome change.  However, it doesn't go far enough,
IMHO, and I'll explain that more below.

> diff --git a/drivers/net/ethernet/mellanox/mlx4/en_clock.c 
> b/drivers/net/ethernet/mellanox/mlx4/en_clock.c
> index 0247885..35987b5 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/en_clock.c
> +++ b/drivers/net/ethernet/mellanox/mlx4/en_clock.c
...

As it stands now, timecounter_init() is used for two totally different
reasons.  Some callers only want to set the time, ...

> @@ -207,7 +207,7 @@ static int mlx4_en_phc_settime(struct ptp_clock_info *ptp,
>  
>   /* reset the timecounter */
>   write_seqlock_irqsave(>clock_lock, flags);
> - timecounter_init(>clock, >cycles, ns);
> + timecounter_init(>clock, ns);
>   write_sequnlock_irqrestore(>clock_lock, flags);
>  
>   return 0;

... while others initialize the data structure the first time:

> @@ -274,17 +274,17 @@ void mlx4_en_init_timestamp(struct mlx4_en_dev *mdev)
>  
>   seqlock_init(>clock_lock);
>  
> - memset(>cycles, 0, sizeof(mdev->cycles));
> - mdev->cycles.read = mlx4_en_read_clock;
> - mdev->cycles.mask = CLOCKSOURCE_MASK(48);
> - mdev->cycles.shift = freq_to_shift(dev->caps.hca_core_clock);
> - mdev->cycles.mult =
> - clocksource_khz2mult(1000 * dev->caps.hca_core_clock, 
> mdev->cycles.shift);
> - mdev->nominal_c_mult = mdev->cycles.mult;
> + memset(>clock.cc, 0, sizeof(mdev->clock.cc));
> + mdev->clock.cc.read = mlx4_en_read_clock;
> + mdev->clock.cc.mask = CLOCKSOURCE_MASK(48);
> + mdev->clock.cc.shift = freq_to_shift(dev->caps.hca_core_clock);
> + mdev->clock.cc.mult =
> + clocksource_khz2mult(1000 * dev->caps.hca_core_clock,
> +  mdev->clock.cc.shift);
> + mdev->nominal_c_mult = mdev->clock.cc.mult;
>  
>   write_seqlock_irqsave(>clock_lock, flags);
> - timecounter_init(>clock, >cycles,
> -  ktime_to_ns(ktime_get_real()));
> + timecounter_init(>clock, ktime_to_ns(ktime_get_real()));

I'd like to see two followup patches to this one:

1. Convert timecounter_init() callers to a new timecounter_reset()
   function where the intent is to reset the time.

2. Change timecounter_init() to take the cyclecounter fields as
   arguments.

void timecounter_init(struct timecounter *tc,
  u64 (*read)(const struct cyclecounter *cc),
  u64 mask,
  u32 mult,
  u32 shift,
  u64 start_tstamp);

Then we can clean up all this stuff:

mdev->clock.cc.read = mlx4_en_read_clock;
mdev->clock.cc.mask = CLOCKSOURCE_MASK(48);
mdev->clock.cc.shift = freq_to_shift(dev->caps.hca_core_clock);
mdev->clock.cc.mult = clocksource_khz2mult(...);

This second step can be phased in by calling the new function
timecounter_initialize() and converting the drivers one by one.

> diff --git a/include/linux/timecounter.h b/include/linux/timecounter.h
> index 2496ad4..6daca06 100644
> --- a/include/linux/timecounter.h
> +++ b/include/linux/timecounter.h
...
> @@ -98,7 +98,6 @@ static inline void timecounter_adjtime(struct timecounter 
> *tc, s64 delta)
>  /**
>   * timecounter_init - initialize a time counter
>   * @tc:  Pointer to time counter which is to be 
> initialized/reset
> - * @cc:  A cycle counter, ready to be used.

This "ready to used" requirement should go.  The init() function
should make the instance ready to be used all at once.

Thanks,
Richard

Re: [PATCH net-next 0/2] net: dsa: cross-chip FDB support

2017-12-02 Thread David Miller

From: Vivien Didelot 
Date: Thu, 30 Nov 2017 12:56:41 -0500

> DSA can have interconnected switches. For instance, the ZII Dev Rev B
> board described in arch/arm/boot/dts/vf610-zii-dev-rev-b.dts has a
> switch fabric composed of 3 switch devices like this:
> 
>   lan4 lan6
> CPU (eth1)|  lan5 |  lan7
>   |   | | | |
>[0 1 2 3 4 6 5]---[6 0 1 2 3 4 5]---[9 0 1 2 3 4 5 6 7 8]
> | | |   | | | |
> lan0  |  lan2   lan3  lan8  |  optical4
>lan1  optical3
> 
> One current issue with DSA is cross-chip FDB. If we add a static MAC
> address on lan3, only its parent switch 1 (the one in the middle) will
> be programmed. That is not correct in a cross-chip environment, because
> the DSA ports connecting to switch 1 of adjacent switch 0 (on the left)
> and switch 2 (on the right) must be programmed too.
> 
> Without this patchset, a dump of the hardware FDB of switches 0, 1 and 2
> after programming a MAC address on lan3 looks like this (*):
> 
> # bridge fdb add 11:22:33:44:55:66 dev lan3
> # cat /sys/kernel/debug/mv88e6xxx/sw*/atu/0 | grep -v FID
>0  ff:ff:ff:ff:ff:ffMC_STATIC   n  0 1 2 3 4 5 6
>0  11:22:33:44:55:66MC_STATIC_MGMT_PO   n  0 - - - - - -
>0  ff:ff:ff:ff:ff:ffMC_STATIC   n  0 1 2 3 4 5 6
>0  ff:ff:ff:ff:ff:ffMC_STATIC   n  0 1 2 3 4 5 6 7 8 9
> 
> With this patchset applied, adjacent DSA ports get programmed too:
> 
> # bridge fdb add 11:22:33:44:55:66 dev lan3
> # cat /sys/kernel/debug/mv88e6xxx/sw*/atu/0 | grep -v FID
>0  11:22:33:44:55:66MC_STATIC_MGMT_PO   n  - - - - - 5 -
>0  ff:ff:ff:ff:ff:ffMC_STATIC   n  0 1 2 3 4 5 6
>0  11:22:33:44:55:66MC_STATIC_MGMT_PO   n  0 - - - - - -
>0  ff:ff:ff:ff:ff:ffMC_STATIC   n  0 1 2 3 4 5 6
>0  11:22:33:44:55:66MC_STATIC_MGMT_PO   n  - - - - - - - - - 9
>0  ff:ff:ff:ff:ff:ffMC_STATIC   n  0 1 2 3 4 5 6 7 8 9
 ...

Series applied, thanks.

Re: netfilter: xt_bpf: Fix XT_BPF_MODE_FD_PINNED mode of 'xt_bpf_info_v1'

2017-12-02 Thread Willem de Bruijn

>> OK...  See vfs.git#untested.mkobj; it really needs testing, though - 
>> mq_open(2)
>> passes LTP tests, but that's not saying much, and BPF side is completely
>> untested.
>
> ... and FWIW, completely untested patch for net/netfilter/xt_bpf.c follows:

Thanks a lot for this fix.

The tree including the bpf fix passes this basic xt_bpf test:

  mount -t bpf bpf /sys/fs/bpf
  ./pin /sys/fs/bpf/pass
  iptables -A INPUT -m bpf --object-pinned /sys/fs/bpf/five -j LOG
  iptables -L INPUT
  iptables -F INPUT

where pin is as follows:

diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
index adeaa1302f34..0cd2bb8d634b 100644
--- a/samples/bpf/Makefile
+++ b/samples/bpf/Makefile
@@ -41,6 +41,7 @@ hostprogs-y += xdp_redirect_map
 hostprogs-y += xdp_redirect_cpu
 hostprogs-y += xdp_monitor
 hostprogs-y += syscall_tp
+hostprogs-y += pin

 # Libbpf dependencies
 LIBBPF := ../../tools/lib/bpf/bpf.o
@@ -89,6 +90,7 @@ xdp_redirect_map-objs := bpf_load.o $(LIBBPF)
xdp_redirect_map_user.o
 xdp_redirect_cpu-objs := bpf_load.o $(LIBBPF) xdp_redirect_cpu_user.o
 xdp_monitor-objs := bpf_load.o $(LIBBPF) xdp_monitor_user.o
 syscall_tp-objs := bpf_load.o $(LIBBPF) syscall_tp_user.o
+pin-objs := $(LIBBPF) pin.o

 # Tell kbuild to always build the programs
 always := $(hostprogs-y)
diff --git a/samples/bpf/pin.c b/samples/bpf/pin.c
new file mode 100644
index ..826e86784edf
--- /dev/null
+++ b/samples/bpf/pin.c
@@ -0,0 +1,41 @@
+#define _GNU_SOURCE
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "libbpf.h"
+#include "bpf_load.h"
+
+static char log_buf[1 << 16];
+
+int main(int argc, char **argv)
+{
+   struct bpf_insn prog[] = {
+   BPF_MOV64_IMM(BPF_REG_0, 1),
+   BPF_EXIT_INSN(),
+   };
+   int fd;
+
+   if (argc != 2)
+   error(1, 0, "Usage: %s \n", argv[0]);
+
+   fd = bpf_load_program(BPF_PROG_TYPE_SOCKET_FILTER, prog,
+ sizeof(prog) / sizeof(prog[0]),
+ "GPL", 0, log_buf, sizeof(log_buf));
+   if (fd == -1)
+   error(1, errno, "load: %s", log_buf);
+
+   if (bpf_obj_pin(fd, argv[1]))
+   error(1, errno, "pin");
+
+   if (close(fd))
+   error(1, errno, "close");
+
+   return 0;
+}

Re: [PATCH net 0/3] s390/qeth: fixes 2017-12-01

2017-12-02 Thread David Miller

From: Julian Wiedmann 
Date: Fri,  1 Dec 2017 10:14:48 +0100

> please apply the following three fixes for 4.15. These should also go
> back to stable.

Series applied and queued up for -stable, thanks.

Re: [PATCH net,stable v4 0/3] vhost: fix a few skb leaks

2017-12-02 Thread David Miller

From: w...@redhat.com
Date: Fri,  1 Dec 2017 05:10:35 -0500

> Matthew found a roughly 40% tcp throughput regression with commit
> c67df11f(vhost_net: try batch dequing from skb array) as discussed
> in the following thread:
> https://www.mail-archive.com/netdev@vger.kernel.org/msg187936.html

Series applied and queued up for -stable.

Re: [PATCH net-next 5/5] net: phy: realtek: add utility functions to read/write page addresses

2017-12-02 Thread Andrew Lunn

On Sat, Dec 02, 2017 at 10:51:28PM +0100, Martin Blumenstingl wrote:
> Realtek PHYs implement the concept of so-called "extension pages". The
> reason for this is probably because these PHYs expose more registers
> than available in the standard address range.
> After all read/write operations on such a page are done the driver
> should switch back to page 0 where the standard MII registers (such as
> MII_BMCR) are available.
> 
> When referring to such a register the datasheets of RTL8211E and
> RTL8211F always specify:
> - the page / "ext. page" which has to be written to RTL821x_PAGE_SELECT
> - an address (sometimes also called reg)
> 
> These new utility functions make the existing code easier to read since
> it removes some duplication (switching back to page 0 is done within the
> new helpers for example).
> 
> No functional changes are intended.
> 
> Signed-off-by: Martin Blumenstingl 

Reviewed-by: Andrew Lunn 

Andrew

Re: [PATCH net-next 1/5] net: phy: realtek: use the BIT and GENMASK macros

2017-12-02 Thread Andrew Lunn

On Sat, Dec 02, 2017 at 10:51:24PM +0100, Martin Blumenstingl wrote:
> This makes it easier to compare the #defines with the datasheets.
> No functional changes.
> 
> Signed-off-by: Martin Blumenstingl 

Reviewed-by: Andrew Lunn 

Andrew

Re: [PATCH net-next 2/5] net: phy: realtek: rename RTL821x_INER_INIT to RTL8211B_INER_INIT

2017-12-02 Thread Andrew Lunn

On Sat, Dec 02, 2017 at 10:51:25PM +0100, Martin Blumenstingl wrote:
> This macro is only used by the RTL8211B code. RTL8211E and RTL8211F both
> use other bits to initialize the RTL821x_INER register.
> No functional changes.
> 
> Signed-off-by: Martin Blumenstingl 

Reviewed-by: Andrew Lunn 

Andrew

Re: [Patch net-next] act_mirred: use tcfm_dev in tcf_mirred_get_dev()

2017-12-02 Thread Jiri Pirko

Sat, Dec 02, 2017 at 08:53:20PM CET, xiyou.wangc...@gmail.com wrote:
>On Sat, Dec 2, 2017 at 11:47 AM, Cong Wang  wrote:
>> On Sat, Dec 2, 2017 at 12:57 AM, Jiri Pirko  wrote:
>>> Good. Please also use m->tcfm_dev->ifindex in tcf_mirred_dump and
>>> tcf_mirred_ifindex. Then you can remove tcfm_ifindex completely.
>>
>> Sounds good. Will send v2.
>
>Hold on, m->tcfm_dev could be NULL, so we can't just def it here.
>
>I think we can just use 0 when it is NULL:
>
>.ifindex = m->tcfm_dev ? m->tcfm_dev->ifindex : 0,
>
>This also "fixes" the garbage ifindex dump when the target device is gone.

Sounds fine.

Re: [stable] s390/qeth: stable candidates

2017-12-02 Thread David Miller

From: Julian Wiedmann 
Date: Fri, 1 Dec 2017 10:22:37 +0100

> 83cf79a2fec3 s390/qeth: fix early exit from error path
> Fixes:  5f78e29ceebf ("qeth: optimize IP handling in rx_mode callback")
> For 4.8+

Queued up.

The rest are for truly ancient kernels.  I only submit -stable backports
for the most recent kernel releases.

Re: [PATCH net-next 3/5] net: phy: realtek: group all register bit #defines for RTL821x_INER

2017-12-02 Thread Andrew Lunn

On Sat, Dec 02, 2017 at 10:51:26PM +0100, Martin Blumenstingl wrote:
> This simply moves all register bit #defines which describe the (PHY
> specific) bits in the RTL821x_INER right below the RTL821x_INER register
> definition. This makes it easier to spot which registers and bits belong
> together.
> No functional changes.
> 
> Signed-off-by: Martin Blumenstingl 

Reviewed-by: Andrew Lunn 

Andrew

Re: [PATCH net-next 4/5] net: phy: realtek: use the same indentation for all #defines

2017-12-02 Thread Andrew Lunn

On Sat, Dec 02, 2017 at 10:51:27PM +0100, Martin Blumenstingl wrote:
> This simply makes the code easier to read. No functional changes.
> 
> Signed-off-by: Martin Blumenstingl 

Reviewed-by: Andrew Lunn 

Andrew

Re: [PATCH net-next v4 2/2] net: ethernet: socionext: add AVE ethernet driver

2017-12-02 Thread kbuild test robot

Hi Kunihiko,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on net-next/master]

url:
https://github.com/0day-ci/linux/commits/Kunihiko-Hayashi/dt-bindings-net-add-DT-bindings-for-Socionext-UniPhier-AVE/20171203-095248
config: alpha-allyesconfig (attached as .config)
compiler: alpha-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=alpha 

All warnings (new ones prefixed by >>):

   drivers/net/ethernet/socionext/sni_ave.c: In function 
'ave_pfsel_set_promisc':
>> drivers/net/ethernet/socionext/sni_ave.c:172:27: warning: large integer 
>> implicitly truncated to unsigned type [-Woverflow]
#define AVE_PFMBYTE_MASK0 (~GENMASK(7, 6))
  ^
>> drivers/net/ethernet/socionext/sni_ave.c:1046:9: note: in expansion of macro 
>> 'AVE_PFMBYTE_MASK0'
 writel(AVE_PFMBYTE_MASK0, priv->base + AVE_PFMBYTE(entry));
^

vim +172 drivers/net/ethernet/socionext/sni_ave.c

   170  
   171  /* Packet filter */
 > 172  #define AVE_PFMBYTE_MASK0   (~GENMASK(7, 6))
   173  #define AVE_PFMBYTE_MASK1   GENMASK(25, 0)
   174  #define AVE_PFMBIT_MASK GENMASK(15, 0)
   175  

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip

Re: [RFC] virtio-net: help live migrate SR-IOV devices

2017-12-02 Thread Michael S. Tsirkin

On Fri, Dec 01, 2017 at 12:08:59PM -0800, Shannon Nelson wrote:
> On 11/30/2017 6:11 AM, Michael S. Tsirkin wrote:
> > On Thu, Nov 30, 2017 at 10:08:45AM +0200, achiad shochat wrote:
> > > Re. problem #2:
> > > Indeed the best way to address it seems to be to enslave the VF driver
> > > netdev under a persistent anchor netdev.
> > > And it's indeed desired to allow (but not enforce) PV netdev and VF
> > > netdev to work in conjunction.
> > > And it's indeed desired that this enslavement logic work out-of-the box.
> > > But in case of PV+VF some configurable policies must be in place (and
> > > they'd better be generic rather than differ per PV technology).
> > > For example - based on which characteristics should the PV+VF coupling
> > > be done? netvsc uses MAC address, but that might not always be the
> > > desire.
> > 
> > It's a policy but not guest userspace policy.
> > 
> > The hypervisor certainly knows.
> > 
> > Are you concerned that someone might want to create two devices with the
> > same MAC for an unrelated reason?  If so, hypervisor could easily set a
> > flag in the virtio device to say "this is a backup, use MAC to find
> > another device".
> 
> This is something I was going to suggest: a flag or other configuration on
> the virtio device to help control how this new feature is used.  I can
> imagine this might be useful to control from either the hypervisor side or
> the VM side.
> 
> The hypervisor might want to (1) disable it (force it off), (2) enable it
> for VM choice, or (3) force it on for the VM.  In case (2), the VM might be
> able to chose whether it wants to make use of the feature, or stick with the
> bonding solution.
> 
> Either way, the kernel is making a feature available, and the user (VM or
> hypervisor) is able to control it by selecting the feature based on the
> policy desired.
> 
> sln

I'm not sure what's the feature that is available here.

I saw this as a flag that says "this device shares backend with another
network device which can be found using MAC, and that backend should be
preferred".  kernel then forces configuration which uses that other
backend - as long as it exists.

However, please Cc virtio-dev mailing list if we are doing this since
this is a spec extension.

-- 
MST

Re: [Patch net-next] act_mirred: use tcfm_dev in tcf_mirred_get_dev()

2017-12-02 Thread Cong Wang

On Sat, Dec 2, 2017 at 12:57 AM, Jiri Pirko  wrote:
> Good. Please also use m->tcfm_dev->ifindex in tcf_mirred_dump and
> tcf_mirred_ifindex. Then you can remove tcfm_ifindex completely.

Sounds good. Will send v2.

Re: [Patch net-next] act_mirred: use tcfm_dev in tcf_mirred_get_dev()

2017-12-02 Thread Cong Wang

On Sat, Dec 2, 2017 at 11:47 AM, Cong Wang  wrote:
> On Sat, Dec 2, 2017 at 12:57 AM, Jiri Pirko  wrote:
>> Good. Please also use m->tcfm_dev->ifindex in tcf_mirred_dump and
>> tcf_mirred_ifindex. Then you can remove tcfm_ifindex completely.
>
> Sounds good. Will send v2.

Hold on, m->tcfm_dev could be NULL, so we can't just def it here.

I think we can just use 0 when it is NULL:

.ifindex = m->tcfm_dev ? m->tcfm_dev->ifindex : 0,

This also "fixes" the garbage ifindex dump when the target device is gone.

Re: [PATCH 08/10] net: fjes: Handle return value of platform_get_irq and platform_get_resource

2017-12-02 Thread Sergei Shtylyov


Hello!

On 12/02/2017 10:26 PM, Arvind Yadav wrote:


platform_get_irq() and platform_get_resource() can fail here and
we must check its return value.

Signed-off-by: Arvind Yadav 
---
  drivers/net/fjes/fjes_main.c | 10 ++
  1 file changed, 10 insertions(+)

diff --git a/drivers/net/fjes/fjes_main.c b/drivers/net/fjes/fjes_main.c
index 750954b..540dd51 100644
--- a/drivers/net/fjes/fjes_main.c
+++ b/drivers/net/fjes/fjes_main.c
@@ -1265,9 +1265,19 @@ static int fjes_probe(struct platform_device *plat_dev)
adapter->interrupt_watch_enable = false;
  
  	res = platform_get_resource(plat_dev, IORESOURCE_MEM, 0);

+   if (!res) {
+   err = -EINVAL;
+   goto err_free_netdev;
+   }
+
hw->hw_res.start = res->start;
hw->hw_res.size = resource_size(res);
hw->hw_res.irq = platform_get_irq(plat_dev, 0);
+   if (hw->hw_res.irq <= 0) {


   This function no longer returns 0 on error, no need to check for <= 0.


+   err = hw->hw_res.irq ? hw->hw_res.irq : -ENODEV;
+   goto err_free_netdev;


   gcc allows a shorter way to write that.

err = hw->hw_res.irq ?: -ENODEV;


+   }
+
err = fjes_hw_init(>hw);
if (err)
goto err_free_netdev;


MBR, Sergei

Re: [PATCH net v2 2/3] xfrm: Add an activate() offload dev op

2017-12-02 Thread Shannon Nelson


On 12/1/2017 11:47 AM, Shannon Nelson wrote:

On 11/28/2017 9:55 AM, av...@mellanox.com wrote:

From: Aviv Heller 

Adding the state to the offload device prior to replay init in
xfrm_state_construct() will result in NULL dereference if a matching
ESP packet is received in between.

In order to inhibit driver offload logic from processing the state's
packets prior to the xfrm_state object being completely initialized and
added to the SADBs, a new activate() operation was added to inform the
driver the aforementioned conditions have been met.


Are there also conditions where you would want to temporarily 
deactivate, or pause, the incoming driver offload, followed then by 
another activate?


sln


Instead of setting up a half-ready state that needs the activate() 
operation to finish, can we instead just move the xfrm_dev_state_add() 
call to after the xfrm_init_replay()?  Especially since this really only 
makes sense for the inbound, and makes no sense for the outbound path.


sln





Signed-off-by: Aviv Heller 
Signed-off-by: Yossi Kuperman 
---
v1 -> v2:
- Separate to state addition and then activation, instead
  of relocating dev state addition call.
---
  include/linux/netdevice.h |  1 +
  include/net/xfrm.h    | 12 
  net/xfrm/xfrm_user.c  |  5 +
  3 files changed, 18 insertions(+)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 2eaac7d..c6ca356 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -819,6 +819,7 @@ struct netdev_xdp {
  #ifdef CONFIG_XFRM_OFFLOAD
  struct xfrmdev_ops {
  int    (*xdo_dev_state_add) (struct xfrm_state *x);
+    void    (*xdo_dev_state_activate) (struct xfrm_state *x);
  void    (*xdo_dev_state_delete) (struct xfrm_state *x);
  void    (*xdo_dev_state_free) (struct xfrm_state *x);
  bool    (*xdo_dev_offload_ok) (struct sk_buff *skb,
diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index e015e16..324374e 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -1877,6 +1877,14 @@ static inline bool xfrm_dst_offload_ok(struct 
dst_entry *dst)

  return false;
  }
+static inline void xfrm_dev_state_activate(struct xfrm_state *x)
+{
+    struct xfrm_state_offload *xso = >xso;
+
+    if (xso->dev && xso->dev->xfrmdev_ops->xdo_dev_state_activate)
+    xso->dev->xfrmdev_ops->xdo_dev_state_activate(x);
+}
+
  static inline void xfrm_dev_state_delete(struct xfrm_state *x)
  {
  struct xfrm_state_offload *xso = >xso;
@@ -1907,6 +1915,10 @@ static inline int xfrm_dev_state_add(struct net 
*net, struct xfrm_state *x, stru

  return 0;
  }
+static inline void xfrm_dev_state_activate(struct xfrm_state *x)
+{
+}
+
  static inline void xfrm_dev_state_delete(struct xfrm_state *x)
  {
  }
diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c
index e44a0fe..d06f579 100644
--- a/net/xfrm/xfrm_user.c
+++ b/net/xfrm/xfrm_user.c
@@ -662,6 +662,11 @@ static int xfrm_add_sa(struct sk_buff *skb, 
struct nlmsghdr *nlh,

  goto out;
  }
+    spin_lock_bh(>lock);
+    if (x->km.state == XFRM_STATE_VALID)
+    xfrm_dev_state_activate(x);
+    spin_unlock_bh(>lock);
+
  c.seq = nlh->nlmsg_seq;
  c.portid = nlh->nlmsg_pid;
  c.event = nlh->nlmsg_type;

pull-request: bpf-next 2017-12-03

2017-12-02 Thread Daniel Borkmann

Hi David,

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) Addition of a software model for BPF offloads in order to ease
   testing code changes in that area and make semantics more clear.
   This is implemented in a new driver called netdevsim, which can
   later also be extended for other offloads. SR-IOV support is added
   as well to netdevsim. BPF kernel selftests for offloading are
   added so we can track basic functionality as well as exercising
   all corner cases around BPF offloading, from Jakub.

2) Today drivers have to drop the reference on BPF progs they hold
   due to XDP on device teardown themselves. Change this in order
   to make XDP handling inside the drivers less error prone, and
   move disabling XDP to the core instead, also from Jakub.

3) Misc set of BPF verifier improvements and cleanups as preparatory
   work for upcoming BPF-to-BPF calls. Among others, this set also
   improves liveness marking such that pruning can be slightly more
   effective. Register and stack liveness information is now included
   in the verifier log as well, from Alexei.

4) nfp JIT improvements in order to identify load/store sequences in
   the BPF prog e.g. coming from memcpy lowering and optimizing them
   through the NPU's command push pull (CPP) instruction, from Jiong.

5) Cleanups to test_cgrp2_attach2.c BPF sample code in oder to remove
   bpf_prog_attach() magic values and replacing them with actual proper
   attach flag instead, from David.

Please consider pulling these changes from:

  git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git

Thanks a lot!



The following changes since commit 9f66816a6a4dd740bfa29cc8a8e19b90fd7df4e7:

  net: dsa: bcm_sf2: Utilize b53_get_tag_protocol() (2017-11-30 13:00:04 -0500)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git 

for you to fetch changes up to 6720f1084c066a5ba051a250e9d5d8c2ad4f554c:

  Merge branch 'bpf-xdp-stack-uninit-and-offload-tests' (2017-12-03 00:27:59 
+0100)


Alexei Starovoitov (7):
  bpf: fix stack state printing in verifier log
  bpf: print liveness info to verifier log
  bpf: don't mark FP reg as uninit
  bpf: improve verifier liveness marks
  bpf: improve JEQ/JNE path walking
  bpf: cleanup register_is_null()
  selftests/bpf: adjust test_align expected output

Daniel Borkmann (3):
  Merge branch 'bpf-verifier-misc-improvements'
  Merge branch 'bpf-nfp-jmp-memcpy-improvements'
  Merge branch 'bpf-xdp-stack-uninit-and-offload-tests'

David Ahern (1):
  samples/bpf: Convert magic numbers to names in multi-prog cgroup test case

Jakub Kicinski (10):
  nfp: fix old kdoc issues
  nfp: bpf: encode indirect commands
  net: xdp: avoid output parameters when querying XDP prog
  net: xdp: report flags program was installed with on query
  net: xdp: make the stack take care of the tear down
  netdevsim: add software driver for testing offloads
  netdevsim: add bpf offload support
  selftests/bpf: add offload test based on netdevsim
  netdevsim: add SR-IOV functionality
  net: dummy: remove fake SR-IOV functionality

Jiong Wang (11):
  nfp: bpf: support backward jump
  nfp: bpf: record jump destination to simplify jump fixup
  nfp: bpf: flag jump destination to guide insn combine optimizations
  nfp: bpf: don't do ld/mask combination if mask is jump destination
  nfp: bpf: don't do ld/shifts combination if shifts are jump destination
  nfp: bpf: relax source operands check
  nfp: bpf: correct the encoding for No-Dest immed
  nfp: bpf: factor out is_mbpf_load & is_mbpf_store
  nfp: bpf: implement memory bulk copy for length within 32-bytes
  nfp: bpf: implement memory bulk copy for length bigger than 32-bytes
  nfp: bpf: detect load/store sequences lowered from memory copy

 MAINTAINERS|   5 +
 drivers/net/Kconfig|  11 +
 drivers/net/Makefile   |   1 +
 drivers/net/dummy.c| 215 +--
 drivers/net/ethernet/broadcom/bnxt/bnxt.c  |   2 -
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c  |   3 -
 drivers/net/ethernet/netronome/nfp/bpf/jit.c   | 487 +--
 drivers/net/ethernet/netronome/nfp/bpf/main.c  |   7 -
 drivers/net/ethernet/netronome/nfp/bpf/main.h  |  35 +-
 drivers/net/ethernet/netronome/nfp/bpf/offload.c   |  23 +-
 drivers/net/ethernet/netronome/nfp/bpf/verifier.c  |   8 +-
 drivers/net/ethernet/netronome/nfp/nfp_asm.c   |   7 +-
 drivers/net/ethernet/netronome/nfp/nfp_asm.h   |   7 +-
 drivers/net/ethernet/netronome/nfp/nfp_net.h   |   2 +

[PATCH net-next 4/4] rtnetlink: remove __rtnl_register

2017-12-02 Thread Florian Westphal

This removes __rtnl_register and switches callers to either
rtnl_register or rtnl_register_module.

Also, rtnl_register() will now print an error if memory allocation
failed rather than panic the kernel.

Signed-off-by: Florian Westphal 
---
 include/net/rtnetlink.h |  2 --
 net/core/rtnetlink.c| 33 -
 net/ipv6/addrconf.c | 44 ++--
 net/ipv6/addrlabel.c| 13 ++---
 net/ipv6/ip6_fib.c  |  4 ++--
 net/ipv6/route.c| 20 +++-
 6 files changed, 61 insertions(+), 55 deletions(-)

diff --git a/include/net/rtnetlink.h b/include/net/rtnetlink.h
index e326b3f9eb5f..14b6b3af8918 100644
--- a/include/net/rtnetlink.h
+++ b/include/net/rtnetlink.h
@@ -13,8 +13,6 @@ enum rtnl_link_flags {
RTNL_FLAG_DOIT_UNLOCKED = 1,
 };
 
-int __rtnl_register(int protocol, int msgtype,
-   rtnl_doit_func, rtnl_dumpit_func, unsigned int flags);
 void rtnl_register(int protocol, int msgtype,
   rtnl_doit_func, rtnl_dumpit_func, unsigned int flags);
 int rtnl_register_module(struct module *owner, int protocol, int msgtype,
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index de6390365c90..fb2d61df1e2f 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -238,7 +238,7 @@ int rtnl_register_module(struct module *owner,
 EXPORT_SYMBOL_GPL(rtnl_register_module);
 
 /**
- * __rtnl_register - Register a rtnetlink message type
+ * rtnl_register - Register a rtnetlink message type
  * @protocol: Protocol family or PF_UNSPEC
  * @msgtype: rtnetlink message type
  * @doit: Function pointer called for each request message
@@ -252,35 +252,18 @@ EXPORT_SYMBOL_GPL(rtnl_register_module);
  * The special protocol family PF_UNSPEC may be used to define fallback
  * function pointers for the case when no entry for the specific protocol
  * family exists.
- *
- * Returns 0 on success or a negative error code.
- */
-int __rtnl_register(int protocol, int msgtype,
-   rtnl_doit_func doit, rtnl_dumpit_func dumpit,
-   unsigned int flags)
-{
-   return rtnl_register_internal(NULL, protocol, msgtype,
- doit, dumpit, flags);
-}
-EXPORT_SYMBOL_GPL(__rtnl_register);
-
-/**
- * rtnl_register - Register a rtnetlink message type
- *
- * Identical to __rtnl_register() but panics on failure. This is useful
- * as failure of this function is very unlikely, it can only happen due
- * to lack of memory when allocating the chain to store all message
- * handlers for a protocol. Meant for use in init functions where lack
- * of memory implies no sense in continuing.
  */
 void rtnl_register(int protocol, int msgtype,
   rtnl_doit_func doit, rtnl_dumpit_func dumpit,
   unsigned int flags)
 {
-   if (__rtnl_register(protocol, msgtype, doit, dumpit, flags) < 0)
-   panic("Unable to register rtnetlink message handler, "
- "protocol = %d, message type = %d\n",
- protocol, msgtype);
+   int err;
+
+   err = rtnl_register_internal(NULL, protocol, msgtype, doit, dumpit,
+flags);
+   if (err)
+   pr_err("Unable to register rtnetlink message handler, "
+  "protocol = %d, message type = %d\n", protocol, msgtype);
 }
 EXPORT_SYMBOL_GPL(rtnl_register);
 
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index f49bd7897e95..a5ad8425551a 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -6595,27 +6595,43 @@ int __init addrconf_init(void)
 
rtnl_af_register(_ops);
 
-   err = __rtnl_register(PF_INET6, RTM_GETLINK, NULL, inet6_dump_ifinfo,
- 0);
+   err = rtnl_register_module(THIS_MODULE, PF_INET6, RTM_GETLINK,
+  NULL, inet6_dump_ifinfo, 0);
if (err < 0)
goto errout;
 
-   /* Only the first call to __rtnl_register can fail */
-   __rtnl_register(PF_INET6, RTM_NEWADDR, inet6_rtm_newaddr, NULL, 0);
-   __rtnl_register(PF_INET6, RTM_DELADDR, inet6_rtm_deladdr, NULL, 0);
-   __rtnl_register(PF_INET6, RTM_GETADDR, inet6_rtm_getaddr,
-   inet6_dump_ifaddr, RTNL_FLAG_DOIT_UNLOCKED);
-   __rtnl_register(PF_INET6, RTM_GETMULTICAST, NULL,
-   inet6_dump_ifmcaddr, 0);
-   __rtnl_register(PF_INET6, RTM_GETANYCAST, NULL,
-   inet6_dump_ifacaddr, 0);
-   __rtnl_register(PF_INET6, RTM_GETNETCONF, inet6_netconf_get_devconf,
-   inet6_netconf_dump_devconf, RTNL_FLAG_DOIT_UNLOCKED);
-
+   err = rtnl_register_module(THIS_MODULE, PF_INET6, RTM_NEWADDR,
+  inet6_rtm_newaddr, NULL, 0);
+   if (err < 0)
+   goto errout;
+   err = rtnl_register_module(THIS_MODULE, PF_INET6, RTM_DELADDR,
+

[PATCH next-next 0/4] rtnetlink: rework handler (un)registering

2017-12-02 Thread Florian Westphal

Peter Zijlstra reported (referring to commit 019a316992ee0d983,
"rtnetlink: add reference counting to prevent module unload while dump is in 
progress"):

 1) it not in fact a refcount, so using refcount_t is silly
 2) there is a distinct lack of memory barriers, so we can easily
observe the decrement while the msg_handler is still in progress.
 3) waiting with a schedule()/yield() loop is complete crap and subject
life-locks, imagine doing that rtnl_unregister_all() from a RT task.

In ancient times rtnetlink exposed a statically-sized table with
preset doit/dumpit handlers to be called for a protocol/type pair.

Later the rtnl_register interface was added and the table was allocated
on demand.  Eventually these were also used by modules.

Problem is that nothing prevents module unload while a netlink dump
is in progress.  netlink dumps can be span multiple recv calls and
netlink core saves the to-be-repeated dumper address for later invocation.

To prevent rmmod the netlink core expects callers to pass in the owning
module so a reference can be taken.

So far rtnetlink wasn't doing this, add new interface to pass THIS_MODULE.
Moreover, when converting parts of the rtnetlink handling to rcu this code
gained way too many READ_ONCE spots, remove them and the extra refcounting.

Take a module reference when running dumpit and doit callbacks
and never alter content of rtnl_link structures after they have been
published via rcu_assign_pointer.

Based partially on earlier patch from Peter.

 include/net/rtnetlink.h |4 
 net/bridge/br_mdb.c |6 -
 net/can/gw.c|   14 +-
 net/core/rtnetlink.c|  270 ++--
 net/decnet/dn_dev.c |9 +
 net/decnet/dn_fib.c |6 -
 net/decnet/dn_route.c   |8 -
 net/ipv6/addrconf.c |   44 +--
 net/ipv6/addrlabel.c|   13 +-
 net/ipv6/ip6_fib.c  |4 
 net/ipv6/route.c|   20 ++-
 net/mpls/af_mpls.c  |   15 +-
 net/phonet/pn_netlink.c |   21 ++-
 net/qrtr/qrtr.c |8 +
 14 files changed, 282 insertions(+), 160 deletions(-)

[PATCH net-next 3/4] net: use rtnl_register_module where needed

2017-12-02 Thread Florian Westphal

all of these can be compiled as a module, so use new
_module version to make sure module can no longer be removed
while callback/dump is in use.

Signed-off-by: Florian Westphal 
---
 net/bridge/br_mdb.c |  6 +++---
 net/can/gw.c| 14 ++
 net/decnet/dn_dev.c |  9 ++---
 net/decnet/dn_fib.c |  6 --
 net/decnet/dn_route.c   |  8 
 net/mpls/af_mpls.c  | 15 +--
 net/phonet/pn_netlink.c | 21 +
 net/qrtr/qrtr.c |  8 ++--
 8 files changed, 55 insertions(+), 32 deletions(-)

diff --git a/net/bridge/br_mdb.c b/net/bridge/br_mdb.c
index b0f4c734900b..6d9f48bd374a 100644
--- a/net/bridge/br_mdb.c
+++ b/net/bridge/br_mdb.c
@@ -760,9 +760,9 @@ static int br_mdb_del(struct sk_buff *skb, struct nlmsghdr 
*nlh,
 
 void br_mdb_init(void)
 {
-   rtnl_register(PF_BRIDGE, RTM_GETMDB, NULL, br_mdb_dump, 0);
-   rtnl_register(PF_BRIDGE, RTM_NEWMDB, br_mdb_add, NULL, 0);
-   rtnl_register(PF_BRIDGE, RTM_DELMDB, br_mdb_del, NULL, 0);
+   rtnl_register_module(THIS_MODULE, PF_BRIDGE, RTM_GETMDB, NULL, 
br_mdb_dump, 0);
+   rtnl_register_module(THIS_MODULE, PF_BRIDGE, RTM_NEWMDB, br_mdb_add, 
NULL, 0);
+   rtnl_register_module(THIS_MODULE, PF_BRIDGE, RTM_DELMDB, br_mdb_del, 
NULL, 0);
 }
 
 void br_mdb_uninit(void)
diff --git a/net/can/gw.c b/net/can/gw.c
index 73a02af4b5d7..398dd0395ad9 100644
--- a/net/can/gw.c
+++ b/net/can/gw.c
@@ -1014,6 +1014,8 @@ static struct pernet_operations cangw_pernet_ops = {
 
 static __init int cgw_module_init(void)
 {
+   int ret;
+
/* sanitize given module parameter */
max_hops = clamp_t(unsigned int, max_hops, CGW_MIN_HOPS, CGW_MAX_HOPS);
 
@@ -1031,15 +1033,19 @@ static __init int cgw_module_init(void)
notifier.notifier_call = cgw_notifier;
register_netdevice_notifier();
 
-   if (__rtnl_register(PF_CAN, RTM_GETROUTE, NULL, cgw_dump_jobs, 0)) {
+   ret = rtnl_register_module(THIS_MODULE, PF_CAN, RTM_GETROUTE,
+  NULL, cgw_dump_jobs, 0);
+   if (ret) {
unregister_netdevice_notifier();
kmem_cache_destroy(cgw_cache);
return -ENOBUFS;
}
 
-   /* Only the first call to __rtnl_register can fail */
-   __rtnl_register(PF_CAN, RTM_NEWROUTE, cgw_create_job, NULL, 0);
-   __rtnl_register(PF_CAN, RTM_DELROUTE, cgw_remove_job, NULL, 0);
+   /* Only the first call to rtnl_register_module can fail */
+   rtnl_register_module(THIS_MODULE, PF_CAN, RTM_NEWROUTE,
+cgw_create_job, NULL, 0);
+   rtnl_register_module(THIS_MODULE, PF_CAN, RTM_DELROUTE,
+cgw_remove_job, NULL, 0);
 
return 0;
 }
diff --git a/net/decnet/dn_dev.c b/net/decnet/dn_dev.c
index 9153247dad28..d1885cf59319 100644
--- a/net/decnet/dn_dev.c
+++ b/net/decnet/dn_dev.c
@@ -1418,9 +1418,12 @@ void __init dn_dev_init(void)
 
dn_dev_devices_on();
 
-   rtnl_register(PF_DECnet, RTM_NEWADDR, dn_nl_newaddr, NULL, 0);
-   rtnl_register(PF_DECnet, RTM_DELADDR, dn_nl_deladdr, NULL, 0);
-   rtnl_register(PF_DECnet, RTM_GETADDR, NULL, dn_nl_dump_ifaddr, 0);
+   rtnl_register_module(THIS_MODULE, PF_DECnet, RTM_NEWADDR,
+dn_nl_newaddr, NULL, 0);
+   rtnl_register_module(THIS_MODULE, PF_DECnet, RTM_DELADDR,
+dn_nl_deladdr, NULL, 0);
+   rtnl_register_module(THIS_MODULE, PF_DECnet, RTM_GETADDR,
+NULL, dn_nl_dump_ifaddr, 0);
 
proc_create("decnet_dev", S_IRUGO, init_net.proc_net, _dev_seq_fops);
 
diff --git a/net/decnet/dn_fib.c b/net/decnet/dn_fib.c
index b37a1b833c77..fce94cbd4378 100644
--- a/net/decnet/dn_fib.c
+++ b/net/decnet/dn_fib.c
@@ -792,8 +792,10 @@ void __init dn_fib_init(void)
 
register_dnaddr_notifier(_fib_dnaddr_notifier);
 
-   rtnl_register(PF_DECnet, RTM_NEWROUTE, dn_fib_rtm_newroute, NULL, 0);
-   rtnl_register(PF_DECnet, RTM_DELROUTE, dn_fib_rtm_delroute, NULL, 0);
+   rtnl_register_module(THIS_MODULE, PF_DECnet, RTM_NEWROUTE,
+dn_fib_rtm_newroute, NULL, 0);
+   rtnl_register_module(THIS_MODULE, PF_DECnet, RTM_DELROUTE,
+dn_fib_rtm_delroute, NULL, 0);
 }
 
 
diff --git a/net/decnet/dn_route.c b/net/decnet/dn_route.c
index 4b3ca70be723..73160d4aebbe 100644
--- a/net/decnet/dn_route.c
+++ b/net/decnet/dn_route.c
@@ -1923,11 +1923,11 @@ void __init dn_route_init(void)
_rt_cache_seq_fops);
 
 #ifdef CONFIG_DECNET_ROUTER
-   rtnl_register(PF_DECnet, RTM_GETROUTE, dn_cache_getroute,
- dn_fib_dump, 0);
+   rtnl_register_module(THIS_MODULE, PF_DECnet, RTM_GETROUTE,
+dn_cache_getroute, dn_fib_dump, 0);
 #else
-   rtnl_register(PF_DECnet, RTM_GETROUTE, dn_cache_getroute,
- dn_cache_dump, 0);
+

[PATCH net-next 1/4] net: rtnetlink: use rcu to free rtnl message handlers

2017-12-02 Thread Florian Westphal

rtnetlink is littered with READ_ONCE() because we can have read accesses
while another cpu can write to the structure we're reading by
(un)registering doit or dumpit handlers.

This patch changes this so that (un)registering cpu allocates a new
structure and then publishes it via rcu_assign_pointer, i.e. once
another cpu can see such pointer no modifications will occur anymore.

based on initial patch from Peter Zijlstra.

Cc: Peter Zijlstra 
Signed-off-by: Florian Westphal 
---
 net/core/rtnetlink.c | 154 +--
 1 file changed, 101 insertions(+), 53 deletions(-)

diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index dabba2a91fc8..ff292d3f2c41 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -63,6 +63,7 @@ struct rtnl_link {
rtnl_doit_func  doit;
rtnl_dumpit_funcdumpit;
unsigned intflags;
+   struct rcu_head rcu;
 };
 
 static DEFINE_MUTEX(rtnl_mutex);
@@ -127,7 +128,7 @@ bool lockdep_rtnl_is_held(void)
 EXPORT_SYMBOL(lockdep_rtnl_is_held);
 #endif /* #ifdef CONFIG_PROVE_LOCKING */
 
-static struct rtnl_link __rcu *rtnl_msg_handlers[RTNL_FAMILY_MAX + 1];
+static struct rtnl_link __rcu **rtnl_msg_handlers[RTNL_FAMILY_MAX + 1];
 static refcount_t rtnl_msg_handlers_ref[RTNL_FAMILY_MAX + 1];
 
 static inline int rtm_msgindex(int msgtype)
@@ -144,6 +145,20 @@ static inline int rtm_msgindex(int msgtype)
return msgindex;
 }
 
+static struct rtnl_link *rtnl_get_link(int protocol, int msgtype)
+{
+   struct rtnl_link **tab;
+
+   if (protocol >= ARRAY_SIZE(rtnl_msg_handlers))
+   protocol = PF_UNSPEC;
+
+   tab = rcu_dereference_rtnl(rtnl_msg_handlers[protocol]);
+   if (!tab)
+   tab = rcu_dereference_rtnl(rtnl_msg_handlers[PF_UNSPEC]);
+
+   return tab[msgtype];
+}
+
 /**
  * __rtnl_register - Register a rtnetlink message type
  * @protocol: Protocol family or PF_UNSPEC
@@ -166,28 +181,52 @@ int __rtnl_register(int protocol, int msgtype,
rtnl_doit_func doit, rtnl_dumpit_func dumpit,
unsigned int flags)
 {
-   struct rtnl_link *tab;
+   struct rtnl_link **tab, *link, *old;
int msgindex;
+   int ret = -ENOBUFS;
 
BUG_ON(protocol < 0 || protocol > RTNL_FAMILY_MAX);
msgindex = rtm_msgindex(msgtype);
 
-   tab = rcu_dereference_raw(rtnl_msg_handlers[protocol]);
+   rtnl_lock();
+   tab = rtnl_msg_handlers[protocol];
if (tab == NULL) {
-   tab = kcalloc(RTM_NR_MSGTYPES, sizeof(*tab), GFP_KERNEL);
-   if (tab == NULL)
-   return -ENOBUFS;
+   tab = kcalloc(RTM_NR_MSGTYPES, sizeof(void *), GFP_KERNEL);
+   if (!tab)
+   goto unlock;
 
+   /* ensures we see the 0 stores */
rcu_assign_pointer(rtnl_msg_handlers[protocol], tab);
}
 
+   old = rtnl_dereference(tab[msgindex]);
+   if (old) {
+   link = kmemdup(old, sizeof(*old), GFP_KERNEL);
+   if (!link)
+   goto unlock;
+   } else {
+   link = kzalloc(sizeof(*link), GFP_KERNEL);
+   if (!link)
+   goto unlock;
+   }
+
+   WARN_ON(doit && link->doit && link->doit != doit);
if (doit)
-   tab[msgindex].doit = doit;
+   link->doit = doit;
+   WARN_ON(dumpit && link->dumpit && link->dumpit != dumpit);
if (dumpit)
-   tab[msgindex].dumpit = dumpit;
-   tab[msgindex].flags |= flags;
+   link->dumpit = dumpit;
 
-   return 0;
+   link->flags |= flags;
+
+   /* publish protocol:msgtype */
+   rcu_assign_pointer(tab[msgindex], link);
+   ret = 0;
+   if (old)
+   kfree_rcu(old, rcu);
+unlock:
+   rtnl_unlock();
+   return ret;
 }
 EXPORT_SYMBOL_GPL(__rtnl_register);
 
@@ -220,24 +259,25 @@ EXPORT_SYMBOL_GPL(rtnl_register);
  */
 int rtnl_unregister(int protocol, int msgtype)
 {
-   struct rtnl_link *handlers;
+   struct rtnl_link **tab, *link;
int msgindex;
 
BUG_ON(protocol < 0 || protocol > RTNL_FAMILY_MAX);
msgindex = rtm_msgindex(msgtype);
 
rtnl_lock();
-   handlers = rtnl_dereference(rtnl_msg_handlers[protocol]);
-   if (!handlers) {
+   tab = rtnl_dereference(rtnl_msg_handlers[protocol]);
+   if (!tab) {
rtnl_unlock();
return -ENOENT;
}
 
-   handlers[msgindex].doit = NULL;
-   handlers[msgindex].dumpit = NULL;
-   handlers[msgindex].flags = 0;
+   link = tab[msgindex];
+   rcu_assign_pointer(tab[msgindex], NULL);
rtnl_unlock();
 
+   kfree_rcu(link, rcu);
+
return 0;
 }
 EXPORT_SYMBOL_GPL(rtnl_unregister);
@@ -251,20 +291,29 @@ EXPORT_SYMBOL_GPL(rtnl_unregister);
  */
 void

[RfC net-next 2/3] net: phy: realtek: configure the INTB pin on RTL8211F

2017-12-02 Thread Martin Blumenstingl

The interrupt pin on the RTL8211F PHY can be used in two different
modes:
INTB
- the default mode of the PHY
- interrupts can be configured through page 0xa42 register RTL821x_INER
- interrupts can be ACK'ed through RTL8211F_INSR
- it acts as a level-interrupt which is active low
- Wake-on-LAN "wakeup" status is available in RTL8211F_INSR bit 7

PMEB:
- special mode for Wake-on-LAN
- interrupts configured through page 0xa42 register RTL821x_INER are
  disabled
- it supports a "pulse low" waveform for the interrupt

For now we simply force the pin into INTB mode since the PHY driver does
not support Wake-on-LAN yet.

Signed-off-by: Martin Blumenstingl 
---
 drivers/net/phy/realtek.c | 27 +--
 1 file changed, 25 insertions(+), 2 deletions(-)

diff --git a/drivers/net/phy/realtek.c b/drivers/net/phy/realtek.c
index d4e7f249a4bc..961165d128d6 100644
--- a/drivers/net/phy/realtek.c
+++ b/drivers/net/phy/realtek.c
@@ -40,6 +40,9 @@
 #define RTL8201F_ISR   0x1e
 #define RTL8201F_IER   0x13
 
+#define RTL8211F_INTBCR0x16
+#define RTL8211F_INTBCR_INTB_PMEB  BIT(5)
+
 MODULE_DESCRIPTION("Realtek PHY driver");
 MODULE_AUTHOR("Johnson Leung");
 MODULE_LICENSE("GPL");
@@ -161,12 +164,32 @@ static int rtl8211e_config_intr(struct phy_device *phydev)
 
 static int rtl8211f_config_intr(struct phy_device *phydev)
 {
+   int err;
u16 val;
 
-   if (phydev->interrupts == PHY_INTERRUPT_ENABLED)
+   if (phydev->interrupts == PHY_INTERRUPT_ENABLED) {
+   /*
+* The interrupt pin has two functions:
+* 0: INTB: it acts as interrupt pin which can be configured
+*through RTL821x_INER and the status can be read through
+*RTL8211F_INSR
+* 1: PMEB: a special "Power Management Event" mode for
+*Wake-on-LAN operation (with support for a "pulse low"
+*wave format). Interrupts configured through RTL821x_INER
+*will not work in this mode
+*
+* select INTB mode in the "INTB pin control" register to
+* ensure that the interrupt pin is in the correct mode.
+*/
+   err = rtl8211x_page_mask_bits(phydev, 0xd40, RTL8211F_INTBCR,
+ RTL8211F_INTBCR_INTB_PMEB, 0);
+   if (err)
+   return err;
+
val = RTL8211F_INER_LINK_STATUS;
-   else
+   } else {
val = 0;
+   }
 
return rtl8211x_page_write(phydev, 0xa42, RTL821x_INER, val);
 }
-- 
2.15.1

[RfC net-next 0/3] RTL8211F Ethernet PHY "documentation"

2017-12-02 Thread Martin Blumenstingl

A recent patch from Heiner made me curious if the RTL8211F part of
the realtek.c PHY driver is correct: [0]
I contacted Realtek and asked if we could get a datasheet for the
RTL8211F PHY. it seems that the full datasheet can only be obtained
with an NDA. however, the contact at Realtek kindly answered all
questions I had regarding the RTL8211F PHY (thank you very much
again!). I am not permitted to share the exact answers I received,
but I am allowed to put them into my own words.

This series is a result of the conversation with Realtek: the main
intention behind this series is to *document* the information I got.
I am not aware of any board which needs the RX delay, has a problem
with the INTB pin configuration or requires any of the additional
interrupts.

I only tested that these patches don't break existing functionality
on Khadas VIM2 board with RTL8211F PHY.

PS: I also received information about the RTL8211E PHY's RX and TX
delay configuration. however, I don't understand that part of the
datasheet so I did not add any #defines in patch #3. the datasheet
says that:
- TX delay is configured through bit 12 and bit 13 in register 0x1c
  in page 0xa4 (value 1 = enabled, 0 = disabled)
- RX delay is configured through bit 11 and bit 13 in register 0x1c
  in page 0xa4 (value 1 = enabled, 0 = disabled)
I don't have any board with a RTL8211E PHY, so I could not test this
part at all. thus I don't know why bit 13 is listed for both, RX and
TX delay.

I do not expect that this series is applied. if someone is interested
in testing this: it applies on top of my other series:
"Realtek Ethernet PHY driver improvements" [1]


[0] https://www.spinics.net/lists/netdev/msg41.html
[1] https://marc.info/?l=linux-netdev=151225151410593=2

Martin Blumenstingl (3):
  net: phy: realtek: add support for configuring the RX delay on
RTL8211F
  net: phy: realtek: configure the INTB pin on RTL8211F
  net: phy: realtek: add more interrupt bits for RTL8211E and RTL8211F

 drivers/net/phy/realtek.c | 89 ---
 1 file changed, 77 insertions(+), 12 deletions(-)

-- 
2.15.1

Re: Fixing CVE-2017-16939 in v4.4.y and possibly v3.18.y

2017-12-02 Thread Michal Kubecek

On Sat, Dec 02, 2017 at 04:20:40PM -0800, Guenter Roeck wrote:
> On 12/01/2017 11:48 AM, Michal Kubecek wrote:
> > On Thu, Nov 30, 2017 at 10:37:40AM -0800, Guenter Roeck wrote:
> > > Hi,
> > > 
> > > The fix for CVE-2017-16939 has been applied to v4.9.y, but not to v4.4.y
> > > and older kernels. However, I confirmed that running the published POC
> > > (see https://blogs.securiteam.com/index.php/archives/3535) does crash a 
> > > 4.4
> > > kernel.
> > > 
> > > I confirmed that the following two patches fix the problem in v4.4.y.
> > > Please consider applying them to v4.4.y (and possibly v3.18.y).
> > > 
> > > fc9e50f5a5a4e ("netlink: add a start callback for starting a netlink 
> > > dump")
> > > 1137b5e2529a8 ("ipsec: Fix aborted xfrm policy dump crash")
> > > 
> > > My apologies for the noise if this is already under consideration.
> > 
> > It's a bit too big hammer. As Nicolai Stange noticed when we were
> 
> The hammer is just as big as the upstream hammer. Personally I prefer the
> upstream patch; I don't see a reason to deviate from upstream just because
> the upstream solution is more complex than necessary.

Comparing that little patch with the combination of the two commits,
I would say we have a very different idea what "as big as" means. :-)

> > handling this for SLE12 (where fc9e50f5a5a4e would break kABI), it's
> 
> I didn't know that this is even a concern for stable releases. Is there
> some guideline that kABI changes should be avoided in stable releases ?

Not to my knowledge, stable updates break kABI quite often. I just
mentioned it to explain why we had stronger motivation to find another
solution.

Michal Kubecek

86 matches

Mail list logo