date:20170923

Re: [PATCH net-next] bpf/verifier: improve disassembly of BPF_END instructions

2017-09-23 Thread Alexei Starovoitov

On Fri, Sep 22, 2017 at 09:49:10PM -0700, Y Song wrote:
> On Fri, Sep 22, 2017 at 9:23 AM, Edward Cree  wrote:
> > On 22/09/17 16:16, Alexei Starovoitov wrote:
> >> looks like we're converging on
> >> "be16/be32/be64/le16/le32/le64 #register" for BPF_END.
> >> I guess it can live with that. I would prefer more C like syntax
> >> to match the rest, but llvm parsing point is a strong one.
> > Yep, agreed.  I'll post a v2 once we've settled BPF_NEG.
> >> For BPG_NEG I prefer to do it in C syntax like interpreter does:
> >> ALU_NEG:
> >> DST = (u32) -DST;
> >> ALU64_NEG:
> >> DST = -DST;
> >> Yonghong, does it mean that asmparser will equally suffer?
> > Correction to my earlier statements: verifier will currently disassemble
> >  neg as:
> > (87) r0 neg 0
> > (84) (u32) r0 neg (u32) 0
> >  because it pretends 'neg' is a compound-assignment operator like +=.
> > The analogy with be16 and friends would be to use
> > neg64 r0
> > neg32 r0
> >  whereas the analogy with everything else would be
> > r0 = -r0
> > r0 = (u32) -r0
> >  as Alexei says.
> > I'm happy to go with Alexei's version if it doesn't cause problems for llvm.
> 
> I got some time to do some prototyping in llvm and it looks like that
> I am able to
> resolve the issue and we are able to use more C-like syntax. That is:
> for bswap:
>  r1 = (be16) (u16) r1
>  or
>  r1 = (be16) r1
>  or
>  r1 = be16 r1
> for neg:
>  r0 = -r0
>  (for 32bit support, llvm may output "w0 = -w0" in the future. But
> since it is not
>   enabled yet, you can continue to output "r0 = (u32) -r0".)
> 
> Not sure which syntax is best for bswap. The "r1 = (be16) (u16) r1" is most
> explicit in its intention.
> 
> Attaching my llvm patch as well and cc'ing Jiong and Jakub so they can see my
> implementation and the relative discussion here. (In this patch, I did
> not implement
> bswap for little endian yet.) Maybe they can provide additional comments.

This is awesome. In such case I'd like to swing back to the C syntax for 
bpf_end :)
Any of these
  r1 = (be16) (u16) r1
  or
  r1 = (be16) r1
  or
  r1 = be16 r1
are better than just
  be16 r1
I like 1st the most, since it's explicit in terms of what happens with upper 
bits,
but 2nd is also ok. 3rd is not quite C-like.

[PATCH] mac80211: aead api to reduce redundancy

2017-09-23 Thread Xiang Gao

Currently, the aes_ccm.c and aes_gcm.c are almost line by line
copy of each other. This patch reduce code redundancy by moving
the code in these two files to crypto/aead_api.c to make it a
higher level aead api. The aes_ccm.c and aes_gcm.c are removed
and all the functions are now implemented in their headers using
the newly added aead api.

Signed-off-by: Xiang Gao 
---
 crypto/Makefile |   2 +-
 net/mac80211/aes_gcm.c => crypto/aead_api.c |  53 +++--
 include/crypto/aead_api.h   |  21 +
 net/mac80211/Makefile   |   2 -
 net/mac80211/aes_ccm.c  | 115 
 net/mac80211/aes_ccm.h  |  42 +++---
 net/mac80211/aes_gcm.h  |  40 --
 net/mac80211/key.c  |   1 +
 net/mac80211/wpa.c  |  11 +--
 9 files changed, 118 insertions(+), 169 deletions(-)
 rename net/mac80211/aes_gcm.c => crypto/aead_api.c (54%)
 create mode 100644 include/crypto/aead_api.h
 delete mode 100644 net/mac80211/aes_ccm.c

diff --git a/crypto/Makefile b/crypto/Makefile
index d41f0331b085..541316db5841 100644
--- a/crypto/Makefile
+++ b/crypto/Makefile
@@ -14,7 +14,7 @@ crypto_algapi-$(CONFIG_PROC_FS) += proc.o
 crypto_algapi-y := algapi.o scatterwalk.o $(crypto_algapi-y)
 obj-$(CONFIG_CRYPTO_ALGAPI2) += crypto_algapi.o
 
-obj-$(CONFIG_CRYPTO_AEAD2) += aead.o
+obj-$(CONFIG_CRYPTO_AEAD2) += aead.o aead_api.o
 
 crypto_blkcipher-y := ablkcipher.o
 crypto_blkcipher-y += blkcipher.o
diff --git a/net/mac80211/aes_gcm.c b/crypto/aead_api.c
similarity index 54%
rename from net/mac80211/aes_gcm.c
rename to crypto/aead_api.c
index 8a4397cc1b08..9585ee400b2e 100644
--- a/net/mac80211/aes_gcm.c
+++ b/crypto/aead_api.c
@@ -1,7 +1,4 @@
-/*
- * Copyright 2014-2015, Qualcomm Atheros, Inc.
- *
- * This program is free software; you can redistribute it and/or modify
+/* This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License version 2 as
  * published by the Free Software Foundation.
  */
@@ -9,43 +6,43 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 
-#include 
-#include "key.h"
-#include "aes_gcm.h"
+#include 
 
-int ieee80211_aes_gcm_encrypt(struct crypto_aead *tfm, u8 *j_0, u8 *aad,
- u8 *data, size_t data_len, u8 *mic)
+int aead_encrypt(struct crypto_aead *tfm, u8 *b_0, u8 *aad, size_t aad_len,
+u8 *data, size_t data_len, u8 *auth)
 {
struct scatterlist sg[3];
struct aead_request *aead_req;
int reqsize = sizeof(*aead_req) + crypto_aead_reqsize(tfm);
u8 *__aad;
 
-   aead_req = kzalloc(reqsize + GCM_AAD_LEN, GFP_ATOMIC);
+   aead_req = kzalloc(reqsize + aad_len, GFP_ATOMIC);
if (!aead_req)
return -ENOMEM;
 
__aad = (u8 *)aead_req + reqsize;
-   memcpy(__aad, aad, GCM_AAD_LEN);
+   memcpy(__aad, aad, aad_len);
 
sg_init_table(sg, 3);
-   sg_set_buf([0], &__aad[2], be16_to_cpup((__be16 *)__aad));
+   sg_set_buf([0], __aad, aad_len);
sg_set_buf([1], data, data_len);
-   sg_set_buf([2], mic, IEEE80211_GCMP_MIC_LEN);
+   sg_set_buf([2], auth, tfm->authsize);
 
aead_request_set_tfm(aead_req, tfm);
-   aead_request_set_crypt(aead_req, sg, sg, data_len, j_0);
+   aead_request_set_crypt(aead_req, sg, sg, data_len, b_0);
aead_request_set_ad(aead_req, sg[0].length);
 
crypto_aead_encrypt(aead_req);
kzfree(aead_req);
+
return 0;
 }
+EXPORT_SYMBOL_GPL(aead_encrypt);
 
-int ieee80211_aes_gcm_decrypt(struct crypto_aead *tfm, u8 *j_0, u8 *aad,
- u8 *data, size_t data_len, u8 *mic)
+int aead_decrypt(struct crypto_aead *tfm, u8 *b_0, u8 *aad, size_t aad_len,
+u8 *data, size_t data_len, u8 *auth)
 {
struct scatterlist sg[3];
struct aead_request *aead_req;
@@ -56,21 +53,20 @@ int ieee80211_aes_gcm_decrypt(struct crypto_aead *tfm, u8 
*j_0, u8 *aad,
if (data_len == 0)
return -EINVAL;
 
-   aead_req = kzalloc(reqsize + GCM_AAD_LEN, GFP_ATOMIC);
+   aead_req = kzalloc(reqsize + aad_len, GFP_ATOMIC);
if (!aead_req)
return -ENOMEM;
 
__aad = (u8 *)aead_req + reqsize;
-   memcpy(__aad, aad, GCM_AAD_LEN);
+   memcpy(__aad, aad, aad_len);
 
sg_init_table(sg, 3);
sg_set_buf([0], &__aad[2], be16_to_cpup((__be16 *)__aad));
sg_set_buf([1], data, data_len);
-   sg_set_buf([2], mic, IEEE80211_GCMP_MIC_LEN);
+   sg_set_buf([2], auth, tfm->authsize);
 
aead_request_set_tfm(aead_req, tfm);
-   aead_request_set_crypt(aead_req, sg, sg,
-  data_len + IEEE80211_GCMP_MIC_LEN, j_0);
+   aead_request_set_crypt(aead_req, sg, sg, data_len +

Re: [PATCH 1/1] forcedeth: optimize the xmit/rx with unlikely

2017-09-23 Thread David Miller

From: Zhu Yanjun 
Date: Fri, 22 Sep 2017 10:20:21 -0400

> In the xmit/rx fastpath, the function dma_map_single rarely fails.
> Therefore, add an unlikely() optimization to this error check
> conditional.
> 
> Signed-off-by: Zhu Yanjun 

Applied to net-next, thanks.

Re: [PATCH net] net: qualcomm: rmnet: Fix rcu splat in rmnet_is_real_dev_registered

2017-09-23 Thread David Miller

From: Subash Abhinov Kasiviswanathan 
Date: Thu, 21 Sep 2017 18:00:36 -0600

> Xiaolong reported a suspicious rcu_dereference_check in the device
> unregister notifier callback. Since we do not dereference the
> rx_handler_data, it's ok to just check for the value of the pointer.
> Note that this section is already protected by rtnl_lock.
 ...
> Fixes: ceed73a2cf4a ("drivers: net: ethernet: qualcomm: rmnet: Initial 
> implementation")
> Signed-off-by: Subash Abhinov Kasiviswanathan 
> Reported-by: kernel test robot 

Applied, thank you.

Re: [PATCH net-next 09/14] gtp: Allow configuring GTP interface as standalone

2017-09-23 Thread Harald Welte

Hi Tom,

On Thu, Sep 21, 2017 at 09:43:02AM -0700, Tom Herbert wrote:
> Please see the cover letter for the original GTP kernel patches dated
> May 10, 2016. My first question on those was "Is there a timeline for
> adding IPv6 support?". To which Pablo replied that there was a
> preliminary patch for it that has not been released. 

I'll suggest Pablo to comment on that.  I don't recall the details at
that time, I was only involved in the earliest development of the module
and then handed over.

> If you don't like my patches or don't think that can be adapted to
> fully support the GTP specification, that's fine. 

It's not about "not liking".  I'm very happy about contributions,
including (of course) yours.  It's about making sure that code we merge
into the kernel GTP driver will actually be usable to create a
standards-compliant GTP application or not.

There's no use in merging an IPv6 support patch if already by code
review it can be shown that it's impossible to create a spec-compliant
implementation using that patch.  To me, that would be "merging IPv6
support so we can check off a box on a management form or marketing
sheet", but not for any practical value.

> But then you need to provide a viable alternative.

Why do *I* have to provide a viable alternative?  Who says that *I* have
an obligation to do so?  A (co-)maintainer of a given driver doesn't
have the obligation of implementing any feature as requested.

Community based collaborative development only gets those things done
that people contribute.  I have already contributed almost a decade of
my life to creating Free Software implementations of cellular protocol
stacks, and it continues to be the center of my work and spare time.
GTP is only one protocol layer on one of those stacks.

Pablo, Andreas and I have contributed a Linux kernel implementation that
currently only implements IPv4.  This implementation can by anyone
extended to support IPv6, and as you see from this e-mail thread, there
is interest in helping this along by
* providing code review (even at times when it's personally difficult
  for me)
* providing free hardware for setting up a "private cellular network"
  to test interoperability
* providing testing tools for validation in absence of such a cellular
  network

> We are at the point where a kernel networking feature that only
> supports IPv4 when it could support IPv6 must be considered
> incomplete.

I agree it is incomplete.  There's no doubt about that.  But then,
even the current "incomplete" implementation is working and can be used
to operate an interoperable, spec-compatible IPv4 GGSN.  So it serves a
practical purpose.  All I'm asking is that any IPv6 support patches are
developed with that same practical purpose in mind.

Going through the cover letter of your series again:

>  - IPv6 support

Cannot be merged as-is, see lengthy review discussion

> - Configurable networking interfaces so that GTP kernel can be
>   used and tested without needing GSN network emulation (i.e. no user
>   space daemon needed).
> - Port numbers are configurable

As I indicated, I'm not fundamentally opposed to it, but I'm wondering
how much value they bring in reality.  Andreas has raised the valid
concern that we're adding code that is not used in production setups or
by any of the userspace implementations using this tunneling module.
The code gets more complex and gets code paths that will not be
exercised/tested.

Nevertheless, if it helps you to work on GTP, we can merge them from my
point of view - unless Pablo and/or Andreas object more strongly.

> - GSO,GRO
> - A facility that allows application specific GSO callbacks

Fine with me, but I think you need to convince other folks about the
"application specific GSO" and the usage of the upper bits of
shinfo(skb)->gso_type.

>  - Control of zero UDP checksums

Same as above, Dave was raising some question about it, not sure if
his concern remains.

>  - Addition of a dst_cache in the GTP structure

Fine with me.

As for the patches touching gtp.c:

* 04/14 udp recv clean up:
fine with me, but kbuild robot complaint?
On a minor note, I think you're mixing two unrelated topics:
Separating the UDP receive functions and conversion to gro_cells,
which violates the "one patch per feature" rule.  I'd still
merge it, but would prefer two separate patches

* 05/14 Remove special mtu handling
Pending your rework

* 06/14 Eliminate pktinfo and add port configuration
I don't like the combination of a non-functional "cosmetic"
refactoring of removing a data structure with the introduction
of a new feature.  Makes it harder to review, impossible to
merge only one of the two.  For the rationale of introducing the
gtp_pktinfo struct, see

http://git.osmocom.org/osmo-gtp-kernel/commit/?id=3bc7019c7afd06b5c7d94e5621728d092b82bb85
it was actually intended to make IPv6 support easier, but

Re: [PATCH net-next 03/14] gtp: Call common functions to get tunnel routes and add dst_cache

2017-09-23 Thread Harald Welte

Hi Andreas,

On Wed, Sep 20, 2017 at 05:37:52PM +0200, Andreas Schultz wrote:
> I think we had this discussion before. The sending IP and port are not part
> of the identity of the PDP context. So IMHO the sender is permitted
> to change the source IP at random.

Thanks for the reminder: You are correct, at least in the uplink case
(MS->GGSN) where there is mobility of the MS. In the downlink case
(GGSN->MS), which is the "sending" part for the kernel GTP code used at
a GGSN, I'm not sure if that theory holds true in reality.

Do you agree that the current behavior of not using automatic source
address selection for encapsulated GTP packets but rather using the
source address of the socket is intended?

Do you further agree that the dst_cache support patch by Tom retains
that intended behavior and it should be merged?

-- 
- Harald Welte    http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

[PATCH] staging: rtl8822be: Keep array subscript no lower than zero

2017-09-23 Thread Larry Finger

The kbuild test robot reports the following:
   drivers/staging//rtlwifi/phydm/phydm_dig.c: In function 'odm_pause_dig':
   drivers/staging//rtlwifi/phydm/phydm_dig.c:494:45: warning: array subscript 
is below array bounds [-Warray-bounds]
  odm_write_dig(dm, dig_tab->pause_dig_value[max_level]);

This condition is caused when a loop falls through. The fix is to pin
max_level to be >= 0.

Signed-off-by: Larry Finger 
c: kbuild test robot 
Fixes: 9ce99b04b5b82fdf11e4c76b60a5f82c1e541297 staging: r8822be: Add phydm 
mini driver
---
 drivers/staging/rtlwifi/phydm/phydm_dig.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/staging/rtlwifi/phydm/phydm_dig.c 
b/drivers/staging/rtlwifi/phydm/phydm_dig.c
index 31a4f3fcad19..c88b9788363a 100644
--- a/drivers/staging/rtlwifi/phydm/phydm_dig.c
+++ b/drivers/staging/rtlwifi/phydm/phydm_dig.c
@@ -490,6 +490,8 @@ void odm_pause_dig(void *dm_void, enum phydm_pause_type 
pause_type,
break;
}
 
+   /* pin max_level to be >= 0 */
+   max_level = max_t(s8, 0, max_level);
/* write IGI of lower level */
odm_write_dig(dm, dig_tab->pause_dig_value[max_level]);
ODM_RT_TRACE(dm, ODM_COMP_DIG,
-- 
2.12.3

Re: [Patch net-next] net_sched: use idr to allocate bpf filter handles

2017-09-23 Thread David Miller

From: Cong Wang 
Date: Thu, 21 Sep 2017 16:43:00 -0700

> @@ -421,27 +427,6 @@ static int cls_bpf_set_parms(struct net *net, struct 
> tcf_proto *tp,
>   return 0;
>  }
>  
> -static u32 cls_bpf_grab_new_handle(struct tcf_proto *tp,
> -struct cls_bpf_head *head)
> -{
> - unsigned int i = 0x8000;
> - u32 handle;
> -
> - do {
> - if (++head->hgen == 0x7FFF)
> - head->hgen = 1;
> - } while (--i > 0 && cls_bpf_get(tp, head->hgen));
> -
> - if (unlikely(i == 0)) {
> - pr_err("Insufficient number of handles\n");
> - handle = 0;
> - } else {
> - handle = head->hgen;
> - }
> -
> - return handle;
> -}
> -

If I read the code properly, the existing code allows IDs from '1' to
and including '0x7ffe', whereas the new code allows up to and
including '0x7'.

Whether intentional or not this is a change in behavior.  If it's OK,
it should be at least documented either in a comment or in the
commit log itself.

Similarly for your other scheduler IDR changes.

Thanks.

Re: [PATCH] cnic: Fix an error handling path in 'cnic_alloc_bnx2x_resc()'

2017-09-23 Thread David Miller

From: Christophe JAILLET 
Date: Fri, 22 Sep 2017 01:01:11 +0200

> All the error handling paths 'goto error', except this one.
> We should also go to error in this case, or some resources will be
> leaking.
> 
> Signed-off-by: Christophe JAILLET 

Applied, thank you.

Re: [PATCH net-next v3 3/6] rtnetlink: add helper to dump ifalias

2017-09-23 Thread Florian Westphal

Eric Dumazet  wrote:
> On Sat, 2017-09-23 at 21:26 +0200, Florian Westphal wrote:
> > Reviewed-by: David Ahern 
> > Signed-off-by: Florian Westphal 
> > ---
> >  Changes since v3: don't add rtnl assertion, I placed the assertion
> >  in my working tree instead as a reminder.
> > 
> >  net/core/rtnetlink.c | 11 +--
> >  1 file changed, 9 insertions(+), 2 deletions(-)
> > 
> > diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
> > index c801212ee40e..47c17c3de79a 100644
> > --- a/net/core/rtnetlink.c
> > +++ b/net/core/rtnetlink.c
> > @@ -1332,6 +1332,14 @@ static int nla_put_iflink(struct sk_buff *skb, const 
> > struct net_device *dev)
> > return nla_put_u32(skb, IFLA_LINK, ifindex);
> >  }
> >  
> > +static noinline int nla_put_ifalias(struct sk_buff *skb, struct net_device 
> > *dev)
> 
> 
> Why noinline here ?
> 
> This function does not use stack at all (and that would call for
> noinline_for_stack )
> 
> > +{
> > +   if (dev->ifalias)
> > +   return nla_put_string(skb, IFLA_IFALIAS, dev->ifalias);
> > +
> > +   return 0;
> > +}
> > +
> 
> I really do not see the point of not making this RCU aware right away,
> or at least make it in the same patch series...

I saw no point to mix these refactoring with actual change :-|

(and it doesn't help either as-is with netlink dumping, only sysfs path can
 elide rtnl).

Subject: [PATCH net-next] net: core: decouple ifalias get/set from rtnl lock

Device alias can be set by either rtnetlink (rtnl is held) or sysfs.

rtnetlink holds rtnl mutex, sysfs acquires it for this purpose.
Add a new mutex for it plus a seqcount to get a consistent snapshot
of the alias buffer.

This allows the sysfs path to not take rtnl and would later allow
to not hold it when dumping ifalias.

Signed-off-by: Florian Westphal 
---
 include/linux/netdevice.h |  3 +-
 net/core/dev.c| 70 +++
 net/core/net-sysfs.c  | 14 --
 net/core/rtnetlink.c  |  7 +++--
 4 files changed, 70 insertions(+), 24 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index f535779d9dc1..0bcedb498f6f 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1632,7 +1632,7 @@ enum netdev_priv_flags {
 struct net_device {
charname[IFNAMSIZ];
struct hlist_node   name_hlist;
-   char*ifalias;
+   char__rcu *ifalias;
/*
 *  I/O specific fields
 *  FIXME: Merge these and struct ifmap into one
@@ -3275,6 +3275,7 @@ void __dev_notify_flags(struct net_device *, unsigned int 
old_flags,
unsigned int gchanges);
 int dev_change_name(struct net_device *, const char *);
 int dev_set_alias(struct net_device *, const char *, size_t);
+int dev_get_alias(const struct net_device *, char *, size_t);
 int dev_change_net_namespace(struct net_device *, struct net *, const char *);
 int __dev_set_mtu(struct net_device *, int);
 int dev_set_mtu(struct net_device *, int);
diff --git a/net/core/dev.c b/net/core/dev.c
index 97abddd9039a..92d87b4a2db1 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -188,6 +188,9 @@ static struct napi_struct *napi_by_id(unsigned int napi_id);
 DEFINE_RWLOCK(dev_base_lock);
 EXPORT_SYMBOL(dev_base_lock);
 
+static DEFINE_MUTEX(ifalias_mutex);
+static seqcount_t ifalias_rename_seq;
+
 /* protects napi_hash addition/deletion and napi_gen_id */
 static DEFINE_SPINLOCK(napi_hash_lock);
 
@@ -1265,29 +1268,72 @@ int dev_change_name(struct net_device *dev, const char 
*newname)
  */
 int dev_set_alias(struct net_device *dev, const char *alias, size_t len)
 {
-   char *new_ifalias;
-
-   ASSERT_RTNL();
+   char *new_ifalias, *old_ifalias;
 
if (len >= IFALIASZ)
return -EINVAL;
 
+   mutex_lock(_mutex);
+
+   old_ifalias = rcu_dereference_protected(dev->ifalias,
+   
mutex_is_locked(_mutex));
if (!len) {
-   kfree(dev->ifalias);
-   dev->ifalias = NULL;
-   return 0;
+   RCU_INIT_POINTER(dev->ifalias, NULL);
+   goto out;
}
 
-   new_ifalias = krealloc(dev->ifalias, len + 1, GFP_KERNEL);
-   if (!new_ifalias)
+   new_ifalias = __krealloc(old_ifalias, len + 1, GFP_KERNEL);
+   if (!new_ifalias) {
+   mutex_unlock(_mutex);
return -ENOMEM;
-   dev->ifalias = new_ifalias;
-   memcpy(dev->ifalias, alias, len);
-   dev->ifalias[len] = 0;
+   }
 
+   if (new_ifalias == old_ifalias) {
+   write_seqcount_begin(_rename_seq);
+   memcpy(new_ifalias, alias, len);
+   new_ifalias[len] = 0;
+   write_seqcount_end(_rename_seq);
+   mutex_unlock(_mutex);
+   return len;
+   }

Re: [PATCH net-next v3 3/6] rtnetlink: add helper to dump ifalias

2017-09-23 Thread Eric Dumazet

On Sat, 2017-09-23 at 21:26 +0200, Florian Westphal wrote:
> Reviewed-by: David Ahern 
> Signed-off-by: Florian Westphal 
> ---
>  Changes since v3: don't add rtnl assertion, I placed the assertion
>  in my working tree instead as a reminder.
> 
>  net/core/rtnetlink.c | 11 +--
>  1 file changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
> index c801212ee40e..47c17c3de79a 100644
> --- a/net/core/rtnetlink.c
> +++ b/net/core/rtnetlink.c
> @@ -1332,6 +1332,14 @@ static int nla_put_iflink(struct sk_buff *skb, const 
> struct net_device *dev)
>   return nla_put_u32(skb, IFLA_LINK, ifindex);
>  }
>  
> +static noinline int nla_put_ifalias(struct sk_buff *skb, struct net_device 
> *dev)


Why noinline here ?

This function does not use stack at all (and that would call for
noinline_for_stack )

> +{
> + if (dev->ifalias)
> + return nla_put_string(skb, IFLA_IFALIAS, dev->ifalias);
> +
> + return 0;
> +}
> +

I really do not see the point of not making this RCU aware right away,
or at least make it in the same patch series...

[PATCH] au1k_ir.c fix warning: Prefer [subsystem eg: netdev]_info([subsystem]dev, ...

2017-09-23 Thread Yurii Pavlenko

This patch fixes the following checkpatch.pl warning: fix
Prefer [subsystem eg: netdev]_info([subsystem]dev, ... then dev_info(dev, ...
then pr_info(...  to printk(KERN_INFO ...

Signed-off-by: Yurii Pavlenko 
---
 drivers/staging/irda/drivers/au1k_ir.c | 40 +++---
 1 file changed, 18 insertions(+), 22 deletions(-)

diff --git a/drivers/staging/irda/drivers/au1k_ir.c 
b/drivers/staging/irda/drivers/au1k_ir.c
index be4ea6a..73e3e4b 100644
--- a/drivers/staging/irda/drivers/au1k_ir.c
+++ b/drivers/staging/irda/drivers/au1k_ir.c
@@ -290,8 +290,7 @@ static int au1k_irda_set_speed(struct net_device *dev, int 
speed)
while (irda_read(aup, IR_STATUS) & (IR_RX_STATUS | IR_TX_STATUS)) {
msleep(20);
if (!timeout--) {
-   printk(KERN_ERR "%s: rx/tx disable timeout\n",
-   dev->name);
+   netdev_err(dev, "rx/tx disable timeout\n");
break;
}
}
@@ -349,7 +348,7 @@ static int au1k_irda_set_speed(struct net_device *dev, int 
speed)
IR_RX_ENABLE);
break;
default:
-   printk(KERN_ERR "%s unsupported speed %x\n", dev->name, speed);
+   netdev_err(dev, "unsupported speed %x\n", speed);
ret = -EINVAL;
break;
}
@@ -361,18 +360,18 @@ static int au1k_irda_set_speed(struct net_device *dev, 
int speed)
irda_write(aup, IR_RING_PROMPT, 0);
 
if (control & (1 << 14)) {
-   printk(KERN_ERR "%s: configuration error\n", dev->name);
+   netdev_err(dev, "configuration error\n");
} else {
if (control & (1 << 11))
-   printk(KERN_DEBUG "%s Valid SIR config\n", dev->name);
+   netdev_debug(dev, "Valid SIR config\n");
if (control & (1 << 12))
-   printk(KERN_DEBUG "%s Valid MIR config\n", dev->name);
+   netdev_debug(dev, "Valid MIR config\n");
if (control & (1 << 13))
-   printk(KERN_DEBUG "%s Valid FIR config\n", dev->name);
+   netdev_debug(dev, "Valid FIR config\n");
if (control & (1 << 10))
-   printk(KERN_DEBUG "%s TX enabled\n", dev->name);
+   netdev_debug(dev, "TX enabled\n");
if (control & (1 << 9))
-   printk(KERN_DEBUG "%s RX enabled\n", dev->name);
+   netdev_debug(dev, "RX enabled\n");
}
 
return ret;
@@ -584,23 +583,21 @@ static int au1k_irda_start(struct net_device *dev)
 
retval = au1k_init(dev);
if (retval) {
-   printk(KERN_ERR "%s: error in au1k_init\n", dev->name);
+   netdev_err(dev, "error in au1k_init\n");
return retval;
}
 
retval = request_irq(aup->irq_tx, _irda_interrupt, 0,
 dev->name, dev);
if (retval) {
-   printk(KERN_ERR "%s: unable to get IRQ %d\n",
-   dev->name, dev->irq);
+   netdev_err(dev, "unable to get IRQ %d\n", dev->irq);
return retval;
}
retval = request_irq(aup->irq_rx, _irda_interrupt, 0,
 dev->name, dev);
if (retval) {
free_irq(aup->irq_tx, dev);
-   printk(KERN_ERR "%s: unable to get IRQ %d\n",
-   dev->name, dev->irq);
+   netdev_err(dev, "unable to get IRQ %d\n", dev->irq);
return retval;
}
 
@@ -673,12 +670,12 @@ static int au1k_irda_hard_xmit(struct sk_buff *skb, 
struct net_device *dev)
flags = ptxd->flags;
 
if (flags & AU_OWN) {
-   printk(KERN_DEBUG "%s: tx_full\n", dev->name);
+   netdev_debug(dev, "tx_full\n");
netif_stop_queue(dev);
aup->tx_full = 1;
return 1;
} else if (((aup->tx_head + 1) & (NUM_IR_DESC - 1)) == aup->tx_tail) {
-   printk(KERN_DEBUG "%s: tx_full\n", dev->name);
+   netdev_debug(dev, "tx_full\n");
netif_stop_queue(dev);
aup->tx_full = 1;
return 1;
@@ -688,7 +685,7 @@ static int au1k_irda_hard_xmit(struct sk_buff *skb, struct 
net_device *dev)
 
 #if 0
if (irda_read(aup, IR_RX_BYTE_CNT) != 0) {
-   printk(KERN_DEBUG "tx warning: rx byte cnt %x\n",
+   netdev_debug(dev, "tx warning: rx byte cnt %x\n",
irda_read(aup, IR_RX_BYTE_CNT));
}
 #endif
@@ -726,7 +723,7 @@ static void au1k_tx_timeout(struct net_device *dev)
u32 speed;
struct au1k_private *aup = netdev_priv(dev);
 
-   printk(KERN_ERR "%s: tx timeout\n", dev->name);
+

[PATCH 2/2] neigh: make strucrt neigh_table::entry_size unsigned int

2017-09-23 Thread Alexey Dobriyan

Key length can't be negative.

Leave comparisons against nla_len() signed just in case truncated attribute
can sneak in there.

Space savings:

add/remove: 0/0 grow/shrink: 0/7 up/down: 0/-7 (-7)
function old new   delta
pneigh_delete273 272  -1
mlx5e_rep_netevent_event14151414  -1
mlx5e_create_encap_header_ipv6  11941193  -1
mlx5e_create_encap_header_ipv4  10711070  -1
cxgb4_l2t_get   11041103  -1
__pneigh_lookup   69  68  -1
__neigh_create  24522451  -1

Signed-off-by: Alexey Dobriyan 
---

 drivers/net/ethernet/chelsio/cxgb4/l2t.c |4 ++--
 include/net/neighbour.h  |2 +-
 net/core/neighbour.c |   18 +-
 3 files changed, 12 insertions(+), 12 deletions(-)

--- a/drivers/net/ethernet/chelsio/cxgb4/l2t.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/l2t.c
@@ -422,7 +422,7 @@ struct l2t_entry *cxgb4_l2t_get(struct l2t_data *d, struct 
neighbour *neigh,
u8 lport;
u16 vlan;
struct l2t_entry *e;
-   int addr_len = neigh->tbl->key_len;
+   unsigned int addr_len = neigh->tbl->key_len;
u32 *addr = (u32 *)neigh->primary_key;
int ifidx = neigh->dev->ifindex;
int hash = addr_hash(d, addr, addr_len, ifidx);
@@ -536,7 +536,7 @@ void t4_l2t_update(struct adapter *adap, struct neighbour 
*neigh)
struct l2t_entry *e;
struct sk_buff_head *arpq = NULL;
struct l2t_data *d = adap->l2t;
-   int addr_len = neigh->tbl->key_len;
+   unsigned int addr_len = neigh->tbl->key_len;
u32 *addr = (u32 *) neigh->primary_key;
int ifidx = neigh->dev->ifindex;
int hash = addr_hash(d, addr, addr_len, ifidx);
--- a/include/net/neighbour.h
+++ b/include/net/neighbour.h
@@ -191,7 +191,7 @@ struct neigh_hash_table {
 struct neigh_table {
int family;
unsigned intentry_size;
-   int key_len;
+   unsigned intkey_len;
__be16  protocol;
__u32   (*hash)(const void *pkey,
const struct net_device *dev,
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -457,7 +457,7 @@ struct neighbour *neigh_lookup_nodev(struct neigh_table 
*tbl, struct net *net,
 const void *pkey)
 {
struct neighbour *n;
-   int key_len = tbl->key_len;
+   unsigned int key_len = tbl->key_len;
u32 hash_val;
struct neigh_hash_table *nht;
 
@@ -488,7 +488,7 @@ struct neighbour *__neigh_create(struct neigh_table *tbl, 
const void *pkey,
 struct net_device *dev, bool want_ref)
 {
u32 hash_val;
-   int key_len = tbl->key_len;
+   unsigned int key_len = tbl->key_len;
int error;
struct neighbour *n1, *rc, *n = neigh_alloc(tbl, dev);
struct neigh_hash_table *nht;
@@ -572,7 +572,7 @@ struct neighbour *__neigh_create(struct neigh_table *tbl, 
const void *pkey,
 }
 EXPORT_SYMBOL(__neigh_create);
 
-static u32 pneigh_hash(const void *pkey, int key_len)
+static u32 pneigh_hash(const void *pkey, unsigned int key_len)
 {
u32 hash_val = *(u32 *)(pkey + key_len - 4);
hash_val ^= (hash_val >> 16);
@@ -585,7 +585,7 @@ static u32 pneigh_hash(const void *pkey, int key_len)
 static struct pneigh_entry *__pneigh_lookup_1(struct pneigh_entry *n,
  struct net *net,
  const void *pkey,
- int key_len,
+ unsigned int key_len,
  struct net_device *dev)
 {
while (n) {
@@ -601,7 +601,7 @@ static struct pneigh_entry *__pneigh_lookup_1(struct 
pneigh_entry *n,
 struct pneigh_entry *__pneigh_lookup(struct neigh_table *tbl,
struct net *net, const void *pkey, struct net_device *dev)
 {
-   int key_len = tbl->key_len;
+   unsigned int key_len = tbl->key_len;
u32 hash_val = pneigh_hash(pkey, key_len);
 
return __pneigh_lookup_1(tbl->phash_buckets[hash_val],
@@ -614,7 +614,7 @@ struct pneigh_entry * pneigh_lookup(struct neigh_table *tbl,
struct net_device *dev, int creat)
 {
struct pneigh_entry *n;
-   int key_len = tbl->key_len;
+   unsigned int key_len = tbl->key_len;
u32 hash_val = pneigh_hash(pkey, key_len);
 
read_lock_bh(>lock);
@@ -659,7 +659,7 @@ int pneigh_delete(struct neigh_table *tbl, struct net *net, 
const void *pkey,
  struct

[PATCH 1/2] neigh: make struct neigh_table::entry_size unsigned int

2017-09-23 Thread Alexey Dobriyan

Neigh entry size can't be negative.

Space savings:

add/remove: 0/0 grow/shrink: 0/5 up/down: 0/-7 (-7)
function old new   delta
lowpan_neigh_construct25  24  -1
clip_seq_sub_iter152 151  -1
clip_ioctl  14751474  -1
clip_constructor  93  92  -1
__neigh_create  24552452  -3

Signed-off-by: Alexey Dobriyan 
---

 include/net/neighbour.h |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/include/net/neighbour.h
+++ b/include/net/neighbour.h
@@ -190,7 +190,7 @@ struct neigh_hash_table {
 
 struct neigh_table {
int family;
-   int entry_size;
+   unsigned intentry_size;
int key_len;
__be16  protocol;
__u32   (*hash)(const void *pkey,

[PATCH net-next] net: speed up skb_rbtree_purge()

2017-09-23 Thread Eric Dumazet

From: Eric Dumazet 

As measured in my prior patch ("sch_netem: faster rb tree removal"),
rbtree_postorder_for_each_entry_safe() is nice looking but much slower
than using rb_next() directly, except when tree is small enough
to fit in CPU caches (then the cost is the same)

Also note that there is not even an increase of text size :
$ size net/core/skbuff.o.before net/core/skbuff.o
   textdata bss dec hex filename
  407111298   0   42009a419 net/core/skbuff.o.before
  407111298   0   42009a419 net/core/skbuff.o


From: Eric Dumazet 
---
 net/core/skbuff.c |   11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 
16982de649b97b92423a4f9f5eac1e98ca803370..000ce735fa8d649e7abeeef2ebab8501dea96efd
 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -2848,12 +2848,15 @@ EXPORT_SYMBOL(skb_queue_purge);
  */
 void skb_rbtree_purge(struct rb_root *root)
 {
-   struct sk_buff *skb, *next;
+   struct rb_node *p = rb_first(root);
 
-   rbtree_postorder_for_each_entry_safe(skb, next, root, rbnode)
-   kfree_skb(skb);
+   while (p) {
+   struct sk_buff *skb = rb_entry(p, struct sk_buff, rbnode);
 
-   *root = RB_ROOT;
+   p = rb_next(p);
+   rb_erase(>rbnode, root);
+   kfree_skb(skb);
+   }
 }
 
 /**

[RESEND] Re: usb/net/p54: trying to register non-static key in p54_unregister_leds

2017-09-23 Thread Christian Lamparter

This got rejected by gmail once. Let's see if it works now.

On Thursday, September 21, 2017 8:22:45 PM CEST Andrey Konovalov wrote:
> On Wed, Sep 20, 2017 at 9:55 PM, Johannes Berg
>  wrote:
> > On Wed, 2017-09-20 at 21:27 +0200, Christian Lamparter wrote:
> >
> >> It seems this is caused as a result of:
> >> -> lock_map_acquire(>lockdep_map);
> >>   lock_map_release(>lockdep_map);
> >>
> >> in flush_work() [0]
> >
> > Agree.
> >
> >> This was added by:
> >>
> >>   commit 0976dfc1d0cd80a4e9dfaf87bd8744612bde475a
> >>   Author: Stephen Boyd 
> >>   Date:   Fri Apr 20 17:28:50 2012 -0700
> >>
> >>   workqueue: Catch more locking problems with flush_work()
> >
> > Yes, but that doesn't matter.
> >
> >> Looking at the Stephen's patch, it's clear that it was made
> >> with "static DECLARE_WORK(work, my_work)" in mind. However
> >> p54's led_work is "per-device", hence it is stored in the
> >> devices context p54_common, which is dynamically allocated.
> >> So, maybe revert Stephen's patch?
> >
> > I disagree - as the lockdep warning says:
> >
> >> > INFO: trying to register non-static key.
> >> > the code is fine but needs lockdep annotation.
> >> > turning off the locking correctness validator.
> >
> > What it needs is to actually correctly go through initializing the work
> > at least once.
> >
> > Without more information, I can't really say what's going on, but I
> > assume that something is failing and p54_unregister_leds() is getting
> > invoked without p54_init_leds() having been invoked, so essentially
> > it's trying to flush a work that was never initialized?
> >
> > INIT_DELAYED_WORK() does, after all, initialize the lockdep map
> > properly via __INIT_WORK().

Ok, thanks. This does indeed explain it.

But this also begs the question: Is this really working then?
>From what I can tell, if CONFIG_LOCKDEP is not set then there's no BUG
no WARN, no other splat or any other odd system behaviour. Does
[cancel | flush]_[delayed_]work[_sync] really "just work" by *accident*,
as long the delayed_work | work_struct is zeroed out? 
And should it work in the future as well?

> Since I'm able to reproduce this, please let me know if you need me to
> collect some debug traces to help with the triage.

Do you want to take a shot at making a patch too? At a quick glance, it
should be enough to move the [#ifdef CONFIG_P54_LEDS ... #endif] block
in p54_unregister_common() into the if (priv->registered) { block
(preferably before the ieee80211_unregister_hw(dev).

Regards,
Christian

[PATCH net-next v3 5/6] rtnetlink: add helpers to dump netnsid information

2017-09-23 Thread Florian Westphal

Reviewed-by: David Ahern 
Signed-off-by: Florian Westphal 
---
 Changes since v2:
 this hunk was part of patch #4.

 net/core/rtnetlink.c | 30 +++---
 1 file changed, 19 insertions(+), 11 deletions(-)

diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 625342d27c44..e858a2b48d7e 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -1370,6 +1370,23 @@ static noinline int nla_put_ifalias(struct sk_buff *skb, 
struct net_device *dev)
return 0;
 }
 
+static noinline int rtnl_fill_link_netnsid(struct sk_buff *skb,
+  const struct net_device *dev)
+{
+   if (dev->rtnl_link_ops && dev->rtnl_link_ops->get_link_net) {
+   struct net *link_net = dev->rtnl_link_ops->get_link_net(dev);
+
+   if (!net_eq(dev_net(dev), link_net)) {
+   int id = peernet2id_alloc(dev_net(dev), link_net);
+
+   if (nla_put_s32(skb, IFLA_LINK_NETNSID, id))
+   return -EMSGSIZE;
+   }
+   }
+
+   return 0;
+}
+
 static int rtnl_fill_ifinfo(struct sk_buff *skb, struct net_device *dev,
int type, u32 pid, u32 seq, u32 change,
unsigned int flags, u32 ext_filter_mask,
@@ -1458,17 +1475,8 @@ static int rtnl_fill_ifinfo(struct sk_buff *skb, struct 
net_device *dev,
goto nla_put_failure;
}
 
-   if (dev->rtnl_link_ops &&
-   dev->rtnl_link_ops->get_link_net) {
-   struct net *link_net = dev->rtnl_link_ops->get_link_net(dev);
-
-   if (!net_eq(dev_net(dev), link_net)) {
-   int id = peernet2id_alloc(dev_net(dev), link_net);
-
-   if (nla_put_s32(skb, IFLA_LINK_NETNSID, id))
-   goto nla_put_failure;
-   }
-   }
+   if (rtnl_fill_link_netnsid(skb, dev))
+   goto nla_put_failure;
 
if (!(af_spec = nla_nest_start(skb, IFLA_AF_SPEC)))
goto nla_put_failure;
-- 
2.13.5

[PATCH net-next v3 1/6] selftests: rtnetlink.sh: add rudimentary vrf test

2017-09-23 Thread Florian Westphal

Acked-by: David Ahern 
Signed-off-by: Florian Westphal 
---
 Changes since v1: indent all lines with tabs, not spaces

 tools/testing/selftests/net/rtnetlink.sh | 42 
 1 file changed, 42 insertions(+)

diff --git a/tools/testing/selftests/net/rtnetlink.sh 
b/tools/testing/selftests/net/rtnetlink.sh
index 4b48de565cae..a048f7a5f94c 100755
--- a/tools/testing/selftests/net/rtnetlink.sh
+++ b/tools/testing/selftests/net/rtnetlink.sh
@@ -291,6 +291,47 @@ kci_test_ifalias()
echo "PASS: set ifalias $namewant for $devdummy"
 }
 
+kci_test_vrf()
+{
+   vrfname="test-vrf"
+   ret=0
+
+   ip link show type vrf 2>/dev/null
+   if [ $? -ne 0 ]; then
+   echo "SKIP: vrf: iproute2 too old"
+   return 0
+   fi
+
+   ip link add "$vrfname" type vrf table 10
+   check_err $?
+   if [ $ret -ne 0 ];then
+   echo "FAIL: can't add vrf interface, skipping test"
+   return 0
+   fi
+
+   ip -br link show type vrf | grep -q "$vrfname"
+   check_err $?
+   if [ $ret -ne 0 ];then
+   echo "FAIL: created vrf device not found"
+   return 1
+   fi
+
+   ip link set dev "$vrfname" up
+   check_err $?
+
+   ip link set dev "$devdummy" master "$vrfname"
+   check_err $?
+   ip link del dev "$vrfname"
+   check_err $?
+
+   if [ $ret -ne 0 ];then
+   echo "FAIL: vrf"
+   return 1
+   fi
+
+   echo "PASS: vrf"
+}
+
 kci_test_rtnl()
 {
kci_add_dummy
@@ -306,6 +347,7 @@ kci_test_rtnl()
kci_test_bridge
kci_test_addrlabel
kci_test_ifalias
+   kci_test_vrf
 
kci_del_dummy
 }
-- 
2.13.5

[PATCH net-next v3 3/6] rtnetlink: add helper to dump ifalias

2017-09-23 Thread Florian Westphal

Reviewed-by: David Ahern 
Signed-off-by: Florian Westphal 
---
 Changes since v3: don't add rtnl assertion, I placed the assertion
 in my working tree instead as a reminder.

 net/core/rtnetlink.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index c801212ee40e..47c17c3de79a 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -1332,6 +1332,14 @@ static int nla_put_iflink(struct sk_buff *skb, const 
struct net_device *dev)
return nla_put_u32(skb, IFLA_LINK, ifindex);
 }
 
+static noinline int nla_put_ifalias(struct sk_buff *skb, struct net_device 
*dev)
+{
+   if (dev->ifalias)
+   return nla_put_string(skb, IFLA_IFALIAS, dev->ifalias);
+
+   return 0;
+}
+
 static int rtnl_fill_ifinfo(struct sk_buff *skb, struct net_device *dev,
int type, u32 pid, u32 seq, u32 change,
unsigned int flags, u32 ext_filter_mask,
@@ -1374,8 +1382,7 @@ static int rtnl_fill_ifinfo(struct sk_buff *skb, struct 
net_device *dev,
nla_put_u8(skb, IFLA_CARRIER, netif_carrier_ok(dev)) ||
(dev->qdisc &&
 nla_put_string(skb, IFLA_QDISC, dev->qdisc->ops->id)) ||
-   (dev->ifalias &&
-nla_put_string(skb, IFLA_IFALIAS, dev->ifalias)) ||
+   nla_put_ifalias(skb, dev) ||
nla_put_u32(skb, IFLA_CARRIER_CHANGES,
atomic_read(>carrier_changes)) ||
nla_put_u8(skb, IFLA_PROTO_DOWN, dev->proto_down))
-- 
2.13.5

[PATCH net-next v3 6/6] rtnetlink: rtnl_have_link_slave_info doesn't need rtnl

2017-09-23 Thread Florian Westphal

it can be switched to rcu.

Reviewed-by: David Ahern 
Signed-off-by: Florian Westphal 
---
 Changes since v2: remove ASSERT_RTNL.

 net/core/rtnetlink.c | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index e858a2b48d7e..c69451964a44 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -522,11 +522,15 @@ static size_t rtnl_link_get_af_size(const struct 
net_device *dev,
 static bool rtnl_have_link_slave_info(const struct net_device *dev)
 {
struct net_device *master_dev;
+   bool ret = false;
 
-   master_dev = netdev_master_upper_dev_get((struct net_device *) dev);
+   rcu_read_lock();
+
+   master_dev = netdev_master_upper_dev_get_rcu((struct net_device *)dev);
if (master_dev && master_dev->rtnl_link_ops)
-   return true;
-   return false;
+   ret = true;
+   rcu_read_unlock();
+   return ret;
 }
 
 static int rtnl_link_slave_info_fill(struct sk_buff *skb,
-- 
2.13.5

[PATCH net-next v3 4/6] rtnetlink: add helpers to dump vf information

2017-09-23 Thread Florian Westphal

similar to earlier patches, split out more parts of this function to
better see what is happening and where we assume rtnl is locked.

Reviewed-by: David Ahern 
Signed-off-by: Florian Westphal 
---
 changes since v2: split this patch into two, last submission also
 added netnsid helper, which was moved to next patch.

 net/core/rtnetlink.c | 50 +++---
 1 file changed, 31 insertions(+), 19 deletions(-)

diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 47c17c3de79a..625342d27c44 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -1211,6 +1211,36 @@ static noinline_for_stack int rtnl_fill_vfinfo(struct 
sk_buff *skb,
return -EMSGSIZE;
 }
 
+static noinline_for_stack int rtnl_fill_vf(struct sk_buff *skb,
+  struct net_device *dev,
+  u32 ext_filter_mask)
+{
+   struct nlattr *vfinfo;
+   int i, num_vfs;
+
+   if (!dev->dev.parent || ((ext_filter_mask & RTEXT_FILTER_VF) == 0))
+   return 0;
+
+   num_vfs = dev_num_vf(dev->dev.parent);
+   if (nla_put_u32(skb, IFLA_NUM_VF, num_vfs))
+   return -EMSGSIZE;
+
+   if (!dev->netdev_ops->ndo_get_vf_config)
+   return 0;
+
+   vfinfo = nla_nest_start(skb, IFLA_VFINFO_LIST);
+   if (!vfinfo)
+   return -EMSGSIZE;
+
+   for (i = 0; i < num_vfs; i++) {
+   if (rtnl_fill_vfinfo(skb, dev, i, vfinfo))
+   return -EMSGSIZE;
+   }
+
+   nla_nest_end(skb, vfinfo);
+   return 0;
+}
+
 static int rtnl_fill_link_ifmap(struct sk_buff *skb, struct net_device *dev)
 {
struct rtnl_link_ifmap map;
@@ -1414,27 +1444,9 @@ static int rtnl_fill_ifinfo(struct sk_buff *skb, struct 
net_device *dev,
if (rtnl_fill_stats(skb, dev))
goto nla_put_failure;
 
-   if (dev->dev.parent && (ext_filter_mask & RTEXT_FILTER_VF) &&
-   nla_put_u32(skb, IFLA_NUM_VF, dev_num_vf(dev->dev.parent)))
+   if (rtnl_fill_vf(skb, dev, ext_filter_mask))
goto nla_put_failure;
 
-   if (dev->netdev_ops->ndo_get_vf_config && dev->dev.parent &&
-   ext_filter_mask & RTEXT_FILTER_VF) {
-   int i;
-   struct nlattr *vfinfo;
-   int num_vfs = dev_num_vf(dev->dev.parent);
-
-   vfinfo = nla_nest_start(skb, IFLA_VFINFO_LIST);
-   if (!vfinfo)
-   goto nla_put_failure;
-   for (i = 0; i < num_vfs; i++) {
-   if (rtnl_fill_vfinfo(skb, dev, i, vfinfo))
-   goto nla_put_failure;
-   }
-
-   nla_nest_end(skb, vfinfo);
-   }
-
if (rtnl_port_fill(skb, dev, ext_filter_mask))
goto nla_put_failure;
 
-- 
2.13.5

[PATCH net-next v3 2/6] rtnetlink: add helper to put master and link ifindexes

2017-09-23 Thread Florian Westphal

rtnl_fill_ifinfo currently requires caller to hold the rtnl mutex.
Unfortunately the function is quite large which makes it harder to see
which spots require the lock, which spots assume it and which ones could
do without.

Add helpers to factor out the ifindex dumping, one can use rcu to avoid
rtnl dependency.

Reviewed-by: David Ahern 
Signed-off-by: Florian Westphal 
---
 no changes since v2.
 net/core/rtnetlink.c | 32 +++-
 1 file changed, 27 insertions(+), 5 deletions(-)

diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index a78fd61da0ec..c801212ee40e 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -1307,6 +1307,31 @@ static u32 rtnl_get_event(unsigned long event)
return rtnl_event_type;
 }
 
+static int put_master_ifindex(struct sk_buff *skb, struct net_device *dev)
+{
+   const struct net_device *upper_dev;
+   int ret = 0;
+
+   rcu_read_lock();
+
+   upper_dev = netdev_master_upper_dev_get_rcu(dev);
+   if (upper_dev)
+   ret = nla_put_u32(skb, IFLA_MASTER, upper_dev->ifindex);
+
+   rcu_read_unlock();
+   return ret;
+}
+
+static int nla_put_iflink(struct sk_buff *skb, const struct net_device *dev)
+{
+   int ifindex = dev_get_iflink(dev);
+
+   if (dev->ifindex == ifindex)
+   return 0;
+
+   return nla_put_u32(skb, IFLA_LINK, ifindex);
+}
+
 static int rtnl_fill_ifinfo(struct sk_buff *skb, struct net_device *dev,
int type, u32 pid, u32 seq, u32 change,
unsigned int flags, u32 ext_filter_mask,
@@ -1316,7 +1341,6 @@ static int rtnl_fill_ifinfo(struct sk_buff *skb, struct 
net_device *dev,
struct nlmsghdr *nlh;
struct nlattr *af_spec;
struct rtnl_af_ops *af_ops;
-   struct net_device *upper_dev = netdev_master_upper_dev_get(dev);
 
ASSERT_RTNL();
nlh = nlmsg_put(skb, pid, seq, type, sizeof(*ifm), flags);
@@ -1345,10 +1369,8 @@ static int rtnl_fill_ifinfo(struct sk_buff *skb, struct 
net_device *dev,
 #ifdef CONFIG_RPS
nla_put_u32(skb, IFLA_NUM_RX_QUEUES, dev->num_rx_queues) ||
 #endif
-   (dev->ifindex != dev_get_iflink(dev) &&
-nla_put_u32(skb, IFLA_LINK, dev_get_iflink(dev))) ||
-   (upper_dev &&
-nla_put_u32(skb, IFLA_MASTER, upper_dev->ifindex)) ||
+   nla_put_iflink(skb, dev) ||
+   put_master_ifindex(skb, dev) ||
nla_put_u8(skb, IFLA_CARRIER, netif_carrier_ok(dev)) ||
(dev->qdisc &&
 nla_put_string(skb, IFLA_QDISC, dev->qdisc->ops->id)) ||
-- 
2.13.5

[PATCH net-next v3 0/6] rtnetlink: preparation patches for further rtnl lock pushdown/removal

2017-09-23 Thread Florian Westphal

First patch adds a rudimentary vrf test case.

Remaining patches split large rtnl_fill_ifinfo into smaller chunks
to better see which parts

1. require rtnl
2. do not require it at all
3. rely on rtnl locking now but could be converted

I removed all the ASSERT_RTNL spots that v1 and v2 added,
i will keep that back in my working branch since those
are just 'todo' markers for myself.

Eric Dumazet pointed out that qdiscs are now freed directly without
call_rcu so I dropped the patch that made this assumption from the series.

As the remaining patches did not see major changes vs. v2 I retained
all reviewed/acked-by tags from David Ahern.

Re: [PATCH net-next v2 3/6] rtnetlink: add helper to dump qdisc name

2017-09-23 Thread Florian Westphal

Eric Dumazet  wrote:
> On Fri, 2017-09-22 at 08:10 +0200, Florian Westphal wrote:
> > We can use rcu here to make this safe even if we would not hold rtnl:
> > qdisc_destroy uses call_rcu to free the Qdisc struct.
> 
> 
> Where do you see call_rcu() called from qdisc_destroy() ?
> 
> You missed this commit I guess
> 
> 752fbcc33405d6f8249465e4b2c4e420091bb825
> ("net_sched: no need to free qdisc in RCU callback")

Indeed, I did, patch dropped, thanks for the heads-up.

[PATCH net-next] sch_netem: faster rb tree removal

2017-09-23 Thread Eric Dumazet

From: Eric Dumazet 

While running TCP tests involving netem storing millions of packets,
I had the idea to speed up tfifo_reset() and did experiments.

I tried the rbtree_postorder_for_each_entry_safe() method that is
used in skb_rbtree_purge() but discovered it was slower than the
current tfifo_reset() method.

I measured time taken to release skbs with three occupation levels :
10^4, 10^5 and 10^6 skbs with three methods :

1) (current 'naive' method)

while ((p = rb_first(>t_root))) {
struct sk_buff *skb = netem_rb_to_skb(p);
 
rb_erase(p, >t_root);
rtnl_kfree_skbs(skb, skb);
}

2) Use rb_next() instead of rb_first() in the loop :

p = rb_first(>t_root);
while (p) {
struct sk_buff *skb = netem_rb_to_skb(p);

p = rb_next(p);
rb_erase(>rbnode, >t_root);
rtnl_kfree_skbs(skb, skb);
}

3) "optimized" method using rbtree_postorder_for_each_entry_safe()

struct sk_buff *skb, *next;

rbtree_postorder_for_each_entry_safe(skb, next,
 >t_root, rbnode) {
   rtnl_kfree_skbs(skb, skb);
}
q->t_root = RB_ROOT;

Results :

method_1:while (rb_first()) rb_erase() 1 skbs in 690378 ns (69 ns per skb)
method_2:rb_first; while (p) { p = rb_next(p); ...}  1 skbs in 541846 ns 
(54 ns per skb)
method_3:rbtree_postorder_for_each_entry_safe() 1 skbs in 868307 ns (86 ns 
per skb)

method_1:while (rb_first()) rb_erase() 6 skbs in 7804021 ns (78 ns per skb)
method_2:rb_first; while (p) { p = rb_next(p); ...}  10 skbs in 5942456 ns 
(59 ns per skb)
method_3:rbtree_postorder_for_each_entry_safe() 10 skbs in 11584940 ns (115 
ns per skb)

method_1:while (rb_first()) rb_erase() 100 skbs in 108577838 ns (108 ns per 
skb)
method_2:rb_first; while (p) { p = rb_next(p); ...}  100 skbs in 82619635 
ns (82 ns per skb)
method_3:rbtree_postorder_for_each_entry_safe() 100 skbs in 127328743 ns 
(127 ns per skb)

Method 2) is simply faster, probably because it maintains a smaller
working size set.

Note that this is the method we use in tcp_ofo_queue() already.

I will also change skb_rbtree_purge() in a second patch.

Signed-off-by: Eric Dumazet 
---
 net/sched/sch_netem.c |7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
index 
063a4bdb9ee6f26b01387959e8f6ccd15ec16191..5a4f1008029068372019a965186e7a3c0a18aac3
 100644
--- a/net/sched/sch_netem.c
+++ b/net/sched/sch_netem.c
@@ -361,12 +361,13 @@ static psched_time_t packet_len_2_sched_time(unsigned int 
len, struct netem_sche
 static void tfifo_reset(struct Qdisc *sch)
 {
struct netem_sched_data *q = qdisc_priv(sch);
-   struct rb_node *p;
+   struct rb_node *p = rb_first(>t_root);
 
-   while ((p = rb_first(>t_root))) {
+   while (p) {
struct sk_buff *skb = netem_rb_to_skb(p);
 
-   rb_erase(p, >t_root);
+   p = rb_next(p);
+   rb_erase(>rbnode, >t_root);
rtnl_kfree_skbs(skb, skb);
}
 }

Re: [PATCH net-next v2 3/6] rtnetlink: add helper to dump qdisc name

2017-09-23 Thread Eric Dumazet

On Fri, 2017-09-22 at 08:10 +0200, Florian Westphal wrote:
> We can use rcu here to make this safe even if we would not hold rtnl:
> qdisc_destroy uses call_rcu to free the Qdisc struct.


Where do you see call_rcu() called from qdisc_destroy() ?

You missed this commit I guess

752fbcc33405d6f8249465e4b2c4e420091bb825
("net_sched: no need to free qdisc in RCU callback")

Re: [PATCH] net: dsa: avoid null pointer dereference on p->phy

2017-09-23 Thread Florian Fainelli



On 09/23/2017 09:57 AM, Colin King wrote:
> From: Colin Ian King 
> 
> Currently p->phy is being null checked in several places to avoid
> null pointer dereferences on p->phy, however, the final call
> to phy_attached_info on p->phy when p->phy will perform a null
> pointer dereference. Fix this by simply moving the call into
> the previous code block that is only executed if p->phy is
> not null.
> 
> Detected by CoverityScan, CID#1457034 ("Dereference after null check")

The code flow is not exactly easy to read, but I don't see how we can
actually wind up in that situation because we check the return values of
of_phy_connect() and dsa_slave_phy_connect() earlier on.

> 
> Fixes: 2220943a21e2 ("phy: Centralise print about attached phy")
> Signed-off-by: Colin Ian King 
> ---
>  net/dsa/slave.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/net/dsa/slave.c b/net/dsa/slave.c
> index 02ace7d462c4..29ab4e98639b 100644
> --- a/net/dsa/slave.c
> +++ b/net/dsa/slave.c
> @@ -1115,10 +1115,9 @@ static int dsa_slave_phy_setup(struct net_device 
> *slave_dev)
>   of_phy_deregister_fixed_link(port_dn);
>   return ret;
>   }
> + phy_attached_info(p->phy);
>   }
>  
> - phy_attached_info(p->phy);
> -
>   return 0;
>  }
>  
> 

-- 
Florian

Re: [PATCH net-next v2 6/6] rtnetlink: rtnl_have_link_slave_info doesn't need rtnl

2017-09-23 Thread David Ahern

On 9/22/17 12:10 AM, Florian Westphal wrote:
> Switch it to rcu.
> 
> rtnl_link_slave_info_fill on to other hand does need rtnl mutex for now so
> add an annotation to its caller as a reminder.
> 
> Signed-off-by: Florian Westphal 
> ---
>  Change since v1:
>   - don't add ASSERT_RTNL to rtnl_link_slave_info_fill and
>   rtnl_link_info_fill they are only called via rtnl_link_fill.
> 
>  net/core/rtnetlink.c | 12 +---
>  1 file changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
> index 590823f70cc3..e6f9e36d9d5a 100644
> --- a/net/core/rtnetlink.c
> +++ b/net/core/rtnetlink.c
> @@ -522,11 +522,15 @@ static size_t rtnl_link_get_af_size(const struct 
> net_device *dev,
>  static bool rtnl_have_link_slave_info(const struct net_device *dev)
>  {
>   struct net_device *master_dev;
> + bool ret = false;
>  
> - master_dev = netdev_master_upper_dev_get((struct net_device *) dev);
> + rcu_read_lock();
> +
> + master_dev = netdev_master_upper_dev_get_rcu((struct net_device *)dev);
>   if (master_dev && master_dev->rtnl_link_ops)
> - return true;
> - return false;
> + ret = true;
> + rcu_read_unlock();
> + return ret;
>  }
>  
>  static int rtnl_link_slave_info_fill(struct sk_buff *skb,
> @@ -598,6 +602,8 @@ static int rtnl_link_fill(struct sk_buff *skb, const 
> struct net_device *dev)
>   struct nlattr *linkinfo;
>   int err = -EMSGSIZE;
>  
> + ASSERT_RTNL();

Rather than sprinkling the ASSERT_RTNL why not just add a comment above
the function which is done in many places. Since this is really meant as
your notes as you remove rtnl usage a comment serves the same purpose.

> +
>   linkinfo = nla_nest_start(skb, IFLA_LINKINFO);
>   if (linkinfo == NULL)
>   goto out;
> 


Reviewed-by: David Ahern

Re: [PATCH net-next v2 4/6] rtnetlink: add helper to dump ifalias

2017-09-23 Thread David Ahern

On 9/22/17 12:10 AM, Florian Westphal wrote:
> ifalias is currently protected by rtnl mutex, add assertion
> as a reminder.
> 
> Signed-off-by: Florian Westphal 
> ---
>  net/core/rtnetlink.c | 13 +++--
>  1 file changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
> index ad3f27da37a8..42ff582a010e 100644
> --- a/net/core/rtnetlink.c
> +++ b/net/core/rtnetlink.c
> @@ -1345,6 +1345,16 @@ static int nla_put_qdisc(struct sk_buff *skb, struct 
> net_device *dev)
>   return ret;
>  }
>  
> +static noinline int nla_put_ifalias(struct sk_buff *skb, struct net_device 
> *dev)
> +{
> + ASSERT_RTNL();

The assert is not needed given the code path.

> +
> + if (dev->ifalias)
> + return nla_put_string(skb, IFLA_IFALIAS, dev->ifalias);
> +
> + return 0;
> +}
> +
>  static int rtnl_fill_ifinfo(struct sk_buff *skb, struct net_device *dev,
>   int type, u32 pid, u32 seq, u32 change,
>   unsigned int flags, u32 ext_filter_mask,
> @@ -1386,8 +1396,7 @@ static int rtnl_fill_ifinfo(struct sk_buff *skb, struct 
> net_device *dev,
>   put_master_ifindex(skb, dev) ||
>   nla_put_u8(skb, IFLA_CARRIER, netif_carrier_ok(dev)) ||
>   nla_put_qdisc(skb, dev) ||
> - (dev->ifalias &&
> -  nla_put_string(skb, IFLA_IFALIAS, dev->ifalias)) ||
> + nla_put_ifalias(skb, dev) ||
>   nla_put_u32(skb, IFLA_CARRIER_CHANGES,
>   atomic_read(>carrier_changes)) ||
>   nla_put_u8(skb, IFLA_PROTO_DOWN, dev->proto_down))
> 

Reviewed-by: David Ahern

Re: [PATCH] net: dsa: avoid null pointer dereference on p->phy

2017-09-23 Thread Joe Perches

On Sat, 2017-09-23 at 17:57 +0100, Colin King wrote:
> From: Colin Ian King 
> 
> Currently p->phy is being null checked in several places to avoid
> null pointer dereferences on p->phy, however, the final call
> to phy_attached_info on p->phy when p->phy will perform a null
> pointer dereference. Fix this by simply moving the call into
> the previous code block that is only executed if p->phy is
> not null.
> 
> Detected by CoverityScan, CID#1457034 ("Dereference after null check")
> 
> Fixes: 2220943a21e2 ("phy: Centralise print about attached phy")
> Signed-off-by: Colin Ian King 
> ---
>  net/dsa/slave.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/net/dsa/slave.c b/net/dsa/slave.c
> index 02ace7d462c4..29ab4e98639b 100644
> --- a/net/dsa/slave.c
> +++ b/net/dsa/slave.c
> @@ -1115,10 +1115,9 @@ static int dsa_slave_phy_setup(struct net_device 
> *slave_dev)
>   of_phy_deregister_fixed_link(port_dn);
>   return ret;
>   }
> + phy_attached_info(p->phy);
>   }
>  
> - phy_attached_info(p->phy);
> -
>   return 0;
>  }

Huh?  Why move this into the test?

The test of the block above this change is

if (!p->phy) {

Perhaps this should be
'
if (p->phy)
phy_attached_info(p->phy);

or simpler

} else {
phy_attached_info(p->phy);
}

or maybe reverse the block

if (p->phy) {
phy_attached_info(p->phy);
} else {
ret = dsa_slave_phy_connect(slave_dev, p->dp->index);
if (ret) {
netdev_err(slave_dev, "failed to connect to port %d: 
%d\n",
   p->dp->index, ret);
if (phy_is_fixed)
of_phy_deregister_fixed_link(port_dn);
return ret;
}
}

return 0;
}

Re: [PATCH net-next v2 5/6] rtnetlink: add helpers to dump vf and netnsid information

2017-09-23 Thread David Ahern

On 9/22/17 12:10 AM, Florian Westphal wrote:
> +static noinline_for_stack int rtnl_fill_vf(struct sk_buff *skb,
> +struct net_device *dev,
> +u32 ext_filter_mask)
> +{
> + struct nlattr *vfinfo;
> + int i, num_vfs;
> +
> + if (!dev->dev.parent || ((ext_filter_mask & RTEXT_FILTER_VF) == 0))
> + return 0;
> +
> + num_vfs = dev_num_vf(dev->dev.parent);
> + if (nla_put_u32(skb, IFLA_NUM_VF, num_vfs))
> + return -EMSGSIZE;
> +
> + if (!dev->netdev_ops->ndo_get_vf_config)
> + return 0;
> +
> + vfinfo = nla_nest_start(skb, IFLA_VFINFO_LIST);
> + if (!vfinfo)
> + return -EMSGSIZE;
> +
> + for (i = 0; i < num_vfs; i++) {
> + if (rtnl_fill_vfinfo(skb, dev, i, vfinfo))
> + return -EMSGSIZE;
> + }
> +
> + nla_nest_end(skb, vfinfo);
> + return 0;
> +}
> +
>  static int rtnl_fill_link_ifmap(struct sk_buff *skb, struct net_device *dev)
>  {
>   struct rtnl_link_ifmap map;
> @@ -1355,6 +1385,23 @@ static noinline int nla_put_ifalias(struct sk_buff 
> *skb, struct net_device *dev)
>   return 0;
>  }
>  
> +static noinline int rtnl_fill_link_netnsid(struct sk_buff *skb,
> +const struct net_device *dev)
> +{
> + if (dev->rtnl_link_ops && dev->rtnl_link_ops->get_link_net) {
> + struct net *link_net = dev->rtnl_link_ops->get_link_net(dev);
> +
> + if (!net_eq(dev_net(dev), link_net)) {
> + int id = peernet2id_alloc(dev_net(dev), link_net);
> +
> + if (nla_put_s32(skb, IFLA_LINK_NETNSID, id))
> + return -EMSGSIZE;
> + }
> + }
> +
> + return 0;
> +}
> +

No reason to combine vf and netnsid into 1 patch; completely separate topics


>  static int rtnl_fill_ifinfo(struct sk_buff *skb, struct net_device *dev,
>   int type, u32 pid, u32 seq, u32 change,
>   unsigned int flags, u32 ext_filter_mask,
> @@ -1428,27 +1475,9 @@ static int rtnl_fill_ifinfo(struct sk_buff *skb, 
> struct net_device *dev,
>   if (rtnl_fill_stats(skb, dev))
>   goto nla_put_failure;
>  
> - if (dev->dev.parent && (ext_filter_mask & RTEXT_FILTER_VF) &&
> - nla_put_u32(skb, IFLA_NUM_VF, dev_num_vf(dev->dev.parent)))
> + if (rtnl_fill_vf(skb, dev, ext_filter_mask))
>   goto nla_put_failure;
>  
> - if (dev->netdev_ops->ndo_get_vf_config && dev->dev.parent &&
> - ext_filter_mask & RTEXT_FILTER_VF) {
> - int i;
> - struct nlattr *vfinfo;
> - int num_vfs = dev_num_vf(dev->dev.parent);
> -
> - vfinfo = nla_nest_start(skb, IFLA_VFINFO_LIST);
> - if (!vfinfo)
> - goto nla_put_failure;
> - for (i = 0; i < num_vfs; i++) {
> - if (rtnl_fill_vfinfo(skb, dev, i, vfinfo))
> - goto nla_put_failure;
> - }
> -
> - nla_nest_end(skb, vfinfo);
> - }
> -
>   if (rtnl_port_fill(skb, dev, ext_filter_mask))
>   goto nla_put_failure;
>  
> @@ -1460,17 +1489,8 @@ static int rtnl_fill_ifinfo(struct sk_buff *skb, 
> struct net_device *dev,
>   goto nla_put_failure;
>   }
>  
> - if (dev->rtnl_link_ops &&
> - dev->rtnl_link_ops->get_link_net) {
> - struct net *link_net = dev->rtnl_link_ops->get_link_net(dev);
> -
> - if (!net_eq(dev_net(dev), link_net)) {
> - int id = peernet2id_alloc(dev_net(dev), link_net);
> -
> - if (nla_put_s32(skb, IFLA_LINK_NETNSID, id))
> - goto nla_put_failure;
> - }
> - }
> + if (rtnl_fill_link_netnsid(skb, dev))
> + goto nla_put_failure;
>  
>   if (!(af_spec = nla_nest_start(skb, IFLA_AF_SPEC)))
>   goto nla_put_failure;
> 

Reviewed-by: David Ahern

Re: [PATCH net-next v2 1/6] selftests: rtnetlink.sh: add rudimentary vrf test

2017-09-23 Thread David Ahern

On 9/22/17 12:10 AM, Florian Westphal wrote:
> +kci_test_vrf()
> +{
> + vrfname="test-vrf"
> + ret=0
> +
> + ip link show type vrf 2>/dev/null
> + if [ $? -ne 0 ]; then
> + echo "SKIP: vrf: iproute2 too old"
> + return 0
> + fi
> +
> + ip link add "$vrfname" type vrf table 10
> + check_err $?
> + if [ $ret -ne 0 ];then
> + echo "FAIL: can't add vrf interface, skipping test"
> + return 0
> + fi
> +
> + ip -br link show type vrf | grep -q "$vrfname"
> + check_err $?
> + if [ $ret -ne 0 ];then
> + echo "FAIL: created vrf device not found"
> + return 1
> + fi
> +
> +ip link set dev "$vrfname" up

BTW, if there is a v3 of this set, that ip command is shifted - uses
spaces instead of tab.

Re: [PATCH net-next v2 3/6] rtnetlink: add helper to dump qdisc name

2017-09-23 Thread David Ahern

On 9/22/17 12:10 AM, Florian Westphal wrote:
> We can use rcu here to make this safe even if we would not hold rtnl:
> qdisc_destroy uses call_rcu to free the Qdisc struct.
> 
> Signed-off-by: Florian Westphal 
> ---
>  net/core/rtnetlink.c | 16 ++--
>  1 file changed, 14 insertions(+), 2 deletions(-)

Reviewed-by: David Ahern

Re: [PATCH net-next v2 2/6] rtnetlink: add helper to put master ifindex

2017-09-23 Thread David Ahern

On 9/22/17 12:10 AM, Florian Westphal wrote:
> rtnl_fill_ifinfo currently requires caller to hold the rtnl mutex.
> Unfortunately the function is quite large which makes it harder to see
> which spots need the lock, which spots assume it and which ones could do
> without.
> 
> Add helpers to factor out the ifindex dumping.
> 
> One helper can use rcu to remove rtnl dependency.
> 
> Signed-off-by: Florian Westphal 
> ---
>  net/core/rtnetlink.c | 32 +++-
>  1 file changed, 27 insertions(+), 5 deletions(-)

Reviewed-by: David Ahern

Re: [PATCH net-next v2 1/6] selftests: rtnetlink.sh: add rudimentary vrf test

2017-09-23 Thread David Ahern

On 9/22/17 12:10 AM, Florian Westphal wrote:
> Cc: David Ahern 
> Signed-off-by: Florian Westphal 
> ---
>  tools/testing/selftests/net/rtnetlink.sh | 42 
> 
>  1 file changed, 42 insertions(+)

Acked-by: David Ahern

Re: tools: selftests: psock_tpacket: skip un-supported tpacket_v3 test

2017-09-23 Thread David Miller

From: Fathi Boudra 
Date: Sat, 23 Sep 2017 14:27:15 +0300

> On 23 September 2017 at 04:20, David Miller  wrote:
>> From: Orson Zhai 
>> Date: Fri, 22 Sep 2017 18:17:17 +0800
>>
>>> The TPACKET_V3 test of PACKET_TX_RING will fail with kernel version
>>> lower than v4.11. Supported code of tx ring was add with commit id
>>> <7f953ab2ba46: af_packet: TX_RING support for TPACKET_V3> at Jan. 3
>>> of 2017.
>>>
>>> So skip this item test instead of reporting failing for old kernels.
>>>
>>> Signed-off-by: Orson Zhai 
>>
>> The whole point is to make sure the kernel in which the selftest
>> code is present functions properly.
>>
>> There are many tests in selftests that only work on recent kernels.
> 
> For the background, a similar discussion happened on this thread:
> https://lkml.org/lkml/2017/6/22/802
> 
> There's cases where we'd like to run latest selftests on stable kernels.
> You're right, there are many tests in selftests that only work on
> recent kernels and we intend to fix it.
> Skipping gracefully a test because the feature is missing on the
> kernel under test is preferred to fail.

This approach is extremely ill advised.

It is hard enough to get developers to add new tests in the first
place.

Having the extra burdon of needing to make the test work on older
kernels is going to discourage test writing even more.

If you want to "backport" tests, handle them the same way -stable
backports are done.  With extreme care and making sure they get
backported to the kernel they actually would work on.

[PATCH] net: dsa: avoid null pointer dereference on p->phy

2017-09-23 Thread Colin King

From: Colin Ian King 

Currently p->phy is being null checked in several places to avoid
null pointer dereferences on p->phy, however, the final call
to phy_attached_info on p->phy when p->phy will perform a null
pointer dereference. Fix this by simply moving the call into
the previous code block that is only executed if p->phy is
not null.

Detected by CoverityScan, CID#1457034 ("Dereference after null check")

Fixes: 2220943a21e2 ("phy: Centralise print about attached phy")
Signed-off-by: Colin Ian King 
---
 net/dsa/slave.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 02ace7d462c4..29ab4e98639b 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -1115,10 +1115,9 @@ static int dsa_slave_phy_setup(struct net_device 
*slave_dev)
of_phy_deregister_fixed_link(port_dn);
return ret;
}
+   phy_attached_info(p->phy);
}
 
-   phy_attached_info(p->phy);
-
return 0;
 }
 
-- 
2.14.1

Re: [PATCH net-next] liquidio: pass date and time info to NIC firmware

2017-09-23 Thread Andrew Lunn

On Fri, Sep 22, 2017 at 05:35:18PM -0700, Felix Manlunas wrote:
> From: Veerasenareddy Burru 

This is kind of interesting. So you do this once. It could be before
the RTC driver has probed, so it is 1970. It could be before the NTP
daemon has started, and so the host clock will later jump, or stretch
time to gain synchronisation.

It seems like you should be periodically giving the time to the
firmware, not just once.

   Andrew

Re: [patch net-next 07/12] mlxsw: spectrum: Add the multicast routing offloading logic

2017-09-23 Thread Andrew Lunn

> > So when i look at these patches, i try to make sure the general use
> > cases works, not just the plain boring Ethernet switch box use cases
> > :-)
> 
> So when doing it, we did think about multi-ASIC situations, so I think it 
> should
> fit :)

Maybe, maybe not. DSA is multi switch. That is what the D means,
Distributed. All switches in a cluster are presented to the software
stack as a single switch.

If your multi-ASIC architecture is the same, a distributed switch,
then you don't cover the general case. If you represent it as two
switches, with frames going between switches going via CPU, then yes,
you probably cover the general case.

Andrew

[PATCH net-next] tun: delete original tun_get() and rename __tun_get() to tun_get()

2017-09-23 Thread yuan linyu

From: yuan linyu 

it seems no need to keep tun_get() and __tun_get() at same time.

Signed-off-by: yuan linyu 
---
 drivers/net/tun.c | 26 +++---
 1 file changed, 11 insertions(+), 15 deletions(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 3c9985f..206bc6c 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -692,7 +692,7 @@ static int tun_attach(struct tun_struct *tun, struct file 
*file, bool skip_filte
return err;
 }
 
-static struct tun_struct *__tun_get(struct tun_file *tfile)
+static struct tun_struct *tun_get(struct tun_file *tfile)
 {
struct tun_struct *tun;
 
@@ -705,11 +705,6 @@ static struct tun_struct *__tun_get(struct tun_file *tfile)
return tun;
 }
 
-static struct tun_struct *tun_get(struct file *file)
-{
-   return __tun_get(file->private_data);
-}
-
 static void tun_put(struct tun_struct *tun)
 {
dev_put(tun->dev);
@@ -1149,7 +1144,7 @@ static void tun_net_init(struct net_device *dev)
 static unsigned int tun_chr_poll(struct file *file, poll_table *wait)
 {
struct tun_file *tfile = file->private_data;
-   struct tun_struct *tun = __tun_get(tfile);
+   struct tun_struct *tun = tun_get(tfile);
struct sock *sk;
unsigned int mask = 0;
 
@@ -1569,8 +1564,8 @@ static ssize_t tun_get_user(struct tun_struct *tun, 
struct tun_file *tfile,
 static ssize_t tun_chr_write_iter(struct kiocb *iocb, struct iov_iter *from)
 {
struct file *file = iocb->ki_filp;
-   struct tun_struct *tun = tun_get(file);
struct tun_file *tfile = file->private_data;
+   struct tun_struct *tun = tun_get(tfile);
ssize_t result;
 
if (!tun)
@@ -1754,7 +1749,7 @@ static ssize_t tun_chr_read_iter(struct kiocb *iocb, 
struct iov_iter *to)
 {
struct file *file = iocb->ki_filp;
struct tun_file *tfile = file->private_data;
-   struct tun_struct *tun = __tun_get(tfile);
+   struct tun_struct *tun = tun_get(tfile);
ssize_t len = iov_iter_count(to), ret;
 
if (!tun)
@@ -1831,7 +1826,7 @@ static int tun_sendmsg(struct socket *sock, struct msghdr 
*m, size_t total_len)
 {
int ret;
struct tun_file *tfile = container_of(sock, struct tun_file, socket);
-   struct tun_struct *tun = __tun_get(tfile);
+   struct tun_struct *tun = tun_get(tfile);
 
if (!tun)
return -EBADFD;
@@ -1847,7 +1842,7 @@ static int tun_recvmsg(struct socket *sock, struct msghdr 
*m, size_t total_len,
   int flags)
 {
struct tun_file *tfile = container_of(sock, struct tun_file, socket);
-   struct tun_struct *tun = __tun_get(tfile);
+   struct tun_struct *tun = tun_get(tfile);
int ret;
 
if (!tun)
@@ -1879,7 +1874,7 @@ static int tun_peek_len(struct socket *sock)
struct tun_struct *tun;
int ret = 0;
 
-   tun = __tun_get(tfile);
+   tun = tun_get(tfile);
if (!tun)
return 0;
 
@@ -2265,7 +2260,7 @@ static long __tun_chr_ioctl(struct file *file, unsigned 
int cmd,
ret = 0;
rtnl_lock();
 
-   tun = __tun_get(tfile);
+   tun = tun_get(tfile);
if (cmd == TUNSETIFF) {
ret = -EEXIST;
if (tun)
@@ -2612,15 +2607,16 @@ static int tun_chr_close(struct inode *inode, struct 
file *file)
 }
 
 #ifdef CONFIG_PROC_FS
-static void tun_chr_show_fdinfo(struct seq_file *m, struct file *f)
+static void tun_chr_show_fdinfo(struct seq_file *m, struct file *file)
 {
+   struct tun_file *tfile = file->private_data;
struct tun_struct *tun;
struct ifreq ifr;
 
memset(, 0, sizeof(ifr));
 
rtnl_lock();
-   tun = tun_get(f);
+   tun = tun_get(tfile);
if (tun)
tun_get_iff(current->nsproxy->net_ns, tun, );
rtnl_unlock();
-- 
2.7.4

Re: [PATCH net-next 2/2] net: dsa: lan9303: Add basic offloading of unicast traffic

2017-09-23 Thread Andrew Lunn

> The point is: Once both external ports are in "forwarding", I see no way
> to prevent traffic flowing directly between the external ports.

Generally, there are port vectors. Port X can send frames only to Port
Y.

If you don't have that, there are possibilities with VLANs. Each port
is given a unique VLAN. All incoming untagged traffic is tagged with
the VLAN. You just need to keep the VLAN separated and add/remove the
VLAN tag in the dsa tag driver.

 Andrew

Re: [PATCH net] sctp: Fix a big endian bug in sctp_for_each_transport()

2017-09-23 Thread Xin Long

On Sat, Sep 23, 2017 at 6:25 PM, Dan Carpenter  wrote:
> Fundamentally, the "pos" pointer points to "cb->args[2]" which is a long.
> In the current code, we only use the high 32 bits and cast it as an
> int.  That works on little endian systems but will fail on big endian
> systems.
>
> Fixes: d25adbeb0cdb ("sctp: fix an use-after-free issue in sctp_sock_dump")
> Signed-off-by: Dan Carpenter 
>
> diff --git a/include/net/sctp/sctp.h b/include/net/sctp/sctp.h
> index d7d8cba01469..7d87439f299a 100644
> --- a/include/net/sctp/sctp.h
> +++ b/include/net/sctp/sctp.h
> @@ -121,14 +121,14 @@ void sctp_transport_walk_stop(struct rhashtable_iter 
> *iter);
>  struct sctp_transport *sctp_transport_get_next(struct net *net,
> struct rhashtable_iter *iter);
>  struct sctp_transport *sctp_transport_get_idx(struct net *net,
> -   struct rhashtable_iter *iter, int pos);
> +   struct rhashtable_iter *iter, long pos);
>  int sctp_transport_lookup_process(int (*cb)(struct sctp_transport *, void *),
>   struct net *net,
>   const union sctp_addr *laddr,
>   const union sctp_addr *paddr, void *p);
>  int sctp_for_each_transport(int (*cb)(struct sctp_transport *, void *),
> int (*cb_done)(struct sctp_transport *, void *),
> -   struct net *net, int *pos, void *p);
> +   struct net *net, long *pos, void *p);
>  int sctp_for_each_endpoint(int (*cb)(struct sctp_endpoint *, void *), void 
> *p);
>  int sctp_get_sctp_info(struct sock *sk, struct sctp_association *asoc,
>struct sctp_info *info);
> diff --git a/net/sctp/socket.c b/net/sctp/socket.c
> index d4730ada7f32..0222743b3aa8 100644
> --- a/net/sctp/socket.c
> +++ b/net/sctp/socket.c
> @@ -4603,7 +4603,7 @@ struct sctp_transport *sctp_transport_get_next(struct 
> net *net,
>
>  struct sctp_transport *sctp_transport_get_idx(struct net *net,
>   struct rhashtable_iter *iter,
> - int pos)
> + long pos)
>  {
> void *obj = SEQ_START_TOKEN;
>
> @@ -4659,7 +4659,7 @@ EXPORT_SYMBOL_GPL(sctp_transport_lookup_process);
>
>  int sctp_for_each_transport(int (*cb)(struct sctp_transport *, void *),
> int (*cb_done)(struct sctp_transport *, void *),
> -   struct net *net, int *pos, void *p) {
> +   struct net *net, long *pos, void *p) {
> struct rhashtable_iter hti;
> struct sctp_transport *tsp;
> int ret;
> diff --git a/net/sctp/sctp_diag.c b/net/sctp/sctp_diag.c
> index 22ed01a76b19..e9d5405aa6ac 100644
> --- a/net/sctp/sctp_diag.c
> +++ b/net/sctp/sctp_diag.c
> @@ -493,7 +493,7 @@ static void sctp_diag_dump(struct sk_buff *skb, struct 
> netlink_callback *cb,
> goto done;
>
> sctp_for_each_transport(sctp_sock_filter, sctp_sock_dump,
> -   net, (int *)>args[2], );
> +   net, >args[2], );
>
>  done:
> cb->args[1] = cb->args[4];
Better not to touch sctp_for_each_transport and sctp_transport_get_idx
which are supposed to be also used elsewhere as common apis.

Can you pls try to fix this in sctp_diag_dump(), like:
@@ -463,6 +463,7 @@ static void sctp_diag_dump(struct sk_buff *skb,
struct netlink_callback *cb,
.r = r,
.net_admin = netlink_net_capable(cb->skb, CAP_NET_ADMIN),
};
+   int pos = cb->args[2];

/* eps hashtable dumps
 * args:
@@ -493,7 +494,8 @@ static void sctp_diag_dump(struct sk_buff *skb,
struct netlink_callback *cb,
goto done;

sctp_for_each_transport(sctp_sock_filter, sctp_sock_dump,
-   net, (int *)>args[2], );
+   net, , );
+   cb->args[2] = pos;

Re: tools: selftests: psock_tpacket: skip un-supported tpacket_v3 test

2017-09-23 Thread Fathi Boudra

On 23 September 2017 at 04:20, David Miller  wrote:
> From: Orson Zhai 
> Date: Fri, 22 Sep 2017 18:17:17 +0800
>
>> The TPACKET_V3 test of PACKET_TX_RING will fail with kernel version
>> lower than v4.11. Supported code of tx ring was add with commit id
>> <7f953ab2ba46: af_packet: TX_RING support for TPACKET_V3> at Jan. 3
>> of 2017.
>>
>> So skip this item test instead of reporting failing for old kernels.
>>
>> Signed-off-by: Orson Zhai 
>
> The whole point is to make sure the kernel in which the selftest
> code is present functions properly.
>
> There are many tests in selftests that only work on recent kernels.

For the background, a similar discussion happened on this thread:
https://lkml.org/lkml/2017/6/22/802

There's cases where we'd like to run latest selftests on stable kernels.
You're right, there are many tests in selftests that only work on
recent kernels and we intend to fix it.
Skipping gracefully a test because the feature is missing on the
kernel under test is preferred to fail.

> I'm not applying this, sorry.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kselftest" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH net-next] cxgb4: do DCB state reset in couple of places

2017-09-23 Thread Ganesh Goudar

reset the driver's DCB state in couple of places
where it was missing.

Signed-off-by: Casey Leedom 
Signed-off-by: Ganesh Goudar 
---
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_dcb.c  | 15 +++
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_dcb.h  |  1 +
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c | 10 --
 3 files changed, 20 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_dcb.c 
b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_dcb.c
index 6ee2ed3..4e7f72b 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_dcb.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_dcb.c
@@ -40,8 +40,7 @@ static inline bool cxgb4_dcb_state_synced(enum 
cxgb4_dcb_state state)
return false;
 }
 
-/* Initialize a port's Data Center Bridging state.  Typically used after a
- * Link Down event.
+/* Initialize a port's Data Center Bridging state.
  */
 void cxgb4_dcb_state_init(struct net_device *dev)
 {
@@ -106,6 +105,15 @@ static void cxgb4_dcb_cleanup_apps(struct net_device *dev)
}
 }
 
+/* Reset a port's Data Center Bridging state.  Typically used after a
+ * Link Down event.
+ */
+void cxgb4_dcb_reset(struct net_device *dev)
+{
+   cxgb4_dcb_cleanup_apps(dev);
+   cxgb4_dcb_state_init(dev);
+}
+
 /* Finite State machine for Data Center Bridging.
  */
 void cxgb4_dcb_state_fsm(struct net_device *dev,
@@ -194,8 +202,7 @@ void cxgb4_dcb_state_fsm(struct net_device *dev,
 * state.  We need to reset back to a ground state
 * of incomplete.
 */
-   cxgb4_dcb_cleanup_apps(dev);
-   cxgb4_dcb_state_init(dev);
+   cxgb4_dcb_reset(dev);
dcb->state = CXGB4_DCB_STATE_FW_INCOMPLETE;
dcb->supported = CXGB4_DCBX_FW_SUPPORT;
linkwatch_fire_event(dev);
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_dcb.h 
b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_dcb.h
index ccf24d3..02040b9 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_dcb.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_dcb.h
@@ -131,6 +131,7 @@ struct port_dcb_info {
 
 void cxgb4_dcb_state_init(struct net_device *);
 void cxgb4_dcb_version_init(struct net_device *);
+void cxgb4_dcb_reset(struct net_device *dev);
 void cxgb4_dcb_state_fsm(struct net_device *, enum cxgb4_dcb_state_input);
 void cxgb4_dcb_handle_fw_update(struct adapter *, const struct fw_port_cmd *);
 void cxgb4_dcb_set_caps(struct adapter *, const struct fw_port_cmd *);
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c 
b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
index aa93ae9..13b636b 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
@@ -281,7 +281,7 @@ void t4_os_link_changed(struct adapter *adapter, int 
port_id, int link_stat)
else {
 #ifdef CONFIG_CHELSIO_T4_DCB
if (cxgb4_dcb_enabled(dev)) {
-   cxgb4_dcb_state_init(dev);
+   cxgb4_dcb_reset(dev);
dcb_tx_queue_prio_enable(dev, false);
}
 #endif /* CONFIG_CHELSIO_T4_DCB */
@@ -2304,10 +2304,16 @@ static int cxgb_close(struct net_device *dev)
 {
struct port_info *pi = netdev_priv(dev);
struct adapter *adapter = pi->adapter;
+   int ret;
 
netif_tx_stop_all_queues(dev);
netif_carrier_off(dev);
-   return t4_enable_vi(adapter, adapter->pf, pi->viid, false, false);
+   ret = t4_enable_vi(adapter, adapter->pf, pi->viid, false, false);
+#ifdef CONFIG_CHELSIO_T4_DCB
+   cxgb4_dcb_reset(dev);
+   dcb_tx_queue_prio_enable(dev, false);
+#endif
+   return ret;
 }
 
 int cxgb4_create_server_filter(const struct net_device *dev, unsigned int stid,
-- 
2.1.0

[PATCH net] sctp: Fix a big endian bug in sctp_for_each_transport()

2017-09-23 Thread Dan Carpenter

Fundamentally, the "pos" pointer points to "cb->args[2]" which is a long.
In the current code, we only use the high 32 bits and cast it as an
int.  That works on little endian systems but will fail on big endian
systems.

Fixes: d25adbeb0cdb ("sctp: fix an use-after-free issue in sctp_sock_dump")
Signed-off-by: Dan Carpenter 

diff --git a/include/net/sctp/sctp.h b/include/net/sctp/sctp.h
index d7d8cba01469..7d87439f299a 100644
--- a/include/net/sctp/sctp.h
+++ b/include/net/sctp/sctp.h
@@ -121,14 +121,14 @@ void sctp_transport_walk_stop(struct rhashtable_iter 
*iter);
 struct sctp_transport *sctp_transport_get_next(struct net *net,
struct rhashtable_iter *iter);
 struct sctp_transport *sctp_transport_get_idx(struct net *net,
-   struct rhashtable_iter *iter, int pos);
+   struct rhashtable_iter *iter, long pos);
 int sctp_transport_lookup_process(int (*cb)(struct sctp_transport *, void *),
  struct net *net,
  const union sctp_addr *laddr,
  const union sctp_addr *paddr, void *p);
 int sctp_for_each_transport(int (*cb)(struct sctp_transport *, void *),
int (*cb_done)(struct sctp_transport *, void *),
-   struct net *net, int *pos, void *p);
+   struct net *net, long *pos, void *p);
 int sctp_for_each_endpoint(int (*cb)(struct sctp_endpoint *, void *), void *p);
 int sctp_get_sctp_info(struct sock *sk, struct sctp_association *asoc,
   struct sctp_info *info);
diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index d4730ada7f32..0222743b3aa8 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -4603,7 +4603,7 @@ struct sctp_transport *sctp_transport_get_next(struct net 
*net,
 
 struct sctp_transport *sctp_transport_get_idx(struct net *net,
  struct rhashtable_iter *iter,
- int pos)
+ long pos)
 {
void *obj = SEQ_START_TOKEN;
 
@@ -4659,7 +4659,7 @@ EXPORT_SYMBOL_GPL(sctp_transport_lookup_process);
 
 int sctp_for_each_transport(int (*cb)(struct sctp_transport *, void *),
int (*cb_done)(struct sctp_transport *, void *),
-   struct net *net, int *pos, void *p) {
+   struct net *net, long *pos, void *p) {
struct rhashtable_iter hti;
struct sctp_transport *tsp;
int ret;
diff --git a/net/sctp/sctp_diag.c b/net/sctp/sctp_diag.c
index 22ed01a76b19..e9d5405aa6ac 100644
--- a/net/sctp/sctp_diag.c
+++ b/net/sctp/sctp_diag.c
@@ -493,7 +493,7 @@ static void sctp_diag_dump(struct sk_buff *skb, struct 
netlink_callback *cb,
goto done;
 
sctp_for_each_transport(sctp_sock_filter, sctp_sock_dump,
-   net, (int *)>args[2], );
+   net, >args[2], );
 
 done:
cb->args[1] = cb->args[4];

Re: [PATCH net-next 2/2] net: dsa: lan9303: Add basic offloading of unicast traffic

2017-09-23 Thread Egil Hjelmeland


Den 22. sep. 2017 22:08, skrev Andrew Lunn:

I'm wondering how this is supposed to work. Please add a good comment
here, since the hardware is forcing you to do something odd.

Maybe it would be a good idea to save the STP state in chip.  And then
when chip->is_bridged is set true, change the state in the hardware to
the saved value?

What happens when port 0 is added to the bridge, there is then a
minute pause and then port 1 is added? I would expect that as soon as
port 0 is added, the STP state machine for port 0 will start and move
into listening and then forwarding. Due to hardware limitations it
looks like you cannot do this. So what state is the hardware
effectively in? Blocking? Forwarding?

Then port 1 is added. You can then can respect the states. port 1 will
do blocking->listening->forwarding, but what about port 0? The calls
won't get repeated? How does it transition to forwarding?

   Andrew



I see your point with the "minute pause" argument. Although a bit
contrived use case, it is easy to fix by caching the STP state, as
you suggest. So I can do that.


I don't think it is contrived. I've done bridge configuration by hand
for testing purposes. I've also set the forwarding delay to very small
values, so there is a clear race condition here.


How does other DSA HW chips handle port separation? Knowing that
could perhaps help me know what to look for.


They have better hardware :-)

Generally each port is totally independent. You can change the STP
state per port without restrictions.


We can indeed change the STP state per lan9303 port "without
restrictions".

The point is: Once both external ports are in "forwarding", I see no way
to prevent traffic flowing directly between the external ports.



   Andrew



Egil

Re: [patch net-next 07/12] mlxsw: spectrum: Add the multicast routing offloading logic

2017-09-23 Thread Yotam Gigi

On 09/22/2017 04:21 PM, Andrew Lunn wrote:
> On Fri, Sep 22, 2017 at 11:36:59AM +0300, Yotam Gigi wrote:
>> On 09/21/2017 06:26 PM, Andrew Lunn wrote:
 +static void mlxsw_sp_mr_route_stats_update(struct mlxsw_sp *mlxsw_sp,
 + struct mlxsw_sp_mr_route *mr_route)
 +{
 +  struct mlxsw_sp_mr *mr = mlxsw_sp->mr;
 +  u64 packets, bytes;
 +
 +  if (mr_route->route_action == MLXSW_SP_MR_ROUTE_ACTION_TRAP)
 +  return;
 +
 +  mr->mr_ops->route_stats(mlxsw_sp, mr_route->route_priv, ,
 +  );
 +
 +  switch (mr_route->mr_table->proto) {
 +  case MLXSW_SP_L3_PROTO_IPV4:
 +  mr_route->mfc4->mfc_un.res.pkt = packets;
 +  mr_route->mfc4->mfc_un.res.bytes = bytes;
>>> What about wrong_if and lastuse? 
> Hi Yotam
>
>> wronf_if is updated by ipmr as it is trapped to the CPU.
> Great.
>
>> We did not address lastuse currently, though it can be easily
>> addressed here.
> Please do. I've written multicast routing daemons, where i use it to
> flush out MFCs which are no longer in use. Having it always 0 is going
> to break daemons.

I will. Thanks for the feedback!

>  
>>> Is an mfc with iif on the host, not the switch, not offloaded?
>>
>> I am not sure I followed. What do you mean MFC with iif on the host? you mean
>> MFC with iif that is an external NIC which is not part of the spectrum ASIC?
> Yes. We probably have different perspectives on the world. To
> Mellanox, everything is a switch in a box. In the DSA world, we tend
> to think of having a general purpose machine which also has a switch
> connected. Think of a wireless access point, set top box, passenger
> entertainment system. We have applications on the general purpose
> computer, we have wifi interfaces, cable modems, etc. Think about all
> the different packages in LEDE. We might have a multicast video
> stream, coming from the cable modem being sent over ports of the
> switch to clients.
>
> So when i look at these patches, i try to make sure the general use
> cases works, not just the plain boring Ethernet switch box use cases
> :-)

So when doing it, we did think about multi-ASIC situations, so I think it should
fit :)

>
>> in this case, the route will not be offloaded and all traffic will
>> pass in slowpath.
> O.K. I was just thinking if those counters need to be +=, not =.  But
> either the iif is on the host, or it is in the switch. It cannot be
> both. So = is O.K.
>
> Thanks
>   Andrew

47 matches

Mail list logo