Re: [PATCH net-next] bpf/verifier: improve disassembly of BPF_END instructions
On Fri, Sep 22, 2017 at 09:49:10PM -0700, Y Song wrote: > On Fri, Sep 22, 2017 at 9:23 AM, Edward Creewrote: > > On 22/09/17 16:16, Alexei Starovoitov wrote: > >> looks like we're converging on > >> "be16/be32/be64/le16/le32/le64 #register" for BPF_END. > >> I guess it can live with that. I would prefer more C like syntax > >> to match the rest, but llvm parsing point is a strong one. > > Yep, agreed. I'll post a v2 once we've settled BPF_NEG. > >> For BPG_NEG I prefer to do it in C syntax like interpreter does: > >> ALU_NEG: > >> DST = (u32) -DST; > >> ALU64_NEG: > >> DST = -DST; > >> Yonghong, does it mean that asmparser will equally suffer? > > Correction to my earlier statements: verifier will currently disassemble > > neg as: > > (87) r0 neg 0 > > (84) (u32) r0 neg (u32) 0 > > because it pretends 'neg' is a compound-assignment operator like +=. > > The analogy with be16 and friends would be to use > > neg64 r0 > > neg32 r0 > > whereas the analogy with everything else would be > > r0 = -r0 > > r0 = (u32) -r0 > > as Alexei says. > > I'm happy to go with Alexei's version if it doesn't cause problems for llvm. > > I got some time to do some prototyping in llvm and it looks like that > I am able to > resolve the issue and we are able to use more C-like syntax. That is: > for bswap: > r1 = (be16) (u16) r1 > or > r1 = (be16) r1 > or > r1 = be16 r1 > for neg: > r0 = -r0 > (for 32bit support, llvm may output "w0 = -w0" in the future. But > since it is not > enabled yet, you can continue to output "r0 = (u32) -r0".) > > Not sure which syntax is best for bswap. The "r1 = (be16) (u16) r1" is most > explicit in its intention. > > Attaching my llvm patch as well and cc'ing Jiong and Jakub so they can see my > implementation and the relative discussion here. (In this patch, I did > not implement > bswap for little endian yet.) Maybe they can provide additional comments. This is awesome. In such case I'd like to swing back to the C syntax for bpf_end :) Any of these r1 = (be16) (u16) r1 or r1 = (be16) r1 or r1 = be16 r1 are better than just be16 r1 I like 1st the most, since it's explicit in terms of what happens with upper bits, but 2nd is also ok. 3rd is not quite C-like.
[PATCH] mac80211: aead api to reduce redundancy
Currently, the aes_ccm.c and aes_gcm.c are almost line by line copy of each other. This patch reduce code redundancy by moving the code in these two files to crypto/aead_api.c to make it a higher level aead api. The aes_ccm.c and aes_gcm.c are removed and all the functions are now implemented in their headers using the newly added aead api. Signed-off-by: Xiang Gao--- crypto/Makefile | 2 +- net/mac80211/aes_gcm.c => crypto/aead_api.c | 53 +++-- include/crypto/aead_api.h | 21 + net/mac80211/Makefile | 2 - net/mac80211/aes_ccm.c | 115 net/mac80211/aes_ccm.h | 42 +++--- net/mac80211/aes_gcm.h | 40 -- net/mac80211/key.c | 1 + net/mac80211/wpa.c | 11 +-- 9 files changed, 118 insertions(+), 169 deletions(-) rename net/mac80211/aes_gcm.c => crypto/aead_api.c (54%) create mode 100644 include/crypto/aead_api.h delete mode 100644 net/mac80211/aes_ccm.c diff --git a/crypto/Makefile b/crypto/Makefile index d41f0331b085..541316db5841 100644 --- a/crypto/Makefile +++ b/crypto/Makefile @@ -14,7 +14,7 @@ crypto_algapi-$(CONFIG_PROC_FS) += proc.o crypto_algapi-y := algapi.o scatterwalk.o $(crypto_algapi-y) obj-$(CONFIG_CRYPTO_ALGAPI2) += crypto_algapi.o -obj-$(CONFIG_CRYPTO_AEAD2) += aead.o +obj-$(CONFIG_CRYPTO_AEAD2) += aead.o aead_api.o crypto_blkcipher-y := ablkcipher.o crypto_blkcipher-y += blkcipher.o diff --git a/net/mac80211/aes_gcm.c b/crypto/aead_api.c similarity index 54% rename from net/mac80211/aes_gcm.c rename to crypto/aead_api.c index 8a4397cc1b08..9585ee400b2e 100644 --- a/net/mac80211/aes_gcm.c +++ b/crypto/aead_api.c @@ -1,7 +1,4 @@ -/* - * Copyright 2014-2015, Qualcomm Atheros, Inc. - * - * This program is free software; you can redistribute it and/or modify +/* This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License version 2 as * published by the Free Software Foundation. */ @@ -9,43 +6,43 @@ #include #include #include -#include +#include -#include -#include "key.h" -#include "aes_gcm.h" +#include -int ieee80211_aes_gcm_encrypt(struct crypto_aead *tfm, u8 *j_0, u8 *aad, - u8 *data, size_t data_len, u8 *mic) +int aead_encrypt(struct crypto_aead *tfm, u8 *b_0, u8 *aad, size_t aad_len, +u8 *data, size_t data_len, u8 *auth) { struct scatterlist sg[3]; struct aead_request *aead_req; int reqsize = sizeof(*aead_req) + crypto_aead_reqsize(tfm); u8 *__aad; - aead_req = kzalloc(reqsize + GCM_AAD_LEN, GFP_ATOMIC); + aead_req = kzalloc(reqsize + aad_len, GFP_ATOMIC); if (!aead_req) return -ENOMEM; __aad = (u8 *)aead_req + reqsize; - memcpy(__aad, aad, GCM_AAD_LEN); + memcpy(__aad, aad, aad_len); sg_init_table(sg, 3); - sg_set_buf([0], &__aad[2], be16_to_cpup((__be16 *)__aad)); + sg_set_buf([0], __aad, aad_len); sg_set_buf([1], data, data_len); - sg_set_buf([2], mic, IEEE80211_GCMP_MIC_LEN); + sg_set_buf([2], auth, tfm->authsize); aead_request_set_tfm(aead_req, tfm); - aead_request_set_crypt(aead_req, sg, sg, data_len, j_0); + aead_request_set_crypt(aead_req, sg, sg, data_len, b_0); aead_request_set_ad(aead_req, sg[0].length); crypto_aead_encrypt(aead_req); kzfree(aead_req); + return 0; } +EXPORT_SYMBOL_GPL(aead_encrypt); -int ieee80211_aes_gcm_decrypt(struct crypto_aead *tfm, u8 *j_0, u8 *aad, - u8 *data, size_t data_len, u8 *mic) +int aead_decrypt(struct crypto_aead *tfm, u8 *b_0, u8 *aad, size_t aad_len, +u8 *data, size_t data_len, u8 *auth) { struct scatterlist sg[3]; struct aead_request *aead_req; @@ -56,21 +53,20 @@ int ieee80211_aes_gcm_decrypt(struct crypto_aead *tfm, u8 *j_0, u8 *aad, if (data_len == 0) return -EINVAL; - aead_req = kzalloc(reqsize + GCM_AAD_LEN, GFP_ATOMIC); + aead_req = kzalloc(reqsize + aad_len, GFP_ATOMIC); if (!aead_req) return -ENOMEM; __aad = (u8 *)aead_req + reqsize; - memcpy(__aad, aad, GCM_AAD_LEN); + memcpy(__aad, aad, aad_len); sg_init_table(sg, 3); sg_set_buf([0], &__aad[2], be16_to_cpup((__be16 *)__aad)); sg_set_buf([1], data, data_len); - sg_set_buf([2], mic, IEEE80211_GCMP_MIC_LEN); + sg_set_buf([2], auth, tfm->authsize); aead_request_set_tfm(aead_req, tfm); - aead_request_set_crypt(aead_req, sg, sg, - data_len + IEEE80211_GCMP_MIC_LEN, j_0); + aead_request_set_crypt(aead_req, sg, sg, data_len +
Re: [PATCH 1/1] forcedeth: optimize the xmit/rx with unlikely
From: Zhu YanjunDate: Fri, 22 Sep 2017 10:20:21 -0400 > In the xmit/rx fastpath, the function dma_map_single rarely fails. > Therefore, add an unlikely() optimization to this error check > conditional. > > Signed-off-by: Zhu Yanjun Applied to net-next, thanks.
Re: [PATCH net] net: qualcomm: rmnet: Fix rcu splat in rmnet_is_real_dev_registered
From: Subash Abhinov KasiviswanathanDate: Thu, 21 Sep 2017 18:00:36 -0600 > Xiaolong reported a suspicious rcu_dereference_check in the device > unregister notifier callback. Since we do not dereference the > rx_handler_data, it's ok to just check for the value of the pointer. > Note that this section is already protected by rtnl_lock. ... > Fixes: ceed73a2cf4a ("drivers: net: ethernet: qualcomm: rmnet: Initial > implementation") > Signed-off-by: Subash Abhinov Kasiviswanathan > Reported-by: kernel test robot Applied, thank you.
Re: [PATCH net-next 09/14] gtp: Allow configuring GTP interface as standalone
Hi Tom, On Thu, Sep 21, 2017 at 09:43:02AM -0700, Tom Herbert wrote: > Please see the cover letter for the original GTP kernel patches dated > May 10, 2016. My first question on those was "Is there a timeline for > adding IPv6 support?". To which Pablo replied that there was a > preliminary patch for it that has not been released. I'll suggest Pablo to comment on that. I don't recall the details at that time, I was only involved in the earliest development of the module and then handed over. > If you don't like my patches or don't think that can be adapted to > fully support the GTP specification, that's fine. It's not about "not liking". I'm very happy about contributions, including (of course) yours. It's about making sure that code we merge into the kernel GTP driver will actually be usable to create a standards-compliant GTP application or not. There's no use in merging an IPv6 support patch if already by code review it can be shown that it's impossible to create a spec-compliant implementation using that patch. To me, that would be "merging IPv6 support so we can check off a box on a management form or marketing sheet", but not for any practical value. > But then you need to provide a viable alternative. Why do *I* have to provide a viable alternative? Who says that *I* have an obligation to do so? A (co-)maintainer of a given driver doesn't have the obligation of implementing any feature as requested. Community based collaborative development only gets those things done that people contribute. I have already contributed almost a decade of my life to creating Free Software implementations of cellular protocol stacks, and it continues to be the center of my work and spare time. GTP is only one protocol layer on one of those stacks. Pablo, Andreas and I have contributed a Linux kernel implementation that currently only implements IPv4. This implementation can by anyone extended to support IPv6, and as you see from this e-mail thread, there is interest in helping this along by * providing code review (even at times when it's personally difficult for me) * providing free hardware for setting up a "private cellular network" to test interoperability * providing testing tools for validation in absence of such a cellular network > We are at the point where a kernel networking feature that only > supports IPv4 when it could support IPv6 must be considered > incomplete. I agree it is incomplete. There's no doubt about that. But then, even the current "incomplete" implementation is working and can be used to operate an interoperable, spec-compatible IPv4 GGSN. So it serves a practical purpose. All I'm asking is that any IPv6 support patches are developed with that same practical purpose in mind. Going through the cover letter of your series again: > - IPv6 support Cannot be merged as-is, see lengthy review discussion > - Configurable networking interfaces so that GTP kernel can be > used and tested without needing GSN network emulation (i.e. no user > space daemon needed). > - Port numbers are configurable As I indicated, I'm not fundamentally opposed to it, but I'm wondering how much value they bring in reality. Andreas has raised the valid concern that we're adding code that is not used in production setups or by any of the userspace implementations using this tunneling module. The code gets more complex and gets code paths that will not be exercised/tested. Nevertheless, if it helps you to work on GTP, we can merge them from my point of view - unless Pablo and/or Andreas object more strongly. > - GSO,GRO > - A facility that allows application specific GSO callbacks Fine with me, but I think you need to convince other folks about the "application specific GSO" and the usage of the upper bits of shinfo(skb)->gso_type. > - Control of zero UDP checksums Same as above, Dave was raising some question about it, not sure if his concern remains. > - Addition of a dst_cache in the GTP structure Fine with me. As for the patches touching gtp.c: * 04/14 udp recv clean up: fine with me, but kbuild robot complaint? On a minor note, I think you're mixing two unrelated topics: Separating the UDP receive functions and conversion to gro_cells, which violates the "one patch per feature" rule. I'd still merge it, but would prefer two separate patches * 05/14 Remove special mtu handling Pending your rework * 06/14 Eliminate pktinfo and add port configuration I don't like the combination of a non-functional "cosmetic" refactoring of removing a data structure with the introduction of a new feature. Makes it harder to review, impossible to merge only one of the two. For the rationale of introducing the gtp_pktinfo struct, see http://git.osmocom.org/osmo-gtp-kernel/commit/?id=3bc7019c7afd06b5c7d94e5621728d092b82bb85 it was actually intended to make IPv6 support easier, but
Re: [PATCH net-next 03/14] gtp: Call common functions to get tunnel routes and add dst_cache
Hi Andreas, On Wed, Sep 20, 2017 at 05:37:52PM +0200, Andreas Schultz wrote: > I think we had this discussion before. The sending IP and port are not part > of the identity of the PDP context. So IMHO the sender is permitted > to change the source IP at random. Thanks for the reminder: You are correct, at least in the uplink case (MS->GGSN) where there is mobility of the MS. In the downlink case (GGSN->MS), which is the "sending" part for the kernel GTP code used at a GGSN, I'm not sure if that theory holds true in reality. Do you agree that the current behavior of not using automatic source address selection for encapsulated GTP packets but rather using the source address of the socket is intended? Do you further agree that the dst_cache support patch by Tom retains that intended behavior and it should be merged? -- - Harald Weltehttp://laforge.gnumonks.org/ "Privacy in residential applications is a desirable marketing option." (ETSI EN 300 175-7 Ch. A6)
[PATCH] staging: rtl8822be: Keep array subscript no lower than zero
The kbuild test robot reports the following: drivers/staging//rtlwifi/phydm/phydm_dig.c: In function 'odm_pause_dig': drivers/staging//rtlwifi/phydm/phydm_dig.c:494:45: warning: array subscript is below array bounds [-Warray-bounds] odm_write_dig(dm, dig_tab->pause_dig_value[max_level]); This condition is caused when a loop falls through. The fix is to pin max_level to be >= 0. Signed-off-by: Larry Fingerc: kbuild test robot Fixes: 9ce99b04b5b82fdf11e4c76b60a5f82c1e541297 staging: r8822be: Add phydm mini driver --- drivers/staging/rtlwifi/phydm/phydm_dig.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/staging/rtlwifi/phydm/phydm_dig.c b/drivers/staging/rtlwifi/phydm/phydm_dig.c index 31a4f3fcad19..c88b9788363a 100644 --- a/drivers/staging/rtlwifi/phydm/phydm_dig.c +++ b/drivers/staging/rtlwifi/phydm/phydm_dig.c @@ -490,6 +490,8 @@ void odm_pause_dig(void *dm_void, enum phydm_pause_type pause_type, break; } + /* pin max_level to be >= 0 */ + max_level = max_t(s8, 0, max_level); /* write IGI of lower level */ odm_write_dig(dm, dig_tab->pause_dig_value[max_level]); ODM_RT_TRACE(dm, ODM_COMP_DIG, -- 2.12.3
Re: [Patch net-next] net_sched: use idr to allocate bpf filter handles
From: Cong WangDate: Thu, 21 Sep 2017 16:43:00 -0700 > @@ -421,27 +427,6 @@ static int cls_bpf_set_parms(struct net *net, struct > tcf_proto *tp, > return 0; > } > > -static u32 cls_bpf_grab_new_handle(struct tcf_proto *tp, > -struct cls_bpf_head *head) > -{ > - unsigned int i = 0x8000; > - u32 handle; > - > - do { > - if (++head->hgen == 0x7FFF) > - head->hgen = 1; > - } while (--i > 0 && cls_bpf_get(tp, head->hgen)); > - > - if (unlikely(i == 0)) { > - pr_err("Insufficient number of handles\n"); > - handle = 0; > - } else { > - handle = head->hgen; > - } > - > - return handle; > -} > - If I read the code properly, the existing code allows IDs from '1' to and including '0x7ffe', whereas the new code allows up to and including '0x7'. Whether intentional or not this is a change in behavior. If it's OK, it should be at least documented either in a comment or in the commit log itself. Similarly for your other scheduler IDR changes. Thanks.
Re: [PATCH] cnic: Fix an error handling path in 'cnic_alloc_bnx2x_resc()'
From: Christophe JAILLETDate: Fri, 22 Sep 2017 01:01:11 +0200 > All the error handling paths 'goto error', except this one. > We should also go to error in this case, or some resources will be > leaking. > > Signed-off-by: Christophe JAILLET Applied, thank you.
Re: [PATCH net-next v3 3/6] rtnetlink: add helper to dump ifalias
Eric Dumazetwrote: > On Sat, 2017-09-23 at 21:26 +0200, Florian Westphal wrote: > > Reviewed-by: David Ahern > > Signed-off-by: Florian Westphal > > --- > > Changes since v3: don't add rtnl assertion, I placed the assertion > > in my working tree instead as a reminder. > > > > net/core/rtnetlink.c | 11 +-- > > 1 file changed, 9 insertions(+), 2 deletions(-) > > > > diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c > > index c801212ee40e..47c17c3de79a 100644 > > --- a/net/core/rtnetlink.c > > +++ b/net/core/rtnetlink.c > > @@ -1332,6 +1332,14 @@ static int nla_put_iflink(struct sk_buff *skb, const > > struct net_device *dev) > > return nla_put_u32(skb, IFLA_LINK, ifindex); > > } > > > > +static noinline int nla_put_ifalias(struct sk_buff *skb, struct net_device > > *dev) > > > Why noinline here ? > > This function does not use stack at all (and that would call for > noinline_for_stack ) > > > +{ > > + if (dev->ifalias) > > + return nla_put_string(skb, IFLA_IFALIAS, dev->ifalias); > > + > > + return 0; > > +} > > + > > I really do not see the point of not making this RCU aware right away, > or at least make it in the same patch series... I saw no point to mix these refactoring with actual change :-| (and it doesn't help either as-is with netlink dumping, only sysfs path can elide rtnl). Subject: [PATCH net-next] net: core: decouple ifalias get/set from rtnl lock Device alias can be set by either rtnetlink (rtnl is held) or sysfs. rtnetlink holds rtnl mutex, sysfs acquires it for this purpose. Add a new mutex for it plus a seqcount to get a consistent snapshot of the alias buffer. This allows the sysfs path to not take rtnl and would later allow to not hold it when dumping ifalias. Signed-off-by: Florian Westphal --- include/linux/netdevice.h | 3 +- net/core/dev.c| 70 +++ net/core/net-sysfs.c | 14 -- net/core/rtnetlink.c | 7 +++-- 4 files changed, 70 insertions(+), 24 deletions(-) diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index f535779d9dc1..0bcedb498f6f 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -1632,7 +1632,7 @@ enum netdev_priv_flags { struct net_device { charname[IFNAMSIZ]; struct hlist_node name_hlist; - char*ifalias; + char__rcu *ifalias; /* * I/O specific fields * FIXME: Merge these and struct ifmap into one @@ -3275,6 +3275,7 @@ void __dev_notify_flags(struct net_device *, unsigned int old_flags, unsigned int gchanges); int dev_change_name(struct net_device *, const char *); int dev_set_alias(struct net_device *, const char *, size_t); +int dev_get_alias(const struct net_device *, char *, size_t); int dev_change_net_namespace(struct net_device *, struct net *, const char *); int __dev_set_mtu(struct net_device *, int); int dev_set_mtu(struct net_device *, int); diff --git a/net/core/dev.c b/net/core/dev.c index 97abddd9039a..92d87b4a2db1 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -188,6 +188,9 @@ static struct napi_struct *napi_by_id(unsigned int napi_id); DEFINE_RWLOCK(dev_base_lock); EXPORT_SYMBOL(dev_base_lock); +static DEFINE_MUTEX(ifalias_mutex); +static seqcount_t ifalias_rename_seq; + /* protects napi_hash addition/deletion and napi_gen_id */ static DEFINE_SPINLOCK(napi_hash_lock); @@ -1265,29 +1268,72 @@ int dev_change_name(struct net_device *dev, const char *newname) */ int dev_set_alias(struct net_device *dev, const char *alias, size_t len) { - char *new_ifalias; - - ASSERT_RTNL(); + char *new_ifalias, *old_ifalias; if (len >= IFALIASZ) return -EINVAL; + mutex_lock(_mutex); + + old_ifalias = rcu_dereference_protected(dev->ifalias, + mutex_is_locked(_mutex)); if (!len) { - kfree(dev->ifalias); - dev->ifalias = NULL; - return 0; + RCU_INIT_POINTER(dev->ifalias, NULL); + goto out; } - new_ifalias = krealloc(dev->ifalias, len + 1, GFP_KERNEL); - if (!new_ifalias) + new_ifalias = __krealloc(old_ifalias, len + 1, GFP_KERNEL); + if (!new_ifalias) { + mutex_unlock(_mutex); return -ENOMEM; - dev->ifalias = new_ifalias; - memcpy(dev->ifalias, alias, len); - dev->ifalias[len] = 0; + } + if (new_ifalias == old_ifalias) { + write_seqcount_begin(_rename_seq); + memcpy(new_ifalias, alias, len); + new_ifalias[len] = 0; + write_seqcount_end(_rename_seq); + mutex_unlock(_mutex); + return len; + }
Re: [PATCH net-next v3 3/6] rtnetlink: add helper to dump ifalias
On Sat, 2017-09-23 at 21:26 +0200, Florian Westphal wrote: > Reviewed-by: David Ahern> Signed-off-by: Florian Westphal > --- > Changes since v3: don't add rtnl assertion, I placed the assertion > in my working tree instead as a reminder. > > net/core/rtnetlink.c | 11 +-- > 1 file changed, 9 insertions(+), 2 deletions(-) > > diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c > index c801212ee40e..47c17c3de79a 100644 > --- a/net/core/rtnetlink.c > +++ b/net/core/rtnetlink.c > @@ -1332,6 +1332,14 @@ static int nla_put_iflink(struct sk_buff *skb, const > struct net_device *dev) > return nla_put_u32(skb, IFLA_LINK, ifindex); > } > > +static noinline int nla_put_ifalias(struct sk_buff *skb, struct net_device > *dev) Why noinline here ? This function does not use stack at all (and that would call for noinline_for_stack ) > +{ > + if (dev->ifalias) > + return nla_put_string(skb, IFLA_IFALIAS, dev->ifalias); > + > + return 0; > +} > + I really do not see the point of not making this RCU aware right away, or at least make it in the same patch series...
[PATCH] au1k_ir.c fix warning: Prefer [subsystem eg: netdev]_info([subsystem]dev, ...
This patch fixes the following checkpatch.pl warning: fix Prefer [subsystem eg: netdev]_info([subsystem]dev, ... then dev_info(dev, ... then pr_info(... to printk(KERN_INFO ... Signed-off-by: Yurii Pavlenko--- drivers/staging/irda/drivers/au1k_ir.c | 40 +++--- 1 file changed, 18 insertions(+), 22 deletions(-) diff --git a/drivers/staging/irda/drivers/au1k_ir.c b/drivers/staging/irda/drivers/au1k_ir.c index be4ea6a..73e3e4b 100644 --- a/drivers/staging/irda/drivers/au1k_ir.c +++ b/drivers/staging/irda/drivers/au1k_ir.c @@ -290,8 +290,7 @@ static int au1k_irda_set_speed(struct net_device *dev, int speed) while (irda_read(aup, IR_STATUS) & (IR_RX_STATUS | IR_TX_STATUS)) { msleep(20); if (!timeout--) { - printk(KERN_ERR "%s: rx/tx disable timeout\n", - dev->name); + netdev_err(dev, "rx/tx disable timeout\n"); break; } } @@ -349,7 +348,7 @@ static int au1k_irda_set_speed(struct net_device *dev, int speed) IR_RX_ENABLE); break; default: - printk(KERN_ERR "%s unsupported speed %x\n", dev->name, speed); + netdev_err(dev, "unsupported speed %x\n", speed); ret = -EINVAL; break; } @@ -361,18 +360,18 @@ static int au1k_irda_set_speed(struct net_device *dev, int speed) irda_write(aup, IR_RING_PROMPT, 0); if (control & (1 << 14)) { - printk(KERN_ERR "%s: configuration error\n", dev->name); + netdev_err(dev, "configuration error\n"); } else { if (control & (1 << 11)) - printk(KERN_DEBUG "%s Valid SIR config\n", dev->name); + netdev_debug(dev, "Valid SIR config\n"); if (control & (1 << 12)) - printk(KERN_DEBUG "%s Valid MIR config\n", dev->name); + netdev_debug(dev, "Valid MIR config\n"); if (control & (1 << 13)) - printk(KERN_DEBUG "%s Valid FIR config\n", dev->name); + netdev_debug(dev, "Valid FIR config\n"); if (control & (1 << 10)) - printk(KERN_DEBUG "%s TX enabled\n", dev->name); + netdev_debug(dev, "TX enabled\n"); if (control & (1 << 9)) - printk(KERN_DEBUG "%s RX enabled\n", dev->name); + netdev_debug(dev, "RX enabled\n"); } return ret; @@ -584,23 +583,21 @@ static int au1k_irda_start(struct net_device *dev) retval = au1k_init(dev); if (retval) { - printk(KERN_ERR "%s: error in au1k_init\n", dev->name); + netdev_err(dev, "error in au1k_init\n"); return retval; } retval = request_irq(aup->irq_tx, _irda_interrupt, 0, dev->name, dev); if (retval) { - printk(KERN_ERR "%s: unable to get IRQ %d\n", - dev->name, dev->irq); + netdev_err(dev, "unable to get IRQ %d\n", dev->irq); return retval; } retval = request_irq(aup->irq_rx, _irda_interrupt, 0, dev->name, dev); if (retval) { free_irq(aup->irq_tx, dev); - printk(KERN_ERR "%s: unable to get IRQ %d\n", - dev->name, dev->irq); + netdev_err(dev, "unable to get IRQ %d\n", dev->irq); return retval; } @@ -673,12 +670,12 @@ static int au1k_irda_hard_xmit(struct sk_buff *skb, struct net_device *dev) flags = ptxd->flags; if (flags & AU_OWN) { - printk(KERN_DEBUG "%s: tx_full\n", dev->name); + netdev_debug(dev, "tx_full\n"); netif_stop_queue(dev); aup->tx_full = 1; return 1; } else if (((aup->tx_head + 1) & (NUM_IR_DESC - 1)) == aup->tx_tail) { - printk(KERN_DEBUG "%s: tx_full\n", dev->name); + netdev_debug(dev, "tx_full\n"); netif_stop_queue(dev); aup->tx_full = 1; return 1; @@ -688,7 +685,7 @@ static int au1k_irda_hard_xmit(struct sk_buff *skb, struct net_device *dev) #if 0 if (irda_read(aup, IR_RX_BYTE_CNT) != 0) { - printk(KERN_DEBUG "tx warning: rx byte cnt %x\n", + netdev_debug(dev, "tx warning: rx byte cnt %x\n", irda_read(aup, IR_RX_BYTE_CNT)); } #endif @@ -726,7 +723,7 @@ static void au1k_tx_timeout(struct net_device *dev) u32 speed; struct au1k_private *aup = netdev_priv(dev); - printk(KERN_ERR "%s: tx timeout\n", dev->name); +
[PATCH 2/2] neigh: make strucrt neigh_table::entry_size unsigned int
Key length can't be negative. Leave comparisons against nla_len() signed just in case truncated attribute can sneak in there. Space savings: add/remove: 0/0 grow/shrink: 0/7 up/down: 0/-7 (-7) function old new delta pneigh_delete273 272 -1 mlx5e_rep_netevent_event14151414 -1 mlx5e_create_encap_header_ipv6 11941193 -1 mlx5e_create_encap_header_ipv4 10711070 -1 cxgb4_l2t_get 11041103 -1 __pneigh_lookup 69 68 -1 __neigh_create 24522451 -1 Signed-off-by: Alexey Dobriyan--- drivers/net/ethernet/chelsio/cxgb4/l2t.c |4 ++-- include/net/neighbour.h |2 +- net/core/neighbour.c | 18 +- 3 files changed, 12 insertions(+), 12 deletions(-) --- a/drivers/net/ethernet/chelsio/cxgb4/l2t.c +++ b/drivers/net/ethernet/chelsio/cxgb4/l2t.c @@ -422,7 +422,7 @@ struct l2t_entry *cxgb4_l2t_get(struct l2t_data *d, struct neighbour *neigh, u8 lport; u16 vlan; struct l2t_entry *e; - int addr_len = neigh->tbl->key_len; + unsigned int addr_len = neigh->tbl->key_len; u32 *addr = (u32 *)neigh->primary_key; int ifidx = neigh->dev->ifindex; int hash = addr_hash(d, addr, addr_len, ifidx); @@ -536,7 +536,7 @@ void t4_l2t_update(struct adapter *adap, struct neighbour *neigh) struct l2t_entry *e; struct sk_buff_head *arpq = NULL; struct l2t_data *d = adap->l2t; - int addr_len = neigh->tbl->key_len; + unsigned int addr_len = neigh->tbl->key_len; u32 *addr = (u32 *) neigh->primary_key; int ifidx = neigh->dev->ifindex; int hash = addr_hash(d, addr, addr_len, ifidx); --- a/include/net/neighbour.h +++ b/include/net/neighbour.h @@ -191,7 +191,7 @@ struct neigh_hash_table { struct neigh_table { int family; unsigned intentry_size; - int key_len; + unsigned intkey_len; __be16 protocol; __u32 (*hash)(const void *pkey, const struct net_device *dev, --- a/net/core/neighbour.c +++ b/net/core/neighbour.c @@ -457,7 +457,7 @@ struct neighbour *neigh_lookup_nodev(struct neigh_table *tbl, struct net *net, const void *pkey) { struct neighbour *n; - int key_len = tbl->key_len; + unsigned int key_len = tbl->key_len; u32 hash_val; struct neigh_hash_table *nht; @@ -488,7 +488,7 @@ struct neighbour *__neigh_create(struct neigh_table *tbl, const void *pkey, struct net_device *dev, bool want_ref) { u32 hash_val; - int key_len = tbl->key_len; + unsigned int key_len = tbl->key_len; int error; struct neighbour *n1, *rc, *n = neigh_alloc(tbl, dev); struct neigh_hash_table *nht; @@ -572,7 +572,7 @@ struct neighbour *__neigh_create(struct neigh_table *tbl, const void *pkey, } EXPORT_SYMBOL(__neigh_create); -static u32 pneigh_hash(const void *pkey, int key_len) +static u32 pneigh_hash(const void *pkey, unsigned int key_len) { u32 hash_val = *(u32 *)(pkey + key_len - 4); hash_val ^= (hash_val >> 16); @@ -585,7 +585,7 @@ static u32 pneigh_hash(const void *pkey, int key_len) static struct pneigh_entry *__pneigh_lookup_1(struct pneigh_entry *n, struct net *net, const void *pkey, - int key_len, + unsigned int key_len, struct net_device *dev) { while (n) { @@ -601,7 +601,7 @@ static struct pneigh_entry *__pneigh_lookup_1(struct pneigh_entry *n, struct pneigh_entry *__pneigh_lookup(struct neigh_table *tbl, struct net *net, const void *pkey, struct net_device *dev) { - int key_len = tbl->key_len; + unsigned int key_len = tbl->key_len; u32 hash_val = pneigh_hash(pkey, key_len); return __pneigh_lookup_1(tbl->phash_buckets[hash_val], @@ -614,7 +614,7 @@ struct pneigh_entry * pneigh_lookup(struct neigh_table *tbl, struct net_device *dev, int creat) { struct pneigh_entry *n; - int key_len = tbl->key_len; + unsigned int key_len = tbl->key_len; u32 hash_val = pneigh_hash(pkey, key_len); read_lock_bh(>lock); @@ -659,7 +659,7 @@ int pneigh_delete(struct neigh_table *tbl, struct net *net, const void *pkey, struct
[PATCH 1/2] neigh: make struct neigh_table::entry_size unsigned int
Neigh entry size can't be negative. Space savings: add/remove: 0/0 grow/shrink: 0/5 up/down: 0/-7 (-7) function old new delta lowpan_neigh_construct25 24 -1 clip_seq_sub_iter152 151 -1 clip_ioctl 14751474 -1 clip_constructor 93 92 -1 __neigh_create 24552452 -3 Signed-off-by: Alexey Dobriyan--- include/net/neighbour.h |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/include/net/neighbour.h +++ b/include/net/neighbour.h @@ -190,7 +190,7 @@ struct neigh_hash_table { struct neigh_table { int family; - int entry_size; + unsigned intentry_size; int key_len; __be16 protocol; __u32 (*hash)(const void *pkey,
[PATCH net-next] net: speed up skb_rbtree_purge()
From: Eric DumazetAs measured in my prior patch ("sch_netem: faster rb tree removal"), rbtree_postorder_for_each_entry_safe() is nice looking but much slower than using rb_next() directly, except when tree is small enough to fit in CPU caches (then the cost is the same) Also note that there is not even an increase of text size : $ size net/core/skbuff.o.before net/core/skbuff.o textdata bss dec hex filename 407111298 0 42009a419 net/core/skbuff.o.before 407111298 0 42009a419 net/core/skbuff.o From: Eric Dumazet --- net/core/skbuff.c | 11 +++ 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/net/core/skbuff.c b/net/core/skbuff.c index 16982de649b97b92423a4f9f5eac1e98ca803370..000ce735fa8d649e7abeeef2ebab8501dea96efd 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -2848,12 +2848,15 @@ EXPORT_SYMBOL(skb_queue_purge); */ void skb_rbtree_purge(struct rb_root *root) { - struct sk_buff *skb, *next; + struct rb_node *p = rb_first(root); - rbtree_postorder_for_each_entry_safe(skb, next, root, rbnode) - kfree_skb(skb); + while (p) { + struct sk_buff *skb = rb_entry(p, struct sk_buff, rbnode); - *root = RB_ROOT; + p = rb_next(p); + rb_erase(>rbnode, root); + kfree_skb(skb); + } } /**
[RESEND] Re: usb/net/p54: trying to register non-static key in p54_unregister_leds
This got rejected by gmail once. Let's see if it works now. On Thursday, September 21, 2017 8:22:45 PM CEST Andrey Konovalov wrote: > On Wed, Sep 20, 2017 at 9:55 PM, Johannes Berg >wrote: > > On Wed, 2017-09-20 at 21:27 +0200, Christian Lamparter wrote: > > > >> It seems this is caused as a result of: > >> -> lock_map_acquire(>lockdep_map); > >> lock_map_release(>lockdep_map); > >> > >> in flush_work() [0] > > > > Agree. > > > >> This was added by: > >> > >> commit 0976dfc1d0cd80a4e9dfaf87bd8744612bde475a > >> Author: Stephen Boyd > >> Date: Fri Apr 20 17:28:50 2012 -0700 > >> > >> workqueue: Catch more locking problems with flush_work() > > > > Yes, but that doesn't matter. > > > >> Looking at the Stephen's patch, it's clear that it was made > >> with "static DECLARE_WORK(work, my_work)" in mind. However > >> p54's led_work is "per-device", hence it is stored in the > >> devices context p54_common, which is dynamically allocated. > >> So, maybe revert Stephen's patch? > > > > I disagree - as the lockdep warning says: > > > >> > INFO: trying to register non-static key. > >> > the code is fine but needs lockdep annotation. > >> > turning off the locking correctness validator. > > > > What it needs is to actually correctly go through initializing the work > > at least once. > > > > Without more information, I can't really say what's going on, but I > > assume that something is failing and p54_unregister_leds() is getting > > invoked without p54_init_leds() having been invoked, so essentially > > it's trying to flush a work that was never initialized? > > > > INIT_DELAYED_WORK() does, after all, initialize the lockdep map > > properly via __INIT_WORK(). Ok, thanks. This does indeed explain it. But this also begs the question: Is this really working then? >From what I can tell, if CONFIG_LOCKDEP is not set then there's no BUG no WARN, no other splat or any other odd system behaviour. Does [cancel | flush]_[delayed_]work[_sync] really "just work" by *accident*, as long the delayed_work | work_struct is zeroed out? And should it work in the future as well? > Since I'm able to reproduce this, please let me know if you need me to > collect some debug traces to help with the triage. Do you want to take a shot at making a patch too? At a quick glance, it should be enough to move the [#ifdef CONFIG_P54_LEDS ... #endif] block in p54_unregister_common() into the if (priv->registered) { block (preferably before the ieee80211_unregister_hw(dev). Regards, Christian
[PATCH net-next v3 5/6] rtnetlink: add helpers to dump netnsid information
Reviewed-by: David AhernSigned-off-by: Florian Westphal --- Changes since v2: this hunk was part of patch #4. net/core/rtnetlink.c | 30 +++--- 1 file changed, 19 insertions(+), 11 deletions(-) diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index 625342d27c44..e858a2b48d7e 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -1370,6 +1370,23 @@ static noinline int nla_put_ifalias(struct sk_buff *skb, struct net_device *dev) return 0; } +static noinline int rtnl_fill_link_netnsid(struct sk_buff *skb, + const struct net_device *dev) +{ + if (dev->rtnl_link_ops && dev->rtnl_link_ops->get_link_net) { + struct net *link_net = dev->rtnl_link_ops->get_link_net(dev); + + if (!net_eq(dev_net(dev), link_net)) { + int id = peernet2id_alloc(dev_net(dev), link_net); + + if (nla_put_s32(skb, IFLA_LINK_NETNSID, id)) + return -EMSGSIZE; + } + } + + return 0; +} + static int rtnl_fill_ifinfo(struct sk_buff *skb, struct net_device *dev, int type, u32 pid, u32 seq, u32 change, unsigned int flags, u32 ext_filter_mask, @@ -1458,17 +1475,8 @@ static int rtnl_fill_ifinfo(struct sk_buff *skb, struct net_device *dev, goto nla_put_failure; } - if (dev->rtnl_link_ops && - dev->rtnl_link_ops->get_link_net) { - struct net *link_net = dev->rtnl_link_ops->get_link_net(dev); - - if (!net_eq(dev_net(dev), link_net)) { - int id = peernet2id_alloc(dev_net(dev), link_net); - - if (nla_put_s32(skb, IFLA_LINK_NETNSID, id)) - goto nla_put_failure; - } - } + if (rtnl_fill_link_netnsid(skb, dev)) + goto nla_put_failure; if (!(af_spec = nla_nest_start(skb, IFLA_AF_SPEC))) goto nla_put_failure; -- 2.13.5
[PATCH net-next v3 1/6] selftests: rtnetlink.sh: add rudimentary vrf test
Acked-by: David AhernSigned-off-by: Florian Westphal --- Changes since v1: indent all lines with tabs, not spaces tools/testing/selftests/net/rtnetlink.sh | 42 1 file changed, 42 insertions(+) diff --git a/tools/testing/selftests/net/rtnetlink.sh b/tools/testing/selftests/net/rtnetlink.sh index 4b48de565cae..a048f7a5f94c 100755 --- a/tools/testing/selftests/net/rtnetlink.sh +++ b/tools/testing/selftests/net/rtnetlink.sh @@ -291,6 +291,47 @@ kci_test_ifalias() echo "PASS: set ifalias $namewant for $devdummy" } +kci_test_vrf() +{ + vrfname="test-vrf" + ret=0 + + ip link show type vrf 2>/dev/null + if [ $? -ne 0 ]; then + echo "SKIP: vrf: iproute2 too old" + return 0 + fi + + ip link add "$vrfname" type vrf table 10 + check_err $? + if [ $ret -ne 0 ];then + echo "FAIL: can't add vrf interface, skipping test" + return 0 + fi + + ip -br link show type vrf | grep -q "$vrfname" + check_err $? + if [ $ret -ne 0 ];then + echo "FAIL: created vrf device not found" + return 1 + fi + + ip link set dev "$vrfname" up + check_err $? + + ip link set dev "$devdummy" master "$vrfname" + check_err $? + ip link del dev "$vrfname" + check_err $? + + if [ $ret -ne 0 ];then + echo "FAIL: vrf" + return 1 + fi + + echo "PASS: vrf" +} + kci_test_rtnl() { kci_add_dummy @@ -306,6 +347,7 @@ kci_test_rtnl() kci_test_bridge kci_test_addrlabel kci_test_ifalias + kci_test_vrf kci_del_dummy } -- 2.13.5
[PATCH net-next v3 3/6] rtnetlink: add helper to dump ifalias
Reviewed-by: David AhernSigned-off-by: Florian Westphal --- Changes since v3: don't add rtnl assertion, I placed the assertion in my working tree instead as a reminder. net/core/rtnetlink.c | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index c801212ee40e..47c17c3de79a 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -1332,6 +1332,14 @@ static int nla_put_iflink(struct sk_buff *skb, const struct net_device *dev) return nla_put_u32(skb, IFLA_LINK, ifindex); } +static noinline int nla_put_ifalias(struct sk_buff *skb, struct net_device *dev) +{ + if (dev->ifalias) + return nla_put_string(skb, IFLA_IFALIAS, dev->ifalias); + + return 0; +} + static int rtnl_fill_ifinfo(struct sk_buff *skb, struct net_device *dev, int type, u32 pid, u32 seq, u32 change, unsigned int flags, u32 ext_filter_mask, @@ -1374,8 +1382,7 @@ static int rtnl_fill_ifinfo(struct sk_buff *skb, struct net_device *dev, nla_put_u8(skb, IFLA_CARRIER, netif_carrier_ok(dev)) || (dev->qdisc && nla_put_string(skb, IFLA_QDISC, dev->qdisc->ops->id)) || - (dev->ifalias && -nla_put_string(skb, IFLA_IFALIAS, dev->ifalias)) || + nla_put_ifalias(skb, dev) || nla_put_u32(skb, IFLA_CARRIER_CHANGES, atomic_read(>carrier_changes)) || nla_put_u8(skb, IFLA_PROTO_DOWN, dev->proto_down)) -- 2.13.5
[PATCH net-next v3 6/6] rtnetlink: rtnl_have_link_slave_info doesn't need rtnl
it can be switched to rcu. Reviewed-by: David AhernSigned-off-by: Florian Westphal --- Changes since v2: remove ASSERT_RTNL. net/core/rtnetlink.c | 10 +++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index e858a2b48d7e..c69451964a44 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -522,11 +522,15 @@ static size_t rtnl_link_get_af_size(const struct net_device *dev, static bool rtnl_have_link_slave_info(const struct net_device *dev) { struct net_device *master_dev; + bool ret = false; - master_dev = netdev_master_upper_dev_get((struct net_device *) dev); + rcu_read_lock(); + + master_dev = netdev_master_upper_dev_get_rcu((struct net_device *)dev); if (master_dev && master_dev->rtnl_link_ops) - return true; - return false; + ret = true; + rcu_read_unlock(); + return ret; } static int rtnl_link_slave_info_fill(struct sk_buff *skb, -- 2.13.5
[PATCH net-next v3 4/6] rtnetlink: add helpers to dump vf information
similar to earlier patches, split out more parts of this function to better see what is happening and where we assume rtnl is locked. Reviewed-by: David AhernSigned-off-by: Florian Westphal --- changes since v2: split this patch into two, last submission also added netnsid helper, which was moved to next patch. net/core/rtnetlink.c | 50 +++--- 1 file changed, 31 insertions(+), 19 deletions(-) diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index 47c17c3de79a..625342d27c44 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -1211,6 +1211,36 @@ static noinline_for_stack int rtnl_fill_vfinfo(struct sk_buff *skb, return -EMSGSIZE; } +static noinline_for_stack int rtnl_fill_vf(struct sk_buff *skb, + struct net_device *dev, + u32 ext_filter_mask) +{ + struct nlattr *vfinfo; + int i, num_vfs; + + if (!dev->dev.parent || ((ext_filter_mask & RTEXT_FILTER_VF) == 0)) + return 0; + + num_vfs = dev_num_vf(dev->dev.parent); + if (nla_put_u32(skb, IFLA_NUM_VF, num_vfs)) + return -EMSGSIZE; + + if (!dev->netdev_ops->ndo_get_vf_config) + return 0; + + vfinfo = nla_nest_start(skb, IFLA_VFINFO_LIST); + if (!vfinfo) + return -EMSGSIZE; + + for (i = 0; i < num_vfs; i++) { + if (rtnl_fill_vfinfo(skb, dev, i, vfinfo)) + return -EMSGSIZE; + } + + nla_nest_end(skb, vfinfo); + return 0; +} + static int rtnl_fill_link_ifmap(struct sk_buff *skb, struct net_device *dev) { struct rtnl_link_ifmap map; @@ -1414,27 +1444,9 @@ static int rtnl_fill_ifinfo(struct sk_buff *skb, struct net_device *dev, if (rtnl_fill_stats(skb, dev)) goto nla_put_failure; - if (dev->dev.parent && (ext_filter_mask & RTEXT_FILTER_VF) && - nla_put_u32(skb, IFLA_NUM_VF, dev_num_vf(dev->dev.parent))) + if (rtnl_fill_vf(skb, dev, ext_filter_mask)) goto nla_put_failure; - if (dev->netdev_ops->ndo_get_vf_config && dev->dev.parent && - ext_filter_mask & RTEXT_FILTER_VF) { - int i; - struct nlattr *vfinfo; - int num_vfs = dev_num_vf(dev->dev.parent); - - vfinfo = nla_nest_start(skb, IFLA_VFINFO_LIST); - if (!vfinfo) - goto nla_put_failure; - for (i = 0; i < num_vfs; i++) { - if (rtnl_fill_vfinfo(skb, dev, i, vfinfo)) - goto nla_put_failure; - } - - nla_nest_end(skb, vfinfo); - } - if (rtnl_port_fill(skb, dev, ext_filter_mask)) goto nla_put_failure; -- 2.13.5
[PATCH net-next v3 2/6] rtnetlink: add helper to put master and link ifindexes
rtnl_fill_ifinfo currently requires caller to hold the rtnl mutex. Unfortunately the function is quite large which makes it harder to see which spots require the lock, which spots assume it and which ones could do without. Add helpers to factor out the ifindex dumping, one can use rcu to avoid rtnl dependency. Reviewed-by: David AhernSigned-off-by: Florian Westphal --- no changes since v2. net/core/rtnetlink.c | 32 +++- 1 file changed, 27 insertions(+), 5 deletions(-) diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index a78fd61da0ec..c801212ee40e 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -1307,6 +1307,31 @@ static u32 rtnl_get_event(unsigned long event) return rtnl_event_type; } +static int put_master_ifindex(struct sk_buff *skb, struct net_device *dev) +{ + const struct net_device *upper_dev; + int ret = 0; + + rcu_read_lock(); + + upper_dev = netdev_master_upper_dev_get_rcu(dev); + if (upper_dev) + ret = nla_put_u32(skb, IFLA_MASTER, upper_dev->ifindex); + + rcu_read_unlock(); + return ret; +} + +static int nla_put_iflink(struct sk_buff *skb, const struct net_device *dev) +{ + int ifindex = dev_get_iflink(dev); + + if (dev->ifindex == ifindex) + return 0; + + return nla_put_u32(skb, IFLA_LINK, ifindex); +} + static int rtnl_fill_ifinfo(struct sk_buff *skb, struct net_device *dev, int type, u32 pid, u32 seq, u32 change, unsigned int flags, u32 ext_filter_mask, @@ -1316,7 +1341,6 @@ static int rtnl_fill_ifinfo(struct sk_buff *skb, struct net_device *dev, struct nlmsghdr *nlh; struct nlattr *af_spec; struct rtnl_af_ops *af_ops; - struct net_device *upper_dev = netdev_master_upper_dev_get(dev); ASSERT_RTNL(); nlh = nlmsg_put(skb, pid, seq, type, sizeof(*ifm), flags); @@ -1345,10 +1369,8 @@ static int rtnl_fill_ifinfo(struct sk_buff *skb, struct net_device *dev, #ifdef CONFIG_RPS nla_put_u32(skb, IFLA_NUM_RX_QUEUES, dev->num_rx_queues) || #endif - (dev->ifindex != dev_get_iflink(dev) && -nla_put_u32(skb, IFLA_LINK, dev_get_iflink(dev))) || - (upper_dev && -nla_put_u32(skb, IFLA_MASTER, upper_dev->ifindex)) || + nla_put_iflink(skb, dev) || + put_master_ifindex(skb, dev) || nla_put_u8(skb, IFLA_CARRIER, netif_carrier_ok(dev)) || (dev->qdisc && nla_put_string(skb, IFLA_QDISC, dev->qdisc->ops->id)) || -- 2.13.5
[PATCH net-next v3 0/6] rtnetlink: preparation patches for further rtnl lock pushdown/removal
First patch adds a rudimentary vrf test case. Remaining patches split large rtnl_fill_ifinfo into smaller chunks to better see which parts 1. require rtnl 2. do not require it at all 3. rely on rtnl locking now but could be converted I removed all the ASSERT_RTNL spots that v1 and v2 added, i will keep that back in my working branch since those are just 'todo' markers for myself. Eric Dumazet pointed out that qdiscs are now freed directly without call_rcu so I dropped the patch that made this assumption from the series. As the remaining patches did not see major changes vs. v2 I retained all reviewed/acked-by tags from David Ahern.
Re: [PATCH net-next v2 3/6] rtnetlink: add helper to dump qdisc name
Eric Dumazetwrote: > On Fri, 2017-09-22 at 08:10 +0200, Florian Westphal wrote: > > We can use rcu here to make this safe even if we would not hold rtnl: > > qdisc_destroy uses call_rcu to free the Qdisc struct. > > > Where do you see call_rcu() called from qdisc_destroy() ? > > You missed this commit I guess > > 752fbcc33405d6f8249465e4b2c4e420091bb825 > ("net_sched: no need to free qdisc in RCU callback") Indeed, I did, patch dropped, thanks for the heads-up.
[PATCH net-next] sch_netem: faster rb tree removal
From: Eric DumazetWhile running TCP tests involving netem storing millions of packets, I had the idea to speed up tfifo_reset() and did experiments. I tried the rbtree_postorder_for_each_entry_safe() method that is used in skb_rbtree_purge() but discovered it was slower than the current tfifo_reset() method. I measured time taken to release skbs with three occupation levels : 10^4, 10^5 and 10^6 skbs with three methods : 1) (current 'naive' method) while ((p = rb_first(>t_root))) { struct sk_buff *skb = netem_rb_to_skb(p); rb_erase(p, >t_root); rtnl_kfree_skbs(skb, skb); } 2) Use rb_next() instead of rb_first() in the loop : p = rb_first(>t_root); while (p) { struct sk_buff *skb = netem_rb_to_skb(p); p = rb_next(p); rb_erase(>rbnode, >t_root); rtnl_kfree_skbs(skb, skb); } 3) "optimized" method using rbtree_postorder_for_each_entry_safe() struct sk_buff *skb, *next; rbtree_postorder_for_each_entry_safe(skb, next, >t_root, rbnode) { rtnl_kfree_skbs(skb, skb); } q->t_root = RB_ROOT; Results : method_1:while (rb_first()) rb_erase() 1 skbs in 690378 ns (69 ns per skb) method_2:rb_first; while (p) { p = rb_next(p); ...} 1 skbs in 541846 ns (54 ns per skb) method_3:rbtree_postorder_for_each_entry_safe() 1 skbs in 868307 ns (86 ns per skb) method_1:while (rb_first()) rb_erase() 6 skbs in 7804021 ns (78 ns per skb) method_2:rb_first; while (p) { p = rb_next(p); ...} 10 skbs in 5942456 ns (59 ns per skb) method_3:rbtree_postorder_for_each_entry_safe() 10 skbs in 11584940 ns (115 ns per skb) method_1:while (rb_first()) rb_erase() 100 skbs in 108577838 ns (108 ns per skb) method_2:rb_first; while (p) { p = rb_next(p); ...} 100 skbs in 82619635 ns (82 ns per skb) method_3:rbtree_postorder_for_each_entry_safe() 100 skbs in 127328743 ns (127 ns per skb) Method 2) is simply faster, probably because it maintains a smaller working size set. Note that this is the method we use in tcp_ofo_queue() already. I will also change skb_rbtree_purge() in a second patch. Signed-off-by: Eric Dumazet --- net/sched/sch_netem.c |7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c index 063a4bdb9ee6f26b01387959e8f6ccd15ec16191..5a4f1008029068372019a965186e7a3c0a18aac3 100644 --- a/net/sched/sch_netem.c +++ b/net/sched/sch_netem.c @@ -361,12 +361,13 @@ static psched_time_t packet_len_2_sched_time(unsigned int len, struct netem_sche static void tfifo_reset(struct Qdisc *sch) { struct netem_sched_data *q = qdisc_priv(sch); - struct rb_node *p; + struct rb_node *p = rb_first(>t_root); - while ((p = rb_first(>t_root))) { + while (p) { struct sk_buff *skb = netem_rb_to_skb(p); - rb_erase(p, >t_root); + p = rb_next(p); + rb_erase(>rbnode, >t_root); rtnl_kfree_skbs(skb, skb); } }
Re: [PATCH net-next v2 3/6] rtnetlink: add helper to dump qdisc name
On Fri, 2017-09-22 at 08:10 +0200, Florian Westphal wrote: > We can use rcu here to make this safe even if we would not hold rtnl: > qdisc_destroy uses call_rcu to free the Qdisc struct. Where do you see call_rcu() called from qdisc_destroy() ? You missed this commit I guess 752fbcc33405d6f8249465e4b2c4e420091bb825 ("net_sched: no need to free qdisc in RCU callback")
Re: [PATCH] net: dsa: avoid null pointer dereference on p->phy
On 09/23/2017 09:57 AM, Colin King wrote: > From: Colin Ian King> > Currently p->phy is being null checked in several places to avoid > null pointer dereferences on p->phy, however, the final call > to phy_attached_info on p->phy when p->phy will perform a null > pointer dereference. Fix this by simply moving the call into > the previous code block that is only executed if p->phy is > not null. > > Detected by CoverityScan, CID#1457034 ("Dereference after null check") The code flow is not exactly easy to read, but I don't see how we can actually wind up in that situation because we check the return values of of_phy_connect() and dsa_slave_phy_connect() earlier on. > > Fixes: 2220943a21e2 ("phy: Centralise print about attached phy") > Signed-off-by: Colin Ian King > --- > net/dsa/slave.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/net/dsa/slave.c b/net/dsa/slave.c > index 02ace7d462c4..29ab4e98639b 100644 > --- a/net/dsa/slave.c > +++ b/net/dsa/slave.c > @@ -1115,10 +1115,9 @@ static int dsa_slave_phy_setup(struct net_device > *slave_dev) > of_phy_deregister_fixed_link(port_dn); > return ret; > } > + phy_attached_info(p->phy); > } > > - phy_attached_info(p->phy); > - > return 0; > } > > -- Florian
Re: [PATCH net-next v2 6/6] rtnetlink: rtnl_have_link_slave_info doesn't need rtnl
On 9/22/17 12:10 AM, Florian Westphal wrote: > Switch it to rcu. > > rtnl_link_slave_info_fill on to other hand does need rtnl mutex for now so > add an annotation to its caller as a reminder. > > Signed-off-by: Florian Westphal> --- > Change since v1: > - don't add ASSERT_RTNL to rtnl_link_slave_info_fill and > rtnl_link_info_fill they are only called via rtnl_link_fill. > > net/core/rtnetlink.c | 12 +--- > 1 file changed, 9 insertions(+), 3 deletions(-) > > diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c > index 590823f70cc3..e6f9e36d9d5a 100644 > --- a/net/core/rtnetlink.c > +++ b/net/core/rtnetlink.c > @@ -522,11 +522,15 @@ static size_t rtnl_link_get_af_size(const struct > net_device *dev, > static bool rtnl_have_link_slave_info(const struct net_device *dev) > { > struct net_device *master_dev; > + bool ret = false; > > - master_dev = netdev_master_upper_dev_get((struct net_device *) dev); > + rcu_read_lock(); > + > + master_dev = netdev_master_upper_dev_get_rcu((struct net_device *)dev); > if (master_dev && master_dev->rtnl_link_ops) > - return true; > - return false; > + ret = true; > + rcu_read_unlock(); > + return ret; > } > > static int rtnl_link_slave_info_fill(struct sk_buff *skb, > @@ -598,6 +602,8 @@ static int rtnl_link_fill(struct sk_buff *skb, const > struct net_device *dev) > struct nlattr *linkinfo; > int err = -EMSGSIZE; > > + ASSERT_RTNL(); Rather than sprinkling the ASSERT_RTNL why not just add a comment above the function which is done in many places. Since this is really meant as your notes as you remove rtnl usage a comment serves the same purpose. > + > linkinfo = nla_nest_start(skb, IFLA_LINKINFO); > if (linkinfo == NULL) > goto out; > Reviewed-by: David Ahern
Re: [PATCH net-next v2 4/6] rtnetlink: add helper to dump ifalias
On 9/22/17 12:10 AM, Florian Westphal wrote: > ifalias is currently protected by rtnl mutex, add assertion > as a reminder. > > Signed-off-by: Florian Westphal> --- > net/core/rtnetlink.c | 13 +++-- > 1 file changed, 11 insertions(+), 2 deletions(-) > > diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c > index ad3f27da37a8..42ff582a010e 100644 > --- a/net/core/rtnetlink.c > +++ b/net/core/rtnetlink.c > @@ -1345,6 +1345,16 @@ static int nla_put_qdisc(struct sk_buff *skb, struct > net_device *dev) > return ret; > } > > +static noinline int nla_put_ifalias(struct sk_buff *skb, struct net_device > *dev) > +{ > + ASSERT_RTNL(); The assert is not needed given the code path. > + > + if (dev->ifalias) > + return nla_put_string(skb, IFLA_IFALIAS, dev->ifalias); > + > + return 0; > +} > + > static int rtnl_fill_ifinfo(struct sk_buff *skb, struct net_device *dev, > int type, u32 pid, u32 seq, u32 change, > unsigned int flags, u32 ext_filter_mask, > @@ -1386,8 +1396,7 @@ static int rtnl_fill_ifinfo(struct sk_buff *skb, struct > net_device *dev, > put_master_ifindex(skb, dev) || > nla_put_u8(skb, IFLA_CARRIER, netif_carrier_ok(dev)) || > nla_put_qdisc(skb, dev) || > - (dev->ifalias && > - nla_put_string(skb, IFLA_IFALIAS, dev->ifalias)) || > + nla_put_ifalias(skb, dev) || > nla_put_u32(skb, IFLA_CARRIER_CHANGES, > atomic_read(>carrier_changes)) || > nla_put_u8(skb, IFLA_PROTO_DOWN, dev->proto_down)) > Reviewed-by: David Ahern
Re: [PATCH] net: dsa: avoid null pointer dereference on p->phy
On Sat, 2017-09-23 at 17:57 +0100, Colin King wrote: > From: Colin Ian King> > Currently p->phy is being null checked in several places to avoid > null pointer dereferences on p->phy, however, the final call > to phy_attached_info on p->phy when p->phy will perform a null > pointer dereference. Fix this by simply moving the call into > the previous code block that is only executed if p->phy is > not null. > > Detected by CoverityScan, CID#1457034 ("Dereference after null check") > > Fixes: 2220943a21e2 ("phy: Centralise print about attached phy") > Signed-off-by: Colin Ian King > --- > net/dsa/slave.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/net/dsa/slave.c b/net/dsa/slave.c > index 02ace7d462c4..29ab4e98639b 100644 > --- a/net/dsa/slave.c > +++ b/net/dsa/slave.c > @@ -1115,10 +1115,9 @@ static int dsa_slave_phy_setup(struct net_device > *slave_dev) > of_phy_deregister_fixed_link(port_dn); > return ret; > } > + phy_attached_info(p->phy); > } > > - phy_attached_info(p->phy); > - > return 0; > } Huh? Why move this into the test? The test of the block above this change is if (!p->phy) { Perhaps this should be ' if (p->phy) phy_attached_info(p->phy); or simpler } else { phy_attached_info(p->phy); } or maybe reverse the block if (p->phy) { phy_attached_info(p->phy); } else { ret = dsa_slave_phy_connect(slave_dev, p->dp->index); if (ret) { netdev_err(slave_dev, "failed to connect to port %d: %d\n", p->dp->index, ret); if (phy_is_fixed) of_phy_deregister_fixed_link(port_dn); return ret; } } return 0; }
Re: [PATCH net-next v2 5/6] rtnetlink: add helpers to dump vf and netnsid information
On 9/22/17 12:10 AM, Florian Westphal wrote: > +static noinline_for_stack int rtnl_fill_vf(struct sk_buff *skb, > +struct net_device *dev, > +u32 ext_filter_mask) > +{ > + struct nlattr *vfinfo; > + int i, num_vfs; > + > + if (!dev->dev.parent || ((ext_filter_mask & RTEXT_FILTER_VF) == 0)) > + return 0; > + > + num_vfs = dev_num_vf(dev->dev.parent); > + if (nla_put_u32(skb, IFLA_NUM_VF, num_vfs)) > + return -EMSGSIZE; > + > + if (!dev->netdev_ops->ndo_get_vf_config) > + return 0; > + > + vfinfo = nla_nest_start(skb, IFLA_VFINFO_LIST); > + if (!vfinfo) > + return -EMSGSIZE; > + > + for (i = 0; i < num_vfs; i++) { > + if (rtnl_fill_vfinfo(skb, dev, i, vfinfo)) > + return -EMSGSIZE; > + } > + > + nla_nest_end(skb, vfinfo); > + return 0; > +} > + > static int rtnl_fill_link_ifmap(struct sk_buff *skb, struct net_device *dev) > { > struct rtnl_link_ifmap map; > @@ -1355,6 +1385,23 @@ static noinline int nla_put_ifalias(struct sk_buff > *skb, struct net_device *dev) > return 0; > } > > +static noinline int rtnl_fill_link_netnsid(struct sk_buff *skb, > +const struct net_device *dev) > +{ > + if (dev->rtnl_link_ops && dev->rtnl_link_ops->get_link_net) { > + struct net *link_net = dev->rtnl_link_ops->get_link_net(dev); > + > + if (!net_eq(dev_net(dev), link_net)) { > + int id = peernet2id_alloc(dev_net(dev), link_net); > + > + if (nla_put_s32(skb, IFLA_LINK_NETNSID, id)) > + return -EMSGSIZE; > + } > + } > + > + return 0; > +} > + No reason to combine vf and netnsid into 1 patch; completely separate topics > static int rtnl_fill_ifinfo(struct sk_buff *skb, struct net_device *dev, > int type, u32 pid, u32 seq, u32 change, > unsigned int flags, u32 ext_filter_mask, > @@ -1428,27 +1475,9 @@ static int rtnl_fill_ifinfo(struct sk_buff *skb, > struct net_device *dev, > if (rtnl_fill_stats(skb, dev)) > goto nla_put_failure; > > - if (dev->dev.parent && (ext_filter_mask & RTEXT_FILTER_VF) && > - nla_put_u32(skb, IFLA_NUM_VF, dev_num_vf(dev->dev.parent))) > + if (rtnl_fill_vf(skb, dev, ext_filter_mask)) > goto nla_put_failure; > > - if (dev->netdev_ops->ndo_get_vf_config && dev->dev.parent && > - ext_filter_mask & RTEXT_FILTER_VF) { > - int i; > - struct nlattr *vfinfo; > - int num_vfs = dev_num_vf(dev->dev.parent); > - > - vfinfo = nla_nest_start(skb, IFLA_VFINFO_LIST); > - if (!vfinfo) > - goto nla_put_failure; > - for (i = 0; i < num_vfs; i++) { > - if (rtnl_fill_vfinfo(skb, dev, i, vfinfo)) > - goto nla_put_failure; > - } > - > - nla_nest_end(skb, vfinfo); > - } > - > if (rtnl_port_fill(skb, dev, ext_filter_mask)) > goto nla_put_failure; > > @@ -1460,17 +1489,8 @@ static int rtnl_fill_ifinfo(struct sk_buff *skb, > struct net_device *dev, > goto nla_put_failure; > } > > - if (dev->rtnl_link_ops && > - dev->rtnl_link_ops->get_link_net) { > - struct net *link_net = dev->rtnl_link_ops->get_link_net(dev); > - > - if (!net_eq(dev_net(dev), link_net)) { > - int id = peernet2id_alloc(dev_net(dev), link_net); > - > - if (nla_put_s32(skb, IFLA_LINK_NETNSID, id)) > - goto nla_put_failure; > - } > - } > + if (rtnl_fill_link_netnsid(skb, dev)) > + goto nla_put_failure; > > if (!(af_spec = nla_nest_start(skb, IFLA_AF_SPEC))) > goto nla_put_failure; > Reviewed-by: David Ahern
Re: [PATCH net-next v2 1/6] selftests: rtnetlink.sh: add rudimentary vrf test
On 9/22/17 12:10 AM, Florian Westphal wrote: > +kci_test_vrf() > +{ > + vrfname="test-vrf" > + ret=0 > + > + ip link show type vrf 2>/dev/null > + if [ $? -ne 0 ]; then > + echo "SKIP: vrf: iproute2 too old" > + return 0 > + fi > + > + ip link add "$vrfname" type vrf table 10 > + check_err $? > + if [ $ret -ne 0 ];then > + echo "FAIL: can't add vrf interface, skipping test" > + return 0 > + fi > + > + ip -br link show type vrf | grep -q "$vrfname" > + check_err $? > + if [ $ret -ne 0 ];then > + echo "FAIL: created vrf device not found" > + return 1 > + fi > + > +ip link set dev "$vrfname" up BTW, if there is a v3 of this set, that ip command is shifted - uses spaces instead of tab.
Re: [PATCH net-next v2 3/6] rtnetlink: add helper to dump qdisc name
On 9/22/17 12:10 AM, Florian Westphal wrote: > We can use rcu here to make this safe even if we would not hold rtnl: > qdisc_destroy uses call_rcu to free the Qdisc struct. > > Signed-off-by: Florian Westphal> --- > net/core/rtnetlink.c | 16 ++-- > 1 file changed, 14 insertions(+), 2 deletions(-) Reviewed-by: David Ahern
Re: [PATCH net-next v2 2/6] rtnetlink: add helper to put master ifindex
On 9/22/17 12:10 AM, Florian Westphal wrote: > rtnl_fill_ifinfo currently requires caller to hold the rtnl mutex. > Unfortunately the function is quite large which makes it harder to see > which spots need the lock, which spots assume it and which ones could do > without. > > Add helpers to factor out the ifindex dumping. > > One helper can use rcu to remove rtnl dependency. > > Signed-off-by: Florian Westphal> --- > net/core/rtnetlink.c | 32 +++- > 1 file changed, 27 insertions(+), 5 deletions(-) Reviewed-by: David Ahern
Re: [PATCH net-next v2 1/6] selftests: rtnetlink.sh: add rudimentary vrf test
On 9/22/17 12:10 AM, Florian Westphal wrote: > Cc: David Ahern> Signed-off-by: Florian Westphal > --- > tools/testing/selftests/net/rtnetlink.sh | 42 > > 1 file changed, 42 insertions(+) Acked-by: David Ahern
Re: tools: selftests: psock_tpacket: skip un-supported tpacket_v3 test
From: Fathi BoudraDate: Sat, 23 Sep 2017 14:27:15 +0300 > On 23 September 2017 at 04:20, David Miller wrote: >> From: Orson Zhai >> Date: Fri, 22 Sep 2017 18:17:17 +0800 >> >>> The TPACKET_V3 test of PACKET_TX_RING will fail with kernel version >>> lower than v4.11. Supported code of tx ring was add with commit id >>> <7f953ab2ba46: af_packet: TX_RING support for TPACKET_V3> at Jan. 3 >>> of 2017. >>> >>> So skip this item test instead of reporting failing for old kernels. >>> >>> Signed-off-by: Orson Zhai >> >> The whole point is to make sure the kernel in which the selftest >> code is present functions properly. >> >> There are many tests in selftests that only work on recent kernels. > > For the background, a similar discussion happened on this thread: > https://lkml.org/lkml/2017/6/22/802 > > There's cases where we'd like to run latest selftests on stable kernels. > You're right, there are many tests in selftests that only work on > recent kernels and we intend to fix it. > Skipping gracefully a test because the feature is missing on the > kernel under test is preferred to fail. This approach is extremely ill advised. It is hard enough to get developers to add new tests in the first place. Having the extra burdon of needing to make the test work on older kernels is going to discourage test writing even more. If you want to "backport" tests, handle them the same way -stable backports are done. With extreme care and making sure they get backported to the kernel they actually would work on.
[PATCH] net: dsa: avoid null pointer dereference on p->phy
From: Colin Ian KingCurrently p->phy is being null checked in several places to avoid null pointer dereferences on p->phy, however, the final call to phy_attached_info on p->phy when p->phy will perform a null pointer dereference. Fix this by simply moving the call into the previous code block that is only executed if p->phy is not null. Detected by CoverityScan, CID#1457034 ("Dereference after null check") Fixes: 2220943a21e2 ("phy: Centralise print about attached phy") Signed-off-by: Colin Ian King --- net/dsa/slave.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/net/dsa/slave.c b/net/dsa/slave.c index 02ace7d462c4..29ab4e98639b 100644 --- a/net/dsa/slave.c +++ b/net/dsa/slave.c @@ -1115,10 +1115,9 @@ static int dsa_slave_phy_setup(struct net_device *slave_dev) of_phy_deregister_fixed_link(port_dn); return ret; } + phy_attached_info(p->phy); } - phy_attached_info(p->phy); - return 0; } -- 2.14.1
Re: [PATCH net-next] liquidio: pass date and time info to NIC firmware
On Fri, Sep 22, 2017 at 05:35:18PM -0700, Felix Manlunas wrote: > From: Veerasenareddy BurruThis is kind of interesting. So you do this once. It could be before the RTC driver has probed, so it is 1970. It could be before the NTP daemon has started, and so the host clock will later jump, or stretch time to gain synchronisation. It seems like you should be periodically giving the time to the firmware, not just once. Andrew
Re: [patch net-next 07/12] mlxsw: spectrum: Add the multicast routing offloading logic
> > So when i look at these patches, i try to make sure the general use > > cases works, not just the plain boring Ethernet switch box use cases > > :-) > > So when doing it, we did think about multi-ASIC situations, so I think it > should > fit :) Maybe, maybe not. DSA is multi switch. That is what the D means, Distributed. All switches in a cluster are presented to the software stack as a single switch. If your multi-ASIC architecture is the same, a distributed switch, then you don't cover the general case. If you represent it as two switches, with frames going between switches going via CPU, then yes, you probably cover the general case. Andrew
[PATCH net-next] tun: delete original tun_get() and rename __tun_get() to tun_get()
From: yuan linyuit seems no need to keep tun_get() and __tun_get() at same time. Signed-off-by: yuan linyu --- drivers/net/tun.c | 26 +++--- 1 file changed, 11 insertions(+), 15 deletions(-) diff --git a/drivers/net/tun.c b/drivers/net/tun.c index 3c9985f..206bc6c 100644 --- a/drivers/net/tun.c +++ b/drivers/net/tun.c @@ -692,7 +692,7 @@ static int tun_attach(struct tun_struct *tun, struct file *file, bool skip_filte return err; } -static struct tun_struct *__tun_get(struct tun_file *tfile) +static struct tun_struct *tun_get(struct tun_file *tfile) { struct tun_struct *tun; @@ -705,11 +705,6 @@ static struct tun_struct *__tun_get(struct tun_file *tfile) return tun; } -static struct tun_struct *tun_get(struct file *file) -{ - return __tun_get(file->private_data); -} - static void tun_put(struct tun_struct *tun) { dev_put(tun->dev); @@ -1149,7 +1144,7 @@ static void tun_net_init(struct net_device *dev) static unsigned int tun_chr_poll(struct file *file, poll_table *wait) { struct tun_file *tfile = file->private_data; - struct tun_struct *tun = __tun_get(tfile); + struct tun_struct *tun = tun_get(tfile); struct sock *sk; unsigned int mask = 0; @@ -1569,8 +1564,8 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile, static ssize_t tun_chr_write_iter(struct kiocb *iocb, struct iov_iter *from) { struct file *file = iocb->ki_filp; - struct tun_struct *tun = tun_get(file); struct tun_file *tfile = file->private_data; + struct tun_struct *tun = tun_get(tfile); ssize_t result; if (!tun) @@ -1754,7 +1749,7 @@ static ssize_t tun_chr_read_iter(struct kiocb *iocb, struct iov_iter *to) { struct file *file = iocb->ki_filp; struct tun_file *tfile = file->private_data; - struct tun_struct *tun = __tun_get(tfile); + struct tun_struct *tun = tun_get(tfile); ssize_t len = iov_iter_count(to), ret; if (!tun) @@ -1831,7 +1826,7 @@ static int tun_sendmsg(struct socket *sock, struct msghdr *m, size_t total_len) { int ret; struct tun_file *tfile = container_of(sock, struct tun_file, socket); - struct tun_struct *tun = __tun_get(tfile); + struct tun_struct *tun = tun_get(tfile); if (!tun) return -EBADFD; @@ -1847,7 +1842,7 @@ static int tun_recvmsg(struct socket *sock, struct msghdr *m, size_t total_len, int flags) { struct tun_file *tfile = container_of(sock, struct tun_file, socket); - struct tun_struct *tun = __tun_get(tfile); + struct tun_struct *tun = tun_get(tfile); int ret; if (!tun) @@ -1879,7 +1874,7 @@ static int tun_peek_len(struct socket *sock) struct tun_struct *tun; int ret = 0; - tun = __tun_get(tfile); + tun = tun_get(tfile); if (!tun) return 0; @@ -2265,7 +2260,7 @@ static long __tun_chr_ioctl(struct file *file, unsigned int cmd, ret = 0; rtnl_lock(); - tun = __tun_get(tfile); + tun = tun_get(tfile); if (cmd == TUNSETIFF) { ret = -EEXIST; if (tun) @@ -2612,15 +2607,16 @@ static int tun_chr_close(struct inode *inode, struct file *file) } #ifdef CONFIG_PROC_FS -static void tun_chr_show_fdinfo(struct seq_file *m, struct file *f) +static void tun_chr_show_fdinfo(struct seq_file *m, struct file *file) { + struct tun_file *tfile = file->private_data; struct tun_struct *tun; struct ifreq ifr; memset(, 0, sizeof(ifr)); rtnl_lock(); - tun = tun_get(f); + tun = tun_get(tfile); if (tun) tun_get_iff(current->nsproxy->net_ns, tun, ); rtnl_unlock(); -- 2.7.4
Re: [PATCH net-next 2/2] net: dsa: lan9303: Add basic offloading of unicast traffic
> The point is: Once both external ports are in "forwarding", I see no way > to prevent traffic flowing directly between the external ports. Generally, there are port vectors. Port X can send frames only to Port Y. If you don't have that, there are possibilities with VLANs. Each port is given a unique VLAN. All incoming untagged traffic is tagged with the VLAN. You just need to keep the VLAN separated and add/remove the VLAN tag in the dsa tag driver. Andrew
Re: [PATCH net] sctp: Fix a big endian bug in sctp_for_each_transport()
On Sat, Sep 23, 2017 at 6:25 PM, Dan Carpenterwrote: > Fundamentally, the "pos" pointer points to "cb->args[2]" which is a long. > In the current code, we only use the high 32 bits and cast it as an > int. That works on little endian systems but will fail on big endian > systems. > > Fixes: d25adbeb0cdb ("sctp: fix an use-after-free issue in sctp_sock_dump") > Signed-off-by: Dan Carpenter > > diff --git a/include/net/sctp/sctp.h b/include/net/sctp/sctp.h > index d7d8cba01469..7d87439f299a 100644 > --- a/include/net/sctp/sctp.h > +++ b/include/net/sctp/sctp.h > @@ -121,14 +121,14 @@ void sctp_transport_walk_stop(struct rhashtable_iter > *iter); > struct sctp_transport *sctp_transport_get_next(struct net *net, > struct rhashtable_iter *iter); > struct sctp_transport *sctp_transport_get_idx(struct net *net, > - struct rhashtable_iter *iter, int pos); > + struct rhashtable_iter *iter, long pos); > int sctp_transport_lookup_process(int (*cb)(struct sctp_transport *, void *), > struct net *net, > const union sctp_addr *laddr, > const union sctp_addr *paddr, void *p); > int sctp_for_each_transport(int (*cb)(struct sctp_transport *, void *), > int (*cb_done)(struct sctp_transport *, void *), > - struct net *net, int *pos, void *p); > + struct net *net, long *pos, void *p); > int sctp_for_each_endpoint(int (*cb)(struct sctp_endpoint *, void *), void > *p); > int sctp_get_sctp_info(struct sock *sk, struct sctp_association *asoc, >struct sctp_info *info); > diff --git a/net/sctp/socket.c b/net/sctp/socket.c > index d4730ada7f32..0222743b3aa8 100644 > --- a/net/sctp/socket.c > +++ b/net/sctp/socket.c > @@ -4603,7 +4603,7 @@ struct sctp_transport *sctp_transport_get_next(struct > net *net, > > struct sctp_transport *sctp_transport_get_idx(struct net *net, > struct rhashtable_iter *iter, > - int pos) > + long pos) > { > void *obj = SEQ_START_TOKEN; > > @@ -4659,7 +4659,7 @@ EXPORT_SYMBOL_GPL(sctp_transport_lookup_process); > > int sctp_for_each_transport(int (*cb)(struct sctp_transport *, void *), > int (*cb_done)(struct sctp_transport *, void *), > - struct net *net, int *pos, void *p) { > + struct net *net, long *pos, void *p) { > struct rhashtable_iter hti; > struct sctp_transport *tsp; > int ret; > diff --git a/net/sctp/sctp_diag.c b/net/sctp/sctp_diag.c > index 22ed01a76b19..e9d5405aa6ac 100644 > --- a/net/sctp/sctp_diag.c > +++ b/net/sctp/sctp_diag.c > @@ -493,7 +493,7 @@ static void sctp_diag_dump(struct sk_buff *skb, struct > netlink_callback *cb, > goto done; > > sctp_for_each_transport(sctp_sock_filter, sctp_sock_dump, > - net, (int *)>args[2], ); > + net, >args[2], ); > > done: > cb->args[1] = cb->args[4]; Better not to touch sctp_for_each_transport and sctp_transport_get_idx which are supposed to be also used elsewhere as common apis. Can you pls try to fix this in sctp_diag_dump(), like: @@ -463,6 +463,7 @@ static void sctp_diag_dump(struct sk_buff *skb, struct netlink_callback *cb, .r = r, .net_admin = netlink_net_capable(cb->skb, CAP_NET_ADMIN), }; + int pos = cb->args[2]; /* eps hashtable dumps * args: @@ -493,7 +494,8 @@ static void sctp_diag_dump(struct sk_buff *skb, struct netlink_callback *cb, goto done; sctp_for_each_transport(sctp_sock_filter, sctp_sock_dump, - net, (int *)>args[2], ); + net, , ); + cb->args[2] = pos;
Re: tools: selftests: psock_tpacket: skip un-supported tpacket_v3 test
On 23 September 2017 at 04:20, David Millerwrote: > From: Orson Zhai > Date: Fri, 22 Sep 2017 18:17:17 +0800 > >> The TPACKET_V3 test of PACKET_TX_RING will fail with kernel version >> lower than v4.11. Supported code of tx ring was add with commit id >> <7f953ab2ba46: af_packet: TX_RING support for TPACKET_V3> at Jan. 3 >> of 2017. >> >> So skip this item test instead of reporting failing for old kernels. >> >> Signed-off-by: Orson Zhai > > The whole point is to make sure the kernel in which the selftest > code is present functions properly. > > There are many tests in selftests that only work on recent kernels. For the background, a similar discussion happened on this thread: https://lkml.org/lkml/2017/6/22/802 There's cases where we'd like to run latest selftests on stable kernels. You're right, there are many tests in selftests that only work on recent kernels and we intend to fix it. Skipping gracefully a test because the feature is missing on the kernel under test is preferred to fail. > I'm not applying this, sorry. > -- > To unsubscribe from this list: send the line "unsubscribe linux-kselftest" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next] cxgb4: do DCB state reset in couple of places
reset the driver's DCB state in couple of places where it was missing. Signed-off-by: Casey LeedomSigned-off-by: Ganesh Goudar --- drivers/net/ethernet/chelsio/cxgb4/cxgb4_dcb.c | 15 +++ drivers/net/ethernet/chelsio/cxgb4/cxgb4_dcb.h | 1 + drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c | 10 -- 3 files changed, 20 insertions(+), 6 deletions(-) diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_dcb.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_dcb.c index 6ee2ed3..4e7f72b 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_dcb.c +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_dcb.c @@ -40,8 +40,7 @@ static inline bool cxgb4_dcb_state_synced(enum cxgb4_dcb_state state) return false; } -/* Initialize a port's Data Center Bridging state. Typically used after a - * Link Down event. +/* Initialize a port's Data Center Bridging state. */ void cxgb4_dcb_state_init(struct net_device *dev) { @@ -106,6 +105,15 @@ static void cxgb4_dcb_cleanup_apps(struct net_device *dev) } } +/* Reset a port's Data Center Bridging state. Typically used after a + * Link Down event. + */ +void cxgb4_dcb_reset(struct net_device *dev) +{ + cxgb4_dcb_cleanup_apps(dev); + cxgb4_dcb_state_init(dev); +} + /* Finite State machine for Data Center Bridging. */ void cxgb4_dcb_state_fsm(struct net_device *dev, @@ -194,8 +202,7 @@ void cxgb4_dcb_state_fsm(struct net_device *dev, * state. We need to reset back to a ground state * of incomplete. */ - cxgb4_dcb_cleanup_apps(dev); - cxgb4_dcb_state_init(dev); + cxgb4_dcb_reset(dev); dcb->state = CXGB4_DCB_STATE_FW_INCOMPLETE; dcb->supported = CXGB4_DCBX_FW_SUPPORT; linkwatch_fire_event(dev); diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_dcb.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_dcb.h index ccf24d3..02040b9 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_dcb.h +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_dcb.h @@ -131,6 +131,7 @@ struct port_dcb_info { void cxgb4_dcb_state_init(struct net_device *); void cxgb4_dcb_version_init(struct net_device *); +void cxgb4_dcb_reset(struct net_device *dev); void cxgb4_dcb_state_fsm(struct net_device *, enum cxgb4_dcb_state_input); void cxgb4_dcb_handle_fw_update(struct adapter *, const struct fw_port_cmd *); void cxgb4_dcb_set_caps(struct adapter *, const struct fw_port_cmd *); diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c index aa93ae9..13b636b 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c @@ -281,7 +281,7 @@ void t4_os_link_changed(struct adapter *adapter, int port_id, int link_stat) else { #ifdef CONFIG_CHELSIO_T4_DCB if (cxgb4_dcb_enabled(dev)) { - cxgb4_dcb_state_init(dev); + cxgb4_dcb_reset(dev); dcb_tx_queue_prio_enable(dev, false); } #endif /* CONFIG_CHELSIO_T4_DCB */ @@ -2304,10 +2304,16 @@ static int cxgb_close(struct net_device *dev) { struct port_info *pi = netdev_priv(dev); struct adapter *adapter = pi->adapter; + int ret; netif_tx_stop_all_queues(dev); netif_carrier_off(dev); - return t4_enable_vi(adapter, adapter->pf, pi->viid, false, false); + ret = t4_enable_vi(adapter, adapter->pf, pi->viid, false, false); +#ifdef CONFIG_CHELSIO_T4_DCB + cxgb4_dcb_reset(dev); + dcb_tx_queue_prio_enable(dev, false); +#endif + return ret; } int cxgb4_create_server_filter(const struct net_device *dev, unsigned int stid, -- 2.1.0
[PATCH net] sctp: Fix a big endian bug in sctp_for_each_transport()
Fundamentally, the "pos" pointer points to "cb->args[2]" which is a long. In the current code, we only use the high 32 bits and cast it as an int. That works on little endian systems but will fail on big endian systems. Fixes: d25adbeb0cdb ("sctp: fix an use-after-free issue in sctp_sock_dump") Signed-off-by: Dan Carpenterdiff --git a/include/net/sctp/sctp.h b/include/net/sctp/sctp.h index d7d8cba01469..7d87439f299a 100644 --- a/include/net/sctp/sctp.h +++ b/include/net/sctp/sctp.h @@ -121,14 +121,14 @@ void sctp_transport_walk_stop(struct rhashtable_iter *iter); struct sctp_transport *sctp_transport_get_next(struct net *net, struct rhashtable_iter *iter); struct sctp_transport *sctp_transport_get_idx(struct net *net, - struct rhashtable_iter *iter, int pos); + struct rhashtable_iter *iter, long pos); int sctp_transport_lookup_process(int (*cb)(struct sctp_transport *, void *), struct net *net, const union sctp_addr *laddr, const union sctp_addr *paddr, void *p); int sctp_for_each_transport(int (*cb)(struct sctp_transport *, void *), int (*cb_done)(struct sctp_transport *, void *), - struct net *net, int *pos, void *p); + struct net *net, long *pos, void *p); int sctp_for_each_endpoint(int (*cb)(struct sctp_endpoint *, void *), void *p); int sctp_get_sctp_info(struct sock *sk, struct sctp_association *asoc, struct sctp_info *info); diff --git a/net/sctp/socket.c b/net/sctp/socket.c index d4730ada7f32..0222743b3aa8 100644 --- a/net/sctp/socket.c +++ b/net/sctp/socket.c @@ -4603,7 +4603,7 @@ struct sctp_transport *sctp_transport_get_next(struct net *net, struct sctp_transport *sctp_transport_get_idx(struct net *net, struct rhashtable_iter *iter, - int pos) + long pos) { void *obj = SEQ_START_TOKEN; @@ -4659,7 +4659,7 @@ EXPORT_SYMBOL_GPL(sctp_transport_lookup_process); int sctp_for_each_transport(int (*cb)(struct sctp_transport *, void *), int (*cb_done)(struct sctp_transport *, void *), - struct net *net, int *pos, void *p) { + struct net *net, long *pos, void *p) { struct rhashtable_iter hti; struct sctp_transport *tsp; int ret; diff --git a/net/sctp/sctp_diag.c b/net/sctp/sctp_diag.c index 22ed01a76b19..e9d5405aa6ac 100644 --- a/net/sctp/sctp_diag.c +++ b/net/sctp/sctp_diag.c @@ -493,7 +493,7 @@ static void sctp_diag_dump(struct sk_buff *skb, struct netlink_callback *cb, goto done; sctp_for_each_transport(sctp_sock_filter, sctp_sock_dump, - net, (int *)>args[2], ); + net, >args[2], ); done: cb->args[1] = cb->args[4];
Re: [PATCH net-next 2/2] net: dsa: lan9303: Add basic offloading of unicast traffic
Den 22. sep. 2017 22:08, skrev Andrew Lunn: I'm wondering how this is supposed to work. Please add a good comment here, since the hardware is forcing you to do something odd. Maybe it would be a good idea to save the STP state in chip. And then when chip->is_bridged is set true, change the state in the hardware to the saved value? What happens when port 0 is added to the bridge, there is then a minute pause and then port 1 is added? I would expect that as soon as port 0 is added, the STP state machine for port 0 will start and move into listening and then forwarding. Due to hardware limitations it looks like you cannot do this. So what state is the hardware effectively in? Blocking? Forwarding? Then port 1 is added. You can then can respect the states. port 1 will do blocking->listening->forwarding, but what about port 0? The calls won't get repeated? How does it transition to forwarding? Andrew I see your point with the "minute pause" argument. Although a bit contrived use case, it is easy to fix by caching the STP state, as you suggest. So I can do that. I don't think it is contrived. I've done bridge configuration by hand for testing purposes. I've also set the forwarding delay to very small values, so there is a clear race condition here. How does other DSA HW chips handle port separation? Knowing that could perhaps help me know what to look for. They have better hardware :-) Generally each port is totally independent. You can change the STP state per port without restrictions. We can indeed change the STP state per lan9303 port "without restrictions". The point is: Once both external ports are in "forwarding", I see no way to prevent traffic flowing directly between the external ports. Andrew Egil
Re: [patch net-next 07/12] mlxsw: spectrum: Add the multicast routing offloading logic
On 09/22/2017 04:21 PM, Andrew Lunn wrote: > On Fri, Sep 22, 2017 at 11:36:59AM +0300, Yotam Gigi wrote: >> On 09/21/2017 06:26 PM, Andrew Lunn wrote: +static void mlxsw_sp_mr_route_stats_update(struct mlxsw_sp *mlxsw_sp, + struct mlxsw_sp_mr_route *mr_route) +{ + struct mlxsw_sp_mr *mr = mlxsw_sp->mr; + u64 packets, bytes; + + if (mr_route->route_action == MLXSW_SP_MR_ROUTE_ACTION_TRAP) + return; + + mr->mr_ops->route_stats(mlxsw_sp, mr_route->route_priv, , + ); + + switch (mr_route->mr_table->proto) { + case MLXSW_SP_L3_PROTO_IPV4: + mr_route->mfc4->mfc_un.res.pkt = packets; + mr_route->mfc4->mfc_un.res.bytes = bytes; >>> What about wrong_if and lastuse? > Hi Yotam > >> wronf_if is updated by ipmr as it is trapped to the CPU. > Great. > >> We did not address lastuse currently, though it can be easily >> addressed here. > Please do. I've written multicast routing daemons, where i use it to > flush out MFCs which are no longer in use. Having it always 0 is going > to break daemons. I will. Thanks for the feedback! > >>> Is an mfc with iif on the host, not the switch, not offloaded? >> >> I am not sure I followed. What do you mean MFC with iif on the host? you mean >> MFC with iif that is an external NIC which is not part of the spectrum ASIC? > Yes. We probably have different perspectives on the world. To > Mellanox, everything is a switch in a box. In the DSA world, we tend > to think of having a general purpose machine which also has a switch > connected. Think of a wireless access point, set top box, passenger > entertainment system. We have applications on the general purpose > computer, we have wifi interfaces, cable modems, etc. Think about all > the different packages in LEDE. We might have a multicast video > stream, coming from the cable modem being sent over ports of the > switch to clients. > > So when i look at these patches, i try to make sure the general use > cases works, not just the plain boring Ethernet switch box use cases > :-) So when doing it, we did think about multi-ASIC situations, so I think it should fit :) > >> in this case, the route will not be offloaded and all traffic will >> pass in slowpath. > O.K. I was just thinking if those counters need to be +=, not =. But > either the iif is on the host, or it is in the switch. It cannot be > both. So = is O.K. > > Thanks > Andrew