Re: [PATCH] net/neighbour: fix potential null pointer deference

2019-05-31 Thread Konstantin Khlebnikov
On 31.05.2019 11:29, Young Xiao wrote: There is a possible null pointer deference bugs in neigh_fill_info(), which is similar to the bug which was fixed in commit 6adc5fd6a142 ("net/neighbour: fix crash at dumping device-agnostic proxy entries"). Have you seen this in real life? I see nobody wh

Re: Repeating "unregister_netdevice: waiting for lo to become free" caused by upstream 76da0704507bb ("ipv6: only call ip6_route_dev_notify() once for NETDEV_UNREGISTER")

2018-04-25 Thread Konstantin Khlebnikov
On 25.04.2018 17:16, Rafał Miłecki wrote: On 23.04.2018 15:08, Rafał Miłecki wrote: I've just updated my kernel 4.4.x and noticed a regression. Bisecting pointed me to the commit 2417da3f4d6bc ("ipv6: only call ip6_route_dev_notify() once for NETDEV_UNREGISTER") [0] which is backport of upstrea

Re: [PATCH] net_sched/sfq: update hierarchical backlog when drop packet

2017-08-15 Thread Konstantin Khlebnikov
On 15.08.2017 17:09, Eric Dumazet wrote: On Tue, 2017-08-15 at 16:37 +0300, Konstantin Khlebnikov wrote: When sfq_enqueue() drops head packet or packet from another queue it have to update backlog at upper qdiscs too. Signed-off-by: Konstantin Khlebnikov Fixes: 2f5fb43f ("net_

Re: [PATCH 1/2] net_sched: call qlen_notify only if child qdisc is empty

2017-08-16 Thread Konstantin Khlebnikov
On 16.08.2017 20:22, Cong Wang wrote: On Tue, Aug 15, 2017 at 6:39 AM, Konstantin Khlebnikov wrote: This callback is used for deactivating class in parent qdisc. This is cheaper to test queue length right here. Also this allows to catch draining screwed backlog and prevent second

[PATCH] net_sched: fix order of queue length updates in qdisc_replace()

2017-08-19 Thread Konstantin Khlebnikov
ackets from empty qdisc and corrupting state at reactivating this class in future. Signed-off-by: Konstantin Khlebnikov Fixes: 86a7996cc8a0 ("net_sched: introduce qdisc_replace() helper") Cc: Stable --- include/net/sch_generic.h |5 - 1 file changed, 4 insertions(+), 1 deletion(

Re: [PATCH] net_sched: fix order of queue length updates in qdisc_replace()

2017-08-19 Thread Konstantin Khlebnikov
17 15:37, Konstantin Khlebnikov wrote: This important to call qdisc_tree_reduce_backlog() after changing queue length. Parent qdisc should deactivate class in ->qlen_notify() called from qdisc_tree_reduce_backlog() but this happens only if qdisc->q.qlen in zero. Missed class deactivations le

[PATCH] net_sched/hhf: update hierarchical backlog when drop packet

2017-08-21 Thread Konstantin Khlebnikov
When hhf_enqueue() drops packet from another bucket it have to update backlog at upper qdiscs too. Signed-off-by: Konstantin Khlebnikov Fixes: 2f5fb43f ("net_sched: update hierarchical backlog too") --- net/sched/sch_hhf.c |5 - 1 file changed, 4 insertions(+), 1 deletio

[PATCH RFC] net_sched/codel: do not defer queue length update

2017-08-21 Thread Konstantin Khlebnikov
her problem in HFSC - now operation peek could fail and deactivate parent class. Signed-off-by: Konstantin Khlebnikov Link: https://bugzilla.kernel.org/show_bug.cgi?id=109581 --- net/sched/sch_codel.c| 14 ++ net/sched/sch_fq_codel.c | 24 +++- 2 files

[PATCH] net: bpfilter: fallback to netfilter if failed to load bpfilter kernel module

2019-05-15 Thread Konstantin Khlebnikov
If bpfilter is not available return ENOPROTOOPT to fallback to netfilter. Function request_module() returns both errors and userspace exit codes. Just ignore them. Rechecking bpfilter_ops is enough. Fixes: d2ba09c17a06 ("net: add skeleton of bpfilter kernel module") Signed-off-by:

[PATCH RFC] proc/meminfo: add NetBuffers counter for socket buffers

2019-05-15 Thread Konstantin Khlebnikov
-by: Konstantin Khlebnikov --- fs/proc/meminfo.c |5 - include/linux/mm.h |6 ++ mm/page_alloc.c|3 ++- net/core/sock.c| 20 net/sctp/socket.c |2 +- 5 files changed, 33 insertions(+), 3 deletions(-) diff --git a/fs/proc/meminfo.c b/fs/proc

[BUG] mlx5 have problems with ipv4-ipv6 tunnels in linux 4.4

2018-07-03 Thread Konstantin Khlebnikov
I'm seeing problems with tunnelled traffic with Mellanox Technologies MT27710 Family [ConnectX-4 Lx] using vanilla driver from linux 4.4.y Packets with payload bigger than 116 bytes are not exmited. Smaller packets and normal ipv6 works fine. In linux 4.9, 4.14 and out-of-tree driver everything

Re: [BUG] mlx5 have problems with ipv4-ipv6 tunnels in linux 4.4

2018-07-10 Thread Konstantin Khlebnikov
On 10.07.2018 01:31, Saeed Mahameed wrote: On Tue, Jul 3, 2018 at 10:45 PM, Konstantin Khlebnikov wrote: I'm seeing problems with tunnelled traffic with Mellanox Technologies MT27710 Family [ConnectX-4 Lx] using vanilla driver from linux 4.4.y Packets with payload bigger than 116 byte

[PATCH] net_sched: always reset qdisc backlog in qdisc_reset()

2017-09-20 Thread Konstantin Khlebnikov
SKB stored in qdisc->gso_skb also counted into backlog. Some qdiscs don't reset backlog to zero in ->reset(), for example sfq just dequeue and free all queued skb. Signed-off-by: Konstantin Khlebnikov Fixes: 2f5fb43f ("net_sched: update hierarchical backlog too&

[PATCH] net_sched/hfsc: fix curve activation in hfsc_change_class()

2017-09-20 Thread Konstantin Khlebnikov
If real-time or fair-share curves are enabled in hfsc_change_class() class isn't inserted into rb-trees yet. Thus init_ed() and init_vf() must be called in place of update_ed() and update_vf(). Remove isn't required because for now curves cannot be disabled. Signed-off-by: Konstantin

[PATCH RFC] net/mlx5_en: switch to Toeplitz RSS hash by default

2018-08-31 Thread Konstantin Khlebnikov
g can limit RPS functionality". XOR is default in mlx5_en since commit 2be6967cdbc9 ("net/mlx5e: Support ETH_RSS_HASH_XOR"). Hash function could be set via ethtool. But it would be nice to have single standard for drivers or proper description why this one is special. Signed-off-by: K

Re: [PATCH RFC] net/mlx5_en: switch to Toeplitz RSS hash by default

2018-09-02 Thread Konstantin Khlebnikov
On 02.09.2018 12:29, Tariq Toukan wrote: On 31/08/2018 2:29 PM, Konstantin Khlebnikov wrote: XOR (MLX5_RX_HASH_FN_INVERTED_XOR8) gives only 8 bits. It seems not enough for RFS. All other drivers use toeplitz. Driver mlx4_en uses Toeplitz by default and warns if hash XOR is used together with

Re: [PATCH RFC] net/mlx5_en: switch to Toeplitz RSS hash by default

2018-09-06 Thread Konstantin Khlebnikov
On 06.09.2018 08:24, Saeed Mahameed wrote: On Sun, Sep 2, 2018 at 2:55 AM, Konstantin Khlebnikov wrote: On 02.09.2018 12:29, Tariq Toukan wrote: On 31/08/2018 2:29 PM, Konstantin Khlebnikov wrote: XOR (MLX5_RX_HASH_FN_INVERTED_XOR8) gives only 8 bits. It seems not enough for RFS. All

Re: 4.4.103 linux kernel regression

2017-12-23 Thread Konstantin Khlebnikov
ck device. Mathias, please try debug patch from attachment. It logs all refcount changes for loopback in non-host net namespace. Hopefully log would will be tiny and show what is missing. Looks like vsftpd creates and destroys empty net-ns, like "unshare -n true" net: debug lo refcnt

Re: 4.4.103 linux kernel regression

2017-12-24 Thread Konstantin Khlebnikov
ly, so I'm not sure it would work on the latest - but I can give it a try. Regards Mathias On Sat, 23 Dec 2017, 17:36 Konstantin Khlebnikov, mailto:khlebni...@yandex-team.ru>> wrote: On 23.12.2017 16:52, Greg KH wrote: > adding stable@ and netdev@ > > On

[PATCH] net_sched: blackhole: tell upper qdisc about dropped packets

2018-06-15 Thread Konstantin Khlebnikov
schedules watchdog work endlessly. This patch return __NET_XMIT_BYPASS in addition to NET_XMIT_SUCCESS, this flag tells upper layer: this packet is gone and isn't queued. Signed-off-by: Konstantin Khlebnikov --- net/sched/sch_blackhole.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-)

Re: [PATCH] net_sched: blackhole: tell upper qdisc about dropped packets

2018-06-15 Thread Konstantin Khlebnikov
On 15.06.2018 16:13, Eric Dumazet wrote: On 06/15/2018 03:27 AM, Konstantin Khlebnikov wrote: When blackhole is used on top of classful qdisc like hfsc it breaks qlen and backlog counters because packets are disappear without notice. In HFSC non-zero qlen while all classes are inactive

[PATCH] net/ipv4: add comment about connect() to INADDR_ANY

2020-07-25 Thread Konstantin Khlebnikov
Copy comment from net/ipv6/tcp_ipv6.c to help future readers. Signed-off-by: Konstantin Khlebnikov --- net/ipv4/route.c |1 + 1 file changed, 1 insertion(+) diff --git a/net/ipv4/route.c b/net/ipv4/route.c index a01efa062f6b..303fe706cbd2 100644 --- a/net/ipv4/route.c +++ b/net/ipv4

Re: [PATCH] net/core/neighbour: tell kmemleak about hash tables

2019-01-10 Thread Konstantin Khlebnikov
On Thu, Jan 10, 2019 at 11:45 PM Cong Wang wrote: > > On Tue, Jan 8, 2019 at 1:30 AM Konstantin Khlebnikov > wrote: > > @@ -443,12 +444,14 @@ static struct neigh_hash_table > > *neigh_hash_alloc(unsigned int shift) > > ret = kmalloc(sizeof(*ret), GFP_AT

[PATCH] net/core/neighbour: fix kmemleak minimal reference count for hash tables

2019-01-14 Thread Konstantin Khlebnikov
This should be 1 for normal allocations, 0 disables leak reporting. Signed-off-by: Konstantin Khlebnikov Reported-by: Cong Wang Fixes: 85704cb8dcfd ("net/core/neighbour: tell kmemleak about hash tables") --- net/core/neighbour.c |2 +- 1 file changed, 1 insertion(+), 1 deletio

[PATCH] e1000e: fix cyclic resets at link up with active tx

2019-01-14 Thread Konstantin Khlebnikov
in order to enable jumbo frames [ 37.790342] e1000e: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None This patch flushes tx buffers only once when carrier is off rather than at each watchdog iteration. Signed-off-by: Konstantin Khlebnikov --- drivers/net/ethernet/int

Re: [Intel-wired-lan] [PATCH] e1000e: fix cyclic resets at link up with active tx

2019-01-17 Thread Konstantin Khlebnikov
On 17.01.2019 10:57, Neftin, Sasha wrote: On 1/14/2019 15:29, Konstantin Khlebnikov wrote: I'm seeing series of e1000e resets (sometimes endless) at system boot if something generates tx traffic at this time. In my case this is netconsole who sends message "e1000e :02:00.0:

[PATCH v2] macvlan: use per-cpu queues for broadcast and multicast packets

2018-12-19 Thread Konstantin Khlebnikov
Currently macvlan has single per-port queue for broadcast and multicast. This disrupts order of packets when flows from different cpus are mixed. This patch replaces this queue with single set of per-cpu queues. Pointer to macvlan port is passed in skb control block. Signed-off-by: Konstantin

[PATCH] net/core/neighbour: tell kmemleak about hash tables

2019-01-08 Thread Konstantin Khlebnikov
9b/0x400 [<81cdb353>] entry_SYSCALL_64_after_hwframe+0x49/0xbe [<00005767ed39>] 0x Signed-off-by: Konstantin Khlebnikov --- net/core/neighbour.c | 13 + 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/net/core/neighbour.c b

Re: [PATCH] net/core/neighbour: tell kmemleak about hash tables

2019-01-08 Thread Konstantin Khlebnikov
On 08.01.2019 14:59, Eric Dumazet wrote: On 01/08/2019 01:30 AM, Konstantin Khlebnikov wrote: This fixes false-positive kmemleak reports about leaked neighbour entries: unreferenced object 0x8885c6e4d0a8 (size 1024): size 1024 object : should have been allocated by kzalloc(), right

[PATCH] inet_diag: fix reporting cgroup classid and fallback to priority

2019-02-09 Thread Konstantin Khlebnikov
roup2 id because it is 64-bit (ino+gen). So, after this patch INET_DIAG_CLASS_ID will report socket priority for most common setup when net_cls isn't set and/or cgroup2 in use. Signed-off-by: Konstantin Khlebnikov Fixes: 0888e372c37f ("net: inet: diag: expose sockets cgroup classid"

Re: [PATCH] inet_diag: fix reporting cgroup classid and fallback to priority

2019-02-13 Thread Konstantin Khlebnikov
On 12.02.2019 21:37, David Miller wrote: From: Konstantin Khlebnikov Date: Sat, 09 Feb 2019 13:35:52 +0300 Field idiag_ext in struct inet_diag_req_v2 used as bitmap of requested extensions has only 8 bits. Thus extensions starting from DCTCPINFO cannot be requested directly. Some of them

[PATCH iproute2] ss: add option --tos for requesting ipv4 tos and ipv6 tclass

2019-02-13 Thread Konstantin Khlebnikov
Also show socket class_id/priority used by classful qdisc. Kernel report this together with tclass since commit ("inet_diag: fix reporting cgroup classid and fallback to priority") Signed-off-by: Konstantin Khlebnikov --- man/man8/ss.8 | 17 + misc/ss.c

Re: [PATCH net-next 0/2] inet_diag: add cgroup attribute and filter

2020-04-30 Thread Konstantin Khlebnikov
On 30/04/2020 22.55, David Miller wrote: From: Dmitry Yakunin Date: Thu, 30 Apr 2020 18:51:13 +0300 This patch series extends inet diag with cgroup v2 ID attribute and filter. Which allows investigate sockets on per cgroup basis. Patch for ss is already sent to iproute2-next mailing list. Ok

[PATCH] ipv6/addrconf: use netdev_info()/netdev_warn()/netdev_dbg() for logging

2019-09-20 Thread Konstantin Khlebnikov
Print prefix " : " or " : ". Add "IPv6: " into format: netdev_info() does not use macro pr_fmt(). Signed-off-by: Konstantin Khlebnikov --- net/ipv6/addrconf.c | 28 net/ipv6/addrconf_core.c |2 +- 2 files changed, 13 insert

[PATCH] net/core/dev: print rtnl kind as driver name for virtual devices

2019-09-20 Thread Konstantin Khlebnikov
Device kind gives more information than only arbitrary device name. Signed-off-by: Konstantin Khlebnikov --- net/core/dev.c | 15 ++- 1 file changed, 6 insertions(+), 9 deletions(-) diff --git a/net/core/dev.c b/net/core/dev.c index 71b18e80389f..c84561634afd 100644 --- a/net

Re: [PATCH] mm/vmstats: add counters for the page frag cache

2017-09-04 Thread Konstantin Khlebnikov
will happen and would remove pgfrag_alloc_calls and pgfrag_free_calls. Thanks, Kyeongdon Kim On 2017-09-01 오후 6:12, Konstantin Khlebnikov wrote: IMHO that's too much counters. Per-node NR_FRAGMENT_PAGES should be enough for guessing what's going on. Perf probes provides enough features for

[PATCH] iptables: ip6t_MASQUERADE: add dependency on conntrack module

2017-12-11 Thread Konstantin Khlebnikov
After commit 4d3a57f23dec ("netfilter: conntrack: do not enable connection tracking unless needed") conntrack is disabled by default unless some module explicitly declares dependency in particular network namespace. Signed-off-by: Konstantin Khlebnikov Fixes: a357b3f80bc8 ("netf

Re: [PATCH] iptables: ip6t_MASQUERADE: add dependency on conntrack module

2017-12-15 Thread Konstantin Khlebnikov
On 11.12.2017 18:47, Pablo Neira Ayuso wrote: On Mon, Dec 11, 2017 at 06:19:33PM +0300, Konstantin Khlebnikov wrote: After commit 4d3a57f23dec ("netfilter: conntrack: do not enable connection tracking unless needed") conntrack is disabled by default unless some module explicitl

[PATCH] net/sched: reset block pointer in tcf_block_put()

2017-08-10 Thread Konstantin Khlebnikov
In previous API tcf_destroy_chain() could be called several times and some schedulers like hfsc and atm use that. In new API tcf_block_put() frees block but leaves stale pointer, second call will free it once again. This patch fixes such double-frees. Signed-off-by: Konstantin Khlebnikov Fixes

[PATCH] net/sched/hfsc: allocate tcf block for hfsc root class

2017-08-10 Thread Konstantin Khlebnikov
Without this filters cannot be attached. Signed-off-by: Konstantin Khlebnikov Fixes: 6529eaba33f0 ("net: sched: introduce tcf block infractructure") --- net/sched/sch_hfsc.c |8 1 file changed, 8 insertions(+) diff --git a/net/sched/sch_hfsc.c b/net/sched/sch_hf

Re: [PATCH] net/sched: reset block pointer in tcf_block_put()

2017-08-11 Thread Konstantin Khlebnikov
On 11.08.2017 23:18, Cong Wang wrote: On Thu, Aug 10, 2017 at 2:31 AM, Konstantin Khlebnikov wrote: In previous API tcf_destroy_chain() could be called several times and some schedulers like hfsc and atm use that. In new API tcf_block_put() frees block but leaves stale pointer, second call

Re: [PATCH] net/sched: reset block pointer in tcf_block_put()

2017-08-14 Thread Konstantin Khlebnikov
On 12.08.2017 00:38, Cong Wang wrote: On Fri, Aug 11, 2017 at 1:36 PM, Konstantin Khlebnikov wrote: On 11.08.2017 23:18, Cong Wang wrote: On Thu, Aug 10, 2017 at 2:31 AM, Konstantin Khlebnikov wrote: In previous API tcf_destroy_chain() could be called several times and some schedulers

[PATCH] net_sched: reset pointers to tcf blocks in classful qdiscs' destructors

2017-08-15 Thread Konstantin Khlebnikov
ot be called second time. This patch set class->block to NULL after first tcf_block_put() and turn second call into no-op. Signed-off-by: Konstantin Khlebnikov Fixes: 6529eaba33f0 ("net: sched: introduce tcf block infractructure") --- net/sched/sch_atm.c |4 +++- net/sched

[PATCH 1/2] net_sched: call qlen_notify only if child qdisc is empty

2017-08-15 Thread Konstantin Khlebnikov
destruction of child qdisc where no packets but backlog is not zero. Signed-off-by: Konstantin Khlebnikov --- net/sched/sch_api.c | 10 +- net/sched/sch_cbq.c |3 +-- net/sched/sch_drr.c |3 +-- net/sched/sch_hfsc.c |6 ++ net/sched/sch_htb.c |3 +-- net/sched/sch_qfq.c

[PATCH] net_sched/sfq: update hierarchical backlog when drop packet

2017-08-15 Thread Konstantin Khlebnikov
When sfq_enqueue() drops head packet or packet from another queue it have to update backlog at upper qdiscs too. Signed-off-by: Konstantin Khlebnikov Fixes: 2f5fb43f ("net_sched: update hierarchical backlog too") --- net/sched/sch_sfq.c |5 - 1 file changed, 4 insert

[PATCH] net_sched: remove warning from qdisc_hash_add

2017-08-15 Thread Konstantin Khlebnikov
ful qdisc is added to inactive device because default qdiscs are added before switching root qdisc. Anyway after commit ea3274695353 ("net: sched: avoid duplicates in qdisc dump") duplicates are filtered right in dumper. Signed-off-by: Konstantin Khlebnikov --- net/sched/sch_api.c

[PATCH 2/2] net_sched/hfsc: opencode trivial set_active() and set_passive()

2017-08-15 Thread Konstantin Khlebnikov
Any move comment abount update_vf() into right place. Signed-off-by: Konstantin Khlebnikov --- net/sched/sch_hfsc.c | 45 - 1 file changed, 16 insertions(+), 29 deletions(-) diff --git a/net/sched/sch_hfsc.c b/net/sched/sch_hfsc.c index

Re: [PATCH] net/sched: reset block pointer in tcf_block_put()

2017-08-15 Thread Konstantin Khlebnikov
On 15.08.2017 00:15, Cong Wang wrote: On Mon, Aug 14, 2017 at 5:59 AM, Konstantin Khlebnikov wrote: This should work, I suppose. But this approach requires careful review for all qdisc, mine is completely mechanical. Well, we don't have many classful qdisc's. Your patch actual

[PATCH] e1000e: use disable_hardirq() also for MSIX vectors in e1000_netpoll()

2017-05-19 Thread Konstantin Khlebnikov
Replace disable_irq() which waits for threaded irq handlers with disable_hardirq() which waits only for hardirq part. Signed-off-by: Konstantin Khlebnikov Fixes: 311191297125 ("e1000: use disable_hardirq() for e1000_netpoll()") --- drivers/net/ethernet/intel/e1000e/netde

[BUG] division by zero in tcpnv_acked()

2017-10-30 Thread Konstantin Khlebnikov
I've got this on two different machines: [ 24.405015] divide error: [#1] SMP [ 24.405403] Modules linked in: nf_log_ipv6 nf_log_common xt_LOG xt_u32 ip6t_REJECT nf_reject_ipv6 xt_conntrack ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_filter ip6_tables xt_tcpudp ipt

[PATCH] tcp_nv: fix division by zero in tcpnv_acked()

2017-11-01 Thread Konstantin Khlebnikov
Average RTT could become zero. This happened in real life at least twice. This patch treats zero as 1us. Signed-off-by: Konstantin Khlebnikov --- net/ipv4/tcp_nv.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/ipv4/tcp_nv.c b/net/ipv4/tcp_nv.c index 1ff73982e28c

[PATCH] tcp_nv: use do_div() instead of expensive div64_u64()

2017-11-02 Thread Konstantin Khlebnikov
Average RTT is 32-bit thus full 64-bit division is redundant. Signed-off-by: Konstantin Khlebnikov Suggested-by: Stephen Hemminger Suggested-by: Eric Dumazet --- net/ipv4/tcp_nv.c |7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/net/ipv4/tcp_nv.c b/net/ipv4

[PATCH RFC] openvswitch: add support for netpoll

2015-04-23 Thread Konstantin Khlebnikov
: Konstantin Khlebnikov --- net/openvswitch/vport-internal_dev.c | 74 ++ net/openvswitch/vport-netdev.c | 63 - net/openvswitch/vport-netdev.h | 15 +++ 3 files changed, 148 insertions(+), 4 deletions(-) diff --git a

Re: [PATCH 2/3] ipvlan: grab rcu_read_lock on xmit path

2015-05-21 Thread Konstantin Khlebnikov
On 20.05.2015 02:33, Mahesh Bandewar wrote: On Thu, May 14, 2015 at 6:56 AM, Konstantin Khlebnikov wrote: ipvlan_start_xmit() is called with rcu_read_lock_bh() while its internal structures requre normal rcu_read_lock(). Signed-off-by: Konstantin Khlebnikov --- [ 802.945151

Re: [PATCH 3/3] ipvlan: set dev_id for l2 ports to generate unique IPv6 addresses

2015-05-21 Thread Konstantin Khlebnikov
On 20.05.2015 02:59, Mahesh Bandewar wrote: On Thu, May 14, 2015 at 6:56 AM, Konstantin Khlebnikov wrote: All ipvlan ports use one MAC address, this way ipv6 RA tries to assign one ipv6 address to all of them. This patch assigns unique dev_id to each ipvlan port. This field is used instead of

[PATCH] mac80211: minstrel_ht: fix out-of-bound in minstrel_ht_set_best_prob_rate

2016-01-29 Thread Konstantin Khlebnikov
Patch fixes this splat BUG: KASAN: slab-out-of-bounds in minstrel_ht_update_stats.isra.7+0x6e1/0x9e0 [mac80211] at addr 8800cee640f4 Read of size 4 by task swapper/3/0 Signed-off-by: Konstantin Khlebnikov Link: http://lkml.kernel.org/r/calygninyjhsavne35qs6ucgasb2dx1_i5hcravuox14otz2

IPv4/IPv6 sysctl defaults in new namespace

2016-02-15 Thread Konstantin Khlebnikov
IPv6 initialized with default. That's ok. IPv4 makes a copy from init_net. Looks like a bug, here v2.6.24-2577-g752d14dc6aa9 root@zurg:~# sysctl net.ipv4.conf.all.forwarding=0 net.ipv6.conf.all.forwarding=0 net.ipv4.conf.all.forwarding = 0 net.ipv6.conf.all.forwarding = 0 root@zurg:~# unshare -n s

[PATCH 2/2] net/ipv6/addrconf: fix sysctl table indentation

2016-04-18 Thread Konstantin Khlebnikov
Separated from previous patch for readability. Signed-off-by: Konstantin Khlebnikov --- net/ipv6/addrconf.c | 616 +-- 1 file changed, 307 insertions(+), 309 deletions(-) diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index 8a724c7136b0

[PATCH] net/mlx4_en: allocate non 0-order pages for RX ring with __GFP_NOMEMALLOC

2016-04-18 Thread Konstantin Khlebnikov
High order pages are optional here since commit 51151a16a60f ("mlx4: allow order-0 memory allocations in RX path"), so here is no reason for depleting reserves. Generic __netdev_alloc_frag() implements the same logic. Signed-off-by: Konstantin Khlebnikov --- drivers/net/ethernet/mel

[PATCH] net/mlx4_en: do batched put_page using atomic_sub

2016-04-18 Thread Konstantin Khlebnikov
This patch fixes couple error paths after allocation failures. Atomic set of page reference counter is safe only if it is zero, otherwise set can race with any speculative get_page_unless_zero. Signed-off-by: Konstantin Khlebnikov --- drivers/net/ethernet/mellanox/mlx4/en_rx.c |8

[PATCH 1/2] net/ipv6/addrconf: simplify sysctl registration

2016-04-18 Thread Konstantin Khlebnikov
options are disable in config. Signed-off-by: Konstantin Khlebnikov --- include/linux/ipv6.h |3 ++- net/ipv6/addrconf.c | 43 +-- 2 files changed, 19 insertions(+), 27 deletions(-) diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h index

[PATCH] cls_cgroup: get sk_classid only from full sockets

2016-04-18 Thread Konstantin Khlebnikov
skb->sk could point to timewait or request socket which has no sk_classid. Detected as "BUG: KASAN: slab-out-of-bounds in cls_cgroup_classify". Signed-off-by: Konstantin Khlebnikov --- include/net/cls_cgroup.h |7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --

Re: [PATCH] ipv4: in new netns initialize sysctls in net.ipv4.conf.* with defaults

2016-02-21 Thread Konstantin Khlebnikov
akes sense. However, there is corner case: module with sysctl can be loaded after creation of namespaces. In this case namespaces will get pre-compiled sysctl defaults, and are not be able to adjust them even if they want to do it. Thank you, Vasily Averin On 21.02.2016 10:11, Konstantin Khlebn

[PATCH] tcp: convert cached rtt from usec to jiffies when feeding initial rto

2016-02-21 Thread Konstantin Khlebnikov
Currently it's converted into msecs, thus HZ=1000 intact. Signed-off-by: Konstantin Khlebnikov Fixes: 740b0f1841f6 ("tcp: switch rtt estimations to usec resolution") --- net/ipv4/tcp_metrics.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/ipv4/tcp_

[PATCH] ipv4: in new netns initialize sysctls in net.ipv4.conf.* with defaults

2016-02-21 Thread Konstantin Khlebnikov
are enabled. Other sysctls in net.ipv4 and net.ipv6 already initialized with default values at namespace creation. Signed-off-by: Konstantin Khlebnikov Fixes: 752d14dc6aa9 ("[IPV4]: Move the devinet pointers on the struct net") --- net/ipv4/devinet.c |2 +- 1 file changed, 1 inser

Re: [PATCH] ipv4: in new netns initialize sysctls in net.ipv4.conf.* with defaults

2016-02-23 Thread Konstantin Khlebnikov
On Wed, Feb 24, 2016 at 2:21 AM, David Miller wrote: > From: Konstantin Khlebnikov > Date: Sun, 21 Feb 2016 10:11:02 +0300 > >> Currently initial net.ipv4.conf.all.* and net.ipv4.conf.default.* are >> copied from init network namespace because static structures are used

[BUG] skb corruption and kernel panic at forwarding with fragmentation

2016-01-06 Thread Konstantin Khlebnikov
I've got some of these: [84408.314676] BUG: unable to handle kernel NULL pointer dereference at (null) [84408.317324] IP: [] put_page+0x5/0x50 [84408.319985] PGD 0 [84408.322583] Oops: [#1] SMP [84408.325156] Modules linked in: ppp_mppe ppp_async ppp_generic slhc 8021q fuse nfsd aut

Re: [BUG] skb corruption and kernel panic at forwarding with fragmentation

2016-01-06 Thread Konstantin Khlebnikov
On Wed, Jan 6, 2016 at 10:59 PM, Cong Wang wrote: > On Wed, Jan 6, 2016 at 11:15 AM, Konstantin Khlebnikov > wrote: >> Looks like this happens because ip_options_fragment() relies on >> correct ip options length in ip control block in skb. But in >> ip_finish_outpu

Re: [BUG] skb corruption and kernel panic at forwarding with fragmentation

2016-01-07 Thread Konstantin Khlebnikov
On Thu, Jan 7, 2016 at 2:00 PM, Konstantin Khlebnikov wrote: > On Thu, Jan 7, 2016 at 2:49 AM, Florian Westphal wrote: >> Florian Westphal wrote: >>> Thadeu Lima de Souza Cascardo wrote: >>> > On Wed, Jan 06, 2016 at 11:11:41PM +0300, Konstantin Khlebnikov wrote

Re: [PATCH net-next] mlx4: support __GFP_MEMALLOC for rx

2017-01-18 Thread Konstantin Khlebnikov
reserves by flood from network. Note that this driver does not reuse pages (yet) so we do not have to add anything else. Signed-off-by: Eric Dumazet Cc: Konstantin Khlebnikov Cc: Tariq Toukan --- drivers/net/ethernet/mellanox/mlx4/en_rx.c |3 ++- 1 file changed, 2 insertions(+),

Re: [PATCH net-next] mlx4: support __GFP_MEMALLOC for rx

2017-01-18 Thread Konstantin Khlebnikov
On 18.01.2017 17:23, Eric Dumazet wrote: On Wed, 2017-01-18 at 12:31 +0300, Konstantin Khlebnikov wrote: On 18.01.2017 07:14, Eric Dumazet wrote: From: Eric Dumazet Commit 04aeb56a1732 ("net/mlx4_en: allocate non 0-order pages for RX ring with __GFP_NOMEMALLOC") added code that app

[PATCH] net/sched/sch_htb: clamp xstats tokens to fit into 32-bit int

2016-07-16 Thread Konstantin Khlebnikov
hus tool 'tc' prints them as signed. Big values loose higher bits and/or become negative. This patch clamps tokens in xstat into range from INT_MIN to INT_MAX. In this way it's easier to understand what's going on here. Signed-off-by: Konstantin Khlebnikov --- net/sched/sc

[PATCH] ovs: do not allocate memory from offline numa node

2015-10-02 Thread Konstantin Khlebnikov
patch disables numa affinity in this case. Signed-off-by: Konstantin Khlebnikov --- <4>[ 24.368805] [ cut here ] <2>[ 24.368846] kernel BUG at include/linux/gfp.h:325! <4>[ 24.368868] invalid opcode: [#1] SMP <4>[ 24.368892] Module

[PATCH 3.10.y 2/2] ipv6: update ip6_rt_last_gc every time GC is run

2015-06-10 Thread Konstantin Khlebnikov
From: Michal Kubeček commit 49a18d86f66d33a20144ecb5a34bba0d1856b260 upstream As pointed out by Eric Dumazet, net->ipv6.ip6_rt_last_gc should hold the last time garbage collector was run so that we should update it whenever fib6_run_gc() calls fib6_clean_all(), not only if we got there from ip6_

[PATCH 3.10.y 0/2] ipv6: avoid soft lockups in fib6_run_gc()

2015-06-10 Thread Konstantin Khlebnikov
Two patches from 3.11 which are missing in 3.10.y I've just seen livelock in 3.10.69+ where all cpus are stuck in fib6_run_gc() <4>[2919865.977745] Call Trace: <4>[2919865.977748] <4>[2919865.977754] [] _raw_spin_lock_bh+0x1e/0x30 <4>[2919865.977759] [] fib6_run_gc+0x28/0x100 <4>[2919865.977

[PATCH 3.10.y 1/2] ipv6: prevent fib6_run_gc() contention

2015-06-10 Thread Konstantin Khlebnikov
From: Michal Kubeček commit 2ac3ac8f86f2fe065d746d9a9abaca867adec577 upstream On a high-traffic router with many processors and many IPv6 dst entries, soft lockup in fib6_run_gc() can occur when number of entries reaches gc_thresh. This happens because fib6_run_gc() uses fib6_gc_lock to allow o

Re: netlink & rhashtable status

2015-06-26 Thread Konstantin Khlebnikov
On 14.05.2015 07:21, Herbert Xu wrote: On Thu, May 14, 2015 at 12:16:28PM +0800, Herbert Xu wrote: On Wed, May 13, 2015 at 09:13:38PM -0700, Eric Dumazet wrote: So it looks like we lost an skb or something OK that sounds reasonable. So my plan is to disable dynamic rehashing and then hu

[PATCH v3.17 .. v3.19] lib/rhashtable: fix race between rhashtable_lookup_compare and hashtable resize

2015-06-26 Thread Konstantin Khlebnikov
adds comment for rhashtable_hashfn and rhashtable_obj_hashfn: user must prevent concurrent insert/remove otherwise returned hash value could be invalid. Signed-off-by: Konstantin Khlebnikov Fixes: e341694e3eb5 ("netlink: Convert netlink_lookup() to use RCU protected hash table")

Re: [PATCH v3.17 .. v3.19] lib/rhashtable: fix race between rhashtable_lookup_compare and hashtable resize

2015-06-30 Thread Konstantin Khlebnikov
+CC Sasha Levin FYI: this patch fixes race in netlink which leads to hung in glibc function getaddrinfo() because it doesn't handle errors at all. On 26.06.2015 13:48, Konstantin Khlebnikov wrote: Hash value passed as argument into rhashtable_lookup_compare could be computed using diff

[PATCH] net/neighbour: fix crash at dumping device-agnostic proxy entries

2015-11-30 Thread Konstantin Khlebnikov
Proxy entries could have null pointer to net-device. Signed-off-by: Konstantin Khlebnikov Fixes: 84920c1420e2 ("net: Allow ipv6 proxies and arp proxies be shown with iproute2") Cc: # v3.4 --- net/core/neighbour.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff -

[PATCH] ip neigh: device is optional for proxy entries

2015-11-30 Thread Konstantin Khlebnikov
Though dumping such entries crashes present kernels. Signed-off-by: Konstantin Khlebnikov --- ip/ipneigh.c | 13 - 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/ip/ipneigh.c b/ip/ipneigh.c index 54655842ed38..92b7cd6f2a75 100644 --- a/ip/ipneigh.c +++ b/ip

Re: net: Fix skb_set_peeked use-after-free bug

2015-08-05 Thread Konstantin Khlebnikov
ue to use the old freed skb. This patch fixes it by making skb_set_peeked return the new skb (or the old one if unchanged). Fixes: 738ac1ebb96d ("net: Clone skb before setting peeked flag") Reported-by: Brenden Blanco Signed-off-by: Herbert Xu Seems correct. You can add: Reviewed-by

[PATCH v2 5/5] ipvlan: set dev_id for l2 ports to generate unique IPv6 addresses

2015-07-03 Thread Konstantin Khlebnikov
All ipvlan ports use one MAC address, this way ipv6 RA tries to assign one ipv6 address to all of them. This patch assigns unique dev_id to each ipvlan port. This field is used instead of common FF-FE in Modified EUI-64. Signed-off-by: Konstantin Khlebnikov --- Documentation/networking

[PATCH v2 2/5] ipvlan: plug memory leak in ipvlan_link_delete

2015-07-03 Thread Konstantin Khlebnikov
Add missing kfree_rcu(addr, rcu); Signed-off-by: Konstantin Khlebnikov --- drivers/net/ipvlan/ipvlan_main.c |1 + 1 file changed, 1 insertion(+) diff --git a/drivers/net/ipvlan/ipvlan_main.c b/drivers/net/ipvlan/ipvlan_main.c index 62577b3f01f2..4c3a0ac85381 100644 --- a/drivers/net/ipvlan

[PATCH v2 0/5] ipvlan: fix ipv6 autoconfiguration

2015-07-03 Thread Konstantin Khlebnikov
with this: https://patchwork.ozlabs.org/patch/471481/ * new fix for trivial memory leak and patch which removes address counters --- Konstantin Khlebnikov (5): ipvlan: remove counters of ipv4 and ipv6 addresses ipvlan: plug memory leak in ipvlan_link_delete ipvlan: unhash

[PATCH v2 4/5] ipvlan: protect addresses with internal spinlock

2015-07-03 Thread Konstantin Khlebnikov
off-by: Konstantin Khlebnikov --- drivers/net/ipvlan/ipvlan.h | 11 +++ drivers/net/ipvlan/ipvlan_core.c |2 -- drivers/net/ipvlan/ipvlan_main.c | 33 ++--- 3 files changed, 41 insertions(+), 5 deletions(-) diff --git a/drivers/net/ipvlan/ipvlan.

[PATCH v2 1/5] ipvlan: remove counters of ipv4 and ipv6 addresses

2015-07-03 Thread Konstantin Khlebnikov
They are unused after commit f631c44bbe15 ("ipvlan: Always set broadcast bit in multicast filter"). Signed-off-by: Konstantin Khlebnikov --- drivers/net/ipvlan/ipvlan.h |2 - drivers/net/ipvlan/ipvlan_main.c | 65 +++--- 2 files changed, 26

[PATCH v2 3/5] ipvlan: unhash addresses without synchronize_rcu

2015-07-03 Thread Konstantin Khlebnikov
All structures used in traffic forwarding are rcu-protected: ipvl_addr, ipvl_dev and ipvl_port. Thus we can unhash addresses without synchronization. We'll anyway hash it back into the same bucket, in worst case lockless lookup will scan hash once again. Signed-off-by: Konstantin Khleb

[PATCH] netlink: enable skb header refcounting before sending first broadcast

2015-07-10 Thread Konstantin Khlebnikov
ees it twice. Race leads to double-free in kmalloc-xxx. Signed-off-by: Konstantin Khlebnikov Fixes: b19372273164 ("net: reorganize sk_buff for faster __copy_skb_header()") --- net/netlink/af_netlink.c |6 ++ 1 file changed, 6 insertions(+) diff --git a/net/netlink/af_netlink

[PATCH] netlink: reset skb->peeked when reuse orphan skb for next broadcast

2015-07-10 Thread Konstantin Khlebnikov
This patch clears skb->peeked set by previous recipient of broadcast. Signed-off-by: Konstantin Khlebnikov Fixes: add05ad4e9f5 ("unix/dgram: peek beyond 0-sized skbs") --- net/netlink/af_netlink.c |1 + 1 file changed, 1 insertion(+) diff --git a/net/netlink/af_netlink.c

Re: [PATCH v2 4/5] ipvlan: protect addresses with internal spinlock

2015-07-10 Thread Konstantin Khlebnikov
On 08.07.2015 07:05, Mahesh Bandewar wrote: On Fri, Jul 3, 2015 at 5:58 AM, Konstantin Khlebnikov wrote: Inet6addr notifier is atomic and runs in bh context without RTNL when ipv6 receives router advertisement packet and performs autoconfiguration. This patch adds ipvl_port->addr_lock

Re: [PATCH] netlink: reset skb->peeked when reuse orphan skb for next broadcast

2015-07-10 Thread Konstantin Khlebnikov
On 10.07.2015 14:51, Konstantin Khlebnikov wrote: This patch clears skb->peeked set by previous recipient of broadcast. Signed-off-by: Konstantin Khlebnikov Fixes: add05ad4e9f5 ("unix/dgram: peek beyond 0-sized skbs") --- net/netlink/af_netlink.c |1 + 1 file changed

[PATCH v2] netlink: reset skb->peeked when reuse orphan skb for next broadcast

2015-07-10 Thread Konstantin Khlebnikov
This patch clears skb->peeked set by previous recipient of broadcast. Signed-off-by: Konstantin Khlebnikov Fixes: add05ad4e9f5 ("unix/dgram: peek beyond 0-sized skbs") --- net/netlink/af_netlink.c |1 + 1 file changed, 1 insertion(+) diff --git a/net/netlink/af_netlink.c

Re: [PATCH] netlink: enable skb header refcounting before sending first broadcast

2015-07-10 Thread Konstantin Khlebnikov
On 10.07.2015 16:49, Eric Dumazet wrote: On Fri, 2015-07-10 at 14:51 +0300, Konstantin Khlebnikov wrote: This fixes race between non-atomic updates of adjacent bit-fields: skb->cloned could be lost because netlink broadcast clones skb after sending it to the first listener who sets skb->

Re: [PATCH] netlink: enable skb header refcounting before sending first broadcast

2015-07-13 Thread Konstantin Khlebnikov
On 13.07.2015 10:23, Herbert Xu wrote: On Fri, Jul 10, 2015 at 02:51:41PM +0300, Konstantin Khlebnikov wrote: This fixes race between non-atomic updates of adjacent bit-fields: skb->cloned could be lost because netlink broadcast clones skb after sending it to the first listener who sets

[PATCH] net: stop endless flood about dst entry refcount underflow or overflow

2015-07-14 Thread Konstantin Khlebnikov
ready fixed in upstream. Anyway flood of that warnings completely kills machine and makes further debugging impossible. Signed-off-by: Konstantin Khlebnikov --- net/core/dst.c |3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/net/core/dst.c b/net/core/dst.c index e956ce6

Re: [PATCH] net: stop endless flood about dst entry refcount underflow or overflow

2015-07-14 Thread Konstantin Khlebnikov
On 14.07.2015 15:04, Eric Dumazet wrote: On Tue, 2015-07-14 at 14:43 +0300, Konstantin Khlebnikov wrote: Kernel generates a lot of warnings when dst entry reference counter overflows and becomes negative. This patch prints address of dst entry, its refcount and then resets reference counter to

[PATCH v3 5/5] ipvlan: ignore addresses from ipv6 autoconfiguration

2015-07-14 Thread Konstantin Khlebnikov
l.org/r/20150703125840.24121.91556.stgit@buzz Signed-off-by: Konstantin Khlebnikov --- drivers/net/ipvlan/ipvlan_main.c |4 1 file changed, 4 insertions(+) diff --git a/drivers/net/ipvlan/ipvlan_main.c b/drivers/net/ipvlan/ipvlan_main.c index e995bc501ee6..20b58bdecf75 100644 --- a/dr

[PATCH v3 4/5] ipvlan: use rcu_deference_bh() in ipvlan_queue_xmit()

2015-07-14 Thread Konstantin Khlebnikov
/0xe8 [] SyS_write+0x47/0x7e [] system_call_fastpath+0x12/0x6f Fixes: 2ad7bf363841 ("ipvlan: Initial check-in of the IPVLAN driver.") Cc: Mahesh Bandewar Signed-off-by: Cong Wang Acked-by: Mahesh Bandewar Acked-by: Konstantin Khlebnikov --- drivers/net/ipvlan/ipvlan.h

[PATCH v3 0/5] ipvlan: cleanups and fixes

2015-07-14 Thread Konstantin Khlebnikov
v1: http://comments.gmane.org/gmane.linux.network/363346 v2: http://comments.gmane.org/gmane.linux.network/369086 v3 has reduced set of patches from "ipvlan: fix ipv6 autoconfiguration". Here just cleanups and patch which ignores ipv6 notifications from RA. --- Konstantin Khl

  1   2   >