On 06.09.2018 08:24, Saeed Mahameed wrote:
On Sun, Sep 2, 2018 at 2:55 AM, Konstantin Khlebnikov
wrote:
On 02.09.2018 12:29, Tariq Toukan wrote:
On 31/08/2018 2:29 PM, Konstantin Khlebnikov wrote:
XOR (MLX5_RX_HASH_FN_INVERTED_XOR8) gives only 8 bits.
It seems not enough for RFS. All
On 02.09.2018 12:29, Tariq Toukan wrote:
On 31/08/2018 2:29 PM, Konstantin Khlebnikov wrote:
XOR (MLX5_RX_HASH_FN_INVERTED_XOR8) gives only 8 bits.
It seems not enough for RFS. All other drivers use toeplitz.
Driver mlx4_en uses Toeplitz by default and warns if hash XOR is used
together
g can limit RPS functionality".
XOR is default in mlx5_en since commit 2be6967cdbc9
("net/mlx5e: Support ETH_RSS_HASH_XOR").
Hash function could be set via ethtool. But it would be nice to have
single standard for drivers or proper description why this one is special.
Signed-off-by: K
On 10.07.2018 01:31, Saeed Mahameed wrote:
On Tue, Jul 3, 2018 at 10:45 PM, Konstantin Khlebnikov
wrote:
I'm seeing problems with tunnelled traffic with Mellanox Technologies
MT27710 Family [ConnectX-4 Lx] using vanilla driver from linux 4.4.y
Packets with payload bigger than 116 bytes
I'm seeing problems with tunnelled traffic with Mellanox Technologies MT27710
Family [ConnectX-4 Lx] using vanilla driver from linux 4.4.y
Packets with payload bigger than 116 bytes are not exmited.
Smaller packets and normal ipv6 works fine.
In linux 4.9, 4.14 and out-of-tree driver
On 15.06.2018 16:13, Eric Dumazet wrote:
On 06/15/2018 03:27 AM, Konstantin Khlebnikov wrote:
When blackhole is used on top of classful qdisc like hfsc it breaks
qlen and backlog counters because packets are disappear without notice.
In HFSC non-zero qlen while all classes are inactive
]
and schedules watchdog work endlessly.
This patch return __NET_XMIT_BYPASS in addition to NET_XMIT_SUCCESS,
this flag tells upper layer: this packet is gone and isn't queued.
Signed-off-by: Konstantin Khlebnikov
---
net/sched/sch_blackhole.c |2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff
On 25.04.2018 17:16, Rafał Miłecki wrote:
On 23.04.2018 15:08, Rafał Miłecki wrote:
I've just updated my kernel 4.4.x and noticed a regression. Bisecting
pointed me to the commit 2417da3f4d6bc ("ipv6: only call
ip6_route_dev_notify() once for NETDEV_UNREGISTER") [0] which is
backport of
on the
latest - but I can give it a try.
Regards
Mathias
On Sat, 23 Dec 2017, 17:36 Konstantin Khlebnikov, <khlebni...@yandex-team.ru
<mailto:khlebni...@yandex-team.ru>> wrote:
On 23.12.2017 16:52, Greg KH wrote:
> adding stable@ and netdev@
>
> On Sat, Dec 23, 201
, please try debug patch from attachment.
It logs all refcount changes for loopback in non-host net namespace.
Hopefully log would will be tiny and show what is missing.
Looks like vsftpd creates and destroys empty net-ns, like "unshare -n true"
net: debug lo refcnt
From: Konstantin Khlebniko
On 11.12.2017 18:47, Pablo Neira Ayuso wrote:
On Mon, Dec 11, 2017 at 06:19:33PM +0300, Konstantin Khlebnikov wrote:
After commit 4d3a57f23dec ("netfilter: conntrack: do not enable connection
tracking unless needed") conntrack is disabled by default unless some
module explicitl
After commit 4d3a57f23dec ("netfilter: conntrack: do not enable connection
tracking unless needed") conntrack is disabled by default unless some
module explicitly declares dependency in particular network namespace.
Signed-off-by: Konstantin Khlebnikov <khlebni...@yandex-t
Average RTT is 32-bit thus full 64-bit division is redundant.
Signed-off-by: Konstantin Khlebnikov <khlebni...@yandex-team.ru>
Suggested-by: Stephen Hemminger <step...@networkplumber.org>
Suggested-by: Eric Dumazet <eric.duma...@gmail.com>
---
net/ipv4/tcp_nv.c |7 ---
Average RTT could become zero. This happened in real life at least twice.
This patch treats zero as 1us.
Signed-off-by: Konstantin Khlebnikov <khlebni...@yandex-team.ru>
---
net/ipv4/tcp_nv.c |2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ipv4/tcp_nv.c b/ne
I've got this on two different machines:
[ 24.405015] divide error: [#1] SMP
[ 24.405403] Modules linked in: nf_log_ipv6 nf_log_common xt_LOG xt_u32 ip6t_REJECT nf_reject_ipv6 xt_conntrack ip6table_nat
nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_filter ip6_tables xt_tcpudp
If real-time or fair-share curves are enabled in hfsc_change_class()
class isn't inserted into rb-trees yet. Thus init_ed() and init_vf()
must be called in place of update_ed() and update_vf().
Remove isn't required because for now curves cannot be disabled.
Signed-off-by: Konstantin Khlebnikov
SKB stored in qdisc->gso_skb also counted into backlog.
Some qdiscs don't reset backlog to zero in ->reset(),
for example sfq just dequeue and free all queued skb.
Signed-off-by: Konstantin Khlebnikov <khlebni...@yandex-team.ru>
Fixes: 2f5fb43f ("net_sched: update hierarc
pen
and would remove pgfrag_alloc_calls and pgfrag_free_calls.
Thanks,
Kyeongdon Kim
On 2017-09-01 오후 6:12, Konstantin Khlebnikov wrote:
IMHO that's too much counters.
Per-node NR_FRAGMENT_PAGES should be enough for guessing what's going on.
Perf probes provides enough features for furhter debugging.
roblem in HFSC - now operation peek could fail and
deactivate parent class.
Signed-off-by: Konstantin Khlebnikov <khlebni...@yandex-team.ru>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=109581
---
net/sched/sch_codel.c| 14 ++
net/sched/sch_fq_co
When hhf_enqueue() drops packet from another bucket it
have to update backlog at upper qdiscs too.
Signed-off-by: Konstantin Khlebnikov <khlebni...@yandex-team.ru>
Fixes: 2f5fb43f ("net_sched: update hierarchical backlog too")
---
net/sched/sch_hhf.c |5 -
1
:37, Konstantin Khlebnikov wrote:
This important to call qdisc_tree_reduce_backlog() after changing queue
length. Parent qdisc should deactivate class in ->qlen_notify() called from
qdisc_tree_reduce_backlog() but this happens only if qdisc->q.qlen in zero.
Missed class deactivations leads t
ackets
from empty qdisc and corrupting state at reactivating this class in future.
Signed-off-by: Konstantin Khlebnikov <khlebni...@yandex-team.ru>
Fixes: 86a7996cc8a0 ("net_sched: introduce qdisc_replace() helper")
Cc: Stable <sta...@vger.kernel.org>
---
include/net/sch_ge
On 16.08.2017 20:22, Cong Wang wrote:
On Tue, Aug 15, 2017 at 6:39 AM, Konstantin Khlebnikov
<khlebni...@yandex-team.ru> wrote:
This callback is used for deactivating class in parent qdisc.
This is cheaper to test queue length right here.
Also this allows to catch draining screwed b
On 15.08.2017 17:09, Eric Dumazet wrote:
On Tue, 2017-08-15 at 16:37 +0300, Konstantin Khlebnikov wrote:
When sfq_enqueue() drops head packet or packet from another queue it
have to update backlog at upper qdiscs too.
Signed-off-by: Konstantin Khlebnikov <khlebni...@yandex-team.ru>
On 15.08.2017 00:15, Cong Wang wrote:
On Mon, Aug 14, 2017 at 5:59 AM, Konstantin Khlebnikov
<khlebni...@yandex-team.ru> wrote:
This should work, I suppose.
But this approach requires careful review for all qdisc, mine is completely
mechanical.
Well, we don't have many classful q
Any move comment abount update_vf() into right place.
Signed-off-by: Konstantin Khlebnikov <khlebni...@yandex-team.ru>
---
net/sched/sch_hfsc.c | 45 -
1 file changed, 16 insertions(+), 29 deletions(-)
diff --git a/net/sched/sch_hfsc.c b/net
ful qdisc is added to inactive device because
default qdiscs are added before switching root qdisc.
Anyway after commit ea3274695353 ("net: sched: avoid duplicates in
qdisc dump") duplicates are filtered right in dumper.
Signed-off-by: Konstantin Khlebnikov <khlebni...@yandex-team.ru&g
When sfq_enqueue() drops head packet or packet from another queue it
have to update backlog at upper qdiscs too.
Signed-off-by: Konstantin Khlebnikov <khlebni...@yandex-team.ru>
Fixes: 2f5fb43f ("net_sched: update hierarchical backlog too")
---
net/sched/sch_sfq.c |
at destruction
of child qdisc where no packets but backlog is not zero.
Signed-off-by: Konstantin Khlebnikov <khlebni...@yandex-team.ru>
---
net/sched/sch_api.c | 10 +-
net/sched/sch_cbq.c |3 +--
net/sched/sch_drr.c |3 +--
net/sched/sch_hfsc.c |6 ++
net/sched/sch
be called second time.
This patch set class->block to NULL after first tcf_block_put() and
turn second call into no-op.
Signed-off-by: Konstantin Khlebnikov <khlebni...@yandex-team.ru>
Fixes: 6529eaba33f0 ("net: sched: introduce tcf block infractructure")
---
net/sched/sch_atm.
On 12.08.2017 00:38, Cong Wang wrote:
On Fri, Aug 11, 2017 at 1:36 PM, Konstantin Khlebnikov
<khlebni...@yandex-team.ru> wrote:
On 11.08.2017 23:18, Cong Wang wrote:
On Thu, Aug 10, 2017 at 2:31 AM, Konstantin Khlebnikov
<khlebni...@yandex-team.ru> wrote:
In
On 11.08.2017 23:18, Cong Wang wrote:
On Thu, Aug 10, 2017 at 2:31 AM, Konstantin Khlebnikov
<khlebni...@yandex-team.ru> wrote:
In previous API tcf_destroy_chain() could be called several times and
some schedulers like hfsc and atm use that. In new API tcf_block_put()
frees block but
Without this filters cannot be attached.
Signed-off-by: Konstantin Khlebnikov <khlebni...@yandex-team.ru>
Fixes: 6529eaba33f0 ("net: sched: introduce tcf block infractructure")
---
net/sched/sch_hfsc.c |8
1 file changed, 8 insertions(+)
diff --git a/net/sched/
In previous API tcf_destroy_chain() could be called several times and
some schedulers like hfsc and atm use that. In new API tcf_block_put()
frees block but leaves stale pointer, second call will free it once again.
This patch fixes such double-frees.
Signed-off-by: Konstantin Khlebnikov
Replace disable_irq() which waits for threaded irq handlers with
disable_hardirq() which waits only for hardirq part.
Signed-off-by: Konstantin Khlebnikov <khlebni...@yandex-team.ru>
Fixes: 311191297125 ("e1000: use disable_hardirq() for e1000_netpoll()")
---
drivers/net/ether
On 18.01.2017 17:23, Eric Dumazet wrote:
On Wed, 2017-01-18 at 12:31 +0300, Konstantin Khlebnikov wrote:
On 18.01.2017 07:14, Eric Dumazet wrote:
From: Eric Dumazet <eduma...@google.com>
Commit 04aeb56a1732 ("net/mlx4_en: allocate non 0-order pages for RX
ring with __GFP_NOMEMAL
s a straight way to depleting all
reserves by flood from network.
Note that this driver does not reuse pages (yet) so we do not have to
add anything else.
Signed-off-by: Eric Dumazet <eduma...@google.com>
Cc: Konstantin Khlebnikov <khlebni...@yandex-team.ru>
Cc: Tariq Toukan <tar...
hus tool 'tc'
prints them as signed. Big values loose higher bits and/or become negative.
This patch clamps tokens in xstat into range from INT_MIN to INT_MAX.
In this way it's easier to understand what's going on here.
Signed-off-by: Konstantin Khlebnikov <khlebni...@yandex-team.ru>
---
skb->sk could point to timewait or request socket which has no sk_classid.
Detected as "BUG: KASAN: slab-out-of-bounds in cls_cgroup_classify".
Signed-off-by: Konstantin Khlebnikov <khlebni...@yandex-team.ru>
---
include/net/cls_cgroup.h |7 +--
1 file changed,
options are disable in config.
Signed-off-by: Konstantin Khlebnikov <khlebni...@yandex-team.ru>
---
include/linux/ipv6.h |3 ++-
net/ipv6/addrconf.c | 43 +--
2 files changed, 19 insertions(+), 27 deletions(-)
diff --git a/include/linux/ipv6.h b/i
High order pages are optional here since commit 51151a16a60f ("mlx4: allow
order-0 memory allocations in RX path"), so here is no reason for depleting
reserves. Generic __netdev_alloc_frag() implements the same logic.
Signed-off-by: Konstantin Khlebnikov <khlebni...@yandex-team.ru&
This patch fixes couple error paths after allocation failures.
Atomic set of page reference counter is safe only if it is zero,
otherwise set can race with any speculative get_page_unless_zero.
Signed-off-by: Konstantin Khlebnikov <khlebni...@yandex-team.ru>
---
drivers/net/ethernet/me
Separated from previous patch for readability.
Signed-off-by: Konstantin Khlebnikov <khlebni...@yandex-team.ru>
---
net/ipv6/addrconf.c | 616 +--
1 file changed, 307 insertions(+), 309 deletions(-)
diff --git a/net/ipv6/addrconf.c b/ne
On Wed, Feb 24, 2016 at 2:21 AM, David Miller <da...@davemloft.net> wrote:
> From: Konstantin Khlebnikov <khlebni...@yandex-team.ru>
> Date: Sun, 21 Feb 2016 10:11:02 +0300
>
>> Currently initial net.ipv4.conf.all.* and net.ipv4.conf.default.* are
>> copied fr
are enabled. Other sysctls in net.ipv4 and net.ipv6
already initialized with default values at namespace creation.
Signed-off-by: Konstantin Khlebnikov <khlebni...@yandex-team.ru>
Fixes: 752d14dc6aa9 ("[IPV4]: Move the devinet pointers on the struct net")
---
net/ipv4/devinet.c |
Currently it's converted into msecs, thus HZ=1000 intact.
Signed-off-by: Konstantin Khlebnikov <khlebni...@yandex-team.ru>
Fixes: 740b0f1841f6 ("tcp: switch rtt estimations to usec resolution")
---
net/ipv4/tcp_metrics.c |2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
.
However, there is corner case:
module with sysctl can be loaded after creation of namespaces.
In this case namespaces will get pre-compiled sysctl defaults,
and are not be able to adjust them even if they want to do it.
Thank you,
Vasily Averin
On 21.02.2016 10:11, Konstantin Khlebnikov wrote
IPv6 initialized with default. That's ok.
IPv4 makes a copy from init_net. Looks like a bug, here
v2.6.24-2577-g752d14dc6aa9
root@zurg:~# sysctl net.ipv4.conf.all.forwarding=0
net.ipv6.conf.all.forwarding=0
net.ipv4.conf.all.forwarding = 0
net.ipv6.conf.all.forwarding = 0
root@zurg:~# unshare -n
Patch fixes this splat
BUG: KASAN: slab-out-of-bounds in minstrel_ht_update_stats.isra.7+0x6e1/0x9e0
[mac80211] at addr 8800cee640f4 Read of size 4 by task swapper/3/0
Signed-off-by: Konstantin Khlebnikov <koc...@gmail.com>
Link:
http://lkml.kernel
On Thu, Jan 7, 2016 at 2:00 PM, Konstantin Khlebnikov <koc...@gmail.com> wrote:
> On Thu, Jan 7, 2016 at 2:49 AM, Florian Westphal <f...@strlen.de> wrote:
>> Florian Westphal <f...@strlen.de> wrote:
>>> Thadeu Lima de Souza Cascardo <casca...@redhat.com>
I've got some of these:
[84408.314676] BUG: unable to handle kernel NULL pointer dereference
at (null)
[84408.317324] IP: [] put_page+0x5/0x50
[84408.319985] PGD 0
[84408.322583] Oops: [#1] SMP
[84408.325156] Modules linked in: ppp_mppe ppp_async ppp_generic slhc
8021q fuse nfsd
On Wed, Jan 6, 2016 at 10:59 PM, Cong Wang <xiyou.wangc...@gmail.com> wrote:
> On Wed, Jan 6, 2016 at 11:15 AM, Konstantin Khlebnikov <koc...@gmail.com>
> wrote:
>> Looks like this happens because ip_options_fragment() relies on
>> correct ip options len
Though dumping such entries crashes present kernels.
Signed-off-by: Konstantin Khlebnikov <koc...@gmail.com>
---
ip/ipneigh.c | 13 -
1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/ip/ipneigh.c b/ip/ipneigh.c
index 54655842ed38..92b7cd6f2a75 100644
--- a/ip/ipn
Proxy entries could have null pointer to net-device.
Signed-off-by: Konstantin Khlebnikov <koc...@gmail.com>
Fixes: 84920c1420e2 ("net: Allow ipv6 proxies and arp proxies be shown with
iproute2")
Cc: <sta...@vger.kernel.org> # v3.4
---
net/core/neighbour.c |4 +
patch disables numa affinity in this case.
Signed-off-by: Konstantin Khlebnikov <khlebni...@yandex-team.ru>
---
<4>[ 24.368805] [ cut here ]
<2>[ 24.368846] kernel BUG at include/linux/gfp.h:325!
<4>[ 24.368868] invalid opcode: [#1]
...@gondor.apana.org.au
Seems correct. You can add:
Reviewed-by: Konstantin Khlebnikov khlebni...@yandex-team.ru
Your skb_set_peeked() doesn't set prev/next to NULL when
unlinks old skb from the queue unlike to __skb_unlink().
Isn't big deal but nulling might be useful.
diff --git a/net/core/datagram.c b
] vfs_write+0xb8/0x190
[ 270.730236] [811fe8c2] SyS_write+0x52/0xb0
[ 270.730239] [817b6bae] entry_SYSCALL_64_fastpath+0x12/0x76
Signed-off-by: Konstantin Khlebnikov khlebni...@yandex-team.ru
---
net/core/netclassid_cgroup.c |3 ++-
1 file changed, 2 insertions(+), 1 deletion
On 22.07.2015 14:56, Daniel Borkmann wrote:
On 07/22/2015 11:23 AM, Konstantin Khlebnikov wrote:
In dev_queue_xmit() net_cls protected with rcu-bh.
...
Signed-off-by: Konstantin Khlebnikov khlebni...@yandex-team.ru
---
net/core/netclassid_cgroup.c |3 ++-
1 file changed, 2 insertions
] vfs_write+0xb8/0x190
[ 270.730236] [811fe8c2] SyS_write+0x52/0xb0
[ 270.730239] [817b6bae] entry_SYSCALL_64_fastpath+0x12/0x76
Signed-off-by: Konstantin Khlebnikov khlebni...@yandex-team.ru
---
net/core/netclassid_cgroup.c |3 ++-
1 file changed, 2 insertions(+), 1 deletion
.
Signed-off-by: Konstantin Khlebnikov khlebni...@yandex-team.ru
---
net/core/dst.c |4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/net/core/dst.c b/net/core/dst.c
index e956ce6d1378..002144bea935 100644
--- a/net/core/dst.c
+++ b/net/core/dst.c
@@ -284,7 +284,9 @@ void
fixed in upstream. Anyway flood of that warnings
completely kills machine and makes further debugging impossible.
Signed-off-by: Konstantin Khlebnikov khlebni...@yandex-team.ru
---
net/core/dst.c |3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/core/dst.c b/net/core
On 14.07.2015 15:04, Eric Dumazet wrote:
On Tue, 2015-07-14 at 14:43 +0300, Konstantin Khlebnikov wrote:
Kernel generates a lot of warnings when dst entry reference counter
overflows and becomes negative. This patch prints address of dst entry,
its refcount and then resets reference counter
/r/20150703125840.24121.91556.stgit@buzz
Signed-off-by: Konstantin Khlebnikov khlebni...@yandex-team.ru
---
drivers/net/ipvlan/ipvlan_main.c |4
1 file changed, 4 insertions(+)
diff --git a/drivers/net/ipvlan/ipvlan_main.c b/drivers/net/ipvlan/ipvlan_main.c
index e995bc501ee6
xiyou.wangc...@gmail.com
Acked-by: Mahesh Bandewar mahe...@google.com
Acked-by: Konstantin Khlebnikov khlebni...@yandex-team.ru
---
drivers/net/ipvlan/ipvlan.h |5 +
drivers/net/ipvlan/ipvlan_core.c |2 +-
2 files changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ipvlan
v1: http://comments.gmane.org/gmane.linux.network/363346
v2: http://comments.gmane.org/gmane.linux.network/369086
v3 has reduced set of patches from ipvlan: fix ipv6 autoconfiguration.
Here just cleanups and patch which ignores ipv6 notifications from RA.
---
Konstantin Khlebnikov (4
Add missing kfree_rcu(addr, rcu);
Signed-off-by: Konstantin Khlebnikov khlebni...@yandex-team.ru
---
drivers/net/ipvlan/ipvlan_main.c |1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/net/ipvlan/ipvlan_main.c b/drivers/net/ipvlan/ipvlan_main.c
index 048ecf0c76fb..7d81e37c3f76 100644
They are unused after commit f631c44bbe15 (ipvlan: Always set broadcast bit in
multicast filter).
Signed-off-by: Konstantin Khlebnikov khlebni...@yandex-team.ru
---
drivers/net/ipvlan/ipvlan.h |2 --
drivers/net/ipvlan/ipvlan_main.c | 33 -
2 files
All structures used in traffic forwarding are rcu-protected:
ipvl_addr, ipvl_dev and ipvl_port. Thus we can unhash addresses
without synchronization. We'll anyway hash it back into the same
bucket: in worst case lockless lookup will scan hash once again.
Signed-off-by: Konstantin Khlebnikov
On 13.07.2015 10:23, Herbert Xu wrote:
On Fri, Jul 10, 2015 at 02:51:41PM +0300, Konstantin Khlebnikov wrote:
This fixes race between non-atomic updates of adjacent bit-fields:
skb-cloned could be lost because netlink broadcast clones skb after
sending it to the first listener who sets skb
it twice. Race leads to double-free in kmalloc-xxx.
Signed-off-by: Konstantin Khlebnikov khlebni...@yandex-team.ru
Fixes: b19372273164 (net: reorganize sk_buff for faster __copy_skb_header())
---
net/netlink/af_netlink.c |6 ++
1 file changed, 6 insertions(+)
diff --git a/net/netlink
This patch clears skb-peeked set by previous recipient of broadcast.
Signed-off-by: Konstantin Khlebnikov khlebni...@yandex-team.ru
Fixes: add05ad4e9f5 (unix/dgram: peek beyond 0-sized skbs)
---
net/netlink/af_netlink.c |1 +
1 file changed, 1 insertion(+)
diff --git a/net/netlink
This patch clears skb-peeked set by previous recipient of broadcast.
Signed-off-by: Konstantin Khlebnikov khlebni...@yandex-team.ru
Fixes: add05ad4e9f5 (unix/dgram: peek beyond 0-sized skbs)
---
net/netlink/af_netlink.c |1 +
1 file changed, 1 insertion(+)
diff --git a/net/netlink
On 10.07.2015 14:51, Konstantin Khlebnikov wrote:
This patch clears skb-peeked set by previous recipient of broadcast.
Signed-off-by: Konstantin Khlebnikov khlebni...@yandex-team.ru
Fixes: add05ad4e9f5 (unix/dgram: peek beyond 0-sized skbs)
---
net/netlink/af_netlink.c |1 +
1 file
On 08.07.2015 07:05, Mahesh Bandewar wrote:
On Fri, Jul 3, 2015 at 5:58 AM, Konstantin Khlebnikov
khlebni...@yandex-team.ru wrote:
Inet6addr notifier is atomic and runs in bh context without RTNL when
ipv6 receives router advertisement packet and performs autoconfiguration.
This patch adds
On 10.07.2015 16:49, Eric Dumazet wrote:
On Fri, 2015-07-10 at 14:51 +0300, Konstantin Khlebnikov wrote:
This fixes race between non-atomic updates of adjacent bit-fields:
skb-cloned could be lost because netlink broadcast clones skb after
sending it to the first listener who sets skb-peeked
They are unused after commit f631c44bbe15 (ipvlan: Always set broadcast bit in
multicast filter).
Signed-off-by: Konstantin Khlebnikov khlebni...@yandex-team.ru
---
drivers/net/ipvlan/ipvlan.h |2 -
drivers/net/ipvlan/ipvlan_main.c | 65 +++---
2 files
All structures used in traffic forwarding are rcu-protected:
ipvl_addr, ipvl_dev and ipvl_port. Thus we can unhash addresses
without synchronization. We'll anyway hash it back into the same
bucket, in worst case lockless lookup will scan hash once again.
Signed-off-by: Konstantin Khlebnikov
-by: Konstantin Khlebnikov khlebni...@yandex-team.ru
---
drivers/net/ipvlan/ipvlan.h | 11 +++
drivers/net/ipvlan/ipvlan_core.c |2 --
drivers/net/ipvlan/ipvlan_main.c | 33 ++---
3 files changed, 41 insertions(+), 5 deletions(-)
diff --git a/drivers
with this: https://patchwork.ozlabs.org/patch/471481/
* new fix for trivial memory leak and patch which removes address counters
---
Konstantin Khlebnikov (5):
ipvlan: remove counters of ipv4 and ipv6 addresses
ipvlan: plug memory leak in ipvlan_link_delete
ipvlan: unhash
All ipvlan ports use one MAC address, this way ipv6 RA tries to assign
one ipv6 address to all of them. This patch assigns unique dev_id to each
ipvlan port. This field is used instead of common FF-FE in Modified EUI-64.
Signed-off-by: Konstantin Khlebnikov khlebni...@yandex-team.ru
Add missing kfree_rcu(addr, rcu);
Signed-off-by: Konstantin Khlebnikov khlebni...@yandex-team.ru
---
drivers/net/ipvlan/ipvlan_main.c |1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/net/ipvlan/ipvlan_main.c b/drivers/net/ipvlan/ipvlan_main.c
index 62577b3f01f2..4c3a0ac85381 100644
+CC Sasha Levin
FYI: this patch fixes race in netlink which leads to hung in glibc
function getaddrinfo() because it doesn't handle errors at all.
On 26.06.2015 13:48, Konstantin Khlebnikov wrote:
Hash value passed as argument into rhashtable_lookup_compare could be
computed using different
On 14.05.2015 07:21, Herbert Xu wrote:
On Thu, May 14, 2015 at 12:16:28PM +0800, Herbert Xu wrote:
On Wed, May 13, 2015 at 09:13:38PM -0700, Eric Dumazet wrote:
So it looks like we lost an skb or something
OK that sounds reasonable. So my plan is to disable dynamic
rehashing and then
it adds comment for rhashtable_hashfn and rhashtable_obj_hashfn:
user must prevent concurrent insert/remove otherwise returned hash value
could be invalid.
Signed-off-by: Konstantin Khlebnikov khlebni...@yandex-team.ru
Fixes: e341694e3eb5 (netlink: Convert netlink_lookup() to use RCU protected
hash
From: Michal Kubeček mkube...@suse.cz
commit 2ac3ac8f86f2fe065d746d9a9abaca867adec577 upstream
On a high-traffic router with many processors and many IPv6 dst
entries, soft lockup in fib6_run_gc() can occur when number of
entries reaches gc_thresh.
This happens because fib6_run_gc() uses
From: Michal Kubeček mkube...@suse.cz
commit 49a18d86f66d33a20144ecb5a34bba0d1856b260 upstream
As pointed out by Eric Dumazet, net-ipv6.ip6_rt_last_gc should
hold the last time garbage collector was run so that we should
update it whenever fib6_run_gc() calls fib6_clean_all(), not only
if we got
Two patches from 3.11 which are missing in 3.10.y
I've just seen livelock in 3.10.69+ where all cpus are stuck in fib6_run_gc()
4[2919865.977745] Call Trace:
4[2919865.977748] IRQ
4[2919865.977754] [8163b87e] _raw_spin_lock_bh+0x1e/0x30
4[2919865.977759] [815e4018]
On 20.05.2015 02:59, Mahesh Bandewar wrote:
On Thu, May 14, 2015 at 6:56 AM, Konstantin Khlebnikov
khlebni...@yandex-team.ru wrote:
All ipvlan ports use one MAC address, this way ipv6 RA tries to assign
one ipv6 address to all of them. This patch assigns unique dev_id to each
ipvlan port
On 20.05.2015 02:33, Mahesh Bandewar wrote:
On Thu, May 14, 2015 at 6:56 AM, Konstantin Khlebnikov
khlebni...@yandex-team.ru wrote:
ipvlan_start_xmit() is called with rcu_read_lock_bh() while its internal
structures requre normal rcu_read_lock().
Signed-off-by: Konstantin Khlebnikov khlebni
-by: Konstantin Khlebnikov khlebni...@yandex-team.ru
---
net/openvswitch/vport-internal_dev.c | 74 ++
net/openvswitch/vport-netdev.c | 63 -
net/openvswitch/vport-netdev.h | 15 +++
3 files changed, 148 insertions(+), 4
90 matches
Mail list logo