Re: 4.19.4 nf_conntrack_count kernel panic

2018-11-26 Thread Denys Fedoryshchenko
On 2018-11-26 21:46, Sami Farin wrote: 4.18.20 works OK, but unfortunately 4.18 series is EOL. I have Ryzen 1600X, 32 GB RAM, Fedora 28, gcc-8.2.1-5, nosmt=force, igb module for Intel I211, using XFS filesystems only. To reproduce, I only do this: connect to VPN using a tunnel (e.g. tun0), sta

4.15.13 kernel panic, ip_rcv_finish, nf_xfrm_me_harder warnings continue to fill dmesg

2018-04-11 Thread Denys Fedoryshchenko
Apr 11 18:01:34[99194.935520] general protection fault: [#1] SMP Apr 11 18:01:34[99194.935998] Modules linked in: pppoe pppox ppp_generic slhc ip_set_hash_net xt_nat xt_string xt_connmark xt_TCPMSS xt_mark xt_CT xt_set xt_tcpudp ip_set_bitmap_port ip_set nfnetlink iptable_raw iptable_filter

Re: ppp/pppoe, still panic 4.15.3 in ppp_push

2018-03-03 Thread Denys Fedoryshchenko
On 2018-03-02 19:43, Guillaume Nault wrote: On Thu, Mar 01, 2018 at 10:07:05PM +0200, Denys Fedoryshchenko wrote: On 2018-03-01 22:01, Guillaume Nault wrote: > diff --git a/drivers/net/ppp/ppp_generic.c > b/drivers/net/ppp/ppp_generic.c > index 255a5def56e9..2acf4b0eabd1 100644 > -

Re: ppp/pppoe, still panic 4.15.3 in ppp_push

2018-03-01 Thread Denys Fedoryshchenko
On 2018-03-01 22:01, Guillaume Nault wrote: On Tue, Feb 27, 2018 at 07:56:27PM +0100, Guillaume Nault wrote: On Tue, Feb 27, 2018 at 12:58:55PM +0200, Denys Fedoryshchenko wrote: > On 2018-02-23 12:07, Guillaume Nault wrote: > > On Fri, Feb 23, 2018 at 11:41:43AM +0200, Denys Fedor

Re: ppp/pppoe, still panic 4.15.3 in ppp_push

2018-02-27 Thread Denys Fedoryshchenko
On 2018-02-23 12:07, Guillaume Nault wrote: On Fri, Feb 23, 2018 at 11:41:43AM +0200, Denys Fedoryshchenko wrote: On 2018-02-23 11:38, Guillaume Nault wrote: > On Thu, Feb 22, 2018 at 08:51:19PM +0200, Denys Fedoryshchenko wrote: > > I'm using accel-ppp that has unit-cache optio

Re: ppp/pppoe, still panic 4.15.3 in ppp_push

2018-02-24 Thread Denys Fedoryshchenko
On 2018-02-23 12:07, Guillaume Nault wrote: On Fri, Feb 23, 2018 at 11:41:43AM +0200, Denys Fedoryshchenko wrote: On 2018-02-23 11:38, Guillaume Nault wrote: > On Thu, Feb 22, 2018 at 08:51:19PM +0200, Denys Fedoryshchenko wrote: > > I'm using accel-ppp that has unit-cache optio

Re: ppp/pppoe, still panic 4.15.3 in ppp_push

2018-02-23 Thread Denys Fedoryshchenko
On 2018-02-23 12:07, Guillaume Nault wrote: On Fri, Feb 23, 2018 at 11:41:43AM +0200, Denys Fedoryshchenko wrote: On 2018-02-23 11:38, Guillaume Nault wrote: > On Thu, Feb 22, 2018 at 08:51:19PM +0200, Denys Fedoryshchenko wrote: > > I'm using accel-ppp that has unit-cache optio

Re: ppp/pppoe, still panic 4.15.3 in ppp_push

2018-02-23 Thread Denys Fedoryshchenko
On 2018-02-23 11:38, Guillaume Nault wrote: On Thu, Feb 22, 2018 at 08:51:19PM +0200, Denys Fedoryshchenko wrote: I'm using accel-ppp that has unit-cache option, i guess for "reusing" ppp interfaces (because creating a lot of interfaces on BRAS with 8k users quite expensiv

Re: ppp/pppoe, still panic 4.15.3 in ppp_push

2018-02-22 Thread Denys Fedoryshchenko
On 2018-02-22 20:30, Guillaume Nault wrote: On Wed, Feb 21, 2018 at 12:04:30PM -0800, Cong Wang wrote: On Thu, Feb 15, 2018 at 11:31 AM, Guillaume Nault wrote: > On Thu, Feb 15, 2018 at 06:01:16PM +0200, Denys Fedoryshchenko wrote: >> On 2018-02-15 17:55, Guillaume Nault wrote: &g

Re: ppp/pppoe, still panic 4.15.3 in ppp_push

2018-02-21 Thread Denys Fedoryshchenko
On 2018-02-21 20:55, Guillaume Nault wrote: On Wed, Feb 21, 2018 at 12:26:51PM +0200, Denys Fedoryshchenko wrote: It seems even rebuilding seemingly stable version triggering crashes too (but different ones) Different ones? The trace following your message looks very similar to your first

Re: ppp/pppoe, still panic 4.15.3 in ppp_push

2018-02-21 Thread Denys Fedoryshchenko
It seems even rebuilding seemingly stable version triggering crashes too (but different ones) Maybe it is coincidence, and bug reproducer appeared in network same time i decided to upgrade kernel, as it happened with xt_MSS(and that bug existed for years). Deleted quoting, i added more debug op

Re: ppp/pppoe, still panic 4.15.3 in ppp_push

2018-02-20 Thread Denys Fedoryshchenko
On 2018-02-16 20:48, Guillaume Nault wrote: On Fri, Feb 16, 2018 at 01:13:18PM +0200, Denys Fedoryshchenko wrote: On 2018-02-15 21:42, Guillaume Nault wrote: > On Thu, Feb 15, 2018 at 09:34:42PM +0200, Denys Fedoryshchenko wrote: > > On 2018-02-15 21:31, Guillaume Nault wrote: > >

a lot of WARNING, nf_xfrm_me_harder in 4.15.x

2018-02-18 Thread Denys Fedoryshchenko
Is there any bug with that or it is just some sort of spam? Cause i am troubleshooting at same time "hard to catch" bug in ppp/pppoe Workload: pppoe bras I am going to try last stable 4.14.x after 1-2 days as well, but probably i noticed this message appeared there as well, under some condition

Re: ppp/pppoe, still panic 4.15.3 in ppp_push

2018-02-18 Thread Denys Fedoryshchenko
On 2018-02-16 20:48, Guillaume Nault wrote: On Fri, Feb 16, 2018 at 01:13:18PM +0200, Denys Fedoryshchenko wrote: On 2018-02-15 21:42, Guillaume Nault wrote: > On Thu, Feb 15, 2018 at 09:34:42PM +0200, Denys Fedoryshchenko wrote: > > On 2018-02-15 21:31, Guillaume Nault wrote: > >

Re: ppp/pppoe, still panic 4.15.3 in ppp_push

2018-02-16 Thread Denys Fedoryshchenko
On 2018-02-15 21:42, Guillaume Nault wrote: On Thu, Feb 15, 2018 at 09:34:42PM +0200, Denys Fedoryshchenko wrote: On 2018-02-15 21:31, Guillaume Nault wrote: > On Thu, Feb 15, 2018 at 06:01:16PM +0200, Denys Fedoryshchenko wrote: > > On 2018-02-15 17:55, Guillaume Nault wrote: > >

Re: ppp/pppoe, still panic 4.15.3 in ppp_push

2018-02-15 Thread Denys Fedoryshchenko
On 2018-02-15 21:31, Guillaume Nault wrote: On Thu, Feb 15, 2018 at 06:01:16PM +0200, Denys Fedoryshchenko wrote: On 2018-02-15 17:55, Guillaume Nault wrote: > On Thu, Feb 15, 2018 at 12:19:52PM +0200, Denys Fedoryshchenko wrote: > > Here we go: > > >

Re: ppp/pppoe, still panic 4.15.3 in ppp_push

2018-02-15 Thread Denys Fedoryshchenko
On 2018-02-15 17:55, Guillaume Nault wrote: On Thu, Feb 15, 2018 at 12:19:52PM +0200, Denys Fedoryshchenko wrote: Here we go: [24558.921549] == [24558.922167] BUG: KASAN: use-after-free in ppp_ioctl+0xa6a/0x1522 [ppp_generic

Re: ppp/pppoe, still panic 4.15.3 in ppp_push

2018-02-15 Thread Denys Fedoryshchenko
On 2018-02-14 19:25, Guillaume Nault wrote: On Wed, Feb 14, 2018 at 06:49:19PM +0200, Denys Fedoryshchenko wrote: On 2018-02-14 18:47, Guillaume Nault wrote: > On Wed, Feb 14, 2018 at 06:29:34PM +0200, Denys Fedoryshchenko wrote: > > On 2018-02-14 18:07, Guillaume Nault wrote: > >

Re: ppp/pppoe, still panic 4.15.3 in ppp_push

2018-02-14 Thread Denys Fedoryshchenko
On 2018-02-14 18:47, Guillaume Nault wrote: On Wed, Feb 14, 2018 at 06:29:34PM +0200, Denys Fedoryshchenko wrote: On 2018-02-14 18:07, Guillaume Nault wrote: > On Wed, Feb 14, 2018 at 03:17:23PM +0200, Denys Fedoryshchenko wrote: > > Hi, > > > > Upgraded kernel to 4.15.3, s

Re: ppp/pppoe, still panic 4.15.3 in ppp_push

2018-02-14 Thread Denys Fedoryshchenko
On 2018-02-14 18:07, Guillaume Nault wrote: On Wed, Feb 14, 2018 at 03:17:23PM +0200, Denys Fedoryshchenko wrote: Hi, Upgraded kernel to 4.15.3, still it crashes after while (several hours, cannot do bisect, as it is production server). dev ppp # gdb ppp_generic.o GNU gdb (Gentoo 7.12.1

ppp/pppoe, still panic 4.15.3 in ppp_push

2018-02-14 Thread Denys Fedoryshchenko
Hi, Upgraded kernel to 4.15.3, still it crashes after while (several hours, cannot do bisect, as it is production server). dev ppp # gdb ppp_generic.o GNU gdb (Gentoo 7.12.1 vanilla) 7.12.1 <> Reading symbols from ppp_generic.o...done. (gdb) list *ppp_push+0x73 0x681 is in ppp_push (drivers/ne

4.15.2 kernel panic, nat, ppp bug?

2018-02-12 Thread Denys Fedoryshchenko
Hello, Got this and then server rebooted with panic (second message). Workload: pppoe BRAS, lost of shapers, ppp interfaces Please let me know if i need to provide more information Feb 12 06:00:58 [13750.606169] WARNING: CPU: 6 PID: 0 at ./include/net/dst.h:256 nf_xfrm_me_harder+0x52/0xd9 [nf

Re: e1000e hardware unit hangs

2018-01-24 Thread Denys Fedoryshchenko
On 2018-01-24 20:31, Ben Greear wrote: On 01/24/2018 08:34 AM, Neftin, Sasha wrote: On 1/24/2018 18:11, Alexander Duyck wrote: On Tue, Jan 23, 2018 at 3:46 PM, Ben Greear wrote: Hello, Anyone have any more suggestions for making e1000e work better? This is from a 4.9.65+ kernel, with thes

Re: Fw: [Bug 197099] New: Kernel panic in interrupt [l2tp_ppp]

2017-10-07 Thread Denys Fedoryshchenko
On 2017-10-07 15:09, SviMik wrote: Unfortunately, netconsole has managed to send a kernel panic trace only once, and it's not related to this bug. Looks like something crashes really hard to make netconsole unusable. In some cases i had luck with pstore, when netconsole failed me (especially ne

Question about "prevent dst uses after free" and WARNING in nf_xfrm_me_harder / refcnt / 4.13.3

2017-10-02 Thread Denys Fedoryshchenko
Hi, I'm running now 4.13.3, is this patch required for 4.13 as well? (it doesnt apply cleanly, as in 4.13 tcp_prequeue use skb_dst_force_safe, so i just renamed it there to skb_dst_force ) This is what i get on PPPoE BRAS on this kernel, patch applied (no idea if its related to patch, but just

Re: [PATCH] bgmac: Remove all offloading features, including GRO.

2017-09-15 Thread Denys Fedoryshchenko
On 2017-09-16 03:18, Eric Dumazet wrote: On Fri, 2017-09-15 at 17:10 -0700, ros...@gmail.com wrote: Ok fair enough. Will only disable GRO in the driver. Well, do not even try. NETIF_F_SOFT_FEATURES is set by core networking stack in register_netdevice(), ( commit 212b573f5552c60265da721ff9ce3

Re: HTB going crazy over ~5Gbit/s (4.12.9, but problem present in older kernels as well)

2017-09-13 Thread Denys Fedoryshchenko
On 2017-09-13 20:20, Eric Dumazet wrote: On Wed, 2017-09-13 at 20:12 +0300, Denys Fedoryshchenko wrote: For my case, as load increased now, i am hitting same issue (i tried to play with quantum / bursts as well, didnt helped): tc -s -d class show dev eth3.777 classid 1:111;sleep 5;tc -s -d

Re: HTB going crazy over ~5Gbit/s (4.12.9, but problem present in older kernels as well)

2017-09-13 Thread Denys Fedoryshchenko
On 2017-09-13 19:55, Eric Dumazet wrote: On Wed, 2017-09-13 at 09:42 -0700, Eric Dumazet wrote: On Wed, 2017-09-13 at 19:27 +0300, Denys Fedoryshchenko wrote: > On 2017-09-13 19:16, Eric Dumazet wrote: > > On Wed, 2017-09-13 at 18:34 +0300, Denys Fedoryshchenko wrote: > >> W

Re: HTB going crazy over ~5Gbit/s (4.12.9, but problem present in older kernels as well)

2017-09-13 Thread Denys Fedoryshchenko
On 2017-09-13 19:16, Eric Dumazet wrote: On Wed, 2017-09-13 at 18:34 +0300, Denys Fedoryshchenko wrote: Well, probably i am answering my own question, removing estimator from classes seems drastically improve situation. It seems estimator has some issues that cause shaper to behave incorrectly

Re: HTB going crazy over ~5Gbit/s (4.12.9, but problem present in older kernels as well)

2017-09-13 Thread Denys Fedoryshchenko
On 2017-09-13 18:51, Eric Dumazet wrote: On Wed, 2017-09-13 at 18:20 +0300, Denys Fedoryshchenko wrote: Hi, I noticed after increasing bandwidth over some amount HTB started to throttle classes it should not throttle. Also estimated rate in htb totally wrong, while byte counters is correct

Re: HTB going crazy over ~5Gbit/s (4.12.9, but problem present in older kernels as well)

2017-09-13 Thread Denys Fedoryshchenko
bottleneck by CPU load measurements. On 2017-09-13 18:20, Denys Fedoryshchenko wrote: Hi, I noticed after increasing bandwidth over some amount HTB started to throttle classes it should not throttle. Also estimated rate in htb totally wrong, while byte counters is correct. Is there any overflow

HTB going crazy over ~5Gbit/s (4.12.9, but problem present in older kernels as well)

2017-09-13 Thread Denys Fedoryshchenko
Hi, I noticed after increasing bandwidth over some amount HTB started to throttle classes it should not throttle. Also estimated rate in htb totally wrong, while byte counters is correct. Is there any overflow or something? X520 card (but XL710 same) br1 8000.90e2ba86c38c n

Re: ipset losing entries on its own

2017-09-06 Thread Denys Fedoryshchenko
On 2017-09-06 13:08, Akshat Kakkar wrote: I am having ipset 6.32 The hash type is hash:ip I am adding/deleting IP addresses to it dynamically using scripts. However, it has been observed that at times few IPs (3-4 out of 4000) are not found in the set though it was added. Also, logs show there

Re: nf_nat_pptp 4.12.3 kernel lockup/reboot

2017-08-25 Thread Denys Fedoryshchenko
On 2017-08-25 08:21, Florian Westphal wrote: Denys Fedoryshchenko wrote: >>> I am trying to upgrade kernel 4.11.8 to 4.12.3 (it is a nat/router, handling >>> approx 2gbps of pppoe users traffic) and noticed that after while server >>> rebooting(i have set reboot on

Re: nf_nat_pptp 4.12.3 kernel lockup/reboot

2017-08-24 Thread Denys Fedoryshchenko
On 2017-07-24 19:20, Florian Westphal wrote: Florian Westphal wrote: Denys Fedoryshchenko wrote: > Hi, > > I am trying to upgrade kernel 4.11.8 to 4.12.3 (it is a nat/router, handling > approx 2gbps of pppoe users traffic) and noticed that after while server > rebooting(i hav

Re: nf_nat_pptp 4.12.3 kernel lockup/reboot

2017-07-26 Thread Denys Fedoryshchenko
On 2017-07-24 19:20, Florian Westphal wrote: Florian Westphal wrote: Denys Fedoryshchenko wrote: > Hi, > > I am trying to upgrade kernel 4.11.8 to 4.12.3 (it is a nat/router, handling > approx 2gbps of pppoe users traffic) and noticed that after while server > rebooting(i hav

Re: nf_nat_pptp 4.12.3 kernel lockup/reboot

2017-07-25 Thread Denys Fedoryshchenko
On 2017-07-24 19:20, Florian Westphal wrote: Florian Westphal wrote: Denys Fedoryshchenko wrote: > Hi, > > I am trying to upgrade kernel 4.11.8 to 4.12.3 (it is a nat/router, handling > approx 2gbps of pppoe users traffic) and noticed that after while server > rebooting(i hav

nf_nat_pptp 4.12.3 kernel lockup/reboot

2017-07-24 Thread Denys Fedoryshchenko
Hi, I am trying to upgrade kernel 4.11.8 to 4.12.3 (it is a nat/router, handling approx 2gbps of pppoe users traffic) and noticed that after while server rebooting(i have set reboot on panic and etc). I can't run serial console, and in pstore / netconsole there is nothing. Best i got is some v

Re: [PATCH net] netfilter: xt_TCPMSS: add more sanity tests on tcph->doff

2017-04-20 Thread Denys Fedoryshchenko
On 2017-04-08 23:24, Pablo Neira Ayuso wrote: On Mon, Apr 03, 2017 at 10:55:11AM -0700, Eric Dumazet wrote: From: Eric Dumazet Denys provided an awesome KASAN report pointing to an use after free in xt_TCPMSS I have provided three patches to fix this issue, either in xt_TCPMSS or in xt_tcpu

Re: KASAN, xt_TCPMSS finally found nasty use-after-free bug? 4.10.8

2017-04-03 Thread Denys Fedoryshchenko
On 2017-04-03 15:09, Eric Dumazet wrote: On Mon, 2017-04-03 at 11:10 +0300, Denys Fedoryshchenko wrote: I modified patch a little as: if (th->doff * 4 < sizeof(_tcph)) { par->hotdrop = true; WARN_ON_ONCE(!tcpinfo->option); return false; } And it did triggered WARN once at

Re: KASAN, xt_TCPMSS finally found nasty use-after-free bug? 4.10.8

2017-04-03 Thread Denys Fedoryshchenko
On 2017-04-02 20:26, Eric Dumazet wrote: On Sun, 2017-04-02 at 10:14 -0700, Eric Dumazet wrote: Could that be that netfilter does not abort earlier if TCP header is completely wrong ? Yes, I wonder if this patch would be better, unless we replicate the th->doff sanity check in all netfilter

Re: KASAN, xt_TCPMSS finally found nasty use-after-free bug? 4.10.8

2017-04-02 Thread Denys Fedoryshchenko
On 2017-04-02 15:32, Eric Dumazet wrote: On Sun, 2017-04-02 at 15:25 +0300, Denys Fedoryshchenko wrote: > */ I will add also WARN_ON_ONCE(tcp_hdrlen >= 15 * 4) before, for curiosity, if this condition are triggered. Is it fine like that? Sure. It didnt triggered WARN_ON, and wit

Re: KASAN, xt_TCPMSS finally found nasty use-after-free bug? 4.10.8

2017-04-02 Thread Denys Fedoryshchenko
On 2017-04-02 15:19, Eric Dumazet wrote: On Sun, 2017-04-02 at 04:54 -0700, Eric Dumazet wrote: On Sun, 2017-04-02 at 13:45 +0200, Florian Westphal wrote: > Eric Dumazet wrote: > > - for (i = sizeof(struct tcphdr); i <= tcp_hdrlen - TCPOLEN_MSS; i += optlen(opt, i)) { > > + for (i = si

Re: KASAN, xt_TCPMSS finally found nasty use-after-free bug? 4.10.8

2017-04-02 Thread Denys Fedoryshchenko
On 2017-04-02 14:45, Florian Westphal wrote: Eric Dumazet wrote: - for (i = sizeof(struct tcphdr); i <= tcp_hdrlen - TCPOLEN_MSS; i += optlen(opt, i)) { + for (i = sizeof(struct tcphdr); i < tcp_hdrlen - TCPOLEN_MSS; i += optlen(opt, i)) { if (opt[i] == TCPOPT_MSS && opt[i+1]

KASAN, xt_TCPMSS finally found nasty use-after-free bug? 4.10.8

2017-04-02 Thread Denys Fedoryshchenko
Repost, due being sleepy missed few important points. I am searching reasons of crashes for multiple conntrack enabled servers, usually they point to conntrack, but i suspect use after free might be somewhere else, so i tried to enable KASAN. And seems i got something after few hours, and it l

finally found nasty use-after-free bug? 4.10.8

2017-04-02 Thread Denys Fedoryshchenko
I am searching reasons of crashes for multiple NAT servers, and tried to enable KASAN. It seems i got something, and it looks very possible related to all crashes, because on all that servers i have MSS. [25181.855611] == [25181.

Re: probably serious conntrack/netfilter panic, 4.8.14, timers and intel turbo

2017-03-31 Thread Denys Fedoryshchenko
I am not sure if it is same issue, but panics still happen, but much less. Same server, nat. I will upgrade to latest 4.10.x build, because for this one i dont have files anymore (for symbols and etc). [864288.511464] Modules linked in: nf_conntrack_netlink nf_nat_pptp nf_nat_proto_gre xt_TCP

__nf_conntrack_find_get - NMI watchdog, 4.10.5

2017-03-25 Thread Denys Fedoryshchenko
Hi, While applying/removing shapers on few thousands of ppp interfaces got pppoe server rebooted with this message: [51306.144984] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [swapper/0:0] [51306.145319] Modules linked in: sch_sfq cls_fw act_police cls_u32 sch_ingress sch_htb pppoe p

4.9.4 panic, nf_conntrack_tuple_taken

2017-02-12 Thread Denys Fedoryshchenko
Hi, Seems i'm quite "lucky" and hitting another bug. This time it is different server, but i believe i've seen this bug on few pppoe servers, but here it is happening once per 1-2 days. Out of tree patch applied, to optimize gc heuristics. I don't exclude (but very small chance) hardware issu

Re: 4.9 conntrack performance issues

2017-01-30 Thread Denys Fedoryshchenko
On 2017-01-30 13:26, Guillaume Nault wrote: On Sun, Jan 15, 2017 at 01:05:58AM +0200, Denys Fedoryshchenko wrote: Hi! Sorry if i added someone wrongly to CC, please let me know, if i should remove. I just run successfully 4.9 on my nat several days ago, and seems panic issue disappeared

Re: 4.9 conntrack performance issues

2017-01-14 Thread Denys Fedoryshchenko
On 2017-01-15 02:29, Florian Westphal wrote: Denys Fedoryshchenko wrote: On 2017-01-15 01:53, Florian Westphal wrote: >Denys Fedoryshchenko wrote: > >I suspect you might also have to change > >1011 } else if (expired_count) { >1012 gc_work->nex

Re: 4.9 conntrack performance issues

2017-01-14 Thread Denys Fedoryshchenko
On 2017-01-15 01:53, Florian Westphal wrote: Denys Fedoryshchenko wrote: [ CC Nicolas since he also played with gc heuristics in the past ] Sorry if i added someone wrongly to CC, please let me know, if i should remove. I just run successfully 4.9 on my nat several days ago, and seems

4.9 conntrack performance issues

2017-01-14 Thread Denys Fedoryshchenko
Hi! Sorry if i added someone wrongly to CC, please let me know, if i should remove. I just run successfully 4.9 on my nat several days ago, and seems panic issue disappeared. But i started to face another issue, it seems garbage collector is hogging one of CPU's. Here is my data: 2xE5-2640 v

Re: probably serious conntrack/netfilter panic, 4.8.14, timers and intel turbo

2017-01-11 Thread Denys Fedoryshchenko
On 2017-01-11 19:22, Guillaume Nault wrote: Cc: netfilter-de...@vger.kernel.org, I'm afraid I'll need some help for this case. On Sat, Dec 17, 2016 at 09:48:13PM +0200, Denys Fedoryshchenko wrote: Hi, I posted recently several netfilter related crashes, didn't got any answer

Re: 4.9.2 panic, __skb_flow_dissect, gro?

2017-01-10 Thread Denys Fedoryshchenko
Yes, it is in the list (ixgbe) On 2017-01-11 02:16, Ian Kumlien wrote: Added David Miller to CC since he said it was queued for stable, maybe he can comment On Wed, Jan 11, 2017 at 12:49 AM, Denys Fedoryshchenko wrote: It seems this patch solve issue. I hope it will go to stable asap

Re: 4.9.2 panic, __skb_flow_dissect, gro?

2017-01-10 Thread Denys Fedoryshchenko
skb_header_pointer. On 2017-01-11 01:26, Denys Fedoryshchenko wrote: Hi, Got panic message on 4.9.2 with latest patches from stable-queue, probably it affects all 4.9 version Panic message: dmesg-erst-6374119981415661569:<6>[ 23.110324] ip_set: protocol 6 dmesg-erst-6374119981415661569:<1>[ 28

4.9.2 panic, __skb_flow_dissect, gro?

2017-01-10 Thread Denys Fedoryshchenko
Hi, Got panic message on 4.9.2 with latest patches from stable-queue, probably it affects all 4.9 version Panic message: dmesg-erst-6374119981415661569:<6>[ 23.110324] ip_set: protocol 6 dmesg-erst-6374119981415661569:<1>[ 28.117455] BUG: unable to handle kernel NULL pointer dereference

probably serious conntrack/netfilter panic, 4.8.14, timers and intel turbo

2016-12-17 Thread Denys Fedoryshchenko
Hi, I posted recently several netfilter related crashes, didn't got any answers, one of them started to happen quite often on loaded NAT (17Gbps), so after trying endless ways to make it stable, i found out that in backtrace i can often see timers, and this bug probably appearing on older rel

Kernel panic in netfilter 4.8.10 probably on conntrack -L

2016-12-05 Thread Denys Fedoryshchenko
Hi! I have quite loaded NAT server (approx 17Gbps of traffic) where periodic "conntrack -L" might trigger once per day kernel panic. I am not definitely sure it is triggered exactly at running tool, or just by enabling events. Here is panic message: [221287.380762] general protection fault:

Re: SNAT --random & fully is not actually random for ips

2016-11-28 Thread Denys Fedoryshchenko
On 2016-11-28 13:29, Pablo Neira Ayuso wrote: On Mon, Nov 28, 2016 at 01:12:07PM +0200, Denys Fedoryshchenko wrote: On 2016-11-28 13:06, Pablo Neira Ayuso wrote: >Why does your patch reverts NF_NAT_RANGE_PROTO_RANDOM_FULLY? Ops, sorry i just did mistake with files, actually it is in reve

Re: SNAT --random & fully is not actually random for ips

2016-11-28 Thread Denys Fedoryshchenko
On 2016-11-28 13:06, Pablo Neira Ayuso wrote: On Mon, Nov 28, 2016 at 12:45:59PM +0200, Denys Fedoryshchenko wrote: Hello, I noticed that if i specify -j SNAT with options --random --random-fully still it keeps persistence for source IP. So you specify both? Actually truly random src ip

SNAT --random & fully is not actually random for ips

2016-11-28 Thread Denys Fedoryshchenko
Hello, I noticed that if i specify -j SNAT with options --random --random-fully still it keeps persistence for source IP. Actually truly random src ip required in some scenarios like links balanced by IPs, but seems since 2012 at least it is not possible. But actually if i do something like:

Re: kernel panic TPROXY , vanilla 4.7.1

2016-08-17 Thread Denys Fedoryshchenko
On 2016-08-17 19:04, Eric Dumazet wrote: On Wed, 2016-08-17 at 08:42 -0700, Eric Dumazet wrote: On Wed, 2016-08-17 at 17:31 +0300, Denys Fedoryshchenko wrote: > Hi! > > Tried to run squid on latest kernel, and hit a panic > Sometimes it just shows warning in dmesg (but doesnt w

kernel panic TPROXY , vanilla 4.7.1

2016-08-17 Thread Denys Fedoryshchenko
Hi! Tried to run squid on latest kernel, and hit a panic Sometimes it just shows warning in dmesg (but doesnt work properly) [ 75.701666] IPv4: Attempt to release TCP socket in state 10 88102d430780 [ 83.866974] squid (2700) used greatest stack depth: 12912 bytes left [ 87.506644] IPv

Re: 4.6.3, pppoe + shaper workload, skb_panic / skb_push / ppp_start_xmit

2016-08-17 Thread Denys Fedoryshchenko
On 2016-08-09 00:05, Guillaume Nault wrote: On Mon, Aug 08, 2016 at 02:25:00PM +0300, Denys Fedoryshchenko wrote: On 2016-08-01 23:59, Guillaume Nault wrote: > Do you still have the vmlinux file with debug symbols that generated > this panic? Sorry for delay, i didn't had same i

Re: 4.6.3, pppoe + shaper workload, skb_panic / skb_push / ppp_start_xmit

2016-08-08 Thread Denys Fedoryshchenko
On 2016-08-01 23:59, Guillaume Nault wrote: Do you still have the vmlinux file with debug symbols that generated this panic? Sorry for delay, i didn't had same image on all servers and probably i found cause of panic, but still testing on several servers. If i remove SFQ qdisc from ppp shapers,

Re: 4.6.3, pppoe + shaper workload, skb_panic / skb_push / ppp_start_xmit

2016-08-01 Thread Denys Fedoryshchenko
On 2016-08-01 23:59, Guillaume Nault wrote: On Thu, Jul 28, 2016 at 02:28:23PM +0300, Denys Fedoryshchenko wrote: [ 5449.904989] CPU: 1 PID: 6359 Comm: ip Not tainted 4.7.0-build-0109 #2 [ 5449.905255] Hardware name: Supermicro X10SLM+-LN4F/X10SLM+-LN4F, BIOS 3.0 04/24/2015 [ 5449.905712

Re: 4.6.3, pppoe + shaper workload, skb_panic / skb_push / ppp_start_xmit

2016-07-28 Thread Denys Fedoryshchenko
On 2016-07-28 14:09, Guillaume Nault wrote: On Tue, Jul 12, 2016 at 10:31:18AM -0700, Cong Wang wrote: On Mon, Jul 11, 2016 at 12:45 PM, wrote: > Hi > > On latest kernel i noticed kernel panic happening 1-2 times per day. It is > also happening on older kernel (at least 4.5.3). > ... > [42916

Re: kernel panic, __neigh_notify, 4.7.0-rc7, Workqueue: events_power_efficient neigh_periodic_work

2016-07-24 Thread Denys Fedoryshchenko
On 2016-07-24 21:40, nuclear...@nuclearcat.com wrote: Different hardware, but same workload. Seems different bug, happened at least twice on this unit (both kernel panic messages here) As additional sidenote, that might be useful (found in commits, that proxy arp might induce this bug, such as i

Re: kernel panic in 4.2.3, rb_erase in sch_fq

2015-11-13 Thread Denys Fedoryshchenko
at least one more person with similar conntrack crashes on latest kernels. On 2015-11-04 06:46, Eric Dumazet wrote: On Wed, 2015-11-04 at 06:25 +0200, Denys Fedoryshchenko wrote: On 2015-11-04 00:06, Cong Wang wrote: > On Mon, Nov 2, 2015 at 6:11 AM, Denys Fedoryshchenko > wrote:

4.3.0, neighbour: arp_cache: neighbor table overflow! and panic

2015-11-06 Thread Denys Fedoryshchenko
Hi I have several pppoe servers running under older kernels, and upgraded two of them to 4.3.0 After that, one of them randomly rebooting and stacktrace always different. Also i noticed message appearing, that didnt exist before on older kernels, appearing on both now: "neighbour: arp_cache:

Re: kernel panic in 4.2.3, rb_erase in sch_fq

2015-11-03 Thread Denys Fedoryshchenko
On 2015-11-04 06:58, Eric Dumazet wrote: On Tue, 2015-11-03 at 20:46 -0800, Eric Dumazet wrote: On Wed, 2015-11-04 at 06:25 +0200, Denys Fedoryshchenko wrote: > On 2015-11-04 00:06, Cong Wang wrote: > > On Mon, Nov 2, 2015 at 6:11 AM, Denys Fedoryshchenko > > wrote: > >>

Re: HTB, HFSC, PIE, FIFO stuck on 2.4Gbit on default values

2015-11-03 Thread Denys Fedoryshchenko
On 2015-11-04 06:28, Eric Dumazet wrote: On Wed, 2015-11-04 at 06:12 +0200, Denys Fedoryshchenko wrote: Just enabling gro or gso (or together) is fine there. Thanks for advice. Seems only tso causing problems. Also i guess if i keep tso disabled, it will solve my MTU issues (i had once issue

Re: kernel panic in 4.2.3, rb_erase in sch_fq

2015-11-03 Thread Denys Fedoryshchenko
On 2015-11-04 00:06, Cong Wang wrote: On Mon, Nov 2, 2015 at 6:11 AM, Denys Fedoryshchenko wrote: Hi! Actually seems i was getting this panic for a while (once per week) on loaded pppoe server, but just now was able to get full panic message. After checking commit logs on sch_fq.c i didnt

Re: HTB, HFSC, PIE, FIFO stuck on 2.4Gbit on default values

2015-11-03 Thread Denys Fedoryshchenko
On 2015-11-03 23:23, Eric Dumazet wrote: On Tue, 2015-11-03 at 22:24 +0200, Denys Fedoryshchenko wrote: I wont argue on that, you are right. Ok, then it is a bit offtopic in current case, different setup, but i know this one has easy to reproduce issues with offloading. but this is bug

Re: HTB, HFSC, PIE, FIFO stuck on 2.4Gbit on default values

2015-11-03 Thread Denys Fedoryshchenko
On 2015-11-03 21:49, Eric Dumazet wrote: Well, I am telling you. Say no to people advising to turn off GRO/TSO. If you were the guy adviding others to do so, it is time to see the light. Lets fix the bugs if any, instead of spreading disinformation. I am so tired of telling these very simple

Re: HTB, HFSC, PIE, FIFO stuck on 2.4Gbit on default values

2015-11-03 Thread Denys Fedoryshchenko
On 2015-11-03 21:11, Eric Dumazet wrote: On Tue, 2015-11-03 at 19:33 +0200, Denys Fedoryshchenko wrote: Hi Recently i was testing shaping over single 10G cards, for speeds up to 3-4Gbps, and noticed interesting effect. Shaping scheme: Incoming bandwidth comes to switch port, with access vlan

HTB, HFSC, PIE, FIFO stuck on 2.4Gbit on default values

2015-11-03 Thread Denys Fedoryshchenko
Hi Recently i was testing shaping over single 10G cards, for speeds up to 3-4Gbps, and noticed interesting effect. Shaping scheme: Incoming bandwidth comes to switch port, with access vlan 100 Outgoing bandwidth leaves switch port with access vlan 200 Linux with Intel X710 connected to trunk p

Re: kernel panic in 4.2.3, rb_erase in sch_fq

2015-11-02 Thread Denys Fedoryshchenko
On 2015-11-02 18:12, Eric Dumazet wrote: On Mon, 2015-11-02 at 17:58 +0200, Denys Fedoryshchenko wrote: On 2015-11-02 17:24, Eric Dumazet wrote: > On Mon, 2015-11-02 at 16:11 +0200, Denys Fedoryshchenko wrote: >> Hi! >> >> Actually seems i was getting this panic for a whi

Re: kernel panic in 4.2.3, rb_erase in sch_fq

2015-11-02 Thread Denys Fedoryshchenko
On 2015-11-02 17:24, Eric Dumazet wrote: On Mon, 2015-11-02 at 16:11 +0200, Denys Fedoryshchenko wrote: Hi! Actually seems i was getting this panic for a while (once per week) on loaded pppoe server, but just now was able to get full panic message. After checking commit logs on sch_fq.c i

kernel panic in 4.2.3, rb_erase in sch_fq

2015-11-02 Thread Denys Fedoryshchenko
Hi! Actually seems i was getting this panic for a while (once per week) on loaded pppoe server, but just now was able to get full panic message. After checking commit logs on sch_fq.c i didnt seen any fixes, so probably upgrading to newer kernel wont help? [237470.633382] general protection

Re: [PATCH net] ppp: don't override sk->sk_state in pppoe_flush_dev()

2015-10-21 Thread Denys Fedoryshchenko
On 2015-10-22 03:14, Matt Bennett wrote: On Tue, 2015-10-13 at 05:13 +0300, Denys Fedoryshchenko wrote: On 2015-10-07 15:12, Guillaume Nault wrote: > On Mon, Oct 05, 2015 at 02:08:44PM +0200, Guillaume Nault wrote: >>if (po) { >>struct sock *s

Re: [PATCH net] ppp: don't override sk->sk_state in pppoe_flush_dev()

2015-10-12 Thread Denys Fedoryshchenko
On 2015-10-07 15:12, Guillaume Nault wrote: On Mon, Oct 05, 2015 at 02:08:44PM +0200, Guillaume Nault wrote: if (po) { struct sock *sk = sk_pppox(po); - bh_lock_sock(sk); - - /* If the user has locked the socket, just ignore -*

Re: [PATCH net] ppp: don't override sk->sk_state in pppoe_flush_dev()

2015-10-04 Thread Denys Fedoryshchenko
On 2015-10-02 20:54, Guillaume Nault wrote: On Fri, Oct 02, 2015 at 11:01:45AM +0300, Denys Fedoryshchenko wrote: Here is similar panic after patch applied (it might be different bug), got over netconsole: [126348.617115] CPU: 0 PID: 5254 Comm: accel-pppd Not tainted 4.2.2-build-0087 #2

Re: [PATCH net] ppp: don't override sk->sk_state in pppoe_flush_dev()

2015-10-02 Thread Denys Fedoryshchenko
Here is similar panic after patch applied (it might be different bug), got over netconsole: [126348.610996] BUG: unable to handle kernel NULL pointer dereference at 0428 [126348.611656] IP: [] pppoe_release+0x56/0x142 [pppoe] [126348.612033] PGD 17d0b03067 PUD 17c721b067 PMD

Re: 4.1.0, kernel panic, pppoe_release

2015-09-25 Thread Denys Fedoryshchenko
On 2015-09-25 17:38, Guillaume Nault wrote: On Tue, Sep 22, 2015 at 04:47:48AM +0300, Denys Fedoryshchenko wrote: Hi, Sorry for late reply, was not able to push new kernel on pppoes without permissions (it's production servers), just got OK. I am testing patch on another pppoe server wi

Re: 4.1.0, kernel panic, pppoe_release

2015-09-21 Thread Denys Fedoryshchenko
. On 2015-09-10 18:56, Guillaume Nault wrote: On Fri, Jul 17, 2015 at 09:16:14PM +0300, Denys Fedoryshchenko wrote: Probably my knowledge of kernel is not sufficient, but i will try few approaches. One of them to add to pppoe_unbind_sock_work: pppox_unbind_sock(sk); +/* Signa

Re: 4.1.0, kernel panic, pppoe_release

2015-07-17 Thread Denys Fedoryshchenko
was causing kernel panic (it needs 24h testing cycle), then i will try this fix. On 2015-07-17 18:36, Dan Williams wrote: On Fri, 2015-07-17 at 12:24 +0300, Denys Fedoryshchenko wrote: As i suspect, this kernel panic caused by recent changes to pppoe. This problem appearing in accel-pppd (server),

Re: 4.1.0, kernel panic, pppoe_release

2015-07-17 Thread Denys Fedoryshchenko
nd related patches. On 2015-07-14 13:57, Denys Fedoryshchenko wrote: Here is panic message from netconsole. Please let me know if any additional information required. Jul 14 13:49:16 10.0.252.10 [76078.867822] BUG: unable to handle kernel Jul 14 13:49:16 10.0.252.10 NULL pointer dereference Jul 1

4.1.0, kernel panic, pppoe_release

2015-07-14 Thread Denys Fedoryshchenko
Here is panic message from netconsole. Please let me know if any additional information required. Jul 14 13:49:16 10.0.252.10 [76078.867822] BUG: unable to handle kernel Jul 14 13:49:16 10.0.252.10 NULL pointer dereference Jul 14 13:49:16 10.0.252.10 at 03f0 Jul 14 13:49:16 10.0.252.

Re: circular locking, mirred, 2.6.24.2

2008-02-25 Thread Denys Fedoryshchenko
-02-2008 23:20, Denys Fedoryshchenko wrote: > > 2.6.24.2 with applied patches for printk,softlockup, and patch for htb (as i > > understand, they are in 2.6.25 git and it is fixes). > > > > I will send also to private mails QoS rules i am us

circular locking, mirred, 2.6.24.2

2008-02-24 Thread Denys Fedoryshchenko
0x296/0x44c [ 118.854820] [] e100_poll+0x14b/0x26a [e100] [ 118.854890] [] net_rx_action+0xbf/0x201 [ 118.854958] [] __do_softirq+0x6f/0xe9 [ 118.855025] [] do_softirq+0x61/0xc8 -- Denys Fedoryshchenko Technical Manager Virtual ISP S.A.L. -- To unsubscribe from this list: send the li

Re: RESEND, HTB(?) softlockup, vanilla 2.6.24

2008-02-16 Thread Denys Fedoryshchenko
OOPS), but i am not sure it is correct and will work if i will setup MTD emulation for block device. That just idea. On Sat, 16 Feb 2008 21:45:19 +0100, Jarek Poplawski wrote > On Sat, Feb 16, 2008 at 12:25:31PM +0200, Denys Fedoryshchenko wrote: > > Thanks, i will try it. > > You t

Re: RESEND, HTB(?) softlockup, vanilla 2.6.24

2008-02-16 Thread Denys Fedoryshchenko
Thanks, i will try it. You think lockdep can be buggy? On Sat, 16 Feb 2008 09:00:36 +0100, Jarek Poplawski wrote > Denys Fedoryshchenko wrote, On 02/13/2008 09:13 AM: > > > It is very difficult to reproduce, happened after running about 1month. No > > changes done in classe

Re: BUG/ spinlock lockup, 2.6.24

2008-02-15 Thread Denys Fedoryshchenko
Fri, 15 Feb 2008 16:24:56 +0100, Bart Van Assche wrote > 2008/2/15 Denys Fedoryshchenko <[EMAIL PROTECTED]>: > > I have random crashes, at least once per week. It is very difficult to catch > > error message, and only recently i setup netconsole. Now i got crash, but >

BUG/ spinlock lockup, 2.6.24

2008-02-15 Thread Denys Fedoryshchenko
de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc pebs bts sync_rdtsc pni monitor ds_cpl vmx cid cx16 xtpr lahf_lm bogomips: 6383.76 clflush size: 64 -- Denys Fedoryshchenko Technical Manager Virtual ISP

HTB(?) softlockup, vanilla 2.6.24

2008-02-10 Thread Denys Fedoryshchenko
SHAPER -- Denys Fedoryshchenko Technical Manager Virtual ISP S.A.L. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html

kernel panic on 2.6.24 with esfq patch applied

2008-02-01 Thread Denys Fedoryshchenko
09:08:50 SERVER [12380.068978] Kernel panic - not syncing: Fatal exception in interrupt -- Denys Fedoryshchenko Technical Manager Virtual ISP S.A.L. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http:/

pppoe, /proc/net/pppoe wrong (extra entries)

2008-01-29 Thread Denys Fedoryshchenko
e servers, extra entries with same mac at the end. If you need more info or access, please let me know. -- Denys Fedoryshchenko Technical Manager Virtual ISP S.A.L. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More maj

WARNING, tcp_fastretrans_alert, rc6-git11

2008-01-22 Thread Denys Fedoryshchenko
61199.895040] === -- Denys Fedoryshchenko Technical Manager Virtual ISP S.A.L. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More m

  1   2   >