[Kernel-packages] [Bug 1824687] Re: 4.4.0-145-generic Kernel Panic ip6_expire_frag_queue

2019-06-20 Thread Heikki Hannikainen
I deployed the actual -proposed kernel 4.4.0-152.179 on 4 servers, and it is stable for us. Previously there were multiple crashes per day. Confirming, verification done. Thank you! ** Tags added: amd64 apport-bug apport-collected xenial -- You received this bug notification because you are a

[Kernel-packages] [Bug 1824687] Re: 4.4.0-145-generic Kernel Panic ip6_expire_frag_queue

2019-06-19 Thread Dirk
Verfied fix by reporter ** Tags removed: amd64 apport-bug apport-collected verification-needed-xenial xenial ** Tags added: verification-done-xenial -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu.

[Kernel-packages] [Bug 1824687] Re: 4.4.0-145-generic Kernel Panic ip6_expire_frag_queue

2019-06-19 Thread Dirk
I tested with Kernel: Linux tor3 4.4.0-149-generic #175+lp1824687v4 SMP Mon May 27 17:21:02 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux IPv6 is enabled and the system is under usual load. No crashes in 24h. For me this is a clear indication that the problem is fixed. Before there where crashes all

[Kernel-packages] [Bug 1824687] Re: 4.4.0-145-generic Kernel Panic ip6_expire_frag_queue

2019-06-18 Thread Ubuntu Kernel Bot
This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed- xenial' to 'verification-done-xenial'. If the problem still exists, change the tag

[Kernel-packages] [Bug 1824687] Re: 4.4.0-145-generic Kernel Panic ip6_expire_frag_queue

2019-06-11 Thread Stefan Bader
I reverted the changes to Cosmic because that needs at least a different approach. In that version the rbtree usage is not yet present and the IPv4 expire function does the exactly same thing (increment the refcount of the skb) and we have no hard evidence this actually causes crashes in the 4.18

[Kernel-packages] [Bug 1824687] Re: 4.4.0-145-generic Kernel Panic ip6_expire_frag_queue

2019-06-06 Thread Stefan Bader
** Changed in: linux (Ubuntu Xenial) Status: Triaged => Fix Committed ** Changed in: linux (Ubuntu Cosmic) Status: Triaged => Fix Committed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu.

[Kernel-packages] [Bug 1824687] Re: 4.4.0-145-generic Kernel Panic ip6_expire_frag_queue

2019-06-06 Thread Stefan Bader
** Description changed: + [SRU Justification] + + == Impact == + + Since 05c0b86b96 "ipv6: frags: rewrite ip6_expire_frag_queue()" the + 16.04/4.4 kernel crashes whenever that functions gets called (on busy + systems this can be every 3-4 hours). While this potentially affects + Cosmic and

[Kernel-packages] [Bug 1824687] Re: 4.4.0-145-generic Kernel Panic ip6_expire_frag_queue

2019-06-05 Thread Heikki Hannikainen
I've now got the v4 debs on 5 servers, and not a single crash since they were installed on each. Looks good to me. Thank you! -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1824687 Title:

[Kernel-packages] [Bug 1824687] Re: 4.4.0-145-generic Kernel Panic ip6_expire_frag_queue

2019-05-31 Thread Heikki Hannikainen
Thanks, I deployed the v4 debs on one server which was particularly unstable, and it's still up after 1 day and 8 hours now. I'll deploy more widely on Monday and Tuesday. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu.

[Kernel-packages] [Bug 1824687] Re: 4.4.0-145-generic Kernel Panic ip6_expire_frag_queue

2019-05-28 Thread Stefan Bader
So far I have not been successful to trigger the code path which leads to the crashes on my test system. I have, however been able to extend the patch I had in v2 in a way that makes me a bit more hopeful that it might get us somewhere. Potentially not the most optimized handling but that could

[Kernel-packages] [Bug 1824687] Re: 4.4.0-145-generic Kernel Panic ip6_expire_frag_queue

2019-05-21 Thread Heikki Hannikainen
I've now got 6 crashes within past 24 hours on the #170+lp1824687v2 testing kernel on a *single* server. It's a production environment, so I'll roll back for now. Two latest backtraces: [ 6251.834160] Call Trace: [ 6251.834166] [ 6251.834174] [] skb_release_head_state+0x90/0xb0 [ 6251.834189]

[Kernel-packages] [Bug 1824687] Re: 4.4.0-145-generic Kernel Panic ip6_expire_frag_queue

2019-05-21 Thread Stefan Bader
Spend a little more time on this yesterday. While it is somewhat clear that this results from fixing the original issue (now it crashes when releasing memory a little later), My past experience of looking at network issues like that is that memory dumps are of rather limited use as the reasons

[Kernel-packages] [Bug 1824687] Re: 4.4.0-145-generic Kernel Panic ip6_expire_frag_queue

2019-05-17 Thread Stefan Bader
Thanks, quickly glancing at this it looks to be different as in crashing now at a different occasion (when releasing a buffer). I will have to take a closer look but probably not today. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to

[Kernel-packages] [Bug 1824687] Re: 4.4.0-145-generic Kernel Panic ip6_expire_frag_queue

2019-05-17 Thread Heikki Hannikainen
Unfortunately 4.4.0-144-generic #170+lp1824687v2 testing kernel still crashes. I have 4 hardware instances running it now, there were 2 panics (Australia, Sweden) within 24 hours. I installed linux-crashdump on them after the first crash to get the panic logs reliably. Attached a log from the

[Kernel-packages] [Bug 1824687] Re: 4.4.0-145-generic Kernel Panic ip6_expire_frag_queue

2019-05-15 Thread Heikki Hannikainen
Sorry for the delay. I'm back in the office now and deploying the test kernel today to a few servers, and to additional ones tomorrow if it's OK on the first ones. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu.

[Kernel-packages] [Bug 1824687] Re: 4.4.0-145-generic Kernel Panic ip6_expire_frag_queue

2019-05-07 Thread Stefan Bader
Just a reminder for the test kernel. If this can be tested soon, it could make it into the next update cycle which starts next week. But for that it has to be submitted before end of Wednesday. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed

[Kernel-packages] [Bug 1824687] Re: 4.4.0-145-generic Kernel Panic ip6_expire_frag_queue

2019-05-03 Thread Heikki Hannikainen
Thank you! I can test this on Monday, weekend is starting here in 2 minutes and this is not the greatest moment to start testing. :) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1824687

[Kernel-packages] [Bug 1824687] Re: 4.4.0-145-generic Kernel Panic ip6_expire_frag_queue

2019-05-03 Thread Stefan Bader
Ok, http://people.canonical.com/~smb/lp1824687/ has been updated with a v2 set which has the upstream patch backported. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1824687 Title:

[Kernel-packages] [Bug 1824687] Re: 4.4.0-145-generic Kernel Panic ip6_expire_frag_queue

2019-05-03 Thread Stefan Bader
>From the upstream discussion thread it looks like I was on the right track (https://marc.info/?l=linux-netdev=155688404826002=2). For confirmation I am building another set of test kernel packages and once this can be confirmed will proceed to SRU this into the other series. This looks to have

[Kernel-packages] [Bug 1824687] Re: 4.4.0-145-generic Kernel Panic ip6_expire_frag_queue

2019-05-03 Thread Heikki Hannikainen
There is the kernel.org bug ticket which describes similar oopsing through ip6_expire_frag_queue() in 4.9 and 4.19 kernels: https://bugzilla.kernel.org/show_bug.cgi?id=202669 I also saw crashes on 4.15.0-48-generic on a server running the same task; I don't have stack traces to show yet since

[Kernel-packages] [Bug 1824687] Re: 4.4.0-145-generic Kernel Panic ip6_expire_frag_queue

2019-05-03 Thread Stefan Bader
As a status update: thanks for testing. I pity it did not help. So far I was looking through all related changes in that set but could not find anything that immediately stuck out. Thinking more over the crash stacktrace it is a netfilter contrack timer expiring which causes a call into

[Kernel-packages] [Bug 1824687] Re: 4.4.0-145-generic Kernel Panic ip6_expire_frag_queue

2019-05-01 Thread Dirk
O.k. tested with the Kernel Provided. It does not improve the situtation. Machine crashed first time after about 2hours - same error as always. I rebootet it - took 2-3 hours until next crash. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed

[Kernel-packages] [Bug 1824687] Re: 4.4.0-145-generic Kernel Panic ip6_expire_frag_queue

2019-04-30 Thread Dirk
Thanks for the updated Kernel. Sorry for the late reply. Changes of ipmi where without success. However I could install and boot your kernel Linux version 4.4.0-144-generic (smb@kathleen) (gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.10) ) #170+lp1824687v1 SMP Tue Apr 30 11:18:53 UTC

[Kernel-packages] [Bug 1824687] Re: 4.4.0-145-generic Kernel Panic ip6_expire_frag_queue

2019-04-30 Thread Stefan Bader
The issue is a check which is causing a oops/crash when a send buffer is referenced more than once when calling pskb_expand_head(). As mentioned in comment #18, this seems to be introduced by a series of patches modifying the way fragments are handled. The networking code is quite complex, so I

[Kernel-packages] [Bug 1824687] Re: 4.4.0-145-generic Kernel Panic ip6_expire_frag_queue

2019-04-30 Thread Stefan Bader
Thanks for the stack traces. Those help a lot to pinpoint the problem. Will be taking a look. ** Changed in: linux (Ubuntu Xenial) Importance: Undecided => High ** Changed in: linux (Ubuntu Xenial) Status: Incomplete => Triaged ** Changed in: linux (Ubuntu Xenial) Assignee:

[Kernel-packages] [Bug 1824687] Re: 4.4.0-145-generic Kernel Panic ip6_expire_frag_queue

2019-04-29 Thread Heikki Hannikainen
kernel.org bug ticket, showing similar crashes on 4.9 and 4.19 kernels: https://bugzilla.kernel.org/show_bug.cgi?id=202669 ** Bug watch added: Linux Kernel Bug Tracker #202669 https://bugzilla.kernel.org/show_bug.cgi?id=202669 -- You received this bug notification because you are a member of

[Kernel-packages] [Bug 1824687] Re: 4.4.0-145-generic Kernel Panic ip6_expire_frag_queue

2019-04-29 Thread Heikki Hannikainen
** Attachment added: "5 kernel stack traces of crashes on 4.4.0-145 and -146, on 4 different hardware nodes" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1824687/+attachment/5260024/+files/hessu-ipv6_expire_frag_queue-crashes.txt -- You received this bug notification because you are

[Kernel-packages] [Bug 1824687] Re: 4.4.0-145-generic Kernel Panic ip6_expire_frag_queue

2019-04-29 Thread Heikki Hannikainen
I have had this crash, with the ip6_expire_frag_queue stack trace, more than 18 times since 2019-04-16 on more than 10 different servers in 8 different countries. There have been some more crashes, but from these ones the panic dump managed to go out to a remote syslog server where it's easy to

[Kernel-packages] [Bug 1824687] Re: 4.4.0-145-generic Kernel Panic ip6_expire_frag_queue

2019-04-17 Thread Stefan Bader
The latter means that you still have 4.4.0-143 around and could select that if you had any way of interfacing with the booting server. So you could go back and confirm the regression happened between 143 and 145. About IPMI, I don't know how one would do that with Windows, but using a Linux box,

[Kernel-packages] [Bug 1824687] Re: 4.4.0-145-generic Kernel Panic ip6_expire_frag_queue

2019-04-16 Thread Dirk
I spend the better part of 2h to install java on an old windows to satisfy the IPMI needs. I was able to start SOL. BUT it is just a black windows displaying nothing. I give up. on this - sorry I do not know how to produce an better screenshot. (Yes I googled). IPMI Viewer produced the same bad

[Kernel-packages] [Bug 1824687] Re: 4.4.0-145-generic Kernel Panic ip6_expire_frag_queue

2019-04-16 Thread Stefan Bader
Knowing which was the last good kernel would be good to minimize the delta of changes. Note that if you are able to interact with the grub loader at boot, you can go back to at least the previous kernel before the reboot. For the trace it would be good to capture the full message. If the server

[Kernel-packages] [Bug 1824687] Re: 4.4.0-145-generic Kernel Panic ip6_expire_frag_queue

2019-04-15 Thread Dirk
regarding #2 I do not know which kernel ran before. I assume linux-image-4.4.0-143 due to following apt history logs. I however do not know if we rebooted. I have a different type of server here with ubuntu. I can try to stress it and to see. But I doubt I get the same quality of traffic since

[Kernel-packages] [Bug 1824687] Re: 4.4.0-145-generic Kernel Panic ip6_expire_frag_queue

2019-04-15 Thread Dirk
added logs of apport-collect 1824687 and then change the status of the bug to 'Confirmed'. ** Changed in: linux (Ubuntu) Status: Incomplete => Confirmed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu.

[Kernel-packages] [Bug 1824687] Re: 4.4.0-145-generic Kernel Panic ip6_expire_frag_queue

2019-04-15 Thread Dirk
apport information ** Tags added: apport-collected ** Description changed: Description: Ubuntu 16.04.6 LTS Release: 16.04 After upgrading our server to this Kernel we experience frequent Kernel panics (Attachment). Every 3 hours. Our machine has a throuput of about 600

[Kernel-packages] [Bug 1824687] Re: 4.4.0-145-generic Kernel Panic ip6_expire_frag_queue

2019-04-15 Thread Stefan Bader
Which kernel version was used before (and did not show this crash)? Can you reproduce the issue on a non-production server (which would allow to experiment with the HWE (4.15) kernel)? -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux

[Kernel-packages] [Bug 1824687] Re: 4.4.0-145-generic Kernel Panic ip6_expire_frag_queue

2019-04-15 Thread Stefan Bader
** Package changed: linux-signed (Ubuntu) => linux (Ubuntu) ** Also affects: linux (Ubuntu Xenial) Importance: Undecided Status: New -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu.