Re: Asterisk deadlocks since Kernel 4.1

2015-12-06 Thread Stefan Priebe
Hi Herbert, i think i found the issue in 4.1 with netlink. Somebody made a mistake while backporting or cherry-picking your patch "netlink: Fix autobind race condition that leads to zero port ID" to 4.1. It misses a goto in 4.1. This goto is missing in 4.1: diff --git

Re: Asterisk deadlocks since Kernel 4.1

2015-12-06 Thread Herbert Xu
On Sun, Dec 06, 2015 at 09:56:34PM +0100, Stefan Priebe wrote: > Hi Herbert, > > i think i found the issue in 4.1 with netlink. Somebody made a > mistake while backporting or cherry-picking your patch "netlink: Fix > autobind race condition that leads to zero port ID" to 4.1. > > It misses a

Re: Asterisk deadlocks since Kernel 4.1

2015-12-06 Thread Philipp Hahn
Hello Stefan, Am 06.12.2015 um 21:56 schrieb Stefan Priebe: > i think i found the issue in 4.1 with netlink. Somebody made a mistake > while backporting or cherry-picking your patch "netlink: Fix autobind > race condition that leads to zero port ID" to 4.1. > > It misses a goto in 4.1. > > This

Re: Asterisk deadlocks since Kernel 4.1

2015-12-06 Thread Stefan Priebe - Profihost AG
Hi Herbert, Am 07.12.2015 um 02:20 schrieb Herbert Xu: > On Sun, Dec 06, 2015 at 09:56:34PM +0100, Stefan Priebe wrote: >> Hi Herbert, >> >> i think i found the issue in 4.1 with netlink. Somebody made a >> mistake while backporting or cherry-picking your patch "netlink: Fix >> autobind race

Re: Asterisk deadlocks since Kernel 4.1

2015-12-05 Thread Stefan Priebe
Hello Philipp, Am 05.12.2015 um 15:19 schrieb Philipp Matthias Hahn: Hello Hannes, On Wed, Dec 02, 2015 at 12:40:32PM +0100, Hannes Frederic Sowa wrote: git bisect tells me it stopped working after those two commits were applied: commit d48623677191e0f035d7afd344f92cf880b01f8e Author:

Re: Asterisk deadlocks since Kernel 4.1

2015-12-05 Thread Philipp Matthias Hahn
Hello Hannes, On Wed, Dec 02, 2015 at 12:40:32PM +0100, Hannes Frederic Sowa wrote: > > git bisect tells me it stopped working after those two commits were applied: > > > > commit d48623677191e0f035d7afd344f92cf880b01f8e > > Author: Herbert Xu > > Date: Tue Sep 22

Re: Asterisk deadlocks since Kernel 4.1

2015-12-04 Thread Herbert Xu
On Fri, Dec 04, 2015 at 07:26:12PM +0100, Stefan Priebe wrote: > > * 9f87e0c - (2 months ago) netlink: Replace rhash_portid with bound > - Herbert Xu > * 35e9890 - (3 months ago) netlink: Fix autobind race condition that > leads to zero port ID - Herbert Xu > * 30c6472 - (7 months ago) netlink:

Re: Asterisk deadlocks since Kernel 4.1

2015-12-04 Thread Stefan Priebe
Hi, I got it fixed / at least not live / deadlocking by doing applying the following patch - which is the diff of the commits below on top of 4.1.13. patch: http://pastebin.com/raw.php?i=hiuq4bsW all commits / changes in reverse order: * 0ceb380 - (6 weeks ago) netlink: fix locking around

Re: Asterisk deadlocks since Kernel 4.1

2015-12-03 Thread Stefan Priebe - Profihost AG
> Am 02.12.2015 um 12:40 schrieb Hannes Frederic Sowa > : > > Hello Stefan, > > Stefan Priebe - Profihost AG writes: > > >> here are the results. >> >> It works with 4.1. >> It works with 4.2. >> It does not work with 4.1.13. >> >> git

Re: Asterisk deadlocks since Kernel 4.1

2015-12-02 Thread Hannes Frederic Sowa
Hello, On Wed, Dec 2, 2015, at 18:15, Philipp Hahn wrote: > Hi, > > Am 02.12.2015 um 10:45 schrieb Stefan Priebe - Profihost AG: > > here are the results. > > > > It works with 4.1. > > It works with 4.2. > > It does not work with 4.1.13. > > the patches were first commitet in v4.3-rc3 and

Re: Asterisk deadlocks since Kernel 4.1

2015-12-02 Thread Philipp Hahn
Hi, Am 02.12.2015 um 10:45 schrieb Stefan Priebe - Profihost AG: > here are the results. > > It works with 4.1. > It works with 4.2. > It does not work with 4.1.13. the patches were first commitet in v4.3-rc3 and appear as backports only since v4.2.3 and v4.1.10 > git bisect tells me it

Re: Asterisk deadlocks since Kernel 4.1

2015-12-02 Thread Philipp Hahn
Hi, Am 02.12.2015 um 12:40 schrieb Hannes Frederic Sowa: > Cool, thanks a lot. Does this patch make a difference? > > diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c > index 59651af..278e94c 100644 > --- a/net/netlink/af_netlink.c > +++ b/net/netlink/af_netlink.c > @@ -1137,7

Re: Asterisk deadlocks since Kernel 4.1

2015-12-02 Thread Stefan Priebe - Profihost AG
Hi, here are the results. It works with 4.1. It works with 4.2. It does not work with 4.1.13. git bisect tells me it stopped working after those two commits were applied: commit d48623677191e0f035d7afd344f92cf880b01f8e Author: Herbert Xu Date: Tue Sep 22

Re: Asterisk deadlocks since Kernel 4.1

2015-12-02 Thread Hannes Frederic Sowa
Hello Stefan, Stefan Priebe - Profihost AG writes: > here are the results. > > It works with 4.1. > It works with 4.2. > It does not work with 4.1.13. > > git bisect tells me it stopped working after those two commits were applied: > > commit

Re: Asterisk deadlocks since Kernel 4.1

2015-11-24 Thread Stefan Priebe - Profihost AG
Am 23.11.2015 um 13:57 schrieb Hannes Frederic Sowa: > On Mon, Nov 23, 2015, at 13:44, Stefan Priebe - Profihost AG wrote: >> Am 19.11.2015 um 20:51 schrieb Stefan Priebe: >>> >>> Am 19.11.2015 um 14:19 schrieb Florian Weimer: On 11/19/2015 01:46 PM, Stefan Priebe - Profihost AG wrote:

Re: Asterisk deadlocks since Kernel 4.1

2015-11-23 Thread Stefan Priebe - Profihost AG
Am 19.11.2015 um 20:51 schrieb Stefan Priebe: > > Am 19.11.2015 um 14:19 schrieb Florian Weimer: >> On 11/19/2015 01:46 PM, Stefan Priebe - Profihost AG wrote: >> >>> I can try Kernel 4.4-rc1 next week. Or something else? >> >> I found this bug report which indicates that 4.1.10 works: >> >>

Re: Asterisk deadlocks since Kernel 4.1

2015-11-23 Thread Hannes Frederic Sowa
On Mon, Nov 23, 2015, at 13:44, Stefan Priebe - Profihost AG wrote: > Am 19.11.2015 um 20:51 schrieb Stefan Priebe: > > > > Am 19.11.2015 um 14:19 schrieb Florian Weimer: > >> On 11/19/2015 01:46 PM, Stefan Priebe - Profihost AG wrote: > >> > >>> I can try Kernel 4.4-rc1 next week. Or something

Re: Asterisk deadlocks since Kernel 4.1

2015-11-19 Thread Florian Weimer
On 11/18/2015 10:23 PM, Stefan Priebe wrote: >> please try to get a backtrace with debugging information. It is likely >> that this is the make_request/__check_pf functionality in glibc, but it >> would be nice to get some certainty. >> >> Which glibc version do you use? Has it got a fix for

Re: Asterisk deadlocks since Kernel 4.1

2015-11-19 Thread Stefan Priebe - Profihost AG
Am 19.11.2015 um 10:44 schrieb Florian Weimer: > On 11/18/2015 10:36 PM, Stefan Priebe wrote: > >>> please try to get a backtrace with debugging information. It is likely >>> that this is the make_request/__check_pf functionality in glibc, but it >>> would be nice to get some certainty. >> >>

Re: Asterisk deadlocks since Kernel 4.1

2015-11-19 Thread Stefan Priebe - Profihost AG
Am 18.11.2015 um 22:22 schrieb Hannes Frederic Sowa: > On Wed, Nov 18, 2015, at 22:20, Stefan Priebe wrote: >> you mean just: >> la /proc/$pid/fd > > ls -l /proc/pid/fd/ > > the numbers in brackets in return from readlink are the inode numbers. > >> and >> >> cat /proc/net/netlink > >

Re: Asterisk deadlocks since Kernel 4.1

2015-11-19 Thread Florian Weimer
On 11/18/2015 10:36 PM, Stefan Priebe wrote: >> please try to get a backtrace with debugging information. It is likely >> that this is the make_request/__check_pf functionality in glibc, but it >> would be nice to get some certainty. > > sorry here it is. What I'm wondering is why is there ipv6

Re: Asterisk deadlocks since Kernel 4.1

2015-11-19 Thread Stefan Priebe - Profihost AG
OK it had a livelock again. It just took more time. So here is the data: # la /proc/2598/fd total 0 dr-x-- 2 root root0 Nov 19 06:53 . dr-xr-xr-x 7 callweaver callweaver 0 Nov 18 22:38 .. lrwx-- 1 root root 64 Nov 19 06:54 0 -> /dev/null lrwx-- 1 root

Re: Asterisk deadlocks since Kernel 4.1

2015-11-19 Thread Stefan Priebe
Am 19.11.2015 um 14:19 schrieb Florian Weimer: On 11/19/2015 01:46 PM, Stefan Priebe - Profihost AG wrote: I can try Kernel 4.4-rc1 next week. Or something else? I found this bug report which indicates that 4.1.10 works: But in

Re: Asterisk deadlocks since Kernel 4.1

2015-11-19 Thread Hannes Frederic Sowa
On Thu, Nov 19, 2015, at 12:43, Stefan Priebe - Profihost AG wrote: > > Am 19.11.2015 um 12:41 schrieb Hannes Frederic Sowa: > > On Thu, Nov 19, 2015, at 10:56, Stefan Priebe - Profihost AG wrote: > >> OK it had a livelock again. It just took more time. > >> > >> So here is the data: > > > >

Re: Asterisk deadlocks since Kernel 4.1

2015-11-19 Thread Stefan Priebe - Profihost AG
Am 19.11.2015 um 13:41 schrieb Hannes Frederic Sowa: > On Thu, Nov 19, 2015, at 12:43, Stefan Priebe - Profihost AG wrote: >> >> Am 19.11.2015 um 12:41 schrieb Hannes Frederic Sowa: >>> On Thu, Nov 19, 2015, at 10:56, Stefan Priebe - Profihost AG wrote: OK it had a livelock again. It just

Re: Asterisk deadlocks since Kernel 4.1

2015-11-19 Thread Stefan Priebe - Profihost AG
Am 19.11.2015 um 12:41 schrieb Hannes Frederic Sowa: > On Thu, Nov 19, 2015, at 10:56, Stefan Priebe - Profihost AG wrote: >> OK it had a livelock again. It just took more time. >> >> So here is the data: > > Thanks, I couldn't reproduce it so far with simple threaded resolver > loop on your

Re: Asterisk deadlocks since Kernel 4.1

2015-11-19 Thread Hannes Frederic Sowa
On Thu, Nov 19, 2015, at 10:56, Stefan Priebe - Profihost AG wrote: > OK it had a livelock again. It just took more time. > > So here is the data: Thanks, I couldn't reproduce it so far with simple threaded resolver loop on your kernel. :/ Your data is useless if you don't also provide the file

Re: Asterisk deadlocks since Kernel 4.1

2015-11-19 Thread Florian Weimer
On 11/19/2015 01:46 PM, Stefan Priebe - Profihost AG wrote: > I can try Kernel 4.4-rc1 next week. Or something else? I found this bug report which indicates that 4.1.10 works: But in your original report, you said that 4.1.13 is

Re: Asterisk deadlocks since Kernel 4.1

2015-11-18 Thread Stefan Priebe
Am 18.11.2015 um 22:40 schrieb Hannes Frederic Sowa: On Wed, Nov 18, 2015, at 22:36, Stefan Priebe wrote: sorry here it is. What I'm wondering is why is there ipv6 stuff? I don't have ipv6 except for link local. Could it be this one? https://bugzilla.redhat.com/show_bug.cgi?id=505105#c79

Re: Asterisk deadlocks since Kernel 4.1

2015-11-18 Thread Hannes Frederic Sowa
On Wed, Nov 18, 2015, at 22:36, Stefan Priebe wrote: > sorry here it is. What I'm wondering is why is there ipv6 stuff? I don't > have ipv6 except for link local. Could it be this one? > > https://bugzilla.redhat.com/show_bug.cgi?id=505105#c79 > > Thread 31 (Thread 0x7f295c011700 (LWP 26654)):

Re: Asterisk deadlocks since Kernel 4.1

2015-11-18 Thread Hannes Frederic Sowa
On Wed, Nov 18, 2015, at 22:42, Stefan Priebe wrote: > > Am 18.11.2015 um 22:40 schrieb Hannes Frederic Sowa: > > On Wed, Nov 18, 2015, at 22:36, Stefan Priebe wrote: > >> sorry here it is. What I'm wondering is why is there ipv6 stuff? I don't > >> have ipv6 except for link local. Could it be

Re: Asterisk deadlocks since Kernel 4.1

2015-11-18 Thread Stefan Priebe
Am 18.11.2015 um 22:18 schrieb Florian Weimer: On 11/18/2015 09:23 PM, Stefan Priebe wrote: Am 17.11.2015 um 20:43 schrieb Thomas Gleixner: On Tue, 17 Nov 2015, Stefan Priebe wrote: I've now also two gdb backtraces from two crashes: http://pastebin.com/raw.php?i=yih5jNt8

Re: Asterisk deadlocks since Kernel 4.1

2015-11-18 Thread Hannes Frederic Sowa
On Wed, Nov 18, 2015, at 21:23, Stefan Priebe wrote: > > Am 17.11.2015 um 20:43 schrieb Thomas Gleixner: > > On Tue, 17 Nov 2015, Stefan Priebe wrote: > >> I've now also two gdb backtraces from two crashes: > >> http://pastebin.com/raw.php?i=yih5jNt8 > >> > >>

Re: Asterisk deadlocks since Kernel 4.1

2015-11-18 Thread Florian Weimer
On 11/18/2015 09:23 PM, Stefan Priebe wrote: > > Am 17.11.2015 um 20:43 schrieb Thomas Gleixner: >> On Tue, 17 Nov 2015, Stefan Priebe wrote: >>> I've now also two gdb backtraces from two crashes: >>> http://pastebin.com/raw.php?i=yih5jNt8 >>> >>> http://pastebin.com/raw.php?i=kGEcvH4T >> >> They

Re: Asterisk deadlocks since Kernel 4.1

2015-11-18 Thread Stefan Priebe
Am 18.11.2015 um 22:00 schrieb Hannes Frederic Sowa: On Wed, Nov 18, 2015, at 21:23, Stefan Priebe wrote: Am 17.11.2015 um 20:43 schrieb Thomas Gleixner: On Tue, 17 Nov 2015, Stefan Priebe wrote: I've now also two gdb backtraces from two crashes: http://pastebin.com/raw.php?i=yih5jNt8

Re: Asterisk deadlocks since Kernel 4.1

2015-11-18 Thread Hannes Frederic Sowa
On Wed, Nov 18, 2015, at 22:20, Stefan Priebe wrote: > you mean just: > la /proc/$pid/fd ls -l /proc/pid/fd/ the numbers in brackets in return from readlink are the inode numbers. > and > > cat /proc/net/netlink Exactly, last row is the inode number. Bye, Hannes -- To unsubscribe from this

Re: Asterisk deadlocks since Kernel 4.1

2015-11-18 Thread Stefan Priebe
Am 18.11.2015 um 22:18 schrieb Florian Weimer: On 11/18/2015 09:23 PM, Stefan Priebe wrote: Am 17.11.2015 um 20:43 schrieb Thomas Gleixner: On Tue, 17 Nov 2015, Stefan Priebe wrote: I've now also two gdb backtraces from two crashes: http://pastebin.com/raw.php?i=yih5jNt8

Re: Asterisk deadlocks since Kernel 4.1

2015-11-17 Thread Stefan Priebe
Am 17.11.2015 um 20:15 schrieb Thomas Gleixner: On Tue, 17 Nov 2015, Stefan Priebe - Profihost AG wrote: since Upgrading our Asterisk System from Kernel 3.18.17 to 4.1.13 it deadlocks every few hours (kill -9 is the only thing working). Booting with 3.18 again let it run smooth again. An

Re: Asterisk deadlocks since Kernel 4.1

2015-11-17 Thread Thomas Gleixner
On Tue, 17 Nov 2015, Stefan Priebe wrote: > I've now also two gdb backtraces from two crashes: > http://pastebin.com/raw.php?i=yih5jNt8 > > http://pastebin.com/raw.php?i=kGEcvH4T They don't tell me anything as I have no idea of the inner workings of asterisk. You might be better of to talk to