OK, LMK if you do find time and/or need further crash traces etc. Thanks ________________________________________ From: [email protected] <[email protected]> on behalf of Rainer Gerhards <[email protected]> Sent: Wednesday, March 4, 2015 10:34 AM To: rsyslog-users Subject: Re: [rsyslog] rsyslogd crashes (segv, signal 11) intermittantly on strmSetWCntr (stream.c:1918)
Unchanged, no time so far :( Sent from phone, thus brief. Am 04.03.2015 16:27 schrieb "Raymond Wu" <[email protected]>: > Ranier, > Any update on this? We're still experiencing the crashes, every few days. > > Thanks > Ray > ________________________________________ > From: [email protected] <[email protected]> > on behalf of Rainer Gerhards <[email protected]> > Sent: Friday, February 27, 2015 3:37 AM > To: rsyslog-users > Subject: Re: [rsyslog] rsyslogd crashes (segv, signal 11) intermittantly > on strmSetWCntr (stream.c:1918) > > thanks, that is very good. I'll try to have a look ASAP, probably early > next week. Please ping me if you haven't heard any more back by mid-week. I > don't need the core. > > Rainer > > 2015-02-26 17:11 GMT+01:00 Raymond Wu <[email protected]>: > > > Ranier, > > > > I was able to run rsyslogd under valgrind and catch this crash... > > > > [root@tsqweb03 data]# cat valgrind-rsyslogd-tsqweb03-20150226.log > > ==19146== Thread 17: > > ==19146== Invalid read of size 8 > > ==19146== at 0x13A679: strmPhysWrite (stream.c:1288) > > ==19146== by 0x13AF66: strmFlush (stream.c:1440) > > ==19146== by 0x13EDB6: qAddDisk (queue.c:932) > > ==19146== by 0x13F1E5: doEnqSingleObj (queue.c:1034) > > ==19146== by 0x14208C: qqueueEnqMsg (queue.c:2858) > > ==19146== by 0x142D05: ConsumerDA (queue.c:1955) > > ==19146== by 0x13D4B1: wtiWorker (wti.c:334) > > ==19146== by 0x13D001: wtpWorker (wtp.c:389) > > ==19146== by 0x504C9D0: start_thread (in /lib64/libpthread-2.12.so) > > ==19146== by 0x61819DC: clone (in /lib64/libc-2.12.so) > > ==19146== Address 0x65b8d60 is 80 bytes inside a block of size 616 > free'd > > ==19146== at 0x4C27430: free (vg_replace_malloc.c:446) > > ==19146== by 0x13B790: strmDestruct (stream.c:947) > > ==19146== by 0x13E1C6: qDestructDisk (queue.c:914) > > ==19146== by 0x142F75: ConsumerReg (queue.c:711) > > ==19146== by 0x13D4B1: wtiWorker (wti.c:334) > > ==19146== by 0x13D001: wtpWorker (wtp.c:389) > > ==19146== by 0x504C9D0: start_thread (in /lib64/libpthread-2.12.so) > > ==19146== by 0x61819DC: clone (in /lib64/libc-2.12.so) > > ==19146== > > ==19146== Invalid read of size 8 > > ==19146== at 0x13A682: strmPhysWrite (stream.c:1286) > > ==19146== by 0x13AF66: strmFlush (stream.c:1440) > > ==19146== by 0x13EDB6: qAddDisk (queue.c:932) > > ==19146== by 0x13F1E5: doEnqSingleObj (queue.c:1034) > > ==19146== by 0x14208C: qqueueEnqMsg (queue.c:2858) > > ==19146== by 0x142D05: ConsumerDA (queue.c:1955) > > ==19146== by 0x13D4B1: wtiWorker (wti.c:334) > > ==19146== by 0x13D001: wtpWorker (wtp.c:389) > > ==19146== by 0x504C9D0: start_thread (in /lib64/libpthread-2.12.so) > > ==19146== by 0x61819DC: clone (in /lib64/libc-2.12.so) > > ==19146== Address 0x65b8d58 is 72 bytes inside a block of size 616 > free'd > > ==19146== at 0x4C27430: free (vg_replace_malloc.c:446) > > ==19146== by 0x13B790: strmDestruct (stream.c:947) > > ==19146== by 0x13E1C6: qDestructDisk (queue.c:914) > > ==19146== by 0x142F75: ConsumerReg (queue.c:711) > > ==19146== by 0x13D4B1: wtiWorker (wti.c:334) > > ==19146== by 0x13D001: wtpWorker (wtp.c:389) > > ==19146== by 0x504C9D0: start_thread (in /lib64/libpthread-2.12.so) > > ==19146== by 0x61819DC: clone (in /lib64/libc-2.12.so) > > ==19146== > > ==19146== Invalid read of size 1 > > ==19146== at 0x13A68E: strmPhysWrite (stream.c:1291) > > ==19146== by 0x13AF66: strmFlush (stream.c:1440) > > ==19146== by 0x13EDB6: qAddDisk (queue.c:932) > > ==19146== by 0x13F1E5: doEnqSingleObj (queue.c:1034) > > ==19146== by 0x14208C: qqueueEnqMsg (queue.c:2858) > > ==19146== by 0x142D05: ConsumerDA (queue.c:1955) > > ==19146== by 0x13D4B1: wtiWorker (wti.c:334) > > ==19146== by 0x13D001: wtpWorker (wtp.c:389) > > ==19146== by 0x504C9D0: start_thread (in /lib64/libpthread-2.12.so) > > ==19146== by 0x61819DC: clone (in /lib64/libc-2.12.so) > > ==19146== Address 0x65b8d69 is 89 bytes inside a block of size 616 > free'd > > ==19146== at 0x4C27430: free (vg_replace_malloc.c:446) > > ==19146== by 0x13B790: strmDestruct (stream.c:947) > > ==19146== by 0x13E1C6: qDestructDisk (queue.c:914) > > ==19146== by 0x142F75: ConsumerReg (queue.c:711) > > ==19146== by 0x13D4B1: wtiWorker (wti.c:334) > > ==19146== by 0x13D001: wtpWorker (wtp.c:389) > > ==19146== by 0x504C9D0: start_thread (in /lib64/libpthread-2.12.so) > > ==19146== by 0x61819DC: clone (in /lib64/libc-2.12.so) > > ==19146== > > ==19146== Invalid read of size 4 > > ==19146== at 0x13A698: strmPhysWrite (stream.c:1295) > > ==19146== by 0x13AF66: strmFlush (stream.c:1440) > > ==19146== by 0x13EDB6: qAddDisk (queue.c:932) > > ==19146== by 0x13F1E5: doEnqSingleObj (queue.c:1034) > > ==19146== by 0x14208C: qqueueEnqMsg (queue.c:2858) > > ==19146== by 0x142D05: ConsumerDA (queue.c:1955) > > ==19146== by 0x13D4B1: wtiWorker (wti.c:334) > > ==19146== by 0x13D001: wtpWorker (wtp.c:389) > > ==19146== by 0x504C9D0: start_thread (in /lib64/libpthread-2.12.so) > > ==19146== by 0x61819DC: clone (in /lib64/libc-2.12.so) > > ==19146== Address 0x65b8d20 is 16 bytes inside a block of size 616 > free'd > > ==19146== at 0x4C27430: free (vg_replace_malloc.c:446) > > ==19146== by 0x13B790: strmDestruct (stream.c:947) > > ==19146== by 0x13E1C6: qDestructDisk (queue.c:914) > > ==19146== by 0x142F75: ConsumerReg (queue.c:711) > > ==19146== by 0x13D4B1: wtiWorker (wti.c:334) > > ==19146== by 0x13D001: wtpWorker (wtp.c:389) > > ==19146== by 0x504C9D0: start_thread (in /lib64/libpthread-2.12.so) > > ==19146== by 0x61819DC: clone (in /lib64/libc-2.12.so) > > ==19146== > > ==19146== Invalid read of size 4 > > ==19146== at 0x13A51B: strmCheckNextOutputFile (stream.c:959) > > ==19146== by 0x13A7BF: strmPhysWrite (stream.c:1296) > > ==19146== by 0x13AF66: strmFlush (stream.c:1440) > > ==19146== by 0x13EDB6: qAddDisk (queue.c:932) > > ==19146== by 0x13F1E5: doEnqSingleObj (queue.c:1034) > > ==19146== by 0x14208C: qqueueEnqMsg (queue.c:2858) > > ==19146== by 0x142D05: ConsumerDA (queue.c:1955) > > ==19146== by 0x13D4B1: wtiWorker (wti.c:334) > > ==19146== by 0x13D001: wtpWorker (wtp.c:389) > > ==19146== by 0x504C9D0: start_thread (in /lib64/libpthread-2.12.so) > > ==19146== by 0x61819DC: clone (in /lib64/libc-2.12.so) > > ==19146== Address 0x65b8d84 is 116 bytes inside a block of size 616 > free'd > > ==19146== at 0x4C27430: free (vg_replace_malloc.c:446) > > ==19146== by 0x13B790: strmDestruct (stream.c:947) > > ==19146== by 0x13E1C6: qDestructDisk (queue.c:914) > > ==19146== by 0x142F75: ConsumerReg (queue.c:711) > > ==19146== by 0x13D4B1: wtiWorker (wti.c:334) > > ==19146== by 0x13D001: wtpWorker (wtp.c:389) > > ==19146== by 0x504C9D0: start_thread (in /lib64/libpthread-2.12.so) > > ==19146== by 0x61819DC: clone (in /lib64/libc-2.12.so) > > ==19146== > > ==19146== Invalid read of size 1 > > ==19146== at 0x13AF67: strmFlush (stream.c:1443) > > ==19146== by 0x13EDB6: qAddDisk (queue.c:932) > > ==19146== by 0x13F1E5: doEnqSingleObj (queue.c:1034) > > ==19146== by 0x14208C: qqueueEnqMsg (queue.c:2858) > > ==19146== by 0x142D05: ConsumerDA (queue.c:1955) > > ==19146== by 0x13D4B1: wtiWorker (wti.c:334) > > ==19146== by 0x13D001: wtpWorker (wtp.c:389) > > ==19146== by 0x504C9D0: start_thread (in /lib64/libpthread-2.12.so) > > ==19146== by 0x61819DC: clone (in /lib64/libc-2.12.so) > > ==19146== Address 0x65b8dd0 is 192 bytes inside a block of size 616 > free'd > > ==19146== at 0x4C27430: free (vg_replace_malloc.c:446) > > ==19146== by 0x13B790: strmDestruct (stream.c:947) > > ==19146== by 0x13E1C6: qDestructDisk (queue.c:914) > > ==19146== by 0x142F75: ConsumerReg (queue.c:711) > > ==19146== by 0x13D4B1: wtiWorker (wti.c:334) > > ==19146== by 0x13D001: wtpWorker (wtp.c:389) > > ==19146== by 0x504C9D0: start_thread (in /lib64/libpthread-2.12.so) > > ==19146== by 0x61819DC: clone (in /lib64/libc-2.12.so) > > ==19146== > > ==19146== Invalid write of size 8 > > ==19146== at 0x13915C: strmSetWCntr (stream.c:1922) > > ==19146== by 0x13EDCB: qAddDisk (queue.c:933) > > ==19146== by 0x13F1E5: doEnqSingleObj (queue.c:1034) > > ==19146== by 0x14208C: qqueueEnqMsg (queue.c:2858) > > ==19146== by 0x142D05: ConsumerDA (queue.c:1955) > > ==19146== by 0x13D4B1: wtiWorker (wti.c:334) > > ==19146== by 0x13D001: wtpWorker (wtp.c:389) > > ==19146== by 0x504C9D0: start_thread (in /lib64/libpthread-2.12.so) > > ==19146== by 0x61819DC: clone (in /lib64/libc-2.12.so) > > ==19146== Address 0x50 is not stack'd, malloc'd or (recently) free'd > > ==19146== > > ==19146== > > ==19146== Process terminating with default action of signal 11 (SIGSEGV): > > dumping core > > ==19146== Access not within mapped region at address 0x50 > > ==19146== at 0x13915C: strmSetWCntr (stream.c:1922) > > ==19146== by 0x13EDCB: qAddDisk (queue.c:933) > > ==19146== by 0x13F1E5: doEnqSingleObj (queue.c:1034) > > ==19146== by 0x14208C: qqueueEnqMsg (queue.c:2858) > > ==19146== by 0x142D05: ConsumerDA (queue.c:1955) > > ==19146== by 0x13D4B1: wtiWorker (wti.c:334) > > ==19146== by 0x13D001: wtpWorker (wtp.c:389) > > ==19146== by 0x504C9D0: start_thread (in /lib64/libpthread-2.12.so) > > ==19146== by 0x61819DC: clone (in /lib64/libc-2.12.so) > > ==19146== If you believe this happened as a result of a stack > > ==19146== overflow in your program's main thread (unlikely but > > ==19146== possible), you can try to increase the size of the > > ==19146== main thread stack using the --main-stacksize= flag. > > ==19146== The main thread stack size used in this run was 10485760. > > > > [root@tsqweb03 data]# rpm -qv rsyslog > > rsyslog-8.7.0-1.el6.x86_64 > > > > If you need the .core files, I can upload them somewhere. > > > > Ray > > ________________________________________ > > From: [email protected] < > [email protected]> > > on behalf of Rainer Gerhards <[email protected]> > > Sent: Thursday, February 12, 2015 12:12 PM > > To: rsyslog-users > > Subject: Re: [rsyslog] rsyslogd crashes (segv, signal 11) intermittantly > > on strmSetWCntr (stream.c:1918) > > > > 2015-02-12 18:07 GMT+01:00 Raymond Wu <[email protected]>: > > > > > Ranier, we're still seeing these crashes of rsyslogd even in v8.7.0... > > > I will try to run this under Valgrind, but here is another gdb trace. > > > > > > > > valgrind would be great, as the gdb trace is "after the fact" -- the > > problem can have occured much earlier at a very different location. > > > > Rainer > > > > > RW > > > ________________________________________ > > > From: [email protected] < > > [email protected]> > > > on behalf of Rainer Gerhards <[email protected]> > > > Sent: Wednesday, October 29, 2014 12:40 PM > > > To: rsyslog-users > > > Subject: Re: [rsyslog] rsyslogd crashes (segv, signal 11) > intermittantly > > > on strmSetWCntr (stream.c:1918) > > > > > > oops again, I meant > > > > > > $ valgrind rsyslogd -n ...options.... > > > > > > The -n makes it run in the foreground. You can ctl-c out of it when > done. > > > > > > Rainer > > > > > > 2014-10-29 17:39 GMT+01:00 Rainer Gerhards <[email protected]>: > > > > > > > sorry, I thought I had answered that. > > > > > > > > You can simply install the symbols via > > > > > > > > $ yum install rsyslog-debuginfo > > > > > > > > from our repository. > > > > > > > > Then all you need to do is stop rsyslog and interactively call it via > > > > > > > > $ valgrind rsyslogd ...options.... > > > > > > > > That should do the trick. > > > > > > > > Thanks, > > > > Rainer > > > > > > > > 2014-10-27 20:01 GMT+01:00 Raymond Wu <[email protected]>: > > > > > > > >> This will require rebuilding the program with -g, correct? > > > >> We can try this but probably can't do it today. > > > >> Would increasing the polling interval help? > > > >> > > > >> RW > > > >> ________________________________________ > > > >> From: [email protected] < > > > >> [email protected]> on behalf of Rainer Gerhards < > > > >> [email protected]> > > > >> Sent: Monday, October 27, 2014 2:35 PM > > > >> To: rsyslog-users > > > >> Subject: Re: [rsyslog] rsyslogd crashes (segv, signal 11) > > intermittantly > > > >> on strmSetWCntr (stream.c:1918) > > > >> > > > >> Could you try under valgrind? That would be very helpful... > > > >> > > > >> Sent from phone, thus brief. > > > >> Am 27.10.2014 19:11 schrieb "Raymond Wu" <[email protected]>: > > > >> > > > >> > OK. I ran gdb on a few more cores produced from similar crashes... > > > >> > Seems to be the same every time (stream.c:1918). > > > >> > ________________________________________ > > > >> > From: [email protected] < > > > >> [email protected]> > > > >> > on behalf of Rainer Gerhards <[email protected]> > > > >> > Sent: Monday, October 27, 2014 1:26 PM > > > >> > To: rsyslog-users > > > >> > Subject: Re: [rsyslog] rsyslogd crashes (segv, signal 11) > > > intermittantly > > > >> > on strmSetWCntr (stream.c:1918) > > > >> > > > > >> > 2014-10-27 17:33 GMT+01:00 Raymond Wu <[email protected]>: > > > >> > > > > >> > > Thanks for the quick reply, Rainer. > > > >> > > Is there a $ version of mode=polling directive? > > > >> > > We're using the old style configs... > > > >> > > > > > >> > > > > > >> > ah, sorry, I overlooked that. so this can't be the issue here. > > > Old-style > > > >> > always uses polling. > > > >> > > > > >> > Rainer > > > >> > > > > >> > > $ModLoad imfile > > > >> > > $InputFileName /var/log/tsq/beakley.log > > > >> > > $InputFileTag beakley.log_app: > > > >> > > $InputFileStateFile stat-beakley.log > > > >> > > $InputFileSeverity error > > > >> > > $InputFileFacility local7 > > > >> > > $InputRunFileMonitor > > > >> > > $InputFilePollingInterval 1 > > > >> > > > > > >> > > Ray > > > >> > > ________________________________________ > > > >> > > From: [email protected] < > > > >> > [email protected]> > > > >> > > on behalf of Rainer Gerhards <[email protected]> > > > >> > > Sent: Monday, October 27, 2014 11:57 AM > > > >> > > To: rsyslog-users > > > >> > > Subject: Re: [rsyslog] rsyslogd crashes (segv, signal 11) > > > >> intermittantly > > > >> > > on strmSetWCntr (stream.c:1918) > > > >> > > > > > >> > > This strongly smells like being caused by this bug: > > > >> > > > > > >> > > https://github.com/rsyslog/rsyslog/issues/135 > > > >> > > > > > >> > > Rainer > > > >> > > > > > >> > > 2014-10-27 16:53 GMT+01:00 Raymond Wu <[email protected]>: > > > >> > > > > > >> > > > > > > >> > > > Hello, I'm running rsyslog-8.4.2 (Adiscon package) on CentOS > > 6.5. > > > >> The > > > >> > > > rsyslogd daemon crashes intermittently on a set of our > > > systems.These > > > >> > > hosts > > > >> > > > run apps which write their own log files, which are fed to > > rsyslog > > > >> via > > > >> > > the > > > >> > > > IMFILE feature. I've patched the OS (CentOS 6.5) on these > hosts > > > with > > > >> > all > > > >> > > > updates as of a few days ago. Often the daemon won't start, > > > >> post-crash, > > > >> > > > unless (1) it's run under strace, or (2) I remove (some of) > the > > > >> > > > /var/log/srvrfwd-*.queue.* files. Perhaps some bug > > > parsing/flushing > > > >> > queue > > > >> > > > files. I can post the rsyslog configuration if necessary, but > > > there > > > >> > are a > > > >> > > > crapload of small include .conf files (around 40, to IMFILE > > > various > > > >> > > > application logs) but nothing really fancy is being done. Here > > is > > > >> the > > > >> > > > package info: > > > >> > > > > > > >> > > > > > > >> > > > $ rpm -qvi rsyslog > > > >> > > > > > > >> > > > Name : rsyslog Relocations: (not > > > >> > relocatable) > > > >> > > > Version : 8.4.2 Vendor: (none) > > > >> > > > Release : 1.el6 Build Date: Thu 02 > > Oct > > > >> 2014 > > > >> > > > 03:25:33 AM EDT > > > >> > > > Install Date: Mon 20 Oct 2014 04:59:59 PM EDT Build Host: > > > >> > > > vmrpm.adiscon.com > > > >> > > > Group : System Environment/Daemons Source RPM: > > > >> > > > rsyslog-8.4.2-1.el6.src.rpm > > > >> > > > Size : 2186241 License: > (GPLv3+ > > > and > > > >> ASL > > > >> > > > 2.0) > > > >> > > > Signature : RSA/SHA1, Thu 02 Oct 2014 03:28:29 AM EDT, Key > ID > > > >> > > > e0f233b3e00b8985 > > > >> > > > URL : http://www.rsyslog.com/ > > > >> > > > Summary : Enhanced system logging and kernel message > > trapping > > > >> > daemon > > > >> > > > Description : > > > >> > > > Rsyslog is an enhanced, multi-threaded syslog daemon. It > > supports > > > >> > MySQL, > > > >> > > > syslog/TCP, RFC 3195, permitted sender lists, filtering on any > > > >> message > > > >> > > > part, > > > >> > > > and fine grain output format control. It is compatible with > > stock > > > >> > > sysklogd > > > >> > > > and can be used as a drop-in replacement. Rsyslog is simple to > > set > > > >> up, > > > >> > > with > > > >> > > > advanced features suitable for enterprise-class, > > > >> encryption-protected > > > >> > > > syslog > > > >> > > > relay chains. > > > >> > > > > > > >> > > > > > > >> > > > Attached is the suggested GDB dump from a core file that was > > > >> produced > > > >> > by > > > >> > > > the crash. > > > >> > > > > > > >> > > > > > > >> > > > Any ideas on how to fix this, are appreciated. Thanks! > > > >> > > > > > > >> > > > > > > >> > > > Ray > > > >> > > > > > > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com/professional-services/ > What's up with rsyslog? Follow https://twitter.com/rgerhards > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad > of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you > DON'T LIKE THAT. > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com/professional-services/ > What's up with rsyslog? Follow https://twitter.com/rgerhards > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad > of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you > DON'T LIKE THAT. > _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT. _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.

