Thanks for the excellent bug report. I have one idea based on a case I had in the past. Can it be that Centos GCC version does not properly support atomic instructions or is set to an to-old version? I have one other case open, and I think it was on centos. If atomics do no work right, nothing works well at all in rsyslog. Note that the build system seems to be unable to correctly detect this, as can be seen in this bug tracker here:
http://bugzilla.adiscon.com/show_bug.cgi?id=202 Could you please rebuild it with -march=i586 added to the CFLAGS. If my idea is right, it will probably run with that. Thanks, Rainer > -----Original Message----- > From: [email protected] [mailto:rsyslog- > [email protected]] On Behalf Of Dražen Kacar > Sent: Friday, November 12, 2010 1:38 PM > To: [email protected] > Subject: [rsyslog] SIGSEGV because of double free in msgDestruct > > Hello. > > I have rsyslog 5.6.0 on CentOS 5.5 with a slightly complex > configuration > and it's crashing. The complete configuration file is attached. The > crash > is perfectly reproducible and it happens very soon after the data > starts > arriving. The program was started with: > > rsyslogd -c5 -x -f rsyslog-datasink.conf > > I have two queues in order to have two thread pools. Input queue just > takes the message from UDP or TCP socket and uses omruleset to pass it > to > the output queue. The output queue then uses omprog to pass the message > to > the external program. Omprog blocks when the pipe to the external > program > is full, so I wanted to have unblocked threads to accept incoming > messages > (which will mostly use UDP). Hence, the configuration has two queues. > > It's possible that I made some error in the configuration and rsyslogd > is > crashing because I'm doing something that I wasn't supposed to do, but > it > didn't detect the faulty configuration early on. > > The whole thing works fine when I have only one queue (created with > $Ruleset) and input and omprog modules on it. But I'd really like to > use > two thread pools. It should be possible to reproduce this problem with > cat > as the omprog binary, although I haven't tried. > > One curiosity (probably unrelated to the problem): $GenerateConfigGraph > at > the end of the config file creates a picture which has only the main > queue, but the queues I configured with $Ruleset directives are not on > the > picture. > > The below is from gdb. The process was started from gdb, so there's no > call to sigsegvHdlr(), which can be seen in the core file when I start > rsyslogd on its own. > > (gdb) info threads > * 8 Thread 0xb4debb90 (LWP 11149) ConsumerReg (pThis=0x80b7988, > pWti=0x80b7cb8) at queue.c:1679 > 7 Thread 0xb57ecb90 (LWP 11148) 0x00d46402 in __kernel_vsyscall () > 6 Thread 0xb61edb90 (LWP 11147) msgDestruct (ppThis=0xb61ed1d4) at > msg.c:790 > 5 Thread 0xb6beeb90 (LWP 11146) 0x00d46402 in __kernel_vsyscall () > 4 Thread 0xb75efb90 (LWP 11145) 0x00d46402 in __kernel_vsyscall () > 3 Thread 0xb7ff0b90 (LWP 11144) 0x00d46402 in __kernel_vsyscall () > 2 Thread 0xb7ff1ac0 (LWP 11111) 0x00d46402 in __kernel_vsyscall () > (gdb) bt > #0 0x00d46402 in __kernel_vsyscall () > #1 0x00b2f040 in raise () from /lib/i686/nosegneg/libc.so.6 > #2 0x00b30a21 in abort () from /lib/i686/nosegneg/libc.so.6 > #3 0x00b67e3b in __libc_message () from /lib/i686/nosegneg/libc.so.6 > #4 0x00b70758 in free () from /lib/i686/nosegneg/libc.so.6 > #5 0x080612ee in msgDestruct (ppThis=0xb4deb1d4) at msg.c:816 > #6 0x08079e35 in DeleteProcessedBatch (pThis=0x80b7988, > pBatch=0x80b7cd0) > at queue.c:1404 > #7 0x0807a3b9 in DequeueConsumableElements (pThis=0x80b7988, > pWti=0x80b7cb8) > at queue.c:1441 > #8 DequeueConsumable (pThis=0x80b7988, pWti=0x80b7cb8) at queue.c:1489 > #9 0x0807a5d7 in DequeueForConsumer (pThis=0x80b7988, pWti=0x80b7cb8) > at queue.c:1626 > #10 ConsumerReg (pThis=0x80b7988, pWti=0x80b7cb8) at queue.c:1679 > #11 0x0807350e in wtiWorker (pThis=0x80b7cb8) at wti.c:315 > #12 0x08072e1f in wtpWorker (arg=0x80b7cb8) at wtp.c:381 > #13 0x00c9b869 in start_thread () from > /lib/i686/nosegneg/libpthread.so.0 > #14 0x00bd9e9e in clone () from /lib/i686/nosegneg/libc.so.6 > > The crash happens in msgDestruct() when it tries to free > pThis->rcvFrom.pfrominet. Valgrind says it's a double free problem. > > The queue mutex used by DequeueForConsumer seems to be properly locked > my > thread 8. From stack frame 10: > > (gdb) p *pThis->mut > $261 = {__data = {__lock = 2, __count = 0, __owner = 11149, __kind = 0, > __nusers = 1, {__spins = 0, __list = {__next = 0x0}}}, > __size = > "\002\000\000\000\000\000\000\000\215+\000\000\000\000\000\000\001\000\ > 000\000\000\000\000", __align = 2} > > The value for __lock is curious. It's usually 1 for locked or 0 for > unlocked, but it might have something to do with gdb. It's 1 in the > core > files. pThis->mutThrdMgmt is unlocked. > > I've checked omruleset code and it does a proper deep copy, as far as I > can tell. All the code in msg.c also seems fine. So I don't know what's > happening. > > -- > .-. .-. Yes, I am an agent of Satan, but my duties are largely > (_ \ / _) ceremonial. > | > | [email protected] _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com

