BTW, is there a way to stop rsyslog aside from sending it SIGTERM? Data collector doesn't like that and then I have problems with viewing results in the GUI.
Rainer Gerhards wrote: > Just a quick note, will go through your mail in full later. I regularly use > valgrind, which is obviously different, and ran the clang static analyzer in > December (or January?) on the code, with a number of minor fixes. I am aware > there is a race somewhere and I am trying to find it for a while now. So far, > we have been unable to reproduce it in lab. The bugzilla has a couple of > entries plus additional information. > > Rainer > > > -----Original Message----- > > From: [email protected] [mailto:rsyslog- > > [email protected]] On Behalf Of Dražen Kacar > > Sent: Wednesday, February 16, 2011 1:17 PM > > To: [email protected] > > Subject: [rsyslog] Race conditions and crashes > > > > Hello. > > > > I have rsyslog 5.6.2 (+ patches for blocking FIFO write and setting > > thread > > scheduling class) on CentOS 5.5 (64-bit) and I have a number of > > crashes. > > SInce 2011-02-02 there were 27 SIGSEGVs and 35 SIGABRTs on one of the > > mavhines in the cluster. > > > > SIGABRTs are generated by glibc: > > > > *** glibc detected *** /opt/bulb/sbin/rsyslogd: double free or > > corruption > > (fasttop): 0x00002aaab02bc4c0 *** > > > > SIGSEGVs are the usual NULL pointer accesses. I didn't check all core > > files, but the ones I checked had that condition. > > > > I decided to run rsyslog through Sun's Data Race analyzer[1] and it > > found > > a few problems. The tool is free and it runs under Linux as well, but > > it > > brings Sun's compiler which doesn't handle all of gcc extensions, so I > > had > > to change the code to make it compile. The patch is attached. It adds > > members to empty structs in a few places. > > > > Since that compiler doesn't have gcc atomic access builtins, config.h > > contains this: > > > > /* Define if compiler provides atomic builtins */ > > /* #undef HAVE_ATOMIC_BUILTINS */ > > > > /* Define if compiler provides 64 bit atomic builtins */ > > /* #undef HAVE_ATOMIC_BUILTINS_64BIT */ > > > > My test was receiving 4 lines via UDP and writing them to a file and a > > FIFO. > > It was as simple as I could make it. Thread scheduling class was not > > set. > > > > The tool found the following problems: > > > > Total Races: 4 Experiment: exp1.er > > > > Race #1, Vaddr: 0x13909168 > > Access 1: Read, GetNxt + 0x0000008A, > > line 346 in "modules.c" > > Access 2: Write, addModToList + 0x00000131, > > line 326 in "modules.c" > > Total Callstack Traces: 1 > > > > Race #2, Vaddr: (Multiple Addresses) > > Access 1: Read, wtpShutdownAll + 0x00000371, > > line 247 in "wtp.c" > > Access 2: Write, wtpWrkrExecCleanup + 0x000000F2, > > line 310 in "wtp.c" > > Total Callstack Traces: 2 > > > > Race #3, Vaddr: (Multiple Addresses) > > Access 1: Read, thrdDestruct + 0x00000058, > > line 76 in "threads.c" > > Access 2: Write, thrdStarter + 0x000001A2, > > line 197 in "threads.c" > > Total Callstack Traces: 1 > > > > Race #4, Vaddr: 0x1394764c > > Access 1: Read, processSocket + 0x000000FE, > > line 314 in "imudp.c" > > Access 2: Write, thrdTerminateNonCancel + 0x000000CC, > > line 100 in "threads.c" > > Total Callstack Traces: 1 > > > > > > What it found really are unprotected memory accesses (ie. bugs), but > > all > > of them are in insignificant places: > > > > race #1 - module loading > > race #2 - shutdown all workers > > race #3 - thread destructor (this one might be responsible for > > something) > > race #4 - thread termination on SIGTTIN > > > > > > My production system is a bit more complicated than that. It has UDP > > and > > TCP receivers and a few more threads created than the test system. > > I suppose I could test some more and try to find errors in other > > places, > > but before I do I'd like to know if anyone else used tools of this kind > > on > > rsyslog. And if so, what the results were. > > > > [1] http://download.oracle.com/docs/cd/E19205-01/821-2124/index.html > > > > -- > > .-. .-. Yes, I am an agent of Satan, but my duties are largely > > (_ \ / _) ceremonial. > > | > > | [email protected] > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com -- .-. .-. Yes, I am an agent of Satan, but my duties are largely (_ \ / _) ceremonial. | | [email protected] _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com

