> -----Original Message-----
> From: [email protected] [mailto:rsyslog-
> [email protected]] On Behalf Of Dražen Kacar
> Sent: Wednesday, February 16, 2011 3:30 PM
> To: rsyslog-users
> Subject: Re: [rsyslog] Race conditions and crashes
> 
> BTW, is there a way to stop rsyslog aside from sending it SIGTERM? Data
> collector doesn't like that and then I have problems with viewing
> results
> in the GUI.

No, and I honestly do not see why this may be a problem. There is no other
method because one shutdown method seemed sufficient...

Rainer

> 
> Rainer Gerhards wrote:
> > Just a quick note, will go through your mail in full later. I
> regularly use
> > valgrind, which is obviously different, and ran the clang static
> analyzer in
> > December (or January?) on the code, with a number of minor fixes. I
> am aware
> > there is a race somewhere and I am trying to find it for a while now.
> So far,
> > we have been unable to  reproduce it in lab. The bugzilla has a
> couple of
> > entries plus additional information.
> >
> > Rainer
> >
> > > -----Original Message-----
> > > From: [email protected] [mailto:rsyslog-
> > > [email protected]] On Behalf Of Dražen Kacar
> > > Sent: Wednesday, February 16, 2011 1:17 PM
> > > To: [email protected]
> > > Subject: [rsyslog] Race conditions and crashes
> > >
> > > Hello.
> > >
> > > I have rsyslog 5.6.2 (+ patches for blocking FIFO write and setting
> > > thread
> > > scheduling class) on CentOS 5.5 (64-bit) and I have a number of
> > > crashes.
> > > SInce 2011-02-02 there were 27 SIGSEGVs and 35 SIGABRTs on one of
> the
> > > mavhines in the cluster.
> > >
> > > SIGABRTs are generated by glibc:
> > >
> > > *** glibc detected *** /opt/bulb/sbin/rsyslogd: double free or
> > > corruption
> > > (fasttop): 0x00002aaab02bc4c0 ***
> > >
> > > SIGSEGVs are the usual NULL pointer accesses. I didn't check all
> core
> > > files, but the ones I checked had that condition.
> > >
> > > I decided to run rsyslog through Sun's Data Race analyzer[1] and it
> > > found
> > > a few problems. The tool is free and it runs under Linux as well,
> but
> > > it
> > > brings Sun's compiler which doesn't handle all of gcc extensions,
> so I
> > > had
> > > to change the code to make it compile. The patch is attached. It
> adds
> > > members to empty structs in a few places.
> > >
> > > Since that compiler doesn't have gcc atomic access builtins,
> config.h
> > > contains this:
> > >
> > > /* Define if compiler provides atomic builtins */
> > > /* #undef HAVE_ATOMIC_BUILTINS */
> > >
> > > /* Define if compiler provides 64 bit atomic builtins */
> > > /* #undef HAVE_ATOMIC_BUILTINS_64BIT */
> > >
> > > My test was receiving 4 lines via UDP and writing them to a file
> and a
> > > FIFO.
> > > It was as simple as I could make it. Thread scheduling class was
> not
> > > set.
> > >
> > > The tool found the following problems:
> > >
> > > Total Races:  4 Experiment:  exp1.er
> > >
> > > Race #1, Vaddr: 0x13909168
> > >       Access 1: Read,  GetNxt + 0x0000008A,
> > >                        line 346 in "modules.c"
> > >       Access 2: Write, addModToList + 0x00000131,
> > >                        line 326 in "modules.c"
> > >   Total Callstack Traces: 1
> > >
> > > Race #2, Vaddr: (Multiple Addresses)
> > >       Access 1: Read,  wtpShutdownAll + 0x00000371,
> > >                        line 247 in "wtp.c"
> > >       Access 2: Write, wtpWrkrExecCleanup + 0x000000F2,
> > >                        line 310 in "wtp.c"
> > >   Total Callstack Traces: 2
> > >
> > > Race #3, Vaddr: (Multiple Addresses)
> > >       Access 1: Read,  thrdDestruct + 0x00000058,
> > >                        line 76 in "threads.c"
> > >       Access 2: Write, thrdStarter + 0x000001A2,
> > >                        line 197 in "threads.c"
> > >   Total Callstack Traces: 1
> > >
> > > Race #4, Vaddr: 0x1394764c
> > >       Access 1: Read,  processSocket + 0x000000FE,
> > >                        line 314 in "imudp.c"
> > >       Access 2: Write, thrdTerminateNonCancel + 0x000000CC,
> > >                        line 100 in "threads.c"
> > >   Total Callstack Traces: 1
> > >
> > >
> > > What it found really are unprotected memory accesses (ie. bugs),
> but
> > > all
> > > of them are in insignificant places:
> > >
> > > race #1 - module loading
> > > race #2 - shutdown all workers
> > > race #3 - thread destructor (this one might be responsible for
> > > something)
> > > race #4 - thread termination on SIGTTIN
> > >
> > >
> > > My production system is a bit more complicated than that. It has
> UDP
> > > and
> > > TCP receivers and a few more threads created than the test system.
> > > I suppose I could test some more and try to find errors in other
> > > places,
> > > but before I do I'd like to know if anyone else used tools of this
> kind
> > > on
> > > rsyslog. And if so, what the results were.
> > >
> > > [1] http://download.oracle.com/docs/cd/E19205-01/821-
> 2124/index.html
> > >
> > > --
> > >  .-.   .-.    Yes, I am an agent of Satan, but my duties are
> largely
> > > (_  \ /  _)   ceremonial.
> > >      |
> > >      |        [email protected]
> > _______________________________________________
> > rsyslog mailing list
> > http://lists.adiscon.net/mailman/listinfo/rsyslog
> > http://www.rsyslog.com
> 
> --
>  .-.   .-.    Yes, I am an agent of Satan, but my duties are largely
> (_  \ /  _)   ceremonial.
>      |
>      |        [email protected]
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com

Reply via email to