And here's the output with the leak-check=full option:

> 3049.206693904:766b700: end prop repl, pRes='2012', len 4
> 3049.206849430:766b700: prop repl 4, pRes='05', len -1
> 3049.206965668:766b700: end prop repl, pRes='05', len 2
> 3049.207120712:766b700: prop repl 4, pRes='03', len -1
> 3049.207236643:766b700: end prop repl, pRes='03', len 2
> 3049.207385859:766b700: prop repl 4, pRes='10.191.19.27', len -1
> 3049.207503212:766b700: end prop repl, pRes='10.191.19.27', len 12
> 3049.207621820:766b700: prop repl 4, pRes='authpriv', len -1
> 3049.207735189:766b700: end prop repl, pRes='authpriv', len 8
> 3049.207903492:766b700: prop repl 4, pRes='2012', len -1
> 3049.208020776:766b700: end prop repl, pRes='2012', len 4
> 3049.208175477:766b700: prop repl 4, pRes='05', len -1
> 3049.208291580:766b700: end prop repl, pRes='05', len 2
> 3049.208445428:766b700: prop repl 4, pRes='03', len -1
> 3049.208561306:766b700: end prop repl, pRes='03', len 2
> 3049.208709151:766b700: prop repl 4, pRes='204.236.165.198', len -1
> 3049.208826659:766b700: end prop repl, pRes='204.236.165.198', len 15
> 3049.208946453:766b700: prop repl 4, pRes='local2', len -1
> 3049.225906852:766b700: end prop repl, pRes='local2', len 6
> 3049.226074826:766b700: prop repl 4, pRes='2012', len -1
> 3049.226191980:766b700: end prop repl, pRes='2012', len 4
> 3049.226348188:766b700: prop repl 4, pRes='05', len -1
> 3049.226464604:766b700: end prop repl, pRes='05', len 2
> 3049.226625159:766b700: prop repl 4, pRes='03', len -1
> 3049.226741423:766b700: end prop repl, pRes='03', len 2
> 3049.226909093:766b700: prop repl 4, pRes='50.18.75.156', len -1
> 3049.227027296:766b700: end prop repl, pRes='50.18.75.156', len 12
> 3049.227146531:766b700: prop repl 4, pRes='local2', len -1
> 3049.227260506:766b700: end prop repl, pRes='local2', len 6
> 3049.227415344:766b700: prop repl 4, pRes='2012', len -1
> 3049.227531737:766b700: end prop repl, pRes='2012', len 4
> 3049.227686657:766b700: prop repl 4, pRes='05', len -1
> 3049.227823771:766b700: end prop repl, pRes='05', len 2
> 3049.227986981:766b700: prop repl 4, pRes='03', len -1
> 3049.228103631:766b700: end prop repl, pRes='03', len 2
> 3049.228232819:766b700: XXXXX:  tryDoAction 0x5a2e6d0, pnElem 5, nElem 5
> 3049.228352738:766b700: Action 0x5a2e6d0 transitioned to state: itx
> 3049.228462772:766b700: entering actionCalldoAction(), state: itx
> 3049.228570909:766b700: file to log to: FILENAME
> 3049.228697210:766b700: write to stream, pData->pStrm 0xb1a18e0, lenBuf 223
> 3049.228822219:766b700: action 0x5a2e6d0 call returned -2121
> 3049.228938493:766b700: entering actionCalldoAction(), state: itx
> 3049.246724313:766b700: file to log to: FILENAME
> 3049.246865159:766b700: write to stream, pData->pStrm 0xb399440, lenBuf 111
> 3049.246990810:766b700: action 0x5a2e6d0 call returned -2121
> 3049.247106361:766b700: entering actionCalldoAction(), state: itx
> 3049.247214322:766b700: file to log to: FILENAME
> 3049.247340965:766b700: write to stream, pData->pStrm 0xb399bc0, lenBuf 112
> 3049.247462568:766b700: action 0x5a2e6d0 call returned -2121
> 3049.247574716:766b700: entering actionCalldoAction(), state: itx
> 3049.247682310:766b700: file to log to: FILENAME
> 3049.247823611:766b700: write to stream, pData->pStrm 0x5bccd90, lenBuf 164
> 3049.247949863:766b700: action 0x5a2e6d0 call returned -2121
> 3049.248062303:766b700: entering actionCalldoAction(), state: itx
> 3049.248169734:766b700: file to log to: FILENAME
> 3049.248296637:766b700: Removed entry 2 for file 
> '/mnt/logs/50.112.37.82/authpriv-2012-05-03.log' from dynaCache.
> 3049.248427757:766b700: strm 0xaa7a1a0: file 13(authpriv-2012-05-03.log) 
> closing
> 3049.248569534:766b700: strm 0xaa7a1a0: file 13(authpriv-2012-05-03.log) 
> flush, buflen 0 (no need to flush)
> 3049.248848350:766b700: file stream local2-2012-05-03.log params: flush 
> interval 0, async write 0
> 3049.248994574:766b700: Added new entry 2 for file cache, file 
> '/mnt/logs/50.18.75.156/local2-2012-05-03.log'.
> 3049.249244122:766b700: write to stream, pData->pStrm 0x5c0cab0, lenBuf 368
> 3049.266993352:766b700: action 0x5a2e6d0 call returned -2121
> 3049.267156843:766b700: strm 0x5c0cab0: file -1(local2-2012-05-03.log) flush, 
> buflen 368
> 3049.267340310:766b700: file '/mnt/logs/50.18.75.156/local2-2012-05-03.log' 
> opened as #13 with mode 416
> 3049.267484698:766b700: strm 0x5c0cab0: opened file 
> '/mnt/logs/50.18.75.156/local2-2012-05-03.log' for WRITE as 13
> 3049.267638704:766b700: strm 0x5c0cab0: file 13 write wrote 368 bytes
> 3049.267788358:766b700: Action 0x5a2e6d0 transitioned to state: rdy
> 3049.267923620:766b700: ruleset: get iRet 0 from rule.ProcessMsg()
> 3049.268036343:766b700: ruleset.ProcessMsg() returns 0
> 3049.268158427:766b700: regular consumer finished, iret=0, szlog 1 sz phys 6
> 3049.268362726:766b700: we deleted 5 objects and enqueued 0 objects
> 3049.268483154:766b700: delete batch from store, new sizes: log 1, phys 1
> 3049.268633137:766b700: msg parser: flags 30, from 
> 'ec2-204-236-179-181.us-west-1.compute.amazonaws.com', msg 
> '<150>2012-05-03T21:33:27.973443+00:00 prod-ts-uswest1-9-i-23'
> 3049.268751986:766b700: parse using parser list 0x5a02e40 (the default list).
> 3049.268908367:766b700: Parser 'rsyslog.rfc5424' returned -2160
> 3049.269094815:766b700: Message will now be parsed by the legacy syslog 
> parser (one size fits all... ;)).
> 3049.269318477:766b700: MsgSetTAG in: len 7, pszBuf: apache:
> 3049.269516203:766b700: MsgSetTAG exit: pMsg->iLenTAG 7, pMsg->TAG.szBuf: 
> apache:
> 3049.287378551:766b700: Parser 'rsyslog.rfc3164' returned 0
> 3049.287498785:766b700: processBatch: batch of 1 elements must be processed
> 3049.287607743:766b700: Processing next rule
> 3049.287719767:766b700: testing filter, f_pmask 255
> 3049.287843919:766b700: Processing next action
> 3049.287968743:766b700: Called action(NotAllMark), processing batch[0] via 
> 'builtin-file'
> 3049.288078927:766b700: Called action(Batch), logging to builtin-file
> 3049.288248978:766b700: prop repl 4, pRes='204.236.179.181', len -1
> 3049.288367943:766b700: end prop repl, pRes='204.236.179.181', len 15
> 3049.288488240:766b700: prop repl 4, pRes='local2', len -1
> 3049.288602119:766b700: end prop repl, pRes='local2', len 6
> 3049.288758247:766b700: prop repl 4, pRes='2012', len -1
> 3049.288874567:766b700: end prop repl, pRes='2012', len 4
> 3049.289168469:766b700: prop repl 4, pRes='05', len -1
> 3049.289365269:766b700: end prop repl, pRes='05', len 2
> 3049.289601218:766b700: prop repl 4, pRes='03', len -1
> 3049.289795968:766b700: end prop repl, pRes='03', len 2
> 3049.290002018:766b700: XXXXX:  tryDoAction 0x5a2e6d0, pnElem 1, nElem 1
> 3049.307849612:766b700: Action 0x5a2e6d0 transitioned to state: itx
> 3049.307969368:766b700: entering actionCalldoAction(), state: itx
> 3049.308078202:766b700: file to log to: FILENAME
> 3049.308205615:766b700: Removed entry 6 for file 
> '/mnt/logs/184.169.236.243/authpriv-2012-05-03.log' from dynaCache.
> 3049.308330100:766b700: strm 0xaa8daa0: file 15(authpriv-2012-05-03.log) 
> closing
> 3049.308472984:766b700: strm 0xaa8daa0: file 15(authpriv-2012-05-03.log) 
> flush, buflen 120
> 3049.308623920:766b700: strm 0xaa8daa0: file 15 write wrote 120 bytes
> 3049.308874666:766b700: file stream local2-2012-05-03.log params: flush 
> interval 0, async write 0
> 3049.309152842:766b700: Added new entry 6 for file cache, file 
> '/mnt/logs/204.236.179.181/local2-2012-05-03.log'.
> 3049.309358728:766b700: write to stream, pData->pStrm 0x5c0d6d0, lenBuf 578
> 3049.309567506:766b700: action 0x5a2e6d0 call returned -2121
> 3049.309798574:766b700: strm 0x5c0d6d0: file -1(local2-2012-05-03.log) flush, 
> buflen 578
> 3049.310075699:766b700: file 
> '/mnt/logs/204.236.179.181/local2-2012-05-03.log' opened as #15 with mode 416
> 3049.310294286:766b700: strm 0x5c0d6d0: opened file 
> '/mnt/logs/204.236.179.181/local2-2012-05-03.log' for WRITE as 15
> 3049.310566768:766b700: strm 0x5c0d6d0: file 15 write wrote 578 bytes
> 3049.310804697:766b700: Action 0x5a2e6d0 transitioned to state: rdy
> 3049.319945343:766b700: ruleset: get iRet 0 from rule.ProcessMsg()
> 3049.320061350:766b700: ruleset.ProcessMsg() returns 0
> 3049.320180243:766b700: regular consumer finished, iret=0, szlog 0 sz phys 1
> 3049.320324960:766b700: we deleted 1 objects and enqueued 0 objects
> 3049.320442164:766b700: delete batch from store, new sizes: log 0, phys 0
> 3049.320565162:766b700: regular consumer finished, iret=4, szlog 0 sz phys 0
> 3049.320675938:766b700: main Q:Reg/w0: worker IDLE, waiting for work.
> 3049.473251254:966f700: XXXXXX: doRetry: iRet 0, pNsd->bAbortConn 0
> 3049.473510223:966f700: hasRcvInBuffer on nsd 0x5bc3510: pszRcvBuf (nil), 
> lenRcvBuf 0
> 3049.473700422:966f700: GnuTLS requested retry of 1 operation - executing
> 3049.725691130:966f700: XXXXXX: doRetry: iRet 0, pNsd->bAbortConn 0
> 3049.726059044:966f700: hasRcvInBuffer on nsd 0x5d9b260: pszRcvBuf (nil), 
> lenRcvBuf 0
> 3049.726266390:966f700: hasRcvInBuffer on nsd 0x5d9b5c0: pszRcvBuf (nil), 
> lenRcvBuf 0
> 3049.726464907:966f700: hasRcvInBuffer on nsd 0x5d9c430: pszRcvBuf 0xaa6fec0, 
> lenRcvBuf -1
> 3049.726663962:966f700: hasRcvInBuffer on nsd 0x5dba7f0: pszRcvBuf 0xaa82e20, 
> lenRcvBuf -1
> 3049.726868320:966f700: hasRcvInBuffer on nsd 0xace90a0: pszRcvBuf 0xb186510, 
> lenRcvBuf -1
> 3049.727069330:966f700: hasRcvInBuffer on nsd 0xad0f980: pszRcvBuf 0xb18d5e0, 
> lenRcvBuf -1
> 3049.727267081:966f700: hasRcvInBuffer on nsd 0xad2e750: pszRcvBuf 0xb1b6670, 
> lenRcvBuf -1
> 3049.727465128:966f700: hasRcvInBuffer on nsd 0xb167bc0: pszRcvBuf 0xb384ec0, 
> lenRcvBuf -1
> 3049.727672278:966f700: hasRcvInBuffer on nsd 0xb194b80: pszRcvBuf 0x5bb35c0, 
> lenRcvBuf -1
> 3049.727874922:966f700: hasRcvInBuffer on nsd 0xb363940: pszRcvBuf 0x5ba79e0, 
> lenRcvBuf -1
> 3049.728073101:966f700: hasRcvInBuffer on nsd 0xb38c3b0: pszRcvBuf 0x5be4040, 
> lenRcvBuf -1
> 3049.728270670:966f700: hasRcvInBuffer on nsd 0x5b91600: pszRcvBuf (nil), 
> lenRcvBuf 0
> 3049.728468483:966f700: hasRcvInBuffer on nsd 0x5bc3510: pszRcvBuf (nil), 
> lenRcvBuf 0
> 3049.728665936:966f700: --------<NSDSEL_PTCP> calling select, active fds (max 
> 25): 9 10 11 12 14 16 18 19 20 21 22 23 25 
> 3049.730337867:966f700: hasRcvInBuffer on nsd 0x5d9b260: pszRcvBuf (nil), 
> lenRcvBuf 0
> 3049.730531724:966f700: New connect on NSD 0x5d9b4c0.
> ==3607== Thread 3:
> ==3607== Jump to the invalid address stated on the next line
> ==3607==    at 0x0: ???
> ==3607==    by 0x6215C3C: MsgSetInputName (in /usr/lib/rsyslog/imuxsock.so)
> ==3607==    by 0x6210BD8: ??? (in /usr/lib/rsyslog/imuxsock.so)
> ==3607==    by 0x6211392: ??? (in /usr/lib/rsyslog/imuxsock.so)
> ==3607==    by 0x43B809: ??? (in /usr/sbin/rsyslogd)
> ==3607==    by 0x504A9C9: start_thread (pthread_create.c:300)
> ==3607==    by 0x7E6C6FF: ???
> ==3607==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
> ==3607== 
> ==3607== 
> ==3607== Process terminating with default action of signal 11 (SIGSEGV)
> ==3607==  Bad permissions for mapped region at address 0x0
> ==3607==    at 0x0: ???
> ==3607==    by 0x6215C3C: MsgSetInputName (in /usr/lib/rsyslog/imuxsock.so)
> ==3607==    by 0x6210BD8: ??? (in /usr/lib/rsyslog/imuxsock.so)
> ==3607==    by 0x6211392: ??? (in /usr/lib/rsyslog/imuxsock.so)
> ==3607==    by 0x43B809: ??? (in /usr/sbin/rsyslogd)
> ==3607==    by 0x504A9C9: start_thread (pthread_create.c:300)
> ==3607==    by 0x7E6C6FF: ???
> ==3607== 
> ==3607== HEAP SUMMARY:
> ==3607==     in use at exit: 681,871 bytes in 3,582 blocks
> ==3607==   total heap usage: 153,960 allocs, 150,378 frees, 12,862,358 bytes 
> allocated
> ==3607== 
> ==3607== Thread 1:
> ==3607== 272 bytes in 1 blocks are possibly lost in loss record 1,129 of 1,226
> ==3607==    at 0x4C267CC: calloc (vg_replace_malloc.c:467)
> ==3607==    by 0x4012495: _dl_allocate_tls (dl-tls.c:300)
> ==3607==    by 0x504B728: pthread_create@@GLIBC_2.2.5 (allocatestack.c:561)
> ==3607==    by 0x42BDEA: wtpAdviseMaxWorkers (in /usr/sbin/rsyslogd)
> ==3607==    by 0x432D22: qqueueEnqObj (in /usr/sbin/rsyslogd)
> ==3607==    by 0x40CE5C: submitMsg (in /usr/sbin/rsyslogd)
> ==3607==    by 0x64563D3: ??? (in /usr/lib/rsyslog/imklog.so)
> ==3607==    by 0x6456591: imklogLogIntMsg (in /usr/lib/rsyslog/imklog.so)
> ==3607==    by 0x6456713: klogWillRun (in /usr/lib/rsyslog/imklog.so)
> ==3607==    by 0x40DB80: ??? (in /usr/sbin/rsyslogd)
> ==3607==    by 0x40E14D: ??? (in /usr/sbin/rsyslogd)
> ==3607==    by 0x40F8A6: realMain (in /usr/sbin/rsyslogd)
> ==3607== 
> ==3607== 1,088 bytes in 4 blocks are possibly lost in loss record 1,176 of 
> 1,226
> ==3607==    at 0x4C267CC: calloc (vg_replace_malloc.c:467)
> ==3607==    by 0x4012495: _dl_allocate_tls (dl-tls.c:300)
> ==3607==    by 0x504B728: pthread_create@@GLIBC_2.2.5 (allocatestack.c:561)
> ==3607==    by 0x43B769: thrdCreate (in /usr/sbin/rsyslogd)
> ==3607==    by 0x40E20E: ??? (in /usr/sbin/rsyslogd)
> ==3607==    by 0x40F8A6: realMain (in /usr/sbin/rsyslogd)
> ==3607==    by 0x568BC4C: (below main) (libc-start.c:226)
> ==3607== 
> ==3607== LEAK SUMMARY:
> ==3607==    definitely lost: 0 bytes in 0 blocks
> ==3607==    indirectly lost: 0 bytes in 0 blocks
> ==3607==      possibly lost: 1,360 bytes in 5 blocks
> ==3607==    still reachable: 680,511 bytes in 3,577 blocks
> ==3607==         suppressed: 0 bytes in 0 blocks
> ==3607== Reachable blocks (those to which a pointer was found) are not shown.
> ==3607== To see them, rerun with: --leak-check=full --show-reachable=yes
> ==3607== 
> ==3607== For counts of detected and suppressed errors, rerun with: -v
> ==3607== ERROR SUMMARY: 3 errors from 3 contexts (suppressed: 43 from 7)

On May 3, 2012, at 1:02 PM, Rainer Gerhards wrote:

> Can you run it under valgrind?
> 
> Rainer
> 
> Matt Wise <[email protected]> hat geschrieben:We're using Rsyslog in our 
> inftrastructure on a bunch of Ubuntu 10.04 hosts. Due to memory and CPU leaks 
> in the 4.xx rsyslog code (we're using TLS on everything), we've had to 
> upgrade to the 5.xx series of code.
> 
> Our clients are running the Evax-built Rsyslog 5.6.3 and they generally seem 
> to work fine. Our rsyslog 'receiver' hosts were running 5.6.3, but we ran 
> into a serious memory leak that caused Rsyslog to bloat after about a day. We 
> have upgraded them to hand-built Rsyslog 5.8.10 code (and also tried 5.9.3 at 
> one point). This worked fine for about 2 days.. and now we are getting near 
> non-stop Seg Faults on the receiver host running Rsyslog.
> 
> I've tried everything I can think of, but we still have the problem.. here 
> are our configs:
> 
> /etc/rsyslog.d/custom.conf:
>> $ModLoad imtcp # load TCP listener
>> $DefaultNetstreamDriverCAFile /etc/rsyslog.d/certs/nextdoor/logd-ca.pem
>> $DefaultNetstreamDriverCertFile /etc/rsyslog.d/certs/nextdoor/nextdoor.pem
>> $DefaultNetstreamDriverKeyFile /etc/rsyslog.d/certs/nextdoor/nextdoor.key
>> $DefaultNetstreamDriver gtls
>> $InputTCPServerStreamDriverMode 1
>> $InputTCPServerStreamDriverAuthMode x509/certvalid
>> $CreateDirs on
>> $template 
>> FILENAME,"/mnt/logs/%fromhost-ip%/%syslogfacility-text%-%$year%-%$month%-%$day%.log"
>> *.* ?FILENAME
>> # Create an additional socket in postfix's chroot in order not to break
>> # mail logging when rsyslog is restarted.  If the directory is missing,
>> # rsyslog will silently skip creating the socket.
>> $AddUnixListenSocket /var/spool/postfix/dev/log
> 
> 
> /etc/rsyslog.conf
>> $ModLoad imuxsock # provides support for local system logging
>> $ModLoad imklog   # provides kernel logging support (previously done by 
>> rklogd)
>> $ModLoad immark  # provides --MARK-- message capability
>> $ActionFileDefaultTemplate RSYSLOG_TraditionalFileFormat
>> $EscapeControlCharactersOnReceive off
>> $RepeatedMsgReduction off
>> $FileOwner sysadm
>> $FileGroup adm
>> $FileCreateMode 0640
>> $DirOwner sysadm
>> $DirGroup adm
>> $DirCreateMode 0755
>> $Umask 0022
>> $PreserveFQDN on
>> $IncludeConfig /etc/rsyslog.d/*.conf
> 
> 
>> rsyslogd 5.8.10, compiled with:
>>      FEATURE_REGEXP:                         Yes
>>      FEATURE_LARGEFILE:                      No
>>      GSSAPI Kerberos 5 support:              Yes
>>      FEATURE_DEBUG (debug build, slow code): No
>>      32bit Atomic operations supported:      Yes
>>      64bit Atomic operations supported:      Yes
>>      Runtime Instrumentation (slow code):    No
> 
> 
>> 2.6.32-317-ec2 #36-Ubuntu SMP Fri Jul 8 18:12:30 UTC 2011 x86_64 GNU/Linux
> 
> 
> And here's the log output of the failure.. 
>> 8901.295674722:7f04c029a700: New connect on NSD 0x1db0ee0
>> .
>> 
>> 8901.359429713:7f04c1a9d700: Message from UNIX socket: #3
>> 8901.359450981:7f04c1a9d700: XXX: pre CM loop, length of control message 32
>> 8901.359461759:7f04c1a9d700: XXX: in CM loop, 1, 2
>> 8901.359471828:7f04c1a9d700: XXX: got credentials pid 24640
>> 8901.359481718:7f04c1a9d700: XXX: post CM loop
>> 8901.359493088:7f04c1a9d700: imuxsock: no ratelimiter for pid 24640, 
>> creating one
>> 
>> Segmentation fault
> 
> The failure looks the same every single time... some message comes in, the 
> 'CM Loop' comments start, the 'rate limit' comment comes, and then the 
> segfault. 
> 
> 
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com/professional-services/
> What's up with rsyslog? Follow https://twitter.com/rgerhards
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com/professional-services/
> What's up with rsyslog? Follow https://twitter.com/rgerhards

_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards

Reply via email to