This is what valgrind has to say about rsyslog:

==19758== Memcheck, a memory error detector
==19758== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==19758== Using Valgrind-3.9.0 and LibVEX; rerun with -h for copyright info
==19758== Command: rsyslogd
==19758==
==19758== HEAP SUMMARY:
==19758==     in use at exit: 83,791 bytes in 1,675 blocks
==19758==   total heap usage: 2,392 allocs, 717 frees, 163,024 bytes
allocated
==19758==
==19758== 160 (40 direct, 120 indirect) bytes in 1 blocks are definitely
lost in loss record 1,075 of 1,131
==19758==    at 0x4026775: malloc (vg_replace_malloc.c:291)
==19758==    by 0x42164C3: nss_parse_service_list (nsswitch.c:622)
==19758==    by 0x4216C06: __nss_database_lookup (nsswitch.c:164)
==19758==    by 0x4713EAB: ???
==19758==    by 0x4714F84: ???
==19758==    by 0x41CE9C4: getpwnam_r@@GLIBC_2.1.2 (getXXbyYY_r.c:253)
==19758==    by 0x80939AF: doGetUID (cfsysline.c:419)
==19758==    by 0x8092AB5: processCfSysLineCommand (cfsysline.c:758)
==19758==    by 0x806F7DA: cfsysline (conf.c:215)
==19758==    by 0x80702B3: cnfDoCfsysline (rsconf.c:448)
==19758==    by 0x8063827: yylex (lexer.l:237)
==19758==    by 0x80606F4: yyparse (grammar.c:1505)
==19758==
==19758== 160 (40 direct, 120 indirect) bytes in 1 blocks are definitely
lost in loss record 1,076 of 1,131
==19758==    at 0x4026775: malloc (vg_replace_malloc.c:291)
==19758==    by 0x42164C3: nss_parse_service_list (nsswitch.c:622)
==19758==    by 0x4216C06: __nss_database_lookup (nsswitch.c:164)
==19758==    by 0x4712F2B: ???
==19758==    by 0x4713A64: ???
==19758==    by 0x41CD834: getgrnam_r@@GLIBC_2.1.2 (getXXbyYY_r.c:253)
==19758==    by 0x80937AD: doGetGID (cfsysline.c:369)
==19758==    by 0x8092AB5: processCfSysLineCommand (cfsysline.c:758)
==19758==    by 0x806F7DA: cfsysline (conf.c:215)
==19758==    by 0x80702B3: cnfDoCfsysline (rsconf.c:448)
==19758==    by 0x8063827: yylex (lexer.l:237)
==19758==    by 0x80606F4: yyparse (grammar.c:1505)
==19758==
==19758== LEAK SUMMARY:
==19758==    definitely lost: 80 bytes in 2 blocks
==19758==    indirectly lost: 240 bytes in 20 blocks
==19758==      possibly lost: 0 bytes in 0 blocks
==19758==    still reachable: 83,471 bytes in 1,653 blocks
==19758==         suppressed: 0 bytes in 0 blocks
==19758== Reachable blocks (those to which a pointer was found) are not
shown.
==19758== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==19758==
==19758== For counts of detected and suppressed errors, rerun with: -v
==19758== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 61 from 11)

ps -ef | grep rsyslog
syslog   19759     1  0 01:47 ?        00:00:00 valgrind --tool=memcheck
--leak-check=full rsyslogd

sudo kill 19759

 ==19759==
==19759== HEAP SUMMARY:
==19759==     in use at exit: 5,000 bytes in 95 blocks
==19759==   total heap usage: 4,332 allocs, 4,237 frees, 335,520 bytes
allocated
==19759==
==19759== 73 (64 direct, 9 indirect) bytes in 1 blocks are definitely lost
in loss record 79 of 90
==19759==    at 0x4025509: calloc (vg_replace_malloc.c:618)
==19759==    by 0x8092040: ratelimitNew (ratelimit.c:277)
==19759==    by 0x46A67C9: ???
==19759==    by 0x807087F: activate (rsconf.c:657)
==19759==    by 0x8055E17: realMain (syslogd.c:1161)
==19759==    by 0x805643F: main (syslogd.c:2067)
==19759==
==19759== 128 bytes in 1 blocks are definitely lost in loss record 83 of 90
==19759==    at 0x4026775: malloc (vg_replace_malloc.c:291)
==19759==    by 0x40268FD: realloc (vg_replace_malloc.c:687)
==19759==    by 0x807CEE3: rsCStrExtendBuf (stringbuf.c:255)
==19759==    by 0x8092D3B: getWord (stringbuf.h:80)
==19759==    by 0x8092DC8: doGetWord (cfsysline.c:531)
==19759==    by 0x8092AB5: processCfSysLineCommand (cfsysline.c:758)
==19759==    by 0x806F7DA: cfsysline (conf.c:215)
==19759==    by 0x80702B3: cnfDoCfsysline (rsconf.c:448)
==19759==    by 0x8063827: yylex (lexer.l:237)
==19759==    by 0x80606F4: yyparse (grammar.c:1505)
==19759==    by 0x8072A8C: load (rsconf.c:1285)
==19759==    by 0x8055CEC: realMain (syslogd.c:1970)
==19759==
==19759== 160 (40 direct, 120 indirect) bytes in 1 blocks are definitely
lost in loss record 84 of 90
==19759==    at 0x4026775: malloc (vg_replace_malloc.c:291)
==19759==    by 0x42164C3: nss_parse_service_list (nsswitch.c:622)
==19759==    by 0x4216C06: __nss_database_lookup (nsswitch.c:164)
==19759==    by 0x4713EAB: ???
==19759==    by 0x4714F84: ???
==19759==    by 0x41CE9C4: getpwnam_r@@GLIBC_2.1.2 (getXXbyYY_r.c:253)
==19759==    by 0x80939AF: doGetUID (cfsysline.c:419)
==19759==    by 0x8092AB5: processCfSysLineCommand (cfsysline.c:758)
==19759==    by 0x806F7DA: cfsysline (conf.c:215)
==19759==    by 0x80702B3: cnfDoCfsysline (rsconf.c:448)
==19759==    by 0x8063827: yylex (lexer.l:237)
==19759==    by 0x80606F4: yyparse (grammar.c:1505)
==19759==
==19759== 160 (40 direct, 120 indirect) bytes in 1 blocks are definitely
lost in loss record 85 of 90
==19759==    at 0x4026775: malloc (vg_replace_malloc.c:291)
==19759==    by 0x42164C3: nss_parse_service_list (nsswitch.c:622)
==19759==    by 0x4216C06: __nss_database_lookup (nsswitch.c:164)
==19759==    by 0x4712F2B: ???
==19759==    by 0x4713A64: ???
==19759==    by 0x41CD834: getgrnam_r@@GLIBC_2.1.2 (getXXbyYY_r.c:253)
==19759==    by 0x80937AD: doGetGID (cfsysline.c:369)
==19759==    by 0x8092AB5: processCfSysLineCommand (cfsysline.c:758)
==19759==    by 0x806F7DA: cfsysline (conf.c:215)
==19759==    by 0x80702B3: cnfDoCfsysline (rsconf.c:448)
==19759==    by 0x8063827: yylex (lexer.l:237)
==19759==    by 0x80606F4: yyparse (grammar.c:1505)
==19759==
==19759== 2,076 (32 direct, 2,044 indirect) bytes in 1 blocks are
definitely lost in loss record 90 of 90
==19759==    at 0x4026775: malloc (vg_replace_malloc.c:291)
==19759==    by 0x809A825: create_hashtable (hashtable.c:52)
==19759==    by 0x46A4DE9: ???
==19759==    by 0x8083589: doModInit (modules.c:575)
==19759==    by 0x8084091: Load (modules.c:1168)
==19759==    by 0x806FA72: doModLoad (conf.c:125)
==19759==    by 0x8092625: doCustomHdlr (cfsysline.c:101)
==19759==    by 0x8092AB5: processCfSysLineCommand (cfsysline.c:758)
==19759==    by 0x806F7DA: cfsysline (conf.c:215)
==19759==    by 0x80702B3: cnfDoCfsysline (rsconf.c:448)
==19759==    by 0x8063827: yylex (lexer.l:237)
==19759==    by 0x80606F4: yyparse (grammar.c:1505)
==19759==
==19759== LEAK SUMMARY:
==19759==    definitely lost: 304 bytes in 5 blocks
==19759==    indirectly lost: 2,293 bytes in 34 blocks
==19759==      possibly lost: 0 bytes in 0 blocks
==19759==    still reachable: 2,403 bytes in 56 blocks
==19759==         suppressed: 0 bytes in 0 blocks
==19759== Reachable blocks (those to which a pointer was found) are not
shown.
==19759== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==19759==
==19759== For counts of detected and suppressed errors, rerun with: -v
==19759== ERROR SUMMARY: 5 errors from 5 contexts (suppressed: 63 from 11)

This was only from running rsyslog for less than a minute. :)
Thanks,

Reinoud.




On Wed, Jul 23, 2014 at 1:11 AM, David Lang <[email protected]> wrote:

> On Tue, 22 Jul 2014, Micah Yoder wrote:
>
>  On 7/22/14, 3:21 PM, David Lang wrote:
>>
>>  Well, that indicates that you have a problem with your dynafile
>>> configuration, you don't allow enough files to be open, lookup
>>> dynafilecachesize and increase it to be much larger
>>>
>>> With this sort of problem, your performance is going to be horrid, and
>>> as such, your queues will grow. Not seeing your impstats output, but
>>> I'll bet the size of the queues is large, and if you mutiply those sizes
>>> by the maxmessagesize, you will probably end up accounting for a lot of
>>> the memory that you think is 'leaking'
>>>
>>
>> Thanks, I will check that.  The "main queue" has size=2, but this one is
>> rather different:
>>
>> Tue Jul 22 09:57:51 2014: action 16 queue: size=438266 enqueued=438268
>> full=0 discarded.full=0 discarded.nf=0 maxqsize=438266
>>
>> Any idea how one goes about finding which action line this corresponds to?
>>
>
> well, it's the 16th action in the config file, if you set name='value' in
> your action() statements it will replace 'action #' with the name.
> depending on how many actions you have, it may be easier to count up from
> the bottom of the file.
>
>
>  Also, if this accounted for the majority of the memory, each item would
>> have to consume 5-10k. Does that sound right?  (The vast majority of our
>> log lines are much less than that, but it's possible that something
>> sends a stream of logs that large.)
>>
>
> the datastructures are based on the max message size, not the size of the
> actual message (at least as I understand it), and you have to have both the
> raw message and all the parsed components, so it's not unreasonable to hit
> 5-10K of RAM per message if the default message size is 4K (I know it's at
> least 2K and I believe it's 4K)
>
> Also, if you are using the default FixedArray size for your main message
> queue, it's eating up it's max size even when empty.
>
> while the exact details are murky, it really looks like this is your
> problem. In any case, it's clear that you have a big problem with that
> action, it's only processed two messages while getting 438K sent to it.
>
> David Lang
>
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com/professional-services/
> What's up with rsyslog? Follow https://twitter.com/rgerhards
> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
> DON'T LIKE THAT.
>
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Reply via email to