This is what valgrind has to say about rsyslog: ==19758== Memcheck, a memory error detector ==19758== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al. ==19758== Using Valgrind-3.9.0 and LibVEX; rerun with -h for copyright info ==19758== Command: rsyslogd ==19758== ==19758== HEAP SUMMARY: ==19758== in use at exit: 83,791 bytes in 1,675 blocks ==19758== total heap usage: 2,392 allocs, 717 frees, 163,024 bytes allocated ==19758== ==19758== 160 (40 direct, 120 indirect) bytes in 1 blocks are definitely lost in loss record 1,075 of 1,131 ==19758== at 0x4026775: malloc (vg_replace_malloc.c:291) ==19758== by 0x42164C3: nss_parse_service_list (nsswitch.c:622) ==19758== by 0x4216C06: __nss_database_lookup (nsswitch.c:164) ==19758== by 0x4713EAB: ??? ==19758== by 0x4714F84: ??? ==19758== by 0x41CE9C4: getpwnam_r@@GLIBC_2.1.2 (getXXbyYY_r.c:253) ==19758== by 0x80939AF: doGetUID (cfsysline.c:419) ==19758== by 0x8092AB5: processCfSysLineCommand (cfsysline.c:758) ==19758== by 0x806F7DA: cfsysline (conf.c:215) ==19758== by 0x80702B3: cnfDoCfsysline (rsconf.c:448) ==19758== by 0x8063827: yylex (lexer.l:237) ==19758== by 0x80606F4: yyparse (grammar.c:1505) ==19758== ==19758== 160 (40 direct, 120 indirect) bytes in 1 blocks are definitely lost in loss record 1,076 of 1,131 ==19758== at 0x4026775: malloc (vg_replace_malloc.c:291) ==19758== by 0x42164C3: nss_parse_service_list (nsswitch.c:622) ==19758== by 0x4216C06: __nss_database_lookup (nsswitch.c:164) ==19758== by 0x4712F2B: ??? ==19758== by 0x4713A64: ??? ==19758== by 0x41CD834: getgrnam_r@@GLIBC_2.1.2 (getXXbyYY_r.c:253) ==19758== by 0x80937AD: doGetGID (cfsysline.c:369) ==19758== by 0x8092AB5: processCfSysLineCommand (cfsysline.c:758) ==19758== by 0x806F7DA: cfsysline (conf.c:215) ==19758== by 0x80702B3: cnfDoCfsysline (rsconf.c:448) ==19758== by 0x8063827: yylex (lexer.l:237) ==19758== by 0x80606F4: yyparse (grammar.c:1505) ==19758== ==19758== LEAK SUMMARY: ==19758== definitely lost: 80 bytes in 2 blocks ==19758== indirectly lost: 240 bytes in 20 blocks ==19758== possibly lost: 0 bytes in 0 blocks ==19758== still reachable: 83,471 bytes in 1,653 blocks ==19758== suppressed: 0 bytes in 0 blocks ==19758== Reachable blocks (those to which a pointer was found) are not shown. ==19758== To see them, rerun with: --leak-check=full --show-leak-kinds=all ==19758== ==19758== For counts of detected and suppressed errors, rerun with: -v ==19758== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 61 from 11)
ps -ef | grep rsyslog syslog 19759 1 0 01:47 ? 00:00:00 valgrind --tool=memcheck --leak-check=full rsyslogd sudo kill 19759 ==19759== ==19759== HEAP SUMMARY: ==19759== in use at exit: 5,000 bytes in 95 blocks ==19759== total heap usage: 4,332 allocs, 4,237 frees, 335,520 bytes allocated ==19759== ==19759== 73 (64 direct, 9 indirect) bytes in 1 blocks are definitely lost in loss record 79 of 90 ==19759== at 0x4025509: calloc (vg_replace_malloc.c:618) ==19759== by 0x8092040: ratelimitNew (ratelimit.c:277) ==19759== by 0x46A67C9: ??? ==19759== by 0x807087F: activate (rsconf.c:657) ==19759== by 0x8055E17: realMain (syslogd.c:1161) ==19759== by 0x805643F: main (syslogd.c:2067) ==19759== ==19759== 128 bytes in 1 blocks are definitely lost in loss record 83 of 90 ==19759== at 0x4026775: malloc (vg_replace_malloc.c:291) ==19759== by 0x40268FD: realloc (vg_replace_malloc.c:687) ==19759== by 0x807CEE3: rsCStrExtendBuf (stringbuf.c:255) ==19759== by 0x8092D3B: getWord (stringbuf.h:80) ==19759== by 0x8092DC8: doGetWord (cfsysline.c:531) ==19759== by 0x8092AB5: processCfSysLineCommand (cfsysline.c:758) ==19759== by 0x806F7DA: cfsysline (conf.c:215) ==19759== by 0x80702B3: cnfDoCfsysline (rsconf.c:448) ==19759== by 0x8063827: yylex (lexer.l:237) ==19759== by 0x80606F4: yyparse (grammar.c:1505) ==19759== by 0x8072A8C: load (rsconf.c:1285) ==19759== by 0x8055CEC: realMain (syslogd.c:1970) ==19759== ==19759== 160 (40 direct, 120 indirect) bytes in 1 blocks are definitely lost in loss record 84 of 90 ==19759== at 0x4026775: malloc (vg_replace_malloc.c:291) ==19759== by 0x42164C3: nss_parse_service_list (nsswitch.c:622) ==19759== by 0x4216C06: __nss_database_lookup (nsswitch.c:164) ==19759== by 0x4713EAB: ??? ==19759== by 0x4714F84: ??? ==19759== by 0x41CE9C4: getpwnam_r@@GLIBC_2.1.2 (getXXbyYY_r.c:253) ==19759== by 0x80939AF: doGetUID (cfsysline.c:419) ==19759== by 0x8092AB5: processCfSysLineCommand (cfsysline.c:758) ==19759== by 0x806F7DA: cfsysline (conf.c:215) ==19759== by 0x80702B3: cnfDoCfsysline (rsconf.c:448) ==19759== by 0x8063827: yylex (lexer.l:237) ==19759== by 0x80606F4: yyparse (grammar.c:1505) ==19759== ==19759== 160 (40 direct, 120 indirect) bytes in 1 blocks are definitely lost in loss record 85 of 90 ==19759== at 0x4026775: malloc (vg_replace_malloc.c:291) ==19759== by 0x42164C3: nss_parse_service_list (nsswitch.c:622) ==19759== by 0x4216C06: __nss_database_lookup (nsswitch.c:164) ==19759== by 0x4712F2B: ??? ==19759== by 0x4713A64: ??? ==19759== by 0x41CD834: getgrnam_r@@GLIBC_2.1.2 (getXXbyYY_r.c:253) ==19759== by 0x80937AD: doGetGID (cfsysline.c:369) ==19759== by 0x8092AB5: processCfSysLineCommand (cfsysline.c:758) ==19759== by 0x806F7DA: cfsysline (conf.c:215) ==19759== by 0x80702B3: cnfDoCfsysline (rsconf.c:448) ==19759== by 0x8063827: yylex (lexer.l:237) ==19759== by 0x80606F4: yyparse (grammar.c:1505) ==19759== ==19759== 2,076 (32 direct, 2,044 indirect) bytes in 1 blocks are definitely lost in loss record 90 of 90 ==19759== at 0x4026775: malloc (vg_replace_malloc.c:291) ==19759== by 0x809A825: create_hashtable (hashtable.c:52) ==19759== by 0x46A4DE9: ??? ==19759== by 0x8083589: doModInit (modules.c:575) ==19759== by 0x8084091: Load (modules.c:1168) ==19759== by 0x806FA72: doModLoad (conf.c:125) ==19759== by 0x8092625: doCustomHdlr (cfsysline.c:101) ==19759== by 0x8092AB5: processCfSysLineCommand (cfsysline.c:758) ==19759== by 0x806F7DA: cfsysline (conf.c:215) ==19759== by 0x80702B3: cnfDoCfsysline (rsconf.c:448) ==19759== by 0x8063827: yylex (lexer.l:237) ==19759== by 0x80606F4: yyparse (grammar.c:1505) ==19759== ==19759== LEAK SUMMARY: ==19759== definitely lost: 304 bytes in 5 blocks ==19759== indirectly lost: 2,293 bytes in 34 blocks ==19759== possibly lost: 0 bytes in 0 blocks ==19759== still reachable: 2,403 bytes in 56 blocks ==19759== suppressed: 0 bytes in 0 blocks ==19759== Reachable blocks (those to which a pointer was found) are not shown. ==19759== To see them, rerun with: --leak-check=full --show-leak-kinds=all ==19759== ==19759== For counts of detected and suppressed errors, rerun with: -v ==19759== ERROR SUMMARY: 5 errors from 5 contexts (suppressed: 63 from 11) This was only from running rsyslog for less than a minute. :) Thanks, Reinoud. On Wed, Jul 23, 2014 at 1:11 AM, David Lang <[email protected]> wrote: > On Tue, 22 Jul 2014, Micah Yoder wrote: > > On 7/22/14, 3:21 PM, David Lang wrote: >> >> Well, that indicates that you have a problem with your dynafile >>> configuration, you don't allow enough files to be open, lookup >>> dynafilecachesize and increase it to be much larger >>> >>> With this sort of problem, your performance is going to be horrid, and >>> as such, your queues will grow. Not seeing your impstats output, but >>> I'll bet the size of the queues is large, and if you mutiply those sizes >>> by the maxmessagesize, you will probably end up accounting for a lot of >>> the memory that you think is 'leaking' >>> >> >> Thanks, I will check that. The "main queue" has size=2, but this one is >> rather different: >> >> Tue Jul 22 09:57:51 2014: action 16 queue: size=438266 enqueued=438268 >> full=0 discarded.full=0 discarded.nf=0 maxqsize=438266 >> >> Any idea how one goes about finding which action line this corresponds to? >> > > well, it's the 16th action in the config file, if you set name='value' in > your action() statements it will replace 'action #' with the name. > depending on how many actions you have, it may be easier to count up from > the bottom of the file. > > > Also, if this accounted for the majority of the memory, each item would >> have to consume 5-10k. Does that sound right? (The vast majority of our >> log lines are much less than that, but it's possible that something >> sends a stream of logs that large.) >> > > the datastructures are based on the max message size, not the size of the > actual message (at least as I understand it), and you have to have both the > raw message and all the parsed components, so it's not unreasonable to hit > 5-10K of RAM per message if the default message size is 4K (I know it's at > least 2K and I believe it's 4K) > > Also, if you are using the default FixedArray size for your main message > queue, it's eating up it's max size even when empty. > > while the exact details are murky, it really looks like this is your > problem. In any case, it's clear that you have a big problem with that > action, it's only processed two messages while getting 438K sent to it. > > David Lang > > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com/professional-services/ > What's up with rsyslog? Follow https://twitter.com/rgerhards > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad > of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you > DON'T LIKE THAT. > _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.

