On Wed, Jul 23, 2014 at 11:02 AM, Reinoud Koornstra < [email protected]> wrote:
> This is what valgrind has to say about rsyslog: > > ==19758== Memcheck, a memory error detector > ==19758== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al. > ==19758== Using Valgrind-3.9.0 and LibVEX; rerun with -h for copyright info > ==19758== Command: rsyslogd > ==19758== > ==19758== HEAP SUMMARY: > ==19758== in use at exit: 83,791 bytes in 1,675 blocks > ==19758== total heap usage: 2,392 allocs, 717 frees, 163,024 bytes > allocated > ==19758== > ==19758== 160 (40 direct, 120 indirect) bytes in 1 blocks are definitely > lost in loss record 1,075 of 1,131 > ==19758== at 0x4026775: malloc (vg_replace_malloc.c:291) > ==19758== by 0x42164C3: nss_parse_service_list (nsswitch.c:622) > ==19758== by 0x4216C06: __nss_database_lookup (nsswitch.c:164) > ==19758== by 0x4713EAB: ??? > ==19758== by 0x4714F84: ??? > ==19758== by 0x41CE9C4: getpwnam_r@@GLIBC_2.1.2 (getXXbyYY_r.c:253) > ==19758== by 0x80939AF: doGetUID (cfsysline.c:419) > ==19758== by 0x8092AB5: processCfSysLineCommand (cfsysline.c:758) > ==19758== by 0x806F7DA: cfsysline (conf.c:215) > ==19758== by 0x80702B3: cnfDoCfsysline (rsconf.c:448) > ==19758== by 0x8063827: yylex (lexer.l:237) > ==19758== by 0x80606F4: yyparse (grammar.c:1505) > ==19758== > ==19758== 160 (40 direct, 120 indirect) bytes in 1 blocks are definitely > lost in loss record 1,076 of 1,131 > ==19758== at 0x4026775: malloc (vg_replace_malloc.c:291) > ==19758== by 0x42164C3: nss_parse_service_list (nsswitch.c:622) > ==19758== by 0x4216C06: __nss_database_lookup (nsswitch.c:164) > ==19758== by 0x4712F2B: ??? > ==19758== by 0x4713A64: ??? > ==19758== by 0x41CD834: getgrnam_r@@GLIBC_2.1.2 (getXXbyYY_r.c:253) > ==19758== by 0x80937AD: doGetGID (cfsysline.c:369) > ==19758== by 0x8092AB5: processCfSysLineCommand (cfsysline.c:758) > ==19758== by 0x806F7DA: cfsysline (conf.c:215) > ==19758== by 0x80702B3: cnfDoCfsysline (rsconf.c:448) > ==19758== by 0x8063827: yylex (lexer.l:237) > ==19758== by 0x80606F4: yyparse (grammar.c:1505) > ==19758== > ==19758== LEAK SUMMARY: > ==19758== definitely lost: 80 bytes in 2 blocks > ==19758== indirectly lost: 240 bytes in 20 blocks > ==19758== possibly lost: 0 bytes in 0 blocks > ==19758== still reachable: 83,471 bytes in 1,653 blocks > ==19758== suppressed: 0 bytes in 0 blocks > ==19758== Reachable blocks (those to which a pointer was found) are not > shown. > ==19758== To see them, rerun with: --leak-check=full --show-leak-kinds=all > ==19758== > ==19758== For counts of detected and suppressed errors, rerun with: -v > ==19758== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 61 from 11) > > ps -ef | grep rsyslog > syslog 19759 1 0 01:47 ? 00:00:00 valgrind --tool=memcheck > --leak-check=full rsyslogd > > sudo kill 19759 > > ==19759== > ==19759== HEAP SUMMARY: > ==19759== in use at exit: 5,000 bytes in 95 blocks > ==19759== total heap usage: 4,332 allocs, 4,237 frees, 335,520 bytes > allocated > ==19759== > ==19759== 73 (64 direct, 9 indirect) bytes in 1 blocks are definitely lost > in loss record 79 of 90 > ==19759== at 0x4025509: calloc (vg_replace_malloc.c:618) > ==19759== by 0x8092040: ratelimitNew (ratelimit.c:277) > ==19759== by 0x46A67C9: ??? > ==19759== by 0x807087F: activate (rsconf.c:657) > ==19759== by 0x8055E17: realMain (syslogd.c:1161) > ==19759== by 0x805643F: main (syslogd.c:2067) > ==19759== > ==19759== 128 bytes in 1 blocks are definitely lost in loss record 83 of 90 > ==19759== at 0x4026775: malloc (vg_replace_malloc.c:291) > ==19759== by 0x40268FD: realloc (vg_replace_malloc.c:687) > ==19759== by 0x807CEE3: rsCStrExtendBuf (stringbuf.c:255) > ==19759== by 0x8092D3B: getWord (stringbuf.h:80) > ==19759== by 0x8092DC8: doGetWord (cfsysline.c:531) > ==19759== by 0x8092AB5: processCfSysLineCommand (cfsysline.c:758) > ==19759== by 0x806F7DA: cfsysline (conf.c:215) > ==19759== by 0x80702B3: cnfDoCfsysline (rsconf.c:448) > ==19759== by 0x8063827: yylex (lexer.l:237) > ==19759== by 0x80606F4: yyparse (grammar.c:1505) > ==19759== by 0x8072A8C: load (rsconf.c:1285) > ==19759== by 0x8055CEC: realMain (syslogd.c:1970) > ==19759== > ==19759== 160 (40 direct, 120 indirect) bytes in 1 blocks are definitely > lost in loss record 84 of 90 > ==19759== at 0x4026775: malloc (vg_replace_malloc.c:291) > ==19759== by 0x42164C3: nss_parse_service_list (nsswitch.c:622) > ==19759== by 0x4216C06: __nss_database_lookup (nsswitch.c:164) > ==19759== by 0x4713EAB: ??? > ==19759== by 0x4714F84: ??? > ==19759== by 0x41CE9C4: getpwnam_r@@GLIBC_2.1.2 (getXXbyYY_r.c:253) > ==19759== by 0x80939AF: doGetUID (cfsysline.c:419) > ==19759== by 0x8092AB5: processCfSysLineCommand (cfsysline.c:758) > ==19759== by 0x806F7DA: cfsysline (conf.c:215) > ==19759== by 0x80702B3: cnfDoCfsysline (rsconf.c:448) > ==19759== by 0x8063827: yylex (lexer.l:237) > ==19759== by 0x80606F4: yyparse (grammar.c:1505) > ==19759== > ==19759== 160 (40 direct, 120 indirect) bytes in 1 blocks are definitely > lost in loss record 85 of 90 > ==19759== at 0x4026775: malloc (vg_replace_malloc.c:291) > ==19759== by 0x42164C3: nss_parse_service_list (nsswitch.c:622) > ==19759== by 0x4216C06: __nss_database_lookup (nsswitch.c:164) > ==19759== by 0x4712F2B: ??? > ==19759== by 0x4713A64: ??? > ==19759== by 0x41CD834: getgrnam_r@@GLIBC_2.1.2 (getXXbyYY_r.c:253) > ==19759== by 0x80937AD: doGetGID (cfsysline.c:369) > ==19759== by 0x8092AB5: processCfSysLineCommand (cfsysline.c:758) > ==19759== by 0x806F7DA: cfsysline (conf.c:215) > ==19759== by 0x80702B3: cnfDoCfsysline (rsconf.c:448) > ==19759== by 0x8063827: yylex (lexer.l:237) > ==19759== by 0x80606F4: yyparse (grammar.c:1505) > ==19759== > ==19759== 2,076 (32 direct, 2,044 indirect) bytes in 1 blocks are > definitely lost in loss record 90 of 90 > ==19759== at 0x4026775: malloc (vg_replace_malloc.c:291) > ==19759== by 0x809A825: create_hashtable (hashtable.c:52) > ==19759== by 0x46A4DE9: ??? > ==19759== by 0x8083589: doModInit (modules.c:575) > ==19759== by 0x8084091: Load (modules.c:1168) > ==19759== by 0x806FA72: doModLoad (conf.c:125) > ==19759== by 0x8092625: doCustomHdlr (cfsysline.c:101) > ==19759== by 0x8092AB5: processCfSysLineCommand (cfsysline.c:758) > ==19759== by 0x806F7DA: cfsysline (conf.c:215) > ==19759== by 0x80702B3: cnfDoCfsysline (rsconf.c:448) > ==19759== by 0x8063827: yylex (lexer.l:237) > ==19759== by 0x80606F4: yyparse (grammar.c:1505) > ==19759== > ==19759== LEAK SUMMARY: > ==19759== definitely lost: 304 bytes in 5 blocks > ==19759== indirectly lost: 2,293 bytes in 34 blocks > ==19759== possibly lost: 0 bytes in 0 blocks > ==19759== still reachable: 2,403 bytes in 56 blocks > ==19759== suppressed: 0 bytes in 0 blocks > ==19759== Reachable blocks (those to which a pointer was found) are not > shown. > ==19759== To see them, rerun with: --leak-check=full --show-leak-kinds=all > ==19759== > ==19759== For counts of detected and suppressed errors, rerun with: -v > ==19759== ERROR SUMMARY: 5 errors from 5 contexts (suppressed: 63 from 11) > > This was only from running rsyslog for less than a minute. :) > so what? Rainer > Thanks, > > Reinoud. > > > > > On Wed, Jul 23, 2014 at 1:11 AM, David Lang <[email protected]> wrote: > > > On Tue, 22 Jul 2014, Micah Yoder wrote: > > > > On 7/22/14, 3:21 PM, David Lang wrote: > >> > >> Well, that indicates that you have a problem with your dynafile > >>> configuration, you don't allow enough files to be open, lookup > >>> dynafilecachesize and increase it to be much larger > >>> > >>> With this sort of problem, your performance is going to be horrid, and > >>> as such, your queues will grow. Not seeing your impstats output, but > >>> I'll bet the size of the queues is large, and if you mutiply those > sizes > >>> by the maxmessagesize, you will probably end up accounting for a lot of > >>> the memory that you think is 'leaking' > >>> > >> > >> Thanks, I will check that. The "main queue" has size=2, but this one is > >> rather different: > >> > >> Tue Jul 22 09:57:51 2014: action 16 queue: size=438266 enqueued=438268 > >> full=0 discarded.full=0 discarded.nf=0 maxqsize=438266 > >> > >> Any idea how one goes about finding which action line this corresponds > to? > >> > > > > well, it's the 16th action in the config file, if you set name='value' in > > your action() statements it will replace 'action #' with the name. > > depending on how many actions you have, it may be easier to count up from > > the bottom of the file. > > > > > > Also, if this accounted for the majority of the memory, each item would > >> have to consume 5-10k. Does that sound right? (The vast majority of our > >> log lines are much less than that, but it's possible that something > >> sends a stream of logs that large.) > >> > > > > the datastructures are based on the max message size, not the size of the > > actual message (at least as I understand it), and you have to have both > the > > raw message and all the parsed components, so it's not unreasonable to > hit > > 5-10K of RAM per message if the default message size is 4K (I know it's > at > > least 2K and I believe it's 4K) > > > > Also, if you are using the default FixedArray size for your main message > > queue, it's eating up it's max size even when empty. > > > > while the exact details are murky, it really looks like this is your > > problem. In any case, it's clear that you have a big problem with that > > action, it's only processed two messages while getting 438K sent to it. > > > > David Lang > > > > _______________________________________________ > > rsyslog mailing list > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > http://www.rsyslog.com/professional-services/ > > What's up with rsyslog? Follow https://twitter.com/rgerhards > > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad > > of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you > > DON'T LIKE THAT. > > > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com/professional-services/ > What's up with rsyslog? Follow https://twitter.com/rgerhards > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad > of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you > DON'T LIKE THAT. > _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.

