An apology is due... sorry Reinoud, I shouldn't have posted immediately... While the tech contents of my message is right, the wording is unacceptable. You for sure just wanted to help and I ranted at you. :-(
sorry again... Rainer On Wed, Jul 23, 2014 at 11:14 AM, Rainer Gerhards <[email protected]> wrote: > > On Wed, Jul 23, 2014 at 11:08 AM, Reinoud Koornstra < > [email protected]> wrote: > >> valgrind thinks there are memory leaks. That's all it says. >> >> > well, that's a pretty shot in the dark. First of, you miss the supressions > file for nss, because a couple of the leaks are for nss. Secondly, the test > was done incorrectly, as valgrind does not work reliably with > auto-backgrounding AND you need to test the actual config in question. > Thirdly, there are three real 1-block leaks, and they are there because > this is memory we need up until termination. For better cosmetics, we could > free them immediately before the OS cleanup. > > Looking blindly at valgrind results introduces such nice bugs like zeroing > out crypto randomness ;) Spreading memleak fud on rsyslog does not really > help solving this issue... > > Rainer > > Reinoud. >> On Jul 23, 2014 2:05 AM, "Rainer Gerhards" <[email protected]> >> wrote: >> >> > On Wed, Jul 23, 2014 at 11:02 AM, Reinoud Koornstra < >> > [email protected]> wrote: >> > >> > > This is what valgrind has to say about rsyslog: >> > > >> > > ==19758== Memcheck, a memory error detector >> > > ==19758== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et >> al. >> > > ==19758== Using Valgrind-3.9.0 and LibVEX; rerun with -h for copyright >> > info >> > > ==19758== Command: rsyslogd >> > > ==19758== >> > > ==19758== HEAP SUMMARY: >> > > ==19758== in use at exit: 83,791 bytes in 1,675 blocks >> > > ==19758== total heap usage: 2,392 allocs, 717 frees, 163,024 bytes >> > > allocated >> > > ==19758== >> > > ==19758== 160 (40 direct, 120 indirect) bytes in 1 blocks are >> definitely >> > > lost in loss record 1,075 of 1,131 >> > > ==19758== at 0x4026775: malloc (vg_replace_malloc.c:291) >> > > ==19758== by 0x42164C3: nss_parse_service_list (nsswitch.c:622) >> > > ==19758== by 0x4216C06: __nss_database_lookup (nsswitch.c:164) >> > > ==19758== by 0x4713EAB: ??? >> > > ==19758== by 0x4714F84: ??? >> > > ==19758== by 0x41CE9C4: getpwnam_r@@GLIBC_2.1.2 >> (getXXbyYY_r.c:253) >> > > ==19758== by 0x80939AF: doGetUID (cfsysline.c:419) >> > > ==19758== by 0x8092AB5: processCfSysLineCommand (cfsysline.c:758) >> > > ==19758== by 0x806F7DA: cfsysline (conf.c:215) >> > > ==19758== by 0x80702B3: cnfDoCfsysline (rsconf.c:448) >> > > ==19758== by 0x8063827: yylex (lexer.l:237) >> > > ==19758== by 0x80606F4: yyparse (grammar.c:1505) >> > > ==19758== >> > > ==19758== 160 (40 direct, 120 indirect) bytes in 1 blocks are >> definitely >> > > lost in loss record 1,076 of 1,131 >> > > ==19758== at 0x4026775: malloc (vg_replace_malloc.c:291) >> > > ==19758== by 0x42164C3: nss_parse_service_list (nsswitch.c:622) >> > > ==19758== by 0x4216C06: __nss_database_lookup (nsswitch.c:164) >> > > ==19758== by 0x4712F2B: ??? >> > > ==19758== by 0x4713A64: ??? >> > > ==19758== by 0x41CD834: getgrnam_r@@GLIBC_2.1.2 >> (getXXbyYY_r.c:253) >> > > ==19758== by 0x80937AD: doGetGID (cfsysline.c:369) >> > > ==19758== by 0x8092AB5: processCfSysLineCommand (cfsysline.c:758) >> > > ==19758== by 0x806F7DA: cfsysline (conf.c:215) >> > > ==19758== by 0x80702B3: cnfDoCfsysline (rsconf.c:448) >> > > ==19758== by 0x8063827: yylex (lexer.l:237) >> > > ==19758== by 0x80606F4: yyparse (grammar.c:1505) >> > > ==19758== >> > > ==19758== LEAK SUMMARY: >> > > ==19758== definitely lost: 80 bytes in 2 blocks >> > > ==19758== indirectly lost: 240 bytes in 20 blocks >> > > ==19758== possibly lost: 0 bytes in 0 blocks >> > > ==19758== still reachable: 83,471 bytes in 1,653 blocks >> > > ==19758== suppressed: 0 bytes in 0 blocks >> > > ==19758== Reachable blocks (those to which a pointer was found) are >> not >> > > shown. >> > > ==19758== To see them, rerun with: --leak-check=full >> > --show-leak-kinds=all >> > > ==19758== >> > > ==19758== For counts of detected and suppressed errors, rerun with: -v >> > > ==19758== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 61 from >> > 11) >> > > >> > > ps -ef | grep rsyslog >> > > syslog 19759 1 0 01:47 ? 00:00:00 valgrind >> --tool=memcheck >> > > --leak-check=full rsyslogd >> > > >> > > sudo kill 19759 >> > > >> > > ==19759== >> > > ==19759== HEAP SUMMARY: >> > > ==19759== in use at exit: 5,000 bytes in 95 blocks >> > > ==19759== total heap usage: 4,332 allocs, 4,237 frees, 335,520 bytes >> > > allocated >> > > ==19759== >> > > ==19759== 73 (64 direct, 9 indirect) bytes in 1 blocks are definitely >> > lost >> > > in loss record 79 of 90 >> > > ==19759== at 0x4025509: calloc (vg_replace_malloc.c:618) >> > > ==19759== by 0x8092040: ratelimitNew (ratelimit.c:277) >> > > ==19759== by 0x46A67C9: ??? >> > > ==19759== by 0x807087F: activate (rsconf.c:657) >> > > ==19759== by 0x8055E17: realMain (syslogd.c:1161) >> > > ==19759== by 0x805643F: main (syslogd.c:2067) >> > > ==19759== >> > > ==19759== 128 bytes in 1 blocks are definitely lost in loss record 83 >> of >> > 90 >> > > ==19759== at 0x4026775: malloc (vg_replace_malloc.c:291) >> > > ==19759== by 0x40268FD: realloc (vg_replace_malloc.c:687) >> > > ==19759== by 0x807CEE3: rsCStrExtendBuf (stringbuf.c:255) >> > > ==19759== by 0x8092D3B: getWord (stringbuf.h:80) >> > > ==19759== by 0x8092DC8: doGetWord (cfsysline.c:531) >> > > ==19759== by 0x8092AB5: processCfSysLineCommand (cfsysline.c:758) >> > > ==19759== by 0x806F7DA: cfsysline (conf.c:215) >> > > ==19759== by 0x80702B3: cnfDoCfsysline (rsconf.c:448) >> > > ==19759== by 0x8063827: yylex (lexer.l:237) >> > > ==19759== by 0x80606F4: yyparse (grammar.c:1505) >> > > ==19759== by 0x8072A8C: load (rsconf.c:1285) >> > > ==19759== by 0x8055CEC: realMain (syslogd.c:1970) >> > > ==19759== >> > > ==19759== 160 (40 direct, 120 indirect) bytes in 1 blocks are >> definitely >> > > lost in loss record 84 of 90 >> > > ==19759== at 0x4026775: malloc (vg_replace_malloc.c:291) >> > > ==19759== by 0x42164C3: nss_parse_service_list (nsswitch.c:622) >> > > ==19759== by 0x4216C06: __nss_database_lookup (nsswitch.c:164) >> > > ==19759== by 0x4713EAB: ??? >> > > ==19759== by 0x4714F84: ??? >> > > ==19759== by 0x41CE9C4: getpwnam_r@@GLIBC_2.1.2 >> (getXXbyYY_r.c:253) >> > > ==19759== by 0x80939AF: doGetUID (cfsysline.c:419) >> > > ==19759== by 0x8092AB5: processCfSysLineCommand (cfsysline.c:758) >> > > ==19759== by 0x806F7DA: cfsysline (conf.c:215) >> > > ==19759== by 0x80702B3: cnfDoCfsysline (rsconf.c:448) >> > > ==19759== by 0x8063827: yylex (lexer.l:237) >> > > ==19759== by 0x80606F4: yyparse (grammar.c:1505) >> > > ==19759== >> > > ==19759== 160 (40 direct, 120 indirect) bytes in 1 blocks are >> definitely >> > > lost in loss record 85 of 90 >> > > ==19759== at 0x4026775: malloc (vg_replace_malloc.c:291) >> > > ==19759== by 0x42164C3: nss_parse_service_list (nsswitch.c:622) >> > > ==19759== by 0x4216C06: __nss_database_lookup (nsswitch.c:164) >> > > ==19759== by 0x4712F2B: ??? >> > > ==19759== by 0x4713A64: ??? >> > > ==19759== by 0x41CD834: getgrnam_r@@GLIBC_2.1.2 >> (getXXbyYY_r.c:253) >> > > ==19759== by 0x80937AD: doGetGID (cfsysline.c:369) >> > > ==19759== by 0x8092AB5: processCfSysLineCommand (cfsysline.c:758) >> > > ==19759== by 0x806F7DA: cfsysline (conf.c:215) >> > > ==19759== by 0x80702B3: cnfDoCfsysline (rsconf.c:448) >> > > ==19759== by 0x8063827: yylex (lexer.l:237) >> > > ==19759== by 0x80606F4: yyparse (grammar.c:1505) >> > > ==19759== >> > > ==19759== 2,076 (32 direct, 2,044 indirect) bytes in 1 blocks are >> > > definitely lost in loss record 90 of 90 >> > > ==19759== at 0x4026775: malloc (vg_replace_malloc.c:291) >> > > ==19759== by 0x809A825: create_hashtable (hashtable.c:52) >> > > ==19759== by 0x46A4DE9: ??? >> > > ==19759== by 0x8083589: doModInit (modules.c:575) >> > > ==19759== by 0x8084091: Load (modules.c:1168) >> > > ==19759== by 0x806FA72: doModLoad (conf.c:125) >> > > ==19759== by 0x8092625: doCustomHdlr (cfsysline.c:101) >> > > ==19759== by 0x8092AB5: processCfSysLineCommand (cfsysline.c:758) >> > > ==19759== by 0x806F7DA: cfsysline (conf.c:215) >> > > ==19759== by 0x80702B3: cnfDoCfsysline (rsconf.c:448) >> > > ==19759== by 0x8063827: yylex (lexer.l:237) >> > > ==19759== by 0x80606F4: yyparse (grammar.c:1505) >> > > ==19759== >> > > ==19759== LEAK SUMMARY: >> > > ==19759== definitely lost: 304 bytes in 5 blocks >> > > ==19759== indirectly lost: 2,293 bytes in 34 blocks >> > > ==19759== possibly lost: 0 bytes in 0 blocks >> > > ==19759== still reachable: 2,403 bytes in 56 blocks >> > > ==19759== suppressed: 0 bytes in 0 blocks >> > > ==19759== Reachable blocks (those to which a pointer was found) are >> not >> > > shown. >> > > ==19759== To see them, rerun with: --leak-check=full >> > --show-leak-kinds=all >> > > ==19759== >> > > ==19759== For counts of detected and suppressed errors, rerun with: -v >> > > ==19759== ERROR SUMMARY: 5 errors from 5 contexts (suppressed: 63 from >> > 11) >> > > >> > > This was only from running rsyslog for less than a minute. :) >> > > >> > >> > so what? >> > >> > Rainer >> > >> > >> > > Thanks, >> > > >> > > Reinoud. >> > > >> > > >> > > >> > > >> > > On Wed, Jul 23, 2014 at 1:11 AM, David Lang <[email protected]> wrote: >> > > >> > > > On Tue, 22 Jul 2014, Micah Yoder wrote: >> > > > >> > > > On 7/22/14, 3:21 PM, David Lang wrote: >> > > >> >> > > >> Well, that indicates that you have a problem with your dynafile >> > > >>> configuration, you don't allow enough files to be open, lookup >> > > >>> dynafilecachesize and increase it to be much larger >> > > >>> >> > > >>> With this sort of problem, your performance is going to be horrid, >> > and >> > > >>> as such, your queues will grow. Not seeing your impstats output, >> but >> > > >>> I'll bet the size of the queues is large, and if you mutiply those >> > > sizes >> > > >>> by the maxmessagesize, you will probably end up accounting for a >> lot >> > of >> > > >>> the memory that you think is 'leaking' >> > > >>> >> > > >> >> > > >> Thanks, I will check that. The "main queue" has size=2, but this >> one >> > is >> > > >> rather different: >> > > >> >> > > >> Tue Jul 22 09:57:51 2014: action 16 queue: size=438266 >> enqueued=438268 >> > > >> full=0 discarded.full=0 discarded.nf=0 maxqsize=438266 >> > > >> >> > > >> Any idea how one goes about finding which action line this >> corresponds >> > > to? >> > > >> >> > > > >> > > > well, it's the 16th action in the config file, if you set >> name='value' >> > in >> > > > your action() statements it will replace 'action #' with the name. >> > > > depending on how many actions you have, it may be easier to count up >> > from >> > > > the bottom of the file. >> > > > >> > > > >> > > > Also, if this accounted for the majority of the memory, each item >> > would >> > > >> have to consume 5-10k. Does that sound right? (The vast majority >> of >> > our >> > > >> log lines are much less than that, but it's possible that something >> > > >> sends a stream of logs that large.) >> > > >> >> > > > >> > > > the datastructures are based on the max message size, not the size >> of >> > the >> > > > actual message (at least as I understand it), and you have to have >> both >> > > the >> > > > raw message and all the parsed components, so it's not unreasonable >> to >> > > hit >> > > > 5-10K of RAM per message if the default message size is 4K (I know >> it's >> > > at >> > > > least 2K and I believe it's 4K) >> > > > >> > > > Also, if you are using the default FixedArray size for your main >> > message >> > > > queue, it's eating up it's max size even when empty. >> > > > >> > > > while the exact details are murky, it really looks like this is your >> > > > problem. In any case, it's clear that you have a big problem with >> that >> > > > action, it's only processed two messages while getting 438K sent to >> it. >> > > > >> > > > David Lang >> > > > >> > > > _______________________________________________ >> > > > rsyslog mailing list >> > > > http://lists.adiscon.net/mailman/listinfo/rsyslog >> > > > http://www.rsyslog.com/professional-services/ >> > > > What's up with rsyslog? Follow https://twitter.com/rgerhards >> > > > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a >> > myriad >> > > > of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if >> you >> > > > DON'T LIKE THAT. >> > > > >> > > _______________________________________________ >> > > rsyslog mailing list >> > > http://lists.adiscon.net/mailman/listinfo/rsyslog >> > > http://www.rsyslog.com/professional-services/ >> > > What's up with rsyslog? Follow https://twitter.com/rgerhards >> > > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a >> myriad >> > > of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >> > > DON'T LIKE THAT. >> > > >> > _______________________________________________ >> > rsyslog mailing list >> > http://lists.adiscon.net/mailman/listinfo/rsyslog >> > http://www.rsyslog.com/professional-services/ >> > What's up with rsyslog? Follow https://twitter.com/rgerhards >> > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad >> > of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >> > DON'T LIKE THAT. >> > >> _______________________________________________ >> rsyslog mailing list >> http://lists.adiscon.net/mailman/listinfo/rsyslog >> http://www.rsyslog.com/professional-services/ >> What's up with rsyslog? Follow https://twitter.com/rgerhards >> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad >> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >> DON'T LIKE THAT. >> > > _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.

