On Wed, Jul 23, 2014 at 11:29 AM, Reinoud Koornstra < [email protected]> wrote:
> It was not my goal to spread memleak fud. Since a possible memleak was > discussed I thought to try valgrind and see what it says and paste it to > try to help in finding some if there are any. Clearly every valgrind output > should be examined thoroughly before any claim can be made. I apologize for > any inconveniences and misinformation my mail has caused as this was not my > intention. Forgive me. > > Our messages crossed. It was entirely my fault. I appreciate your intention to help. The key point is that valgrind output needs to be "examined" and you need to be used to do this *for a certain project*. For example, nss has some "leaks" (I guess for the same OS-cleanup reason like rsyslog has) and there exist valgrind suppression files especially for these (it's so common). Also, I use valgrind routinely during development, almost exclusively test under it's direct control. Finally, the testbench (now run automatically nightly) has quite a bit of valgrind tests included. So by just a regular run, it is extremely unlike to find a memleak. With specific configs, of course, it may happen. We always try hard to add specific tests for such issues to the testbench when we fixed one (but that's not always possible). Again, sorry for my flamy response ... maybe a bit stress too much or a bad day... Rainer > Reinoud. > On Jul 23, 2014 2:14 AM, "Rainer Gerhards" <[email protected]> > wrote: > > > On Wed, Jul 23, 2014 at 11:08 AM, Reinoud Koornstra < > > [email protected]> wrote: > > > > > valgrind thinks there are memory leaks. That's all it says. > > > > > > > > well, that's a pretty shot in the dark. First of, you miss the > supressions > > file for nss, because a couple of the leaks are for nss. Secondly, the > test > > was done incorrectly, as valgrind does not work reliably with > > auto-backgrounding AND you need to test the actual config in question. > > Thirdly, there are three real 1-block leaks, and they are there because > > this is memory we need up until termination. For better cosmetics, we > could > > free them immediately before the OS cleanup. > > > > Looking blindly at valgrind results introduces such nice bugs like > zeroing > > out crypto randomness ;) Spreading memleak fud on rsyslog does not really > > help solving this issue... > > > > Rainer > > > > Reinoud. > > > On Jul 23, 2014 2:05 AM, "Rainer Gerhards" <[email protected]> > > > wrote: > > > > > > > On Wed, Jul 23, 2014 at 11:02 AM, Reinoud Koornstra < > > > > [email protected]> wrote: > > > > > > > > > This is what valgrind has to say about rsyslog: > > > > > > > > > > ==19758== Memcheck, a memory error detector > > > > > ==19758== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward > et > > > al. > > > > > ==19758== Using Valgrind-3.9.0 and LibVEX; rerun with -h for > > copyright > > > > info > > > > > ==19758== Command: rsyslogd > > > > > ==19758== > > > > > ==19758== HEAP SUMMARY: > > > > > ==19758== in use at exit: 83,791 bytes in 1,675 blocks > > > > > ==19758== total heap usage: 2,392 allocs, 717 frees, 163,024 > bytes > > > > > allocated > > > > > ==19758== > > > > > ==19758== 160 (40 direct, 120 indirect) bytes in 1 blocks are > > > definitely > > > > > lost in loss record 1,075 of 1,131 > > > > > ==19758== at 0x4026775: malloc (vg_replace_malloc.c:291) > > > > > ==19758== by 0x42164C3: nss_parse_service_list (nsswitch.c:622) > > > > > ==19758== by 0x4216C06: __nss_database_lookup (nsswitch.c:164) > > > > > ==19758== by 0x4713EAB: ??? > > > > > ==19758== by 0x4714F84: ??? > > > > > ==19758== by 0x41CE9C4: getpwnam_r@@GLIBC_2.1.2 > > (getXXbyYY_r.c:253) > > > > > ==19758== by 0x80939AF: doGetUID (cfsysline.c:419) > > > > > ==19758== by 0x8092AB5: processCfSysLineCommand > (cfsysline.c:758) > > > > > ==19758== by 0x806F7DA: cfsysline (conf.c:215) > > > > > ==19758== by 0x80702B3: cnfDoCfsysline (rsconf.c:448) > > > > > ==19758== by 0x8063827: yylex (lexer.l:237) > > > > > ==19758== by 0x80606F4: yyparse (grammar.c:1505) > > > > > ==19758== > > > > > ==19758== 160 (40 direct, 120 indirect) bytes in 1 blocks are > > > definitely > > > > > lost in loss record 1,076 of 1,131 > > > > > ==19758== at 0x4026775: malloc (vg_replace_malloc.c:291) > > > > > ==19758== by 0x42164C3: nss_parse_service_list (nsswitch.c:622) > > > > > ==19758== by 0x4216C06: __nss_database_lookup (nsswitch.c:164) > > > > > ==19758== by 0x4712F2B: ??? > > > > > ==19758== by 0x4713A64: ??? > > > > > ==19758== by 0x41CD834: getgrnam_r@@GLIBC_2.1.2 > > (getXXbyYY_r.c:253) > > > > > ==19758== by 0x80937AD: doGetGID (cfsysline.c:369) > > > > > ==19758== by 0x8092AB5: processCfSysLineCommand > (cfsysline.c:758) > > > > > ==19758== by 0x806F7DA: cfsysline (conf.c:215) > > > > > ==19758== by 0x80702B3: cnfDoCfsysline (rsconf.c:448) > > > > > ==19758== by 0x8063827: yylex (lexer.l:237) > > > > > ==19758== by 0x80606F4: yyparse (grammar.c:1505) > > > > > ==19758== > > > > > ==19758== LEAK SUMMARY: > > > > > ==19758== definitely lost: 80 bytes in 2 blocks > > > > > ==19758== indirectly lost: 240 bytes in 20 blocks > > > > > ==19758== possibly lost: 0 bytes in 0 blocks > > > > > ==19758== still reachable: 83,471 bytes in 1,653 blocks > > > > > ==19758== suppressed: 0 bytes in 0 blocks > > > > > ==19758== Reachable blocks (those to which a pointer was found) are > > not > > > > > shown. > > > > > ==19758== To see them, rerun with: --leak-check=full > > > > --show-leak-kinds=all > > > > > ==19758== > > > > > ==19758== For counts of detected and suppressed errors, rerun with: > > -v > > > > > ==19758== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 61 > > from > > > > 11) > > > > > > > > > > ps -ef | grep rsyslog > > > > > syslog 19759 1 0 01:47 ? 00:00:00 valgrind > > > --tool=memcheck > > > > > --leak-check=full rsyslogd > > > > > > > > > > sudo kill 19759 > > > > > > > > > > ==19759== > > > > > ==19759== HEAP SUMMARY: > > > > > ==19759== in use at exit: 5,000 bytes in 95 blocks > > > > > ==19759== total heap usage: 4,332 allocs, 4,237 frees, 335,520 > > bytes > > > > > allocated > > > > > ==19759== > > > > > ==19759== 73 (64 direct, 9 indirect) bytes in 1 blocks are > definitely > > > > lost > > > > > in loss record 79 of 90 > > > > > ==19759== at 0x4025509: calloc (vg_replace_malloc.c:618) > > > > > ==19759== by 0x8092040: ratelimitNew (ratelimit.c:277) > > > > > ==19759== by 0x46A67C9: ??? > > > > > ==19759== by 0x807087F: activate (rsconf.c:657) > > > > > ==19759== by 0x8055E17: realMain (syslogd.c:1161) > > > > > ==19759== by 0x805643F: main (syslogd.c:2067) > > > > > ==19759== > > > > > ==19759== 128 bytes in 1 blocks are definitely lost in loss record > 83 > > > of > > > > 90 > > > > > ==19759== at 0x4026775: malloc (vg_replace_malloc.c:291) > > > > > ==19759== by 0x40268FD: realloc (vg_replace_malloc.c:687) > > > > > ==19759== by 0x807CEE3: rsCStrExtendBuf (stringbuf.c:255) > > > > > ==19759== by 0x8092D3B: getWord (stringbuf.h:80) > > > > > ==19759== by 0x8092DC8: doGetWord (cfsysline.c:531) > > > > > ==19759== by 0x8092AB5: processCfSysLineCommand > (cfsysline.c:758) > > > > > ==19759== by 0x806F7DA: cfsysline (conf.c:215) > > > > > ==19759== by 0x80702B3: cnfDoCfsysline (rsconf.c:448) > > > > > ==19759== by 0x8063827: yylex (lexer.l:237) > > > > > ==19759== by 0x80606F4: yyparse (grammar.c:1505) > > > > > ==19759== by 0x8072A8C: load (rsconf.c:1285) > > > > > ==19759== by 0x8055CEC: realMain (syslogd.c:1970) > > > > > ==19759== > > > > > ==19759== 160 (40 direct, 120 indirect) bytes in 1 blocks are > > > definitely > > > > > lost in loss record 84 of 90 > > > > > ==19759== at 0x4026775: malloc (vg_replace_malloc.c:291) > > > > > ==19759== by 0x42164C3: nss_parse_service_list (nsswitch.c:622) > > > > > ==19759== by 0x4216C06: __nss_database_lookup (nsswitch.c:164) > > > > > ==19759== by 0x4713EAB: ??? > > > > > ==19759== by 0x4714F84: ??? > > > > > ==19759== by 0x41CE9C4: getpwnam_r@@GLIBC_2.1.2 > > (getXXbyYY_r.c:253) > > > > > ==19759== by 0x80939AF: doGetUID (cfsysline.c:419) > > > > > ==19759== by 0x8092AB5: processCfSysLineCommand > (cfsysline.c:758) > > > > > ==19759== by 0x806F7DA: cfsysline (conf.c:215) > > > > > ==19759== by 0x80702B3: cnfDoCfsysline (rsconf.c:448) > > > > > ==19759== by 0x8063827: yylex (lexer.l:237) > > > > > ==19759== by 0x80606F4: yyparse (grammar.c:1505) > > > > > ==19759== > > > > > ==19759== 160 (40 direct, 120 indirect) bytes in 1 blocks are > > > definitely > > > > > lost in loss record 85 of 90 > > > > > ==19759== at 0x4026775: malloc (vg_replace_malloc.c:291) > > > > > ==19759== by 0x42164C3: nss_parse_service_list (nsswitch.c:622) > > > > > ==19759== by 0x4216C06: __nss_database_lookup (nsswitch.c:164) > > > > > ==19759== by 0x4712F2B: ??? > > > > > ==19759== by 0x4713A64: ??? > > > > > ==19759== by 0x41CD834: getgrnam_r@@GLIBC_2.1.2 > > (getXXbyYY_r.c:253) > > > > > ==19759== by 0x80937AD: doGetGID (cfsysline.c:369) > > > > > ==19759== by 0x8092AB5: processCfSysLineCommand > (cfsysline.c:758) > > > > > ==19759== by 0x806F7DA: cfsysline (conf.c:215) > > > > > ==19759== by 0x80702B3: cnfDoCfsysline (rsconf.c:448) > > > > > ==19759== by 0x8063827: yylex (lexer.l:237) > > > > > ==19759== by 0x80606F4: yyparse (grammar.c:1505) > > > > > ==19759== > > > > > ==19759== 2,076 (32 direct, 2,044 indirect) bytes in 1 blocks are > > > > > definitely lost in loss record 90 of 90 > > > > > ==19759== at 0x4026775: malloc (vg_replace_malloc.c:291) > > > > > ==19759== by 0x809A825: create_hashtable (hashtable.c:52) > > > > > ==19759== by 0x46A4DE9: ??? > > > > > ==19759== by 0x8083589: doModInit (modules.c:575) > > > > > ==19759== by 0x8084091: Load (modules.c:1168) > > > > > ==19759== by 0x806FA72: doModLoad (conf.c:125) > > > > > ==19759== by 0x8092625: doCustomHdlr (cfsysline.c:101) > > > > > ==19759== by 0x8092AB5: processCfSysLineCommand > (cfsysline.c:758) > > > > > ==19759== by 0x806F7DA: cfsysline (conf.c:215) > > > > > ==19759== by 0x80702B3: cnfDoCfsysline (rsconf.c:448) > > > > > ==19759== by 0x8063827: yylex (lexer.l:237) > > > > > ==19759== by 0x80606F4: yyparse (grammar.c:1505) > > > > > ==19759== > > > > > ==19759== LEAK SUMMARY: > > > > > ==19759== definitely lost: 304 bytes in 5 blocks > > > > > ==19759== indirectly lost: 2,293 bytes in 34 blocks > > > > > ==19759== possibly lost: 0 bytes in 0 blocks > > > > > ==19759== still reachable: 2,403 bytes in 56 blocks > > > > > ==19759== suppressed: 0 bytes in 0 blocks > > > > > ==19759== Reachable blocks (those to which a pointer was found) are > > not > > > > > shown. > > > > > ==19759== To see them, rerun with: --leak-check=full > > > > --show-leak-kinds=all > > > > > ==19759== > > > > > ==19759== For counts of detected and suppressed errors, rerun with: > > -v > > > > > ==19759== ERROR SUMMARY: 5 errors from 5 contexts (suppressed: 63 > > from > > > > 11) > > > > > > > > > > This was only from running rsyslog for less than a minute. :) > > > > > > > > > > > > > so what? > > > > > > > > Rainer > > > > > > > > > > > > > Thanks, > > > > > > > > > > Reinoud. > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Jul 23, 2014 at 1:11 AM, David Lang <[email protected]> wrote: > > > > > > > > > > > On Tue, 22 Jul 2014, Micah Yoder wrote: > > > > > > > > > > > > On 7/22/14, 3:21 PM, David Lang wrote: > > > > > >> > > > > > >> Well, that indicates that you have a problem with your dynafile > > > > > >>> configuration, you don't allow enough files to be open, lookup > > > > > >>> dynafilecachesize and increase it to be much larger > > > > > >>> > > > > > >>> With this sort of problem, your performance is going to be > > horrid, > > > > and > > > > > >>> as such, your queues will grow. Not seeing your impstats > output, > > > but > > > > > >>> I'll bet the size of the queues is large, and if you mutiply > > those > > > > > sizes > > > > > >>> by the maxmessagesize, you will probably end up accounting for > a > > > lot > > > > of > > > > > >>> the memory that you think is 'leaking' > > > > > >>> > > > > > >> > > > > > >> Thanks, I will check that. The "main queue" has size=2, but > this > > > one > > > > is > > > > > >> rather different: > > > > > >> > > > > > >> Tue Jul 22 09:57:51 2014: action 16 queue: size=438266 > > > enqueued=438268 > > > > > >> full=0 discarded.full=0 discarded.nf=0 maxqsize=438266 > > > > > >> > > > > > >> Any idea how one goes about finding which action line this > > > corresponds > > > > > to? > > > > > >> > > > > > > > > > > > > well, it's the 16th action in the config file, if you set > > > name='value' > > > > in > > > > > > your action() statements it will replace 'action #' with the > name. > > > > > > depending on how many actions you have, it may be easier to count > > up > > > > from > > > > > > the bottom of the file. > > > > > > > > > > > > > > > > > > Also, if this accounted for the majority of the memory, each > item > > > > would > > > > > >> have to consume 5-10k. Does that sound right? (The vast > majority > > of > > > > our > > > > > >> log lines are much less than that, but it's possible that > > something > > > > > >> sends a stream of logs that large.) > > > > > >> > > > > > > > > > > > > the datastructures are based on the max message size, not the > size > > of > > > > the > > > > > > actual message (at least as I understand it), and you have to > have > > > both > > > > > the > > > > > > raw message and all the parsed components, so it's not > unreasonable > > > to > > > > > hit > > > > > > 5-10K of RAM per message if the default message size is 4K (I > know > > > it's > > > > > at > > > > > > least 2K and I believe it's 4K) > > > > > > > > > > > > Also, if you are using the default FixedArray size for your main > > > > message > > > > > > queue, it's eating up it's max size even when empty. > > > > > > > > > > > > while the exact details are murky, it really looks like this is > > your > > > > > > problem. In any case, it's clear that you have a big problem with > > > that > > > > > > action, it's only processed two messages while getting 438K sent > to > > > it. > > > > > > > > > > > > David Lang > > > > > > > > > > > > _______________________________________________ > > > > > > rsyslog mailing list > > > > > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > > > > > http://www.rsyslog.com/professional-services/ > > > > > > What's up with rsyslog? Follow https://twitter.com/rgerhards > > > > > > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a > > > > myriad > > > > > > of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST > if > > > you > > > > > > DON'T LIKE THAT. > > > > > > > > > > > _______________________________________________ > > > > > rsyslog mailing list > > > > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > > > > http://www.rsyslog.com/professional-services/ > > > > > What's up with rsyslog? Follow https://twitter.com/rgerhards > > > > > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a > > > myriad > > > > > of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if > > you > > > > > DON'T LIKE THAT. > > > > > > > > > _______________________________________________ > > > > rsyslog mailing list > > > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > > > http://www.rsyslog.com/professional-services/ > > > > What's up with rsyslog? Follow https://twitter.com/rgerhards > > > > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a > > myriad > > > > of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if > you > > > > DON'T LIKE THAT. > > > > > > > _______________________________________________ > > > rsyslog mailing list > > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > > http://www.rsyslog.com/professional-services/ > > > What's up with rsyslog? Follow https://twitter.com/rgerhards > > > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a > myriad > > > of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you > > > DON'T LIKE THAT. > > > > > _______________________________________________ > > rsyslog mailing list > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > http://www.rsyslog.com/professional-services/ > > What's up with rsyslog? Follow https://twitter.com/rgerhards > > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad > > of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you > > DON'T LIKE THAT. > > > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com/professional-services/ > What's up with rsyslog? Follow https://twitter.com/rgerhards > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad > of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you > DON'T LIKE THAT. > _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.

