To try to make it easier I've created a fake pull request, allowing comments on the diff and discussion, and a clear set of diffs between this stuff and 5.8-patches:
https://github.com/fenner/net-snmp/pull/3 Bill On Mon, May 20, 2019 at 11:27 AM Bill Fenner <fen...@gmail.com> wrote: > Some of the back and forth with rstory happened on the admin list; some of > it was unicast. I put together all of the patches, even though there are > now at least 2 fixes for the problem, and published them at > https://github.com/fenner/net-snmp/commits/rstory-getbulk-patches . > > I can't imagine that the changes are useful as they are (other than the > added debugging, perhaps) but may be useful to see what kinds of things we > ran into. > > And my opinion remains that the fundamental problem here is that the > reverse encoding knows how to stop encoding in time, but doesn't know about > the overhead that usm adds, and for some reason the changes in 5.8 try to > address this by switching to forward encoding, which relies on the state > that was freed (and really probably only helps because forward encoding > uses a fixed-sized 1472-byte buffer). (It's possible that the delayed free > that Sam introduced helps the "state that was freed" part. With 3 > different solutions for the problem it's hard to figure out exactly how to > proceed.) > > Bill > > > On Mon, May 20, 2019 at 9:28 AM Bill Fenner <fen...@gmail.com> wrote: > >> On Tue, May 14, 2019 at 1:38 PM Bart Van Assche <bvanass...@acm.org> >> wrote: >> >>> On 5/14/19 4:01 PM, Bill Fenner wrote: >>> > Perhaps getbulk no longer dumps core, but I can not get it to return >>> > anything but GENERR any more, and, it seems to leak memory. >>> > >>> > Any "large enough" request seems to fail in this way, e.g., >>> > snmpbulkget -v 3 ... -Cn 5 -Cr 50 sysUpTime sysUpTime sysUpTime >>> > sysUpTime sysUpTime .1 >>> > >>> > This is particularly frustrating because code was added to 5.8 to >>> > rebuild a getbulk reply if it's too big. But there was already code >>> > to not build the PDU too big, it's just not taking the v3 header into >>> > account properly (that's my best guess, at least). So now there are >>> > two different mechanisms to send a "right-size" reply and they both >>> > don't work. Around 5.8 release time I was working with Robert Story >>> > to fix this, but that effort kind of petered out and Robert's work >>> > didn't make it into git. >>> > >>> > Can anyone get getbulk to work in the current 5.8-patches with SNMPv3? >>> > >>> > I've attached a test script with 504 failing test cases when I run it >>> > against an unmodified V5-8-patches branch, and net-snmp leaks about a >>> > meg of RAM during the test. This is an adaptation of my internal test >>> > to the net-snmp test framework; the complete test would use all >>> > supported values for -l, -a, -x but for now this is the simplest one >>> > using nanp. >>> >>> >>> Hi Bill, >>> >>> A new test has been added >>> (testing/fulltests/default/T0221snmpbulkget_large_simple). That test >>> passes on my setup. Can you have a look whether that test covers the >>> issue you ran into? >>> >> >> It doesn't - the problems arise when the agent decides that the message >> is too big and doesn't fit in the buffer. (E.g., the usage >> of --sendMessageMaxSize in my test, or, the repetition of .1 .1 .1 .1 on >> the snmpbulkget command line). >> >> Regarding the "memory leak": RSS is not a reliable source of information >>> to detect memory leaks. If you want to verify whether or not a new >>> memory leak has been introduced please use Valgrind. >>> >> >> Yes, my internal test uses valgrind and shows the leaks, I was using RSS >> growth as a proxy. I also used a script something like >> http://www.net-snmp.org/wiki/index.php/Running_Net-SNMP_Regression_Tests_under_Valgrind >> and >> it showed leaks when I ran the test that I sent, e.g., >> >> ==32438== 31,584 (8,736 direct, 22,848 indirect) bytes in 168 blocks are >> definitely lost in loss record 756 of 758 >> ==32438== at 0x402D05E: calloc (vg_replace_malloc.c:711) >> ==32438== by 0x430EA7A: usm_malloc_usmStateReference (snmpusm.c:288) >> ==32438== by 0x43117A7: usm_process_in_msg (snmpusm.c:2431) >> ==32438== by 0x4312B44: usm_secmod_process_in_msg (snmpusm.c:2335) >> ==32438== by 0x42D45EF: snmpv3_parse (snmp_api.c:3938) >> >> When I run your new test, I agree it passes: >> ~/src/test-for-bart/testing @fenner-test-for-bart.sjc% ./RUNFULLTESTS -r >> T0221 >> large SNMPv3 bulkget .. ok >> >> If I add "--sendMessageMaxSize=1472", then it fails in the same way as my >> test fails: returning genError. >> >> Your subsequent patch may address this problem if the manager does not >> specify a max response size, but as this is part of the SNMPv3 protocol and >> we do not necessarily control what the manager does, it's not sufficient. >> >> Bill >> >>
_______________________________________________ Net-snmp-coders mailing list Net-snmp-coders@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/net-snmp-coders