Hi, thanks for your patch. Unfortunately, even when I have applied it, it still ends with core dump due of 'double free or corruption (fasttop)'
When I run snmpd with -Dsnmp_agent,agentx/master it ends with: agentx/master: sending pdu (req=0x1d4,trans=0x1d3,sess=0x5) snmp_agent: delegate session == 0x56207e165240 snmp_agent: end of handle_snmp_packet, asp = 0x56207e165240 agentx/master: callback resend agentx/master: callback resend agentx/master: timeout on session 0x56207dfd5400 req=0x1c9 agentx/master: close 0x56207dfd5400, -1 snmp_agent: removed 40 delegated request(s) for session 0x56207dfce490 snmp_agent: processing delegated request, asp = 0x56207e165240 snmp_agent: canceling next walk for asp 0x56207e165240 snmp_agent: REMOVE session == 0x56207e165240 snmp_agent: agent_session 0x56207e165240 released snmp_agent: processing delegated request, asp = 0x56207e1041a0 snmp_agent: canceling next walk for asp 0x56207e1041a0 snmp_agent: REMOVE session == 0x56207e1041a0 snmp_agent: agent_session 0x56207e1041a0 released snmp_agent: processing delegated request, asp = 0x56207e1656c0 snmp_agent: canceling next walk for asp 0x56207e1656c0 snmp_agent: REMOVE session == 0x56207e1656c0 snmp_agent: agent_session 0x56207e1656c0 released snmp_agent: processing delegated request, asp = 0x56207e11af40 snmp_agent: canceling next walk for asp 0x56207e11af40 snmp_agent: REMOVE session == 0x56207e11af40 snmp_agent: agent_session 0x56207e11af40 released snmp_agent: processing delegated request, asp = 0x56207e118f00 snmp_agent: canceling next walk for asp 0x56207e118f00 snmp_agent: REMOVE session == 0x56207e118f00 snmp_agent: agent_session 0x56207e118f00 released snmp_agent: processing delegated request, asp = 0x56207e11b540 snmp_agent: canceling next walk for asp 0x56207e11b540 snmp_agent: REMOVE session == 0x56207e11b540 snmp_agent: agent_session 0x56207e11b540 released snmp_agent: processing delegated request, asp = 0x56207e11bd00 snmp_agent: canceling next walk for asp 0x56207e11bd00 snmp_agent: REMOVE session == 0x56207e11bd00 snmp_agent: agent_session 0x56207e11bd00 released agentx/master: Continue removing delegated subsession reqests agentx/master: close transport snmp_agent: REMOVE session == 0x56207dfd5400 agentx/master: response too late on session 0x56207dfd5400 agentx/master: response too late on session 0x56207dfd5400 double free or corruption (fasttop) Aborted (core dumped) What's interesting, when I run it with -DALL it pass (at least for several rounds). It looks like some strange race condition. Regards Josef Ridky Software Engineer Core Services Team Red Hat Czech, s.r.o. ----- Original Message ----- | From: "Anders Wallin" <walli...@gmail.com> | To: "Josef Ridky" <jri...@redhat.com> | Cc: "net-snmp-coders" <net-snmp-coders@lists.sourceforge.net> | Sent: Tuesday, April 2, 2019 1:46:40 PM | Subject: Re: Core dump with net-snmp-5.8 | | Hi Josef, | | I think it's the same issue as https://sourceforge.net/p/net-snmp/bugs/2914/ | (where I also posted the solution) | Regards | Anders Wallin | | | On Tue, Apr 2, 2019 at 12:43 PM Josef Ridky <jri...@redhat.com> wrote: | | > Hi, | > | > recently, I have hit to an issue in net-snmp-5.8, that is connected to the | > bug report [1]. | > | > When I tried to run agentofdeath test from [1], snmpd daemon will crash | > with malloc(): smallbin double linked list corrupted or double free() issue | > and dumps core (see bellow). | > From log file, I can identified one issue with "Unknown operation". | > | > This issue is in the agentx_got_response function | > (agent/mibgroup/agentx/master.c). There isn't implemented action for | > NETSNMP_CALLBACK_OP_RESEND (defined in | > include/net-snmp/library/snmp_api.h). | > As result "Unknown operation 6 in agentx_got_response" is shown in log | > file. | > | > /var/log/messages | > ------------------------------- | > Mar 28 06:52:42 localhost snmpd[12073]: Unknown operation 6 in | > agentx_got_response | > Mar 28 06:52:43 localhost snmpd[12073]: Unknown operation 6 in | > agentx_got_response | > Mar 28 06:52:43 localhost snmpd[12073]: malloc(): smallbin double linked | > list corrupted | > Mar 28 06:52:43 localhost systemd[1]: Started Process Core Dump (PID | > 13652/UID 0). | > Mar 28 06:52:48 localhost systemd[1]: snmpd.service: Main process exited, | > code=dumped, status=6/ABRT | > Mar 28 06:52:48 localhost systemd[1]: snmpd.service: Failed with result | > 'core-dump'. | > ------------------------------- | > | > The "Unknown operation" callback is caused by newly added piece of code in | > snmplib/snmp_api.c: | > | > static int | > snmp_resend_request(struct session_list *slp, netsnmp_request_list *rp, | > int incr_retries) | > { | > | > ... | > | > tv.tv_sec += tv.tv_usec / 1000000L; | > tv.tv_usec %= 1000000L; | > rp->expireM = tv; | > + if (rp->callback) | > + rp->callback(NETSNMP_CALLBACK_OP_RESEND, sp, | > + rp->pdu->reqid, rp->pdu, rp->cb_data); | > } | > return 0; | > } | > | > | > When I tried to remove it, it just stop complaining about operation 6, but | > the core dump is still present. | > | > May I ask you for help with this issue? Do you have any idea, what causing | > this issue in 5.8 and how to fix it? | > I know, that Jan Safranek has fixed this for 5.7 by commit [2], but it | > looks like something other has changed and this issue is current again. | > | > [1] https://sourceforge.net/p/net-snmp/bugs/2411/ | > [2] | > https://github.com/net-snmp/net-snmp/commit/793d596838ff7cb48a73b675d62897c56c9e62df | > | > Regards | > | > Josef Ridky | > Software Engineer | > Core Services Team | > Red Hat Czech, s.r.o. | > | > | > | > _______________________________________________ | > Net-snmp-coders mailing list | > Net-snmp-coders@lists.sourceforge.net | > https://lists.sourceforge.net/lists/listinfo/net-snmp-coders | > | _______________________________________________ Net-snmp-coders mailing list Net-snmp-coders@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/net-snmp-coders