Hi folks,

 I've got a problem with net-snmp which upsets one of our customers. The
problem is a high memory consumption. This problem refers to
net-snmp-5.0.9 and it seems that its still present in net-snmp-5.0.10. I
would appreciate any comments to this problem, thanks a lot... 

<snip>
After further exchanges with the folks reporting this problem it would
appear that there are some terminology issues.  The complaint doesn't
actually refer to a leak, since the memory in question is still
'reachable', it relates to the rapid consumption of resources without
any accompanying free.

If you re-run the valgrind you will see that it reports a large amount of memory
being consumed over time that is never released (shown as still reachable).  Now
whilst I don't pretend to understand snmp I have spent some hours over the last
few days trying to figure some of this out, as a result I now have more
questions which may/may not point at a defect.

Firstly, from the code perspective we you should consider the following 
functions:


#0  _sess_async_send (sessp=0x9d5efe8, pdu=0x9d5efe8, callback=0x9d5efe8,
   cb_data=0x9d5efe8) at snmp_api.c:4446
#1  0x00e1e65b in snmp_sess_async_send (sessp=0x9d29f58, pdu=0x9d5efe8,
   callback=0x9d5efe8, cb_data=0x9d5efe8) at snmp_api.c:4698
#2  0x00e1de4c in snmp_async_send (session=0x9d5efe8, pdu=0x9d5efe8,
   callback=0x9d5efe8, cb_data=0x9d5efe8) at snmp_api.c:4430
#3  0x00e1ddc6 in snmp_send (session=0x9d5efe8, pdu=0x9d5efe8)
   at snmp_api.c:4416
#4  0x007615a8 in send_trap_to_sess (sess=0x9d29fa8, template_pdu=0x9d5ef10)
   at agent_trap.c:566
#5  0x0076129d in send_enterprise_trap_vars (trap=6, specific=3414,
   enterprise=0x804a360, enterprise_length=13, vars=0xbfffdf50)
   at agent_trap.c:499
#6  0x08048e5d in main (argc=3, argv=0xbfffe484) at agt_decss7_main.c:449
(gdb)

Ignore the values displayed for callback and cb_data in Frame #0, #1 & #2, I
don't know why gdb shows this since these values are NULL, that is clear from
the evidence of the earlier F #3, eg:


int
snmp_send(netsnmp_session * session, netsnmp_pdu *pdu)
{
   return snmp_async_send(session, pdu, NULL, NULL);
}                                          |    |-----  callback
                                          |----------  cb_data
   |
   |
   V
  

int
snmp_async_send(netsnmp_session * session,
               netsnmp_pdu *pdu, snmp_callback callback, void *cb_data)
{
   void           *sessp = snmp_sess_pointer(session);
   return snmp_sess_async_send(sessp, pdu, callback, cb_data);
}


   |
   |
   V
 

int
snmp_sess_async_send(void *sessp,
                    netsnmp_pdu *pdu,
                    snmp_callback callback, void *cb_data)
{
   int             rc;

   if (sessp == NULL) {
       snmp_errno = SNMPERR_BAD_SESSION;       /*MTCRITICAL_RESOURCE */
       return (0);
   }
   rc = _sess_async_send(sessp, pdu, callback, cb_data);


   |
   |
   V


static int
_sess_async_send(void *sessp,
                netsnmp_pdu *pdu, snmp_callback callback, void *cb_data)
{
.........


Now my understanding, which may well be incorrect, is that the callback function
is used to handle a response to the trap?  Is that correct?  If so, why do we
pass in a NULL pointer for the callback function?  I know we expect a response
because:

a. I traced that with gdb

We enter code:


_sess_async_send(void *sessp,
                netsnmp_pdu *pdu, snmp_callback callback, void *cb_data)
{

.........


   /*
    * Add to pending requests list if we expect a response.
    */
   if (pdu->flags & UCD_MSG_FLAG_EXPECT_RESPONSE) {
       netsnmp_request_list *rp;
       struct timeval  tv;
..........
      rp->callback = callback;      <---- NULL
      rp->cb_data = cb_data;



b. I traced it with strace

An strace of the app shows us doing a send(2) but no recv(2)

An strace of snmpd shows us doing a recv(2) and then a send(2)


An strace shows that we eventually hang, I have not checked this with crash(1)
but I suspect it is because we are not draining the socket of the snmpd 
responses.

The src also suggests that we do not free the resources in the situation where
we expect a response, that would be dealt with by the callback function (is that
correct???).

So my current hypothesis is that:

o we continue to hold memory because we never employ a callback to
 consume the response, which would also result in a freeing of
 associated resources

o we eventually hang because we are not handling the response and
 therefore consume resources associated with the socket


I initially wondered whether this was an app defect, however given the current
coding within the library I cannot see how we could specify a callback function
given that the call flow includes:



int
snmp_send(netsnmp_session * session, netsnmp_pdu *pdu)
{
   return snmp_async_send(session, pdu, NULL, NULL);
}                                          |    |-----  callback
                                          |----------  cb_data
  

</snip>
-- 
Radek Vokál <[EMAIL PROTECTED]>

Attachment: signature.asc
Description: This is a digitally signed message part

Reply via email to