I have narrowed the cause of my problem down quite a bit. Seems like the
"select()" function call within the "agent_check_and_process()" routine (in
module snmp_agent.c) is inserting a bad address value in the "Sessions" linked
list. Once this address is accessed, then my code coredumps.
Here is a very quick summary of what I found:
In my "subagent.c" file, the "agent_check_and_process()" routine is called
indicating to "block forever" by passing a "1" as a parameter (see below):
while(keep_running) {
/* if you use select(), see snmp_select_info() in snmp_api(3) */
/* --- OR --- */
agent_check_and_process(1); /* 0 == don't block */
}
Here is the "agent_check_and_process()" routine:
int
agent_check_and_process(int block)
{
int numfds;
fd_set fdset;
struct timeval timeout = { LONG_MAX, 0 }, *tvp = &timeout;
int count;
int fakeblock = 0;
numfds = 0;
FD_ZERO(&fdset);
snmp_select_info(&numfds, &fdset, tvp, &fakeblock);
if (block != 0 && fakeblock != 0) {
/*
* There are no alarms registered, and the caller asked for
blocking, so
* let select() block forever.
*/
tvp = NULL;
} else if (block != 0 && fakeblock == 0) {
/*
* The caller asked for blocking, but there is an alarm due sooner
than
* LONG_MAX seconds from now, so use the modified timeout returned by
* snmp_select_info as the timeout for select().
*/
} else if (block == 0) {
/*
* The caller does not want us to block at all.
*/
tvp->tv_sec = 0;
tvp->tv_usec = 0;
}
count = select(numfds, &fdset, 0, 0, tvp);
if (count > 0) {
/*
* packets found, process them
*/
snmp_read(&fdset);
} else
switch (count) {
case 0:
snmp_timeout();
break;
case -1:
if (errno != EINTR) {
snmp_log_perror("select");
}
return -1;
default:
snmp_log(LOG_ERR, "select returned %d\n", count);
return -1;
} /* endif -- count>0 */
/*
* Run requested alarms.
*/
run_alarms();
netsnmp_check_outstanding_agent_requests();
return count;
}
Before (and after) the "snmp_select_info()" routine executes, I displayed the
addresses in the "Sessions" linked list and verified they are valid looking
addresses (which is good):
slp = 0x1032fcb8 (first linked list element)
slp = 0x1032f288 (second linked list element)
After the call to the "snmp_select_info()" routine completes, the following
variable values are displayed:
"block" is not 0
"fakeblock" is 0
After the call to the "select()" completes, I verified the "Sessions" linked
list now contains an invalid address in it's first element:
slp = 0x30000 (first linked list element)
As a result, once the "snmp_timeout()" routine is eventually called, a field at
this invalid address is accessed thus causing the coredump.
It seems the "select()" method is used to check for arriving packets or at
least wait for a packet to arrive when none exist yet. Since I am not really
performing any SNMP requests, I am not sure why the "select()" routine is not
blocking (ie: waiting for a packet) for me. I really do not understand what
this has to do with the "Sessions" linked list either.
Is this a NetSNMP bug which results in the "Sessions" linked list being
corrupted somehow?
Perhaps someone can help me out, since I really do not know where to go from
here. I do not understand what the "Sessions" linked list is use for and I do
not understand what the "select()" routine really does (I can not find it
anywhere).
---------------------------------
Building a website is a piece of cake.
Yahoo! Small Business gives you all the tools to get online.-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
Net-snmp-coders mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/net-snmp-coders