Hi,

I ran into an insidious agent (snmpd) hang on Linux that had me stumped for 
quite a while. The hang occurred when attempting to walk mib-2 and always 
occurred during attempts to walk TCP-MIB::tcpConnTable. At first I thought this 
might be related to my build of v5.7.3 'snmpd' (see below), but I eventually 
discovered that this was related to the fact that the system the agent was 
running on did not have netlink socket diagnostics configured in the kernel. 
When this is the case the netlink (TCPDIAG_GETSOCK) response returns an error; 
the v5.7.3 version of agent/mibgroup/mibII/tcpTable.c does not handle this case 
and the specific code ends up in an infinite wait. 

I've enclosed a patch that illustrates one way of fixing this. This simply logs 
an error and then returns from the loop when this case is encountered. The 
additional check for a non-zero error code isn't exactly needed in this case 
since the original netlink message isn't requesting an ACK (netlink ACKs are 
returned as NLMSG_ERROR message types, but with a zero error code). However, 
I've left in this check for completeness. I've also removed a redundant 
assignment to the 'r' pointer.

Regards,
-David

* I mentioned that I thought this was originally an issue related to the build. 
I still think there are issues with the new 'libnl3' support that was added in 
v5.7.3. I'm not an 'autoconf' expert so I won't go into details about it, but I 
can describe at a high level what I think is wrong. In my case I'm performing a 
cross-platform build of Net-SNMP with an installed cross toolchain. This cross 
toolchain has both libnl2 and libnl3 installed. As such, the include files for 
the latter version are located in the <cross-toolchain location>/include/libnl3 
directory. There are a couple things to consider here:
1. It should be enough to check for presence of the cross-toolchains 
...include/netlink/netlink.h or ...include/libnl3/netlink/netlink.h files to 
determine whether netlink (this is really libnl and not the OS netlink API btw) 
is available.
2. If 'libnl3' is available (using the -lnl-3 mechanism in 'configure' seems 
fine for this determination) then the *cross toolchain's* ...include/libnl3 
include directory should be added to the CPPFLAGS and 
EXTERNAL_MIBGROUP_INCLUDES variables (and *not* the build system's include 
directory (currently this is '/usr/include/libnl3').
    I would have tried to patch configure for this but unfortunately I'm not 
sufficiently savvy re: autoconf to do this. Hopefully, this is something one of 
the authors can do reasonable quickly?

Attachment: netlink-fix.patch
Description: netlink-fix.patch

------------------------------------------------------------------------------
Dive into the World of Parallel Programming! The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net
_______________________________________________
Net-snmp-coders mailing list
Net-snmp-coders@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/net-snmp-coders

Reply via email to