RFC: removal of the NETSNMP_DS_LIB_ALARM_DONT_USE_SIG feature

Bart Van Assche Sat, 19 Dec 2009 01:09:32 -0800

Hello,

As known libnetsnmp supports time-based alarms via the functions
snmp_alarm_register(), run_alarms() and other functions. Two different ways
to trigger the function run_alarms() are supported inside libnetsnmp:
1. By making sure that the timeout argument of select() is small enough such
that select() returns before the next alarm must be handled (when the
variable NETSNMP_DS_LIB_ALARM_DONT_USE_SIG is set to one, which is the
default).
2. By making sure that the kernel fires SIGALRM at the time when
run_alarms() should be called (when the variable
NETSNMP_DS_LIB_ALARM_DONT_USE_SIG is set to zero, which has to be configured
explicitly).


The following issues are associated with the second approach:
1. Alarm functions are used inside Net-SNMP to e.g. refresh cached table
contents. As far as I can see there is nothing in the Net-SNMP source code
that prevents the following from happening: a table refresh triggered via
SIGALRM while a row is being removed from a cached table. This can result in
dangling pointer dereferences and even a crash.
2. POSIX restricts signal handlers to calling functions that are either
reentrant or non-interruptible (
http://www.opengroup.org/onlinepubs/009695399/functions/xsh_chap02_04.html#tag_02_04).
Standard I/O functions like printf() and fprintf() are neither reentrant nor
non-interruptible. run_alarms() is called from inside a signal handler,
which means that this restriction applies to the function run_alarms()
itself and all functions called by it (which includes the alarm callback
functions). Or: e.g. snmp_log() and its callers must not be called from
inside run_alarms() when this function is invoked from inside a signal
handler. This is a severe restriction, and one that is hard to work with.
3. Not all software developers know how to make sure that signal delivery
works correctly in a multithreaded context. POSIX does not guarantee to
which thread a signal like SIGALRM will be delivered, unless that signal has
been blocked before thread creation and is unblocked after thread creation
(see also
http://www.opengroup.org/onlinepubs/009695399/functions/pthread_sigmask.html).
This is relevant for the Net-SNMP project not only because a worker thread
is created inside agent/mibgroup/if-mib/data_access/interface_linux.c but
also because libnetsnmp is often used inside multithreaded software.
Currently no attempt is made to make sure that SIGALRM is processed by the
Net-SNMP event processing loop thread. If SIGALRM is processed by another
thread, this will result in one or more data races.

Because all the difficulties associated with processing alarms from inside a
signal handler function, and because fixing these would require more effort
than it is worth, I propose to remove this feature from the Net-SNMP code
base and to always use approach (1), whether or not
NETSNMP_DS_LIB_ALARM_DONT_USE_SIG has been set.

Any feedback is welcome.

Bart.

------------------------------------------------------------------------------
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev

_______________________________________________
Net-snmp-coders mailing list
Net-snmp-coders@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/net-snmp-coders

RFC: removal of the NETSNMP_DS_LIB_ALARM_DONT_USE_SIG feature

Reply via email to