Hi David,

SNMP4J does not implement "transmit pacing" because in (pure) Java you cannot get any information of the UDP buffer usage. Slowing down the sending of messages in general
will not be acceptable for 99% of the users and use cases.

To be able to control burst sending of messages, many big network management applications use their own centrally implemented throttle mechansims which apply mostly for trap directed polling (for example in thunderstorm situation when many network elements
send alarm traps at once).

SNMP4J-Model, an high-level API for client SNMP developement which is currently under developement, will provide means to centrally control load on transport ports
and can be used to implement such "transmit pacing".

Best regards,
Frank

Am 14.10.2014 12:50, schrieb David Corbin:
TL; DR: Does SNMP4J provide "transmit pacing" for UDP?  Does it handle
Linux's "EPERM error when" buffers overflow? (By handle I mean beyond
throwing an exception and failing).

I'm going to start with some background,  We have a large complex (overly
so) system that monitors some "stuff" using SNMP4J.  It generally works.  I
have an integration test suite that drives our system. We use a dedicated
machine with dozens if IP addresses that is programmed to respond to SNMP
requests in a way that the integration test suite expects.  For the
purposes of this converstaion, There's one specific test class   with 11
tests, and when I run it against our production code all of it's tests
pass very consistently.   Our complex system depends on Mule and JMS.  Most
if not all SNMP requests being made are being sent through JMS to an
SnmpExecutor service.  That Mule service in turn calls a Java class
(SnmpQueryExecutor) to synchronously resolve the request for SNMP data (the
SNMP request is asynchronous, but the Java code has it's own wait for the
answer or timeout) before proceeding.  The JMS client blocks waiting for
the SNMP request to complete (or timeout) before continuing on.

In attempt to simplify our complex system, I made a refactoring (on some
execution paths) to call the Java class directly, bypassing the JMS and
Mule part.  The integration tests now fail intermittently.  There are about
4 tests that sometimes fail.  The nature of the failures is also not
consistent.  Initially, one of them would fail on most runs (4 out 5).
After reducing the code-paths that use the new code to exactly one, and
this has dropped to about 1 failure every 3 or 4 runs.  I've learned that I
can make the test pass reliably by adding a 1 second delay to the new
code-path.  This change was for investigative purposes only.   It suggests
that there is a race condition or some type of failure that is related to
doing too much SNMP too fast, and since Mule and JMS add a fair amount of
overhead, it's probably been masked for some time.

After a lot of work, I was able to discover that some of the failures are
caused by this exception:
java.io.IOException: Operation not permitted
     at java.net.PlainDatagramSocketImpl.send(Native Method)
     at java.net.DatagramSocket.send(DatagramSocket.java:676)
     at
org.snmp4j.transport.DefaultUdpTransportMapping.sendMessage(DefaultUdpTransportMapping.java:117)
     at
org.snmp4j.transport.DefaultUdpTransportMapping.sendMessage(DefaultUdpTransportMapping.java:42)
     at
org.snmp4j.MessageDispatcherImpl.sendMessage(MessageDispatcherImpl.java:198)
     at
org.snmp4j.MessageDispatcherImpl.sendPdu(MessageDispatcherImpl.java:498)
     at
org.snmp4j.util.MultiThreadedMessageDispatcher.sendPdu(MultiThreadedMessageDispatcher.java:127)
     at org.snmp4j.Snmp.sendMessage(Snmp.java:1004)
     at org.snmp4j.Snmp.send(Snmp.java:974)
     at org.snmp4j.Snmp.send(Snmp.java:958)
     ....

While digging for information about this, I found this thread
https://github.com/typesafehub/play-plugins/issues/64 which suggests at the
end that this error happens on Linux systems when the network buffers get
full. (Yes, I'm developing on a Linux system).  Digging for more
information about that, I found this thread
http://compgroups.net/comp.protocols.tcp-ip/udp-socket-sendto-eperm/2624182,
where they talk about UDP and "transmit pacing".  Essentially they say it's
the programmers responsibility to not send UDP packets too fast.  Seems
reasonable.

So, my assumption is that the UDPTransport should be doing this.    I did
look through some of the code, and while I did not see anything that would
do this, that doesn't mean it's not there.  Is it?  Is the IOException
(EPERM) causing "transmit pacing" or even normal retries of UDP to not work?

More information:
One test that fails (and the most SNMP active, I think), makes 5 SNMP
requests of 3 different IP addresses.  The addresses are all simple GET
operations, totalling 8 OIDs in all.  Some of these requests are in
parallel by different threads.

I'm very open to any other suggestions people want to make as to why this
change would cause this behavior.  All help appreciated.

David Corbin
_______________________________________________
SNMP4J mailing list
[email protected]
https://oosnmp.net/mailman/listinfo/snmp4j

--
---
AGENT++
Maximilian-Kolbe-Str. 10
73257 Koengen, Germany
https://agentpp.com
Phone: +49 7024 8688230
Fax:   +49 7024 8688231

_______________________________________________
SNMP4J mailing list
[email protected]
https://oosnmp.net/mailman/listinfo/snmp4j

Reply via email to