Frank (and other readers), Today I did some more testing with the new code and all appears to work as expected. I also did a test (10 threads, 10 devices) where the authoritiveEngineId of the device is unique and the problem also occurs using the old software, with the new software, I didn't see any issues untill now. My conclusion is that the single instance of the USM is not very suitable for multithreading usage (on a 8 core system), the work around you suggested seems to overcome the problem. Thanks for the help, and if any new conclusions come over the next fex weeks, I'll keep you posted. Regards, Sjoerd
--- On Thu, 3/5/09, Sjoerd van Doorn <[email protected]> wrote: From: Sjoerd van Doorn <[email protected]> Subject: Re: [SNMP4J] Resent (full code now): Why is the USM a singleton ? To: "Frank Fock" <[email protected]> Cc: "SNMP4j" <[email protected]> Date: Thursday, March 5, 2009, 8:21 PM Frank, Thanks a lot, I think this did the trick. I still have some testing to do, but the initial tests show improvement. I was able to reproduce the problem by creating 5 Snmp instances to the same device and getting some snmp variables from it in 5 different threads (this time sparc machine with 2 cores) and saw a lot of timeouts and other sorts of errors. Then I modified my code (see snippets below) and did the test again without errors. (please feel free to comment on my modifications. (I know it is not "clean" to throw RuntimeException, but I wanted to avoid modifying the API ;-) Thanks and best regards, Sjoerd. ==== new class public class MultiThreadedSecurityModels extends SecurityModels { private static SecurityModels instance = null; public MultiThreadedSecurityModels() { super(); } public synchronized static SecurityModels getInstance() { throw new RuntimeException("Cannot get instance in this object !"); } } === Method for init protected Snmp initSnmpAndSecurity( ) throws IOException { Snmp snmp = new Snmp(new DefaultUdpTransportMapping()); USM usm = new USM(SecurityProtocols.getInstance(), new OctetString(MPv3.createLocalEngineID()), 0); UsmUser user = new UsmUser(_agent.securityName, _agent.authProtocol, _agent.authPassphrase, _agent.privProtocol, _agent.privPassphrase); usm.addUser(_agent.securityName, user); MultiThreadedSecurityModels mtm = new MultiThreadedSecurityModels(); mtm.addSecurityModel(usm); MessageProcessingModel mpm = snmp.getMessageProcessingModel(MPv3.ID); if ( mpm instanceof MPv3 ) ((MPv3)mpm).setSecurityModels(mtm); else throw new RuntimeException("Wowsers, this is impossible"); return snmp; } --- On Wed, 3/4/09, Frank Fock <[email protected]> wrote: From: Frank Fock <[email protected]> Subject: Re: [SNMP4J] Resent (full code now): Why is the USM a singleton ? To: [email protected] Cc: "SNMP4j" <[email protected]> Date: Wednesday, March 4, 2009, 5:23 PM Sjoerd, If you have an existing architecture that uses sync requests, it is OK to keep it. Creating a Snmp instance creates a socket (very expensive related to other things) and a thread (needs up to 1MB of Memory with default stack size). A single threaded sync approach is of course not an option. Back to the problem: There are several USM instances (you create them by calling the Snmp constructor), however by default the singleton SecurityModels is used which holds only the latest USM created by the Snmp instances. I agree that this is surprising and should be changed. I will change that for the next release, to allow better control of how many USMs are used and when they are created. But you do not have to wait until then. You can use the MPv3.setSecurityModels to set your own subclass of SecurityModels to be used by the particular MPv3 instance. Before taking the above approach, you will have to make sure that each MPv3 and USM pair are using their own unique engine ID. Best regards, Frank Sjoerd van Doorn wrote: > Frank, > Rewriting to ASYNC would mean a redesign of an implementation that has run Ok for over a year (meaning a very big risk I would like to avoid). The only difference we have is that we switched to a new system (from dual core Sparc to 2X4=8 core intel). > Since this new machine is installed we get the issues as described. > I am Aware the context switches are expensive, however I can live with that. > I do not understand why creating the SNMP object would be very expensive, besides the socket creation. > I did a test with a single threaded approach and it takes about 4 seconds to complete all gets and sets on all elements. I'm afraid this time gap will be to long when scaling up to 50 or more devices. > The enigineID that is non unique is the engineID of the network element, and I cannot control these. > At this moment, I am NOT looking for a performance increase, but a fix to avoid these errors that have arised since I installed a more parallel hw system. I susspect the global table in combination of the non unique authoritiveID to be the cause. > As far as I can see, there is only one USM for all SNMP objects and within this USM the user and time tables are stored indexed on the (in my case non-unique) authoritive engine ID. Where am I missing it ? > Regards, > Sjoerd > > --- On *Tue, 3/3/09, Frank Fock /<[email protected]>/* wrote: > > From: Frank Fock <[email protected]> > Subject: Re: [SNMP4J] Resent (full code now): Why is the USM a > singleton ? > To: [email protected] > Cc: "SNMP4j" <[email protected]> > Date: Tuesday, March 3, 2009, 11:07 PM > > Hello Sjoerd, > > First, for 200+ elements using a single thread with async > response processing will be sufficient. With 15 threads > each creating a Snmp instance, you waist resources and > probably run into problems with port allocation on your > system. The thread context switches are also expensive, > unless you have 16 CPUs. > > The USM is NOT a singleton! The problem with your code > is, that you have 15 (or more) Snmp instances with the > same engine ID. Just create a different engine ID for > each instance. > > BTW, creating a Snmp instance is expensive. I would > always share an instance if possible. > > Best regards, > Frank > > > Sjoerd van Doorn wrote: > > > --- On Tue, 3/3/09, Sjoerd van Doorn <[email protected]> > wrote: > > > From: Sjoerd van Doorn <[email protected]> > > Subject: Why is the USM a singleton ? > > To: "SNMP4j" <[email protected]> > > Date: Tuesday, March 3, 2009, 10:35 PM > > > > > > > > > Hello all, > > I'm working on an issue and I suspect the fact that the USM is a > singleton, is part of the reasons I'm having problems. > > Can someone explain why not have an instance of the USM for every MPv3 > instance ? > > My problem is having timeouts, usmStatsNotInTime, usmUnknownEngineId and > MessageException (1404) every now and then. > > I'm in a network with 200+ elements and I already have seen that the > autoritiveEngineId of the elements is not unique, however I cannot have them > changed for my purpose (I am aware that this is against RFC 3414) > > The issues show when in parallel (multithreaded) querying appr. 15 > devices.(I'll post a snipped at the end of my mail. > > I suspect the internal administration of the USM is broken due to the > fact that the engineID is non-unique and that this is causing my errors. > > After analysing the code for a couple of days and going through the > previous posts, I can see more people are having these kind of problems, however > I could not find any solution. > > I'm thinking that a modification of the USM from a singleton to a > instance per MPv3 could solve the problem, but I can't realy oversee why it > is designed as a singleton from the beginning. Here is my code (executed by 15 > threads in parallel in synchronous mode. > > private final Snmp4jAgent _agent; > private final String _requestType; > protected SnmpCommand(Snmp4jAgent agent, String requestType){ _agent = > agent; _requestType = requestType; initSecurityModels(createUSM()); } > protected void initSecurityModels( USM usm ){ > SecurityModels.getInstance().addSecurityModel(usm); } > protected USM createUSM(){ return new USM(SecurityProtocols.getInstance(), > new OctetString(MPv3.createLocalEngineID()), 0); } > protected PDU createRequest(){ final PDU request = new ScopedPDU(); > request.setType(PDU.getTypeFromString(_requestType)); > request.setMaxRepetitions(15); request.setNonRepeaters(0); return request; } > protected Target createTarget(){ final UserTarget target = new > UserTarget(); target.setSecurityLevel(_agent.securityLevel); > target.setSecurityName(_agent.securityName); target.setVersion(_agent.version); > target.setAddress(_agent.udpAddress); target.setRetries(_agent.retries); > target.setTimeout(_agent.timeoutInSeconds * 1000); return target; } > protected Snmp createSnmp() throws IOException{ Snmp snmp = new Snmp(new > DefaultUdpTransportMapping()); UsmUser user = new UsmUser(_agent.securityName, > _agent.authProtocol, _agent.authPassphrase, _agent.privProtocol, > _agent.privPassphrase); snmp.getUSM().addUser(_agent.securityName, user); return > snmp; } > public PDU execute() throws IOException{ > > Snmp snmp = null; > > > try{ > > snmp = createSnmp(); > > final List<VariableBinding> results = new > ArrayList<VariableBinding>(); > > snmp.listen(); > > final PDU request = createRequest(); > > request.add(new VariableBinding(_oid)); > > final Target target = createTarget(); > > ResponseEvent responseEvent = snmp.send(request, target); > > if (responseEvent.getPeerAddress() == null){ > > throw new IOException("No response received"); > > } > > PDU response = responseEvent.getResponse(); > > if (response == null){ > > log.error("SNMP GetNextCommand :: response==null"); > > return null; > > } > > _agent.check(response); > > VariableBinding binding = response.get(0); > > OID checker = binding.getOid(); > > while (binding.getOid().leftMostCompare(_oid.size(), _oid) == 0){ > > results.add(binding); > > > request.set(0, new VariableBinding(binding.getOid())); > > responseEvent = snmp.send(request, target); > > response = responseEvent.getResponse(); > > if (response == null){ > > throw new IOException("Timeout occured"); > > } > > binding = response.get(0); > > // check oid duplicated ( last one ... > > if (checker.equals(binding.getOid())){ > > // if there is only one remove the last one ... bogus > > if (results.size() == 1){ > > results.clear(); > > } > > log.debug("loopy checking the same"); > > break; > > } > > checker = binding.getOid(); > > } > > response.clear(); > > for (int i = 0; i < results.size(); i++){ > > response.add((VariableBinding) results.get(i)); > > } > > > return response; > > } > > finally{ > > // always close the snmp connection > > if (snmp != null){ > > snmp.close(); > > } > > } > > } > > > _______________________________________________ > > SNMP4J mailing list > > [email protected] > > http://lists.agentpp.org/mailman/listinfo/snmp4j > > -- AGENT++ > http://www.agentpp.com > http://www.mibexplorer.com > http://www.mibdesigner.com > > > -- AGENT++ http://www.agentpp.com http://www.mibexplorer.com http://www.mibdesigner.com _______________________________________________ SNMP4J mailing list [email protected] http://lists.agentpp.org/mailman/listinfo/snmp4j _______________________________________________ SNMP4J mailing list [email protected] http://lists.agentpp.org/mailman/listinfo/snmp4j
