Re: using net-snmp at scale with one process and one socket
I'm happy to share the changes I made to the library, but they were minimal. Really just: (a) adding this hook to snmp_api.c to allow response packets to be fed back in... // add hook to allow this to be called from outside void snmp_sess_receive_packet(void *sessp, u_char *packet, int length, struct sockaddr *from, int from_len) { struct session_list *slp = (struct session_list *) sessp; if(slp) { netsnmp_session *sp = slp->session; struct snmp_internal_session *isp = slp->internal; netsnmp_transport *transport = slp->transport; if(transport && transport->sock > 0 && sp && isp && isp->requests) { // The opaque is expected to be malloc'd by the transport receive fn // because it is then freed by _sess_process_packet() later. void *from2 = malloc(from_len); memcpy(from2, from, from_len); int rc = _sess_process_packet(sessp, sp, isp, transport, from2, from_len, packet, length); if(rc && sp->s_snmp_errno) SET_SNMP_ERROR(sp->s_snmp_errno); } } } (b) In snmpusm.c, declaring usm_build_probe_pdu() as a global function (by removing "static"). (c) reworking snmpusm.c to use a hash-table of lists, instead of just one (struct usmUser *) list. This was a bigger change, but it was also a _breaking_ change because there is an SNMP v3 agent table that expects a certain ordering here. I think the right compromise might be to use a tree-structure so that lookups are efficient while ordering is preserved, or better still just use a hash table and change the agent code to comb through the table any time it wants a GET_NEXT. But either way that is a big change for such an obscure benefit, and extra care would need to be taken - especially in the presence of multiple threads. Unfortunately the other code I wrote to interface between the rest of the application and the library is too intertwined with product code to share just as it is. But I'd be happy to discuss choices if someone were to look at this. For now I can share this psuedo code outline below, which is probably more helpful anyway. Notes: (1) I used a red-black-tree for the retry-timeout collection when it might have been more appropriate to use a binary-heap(?) (2) I had to maintain separate hash lookups keyed by reqid and msgid so that processing SNMPv3 responses could work the same way as for SNMPv2. For v3 responses the msgid appears unencrypted, while the reqid only emerges after decryption is successful. (3) Since a response might come in for a request that has already been retried the request-pdu needs to keep the list of all attempts made so far. Those requests are kept in an ordered list by the pdu and only deleted when the pdu life-cycle is complete. (4) It might have worked better to make the Request object be the callback magic pointer, rather than the more persistent "Target" object, which then needs yet another hash table lookup to get to the original request that matches the pdu. I think this was to make sure it was OK to delete a request at any time, but in practice the lifecycle is clear so maybe there is one indirection too many in this scheme. The cast of characters here are: snmp - one persistent object representing this wrapper \_target - one mostly-persistent object for each remote target IP \_pdu - query and response objects |_request - individual attempt // the callback function given to the library. Called after snmp_timeout() or snmp_sess_receive_packet() SNMP_callback(int op, netsnmp_session *ss, int reqid, netsnmp_pdu *pdu, void *magic) { target = (target)magic; req = removeFromHashTable(target->requestsByReqid, pdu->reqid) // process Pdu/error and call back to client // now we can finally clean up for(req in pdu->requests) { // make sure req is removed from all 5 collections: removeFromList(pdu->requests, req) removeFromTree(snmp->timeoutTree, req) removeFromHashTable(target->requestsByReqid, pdu->reqid) removeFromHashTable(snmp->requestsByReqId, req->reqid) removeFromHashTable(snmp->requestsByMsgId, req->msgid) delete(req) } } // sending a request SNMP_sendRequest(pdu, target) { long reqid = snmp_sess_async_send(target->sessp, pdu, SNMP_callback, target) req = newRequest(target, pdu, reqid) // req goes in up to 5 collections: addToList(pdu->requests,req) addToHashTable(target->requestsByReqId, reqid) addToTree(snmp->timeoutTree, req) addToHashTable(snmp->requestsByReqId, reqid, req); if(pdu->version == 3) addToHashTable(snmp->requestsByMsgId, pdu->msgid, req); } // this is called every 100mS or so SNMP_timeout(snmp) { while((req = timedOutRequest(snmp)) != NULL) { removeReqFromTimeoutTree(req); snmp_sess_timeout(req->target->sessp); } } // this is called when a response packet appears on the socket as a result of select() or poll() SNMP_response(snmp, socket, msg) { Request req=NULL; long version=0,msgid=0,reqid=0;
Re: using net-snmp at scale with one process and one socket
Neil Mckee writes: > You are very welcome. Let me know if you want to discuss any of the > particulars. Well, the biggest question in my mind is: are you willing to contribute the patches back? It's unclear if any of them break backwards compatability (which we always strive for), but certainly many of them would not. > Adding or importing your favorite tree implementation to the library > would be a lot of new code Well, there's one issue: lots of new code means bigger applications and we do strive to be small to run on small devices. That being said, the sliding definition of "small" has certainly increased in size over time with the continued increase of storage/memory availability in even the smallest devices. -- Wes Hardaker Please mail all replies to net-snmp-coders@lists.sourceforge.net ___ Net-snmp-coders mailing list Net-snmp-coders@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/net-snmp-coders
Re: using net-snmp at scale with one process and one socket
You are very welcome. Let me know if you want to discuss any of the particulars. For example, it seems like snmpusm.c is also trying to please the SNMPv3 user MIB by getting everything in the right order for GETNEXT there. So my use of a hash-table breaks that, but that's OK for me because I am only using the library as a client. I did wonder if using something like a red-black tree with a carefully crafted sort function would allow efficient lookup without breaking the ordering, but it was hard to make the logic turn out exactly the same as it is now (e.g. I think it would require great care to make sure an entry is removed and added again if it is ever modified in situ). Adding or importing your favorite tree implementation to the library would be a lot of new code, but could also be used to store outstanding requests sorted by monotonic-clock "due" time, which helps to streamline the timeout mechanism when there are hundreds/thousands of concurrent requests. On balance, however, it doesn't seem like there is a case for rocking the boat - especially when most client use-cases (perl, python, snmpwalk etc.) make only one request in the life cycle of a session. Plus it looks like you now have to consider multiple threads too, so using more complex data-structures is hard. I guess if you decided to carve out a single-session, single-threaded client API then it would be more tunable, but as long as the package is also an agent and is also multi-threaded then it seems like your hands are tied? I certainly won't object if nothing changes. On Mon, Oct 18, 2021 at 10:30 AM Wes Hardaker wrote: > > Neil Mckee writes: > > > This may have been covered many times before, but just in case it > > helps someone, here is a summary of my experience with using net-snmp > > in a large network. > > Thank you for the very useful list of changes. > -- > Wes Hardaker > Please mail all replies to net-snmp-coders@lists.sourceforge.net ___ Net-snmp-coders mailing list Net-snmp-coders@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/net-snmp-coders
Re: using net-snmp at scale with one process and one socket
Neil Mckee writes: > This may have been covered many times before, but just in case it > helps someone, here is a summary of my experience with using net-snmp > in a large network. Thank you for the very useful list of changes. -- Wes Hardaker Please mail all replies to net-snmp-coders@lists.sourceforge.net ___ Net-snmp-coders mailing list Net-snmp-coders@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/net-snmp-coders
using net-snmp at scale with one process and one socket
This may have been covered many times before, but just in case it helps someone, here is a summary of my experience with using net-snmp in a large network. Specifically, I use it to talk to thousands of SNMPv2 and SNMPv3 agents using just one UDP socket in a process that runs continuously. The net-snmp library uses linear linked lists for sessions, requests and usmuser data. That would normally be OK, but does not scale so well for this use-case. So I had to make a few small changes: (1) I still create a separate netsnmp-session for each remote agent IP, but force them all to use the same UDP socket. (2) I turn off the v3 engine-id probe hooks so I can send and process those requests manually. (3) I do my own select() using hash tables keyed by reqid and msgid to map back to the right netsnmp-session. There is no blocking. (4) I do my own timeout check using a time-sorted tree. (5) I changed the library to expose this function so I can feed response PDUs back to the library directly: extern void snmp_sess_receive_packet(void *sessp, u_char *packet, int length, struct sockaddr *from, int from_len); (6) I exposed this function to help construct engine-id probes: extern int usm_build_probe_pdu(netsnmp_pdu **pdu); (6) I changed the userList in snmplib/snmpusm.c to be a hash-table of userLists keyed by engine-id (but the logic within each list is otherwise unchanged). These were the smallest changes I could make. There don't seem to be any significant bottlenecks now. Neil P.S. I was also tempted to set SNMPV3_IGNORE_UNAUTH_REPORTS=1 because otherwise the library delivers an unexpected extra callback after a failed engine-id probe, but it seemed better to harden the logic to handle unexpected callbacks gracefully. ___ Net-snmp-coders mailing list Net-snmp-coders@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/net-snmp-coders