Hi!

Yes, lots of things to consider... I have done some information gathering on CMTSes (cable modem routers) reading out several thousamd parameters. In general I can say that an snmpBULKwalk (command line) or its equivalents in other languages are MUCH faster than reading individual objects if you need several adjecent objects.

# time for ((i=1001 ; i-1025 ; i++)) ; do snmpget -v 2c -c public 192.168.8.51 IF-MIB::ifInOctets.$i; done
IF-MIB::ifInOctets.1001 = Counter32: 1105357170
...
real    0m0.386s
user    0m0.316s
sys     0m0.020s

# time snmpbulkwalk -v 2c -c public 192.168.8.51 IF-MIB::ifInOctets
IF-MIB::ifInOctets.1001 = Counter32: 1105357170
...
real    0m0.055s
user    0m0.008s
sys     0m0.008s

Above, the snmpbulkwalk even reads three extra OIDs. Obviously, CLI/bash is not the way to go when you have a need for speed, I just used it to demonstrate the difference. The waiting time is mostly in the switch anyway. I have used PHP, and that's not optimal, but good enough for my needs.

If you need, for instance, most info from the ifEntry-tree, like the red ones below:

IF-MIB::ifIndex.1001 = INTEGER: 1001
IF-MIB::ifDescr.1001 = STRING: Alcatel-Lucent 1/1
IF-MIB::ifType.1001 = INTEGER: ethernetCsmacd(6)
IF-MIB::ifMtu.1001 = INTEGER: 9216
IF-MIB::ifSpeed.1001 = Gauge32: 0
IF-MIB::ifPhysAddress.1001 = STRING: e8:e7:32:2:c5:2a
IF-MIB::ifAdminStatus.1001 = INTEGER: up(1)
IF-MIB::ifOperStatus.1001 = INTEGER: down(2)
IF-MIB::ifLastChange.1001 = Timeticks: (48888100) 5 days, 15:48:01.00
IF-MIB::ifInOctets.1001 = Counter32: 1105357170
IF-MIB::ifInUcastPkts.1001 = Counter32: 159748417
IF-MIB::ifInNUcastPkts.1001 = Counter32: 0
IF-MIB::ifInDiscards.1001 = Counter32: 0
IF-MIB::ifInErrors.1001 = Counter32: 0
IF-MIB::ifInUnknownProtos.1001 = Counter32: 0
IF-MIB::ifOutOctets.1001 = Counter32: 1778801908
IF-MIB::ifOutUcastPkts.1001 = Counter32: 2041113484
IF-MIB::ifOutNUcastPkts.1001 = Counter32: 0
IF-MIB::ifOutDiscards.1001 = Counter32: 23
IF-MIB::ifOutErrors.1001 = Counter32: 0
IF-MIB::ifOutQLen.1001 = Gauge32: 0
IF-MIB::ifSpecific.1001 = OID: SNMPv2-SMI::zeroDotZero

I'd definitely recommand walking the entire ifEntry tree instead of walking several separate walks (again, cli ony for educational purposes):

snmpbulkwalk  -v 2c -c public 192.168.38.51 IF-MIB::ifEntry -m all

It will probably be much more efficient to discard the extra info (black lines above) instead of doing multiple walks. I think your switches/routers will agree too, but that's very vendor- and even product dependent. The downside is a little more traffic over the network, but I'd say it's negligable.

I store all values retrieved in a database (I use MySQL or Postgres, but choice is free) and then I can have the front-end pick out values whenever convenient.

I even use the "bulkwalk" strategy to monitor almost 250 emergency phones spread on 28 AudioCodes phone-to-SIP concentrators for a customer. We can detect a "hook off" event in less than 5 seconds (the criteria) by using 4 or 5 parallell jobs that walk the AudioCodes units constantly. There are 24 ports in each unit, so selecting only the active ports and reading just those would have required way more time and essentially lots of parallell jobs.

There are functions in sFlow and similar that can send/push info on the amount of traffic to an sFlow server if traffic volumes are all you're interested in. I'm often also interested in errors, queues and so on, but I could settle for sFlow based traffic every minute and poll errors and such less often.

/Fredrik


Den 2016-05-09 kl. 11:25, skrev Jurkiewicz Jean-Marc:
Hi, Like James, I wrote tools that handle several 10K variables on 5 minute interval. I am not at all trying to discourage you in your project. There are some concern you should keep in mind: Please always remember that the primary role of the network / network equipment you try to manage is to transport data. I mean payload data, not management data. Have you estimated/calculated the ration of the bandwidth you will consume "just" for management? (You did not mention how many counters per routers you intend to collect data about.) Measurement should not bias measured (or only to a minimal extend). Switchs and routers are not designed to assist to this extend, the management station. SNMP is not the best method (high CPU load, low priority on the equipment) => How about Netflow ( nfsen is a wonderful free tool , if you accept 5 min. period) There is a big difference between what can be done and what make sense to be done. In any way Get-bulk (if several counters par routers) seems more appropriate than Traps. There are some questions you should consider and find an answer to: How are configured the time-outs and the retries of your SNMP requests (you intend to address some "real" equipment that may respond with latency) ? Two seconds time-out and 3 retries don't make really sense when polling every second. Impact of a "non-responding" device? (a faulty one, defectuous one), of several faulty devices (let say 10). What happens to the 190 others? What is the latency of the network that interconnect your 200 routers? How do you "time-stamp" the collected data ? Good luck with your work. JM -----Message d'origine----- De : Richard Mayers [mailto:richard.mayer...@gmail.com] Envoyé : dimanche, 8. mai 2016 12:11 À : j...@mindspring.com; net-snmp-users@lists.sourceforge.net Objet : Re: Best practice to get counters from a huge amount of routers. Hi, Thanks for the replies. What about using Traps? Can I make the routers send me the counters every second ? Is it hard to set up? Kind regards, Richard 2016-05-07 20:56 GMT+02:00 James Leu <j...@mindspring.com>:
For my day job we tried open source packages for SNMP polling like: mrtg cricket They were fine for small numbers and long intervals, but will not scale to your usage. I wrote an implementation from scratch that handles 10K variables on 5 minute interval. So it can be done. Issues I think you will run into: router CPU: hitting a device every second for a number of interfaces may consume too much CPU time to be practical. storage IO: if you are successful in retrieving the data, standard DB storage even RRD will not be able to handle the IO load Things to consider: Bulk SNMP gets RMON local device scripting, I believe Cisco routers can do local TCL scripting and junos based devices have local scripting as well. Directory based queues with small files storing samples Good luck with your work. On Sat, May 07, 2016 at 05:20:53PM +0200, Richard Mayers wrote:
Hi folks, For my master thesis I am doing a load balancing project and I have to know the link usage if possible every second. For that I set the refresh interval to 1 second, so every thing is good so far. My problem is that I am working with big topologies and I may have 200 or more routers. If I get the counters polling it takes forever I can not poll the routers one by one, or not even using threads (at some point it would not scale). What would be the best way to get all the counters ? Since I am simulating everything in a single machine I can do a trick and write the counters in a file, however that will not be useful when I test my solution in a real network. Kind regards, Richard --------------------------------------------------------------------- --------- Find and fix application performance issues faster with Applications Manager Applications Manager provides deep performance insights into multiple tiers of your business applications. It resolves application problems quickly and reduces your MTTR. Get your free trial! https://ad.doubleclick.net/ddm/clk/302982198;130105516;z _______________________________________________ Net-snmp-users mailing list Net-snmp-users@lists.sourceforge.net Please see the following page to unsubscribe or change other options: https://lists.sourceforge.net/lists/listinfo/net-snmp-users
-- James R. Leu j...@mindspring.com
------------------------------------------------------------------------------ Find and fix application performance issues faster with Applications Manager Applications Manager provides deep performance insights into multiple tiers of your business applications. It resolves application problems quickly and reduces your MTTR. Get your free trial! https://ad.doubleclick.net/ddm/clk/302982198;130105516;z _______________________________________________ Net-snmp-users mailing list Net-snmp-users@lists.sourceforge.net Please see the following page to unsubscribe or change other options: https://lists.sourceforge.net/lists/listinfo/net-snmp-users ------------------------------------------------------------------------------ Find and fix application performance issues faster with Applications Manager Applications Manager provides deep performance insights into multiple tiers of your business applications. It resolves application problems quickly and reduces your MTTR. Get your free trial! https://ad.doubleclick.net/ddm/clk/302982198;130105516;z _______________________________________________ Net-snmp-users mailing list Net-snmp-users@lists.sourceforge.net Please see the following page to unsubscribe or change other options: https://lists.sourceforge.net/lists/listinfo/net-snmp-users

------------------------------------------------------------------------------
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
_______________________________________________
Net-snmp-users mailing list
Net-snmp-users@lists.sourceforge.net
Please see the following page to unsubscribe or change other options:
https://lists.sourceforge.net/lists/listinfo/net-snmp-users

Reply via email to