On Tue, Dec 31, 2019 at 11:44:07AM +0100, Claudio Jeker wrote: > On Tue, Dec 31, 2019 at 11:16:37AM +0100, Martijn van Duren wrote: > > I'm on the fence about this. So if you feel strongly about this go > > ahead if it works. > > In some regard I agree but in this case I think it makes sense. > > > I am however somewhat confused about your description. You say it > > times out, but considering it (by default) only retrieves 10 entries > > per packet it should return somewhat quick and not cause a timeout. > > If you want to move through it quicker you can increase the amount > > retrieved per request by setting -CrX where X can be any value. > > I reckon -Cr40 is a decent number. > > I doubt this will help. Looking at the code it seems that snmpd will fetch > the full pftable from the kernel for each request which is probably where > the inefficency and timeout happen. Also it does a linear search over > Florian's 300'000 addresses to pick up the next address.
I'm currently using the net-snmp tools, I haven't switched to what we have in base, yet. Having another look with -t 60 so it waits 60 seconds before retry I get 10 answers every 2 - 5 seconds on this particular vm. At the same time snmpd is spinning at 100% cpu and ktrace'ing for 10 seconds shows these ioctls: 88 SIOCGIFDESCR 90 DIOCRGETTSTATS 180 DIOCRGETASTATS To traverse the 300k entries will take days at this rate. > > In short the pftable code and probably also the rest of the pf code in > snmpd is not built to scale. There is no caching, no fast lookups, no > smart paging implemented. This code needs major work to be usable. The rest is actually OK-ish. I guess I have two use-cases, and maybe I'm opproaching this wrong: 1) Snoop around what's available / point an auto-discover NMS at snmpd. I did a snmpbulkwalk host private to see what data snmpd gives me and eventually it timed out (when it hit the addr table). Pointing an auto-discover NMS at "private" will not work out. 2) Replacing munin with snmpd. I'm using prometheus & snmp_exporter and I naively pointed it at pfMIBObjects. Which worked out on most hosts, but not my two MXs which have these huge block lists. The workaround in both cases is of course "don't do that then". Which is fair enough, I painstaikingly arrived at a list of oids I can touch without everything blowing up and that contain interesting data (for me): - pfInfo - pfCounters - pfStateTable - pfIfTable - pfTblTable - pfLabelTable I'm happy to put this in with Claudio's OK and Martijn's "half-OK", but if the conclusion is that I'm doing something stupid I'm not supposed to do that's fine, too. If this code would be efficient, it would still not be sensible to export 300k prefixes this way, at least not in most cases. > > I think the diff by Florian is OK and should go in. > -- > :wq Claudio -- I'm not entirely sure you are real.
