On Fri, Oct 28, 2022 at 01:22:57PM +0200, Martijn van Duren wrote: > I wondered that as well, but I tried to simulate the not found and > error code-paths, but I couldn't trigger it. So I'm not ruling it > out, I just can't reproduce it. > > Another thing that's weird is that it looks like the index has been > stripped from sensorStatus, which might be an indication that > weird is going on inside libagentx. But like I said: without a > reproducer I haven't been able to pin it down. > > So the additional verbose information should be useful. > Come to think of it: The `sysctl hw.sensors` output might be > helpful as well, both on a succeeding run, as well as at the time > of the crash (maybe something like: > `while true; do date; sysctl hw.sensors; sleep 1; done > \ > /path/to/output`)
As the offending machines are VMs, hw.sensors actually returns nothing. I will send you the output for all of 'hw' key, and log output for snmpd -vv when the issue arrives. It does seem to coincide with librenms's discovery process, which comes from librenms upstream as this cron job (on a linux machine): 33 */6 * * * librenms /opt/librenms/cronic /opt/librenms/discovery-wrapper.py 1 So, it is the one job running every ~6 hours which would match up with when snmpd is dying on these OpenBSD 7.2 VMs. I still have 30+ VMs on <7.2 that are OK. Any physical machines I've upgraded to 7.2 are only at home, not $WORKPLACE where librenms lives. Not trying to be noisy, just hopefully narrow down the actual cause :) Thanks for the hints! Regards, -Ryan > > @Ryan: this info doesn't need to be on the list, so feel free to > send it to me in private if you want. > > On Fri, 2022-10-28 at 11:08 +0100, Stuart Henderson wrote: > > I wonder if there are any sensors which disappear and reappear.. > > > > On 2022/10/28 10:01, Martijn van Duren wrote: > > > Could you run snmpd with `-vv`? That way I also have the specific > > > OIDs being requested and returned (both frontend and backend) and > > > might make it a little more easy to reproduce. > > > > > > Do note that this adds at least 4 log lines for every request > > > issues to snmpd, so your logfile might explode a bit. > > > > > > martijn@ > > > > > > On Thu, 2022-10-27 at 14:08 -0700, Ryan Freeman wrote: > > > > On Thu, Oct 27, 2022 at 01:46:21PM -0700, Ryan Freeman wrote: > > > > > Hello, > > > > > After upgrading some virtual machines to OpenBSD 7.2, I started > > > > > noticing > > > > > snmpd dying approx every 6 hours on the upgraded machines. > > > > > > > > > > Oct 27 13:14:33 mirror snmpd[98795]: AgentX(1268939451/2580462718): > > > > > 2506302838 > > > > > iso.org.dod.internet.private.enterprises.openBSD.sensorsMIBObjects.sensors.sensorTable.sensorEntry.sensorStatus: > > > > > oids not equal > > > > > Oct 27 13:14:33 mirror snmpd[98795]: AgentX(1268939451/2580462718): > > > > > Closing: Too many parse errors > > > > > Oct 27 13:14:33 mirror snmpd[98795]: AgentX(1268939451/2580462718): > > > > > Closed by snmpd (Too many AgentX parse errors from peer) > > > > > Oct 27 13:14:33 mirror snmpd_metrics[88325]: [fd:0 sess:2580462718 > > > > > ctx:<default>]: unsupported call: agentx-Close-PDU > > > > > Oct 27 13:14:33 mirror snmpd[98795]: AgentX(1268939451): Connection > > > > > reset by peer > > > > > Oct 27 13:14:33 mirror snmpd[98795]: snmpe: AgentX(1268939451): > > > > > disappeared unexpected > > > > > > > > > > The message is always the same, it tends to be around 1:20am, 7:20am, > > > > > 1:20pm, 7:20pm > > > > > I have a script set to check "rcctl ls failed" and notify if > > > > > something has failed. > > > > > > > > > > LibreNMS is used to scrape the snmpd instances on the affected VMs. > > > > > > > > And, forgot to include the snmpd.conf, apologies. here it is with minor > > > > changes values only: > > > > # $OpenBSD: snmpd.conf,v 1.2 2021/08/08 13:43:10 sthen Exp $ > > > > > > > > # See snmpd.conf(5) for more options (tcp, alternative ports, trap > > > > listener) > > > > listen on 127.0.0.1 > > > > > > > > user "changed" auth hmac-sha1 authkey "randomstuff" enc aes enckey > > > > "morerandomstuff" > > > > > > > > # Adjust the local system information > > > > system contact "Systems Team ([email protected])" > > > > #system location "Rack A1-24, Room 13" > > > > > > > > # Required by some management software > > > > system services 74 > > > > > > > > LibreNMS then scrapes it using snmpv3 and authPriv mode. > > > > no core file is being dropped by snmpd > > > > > > > > -Ryan > > > > > > > >
