Moving this to tech@. Not an answer but here's some more information.
misc thread archived at https://marc.info/?t=160469547200002&r=1&w=2
but I think the important bits are all in quotes in this mail anyway.

On 2020/11/12 20:35, Winfred Harrelson wrote:
> On Thu, Nov 12, 2020 at 09:51:46AM -0000, Stuart Henderson wrote:
> > On 2020-11-09, Winfred Harrelson <wharr...@kettering.edu> wrote:
> > > On Sat, Nov 07, 2020 at 01:53:00PM -0000, Stuart Henderson wrote:
> > >> On 2020-11-06, Winfred Harrelson <wharr...@kettering.edu> wrote:
> > >> > I am running OpenBSD 6.7 and am having a strange issue with snmpd(8).
> > >> >
> > >> > The issue is that it doesn't have all the arp entries but this was
> > >> > working before.  I don't know exactly when this started happening
> > >> > but I just noticed today.
> > >> >
> > >> > Here is the machine in question and what I get:
> > >> >
> > >> > wharrels@styx1:/home/wharrels$ uname -a
> > >> > OpenBSD styx1 6.7 GENERIC.MP#3 amd64
> > >> >
> > >> > wharrels@styx1:/home/wharrels$ arp -a | wc -l
> > >> >      985
> > >> >
> > >> > Box is acting as a firewall so that is normal.  Actually normal to
> > >> > have many more than that.  But if I do a query from another machine
> > >> > via snmpwalk I get a completely different number of machines in
> > >> > the arp table:
> > >> >
> > >> > [wharrels@newtron ~]$ snmpwalk -v2c -c public styx1 
> > >> > ip.ipNetToMediaTable.ipNetToMediaEntry.ipNetToMediaPhysAddress | wc -l
> > >> > 456
> > >> >
> > >> > Not even close to the same number of machines.  The above OID 
> > >> > translates to
> > >> > 1.3.6.1.2.1.4.22.1.2 if you want to try this and see what you get.
> > >> >
> > >> > Should I be using a different OID to get the arp table or is there
> > >> > another way to do this?  It might be that this was not working quite
> > >> > right before but I don't remember it being off like this.
> > >> >
> > >> > Any help would be appreciated, thanks.
> > >> >
> > >> > Winfred
> > >> >
> > >> >
> > >> 
> > >> If you have set "filter-routes yes" then this is expected as it will
> > >> stop snmpd from seeing route updates and thus new additions to the
> > >> ARP table.
> > >
> > > I do not have that in my config file.  Man page says the default is "no"
> > > so this should not be it correct?  I will try adding the line with a
> > > "no" just to see if that changes anything though.
> > 
> > Correct.
> > 
> > >> If you have not then I'd say this is a bug and best reported to
> > >> bugs@ rather than misc@.
> > >
> > > I am running 6.7 on this box so I may wait until I can get it updated
> > > to 6.8 before reporting to bugs@.
> > 
> > Worth doing though I think 6.8 is unlikely to help.
> > 
> > Does restarting snmpd result in picking up the full arp table again?
> 
> Yes initially.  Unfortunately, after a while they get out of sync
> again.  Gets more out of sync longer it runs though not quickly.
> 
> I have another box running 6.4 that is also having the same issue.
> I am not going to bother complaining about that one because it is
> too old and in more of a need to be updated.  Am hoping to update
> both over the coming holidays.

I see what looks like the same with -current too, I usually have
filter-routes on most firewalls/routers to reduce cpu load (especially
if running dynamic routing protocols) so I have to ignore results
from those machines because they will definitely miss any changes
made after the initial load[1],

After hunting around I found some other machines that don't use
filter-routes and do have a mismatch between arp -an and
ipNetToMediaTable entries (I suppose the simplest query to find
these is a plain snmpwalk on ipNetToMediaPhysAddress). I only
usually see either the same number or one fewer address in
ipNetToMediaTable than arp -an.

I wanted to see if I could spot anything in snmpd debug output
(snmpd -dv) when this happened so I ran this in one terminal

$ while true; do echo `snmp walk -v 2c -c public 82.68.199.132 
ipNetToMediaPhysAddress|grep -c [H]ex-STRING` `arp -an | grep -c :`; sleep .5; 
done

to show up discrepancies while running "arp -da" in another several
times.

The debug output was pretty noisy so I started again and grepped out
the query parsing lines - "snmpd -dv 2>&1 | grep -v snmpe_parse" -
and another round of arp -da, no unexpected output there.

Then I restarted again while "route -n monitor" was running, provoked
the discrepancy again, and searched monitor output for the IP address
that was missing in snmp, and compared to one that showed up.

An address showing in both snmpd and arp -an had

RTM_GET         <UP,HOST,DONE,LLINFO,CLONED>
RTM_DELETE      <HOST,DONE,CLONED>
RTM_ADD         <UP,HOST,DONE,LLINFO,CLONED>
RTM_RESOLVE     <UP,HOST,DONE,LLINFO,CLONED>

And the address only showing in arp -an had

RTM_GET         <UP,HOST,DONE,LLINFO,CLONED,CACHED>
RTM_DELETE      <UP,HOST,DONE,LLINFO,CLONED,CACHED>
(no RTM_ADD/RTM_RESOLVE)

So whatever the root cause (probably something to do with the fact that
it's a CACHED route), there doesn't appear to be a new route socket
message generated when those arp entries come back. The lack of route
socket message would certainly explain why snmpd doesn't show the arp
entry any more.


> Have been emailing back and forth with Martijn van Duren and he
> is unable to recreate the issue.
> 
> > >> BTW you can see this table in a nicer output format:
> > >> 
> > >> $ snmptable -v2c -c pulic host ip.ipNetToMediaTable
> > >
> > > I did not know that, thanks for the info.  Doesn't look to be much
> > > different though.
> > 
> > Yes, it's just nicer formatting and pulls details from the various oids
> > that make up the snmp table in one place, it works for various other
> > tables too.
> 
> I do like this format very much and will be using it from time to time
> in the future.  May be playing around with that command some more for
> other tables and see what I can do.  :)
> 
> Winfred
> 

[1] snmpd handles route/arp/ip addresses by doing an initial fetch
of addresses/tables via sysctl, then monitors route socket messages
for changes, and only does another full sysctl fetch if it gets an
RTM_DESYNC message i.e. kernel detects that the buffer was full.
Setting "filter-routes yes" has it ask the kernel not to send route
socket messages relating to route changes but doesn't disable the
initial fetch so you'll see a snapshot from the time snmpd started.

Reply via email to