Send netdisco-users mailing list submissions to
netdisco-users@lists.sourceforge.net
To subscribe or unsubscribe via the World Wide Web, visit
https://lists.sourceforge.net/lists/listinfo/netdisco-users
or, via email, send a message with subject or body 'help' to
netdisco-users-requ...@lists.sourceforge.net
You can reach the person managing the list at
netdisco-users-ow...@lists.sourceforge.net
When replying, please edit your Subject line so it is more specific
than "Re: Contents of netdisco-users digest..."
Today's Topics:
1. Re: Fortinet FortiOS Shenanigans (Christian Ramseyer)
2. Re: Fortinet FortiOS Shenanigans (Michael Butash)
--- Begin Message ---
On 16.03.2025 12:10, Michael Butash wrote:
Thanks Christian, that doesn't seem related for me, I tried it and it
didn't seem to affect this hang. Though it's good to know, actually
I've had challenges with another customer and memory on their
fortigates. I think they're only using prometheus via api, but might
check in...
So in testing this quick, I decided to fiddle with diag debug
application snmpd, and woah - it's not simply hanging netdisco,
netdisco is blasting it over and over for the entire timeout duration
with the same request it seems. This is really odd, again only
happening on one of the 3 boxes. I'm sure the fortigate cpu doesn't
appreciate ion cannon blast of bad requests either.
snmpd: v3 recv: get-bulk
snmpd: get-next: ifStackStatus.0.7 -> () -> 0
snmpd: get-next: ifStackStatus.0.7 -> () -> 0
snmpd: get-next: ifStackStatus.0.7 -> () -> 0
snmpd: </msg> 1
Any thoughts on why the heck it's doing THAT over and over
indefinitely until the timeout expiration?
Actually our Forti SNMP crashes presumable also were caused by infinite
loops which go OOM. Support suspected something with VDOMs and BGP mibs,
but I'm not even sure they are on the right track there. I'd play around
with putting the device into bulkwalk_no and see if it happens in both
cases. Then I'd try to reproduce it with snmpwalk and open a Fortinet
case if the problem is not specific to Netdisco. For the unlikely case
it is, we'd need to some packet-level poking at snmp-info to see how we
end up in this pickle.
Per your other suggestion, I tried exactly that by various fqdn or ip,
and it simply always chooses to use snmp as a general option, even
with the "only" specified for the ssh one. I commented out the other
snmp options for a test, and doing so simply seem to see no viable
option to try, so nada there either, ending with:
[1752339] 2025-03-16 10:54:55 debug ⬅ (info) [10.0.0.10] hooks -
skipping due to incomplete job
[1752339] 2025-03-16 10:54:55 info discover: finished at Sun Mar 16
03:54:55 2025
[1752339] 2025-03-16 10:54:55 info discover: status info: skip:
driver or action not applicable
Thanks again for the help!
Wait this output is from discover? SSH only does something in arpnip or
(sometimes) macsuck. Discover still needs a regular snmp device_auth to
discover all the interfaces, IPs, neighbors etc. All that SSH provides
is an improved arp table scraping that can deal with VDOMs.
If you're still struggling with arpnip too, can you post the full output of
ND2_LOG_PLUGINS=1 netdisco-do arpnip -DQ -d <ip>
Cheers
Christian
On Sun, Mar 16, 2025 at 2:53 AM Christian Ramseyer
<ramse...@netnea.com> wrote:
Hi Mike
On 16.03.2025 06:16, Michael Butash wrote:
> I'm setting up ND for a new customer, and found a few problems I'm
> scratching my head with.
>
> First, Out 3 Fortigates, two work fine, but one I'm unable to
get port
> data out of. A snmpwalk comes back fine, as well librenms polls
and sees
> all ports, but the one simply hangs ND for the 10 minute
timeout, comes
> back with a weird HASH output then an error, grabs the rest of
the node
> entity data, and exits normally.
>
Just recently we've encountered crashing SNMP agents on Fortigates
and I
suspect it might be triggered by Netdisco queries. Can you check
in the
Forti logs if you get something like
200: 2024-09-30 03:21:47 snmpd watchdog timeout
201: 2024-09-30 03:21:47 <16862> firmware FortiGate-600E
v7.2.10,build1706b1706,240918 (GA.M) (Release)
202: 2024-09-30 03:21:47 <16862> application snmpd
203: 2024-09-30 03:21:47 <16862> *** signal 6 (Aborted) received ***
Apparently a fix for 7.2 and 7.4 should be in the works. Also a
workaround was proposed:
config system snmp sysinfo
set append-index enable
end
Not sure if this is related to your problem, but might be worth a
look.
>
> Second, I set up the ssh arp collector for FortiOS per
instructions, but
> the collector simply never seems to attempt to use ssh, only
using the
> snmp v3 set. I tried to set up a host_group again per github
docs on
> using it for FortiOS, but nothing. I tried with and without
including
> SNMP with the ssh collector bits, not sure if I should, but still I
> never see the ssh portion.
>
> Here's my deployment.yml bits, do you see anything wrong with
this approach?
>
> <code>
> device_auth:
> - tag: 'corp_fortinet'
> # user: 'monitor-user'
> # auth:
> # pass: 'secret'
> # proto: SHA
> # priv:
> # pass: 'secret
> # proto: AES
> action: arpnip::nodes
> only: 'group:Fortinet-Fortigate'
> driver: cli
> platform: FortiOS
> username: 'monitor-user'
> password: 'secret'
> banner: true
> ssh_master_opts:
> - "-o"
> - "StrictHostKeyChecking=no"
> - tag: 'corp_v3authpriv_network_security'
> user: 'monitor-user'
> auth:
> pass: 'secret'
> proto: SHA
> priv:
> pass: 'secret'
> proto: AES
> - tag: 'corp_v2c_servers'
> community: 'secret'
> read: true
> write: false
>
> host_groups:
> Fortinet-Fortigate:
> - 'platform:FortiOS'
Sorry you can't match on platform: like that in the group, since
platform is only the name of the Perl module that is used to do
the ssh
arpnip, not really a property that is materialized in the
database. I'd
try matching just a single IP, see if that works, and go from there:
device_auth:
- tag: 'corp_fortinet'
#action: arpnip::nodes
#only: 'group:Fortinet-Fortigate'
only:
- 192.168.1.42
driver: cli
platform: FortiOS
username: 'monitor-user'
password: 'secret'
banner: true
ssh_master_opts:
- "-o"
- "StrictHostKeyChecking=no"
Putting SNMP credentials into the cli stanzas is not required and
won't
do anything (but also shouldn't make it fail).
Also note the bit from the doc that SSH will not enumerate multiple
credentials, "For drivers other than SNMP, only one stanza will be
tried, and it is a fatal error to have more than one stanza available
for a target device." - this has gotten various people already.
Cheers
Christian
--
Christian Ramseyer, netnea ag
Network Management. Security. OpenSource.
https://www.netnea.com
Phone: +41 79 644 77 64
--
Christian Ramseyer, netnea ag
Network Management. Security. OpenSource.
https://www.netnea.com
Phone: +41 79 644 77 64
--- End Message ---
--- Begin Message ---
Ahh, so money! Yeah, once I found a reference on how to set bulkget_no
(thanks mailing list, your docs *should* really give an example of use..
:)), it ran right through with no issues using getnext. Reading the doc
didn't make any sense where to use it, searches turn up nothing on how or
where to declare this, even chatgpt said I should jab it into the
device_auth section, but otherwise... Thank you!!
So now the question is why is ND misbehaving? There really is little
configuration difference between the working fortigate and not working one,
particularly nothing special around SNMP, so I have no idea why ND would
behave like this for one fortigate and not another. This seems more of a ND
problem than the fortigate.
And yes, I'm a dork re: discover vs arpnip, I was doing discovery. Sorry
for barking up the wrong tree.
Still though, it seems to try, but fails weirdly with an error about
.libnet-openssh-perl not being secure. I wasn't really sure what part it
was considering "not secure", chatgpt seemed to think it was related to the
directory not being secure, but it's chmod 700 to netdisco only, not sure
how much more secure it wants it. I can ssh to the device normally with
that account otherwise from the server.
ND2_LOG_PLUGINS=1 ~/bin/netdisco-do arpnip -DQ -d 10.0.0.10
[2179658] 2025-03-16 20:41:55 debug //// EARLY \\\\ phase
[2179658] 2025-03-16 20:41:55 debug ⮕ worker Arpnip::Nodes p0 "prepare
common data"
[2179658] 2025-03-16 20:41:55 debug //// MAIN \\\\ phase
[2179658] 2025-03-16 20:41:55 debug ⮕ worker Arpnip::Nodes p1000000
[2179658] 2025-03-16 20:41:55 debug ⬅ (info) skip: arp table data supplied
by other source
[2179658] 2025-03-16 20:41:55 debug ⮕ worker Arpnip::Nodes p200
[2179658] 2025-03-16 20:41:55 debug cli session cache warm: [10.0.0.10]
## now it tries ssh
[2179658] 2025-03-16 20:41:55 error [10.0.0.10] ssh connection error
[ctl_dir /opt/netdisco/.libnet-openssh-perl/ is not secure]
[2179658] 2025-03-16 20:41:55 debug ⬅ (defer) arpnip failed: could not SSH
connect to 10.0.0.10
## now it goes on attempting snmp
[2179658] 2025-03-16 20:41:55 debug ⮕ worker Arpnip::Nodes p100
[2179658] 2025-03-16 20:41:55 debug snmp reader cache warm: [10.0.0.10]
What else could that refer to? There isn't much else interesting in the
debug output.
Thanks again for all your help!
-mb
On Sun, Mar 16, 2025 at 11:16 AM Christian Ramseyer <ramse...@netnea.com>
wrote:
>
>
> On 16.03.2025 12:10, Michael Butash wrote:
>
> Thanks Christian, that doesn't seem related for me, I tried it and it
> didn't seem to affect this hang. Though it's good to know, actually I've
> had challenges with another customer and memory on their fortigates. I
> think they're only using prometheus via api, but might check in...
>
> So in testing this quick, I decided to fiddle with diag debug application
> snmpd, and woah - it's not simply hanging netdisco, netdisco is blasting it
> over and over for the entire timeout duration with the same request it
> seems. This is really odd, again only happening on one of the 3 boxes. I'm
> sure the fortigate cpu doesn't appreciate ion cannon blast of bad requests
> either.
>
> snmpd: v3 recv: get-bulk
> snmpd: get-next: ifStackStatus.0.7 -> () -> 0
> snmpd: get-next: ifStackStatus.0.7 -> () -> 0
> snmpd: get-next: ifStackStatus.0.7 -> () -> 0
> snmpd: </msg> 1
>
> Any thoughts on why the heck it's doing THAT over and over indefinitely
> until the timeout expiration?
>
>
> Actually our Forti SNMP crashes presumable also were caused by infinite
> loops which go OOM. Support suspected something with VDOMs and BGP mibs,
> but I'm not even sure they are on the right track there. I'd play around
> with putting the device into bulkwalk_no and see if it happens in both
> cases. Then I'd try to reproduce it with snmpwalk and open a Fortinet case
> if the problem is not specific to Netdisco. For the unlikely case it is,
> we'd need to some packet-level poking at snmp-info to see how we end up in
> this pickle.
>
>
>
> Per your other suggestion, I tried exactly that by various fqdn or ip, and
> it simply always chooses to use snmp as a general option, even with the
> "only" specified for the ssh one. I commented out the other snmp options
> for a test, and doing so simply seem to see no viable option to try, so
> nada there either, ending with:
>
> [1752339] 2025-03-16 10:54:55 debug ⬅ (info) [10.0.0.10] hooks - skipping
> due to incomplete job
> [1752339] 2025-03-16 10:54:55 info discover: finished at Sun Mar 16
> 03:54:55 2025
> [1752339] 2025-03-16 10:54:55 info discover: status info: skip: driver or
> action not applicable
>
> Thanks again for the help!
>
>
> Wait this output is from discover? SSH only does something in arpnip or
> (sometimes) macsuck. Discover still needs a regular snmp device_auth to
> discover all the interfaces, IPs, neighbors etc. All that SSH provides is
> an improved arp table scraping that can deal with VDOMs.
>
> If you're still struggling with arpnip too, can you post the full output
> of
>
> ND2_LOG_PLUGINS=1 netdisco-do arpnip -DQ -d <ip>
>
> Cheers
> Christian
>
>
> On Sun, Mar 16, 2025 at 2:53 AM Christian Ramseyer <ramse...@netnea.com>
> wrote:
>
>> Hi Mike
>>
>>
>>
>> On 16.03.2025 06:16, Michael Butash wrote:
>> > I'm setting up ND for a new customer, and found a few problems I'm
>> > scratching my head with.
>> >
>> > First, Out 3 Fortigates, two work fine, but one I'm unable to get port
>> > data out of. A snmpwalk comes back fine, as well librenms polls and
>> sees
>> > all ports, but the one simply hangs ND for the 10 minute timeout, comes
>> > back with a weird HASH output then an error, grabs the rest of the node
>> > entity data, and exits normally.
>> >
>>
>> Just recently we've encountered crashing SNMP agents on Fortigates and I
>> suspect it might be triggered by Netdisco queries. Can you check in the
>> Forti logs if you get something like
>>
>> 200: 2024-09-30 03:21:47 snmpd watchdog timeout
>> 201: 2024-09-30 03:21:47 <16862> firmware FortiGate-600E
>> v7.2.10,build1706b1706,240918 (GA.M) (Release)
>> 202: 2024-09-30 03:21:47 <16862> application snmpd
>> 203: 2024-09-30 03:21:47 <16862> *** signal 6 (Aborted) received ***
>>
>> Apparently a fix for 7.2 and 7.4 should be in the works. Also a
>> workaround was proposed:
>>
>> config system snmp sysinfo
>> set append-index enable
>> end
>>
>> Not sure if this is related to your problem, but might be worth a look.
>>
>>
>> >
>> > Second, I set up the ssh arp collector for FortiOS per instructions,
>> but
>> > the collector simply never seems to attempt to use ssh, only using the
>> > snmp v3 set. I tried to set up a host_group again per github docs on
>> > using it for FortiOS, but nothing. I tried with and without including
>> > SNMP with the ssh collector bits, not sure if I should, but still I
>> > never see the ssh portion.
>> >
>> > Here's my deployment.yml bits, do you see anything wrong with this
>> approach?
>> >
>> > <code>
>> > device_auth:
>> > - tag: 'corp_fortinet'
>> > # user: 'monitor-user'
>> > # auth:
>> > # pass: 'secret'
>> > # proto: SHA
>> > # priv:
>> > # pass: 'secret
>> > # proto: AES
>> > action: arpnip::nodes
>> > only: 'group:Fortinet-Fortigate'
>> > driver: cli
>> > platform: FortiOS
>> > username: 'monitor-user'
>> > password: 'secret'
>> > banner: true
>> > ssh_master_opts:
>> > - "-o"
>> > - "StrictHostKeyChecking=no"
>> > - tag: 'corp_v3authpriv_network_security'
>> > user: 'monitor-user'
>> > auth:
>> > pass: 'secret'
>> > proto: SHA
>> > priv:
>> > pass: 'secret'
>> > proto: AES
>> > - tag: 'corp_v2c_servers'
>> > community: 'secret'
>> > read: true
>> > write: false
>> >
>> > host_groups:
>> > Fortinet-Fortigate:
>> > - 'platform:FortiOS'
>>
>> Sorry you can't match on platform: like that in the group, since
>> platform is only the name of the Perl module that is used to do the ssh
>> arpnip, not really a property that is materialized in the database. I'd
>> try matching just a single IP, see if that works, and go from there:
>>
>> device_auth:
>> - tag: 'corp_fortinet'
>> #action: arpnip::nodes
>> #only: 'group:Fortinet-Fortigate'
>> only:
>> - 192.168.1.42
>> driver: cli
>> platform: FortiOS
>> username: 'monitor-user'
>> password: 'secret'
>> banner: true
>> ssh_master_opts:
>> - "-o"
>> - "StrictHostKeyChecking=no"
>>
>> Putting SNMP credentials into the cli stanzas is not required and won't
>> do anything (but also shouldn't make it fail).
>>
>> Also note the bit from the doc that SSH will not enumerate multiple
>> credentials, "For drivers other than SNMP, only one stanza will be
>> tried, and it is a fatal error to have more than one stanza available
>> for a target device." - this has gotten various people already.
>>
>> Cheers
>> Christian
>>
>>
>> --
>> Christian Ramseyer, netnea ag
>> Network Management. Security. OpenSource.
>> https://www.netnea.com
>> Phone: +41 79 644 77 64
>>
>>
> --
> Christian Ramseyer, netnea ag
> Network Management. Security. OpenSource.https://www.netnea.com
> Phone: +41 79 644 77 64
>
>
--- End Message ---
_______________________________________________
Netdisco mailing list - Digest Mode
netdisco-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/netdisco-users