Send netdisco-users mailing list submissions to
        netdisco-users@lists.sourceforge.net

To subscribe or unsubscribe via the World Wide Web, visit
        https://lists.sourceforge.net/lists/listinfo/netdisco-users
or, via email, send a message with subject or body 'help' to
        netdisco-users-requ...@lists.sourceforge.net

You can reach the person managing the list at
        netdisco-users-ow...@lists.sourceforge.net

When replying, please edit your Subject line so it is more specific
than "Re: Contents of netdisco-users digest..."
Today's Topics:

   1. Re: Netdisco 2 - port name problem (Steven Xu)
--- Begin Message --- Hi Edward,

I don't have a networking background, so, unfortunately, I can't help look at your device SNMP config if it is indeed the cause of your problems. I was simply speculating as a layman.

Note that in Netdisco V1, as far as I know, it only discovered one device at a time unless you specifically wrote custom code otherwise. In Netdisco V2, it queries multiple devices at the same time. The problem may have only become visible when querying multiple devices at a time.

Steven

-----Edward Vopata <vop...@ti.com> wrote: -----
To: Steven Xu <stev...@yorku.ca>
From: Edward Vopata <vop...@ti.com>
Date: 02/09/2016 05:42PM
Cc: <netdisco-users@lists.sourceforge.net>
Subject: Re: [Netdisco] Netdisco 2 - port name problem

Okay.  I'm going to purge my data and setup to only add about 15 devices and see what happens.  T
This should make the log fairly reasonable.

I don't understand how my SNMP configuration could be causing issues:
    - When I do a discover on a single device,  it works.
    - The same snmp configuration is NOT causing an issue with the old NetDisco 1 instance.
    -
I have attached a sample of my snmp config for a device.
 
I have also attached the snmp config section from my deployment.yml


-- Ed.


On 02/09/16 15:45, Steven Xu wrote:
Hi Edward,

I'm rather stumped. For all devices that Netdisco knows about, "discoverall" simply adds a discover job for those that aren't already in the queue. The single discover runs the discover job (same code) without adding it to the queue.

Maybe this is the interaction of several devices on your test subnet? Try limiting your discover to a single device. Configuration:

discover_only: <ip>

With a single device, your logs should be considerably smaller. You can further reduce the size by running
cat ~/logs/netdisco-daemon.log | grep -vE "(mgr|sched)"

I would also propose that it may have to do with the SNMP configuration on your devices since it seems no one else is having your problem.

Steven

-----Edward Vopata <vop...@ti.com> wrote: -----
To: Steven Xu <stev...@yorku.ca>
From: Edward Vopata <vop...@ti.com>
Date: 02/09/2016 11:44AM
Cc: <netdisco-users@lists.sourceforge.net>
Subject: Re: [Netdisco] Netdisco 2 - port name problem


I have been running this for a few days.
It looks like the problem is somewhere in the "discoverall" code.

I ran without the discoverall (with just macwalk and arpwalk scheduled) and  the problem did NOT appear.

So, I enable the the discoverall schedule and the problem is back.

I then tried to limit the discover to a single subnet with "log: debug",
but the output is overwhelming (74 MB of logs covering 2 hours worth of runtime).
I think I need to be able to limit the logging to only log for a set of IP addresses.
    

Why does the single discover fix the problem (ie: netdisco-do -DIQS  discover -d 10.255.32.73),
but the discoverall schedule corrupt my data (sometimes)?

-- Ed.



On 02/05/16 10:31, Steven Xu wrote:
Hi Edward,

I noticed that your scheduling is a slightly different time than your previous schedule. If you see no problems after running this discover, it may have to do with the time.

I recommend you set the log configuration to debug for this scheduled discover. I recommend that, if you include your logs, you should put them on pastie.org because they can get quite large.
log: debug

Steven

-----Edward Vopata <vop...@ti.com> wrote: -----
To: Steven Xu <stev...@yorku.ca>
From: Edward Vopata <vop...@ti.com>
Date: 02/05/2016 10:15AM
Cc: <netdisco-users@lists.sourceforge.net>
Subject: Re: [Netdisco] Netdisco 2 - port name problem

Okay.  I have set "discover_only" to a single /24 subnet.

I have set the schedule as follows:

    schedule:
      macwalk:
        when:
          min: 20
      arpwalk:
        when:
          min: 50
      expire:
        when: '20 23 * * *'


I force a discover of target list of devices that are exhibiting the ifindex issue.

I ran the about schedule for a full day, and the ifIndex issue has NOT reappeared.

So, I am enabling the discoverall scheduling:

    discoverall:
        when: '1 9 * *

And we will see what happens.

-- Ed.


On 02/04/16 07:47, Steven Xu wrote:
Hi Edward,

My suggestion in the last email was to run a manual discover, disabling the scheduled jobs and checking the port names later in the day. This won't solve the problem, but at least we might be able to identify what causes it.

Whoops, I hadn't dawned on me that arpnip and macsuck would fail when the port names are wrong.

Steven

-----Edward Vopata <vop...@ti.com> wrote: -----
To: Steven Xu <stev...@yorku.ca>
From: Edward Vopata <vop...@ti.com>
Date: 02/03/2016 05:24PM
Cc: <netdisco-users@lists.sourceforge.net>
Subject: Re: [Netdisco] Netdisco 2 - port name problem



> I mean, you should enable those "Display Columns" on the right sidebar of the web interface when looking at ports for a device.

This doesn't solve the problem of the port switching.   Also, once the port switches, updates (arp and mac) start failing because the
port does not match.


> If this machine is not in production, it's possible you could manually run a discover, disable the daemon or limit them to a single subnet and check back on the port names later. If > doing this stops the port names changing to the numbers, then one of the scheduled jobs is incorrectly changing your data.

The system is NOT production, I just need some hints on what to try.

> number of workers

   I have 2 x 8 core CPU's ==> 16 cores
   So, I'm at "4 * AUTO"?

    I don't think I am stressing my system at this point.
    I am just trying to get it to work..

>
There also isn't much difference between your version of netdisco and the latest.

The App::Netdisco (version 2.033005) showed up today, so I installed it.
I would like to report that this version did NOT resolve the issue.

-- Ed.


On 02/03/16 14:51, Steven Xu wrote:
Hi Edward,

I mean, you should enable those "Display Columns" on the right sidebar of the web interface when looking at ports for a device.

If this machine is not in production, it's possible you could manually run a discover, disable the daemon or limit them to a single subnet and check back on the port names later. If doing this stops the port names changing to the numbers, then one of the scheduled jobs is incorrectly changing your data.

I can't exactly comment on the number of workers that you're using other than we use "10 * AUTO" (10 workers * 4 cores). There also isn't much difference between your version of netdisco and the latest.

Steven

-----Edward Vopata <vop...@ti.com> wrote: -----
To: Steven Xu <stev...@yorku.ca>
From: Edward Vopata <vop...@ti.com>
Date: 02/03/2016 02:10PM
Cc: <netdisco-users@lists.sourceforge.net>
Subject: Re: [Netdisco] Netdisco 2 - port name problem


> I recommend, when in the web UI, at least enabling the description or name fields to at least mitigate the problem if you haven't already.

I don't understand what you mean here?


> Are you running Netdisco V1 and V2 on the same server? Have you ensured they are properly set up separately from another?

No.  My NetDisco V1 is running on a different server with it's own database.
Both versions are running the same version of SNMP::Info.


> How many workers have you set up?
 
workers:
  task: 64

dns:
  max_outstanding: 80


FYI:  I upgrade to the latest netdisco 2 version:

    Hostname      : netdisco2
    OS            : Ubuntu 14.04.3 LTS
    Perl          : v5.18.2
    App::NetDisco : 2.033005
    DB Schema     : 40
    SNMP::Info    : 3.31
    Apache        : 2.4.7
    Net-SNMP      : 5.7.2
    PostgreSQL    : 9.3.10


I am game to try anything.  I am using the netdisco 1 as my production Netdisco.

     I suggest
        - totally disabling the nbtstats job.
        - maybe disabling the macsuck job.
        - leave the arpwalk job. 
        - add  discover_only  and arpwalk_only to limit the discover to a single subnet.


Thoughts
    
--- Ed.



On 02/03/16 11:48, Steven Xu wrote:
Hi Edward,

I recommend, when in the web UI, at least enabling the description or name fields to at least mitigate the problem if you haven't already.

Interesting observation: you only run discovers at 3AM, but you mention that within a few hours after the initial discovery (presumably during the workday), the port names change.

Some (random) guesses:
Are you running Netdisco V1 and V2 on the same server? Have you ensured they are properly set up separately from another?
How many workers have you set up? If you set up too many, it could possibly cause SNMP timeouts as in the other thread?

Otherwise, somehow, the scheduled macsuck, arpnip and nbtstat jobs are affecting your device port entries?

Steven

-----Edward Vopata <vop...@ti.com> wrote: -----
To: Steven Xu <stev...@yorku.ca>
From: Edward Vopata <vop...@ti.com>
Date: 02/03/2016 10:03AM
Cc: <netdisco-users@lists.sourceforge.net>
Subject: Re: [Netdisco] Netdisco 2 - port name problem


Attached is the schedule section from my deployment.yml file.

I'm found the logs/netdisco-daemon.log, however I don't find anything about the job scheduling.

My Log level is set as:
    log: 'warning'

Now you are starting to understand my frustration, this is proving to be a very difficult problem.


Yes, there are other people having this problem.  There was a message thread back in November 2015
"Subject: [Netdisco] Some Ports in Portview showing numbers instead of interface ID"

Thanks,
-- Ed.

 


On 02/03/16 08:25, Steven Xu wrote:
Hi Edward,

Since you're observing that the names change after a couple hours, it leads me to believe that some scheduled job is messing with your data. Can you include your scheduled jobs configuration?

You can also check the status of your scheduled jobs by checking the logs in logs/netdisco-daemon.log and searching for the last logs for a particular device.

I have a hard time imagining what could be causing this problem. When you ran the discover manually, the results are fine, but when some schedule job runs (presumably the scheduled discover), the device port entry is getting the wrong name.

Maybe someone else has had a similar issue before?

Steven

-----Edward Vopata <vop...@ti.com> wrote: -----
To: Steven Xu <stev...@yorku.ca>, <netdisco-users@lists.sourceforge.net>
From: Edward Vopata <vop...@ti.com>
Date: 02/02/2016 01:00PM
Subject: Re: [Netdisco] Netdisco 2 - port name problem


I have about 4300 devices in my netdisco2 database, of which 2200 of these devices are having this ifindex issue.

Attached is a list of device models that are having this issue.
I am currently focusing on the cisco 2911 and 2811 devices.

Attached are some log & data files:

 
Device-before-discover.txt    - this a query of a device before the discovery, showing the problem
                                                  (ie. device_port.port is the ifIndex number)

Device-netdisco-discover.log  - this is the log of the netdisco discover on that device.

Device-after-discover.txt    - this the same query after the discovery.

The view after the discover is how I would expect the device to look.
However, after several hours, the device reverts back to the "before" view,
which is NOT right.

This is a Netdisco 2 problem. 
    I am running an old version of netdisco (based on the 0.96 release),
    with the latest SNMP::Info (version 3.31) and I am not having the problem.

    The same device on the old netdisco does NOT exhibit this problem.

I did intend to reply to the mailing list.

Thanks,

- Ed Vopata


On 02/02/16 09:03, Steven Xu wrote:
Hi Ed,

I suspect the problem lies with the SNMP setup for your routers; netdisco always uses the port name given by SNMP. The problem could also lie in the SNMP::Info implementation for your routers (which netdisco uses to gather information). What model are they?

Some logging code is already present and may already be enough to diagnose your issue.  Run the script bin/netdisco-do to discover a device that has already been discover and is experiencing this problem, optionally with the -D and -S flags (try "bin/netdisco-do help" for more details).

I think, even without the flags, warnings will be present if there is a problem with discovering your device. Do this after you notice that the port name gets changed to the IFindex value. Include the logs or let me know what you find.

Also, I noticed that you didn't include the netdisco mailing list this time. Was this intentional? Others may have more insights into the problem that I don't.

Steven


-----Edward Vopata <vop...@ti.com> wrote: -----
To: Steven Xu <stev...@yorku.ca>
From: Edward Vopata <vop...@ti.com>
Date: 02/02/2016 09:44AM
Subject: Re: [Netdisco] Netdisco 2 - port name problem


It is very difficult to isolate this problem.

An initial device discovery will set the port name to the correct value (ie GigabitEthernet0/0),
but sometime later the value gets changed to the IFindex value.  I have a small group of
routers that are consistently exhibiting this problem, but I haven't been able to get much deeper.

Question:
    Where does the port name (device_port.port) value get set?
        - If it is a few places, then I can add some logging code to detect specific changes.
    Where is the source of the port name?
        - maybe I can add some checks there?

I still don't thing that the issue is a timeout issue.

-- Ed Vopata


On 02/01/16 07:09, Steven Xu wrote:
Hi Edward,

It would help to provide some debug logs, although I'm not entirely familiar with which logs you should be providing.

Steven

-----Edward Vopata <vop...@ti.com> wrote: -----
To: <netdisco-users@lists.sourceforge.net>
From: Edward Vopata <vop...@ti.com>
Date: 01/18/2016 02:05PM
Subject: [Netdisco] Netdisco 2 - port name problem

I am still having the a problem with netdisco 2, where the port
(device_port.port) is getting set to the SNMP ifIndex value of the port
instead of the port name value (ie: GigabitEthernet0/0).   I have tried
adjusting the SNMP retries and SNMP timeouts, but I am still getting the
same results.  An initial discover on the device will set the port to the
correct name value (ie GigabitEthernet0/0), however after a few hours the
port changes to the ifIndex value.

I don't believe that the problem is with the SNMP timeouts, since some
of the devices having the problem are in the same data center and the
netdisco server.

Here is my NetDisco Information:
     Hostname      : netdisco2
     OS            : Ubuntu 14.04.3 LTS
     Perl          : v5.18.2
     App::NetDisco : 2.033004
     DB Schema     : 40
     SNMP::Info    : 3.30
     Apache        : 2.4.7
     Net-SNMP      : 5.7.2
     PostgreSQL    : 9.3.10

Please advise.

Thanks.
-- Ed.

------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140
_______________________________________________
Netdisco mailing list
netdisco-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/netdisco-users




[attachment "Netdisco-models.txt" removed by Steven Xu/fs/YorkU]
[attachment "Device-after-discover.txt" removed by Steven Xu/fs/YorkU]
[attachment "Device-before-discover.txt" removed by Steven Xu/fs/YorkU]
[attachment "Device-netdisco-discover.log" removed by Steven Xu/fs/YorkU]



[attachment "Netdisco-Schedule.txt" removed by Steven Xu/fs/YorkU]







[attachment "Device-SNMP-config.txt" removed by Steven Xu/fs/YorkU]
[attachment "NetDisco-SNMP-config.txt" removed by Steven Xu/fs/YorkU]

--- End Message ---
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Netdisco mailing list - Digest Mode
netdisco-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/netdisco-users

Reply via email to