Good idea. Did the pcap trace, and it sure looks like the SiteMonitor is responding with the correct values. So the question remains as to why cacti thinks otherwise (problem is cacti, but I have no idea why). Maybe need to trace it at the other end as well....

bp

On 10/24/2014 12:32 PM, Forrest Christian (List Account) via Af wrote:

Darn autocorrect.  SNMP not snap.

On Oct 24, 2014 12:32 PM, "Forrest Christian (List Account)" <[email protected] <mailto:[email protected]>> wrote:

    Before you do that I'd look at what is coming back via snap via a
    wireshark or similar.

    If you zeroed an expansion module in the middle of the list, then
    all of the oids for devices after that entry in the list would
    have shifted to a lower number.

    The sitemonitor assigns oids based on its knowledge of how many of
    each i/o type each device takes. It remembers this even if the
    device isn't attached anymore.  By zeroing a device in the middle,
    it reassigns oids after that point in the table, since it doesn't
    have the zeroed device info as a placeholder.

    On Oct 24, 2014 12:01 PM, "Bill Prince via Af" <[email protected]
    <mailto:[email protected]>> wrote:

        You think you're confused.

        I did not change the community string, and it works from the
        CLI and/or through the realtime plugin.� The device shows as
        UP, and I use "SNMP or ping" as up/down detection.

        I also tried changing the SNMP timeout to 1000 ms.� All that
        did was change the error log to this:

        10/24/2014 11:29:22 AM - SPINE: Poller[0] Host[703] TH[1]
        DS[12223] WARNING: SNMP timeout detected [1000 ms], ignoring
        host '10.13.114.254'

        I've tried "SNMP Uptime", "SNMP Desc", and "SNMP getNext" as
        well.� On the Device Management screen, it retrieves the
        correct SNMP information.� The only think that seems to not
        be working is the polling through spine.

        I'm curious why zeroing the serial number of a non-existent
        expansion unit caused this problem.

        I've also rebooted the SiteMonitor at least a couple of times
        to no effect.

        My next thing will be to just replace the SiteMonitor with a
        spare.� It's all the way down in town, so that is a half-day
        time hit.

        bp

        On 10/24/2014 11:16 AM, George Skorup (Cyber Broadcasting) via
        Af wrote:
        I am thoroughly confused. Is your community string correct?
        Can you increase the device SNMP timeout, like 1000ms instead
        of 250ms. What's your device down detection set to? Is it
        showing down in the device list?

        I have seen some base units go kinda screwy and respond
        slower and a reboot doesn't fix it, they needed a power-cycle.

        On 10/24/2014 11:25 AM, Bill Prince via Af wrote:
        Now thrice.

        No joy in Mudville.

        bp
        On 10/24/2014 8:07 AM, Bill Prince via Af wrote:
        Yah.� Twice now.

        bp
        On 10/23/2014 11:06 PM, George Skorup (Cyber Broadcasting)
        via Af wrote:
        Gotta be the poller cache. Did you try a rebuild?

        On 10/23/2014 11:03 PM, Bill Prince via Af wrote:
        Getting closer.� When I look in the SNMP cache, there
        is no entry for the device.

        Looking in the log (without debug), I get:

        10/23/2014 08:34:25 PM - SPINE: Poller[0] Host[797
        <http://10.13.112.20/host.php?action=edit&id=797>] TH[1]
        DS[12316
        <http://10.13.112.20/data_sources.php?action=ds_edit&id=12316>]
        WARNING: SNMP timeout detected [250 ms], ignoring host
        '10.13.114.254'

        So there is something causing the SNMP request to barf
        inside cacti.� When I do an snmpget from the CLI, it
        all looks fine.� Likewise, the realtime plugin is
        working fine too.

        So when realtime is doing the SNMP queries outside the
        poller, they are fine.� Just when spine is doing the
        SNMP requests.


        bp
        On 10/23/2014 4:12 PM, George Skorup (Cyber Broadcasting)
        via Af wrote:
        You divided by zero, didn't you?

        Are you sure your modules are in the same order as before?

        On 10/23/2014 1:29 PM, Bill Prince via Af wrote:

        I noticed an "Expansion Unit" on one of my SiteMonitors
        this morning.� It said something about "Device
        Removed" or something like that.

        Remembering the discussion the other day on this topic,
        I put a "0" in the Serial # for the non-existent unit,
        rescanned, & rebooted.

        Now, none of the OIDs work in Cacti.� If I do a
        simple snmpget on any of the OIDs that I use, the
        correct information comes back. Several of the OIDs are
        on the base unit anyway, so they would not have moved,
        and further, the OIDs don't reference the serial number.

        So... what did I do, and how do I fix it?










Reply via email to