Most people end up with a set of three or four configurations. Ie sitemonitor plus a injector is one configuration, a sitemonitor by itself is another one.
If you put the modules you don't ever monitor at the end of the list then you can reuse configurations. Ie, a sitemonitor and syncinjector is the same as a sitemonitor, syncinjector, and Poe as far as monitoring goes. On Oct 25, 2014 1:06 PM, "Bill Prince via Af" <[email protected]> wrote: > OK. I think I have an approach. The SiteMonitor plus all its expansion > units is not the "device". > > The "device" is the SiteMonitor plus the index of the expansion unit. > > For example: > > - SiteMonitor, index 0 is the SiteMonitor device > - SiteMonitor, index 1 is the 4-port POE device > - SiteMonitor, index 2 is the SyncInjector (first instance) > - SiteMonitor, index 3 is the SyncInjector (second instance) > > and so on. > > So when you add a SiteMonitor, you just add the SiteMonitor. If you add > another Packetflux expansion unit, you have to add it knowing which index > (AKA "slot") it is. Put the device in a different position, and you need > to update the index. > > bp > > On 10/25/2014 10:52 AM, Bill Prince via Af wrote: > > Yah. Except that the index moves around, depending on what's in front of > it (e.g. 4-port POE versus an 8-port POE). So I can't depend on what index > number I'll be using at any given installation. The index name will have > to stay static if I ever hope to find it. Then again, if I install two of > anything, there will be more than one index with the same description. > > Hmmm. How to do this. Maybe I do have to give each device a unique > description, and then teach cacti to index on the unique description? > > bp > > On 10/25/2014 10:16 AM, Forrest Christian (List Account) via Af wrote: > > They should be offset by a fixed amount. Ie subtract 4 > On Oct 25, 2014 10:58 AM, "Bill Prince via Af" <[email protected]> wrote: > >> I think that may be it. The OID I was using is no longer valid. So >> the SNMP response that came back had numbers in it, but it also looks like >> the checksum was broken. >> >> Not clear to me why I thought I could do this without doing the index >> thing. >> >> I hate doing the index thing. >> >> bp >> >> On 10/24/2014 10:32 PM, Forrest Christian (List Account) via Af wrote: >> >> A power cycle and a reboot should be identical in almost every case. The >> reboot actually triggers a hardware reset internally in the processor, >> which should clear everything out. Of course as soon as I say that it is >> identical, someone will find an example where it is not. >> >> I'm not where I can look at the trace you sent, but I'm surprised it >> contains errors. I do know that the unit will return a response which may >> look like this if the oid is invalid. >> >> Did you adjust your oids in cacti after the removal of the mystery >> expansion unit from the table? If not, this is likely the problem. >> >> In regards to the unit being there grin the factory.. My guess is if you >> had this unit listed in there from the get go, then it probably was the >> expansion unit we use to test the expansion bus here. It's supposed to be >> factory reset before shipping but it would not shock me if it wasn't. We >> actually had a short period that a largish percentage went out not factory >> reset due to a tester software issue. Not really a problem but we hate to >> have them go out in any other state. >> On Oct 24, 2014 5:08 PM, "Bill Prince via Af" <[email protected]> wrote: >> >>> You mean from the web GUI?� Sure. >>> >>> I presume a power cycle does something different from a reboot? >>> >>> I was always curious about this particular SiteMonitor, as it came up >>> with the extra device on the expansion bus from the get-go.� I'd never >>> worried about it, and then I saw the discussion about getting rid of old >>> devices with the zeroed-serial trick. >>> >>> Don't go there!� It's a trap! >>> >>> bp >>> >>> On 10/24/2014 2:52 PM, George Skorup (Cyber Broadcasting) via Af wrote: >>> >>> Can you post a screenshot of your expansion, binary and analog tabs? >>> >>> Also, I bet if you power-cycle it, it will be fine again. I was working >>> with Forrest on a bug where the SyncInjector and some other newer modules >>> would mysteriously disappear from the bus. He was able to reproduce and get >>> a fixed up firmware load for the modules. Something about one thing booting >>> up faster than another, or something like that. >>> >>> On 10/24/2014 4:41 PM, Bill Prince via Af wrote: >>> >>> Gotcha! >>> >>> I removed all the Data Sources except one (PWR1).� Suddenly that data >>> was making it into cacti. >>> >>> Then I added back in all the Data Sources coming _JUST_ from the >>> SiteMonitor itself.� That also worked. >>> >>> Then I added in one of the Data Sources from the SyncInjector (sync >>> events), which happens to be the only unit on the expansion bus past where >>> I removed the non-existent unit.� This broke it again. >>> >>> So I have apparently uncovered a bug where removing a unit from the >>> expansion bus (by zeroing the serial number) that causes the SiteMonitor to >>> break SNMP responses.� I think it's probably just a bad checksum, but I >>> will leave that up to him.� I forwarded the pcap trace to him. >>> >>> I will probably also swap out the SiteMonitor that has the problem. >>> >>> Thanks guys! >>> >>> bp >>> >>> On 10/24/2014 1:57 PM, Bill Prince via Af wrote: >>> >>> Then again.... >>> >>> Not sure why I didn't notice this the first (or second) time.� >>> Wireshark is telling me I have a malformed packet; either a broken header >>> or bad checksum.� So even though the SNMP response is coming in with the >>> expected data, it's getting dropped before is gets into cacti because of >>> the malformed packet. >>> >>> This would explain why removing a unit on the expansion bus changed >>> things... >>> >>> bp >>> >>> >>> >>> >>> On 10/24/2014 1:32 PM, Bill Prince via Af wrote: >>> >>> OK. Confirmed.� The SiteMonitor is getting the SNMP requests, and it >>> is responding with the expected values. >>> >>> I ran a pcap trace both at the SiteMonitor as well as at the ethernet >>> port on the cacti server.� SNMP requests/responses are going both ways >>> (and at both ends). In fact, spine appears to be doing 3 retries. >>> >>> One thing I didn't expect is that just before the SNMP requests, there >>> are two attempts to open a telnet on the SiteMonitor.� Not sure where >>> that is coming from, except perhaps for the Manage plugin (which I >>> de-installed several weeks ago). >>> >>> So something is broken inside cacti.� How/why this was caused by >>> zeroing a serial number from a non-existent expansion unit is completely >>> baffling to me. >>> >>> I also have no clue how to fix it, because cacti "thinks" there was no >>> response. >>> >>> bp >>> >>> On 10/24/2014 11:16 AM, George Skorup (Cyber Broadcasting) via Af wrote: >>> >>> I am thoroughly confused. Is your community string correct? Can you >>> increase the device SNMP timeout, like 1000ms instead of 250ms. What's your >>> device down detection set to? Is it showing down in the device list? >>> >>> I have seen some base units go kinda screwy and respond slower and a >>> reboot doesn't fix it, they needed a power-cycle. >>> >>> On 10/24/2014 11:25 AM, Bill Prince via Af wrote: >>> >>> Now thrice. >>> >>> No joy in Mudville. >>> >>> bp >>> >>> On 10/24/2014 8:07 AM, Bill Prince via Af wrote: >>> >>> Yah.� Twice now. >>> >>> bp >>> >>> On 10/23/2014 11:06 PM, George Skorup (Cyber Broadcasting) via Af wrote: >>> >>> Gotta be the poller cache. Did you try a rebuild? >>> >>> On 10/23/2014 11:03 PM, Bill Prince via Af wrote: >>> >>> Getting closer.� When I look in the SNMP cache, there is no entry for >>> the device. >>> >>> Looking in the log (without debug), I get: >>> >>> 10/23/2014 08:34:25 PM - SPINE: Poller[0] Host[797 >>> <http://10.13.112.20/host.php?action=edit&id=797>] TH[1] DS[12316 >>> <http://10.13.112.20/data_sources.php?action=ds_edit&id=12316>] >>> WARNING: SNMP timeout detected [250 ms], ignoring host '10.13.114.254' >>> >>> So there is something causing the SNMP request to barf inside cacti.� >>> When I do an snmpget from the CLI, it all looks fine.� Likewise, the >>> realtime plugin is working fine too. >>> >>> So when realtime is doing the SNMP queries outside the poller, they are >>> fine.� Just when spine is doing the SNMP requests. >>> >>> >>> bp >>> >>> On 10/23/2014 4:12 PM, George Skorup (Cyber Broadcasting) via Af wrote: >>> >>> You divided by zero, didn't you? >>> >>> Are you sure your modules are in the same order as before? >>> >>> On 10/23/2014 1:29 PM, Bill Prince via Af wrote: >>> >>> >>> I noticed an "Expansion Unit" on one of my SiteMonitors this morning.� >>> It said something about "Device Removed" or something like that. >>> >>> Remembering the discussion the other day on this topic, I put a "0" in >>> the Serial # for the non-existent unit, rescanned, & rebooted. >>> >>> Now, none of the OIDs work in Cacti.� If I do a simple snmpget on any >>> of the OIDs that I use, the correct information comes back. Several of the >>> OIDs are on the base unit anyway, so they would not have moved, and >>> further, the OIDs don't reference the serial number. >>> >>> So... what did I do, and how do I fix it? >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >> > >
