Re: [AFMUG] Cambium 450 Watchdog resets - was: To Cambium With Love- Replace the bad ePMP units.

George Skorup Fri, 22 Jan 2016 18:35:32 -0800

I'm not sure if that's exactly my problem. I've been seeing this weirdstuff on APs with their own parasitic pipe, or multiple APs on aSyncBox. Those timing cables are very short, less than 5 feet. Even APsrunning on the iGPS do stupid things some times. OK, a lot, because ofthe orientation, which is why I've been disabling them. What I havenoticed is that the newer radios with the GTop GNSS (GPS+GLONASS)receivers seem much more stable. And this is on 5GHz too, not just 3GHz.

Anyway.. do you recall the recent 450i + GigE SyncInjector thread?Forrest mentioned exactly what you're talking about. The sync over powerpulse on long cables has this bounce issue. I wonder if the new revisionthat deals with this would benefit the regular 450 APs as well. Ifthat's even the issue. I've seen these problems on ~100 to 290 feet ofcable. FM sites, grain elevators, water towers and regular towers withonly our stuff or other gear like SCADA and two-way. So there's no wayto tell if it's one type of site or another causing these problems.


On 1/22/2016 12:16 PM, Mark Radabaugh wrote:

I do see the SM’s drop session when the unit goes from sync source to a 
different source - there is definitely a shift between the various sources.   
There are telnet commands in the newer software to disable and enable various 
sync sources without rebooting the unit - very helpful for getting a AP back 
where you want it without a reboot, at the risk of causing a reregistration of 
some or all of the SM’s.

If I had to speculate on the actual issue it would be a combination of a couple 
of things.

a) Cambium has never (to my knowledge) published a detailed specification of 
the sync over power requirements.   As a result the 3rd party vendors have had 
to reverse engineer the design.
b) Cambium was not consistent in the timing from their own equipment - 
internal, CMM3, CMM4, iGPS
c) The POE cabling and voltage has changed a number of times (this had to 
happen - nothing that can be done about that, standards move on)
d) The ’sync detection’ circuit in the AP’s has changed a number of times.  
Again this had to happen, components and designs change.   It’s nearly 
impossible to make them precisely the same.
e) The 3rd party timing source vendors have tried to work around timing issue, 
while Cambium has tried to tweak things to deal with 3rd party timing - and 
they end up confusing each other when (a) doesn’t exist.

I think I got off topic….    Anyway - since this is sync over power and the 
signal is a chop in the DC voltage (a square wave) a couple of things happen 
when you attach a long cable to it.   The inductance and capacitance of the 
wire distorts the square wave.   The leading and trailing edge becomes a slope, 
possibly with some ringing on top of that.  The detection circuit in the AP has 
to decide where on the slope it’s going to ‘detect’ the timing.  Even if that 
point is precisely the same voltage on every AP revision, the variation in 
cable length, inductance, and capacitance is  going to make each installation 
different.   More knobs could be added to the AP, but it’s going to be mighty 
hard to figure out what your supposed to set them to.

Sync over power, while an elegant solution to timing on the original FSK 
radios, has outlived itself - timing is much more critical as data rates go up 
and sync over power isn’t going to cut it in it’s current form.

So why does this one AP go wonky?   My guess would be an interaction between 
the SyncInjector, something unique about the cabling to this specific AP (how 
and what the cable is strapped to, if the installer did something different 
with drip loops, slack storage, surge suppressor components, etc.) that causes 
the square wave to either ring or slew, and the AP either misinterprets this or 
tries to compensate for it and fails.   That doesn’t mean the SyncInjector did 
anything wrong - it’s probably doing exactly what it’s supposed to do.   The AP 
isn’t really necessarily wrong either - it’s doing what it can to recover a 
mangled signal.

Nobody is truly wrong, it’s just a broken system.

Mark

On Jan 22, 2016, at 9:56 AM, George Skorup <[email protected]> wrote:

OK, wow. I feel so not alone now.

A week or two ago I had this happen... I have two sites about 8 miles apart. 
One has a full 3.6 cluster (N,S,E,W) on a SyncInjector. The other is a single 
NE sector on a Parasitic SyncPipe. Same frequency and none of the SMs on either 
sector can hear the other APs, they all face away (i.e. the two sites don't 
line up together for any SM), so standard frequency reuse scenario. We had some 
speed complaints and I found all of the SMs on the North sector at 8x/1X 
downlink. Rebooted the whole cluster with no resolution. Figured something 
happened to that AP. Then I figured I'd check all other APs for any issues. 
That's when I found this NE sector running on the on-board GPS. Everything 
looked completely normal on this sector (SM links, etc). As soon as I disabled 
the on-board GPS to force it back onto the SyncPipe, everything returned to 
normal at the other site. I still don't understand that. It seems like the iGPS 
AP's timing was slightly different and interfering with the uplink on the other 
(the sectors can hear each other) which was causing a ton of retries so it 
backed off the downlink? That's all I can figure.

Also, before I started disabling the on-board GPS on all APs in a cluster, one 
would fail over from power port to on-board and I'd get LBT events. So I do 
believe there's a difference in the timing, PacketFlux vs on-board, aka CMM3 vs 
CMM4.

The other thing I see is dropped sessions when AutoSync switches to or from 
on-board GPS and timing port/power port (PacketFlux on both). Sometimes it's 
only the SMs actively passing traffic that go idle. I believe the pulse is 
different and the AP has to re-align the frame. I could be wrong.

I've also mentioned this to Cambium multiple times now. I believe the on-board 
GPS, even with it disabled because they don't actually turn the receiver off, 
plays a role with AutoSync going stupid. Again, I could be wrong.

I remember you mentioned that you're seeing very close 3.6 SMs getting fairly 
bad downlink SNR due to them hearing the opposite AP on the same freq, even 
with the high f/b OEM Laird sectors. We're definitely seeing that too.

Anyway, that's all I got for now. I'm not blaming anyone or pointing fingers, 
mostly because I'm not entirely sure it's Cambium's or PacketFlux's fault, 
could be both, I don't know. Shit happens. But shit has been happening for a 
really long time and I'm very frustrated. Aaron knows this, I send him stuff 
all the time. :)

On 1/22/2016 5:38 AM, Mark Radabaugh wrote:

Inline

On Jan 22, 2016, at 12:02 AM, George Skorup <[email protected]> wrote:

Sounds like you mean that sector's frame timing is drifting? I haven't seen 
that. But I have seen sectors go nuts where all of the SMs show 12dB SNR and 
MIMO-A downlink. For no reason. Reboot the AP and it's gone.

I do think the sectorï¿½s frame timing is drifting.   The end result is exactly 
the same thing you are seeing - all the SM end up with low SNR and MIMO-A.  
Reboot it and it goes away for 8 to 12 hours before itï¿½s back.   It truly 
smells like timing drift.  I donï¿½t think it keeps drifting though - I think 
the AP works itï¿½s way off to some sort of offset limit on the timing and then 
sits there.  Cambium was playing around with something in the software with a 
ï¿½debounce timingï¿½ that (I think) compared the incoming pulse with the 
internal clock for a sanity check and then rejected pulses outside the window.  
Iï¿½m not sure they didnï¿½t end up putting some type of averaging or 
compensating calculation in that might be ending up working itï¿½s way to a 
limit.

On full cluster sites, we're using SyncInjectors, and only SyncInjectors. I.E. 
I disable the on-board and timing ports on all sectors. If the sync pulse from 
the injector drops, oh well. FreeRun is a PITA especially with LBT.

Thatï¿½s my usual practice, for the same reasons.

And I've said this many, many times. There's a difference between the on-board 
GPS which is CMM4 aligned and PacketFlux stuff which is CMM3 aligned. The 
frames don't match when you have a mix of this in a cluster.

I donï¿½t have any way to measure the on-board timing pulse so I canï¿½t say 
where itï¿½s at relative to a sync injector, but I am getting very good 
performance with the north AP on the same frequency as the south with one on a 
SyncInjector and the other on internal GPS.   SNR on both sectors is stable and 
modulation is where I expect it to be.

It's very obvious when this problem crops up. The AP session list shows most 
SMs sitting at 8X/1X. And sessions with HP show the normal VC as 8X/1X, HP VC 
as 1X/1X.

Yep

I know at one site in particular, the SyncInjector doesn't show any 1PPS active 
events, yet the APs show a few inSync and outSync counts. Those could be actual 
losses in timing, but Forrest said it needs to see something like 3-4 in a row 
before it's logged in the event counter. However, at another site, the timing 
is very stable and the APs show zero outSync and one inSync count. But weird 
things are still happening.

Same - the SyncInjector logs donï¿½t show anything unusual, nor does the AP.

Another thing I've noticed is that a loss in the sync pulse doesn't always show 
up in the AP event log. Or I'll see a loss message, but no acquired message 
after that.

And I've been seeing this weird stuff for well over a year. I just don't know 
what else to do. CMMs, CTMs and new radios do not fit in the budget.

Using the internal GPS isnï¿½t a viable permanent solution for this, but itï¿½s 
working for the moment, and I have not seen a recurrence of the problem in a 
week.   Iï¿½m going to try a CMM to see if it makes any difference, and if that 
fails Iï¿½ll probably resort to dragging a timing port device up there and 
using that.

Mark

Re: [AFMUG] Cambium 450 Watchdog resets - was: To Cambium With Love- Replace the bad ePMP units.

Reply via email to