Another thing to note is that there does appear to be a relationship between quantity of data being moved by the EPC and the odds of it going into an "ethernet coma". The more data it is moving, the better the chances that this occurs.
-- Nathan From: telrad-boun...@wispa.org [mailto:telrad-boun...@wispa.org] On Behalf Of Nathan Anderson Sent: Monday, February 20, 2017 3:47 PM To: 'Tristan Johnson'; telrad@wispa.org Subject: Re: [Telrad] Ethernet RX FCS errors from BreezeWay? So, we do have an update on this situation. The FCS errors that we get on ports 9-12 appear to just be symptomatic of a bigger problem. They occur on those ports (and only those ports) whenever the EPC decides to reinitialize all of the ethernet ports / the switching infrastructure, which it decides to do (at least in our case) when it gets into a certain state where it claims that the transmit queues are "full" (according to the log entries we see). When this happens, it reinitializes the ports, which makes the EPC unavailable/unreachable for a good 4-5 seconds. During this reinit, we see 2 frames with incorrectly computed CRC received from ports 9-12 but not the other ports. Moving away from any port 9-12 only made it so that we don't see those FCS errors. It did not address the underlying problem, which is that the EPC switch gets into a state that it apparently only knows how to deal with by resetting it completely, which knocks the EPC off-line. This is the actual reason why we are experiencing the SCTP failures to some of our eNBs. The FCS errors on those particular ports are just another symptom (and apparently harmless). I happened to be watching the network at the exact instant that one of these network "burps" happened, and I was also tailing the tlsyslog file at that same moment as well. Here is what was written to tlsyslog when the EPC became unresponsive: 16118:2017-02-20,14:34:05.859171:NOTICE:0:06.06.02161:MEAAPI:1008:mea_adapter.cpp:QueueFullControlHandler(5696):Queue is full for port 102 16119:2017-02-20,14:34:05.859638:NOTICE:0:06.06.02161:MEAAPI:1008:mea_adapter.cpp:QueueFullControlHandler(5701):Tx counter: 16681286/0 16120:2017-02-20,14:34:05.859712:NOTICE:0:06.06.02161:MEAAPI:1008:mea_adapter.cpp:QueueFullControlHandler(5716):Save state for port 102, queue full:Yes, mcqueue full:No, cnt:16681286 16121:2017-02-20,14:34:14.569762:NOTICE:0:06.06.02161:MEAAPI:1008:mea_adapter.cpp:QueueFullControlHandler(5696):Queue is full for port 102 16122:2017-02-20,14:34:14.570168:NOTICE:0:06.06.02161:MEAAPI:1008:mea_adapter.cpp:QueueFullControlHandler(5701):Tx counter: 16681286/16681286 16123:2017-02-20,14:34:14.570279:NOTICE:0:06.06.02161:MEAAPI:1008:mea_adapter.cpp:QueueFullControlHandler(5706):Queue full condition detected for port 102 16124:2017-02-20,14:34:21.717955:NOTICE:0:06.06.02161:MEAAPI:1008:mea_adapter.cpp:QueueFullControlHandler(5789):MEA FPGA reinit is triggered by queue control engine I found similar-looking errors in tlsyslogs from the past, and I have a feeling that if I took the time, I could correlate every SCTP failure that we have seen (as well as every logged FCS error) to one of these. I have already updated my tickets with Telrad to include this new finding. -- Nathan From: telrad-boun...@wispa.org<mailto:telrad-boun...@wispa.org> [mailto:telrad-boun...@wispa.org] On Behalf Of Tristan Johnson Sent: Monday, February 20, 2017 3:00 PM To: telrad@wispa.org<mailto:telrad@wispa.org> Subject: Re: [Telrad] Ethernet RX FCS errors from BreezeWay? Our low usage EPC uses Network and access on the same port. Plugged in to port 9 I think on the EPC (we wanted to take advantage of the small surge protection on those last 4 ports), into a MT CRS125-24G-1S we have no FCS errors. Probably a 3' cat5e patch cable. We just put our first customer on it a couple days ago though. Thanks, Tristan Johnson Owner [cid:image001.png@01D28B95.49C4ADE0] www.wirelessdatanet.net<http://wirelessdatanet.net/> 309-893-4152 ------ Original Message ------ From: "Nathan Anderson" <nath...@fsr.com<mailto:nath...@fsr.com>> To: "'telrad@wispa.org'" <telrad@wispa.org<mailto:telrad@wispa.org>> Sent: 2/17/2017 2:11:51 PM Subject: Re: [Telrad] Ethernet RX FCS errors from BreezeWay? Update on this: to be clear, we are running (as I think I have stated in the past) a network-if + access-if config. And it was the access-if port that has been showing the FCS errors. I realized that the network-if port, which is plugged into an 1100AHx2, has been fine. 1100 shows no FCS receive errors. Huh. The network-if port happened to be port 8, so I got the idea of swapping my network-if port and access-if port, and now network-if is port 11 and access-if is port 8. No more FCS errors, now that I am using port *8*. ?????????????? Also, no FCS errors on port 11 now that it is configured as a network-if port. ?!?!?!?!?!?!?!?!?!?!?!?!?! Our backup EPC, which is configured identically, has network-if on 8 and access-if on 11, and I cannot reproduce the FCS error problem on that box, either. This is SUPER bizarre. On the bright side, I haven't had an SCTP reset since I made this change. -- Nathan From: Nathan Anderson Sent: Friday, February 17, 2017 2:01 AM To: telrad@wispa.org<mailto:telrad@wispa.org> Subject: RE: [Telrad] Ethernet RX FCS errors from BreezeWay? I decided to take a switch that we had laying around, and put it in between the EPC and the CCR. Wouldn't you know it: the FCS errors started showing up on the switchport that the EPC is plugged into. This isn't an EPC <-> CCR issue. This, at least for us, is purely an EPC issue. This particular issue is really starting to get on my nerves... -- Nathan From: telrad-boun...@wispa.org<mailto:telrad-boun...@wispa.org> [mailto:telrad-boun...@wispa.org] On Behalf Of Nathan Anderson Sent: Thursday, February 16, 2017 11:11 PM To: telrad@wispa.org<mailto:telrad@wispa.org> Subject: Re: [Telrad] Ethernet RX FCS errors from BreezeWay? The cables we were using were 7ft pre-manufactured 5e patch cables. So for grins, I took 10ft of Cat. 6 off of a spool and put my own ends on. 10 minutes later, what happened? FCS ERROR. https://getyarn.io/yarn-clip/0f901d47-7954-40e1-a539-b6cea39d93c0 https://getyarn.io/yarn-clip/a8ae5833-4912-4a5a-81ce-ae8d2cdd2823 -- Nathan From: telrad-boun...@wispa.org<mailto:telrad-boun...@wispa.org> [mailto:telrad-boun...@wispa.org] On Behalf Of Skywerx Support Sent: Thursday, February 16, 2017 5:53 AM To: telrad@wispa.org<mailto:telrad@wispa.org> Subject: Re: [Telrad] Ethernet RX FCS errors from BreezeWay? We went from like 1 meter to 2 meter and from port 5 into port nine on two different EPC's and that did the trick at the locations we were having an issue. -- Justin Davis COO SkyWerx Industries, LLC On Feb 16, 2017, at 5:54 AM, Nathan Anderson <nath...@fsr.com<mailto:nath...@fsr.com>> wrote: Port 9 was the other port I tried. No difference. How much is "a bit" longer? The cable currently being used is I think between 7 and 10 ft (I'll check tomorrow). -- Nathan ________________________________ From: telrad-boun...@wispa.org<mailto:telrad-boun...@wispa.org> <telrad-boun...@wispa.org<mailto:telrad-boun...@wispa.org>> on behalf of Skywerx Support <jus...@skywerxsupport.com<mailto:jus...@skywerxsupport.com>> Sent: Thursday, February 16, 2017 4:19 AM To: telrad@wispa.org<mailto:telrad@wispa.org> Subject: Re: [Telrad] Ethernet RX FCS errors from BreezeWay? Nathan have you used a bit longer cable from EPC to router and use port 9. This fixed our FCS error issue. -- Justin Davis COO SkyWerx Industries, LLC On Feb 16, 2017, at 5:00 AM, Nathan Anderson <nath...@fsr.com<mailto:nath...@fsr.com>> wrote: So, as an update to this, we are still seeing these FCS errors between the EPC (port 11) and the CCR-1036. I know it is not just a reporting error, because we have been having issues for a while with UEs dropping off randomly, and until now we just assumed the underlying cause was whatever issue supposedly solves this for others in the upcoming 6.6M2 release. However, looking at the EPC logs, it is clear that the UEs are dropping at least in our case because the SCTP session between the EPC and some eNB on the network breaks down. And every time this happens, I can take the timestamp from the EPC log of the SCTP failure, go over to the CCR, look through its logs, and find an FCS error log entry with a timestamp that matches up with the SCTP failure *exactly*. Not every FCS error results in an eNB disconnect, but every eNB disconnect can be traced back to an FCS error. Soo, clearly we need to get this fixed. To that end, I have tried the following: 1) Replaced the ethernet cable 2) Tried a different port on the CCR 3) Tried a different port on the EPC (that's always fun, because "reboots") 4) REPLACED THE CCR We are still seeing FCS receive errors on that port and that port only. Anybody have any other ideas? -- Nathan From: telrad-boun...@wispa.org<mailto:telrad-boun...@wispa.org> [mailto:telrad-boun...@wispa.org] On Behalf Of Nathan Anderson Sent: Monday, January 30, 2017 1:31 PM To: telrad@wispa.org<mailto:telrad@wispa.org> Subject: Re: [Telrad] Ethernet RX FCS errors from BreezeWay? We have always had a habit of doing ths. And we are on 6.32.x. Searching through the ROS changelogs (which are a lot more verbose these days than they used to be) for "FCS" revealed nothing interesting. -- Nathan From: telrad-boun...@wispa.org<mailto:telrad-boun...@wispa.org> [mailto:telrad-boun...@wispa.org] On Behalf Of Shayne Lebrun Sent: Monday, January 30, 2017 7:15 AM To: telrad@wispa.org<mailto:telrad@wispa.org> Subject: Re: [Telrad] Ethernet RX FCS errors from BreezeWay? And always make sure, when upgrading RouterOS, that you then upgrade the Routerboard firmware. I think it happens semi-automatically now; you no longer need to do a /system routerboard upgrade,’ but you do still need a second reboot. From: telrad-boun...@wispa.org<mailto:telrad-boun...@wispa.org> [mailto:telrad-boun...@wispa.org] On Behalf Of Jesse Dupont Sent: Monday, January 30, 2017 8:51 AM To: telrad@wispa.org<mailto:telrad@wispa.org> Subject: Re: [Telrad] Ethernet RX FCS errors from BreezeWay? What RouterOS version is on this CCR? I think around the 6.34 mark, these started showing up in some circumstances, but a RouterOS upgrade resolves them (may just be a reporting error or hardware driver bug). On Mon, Jan 30, 2017 at 12:37 AM -0700, "Nathan Anderson" <nath...@fsr.com<mailto:nath...@fsr.com>> wrote: We have recently noticed a new problem: we have an access port on our BreezeWay (in this case, so happens it's the BW's port 10, but may not be relevant) that is plugged into a MikroTik CCR 1036. The MikroTik is reporting that it is sporadically seeing FCS errors on frames received from the BreezeWay. I have replaced the ethernet cable and also tried moving to a different ethernet port on the CCR. Neither has made a difference. I'm wondering if there is any way I can see any ethernet stats or diagnostic information from the BreezeWay's perspective. In my poking around, so far I have come up empty. I'm not necessarily convinced at this point that this is the BreezeWay's fault, mind you. I'm just wondering if anybody else has seen something similar, and how to best go about chasing this problem down. My perception is that the CCRs in particular have had a troubled history when it comes to its copper gig ports...search for "Ubiquiti AirFiber MikroTik CCR" if you want some fun afternoon light reading. Thanks, -- Nathan Anderson First Step Internet, LLC nath...@fsr.com<mailto:nath...@fsr.com> _______________________________________________ Telrad mailing list Telrad@wispa.org<mailto:Telrad@wispa.org> http://lists.wispa.org/mailman/listinfo/telrad _______________________________________________ Telrad mailing list Telrad@wispa.org<mailto:Telrad@wispa.org> http://lists.wispa.org/mailman/listinfo/telrad _______________________________________________ Telrad mailing list Telrad@wispa.org<mailto:Telrad@wispa.org> http://lists.wispa.org/mailman/listinfo/telrad ________________________________ [Avast logo]<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient> This email has been checked for viruses by Avast antivirus software. www.avast.com<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>
_______________________________________________ Telrad mailing list Telrad@wispa.org http://lists.wispa.org/mailman/listinfo/telrad