We've had success with the Friday version as well. On Thu, Feb 9, 2017 at 1:10 PM, Adam Moffett <[email protected]> wrote:
> ...and with that warning out of the way, I'll say I'm running it on every > 8000 UE for an entire site. So far they're all fine, but it's only been 12 > hours. > > > > ------ Original Message ------ > From: "Adam Moffett" <[email protected]> > To: [email protected] > Sent: 2/9/2017 5:05:24 PM > Subject: Re: [Telrad] UE upgrade failure rate > > > It's labeled "beta". > > I asked today if it was safe for general deployment and the answer was > essentially "gee I think so". > > > > ------ Original Message ------ > From: "Nathan Anderson" <[email protected]> > To: "Telrad List" <[email protected]> > Sent: 2/9/2017 4:57:49 PM > Subject: Re: [Telrad] UE upgrade failure rate > > > I have had tickets open with Telrad for weeks about our various CPE8000 > issues, with a response that they are working on new firmware for us, but > nobody updated the ticket to inform me that a new release is out. I just > found it on the Zendesk after seeing your message, and it was posted > Friday. Thanks for the heads-up; I'll give it a go. > > > > It is good to hear that this new firmware brings reattach times back to > pre-826 levels. That alone has been preventing us from upgrading from > 694. My big hope is that this release also fixes ACS access when DMZ is > enabled. > > > > -- Nathan > > > > *From:* [email protected] [mailto:[email protected]] *On > Behalf Of *Matthew Carpenter > *Sent:* Thursday, February 09, 2017 7:52 AM > *To:* Telrad List > *Subject:* Re: [Telrad] UE upgrade failure rate > > > > So far I have moved 12 CPE8000's to the new firmware V2.2.2 Pack 0 > (Ver.983). > > One UE did not recover from the update. > > > > But now if the S1 goes down the updated UE's will reconnect within 1 - 2 > minutes, while the CPE8000 with V2.2.1 Pack 9 (Ver.826 & Ver.832) would not > reconnect for anywhere between 10 Minutes and 40 Minutes. > > I am seeing a major improvement with the V2.2.0 Pack 0 (Ver.983) ! > > > > Matt Carpenter > > Amarillo Wireless > > > > > > On Thu, Feb 9, 2017 at 9:43 AM, Nathan Anderson <[email protected]> wrote: > > Also, to be clear, what I have found is that the "admin" config file is > neither included as part of the config backup (and so does not get changed > or overwritten if you restore a config backup to a CPE), nor is it touched > if you reset the CPE to defaults. So neither simply resetting to defaults > or restoring a config to the CPE will disable the FTP server on CPEs that > have "admin_services" in "debug" mode. > > -- Nathan > > -----Original Message----- > From: [email protected] [mailto:[email protected]] On > Behalf Of Nathan Anderson > Sent: Thursday, February 09, 2017 7:33 AM > To: '[email protected]'; '[email protected]'; 'Adam Moffett' > > Subject: Re: [Telrad] UE upgrade failure rate > > Brief update: I went through more CPE8000s on our network, and found a few > that are running the rogue FTP server process. > > Nothing has uploaded any files to them, though, because we have > standardized on enabling DMZ on every installed UE. And so the DMZ feature > is forwarding all incoming connections bound for the FTP port (TCP 21) to > the DMZ IP address, which ends up protecting the UE. > > The DMZ feature on the 8000s currently is buggy, though, insofar as > enabling it also breaks the ability of the ACS server to talk to the UEs. > So right now we have no control over the 8000s from the ACS until Telrad > fixes this. > > I think what must have happened with the few CPEs that failed in the way > that they did is that in a few very select cases, we disabled the DMZ > precisely so that we could either monitor a particular CPE from the ACS or > do things like push out a firmware upgrade from the ACS. Disabling the DMZ > would have made that CPE vulnerable if it was one that the FTP server > process happened to be enabled on. > > We only upgraded a very small number of CPE8000s from the stock factory > firmware to the most recent before we stopped doing that entirely, due to > the discovery of the "it takes the new firmware 15 minutes to re-attach > after a network detach" bug. So the vast majority of 8000s on our network > are still running stock 694 firmware. And all CPEs, as I said before, have > been installed and configured the exact same way (with the restoration of a > config backup template file). Still, I have found both never upgraded > 8000s that aren't running the FTP server, and never upgraded 8000s that > are. So it isn't like the FTP server was something that the old firmware > did in all circumstances and which got fixed in later firmware, because > there are many 8000s running on the network right now that are running 694 > and aren't running the FTP daemon. > > The only difference I can see between the two is the contents of the > /etc/config/admin file. This file controls (among other things) whether > the FTP server process gets launched on bootup. And the > set_admin_services.sh command/script that I described in my "how to fix" > post either modifies that file to enable ("debug" mode) or disable > ("commercial" mode) that (and other) processes, depending on how it is > invoked. The question is why do some CPEs have the "debug" admin config > file in place while others have the "commercial" one in place. What > conditions determine this? > > I plan to take some brand-new-in-box 8000s out, power them up, and see if > right out of the box they are all the same. Maybe some got shipped out > with the "admin services" set to "debug" mode for some reason? > > -- Nathan > > -----Original Message----- > From: Nathan Anderson > Sent: Thursday, February 09, 2017 5:02 AM > To: '[email protected]'; [email protected]; Adam Moffett > Subject: RE: [Telrad] UE upgrade failure rate > > To be fair, I haven't ruled out the possibility that we did something > stupid, and left a security door wide open somewhere. But I'm having a > hard time figuring out what that would have been. > > I also went through and checked a handfulof 8000s in the field, and > haven't found any others that look like they either have an FTP server > running, or that look like they have served as a dumping ground for these > sorts of worms. We program each CPE identically when they get installed, > and the configuration is stamped out of a template (we configured one CPE > how we wanted it, then backed up the config and restore that config file to > a new CPE during installation). > > This is a weird one, to be sure. > > -- Nathan > > -----Original Message----- > From: [email protected] [mailto:[email protected]] On > Behalf Of Robert > Sent: Thursday, February 09, 2017 3:48 AM > To: Adam Moffett; [email protected] > Subject: Re: [Telrad] UE upgrade failure rate > > Sometimes, you feel like reaching out and finding a particular software > engineer and slapping them upside the back of the head... Over and over > and over and over.... WTF were they thinking... > > On 2/9/17 2:56 AM, Adam Moffett wrote: > > WOW. > > > > ------ Original Message ------ > > From: "Nathan Anderson" <[email protected] <mailto:[email protected]>> > > To: "Telrad List" <[email protected] <mailto:[email protected]>> > > Sent: 2/9/2017 5:54:27 AM > > Subject: Re: [Telrad] UE upgrade failure rate > > > >> After much trial and error, I managed to learn that the white port is > >> a TTL-level serial interface. And there was much rejoicing. > >> > >> > >> > >> ALSO, I FIGURED OUT WHAT HAS BEEN KILLING (at least our) CPE8000s. > >> > >> > >> > >> Remember that problem that the EPC firmware had back when it was first > >> released? Back when root access was still available on the EPC > >> firmware, there was an FTP server running on it that accepted > >> connections via the PDN IP address, and if you didn't change the root > >> password from the default insecure one (which was ironically named), > >> then infected machines trying to spread that stupid Photo.scr worm > >> would successfully log into the EPC via FTP and, thinking that it had > >> managed to log into a public web server somewhere, upload a bajillion > >> copies of the virus to it in various directories, filling up the disk. > >> > >> > >> > >> The exact same thing is happening here, believe it or not. It hadn't > >> ever occurred to me to test for this, but it turns out that under > >> certain circumstances that I haven't yet managed to nail down, the > >> CPE8000 firmware actually starts running an FTP server. Even worse, > >> this FTP server, once enabled, does not ask for any credentials. You > >> can literally type in any username when prompted, and you are in. I > >> see no config option on the web interface for the CPE that allows you > >> to turn this on and off...but whatever is triggering it ends up > >> creating a ready and completely unsecured backdoor to the CPE. > >> > >> > >> > >> *headdesk* > >> > >> > >> > >> If you guys give out routable IPs to your LTE users, or if you have > >> somebody on your network that has a PC infected with this particular > >> virus, then it might be that this could also explain your CPE8000 > >> firmware upgrade problems. > >> > >> > >> > >> After figuring out the serial port bit and examining the "dead" CPEs > >> more in-depth, I found the filesystems littered with files named > >> things like Photo.scr, IMG001.scr, Info.zip, etc. Once the writable > >> partition with the CPE configuration is completely full, if at that > >> point you issue either a reset-to-defaults, or upload a configuration > >> backup, or initiate a firmware upgrade (which has to migrate your > >> configuration from the old firmware version during the process), your > >> CPE gets bricked because there isn't enough disk space left for it to > >> properly finish writing the config changes to disk. So it gets only > >> half-done, and the configuration is left in an inconsistent state. > >> > >> > >> > >> I've managed to fix my dead units, and also found the mechanism for > >> disabling the FTP server. Still not sure how it is getting toggled on > >> in the first place (perhaps there is some other vulnerability that is > >> getting exploited first?), but I'll keep looking. > >> > >> > >> > >> I'll write up some instructions for y'all and post them here soon. > >> > >> > >> > >> -- Nathan > >> > >> > >> > >> *From:*[email protected] <mailto:[email protected]> > >> [mailto:[email protected] <mailto:[email protected]>] > >> *On Behalf Of *Nathan Anderson > >> *Sent:* Wednesday, February 08, 2017 1:49 PM > >> *To:* Telrad List; Adam Moffett > >> *Subject:* Re: [Telrad] UE upgrade failure rate > >> > >> > >> > >> Does anybody happen to know if the 6-pin white connector on the 8000's > >> board is either a serial port or a JTAG interface? > >> > >> > >> > >> -- Nathan > >> > >> > >> > >> *From:*[email protected] <mailto:[email protected]> > >> [mailto:[email protected]] *On Behalf Of *Nathan Anderson > >> *Sent:* Wednesday, February 08, 2017 1:40 PM > >> *To:* Telrad List; Adam Moffett > >> *Subject:* Re: [Telrad] UE upgrade failure rate > >> > >> > >> > >> This thread is interesting because I was just complaining last night > >> to our vendor about how fragile the software on the CPE8000s seems to > be. > >> > >> > >> > >> We have not had specific issues with flashing CPEs "over the air" from > >> the web interface, but sometimes ACS-initiated updates don't complete > >> correctly. On 7000s it usually takes the form of the upgrade not > >> completing and the UE falling off of the ACS, but the radio stays up > >> and attached to the network. We go in via the web interface OTA and > >> reboot it and it comes back with the same version of firmware it was > >> already running. Second time is usually the charm, and I'm thinking > >> that perhaps if the UE had been freshly-rebooted before attempting the > >> update, we might have a higher success rate. (We have also seen 7000s > >> just stop talking to the ACS without us touching the firmware, and > >> even though they are otherwise working fine. Again, rebooting the CPE > >> fixes this. Although it is rare, we have seen this even on the latest > >> .116) > >> > >> > >> > >> We once had a 7000 that did drop off the network after pushing the > >> update via ACS. We never checked what state it was in from the > >> ethernet side, but we had the customer powercycle it themselves and it > >> came back...again running the same firmware. So the upgrade did not > >> take, but it didn't brick it either and resetting config to defaults > >> on the UE was not (and at least for us never has been) necessary. > >> > >> > >> > >> So we have never had to truck-roll to a 7000 as a result of a failed > >> firmware upgrade. The 8000s, however, seem to be another story. I am > >> so scared to touch the ones we have in the field anymore. We have had > >> a couple that seem to get their configs corrupted after a firmware > >> change, and get into very funky states. > >> > >> > >> > >> One of them had these symptoms: defaulted to a 192.168.0.1 IP on the > >> ethernet (!), no web server running, no DHCP server running, had > >> telnet access that didn't prompt for a password (!!). Fixed it by > >> resetting to defaults (found a shell script that performs this > >> function on the CPE's filesystem). I got lucky with this one. > >> > >> > >> > >> One that I have sitting on my desk now is one that we tried to > >> rollback the firmware on (customer was experiencing random network > >> detaches, and the latest 8000 firmware doesn't reattach for 15 minutes > >> on-the-dot, so customer was -- I think justifiably -- getting a bit > >> pissy). Current symptoms are: NO IPv4 on the ethernet, IPv6 > >> link-local responds, no web server running, no DHCP server running, > >> telnet responds (calls itself "KZTECH") but default root/root123 > >> doesn't work, so I have NO way to get in and reset the damn thing, and > >> the 8000s don't seem to have a reset button. Thus it seems that it is > >> possible for a scrambled config to completely brick an 8000. > >> > >> > >> > >> If anybody has reliable information on how to get the 8000 to wipe its > >> config during bootup even though it seemingly lacks a reset button, I > >> would be eternally grateful... > >> > >> > >> > >> -- Nathan > >> > >> > >> > >> *From:*[email protected] <mailto:[email protected]> > >> [mailto:[email protected]] *On Behalf Of *Matthew Carpenter > >> *Sent:* Wednesday, February 08, 2017 6:38 AM > >> *To:* Adam Moffett; Telrad List > >> *Subject:* Re: [Telrad] UE upgrade failure rate > >> > >> > >> > >> Hi, > >> > >> > >> > >> So far only 1 CPE8000 UE that did not come back after a firmware > >> update. Normally a hard reboot would fix it, but in this case we had > >> to replace it. > >> > >> I have that CPE8000 on my desk and need to see what the status is from > >> the LAN side. Thanks for the info on defaulting it, will try it. > >> > >> > >> > >> Matt C. > >> > >> > >> > >> > >> > >> > >> > >> On Wed, Feb 8, 2017 at 8:23 AM, Adam Moffett <[email protected] > >> <mailto:[email protected]>> wrote: > >> > >> We've had a helluva time upgrading UE firmware over the air. It was > >> worse with Wimax. On Wimax it was more like 75% of the time we would > >> lose the channel scan table and have to go on site to add it back in. > >> It became SOP to leave the operator password at default so we had the > >> option of having the customer log in and fix the scan table for us. > >> > >> > >> > >> I think we've had more success since going to LTE. However, failed > >> firmware updates was one of the incentives to set up a dedicated > >> management bearer. I was hoping it would help with these things. We > >> haven't pushed out an update recently enough to say whether it helped. > >> > >> > >> > >> -Adam > >> > >> > >> > >> > >> > >> > >> > >> ------ Original Message ------ > >> > >> From: "Shayne Lebrun" <[email protected] > >> <mailto:[email protected]>> > >> > >> To: [email protected] <mailto:[email protected]> > >> > >> Sent: 2/8/2017 9:14:36 AM > >> > >> Subject: [Telrad] UE upgrade failure rate > >> > >> > >> > >> Does anybody else experience a ten to fifteen percent failure rate > >> when upgrading UEs? The behavior is, you upgrade the firmware, > >> reboot, and the device doesn't come back. Logging into the UE's > >> management from LAN, you'll see it's stuck in 'device init.' > >> Defaulting the unit and rebooting allows it to boot and attach. > >> > >> > >> > >> We're not using the residential gateway device or anything, and > >> the only config we put in is device name, SNMP and ACS settings. > >> Sometimes we hardcode the client's device in the DHCP server, to > >> turn on DMZ to allow port forwarding, but that doesn't seem to be > >> a causal factor. > >> > >> > >> _______________________________________________ > >> Telrad mailing list > >> [email protected] <mailto:[email protected]> > >> http://lists.wispa.org/mailman/listinfo/telrad > >> > >> > >> > >> > >> > >> -- > >> > >> *Matthew Carpenter* > >> > >> *806-316-5071 office* > >> > >> *806-236-9558 cell* > >> > >> > >> > > > > > > _______________________________________________ > > Telrad mailing list > > [email protected] > > http://lists.wispa.org/mailman/listinfo/telrad > > > _______________________________________________ > Telrad mailing list > [email protected] > http://lists.wispa.org/mailman/listinfo/telrad > > _______________________________________________ > Telrad mailing list > [email protected] > http://lists.wispa.org/mailman/listinfo/telrad > > _______________________________________________ > Telrad mailing list > [email protected] > http://lists.wispa.org/mailman/listinfo/telrad > > > > > > -- > > *Matthew Carpenter* > > *806-316-5071 <(806)%20316-5071> office* > > *806-236-9558 <(806)%20236-9558> cell* > > > > > _______________________________________________ > Telrad mailing list > [email protected] > http://lists.wispa.org/mailman/listinfo/telrad > > -- Jeremy Austin (907) 895-2311 (907) 803-5422 [email protected] Heritage NetWorks Whitestone Power & Communications Vertical Broadband, LLC Schedule a meeting: http://doodle.com/jermudgeon
_______________________________________________ Telrad mailing list [email protected] http://lists.wispa.org/mailman/listinfo/telrad
