On the other hand, the dips are back for us again. This is getting to be very wearing.
To recap: 1) We are running the prerelease code 2) We have been having to reset S1/reboot ENBs periodically (multiple times a day on one particular sector) due to a state of stuck high RF usage 3) The dips are back (60 second cycle, significant drop in throughput for about 7 seconds) 4) Otherwise, throughput is performing *better than ever* We are now going on 8 full months with failure to resolve these. (To be fair, other manufacturers can take a while to fix things as well.) Customers are complaining. Telrad has confirmed (multiple times) that there is nothing wrong with our network/setup/UEs. They have confirmed that we have done every single thing we can do to verify that performance issues are *not* our problem, but Telrad's. However, replacing Telrad is not an option at present. I doubt this falls under any lemon laws, but I can only describe our experience of failures as systematic of core issues with the Telrad business. Individually, I take no issue with either the American or Israeli support team. Collectively, however, we have a significant problem. There is no 24/7 NOC/TAC… and even if there were, the fact that we probably couldn't rouse an engineer to observe/collect data when the trouble is occurring is a serious defect. I'm looking forward to seeing the Telrad team at Wispamerica — but with extremely mixed feelings about the support experience. I have attempted to burn no bridges, but at the same time be very clear about what I perceive as a systematic failure to deliver as promised. I have been fairly quiet on list about our outstanding issues, thinking that they would be better solved by superior troubleshooting and Telrad engineering than by social engineering. Perhaps it is time for that to change. Perhaps I am doing a disservice to other Telrad customers by keeping quiet. Thoughts? On Thu, Feb 16, 2017 at 2:40 AM, Nathan Anderson <[email protected]> wrote: > Ugh, this is what I get for jumping to conclusions and running my mouth > off before doing just the slightest bit of investigation. > > > > I think it might somehow just be the tool I'm using to do the graphing. > If I watch one of the active bandwidth tests closely while also watching > the graph of the eNB that UE is attached to, I don't (always) see the same > dips. > > > > Sooo, false alarm. Possibly. I'll keep watching things and report back. > > > > If it's just a graphing error/anomaly, not sure what the problem would be > here. Both the tool and the switch that the eNBs are plugged into > supposedly support SNMP v2c, so we shouldn't be overrunning a 32-bit > integer. > > > > -- Nathan > > > > *From:* [email protected] [mailto:[email protected]] *On > Behalf Of *Adam Moffett > *Sent:* Thursday, February 16, 2017 2:18 AM > > *To:* [email protected] > *Subject:* Re: [Telrad] Uplink throughput again > > > > Interesting. > > > > ------ Original Message ------ > > From: "Nathan Anderson" <[email protected]> > > To: "[email protected]" <[email protected]> > > Sent: 2/16/2017 4:24:00 AM > > Subject: Re: [Telrad] Uplink throughput again > > > > Jeremy mentioned his periodic traffic dips to me recently off-list. I > haven't seen anything exactly like what either of you two are talking > about, but...attached is an interesting screenshot I just took of downlink > usage on 3 separate eNBs on our network, each of which I am currently > saturating (off-hours) with MT download bandwidth test (occurring behind 1 > UE on each sector, and each UE has been temporarily granted 100Mbit > downlink AMBR). > > > > Notice the little icicle-like formations? Also notice how they seem to be > fairly regular, and also seem to occur at the exact same interval on every > sector, but don't perfectly line up with each other? > > > > WTF is *that* about? > > > > -- Nathan > > > > *From:* [email protected] [mailto:[email protected]] *On > Behalf Of *Jeremy Austin > *Sent:* Wednesday, February 15, 2017 8:44 PM > *To:* Adam Moffett; [email protected] > *Subject:* Re: [Telrad] Uplink throughput again > > > > Adam, I'm going to assume that no other traffic on the same equipment > (sans EPC and ENB) show this periodicity? > > > > I have seen something in the same ballpark, but not identical, since > August. I have been planning to post it to the list to get more eyes on it > (after letting Telrad have some time to look at it first). > > > > Just wanted to check that you had isolated the behavior entirely to LTE, > and not routers/backhauls/switches. > > > > > > On Wed, Feb 15, 2017 at 7:15 PM Adam Moffett <[email protected]> > wrote: > > Weird. Maybe overflow from the dedicated bearer falls into the default > bearer? I also have to wonder if it's a bug in the UE. It seems like it > must fall on the UE to ultimately enforce the rate limit. > > > > In our uplink throughput issue, I might have tripped over something of > interest. I originally reported to Telrad that I was getting about half of > what I expect for UL throughput. Now I think we actually do get the > expected throughput, but only for a moment. Five seconds later there's > next to nothing, then 5 seconds later back to full speed, and so on. I see > it when looking at the realtime traffic display on our switch port, but on > your typical chart with a 5 minute average it just looks like you're > getting half speed. > > > > Weird thing is that it's not happening all the time. I started iPerf on 6 > UE at one site at 4am the other day and when looking at traffic at the > switch port I saw a perfect sine wave with 10 seconds peak to peak. Later > that day I repeated the test to show one of my co-workers and the damn > thing wouldn't do it. > > > > I don't know what to make of it yet. > > > > > > ------ Original Message ------ > > From: "Nathan Anderson" <[email protected]> > > To: "[email protected]" <[email protected]>; "'Adam Moffett'" < > [email protected]> > > Sent: 2/10/2017 3:59:40 PM > > Subject: RE: [Telrad] Uplink throughput again > > > > So last night, I re-ran this test again, and captured the whole thing not > just at the edge of the LTE network coming out of the EPC, but between the > EPC and eNB, so that I could grab the user traffic together with the > encapsulating GTP headers. > > > > What I found was that when traffic comes from behind the UE with the > proper DSCP value set, it DOES get transmitted by the UE on the dedicated > bearer, but the MBR is still not being enforced. I had a 10Mbit/s UL AMBR > configured and a 256Kbit/s UL MBR set on the dedicated bearer, and when I > ran an upload test on the dedicated bearer, it hit 10 megs. (Download test > on the dedicated bearer was limited to the configured 256Kbit/s DL MBR.) > > > > What makes this so bizarre is that even if there is a bug that causes the > system (which part?) to not enforce the configured rate limit for the > dedicated bearer on the uplink, the UE AMBR should not be taken into > account for GBR bearers, as discussed before. But it sure seems like what > is happening is that whatever is supposed to be policing the uplink is > mistakenly enforcing the UE UL AMBR on the dedicated bearer instead of the > UL MBR. > > > > Ticket opened with Telrad. > > > > -- Nathan > > > > *From:* [email protected] [mailto:[email protected]] *On > Behalf Of *Nathan Anderson > *Sent:* Monday, February 06, 2017 3:56 PM > > > *To:* 'Adam Moffett'; [email protected] > *Subject:* Re: [Telrad] Uplink throughput again > > > > Then maybe the problem is not that the properly-marked upload traffic > isn't getting transmitted on the right bearer, but rather that the UL > GBR/MBR are not being enforced? > > > > Whose responsibility is enforcement of bitrates on uplink? The UE's? The > eNB? The EPC? A little of columns A, B, and C? > > > > -- Nathan > > > > *From:* [email protected] [mailto:[email protected] > <[email protected]>] *On Behalf Of *Adam Moffett > *Sent:* Monday, February 06, 2017 2:50 PM > > > *To:* [email protected] > *Subject:* Re: [Telrad] Uplink throughput again > > > > Somewhere there must be traffic counters for each QCI, or for individual > bearers, or something. Without seeing them it's hard to say for sure. > > > > On a busy eNB (50+ UE), I tried changing the mgmt DSCP value on an > individual UE from 6 to 5 and testing before and after. > > > > With the UE set to DSCP 5 for mgmt, I get 0.1 mbps upload and 7% packet > loss (500 byte pings, 0.1 second interval) > > On DSCP 6 I get 0.5mbps and 0% packet loss. > > > > That's not scientific rigor, but it seems like it's working. > > > > On a lighter loaded eNB I was actually getting slightly more UL throughput > with the UE Mgmt DSCP set to 5. I don't know why. > > > > -Adam > > > > > > > > ------ Original Message ------ > > From: "Nathan Anderson" <[email protected]> > > To: "[email protected]" <[email protected]>; "'Adam Moffett'" < > [email protected]> > > Sent: 2/6/2017 5:11:49 PM > > Subject: RE: [Telrad] Uplink throughput again > > > > ...also, I still remain unconvinced that the UEs are transmitting any > upload traffic -- even when properly marked with the right DSCP -- on the > dedicated bearer. Until it is proven beyond a doubt that this works, > testing upload capacity using dedicated bearers is probably a waste of time > because it isn't doing what you think it is doing. > > > > I have tested both CPE7000 and CPE8000 at this point, and have the same > issue on both, so I don't think it is a CPE firmware bug (that would be a > freaky coincidence, given that both CPEs are contract-manufactured by > different companies). So I don't know if this is me being stupid and not > configuring my EPCs correctly, or what. But something is not working here. > > > > -- Nathan > > > > *From:* [email protected] [mailto:[email protected]] *On > Behalf Of *Nathan Anderson > *Sent:* Monday, February 06, 2017 2:06 PM > *To:* 'Adam Moffett'; [email protected] > *Subject:* Re: [Telrad] Uplink throughput again > > > > Something that I learned that I should point out: > > > > A dedicated bearer with a higher priority should take precedence over > default bearer traffic, yes. But from what I can tell, LTE spec. does not > have a way of putting a total speed cap on the entire UE across any and all > bearers. The UE AMBRs only restrict all non-GBR bearers (default or not, > even across multiple APNs) but does NOT take into account GBR bearers, and > QCI 1 is GBR. > > > > What this means is that, for example, if you have a default bearer with > QCI 6, and dedicated bearer with QCI 1, and the UE DL and UL AMBRs are set > to 10 and 1 Mbit/s respectively, and your dedicated bearer's MBRs are set > to 5 and 0.5 (half of the UE AMBRs, for the sake of this example), you > haven't actually set up things such that up to half of the subscriber's > AMBRs are given priority on the dedicated bearer, leaving that user half of > his total bandwidth if you end up filling the dedicated bearer up to its > MBR in both directions. No, instead because the GBR QCIs are not accounted > for within the AMBR, the user can move up to 5x0.5 on the dedicated bearer > and *simultaneously* also move up to 10x1 (assuming there is enough sector > capacity at the time) on the default bearer. > > > > Maybe in some cases, this is desireable. If you use QCI 1 for VoIP, for > example, then you are effectively providing the customer with a separate > channel for their voice calls that does not dip into their configured speed > package, but is instead additive. But it is something to keep in mind as > you are planning and building your network as well as running tests. > > > > -- Nathan > > > > *From:* [email protected] [mailto:[email protected] > <[email protected]>] *On Behalf Of *Adam Moffett > *Sent:* Monday, February 06, 2017 1:48 PM > *To:* [email protected] > *Subject:* Re: [Telrad] Uplink throughput again > > > > The EPC and most of the eNB are running the latest general release > available on Zendesk. > > A couple of eNB are running some kind of maintenance release that support > wanted us to try. > > > > I'm making sure to run iPerf on the dedicated bearer to eliminate other > user traffic from weaker UE as a factor. At QCI 1 it should take > precedence over the default bearer traffic. > > > > I would definitely take the time to set one up, not necessarily for this > purpose, but rather to ensure you always have access to your UE. If the > default bearer is hosed with a torrent and you don't have a dedicated > bearer for management access then you can be completely locked out of the > unit. Monitoring, management access, and firmware updates all work more > reliably with the dedicated bearer and I'd strongly recommend it. There's > a knowledge base article in Zendesk about it. Use DSCP 6 because that's > tagged by default in the UE. > > > > > > > > ------ Original Message ------ > > From: "Jeremy Austin" <[email protected]> > > To: "Adam Moffett" <[email protected]>; [email protected] > > Sent: 2/6/2017 4:30:43 PM > > Subject: Re: [Telrad] Uplink throughput again > > > > > > On Mon, Feb 6, 2017 at 12:20 PM, Adam Moffett <[email protected]> > wrote: > > Can somebody tell me if they're getting expected uplink throughput? > > > > > What ENB and EPC revisions are you at, Adam? > > > > We're investigating this same issue ourselves, although we haven't tried a > dedicated bearer. > > > > -- > > Jeremy Austin > > > > (907) 895-2311 > > (907) 803-5422 > > [email protected] > > > > Heritage NetWorks > > Whitestone Power & Communications > > Vertical Broadband, LLC > > > > Schedule a meeting: http://doodle.com/jermudgeon > > _______________________________________________ > Telrad mailing list > [email protected] > http://lists.wispa.org/mailman/listinfo/telrad > > > _______________________________________________ > Telrad mailing list > [email protected] > http://lists.wispa.org/mailman/listinfo/telrad > > -- Jeremy Austin (907) 895-2311 (907) 803-5422 [email protected] Heritage NetWorks Whitestone Power & Communications Vertical Broadband, LLC Schedule a meeting: http://doodle.com/jermudgeon
_______________________________________________ Telrad mailing list [email protected] http://lists.wispa.org/mailman/listinfo/telrad
