That Flakeway tool makes me think of an early version of the Chaos Monkey. To that note, Apple maintains a developer tool called Network Link Conditioner that does a good job simulating reduced network performance.
> On Oct 15, 2023, at 23:30, Jack Haverty via Nnagain > <nnagain@lists.bufferbloat.net> wrote: > > Even back in 1978, I didn't think Source Quench would work. I recall that I > was trying to adapt my TCP2.5 Unix implementation to become TCP4, and I asked > what my TCP should do if it sent the first IP datagram to open a TCP > connection and received a Source Quench. It wasn't clear at all how I should > "slow down". Other TCP implementors took the receipt of an SQ as an > indication that a datagram they had sent had been discarded, so the obvious > reaction for user satisfaction was to retransmit immediately. Slowing down > would simply degrade their user's experience. > > Glad to hear SQ is gone. I hope whatever replaced it works. > > There's some confusion about the Arpanet. The Arpanet was known as a "packet > switching network", but it had lots of internal mechanisms that essentially > created virtual circuits between attached computers. Every packet sent in > to the network by a user computer came out at the destination intact, in > order, and not duplicated or lost. The Arpanet switches even had a hardware > mechanism for flow control; a switch could halt data transfer from a user > computer when necessary. During the 80s, the Arpanet evolved to have an > X.25 interface, and operated as a true "virtual circuit" provider. Even in > the Defense Data Network (DDN), the network delivered a virtual circuit > service. The attached users' computers had TCP, but the TCP didn't need to > deal with most of the network behavior that TCP was designed to handle. > Congestion was similarly handled by internal Arpanet mechanisms (there were > several technical reports from BBN to ARPA with details). I don't remember > any time that "an explicit ack for every packet was ripped out of the > arpanet" None of those events happened when two TCP computers were connected > to the Arpanet. > > The Internet grew up around the Arpanet, which provided most of the wide-area > connectivity through the mid-80s. Since the Arpanet provided the same > "reliable byte stream" behavior as TCP provided, and most user computers were > physically attached to an Arpanet switch, it wasn't obvious how to test a TCP > implementation, to see how well it dealt with reordering, duplication, > dropping, or corruption of IP datagrams. > > We (at BBN) actually had to implement a software package called a "Flakeway", > which ran on a SparcStation. Using a "feature" of Ethernets and ARP (some > would call it a vulnerability), the Flakeway could insert itself invisibly in > the stream of datagrams between any two computers on that LAN (e.g., between > a user computer and the gateway/router providing a path to other sites). The > Flakeway could then simulate "real" Internet behavior by dropping, > duplicating, reordering, mangling, delaying, or otherwise interfering with > the flow. That was extremely useful in testing and diagnosing TCP > implementations. > > I understand that there has been a lot of technical work over the years, and > lots of new mechanisms defined for use in the Internet to solve various > problems. But one issue that has not been addressed -- how do you know > whether or not some such mechanism has actually been implemented, and > configured correctly, in the millions of devices that are now using TCP (and > UDP, IP, etc.)? AFAIK, there's no way to tell unless you can examine the > actual code. > > The Internet, and TCP, was an experiment. One aspect of that experiment > involved changing the traditional role of a network "switch", and moving > mechanisms for flow control, error control, and other mechanisms used to > create a "virtual circuit" behavior. Instead of being implemented inside > some switching equipment, TCP's mechanisms are implemented inside users' > computers. That was a significant break from traditional network > architecture. > > I didn't realize it at the time, but now, with users' devices being > uncountable handheld or desktop computers rather than huge racks in > relatively few data centers, moving all those mechanisms from switches to > users' computers significantly complicates the system design and especially > operation. > > That may be one of the more important results of the long-running experiment. > > Jack Haverty > > On 10/15/23 18:39, Dave Taht wrote: >> It is wonderful to have your original perspectives here, Jack. >> >> But please, everyone, before a major subject change, change the subject? >> >> Jack's email conflates a few things that probably deserve other >> threads for them. One is VGV - great acronym! Another is about the >> "Placeholders" of TTL, and TOS. The last is the history of congestion >> control - and it's future! As being a part of the most recent episodes >> here I have written extensively on the subject, but what I most like >> to point people to is my fun talks trying to make it more accessible >> like this one at apnic >> https://blog.apnic.net/2020/01/22/bufferbloat-may-be-solved-but-its-not-over-yet/ >> or my more recent one at tti/vanguard. >> >> Most recently one of our LibreQos clients has been collecting 10ms >> samples and movies of what real-world residential traffic actually >> looks like: >> >> https://www.youtube.com/@trendaltoews7143 >> >> And it is my hope that that conveys intuition to others... as compared >> to speedtest traffic, which prove nothing about the actual behaviors >> of VGV traffic, which I ranted about here: >> https://blog.cerowrt.org/post/speedtests/ - I am glad that these >> speedtests now have latency under load reports almost universally, but >> see the rant for more detail. >> >> Most people only have a picture of traffic in the large, over 5 minute >> intervals, which behaves quite differently, or a pre-conception that >> backpressure actually exists across the internet. It doesn't. An >> explicit ack for every packet was ripped out of the arpanet as costing >> too much time. Wifi, to some extent, recreates the arpanet problem by >> having explicit acks on the local loop that are repeated until by god >> the packet comes through, usually without exponential backoff. >> >> We have some really amazing encoding schemes now - I do not understand >> how starlink works without retries for example, an my grip on 5G's >> encodings is non-existent, except knowing it is the most bufferbloated >> of all our technologies. >> >> ... >> >> Anyway, my hope for this list is that we come up with useful technical >> feedback to the powers-that-be that want to regulate the internet >> under some title ii provisions, and I certainly hope we can make >> strides towards fixing bufferbloat along the way! There are many other >> issues. Let's talk about those instead! >> >> But... >> ... >> >> In "brief" response to the notes below - source quench died due to >> easy ddos, AQMs from RED (1992) until codel (2012) struggled with >> measuring the wrong things ( Kathie's updated paper on red in a >> different light: https://pollere.net/Codel.html ), SFQ was adopted by >> many devices, WRR used in others, ARED I think is common in juniper >> boxes, fq_codel is pretty much the default now for most of linux, and >> I helped write CAKE. >> >> TCPs evolved from reno to vegas to cubic to bbr and the paper on BBR >> is excellent: https://research.google/pubs/pub45646/ as is len >> kleinrock's monograph on it. However problems with self congestion and >> excessive packet loss were observed, and after entering the ietf >> process, is now in it's 3rd revision, which looks pretty good. >> >> Hardware pause frames in ethernet are often available, there are all >> kinds of specialized new hardware flow control standards in 802.1, a >> new more centralized controller in wifi7 >> >> To this day I have no idea how infiniband works. Or how ATM was >> supposed to work. I have a good grip on wifi up to version 6, and the >> work we did on wifi is in use now on a lot of wifi gear like openwrt, >> eero and evenroute, and I am proudest of all my teams' work on >> achieving airtime fairness, and better scheduling described in this >> paper here: https://www.cs.kau.se/tohojo/airtime-fairness/ for wifi >> and MOS to die for. >> >> There is new work on this thing called L4S, which has a bunch of RFCs >> for it, leverages multi-bit DCTCP style ECN and is under test by apple >> and comcast, it is discussed on tsvwg list a lot. I encourage users to >> jump in on the comcast/apple beta, and operators to at least read >> this: https://datatracker.ietf.org/doc/draft-ietf-tsvwg-l4sops/ >> >> Knowing that there is a book or three left to write on this subject >> that nobody will read is an issue, as is coming up with an >> architecture to take packet handling as we know it, to the moon and >> the rest of the solar system, seems kind of difficult. >> >> Ideally I would love to be working on that earth-moon architecture >> rather than trying to finish getting stuff we designed in 2012-2016 >> deployed. >> >> I am going to pull out a few specific questions from the below and >> answer separately. >> >> On Sun, Oct 15, 2023 at 1:00 PM Jack Haverty via Nnagain >> <nnagain@lists.bufferbloat.net> <mailto:nnagain@lists.bufferbloat.net> wrote: >>> The "VGV User" (Voice, Gaming, Videoconferencing) cares a lot about >>> latency. It's not just "rewarding" to have lower latencies; high >>> latencies may make VGV unusable. Average (or "typical") latency as the >>> FCC label proposes isn't a good metric to judge usability. A path which >>> has high variance in latency can be unusable even if the average is >>> quite low. Having your voice or video or gameplay "break up" every >>> minute or so when latency spikes to 500 msec makes the "user experience" >>> intolerable. >>> >>> A few years ago, I ran some simple "ping" tests to help a friend who was >>> trying to use a gaming app. My data was only for one specific path so >>> it's anecdotal. What I saw was surprising - zero data loss, every >>> datagram was delivered, but occasionally a datagram would take up to 30 >>> seconds to arrive. I didn't have the ability to poke around inside, but >>> I suspected it was an experience of "bufferbloat", enabled by the >>> dramatic drop in price of memory over the decades. >>> >>> It's been a long time since I was involved in operating any part of the >>> Internet, so I don't know much about the inner workings today. Apologies >>> for my ignorance.... >>> >>> There was a scenario in the early days of the Internet for which we >>> struggled to find a technical solution. Imagine some node in the bowels >>> of the network, with 3 connected "circuits" to some other nodes. On two >>> of those inputs, traffic is arriving to be forwarded out the third >>> circuit. The incoming flows are significantly more than the outgoing >>> path can accept. >>> >>> What happens? How is "backpressure" generated so that the incoming >>> flows are reduced to the point that the outgoing circuit can handle the >>> traffic? >>> >>> About 45 years ago, while we were defining TCPV4, we struggled with this >>> issue, but didn't find any consensus solutions. So "placeholder" >>> mechanisms were defined in TCPV4, to be replaced as research continued >>> and found a good solution. >>> >>> In that "placeholder" scheme, the "Source Quench" (SQ) IP message was >>> defined; it was to be sent by a switching node back toward the sender of >>> any datagram that had to be discarded because there wasn't any place to >>> put it. >>> >>> In addition, the TOS (Type Of Service) and TTL (Time To Live) fields >>> were defined in IP. >>> >>> TOS would allow the sender to distinguish datagrams based on their >>> needs. For example, we thought "Interactive" service might be needed >>> for VGV traffic, where timeliness of delivery was most important. >>> "Bulk" service might be useful for activities like file transfers, >>> backups, et al. "Normal" service might now mean activities like using >>> the Web. >>> >>> The TTL field was an attempt to inform each switching node about the >>> "expiration date" for a datagram. If a node somehow knew that a >>> particular datagram was unlikely to reach its destination in time to be >>> useful (such as a video datagram for a frame that has already been >>> displayed), the node could, and should, discard that datagram to free up >>> resources for useful traffic. Sadly we had no mechanisms for measuring >>> delay, either in transit or in queuing, so TTL was defined in terms of >>> "hops", which is not an accurate proxy for time. But it's all we had. >>> >>> Part of the complexity was that the "flow control" mechanism of the >>> Internet had put much of the mechanism in the users' computers' TCP >>> implementations, rather than the switches which handle only IP. Without >>> mechanisms in the users' computers, all a switch could do is order more >>> circuits, and add more memory to the switches for queuing. Perhaps that >>> led to "bufferbloat". >>> >>> So TOS, SQ, and TTL were all placeholders, for some mechanism in a >>> future release that would introduce a "real" form of Backpressure and >>> the ability to handle different types of traffic. Meanwhile, these >>> rudimentary mechanisms would provide some flow control. Hopefully the >>> users' computers sending the flows would respond to the SQ backpressure, >>> and switches would prioritize traffic using the TTL and TOS information. >>> >>> But, being way out of touch, I don't know what actually happens today. >>> Perhaps the current operators and current government watchers can answer?: >> I would love moe feedback about RED''s deployment at scale in particular. >> >>> 1/ How do current switches exert Backpressure to reduce competing >>> traffic flows? Do they still send SQs? >> Some send various forms of hardware flow control, an ethernet pause >> frame derivative >> >>> 2/ How do the current and proposed government regulations treat the >>> different needs of different types of traffic, e.g., "Bulk" versus >>> "Interactive" versus "Normal"? Are Internet carriers permitted to treat >>> traffic types differently? Are they permitted to charge different >>> amounts for different types of service? >> >>> Jack Haverty >>> >>> On 10/15/23 09:45, Dave Taht via Nnagain wrote: >>>> For starters I would like to apologize for cc-ing both nanog and my >>>> new nn list. (I will add sender filters) >>>> >>>> A bit more below. >>>> >>>> On Sun, Oct 15, 2023 at 9:32 AM Tom Beecher <beec...@beecher.cc> >>>> <mailto:beec...@beecher.cc> wrote: >>>>>> So for now, we'll keep paying for transit to get to the others (since >>>>>> it’s about as much as transporting IXP from Dallas), and hoping someone >>>>>> at Google finally sees Houston as more than a third rate city hanging >>>>>> off of Dallas. Or… someone finally brings a worthwhile IX to Houston >>>>>> that gets us more than peering to Kansas City. Yeah, I think the former >>>>>> is more likely. 😊 >>>>> There is often a chicken/egg scenario here with the economics. As an >>>>> eyeball network, your costs to build out and connect to Dallas are >>>>> greater than your transit cost, so you do that. Totally fair. >>>>> >>>>> However think about it from the content side. Say I want to build into to >>>>> Houston. I have to put routers in, and a bunch of cache servers, so I >>>>> have capital outlay , plus opex for space, power, IX/backhaul/transit >>>>> costs. That's not cheap, so there's a lot of calculations that go into >>>>> it. Is there enough total eyeball traffic there to make it worth it? Is >>>>> saving 8-10ms enough of a performance boost to justify the spend? What >>>>> are the long term trends in that market? These answers are of course >>>>> different for a company running their own CDN vs the commercial CDNs. >>>>> >>>>> I don't work for Google and obviously don't speak for them, but I would >>>>> suspect that they're happy to eat a 8-10ms performance hit to serve from >>>>> Dallas , versus the amount of capital outlay to build out there right now. >>>> The three forms of traffic I care most about are voip, gaming, and >>>> videoconferencing, which are rewarding to have at lower latencies. >>>> When I was a kid, we had switched phone networks, and while the sound >>>> quality was poorer than today, the voice latency cross-town was just >>>> like "being there". Nowadays we see 500+ms latencies for this kind of >>>> traffic. >>>> >>>> As to how to make calls across town work that well again, cost-wise, I >>>> do not know, but the volume of traffic that would be better served by >>>> these interconnects quite low, respective to the overall gains in >>>> lower latency experiences for them. >>>> >>>> >>>> >>>>> On Sat, Oct 14, 2023 at 11:47 PM Tim Burke <t...@mid.net> >>>>> <mailto:t...@mid.net> wrote: >>>>>> I would say that a 1Gbit IP transit in a carrier neutral DC can be had >>>>>> for a good bit less than $900 on the wholesale market. >>>>>> >>>>>> Sadly, IXP’s are seemingly turning into a pay to play game, with rates >>>>>> almost costing as much as transit in many cases after you factor in loop >>>>>> costs. >>>>>> >>>>>> For example, in the Houston market (one of the largest and fastest >>>>>> growing regions in the US!), we do not have a major IX, so to get up to >>>>>> Dallas it’s several thousand for a 100g wave, plus several thousand for >>>>>> a 100g port on one of those major IXes. Or, a better option, we can get >>>>>> a 100g flat internet transit for just a little bit more. >>>>>> >>>>>> Fortunately, for us as an eyeball network, there are a good number of >>>>>> major content networks that are allowing for private peering in markets >>>>>> like Houston for just the cost of a cross connect and a QSFP if you’re >>>>>> in the right DC, with Google and some others being the outliers. >>>>>> >>>>>> So for now, we'll keep paying for transit to get to the others (since >>>>>> it’s about as much as transporting IXP from Dallas), and hoping someone >>>>>> at Google finally sees Houston as more than a third rate city hanging >>>>>> off of Dallas. Or… someone finally brings a worthwhile IX to Houston >>>>>> that gets us more than peering to Kansas City. Yeah, I think the former >>>>>> is more likely. 😊 >>>>>> >>>>>> See y’all in San Diego this week, >>>>>> Tim >>>>>> >>>>>> On Oct 14, 2023, at 18:04, Dave Taht <dave.t...@gmail.com> >>>>>> <mailto:dave.t...@gmail.com> wrote: >>>>>>> This set of trendlines was very interesting. Unfortunately the data >>>>>>> stops in 2015. Does anyone have more recent data? >>>>>>> >>>>>>> https://drpeering.net/white-papers/Internet-Transit-Pricing-Historical-And-Projected.php >>>>>>> >>>>>>> I believe a gbit circuit that an ISP can resell still runs at about >>>>>>> $900 - $1.4k (?) in the usa? How about elsewhere? >>>>>>> >>>>>>> ... >>>>>>> >>>>>>> I am under the impression that many IXPs remain very successful, >>>>>>> states without them suffer, and I also find the concept of doing micro >>>>>>> IXPs at the city level, appealing, and now achievable with cheap gear. >>>>>>> Finer grained cross connects between telco and ISP and IXP would lower >>>>>>> latencies across town quite hugely... >>>>>>> >>>>>>> PS I hear ARIN is planning on dropping the price for, and bundling 3 >>>>>>> BGP AS numbers at a time, as of the end of this year, also. >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Oct 30: >>>>>>> https://netdevconf.info/0x17/news/the-maestro-and-the-music-bof.html >>>>>>> Dave Täht CSO, LibreQos >>>> >>> _______________________________________________ >>> Nnagain mailing list >>> Nnagain@lists.bufferbloat.net <mailto:Nnagain@lists.bufferbloat.net> >>> https://lists.bufferbloat.net/listinfo/nnagain >> >> > > _______________________________________________ > Nnagain mailing list > Nnagain@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/nnagain
_______________________________________________ Nnagain mailing list Nnagain@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/nnagain